Snapshot Backups for MongoDB Using MinIO

Snapshot Backups for MongoDB Using MinIO

As the volume and frequency of data continue to expand and accelerate at an unprecedented pace, object storage continues to serve as the backbone of a data stack, driving innovation and efficiency in data products of all kinds. Among these, MongoDB stands out as a frontrunner in the NoSQL realm, offering a versatile and scalable solution tailored for managing unstructured and semi-structured data, such as JSON files. It excels in addressing the specific demands posed by application systems that consistently handle small yet mission-critical data, such that created by e-commerce transactions, research and development, IoT sensor readings and others.

The only true backup storage for a modern data infrastructure is object storage that is highly performant, open source and accessible with industry standard S3 APIs. Veeam and Commvault have continuously validated this assertion.

We’ve already covered how to accelerate MongoDB backups with MinIO Jumbo. This method makes use of  mongodump, a great tool for backing up and restoring small MongoDB deployments. However, this method is not recommended for capturing snapshots of larger, enterprise deployments. This is because mongodump can adversely affect performance when the data being backed up is larger than system memory. Furthermore, mongodump is not suitable for use with sharded clusters because it cannot guarantee transaction atomicity across multiple shards.

In light of these limitations, Ops Manager is the preferred method for large-scale backups. Ops Manager falls within the suite of paid subscription tools available in MongoDB Enterprise Advanced. Ops Manager specializes in continuously backing up replica sets and sharded clusters by reading the Oplog data to create snapshots at specified intervals. It also offers point-in-time recovery capabilities. What distinguishes Ops Manager from other MongoDB backup tools is its flexibility, as it can be deployed on your infrastructure, enabling some level of portability and potential cost savings.

This tutorial will focus on using Ops Manager to back up replica sets running on MongoDB Enterprise Server using MinIO as both Snapshot storage and additional Oplog storage.

Understanding MongoDB Backup Basics

Ops Manager can perform both full and incremental backups, with full backups necessary for initial, post-deletion, or block size change scenarios. The full backup followed by incremental backups is a very common strategy as it offers both cost and network efficiency.

When you enable Backup for your MongoDB deployment, you are creating a workflow for taking scheduled snapshots of data from MongoDB replica sets or sharded clusters.

Here are some key elements of that workflow:

  • Data Snapshot: Ops Manager regularly takes a snapshot of your data directory at user scheduled intervals.
  • Monitoring: MongoDB Agents monitor your deployment and offer built in tools to evaluate metrics like query and schema performance.
  • Data Transfer: The backup daemon copies these snapshots from your MongoDB deployment and sends them to MinIO.
  • Incremental Changes: To keep backups up to date, the Ops Manager uses WiredTiger’s incremental backup cursor tool to capture any changes made to the data since the last snapshot in each replica sets’ operation log (Oplog). These logs are tailed by the MongoDB agents and compressed batches or slices are stored s in MinIO.
  • Recovery: In case of data loss or other issue, Ops Manager allows you to restore your MongoDB data from these snapshots.
Chart adapted from “Ops Manager Architecture.” https://www.mongodb.com/docs/ops-manager/current/core/system-overview

When you’re in the planning stages for your backup process, MongoDB strongly recommends that you create a ticket through their support portal. They can help you map out the appropriate architecture for your deployment. In almost all cases, object storage plays a key part.

Setting Up MinIO for MongoDB Snapshot Backups

To utilize MinIO for your backup needs, you'll need to have MinIO Server up and running. If you haven't already set up MinIO, please go ahead and install it on your chosen platform. Once MinIO is installed, you can access it using either the web console or the mc command-line tool. If you opt for mc, here are the steps to install it.

Create a bucket to store your snapshots. Do not create subfolders as these are not supported by Ops Manager. You’ll need to first create a bucket for your Snapshot Store.

And, then another bucket for your Oplog Storage.

Configuring MongoDB for Snapshot Backups to MinIO

The prerequisite for this tutorial is a working Ops Manager and a replica set which you intend to backup. If you haven’t yet installed Ops Manager, please reference MongoDB’s guide.

You can use Ops Manager to create new deployments, but we will assume for this tutorial that you already have an existing replica set. If you haven’t already, follow MongoDB’s guide to monitor an existing deployment. You will need to download Ops Manager agents to servers where the MongoDB deployment is running, but the wizard will walk you through the process.

The next few configurations can be done in the Admin panel in Ops Manager. Click on the Backup panel to open options for Snapshot storage.

In this configuration, Snapshots will stored in MinIO. When configured correctly, your S3 Storage should look similar to the below. You will have to use the advanced configuration option.

  • S3 Bucket Name - MinIO bucket  you created for Snapshot Storage
  • Region Override - Set to us-east-1
  • S3 Endpoint  - For this tutorial, leave the https off your Minio host url. Specify the port number for the MinIO S3-API, not the MinIO Console.
  • Path Style Access - Check
  • AWS Access Key - MinIO username
  • AWS Secret Key - MinIO password
  • <hostname>:<port> - comma separated list of replica set members
  • Acknowledgement that MongoDB cannot offer support for MinIO - Check

While the necessity of a bucket for Snapshot short seems obvious, an Oplog Storage option might not be. As explanation and recap of the workflow outlined earlier, MongoDB Agent constantly monitors your replica set and tracks any changes in the Oplog (operation log) of the primary replica set. The other replica sets copy and apply these changes to their own Oplogs. MongoDB Agent tails the Oplogs of all the replica sets and then transmits these new Oplog entries to Ops Manager. These Oplog entries are sent in compressed bundles called Oplog slices that require storage in order for Ops Manager to function.

Navigate to the Admin panel and then the Oplog Storage panel to configure Oplog Storage for Ops Manager. Select Add New s3 Oplog Store and then the advanced configuration panel.

When you’ve configured correctly, your Oplog S3 storage panel should look similar to the screenshot below. You absolutely need your Oplog Storage to be a different bucket than your Snapshot Storage. All the other required fields should be familiar to you from the Snapshot Storage configuration panel.

You can now enable continuous backup.

You’ll first be asked to enable a HEAD directory if you haven’t already done so. This can be done in Admin under the Backup Initial Configuration. The default directory is the dbpath of your replica set.

Start the Ops Manager Backup wizard by clicking on Continuous backup on the Ops Manager start page.

Click on the green Begin Setup button to begin the process.

You’ll need to install the MongoDB Agents first. We assumed earlier in the tutorial you had already done this, but if you haven’t yet you must now. The minimum requirements are one instance of Monitoring and one instance of Backup. In production, these two should be activated on the same server, and that must be a different server than the mongod instances that will be monitored.

Go through the guide to install the agent, configure and once verified click on the Next button to move forward.

The next screen will guide you to enabling backup on the replica sets you already added to your project. Note that backups can only be enabled on MongoDB Enterprise. You will have to upgrade if using a community edition, or use the previously mentioned mongodump.

If you were successful, you should now be able to see Backup Enabled on your Deployment.

You were asked to configure Ops Manager Snapshot schedule when configuring Ops Manager for the first time. If your requirements have changed, you can edit the schedule anytime by navigating to Admin and then Ops Manager Config. You cannot create an on-demand snapshot and the feature of an initial backup has been depreciated. Essentially, you enable backups and then wait for the first one to run.

How to Restore Backups

Ops Manager enables you to restore data from either a complete scheduled snapshot or from a specific point in time between snapshots, for both sharded clusters and replica sets. Snapshot-based restoration is the simplest method and involves Ops Manager directly reading from MinIO.

In contrast, point-in-time restoration with Ops Manager involves taking snapshot data from MinIO, applying it to stored Oplogs up to the specified point, and then delivering the combination to Ops Manager. Configuration options in Ops Manager allow you to specify how much Oplog data to retain per backup, which impacts the duration for which point-in-time restores are possible.

To configure Ops Manager for restoration from the Continuous Backup panel, simply click on the deployment and select the restore option. Follow the wizard's steps to choose your restore point and decide whether to restore to another cluster or download the files

Conclusion

This comprehensive guide explored the synergy between MongoDB Ops Manager and MinIO, showcasing the pivotal role of each in safeguarding MongoDB data. While mongodump remains a viable option for smaller deployments, Ops Manager is  the preferred choice for larger enterprise scenarios. Its ability to seamlessly create and restore snapshots of MongoDB resources combined with MinIO's high-performance, open-source, and S3 API-compatible storage create the foundation of a robust backup strategy that ensures MongoDB data remains secure and accessible, even in the face of unexpected challenges.

As a parting note, remember to keep your installations of both MongoDB and MinIO updated to take advantage of the latest features (like Ops Manager) and security updates.Show us your MongoDB and MinIO backup architecture on Slack or email us at hello@min.io.

Previous Post Next Post