Accelerating Database Backup and Restore with MinIO Jumbo

Accelerating Database Backup and Restore with MinIO Jumbo

Object storage is primary storage in modern, cloud-native architectures. One of the implications of this shift is that databases either run on, connect to or backup to a high-performance object store.

Backup, for instance, can be done in the traditional manner. Veeam and Commvault, have native support for the S3 API.

There are still a large percentage of enterprises that depend on the tools provided by their database vendor(s) for snapshot/restore. Specific to the database, these snapshot/restore tools often lack S3 support (but are coming). Yet, if enterprises are running database workloads on object storage, then why would they want to add additional filesystem-based storage just to house backups? This gives them an extra storage deployment to manage and deprives the enterprise of the opportunity to consolidate their snapshots/backups/archives onto the same object storage deployment that already serves as primary storage. Furthermore, they must continue to rely on file systems as a key component of their BC/DR posture.

Here’s the problem. Modern databases are too big for SAN/NAS architectures. The inability for the SAN/NAS to scale creates significant issues for the enterprise.

As a result, at the behest of one of our Fortune 100 customers, MinIO built a tool we call Jumbo. Jumbo can handle massive dumps of data by creating parallel streams to upload segments of large objects. That collection of objects can be read back with a single restore command.

In essence, Jumbo organizes a large file, such as a database snapshot, into a single large stream of objects and uploads them rapidly in parallel to object storage using the S3 API. The only limitation to Jumbo is network speed (more on this below). Jumbo can run read files via Linux pipe and read files from drives. The database backup tool can either stream backup content via a STDOUT pipe or write files to the file system and Jumbo can read in from either.  The backup runs, hands off the file to Jumbo using Linux pipes and Jumbo does the rest. Jumbo supports any backup tool that writes to STDOUT. In our testing, Linux pipe speed is far lower than reading directly from disks (<3GB/s for pipes) despite optimizations to address issues that commonly plague pipes.

Further optimization is possible - but it requires embedding Jumbo directly into your backup tool. Please reach out to us if you would like to discuss how.

The utility of Jumbo goes beyond backups. It can be used to upload any large file from a local file system to S3 storage, giving it potential value in any environment where this operation takes place, such as genetics, DNA sequencing, space exploration, oil/gas exploration, particle collisions and more.

Combining Jumbo with a MinIO deployment creates a flexible, durable, high-performance home for very large backups. Jumbo adds to the suite of MinIO capabilities that make it so well suited as a backup storage endpoint:

  • High-Performance: MinIO is capable of PUT throughput of 1.32 Tbps and GET throughput of 2.6 Tbps in a single 32 node NVME cluster. This means that backup and restore operations run faster, decreasing the potential business impact of downtime.
  • Optimized across a range of object sizes: MinIO can handle any object size with aplomb. Because MinIO writes metadata atomically along with object data, it does not require a separate metadata database. This dramatically decreases response time for small file PUTs and GETs. Jumbo parallelizes large object uploads to use the network as efficiently as possible.
  • Inline and strictly consistent: All I/O is committed synchronously with inline erasure-code. Bitrot hash provides integrity checks that allow MinIO to detect and heal corruption on disks, while encryption protects against theft of data from disks. The S3 API is resilient to disruption or restart during an upload or download so a backup can’t disappear. Finally, because there is no caching or staging of data, meaning that all backup operations that complete are guaranteed to be present on persistent storage.
  • Built for commodity hardware: Commercial off-the-shelf hardware means that you’ll realize huge savings over purpose built appliances. In today’s economic climate, you’ll get more bang for the buck as backup data grows into petabytes.  

Jumbo Performance Testing

As we noted above, we did this for a large customer. This wasn’t just any database backup job, this was the colossus of backup jobs. It involved large database backup files, typically 15TiB to 30TiB per job. The company was now facing a backup window longer than 24 hours – the 16TiB backup took over 28 hours to complete) and the end-to-end time taken hindered the ability to create backups and clones, threatening operations.

In building the solution, the customer tested with the latest Intel Sapphire Rapids-based Supermicro server platform.

MinIO Client: 1 node with 6 x 15.36TB NVMe drives per node was used to perform 50TiB and 64TiB blob backups. These files were sharded over 6 drives.

MinIO Client: 1 node with 6 x 7.6TB NVMe drives per node  was used to perform 1TiB, 16TiB and 32TiB blob backups. These files were sharded over 6 drives.

MinIO Servers: 12 nodes with 6 x 7.68TB NVMe drives per node.

  • Network: Dual 100GbE QSFP28, Mellanox CX-6
  • Storage - 6 x Kioxia CD6-R 7.68TB NVMe
  • 1 CPU 48 Physical Cores (96 Total Threads per node) - Intel Xeon (Sapphire Rapids) - Q16Z/E3 SPR 8461V 48C 2.2G 97.5MB FC-LGA16A 300W XCC
  • Memory: 512GB DDR5-4800 (8 x 64GB)
  • Boot Storage - Kioxia XG6 2TB NVMe (4X 512GB)
  • Operating System: RHEL9.1

Each SYS-211GT-HNTR 2U server enclosure is a quad server (2U 4 node) configuration.

This configuration was also used for a Small Files Benchmark that can be found in Need for Speed 2 - The Supermicro GrandTwin™ SuperServer Benchmarks.

We ran Jumbo on one data node with data sharded on 6 NVMe disks. Jumbo reads from the disks in parallel and pushes data to the 12 node MinIO cluster leveraging 2X NIC Cards (100 Gbps each) present on the data node.

MinIO Object Store Benchmark

Before testing the actual large file backup performance, MinIO used its benchmarking tool to do a performance test on the 12 node MinIO cluster.

Typically, MinIO performance is either network-bound or disk bound. In this particular test, MinIO delivered the following server-side performance:

Large File Backup Benchmark

The goal of this benchmark test was to achieve at least 20 GiB/s throughput on the client side where the DB Backup file is stored. As part of these tests we were able to optimize the Jumbo code to take advantage of O_DIRECT read mode from the high performing NVMe disks.

O_DIRECT read mode ensured that we would avoid copying into the OS PageCache and then into application memory; with this mode enabled we saw 3X faster performance. The benchmark tests (see below table) saturated the network while CPU and memory was still available, indicating that the bottleneck in this test is, in fact, the network. We were able to consistently achieve 20 to 21 GiB/s across DB sizes.

In practice, the database and backup tool also limit overall throughput because they’re not capable of driving traffic at wire speed, but if we return to the original reason for these tests, then the value of Jumbo is illustrated clearly. We began with a 16TiB backup that took over 28 hours to complete. A backup of the same size written directly from the DB backup tool to STDOUT, piped to Jumbo and uploaded to the MinIO cluster in 1h51m, resulting in a 15x performance improvement over the company’s existing backup solution.

Jamming with Jumbo

Jumbo shrinks backup windows with speedy uploads to MinIO. It accomplishes this by parallelizing uploads. Jumbo uses all available bandwidth to write the backup files to MinIO. Given that MinIO is the fastest object storage available, and that Jumbo maxes out the network, this means that the only limit to how fast a backup can take place is the source system itself.  

This is the way in modern architectures. Disaggregated databases should run against disaggregated storage, and it alleviates the need for the database itself to manage storage. The modern reference architecture combines a data lake with cloud-native database, analytics and AI applications. It all runs on Kubernetes and replicates data as needed across the multi-cloud.

Don’t take our word for it, download MinIO today and take it for a test drive. If you want to talk about Jumbo or have any questions, please ask us on our Slack channel.

Previous Post Next Post