Scaling MinIO: Benchmarking Performance From Terabytes to Petabytes

Sidhartha Mani Sidhartha Mani on Benchmarks 17 December 2019

MinIO provides the best-in-class performance as we have repeatedly shown in our previous benchmarks. In those benchmarks, we chose the highest-end hardware and measured if MinIO could squeeze out every bit of the resources afforded it. This proved two key points:

Ensuring that MinIO utilizes the maximum possible CPU, Network, and Storage available.
Ensuring that MinIO is NOT the IO-bottleneck.

Having achieved proof points on the above, we turned our attention to measuring the behavior of MinIO along another equally important dimension:

Ensuring that MinIO’s performance does not degrade as we increase the cluster size.

The chart above depicts the linear scalability of MinIO with a HDD Backend. The post covers HDD and NVMe and what we see in those cases.

Performance at Scale

In order to measure performance at scale, we performed separate sets of tests for NVMe and HDD backends. This is because NVMe Drives and Hard Disk Drives have different scaling dynamics and warrant separate measurements.

NVMe Backends: Scaling for Performance

The maximum sustained throughput of a single NVMe drive is ~3.5 GB/sec for reads and ~2.5 GB/sec for writes. This essentially means that only 4 NVMe drives are needed to saturate 16x PCIe 2.0 lanes (Maximum 16x PCIe 2.0 bandwidth is 8GB/sec)

In real-world scenarios, a variety of workloads simultaneously hitting the drives would justify a higher number of drives. As a rule of thumb, 8 NVMe drives per machine can be considered the point of diminishing returns.

The chart above depicts the linear scalability of MinIO with NVMe Backend

The above-mentioned dynamics make NVMe backends a great choice for scaling workloads where maximizing throughput is the primary requirement.

We performed tests with 8 nodes and 32 nodes, each with 8 NVMe drives and 100 GBe Network. Each drive’s capacity was 8 TB. The total available storage was 512 TB (0.5 PB) and 2048 TB (2 PB).

As we increased the node count from 8 to 32, we noticed a near-linear (~4x) increase in Read performance.

Note: The PUT numbers seemingly indicate supralinear scalability. However, this is due to the variability of hardware performance on AWS. In the case of the 32 node tests, they were performed after AWS launched bare-metal NVMe instances (i3en.metal), which did not have this issue.

HDD Backends: Scaling for Storage Capacity

In contrast to NVMe drives, the maximum sustained throughput of a single Hard Disk Drive is ~250 MB/sec for both reads and writes. It takes approximately 32 HDDs working simultaneously at the highest performance to saturate 16x PCIe 2.0 lanes. An even larger number of drives can be justified per machine as randomness (multiple IO requests) rapidly degrades HDD performance.

HDDs enable higher storage capacity to be achieved with fewer servers as compared to NVMe backends as more drives can be packed into a single server. This makes HDD backends a great choice for scaling workloads where maximizing total storage capacity is the primary requirement.

We performed tests with 16 nodes and 24 nodes, each with 8 HDD drives and 25 GBe Network. Each drive’s capacity was 2 TB. The total available storage was 256 TB and 384 TB.

As we increased the node count from 16 to 24 (1.5x), we noticed a near-linear increase in Read performance. The increase in write performance was less than linear, likely due to increased random I/O caused by the sheer load on the drives from the benchmark tests. Please note that hardware performance showed slight variability in these AWS instances as well.

Conclusion

HDD backends are bound to provide better price, efficiency, and performance when scaling total storage space. NVMe backends are more suited for scaling maximum bandwidth available for clients to read and write data. In both cases, the maximum throughput scaled linearly as the cluster size was increased. If you have any questions, please reach out to me at sid@min.io

Previous Post Next Post

S3 Select Security Modern Data Lakes Apache Presto SQL Performance S3 Brand/Design Golang Programming Cloud Computing Microservices Docker AWS Kubernetes Apache Spark Open Source Benchmarks Integrations SUBNET Edge Computing Sidekick Secure-by-Design Splunk Veeam Intel Apache Nifi Immutability Software Defined Storage VMware Apache Arrow Hybrid Cloud Red Hat OpenShift Multicloud Scalability Cloud Field Day Cloud Native Apache Kafka Architect's Guide Awards Operator's Guide Security Advisory AI/ML AGPLv3 Apache Hadoop SFD Azure GCP Observability Analytics R H20 DirectPV DevOps Apache Iceberg Apache Hudi YouTube Summaries EKS Elastic Load Balancers CI/CD Object Storage Compliance opentelemetry BC/DR Storage Newsletter Predictions Best Practices Dremio New MinIO Features partners Small Files Databases DuckDB PostgreSQL Delta Lake Cloud Repatriation Python Object Lambdas Data Pipelines Cloud Operating Model Webhook ClickHouse Vector Database Events Value Engineering Change Data Capture Enterprise Object Store GitOps Case Study Equinix Certifications Snowflake Repatriation Migration Tabular Databricks

Performance at Scale

NVMe Backends: Scaling for Performance

HDD Backends: Scaling for Storage Capacity

Conclusion

Get a Quote

Select Plan

Choose Capacity