All posts

Open Source = Bombproof

Open Source = Bombproof

Software isn't usually described as bombproof.

Particularly the type of software that is responsible for large analytic jobs or machine learning workloads. The words “finicky”, “complex” or in the case of good marketing “professional grade” (meaning you need years of study and multiple certifications) are more common.

Bombproof software, however, is one of the many benefits associated with active open source software projects. While this may seem counterintuitive to those who consider open source software to be “unfinished” or “test/dev quality” in practice highly utilized open source software is the most tried and tested software in the market.

Let’s start with scale. MinIO is deployed on every continent . There are more than 227M Docker pulls as of this writing. More than 16.5K GitHub stars. More than 17K different entities from AAA to Zyztm Research with thousands more behind unresolvable IP addresses.  Every day those entities pour in more and more data.

Exabytes upon exabytes in aggregate.

Big instances, small instances and every size in between. The type of scale that the appliance vendors dream of. This scale is a function of the fact that anyone can download and run MinIO - from large, global enterprises who build applications on top of it (more on that in a bit) to the small consulting shops who deploy it in banks, hospitals and retail settings.

Next is configurations. Every one of these seventeen thousand entities is different. Even the instances within a single entity are different. What features are enabled, what language, what APIs, what applications, what compute and storage hardware, what network, what security, what type of data. This list is incredibly long and the permutations approach the number of atoms in the universe.

MinIO has to work in every case. If it doesn’t work we will hear about it on Slack, on Twitter, on Hello and on GitHub. This is the power of the community. Because the software is open source doesn’t mean that customers are not demanding - on the contrary, when they build MinIO into their infrastructure they have expectations. When they see MinIO featured in another vendor’s solution they expect it to perform accordingly in their configuration. This hardens the software, pushes its limits and stretches it capabilities.  

Third is workloads. MinIO is high-performance object storage. That means that we not only do the traditional use cases like archival, backup and disaster recovery, but we do an entirely new class of workloads from AI/ML and big data analytics to serving as the persistent datastore for cloud native applications. In fact, many of our clients deploy us for high-performance cases and decide to dump  their legacy, incumbent backup storage  because they effectively get that use case for free with the high performance applications.

Within these classes of workloads the use cases are incredibly deep. For example AI/ML includes Spark, Presto, TensorFlow, H2O.ai, SciKitLearn to name just a few. Again, MinIO has to perform, whether we are replacing HDFS or running as an S3 gateway in front of Azure and Google Cloud. The workloads may differ but the importance of the underlying data does not.

Fourth is security. Securing data, in flight and at rest is paramount for any storage system be it file, block or object. The countless configurations and workloads expose our software in every conceivable way yet we retain one of the best reputations in the industry. How? Well, MinIO’s encryption is based on cryptographic building blocks that are provable secure and open source. Formal proofs ensure that you haven’t made a conceptual mistake while open source implementations would reveal any malicious backdoors but also help finding potential implementation bugs.  The latter point also applies to security mechanisms in general - like authentication and access control. In general, the community is a powerful source of QA. They are constantly on the lookout for potential vulnerabilities and communicate that back to MinIO via the channels referenced above. Another is the incredible number of deployed configurations. MinIO tightly integrates with security applications and key management systems  to match the security requirements of the most rigorous and demanding organizations.

Fifth is the license. Because we selected the most liberal Apache license our product is built into dozens of other products. We are the developer object store for the world’s most valuable technology company. We are the object storage foundation for one of the biggest players in healthcare. We are the object layer for the leader in the software-defined enterprise cloud space and we power dozens of other commercial applications. Again, these are customer-facing applications with billions of dollars on the line, and they are powered by MinIO and enabled by the Apache V2 license.

The net result is that open source makes MinIO better, stronger, safer and more resilient than proprietary software. Yes it impacts the top line - when you make software as good as ours and don’t charge for it, you generate less revenue in the short term. In the long term, however, the adoption of MinIO object storage provides an exceptionally bombproof foundation for our customers to build - and that helps us build our business.