The Flaw in the Unified Storage Narrative

The Flaw in the Unified Storage Narrative

It seems like more and more companies are touting themselves as one-stop shops for object, file and block storage these days - adopting the mantle of “unified” storage and offering to support a variety of storage protocols. The idea of supporting S3, NFS, SMB, HDFS, iSCSI, FCoE, NVMeoF and FCP all at once, is touted as the epitome of flexibility and convenience.

Truth is, this practice results in complexity, inefficiency, and subpar performance. It is the equivalent of the swiss army knife or hybrid bicycle - it is mediocre at every task in the hopes of being relevant to every task.

We don’t just say that because we are solely focused on the S3 API, we say that because it follows engineering first principles. The team here built GlusterFS. They know a thing or two about POSIX. They decided to build something modern and performant with MinIO. Something that was designed for the cloud. In the cloud operating model, object storage rules. Nonetheless, there are use cases where file and block are better tools for the job and we recommend dedicated options there as well.

Understanding the Pitfalls of Multi-Protocol Approaches

The appeal of multi-protocol storage systems can be alluring at first glance but on closer inspection reveals itself to be the lazy thinking of companies that are comfortable with shortcuts (the same criticism can be leveled at analysts who promote this approach).

The siren song of catering to the needs of all users results in the worst of all worlds - as the tradeoffs and compromises come at the cost of excellence. This is not to say that one company cannot support two first class offerings (one POSIX and one S3). It can. It just can’t be done from the same product.

The flaws are everywhere.

Let’s start with the problem of complexity. As the number of supported protocols increases, so too does the complexity of the storage system. This complexity makes the system increasingly difficult to manage and maintain, leading to more errors and increased downtime. A perfect example can be found in the legendary complexity of CEPH. It is a good product in many disciplines - it is a great product in none. It is an operational nightmare. Even the CEPH devotees will tell you that.

Then there are the inefficiencies. Since each protocol has its unique requirements, supporting multiple protocols leads to unnecessary duplication of functions and data, causing storage inefficiency and wastage. The result is a bloated piece of software that is ill-suited to any one approach in the hopes of being mediocre in many. We have never understood this thinking. It may drive short term cost-savings gains, but over time, the shortcomings of the product will result in lost profitability and lower margins.

Performance is another one. When a storage system supports multiple protocols, it's often a "jack of all trades, master of none." It fails to excel at handling any single protocol, resulting in suboptimal performance. It is simply too hard to “do the work” around things like SIMD acceleration, AVX-512 instruction sets, in-line erasure coding or encryption optimizations if you are spread out over file, block or object.

Historical Precedence in Tech: The Power of Focus

The tech industry has numerous examples where focusing on one thing has led to success, and one of the most famous is Microsoft with the MS-DOS operating system. In the early 1980s, Microsoft focused solely on MS-DOS when many competitors were developing operating systems supporting multiple protocols. This focus allowed Microsoft to capture a significant market share and led to them dominating the PC industry. Apple in turn built the MacOS into a formidable competitor by relentlessly focusing on supporting its own GUI-driven perspective.

Another instance can be seen in the rise of TCP/IP as the standard for Internet communication. In the early days of the Internet, various protocols were used, such as IPX, NetBEUI, and AppleTalk. However, the singular focus on developing and promoting TCP/IP by companies and organizations like ARPA eventually led to its dominance.

A more timely example is Tesla. Their singular focus on electric vehicles and the infrastructure that supports them has allowed the company to dictate the standards for charging and capture a disproportionate share of the market. It isn’t spreading its talent across gas, hybrid and electric technologies. It is in the game to win electric. It is a philosophical choice.

The Object Storage Bandwagon

The rise of object storage as the storage of the cloud operating model isn’t really debated much anymore. That is not to say that the SAN/NAS folks don’t make their case that they are still relevant - they do at every opportunity, it is just that the people who know don’t really listen anymore.

Redshift, BigQuery, Netflix, Uber and now OpenAI all run on object storage. Performance at scale is the reason.

As with any trend, the early adopters choose object storage for its various attributes and the functional superiority of the S3 API. Now almost everyone sees the trend. There are still holdouts but they are increasingly isolated.

They still have buying and or veto power. As a result, they have told their incumbent vendors, “if you want to stick around, you better be able to offer object storage or I will be forced to move to someone else.”

As a file or block provider, when your $100M account tells you this - you figure out how to offer object storage. This has spawned a whole slew of “check the box” offerings from companies that until recently disparaged object storage as “cheap, deep and slow.” Since pure play offerings like MinIO and AWS S3 have shown that to be patently false, those same companies are now adding object stores to their product offerings.

They don’t really believe in object storage. They are SAN/NAS types at heart. As a result they do one of the following to “add” object storage to their portfolio:

1. They partner with someone who does it - white labeling the offering.

2. They try to build something from scratch.

3. They use old Apache-vintage MinIO code and call it their own.

They all have issues.

In #1, it is not their code and they don’t have control over the roadmap so it will always be someone else’s product. If quality suffers there, quality suffers for your customers. Further, when big companies make such arrangements, the economics are generally poor for the object storage vendor. There isn’t much incentive (or often resources) to do more.

In #2, we applaud the effort. It is hard to build an object store from scratch. We know. It takes a lot of time. If it is proprietary, it will be buggy. MinIO is open source and we have tens of thousands of developers submitting GitHub issues improving our work on a near hourly basis. Still - this is the right path. It requires long term vision.

We see #3 more than we would like. If a vendor magically adds object storage - it is pretty much a guarantee it is our code. Provided you adhere to the license it is perfectly acceptable - but the Apache code is several years old at this point and the subsequently discovered security vulnerabilities make it such that you should not be running it for anything.

The Power of Best of Breed Technologies

Best-of-breed technologies represent the pinnacle of performance in their specific domains. In the storage domain, there are superb file and block solutions and for those workloads and where they are ideal, enterprises should select them. For workloads where object storage is ideal, enterprises should select the best version of that.

The overhead associated with different GUIs or additional is minimal compared to the gains you get from a system built specifically for the task.

Look at specialized systems, such as Salesforce for CRM or Epic for EHR. They offer advanced features that comprehensive suites from a single vendor often lack. Yes, these are sprawling products in their own right - but that is because they have customized modules for personas and businesses. They deliver the depth and sophistication needed to drive business processes effectively, providing firms with a distinct competitive advantage.

Additionally the user experience, a critical factor influencing the adoption and efficiency of tech systems, is significantly enhanced in best-of-breed solutions. Vendors that specialize in one area, generally offer intuitive, user-friendly interfaces that reflect a deep understanding of user needs, resulting in greater productivity and satisfaction.

Finally, enterprises, and particularly those who see data storage as strategic, have the skill set to walk and chew gum at the same time. Managing two, superior storage solutions - each designed to be the best in their field isn’t a challenge - it is an opportunity to move the business forward. Frankly, best-of-breed solutions, due to their focus and modern architectures, readily lend themselves to such adaptability, contrasting with the rigid, monolithic structures of integrated suites. That makes them easier to integrate - particularly with the outside (cloud-native world).

Don’t Fall for the Myth

The multi-protocol, unified storage argument is weak. It asks the enterprise to sacrifice on multiple fronts under the guise of convenience. Mediocrity is not convenient. It is an excuse.

We welcome debate on this subject. We understand that our position is not without bias, but it doesn’t make it any less true. Let us know where you disagree.