A Sneak Peak: The MinIO Object Storage and AI Survey

A Sneak Peak: The MinIO Object Storage and AI Survey

MinIO recently surveyed 656 IT leaders as part of a primary research initiative with User Evidence. The results were very interesting and underscore the massive sea change we are seeing in the enterprise, both around the movement to object storage and the interest in using object storage as the primary building block for an organization’s AI initiatives. We will summarize some of those key points here as a sneak peak, with the full report getting released in time for the Gartner IOCS event in early December. 

Nearly 50% of the respondents worked in IT Operations/Infrastructure with Application and Software Development coming in next (27%) and IT architecture (11%) following. The rest were spread (in order) among DevOps, data engineering, and other. 

Which of the following best describes your primary job responsibility? 


The respondents were hands on. We asked for their top three responsibilities and the following came up the most frequently: Evaluate and Select Storage Technologies (63%), Implementation (49%) and Building Applications/Databases/AI/ML (52%). 

What are your top three professional responsibilities for your organization's data storage? 

Choose up to three:

More than 80% of the respondents were either management (VP, Director) or managers (team lead, project leads). 70% came from organizations with more than 500 employees with the largest bucket (29%) coming from the 1K to 5K employee range. The respondents were primarily North America (61%) followed by Europe (31%) - the balance from APAC and MENA. 

Which of the following best describes your current job? 

Where is your organization headquartered? 

Let’s start by looking at object storage usage. Even we were surprised by what we heard and we are the biggest champions of object storage in the industry. The question asked was: “Think about all the data your organization has in cloud native storage TODAY. To the best of your knowledge, what percentage of that data is in object storage?”

The average was 70%. The most frequently given answer was 99% and the median was 71%.

Think about all the data your organization has in cloud native storage TODAY. To the best of your knowledge, what percentage of that data is in object storage?

More importantly, these respondents all saw that number growing: “Think about how data storage at your organization is evolving. What is your best guess about what percentage of their data will be in object storage TWO YEARS FROM TODAY?”

The average was 75%. The most frequently given answer was 99% and the median was 80%.

Think about how data storage at your organization is evolving. What is your best guess about what percentage of their data will be in object storage TWO YEARS FROM TODAY?

Object storage is the dominant storage type today and will continue to grow. Many organizations think almost ALL of their data will be in object storage within two years. That will come as a fairly big shock to the SAN/NAS community. Collectively, they dominate the traditional media sources, but the truth is, they are not that relevant and will become less so over time. The reason is unstructured data. It is the fuel for AI. Video, audio, images, log files, telemetry data, time series data. That is what the enterprise produces today and it is stored on object storage. Add the new open table formats to the mix and you can easily deal with unstructured data as well. That is why databases are building themselves on object storage. 

But the story here is AI = object storage. 

The research is clear on this: What are the top three business or technology factors motivating your organization’s adoption of object storage (public or private cloud)?

Support AI 52%

Performance Requirements 49%

Scalability 44%

These are interlocking concepts. Object storage is deployed to support AI which, guess what…requires performance and scale. If we had been able to write it performance at scale it would have done even better.  

This is the MinIO story. We tell it everywhere and to anyone who will listen.

What are the top three business or technology factors motivating your organization’s adoption of object storage (public or private cloud)? Choose up to three.


The data also tells us that when it comes to value drivers, cost is way down the list. This is counter to the cheap, deep and slow narrative the SAN/NAS vendors would have you believe. 

Since we are on the subject of AI, let’s take a look at how it is going to market in the enterprise.

When we asked what workloads used object storage, the answer was as we would have expected. Enterprises are still using object stores as the foundation of their analytics workloads (54%) but increasingly also for AI model training and inference (51%). This is followed by modern datalakes and lakehouses (44%). Then, and only then, do you find traditional workloads like disaster recovery (41%).

What workloads use object storage? Choose all that apply. 

When you drill down a little, you really get a sense of what is driving training data development. This is all AI and it is fairly tightly packed. Application data leads, followed by log data. We would have expected “custom corpus” to come in a little higher - but that could be a function of the technical nature of the term. 

What types of training sets does your organization send to object storage for AI analysis? Choose all that apply. 

Having said that, there is consistency in the responses. When we asked what types of workloads they run on object storage, custom corpus came in last… What was more interesting is that the number of enterprises using the public cloud and private cloud for GenAI was effectively the same. 

Do you have plans to build a Data Lakehouse with object storage in the near future?

Everyone, it seems that everybody wants to build a data lakehouse, however. With 92% saying they plan to or already have and 62% planning on doing it in the next year if they don’t have one. Pretty clear indication that SAN/NAS need not apply to those workloads. 

What types of AI workloads does your org currently or plan running on your object store? Choose all that apply:


Last up on the AI front. We asked respondents to name the top three most challenging elements of AI for their organization. Unsurprisingly, Security and Privacy were at the top. 

What are the top three most challenging elements of AI for your organization? Choose up to three:


One of the reasons that enterprises will repatriate is for control reasons. Security and Privacy are about control. Data governance is a similar expression of that concern. Understanding what is in your data, who has access to it are core expressions of this control narrative. There are a number of responses that are at the same level and they can be generally grouped. For example - fast networking and performant storage speak to the ability to run different types of workloads. Cloud-Native storage speaks to support for containerization, orchestration, RESTful APIs and microservices. SAN/NAS technologies are ill suited for the cloud native world and you can’t containerize an appliance.  Cloud-Native = Software Defined. 

There is more, but this is a sneak peek. We have data drive types, object sizes, who manages object storage, how many FTE’s it generally takes to manage a single PB, how many clouds (public and private), network speeds and more. We will have the full report out after Thanksgiving (early December). Be on the lookout.

The key takeaway, however, is that in the enterprise, object storage is primary storage and that AI runs on object storage. This isn’t news to practitioners. It isn’t news to developers. It isn’t news to architects. It probably is news to senior IT leaders who grew up in a world where SAN/NAS were dominant and have a bias towards those technologies and the appliance model. That’s clearly changing however, and the stakes are incredibly high.

What got you here, isn’t going to get you there. 

It is time to get on the object storage bandwagon.