The Challenge in Big Data is Small Files
Large numbers of small files present big challenges for application performance.
Read more...Large numbers of small files present big challenges for application performance.
Read more...When you think about object storage workloads and storage types - databases are increasingly a core workload. The changes are driven by two forces: the availability of high performance object storage and the explosive growth of data and specifically its associated metadata. Because of these two forces, almost every major database vendor now includes S3 compatible endpoints. Further, for many
Read more...The regulatory landscape is evolving rapidly, and the upcoming Digital Operational Resilience Act (DORA) in Europe is a testament to this dynamic change. We have multiple European banking customers and each one is approaching the problem from a slightly different angle with one exception - almost all of them are using modern object storage as the foundational layer. For IT
Read more...The modern enterprise defines itself by its data. This requires a data infrastructure for AI/ML as well as a data infrastructure that is the foundation for a Modern Datalake capable of supporting business intelligence, data analytics, and data science. This is true if they are behind, getting started or using AI for advanced insights. For the foreseeable future, this
Read more...This post initially appeared on The New Stack. For a few years there, the term “private cloud” had a negative connotation. But as we know, technology is more of a wheel than an arrow, and right on cue, the private cloud is getting a ton of attention and it is all positive. The statistics are clear, Forrester’s 2023 Infrastructure
Read more...The Modern Datalake is one-half data warehouse and one-half data lake and uses object storage for everything. The use of object storage to build a data warehouse is made possible by Open Table Formats OTFs) like Apache Iceberg, Apache Hudi, and Delta Lake, which are specifications that, once implemented, make it seamless for object storage to be used as the
Read more...In this blog, we will demonstrate how to use MinIO to build a Retrieval Augmented Generation(RAG) based chat application using commodity hardware.
Read more...This post first appeared on The New Stack on June 3rd, 2024. I previously wrote about the modern data lake reference architecture, addressing the challenges in every enterprise — more data, aging Hadoop tooling (specifically HDFS) and greater demands for RESTful APIs (S3) and performance — but I want to fill in some gaps. The modern data lake, sometimes referred to as
Read more...An abbreviated version of this post appeared on The New Stack on March 19th, 2024. In enterprise artificial intelligence, there are two main types of models: discriminative and generative. Discriminative models are used to classify or predict data, while generative models are used to create new data. Even though Generative AI has dominated the news of late, organizations are still
Read more...I suspect some folks will accuse me of clickbait titling. Others will say, that’s not really a reach - most folks will fail in their initial AI attempts but it doesn’t matter and the learnings are worth it. On some level both are right - but I think WHY enterprises will fail is worth exploration and may allow
Read more...Unleash data collaboration and quality with Nessie! Learn to manage branches, commits, and merges effortlessly. This guide walks you through deploying Dremio, MinIO, and Nessie, transforming your data engineering with collaborative precision. Dive in to revolutionize your workflows!
Read more...In this post let's take a look at how to set up multiple LXMIN servers backing up to a multi-node multi-drive MinIO cluster.
Read more...Unlock the secrets of modern datalakes migration to the private clouds. Embrace S3 compatibility, data control, and the ever-evolving landscape for cost-effective data management. Don't miss the journey to enhanced flexibility, efficiency, and the future-proofing of your data ecosystem
Read more...This is your symphony for data excellence. Explore the components of this modern data stack, including storage, data integration, transformation, data observability, data discovery, data visualization, data analytics, and machine learning.
Read more...Unlock the true potential of your cloud migration journey! Learn how embracing the cloud as an operating model, rather than a location, can revolutionize your technology approach. Find out why portability, the right tools, and open standards are your keys to success.
Read more...Build a streaming Change Data Capture (CDC) pipeline with Redpanda and MinIO into Snowflake. This solution simplifies data migration and analytics, with Redpanda offering scalability, MinIO as efficient storage, and Snowflake as a cloud-native analytics engine.
Read more...Email is the ultimate performance-at-scale use case as it generally only goes up in terms of data volume. Further, the more data that’s stored, the more valuable the data becomes. MinIO’s multi-site active-active replication focuses on keeping the cluster in top performance.
Read more...We were recently asked by a journalist to help frame the challenges and complexity of the hybrid cloud for technology leaders. While we suspect many technologists have given this a fair amount of thought, we also know from first-hand discussions with customers and community members that this is still an area of significant inquiry. We wanted to summarize that thinking
Read more...Most developers, engineers, architects and DevOps folks know MinIO. Not all know that the only thing we do is software-defined object storage. We don’t do file or block. We don’t offer a service, it is self-hosted. Our focus is singular. The result is that our object store is objectively, based on adoption, awards and customer feedback the best
Read more...In this post we’ll talk about Erasure Coding and Erasure Sets, and then dive deeper into how to use the Erasure Code Calculator when designing deployments to make the most out of MinIO by opting for the right hardware configuration setup from the get go.
Read more...