Amid the fervor to adopt AI is a critical and often overlooked truth - the success of any AI initiative is intrinsically tied to the quality, reliability and performance of the underlying data infrastructure. If you don't have the proper foundation, you are limited in what you can build and therefore what you can achieve.
Your data infrastructure
Read more
The combination of StarRocks and MinIO offers a cloud-native, flexible, and efficient data architecture for modern enterprises, enabling independent scaling and optimized resource utilization. Read the full tutorial for insights into cloud-native analytics with StarRocks and MinIO
Read more
Explore the integration of Dockerized MinIO with localhost Flask apps. This guide addresses Docker networking challenges, ensuring seamless MinIO and Flask communication for a development environment that closely mirrors production. Dive into practical solutions for robust workflows.
Read more
In today’s post, we’ll go deeper into some of the considerations for long-term MinIO management that you need to take into account, so that when Day 2 does roll around 48 hours later you have all your ducks in a row.
Read more
There is an interesting report out from McKinsey on the impending impact of AI on an enterprise’s cloud investments.
There was a quote early on in the piece where McKinsey states:“While the possible impact varies by sector, adopting cloud represents an opportunity for the average company to increase profitability by 20 to 30 percent.”
To many, this would
Read more
Introduction
Distributed data processing is a key component of an efficient end-to-end distributed machine-learning training pipeline. This is true if you are building a basic neural network for statistical predictions where distributed training could mean each experiment runs in 10 minutes vs. an hour. It is also true if you are training or fine-tuning a Large Language Model (LLM) where
Read more
Discover how Databricks and Apache Iceberg's strides in open table formats influence data portability in the modern data stack. Learn how the shift to a private cloud operating model aligns with this evolution, fostering an adaptable, interoperable data ecosystem.
Read more
This post was written in collaboration with Amit Kesarwani from lakeFS.
The reality of running multiple machine learning experiments is that managing them can become unpredictable and complicated - especially in a team environment. What often happens is that during the research process, teams constantly change configuration and data between experiments. For example, try several training sets and several hyperparameter
Read more
As we were writing the blogs on Event Notifications and Object Lambda we came to a realization of why there are two different features doing almost the same thing? Or are they? What is the difference between the Greek Lambda and Lightning Bolt?
Read more
Unleash data collaboration and quality with Nessie! Learn to manage branches, commits, and merges effortlessly. This guide walks you through deploying Dremio, MinIO, and Nessie, transforming your data engineering with collaborative precision. Dive in to revolutionize your workflows!
Read more
Every system needs to be backed up because there are countless ways to lose local filesystem data and configurations. That loss can be devastating – potentially resulting in revenue loss, dissatisfied customers and even costly litigation. The statistics are pretty bleak – sixty percent of businesses that suffer a data loss event close within six months and ninety-three percent of companies that
Read more
In today’s post we’ll show you how to configure MinIO as a storage provider and metadata store for Quickwit.
Read more
Here we are with our semi-annual critique of KubeCon. We do it for Europe and we do it for North America and we don’t pull our punches. If you don’t believe us, check out our write up on Detroit.
This year is very different. Chicago had some sizzle to it. There was buzz. There was unseasonably beautiful weather.
Read more
In the Harvard Business Review's recent How Companies Think About Data, Leandro DalleMule and Thomas H. Davenport present "a framework for building a robust data strategy that can be applied across industries and levels of data maturity." The framework draws on their experience at AIG, a global insurance company where Mr. DalleMulle is CDO, combined with
Read more
Empower regulatory compliance with MinIO Object Lambdas. Seamlessly customize data on-the-fly for cost-effective and efficient data pipelines. Explore the tutorial for real-life scenarios and unleash the power of MinIO's Object Lambdas.
Read more
In this post let's take a look at how to set up multiple LXMIN servers backing up to a multi-node multi-drive MinIO cluster.
Read more
Introduction
Generative AI represents the latest technique an enterprise can employ to unlock the data trapped within its boundaries. The easiest way to conceptualize what is possible with Generative AI is to imagine a customized Large Language Model - similar to the one powering ChatGPT - running inside your firewall. Now, this custom LLM is not the same as the
Read more
Unlock the secrets of modern datalakes migration to the private clouds. Embrace S3 compatibility, data control, and the ever-evolving landscape for cost-effective data management. Don't miss the journey to enhanced flexibility, efficiency, and the future-proofing of your data ecosystem
Read more
Today we’ll talk how we use our local lab to test some of the key features and functionality to not only show you but also hopefully inspire you to elevate the technology and processes in your lab too that can make debugging any application a piece of cake.
Read more
A lot of ink has been spilled on the significance of the AI/ML technology wave (here are our posts). What doesn’t get attention, but probably should, is how AI/ML is remaking the technology power structure inside the enterprise. As companies reorganize around a data-centric orientation, they are also reorganizing who makes and executes the technology architecture. While
Read more