MinIO Blog

MinIO Blog

MinIO Batch Framework Adds Support for Expiry

MinIO Batch Framework Adds Support for Expiry

You can now perform S3 Delete operations using the MinIO Batch Framework to remove multitudes of objects with a single API request. The MinIO Batch Framework lets you quickly and easily perform repetitive or bulk actions like Batch Replication and Batch Key-Rotate across your MinIO deployment. The MinIO Batch Framework handles all the manual work, including managing retries and reporting

Read more

The Blog Year in Review: Top 10 for 2023

Sasha Wodtke Sasha Wodtke on |
The Blog Year in Review: Top 10 for 2023

With only a few days left in 2023 (who else can’t believe it?), we have been taking some time to look back on what an amazing year it’s been. There have been so many highlights. Whether it’s been the many awards, conferences, or meeting so many of you, we are eternally grateful!  The biggest part of MinIO

Read more

Distributed Training and Experiment Tracking with Ray Train, MLflow, and MinIO

Distributed Training and Experiment Tracking with Ray Train, MLflow, and MinIO

Over the past few months, I have written about a number of different technologies (Ray Data, Ray Train, and MLflow). I thought it would make sense to pull them all together and deliver an easy-to-understand recipe for distributed data preprocessing and distributed training using a production-ready MLOPs tool for tracking and model serving. This post integrates the code I presented

Read more

Recent Launch of Amazon S3 Express One Zone Validates That Object Storage is Primary Storage for AI

Matt Sarrel Matt Sarrel Matt Sarrel @msarrel on |
Recent Launch of Amazon S3 Express One Zone Validates That Object Storage is Primary Storage for AI

We have made the case for several years that in modern data stacks object storage is primary storage. This is even more true in the age of AI where enterprises focus almost exclusively on object storage. The modern data stack relies on disaggregated compute and storage alongside cloud-native microservices running in containers on Kubernetes. As more enterprises shift to this

Read more

Distributed Training with Ray Train and MinIO

Distributed Training with Ray Train and MinIO

Most machine learning projects start off as a single-threaded proof of concept where each task is completed before the next task can begin. The single-threaded ML pipeline depicted below is an example. However, at some point, you will outgrow the pipeline shown above. This may be caused by datasets that no longer fit into the memory of a single process.

Read more

Data Science and AI with a SQL Server 2022 Data Lakehouse

Matt Sarrel Matt Sarrel Matt Sarrel @msarrel on SQL |
Data Science and AI with a SQL Server 2022 Data Lakehouse

Microsoft SQL Server 2022 is one of the most commonly implemented enterprise relational databases. Many of the world's most successful companies, regardless of vertical, have significant SQL Server deployments. Thousands of companies have relied on SQL Server for decades. Microsoft has made great strides over the past decade in embracing open-source and standards-compliant technologies. The result is that

Read more

Scaling up MinIO Internal Connectivity

Klaus Post Klaus Post on Programming |
Scaling up MinIO Internal Connectivity

A MinIO cluster operates as a uniform cluster. This means that any request must be seamlessly handled by any server. As a consequence, servers need to coordinate between themselves. This has so far been handled with traditional HTTP RPC requests - and this has served us well.  Whenever server A would like to call server B an HTTP request would

Read more

Airgapped MinIO Deployments

AJ AJ on DevOps |
Airgapped MinIO Deployments

In this post we’ll talk about what is an Airgapped Network, what to consider when deploying MinIO in such an environment and how to replicate and scale it thereafter with other airgapped sites.

Read more

Two Things Can Be True at the Same Time

Two Things Can Be True at the Same Time

There is an interesting report out from McKinsey on the impending impact of AI on an enterprise’s cloud investments.  There was a quote early on in the piece where McKinsey states:“While the possible impact varies by sector, adopting cloud represents an opportunity for the average company to increase profitability by 20 to 30 percent.”  To many, this would

Read more

Distributed Data Processing with Ray Data and MinIO

Distributed Data Processing with Ray Data and MinIO

Introduction Distributed data processing is a key component of an efficient end-to-end distributed machine-learning training pipeline. This is true if you are building a basic neural network for statistical predictions where distributed training could mean each experiment runs in 10 minutes vs. an hour. It is also true if you are training or fine-tuning a Large Language Model (LLM) where

Read more

AI/ML Reproducibility with lakeFS and MinIO

MinIO MinIO MinIO on AI/ML |
AI/ML Reproducibility with lakeFS and MinIO

This post was written in collaboration with Amit Kesarwani from lakeFS. The reality of running multiple machine learning experiments is that managing them can become unpredictable and complicated - especially in a team environment. What often happens is that during the research process, teams constantly change configuration and data between experiments. For example, try several training sets and several hyperparameter

Read more

Event Notifications vs Object Lambda

AJ AJ on Object Lambdas |
Event Notifications vs Object Lambda

As we were writing the blogs on Event Notifications and Object Lambda we came to a realization of why there are two different features doing almost the same thing? Or are they? What is the difference between the Greek Lambda and Lightning Bolt?

Read more