MinIO Blog

AI/ML

A collection of 118 posts tagged with "AI/ML"

Distributed Data Processing with Ray Data and MinIO

Distributed Data Processing with Ray Data and MinIO

Introduction Distributed data processing is a key component of an efficient end-to-end distributed machine-learning training pipeline. This is true if you are building a basic neural network for statistical predictions where distributed training could mean each experiment runs in 10 minutes vs. an hour. It is also true if you are training or fine-tuning a Large Language Model (LLM) where

Read more

AI/ML Reproducibility with lakeFS and MinIO

MinIO MinIO MinIO on AI/ML |
AI/ML Reproducibility with lakeFS and MinIO

This post was written in collaboration with Amit Kesarwani from lakeFS. The reality of running multiple machine learning experiments is that managing them can become unpredictable and complicated - especially in a team environment. What often happens is that during the research process, teams constantly change configuration and data between experiments. For example, try several training sets and several hyperparameter

Read more

Generative AI for the Enterprise

Generative AI for the Enterprise

Introduction Generative AI represents the latest technique an enterprise can employ to unlock the data trapped within its boundaries. The easiest way to conceptualize what is possible with Generative AI is to imagine a customized Large Language Model - similar to the one powering ChatGPT - running inside your firewall. Now, this custom LLM is not the same as the

Read more

An Unintended Consequence of the AI/ML Revolution - Power Shifts in the Enterprise

An Unintended Consequence of the AI/ML Revolution - Power Shifts in the Enterprise

A lot of ink has been spilled on the significance of the AI/ML technology wave (here are our posts). What doesn’t get attention, but probably should, is how AI/ML is remaking the technology power structure inside the enterprise. As companies reorganize around a data-centric orientation, they are also reorganizing who makes and executes the technology architecture. While

Read more

Creating an ML Scenario in SAP Data Intelligence Cloud to Read and Model Data in MinIO

Creating an ML Scenario in SAP Data Intelligence Cloud to Read and Model Data in MinIO

Enterprise customers use MinIO to build data lakehouses to store a wide variety of structured and unstructured data, and work with it using ML and analytics. Data flows into MinIO from across the enterprise and the S3 API allows applications, such as analytics and AI/ML to work with it.   I previously blogged about building data pipelines with SAP Data

Read more

Object Detection Made Simple with MinIO and YOLO

Object Detection Made Simple with MinIO and YOLO

Tl;dr: In this post, we will create a custom image dataset and then train a You-Only-Look-Once (YOLO) model for the ubiquitous task of object detection. We will then implement a system using MinIO Bucket Notifications that can automatically perform inference on a new image. Introduction: Computer vision remains an extremely compelling application of artificial intelligence. Whether it’s recognizing

Read more

A Developer’s Introduction to Apache Iceberg using MinIO

A Developer’s Introduction to Apache Iceberg using MinIO

Introduction Open Table Formats (OTFs) are a phenomenon in the data analytics world that has been gaining momentum recently. The promise of OTFs is as a solution that leverages distributed computing and distributed object stores to provide capabilities that exceed what is possible with a Data Warehouse. The open aspect of these formats gives organizations options when it comes to

Read more

Anomaly Detection from Log Files: The Performance at Scale Use Case

Moiz Kohari Moiz Kohari on AI/ML |
Anomaly Detection from Log Files: The Performance at Scale Use Case

Driving competitive advantage by employing the best technologies separates great operators from good operators. Discovering the hidden gems in your corporate data and then presenting key actionable insights to your clients will help create an indispensable service for your clients, and isn’t this what every executive wishes to create? Cloud-based data storage (led by the likes of Amazon S3,

Read more

MLflow Tracking and MinIO

MLflow Tracking and MinIO

Introduction It’s challenging to keep track of machine learning experiments. Let’s say you have a collection of raw files in a MinIO bucket to be used to train and test a model. There will always be multiple ways to preprocess the data, engineer features, and design the model. Given all these options, you will want to run many

Read more

AI/ML Best Practices During a Gold Rush

AI/ML Best Practices During a Gold Rush

Introduction The California Gold Rush started in 1848 and lasted until 1855. It is estimated that approximately 300,000 people migrated to California from other parts of the United States and abroad. Economic estimates suggest that, on average, only half made a modest profit. The other half either lost money or broke even. Very few gold seekers made a significant

Read more

Parallel ML Experimentation leveraging MinIO & lakeFS

MinIO MinIO MinIO on AI/ML |
Parallel ML Experimentation leveraging MinIO & lakeFS

Introduction This post was written in collaboration with Iddo Avneri from lakeFS. Managing the growing complexity of ML models and the ever-increasing volume of data has become a daunting challenge for ML practitioners. Efficient data management and data version control are now critical aspects of successful ML workflows. In this blog post, we delve into the power of parallel ML

Read more

Setting up a Development Machine with MLFlow and MinIO

Setting up a Development Machine with MLFlow and MinIO

About MLflow MLflow is an open-source platform designed to manage the complete machine learning lifecycle. Databricks created it as an internal project to address challenges faced in their own machine learning development and deployment processes. MLflow was later released as an open-source project in June 2018. As a tool for managing the complete lifecycle, MLflow contains the following components. * MLflow

Read more

Enhance Large Language Models Leveraging RAG and MinIO on cnvrg.io

MinIO MinIO MinIO on AI/ML |
Enhance Large Language Models Leveraging RAG and MinIO on cnvrg.io

This post was written in collaboration with Harinder Mashiana from cnvrg.io. Large language models (LLMs) have revolutionized the world of technology, offering powerful capabilities for text analysis, language translation, and chatbot interactions. The revolution will heavily impact businesses, according to OpenAI, approximately 80% of the U.S. workforce could have at least 10% of their work tasks affected by

Read more

Object Management for AI/ML

Object Management for AI/ML

Introduction In a few previous posts on AI/ML, I mentioned that one of the benefits of MinIO is that you have tools for Versioning, Lifecycle Management, Object Locking, Object Retention and Legal Holds. These capabilities have a variety of uses. You may need a simple way to keep track of training experiments. You could also use these features to

Read more