MinIO Blog

AI/ML

A collection of 117 posts tagged with "AI/ML"

An Easier Path to Scalable AI: Intel Tiber Developer Cloud + MinIO Object Store

An Easier Path to Scalable AI: Intel Tiber Developer Cloud + MinIO Object Store

One of the biggest challenges facing organizations today for AI and data management is access to reliable infrastructure and compute resources. The Intel Tiber Developer Cloud is purpose-built for engineers who need an environment for proof-of-concepts, experimentation, model training, and service deployments. Unlike other clouds, which can be unapproachable and complex, the Intel Tiber Developer Cloud is simple and easy

Read more

Open Source or Closed? The AI Dilemma

Open Source or Closed? The AI Dilemma

This post first appeared on The New Stack on July 29th, 2024. Artificial Intelligence is in the middle of a perfect storm in the software industry, and now Mark Zuckerberg is calling for open-sourced AI.  Three powerful perspectives are colliding on how to control AI:  1. All AI should be open-source for sharing and transparency. 2. Keep AI closed-source and

Read more

The MinIO DataPod: A Reference Architecture for Exascale

The MinIO DataPod: A Reference Architecture for Exascale

The modern enterprise defines itself by its data. This requires a data infrastructure for AI/ML as well as a data infrastructure that is the foundation for a Modern Datalake capable of supporting business intelligence, data analytics, and data science. This is true if they are behind, getting started or using AI for advanced insights. For the foreseeable future, this

Read more

Build a Distributed Embedding Subsystem with MinIO, Langchain, and Ray Data

Build a Distributed Embedding Subsystem with MinIO, Langchain, and Ray Data

An embedding subsystem is one of four subsystems needed to implement Retrieval Augmented Generation. It turns your custom corpus into a database of vectors that can be searched for semantic meaning. The other subsystems are the data pipeline for creating your custom corpus, the retriever for querying the vector database to add more context to a user query, and finally,

Read more

Bringing ARM into the AI Data Infrastructure Fold at MinIO Using SVE

Bringing ARM into the AI Data Infrastructure Fold at MinIO Using SVE

One of the reasons that MinIO is so performant is that we do the granular work that others will not or cannot. From SIMD acceleration to the AVX-512 optimizations we have done the hard stuff. Recent developments for the ARM CPU architecture, in particular Scalable Vector Extensions (SVE), presented us with the opportunity to deliver significant performance and efficiency gains

Read more

Data-Centric AI with Snorkel and MinIO

Data-Centric AI with Snorkel and MinIO

With all the talk in the industry today regarding large language models with their encoders, decoders, multi-headed attention layers, and billions (soon trillions) of parameters, it is tempting to believe that good AI is the result of model design only. Unfortunately, this is not the case. Good AI requires more than a well-designed model. It also requires properly constructed training

Read more

The Architect's Guide to Machine Learning Operations (MLOps)

The Architect's Guide to Machine Learning Operations (MLOps)

MLOps, short for Machine Learning Operations, is a set of practices and tools aimed at addressing the specific needs of engineers building models and moving them into production. Some organizations start off with a few homegrown tools that version datasets after each experiment and checkpoint models after every epoch of training. On the other hand, many organizations have chosen to

Read more

The Architect’s Guide to the GenAI Tech Stack - Ten Tools

The Architect’s Guide to the GenAI Tech Stack - Ten Tools

This post first appeared on The New Stack on June 3rd, 2024. I previously wrote about the modern data lake reference architecture, addressing the challenges in every enterprise — more data, aging Hadoop tooling (specifically HDFS) and greater demands for RESTful APIs (S3) and performance — but I want to fill in some gaps.  The modern data lake, sometimes referred to as

Read more

WARP speed your AI data storage Infrastructure

AJ AJ on AI/ML |
WARP speed your AI data storage Infrastructure

Do you know the secret to some of the best AI models out there? It's the amount of data they had access to on which they could be trained on. For AI/ML models Fast accessible Data is King. Let me emphasize, it's not just Data, but fast accessible Data.

Read more