Distributed Data Processing with Ray Data and MinIO

Distributed Data Processing with Ray Data and MinIO

Introduction Distributed data processing is a key component of an efficient end-to-end distributed machine-learning training pipeline. This is true if you are building a basic neural network for statistical predictions where distributed training could mean each experiment runs in 10 minutes vs. an hour. It is also true if you are training or fine-tuning a Large Language Model (LLM) where

Read more...

AI/ML Reproducibility with lakeFS and MinIO

AI/ML Reproducibility with lakeFS and MinIO

This post was written in collaboration with Amit Kesarwani from lakeFS. The reality of running multiple machine learning experiments is that managing them can become unpredictable and complicated - especially in a team environment. What often happens is that during the research process, teams constantly change configuration and data between experiments. For example, try several training sets and several hyperparameter

Read more...

How to Back Up with Restic and MinIO

How to Back Up with Restic and MinIO

Every system needs to be backed up because there are countless ways to lose local filesystem data and configurations. That loss can be devastating  – potentially resulting in revenue loss, dissatisfied customers and even costly litigation. The statistics are pretty bleak – sixty percent of businesses that suffer a data loss event close within six months and ninety-three percent of companies that

Read more...

Generative AI for the Enterprise

Generative AI for the Enterprise

Introduction Generative AI represents the latest technique an enterprise can employ to unlock the data trapped within its boundaries. The easiest way to conceptualize what is possible with Generative AI is to imagine a customized Large Language Model - similar to the one powering ChatGPT - running inside your firewall. Now, this custom LLM is not the same as the

Read more...

An Unintended Consequence of the AI/ML Revolution - Power Shifts in the Enterprise

An Unintended Consequence of the AI/ML Revolution - Power Shifts in the Enterprise

A lot of ink has been spilled on the significance of the AI/ML technology wave (here are our posts). What doesn’t get attention, but probably should, is how AI/ML is remaking the technology power structure inside the enterprise. As companies reorganize around a data-centric orientation, they are also reorganizing who makes and executes the technology architecture. While

Read more...