MinIO Blog (Page 10)

Replication Strategies Deep Dive

AJ AJ on DevOps | 6 February 2024

With all these different types of replication types floating around one has to wonder which replication strategy to use where? Today we’ll demystify these different replication strategies to see which one should be used in which scenario.

Backing Up Weaviate with MinIO S3 Buckets

David Cannan David Cannan on AI/ML | 6 February 2024

Explore integrating MinIO with Weaviate using Docker Compose for AI-enhanced data management. Learn to back up Weaviate to MinIO S3 buckets, ensuring data integrity and scalability with practical Docker and Python examples. Streamline your AI-driven search and analysis with this robust setup.

SQL Server 2022 Machine Learning Services Unlock the Value of Your Data

Matt Sarrel

Matt Sarrel @msarrel on Integrations | 6 February 2024

SQL Server 2022 Machine Learning Services Unlock the Value of Your Data

Learn how to run Python stored procedures on SQL Server 2022.

MinIO and Apache Tika: A Pattern for Text Extraction

Sidharth Rajaram

Sidharth Rajaram @sidharrrrrth on AI/ML | 2 February 2024

MinIO and Apache Tika: A Pattern for Text Extraction

Tl;dr: In this post, we will use MinIO Bucket Notifications and Apache Tika, for document text extraction, which is at the heart of critical downstream tasks like Large Language Model (LLM) training and Retrieval Augmented Generation (RAG). The Premise Let’s say that I want to construct a dataset of text that I can then use to fine-tune an

Hungry GPUs Need Fast Object Storage

Keith Pijanowski Keith Pijanowski on AI/ML | 31 January 2024

A chain is as strong as its weakest link - and your AI/ML infrastructure is only as fast as your slowest component. If you train machine learning models with GPUs, then your weak link may be your storage solution. The result is what I call the “Starving GPU Problem.” The Starving GPU problem occurs when your network or your

Why Your Enterprise AI Strategy Is Likely to Fail in 2024: Model Down vs. Data Up

Jonathan Symonds Jonathan Symonds on AI/ML | 30 January 2024

I suspect some folks will accuse me of clickbait titling. Others will say, that’s not really a reach - most folks will fail in their initial AI attempts but it doesn’t matter and the learnings are worth it. On some level both are right - but I think WHY enterprises will fail is worth exploration and may allow

Innovating S3 Bucket Retrieval: Langchain Community S3 Loaders with OpenAI API

David Cannan David Cannan on AI/ML | 30 January 2024

Explore the synergy of MinIO, Langchain, and OpenAI in enhancing data storage and processing. This article illustrates MinIO’s integration for efficient document summarization using Langchain and OpenAI’s GPT, revolutionizing AI and ML data handling.

Supercharge TileDB Engine with MinIO

AJ AJ on Vector Database | 30 January 2024

MinIO makes a powerful primary TileDB backend because both are built for performance and scale.

Data Before Models: The Unsung Heroes Who Unlock Real AI Results

Brenna Buuck

Brenna Buuck on AI/ML | 29 January 2024

Data Before Models: The Unsung Heroes Who Unlock Real AI Results

Explore the essential role of Data Engineers in unleashing the true power of AI! Data Engineers have a critical foundation in cleaning and structuring raw data for ML success. Learn why their expertise in data infrastructure, feature engineering, and pipeline optimization is indispensable.

The Strengths, Weaknesses and Dangers of LLMs

Sidharth Rajaram

Sidharth Rajaram @sidharrrrrth , Keith Pijanowski Keith Pijanowski on AI/ML | 25 January 2024

The Strengths, Weaknesses and Dangers of LLMs

Much has been said lately about the wonders of Large Language Models (LLMs). Most of these accolades are deserved. Ask ChatGPT to describe the General Theory of Relativity and you will get a very good (and accurate) answer. However, at the end of the day ChatGPT is still a computer program (as are all other LLMs) that is blindly executing

We Read Google’s New Egress Policy So You Don’t Have To…It Is Surprising

Matt Sarrel

Matt Sarrel @msarrel on GCP | 24 January 2024

We Read Google’s New Egress Policy So You Don’t Have To…It Is Surprising

Google recently announced that it would eliminate data egress fees for those leaving the platform. Given our position on the cloud operating model and the lifecycle of the cloud, this appeared to be a major announcement. It is not. You could understand our initial enthusiasm. Google stated that any "customers who wish to stop using Google Cloud and migrate

Event-Driven Architecture: MinIO Event Notification Webhooks using Flask

David Cannan David Cannan on Events | 23 January 2024

Explore deploying MinIO and Flask with Docker-compose for event-driven architecture. Master MinIO bucket events and Flask webhooks for efficient data workflows and robust applications. Dive into the synergy of cloud technologies.

Locking down MinIO Operator Permissions

AJ AJ on Kubernetes | 23 January 2024

In this post, we’ll show you how to configure the MinIO Operator with the most restrictive namespace permissions – all the while being able to fully utilize the power and flexibility of the MinIO Operator for day-to-day operations.

Everything You Need to Know to Repatriate from AWS S3 to MinIO

Matt Sarrel

Matt Sarrel @msarrel on Operator's Guide | 22 January 2024

Everything You Need to Know to Repatriate from AWS S3 to MinIO

Step by step instructions to plan for a migrate data off AWS S3 and on MinIO on-premise.

Debugging MinIO Installs

AJ AJ on DevOps | 19 January 2024

In this blog post, we’ll show you how to debug a MinIO install running in Kubernetes and also some of the common issues you might encounter when doing bare metal installation and how to rectify them.

Never Say Die: Persistent Data with a CDC MinIO Sink for CockroachDB

Brenna Buuck

Brenna Buuck on Databases | 18 January 2024

Never Say Die: Persistent Data with a CDC MinIO Sink for CockroachDB

Learn how to integrate MinIO into your Enterprise CockroachDB instance as a changefeed sink, ensuring durability and scalability. This guide enables an enterprise-grade CDC strategy, vital for real-time data fabrics, analytics, and machine learning.

Understanding True Costs - Hardware and Software for 10PB

Dudley Nostrand Dudley Nostrand on Value Engineering | 18 January 2024

We had a conversation with the CIO of a major bank the other day. They are one of the global systemically important banks - the biggest of the big. The CIO had decided to bring in MinIO as the object store for a data analytics initiative. This deployment collects data from mortgage, transactional and news platforms to run Spark and

Building an S3 Compliant Stock Market Data Lake with MinIO

Keith Pijanowski Keith Pijanowski on Delta Lake | 18 January 2024

In all my previous posts on MinIO, where I had to write code, I used MinIO’s Python SDK, which is documented here. I prefer this SDK because it is easy to use and it provides programmatic access to MinIO’s enterprise features, such as Lifecycle Management, Object Locking, Bucket Notifications, and Site Replication. (I showed how to set up

Renewing KES certificate

AJ AJ on Security | 18 January 2024

In this post we'll show you some of the common errors you can run into when the certs in KES expire. We'll show you what errors you can expect and how to renew and update the certs in a quick fashion.

Backing Up SQL Server 2022 Databases to MinIO

Matt Sarrel

Matt Sarrel @msarrel on BC/DR | 17 January 2024

Backing Up SQL Server 2022 Databases to MinIO

Learn how to back up SQL Server 2022 to MinIO on-premise.