Apache Kafka - MinIO Blog

Building Real-time Data Pipelines with MinIO's AIStor

Samson Koshy Samson Koshy on Data Engineering | 23 May 2025

Built a portable Java data pipeline using MinIO's AIStor and Kafka that scales from Mac to Kubernetes. Containerized stack (Kafka, AIStor, Prometheus, Grafana) processes millions of events, preserving raw data for analysis while delivering real-time dashboard summaries.

AI Data Workflows with Kafka and MinIO

AJ AJ on Apache Kafka | 23 October 2024

AIStor is a foundational component for creating and executing complex data workflows. At the core of this event-driven functionality is MinIO bucket notifications using Kafka.

Building on the Lessons from Kafka: How AutoMQ and MinIO Solve Cost and Elasticity Challenges

Brenna Buuck

Brenna Buuck on Apache Kafka | 30 July 2024

Building on the Lessons from Kafka: How AutoMQ and MinIO Solve Cost and Elasticity Challenges

AutoMQ enhances Kafka's architecture by using MinIO's object storage, cutting costs, and boosting elasticity while keeping Kafka API compatibility. This combo offers scalable, secure, and efficient data streaming, ideal for diverse cloud environments.

Build a Streaming CDC Pipeline with MinIO and Redpanda into Snowflake

Brenna Buuck

Brenna Buuck on Apache Kafka | 23 October 2023

Build a Streaming CDC Pipeline with MinIO and Redpanda into Snowflake

Build a streaming Change Data Capture (CDC) pipeline with Redpanda and MinIO into Snowflake. This solution simplifies data migration and analytics, with Redpanda offering scalability, MinIO as efficient storage, and Snowflake as a cloud-native analytics engine.

Confluent Platform with MinIO Tiered Object Storage Throughput Benchmark

MinIO

MinIO on Benchmarks | 23 October 2023

Confluent Platform with MinIO Tiered Object Storage Throughput Benchmark

Confluent, Intel and MinIO conducted benchmarking and certification testing for MinIO Tiered Object Storage for Kafka storage. This blog post describes the observations and results of testing MinIO object storage as a backend for the tiered storage feature of Confluent Platform 7.1.0 on servers equipped with third generation Intel Xeon Scalable processors. The scope of these tests was

Streamlining Data Streaming: A Guide to WarpStream and MinIO

Brenna Buuck

Brenna Buuck on Operator's Guide | 12 October 2023

Streamlining Data Streaming: A Guide to WarpStream and MinIO

Explore the next generation of data streaming with WarpStream and MinIO! While Apache Kafka has been the standard for streaming data, it may be time to consider a simpler, more cost-effective, and cloud-native solution.

End to End Spark Structured Streaming for Kafka Topics

Dileeshvar Radhakrishnan

Dileeshvar Radhakrishnan , AJ AJ on Apache Kafka | 12 June 2023

End to End Spark Structured Streaming for Kafka Topics

Apache Kafka and Apache Spark are two leading technologies used to build the streaming data pipelines that feed data lakes and lake houses. At a really high level, Kafka streams messages to Spark where they are transformed into a format that can be read in by applications and saved to storage.

Spark Structured Streaming With Kafka and MinIO

Dileeshvar Radhakrishnan

Dileeshvar Radhakrishnan , AJ AJ on Apache Kafka | 22 May 2023

Spark Structured Streaming With Kafka and MinIO

Kafka and Spark Structured Streaming are used together to build data lakes/lake houses fed by streaming data and provide real time business insights.

Making the Most of Streaming with Kafka Schema Registry and MinIO

Dileeshvar Radhakrishnan

Dileeshvar Radhakrishnan , AJ AJ on Modern Data Lakes | 18 May 2023

Making the Most of Streaming with Kafka Schema Registry and MinIO

Make you Kafka topics performant and efficient with Kafka Schema Registry.

How to Set up Kafka and Stream Data to MinIO in Kubernetes

Dileeshvar Radhakrishnan

Dileeshvar Radhakrishnan , AJ AJ on Apache Kafka | 24 April 2023

How to Set up Kafka and Stream Data to MinIO in Kubernetes

Apache Kafka is an open-source distributed event streaming platform that is used for building real-time data pipelines and streaming applications. It was originally developed by LinkedIn and is now maintained by the Apache Software Foundation. Kafka is designed to handle high volume, high throughput, and low latency data streams, making it a popular choice for building scalable and reliable data

Publish from Kafka, Persist on MinIO

Ashish Sinha Ashish Sinha on Apache Kafka | 13 September 2021

Streaming data is a core component of the modern object storage stack. Whether the source of that data is an edge device or an application running in the datacenter, streaming data is quickly outpacing traditional batch processing frameworks. Streaming data includes everything from log files (think Splunk SmartStore), web or mobile applications, autonomous vehicles, social networks and, of course financial

Modern Data Lake with MinIO : Part 1

Nitish Tiwari Nitish Tiwari on Architect's Guide | 9 October 2018

In the first part of this two post series, we’ll take a look at how object storage is different from other storage approaches and why it makes sense to leverage object storage like Minio for data lakes.