Spark Structured Streaming With Kafka and MinIO

Kafka and Spark Structured Streaming are used together to build data lakes/lake houses fed by streaming data and provide real time business insights.
Read more...Kafka and Spark Structured Streaming are used together to build data lakes/lake houses fed by streaming data and provide real time business insights.
Read more...Make you Kafka topics performant and efficient with Kafka Schema Registry.
Read more...Apache Kafka is an open-source distributed event streaming platform that is used for building real-time data pipelines and streaming applications. It was originally developed by LinkedIn and is now maintained by the Apache Software Foundation. Kafka is designed to handle high volume, high throughput, and low latency data streams, making it a popular choice for building scalable and reliable data
Read more...Build your on-prem data lake with Apache Iceberg, Dremio and MinIO
Read more...Learn how to get started with Dremio and MinIO on Kubernetes for fast, scalable analytics.
Read more...In this blog post, we will build a Notebook that uses MinIO as object storage for Spark jobs to manage Iceberg tables.
Read more...Apache Spark and MinIO are powerful tools for data lakes and analytics. Learn how to run them in Kubernetes.
Read more...