Spark Structured Streaming With Kafka and MinIO

Kafka and Spark Structured Streaming are used together to build data lakes/lake houses fed by streaming data and provide real time business insights.
Read more...Kafka and Spark Structured Streaming are used together to build data lakes/lake houses fed by streaming data and provide real time business insights.
Read more...Make you Kafka topics performant and efficient with Kafka Schema Registry.
Read more...Apache Kafka is an open-source distributed event streaming platform that is used for building real-time data pipelines and streaming applications. It was originally developed by LinkedIn and is now maintained by the Apache Software Foundation. Kafka is designed to handle high volume, high throughput, and low latency data streams, making it a popular choice for building scalable and reliable data
Read more...MinIO is a foundational component for creating and executing complex data workflows. At the core of this event-driven functionality is MinIO bucket notifications using Kafka.
Read more...Streaming data is a core component of the modern object storage stack. Whether the source of that data is an edge device or an application running in the datacenter, streaming data is quickly outpacing traditional batch processing frameworks. Streaming data includes everything from log files (think Splunk SmartStore), web or mobile applications, autonomous vehicles, social networks and, of course financial
Read more...In the first part of this two post series, we’ll take a look at how object storage is different from other storage approaches and why it makes sense to leverage object storage like Minio for data lakes.
Read more...