Apache Arrow is an open-source columnar memory format that is vital for modern datalakes. This is because Arrow makes data processing swift and seamless across various systems. Arrow propels AI and analytics by enhancing interoperability and computational efficiency.
Read more
Learn how to get started with Dremio and MinIO on Kubernetes for fast, scalable analytics.
Read more
You must have heard of different data formats like Parquet, ORC, Avro, Arrow, Protobuf, Thrift and MessagePack. What are they and how to choose the right one?
Read more
Arrow here, Arrow there, Arrow everywhere. Seems like currently you can't swing
a dead cat without hitting an article or blog post about Apache Arrow. Most seem
to be addressing a developer audience and are based on a Python and Spark style
development platform. Today I’m going to write about using Apache Arrow with
MinIO from the
Read more
With the introduction of Apache Arrow, language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations, MinIO data lakes can be much more powerful. This article explains how to make use of Apache Arrow by using ArrowRDD.
Read more