Apache Iceberg - MinIO Blog

From Data Swamps to Reliable Data Systems: How Iceberg Brought 40 Years of Database Wisdom to Data Lakes

Karthic Rao @hackintoshrao on Apache Iceberg | 25 July 2025

From Data Swamps to Reliable Data Systems: How Iceberg Brought 40 Years of Database Wisdom to Data Lakes

The data lake was once heralded as the future, an infinitely scalable reservoir for all our raw data, promising to transform it into actionable insights. This was a logical progression from databases and data warehouses, each step driven by the increasing demand for scalability. Yet, in embracing the data lake's scale and flexibility, we overlooked a critical difference.

The Case for Native Iceberg Catalog APIs and Unified Governance in Object Storage

Karthic Rao

Karthic Rao @hackintoshrao on Apache Iceberg | 23 June 2025

The Case for Native Iceberg Catalog APIs and Unified Governance in Object Storage

Apache Iceberg is significantly transforming modern data lakes. Its introduction to object storage platforms has been celebrated for delivering ACID transactions, strong schema evolution, and warehouse-like reliability to data lake architectures. The Iceberg Catalog API standard is crucial to this transformation, as it ensures that various tools can consistently discover tables and execute atomic transactions once a compliant catalog service

Discover, Transact, Govern? Unpacking the Iceberg Catalog API Standard's True Scope.

Karthic Rao

Karthic Rao @hackintoshrao on Apache Iceberg | 22 May 2025

Discover, Transact, Govern? Unpacking the Iceberg Catalog API Standard's True Scope.

In data engineering, open standards are foundational for building interoperable, evolvable, and non-proprietary systems. Apache Iceberg, an open table format, is a prime example. Along with compute, Iceberg brings structure and reliability to data lakes. When coupled with high-performance object storage like MinIO AIStor, Iceberg unlocks new avenues for creating next-generation, high-performance, cost-effective, and scalable architectures. However, this powerful table

Iceberg's Catalog API: The Atomic Pointer Manager Behind Your Iceberg Tables

Karthic Rao

Karthic Rao @hackintoshrao on Apache Iceberg | 16 May 2025

Iceberg's Catalog API: The Atomic Pointer Manager Behind Your Iceberg Tables

Apache Iceberg has significantly reshaped how organizations manage and interact with massive structured analytical datasets inside object storage. It brings database-like reliability and powerful features such as ACID transactions, schema evolution, and time travel. Although these features are commonly emphasized, the Iceberg Catalog API is what makes these tables accessible. The Iceberg Catalog API is a centralized interface for managing

The Case for On-Prem Iceberg: Cost, Control, and Performance

Brenna Buuck

Brenna Buuck on AIStor | 17 April 2025

The Case for On-Prem Iceberg: Cost, Control, and Performance

Cloud lakehouses break the bank at scale and compromise control. On-prem Iceberg lakehouses deliver speed, savings, and sovereignty. From cancer research to finance, real-world deployments prove it: petabyte-scale performance, full control, and lower TCO are within reach.

Architect’s Guide to Open Table Formats and Object Storage

Brenna Buuck

Brenna Buuck on Open Table Formats | 14 February 2025

Architect’s Guide to Open Table Formats and Object Storage

Choosing the right open table format—Apache Iceberg, Delta Lake, or Apache Hudi—can make or break your data lakehouse. This guide breaks down their strengths, how they integrate with object storage, and which one is best for AI, analytics, and real-time workloads.

The Definitive Guide to Lakehouse Architecture with Iceberg and AIStor

Brenna Buuck

Brenna Buuck on Apache Iceberg | 17 January 2025

The Definitive Guide to Lakehouse Architecture with Iceberg and AIStor

Discover the power of Apache Iceberg and AIStor in transforming data lakehouses! From multi-engine compatibility to time travel, schema evolution, and blazing-fast performance, this guide dives deep into how Iceberg unlocks the full potential of modern AI and analytics workloads.

ACID Transactions with Iceberg on AIStor

AJ AJ on Apache Iceberg | 18 September 2024

Pairing the Iceberg table format with AIStor creates a powerful, flexible and extensible lakehouse platform. The Iceberg Table Spec declares a table format that is designed to manage “a large, slow-changing collection” of files or objects stored in a distributed system.

The Rise of Iceberg: Transforming Data Architectures

Brenna Buuck

Brenna Buuck on Apache Iceberg | 29 July 2024

The Rise of Iceberg: Transforming Data Architectures

Iceberg is shifting the market's focus to scalable, cloud-native storage. This shift is leading to the commoditization of query engines, offering users more flexibility, better pricing, and innovation.

The Catalog’s “IT” Moment and What it Means For Object Storage and AI

Brenna Buuck

Brenna Buuck on Modern Data Lakes | 25 July 2024

The Catalog’s “IT” Moment and What it Means For Object Storage and AI

Catalogs are revolutionizing modern datalakes, with industry giants like Databricks and Snowflake adopting Apache Iceberg’s catalog REST API. A commitment to open standards enhances performance, fosters innovation, and transforms data management for AI and ML.

The Significance of Databricks' Acquisition of Tabular: A Triumph for Open Frameworks in Data

Brenna Buuck

Brenna Buuck on Apache Iceberg | 3 July 2024

The Significance of Databricks' Acquisition of Tabular: A Triumph for Open Frameworks in Data

Databricks' acquisition of Tabular, founded by the creators of Apache Iceberg, underscores the importance of open frameworks in modern data lake design. Open frameworks ensure interoperability, flexibility, and simplicity, benefiting those leveraging data for AI.

Building Modern Data Architectures with Iceberg, Tabular and MinIO

Brenna Buuck

Brenna Buuck on Modern Data Lakes | 23 February 2024

Building Modern Data Architectures with Iceberg, Tabular and MinIO

Explore modern data architecture with Iceberg, Tabular, and MinIO. Learn to seamlessly integrate structured and unstructured data, optimize AI/ML workloads, and build a high-performance, cloud-native data lake.

What the Interoperability Trend in Open Table Formats Means for Enterprise Data Architectures

Brenna Buuck

Brenna Buuck on Modern Data Lakes | 30 November 2023

What the Interoperability Trend in Open Table Formats Means for Enterprise Data Architectures

Discover how Databricks and Apache Iceberg's strides in open table formats influence data portability in the modern data stack. Learn how the shift to a private cloud operating model aligns with this evolution, fostering an adaptable, interoperable data ecosystem.

Data Lake Mysteries Unveiled: Nessie, Dremio, and MinIO Make Waves

Brenna Buuck

Brenna Buuck on Dremio | 21 November 2023

Data Lake Mysteries Unveiled: Nessie, Dremio, and MinIO Make Waves

Unleash data collaboration and quality with Nessie! Learn to manage branches, commits, and merges effortlessly. This guide walks you through deploying Dremio, MinIO, and Nessie, transforming your data engineering with collaborative precision. Dive in to revolutionize your workflows!

Why the Modern Datalake is Being Built Privately

Brenna Buuck

Brenna Buuck on Modern Data Lakes | 3 November 2023

Why the Modern Datalake is Being Built Privately

Unlock the secrets of modern datalakes migration to the private clouds. Embrace S3 compatibility, data control, and the ever-evolving landscape for cost-effective data management. Don't miss the journey to enhanced flexibility, efficiency, and the future-proofing of your data ecosystem

The Disruptive Nature of Data Lakehouses

Keith Pijanowski Keith Pijanowski on Apache Iceberg | 12 September 2023

Introduction In 1997, Clayton Christensen, in his book The Innovator’s Dilemma, identified a pattern of innovation that tracked the capabilities, cost, and adoption by market segment between an incumbent and a new entrant. He labeled this pattern “Disruptive Innovation.” Not every successful product is disruptive - even if it causes well-established businesses to lose market share or even fail

Making the Most of Streaming with Kafka Schema Registry and MinIO

Dileeshvar Radhakrishnan

Dileeshvar Radhakrishnan , AJ AJ on Modern Data Lakes | 18 May 2023

Making the Most of Streaming with Kafka Schema Registry and MinIO

Make you Kafka topics performant and efficient with Kafka Schema Registry.

Query Iceberg Tables on MinIO with Dremio

Dileeshvar Radhakrishnan

Dileeshvar Radhakrishnan on Apache Iceberg | 19 April 2023

Query Iceberg Tables on MinIO with Dremio

Build your on-prem data lake with Apache Iceberg, Dremio and MinIO