Modern Data Lakes

A collection of 70 posts tagged with "Modern Data Lakes"

A Global Telecommunications Leader and MinIO AIStor: Powering the Next Generation of Data Lakehouse for Analytics and AI

David Koppe David Koppe on Case Study |
A Global Telecommunications Leader and MinIO AIStor: Powering the Next Generation of Data Lakehouse for Analytics and AI

1. Executive Summary Our customer, a global telecommunications leader, established a Data Platform team to transform how data improved customer experiences and business operations. Faced with ballooning data growth and legacy storage constraints, they replaced aging legacy data storage systems with a high-performance, cloud-native data lakehouse, built on MinIO’s AIStor. The result: a scalable, cost-efficient foundation ready for AI,

Read more

From Data Swamps to Reliable Data Systems: How Iceberg Brought 40 Years of Database Wisdom to Data Lakes

From Data Swamps to Reliable Data Systems: How Iceberg Brought 40 Years of Database Wisdom to Data Lakes

The data lake was once heralded as the future, an infinitely scalable reservoir for all our raw data, promising to transform it into actionable insights. This was a logical progression from databases and data warehouses, each step driven by the increasing demand for scalability. Yet, in embracing the data lake's scale and flexibility, we overlooked a critical difference.

Read more

AI ML Architecture: Modern Datalake Reference Guide

AI ML Architecture: Modern Datalake Reference Guide

An abbreviated version of this post appeared on The New Stack on March 19th, 2024. In enterprise artificial intelligence, there are two main types of models: discriminative and generative. Discriminative models are used to classify or predict data, while generative models are used to create new data. Even though Generative AI has dominated the news of late, organizations are still

Read more

ACID Transactions with Iceberg on AIStor

AJ AJ on Apache Iceberg |
ACID Transactions with Iceberg on AIStor

Pairing the Iceberg table format with AIStor creates a powerful, flexible and extensible lakehouse platform. The Iceberg Table Spec declares a table format that is designed to manage “a large, slow-changing collection” of files or objects stored in a distributed system.

Read more

Stop Giving Your Data to Vendors

Brenna Buuck Brenna Buuck Brenna Buuck on Databricks |
Stop Giving Your Data to Vendors

Databrick's CEO Ali Ghodsi Decouple storage and compute for more control, lower costs, and scalability. Modern datalakes, built on high-performance object storage like MinIO, empower you to handle AI/ML workloads with flexibility and performance—without relying on proprietary platforms.

Read more

Architecting a Modern Data Lake

Architecting a Modern
Data Lake

The Modern Datalake is one-half data warehouse and one-half data lake and uses object storage for everything. The use of object storage to build a data warehouse is made possible by Open Table Formats OTFs) like Apache Iceberg, Apache Hudi, and Delta Lake, which are specifications that, once implemented, make it seamless for object storage to be used as the

Read more