Object tags give you greater power. You now have the ability to categorize by up to ten dimensions. If you want to add the diagram to a project, then all you have to do is tag it appropriately.
Read more
To ensure AI success, start by hiring a data engineer, not an AI/ML expert. Learn from our experience and find out why a strong data foundation—focused on object storage, data lakehouses, and optimized pipelines—is critical for scalable, efficient AI/ML workloads.
Read more
Faced with skyrocketing compute costs, MinIO data scientist Archana Vaidyanathan leveraged the power of the data lakehouse, which allows for flexible compute choices without overhauling storage. AIStor enhances this model, delivering speed, scalability, and cost savings.
Read more
Large numbers of small files present big challenges for application performance.
Read more
Pairing the Iceberg table format with AIStor creates a powerful, flexible and extensible lakehouse platform. The Iceberg Table Spec declares a table format that is designed to manage “a large, slow-changing collection” of files or objects stored in a distributed system.
Read more
MinIO introduced its conditional write feature long before AWS S3’s recent announcement. This powerful tool offers greater control in high-concurrency environments, ensuring data consistency and reliability, especially in AI and ML workflows.
Read more
Databrick's CEO Ali Ghodsi Decouple storage and compute for more control, lower costs, and scalability. Modern datalakes, built on high-performance object storage like MinIO, empower you to handle AI/ML workloads with flexibility and performance—without relying on proprietary platforms.
Read more
Take advantage of cloud native, Kubernetes-oriented, microservices-based architectures with object storage.
Read more
Our client, a global financial institution headquarterd in Japan, recently completed an ambitious Hadoop replacement project with MinIO and Dremio. You can see them present it in this talk from Subsurface but we thought we would write it up as well.
Like most banks, the firm had built out a large Hadoop footprint to power its analytics and risk management
Read more
The rise of lakehouse functionality is reshaping data management. ParadeDB's pg_lakehouse extension lets PostgreSQL integrate with object storage, enabling scalable, secure analytics. This makes the modernization of data infrastructure possible without extensive overhauls. Welcome to the future!
Read more
Amid the AI frenzy, one silent hero powers it all: modern object storage. It may not be glamorous, but it's the backbone of today's data lakes, enabling vast, efficient data management. Discover how AIStor elevates your data infrastructure.
Read more
Catalogs are revolutionizing modern datalakes, with industry giants like Databricks and Snowflake adopting Apache Iceberg’s catalog REST API. A commitment to open standards enhances performance, fosters innovation, and transforms data management for AI and ML.
Read more
This post initially appeared on The New Stack.
For a few years there, the term “private cloud” had a negative connotation. But as we know, technology is more of a wheel than an arrow, and right on cue, the private cloud is getting a ton of attention and it is all positive. The statistics are clear, Forrester’s 2023 Infrastructure
Read more
The semantic layer in modern datalakes provides context and structure to raw data, crucial for key data initiatives like AI model training, data management and data governance. A unified strategy and robust infrastructure are essential for effective implementation of the semantic layer.
Read more
The Modern Datalake is one-half data warehouse and one-half data lake and uses object storage for everything. The use of object storage to build a data warehouse is made possible by Open Table Formats OTFs) like Apache Iceberg, Apache Hudi, and Delta Lake, which are specifications that, once implemented, make it seamless for object storage to be used as the
Read more
In this blog, we will demonstrate how to use MinIO to build a Retrieval Augmented Generation(RAG) based chat application using commodity hardware.
Read more
Snowflake's support for external tables has seen significant updates since our last blog post on how to extend your Snowflake implementation with MinIO. External tables allow users of Snowflake to treat data in object storage like MinIO as a read-only table in Snowflake without migration. Snowflake's ongoing enhancements to their external table functionality clearly demonstrate the
Read more
In this tutorial, we'll deploy a cohesive system that allows distributed SQL querying across large datasets stored in Minio, with Trino leveraging metadata from Hive Metastore and table schemas from Redis.
Read more
Discover RisingWave, an open-source streaming database revolutionizing data lakehouses. Built for speed and scalability, it empowers developers with SQL on streaming data. Unlock the potential of real-time analytics and scalable data processing for your AI initiatives.
Read more
Apache Arrow is an open-source columnar memory format that is vital for modern datalakes. This is because Arrow makes data processing swift and seamless across various systems. Arrow propels AI and analytics by enhancing interoperability and computational efficiency.
Read more