Architecting a Modern Data Lake

Architecting a Modern
Data Lake

The Modern Datalake is one-half data warehouse and one-half data lake and uses object storage for everything. The use of object storage to build a data warehouse is made possible by Open Table Formats OTFs) like Apache Iceberg, Apache Hudi, and Delta Lake, which are specifications that, once implemented, make it seamless for object storage to be used as the

Read more

Data-Centric AI with Snorkel and MinIO

Data-Centric AI with Snorkel and MinIO

With all the talk in the industry today regarding large language models with their encoders, decoders, multi-headed attention layers, and billions (soon trillions) of parameters, it is tempting to believe that good AI is the result of model design only. Unfortunately, this is not the case. Good AI requires more than a well-designed model. It also requires properly constructed training

Read more

The Architects Guide to Machine Learning Operations (MLOps)

The Architects Guide to Machine Learning Operations (MLOps)

MLOps, short for Machine Learning Operations, is a set of practices and tools aimed at addressing the specific needs of engineers building models and moving them into production. Some organizations start off with a few homegrown tools that version datasets after each experiment and checkpoint models after every epoch of training. On the other hand, many organizations have chosen to

Read more

The Architect’s Guide to the GenAI Tech Stack - Ten Tools

The Architect’s Guide to the GenAI Tech Stack - Ten Tools

This post first appeared on The New Stack on June 3rd, 2024. I previously wrote about the modern data lake reference architecture, addressing the challenges in every enterprise — more data, aging Hadoop tooling (specifically HDFS) and greater demands for RESTful APIs (S3) and performance — but I want to fill in some gaps.  The modern data lake, sometimes referred to as

Read more

WARP speed your AI data storage Infrastructure

AJ AJ on AI/ML |
WARP speed your AI data storage Infrastructure

Do you know the secret to some of the best AI models out there? It's the amount of data they had access to on which they could be trained on. For AI/ML models Fast accessible Data is King. Let me emphasize, it's not just Data, but fast accessible Data.

Read more

Dell ECS Data Movement to MinIO

AJ AJ on Cloud Repatriation |
Dell ECS Data Movement to MinIO

Dell ECS's “Data Movement”, also called copy-to-cloud is a feature introduced in ECS 3.8.0.1 that allows you to copy objects from Dell ECS to MinIO which is rather popular with customers and prospects who are modernizing their storage stack to support their AI data infrastructure requirements.

Read more

Integrate MinIO with Keycloak OIDC

AJ AJ on Security |
Integrate MinIO with Keycloak OIDC

In this blog post, we’ll show you how to set up MinIO to work with Keycloak. But broadly it should also give you an idea of how OIDC is configured with MinIO so you can use it with anything other than Keycloak, here we just use it as an example.

Read more

Latest Enhancements to Snowflake External Tables: What You Need to Know

Latest Enhancements to Snowflake External Tables: What You Need to Know

Snowflake's support for external tables has seen significant updates since our last blog post on how to extend your Snowflake implementation with MinIO. External tables allow users of Snowflake to treat data in object storage like MinIO as a read-only table in Snowflake without migration. Snowflake's ongoing enhancements to their external table functionality clearly demonstrate the

Read more

MinIO Audit Logs in ElasticSearch in Kubernetes

AJ AJ on AI/ML |
MinIO Audit Logs in ElasticSearch in Kubernetes

Whether you are on-prem or in the Cloud, you want to ensure in the cloud operating model processes are set up in a homogenous way. This tutorial will give you a full overview of how you can surface MinIO audit logs in ElasticSearch so they can be searchable.

Read more

Introducing Technical Certifications at MinIO

Introducing Technical Certifications at MinIO

We are excited to announce our first technical certification, the MinIO Certified Administrator - Practitioner. The MinIO certified professional program is designed to validate an individual's practical skills administrating MinIO. For the practitioner level exam, candidates will need working knowledge of all core features and capabilities including deployment, bucket creation, versioning, life cycle management, replication, encryption, and authentication,

Read more