The Modern Datalake is one-half data warehouse and one-half data lake and uses object storage for everything. The use of object storage to build a data warehouse is made possible by Open Table Formats OTFs) like Apache Iceberg, Apache Hudi, and Delta Lake, which are specifications that, once implemented, make it seamless for object storage to be used as the
Read more
This post was a collaboration between Kevin Lambrecht of UCE Systems and Raghav Karnam
The cloud operating model and specifically Kubernetes have become the standard for large scale infrastructure today. More importantly, they are evolving at an exceptional pace with material impacts to data science, data analytics and AI/ML.
This transition has a significant impact on the Hadoop ecosystem.
Read more
You must have heard of different data formats like Parquet, ORC, Avro, Arrow, Protobuf, Thrift and MessagePack. What are they and how to choose the right one?
Read more
In this blog post we’ll show you how you can quickly get up and running with MinIO, KES and Vault to fully understand the capabilities of server-side encryption.
Read more
Let open source software help you with simplifying enterprise conversational AI needs and let MinIO handle the storage solutions to enable continuous learning and optimize the knowledge base for improved chatbot experience.
Read more
This post focuses on how Iceberg and MinIO complement each other and how various analytic frameworks (Spark, Flink, Trino, Dremio, and Snowflake) can leverage the two.
Read more
We’ll go over how to set up the required infrastructure for integrating GitHub Enterprise packages and actions to use MinIO as a backend. At a high level we’ll need running instances of MinIO and GitHub Enterprise.
Read more
With the advent of cloud computing, ephemeral compute instances have become
ubiquitous. This introduces a whole set of challenges around managing the
software, applying DevOps principles, addressing security vulnerabilities and
ensuring automation. These are mission-critical in order to prevent data theft
and service disruption.
Addressing security vulnerabilities is particularly challenging as it frequently
takes the form of updating and restarting
Read more