Architecting a Modern Data Lake

Architecting a Modern
Data Lake

The Modern Datalake is one-half data warehouse and one-half data lake and uses object storage for everything. The use of object storage to build a data warehouse is made possible by Open Table Formats OTFs) like Apache Iceberg, Apache Hudi, and Delta Lake, which are specifications that, once implemented, make it seamless for object storage to be used as the

Read more...

Migrating from Hadoop to a Cloud-Ready Architecture for Data Analytics

Migrating from Hadoop to a Cloud-Ready Architecture for Data Analytics

This post was a collaboration between Kevin Lambrecht of UCE Systems and Raghav Karnam The cloud operating model and specifically Kubernetes have become the standard for large scale infrastructure today. More importantly, they are evolving at an exceptional pace with material impacts to data science, data analytics and AI/ML. This transition has a significant impact on the Hadoop ecosystem.

Read more...

Migrating MinIO Cluster Instances with Zero Downtime and Zero Data Loss

Migrating MinIO Cluster Instances with Zero Downtime and Zero Data Loss

With the advent of cloud computing, ephemeral compute instances have become ubiquitous. This introduces a whole set of challenges around managing the software, applying DevOps principles, addressing security vulnerabilities and ensuring automation. These are mission-critical in order to prevent data theft and service disruption. Addressing security vulnerabilities is particularly challenging as it frequently takes the form of updating and restarting

Read more...