Introduction
Open Table Formats (OTFs) are a phenomenon in the data analytics world that has been gaining momentum recently. The promise of OTFs is as a solution that leverages distributed computing and distributed object stores to provide capabilities that exceed what is possible with a Data Warehouse. The open aspect of these formats gives organizations options when it comes to
Read more
We were recently asked by a journalist to help frame the challenges and complexity of the hybrid cloud for technology leaders. While we suspect many technologists have given this a fair amount of thought, we also know from first-hand discussions with customers and community members that this is still an area of significant inquiry. We wanted to summarize that thinking
Read more
Driving competitive advantage by employing the best technologies separates great operators from good operators. Discovering the hidden gems in your corporate data and then presenting key actionable insights to your clients will help create an indispensable service for your clients, and isn’t this what every executive wishes to create?
Cloud-based data storage (led by the likes of Amazon S3,
Read more
Introduction
It’s challenging to keep track of machine learning experiments. Let’s say you have a collection of raw files in a MinIO bucket to be used to train and test a model. There will always be multiple ways to preprocess the data, engineer features, and design the model. Given all these options, you will want to run many
Read more
Introduction
The California Gold Rush started in 1848 and lasted until 1855. It is estimated that approximately 300,000 people migrated to California from other parts of the United States and abroad. Economic estimates suggest that, on average, only half made a modest profit. The other half either lost money or broke even. Very few gold seekers made a significant
Read more
Between the public cloud and your data center exists a middle ground where you can have full control over infrastructure hardware, without the high initial cost of investment.
Read more
If S3 costs are burning a hole in your pocket, then it's time to start thinking about running MinIO on-premise for your private cloud.
Read more
MinIO has partners across the ecosystem - from our cloud partnerships with AWS, GCP, Azure and IBM to more solution-focused partnerships like Snowflake and Dremio. We are pleased to add UCE Systems to our roster of solutions-based partnerships.
UCE is a leading consulting firm focused on modern data platforms (like the aforementioned Dremio). UCE has brought dozens of enterprises out
Read more
Introduction
This post was written in collaboration with Iddo Avneri from lakeFS.
Managing the growing complexity of ML models and the ever-increasing volume of data has become a daunting challenge for ML practitioners. Efficient data management and data version control are now critical aspects of successful ML workflows.
In this blog post, we delve into the power of parallel ML
Read more
When we announced the availability of MinIO on Red Hat OpenShift, we didn’t anticipate that demand would be so great that we would someday write a series of blog posts about this powerful combination. This combination is being rapidly adopted due to the ubiquitous nature of on-prem cloud and the need of large organizations wanting to bring their data
Read more
About MLflow
MLflow is an open-source platform designed to manage the complete machine learning lifecycle. Databricks created it as an internal project to address challenges faced in their own machine learning development and deployment processes. MLflow was later released as an open-source project in June 2018.
As a tool for managing the complete lifecycle, MLflow contains the following components.
* MLflow
Read more
This post was a collaboration between Kevin Lambrecht of UCE Systems and Raghav Karnam
The cloud operating model and specifically Kubernetes have become the standard for large scale infrastructure today. More importantly, they are evolving at an exceptional pace with material impacts to data science, data analytics and AI/ML.
This transition has a significant impact on the Hadoop ecosystem.
Read more
MinIO has developed into a core building block for the media and entertainment industry. With a customer roster that includes the leading cable company, the biggest streaming company and dozens of companies up and down the stack we have added a number of different features in recent quarters. One of those is called the fan out feature and it is
Read more
Making the serving of your AI models more lightweight by leveraging the simplicity of MinIO’s object store.
tl;dr
MinIO object storage can be used as a ‘single source of truth’ for your machine learning models and, in turn, make serving with PyTorch Serve more efficient when managing changes to Large Language Models (LLMs). As always, sample code is
Read more
MinIO is built with speed and resiliency at the forefront, regardless of the type of environment you choose to run it on. Whether it's multi cloud, bare metal, cloud instances or even on-premise, MinIO is designed to run on AWS, GCP, Azure, colocated bare metal servers and Kubernetes distributions such as Red Hat OpenShift. MinIO runs just as
Read more
The world of backup has entered a brave new world where traditional solutions still have utility but where the scale, speed of change and application landscape require different…radically different…approaches. This post seeks to lay out the challenges of this new world, where the line of demarcation exists and how to think about architecting a data protection framework that
Read more
This post was written in collaboration with Harinder Mashiana from cnvrg.io.
Large language models (LLMs) have revolutionized the world of technology, offering powerful capabilities for text analysis, language translation, and chatbot interactions. The revolution will heavily impact businesses, according to OpenAI, approximately 80% of the U.S. workforce could have at least 10% of their work tasks affected by
Read more