This post first appeared in The New Stack.
Developers gravitate to technologies that are software defined, open source, cloud native and simple. That essentially defines object storage.
Introduction
Choosing the best storage for all phases of a machine learning (ML) project is critical. Research engineers need to create multiple versions of datasets and experiment with different model architectures. When a
Read more
Most developers, engineers, architects and DevOps folks know MinIO. Not all know that the only thing we do is software-defined object storage. We don’t do file or block. We don’t offer a service, it is self-hosted.
Our focus is singular.
The result is that our object store is objectively, based on adoption, awards and customer feedback the best
Read more
Apache Kafka and Apache Spark are two leading technologies used to build the streaming data pipelines that feed data lakes and lake houses. At a really high level, Kafka streams messages to Spark where they are transformed into a format that can be read in by applications and saved to storage.
Read more
Build data pipelines with S3 to MinIO and MinIO to MinIO batch replication.
Read more
Encryption is an important part of the MinIO architecture. MinIO applies encryption to ensure objects are secure at rest and are compliant with regulations.
Read more
Engineers like to play and learn locally. It does not matter which tool is under investigation: a high-end storage solution, a workflow orchestration engine, or the latest thing in distributed computing. The best way to learn a new technology is to find a way to cram it all on a single machine so that you can put your hands on
Read more
InfluxDB is built on the same ethos as MinIO. It is a single Go binary, cloud agnostic, lightweight, but is also feature packed with things like replication and encryption, and it provides integrations with various applications.
Read more
88 GB/s writes in a 2U form factor for on-prem, colo and edge object storage.
Read more
Kubeflow Pipelines (KFP) is the most popular feature of Kubeflow. A Python engineer can turn a function written in plain old Python into a component that runs in Kubernetes using the KFP decorators. If you used KFP v1, be warned - the programming model in KFP v2 is very different - however, it is a big improvement. Transforming plain old
Read more
Kafka and Spark Structured Streaming are used together to build data lakes/lake houses fed by streaming data and provide real time business insights.
Read more
Make you Kafka topics performant and efficient with Kafka Schema Registry.
Read more
HDD failure rates create big complications for RAID arrays. Find out why erasure coding is a better option for data durability.
Read more
Managing users, groups, and policies for security and functionality with MinIO.
Read more
MinIO licensees gain access to SUBNET security features like long term support and policy reviews.
Read more
MinIO has added support for FTP and SFTP into the MinIO Server.
Read more
What is ArgoCD? In short, it's a GitOps continuous deployment tool that stores the state of the infrastructure in a Git repository and automates deployment by tracking the changes between the existing and new deployment configurations.
Read more
I wanted to share my thoughts on the semi-annual confab that is Kubecon, this one the European edition. These are fairly candid takes, I can be critical or complementary, but given how important this space is to us, it is worthy of analysis.
Let’s get one thing out of the way. This was a superb Kubecon. The location was
Read more
Apache Kafka is an open-source distributed event streaming platform that is used for building real-time data pipelines and streaming applications. It was originally developed by LinkedIn and is now maintained by the Apache Software Foundation. Kafka is designed to handle high volume, high throughput, and low latency data streams, making it a popular choice for building scalable and reliable data
Read more