The rise of lakehouse functionality is reshaping data management. ParadeDB's pg_lakehouse extension lets PostgreSQL integrate with object storage, enabling scalable, secure analytics. This makes the modernization of data infrastructure possible without extensive overhauls. Welcome to the future!
Read more
Amid the AI frenzy, one silent hero powers it all: modern object storage. It may not be glamorous, but it's the backbone of today's data lakes, enabling vast, efficient data management. Discover how AIStor elevates your data infrastructure.
Read more
The regulatory landscape is evolving rapidly, and the upcoming Digital Operational Resilience Act (DORA) in Europe is a testament to this dynamic change. We have multiple European banking customers and each one is approaching the problem from a slightly different angle with one exception - almost all of them are using modern object storage as the foundational layer.
For IT
Read more
In this post we explain how to use Splunk's advanced log analytics to help understand the performance of AIStor and the data under management.
Read more
The modern enterprise defines itself by its data. This requires a data infrastructure for AI/ML as well as a data infrastructure that is the foundation for a Modern Datalake capable of supporting business intelligence, data analytics, and data science. This is true if they are behind, getting started or using AI for advanced insights. For the foreseeable future, this
Read more
The Load Balancer in MinIO Firewall solves the network bottleneck. In a cloud-native environment like Kubernetes, MinIO Firewall can be fairly easy to enable the Load Balancing without any modification to your application binary or container image.
Read more
The team at Insight Partners just released their State of Enterprise Tech report for 2024. There is a lot to consume in the 60+ slides, but we cherry picked the things that should be interesting to our audience - and frankly there is a lot of interesting stuff.
I will leave the survey methodology stuff for you to consume, but
Read more
AutoMQ enhances Kafka's architecture by using MinIO's object storage, cutting costs, and boosting elasticity while keeping Kafka API compatibility. This combo offers scalable, secure, and efficient data streaming, ideal for diverse cloud environments.
Read more
Iceberg is shifting the market's focus to scalable, cloud-native storage. This shift is leading to the commoditization of query engines, offering users more flexibility, better pricing, and innovation.
Read more
An embedding subsystem is one of four subsystems needed to implement Retrieval Augmented Generation. It turns your custom corpus into a database of vectors that can be searched for semantic meaning. The other subsystems are the data pipeline for creating your custom corpus, the retriever for querying the vector database to add more context to a user query, and finally,
Read more
Catalogs are revolutionizing modern datalakes, with industry giants like Databricks and Snowflake adopting Apache Iceberg’s catalog REST API. A commitment to open standards enhances performance, fosters innovation, and transforms data management for AI and ML.
Read more
Observability is all about gathering information (traces, logs, metrics) with the goal of improving performance, reliability, and availability.
Read more
One of the reasons that MinIO is so performant is that we do the granular work that others will not or cannot. From SIMD acceleration to the AVX-512 optimizations we have done the hard stuff. Recent developments for the ARM CPU architecture, in particular Scalable Vector Extensions (SVE), presented us with the opportunity to deliver significant performance and efficiency gains
Read more
This post initially appeared on The New Stack.
For a few years there, the term “private cloud” had a negative connotation. But as we know, technology is more of a wheel than an arrow, and right on cue, the private cloud is getting a ton of attention and it is all positive. The statistics are clear, Forrester’s 2023 Infrastructure
Read more
The semantic layer in modern datalakes provides context and structure to raw data, crucial for key data initiatives like AI model training, data management and data governance. A unified strategy and robust infrastructure are essential for effective implementation of the semantic layer.
Read more
Simply put, OperatorHub to OpenShift is what App Store is to Apple. With a web console interface, an Operator can be pulled from its off-cluster source, installed and subscribed on the cluster, and made ready for engineering teams to self-service manage the product across deployment environments.
Read more
The Modern Datalake is one-half data warehouse and one-half data lake and uses object storage for everything. The use of object storage to build a data warehouse is made possible by Open Table Formats OTFs) like Apache Iceberg, Apache Hudi, and Delta Lake, which are specifications that, once implemented, make it seamless for object storage to be used as the
Read more
With all the talk in the industry today regarding large language models with their encoders, decoders, multi-headed attention layers, and billions (soon trillions) of parameters, it is tempting to believe that good AI is the result of model design only. Unfortunately, this is not the case. Good AI requires more than a well-designed model. It also requires properly constructed training
Read more
Boundary helps record SSH sessions to meet compliance and improve security requirements. These sessions are then stored on MinIO for fast retrieval for auditing purposes in case of a data breach incident.
Read more