Master Full Text Search with MeiliSearch on MinIO
How to pair fast and efficient search with high-performance Kubernetes-native object storage.
Read moreA collection of 71 posts tagged with "Modern Data Lakes"
How to pair fast and efficient search with high-performance Kubernetes-native object storage.
Read more
When you think about the cloud, it helps to think about the types of businesses that have been built with elastic compute, networking and storage as a foundational component and self-service/multi-tenancy as the vehicle for customer engagement. For the most part, those businesses succeeded at scaling by focusing their efforts on building their product, almost exclusively on a single
Read more
Introduction Document management is a core requirement for all sorts of regulated institutions - finance, telecom, healthcare, government and others. These institutions need to manage and retain an ever growing number of documents and regulatory guidelines often require these documents to be stored for a very long term (7-10 years). Take for example, KYC (Know Your Customer) documents. Anyone starting
Read more
With the introduction of Apache Arrow, language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations, MinIO data lakes can be much more powerful. This article explains how to make use of Apache Arrow by using ArrowRDD.
Read more
Apache Nifi is one of the most popular open source data flow engines available today. Nifi supports almost all the major enterprise data systems and allows users to create effective, fast, and scalable information flow systems. Creating data flow systems is simple with Nifi and there is a clear path to add support for systems not already available as Nifi
Read more
When early object storage APIs were developed they focused on the efficient storage and retrieval of objects. Amazon’s success with S3 and its implementation of the robust S3 API quickly became the de facto standard for object storage in the cloud. MinIO, recognizing this, invested heavily in creating the most compliant implementation of the S3 API outside of Amazon.
Read more
In this post we’ll learn more about object storage, specifically Minio and then see how to connect Minio with tools like Apache Spark and Presto for analytics workloads.
Read more
In the first part of this two post series, we’ll take a look at how object storage is different from other storage approaches and why it makes sense to leverage object storage like Minio for data lakes.
Read more
In this post, we learn about Pivotal Container Service deployment and how to use the pks command line tool to create and manage Kubernetes clusters. We also saw how to deploy Minio once your PKS Kubernetes cluster is set up and running.
Read more
In this post, we learn about why and how Presto is becoming the tool of choice when querying large datasets from platforms like MinIO. We then learn the steps to setup and deploy Presto on private infrastructure.
Read more
One of the major requirements for success with IoT strategy is the ability to store and analyze device and sensor data. As IoT brings thousands of devices online everyday, the data being generated by all these devices combined is reaching staggering levels. > Storing the IoT data in a scalable yet cost effective manner, while being able to analyze it
Read more