All posts

Benchmarking MinIO vs. AWS S3 for Apache Spark

Benchmarking MinIO vs. AWS S3 for Apache Spark

Apache Spark is a framework for distributed computing. It provides one of the best mechanisms for distributing data across multiple machines in a cluster and performing computations on it. Spark achieves this by constructing data structures called RDDs (Resilient Distributed Datasets). RDDs allow data to be broken into disparate chunks and processed independently of one another. The individual chunks can

Read more...

Scalable Genomics Data Processing Pipeline with Alluxio, Mesos, and MinIO

This is a guest blog from our friends at Guardant Health.Guardant Health is the world leader in comprehensive liquid biopsy. Oncologists order our blood test to help determine if their advanced cancer patients are eligible for certain drugs that target specific genomic alterations in tumour DNA. Each test produces huge amounts of genomic data that we process into easily

Read more...