WARP speed your AI data storage Infrastructure

WARP speed your AI data storage Infrastructure

Do you know the secret to some of the best AI models out there? It's the amount of data they had access to on which they could be trained on. For AI/ML models Fast accessible Data is King. Let me emphasize, it’s not just Data, but fast accessible Data. If someone can build a faster and stronger model then you’ve already lost the AI race.

When designing AI Infrastructure components, specially data storage components, it's crucial to ensure the overall experience of ML Engineers and Data scientists is taken into account when they need to store their machine learning algorithms and manage the resource available in the MinIO cluster effectively. This ensures that reliable models are built quickly and effectively without storage infrastructure being the bottleneck.

There are several components in the AI Infrastructure layer that are needed to not only build AI models, but train and store the resulting models in a fast accessible data store such as MinIO. The world of ML Ops is at the juncture of DevOps and integrating ML models that are being produced at a breakneck speed. In this post we’ll show you how to measure the performance of your MinIO AI data storage infrastructure using WARP.

WARP is an open-source full-featured S3 performance assessment software built to conduct tests between WARP clients and object storage hosts. WARP measures GET and PUT performance from multiple clients against a MinIO cluster. WARP has many options, configured by command line or environmental variables, allowing you to create tests that align with your workloads. We’ll quickly show you how you can run it so you can start analyzing your AI data storage infrastructure.

Run and Analyze WARP

Create warp client listeners to run distributed warp benchmarks, here we will run them as stateful sets across client nodes.

kubectl apply -f https://raw.githubusercontent.com/minio/warp/master/k8s/warp.yaml

In warp-job.yaml update the --warp-clients and --host flags, to match your cluster specifics. Once set, deploy as follows

kubectl apply -f https://raw.githubusercontent.com/minio/warp/master/k8s/warp-job.yaml

Once the WARP job completes the status can be found in the logs

kubectl get pods -l job-name=warp-job

NAME             READY   STATUS      RESTARTS   AGE

warp-job-6xt5k   0/1     Completed   0          8m53s


~ kubectl logs warp-job-6xt5k

...

-------------------

Operation: PUT. Concurrency: 256. Hosts: 4.

* Average: 412.73 MiB/s, 12.90 obj/s (1m48.853s, starting 19:14:51 UTC)


Throughput by host:

 * http://minio-0.minio.default.svc.cluster.local:9000: Avg: 101.52 MiB/s, 3.17 obj/s (2m32.632s, starting 19:14:30 UTC)

...


Aggregated Throughput, split into 108 x 1s time segments:

 * Fastest: 677.1MiB/s, 21.16 obj/s (1s, starting 19:15:54 UTC)

 * 50% Median: 406.4MiB/s, 12.70 obj/s (1s, starting 19:14:51 UTC)

 * Slowest: 371.5MiB/s, 11.61 obj/s (1s, starting 19:15:42 UTC)

You can also set WARP to do distributed benchmarking. This allows you to perform the test in a more realistic manner with multiple WARP clients, as it is usually the case in the real world.

When Running WARP make sure the nodes where you install the clients are on a private server as there is a potential of getting DDoS if the client is exposed. Also avoid running WARP against during peak production periods and you could end up in a situation where the resources are being competed for.

It's possible to randomize object sizes and files will have a "random" size up to the object size refined. 

Example of objects (horizontally) and their sizes, 100MB max:

You can also auto terminate WARP when results are considered stable. To detect a stable setup, warp continuously downsample the current data to 25 data points stretched over the current timeframe. For a benchmark to be considered "stable", the last 7 of 25 data points must be within a specified percentage.

Looking at the throughput over time, it could look like this

The red frame shows the window used to evaluate stability. The height of the box is determined by the threshold percentage of the current speed. 

Warp Speed Ahead!

We encourage you to refer to the documentation to learn about executing more test scenarios. For example, you can enable TLS and server-side encryption to measure their impact in your environment. You can stress infrastructure more by increasing the number of concurrent tests. You can use a random mix of object sizes, or specify an object size that matches your current environment and workload. You can configure tests to run for a defined period of time or to auto-terminate as we did above.

If you have any questions on WARP be sure to reach out to us on Slack!