Using mc to Migrate Data to/from AWS Snowball

Using mc to Migrate Data to/from AWS Snowball

As data has grown, so has the challenge associated with moving it. Indeed, the bandwidth costs to migrate a PB of data out of AWS would be more than keeping it there for years. Still, customers often need to move large amounts (100s of TBs up to PBs) with some frequency.

Amazon knows this and has, in their intensely customer focused way, come up with a clever solution in Amazon Snowball. Snowball is reasonably priced and can move massive amounts of data. This handy table from their FAQ page provides some context. At these speeds, FedEx and UPS are a better bet.

There is, however, one problem worth noting and that is Amazon’s CLI for getting the data out is quite limited. Moving data out of Snowball requires a staging area of equivalent capacity before it can be moved to another object storage system. The staging area has to be a NAS or a file system.

In order to facilitate moving data from AWS' Snowball directly to another S3 compatible object storage at scale, we used our popular MinIO Client (MC) tool.

Snowball speaks S3, but we noticed it only does so in limited fashion.

By enhancing mc to recognize Snowball as an S3 compatible object storage server, one gains all of the mc commands to manage data on the Snowball. With the addition of our gateway (like Azure, HDFS, NAS, GCS, Alibaba), users can even migrate data to/from Snowball to/from a non-compatible S3 object stores.

Mirror, migrate and move at will.

This enhancement is available in the latest release and should make it really simple for data/devops/infrastructure professionals to move massive amounts of data in and out of AWS.

Download the latest MC:

wget https://dl.min.io/client/mc/release/linux-amd64/mc
chmod +x mc
./mc --help

While it will certainly accelerate the efforts of those looking to repatriate some or all of their data, it will also make it easier to get data onto AWS S3 if that is what the customer wants.

Here are the instructions to configure mc for AWS Snowball. As we noted, Amazon Snowball implements restricted S3 APIs.

The latest release of mc will automatically detect a Snowball and behave accordingly.

Before starting, the user must get the AWS Snowball credentials as detailed here.

There are two methods to move the data, secured and unsecured. While we recommend the secure method, we detail both here.

Secured (recommended)

Before you configure mc to use the HTTPs endpoint make sure to obtain the current Snowball Edge certificate following this document.

Once the certificate has been obtained, it should look similar to the one below:

~ mkdir -p ${HOME}/.mc/certs/CAs
~ snowballEdge get-certificate --certificate-arn arn:aws:snowball-device:::certificate/78EXAMPLE516EXAMPLEf538EXAMPLEa7 > ${HOME}/.mc/certs/CAs/snowball.pem

Save the certificate. The next step is to create the alias with the obtained credentials as shown below:

~ mc config host add mysnowball https://<YOUR-SNOWBALL-IP>:8443 YOUR-SNOWBALL-ACCESS-KEY YOUR-SNOWBALL-SECRET-KEY

Unsecured

Snowball also exposes an insecure endpoint, without the certificate.

~ mc config host add mysnowball https://<YOUR-SNOWBALL-IP>:8443 YOUR-SNOWBALL-ACCESS-KEY YOUR-SNOWBALL-SECRET-KEY

​With this complete, we now turn our attention to the MinIO endpoint.

Start mirror

Assuming you have configured the MinIO server as "myminio", start mirror to copy all the buckets and all objects on Amazon Snowball to MinIO.

~ mc mirror mysnowball/ myminio/

Voila!

Now your Snowball is a first class object storage server from mc's perspective.

If you have any specific questions, drop us a note on hello@min.io or join the conversation on Slack. We are here to help.