Replication Strategies Deep Dive

Replication Strategies Deep Dive

In previous blogs we’ve talked about Replication Best practices and the different types of replication such as Batch, Site and Bucket. But with all these different types of replication types floating around one has to wonder which replication strategy to use where? Do you use mc mirror or Batch when migrating data from an existing S3 compatible data store? When replicating between clusters should you use Site Replication or Bucket replication?

Today we’ll demystify these different replication strategies to see which one should be used in which scenario.

Replicating from existing source

Generally if you already have existing data either locally on a drive or existing S3 compatible store, there are one of two ways we recommend replicating the data.

  • Batch Replication: This must need an existing source that is either MinIO or another S3 compatible store such as AWS.
  • Using mc mirror: This could be a local directory or NFS mount among others.

Before we go through the specifics let's take a look at some of the prerequisites.

Create an alias in mc called miniostore for the MinIO cluster.

mc alias set miniostore

Create a bucket in miniostore where the data from olderstore will be transferred to.

mc mb miniostore/mybucket

Create another alias in mc for the existing bucket in the S3 compatible store.

mc alias set olderstore

In this case we will assume there is already a bucket in olderstore named mybucket.

Batch Replication

Let's take a look at how we can use Batch Replication to migrate data from an existing S3 compatible source to MinIO bucket.

Create the yaml for the batch replication configuration

mc batch generate olderstore/ replicate

You should see a replication.yaml file similar to below, the source is olderstore and the target is miniostore.

replicate:

  apiVersion: v1

  # source of the objects is `olderstore` alias

  source:

    type: TYPE # valid values are "s3"

    bucket: BUCKET

    prefix: PREFIX

    # NOTE: if source is remote then target must be "local"

    # endpoint: ENDPOINT

    # credentials:

    #   accessKey: ACCESS-KEY

    #   secretKey: SECRET-KEY

    #   sessionToken: SESSION-TOKEN # Available when rotating credentials are used


  # target where the objects is `miniostore` alias

  target:

    type: TYPE # valid values are "s3"

    bucket: BUCKET

    prefix: PREFIX

    # NOTE: if target is remote then source must be "local"

    # endpoint: ENDPOINT

    # credentials:

    #   accessKey: ACCESS-KEY

    #   secretKey: SECRET-KEY

    #   sessionToken: SESSION-TOKEN # Available when rotating credentials are used


[TRUNCATED]

Execute batch replication using the command below

mc batch start olderstore/ ./replicate.yaml

Successfully start 'replicate' job `E24HH4nNMcgY5taynaPfxu` on '2024-02-04 14:19:06.296974771 -0400 EDT'

Using the replicate job ID above, in this case E24HH4nNMcgY5taynaPfxu, we can find the status of the batch job.

mc batch status olderstore/ E24HH4nNMcgY5taynaPfxu

●∙∙

Objects:    28766

Versions:   28766

Throughput: 3.0 MiB/s

Transferred: 406 MiB

Elapsed:    2m14.227222868s

CurrObjName: share/doc/xml-core/examples/foo.xmlcatalogs

You can list and find the config of all the batch jobs currently running.

mc batch list olderstore/


ID                  TYPE        USER        STARTED

E24HH4nNMcgY5taynaPfxu  replicate   minioadmin  1 minute ago

mc batch describe olderstore/ E24HH4nNMcgY5taynaPfxu


replicate:

  apiVersion: v1

You can also cancel and start the batch job if, for example, it's saturating the network and you need to resume it at a later time when traffic is the least.

Mc Mirror

Let’s take a quick look at how mc mirror would work in this case.

mc mirror --watch olderstore/mybucket miniostore/mybucket

The above command is similar to rsync. It will not only copy the data from olderstore to miniostore but also look for newer objects on olderstore that come in and then copy them to miniostore. There are some nuances to batch jobs on versioned vs unversioned buckets. If one of the source/target is s3 compatible source batch job works just like mirror and copies only the latest version. However, version id will not be preserved. 

You can compare the two buckets to see if the data has been copied over successfully.

mc diff olderstore/mybucket miniostore/mybucket

It's as simple as that.

Which is the better option?

Although mc mirror seems simple and straightforward, we actually recommend the Batch Replication method for migrating data from an existing S3 compatible store, the reasons are several fold.

Batch replication runs on server side, while mc mirror runs on client side. Meaning batch replication has the full resources available to it that MinIO servers run on to perform its batch jobs. On the other hand mc mirror is bottle-necked by the client system where the command is being run, so your data is taking the longer route. In other words with Batch replication the traceroute would look like olderstore -> miniostore but with mirroring that would look like olderstore -> mc mirror -> miniostore.

Batch jobs are one-time processes allowing for fine control replication. For example, while running replication if you notice the network is being saturated you can cancel the batch replication job and later resume during off hours when the traffic is the least. In the event that some objects fail to replicate, the job will retry multiple attempts so the objects eventually replicate. 

So does Batch replication have no downsides? Well not a lot. One possible concern we see in the real world is that sometimes batch replication is slow and not instantaneous. Depending on the network transfer and speed you might see some slowness compared to other methods. That being said, we still recommend Batch replication because it's more stable and we have more control on how and when the data gets migrated.

Replicating to another site

Once you have data in your MinIO cluster, you would want to ensure that the data gets replicated to another MinIO cluster in another site for redundancy, performance and disaster recovery purposes. There are several ways to do this but in this case let's talk about the following two:

  • Site Replication
  • Bucket Replication

Site Replication

Once data is in a MinIO object store cluster, it opens to several different possibilities to replicate and manage your data.

First step is to set up 3 identical MinIO clusters and name them minio1, minio2 and minio3, respectively. We will assume site1 already has the data migrated to it using Batch replication.

mc alias set minio1 http:// minioadmin minioadmin

mc alias set minio2 http:// minioadmin minioadmin

mc alias set minio3 http:// minioadmin minioadmin

Enable site replication across all 3 sites

mc admin replicate add minio1 minio2 minio3

Verify the site replication is set properly across 3 sites

mc admin replicate info minio1


SiteReplication enabled for:


Deployment ID               | Site Name  | Endpoint

f96a6675-ddc3-4c6e-907d-edccd9eae7a4 | minio1 | http://

0dfce53f-e85b-48d0-91de-4d7564d5456f | minio2 | http://

8527896f-0d4b-48fe-bddc-a3203dccd75f | minio3 | http://

Check the current replication status using the following command

mc admin replicate status minio1

Once site replication is enabled, data will automatically start to replicate between all the sites. Depending on the amount of data to transfer, the network and disk speeds, it might take a couple of hours to a few days for the object to be synchronized across the sites.

If it's taking longer than usual or you still don’t see everything replicated over you can perform the resync command as below

mc admin replicate resync start minio1 minio2 minio3

The status can be checked using the following command

mc admin replicate resync status minio1 minio2 minio3

Eventually all the data will be replicated to the minio2 and minio3 sites.

Bucket Replication

Bucket replication, as the name suggests, sets up replication on a particular bucket in MinIO based on ARN.

Set up the following two MinIO aliases

Source:

mc alias set minio1

Destination:

mc alias set minio2

Once both aliases are set on the minio2 side, create a replication user repluser and set up a user policy for this user on the minio2 side bucket which has permissions to the actions listed in this policy as a minimal requirement for replication.

mc admin user add minio2 repluser repluserpwd

Set the minimum policy required for repluser to run the replication operations

$ cat > replicationPolicy.json << EOF

{

 "Version": "2012-10-17",

 "Statement": [

  {

   "Effect": "Allow",

   "Action": [

"s3:GetBucketVersioning"

   ],

   "Resource": [

"arn:aws:s3:::destbucket"

   ]

  },

  {

   "Effect": "Allow",

   "Action": [

"s3:ReplicateTags",

"s3:GetObject",

"s3:GetObjectVersion",

"s3:GetObjectVersionTagging",

"s3:PutObject",

"s3:ReplicateObject"

   ],

   "Resource": [

"arn:aws:s3:::destbucket/*"

   ]

  }

 ]

}

Attach the above replpolicy to repluser

$ mc admin policy add minio2 replpolicy ./replicationPolicy.json

$ mc admin policy set minio2 replpolicy user=repluser

Now this is where it gets interesting. Now that you have the replication user (repluser) and the replication policy (replpolicy) created on minio2 cluster, you need to actually set the bucket replication target on minio1. This doesn’t start the bucket replication yet, it only sets it up for later when we actually start the process.

$ mc replicate add minio1/srcbucket https:/repluser:repluserpwd@replica-endpoint:9000/destbucket --service "replication" --region "us-east-1"

Replication ARN = 'arn:minio:replication:us-east-1:28285312-2dec-4982-b14d-c24e99d472e6:destbucket'

Finally, this is where the rubber meets the road, let's start the replication process.

$ mc replicate add minio1/srcbucket --remote-bucket https:/repluser:repluserpwd@replica-endpoint:9000/destbucket

Any objects uploaded to the source bucket that meet replication criteria will now be automatically replicated by the MinIO server to the remote destination bucket. Replication can be disabled at any time by disabling specific rules in the configuration or deleting the replication configuration entirely.

Which is the better option?

So why can’t we use Site to Site replication for everything and why do we need to use Batch Replication? Well Batch Replication provides more control over the replication process. Think of site replication as a firehose when you start it for the first time, once started the site replication process has potential to use all the available network bandwidth on the network to a point where no other applications can use the network throughput. On the other hand while sometimes batch replication might be slow it will not disrupt your existing network during the initial data transfer. Bucket replication is generally useful when you want to replicate just a handful of bucket and not the entire cluster.

Okay great, then what about site replication? Batch replication is not ideal for continuous replication because once the batch job ends it won’t replicate any new objects. So you have to keep re-run the batch replication job at certain intervals to ensure the delta gets replication to the minio2 site. On the other hand site replication allows data to be replicated both from minio1 to minio2 and vice versa, if you have active-active replication setup.

It is not possible to have both bucket and site replication to be enabled at the same time, you have to pick one or the other. So generally unless you want to only replicate certain buckets or certain objects in a particular bucket, we highly recommend going with site replication as it will not only replicate existing buckets and objects but also any new buckets/objects that are created. More over without too much configuration you can setup replication in a distributed manner where you can have minio1 in North America and minio2 in Africa so MENA (Middle East North Africa) region will add data to minio2 and North America region will add data to minio1 and they will replicate each other.

Final Thoughts

In this post we went deeper in Bucket, Batch and Site replication types. While there is no set rule to use a particular replication strategy, our engineers at SUBNET after working with countless clusters setup, migrating them, expanding them, thinking about disaster recovery scenarios, our engineers have come up with the above replication strategies which should help most folks out there thinking of migrating their data to MinIO.

If you have any questions on the replication or best practices be sure to reach out to us on Slack! or better yet sign up for SUBNET and we can get you going.