Add Pools and expand capacity

Server pools help you expand the capacity of your existing MinIO cluster quickly and easily. This blog post focuses on increasing the capacity of one cluster, which is different from adding another cluster and replicating the same data across multiple clusters. When adding a server pool to an existing cluster, you increase the overall usable capacity of that cluster. If you have replication set up, then you will need to grow your replication target equally to accommodate the growth of the replication origin. 

Server pools are an important concept in MinIO because they facilitate rapid storage capacity expansion. We recommend sizing a single-pool cluster for at least 2-3 years of storage capacity runway –  and possibly more if you anticipate dramatic growth. This way you avoid adding unnecessary server pools and instead start with a simple MinIO cluster that grows simply over time. Even though server pools are easier to work with than individual nodes, they still add a tiny bit of management overhead. Once you've expanded, you should think about consolidating multiple pools into a few large ones by decommissioning the smaller pools.

In this post we’ll show you what you need to consider before expanding a server pool, how to create your initial pool and then later how to expand it by adding a new pool.

Build the Cluster

When setting up a server pool to expand the cluster there are certain prerequisites that need to be met in order to have the necessary specs for the additional pool.

Network and Firewall: The nodes in the new pool need to be able to talk to all the existing nodes in the cluster bi-directionally. All the new nodes must be listening on the same port as the existing ones. For example if you use port `9000` then the new pool must also communicate on `9000`. We also recommend using a load balancer such as Nginx or HAProxy for proxying the requests. Configure the routing algorithm to ensure traffic is routed based on least connections.

Sequential Hostnames: MinIO uses an expansion notation `{x...y}` to denote a sequential series of hostnames. It is therefore mandatory to name the new nodes in your pool in a sequential manner. If the existing nodes had these hostnames:

minio1.example.com

minio2.example.com

minio3.example.com

minio4.example.com

Then the new pool should have the following hostnames:

minio5.example.com

minio6.example.com

minio7.example.com

minio8.example.com

Be sure to create the DNS records for these hostnames prior to launching the new pool.

Sequential Drives: Similar to the hostnames, the drives need to be mounted in sequential order as well using the same expansion notation {x...y}. Here is an example of an /etc/fstab file.

$ mkfs.xfs /dev/sdb -L DISK1

$ mkfs.xfs /dev/sdc -L DISK2

$ mkfs.xfs /dev/sdd -L DISK3

$ mkfs.xfs /dev/sde -L DISK4


$ nano /etc/fstab


  #                

  LABEL=DISK1      /mnt/disk1     xfs     defaults,noatime  0       2

  LABEL=DISK2      /mnt/disk2     xfs     defaults,noatime  0       2

  LABEL=DISK3      /mnt/disk3     xfs     defaults,noatime  0       2

  LABEL=DISK4      /mnt/disk4     xfs     defaults,noatime  0       2

You can then specify the entire range of drives with /mnt/disk{1...4}. If you want to use a specific sub-folder on each drive, specify it as /mnt/disk{1...4}/minio.

Erasure Code: As mentioned earlier, MinIO requires each server pool to satisfy the deployment parameters of existing clusters. Specifically the new pool topology must support a minimum of 2 x EC:N drives per erasure set, where EC:N is the standard parity storage class of the deployment. This requirement ensures the new server pool can satisfy the expected SLA of the deployment. For reference, this blog post explains how to use the Erasure Code Calculator to determine the number of disks and capacity  you need. For an explanation of erasure coding, please see Erasure Coding 101.

Atomic Updates: You should also ensure that the new pool is homogeneous with the existing cluster as much as possible. It does not have to match spec for spec but the drives and network configuration need to be as close as possible to avoid potential edge case issues. Adding a new server pool requires restarting all MinIO nodes in the deployment at the same time. MinIO recommends restarting all nodes simultaneously. Do not perform rolling restarts (e.g. one node at a time), MinIO operations are atomic and strictly consistent. As such the restart procedure is non-disruptive to applications and ongoing operations.

Let's go ahead and build the cluster. In this example, we’ll build a Kubernetes cluster using KIND. We’ll use the following configuration to build a virtual 8-node cluster.

kind: Cluster

apiVersion: kind.x-k8s.io/v1alpha4

networking:

  apiServerAddress: "127.0.0.1"

  apiServerPort: 6443

nodes:

  - role: control-plane

    extraPortMappings:

    - containerPort: 30080

      hostPort: 30080

      listenAddress: "127.0.0.1"

      protocol: TCP

  - role: worker

    extraPortMappings:

    - containerPort: 30081

      hostPort: 30081

      listenAddress: "127.0.0.1"

      protocol: TCP

  - role: worker

    extraPortMappings:

    - containerPort: 30082

      hostPort: 30082

      listenAddress: "127.0.0.1"

      protocol: TCP

  - role: worker

    extraPortMappings:

    - containerPort: 30083

      hostPort: 30083

      listenAddress: "127.0.0.1"

      protocol: TCP

  - role: worker

    extraPortMappings:

    - containerPort: 30084

      hostPort: 30084

      listenAddress: "127.0.0.1"

      protocol: TCP

  - role: worker

    extraPortMappings:

    - containerPort: 30085

      hostPort: 30085

      listenAddress: "127.0.0.1"

      protocol: TCP

  - role: worker

    extraPortMappings:

    - containerPort: 30086

      hostPort: 30086

      listenAddress: "127.0.0.1"

      protocol: TCP

  - role: worker

    extraPortMappings:

    - containerPort: 30087

      hostPort: 30087

      listenAddress: "127.0.0.1"

      protocol: TCP

  - role: worker

    extraPortMappings:

    - containerPort: 30088

      hostPort: 30088

      listenAddress: "127.0.0.1"

      protocol: TCP

Add a label to the first 4 nodes into pool zero like so

kubectl label nodes kind-worker  pool=zero

kubectl label nodes kind-worker2 pool=zero

kubectl label nodes kind-worker3 pool=zero

kubectl label nodes kind-worker4 pool=zero

Clone the MinIO operator’s tenant lite Kustomize configuration

git clone https://github.com/minio/operator.git

Ensure tenant.yaml has a pool called pool-0 like below

pools:

  - name: pool-0

    nodeSelector:

      pool: zero

Apply tenant configuration to launch pool-0

$ kubectl apply -k operator/resources

$ kubectl apply -k operator/examples/kustomization/tenant-lite

Check to make sure there are 4 pods in the pool

$ kubectl get pods -n tenant-lite -o wide

NAME               READY   STATUS    RESTARTS   AGE   IP            NODE           NOMINATED NODE   READINESS GATES

myminio-pool-0-0   1/1     Running   0          12h   10.244.7.5    kind-worker             

myminio-pool-0-1   1/1     Running   0          12h   10.244.5.5    kind-worker3            

myminio-pool-0-2   1/1     Running   0          12h   10.244.4.10   kind-worker2            

myminio-pool-0-3   1/1     Running   0          12h   10.244.8.13   kind-worker4            

This is the initial setup that most folks start out with. This ensures you are set up in a way that sets you up to seamlessly expand in the future. Speaking of expanding pools, let's take a look how that would look like

Expand the Cluster

Expanding a pool is a non-disruptive operation which causes zero cluster downtime. Below is a diagram of what we intend to achieve as an end result.

In the above diagram on the left hand side we see pool-0 that has already been setup in the previous steps. In this section we’ll tackle adding pool-1 to expand the overall capacity of the cluster. You need to add 4 more similar nodes to pool-0 in order to expand into pool-1. We’ve already done this by launching an 8-node cluster to simplify the demo.

Edit the tenant-lite config to add pool-1

kubectl edit tenant -n tenant-lite

It should open a yaml file, find the pools section and add this below that.

  - affinity:

      podAntiAffinity:

        requiredDuringSchedulingIgnoredDuringExecution:

        - labelSelector:

            matchExpressions:

            - key: v1.min.io/tenant

              operator: In

              values:

              - myminio

            - key: v1.min.io/pool

              operator: In

              values:

              - pool-1

          topologyKey: kubernetes.io/hostname

    name: pool-1

    nodeSelector:

      pool: one

    resources: {}

    runtimeClassName: ""

    servers: 4

    volumeClaimTemplate:

      metadata:

        creationTimestamp: null

        name: data

      spec:

        accessModes:

        - ReadWriteOnce

        resources:

          requests:

            storage: "2147483648"

        storageClassName: standard

      status: {}

    volumesPerServer: 2

As soon as you save the file, the new pool should be starting to deploy. Verify it by getting a list of pods.

$ kubectl get pods -n tenant-lite -o wide

NAME               READY   STATUS    RESTARTS   AGE   IP            NODE           NOMINATED NODE   READINESS GATES

myminio-pool-0-0   1/1     Running   0          12h   10.244.7.5    kind-worker             

myminio-pool-0-1   1/1     Running   0          12h   10.244.5.5    kind-worker3            

myminio-pool-0-2   1/1     Running   0          12h   10.244.4.10   kind-worker2            

myminio-pool-0-3   1/1     Running   0          12h   10.244.8.13   kind-worker4            

myminio-pool-1-0   1/1     Running   0          12h   10.244.3.10   kind-worker8            

myminio-pool-1-1   1/1     Running   0          12h   10.244.6.15   kind-worker6            

myminio-pool-1-2   1/1     Running   0          12h   10.244.2.7    kind-worker5            

myminio-pool-1-3   1/1     Running   0          12h   10.244.1.10   kind-worker7            

There you have it. Wasn’t that a remarkably easy way to expand?

Ermahgerd Pools!

Server pools streamline ongoing operations of MinIO clusters. Pools allow you to expand your cluster on a moment's notice without having to move your data around to different clusters or re-balancing the cluster. Server pools enhance operational efficiency because they give storage admins a powerful shortcut in the ability to address an entire hardware cluster as a single resource.

While server pools are an excellent way to expand the capacity of your cluster, they should be used judiciously. We recommend that you size the cluster from Day One to have enough space for 3 years of anticipated growth so you don’t need to immediately start adding more pools. In addition, consider tiering before buying more capacity – tier off the old data to less expensive hardware and devote the latest and greatest hardware to storing the most recent and heavily accessed objects. If and when you have to expand the cluster by adding pools, have a game plan to eventually decommission the older pools and consolidate into a single large cluster per site. This will further lower the overhead required to keep your MinIO cluster running smoothly.

If you have any questions on how to add and expand server pools be sure to reach out to us on Slack!