CI/CD Deploy with MinIO distributed cluster on Kubernetes
Welcome to the third and final installment of our MinIO and CI/CD series. So far, we’ve discussed the basics of CI/CD concepts and how to build MinIO artifacts and how to test them in development. In this blog post, we’ll focus on Continuous Delivery and MinIO. We’ll show you how to deploy a MinIO cluster in a production environment using infrastructure as code to ensure anyone can read the resources installed and apply version control to any changes.
MinIO is very versatile and could be installed in almost any environment. MinIO conforms to multiple use cases for developers to have the same environment on a laptop that they work in production using the CI/CD concepts and pipelines we discussed. We showed you previously how to install MinIO as a docker container and even as a systemd service. Today we’ll show you how to deploy MinIO in distributed mode in a production Kubernetes cluster using an operator. We’ll use Terraform to deploy the infrastructure first, then we’ll deploy the required MinIO resources.
MinIO Network
First we’ll use Terraform to build the basic network needed for our infrastructure to get up and running. We are going to set up a VPC networking with 3 basic commonly used networking types. Within that network we’ll launch a Kubernetes cluster where we can deploy our MinIO workloads. The structure of our Terraform modules would look something like this
modules |
https://github.com/minio/blog-assets/tree/main/ci-cd-deploy/terraform/aws/modules
In order for the VPC to have different networks each subnet requires a unique non overlapping subnet. These subnets are split into CIDR blocks. For a handful, this is pretty easy to calculate, but for many subnets like we have here, Terraform provides a handy function cidrsubnet()
to split the subnets for us based on a larger subnet we provide, in this case 10.0.0.0/16
.
variable "minio_aws_vpc_cidr_block" { |
Define the VPC resource in Terraform. Any subnet created will be based on this VPC.
resource "aws_vpc" "minio_aws_vpc" { |
Set up 3 different networks: Public, Private and Isolated.
The Public Network with Internet Gateway (IGW) will have inbound and outbound internet access with a public IP and an Internet Gateway.
variable "minio_public_igw_cidr_blocks" { |
The aws_subnet
resource will loop 3 times creating 3 subnets in the public VPC
resource "aws_subnet" "minio_aws_subnet_public_igw" { |
The Private Network with NAT Gateway (NGW) will have outbound network access, but no inbound network access, with a private IP address and NAT Gateway.
variable "minio_private_ngw_cidr_blocks" { |
The aws_subnet
resource will loop 3 times creating 3 subnets in the private VPC
resource "aws_subnet" "minio_aws_subnet_private_ngw" { |
Finally, we create an Isolated and Air-gapped network with neither outbound nor inbound internet access. This network is completely air gapped with only a private IP address.
variable "minio_private_isolated_cidr_blocks" { |
The aws_subnet
resource will loop 3 times creating 3 subnets in the isolated/air-gapped VPC
resource "aws_subnet" "minio_aws_subnet_private_isolated" { |
MinIO Kubernetes Cluster
Create a Kubernetes cluster on which we’ll deploy our MinIO cluster. The minio_aws_eks_cluster_subnet_ids
will be provided by the VPC that we’ll create. Later, we’ll show how to stitch all this together in the deployment phase.
variable "minio_aws_eks_cluster_subnet_ids" { |
Note: In production you probably don’t want to have public access to the Kubernetes API endpoint because it could become a security issue as it will open up control of the cluster.
You will also need a couple of roles to ensure the Kubernetes cluster can communicate properly via the networks we’ve created, and those are defined at eks/main.tf#L1-L29. The Kubernetes cluster definition is as follows
resource "aws_eks_cluster" "minio_aws_eks_cluster" { |
The cluster takes in the API requests made from commands like kubectl
, but there’s more to it than that – the workloads need to be scheduled somewhere. This is where a Kubernetes cluster node group is required. Below, we define the node group name, the type of instance and the desired group size. Since we have 3 AZs, we’ll create 3 nodes one for each of them.
variable "minio_aws_eks_node_group_name" { |
You need a couple of roles to ensure the Kubernetes node group can communicate properly, and those are defined at eks/main.tf#L48-L81. The Kubernetes node group (workers) definition is as follows:
resource "aws_eks_node_group" "minio_aws_eks_node_group" { |
This configuration will launch a control plane with worker nodes in any of the 3 VPC networks we configured. We’ll show later the kubectl get no
output once the cluster is launched.
MinIO Deployment
By now, we have all the necessary infrastructure in code form. Next, we’ll deploy these resources and create the cluster on which we’ll deploy MinIO.
Install Terraform using the following command
brew install terraform |
Install aws CLI using the following command
brew install awscli |
Create an AWS IAM user with the following policy. Note the AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
after creating the user.
Set environmental variables for AWS, as they will be used by `terraform` and awscli
.
$ export AWS_ACCESS_KEY_ID=<access_key> |
Create a folder called hello_world
in the same directory as modules
using the structure below
. ├── hello_world │ ├── main.tf │ ├── outputs.tf │ ├── terraform.tfvars │ └── variables.tf ├── modules │ ├── eks │ └── vpc |
https://github.com/minio/blog-assets/tree/main/ci-cd-deploy/terraform/aws/hello_world
Create a file called terraform.tfvars
and set the following variable
hello_minio_aws_region = "us-east-1" |
Create a file called main.tf
and initialize the terraform AWS provider and S3 backend. Note that the S3 bucket needs to exist beforehand. We are using S3 backend to store the state so that it can be shared among developers and CI/CD processes alike without dealing with trying to keep local state in sync across the org.
terraform { |
Setting the backend bucket and key as variables is not supported, so those values need to be hard coded.
Call the VPC module from main.tf
and name it hello_minio_aws_vpc
module "hello_minio_aws_vpc" { |
These are the variables required by vpc module
hello_minio_aws_vpc_cidr_block = "10.0.0.0/16" hello_minio_aws_vpc_cidr_newbits = 4 |
hello_world/terraform.tfvars#L3-L22
Once the VPC has been created, the next step is to create the Kubernetes cluster. The only value we will use from the VPC creation is minio_aws_eks_cluster_subnet_ids
. We’ll use the private subnets created by the VPC
module "hello_minio_aws_eks_cluster" { |
These are the variables required by EKS module
hello_minio_aws_eks_cluster_name = "hello_minio_aws_eks_cluster" |
hello_world/terraform.tfvars#L24-L32
Finally we’ll apply the configuration. While still in the hello_world
directory run the following terraform
commands. This will take about 15-20 minutes to get the entire infrastructure up and running. Towards the end, you should see an output similar to below:
$ terraform init …TRUNCATED… $ terraform apply …TRUNCATED… hello_minio_aws_eks_cluster_name = "hello_minio_aws_eks_cluster" …TRUNCATED… |
Update your --kubeconfig
default configuration to use the cluster we just created using aws eks
command. The --region
and --name
are available from the previous output.
$ aws eks --region us-east-1 update-kubeconfig \ |
Check to verify that you can get a list of nodes
$ kubectl get no |
Next, install EBS drivers so gp2 PVCs can mount. We are using gp2 because this is the default storage class supported by AWS.
Set credentials for the AWS secret using the same credentials used for awscli
kubectl create secret generic aws-secret \ |
Apply the EBS drivers resources:
$ kubectl apply -k "github.com/kubernetes-sigs/aws-ebs-csi-driver/deploy/kubernetes/overlays/stable/?ref=release-1.12" |
Your Kubernetes cluster should be ready now.
Now we’re ready to deploy MinIO. First, clone the MinIO repository
$ git clone https://github.com/minio/operator.git |
Since this is AWS, we need to update the storageClassName
to gp2
. Open the following file and update any references from storageClassName: standard
to storageClassName: gp2
. Each MinIO tenant has its own tenant.yaml
that contains the storageClassName configuration. Based on the tenant you are using, be sure to update the storageClassName accordingly.
$ vim ./operator/examples/kustomization/base/tenant.yaml |
Apply the resources to Kubernetes to install MinIO
$ kubectl apply -k operator/resources |
Wait at least 5 minutes for the resources to come up, then verify that MinIO is up and running.
$ kubectl -n tenant-lite get po -o wide |
If you notice the above output, each storage-lite-pool-
is on a different worker node. Two of them share the same node because we have 3 nodes, but that is okay because we only have 3 availability zones (AZs). Basically there are 3 nodes in 3 AZs and 4 MinIO pods with 2 PVCs each which is reflected in the status 8 Online
below.
$ kubectl -n tenant-lite logs storage-lite-pool-0-0 |
You will need the TCP port of the MinIO console; in this case it is 9443
.
$ kubectl -n tenant-lite get svc | grep -i console |
With this information, we can set up Kubernetes port forwarding. We chose port 39443
for the host, but this could be anything, just be sure to use this same port when accessing the console through a web browser.
$ kubectl -n tenant-lite port-forward svc/storage-lite-console 39443:9443 |
Access MinIO Operator Console through the web browser using the following credentials:
URL: https://localhost:39443
User: minio
Password: minio123
You now have a fully production setup of a distributed MinIO cluster. Here is how you can automate it using Jenkins
Here is the execute shell command in text format
export PATH=$PATH:/usr/local/bin
cd ci-cd-deploy/terraform/aws/hello_world/
terraform init
terraform plan
terraform apply -auto-approve
Final Thoughts
In these past few blogs of the CI/CD series we’ve shown you how nimble and flexible MinIO is. You can build it into anything you want using Packer and deploy it in VMs or Kubernetes clusters wherever it is needed. This allows your developers to have as close to a production infrastructure as possible in their development environment, while at the same time leveraging powerful security features such as Server Side Object Encryption and managing IAM policies for restricting access to buckets.
In a production environment, you might want to restrict the IAM user to a specific policy but that really depends on your use cases. For demonstration purposes, we kept things simple with a broad policy, but in production you would want to narrow it down to specific resources and groups of users. In a later blog we’ll show some of the best practices on how to design your infrastructure for different AZs and regions.
Would you like to try automating the kubectl
part as well with Jenkins instead of applying manually? Let us know what type of pipeline you’ve built using our tutorials for planning, deploying, scaling and securing MinIO across the multicloud, and reach out to us on our Slack and share your pipelines!