Using K8ssandra to Backup and Restore Cassandra with MinIO

Apache Cassandra is a distributed NoSQL database that enjoys broad popularity with the developer community. From Ably to Yelp, Cassandra is commonly deployed because of its ability to quickly process queries across petabytes of data and scale to hundreds or thousands of nodes. With relative ease Cassandra can be deployed to any environment, it scales seamlessly both up and out, and operates well in distributed mode. It is also flexible in storing its data, it can handle high volume without compromising on performance and speed. It is open source so when fixes need to go out the iteration cycle is quick and the Cassandra query language (CQL) is easy to learn for anyone who has worked with SQL syntax.

Cassandra is commonly used for writing intensive data models like metrics / time series data, historical viewed pages, taxi tracking, etc. What folks stress most when it comes to Cassandra is having the right data model with the appropriate use case. And, of course, make sure you tune your JVM properly in order to have a pleasant Cassandra experience.

Cassandra comes in a few different flavors:

The Apache version which is the original open source of Cassandra that was developed by Meta/Facebook. It has all the core features that are required to run Cassandra but support is community based. You have to be familiar with the command line tools to manage and operate the cluster.
The Datastax version of Cassandra was built on top of the open source but a with focus on security and supportability for the enterprise. For instance, one cool feature it provides is the OpsCenter, a browser-based console UI that shows the state of the cluster and can be used to perform operational tasks.

There are several ways to deploy Cassandra; bare metal, VMs, containers, etc. Today we’ll deploy it to a Kubernetes cluster with the help of K8ssandra. This will set up all the necessary scaffolding for our Cassandra cluster to come up and running in no time.

No matter which method you use to deploy Cassandra,it makes sense to pair it with cloud-native object storage to make the most of it. MinIO is the perfect complement to Cassandra/K8ssandra because of its industry-leading performance and scalability. MinIO’s combination of scalability and high-performance puts every data-intensive workload, not just Cassandra, within reach. MinIO has created a comprehensive blueprint for data infrastructure to support exascale AI and other large scale data lake workloads. It is called the MinIO DataPod. Why? Because exascale data is the reality that is common today in today's enterprise.

MinIO was designed and built to be Kubernetes-native and cloud-native, and to scale seamlessly from TBs to EBs and beyond. MinIO tenants are fully isolated from each other in their own namespace. By following the Kubernetes plugin and operator paradigm, MinIO fits seamlessly into existing DevOps practices and toolchains, making it possible to automate Cassandra backup operations.MinIO makes a safe home for Cassandra backups. Data written to MinIO is immutable, versioned and protected by erasure coding. Let’s look at how you can backup and restore data fast to minimize downtime with MinIO.

Prerequisites

Before we get started there are some prerequisites you need to have ready for our installation.

Helm
Kubernetes cluster (Minikube, Kind etc.)

In this guided tour we will use Minikube as our Kubernetes cluster of choice but any properly configured Kubernetes cluster will work.

Cassandra

There are several ways we can install Apache Cassandra. In this example we will deploy Cassandra in a semi-automated way using K8ssandra.

Configuring

We will deploy a simple Cassandra cluster with the following spec, save this yaml as k8ssandra.yaml.

cassandra:
version: "3.11.10"
cassandraLibDirVolume:
storageClass: local-path
size: 5Gi
allowMultipleNodesPerWorker: true
heap:
size: 1G
newGenSize: 1G
resources:
requests:
cpu: 1000m
memory: 2Gi
limits:
cpu: 1000m
memory: 2Gi
datacenters:
- name: dc1
size: 1
racks:
- name: default
kube-prometheus-stack:
grafana:
adminUser: admin
adminPassword: admin123
stargate:
enabled: true
replicas: 1
heapMB: 256
cpuReqMillicores: 200
cpuLimMillicores: 1000
medusa:
enabled: true
storage: s3_compatible
storage_properties:
host: minio
port: 9000
secure: "False"
bucketName: k8ssandra-medusa
storageSecret: medusa-bucket-key

We will not go through the entire spec but we will go through some of the key components below:

Storage Classes

There are a couple different storage classes that you can use but the main requirement is that the VOLUMEBINDINGMODE should be WaitForFirstConsumer.

The default storage class mode is not supported so let's create a rancher local-path storage class.

kubectl get storageclass

NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
standard (default) k8s.io/minikube-hostpath Delete Immediate false 4m33s

kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/master/deploy/local-path-storage.yaml

namespace/local-path-storage created

serviceaccount/local-path-provisioner-service-account created

clusterrole.rbac.authorization.k8s.io/local-path-provisioner-role created

clusterrolebinding.rbac.authorization.k8s.io/local-path-provisioner-bind created

deployment.apps/local-path-provisioner created

storageclass.storage.k8s.io/local-path created

configmap/local-path-config created

Now you should see a second storage class with the proper config

kubectl get storageclass

NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
local-path rancher.io/local-path Delete WaitForFirstConsumer false 20s
standard (default) k8s.io/minikube-hostpath Delete Immediate false 6m30s

Medusa

Medusa is a component of the Apache Cassandra ecosystem. At its core Medusa orchestrates the backup and restore of the Cassandra cluster.

The fields in particular of interest are:

host
port
storageSecret
bucketName

These fields allow you to connect to MinIO to perform backup and restore operations.

Storage Secret

The storage secret is a Kubernetes Secret that we’ll create with credentials to connect to MinIO. Put the below contents in a file called minio-medusa-secret.yaml:

apiVersion: v1
kind: Secret
metadata:
name: medusa-bucket-key
type: Opaque
stringData:
medusa_s3_credentials: |-
[default]
aws_access_key_id = miniok8ssandra_key
aws_secret_access_key = miniok8ssandra_secret

metadata.name must match storageSecret field in k8ssandra.yaml.
aws_access_key_id and aws_secret_access_key must match MinIO’s accessKey and secretKey, respectively.

Apply minio-medusa-secret.yaml resource in the cluster

kubectl apply -f minio-medusa-secret.yaml

secret/medusa-bucket-key created

Verify with kubectl get secret medusa-bucket-key

NAME TYPE DATA AGE
medusa-bucket-key Opaque 1 12s

Installing

To install add K8ssandra Helm repo and update helm

Add repo helm repo add k8ssandra https://helm.k8ssandra.io/stable
Verify with helm repo list
Run helm repo update to pull latest updates

Install Apache Cassandra using K8ssandra, ensure the secret medusa-bucket-key exists.

helm install -f k8ssandra.yaml k8ssandra k8ssandra/k8ssandra

The above command should do the following:

Create a Cassandra cluster named k8ssandra.
Deploy the cluster to the default Kubernetes namespace.

Wait 5 minutes at least until the cluster comes fully online. If you run kubectl get pods you should see all of them in Running state.

NAME READY STATUS RESTARTS AGE
k8ssandra-cass-operator-699b8c4cbc-ptbtk 1/1 Running 0 3m35s
k8ssandra-dc1-default-sts-0 2/3 CrashLoopBackOff 4 (14s ago) 3m23s
k8ssandra-dc1-stargate-556ddcc5-8rb7v 0/1 Init:0/1 0 3m34s
k8ssandra-grafana-795f9865fc-7v9lq 2/2 Running 0 3m35s
k8ssandra-kube-prometheus-operator-9544bb7bf-b8m6k 1/1 Running 0 3m35s
k8ssandra-medusa-operator-66d8969ff9-knkd6 1/1 Running 0 3m35s
k8ssandra-reaper-operator-77554f9458-zd5mh 1/1 Running 0 3m35s
prometheus-k8ssandra-kube-prometheus-prometheus-0 2/2 Running 0 3m31s

Let’s take this opportunity to explore debugging methodology. This is a little outside the scope of an introductory tutorial, but it will be helpful in the long run.

Debugging

Let’s describe the pod to see what is it stuck on

kubectl describe pod k8ssandra-dc1-default-sts-0

medusa:

……

State: Waiting

Reason: CrashLoopBackOff

Last State: Terminated

Reason: Error

Exit Code: 1

……

We can see from the above describe output the medusa container is having issues, lets see its logs to see if we can determine what the issue is:

kubectl logs k8ssandra-dc1-default-sts-0 -c medusa

[2022-07-06 14:06:18,778] DEBUG: Starting new HTTP connection (1): minio:9000
...
socket.gaierror: [Errno -2] Name or service not known
...

Ah hah! It looks like there is an issue connecting to MinIO service. Which makes sense, because we do not have any service named minio running in the cluster.

$ kubectl get service

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
k8ssandra-dc1-additional-seed-service ClusterIP None <none> <none> 3h26m
k8ssandra-dc1-all-pods-service ClusterIP None <none> 9042/TCP,8080/TCP,9103/TCP 3h26m
k8ssandra-dc1-service ClusterIP None <none> 9042/TCP,9142/TCP,8080/TCP,9103/TCP,9160/TCP 3h26m
k8ssandra-dc1-stargate-service ClusterIP 10.110.1.67 <none> 8080/TCP,8081/TCP,8082/TCP,8084/TCP,8085/TCP,9042/TCP 3h27m
k8ssandra-grafana ClusterIP 10.100.244.68 <none> 80/TCP 3h27m
k8ssandra-kube-prometheus-operator ClusterIP 10.96.25.118 <none> 443/TCP 3h27m
k8ssandra-kube-prometheus-prometheus ClusterIP 10.104.105.81 <none> 9090/TCP 3h27m
k8ssandra-reaper-reaper-service ClusterIP 10.98.35.86 <none> 8080/TCP 3h26m
k8ssandra-seed-service ClusterIP None <none> <none> 3h26m
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 5h3m
prometheus-operated ClusterIP None <none> 9090/TCP 3h26m

Tip: If you run into any issues with backups/restores and need additional debug info, the medusa container is where you would find most of the details.

MinIO

As we found out, for the medusa container to come up we need to first get MinIO online. We will use Helm to set up MinIO as well.

Installing

To install add MinIO Helm repo and update helm

Add repo helm repo add minio https://helm.min.io/
Verify with helm repo list
Run helm repo update to pull latest updates

Install MinIO using the helm command:

helm install --set accessKey=miniok8ssandra_key,secretKey=miniok8ssandra_secret,defaultBucket.enabled=true,defaultBucket.name=k8ssandra-medusa minio minio/minio
accessKey and secretKey must match aws_access_key_id and aws_secret_access_key respectively in minio-medusa-secret.yaml.
defaultBucket.name must match bucketName in k8ssandra.yaml.

The above command will do the following:

Create a deployment and service named minio.
Deploy MinIO to the default Kubernetes namespace.

If you wait a few minutes and run kubectl get pods you should see all of them in Running state. But if you are impatient like me and want to see the medusa container come up quickly, just delete k8ssandra-dc1-default-sts-0 pod.

kubectl delete pod k8ssandra-dc1-default-sts-0

After about a minute or so you should see all the pods Running:

kubectl get po
NAME READY STATUS RESTARTS AGE
k8ssandra-cass-operator-699b8c4cbc-s7qqz 1/1 Running 0 3h46m
k8ssandra-dc1-default-sts-0 3/3 Running 0 3h34m
k8ssandra-dc1-stargate-556ddcc5-mz2fp 1/1 Running 0 3h46m
k8ssandra-grafana-795f9865fc-95pwc 2/2 Running 0 3h46m
k8ssandra-kube-prometheus-operator-9544bb7bf-f6nxb 1/1 Running 0 3h46m
k8ssandra-medusa-operator-66d8969ff9-svbbw 1/1 Running 0 3h46m
k8ssandra-reaper-679fd4fc7d-lbjhw 1/1 Running 0 3h33m
k8ssandra-reaper-operator-77554f9458-zkdcl 1/1 Running 0 3h46m
minio-5fb8f49576-l2s7l 1/1 Running 0 3h37m
prometheus-k8ssandra-kube-prometheus-prometheus-0 2/2 Running 0 3h46m

Verify Bucket

How do we check if Medusa was able to connect to MinIO? Well there is a very simple way. If we port forward the MinIO console and log in we should see a bucket named k8ssandra-medusa.

kubectl port-forward service/minio 39000:9000
Using a browser, go to http://localhost:39000
Login with accessKey and secretKey for username and password, respectively.
If you see the bucket, then Medusa was able to successfully connect to MinIO.

Backup

In order to test the backup there are a few prerequisites:

cqlsh: The utility that is used to interact with Cassandra clusters.
Test data which we can hydrate, delete and restore to show the capabilities.

Install cqlsh

In order to interact with Cassandra we’ll use a utility called cqlsh.

Install cqlsh using pip

pip3 install cqlsh

Get Cassandra Superuser credentials

kubectl get secret k8ssandra-superuser -o jsonpath="{.data.username}" | base64 --decode ; echo
kind kubectl get secret k8ssandra-superuser -o jsonpath="{.data.password}" | base64 --decode ; echo

Open port-forward for cqlsh to access our Cassandra cluster

kubectl port-forward svc/k8ssandra-dc1-stargate-service 8080 8081 8082 9042

cqlsh -u <username> -p <password>

Hydrate data

Once you are able to log in, you should see a prompt such as below:

<username>@cqlsh>

Copy and paste the following blob into the REPL then run it

CREATE KEYSPACE medusa_test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};
USE medusa_test;
CREATE TABLE users (email text primary key, name text, state text);
insert into users (email, name, state) values ('alice@example.com', 'Alice Smith', 'TX');
insert into users (email, name, state) values ('bob@example.com', 'Bob Jones', 'VA');
insert into users (email, name, state) values ('carol@example.com', 'Carol Jackson', 'CA');
insert into users (email, name, state) values ('david@example.com', 'David Yang', 'NV');

Once the above blob has been run let’s verify the data has been entered properly

SELECT * FROM medusa_test.users;

Now we can finally back up to MinIO using the following command, because this is a small data set it shouldn’t take too long.

helm install demo-backup k8ssandra/backup --set name=backup1,cassandraDatacenter.name=dc1

Be sure to note the name of the backup which is required later in this tutorial for restoring.

Verify the status of the backup. If the below command returns a timestamp, it means the backup was successful.

kubectl get cassandrabackup backup1 -o jsonpath={.status.finishTime}

You should verify that the backup was created in MinIO console as well:

Set port forward kubectl port-forward service/minio 39000:9000
Go to MinIO Console http://localhost:39000

Restore

Now that we were able to confirm that the backup was successful, as an SRE/DevOps engineer you would also want to test to ensure you are able to restore it. Test the restore with your entire dataset to ensure you are able to restore everything.

Delete data

While still in cqlsh> REPL, delete the existing data.

TRUNCATE medusa_test.users;

Verify the data has been in fact removed

SELECT * FROM medusa_test.users;

email | name | state
-------+------+-------

(0 rows)

When we had data there were 4 rows and now we have 0.

Fetch Backup

Similar to how we ran the backup, we’ll use helm to restore as well

helm install demo-restore k8ssandra/restore --set name=restore-backup1,backup.name=backup1,cassandraDatacenter.name=dc1

In the above command backup.name should match the name of the original backup.

Check the status of the backup

kubectl get cassandrarestore restore-backup1 -o jsonpath={.status}

When you see finishTime in the output that means the restore has completed successfully. Another way you can verify this is by getting the list of users again

SELECT * FROM medusa_test.users;

Conclusion

There you go! Let’s recap what we did:

Using K8ssandra we deployed Cassandra to Kubernetes
We deployed MinIO object storage
We successfully backed up and restored the data in Apache Cassandra to and from MinIO object storage.

You can take this one step further and perform a disaster recovery scenario where you can bring up a new cluster from scratch, connect Medusa to MinIO, and start restoring from a previously backed up copy of the data. Give it a try and let us know how it goes!

If you have any questions regarding how to set this up, be sure to join our Slack channel here.