Airgapped MinIO Deployments

AJ AJ on DevOps |
Airgapped MinIO Deployments

There are different portions of a network such as DMZ, Public, Private, Bastion, among others. It really depends on your organization and your networking requirements. When deploying an application, any application, we need to consider the type and whether it needs to be in a particular portion of the network.

For example, if you are deploying a database, you do not want it to be on the Public network, you probably want it to be in a Private network where it cannot be accessed from the outside internet. Why? Simply because there is a lot more sensitive information in a database and the access control lists in some databases are not stringent enough to cope with the potential of being compromised when being accessed directly. Also end users seldom access the database directly, they generally access it via a frontend application which then performs structured queries on the database.

In this post we’ll talk about what is an Airgapped Network, what to consider when deploying MinIO in such an environment and how to replicate and scale it thereafter with other airgapped sites.

What is an Airgapped Network?

Generally in Private networks, IPs in the following CIDRs (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16), you cannot talk inbound to the nodes from the internet unless you are using a proxy with a public IP that you first have to connect to, like a VPN or Reverse Proxy. Generally the nodes can talk to the outside world because the datacenter generally uses a NAT or an Internet Gateway (IGW) of some sort to route public requests out this NAT/IGW device to be able to download packages for uploads among other operations.

An airgapped network, as the name suggests, goes a step further and not only can you not access it from the internet, but you cannot connect from the node to the internet either. The nodes are completely locked down in this network. You might still be able to access them via VPN but generally it's recommended to connect to a bastion host and then have the airgap network accessible only from the bastion node’s private IP.

Deploying MinIO in an Airgapped Network

So why is this level of security needed? Well for a lot of purposes. We mentioned databases, but this especially applies to critical infrastructure components such as MinIO where you cannot afford to expose the data in it to the outside world, yet you need to store them and be accessible to other applications within your ecosystem.

When designing an airgapped network, you have to ensure all the resources that are needed that are public for the operation of that node need to be accessible by the airgap network. This means the Operating System packages, RPM/APT packages, MinIO Binary, and any other dependency that the node needs to pull from the internet needs to be already available to the airgap network. This can be achieved by syncing the repository of all these dependencies locally in your datacenter in a special network called Ops Network. The specialty of the Ops Network is that any service running in this network can talk to the Internet (not vice versa) and any application in a private network within the datacenter (even airgapped)  can talk to the services in the Ops Network. With this approach you kill not just two but several birds at once

  • You can scrub the packages downloaded from the internet to ensure they have not been tampered with. You can rest assured that all your applications will use packages that have been vetted and safe to be installed from a trusted repo mirror in the airgapped environment.
  • All the packages needed for the node to be built are available locally, If there is an outage of the upstream repo it will not affect the operations of the airgapped environment.
  • In case there is a complete node failure, time is of the essence. The faster you replace the node the less the chances of the cluster going out of quorum if more nodes fail. By rebuilding the nodes quicker you can bring down the MTR (Mean Time to Resolve).

Once all the requires resources are internally accessible by the airgapped network, deploying MinIO in Airgapped network is just like any other. We recommend at a single site you at least do Multi-Node Multi-Drive deployment so that in case there is a drive failure or even an entire node failure Erasure Coding will ensure data is replicated to other disks and nodes. You just have to make sure the port on which MinIO is running on is open on all the nodes and accessible bidirectionally from within the airgapped network.

So far so good. We designed an airgapped network. Found a way to deploy MinIO in it with upstream resources mirrored locally. But how do we actually use the cluster? If it cannot access the outside how do we add data to it? Well it's actually quite simple, we can use similar techniques as we would do if it were a traditional database. The network should be designed so that MinIO is only accessible through the application in front of it. It could be a full blown front-end application that perhaps acts as a CDN or a simple ETL workflow that uses MinIO to retrieve raw data and store the final processed result.

MinIO will never be accessible to the end user directly so incase there is a network compromise in the Public subnet or the DMZ they will never be able to access the services in airgapped network such as MinIO so your data and models are safe. Moreover, you will notice your overall cluster is operating much stabler than it was before, but why is that? Everything else the same?

Reason being stability matters. Nothing in an airgapped network gets updated unless you explicitly want it to update. Let me take a quick story time detour. Ten years ago while I was on a SRE team one day we had the task of upgrading 260 Redis servers. While we were doing routine maintenance getting ready to prep the binary for the new version, we noticed all our page loads going up and some features not loading at all. Unfortunately, all of them were services backed by Redis. We were baffled, we only staged the binary, why is everything going offline? After 5 minutes we immediately noticed all Redis servers were automagically being upgraded to the newer binary we only staged, not deployed. Turns out in our Infrastructure Automation system (Puppet), we set it to upgrade Redis as soon as it notices a new package being staged. So it did what it was supposed to and upgraded all the servers and subsequently took down multiple Redis clusters at once. So the fault was ours and we quickly rectified it. Even though our Redis servers were airgapped we still managed to take it down, so just imagine if you have it pulling directly from upstream, your services will be updated at the whim of the package maintainer.

While MinIO updates are backwards compatible and we highly recommend you keep up with new releases as regularly possible with our no downtime atomic upgrades, we still would not want you to upgrade your cluster when we publish the binary, that is just bad devops practice, you should at least be aware what is being upgraded on your systems. Some teams go the extra mile and make the entire node immutable, meaning, anytime anything needs to be upgraded or even modified they have to rebuild the entire node that way they ensure there is no configuration drift. But that might be a larger undertaking than required in most cases so having an airgapped environment is a good option in these cases.

Airgapped Site Replication and Security

We talked about a Single Cluster in a Single Region, but what about multi-site? Does our Airgap network now have to talk over the internet to reach other sites? Doesn’t that defeat the purpose of an airgapped environment?

Similar to how the internet goes over physical fiber connections, data centers are for the most part connected to each other via Internet Exchange facilities. These facilities provide dark fiber WAN links (Wide Area Network) that you can essentially do anything with. You can start your own ISP or use it to connect your two disparately located data centers across the world. The protocol that runs on these WAN links could be anything, you don’t need to run VPN on top of this. While we still recommend encrypting MinIO traffic, the WAN link should also have its own encryption so data cannot be sniffed by neighboring traffic as essentially all these strands are shared.

Once the WAN link is up and operational, you will be able to talk to other MinIO clusters on a private airgapped network just like if the site-to-site replication was running over regular internet. The only difference here is that:

  • The traffic is completely private which provides extra layer of security
  • Because you are not sharing the pipe with anyone else, you will have access to full throughput of the pipe, generally for critical and time sensitive applications a dedicated allocated WAN link is preferred.

Please take a moment to read this blog post where we talk about best practices for Site to Site replication.

Monitoring MinIO in Airgap

So if everything is airgapped how can you send MinIO cluster metrics and use the diagnostics analysis with SUBNET?

That is a great question. This is the exact reason why we provide the mc diag command with an airgap flag. Not only that but you can anonymize your data so that sensitive information such as hostnames and IPs are obfuscated before being sent along with the diagnostics bundle.

To create a diagnostics bundle in airgap, run the following command which also anonymizes the data

mc support diag myminio --airgap --anonymize=strict

● CPU Info ... ✔
● Disk Info ... ✔
● Net Info ... ✔
● Os Info ... ✔
● Mem Info ... ✔
● Process Info ... ✔
● Server Config ... ✔
● System Errors ... ✔
● System Services ... ✔
● System Config ... ✔
● Admin Info ... ✔

mc: MinIO diagnostics report saved to myminio-health_20231111053323.json.gz

Login to https://subnet.min.io/ and upload the above generated gzip file to the Deployments section. Select the deployment for which you want to analyze this report for and upload the gzip file. Once the file is uploaded our Engineers in SUBNET can help you diagnose any issues with your cluster, sometimes issues that you might not have even been aware of.

ET Phone home

It almost always makes sense to run MinIO in an Airgapped environment. Because almost always the data stored is so sensitive that you don’t want to have the MinIO port listening on a public network to the outside world where it could potentially be compromised. We build MinIO with first and foremost security but ultimately you are running on a network and hardware which MinIO has no control over so it's better to be cautious and keep your data safe in an Airgapped Environment.

If you have any questions on Airgapped MinIO installs be sure to reach out to us on Slack!