Building a Scalable, Data Sovereign National ID System

Building a Scalable, Data Sovereign National ID System

Some of the smartest minds in philanthropy are backing the concept of a simple yet powerful national ID system. The Bill and Melinda Gates Foundation, the Tata Trusts, the Omidyar Network and the Pratiksha Trust have all gotten involved with this movement because of its foundational capabilities for enabling a wide range of social programmes. They have put their resources behind an open source project called MOSIP and it is quietly remaking national identity across Africa and Asia:

A national ID system is a centralized database that stores information about all citizens and legal residents of a country. This information can include name, date of birth, address, photograph, fingerprints, and other biometric data. Core applications include: 

  • Enhancing security: These systems can help to prevent identity theft and fraud by providing a secure and reliable way to verify a person's identity. This is important for a number of purposes, such as opening a bank account, applying for a job, or voting.
  • Promoting efficiency: A national ID system can help to streamline government services by making it easier for citizens and legal residents to access them. For example, a national ID card can be used to verify a person's identity when applying for a driver's license, passport, or other government document.
  • Encouraging financial inclusion: A national ID system can help to make financial services more accessible to people who have previously been excluded from the formal financial system. This is because a national ID card can be used to open a bank account or obtain a loan, even if the person does not have other documentation, such as a birth certificate or marriage certificate.
  • Improving access to healthcare: A national ID system can help to improve access to healthcare by making it easier for people to register with a doctor or hospital. This is important for people who move frequently or who do not have other documentation, such as a permanent address.

There are other ancillary benefits as well. Those include reducing corruption by reducing identity fraud, enhancing economic growth through access to the financial system and reduced friction/increased social cohesion. 

MOSIP can obviously work in the greenfield model where the program is built from scratch and each of the open source modules are customized - from pre-registration to issuance and verification. MOSIP also works in a brownfield model where the open source modules are integrated with existing databases or identity systems. 

At its heart, this is a technology problem and one where MinIO is deeply embedded. Indeed, MOSIP, ultimately recommends two deployment options - MinIO for countries that keep their data within their borders and AWS when that data is permitted to leave the country and go to the cloud. 

We want to document how this architecture goes together and why a MinIO-based data sovereign approach matters. 

The following aspects are important when considering a storage platform capable of handling a national ID program:

Strong Security

By its nature, a national ID program stores the most sensitive data imaginable - personally identifiable information including personal data, images and biometric data.  The highest levels of data security are required when dealing with such data.  MinIO provides enterprise-level encryption for data both in-flight using TLS as well as at-rest using external keys stored on external key management systems.  

MinIO supports Transport Layer Security (TLS) v1.2+ between all components in the cluster. This approach ensures there are no weak links in either inter or intra-cluster encrypted traffic. TLS is a ubiquitous encryption framework: it’s what puts the s in https and is the same encryption protocol used by banks, e-commerce sites and other enterprise-grade systems that rely on data storage encryption.

MinIO’s state-of-the-art encryption schemes support granular object-level encryption using modern, industry-standard encryption algorithms, such as AES-256-GCM, ChaCha20-Poly1305, and AES-CBC. MinIO is fully compatible with S3 encryption semantics, and also extends S3 by including support for non-AWS key management services such as Hashicorp Vault, Gemalto KeySecure, and Google Secrets Manager.

Enterprise Grade Object Storage Encryption

Easily Expandable Capacity

Populations tend to grow with time, and the types and amount of data stored per ID is likely to grow with time. Typically a MOSIP deployment starts with a few million IDs and grows from there to the hundreds of millions. Capacity, and the ability to easily scale capacity, becomes critical for a deployment such as national ID programs.  MinIO’s erasure coding provides very efficient storage with the ability to survive the loss of drives and nodes.  Capacity is easily scaled through the use of Server Pools. Server Pools eliminate the need to rebalance data - a legacy approach that is both expensive and time consuming. Scaling in pools allows growth from terabytes to petabytes as needs change.

Scalable Object Storage

Consistent High-Throughput Performance

As the number of IDs grows, and the data per ID grows, throughput becomes critical to a successful national program in order to ensure a speedy user experience. The data structure of an ID is made up of a large number of data items. MinIO is a high-performance object storage system designed for these types of workloads. When properly deployed across 32 nodes, MinIO can deliver sustained READ throughput of over 320 GiB per second.

MinIO NVMe Benchmark

Real-time Replication for BC/DR

At-scale replication is the only rational way to provide data resiliency across sites. The time it takes to backup or restore a small quantity of data, for example 10TiB, across a slow network is unacceptable for almost all use cases, and certainly unacceptable for a government ID system that needs to be available 24/7. Active-Active Replication for object storage is a key requirement for mission-critical production environments. MinIO is the only vendor that offers synchronous, object-level replication to multiple sites, today.

Active Active Replication for Object Storage

Efficient and Cost-Effective Deployment and Growth

The required storage per ID varies with the MOSIP modules that are deployed, the types of data stored, the resolution of the images and the biometric data items.  Please consult MOSIP for a more accurate prediction of storage needs. That being said, when a deployment starts at a few million IDs the data storage required is typically on the order of 10 TiB. When it grows to 100 million IDs, the storage requirements can be on the order of 1 PiB, and the throughput requirements grow proportionately. A typical initial deployment for MinIO, able to handle a few million IDs, would consist of 4 nodes with 4 drives each.

MinIO runs on commodity hardware, so any vendor is fine.  As an example, below is a 4 node 2U device (4 separate CPUs, RAM, and drive sets in a single 2U chassis) from Supermicro that makes deployment easy. This unit supports up to 6 disks per CPU in the chassis for a total of 24 drives. Using hardware like this makes it easy and cost effective to deploy enterprise-grade object storage using MinIO in order to support MOSIP.

Supermicro GrandTwin™ SuperServer and MinIO 

The Supermicro GrandTwin™ SuperServer SYS-211GT-HNTR 2U server enclosure is a dense, rack-optimized platform for deploying MinIO object storage. 

As the MOSIP deployment grows, additional units can be added to scale the MinIO storage capacity. Using three of the above referenced Supermicro units deployed across 3 racks would provide exceedingly fast data access to over 3PiB of storage. MinIO is resilient and erasure coding would allow for the loss of 96 drives, or 4 servers, or 1 rack and still maintain full functionality.  

Bringing a National ID Program to Life

As national ID systems proliferate for all the benefits outlined above, and the quantity of data they store grows, it becomes incumbent on governments to store such data safely and securely in object storage that is secure, cost-effective, rapidly scalable and high-performance. 

Governments and NGOs can’t take risks when it comes to national ID systems because  citizen data is too valuable and sensitive to lose. Implementing a national ID system such as MOSIP, built on an enterprise class object storage system such as MinIO, guarantees a successful deployment and the overall success of the program.

MinIO is always available for a call to discuss your object storage needs, growth path, and to work with your hardware vendor to ensure a proper object storage deployment. Reach out to us on slack or email us at hello@min.io.