Cloud Repatriation: Why Enterprises Are Moving Workloads Off Hyperscalers

Enterprises are increasingly moving workloads from the public cloud back on-premises, especially for AI infrastructure. The reason is clear. Modern private cloud stacks now offer the scalability, performance, and reliability of the hyperscalers, while delivering better economics, greater control, and stronger data security.

What is Cloud Repatriation?

Cloud repatriation is the process of moving workloads, data, and infrastructure from the public cloud back to on-premises or private cloud environments. It typically occurs when organizations face rising costs, performance limits, or control challenges in the public cloud. Repatriation applies cloud-native principles like disaggregation, orchestration, and scalability to infrastructure owned and operated by the enterprise.

Why Cloud Repatriation? Understanding the Trend

Legacy architectures like Hadoop tightly coupled storage and compute, forcing operators to scale both in lockstep. These systems were difficult to deploy and maintain, often requiring large, specialized teams. Public cloud providers emerged as a compelling alternative by offering managed services and elastic scale.

For a while, it worked. But data volumes have exploded. Enterprises that once handled terabytes now deal in petabytes. AI workloads operate on orders of magnitude more data, with concurrency levels and cost profiles that legacy designs, and even public cloud pricing models, struggle to support.

The very same workloads that drove enterprises to the cloud are now driving them out.

Modern on-prem solutions are not a return to Hadoop. Today’s infrastructure is disaggregated, performant, and built to scale linearly on commodity hardware. Object storage is decoupled from compute. Kubernetes orchestrates everything. It’s hyperscaler design without the hyperscaler bill.

Back in 2023, IDC reported that 80% of enterprises expect to repatriate some portion of their workloads from the cloud. This trend is continuing. For predictable, data-intensive workloads, it’s no longer a question of whether to repatriate, but when.

What Workloads are Going Back on Prem

The most common workloads being repatriated are those that involve persistent, high-volume data: data lakes, data lakehouses, and AI pipelines.

An AI-ready data lakehouse built in a private cloud supports a full range of data operations, from ingestion to inference. At the core is high-performance, S3-compatible object storage, which serves as the durable and scalable foundation for structured, semi-structured, and unstructured data. It is a cloud-native design that disaggregates storage and compute, supports massive concurrency, and delivers high-throughput access to large datasets.

Layered on top of the storage tier are open table formats like Apache Iceberg. These provide features critical for AI workflows and production-grade analytics: ACID transactions for write consistency, schema evolution to adapt to changing data models, partition evolution for optimizing file layout, hidden partitioning to simplify query planning, and time travel for rollback and reproducibility. Iceberg’s metadata layer also decouples data layout from physical storage, enabling fine-grained control over performance tuning and lifecycle management.

At the compute layer, modern engines such as Dremio, Trino, StarRocks, and Apache Flink can query Iceberg tables directly from object storage using ANSI SQL. Python-based frameworks like PyIceberg, DuckDB, Pandas, and Polars provide programmatic access for data scientists and ML engineers, supporting tasks like feature engineering, exploratory analysis, and model evaluation. AI model training can be distributed across on-prem GPU clusters, with direct access to training sets stored in object storage.

This architecture is designed for flexibility and modularity. Each layer from the storage, open table format to compute, can be independently scaled, swapped, or extended. And because everything runs on open formats and interfaces, it avoids lock-in while enabling best-of-breed choices across the stack.

Benefits Of Cloud Repatriation: Why Enterprises are Moving Workloads off Hyperscales

1. Significant Cost Savings

Organizations migrating workloads from the public cloud back on-prem consistently see significant cost savings. We’ve observed our customers saving upwards of 60% by migrating. These savings come from multiple sources.

First, public cloud object storage charges for every operation. That means every PUT, every GET, every LIST. In high-concurrency environments, these costs add up fast. And that’s before factoring in egress charges which you could be hit with every time your AI team pulls a dataset, or a developer syncs model checkpoints to a local GPU node.

2. Enhanced Security and Control

Security is another factor. Shared responsibility models place much of the burden on the customer, but without giving them full control. By contrast, on-prem deployments allow organizations to enforce air-gapped architectures, hardware-based key management, and granular access policies across sites. An air-gapped on-prem deployment remains the gold standard for data protection. It’s no longer the case that you have to exchange data sovereignty for modernization: the future is on-prem.

3. Minimal Workflow Disruption and Staff Retraining

Enterprises have learned that you don’t need to retrain teams or rebuild workflows to migrate from the cloud. The skills earned by managing and scaling cloud infrastructure are transferable to the cloud-native stack. As does your infrastructure-as-code. There is no steep learning curve

4. High Performance Without Hyperscaler Overhead

Performance is the final reason. Legacy architectures treated object storage as cheap, deep and slow. These aging infrastructures look nothing like the high performance data lakehouse stacks of today.

How AI Drives Cloud Repatriation

AI workloads are data-hungry and cost-sensitive. Cloud pricing models were not built for massive, frequent reads across large datasets, nor for the storage of rapidly growing model checkpoints, embeddings, and training data.

With operation-based billing, even modest experimentation can result in runaway cloud bills. On-prem deployments offer predictability and performance. The cost of hardware, staffing and facility footprint can be amortized over several years, turning unpredictable cloud bills into predictable and forecastable infrastructure costs.

In addition, more and more organizations are choosing to keep their training and inference pipelines close to the data. Physics will impact your AI pipelines wherever you decide to deploy them. Putting your data production and data consumption layers in the place is usually a sure fire way to reduce latency. It’s a winning strategy when every section matters.

To build a best-in-class AI data lakehouse on-prem, start with a modern, modular architecture. The stack should be declarative, reproducible, and container-native. It should work in a colo, in your own data center, or at the edge. It should scale - which

If your entire stack can’t be deployed with a single YAML file, you are already behind.

The Public Cloud Still Makes Sense for the Right Workloads

The Public clouds still have a role to play. They are a good fit for bursty, short-lived workloads, archival storage, or low-volume systems. But as data volumes increase, and workloads move from exploratory to operational, the benefits of private cloud infrastructure grow stronger.

Practical Steps for Cloud Repatriation

1. Audit Your Cloud Workloads

Identify what’s driving cost. Categorize workloads by performance profile, regulatory requirement, latency tolerance, and concurrency. Pay close attention to high-frequency object access patterns as these typically incur the highest costs. Prioritize workloads for migration with large datasets, repetitive access, or batch AI processing for migration.

If you haven’t already done so, leverage tools like Kubecost, CloudQuery, and FinOps dashboards for granular visibility into cloud spend and usage. A practical way to make sure these tools are actually used is to Integrate them with your CI/CD workflows and log aggregators to create dynamic reporting. Short-term, these tools can help control and lower cloud spend. Long-term, they establish a baseline to inform infrastructure planning and capacity forecasting. The only way to know where you’re going is to quantify where you’ve been.

2. Calculate Total Cost of Ownership

Include all direct and indirect components: physical infrastructure (servers, racks, switches), power and cooling, licensing, observability tooling, and operational staffing. Compare against your existing cloud spend across compute, storage, egress, and managed services.

This comparison often drives executive buy-in, especially when framed over a multi-year horizon as the cost savings of migrating. Include projected data growth, workload trends, and availability requirements. Use historical usage data to size object storage tiers (hot, warm, cold) and forecast bandwidth and IO needs.

3. Design Your Repatriation Architecture

Embrace a software-defined, container-native model. Design for immutability, reproducibility, and declarative deployment. For streamlined orchestration, use Kubernetes or OpenShift to ensure all services can be deployed from a manifest.

Design around open standards. Use open table formats like Iceberg. Select compute engines that support SQL federation and metadata integration. Consider identity, access policies, and multi-tenancy from day one.

Validate the network stack, especially if AI pipelines are moving petabytes between compute and storage. Architect for east-west bandwidth and enable RDMA where applicable.

4. Implement a Phased Migration Strategy

Begin migration with non-critical data pipelines, development environments, and test data. Establish replication policies to mirror cloud object storage to on-prem using tools like MinIO batch replication.

Validate workflows in shadow deployments and mirror observability. Use canary workloads to confirm performance parity. Instrument everything: disk, network, memory, query latency.

Plan for rollback. Automate backups. Monitor error rates. Scale incrementally and document every failure. Migration is an iterative process of validation, correction, and expansion.

5. Align Teams

Repatriation requires cross-functional coordination. Ensure DevOps, Security, Data Engineering, and IT teams are aligned on goals, timelines, and ownership. Define escalation paths and shared KPIs. Document and communicate changes frequently.

Use retrospectives to improve each migration wave. This cultural component can make or break the long-term success of a hybrid or on-prem architecture.

How MinIO Enables Successful Cloud Repatriation

MinIO AIStor was purpose-built to support this kind of architecture. Its high-performance object storage engine is capable of saturating the network for read operations, making it an ideal foundation for AI and analytics workloads. In benchmark comparisons, AIStor has outperformed Amazon S3, even when deployed inside EKS. With AIStor’s announcement of support for the S3 Express API, AIStor is faster than ever. It delivers industry-leading performance using a minimal hardware footprint, enabling faster access to training data and large-scale datasets without overprovisioning.

AIStor offers batch replication capabilities to move datasets efficiently from public cloud object stores to on-prem environments. Pricing is transparent and available at min.io/pricing, with no per-operation charges.

Whether you're building a new data lakehouse from scratch or repatriating from the public cloud, AIStor provides the speed, simplicity, and flexibility required for success. If you're planning a migration and have questions, book a demo with us today to see what AIStor can do for you.