The Case for On-Prem Iceberg: Cost, Control, and Performance

As organizations scale their data initiatives, many are discovering that cloud lakehouses aren’t always the best fit for performance, compliance, or cost. That’s where on-prem Iceberg architectures shine. In this blog, we’ll walk through the key challenges cloud deployments face, the case for returning workloads on-prem, and how organizations like a cancer research center and Nomura are already succeeding with hybrid and on-prem Iceberg data lakehouses.

Cloud Challenges: Cost and Control

Let’s ground this in reality. Across hundreds of enterprise conversations, two issues consistently emerge as top concerns when it comes to cloud-based data lakehouses: cost and control.

Cloud pricing, especially at scale, is rarely transparent and never forgiving. You’re billed for storage, API calls, egress, and replication; each line item quietly inflates your monthly bill. Adding well-intentioned but redundant backups, region-based compliance, and siloed team environments can lead to further infrastructure sprawl and even larger cloud bills. What works at 100TB becomes financially untenable at 10PB.

Control is the other elephant in the room. Some of your most sensitive data, such as patient records, financial transactions, and proprietary models, must remain on-prem for compliance and governance reasons. Encryption alone isn’t enough. You need true isolation, object-level access control, and full auditability. These aren’t edge cases; they’re central to how enterprises operate.

Why On-Prem Makes Sense

On-prem is no longer a tradeoff. It offers the best of all worlds: scale, performance, control, and cost-efficiency.

With modern, software-defined object storage, enterprises can deploy Iceberg lakehouses anywhere in minutes, from laptops to full-scale data centers. This is because scale operates in both directions, and software-defined storage can and should be tailored to fit the workload. Enterprises adopting enterprise object storage can scale linearly and predictably and upgrade their infrastructure without needing to start over. A swap-and-drop upgrade model means your architecture can keep up with new technologies and advancing workloads.

And the performance speaks for itself. AIStor delivers over 2.2 TiB/s in throughput, maxing out even 400GbE networking from 12 nodes to 550. This isn't hypothetical; it’s proven. Real-world Iceberg workloads, from training LLMs to serving analytical dashboards, are already benefiting from this level of performance.

Security-wise, on-prem means total data sovereignty. You control the hardware, define the namespace boundaries, and set policies, down to individual objects, that govern who accesses what, when, and how.

From a cost perspective, it’s night and day. There are no egress charges, no surprise API bills, just straightforward infrastructure with a predictable TCO that improves as you scale. There is a clear inflection point at around 5,000 TB, or 5 petabytes, where cloud costs far outpace hot data on-prem. This includes the cost of hardware, subscription, and the Colo. As you can see, by the time you hit 20 or 30PB, the delta in cost is millions of dollars annually, and that’s just your storage layer.

Standards Power Performance

One powerful enabler of performance on-prem is S3 over RDMA.

RDMA eliminates the overhead of the TCP/IP stack, allowing memory-to-memory transfers and bypassing the CPU entirely. With 400GbE or 800GbE networks, your storage can move data at terabit speeds, making it ideal for Iceberg’s large-scale metadata operations or machine learning pipelines that require low-latency access.

This isn’t just about speed. It’s about standards. Like AIStor closely conforms to the S3 API and rallies around open table formats like Iceberg, Hudi, and Delta Lake, we see RDMA over Ethernet as the open foundation for next-gen data infrastructure. Solutions that are multi-vendor, cloud-native, and future-proof, such as S3 over RDMA, allow vendors to focus on moving the needle for their customers rather than reinventing the wheel. Standards enable enterprises to hire employees with broadly marketable skills rather than retaining niche or proprietary skill sets.

Real-World Architectures

Consider the largest private cancer center in the world. They run an on-prem Iceberg lakehouse to support cancer research. Their data lake ingests everything from genomic files to clinical trial metadata. Previously, data took weeks or months to reach researchers. Now it’s minutes or hours. No more ETL pipelines. No more offline data silos. Just fast, trusted access to the latest datasets.

Or take Nomura, which uses a hybrid approach. Some extremely sensitive data resides on-prem with AIStor, while less critical workloads use S3 in the cloud. Everything is orchestrated via Kubernetes and served through tools such as Spark, Dremio, and Jupyter, as well as CI/CD tools and pipelines that keep the platform agile and developer-friendly.

These aren’t demos. These are production deployments solving real problems at a petabyte scale.

Time to Value and TCO

One of the most underappreciated benefits of on-prem Iceberg deployments is how quickly they begin to deliver value. Among the surveyed customers, 36% reported seeing value within three months, and half experienced returns within a year. The remaining ones were still implementing when surveyed, but we expect similar or superior results.

This speed is possible because the building blocks are simple: standard 2U servers, NVMe drives, 256GB RAM, and high-speed NICs. Combine that with software-defined object storage and an open format like Iceberg, and you’ve got an architecture that’s fast to deploy and easy to scale.

Your Lakehouse. Your Terms.

Iceberg allows you to choose your deployment model: on-prem if you value control, cloud if you need elasticity, or hybrid if you want both.

With AIStor and Iceberg, you’re not just buying software; you’re building a data platform that reflects your priorities: performance, governance, and cost-efficiency.

The petabyte is the new terabyte. It’s time to architect accordingly.