Demystifying Amazon S3 Tables: Why AIStor Makes Special Buckets Unnecessary
AWS recently announced Amazon S3 Tables, a feature designed to address the unique challenges of storing and querying tabular data in the cloud. While this may sound revolutionary, a closer inspection reveals that the limitations AWS is trying to overcome are intrinsic to their own infrastructure, not universal to object storage. Let’s break it down and explain why AIStor users don’t need to worry about “special buckets” for their data lakehouse.
What Are S3 Tables?
S3 Tables introduce a new type of S3 bucket—a table bucket—specifically optimized for Apache Iceberg based analytics workloads. Key features include:
- Higher Transaction Limits: Table buckets increase request limits to 35,000 PUTs/s and 55,000 GETs/s, compared to standard S3 buckets (3,500 PUTs/s and 5,500 GETs/s).
- Built-in Table Maintenance: Compaction, snapshot expiration, and unreferenced file removal are automated.
- AWS Glue and Lake Formation Integration: Tight coupling with AWS Glue for cataloging and register your Glue catalog as a Lake Formation data location.
However, these optimizations come with notable trade-offs: added expenses, potential Glue dependencies, and limited flexibility for non-AWS environments. For example, if you’re already using a tool for compaction like Lambdas, Spark, Athena or other service of software you could be paying extra for compute that you don’t need.
Why Did AWS Build S3 Tables?
The primary driver for S3 Tables is to address performance bottlenecks in analytics workloads. Standard S3 buckets hit their transaction limits quickly when used with Apache Iceberg for data lakehouses, causing hotspots and degraded performance. By introducing table buckets, AWS can now offer significantly higher request rates for these speciality workloads.
But here’s the kicker: these bottlenecks are unique to AWS’s architecture. They arise because of the way S3 is built and because of the request limits initially imposed by AWS, not because of Iceberg or object storage in general. For MinIO users, these issues simply don’t exist.
All in on Iceberg
It goes without saying that AWS is a major part of the S3 ecosystem. AWS S3 Tables only optimize Iceberg Tables, leaving users of Hudi and Delta Lake to create and manage their own buckets. This choice by AWS to throw their weight around investing and promoting Iceberg over the other open table formats will be very impactful. This action further continues the course started by Databrick’s acquisition of Tabular and Snowflake’s open-sourcing of Polaris.
More importantly, all these investments, convergences, and contributions all support the rising supremacy of open-table format data lakehouses built on object storage. The era of object storage as primary storage has arrived.
Why AIStor Doesn’t Need “Special Buckets”
AIStor subscribers have always been able to store Iceberg tables in any bucket without worrying about request limits. Of course, as AIStor is your storage layer, you’ve always needed a compute layer like Spark, Dremio or Starburst to create, manage and retrieve your open table format data. AIStor is uniquely capable in this partnership for the following reasons;:
- Performance by Design: AIStor is the fastest object store on the market. MinIO’s only limits to throughput has always been down to the network and underlying hardware. MinIO will completely fill the wire and go as fast as your disks can spin. We have never limited the rate of GETs/s and PUTs/s and then charged you for the privilege of performance.
- No Vendor Lock-in: AIStore is compatible with the S3 API, meaning you can fully integrate into every layer of the modern data stack. With AIStor’s flexible deployment, you can go anywhere with your data, on-prem, in any of the public clouds (AWS, GCP, Azure), private clouds, colos, data centers or the edge. You can use any compute engine to query over your open-table format data and are free to explore and build your stack wherever your workloads dictate without being beholden to any particular cloud, vendor or process.
- Streamlined Maintenance: Only pay for what you need. Iceberg’s table maintenance features (compaction, snapshot expiration, etc.) can be scheduled and executed independently of the storage layer. MinIO’s high performance ensures these operations run efficiently.
Let’s Talk Performance
AWS markets S3 Tables as delivering up to 10x higher transaction rates for Iceberg tables. But with AIStor, you’re not constrained by predefined limits. Instead, you size your cluster based on workload demands, achieving the performance you need without additional cost or complexity.
Moreover, AIStor’s object storage is built to deliver consistent, high performance for both analytics and transactional workloads. Meaning that unlike AWS’s table buckets, you’re not forced to segregate storage types to achieve acceptable performance and you’re not limited to a single open-table format.
The Real Cost
S3 Tables introduce complexity and additional expense:
- Higher Costs: AWS’s premium for S3 Tables can add up quickly, especially for large-scale workloads. A little back-of-the-napkin math shows that AWS Tables cost 15% more than a normal S3 bucket.
- Hidden Migration Costs: If you chose to use AWS Tables, you’d have to migrate any existing Iceberg into these new buckets. Data Migration is never easy and rarely cheap.
By contrast, MinIO offers a simpler, more cost-effective solution. There’s no “table bucket” tax, and you can use open table formats like Iceberg without artificial restrictions.
The Future of Data Lakehouse Storage
Amazon S3 Tables address limitations in AWS’s own infrastructure but add complexity, cost, and lock-in for users. AIstor, on the other hand, empowers users to run high-performance queries over Iceberg tables without special buckets or cloud dependencies.
The takeaway? If you’re already using AIStor, you’re ahead of the game. And if you’re considering S3 Tables, take a closer look at whether they’re solving real problems—or just the ones created by AWS.
If you have any further questions on how AIStor delivers scalable, high-performance object storage for AI and advanced analytics in data lakehouses, please reach out to us at hello@min.io or on our Slack channel.