The Innovations from AWS re:Invent

The Innovations from AWS re:Invent

Earlier this month, Amazon held their re:Invent conference in Las Vegas, Nevada, from December 1st to 5th - a 5-day event.  If you have never been to a re:Invent conference, then the word that describes it best is “huge” - not just in terms of the number of attendees (60,000) but also the breadth of topics covered. With 60,000 attendees, the MinIO booth was quite busy for all 5 days. Consequently - even though I was there in person - I did not get a chance to immerse myself in all the data coming from Amazon about their plans for cloud computing in the coming year. So, I had to wait until the holidays to do what most would consider quite nerdy. I watched every day’s keynote presentation, taking careful notes about everything important discussed. When I was done, I had reviewed over 10 hours of video and had over 15 pages of notes. 

The purpose of this effort was not to get in touch with my inner nerd, but I really wanted to put my fingers on the pulse of AI. Here are the facts as I see them. First, Amazon has the largest cloud platform in the world. Second, they have a lot of customers using their cloud platform for all forms of AI. They have the budget to do anything they want to help these customers and remain competitive. Finally, it is a common opinion that in recent years, Amazon has fallen behind the likes of Microsoft, Google, and Meta with respect to AI. Putting all this together - my goal was straightforward - use the re:Invent keynotes to get a sense of where AI is heading in the year 2025 and if their effectively  unlimited engineering resources, massive customer base and large partner network would allow them to close the AI gap that is perceived to exist.

Below is a list of the keynotes for each day of the conference:

  • Day #1 - Monday Night Live with Peter DeSantis SVP, Utility Computing, AWS
  • Day #2 - CEO Keynote with Matt Garman CEO, AWS
  • Day #3 - Dr. Swami Sivasubramanian Keynote VP, AI and Data, AWS
  • Day #4 - AWS Partner Keynote with Dr. Ruba Borno VP, Global Specialists and Partners, AWS
  • Day #5 - Dr. Werner Vogels Keynote VP and CTO, Amazon.com

Going back to my 15 pages of notes. I was somewhat surprised at how easily every important topic and every announced new feature fit into one of three categories: Storage, Compute, and AI. This is how I will organize this post. For each category, I will give my overall opinions and then list key technologies and new announcements.

Compute

My biggest surprise at re:Invent 2024 had to do with Amazon’s compute goals. It took me a moment to get my head around it, but it is clear that they have no intention of becoming the provider of other vendors' chips and servers. Rather, they want to have an offering of hardware they have designed themselves. They have been at this for a while, but I honestly did not understand the magnitude of their efforts. They have developed their own CPU (Graviton2), GPU (Trainium2), an interconnect (NeuronLink), and a custom networking protocol (SIDR). Their new Trainium2 UltraServers use NeuronLink to connect four Trainium2 servers together into one giant server for training and inference.

Many AWS services use their own hardware, and they are also available as EC2 instance types. AWS still has a partnership with Nvidia and will continue to offer Nvidia’s GPUs as instance types. The promise of Trainium2 and Graviton2 is that they will provide a cost advantage over other chips.

Announcements

  1. P6 family of instances that support Nvidia Blackwell chips.
  2. TRN2 family of instances that support the Trainium2 chips. One instance is a server with 16 Trainium2 chips that delivers up to 20.8 FP8 petaFLOPS.
  3. TRN2 Ultraservers - uses neuronlink to connect 4 TRN2 instances - so 64 Trainium2 chips. Meant for models that can’t fit on one server. Up to 83.2 FP8 petaFLOPS
  4. Trainium3 is coming in 2025 - It will be the first chip AWS makes on the 3-nanometer process. It will be twice as fast as Trainium2.

Storage

Most of the storage announcements occurred during the second day’s keynote address with Matt Garman. However, there is a really interesting story that was told during the first day’s keynote about project Barge which was an effort to build a massive storage server to increase storage density (and reduce costs). It had 288 20TB hard disk drives in a single host. Each Barge rack was 4,500 lbs which meant that the data center floors needed to be reinforced and specialized equipment was needed to move them. It turns out that 288 drives spinning at 7200 rpm causes vibrations that increase failure rates. The blast radius of one failure was 6 PB of data which had to be recovered at great expense. In the end they sunk Barge and started thinking about how to disaggregate storage from compute.

What struck me about some of the storage announcements was that the need for unstructured storage is growing. This is evident in the attention that S3 is getting (see announcements below). It is also obvious from some statistics announced during Matt’s keynote: S3 now stores over 400 trillion objects. Ten years ago, fewer than 100 customers stored a petabyte of data on AWS using S3. Today thousands are storing a petabyte, and several storing more than exabytes (which closely matches our experience). Another thought I had regarding unstructured storage is that many of the new foundation models that Amazon is adding to their Cloud (described in the next section) generate images and videos. These images and videos need to be stored somewhere. If this flavor or generative AI takes off then the need for structured data will increase further. At MinIO we believe customers will want an on-premise option for this data.

Amazon is also serious about their storage solutions for structured data. They put a lot of effort into improving the consistency between active-active database instances running in different regions. To do this they redesigned the transaction engine used within their SQL and noSQL databases. There was an interesting story during this part of the keynote about how they had to sync EC2 instances with satellites to keep times in sync with greater precision. The result is that both Aurora and DynamoDB can run active-active with multi-region strong consistency using their redesigned transaction engine.

Announcements

  1. S3 Table Buckets - This is a new bucket type for Iceberg tables. The promise is that AWS will take care of all the maintenance necessary for maintaining an Iceberg table such as compaction, snapshot management, etc. Initial tests show that table buckets using SQL are 3 times faster query performance as compared to brute force S3 “queries”.
  2. S3 Metadata - This feature allows you to take the metadata associated with an object and store it in a Table bucket. Then, you can use your favorite analytics tool to interact and query the metadata and get the associated object.
  3. Amazon Aurora DSQL (Distributed SQL database) Strong consistency between regions using their new transaction engine.
  4. Amazon DynamoDB global tables now support multi-region strong consistency. This is an application of the redesigned transaction engine that was applied to DynamoDB, which is Amazon’s NoSQL database. 

AI

By far most of the announcements at this year’s re:Invent were about AI. A few of these announcements came during the second keynote with Matt Garman but also the Swami Sivasubramanian keynote on the third day. Swami’s keynote was packed full of announcements and almost entirely focused on AI. Some are trivial and all the announcements would have been easier to digest if Amazon would have chunked them up. I described them below as they were presented but I applied the following categorization which is my own doing. 

  • Guardrails - for double checking models
  • Foundational models to compete with Meta, Google, and Microsoft; 
  • Developer tools - for code generation and automating portions of the software development lifecycle.
  • Agentic AI to help customers automate other parts of AWS. 
  • Miscellaneous - a few did not fit into my categories above.

Guardrail Announcements

  1. Amazon Bedrock Automated Reasoning Checks - This is a guardrail that aims to prevent factual errors due to model hallucinations.
  2. Amazon SageMaker HyperPod task governance. Maximize accelerator utilization and reduce costs for model training, fine tuning, and inference.
  3. Amazon Bedrock Guardrails Multimodal toxicity detection. Configurable safeguards for image content. Available for all foundation models in Amazon Bedrock with image support. For filtering out violence, hate and misconduct in images.

Foundational Model Announcements

  1. Amazon Nova - Foundation models coming in four flavors: Micro, Lite, Pro, and Premier.
  2. Amazon Nova Canvas - Image generation model
  3. Amazon Nova Reel - Video generation model - 6 second videos today. 2 minute videos coming soon.
  4. poolside is coming to Bedrock - poolside assistant (malibu and point). 
  5. stability.ai Stable Diffusion 3.5 is coming to Bedrock.
  6. Luma AI is coming soon to Bedrock. Luma Ray2 video generation model. All Luma models are coming to Bedrock.
  7. Amazon Bedrock Marketplace - provides access to hundreds of emerging and specialized models 

Developer Tool Announcements

  1. Amazon Bedrock Model Distillation - This allows you to transfer knowledge from a large complex model into a smaller one using prompts with known answers. It is similar to fine-tuning but slightly different.
  2. Amazon Q Transformations for .NET - to transform .NET applications from Windows to Linux in a fraction of the time. Uses agents. Introduced this feature by saying that customers want an easy button for getting off of Windows.
  3. Amazon Q Developer Transformations for VMWare workloads - Transform VMWare workloads to cloud native archtiectures. Generates a migration plan based on dependencies. Launches agents that can convert VMWare network configurations to AWS equivalents.
  4. Amazon Q Developer Transformations for Mainframe. Uses agents to automate discovery, planning, refactoring, and code analysis of mainframe code (Cobol).
  5. Amazon Q Developer to investigate issues across your AWS environment in a fraction of the time. Uses CloudWatch data and CloudTrail logs. Get suggestions for AWS runbooks and curated documentation to quickly resolve issues.
  6. ISV integration with the Amazon Q index via new APIs.
  7. Amazon Bedrock Prompt Caching - Cache repetitive context in prompts across multiple API calls.
  8. Amazon Bedrock Intelligent Prompt Routing - Automatically route prompts to different foundation models to optimize response quality and lower costs.
  9. Amazon Kendra Generative AI Index - Connect to enterprise sources like SharePoint, OneDrive, and Salesforce. There are more than 40 enterprise data sources supported for RAG usage.
  10. Amazon Bedrock Knowledge Bases supports structured data retrieval. Use data stored in Amazon SageMaker, Lakehouse, Redshift and S3 tables for RAG.
  11. Amazon Bedrock Knowledge Bases now supports GraphRAG - Generate more relevant responses for generative AI applications using knowledge graphs. Knowledge graphs link relationships across data sources.
  12. The next generation of Amazon SageMaker - the center for all your data, analytics, and AI needs. Expanding SageMaker by integrating data, analytics and AI tools. 
  13. Amazon SageMaker Lakehouse - Simplify analytics and AI with an open, unified, and secure data lake house. Unified access to your data across S3, Redshift, SaaS, and federated data sources. 
  14. Amazon SageMaker HyperPod flexible training plans
  15. Amazon Bedrock Data Automation - Transform unstructured multimodal data for generative AI applications and analytics.
  16. Amazon Q Developer is now available in SageMaker Canvas - Build machine learning modes quickly with natural language. A low code offering for building models.

Agentic AI Announcements

  1. Amazon Bedrock multi-agent collaboration 
  2. Three new autonomous agents as a part of Q Developer for generating unit test, code documentation, and code reviews. They are also integrating Amazon Q with GitLab’s Duo assistant.
  3. Amazon Q business automation for complex workflows. Automatically build workflows based on documentation or recordings. Navigate changes to workflows in real time, reducing breakage. 

Miscellaneous Announcements

  1. Combine QuickSight and Amazon Q Business Data (and vice versa).
  2. AI apps from AWS partners now available in Amazon SageMaker
  3. Amazon Q in QuickSight Scenarios
  4. AWS Education Equity Initiative - Offering AWS credits for educational services with the community.

Conclusion

Amazon had a busy year in 2024 developing all the features I described above. If you do a simple count of the new announcements the counts will be heavily weighted toward AI but not all features are equal in effort. The work done on compute and storage is just as significant in my opinion. Designing CPU, GPUs and new server instances is hard and the engineering done on storage is also important as it pushes the boundaries of both structured and unstructured storage solutions. With respect to AI I really like the thinking that is going into guardrail features. 

If 2025 turns in the year of agentic AI then proper guardrail will be key to making sure agentic AI is done correctly. 2025 will be an interesting year as these feature get adopted and further modified.