devxlogo

Google Cloud debuts powerful Ironwood chip

Ironwood Chip
Ironwood Chip

Google has introduced Ironwood, its seventh-generation Tensor Processing Unit (TPU), designed specifically for inference.

This powerful AI accelerator is built to handle the massive computational demands of “thinking models,” such as large language models and mixtures of experts. Ironwood scales up to 9,216 chips, offering 42.5 Exaflops of compute power, making it more powerful than the world’s largest supercomputer.

Ironwood is the latest in a lineage of TPUs that have powered Google’s most demanding AI workloads for over a decade. It represents a significant shift in AI development, moving from models that provide real-time information for human interpretation to those generating insights and delivering proactive answers. This marks the beginning of what Google calls the “age of inference.”

Designed to manage complex computations and communications required by advanced AI models, Ironwood features a low-latency, high-bandwidth Inter-Chip Interconnect (ICI) network.

This allows for coordinated, synchronous communication at full TPU pod scale, supporting up to 9,216 chips per pod. For Google Cloud customers, Ironwood is available in two configurations: a 256-chip setup and a 9,216-chip setup.

Offering over 24 times the compute power of El Capitan, the world’s largest supercomputer, Ironwood delivers the massive parallel processing power necessary for the most demanding AI workloads, such as large language models and mixture of expert models with reasoning capabilities for both training and inference.

See also  Apple Siri Privacy Settlement Claims Guide

Each Ironwood chip boasts a peak compute of 4,614 TFLOPs. It also features enhanced SparseCore support, which accelerates workloads beyond traditional AI domains, extending to financial and scientific computations.

Ironwood’s unprecedented compute power

The architecture includes expanded High Bandwidth Memory (HBM) and improved ICI networking for rapid data access. Google’s Pathways ML runtime, developed by Google DeepMind, enables efficient distributed computing across multiple TPU chips. This allows developers to harness the combined power of tens of thousands of Ironwood TPUs, facilitating advancements in generative AI computation at an unprecedented scale.

Ironwood offers significant performance gains with a focus on power efficiency, providing 2x the performance per watt compared to the previous generation, Trillium TPU. It is nearly 30x more power efficient than the first Cloud TPU from 2018. Ironwood features advanced liquid cooling solutions and optimized chip design, sustaining twice the performance of standard air cooling even under continuous, heavy AI workloads.

This TPU offers 192 GB of HBM per chip, 6x that of Trillium, enabling processing of larger models and datasets. The bandwidth has been dramatically improved to 7.2 TBps per chip. The ICI bandwidth has been increased to 1.2 Tbps bidirectional, enabling faster communication between chips and facilitating efficient distributed training and inference at scale.

Ironwood represents a breakthrough in AI technology, offering unprecedented computational power, memory capacity, and networking capabilities. It’s built to meet the demands of tomorrow’s AI workloads, enabling Google Cloud customers to tackle complex AI challenges with high performance and efficiency. Ironwood’s innovations ensure that the right data is always available to support peak performance at massive scales, making it a cornerstone in the development of AI during the age of inference.

See also  Disney IP Videos Surge On Seedance

Image Credits: Photo by Brian Kostiuk on Unsplash

Johannah Lopez is a versatile professional who seamlessly navigates two worlds. By day, she excels as a SaaS freelance writer, crafting informative and persuasive content for tech companies. By night, she showcases her vibrant personality and customer service skills as a part-time bartender. Johannah's ability to blend her writing expertise with her social finesse makes her a well-rounded and engaging storyteller in any setting.

About Our Editorial Process

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.