Google announced on April 22, 2026, the launch of its eighth-generation Tensor Processing Units (TPUs), splitting AI training and inference into two dedicated chips to better serve the growing demands of modern AI workloads, particularly AI agents.
The new lineup includes a dedicated training chip and an inference-focused TPU 8i. Both processors are scheduled for availability later in 2026 via Google Cloud.
According to Google, the training chip delivers 2.8 times the performance of the previous seventh-generation Ironwood TPU at the same price point. The TPU 8i offers 80% better performance than Ironwood and features 384 MB of SRAM three times more than its predecessor enabling massive throughput and low latency.
This architecture is optimized to run millions of AI agents concurrently in a cost-effective manner. “With the rise of AI agents, we determined the community would benefit from chips individually specialized to the needs of training and serving,” said Amin Vahdat, Google senior vice president and chief technologist for AI and infrastructure.
Alphabet CEO Sundar Pichai added that the design delivers “the massive throughput and low latency needed to concurrently run millions of agents cost-effectively.”
The move represents Google’s latest push to provide a strong alternative to Nvidia’s GPUs in the AI hardware space, focusing on custom silicon tailored for both heavy model training and high-volume inference.