What is Google’s TPU?
A TPU (Tensor Processing Unit) is Google’s custom AI chip, designed from scratch for the giant matrix multiplications that modern models live on. GPUs were built for graphics first.
TPUs were built for deep learning from day one.
At Cloud Next ’26, Google unveiled its 8th generation, and for the first time it ships in two flavors. TPU 8t is built for training, where raw throughput wins. TPU 8i is built for inference, where latency and chip-to-chip speed matter most.
Both still share the same Axion CPUs, liquid cooling, and software stack, so code written for one runs on the other.
The diagram below is a quick study guide to what’s the same, what’s different, and why, based on our understanding of published Google articles.