What is a tensor processing unit (TPU)?

Google's in-house AI chips are the most notable alternative to Nvidia at the enterprise scale

Google Cloud CEO Thomas Kurian speaking on stage at the Sphere in Las Vegas, Nevada, during the opening event of the 2025 Google Cloud Next conference.
(Image credit: ITPro/Rory Bathgate)

A tensor processing unit (TPU) is an application-specific integrated circuit (ASIC) that is specifically designed for the purpose of accelerating high-volume mathematical and logical processing tasks that are typically involved with machine learning (ML) workloads. These were designed back in 2015 by Google and released for third-party use in 2018.

The first TPUs were designed back in 2015 by Google and released for third-party use in 2018.

Over successive generations, TPUs have evolved from modest accelerators to massive supercomputer-scale pods. Early TPU generations focused on inference (and later training) with limited precision and high throughput, and were used to power early DeepMind breakthroughs such as the Go-playing program AlphaGo.

Google progressively improved memory bandwidth, chip count, interconnects, and energy efficiency, with recent models focused largely on meeting the massive demand for generative AI hardware. The latest generation, Ironwood, takes a major leap forward particularly when it comes to running “thinking” AI workloads (i.e., inference and reasoning at huge scale).

The Ironwood TPU is a seventh-generation chip designed for AI inference workloads, featuring 4,614 teraflops of peak compute, 192 GB of HBM memory, and 7.2 TB/s of memory bandwidth per chip. Scaled up to a pod of 9,216 chips, it delivers 42.5 exaflops of compute power and features a 1.2 TB/s bidirectional interconnect network. It also includes a specialized "SparseCore" to accelerate large-scale data tasks like those in recommendation systems.

Ironwood will become generally available towards the end of 2025 – customers will soon be able to test Google Cloud’s claims of its competitive performance for training AI and efficiently running inference on AI models.

The pros and cons of TPUs

Google has staked a lot of its AI dominance on TPUs, on which it heavily relies for training and inference its own flagship Gemini models. It argues that TPUs offer far greater cost per watt than graphics processing units (GPUs) such as those offered by Nvidia, with Ironwood pitched as a peer to the likes of Nvidia’s Blackwell chips.

“TPUs offer several compelling pros for modern AI development, particularly in deep learning. They are highly efficient for training and running large AI models like Large Language Models (LLMs),” Petr Baudis, CTO & chief AI architect at Rossum, tells ITPro.

One of their most significant advantages is the ability to scale thousands of chips together into ‘TPU pods,’ making them an ideal architecture for massive model training. When fully utilized, TPUs often provide a lower cost per unit of work (better $/training hour). Lastly, since Google controls the supply, users are less dependent on GPU shortages, which can be a critical advantage.

But TPUs aren’t better than GPUs across the board, as Baudis also explained.

“The primary limitation is that they are only available on Google Cloud, which results in significant vendor lock-in. They also have a smaller developer ecosystem compared to Nvidia's CUDA model, which is everywhere. Finally, TPUs work best on large, well-structured ML workloads and are not always ideal for smaller or irregular tasks."

TPUs vs GPUs designed by Nvidia, AMD, or Intel

In the past few years GPUs have become known as the go-to chip for AI training and inference, which makes Google’s reliance on TPUs all the more notable.

"TPUs are designed for AI tensor operations versus a GPU by Nvidia/AMD/Intel which is a more flexible architecture targeted at a broader set of applications (goes beyond AI) and will have general-purpose processing cores that are both smaller and in greater numbers than equivalent TPU cores," explains Alvin Nguyen, senior analyst at Forrester Research.

Using in-house chips also saves Google from the kind of supply chain limitations other hyperscalers face when it comes to acquiring in-demand GPUs.

“From a business standpoint, Google's TPU technology provides significant advantages, primarily centered on strategic independence,” Baudis tells ITPro. “It isn't a hostage to Nvidia's pricing or supply bottlenecks, and it can reliably offer AI capacity to customers even when GPUs are sold out elsewhere." But he cautions that in his opinion, while TPUs may offer better efficiency for LLM-specific workloads compared to GPUs, this is a smaller, variable factor that will vary quarter by quarter based on current hardware generations.

Nguyen adds that TPUs come with trade-offs but undeniably help Google to consolidate the control it has over its own AI ecosystem.

"Using TPUs allows Google to target the specific workloads they want to accelerate with an ASIC that they design and manufacture, providing them with more control over their infrastructure and tech stack (minimizes their dependence on NVIDIA and others for their technology, especially important with how in-demand GPUs are for AI)," he tells ITPro.

Google isn’t the only firm relying on its own silicon, however, with Baudis pointing to AWS’ has Trainium and Inferentia as well as Microsoft’s Azure Maia. These aren’t theoretical competitors either: Anthropic already use a mix of TPUs, Trainium, and Inferentia chips for their AI workloads.

Crucially, Nvidia still dominates the software ecosystem, CUDA remains the 'default language of AI compute.' Therefore, TPUs likely give Google an advantage mainly in cost, scale, and internal efficiencies, not total market control."

The future of TPU

Beyond the hardware, the broader strategic context around Google’s TPUs matters greatly. In October 2025, Anthropic announced it would access up to one million TPUs from Google Cloud, representing “well over a gigawatt” of compute capacity coming online in 2026. The deal is reportedly worth tens of billions of dollars.

“Google’s TPU strategy will age well if the world keeps moving towards giant AI models," says Baudis. "It supports huge, dense AI training at an extreme scale very well. The risks all seem relatively unlikely – either a shift towards very small models (while increasing intelligence demands) or a paradigm shift to a different technology than transformer-based neural networks. Smaller shifts in neural architectures, etc., that cannot be accommodated by existing TPUs can likely be absorbed by new incremental generations of the chip design. "

With regards to the Anthropic deal, Baudis adds that Anthropic will serve as a “demand anchor tenant” for Google, bringing several notable benefits.

“Firstly, it amortizes the costs associated with TPU datacenter build-outs,” he says. “Secondly, it validates the performance and efficiency of TPUs at the frontier scale required by a leading AI research company. Crucially, this partnership is expected to pull independent software vendors (ISVs) and enterprises toward the TPU software path, influencing their choice of models and tooling.

Nguyen adds that the deal may help popularize Google’s TPUs, just as the hardware becomes more widely available. This could drive partners to optimize their AI services for Google Cloud more broadly. As enterprises look to avoid AI vendor lock-in, the TPU could become an ever more popular option.