What is Microsoft Maia?
Microsoft's in-house chip is planned to a core aspect of Microsoft Copilot and future Azure AI offerings
With AI taking over the tech world, Microsoft is doubling down on cloud-scale AI infrastructure by building its own silicon, starting with the Azure Maia 100.
Unveiled in November 2023, Maia is Microsoft’s custom AI accelerator, designed from the ground up, to power the next generation of large-language models, generative AI, and other demanding machine-learning workloads inside its Azure cloud. With more than 100 billion transistors, Maia represents Microsoft's intention to break reliance on external GPUs and offer hyperscale AI computing on its own terms. Let's look in more detail at the Maia 100.
What the Maia 100 chip is designed for
The Maia 100 chip is Microsoft's custom-designed AI accelerator built for large-scale, cloud-based artificial intelligence workloads, including training and inferencing for generative AI models, powering services like Azure OpenAI and Copilot, with goals to improve performance, reduce costs, and lessen reliance on third-party GPU vendors like Nvidia by optimizing the entire hardware-software stack for Azure's infrastructure.
"The primary goal of Maia 100 chip is to run AI workloads (training and inference) for Azure cloud,” Gaurav Gupta, VP analyst at Gartner tells ITPro. “The idea is to reduce reliance on merchant silicon and bring improved energy efficiency and a lower-cost solution. The chip is designed for the best performance per dollar and watt.”
Like Google’s tensor processing unit (TPU) and AWS’ Trainium and Inferentia chips, Maia gives Microsoft the security of internally-owned and designed chips, with maximum optimization for its software and hardware.
“Microsoft could control the entire stack, with a better understanding of its workloads and a chip designed specifically for AI workloads,” Gupta adds. “With its own designed chip, Microsoft could also control the supply chain better and leverage IP synergies."
How Maia 100 integrates with Azure
The Maia chip family was designed specifically for Microsoft Azure’s infrastructure, with to form the most efficient compute platform for its Copilot offering and other AI services.
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
"Microsoft hasn’t just made a chip, they’ve engineered the server boards, the racks, the liquid cooling, the networking, and the software in one coordinated push,” explains Greg Hawthorn, managing director, IT Services at Espria. “Maia is tightly designed around the Azure stack. The result is a vertically optimized platform that reduces overhead and improves performance for AI at cloud scale.”
This performance will be turned to tasks such as training and fine-tuning for large language models (LLMs), high-bandwidth inference for Microsoft Copilot, as well as support for wider enterprise AI platforms built in Azure AI Studio. Hawthorn stresses that this makes strategic sense for Microsoft to pursue.
“Because Microsoft isn’t relying solely on Nvidia, Copilot and Azure OpenAI capacity will become more stable over time, with fewer bottlenecks and quota issues. Owning silicon also allows Microsoft to control long-term AI costs, affording it the headroom to improve pricing or increase throughput without passing costs downstream.
“In short, with Maia, Microsoft intends to control its AI destiny, reduce reliance on external supply chains, and deliver AI services that are faster, more affordable, and more predictable for customers."
In addition, Gupta tells ITPro that Microsoft plans to offer the Maia chip to customers via racks installed in their data centers.
Microsoft Maia vs Google TPUs
Google’s TPUs and Microsoft’s Maia 100 sit at different points on the maturity curve, and that is the biggest difference. Google is now on its 6th/7th TPU generation, with a decade of optimization behind it. Maia 100 is Microsoft’s first-generation AI accelerator.
The main architectural and performance distinctions, according to Hawthorn are:
- Architectural maturity: Google TPUs have evolved over several iterations, with highly refined systolic array designs, large-scale pod interconnects, and deep software integration through XLA, JAX and TensorFlow. On the other hand, Maia 100 is a first-generation application-specific integrated circuit (ASIC), tuned heavily for Microsoft workloads such as Copilot and Azure OpenAI. Its architecture is optimized around Azure’s server boards, networking, and liquid cooling system rather than broad machine learning (ML) frameworks.
- Integration philosophy: Google TPUs are tightly integrated with Google Cloud and the Gemini model family, making it a vertically aligned ecosystem from chip to model. Maia takes a “Microsoft-first” approach: its primary purpose is to reduce GPU dependency, support internal workloads, and improve the economics of Azure AI.
- Performance positioning: TPUs aim to push absolute accelerator throughput for frontier model training, while Maia focuses on balanced performance and power efficiency for large-scale inference and ongoing training of production workloads.
- Software ecosystem: Google has a deeply mature optimization stack (XLA, compiler-level graph optimization), whereas Microsoft is building its ecosystem around PyTorch, ONNX Runtime and Triton, with a strong emphasis on portability between GPUs and Maia.
"Google’s latest TPU offering is much more advanced as Google has been doing custom silicon for almost a decade and this is the 7th generation, while Microsoft entered late into this space,” he explains. “Google has a vertically integrated solution with not just the compute, but its own architecture for networking and software stack at the server level, and not just a chip. On the other hand, Maia 100 is still an early offering with low volume and potentially primarily for internal workloads. Google’s TPUs have strong external demand too."
Microsoft Maia vs AWS Trainium and Inferentia
As mentioned above, Microsoft's other hyperscale rival AWS has an ASIC family of its own in the form of Trainium and Inferentia. These chips are distinct from Maia in several key ways, as Hawthorn explains:
- Efficiency: Trainium/Inferentia are designed around cost-per-training-hour and cost-per-inference targets. AWS markets these aggressively, and enterprises can access them today with strong cost benefits compared to GPUs. Maia 100 is designed to improve Microsoft’s efficiency first by reducing the cost of running Copilot, Defender AI, Azure OpenAI, and other internal services.
- Scalability: AWS has scaled Trainium/Inferentia across multiple regions, with large-capacity instance families (Trn1, Inf2), whereas Microsoft is deploying Maia systems region by region and building out next-generation variants.
- Enterprise suitability: AWS silicon is ready for enterprise adoption at scale as its heavily documented, easy to consume, and backed by the Neuron SDK. Maia will indirectly benefit enterprises by strengthening Azure AI services, but direct consumption will come later.
- Integration into the ecosystem: AWS chips integrate with the full ML stack on AWS, SageMaker, Neuron, PyTorch, TensorFlow, JAX. On the other hand, Maia is targeted squarely at Azure AI workloads, Copilot, and Microsoft’s model deployments — highly relevant for Microsoft-first organizations.
“AWS silicon is currently the better option for direct accelerator consumption, but Maia strengthens the economics and capacity of the Azure services that companies already rely on,” Hawthorn says. “As the asset inside Azure that will make Copilot, Defender, Sentinel, and Azure OpenAI faster, Maia is more predictable and more cost-effective over time."
Maia is being deployed across Azure data centers now and is powering Microsoft’s own Copilot and Azure OpenAI workloads.
The roadmap includes follow-on Maia variants in the coming years. However, limited public deployment of Maia 100, continued reliance on third-party GPUs, and delays in its next-gen chips make a full-scale rollout of custom Maia-based infrastructure across all of Azure within a one to two-year window seem optimistic.
-
If Satya Nadella wants us to take AI seriously, let’s forget about mass adoption and start with a return on investment for those already using itOpinion If Satya Nadella wants us to take AI seriously, let's start with ROI for businesses
-
Are hyperscalers backing out of Net Zero?ITPro Podcast Expanding data center construction and demand for high-energy workloads are pushing hyperscalers off course on carbon
-
The Microsoft Azure outage explained: What happened, who was impacted, and what can we learn from it?News Microsoft has confirmed its Azure services are back online after a major outage impacted services across multiple regions – here's everything you need to know.
-
Google just confirmed the location of its first small modular reactorNews Developed by Kairos, Google's first small modular reactor will be located in Tennessee, with operations beginning in 2030.
-
Microsoft's plan to use human waste to offset AI emissions stinks of greenwashingOpinion Hyperscalers are getting increasingly gimmicky when it comes to sustainability
-
Microsoft invests $400 million to expand Swiss data centersNews Growing AI and cloud demand, plus data sovereignty requirements, are fueling European data center investment
-
Microsoft says this data center cooling technique can cut emissions by one-fifth – but switching to renewables will prove far more impactfulNews In a cradle-to-grave environmental assessment, Microsoft has examined manufacturing, transportation, deployment, and end-of-life disposal
-
First Microsoft, now AWS: Why tech giants are hitting the brakes on costly data center plansNews Amazon Web Services (AWS) has paused plans for some data center leases, according to analysts, sparking further concerns about the cost of AI infrastructure spending plans.
-
OpenAI inks $12bn CoreWeave deal in latest move away from MicrosoftNews Cloud infrastructure company CoreWeave will supply OpenAI with infrastructure to run the firm's latest models in a deal worth nearly $12 billion.
-
Analysts think Microsoft's data center rollback is bad news for the AI boom – but the company says not to worryNews Microsoft has reportedly ended leases for a significant amount of data center capacity, sparking debate over whether the AI boom is starting to falter.