Microsoft’s in-house AI chip could ‘hasten’ product rollout, cut costs

Microsoft logo, containing endless boxes of color within each of the four colored sqaures that make up the logo against a dark background, to form a 'tunnel' effect
(Image credit: Getty Images)

Microsoft is reportedly working on its own chips for training AI models, which could help expand its cloud AI offerings and tighten competition with rivals AWS and Google Cloud.

The move could allow Microsoft to further expand its generative AI products, such as those launched through its partnership with OpenAI, and embed AI tools across its productivity ecosystem.

A Bloomberg report stated that the chip is codenamed ‘Athena’, and could be the product of a strategic partnership between Microsoft and AMD.

Microsoft’s need to train large language models (LLMs) at scale has sharply increased over the past year, as the company placed generative AI processes front and center in its future.

AMD shares rose after the initial reports, though neither firm has officially announced a partnership of this kind.

Microsoft spokesman Frank Shaw told Bloomberg that while “AMD is a great partner,” it is not involved in Athena.

“Microsoft’s reported move into custom silicon makes sense, particularly as Microsoft is integrating AI inference into its own products, like Windows, Bing, and Office,” said James Sanders, principal analyst, cloud and infrastructure at CCS Insight, to ITPro.

RELATED RESOURCE

Whitepaper cover with cityscape at sunset image in background

(Image credit: IBM)

Sustainability at scale, accelerated by data

A methodical approach to ESG data management and reporting helps GPT blaze a trail in sustainability

DOWNLOAD FOR FREE

“There’s also ample precedent for this, as Amazon Web Services designs and offers the Trainium and Inferentia chips, and Google Cloud builds and offers TPUs, while Google at large also builds custom silicon for video encoding to support YouTube.”

Google has claimed that its AI chips are ‘faster and greener’ than competitors, with the firm having stated that its 4th generation tensor processing unit (TPU) can train AI at 1.3-1.9 times lower power intensity than Nvidia A100 chips.

Sanders explained that although Copilot and Bing AI will generate revenue through subscriptions and advertisements, they will also drive workload requirements to the extent that Microsoft could save money by using in-house hardware.

“While Copilot features are paid add-ons and Bing with AI is ultimately ad-supported, Microsoft will have the volume of inference workloads to warrant building custom chips for it, and doing so will improve the margins associated with adding AI functions to these products. 

“Depending on how far Microsoft is in development, it may also help to hasten the scale of availability to more users, as Nvidia is experiencing sustained demand for A100 and H100 chips.”

He noted that Microsoft is highly likely to continue to provide Nvidia graphics processing units (GPUs) for its cloud customers, in the same manner that Amazon has provided Intel and AMD CPUs alongside its Graviton Arm CPUs.

In November, Microsoft announced a collaboration with Nvidia to build an AI supercomputer, a years-long project involving tens of thousands of Nvidia’s AI chips.

Where does big tech stand on AI hardware?

Big tech is flocking to Nvidia at present, with Microsoft, AWS, and Adobe all having chosen the firm’s cloud infrastructure for LLM training.

Microsoft has been a particularly large customer, having worked with Nvidia on the aforementioned supercomputer in addition to having paid Nvidia significant sums as part of its $10 billion OpenAI investment.

OpenAI has relied heavily on Nvidia’s cloud architecture and hardware to train its GPT-3 and GPT-4 LLMs.

It has particularly benefited from its CUDA software framework, which has helped to provide a consistent development platform that firms can rely on to be compatible with their software.

AMD could benefit hugely from focused Microsoft investment and assistance, which could place it in a much better position to shake Nvidia from its top spot.

Nasdaq figures for 2022 showed that Nvidia controlled 88% of the GPU market compared to AMD’s 8%.

Despite this, AMD demonstrated stronger growth in 2022, with revenues rising 44% year on year, driven in large part by data centers, compared to Nvidia’s 0.2%.

Intel has also targeted AI hardware dominance by 2025, with a roadmap for its diverse product range seeking to address current bottlenecks such as memory bandwidth.

The key to the chip giant’s success could lie in its open approach to AI, having helped Hugging Face improve the open-science BLOOM LLM using its Gaudi 2 processor.

In a blog post, Hugging Face engineer Régis Pierrard compared Intel’s hardware favorably to Nvidia’s A100 AI chip and stated that the former offered better performance and lower costs.

Intel’s existing customer base may also help it quickly scale its AI market share, with approximately 6.2 million active developers in its community ready to adopt its AI tools and hardware.

Using its tool SYCLomatic, developers can migrate CUDA source code to Intel’s open-source SYCL programming model which could ease the way for a steady stream of customers moving away from Nvidia.

ITPro has approached Microsoft and AMD for comment.

Rory Bathgate
Features and Multimedia Editor

Rory Bathgate is Features and Multimedia Editor at ITPro, overseeing all in-depth content and case studies. He can also be found co-hosting the ITPro Podcast with Jane McCallion, swapping a keyboard for a microphone to discuss the latest learnings with thought leaders from across the tech sector.

In his free time, Rory enjoys photography, video editing, and good science fiction. After graduating from the University of Kent with a BA in English and American Literature, Rory undertook an MA in Eighteenth-Century Studies at King’s College London. He joined ITPro in 2022 as a graduate, following four years in student journalism. You can contact Rory at rory.bathgate@futurenet.com or on LinkedIn.