AMD strikes deal to power Microsoft Azure OpenAI service workloads with Instinct MI300X accelerators

(Image credit: Getty Images)

published 29 May 2024

AMD and Microsoft have expanded their partnership to bolster end-to-end compute and software capabilities for the hyperscaler’s enterprise customers.

Unveiled at Microsoft Build 2024, AMD solutions such as its Instinct MI300X accelerators, ROCm open software, Ryzen AI processors and software, and Alveo MA35D media accelerators will support Microsoft offerings such as the Azure OpenAI service and its new virtual machines.

The deal will help underpin a wide array of critical services from Microsoft, according to AMD president Victor Peng.

“The AMD Instinct MI300X and ROCm software stack is powering the Azure OpenAI Chat GPT 3.5 and 4 services, which are some of the world’s most demanding AI workloads,” he said.

“With the general availability of the new VMs from Azure, AI customers have broader access to MI300X to deliver high-performance and efficient solutions for AI applications.”

Kevin Scott, chief technology officer and executive vice president of AI at Microsoft, said the deal showcases the long-standing partnership between the two companies.

“Microsoft and AMD have a rich history of partnering across multiple computing platforms: first the PC, then custom silicon for Xbox, HPC and now AI,” he said.

“Over the more recent past, we’ve recognized the importance of coupling powerful compute hardware with the system and software optimization needed to deliver amazing AI performance and value.

“Together with AMD, we’ve done so through our use of ROCm and MI300X, empowering Microsoft AI customers and developers to achieve excellent price-performance results for the most advanced and compute-intense frontier models. We’re committed to our collaboration with AMD to continue pushing AI progress forward.”

AMD supporting AI advances at Microsoft

First announced in preview in November 2023, the Azure ND MI300x v5 VM series is now available in the Canada Central region for customers to run AI workloads.

Microsoft said these new VMs provide significant HBM capacity and memory bandwidth and will enable customers to fit larger models in GPU memory or use fewer GPUs.

In the long term, this will help reduce power consumption and costs and rapidly accelerate time-to-solution.

RELATED WHITEPAPER

MAKE THE MOST OUT OF TIGHT POWER BUDGETS

Get introduced to server processors that maximize power efficiency

Powered by AMD’s ROCm software, the VMs are also being used for Azure AI production workloads. Microsoft said this will give customers access to GPT-3.5 and GPT-4 models.

Hugging Face sees performance improvements with ND MI300X VMs

One of the first customers to harness the new VMs was Hugging Face, Microsoft and AMD revealed.

The AI company ported their models to the ND MI300X VMs in just one month, which delivered significant performance and price/performance improvements for their models.

As part of the deal, ND MI300X VM customers can now bring Hugging Face models to the new VMs to easily create and deploy NLP applications.

“The deep collaboration between Microsoft, AMD, and Hugging Face on the ROCm open software ecosystem will enable Hugging Face users to run hundreds of thousands of AI models available on the Hugging Face Hub on Azure with AMD Instinct GPUs without code changes, making it easier for Azure customers to build AI with open models and open source,” said Julien Simon, chief evangelist officer, Hugging Face.

Optimizing with AMD Ryzen AI software

AMD also announced that developers can now use Ryzen AI software to optimize and deploy inference on Ryzen-powered PCs.

This software allows applications to run on the neural processing unit (NPU) built on AMD XDNA architecture, which is the first dedicated AI processing silicon on a Windows x86 processor.

This addresses traditional issues when running AI models on a CPU or GPU alone, which can drain battery power rapidly. However, with a Ruzen AI powered-laptop, AI models can operate on the dedicated NPU, which enables users to free up CPU and GPU resources for other tasks.

AMD said this helps to both increase battery life while allowing developers to run on-device AI workloads and concurrent applications more efficiently.

Advancing Video Services and Enterprise Compute

Finally, Microsoft revealed it selected the AMD Alveo™ MA35D media accelerator to power live streaming video workloads. This includes Microsoft Teams, SharePoint video, and others, the firm said.

The Alveo MA35D media accelerator is purpose-built to power live interactive streaming services “at scale”, according to AMD. The company said this will enable Microsoft to ensure high-quality video experiences for users by streamlining video processing workloads, including video transcoding, decoding, and encoding.

Using the Alveo MA35D accelerator in servers powered by 4th Gen AMD EPYC processors, Microsoft will unlock the ability to consolidate cloud Infrastructure by harnessing high channel density, energy-efficient, and ultra-low latency video processing capabilities.

Similarly, the Alveo MA35D features ASIC-based video processing units that will support AV1 compression standards, enabling AI-enabled video quality improvements to provide a “smooth and seamless” video experience for users.

AMD also noted that the deal will support ‘future-ready AV1 technology’. By supporting emerging standards like AV1, the Alveo MA35D “provides Microsoft with a solution that can adapt to evolving video processing requirements.”

TOPICS

ITPro is a global business technology website providing the latest news, analysis, and business insight for IT decision-makers. Whether it's cyber security, cloud computing, IT infrastructure, or business strategy, we aim to equip leaders with the data they need to make informed IT investments.

For regular updates delivered to your inbox and social feeds, be sure to sign up to our daily newsletter and follow on us LinkedIn and Twitter.