AMD has unveiled the AMD Instinct MI300X, a new accelerator optimized for generative artificial intelligence (AI), as part of a wider announcement for the MI300 series of accelerators.
The MI300X, which was launched at the company’s Data Center and AI Technology Premiere, is AMD’s response to the current thirst for all things generative AI, making it the star of the new lineup.
It’s based on the AMD CDNA 3 accelerator architecture, which supports up to 192GB of high bandwidth memory (HBM) – this is what is needed for large language models (LLMs) and generative AI workloads. With the higher memory capacity, even the largest and most demanding of LLMs , such as the 40B parameter Falcon-40 model, can be run on a single MI300X accelerator, according to AMD.
In addition to the MI300X, the MI300 series also includes the MI300A – described by AMD as the “world’s first APU accelerator for high-performance computing (HPC) and AI workloads”. Other notable announcements included the AMD Instinct Platform, which combines the eight MI300X accelerators into an ‘ultimate solution’ for AI inference and training.
“AI is the defining technology shaping the next generation of computing and the largest strategic growth opportunity for AMD,” said AMD chair and CEO Lisa Su. “We are laser-focused on accelerating the deployment of AMD AI platforms at scale in the data center, led by the launch of our Instinct MI300 accelerators planned for later this year and the growing ecosystem of enterprise-ready AI software optimized for our hardware.”
AI inferencing with AMD EPYC™ processors
Providing an excellent platform for CPU-based AI inferencing
Other AI-based announcements included the Pervasive AI Vision, an AI platform that offers a portfolio of cloud-to-edge-to-endpoint hardware products and software collaborations. The aim is to develop scalable and pervasive AI services.
The announcement was hailed by Su as a “significant step forward for [AMD’s] data center strategy”. The expansion of the 4th Gen EPYC family of processors now supports cloud and technical compute workloads and new public instances and internal deployments with the largest cloud providers.
Along with the new capabilities, a burgeoning ecosystem of partners has been tipped to shape the future of computing – this includes the likes of Amazon Web Services (AWS), Meta, Microsoft, and Oracle. AWS already uses the AMD EPYC processor in the Amazon Elastic Compute Cloud (Amazon EC2) M7a instances. At the same time, Oracle is planning to make its Computing Infrastructure (OCI) E5 instances available with 4th Gen AMD EPYC processors.
“AWS has worked with AMD since 2018 to offer Amazon EC2 instances to customers. Today, we are seeing customers wanting to bring new types of applications to AWS, like financial applications, applications servers, video transcoding, and simulation modeling," said David Brown, vice president of Amazon EC2 at AWS. “When we combine the performance of 4th Gen AMD EPYC processors with the AWS Nitro System, we’re advancing cloud technology for our customers by allowing them to do more with better performance on even more Amazon EC2 instances.”
AMD also showcased the ROCm software ecosystem, which are data center accelerators, pitched for an open AI software ecosystem. This announcement came with a presentation from PyTorch looking at its work with AMD to fully upstream the ROCm software stack and immediate “day zero” support for PyTorch 2.0 with ROCm release 5.4.2 on all AMD Instinct accelerators.
Get the ITPro. daily newsletter
Receive our latest news, industry updates, featured resources and more. Sign up today to receive our FREE report on AI cyber crime & security - newly updated for 2023.
ITPro is a global business technology website providing the latest news, analysis, and business insight for IT decision-makers. Whether it's cyber security, cloud computing, IT infrastructure, or business strategy, we aim to equip leaders with the data they need to make informed IT investments.