Microsoft is building a new AI model to rival some of the biggest
Microsoft is working on a MAI-1, a new internally-developed large language model to rival some of the biggest around, despite its deal with OpenAI
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
You are now subscribed
Your newsletter sign-up was successful
Microsoft has confirmed it is working on MAI-1, a new large language model (LLM) that could be big enough to rival the largest built models currently available, including Google Gemini and GPT-4.
According to a report by The Information, the new 500 billion parameter model is called MAI-1 and is being overseen by Microsoft AI CEO Mustafa Suleyman.
Suleyman, who was only recently hired by Microsoft to lead its consumer AI development division, was co-founder of AI company Inflection, and one of the founders of UK AI pioneer DeepMind.
The move marks a step change for Microsoft, which until now has relied largely on models developed by OpenAI to fuel its charge in the generative AI race against major competitors such as Google and AWS.
How does MAI-1 compare to its rivals?
If MAI-1 is being built with 500 billion parameters, that would make it one of the largest models currently known.
For example, OpenAI’s GPT-4 is thought to have around 1 trillion parameters; Grok, from Elon Musk’s xAI has 314 billion parameters.
Other big AI players such as Google and Anthropic have kept the number of parameters in their LLMs under wraps.
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
It’s not entirely clear why Microsoft would need to build another LLM because it already has made a major – $10 billion – investment in OpenAI, whose ChatGPT models have dominated the generative AI landscape up to now.
Why is Microsoft building MAI-1?
Microsoft CTO Kevin Scott sought to downplay the story in a posting on LinkedIn, while also apparently confirming that the company does have a model called MAI.
“Just to summarize the obvious: we build big supercomputers to train AI models; our partner Open AI uses these supercomputers to train frontier-defining models; and then we both make these models available in products and services so that lots of people can benefit from them. We rather like this arrangement,” he said.
Scott said that each supercomputer built for OpenAI is a lot bigger than the one that preceded it, and each frontier model they train is a lot more powerful than its predecessors.
“We will continue to be on this path--building increasingly powerful supercomputer for Open AI to train the models that will set pace for the whole field - well into the future. There's no end in sight to the increasing impact that our work together will have,” he said.
RELATED WHITEPAPER
Scott said that Microsoft has also built its own AI models “for years and years and years,” and said AI models are used in almost every one of the products, services, and operating processes at Microsoft.
“The teams making and operating things on occasion need to do their own custom work, whether that's training a model from scratch, or fine tuning a model that someone else has built. There will be more of this in the future too. Some of these models have names like Turing, and MAI. Some, like Phi for instance, we even open source,” he said.
Are LLMs the only game in town?
Not all AI models have to have a gigantic parameter count. Microsoft recently introduced Phi-3, a small language model which it said is capable of outperforming models of the same size and next size up across a variety of language, reasoning, coding, and math benchmarks, and could be a more practical choice for customers looking to build generative AI applications.
Building an LLM with a vast number of parameters is only part of the story; AI companies also racing to secure the best sources of data to train their generative AI tools.
Just this week OpenAI announced a deal with Stack Overflow to use the millions of questions and answers posted by developers on the knowledge site to enhance the responses from ChatGPT.
Steve Ranger is an award-winning reporter and editor who writes about technology and business. Previously he was the editorial director at ZDNET and the editor of silicon.com.
-
Will AI hiring entrench gender bias?ITPro Podcast Leaders need to proactive as attackers launch more consistent, sophisticated attacks
-
Met Office hails huge efficiency gains in first year of cloud supercomputing with Microsoft AzureNews In moving to the cloud, the Met Office has bolstered operational resilience and helped to deliver more accurate forecasts
-
Microsoft has a new AI poster child in Anthropic – and it’s about timeOpinion Microsoft is cosying up to Anthropic at a crucial time in the race to deliver on AI promises
-
Anthropic's Claude Cowork tool is coming to Microsoft CopilotNews The new Copilot Cowork tool will be made available through a new Microsoft 365 tier at the end of March
-
Will AI hiring entrench gender bias?ITPro Podcast This International Women's Day, it's more important than ever to consider the inherent biases of training data
-
Why Amazon’s ‘go build it’ AI strategy aligns with OpenAI’s big enterprise pushNews OpenAI and Amazon are both vying to offer customers DIY-style AI development services
-
February rundown: SaaS-pocalypse now?ITPro Podcast Geopolitical uncertainty is intensifying public and private sector focus on true sovereign workloads
-
‘A huge vote of confidence’: London set to host OpenAI's largest research hub outside USNews OpenAI wants to capitalize on the UK’s “world-class” talent in areas such as machine learning
-
Sam Altman just said what everyone is thinking about AI layoffsNews AI layoff claims are overblown and increasingly used as an excuse for “traditional drivers” when implementing job cuts
-
Microsoft Copilot bug saw AI snoop on confidential emails — after it was told not toNews The Copilot bug meant an AI summarizing tool accessed messages in the Sent and Draft folders, dodging policy rules
