Microsoft is building a new AI model to rival some of the biggest
Microsoft is working on a MAI-1, a new internally-developed large language model to rival some of the biggest around, despite its deal with OpenAI
Microsoft has confirmed it is working on MAI-1, a new large language model (LLM) that could be big enough to rival the largest built models currently available, including Google Gemini and GPT-4.
According to a report by The Information, the new 500 billion parameter model is called MAI-1 and is being overseen by Microsoft AI CEO Mustafa Suleyman.
Suleyman, who was only recently hired by Microsoft to lead its consumer AI development division, was co-founder of AI company Inflection, and one of the founders of UK AI pioneer DeepMind.
The move marks a step change for Microsoft, which until now has relied largely on models developed by OpenAI to fuel its charge in the generative AI race against major competitors such as Google and AWS.
How does MAI-1 compare to its rivals?
If MAI-1 is being built with 500 billion parameters, that would make it one of the largest models currently known.
For example, OpenAI’s GPT-4 is thought to have around 1 trillion parameters; Grok, from Elon Musk’s xAI has 314 billion parameters.
Other big AI players such as Google and Anthropic have kept the number of parameters in their LLMs under wraps.
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
It’s not entirely clear why Microsoft would need to build another LLM because it already has made a major – $10 billion – investment in OpenAI, whose ChatGPT models have dominated the generative AI landscape up to now.
Why is Microsoft building MAI-1?
Microsoft CTO Kevin Scott sought to downplay the story in a posting on LinkedIn, while also apparently confirming that the company does have a model called MAI.
“Just to summarize the obvious: we build big supercomputers to train AI models; our partner Open AI uses these supercomputers to train frontier-defining models; and then we both make these models available in products and services so that lots of people can benefit from them. We rather like this arrangement,” he said.
Scott said that each supercomputer built for OpenAI is a lot bigger than the one that preceded it, and each frontier model they train is a lot more powerful than its predecessors.
“We will continue to be on this path--building increasingly powerful supercomputer for Open AI to train the models that will set pace for the whole field - well into the future. There's no end in sight to the increasing impact that our work together will have,” he said.
RELATED WHITEPAPER
Scott said that Microsoft has also built its own AI models “for years and years and years,” and said AI models are used in almost every one of the products, services, and operating processes at Microsoft.
“The teams making and operating things on occasion need to do their own custom work, whether that's training a model from scratch, or fine tuning a model that someone else has built. There will be more of this in the future too. Some of these models have names like Turing, and MAI. Some, like Phi for instance, we even open source,” he said.
Are LLMs the only game in town?
Not all AI models have to have a gigantic parameter count. Microsoft recently introduced Phi-3, a small language model which it said is capable of outperforming models of the same size and next size up across a variety of language, reasoning, coding, and math benchmarks, and could be a more practical choice for customers looking to build generative AI applications.
Building an LLM with a vast number of parameters is only part of the story; AI companies also racing to secure the best sources of data to train their generative AI tools.
Just this week OpenAI announced a deal with Stack Overflow to use the millions of questions and answers posted by developers on the knowledge site to enhance the responses from ChatGPT.
Steve Ranger is an award-winning reporter and editor who writes about technology and business. Previously he was the editorial director at ZDNET and the editor of silicon.com.
-
ReMarkable Paper Pure reviewReviews More integrations and business features to make the distraction-free tablet a little bit more appealing
-
SAP snaps up Dremio and Prior Labs in double AI acquisitionNews The deals aim to strengthen SAP’s data foundations and sharpen its focus on tabular foundation models
-
The AI operations gap is reshaping the Microsoft channelHow are AI advancements shaping the moves channel partners are making and need to make going forward?
-
Accenture has been trialling Microsoft Copilot since 2023 – now it’s rolling out the AI tool to all 743,000 staffNews Accenture will roll out Microsoft Copilot to nearly three quarters of a million employees after years of testing
-
Four things you need to know about OpenAI’s new workspace agents for ChatGPT – including how to build your ownNews New ‘workspace agents’ from OpenAI will automate tasks for workers and can be customized for specific roles
-
'That language is no longer reflective of how Copilot is used today': Microsoft says Copilot isn't just for 'entertainment purposes only'News Sharp-eyed users spotted Microsoft describing its Copilot AI as "for entertainment purposes only"
-
‘Fragmentation is poison’: How Microsoft is targeting disparate data to boost AI adoptionNews Amir Netz, the co-creator of Microsoft's Power BI service, tells ITPro that business context is key to effective AI deployment.
-
Microsoft is rolling out Copilot Cowork to more customersNews Use of Copilot Cowork has been limited to select customers so far
-
Satya Nadella needs to remember the Streisand effect for 'AI slop'Opinion Attempts to discourage criticism may backfire for Microsoft’s CEO
-
OpenAI says AI tools are paying dividends for small businesses, but uptake is sluggish in several UK regionsNews While some small businesses are seeing big benefits, many don't use AI at all
