Microsoft is building a new AI model to rival some of the biggest
Microsoft is working on a MAI-1, a new internally-developed large language model to rival some of the biggest around, despite its deal with OpenAI
Microsoft has confirmed it is working on MAI-1, a new large language model (LLM) that could be big enough to rival the largest built models currently available, including Google Gemini and GPT-4.
According to a report by The Information, the new 500 billion parameter model is called MAI-1 and is being overseen by Microsoft AI CEO Mustafa Suleyman.
Suleyman, who was only recently hired by Microsoft to lead its consumer AI development division, was co-founder of AI company Inflection, and one of the founders of UK AI pioneer DeepMind.
The move marks a step change for Microsoft, which until now has relied largely on models developed by OpenAI to fuel its charge in the generative AI race against major competitors such as Google and AWS.
How does MAI-1 compare to its rivals?
If MAI-1 is being built with 500 billion parameters, that would make it one of the largest models currently known.
For example, OpenAI’s GPT-4 is thought to have around 1 trillion parameters; Grok, from Elon Musk’s xAI has 314 billion parameters.
Other big AI players such as Google and Anthropic have kept the number of parameters in their LLMs under wraps.
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
It’s not entirely clear why Microsoft would need to build another LLM because it already has made a major – $10 billion – investment in OpenAI, whose ChatGPT models have dominated the generative AI landscape up to now.
Why is Microsoft building MAI-1?
Microsoft CTO Kevin Scott sought to downplay the story in a posting on LinkedIn, while also apparently confirming that the company does have a model called MAI.
“Just to summarize the obvious: we build big supercomputers to train AI models; our partner Open AI uses these supercomputers to train frontier-defining models; and then we both make these models available in products and services so that lots of people can benefit from them. We rather like this arrangement,” he said.
Scott said that each supercomputer built for OpenAI is a lot bigger than the one that preceded it, and each frontier model they train is a lot more powerful than its predecessors.
“We will continue to be on this path--building increasingly powerful supercomputer for Open AI to train the models that will set pace for the whole field - well into the future. There's no end in sight to the increasing impact that our work together will have,” he said.
RELATED WHITEPAPER
Scott said that Microsoft has also built its own AI models “for years and years and years,” and said AI models are used in almost every one of the products, services, and operating processes at Microsoft.
“The teams making and operating things on occasion need to do their own custom work, whether that's training a model from scratch, or fine tuning a model that someone else has built. There will be more of this in the future too. Some of these models have names like Turing, and MAI. Some, like Phi for instance, we even open source,” he said.
Are LLMs the only game in town?
Not all AI models have to have a gigantic parameter count. Microsoft recently introduced Phi-3, a small language model which it said is capable of outperforming models of the same size and next size up across a variety of language, reasoning, coding, and math benchmarks, and could be a more practical choice for customers looking to build generative AI applications.
Building an LLM with a vast number of parameters is only part of the story; AI companies also racing to secure the best sources of data to train their generative AI tools.
Just this week OpenAI announced a deal with Stack Overflow to use the millions of questions and answers posted by developers on the knowledge site to enhance the responses from ChatGPT.
Steve Ranger is an award-winning reporter and editor who writes about technology and business. Previously he was the editorial director at ZDNET and the editor of silicon.com.
-
Trump's AI executive order could leave US in a 'regulatory vacuum'News Citing a "patchwork of 50 different regulatory regimes" and "ideological bias", President Trump wants rules to be set at a federal level
-
TPUs: Google's home advantageITPro Podcast How does TPU v7 stack up against Nvidia's latest chips – and can Google scale AI using only its own supply?
-
OpenAI turns to red teamers to prevent malicious ChatGPT use as company warns future models could pose 'high' security riskNews The ChatGPT maker wants to keep defenders ahead of attackers when it comes to AI security tools
-
Microsoft quietly launches Fara-7B, a new 'agentic' small language model that lives on your PC — and it’s more powerful than GPT-4oNews The new Fara-7B model is designed to takeover your mouse and keyboard
-
Microsoft is hell-bent on making Windows an ‘agentic OS’ – forgive me if I don’t want inescapable AI features shoehorned into every part of the operating systemOpinion We don’t need an ‘agentic OS’ filled with pointless features, we need an operating system that works
-
Microsoft's new Agent 365 platform is a one-stop shop for deploying, securing, and keeping tabs on AI agentsNews The new platform looks to shore up visibility and security for enterprises using AI agents
-
Some of the most popular open weight AI models show ‘profound susceptibility’ to jailbreak techniquesNews Open weight AI models from Meta, OpenAI, Google, and Mistral all showed serious flaws
-
'It's slop': OpenAI co-founder Andrej Karpathy pours cold water on agentic AI hype – so your jobs are safe, at least for nowNews Despite the hype surrounding agentic AI, OpenAI co-founder Andrej Karpathy isn't convinced and says there's still a long way to go until the tech delivers real benefits.
-
This new Microsoft tool lets enterprises track internal AI adoption rates – and even how rival companies are using the technologyNews Microsoft's new Benchmarks feature lets managers track and monitor internal Copilot adoption and usage rates – and even how rival companies are using the tool.
-
OpenAI signs another chip deal, this time with AMDnews AMD deal is worth billions, and follows a similar partnership with Nvidia last month
