Microsoft joins competitors in handing over AI models for advanced testing
US and UK government agencies will evaluate the firm's frontier models, along with those from Google and xAI
Microsoft, Google, and xAI have agreed to hand over their AI tools to the US Center for AI Standards and Innovation (CAISI) and the UK's AI Security Institute (AISI) for pre-deployment testing.
They will evaluate the firms' frontier models, assess safeguards, and help mitigate national security and large-scale public safety risks, Microsoft said.
"Well-constructed tests help us understand whether our systems are working as intended and delivering the benefits they are designed to provide. Testing also helps us stay ahead of risks, such as AI-driven cyber attacks and other criminal misuses of AI systems, that can emerge once advanced AI systems are deployed in the world," said Natasha Crampton, Microsoft’s chief responsible AI officer.
"While Microsoft regularly undertakes many types of AI testing on its own, testing for national security and large-scale public safety risks necessarily must be a collaborative endeavor with governments. This type of testing depends on deep technical, scientific, and national security expertise that is uniquely held by institutions like CAISI in the US and AISI in the UK and the government agencies they work with."
In the US, Microsoft and the National Institute of Standards and Technology (NIST) will collaborate with CAISI on improving methodologies for adversarial assessments.
It will mean testing AI systems by examining unexpected behaviors, misuse pathways, and failure modes. This includes developing more systematic and reproducible approaches to evaluation, including shared frameworks, datasets, and workflows for assessing safety, security, and robustness risks in advanced AI systems.
“Independent, rigorous measurement science is essential to understanding frontier AI and its national security implications,” said CAISI director Chris Fall. “These expanded industry collaborations help us scale our work in the public interest at a critical moment.”
Sign up today and you will receive a free copy of our Future Focus 2026 report - the leading resource for IT decision-maker insight on priorities and investment areas in AI, security and more.
Microsoft strikes agreement with UK researchers
In the UK, Microsoft will collaborate with AISI on research related to frontier safety and security, including ways of evaluating high-risk capabilities and the effectiveness of the safeguards used to address them
"The partnership will also include research into societal resilience, examining how conversational AI systems interact with users insensitive contexts," said AISI.
"As AI systems become increasingly capable, sustained two-way collaboration between government and companies developing and deploying frontier AI is essential to advance our joint understanding of large-scale risks to public safety and national security."
Microsoft said future plans include collaborating with other AI institutes around the world, sharing priorities and methodologies for testing through the International Network for AI Measurement, Evaluation and Science.
The company is also working with Frontier Model Forum (FMF), an initiative dedicated to advancing the science and practice of frontier AI safety and security, to support independent research and promote transparency around risk mitigation strategies.
It is also contributing to MLCommons, a multistakeholder non-profit that develops and operationalizes testing tools such as AILuminate, a family of safety and security benchmarks.
"As AI capabilities advance, so too must the rigor of the testing and safeguards that underpin them. We will apply what we learn from these partnerships directly into how we design, test, and deploy AI systems, ensuring that progress in evaluation science translates into safer, more secure products for our customers," said Crampton.
"As these partnerships progress, we will share what we learn and look for opportunities to apply insights and best practices to AI testing more broadly."
FOLLOW US ON SOCIAL MEDIA
Follow ITPro on Google News and add us as a preferred source to keep tabs on all our latest news, analysis, views, and reviews.
You can also follow ITPro on LinkedIn, X, Facebook, and BlueSky.
Emma Woollacott is a freelance journalist writing for publications including the BBC, Private Eye, Forbes, Raconteur and specialist technology titles.
-
Gartner warns that demand for AI skills across supply chains is outpacing talent availabilityNews The analyst firm reveals that demand for supply chain roles requiring AI expertise has surged by 387% since early 2023
-
Ricoh ScanSnap iX2500 reviewReviews Fast speeds, compact design, and peerless document management in one desktop scanner
-
IT leaders are being stung by "unexpected" AI costsNews The growing costs associated with AI are hitting organizations large and small
-
'Botsitting' is destroying productivity as workers spend nearly a full day each week making AI 'usable'News While workers are reporting productivity improvements, ‘botsitting’ means these are often negated
-
'Most enterprises are still unprepared to operationalize it': IT leaders are bullish on agents, but keeping falling at the final hurdle – here's whyNews Forrester points to challenges scaling agentic AI, saying companies start rolling out the tech before they're ready to scale
-
‘Chat is dead’: OpenAI plots ChatGPT ‘super app’ overhaul ahead of public listing – with agents and coding tools the new focusNews The company looks set to spruce up ChatGPT with a particular focus on agents to drive subscriptions
-
Uber’s eye-watering AI bill shows enterprises are ‘still measuring AI success through consumption rather than outcomes’ – and it's warping our perception of ROI and productivityNews ‘Tokenmaxxing’ might pad the stats, but it’s a trend that could come back to haunt enterprises
-
Destination AI: Una partnership affidabile per superare gli ostacoli e gettare le basi per la crescita futuraSponsored Con l'accelerazione dell'adozione dell''AI aziendale, i partner IT devono spostare la loro attenzione dall'hype tecnologico ai risultati aziendali tangibili, sfruttando ecosistemi strutturati per promuovere la monetizzazione a lungo termine
-
Le programme Destination AI : un partenariat de confiance pour surmonter les obstacles et poser les bases de votre croissance futureSponsored Alors que l'adoption de l'IA en entreprise s'accélère, les partenaires informatiques doivent réorienter leurs priorités : délaisser le battage technologique au profit de résultats commerciaux concrets, en exploitant des écosystèmes structurés pour assurer une monétisation à long terme
