Anthropic announces Claude Opus 4.5, the new AI coding frontrunner
The new frontier model is a leap forward for the firm across agentic tool use and resilience against attacks
Anthropic has announced Claude Opus 4.5, its most advanced model to date and the new industry leader for AI code generation.
The AI company claims the new model is a strong contender for all agentic workflows, including code generation and autonomous computer use.
Opus 4.5 scored 80.9% in SWE-Bench Verified, cementing it as the new state of the art model for code generation.
SWE-Bench Verified is one of the most rigorous for testing the agentic coding capabilities of AI models, with models tested according to the benchmark are presented with real-world coding problems taken from open source GitHub repositories.
In comparison, GPT-5.1 Codex Max scored 77.9% and Gemini 3 Pro, Google’s latest frontier model, scored 76.2%.
Claude Sonnet 4.5 has, to date, been widely praised as the best AI model for generating code across a variety of programming languages.
In addition to its raw performance, Opus 4.5 offers developers more choice in how to approach a problem.
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
Via the Claude API, developers can use a new ‘effort’ parameter to determine how many tokens they want the model to use for a given task. This affects how long the output will take and how expensive it will be.
In tests, Opus 4.5 set to ‘medium’ was able to match Claude Sonnet 4.5 scores on SWE-bench Verified while using 76% fewer output tokens.
A big leap forward for Anthropic
Aside from its coding capabilities, Anthropic underlined the overall improvement Opus 4.5 brings to various enterprise tasks.
For example, the model is capable of complex information retrieval, agentic tool use, and deep analysis, as well as Excel automation.
Across agentic tool use benchmarks, Opus 4.5 was found to consistently outclass rival models.
In early testing with Excel automation, Anthropic said its customers measured 20% accuracy improvements and 15% efficiency gains.
Anthropic emphasized these tangible improvements as a sign that the Claude model family has become a strong choice for various enterprise tasks, in addition to its code-generation pedigree.
With the launch of Opus 4.5, Anthropic sees the three models in the Claude family fulfilling distinct roles in the development lifecycle:
- Opus 4.5 is the go-to model for core agentic tasks and production code, with a focus on maximum sophistication and accuracy.
- Sonnet 4.5 is the model of choice for agents at scale, particularly customer-facing agents, as well as generating low latency code for iterative development.
- Haiku 4.5 is for businesses seeking to access a free tier to Claude, as well as for sub-agents.
Anthropic defines sub-agents as those with specific, pre-defined tasks, which agents don’t necessarily require its frontier model to accomplish.
Expanding on its computer use capabilities, Opus 4.5 will become available via a new Chrome extension, Claude for Chrome.
This will allow Max subscribers to let Claude take various actions across their browser.
"Claude Opus 4.5 represents a breakthrough in self-improving AI agents,” said Yusuke Kaji, GM of AI for Business at Rakuten.
“For automation of office tasks, our agents were able to autonomously refine their own capabilities—achieving peak performance in 4 iterations while other models couldn’t match that quality after 10.
“They also demonstrated the ability to learn from experience across technical tasks, storing insights from past work and applying them to new challenges like SRE operations."
New resilience to prompt injection
In addition to its benchmark improvements, Opus 4.5 was trained to be as trustworthy as possible and to defend against common prompt injection attacks launched against reasoning models.
When simulated attackers used 100 “very strong” prompt injection attacks, they saw a success rate of 63% against Opus 4.5 Thinking, compared to 87.8% against GPT-5.1 Thinking and 92% against Gemini 3 Pro Thinking.
When just one attack was used, just 4.7% of attacks succeeded versus 12.6% against GPT-5.1 Thinking and 12.5% against Gemini 3 Pro.
MORE FROM ITPRO

Rory Bathgate is Features and Multimedia Editor at ITPro, overseeing all in-depth content and case studies. He can also be found co-hosting the ITPro Podcast with Jane McCallion, swapping a keyboard for a microphone to discuss the latest learnings with thought leaders from across the tech sector.
In his free time, Rory enjoys photography, video editing, and good science fiction. After graduating from the University of Kent with a BA in English and American Literature, Rory undertook an MA in Eighteenth-Century Studies at King’s College London. He joined ITPro in 2022 as a graduate, following four years in student journalism. You can contact Rory at rory.bathgate@futurenet.com or on LinkedIn.
-
A critical HPE OneView flaw is being exploited in the wildNews A maximum-severity HPE OneView vulnerability is being actively exploited in the wild, prompting an alert by CISA.
-
Microsoft is shaking up GitHub in preparation for a battle with AI coding rivalsNews The tech giant is bracing itself for a looming battle in the AI coding space
-
Retailers are turning to AI to streamline supply chains and customer experience – and open source options are proving highly popularNews Companies are moving AI projects from pilot to production across the board, with a focus on open-source models and software, as well as agentic and physical AI
-
Microsoft CEO Satya Nadella wants an end to the term ‘AI slop’ and says 2026 will be a ‘pivotal year’ for the technology – but enterprises still need to iron out key lingering issuesNews Microsoft CEO Satya Nadella might want the term "AI slop" shelved in 2026, but businesses will still be dealing with increasing output problems and poor returns.
-
OpenAI says prompt injection attacks are a serious threat for AI browsers – and it’s a problem that’s ‘unlikely to ever be fully solved'News OpenAI details efforts to protect ChatGPT Atlas against prompt injection attacks
-
OpenAI says GPT-5.2-Codex is its ‘most advanced agentic coding model yet’ – here’s what developers and cyber teams can expectNews GPT-5.2 Codex is available immediately for paid ChatGPT users and API access will be rolled out in “coming weeks”
-
Google DeepMind CEO Demis Hassabis thinks startups are in the midst of an 'AI bubble'News AI startups raising huge rounds fresh out the traps are a cause for concern, according to Hassabis
-
OpenAI turns to red teamers to prevent malicious ChatGPT use as company warns future models could pose 'high' security riskNews The ChatGPT maker wants to keep defenders ahead of attackers when it comes to AI security tools
-
AWS has dived headfirst into the agentic AI hype cycle, but old tricks will help it chart new watersOpinion While AWS has jumped on the agentic AI hype train, its reputation as a no-nonsense, reliable cloud provider will pay dividends
-
AWS CEO Matt Garman says AI agents will have 'as much impact on your business as the internet or cloud'News Garman told attendees at AWS re:Invent that AI agents represent a paradigm shift in the trajectory of AI and will finally unlock returns on investment for enterprises.