AI is getting better at security – and it's doing it faster than expected
UK AISI warns that AI models are already exceeding existing benchmarks for testing
AI models are getting better at handling more complex security tasks, doubling results in one benchmark in the last few months – and that's before the arrival of security-focused models, notably Anthropic's Claude Mythos and OpenAI's GPT-5.5.
That's according to the UK's AI Security Institute (AISI), which tracks the potential impact of AI on the security industry and efforts to protect organisations, and found newer models had doubled the length of cyber tasks they could complete in just 4.7 months – much faster than expected.
"In February 2026, we internally estimated that the length of cyber tasks AI models could complete had doubled every 4.7 months since late 2024 – already an acceleration from our November 2025 estimate of 8 months," the organisation said in a blog post. "Since then, AISI reported on two new models, Claude Mythos Preview and GPT-5.5, which substantially exceeded both doubling rate trends."
AISI added: "It is unclear whether this represents a new, faster trend."
That follows the release of Claude Mythos, which sparked concerns that companies wouldn't be able to keep up with AI security, as well as GPT-5.5 Cyber last week, with OpenAI releasing the security focused model in a limited preview with access only to security professionals, amid fears that generative AI was accelerating a security arms race. Indeed, Forescout VP of security intelligence Rik Ferguson last week said AI tools are now "a standard part of the attacker toolkit."
How the AISA tests
These results are based on a time-horizon benchmark, which tracks the success rate of AI models on tasks of different lengths based on how long a human expert would take on the same task. For example, one set of tests includes reverse engineering and web exploits in self-contained setups. AISA is looking for a model to succeed 80% of the time to be considered capable of doing a task of a certain length.
AISI admits the time horizon benchmark is imperfect. "They are inexact predictors of performance; AI struggles with some tasks humans do quickly, and easily completes others that humans find hard," the blog post noted. "However, we use this type of benchmark because it offers a measure of AI autonomy from which we can draw trends."
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
Plus, the tests only include some capabilities that would be necessary to run a real-world attack. Alongside that, AISI limits the models to 2.5m tokens to maintain comparability across results.
But AISI said that the 2.5m token cap limits the success of models, as without that cap, the "success rates are so high that time horizons become impossible to calculate." But the organisation also added that its own tests are now too short, meaning it's not possible to reveal at what point model reliability would start to fail on a longer task; the longest task is 12 hours.
"No single benchmark result should be read as a precise measure of AI capability," the post noted, adding: "Regardless, the direction of change and rapid growth have been consistent across the models, methodological choices, and independent data we examined."
New evaluation methods were in development, the AISI added.
What this means for security
The AISI said it was unclear how AI's pace of progress would continue, or how the technology's capabilities would work against real-world systems. But the agency said it was clear that AI was bringing opportunities and risks.
"The time to invest in strong security baselines is now," the AISI post warned. "Frontier AI can strengthen attackers as well as defenders, and there is a critical window to build resilience."
That was echoed by Palo Alto Networks this week, with CTO Lee Klarich warning that AI cyberattacks would become the new normal in the next few months. "This impending vulnerability deluge demands urgency," he wrote.
Freelance journalist Nicole Kobie first started writing for ITPro in 2007, with bylines in New Scientist, Wired, PC Pro and many more.
Nicole the author of a book about the history of technology, The Long History of the Future.
-
Industrial organizations under increasing fire as attackers target operational technologyNews Firms continue to underestimate their operational technology exposure, NCC Group warns
-
Upskill your staff in AI or expect them to quit, says GartnerNews Organizations need to focus on targeted AI tools and training to make the most of their staff and succeed in transformation
-
AI and Data are reshaping the MSP landscape, but hackers are getting in on the hot AI actionNews AI is no longer just a buzzword; it's a hacker's dream and the channel's biggest opportunity
-
Is your new hire an AI clone? Microsoft says North Korean hackers are using AI to impersonate job seekers and steal company secretsNews The groups are increasingly using face-changing or voice-changing software to make their fake identities more plausible
-
Using AI to code? Watch your security debtnews Black Duck research shows faster development may be causing risks for companies
-
Businesses are taking their eye off the ball with vulnerability patchingNews Security leaders are overconfident in their organization’s security posture while allowing vulnerability patching to fall by the wayside.
-
Multichannel attacks are becoming a serious threat for enterprises – and AI is fueling the surgeNews Organizations are seeing a steep rise in multichannel attacks fueled in part by an uptick in AI cyber crime, new research from SoSafe has found.
-
12,000 API keys and passwords were found in a popular AI training dataset – experts say the issue is down to poor identity managementAnalysis The discovery of almost 12,000 secrets in the archive of a popular AI training dataset is the result of the industry’s inability to keep up with the complexities of machine-machine authentication.
-
Hackers are using a new AI chatbot to wage cyber attacks: GhostGPT lets users write malicious code, create malware, and curate phishing emails – and it costs just $50 to useNews Researchers at Abnormal Security have warned about the rise of GhostGPT, a new chatbot used by cyber criminals to create malicious code and malware.
-
LinkedIn faces lawsuit amid claims it shared users' private messages to train AI modelsNews LinkedIn faces a lawsuit in the US amid allegations that it shared Premium members' private messages to train AI models.
