AI’s use as a hacking tool has been overhyped

hand interacting with chatbot via a touchscreen
(Image credit: Getty Images)

The offensive potential of popular large language models (LLMs) has been put to the test in a new study that found GPT-4 was the only model capable of writing viable exploits for a range of CVEs.

The paper from researchers at University of Illinois Urbana-Champaign tested a series of popular LLMs including OpenAI’s GPT-3.5 and GPT-4, as well as leading open-source agents from Mistral AI, Hugging Face, and Meta.

The agents were given a list of 15 vulnerabilities, ranging from medium to critical severity, to test how successfully the LLMs could autonomously write exploit code for CVEs.

The researchers tailored a specific prompt to yield the best results from the models that encouraged the agent not to give up and be as creative as possible with its solution.

During the test, the agents were given access to web browsing elements, a terminal, search results, file creation and editing, as well as a code interpreter

The results of the investigation found GPT-4 was the only model capable of successfully writing an exploit for any of the one-day vulnerabilities, boasting a 86.7% success rate.

The authors noted they did not have access to GPT-4’s commercial rivals such as Anthropic’s Claude 3 or Google’s Gemini 1.5 Pro, and so were not able to compare their performance to that of OpenAI’s flagship GPT-4.

The researchers argued the results demonstrate the “possibility of an emergent capability” in LLMs to exploit one-day vulnerabilities, but also that finding the vulnerability itself is a more difficult task than exploiting it.

GPT-4 was highly capable when provided with a specific vulnerability to exploit according to the study. With more features including better planning, larger response sizes, and the use of subagents, it could become even more capable, the researchers said.

In fact, when given an Astrophy RCE exploit that was published after GPT-4’s knowledge cutoff date, the agent was still able to write code that successfully exploited the vulnerability, despite its absence from the model's training dataset.

Removing CVE descriptions significantly hamstrings GPT-4’s blackhat capabilities

While GPT-4’s capacity for malicious use by hackers may seem concerning, the offensive potential of LLMs remains limited for the moment, according to the research, as even it needed full access to the CVE description before it could create a viable exploit.Without this, GPT-4 was only able to muster a success rate of 7%.

This weakness was further underlined when the study found that although GPT-4 was able to identify the correct vulnerability 33% of the time,, its ability to exploit the flaw without further information was limited: Of the successfully detected vulnerabilities GPT-4 was only able to exploit one of them.

In addition, the researchers tested how many actions the agent took when operating with and without the CVE description, noting the average number of actions only differed by 14%, which the authors put down to the length of the model’s context window.

Speaking to ITPro, president at managed detection and response firm CyberProof, Yuval Wollman, said despite growing interest from cyber criminals in the offensive capabilities of AI chatbots, their efficacy remains limited at this time.


A whitepaper from Proofpoint on security awareness, training, with cartoon image of a security awareness class with teacher and student at desk

(Image credit: Proofpoint)

Learn about the common risks users face 

“The rise, by hundreds of percentage points, in discussions of ChatGPT on the dark web shows that something is going on, but whether it's being translated into more effective attacks? Not yet.”

Wollman said the offensive potential of AI systems is well established, citing previous simulations run on the AI-powered BlackMamba malware,  but argued the maturity of these tools is not quite there for them to be adopted more widely by threat actors.

Ultimately, Wollman thinks AI will have a significant impact on the ongoing arms race between threat actors and security professionals, but claims it’s too early to answer that question at the moment.

“The big question would be how the GenAI revolution and the new capabilities and engines that are now being discussed on the dark web would affect this arms race. I think it's too soon to answer that question.”

Solomon Klappholz
Staff Writer

Solomon Klappholz is a Staff Writer at ITPro. He has experience writing about the technologies that facilitate industrial manufacturing which led to him developing a particular interest in IT regulation, industrial infrastructure applications, and machine learning.