Generative AI could play an important role for organisations looking to prevent cyber attacks such as phishing in the future.
Large language models (LLMs) used by generative AIs such as ChatGPT and Bard could prove effective at learning the language styles of an organisation's staff and be deployed to detect unusual activity coming from their accounts, such as the text in an email.
Kunal Anand, CTO at cyber security firm Imperva, told ITPro that the “cat and mouse game” of cyber security would be enhanced by AI, but that businesses will have to carefully consider what they want used as training data.
Cyber security systems could become more effective if they embedded LLMs that were fed data offering company-by-company context, although such models are largely in the hands of hyperscalers only.
Citing the anti-phishing potential for LLMs in security, Anand said the technology could prove more effective than existing automated security systems due to its capacity for content analysis.
If the technology were ever deployed at scale in the security industry, it's likely that it would augment existing systems, either as standalone products or as features added to unified solutions.
“I think there's going to be interesting use cases where we will use AI, but it will be in conjunction with a signature base system, it will be in conjunction with a logical base system.
"Whether that's a positive security model or a negative security model, I think this is just going to be another layer on top of those things.”
Anand noted that company-specific LLMs could also be a way for large firms to avoid the unwanted collection of valuable data such as source code during the course of training the AI.
Companies such as Google and OpenAI that are producing these AI systems as of now aren't forthcoming with how their tools are collecting or storing the data fed into them, raising questions about safe use in the enterprise.
The potential privacy issues involved with using the tools manifested just last week as ChatGPT was found to leak partial chat histories with other users.
Citing a recent conversation held with employees at an unnamed large company, that had been using GPT to generate client code for internal APIs and microservices, Anand suggested that organisations are already putting too much sensitive data at risk.
“I said, ‘okay, so let me get the straight. You're using some third-party solution, and you have no idea how they’re storing this data, and you're asking it to build you a proprietary application internally using your proprietary schema, that represents your proprietary APIs? You can see the problem right?’
"And they said 'yeah, I don't think we should do that anymore', and I replied 'yeah, I don't think you should do that anymore either'."
“I absolutely believe that from an enterprise perspective, companies from let's say data, security data governance perspectives will probably urge that they bring these generative AI capabilities in-house," Anand said.
"That way they can use their proprietary data and mix it with it.”
Generative AI in malware development
On the other side of the threat landscape, there are already concerns that generative AI can be used to dramatically improve the complexity of malware developed by threat actors.
In January, threat researchers at CyberArk Labs developed polymorphic malware using ChatGPT, a demonstration of the potential threat that generative AI poses to traditional security countermeasures.
Recent calls for leaked AI models to be stored on Bitcoin could exacerbate this potential misuse of LLMs, as a route through which threat actors could anonymously download full training sets that could then be run on home systems. Anand acknowledged that threat actors are already using GPT models to produce malicious code.
“People developing novel attack payloads using these attack tools, and they're asking very open-ended questions of GPT to go in and generate a unique payload that is, you know, something that embeds cross-site scripting or SQL injection - some OWASP top-ten issue and those typically will get sussed out by firewalls in general.”
The essential guide to preventing ransomware attacks
Vital tips and guidelines to protect your business using ZTNA and SSE
Generative AI could be used to improve a technique known as ‘fuzzing’, which involves developing an automated script to flood a system with randomised inputs to expose potential vulnerabilities.
Fuzzing tools have been used to expose flaws in popular software such as Word and Acrobat, and generative AI could improve the accuracy with which fuzzing software can iterate on attack results to discover flaws.
“If I try to launch an attack and you block it, I can then use that signal to train my AI, note that that was not a valid payload and try again,” Anand noted.
"And then I can keep mutating over and over and over again, until I find the boundary conditions in your language model.”
Despite the threat of code generation, the use of LLMs for creating malicious text is a bigger issue at the moment.
This is due in part to the complicated nature of programming, with code required to undergo a validity check before being run in a way that prose is not.
Get the ITPro. daily newsletter
Receive our latest news, industry updates, featured resources and more. Sign up today to receive our FREE report on AI cyber crime & security - newly updated for 2023.
Rory Bathgate is Features and Multimedia Editor at ITPro, overseeing all in-depth content and case studies. He can also be found co-hosting the ITPro Podcast with Jane McCallion, swapping a keyboard for a microphone to discuss the latest learnings with thought leaders from across the tech sector.
In his free time, Rory enjoys photography, video editing, and good science fiction. After graduating from the University of Kent with a BA in English and American Literature, Rory undertook an MA in Eighteenth-Century Studies at King’s College London. He joined ITPro in 2022 as a graduate, following four years in student journalism. You can contact Rory at firstname.lastname@example.org or on LinkedIn.