OpenAI says prompt injection attacks are a serious threat for AI browsers – and it’s a problem that’s ‘unlikely to ever be fully solved'
OpenAI details efforts to protect ChatGPT Atlas against prompt injection attacks
OpenAI has updated its browser to boost protection against prompt injection attacks, but it warned the risk may never fully disappear.
Released in October, OpenAI's ChatGPT Atlas browser includes agent mode, which looks at webpages to click its way through transactions, forms and other online tasks.
But OpenAI noted that as a browser agent can do more, it also becomes more at risk to "adversarial attacks" – in particular prompt injections, which are sneaking malicious instructions into the agent to drive its behavior.
"Prompt injection is one of the most significant risks we actively defend against to help ensure ChatGPT Atlas can operate securely on your behalf," OpenAI said in a blog post.
Indeed, days after the release of OpenAI's browser, security researchers spotted several serious flaws, including a prompt-injection technique – no wonder then that analysts at Gartner have warned companies to ban AI browsers for fear of security risks.
OpenAI said it recently updated ChatGPT Atlas's agent security safeguards and gave it a new model that had been "adversarially trained", as well developing a "rapid response loop" to find flaws and address them.
That was sparked by red teaming, in which an internal team acts like threat actors to test the system for flaws or weaknesses. In this instance, what they found suggests prompt injection is a "long-term AI security challenge".
Sign up today and you will receive a free copy of our Future Focus 2026 report - the leading resource for IT decision-maker insight on priorities and investment areas in AI, security and more.
"Prompt injection, much like scams and social engineering on the web, is unlikely to ever be fully 'solved'," the company said.
"But we’re optimistic that a proactive, highly responsive rapid response loop can continue to materially reduce real-world risk over time," OpenAI added.
"By combining automated attack discovery with adversarial training and system-level safeguards, we can identify new attack patterns earlier, close gaps faster, and continuously raise the cost of exploitation."
New challenge for AI browsers
Prompt injection is when attackers get between an agent's prompt box and the AI model, changing instructions to create malicious results. It's a new problem for browsers that boast AI features – and there’s a growing array already.
The main issue here is that since agents can take many of the same actions as a user, OpenAI said the potential impact of a successful attack could be “just as broad”.
As an example, OpenAI said an attacker could send a malicious email to trick an agent to ignore the user's actual request in favour of forwarding sensitive documents.
The user gives access to email for a legitimate task, such as summarizing messages, but if the agent also follows the injected instructions, sensitive data could leak.
Fighting back
OpenAI has made previous efforts to protect against such attacks, but is now adding new techniques to help avoid prompt injections.
First, the company has built an AI-powered hacker to use as an automated red teaming tool to proactively hunt out prompt injection attacks – even complicated ones taking hundreds of steps.
"We trained this attacker end-to-end with reinforcement learning, so it learns from its own successes and failures to improve its red teaming skills," the company said.
Beyond that, OpenAI has developed what it calls a "rapid response loop". When that automated red team spots a potential injection technique, that's fed back into the AI via adversarial training.
"We continuously train updated agent models against our best automated attacker—prioritizing the attacks where the target agents currently fail," the company added.
"The goal is to teach agents to ignore adversarial instructions and stay aligned with the user’s intent, improving resistance to newly discovered prompt-injection strategies."
Similarly, when using the agent in the ChatGPT Atlas browser, OpenAI advised using "logged out" mode where possible, signing in only when necessary to complete a task, and told users to review all confirmation requests carefully.
When it comes to prompts, be specific rather than broad: saying "review my emails and take whatever action is needed" gives space for threat actors to meddle.
FOLLOW US ON SOCIAL MEDIA
Make sure to follow ITPro on Google News to keep tabs on all our latest news, analysis, and reviews.
You can also follow ITPro on LinkedIn, X, Facebook, and BlueSky.
Freelance journalist Nicole Kobie first started writing for ITPro in 2007, with bylines in New Scientist, Wired, PC Pro and many more.
Nicole the author of a book about the history of technology, The Long History of the Future.
-
IT leaders are being stung by "unexpected" AI costsNews The growing costs associated with AI are hitting organizations large and small
-
'Botsitting' is destroying productivity as workers spend nearly a full day each week making AI 'usable'News While workers are reporting productivity improvements, ‘botsitting’ means these are often negated
-
'Most enterprises are still unprepared to operationalize it': IT leaders are bullish on agents, but keeping falling at the final hurdle – here's whyNews Forrester points to challenges scaling agentic AI, saying companies start rolling out the tech before they're ready to scale
-
‘Chat is dead’: OpenAI plots ChatGPT ‘super app’ overhaul ahead of public listing – with agents and coding tools the new focusNews The company looks set to spruce up ChatGPT with a particular focus on agents to drive subscriptions
-
Uber’s eye-watering AI bill shows enterprises are ‘still measuring AI success through consumption rather than outcomes’ – and it's warping our perception of ROI and productivityNews ‘Tokenmaxxing’ might pad the stats, but it’s a trend that could come back to haunt enterprises
-
Destination AI: Una partnership affidabile per superare gli ostacoli e gettare le basi per la crescita futuraSponsored Con l'accelerazione dell'adozione dell''AI aziendale, i partner IT devono spostare la loro attenzione dall'hype tecnologico ai risultati aziendali tangibili, sfruttando ecosistemi strutturati per promuovere la monetizzazione a lungo termine
-
Le programme Destination AI : un partenariat de confiance pour surmonter les obstacles et poser les bases de votre croissance futureSponsored Alors que l'adoption de l'IA en entreprise s'accélère, les partenaires informatiques doivent réorienter leurs priorités : délaisser le battage technologique au profit de résultats commerciaux concrets, en exploitant des écosystèmes structurés pour assurer une monétisation à long terme

