OpenAI is cracking down on AI misuse with a new bug bounty program

(Image credit: Getty Images)

OpenAI is launching a public Safety Bug Bounty⁠ program focused on identifying how its AI tools could be misused.

The program will complement its existing bug bounty⁠ program by examining issues that pose a meaningful risk of abuse and safety, even if they don’t meet the criteria for a security vulnerability.

"Our goal is to ensure our systems remain safe and secure against misuse or abuse that could lead to tangible harm," OpenAI said in a blog post.

"Through this program, we look forward to continuing to partner with safety and security researchers to help us identify and address issues that fall outside conventional security vulnerabilities but still pose real risks."

To qualify, issues must represent a design or implementation issue in an active OpenAI product that can be abused by an attacker to cause material harm, and must be addressable via a clear set of recommended steps or mitigations.

"The goal of this program is to reward for bug fixes and we cannot reward requests for general product improvements," said the firm.

Issues must be consistently reproducible, any accounts used as victims must be test accounts owned by the researcher, and vulnerability testing must not risk damage or compromise to any real-world accounts.

The firm lists various types of risk that will fall under the new program. These include:

Third-party prompt injection and data exfiltration
Browser-related risks, such as account hijacking
Manipulation of ChatGPT Agent to carry out harmful actions

To qualify, the behavior must be reproducible at least 50% of the time. Also covered is the ability of an agentic OpenAI product to perform a disallowed action on OpenAI’s website at scale, or perform some other potentially harmful action.

OpenAI targets IP protection

The program will also address OpenAI proprietary information, such as model generations that return proprietary information related to reasoning and vulnerabilities that expose other OpenAI proprietary information.

Meanwhile, the firm said it will consider vulnerabilities in account integrity and platform integrity signals, such as bypassing anti-automation controls, manipulating account trust signals, evading account restrictions, suspensions, or bans, and similar issues.

Any issues that allow users to access features, data, or functionalities beyond authorized permissions should be reported to the Security Bug Bounty program, the company noted.

Private OpenAI bounties for other issues

While OpenAI said that jailbreaks are out of scope for this particular program, it periodically runs private bug bounty campaigns focused on certain harm types.

These include risks such as Biorisk content issues in ChatGPT Agent⁠ and GPT‑5, and researchers can apply to these programs as and when they arise.

"Outside of the categories listed above, if researchers identify flaws that facilitate direct paths to user harm and actionable, discrete remediation steps, these may be considered in scope for rewards on a case-by-case basis," OpenAI said.

"General content-policy bypasses without demonstrable safety or abuse impact are out of scope for this program. For example, 'jailbreaks' that result in the model using rude language or returning information that is easily findable via search engines are out of scope."

Submissions will be triaged by OpenAI’s Safety and Security Bug Bounty teams, and could be handled by either. The program is hosted by Bugcrowd.

Follow ITPro on Google News and add us as a preferred source to keep tabs on all our latest news, analysis, views, and reviews.

You can also follow ITPro on LinkedIn, X, Facebook, and BlueSky.

Emma Woollacott is a freelance journalist writing for publications including the BBC, Private Eye, Forbes, Raconteur and specialist technology titles.

OpenAI targets IP protection

Private OpenAI bounties for other issues

FOLLOW US ON SOCIAL MEDIA