Security experts issue warning over the rise of 'gray bot' AI web scrapers
While not malicious, the bots can overwhelm web applications in a way similar to bad actors
Security firm Barracuda has called for organizations to factor AI bots that scrape data from public websites into their security strategies, labelling them not as good or bad bots, but “gray bots”.
Defining these three categories of bot, senior principal software engineer for application security engineering at Barracuda Rahul Gupta said: “There are good bots – such as search engine crawler bots, SEO bots, and customer service bots – and bad bots, designed for malicious or harmful online activities like breaching accounts to steal personal data or commit fraud.
“In the space between them you will find what Barracuda calls ‘gray bots.’ … Gray bots are blurring the boundaries of legitimate activity. They are not overtly malicious, but their approach can be questionable. Some are highly aggressive.”
Examples of gray bots given by Gupta include web scraper bots, automated content aggregators for news, travel offers, and so on, and generative AI scraper bots.
The activity of this third category was specifically highlighted by Gupta, with web applications receiving millions of requests from bots such as Anthropic’s ClaudeBot and TikTok’s Bytespider bot.
“ClaudeBot is the most active Gen AI gray bot in our dataset by a considerable margin,” said Gupta. “ClaudeBot’s relentless requests are likely to impact many of its targeted web applications.
According to Barracuda's analysis, one web application received an average of 323,300 AI scraper bot requests a day over the course of 30 days.
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
Another received 500,000 requests in a single day. A third received approximately 40,800 requests over the course of a day, with an average request rate of 17,000 per hour.
Gupta said this level of consistency was “unexpected”.
“It is generally assumed, and often the case, that gray bot traffic comes in waves, hitting a website for a few minutes to an hour or so before falling back,” he said, although he added that “constant bombardment or unexpected, ad hoc traffic surges [both] present challenges for web applications”.
This level of activity can disrupt operations and degrade the performance of web application traffic, Gupta said, as well as gathering up “vast volumes of proprietary or commercial data”.
There can also be more indirect impacts, such as distorting web traffic figures making it harder to take data driven decisions, Barracuda claimed.
Defensive measures
There are multiple reasons why organizations may wish to protect themselves from AI webscrapers, ranging from protecting their IP and copyright to data privacy concerns, as well as performance degradation.
Those in the creative industries in particular are increasingly worried about their data being used to train generative AI models without their permission, but it’s a dilemma that affects other businesses too.
In January 2024, the UK’s Information Commissioner’s Office (ICO) said it would examine web scraping by generative AI bots as part of its investigation into the collection and processing of personal data by LLMs owned by companies like OpenAI and Anthropic.
"The impact of generative AI can be transformative for society if it's developed and deployed responsibly," said the ICO's executive director for regulatory risk, Stephen Almond, at the time.
"This call for views will help the ICO provide industry with certainty regarding its obligations and safeguard people's information rights and freedoms," he added.
For his part, Gupta recommended: “To ensure your web applications are protected against the impact of gray bots, consider implementing bot protection capable of detecting and blocking generative AI scraper bot activity.”
MORE FROM ITPRO
- Bad bots are on the rise as almost half of all internet traffic is now automated
- How to protect your business from AI web scraping
- OpenAI quietly unveils GPTBot dedicated web crawler

Jane McCallion is Managing Editor of ITPro and ChannelPro, specializing in data centers, enterprise IT infrastructure, and cybersecurity. Before becoming Managing Editor, she held the role of Deputy Editor and, prior to that, Features Editor, managing a pool of freelance and internal writers, while continuing to specialize in enterprise IT infrastructure, and business strategy.
Prior to joining ITPro, Jane was a freelance business journalist writing as both Jane McCallion and Jane Bordenave for titles such as European CEO, World Finance, and Business Excellence Magazine.
-
Trump's AI executive order could leave US in a 'regulatory vacuum'News Citing a "patchwork of 50 different regulatory regimes" and "ideological bias", President Trump wants rules to be set at a federal level
-
TPUs: Google's home advantageITPro Podcast How does TPU v7 stack up against Nvidia's latest chips – and can Google scale AI using only its own supply?
-
Trend Micro issues warning over rise of 'vibe crime' as cyber criminals turn to agentic AI to automate attacksNews Trend Micro is warning of a boom in 'vibe crime' - the use of agentic AI to support fully-automated cyber criminal operations and accelerate attacks.
-
NCSC issues urgent warning over growing AI prompt injection risks – here’s what you need to knowNews Many organizations see prompt injection as just another version of SQL injection - but this is a mistake
-
AWS CISO Amy Herzog thinks AI agents will be a ‘boon’ for cyber professionals — and teams at Amazon are already seeing huge gainsNews AWS CISO Amy Herzog thinks AI agents will be a ‘boon’ for cyber professionals, and the company has already unlocked significant benefits from the technology internally.
-
HPE selects CrowdStrike to safeguard high-performance AI workloadsNews The security vendor joins HPE’s Unleash AI partner program, bringing Falcon security capabilities to HPE Private Cloud AI
-
Microsoft opens up Entra Agent ID preview with new AI featuresNews Microsoft Entra Agent ID aims to help manage influx of AI agents using existing tools
-
GitHub is awash with leaked AI company secrets – API keys, tokens, and credentials were all found out in the openNews Wiz research suggests AI leaders need to clean up their act when it comes to secrets leaking
-
Organizations around the world are unprepared for the threat from bad bots – and UK businesses are some of the worst performersNews As AI-driven bot traffic booms, legacy defenses are failing fast
-
Generative AI attacks are accelerating at an alarming rateNews Two new reports from Gartner highlight the new AI-related pressures companies face, and the tools they are using to counter them