Cloudflare says AI companies have been “scraping content without limits” – now it’s letting website owners block crawlers and force them to pay

The move by Cloudflare looks to improve transparency and support publishers

Cloudflare co-founder and CEO, Matthew Prince, pictured during the Fortune Brainstorm AI conference in San Francisco, California, US, on Monday, Dec. 11, 2023
(Image credit: Getty Images)

Cloudflare has announced new capabilities aimed at blocking AI crawlers from accessing content without permission or compensation.

Available by default from today (1st July), the web infrastructure firm will allow website owners to choose if they want AI crawlers to access content.

Meanwhile, the company's "pay-per-crawl" feature, which is currently in private preview for select customers, will allow publishers to set prices that bots are forced to pay before scraping content.

The move will also see AI companies required to “clearly state their purpose” and disclose whether crawlers are used for training, inference, or search purposes.

This, Cloudflare said, will “help website owners decide which crawlers to allow”.

“If the internet is going to survive the age of AI, we need to give publishers the control they deserve and build a new economic model that works for everyone - creators, consumers, tomorrow’s AI founders, and the future of the web itself,” said Matthew Prince, co-founder and CEO of Cloudflare.

“Original content is what makes the Internet one of the greatest inventions in the last century, and it's essential that creators continue making it.”

Prince noted that AI crawlers have been “scraping content without limits” and the company aims to “put the power back in the hands of creators while still helping AI companies innovate.”

“This is about safeguarding the future of a free and vibrant Internet with a new model that works for everyone,” he added.

How Cloudflare plans to tackle AI crawlers

Cloudflare has made previous moves to tackle the issue of AI crawlers, having introduced the option to block AI crawlers in September 2024. The company said more than one million customers have chosen this option since launch.

This latest attempt is “taking the next step” to enforce a permission-based model, the company noted, with AI companies now required to obtain "explicit permission” from a website before scraping content.

Every new domain will now be asked if they want to allow AI crawlers upon sign-up, the company added.

“This significant shift means that every new domain starts with the default of control and eliminates the need for webpage owners to manually configure their settings to opt out,” Cloudflare said in a statement.

“Customers can easily check their settings and enable crawling at any time if they want their content to be freely accessed.”

Publishers have welcomed the move

The move by Cloudflare comes amidst long-running disputes between media organizations and AI companies over the use of crawlers to scrape data from the web for training purposes.

A host of high-profile lawsuits have been filed against major industry providers, including OpenAI, contesting the practice.

An array of publishers have welcomed the move, including Conde Nast, The Atlantic, the Associated Press, and ADWEEK.

Roger Lynch, CEO of Condé Nast, described the new approach as a “game-changer for publishers” that will “set a new standard for how content is respected online”.

“When AI companies can no longer take anything, they want for free, it opens the door to sustainable innovation built on permission and partnership,” Lynch said.

“This is a critical step toward creating a fair value exchange on the Internet that protects creators, supports quality journalism and holds AI companies accountable.”

Steve Huffman, co-founder and CEO of Reddit, echoed Lynch’s comments. The social media platform recently filed a lawsuit against AI giant Anthropic, claiming the company has illegally scraped user comments to train its Claude AI mode.

“AI companies, search engines, researchers, and anyone else crawling sites have to be who they say they are. And any platform on the web should have a say in who is taking their content for what,” Huffman, said.

“The whole ecosystem of creators, platforms, web users and crawlers will be better when crawling is more transparent and controlled, and Cloudflare’s efforts are a step in the right direction for everyone.”

MORE FROM ITPRO

Ross Kelly
News and Analysis Editor

Ross Kelly is ITPro's News & Analysis Editor, responsible for leading the brand's news output and in-depth reporting on the latest stories from across the business technology landscape. Ross was previously a Staff Writer, during which time he developed a keen interest in cyber security, business leadership, and emerging technologies.

He graduated from Edinburgh Napier University in 2016 with a BA (Hons) in Journalism, and joined ITPro in 2022 after four years working in technology conference research.

For news pitches, you can contact Ross at ross.kelly@futurenet.com, or on Twitter and LinkedIn.