Cloudflare's new policy pushes AI companies to pay for publishers' content
Cloudflare is giving AI companies until September 15 to separate web crawlers used for search from those used for AI training and agents, or risk being blocked by default on many publisher sites. Cloudflare has just issued the AI industry a new deadline to separate the web crawlers used for traditional search purposes, like Google Search, from those used for AI agents and training. Starting on September 15, 2026, Cloudflare's default settings will block "mixed-use" crawlers from any pages that host ads, the company announced on Wednesday.
Key Takeaways
- That means that the crawlers that blend search, agent use, and training will be blocked from crawling these sites by default, unless the site owner adjusts the settings otherwise.
These changes to the defaults will apply to new Cloudflare customers, new sites set up by existing customers, and all existing free customers, the company says.
- ) as having access to about "2x more information" than other AI companies because the search giant makes it difficult for customers to remain discoverable without being used for AI.
Google has pushed back against this generalization in the past, noting that it provides a bot called Google Extended that lets site owners opt out of having their content used for training and AI products and services like Gemini Apps and Vertex API.
- That shift was not expected to occur until next year.
"Cloudflare's new tools and partnerships give website owners increased visibility and commercial opportunities and benefit AI companies that have bots with clear and transparent intent.
- The latter is now also evolving into "Pay Per Use," the company said, which will allow publishers to charge AI companies when their content creates value, not just when it's fetched.
The change could also help conserve publishers' bandwidth and compute resources for AI model providers, as Cloudflare's data suggested that over 50% of crawl traffic from AI crawlers is spent re-fetching unchanged pages.
- Other AI companies can customize this model for how they work, Cloudflare says.
Stats & Key Facts
- #Cloudflare is giving AI companies until September 15 to separate web crawlers used for search from those used for AI training and agents, or risk being blocked by default on many publisher sites.
- #Cloudflare is giving AI companies until September 15 to separate web crawlers used for search from those used for AI training and agents, or risk being blocked by default on many publisher sites.
- #) as having access to about "2x more information" than other AI companies because the search giant makes it difficult for customers to remain discoverable without being used for AI.
- #The change could also help conserve publishers' bandwidth and compute resources for AI model providers, as Cloudflare's data suggested that over 50% of crawl traffic from AI crawlers is spent re-fetching unchanged pages.
That means that the crawlers that blend search, agent use, and training will be blocked from crawling these sites by default, unless the site owner adjusts the settings otherwise. These changes to the defaults will apply to new Cloudflare customers, new sites set up by existing customers, and all existing free customers, the company says. The move could impact how AI model providers are able to access web content for training purposes and to help power their agentic services.
Cloudflare points out that most website owners want their content to be discoverable via search and often through AI services as well, but they want protections against having their intellectual property given away for free. Cloudflare specifically calls out the "world's largest search engine" (clearly a Google reference! ) as having access to about "2x more information" than other AI companies because the search giant makes it difficult for customers to remain discoverable without being used for AI.
Google has pushed back against this generalization in the past, noting that it provides a bot called Google Extended that lets site owners opt out of having their content used for training and AI products and services like Gemini Apps and Vertex API. Its use doesn't impact a site's inclusion in Google Search. However, the tech giant's flagship Googlebot crawls for Search, including AI features like AI Overviews and AI Mode.
For more details please read the original article at TechCrunch AI.
Why It Matters for Business
Real business deployments are the most reliable signal of where AI is generating measurable ROI. Watching which sectors operationalize AI, what they pay for it, and how it changes their P&L tells you more than any vendor demo. These case studies are what serious buyers and investors triangulate on.
Continue Learning
Comments
Sign in to join the conversation