diff --git a/public/__redirects b/public/__redirects index dfdbcbb3ba5..c9ce3f36906 100644 --- a/public/__redirects +++ b/public/__redirects @@ -439,8 +439,9 @@ /bots/about/plans/bm-subscription/ /bots/plans/bm-subscription/ 301 /support/firewall/tools/cloudflare-bot-products-faqs/ /bots/troubleshooting/ 301 /support/other-languages/deutsch/cloudflare-bot/ /bots/troubleshooting/ 301 -/bots/reference/verified-bot-categories/ /bots/concepts/bot/verified-bots/#categories 301 -/bots/reference/verified-bot-policy/ /bots/concepts/bot/verified-bots/policy/ 301 +/bots/reference/verified-bot-categories/ /bots/concepts/bot/verified-bots/#legacy-categories 301 +/bots/reference/verified-bot-policy/ /bots/concepts/bot/verified-bots/ 301 +/bots/concepts/bot/signed-agents/ /bots/concepts/bot/verified-bots/ 301 /bots/concepts/challenge-solve-rate/ /cloudflare-challenges/reference/challenge-solve-rate/ 301 /bots/concepts/detection-ids/ /bots/additional-configurations/detection-ids/ 301 /bots/concepts/ja3-ja4-fingerprint/ /bots/additional-configurations/ja3-ja4-fingerprint/ 301 @@ -457,10 +458,11 @@ /bots/get-started/pro/ /bots/get-started/super-bot-fight-mode/ 301 /bots/additional-configurations/javascript-detections/ /cloudflare-challenges/challenge-types/javascript-detections/ 301 /bots/troubleshooting/frequently-asked-questions/ /bots/ 301 -/bots/concepts/bot/verified-bots/categories/ /bots/concepts/bot/verified-bots/#categories 301 +/bots/concepts/bot/verified-bots/categories/ /bots/concepts/bot/verified-bots/#legacy-categories 301 /bots/concepts/bot/verified-bots/ip-validation/ /bots/reference/bot-verification/ip-validation/ 301 /bots/concepts/bot/verified-bots/web-bot-auth/ /bots/reference/bot-verification/web-bot-auth/ 301 /bots/concepts/bot/verified-bots/overview/ /bots/concepts/bot/verified-bots/ 301 +/bots/concepts/bot/verified-bots/policy/ /bots/concepts/bot/verified-bots/ 301 /bots/frequently-asked-questions/ /bots/ 301 /browser-rendering/get-started/browser-rendering-with-DO/ /browser-run/how-to/browser-run-with-do/ 301 @@ -2985,3 +2987,6 @@ # Security Insights (moved from Security Center to Security) /security-center/security-insights/* /security/security-insights/:splat 301 + +# Bots: signed agents deprecated -> verified bots +/bots/concepts/bot/signed-agents/* /bots/concepts/bot/verified-bots/ 301 diff --git a/src/assets/images/changelog/bots/ai-bot-traffic-policies.png b/src/assets/images/changelog/bots/ai-bot-traffic-policies.png new file mode 100644 index 00000000000..e29e8f9b58e Binary files /dev/null and b/src/assets/images/changelog/bots/ai-bot-traffic-policies.png differ diff --git a/src/assets/images/changelog/bots/attribution-business-insights.png b/src/assets/images/changelog/bots/attribution-business-insights.png new file mode 100644 index 00000000000..044ce5181b0 Binary files /dev/null and b/src/assets/images/changelog/bots/attribution-business-insights.png differ diff --git a/src/content/changelog/bots/2026-07-01-ai-traffic-options.mdx b/src/content/changelog/bots/2026-07-01-ai-traffic-options.mdx new file mode 100644 index 00000000000..ff30c16036a --- /dev/null +++ b/src/content/changelog/bots/2026-07-01-ai-traffic-options.mdx @@ -0,0 +1,16 @@ +--- +title: New options to manage AI traffic +description: All customers can now manage AI crawlers by behavior — Search, Agent, and Training — instead of a single Block AI bots toggle. +products: + - bots +date: 2026-07-01 +publish_future_dated_entry: true +--- + +Not all AI traffic is the same. Now, all customers — including those on the Free plan — can manage AI crawlers based on what they actually do on your site. Cloudflare groups AI traffic into three behaviors you can control independently: [Search, Agent, and Training](/bots/concepts/bot/#ai-bots). This lets you keep the automated traffic that sends readers and revenue back to you, while blocking the traffic that only takes from your content. + +Each behavior maps to a real use case. **Search** covers crawlers that index your content so they can answer questions about it later, where you should expect referral traffic or other equitable compensation in return. **Agent** covers automated activity acting in real time on a person's behalf, such as chat fetch bots and browser-use agents. **Training** covers crawlers that take your content to train or fine-tune a model. For each preset you can choose to block on all pages, block only on pages that display ads, or choose not to block. + +![The Configure AI bot traffic policies screen, where Search, Agent, and Training can each be set to allow, block, or block only on pages with ads](~/assets/images/changelog/bots/ai-bot-traffic-policies.png) + +Starting **September 15, 2026**, new domains onboarding to Cloudflare receive updated defaults: Bots classified as Training or as Agent are blocked on pages that display ads, while **Search** remains allowed. On that date, multi-purpose crawlers that combine Search and Training will be affected by the new defaults to block Training. All customers can [opt out of the new defaults](https://dash.cloudflare.com/?to=/:account/:zone/security/settings) at any time before September 15. diff --git a/src/content/changelog/bots/2026-07-01-botbase-attribution-business-insights.mdx b/src/content/changelog/bots/2026-07-01-botbase-attribution-business-insights.mdx new file mode 100644 index 00000000000..23beb27af81 --- /dev/null +++ b/src/content/changelog/bots/2026-07-01-botbase-attribution-business-insights.mdx @@ -0,0 +1,16 @@ +--- +title: More visibility into bot traffic with BotBase and Attribution Business Insights +description: Enterprise Bot Management gains a searchable directory of every tracked bot and a dashboard showing crawl-to-referral ratios. +products: + - bots +date: 2026-07-01 +publish_future_dated_entry: true +--- + +With Content Independence Day 2026, [Enterprise Bot Management](/bots/get-started/bot-management/) customers get two new tools that make bot traffic far easier to see and reason about: [BotBase](/bots/botbase/), a searchable directory of every bot Cloudflare tracks, and [Attribution Business Insights](/bots/attribution-business-insights/), a dashboard that shows how much value each crawler sends back to your business. + +BotBase is Cloudflare's directory of all known bots and agents, available directly in the dashboard. It shows how Cloudflare classifies each bot by behavior — Search, Agent, Training, and other categories such as Transact, Data Collection, SEO, and Ads Verification — so you can understand why a given crawler is visiting you. You can search and filter the full catalogue, filter your own traffic down to a single bot to investigate its activity on your zone, and copy any bot's detection ID to target it precisely in [Security rules](/security/rules/). Every tracked bot in BotBase is also published in [Cloudflare Radar's bots and agents directory](https://radar.cloudflare.com/bots/directory). + +Attribution Business Insights is built for content owners and business decision-makers who want to know which bots help or harm their business, without reading rule syntax. The dashboard reports crawl-to-referral ratios both site-wide and per bot operator — comparing how often a company crawls your content against how many visitors it actually refers back — over the last 24 hours, 7 days, or 30 days. Each operator is labeled with Cloudflare's [updated classification](/bots/concepts/bot/verified-bots/) and an action status of Allowed, Blocked, or Partially blocked, giving stakeholders a shared, at-a-glance view of the AI traffic reaching your site. + +![The Attribution Business Insights dashboard, showing bot traffic, content page requests, crawl-to-referral ratio, and a per-operator bot activity table](~/assets/images/changelog/bots/attribution-business-insights.png) diff --git a/src/content/changelog/browser-run/2026-03-10-br-crawl-endpoint.mdx b/src/content/changelog/browser-run/2026-03-10-br-crawl-endpoint.mdx index 5116c6d13bd..19cd3c577fc 100644 --- a/src/content/changelog/browser-run/2026-03-10-br-crawl-endpoint.mdx +++ b/src/content/changelog/browser-run/2026-03-10-br-crawl-endpoint.mdx @@ -8,7 +8,7 @@ date: 2026-03-10 _Edit: this post has been edited to clarify crawling behavior with respect to site guidance._ -You can now crawl an entire website with a single API call using [Browser Rendering](/browser-run/)'s new [`/crawl` endpoint](/browser-run/quick-actions/crawl-endpoint/), available in open beta. Submit a starting URL, and pages are automatically discovered, rendered in a headless browser, and returned in multiple formats, including HTML, Markdown, and structured JSON. The endpoint is a [signed-agent](https://developers.cloudflare.com/bots/concepts/bot/signed-agents/) that respects robots.txt and [AI Crawl Control](https://www.cloudflare.com/ai-crawl-control/) by default, making it easy for developers to comply with website rules, and making it less likely for crawlers to ignore web-owner guidance. This is great for training models, building RAG pipelines, and researching or monitoring content across a site. +You can now crawl an entire website with a single API call using [Browser Rendering](/browser-run/)'s new [`/crawl` endpoint](/browser-run/quick-actions/crawl-endpoint/), available in open beta. Submit a starting URL, and pages are automatically discovered, rendered in a headless browser, and returned in multiple formats, including HTML, Markdown, and structured JSON. The endpoint is a [verified bot (intermediary agent)](/bots/concepts/bot/verified-bots/) that respects robots.txt and [AI Crawl Control](https://www.cloudflare.com/ai-crawl-control/) by default, making it easy for developers to comply with website rules, and making it less likely for crawlers to ignore web-owner guidance. This is great for training models, building RAG pipelines, and researching or monitoring content across a site. Crawl jobs run asynchronously. You submit a URL, receive a job ID, and check back for results as pages are processed. diff --git a/src/content/dash-routes/core.json b/src/content/dash-routes/core.json index f864ddbd178..ca92b82f265 100644 --- a/src/content/dash-routes/core.json +++ b/src/content/dash-routes/core.json @@ -493,6 +493,11 @@ "deeplink": "/?to=/:account/:zone/analytics/traffic", "parent": ["Analytics & logs"] }, + { + "name": "Attribution Business Insights", + "deeplink": "/?to=/:account/:zone/analytics/attribution-business-insights", + "parent": ["Analytics & logs"] + }, { "name": "Web analytics", "deeplink": "/?to=/:account/:zone/analytics/web/overview", diff --git a/src/content/docs/ai-crawl-control/configuration/ai-crawl-control-with-bots.mdx b/src/content/docs/ai-crawl-control/configuration/ai-crawl-control-with-bots.mdx index 25135f65b7f..996d2974b33 100644 --- a/src/content/docs/ai-crawl-control/configuration/ai-crawl-control-with-bots.mdx +++ b/src/content/docs/ai-crawl-control/configuration/ai-crawl-control-with-bots.mdx @@ -25,7 +25,7 @@ C --> D[AI Crawl Control:
Pay Per Crawl] classDef highlight fill:#F6821F,color:white ``` -For more information on how Cloudflare bot solutions works with WAF custom rules, refer to [How it works](/bots/concepts/bot/#how-it-works). +For more information on how Cloudflare classifies bot traffic, refer to [AI bots](/bots/concepts/bot/#ai-bots). ## Examples diff --git a/src/content/docs/ai-crawl-control/features/manage-ai-crawlers.mdx b/src/content/docs/ai-crawl-control/features/manage-ai-crawlers.mdx index 37400acba85..0fd900e5cf2 100644 --- a/src/content/docs/ai-crawl-control/features/manage-ai-crawlers.mdx +++ b/src/content/docs/ai-crawl-control/features/manage-ai-crawlers.mdx @@ -28,7 +28,7 @@ The **Crawlers** tab displays a table of AI crawlers that are requesting access | Column | Details | | --------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | Crawler | The name of the AI crawler and the operator that owns it. | -| Category | The category of the AI crawler. Refer to [Verified bot categories](/bots/concepts/bot/verified-bots/#categories). | +| Category | The category of the AI crawler. Refer to [Verified bot categories](/bots/concepts/bot/verified-bots/#legacy-categories). | | Requests | The total number of allowed and unsuccessful requests, with trend chart. Unsuccessful requests may come from any rule or response error, not just the block action in AI Crawl Control. | | Robots.txt violations | The number of times the AI crawler has violated your `robots.txt` file. | | Action | The action you wish to take for the AI crawler. Refer to [Take action for each AI crawler](/ai-crawl-control/features/manage-ai-crawlers/#take-action-for-each-ai-crawler). | @@ -95,7 +95,7 @@ For each AI crawler, you can take one of three actions: allow, charge, or block. - **Implementation:** From the **Actions** column, select **Allow**. Note that you can still choose to [Enforce `robots.txt`](/ai-crawl-control/features/manage-ai-crawlers/#take-action-for-each-ai-crawler). -For more details on how this rule interacts with other Cloudflare settings, refer to [How it works](/bots/concepts/bot/#how-it-works). +For more details on how Cloudflare classifies AI bot traffic, refer to [AI bots](/bots/concepts/bot/#ai-bots). diff --git a/src/content/docs/ai-crawl-control/features/pay-per-crawl/use-pay-per-crawl-as-ai-owner/verify-ai-crawler.mdx b/src/content/docs/ai-crawl-control/features/pay-per-crawl/use-pay-per-crawl-as-ai-owner/verify-ai-crawler.mdx index d2dee4b934c..fef2d60b21c 100644 --- a/src/content/docs/ai-crawl-control/features/pay-per-crawl/use-pay-per-crawl-as-ai-owner/verify-ai-crawler.mdx +++ b/src/content/docs/ai-crawl-control/features/pay-per-crawl/use-pay-per-crawl-as-ai-owner/verify-ai-crawler.mdx @@ -45,7 +45,7 @@ Follow the steps found in [Web Both Auth](/bots/reference/bot-verification/web-b ## 2. Follow verified bot policy -Ensure your AI crawler follows Cloudflare's [verified bots policy](/bots/concepts/bot/verified-bots/policy/). +Ensure your AI crawler follows Cloudflare's [verified bots policy](/bots/concepts/bot/verified-bots/). ## 3. Submit verification request diff --git a/src/content/docs/ai-crawl-control/reference/redirects-for-ai-training.mdx b/src/content/docs/ai-crawl-control/reference/redirects-for-ai-training.mdx index 1122f0a6d7d..d39976186b5 100644 --- a/src/content/docs/ai-crawl-control/reference/redirects-for-ai-training.mdx +++ b/src/content/docs/ai-crawl-control/reference/redirects-for-ai-training.mdx @@ -11,7 +11,7 @@ products: import { TabItem, Tabs, GlossaryTooltip } from "~/components"; -Redirects for AI Training enforces your existing `` tags as 301 redirects for [verified AI training crawlers](/bots/concepts/bot/#verified-bots). When a verified bot with the [AI Crawler category](/bots/concepts/bot/verified-bots/#categories) requests a page whose canonical tag points to a different same-origin URL, Cloudflare returns a `301 Moved Permanently` to the canonical. All other visitors—browsers, search engines, [AI Assistants](/bots/concepts/bot/verified-bots/#categories)—receive the original page unchanged. +Redirects for AI Training enforces your existing `` tags as 301 redirects for [verified AI training crawlers](/bots/concepts/bot/#verified-bots). When a verified bot with the [AI Crawler category](/bots/concepts/bot/verified-bots/#legacy-categories) requests a page whose canonical tag points to a different same-origin URL, Cloudflare returns a `301 Moved Permanently` to the canonical. All other visitors—browsers, search engines, [AI Assistants](/bots/concepts/bot/verified-bots/#legacy-categories)—receive the original page unchanged. To learn more about why this feature can be useful, refer to the [announcement blog post](https://blog.cloudflare.com/ai-redirects/). @@ -190,7 +190,7 @@ Available on Pro, Business, and Enterprise plans at no additional cost. - Only HTML responses (`content-type: text/html`) from the origin are evaluated. Other content types pass through unchanged. - The canonical tag must appear within the first 256 KB of the uncompressed HTML response body. - Only same-origin canonical URLs trigger a redirect. Cross-origin canonicals are ignored. -- Only [verified bots](/bots/concepts/bot/#verified-bots) with the [AI Crawler category](/bots/concepts/bot/verified-bots/#categories) are redirected. [AI Assistants and AI Search bots](/bots/concepts/bot/verified-bots/#categories) are not affected. +- Only [verified bots](/bots/concepts/bot/#verified-bots) with the [AI Crawler category](/bots/concepts/bot/verified-bots/#legacy-categories) are redirected. [AI Assistants and AI Search bots](/bots/concepts/bot/verified-bots/#legacy-categories) are not affected. - Self-canonical pages (where the canonical URL matches the request URL) are not redirected. - Best-effort loop detection uses the `Referer` header. If a crawler was just redirected from the canonical URL back to the current page, the origin HTML is served instead of redirecting. This handles common two-page canonical misconfigurations (Page A canonical points to Page B, Page B canonical points to Page A). @@ -206,4 +206,4 @@ When a redirect is issued, Cloudflare logs the canonical target URL in your HTTP - [Content Signals Policy](https://contentsignals.org/) - Signal post-access content usage preferences in `robots.txt` - [Directives](/ai-crawl-control/features/track-robots-txt/) - Monitor `robots.txt` compliance and check Agent Readiness - [Single Redirects](/rules/url-forwarding/single-redirects/) - Rule-based URL redirects that execute before origin -- [Verified bots](/bots/concepts/bot/verified-bots/#categories) - Bot categories including AI Crawler, AI Assistant, and AI Search +- [Verified bots](/bots/concepts/bot/verified-bots/#legacy-categories) - Bot categories including AI Crawler, AI Assistant, and AI Search diff --git a/src/content/docs/bots/account-abuse-protection.mdx b/src/content/docs/bots/account-abuse-protection.mdx index ddd1362d953..448fb32a580 100644 --- a/src/content/docs/bots/account-abuse-protection.mdx +++ b/src/content/docs/bots/account-abuse-protection.mdx @@ -7,7 +7,7 @@ products: tags: - Account takeover sidebar: - order: 6 + order: 8 label: Account Abuse Protection badge: text: Early Access diff --git a/src/content/docs/bots/additional-configurations/block-ai-bots.mdx b/src/content/docs/bots/additional-configurations/block-ai-bots.mdx index 61260abf06b..f28de27a1fe 100644 --- a/src/content/docs/bots/additional-configurations/block-ai-bots.mdx +++ b/src/content/docs/bots/additional-configurations/block-ai-bots.mdx @@ -12,27 +12,34 @@ sidebar: label: Block AI Bots --- -import { Render, Steps, DashButton } from "~/components" +## Configure AI bot policies -:::note[Block AI bots availability] -The **Block AI bots** feature is only available in the new [application security dashboard](/security/). -::: +### New defaults on September 15, 2026 + +On September 15, 2026, Cloudflare will set updated defaults for new domains: bots classified as Training or as Agent will be blocked on pages that display ads, and Search will remain allowed. Mixed-purpose crawlers that combine Search and Training will also be blocked by all configurations to block AI training, including the legacy "Block AI bots" option. Before September 15, all customers can [opt out of these new defaults](https://dash.cloudflare.com/?to=/:account/:zone/security/settings). + +All Cloudflare customers can choose to block AI bots and agents based on their behavior. Cloudflare offers presets for the most common AI behaviors to give customers the option to treat different AI use cases distinctly: + +- **Search**: crawlers that collect or index your content to answer questions about it later. +- **Agent**: automated activity acting in real time on a person's behalf, such as chat fetch bots and browser-use agents. +- **Training**: crawlers taking your content to train or fine-tune a model, including mixed-purpose crawlers that are used both for Training and for Search. -You can choose to block AI bots by activating **Block AI bots**. Activating this setting will block [verified bots](/bots/concepts/bot/verified-bots/) that are classified as AI crawlers, as well as a number of unverified bots that behave similarly. +Each blocking option will block Verified bots classified with that behavior, plus additional unverified bots that fall under these classifications. -To block [AI bots](/bots/concepts/bot/#ai-bots): +Each setting includes three mitigation options: - - 1. In the Cloudflare dashboard, go to the **Security Settings** page. +- **Block (on all pages)** - Issues the block across the entire zone. +- **Block on pages with ads** - Uses Cloudflare automated detection for pages that display ads on your zone to block only on those pages. +- **Allow (do not block)** - Does not add any blocking. - - 2. Filter by **Bot traffic**. - 3. Go to **Block AI bots**. - 4. Under **Configurations**, select the edit icon. Choose from: - - **Only block on hostnames with ads**: Use this option if you wish to block AI bots only on portions of your site that show ads. Cloudflare automatically detects whether ads are present on a subdomain, and only block on hostnames that contain those ad units. - - **Block on all pages**: Use this option if you wish to block AI bots on all your pages. - - **Do not block (off)**: Use this option if you wish to allow AI bots on all your pages. - 5. Select **Save** to save your configuration. - +To configure these policies, customers can go to **Security Settings** > **Configure AI bot policies**. + +## Block AI bots [Deprecating on September 15, 2026] + +This setting blocks verified bots that are classified as crawling for the purpose of AI training, as well as a number of unverified bots that behave similarly. + +:::note +This option excludes mixed-purpose bots that are used both for Training and for Search. +::: -To block individual AI crawlers (rather than blocking all crawlers), use [AI Crawl Control](/ai-crawl-control/). \ No newline at end of file +To configure this setting and set their preference for blocking mixed-purpose bots, customers can go to **Security Settings** > **Block AI bots**. diff --git a/src/content/docs/bots/attribution-business-insights.mdx b/src/content/docs/bots/attribution-business-insights.mdx new file mode 100644 index 00000000000..57021b987c9 --- /dev/null +++ b/src/content/docs/bots/attribution-business-insights.mdx @@ -0,0 +1,41 @@ +--- +pcx_content_type: concept +title: Attribution Business Insights +description: Understand which bots help or harm your business with crawl-to-referral ratios and behavior-based classification. +products: + - bots +tags: + - AI + - Bots +sidebar: + order: 7 +head: + - tag: title + content: Attribution Business Insights +--- + +import { DashButton } from "~/components" + +**Attribution Business Insights** is a dashboard designed for business decision-makers and content owners, delivering a targeted view of bot traffic flowing to your website. Analyze crawler patterns to your website in the last 24 hours, 7 days, or 30 days. + +## Availability + +Attribution Business Insights is available to all [Bot Management Enterprise](/bots/get-started/bot-management/) customers. + +This dashboard is meant for visibility for a new set of stakeholders, and does not provide a new control plane. To mitigate certain bots, website owners can use [Security rules](/security/rules/) or the [new AI bot mitigation options](/bots/additional-configurations/block-ai-bots/). + +## Access + + + +You can also reach the dashboard from your zone-level **Analytics** > **Attribution Business Insights** in the Cloudflare dashboard. + +## Definitions + +The dashboard surfaces both existing and new metrics that help you evaluate AI traffic. In the current version, we use the following definitions for the metrics shown on the dashboard: + +- **Content pages**: Content is initially defined as HTML pages on your website. +- **Crawl-to-referral ratio, per bot operator**: The average crawl-to-referral ratio (number of crawls sent by this company, vs. the number of visitors who visit you through a referral link from that company, tracked through UTM parameters) for a given company, in the selected time period. +- **Crawl-to-referral ratio, site-wide**: The average crawl-to-referral ratio (number of crawls sent by this company, vs. the number of visitors who visit you through a referral link from that company, tracked through UTM parameters) across all activity on your zone, in the selected time period. +- **Classification**: Each crawler is classified with Cloudflare's updated taxonomy. See [Verified bot classifications](/bots/concepts/bot/verified-bots/) for more information. If the company has at least 1 bot with an AI use case, we label the operator with the "AI" label, plus provide this as a filter. +- **Action**: Action reflects whether requests from this company are Blocked, Allowed, or Partially blocked. Companies that have some bots blocked but at least 1 bot allowed will be marked as "Partially Blocked", and configuration can be confirmed in [Security rules](/security/rules/). diff --git a/src/content/docs/bots/botbase.mdx b/src/content/docs/bots/botbase.mdx new file mode 100644 index 00000000000..31da4368c54 --- /dev/null +++ b/src/content/docs/bots/botbase.mdx @@ -0,0 +1,42 @@ +--- +pcx_content_type: concept +title: BotBase +description: Browse Cloudflare's directory of all known bots and agents, with behavior-based classification, directly in the dashboard. +products: + - bots +tags: + - AI + - Bots +sidebar: + order: 6 +head: + - tag: title + content: BotBase +--- + +BotBase is Cloudflare's directory of all known bots, including [verified bots and agents](/bots/concepts/bot/verified-bots/). It provides a comprehensive, searchable view of the entire bot directory directly in the Cloudflare dashboard, where you can see how Cloudflare classifies each bot and target individual bots in your security configuration. + +BotBase currently serves as a visibility plane for tracked bots. To mitigate these bots, you can use [Security rules](/security/rules/) or the [AI traffic options](/bots/concepts/bot/#ai-bots). + +## Availability + +BotBase is available to [Enterprise Bot Management](/bots/get-started/bot-management/) customers. + +## Access + +To view BotBase, go to **Security Analytics** > **Bot analysis** > **BotBase**. You can also access BotBase from **Security Settings** > **Bot Management** > **BotBase**. + +## What you can do + +- Browse the full catalogue of all verified bots and agents, and see the behavior or behaviors each one is classified under. +- Search and filter the directory to find a specific bot or group of bots. +- Filter your own traffic to a specific bot to investigate its activity on your zone. +- Copy a bot's detection ID to target it in [Security rules](/security/rules/). + +## Classification + +BotBase classifies each tracked bot by its behavior — what the bot may do on your site. A single bot can have one or more behaviors. To read more, see [Verified bot classifications](/bots/concepts/bot/verified-bots/). + +## Radar's public-facing BotBase + +Every bot tracked in BotBase, along with select metadata, is available publicly in [Cloudflare Radar's bots and agents directory](https://radar.cloudflare.com/bots/directory). diff --git a/src/content/docs/bots/concepts/bot/index.mdx b/src/content/docs/bots/concepts/bot/index.mdx index 79b102ed141..ae1372e5744 100644 --- a/src/content/docs/bots/concepts/bot/index.mdx +++ b/src/content/docs/bots/concepts/bot/index.mdx @@ -13,8 +13,6 @@ tags: - AI --- -import { Render } from "~/components"; - A **bot** is a software application programmed to do certain tasks. Bots can be used for good (chatbots, search engine crawlers) or for evil (inventory hoarding, credential stuffing). @@ -24,69 +22,18 @@ Bots can be used for good (chatbots, search engine crawlers) or for evil (invent For more background, refer to [What is a bot?](https://www.cloudflare.com/learning/bots/what-is-a-bot/). ::: -## Verified bots and signed agents - - - -:::note -The method for allowing or blocking verified bots depends on [your plan](/bots/concepts/bot/verified-bots/#availability). -::: - ## AI bots -To prevent AI-related usage of your site content (such as training language models or generating search answers), you can turn on a managed rule that blocks known AI crawlers that use data for training models ("AI Bots"). A managed rule is a rule that Cloudflare maintains and updates — you turn it on, but you do not write or edit the rule yourself. - -### Which bots are blocked - -When you enable this feature, Cloudflare will block the following bots: - -- `Amazonbot` (Amazon) -- `Applebot` (Apple) -- `Bytespider` (ByteDance) -- `ClaudeBot` (Anthropic) -- `DuckAssistBot` (DuckDuckGo) -- `Google-CloudVertexBot` (Google) -- `GoogleOther` (Google) -- `GPTBot` (OpenAI) -- `Meta-ExternalAgent` (Meta) -- `PetalBot` (Huawei) -- `TikTokSpider` (ByteDance) -- `CCBot` (Common Crawl) - -In addition to this list, [verified bots](https://radar.cloudflare.com/bots#verified-bots) that are classified as AI crawlers, as well as a number of unverified bots that behave similarly, are included in the rule. This rule does not include verified bots that fall into the `Search Engine` categories. - -These categories, and the bots classified in these categories, may change from time to time. - -If you are a bot operator and feel your bot may have been incorrectly categorized, [add your bot to the list of verified bots](https://dash.cloudflare.com/?to=/:account/configurations/verified-bots). +AI crawlers and agents interact with your site for very different reasons, and you may want to treat those reasons differently. Rather than relying on a single "AI bot" label, Cloudflare classifies bots by **behavior** — what a bot does on your site — so you can allow the behavior that helps your business and block the behavior that harms it. A single bot can have more than one behavior. -### How it works +### Classification -When you enable this feature, Cloudflare detects and blocks two categories of AI bots: +Cloudflare lets all customers manage three AI-related use cases directly: -- **Well-behaved AI crawlers** that comply with `robots.txt`, respect crawl rates, and do not hide their behavior from your website. -- **Evasive AI crawlers** that do not follow these conventions but are detected through additional signatures. +| Behavior | What it does | +| --- | --- | +| **Search** | Collects or indexes your content so it can answer questions about it later. | +| **Agent** | Automated activity acting in real time on a person's behalf to get something done, such as chat fetch bots and browser-use agents. | +| **Training** | Crawls your content to train or fine-tune a model, permanently absorbing your data into the model. | -### Rule evaluation order - -Cloudflare evaluates bot-related rules in a specific order. When a request matches a rule and receives a terminating action (such as block or challenge), it does not continue to later rules in the sequence. - -1. **Custom rules** (WAF custom rules you create) — evaluated first. -2. **Block AI bots** (the managed AI rule) — evaluated second. -3. **Other Super Bot Fight Mode rules** (definitely automated, likely automated, verified bots) — evaluated last. - -The Block AI bots rule takes precedence over all other Super Bot Fight Mode rules. For example, if you have enabled **Block AI bots** and **Allow verified bots**, verified AI bots will still be blocked. - -For Bot Management customers, custom rules run before the Block AI bots rule. If your custom rule challenges definitely automated traffic, AI bots will receive that challenge instead of reaching the Block AI bots rule. Because the challenge is a terminating action, Cloudflare does not evaluate the request against later rules in the sequence. - -The SBFM settings for verified, definitely automated, and likely bots also affect evaluation. If these settings are set to `allow`, the request is not matched to any SBFM rule and proceeds to the next phase — where the Block AI bots rule can still block it. If the setting is `block`, the request is blocked in the earlier phase and does not reach the AI rule at all. If the setting is `challenge`, the request matches a rule and receives a terminating action, so it will not continue to later rules. - -For self-serve non-Bot Management customers, all rules for verified, definitely automated, and likely bots run in the phase following the AI bots rule. - - - -This feature is available on all Cloudflare plans. - -:::note - -The method for blocking AI bots depends on [your plan](/bots/get-started/). -::: +Cloudflare classifies other behaviors, too — refer to [Verified bots](/bots/concepts/bot/verified-bots/). diff --git a/src/content/docs/bots/concepts/bot/signed-agents/index.mdx b/src/content/docs/bots/concepts/bot/signed-agents/index.mdx deleted file mode 100644 index 79b4ca63733..00000000000 --- a/src/content/docs/bots/concepts/bot/signed-agents/index.mdx +++ /dev/null @@ -1,36 +0,0 @@ ---- -pcx_content_type: overview -title: Signed agents -description: End-user-controlled agents verified through Web Bot Auth cryptographic signatures. -products: - - bots -sidebar: - order: 3 -learning_center: - title: What is a bot? - link: https://www.cloudflare.com/learning/bots/what-is-a-bot/ - ---- - -A signed agent is a bot controlled by an end user and verified through [Web Bot Auth](/bots/reference/bot-verification/web-bot-auth/) cryptographic signatures. - -You can request for your agent to be added to Cloudflare's bots and agents directory by filling out an [online application](https://dash.cloudflare.com/?to=/:account/configurations/verified-bots) in the Cloudflare dashboard. - -:::note -A bot cannot be registered as both a verified bot and a signed agent. Review Cloudflare's [verified bots](/bots/concepts/bot/verified-bots/) to determine how to identify your bot. -::: - -## Signed agent requirement - -For an agent to be recognized, it must meet the following requirements: - -1. The agent must follow the [signed agents policy](/bots/concepts/bot/signed-agents/policy/). -2. The bot must be using [Web Bot Auth](/bots/reference/bot-verification/web-bot-auth/). - -Once Cloudflare approves a signed agent, it should appear on [Cloudflare Radar's bots and agents directory](https://radar.cloudflare.com/verified-bots). - ---- - -## Verification method - -The bot must be verified using [Web Bot Auth](/bots/reference/bot-verification/web-bot-auth/). \ No newline at end of file diff --git a/src/content/docs/bots/concepts/bot/signed-agents/policy.mdx b/src/content/docs/bots/concepts/bot/signed-agents/policy.mdx deleted file mode 100644 index 4b3a2e13bca..00000000000 --- a/src/content/docs/bots/concepts/bot/signed-agents/policy.mdx +++ /dev/null @@ -1,67 +0,0 @@ ---- -pcx_content_type: reference -title: Signed agents policy -description: Requirements an agent must meet to be listed as a Cloudflare signed agent. -products: - - bots -sidebar: - order: 3 - label: Policy - ---- - -In order to be listed by Cloudflare as a signed agent, your agent must conform to the below requirements. To provide the best possible protection to our customers, this policy may change in the future as we adapt to new bot behaviors. - -## Agent policy - -### Minimum zones - -Service must be made for a widespread use of zones. - -#### Example - -A bot crawling one site is not valid. - -### Agent identification - -The user-agent field is optional as it is not required for Web Bot Authentication. - -However, if you choose to provide a user-agent, it and the message signature must meet the following requirements: - -- Have at least five characters. -- Must not contain special characters. -- Must not include the same user-agent of another verified service. - -#### Example - -`cloudflare-browser-rendering` is a valid message signature. - -### Service purpose - -The purpose of the service should be benign or helpful to both the owner of a zone and the users of the service. The service cannot perform any of the following: - -- Bot tooling -- Scalpers -- Credential-stuffing -- Directory-traversal scanning -- Excessive data scraping -- DDoS botnets - -#### Example - -Price scraping direct e-commerce competitors is not a valid use case. - -### Public documentation - -The agent must have a publicly documented purpose and expected behavior. - ---- - -## Breach of policy - -If any of the requirements to validate are breached, a service will be removed from the signed agent list. - -The following are examples of breaches of policy: - -- The service has vulnerabilities that have not been patched. -- The disclosed purpose of the service does not reflect on the traffic. \ No newline at end of file diff --git a/src/content/docs/bots/concepts/bot/verified-bots/index.mdx b/src/content/docs/bots/concepts/bot/verified-bots/index.mdx index 229a0a51703..635f86d74b2 100644 --- a/src/content/docs/bots/concepts/bot/verified-bots/index.mdx +++ b/src/content/docs/bots/concepts/bot/verified-bots/index.mdx @@ -1,7 +1,7 @@ --- pcx_content_type: overview title: Verified bots -description: Bots confirmed by Cloudflare as legitimate, such as search engine crawlers. +description: Bots and agents confirmed by Cloudflare as legitimate, such as search engine crawlers and user-driven agents. products: - bots sidebar: @@ -14,46 +14,78 @@ learning_center: import { GlossaryTooltip } from "~/components"; -A verified bot is a bot that Cloudflare has confirmed as legitimate, such as search engine crawlers and monitoring services. +A Verified bot is a bot or agent that Cloudflare has confirmed is **transparent about who it is and what it does**: it represents itself honestly and does not abuse the access that honesty earns. Examples include search engine crawlers, monitoring services, and user-driven agents. -You can request for your bot to be added to Cloudflare's bots and agents directory by filling out an [online application](https://dash.cloudflare.com/?to=/:account/configurations/verified-bots) in the Cloudflare dashboard. +Being Verified means a bot or agent meets two bars: + +1. **Honest self-identification** — it declares who it is deterministically, through a cryptographic [Web Bot Auth](/bots/reference/bot-verification/web-bot-auth/) signature, a published IP list with a stable user-agent, or reverse DNS. +2. **Non-abusive behavior** — it obeys `robots.txt` and crawl directives, maintains reasonable request rates, and has not been observed evading website owner preferences or attacking sites. + +:::note[Signed agents are now Verified] +As of July 1, 2026, the distinction between a Verified bot and a signed agent is expressed by a new metadata field tracked in [BotBase](/bots/botbase/): Direct versus Intermediary access, which tracks who can operate the bot. +::: + +## Classification :::note -A bot cannot be registered as both a verified bot and a signed agent. Review Cloudflare's [signed agents](/bots/concepts/bot/signed-agents/) to determine how to identify your bot. +These updated classifications reflect the new taxonomy of bots used in [BotBase](/bots/botbase/). They are not individual fields in WAF custom rules, but have backwards-compatible categories from the original taxonomy of Verified bots (see [Legacy categories](#legacy-categories) below). ::: -## Verified bot requirement +Cloudflare classifies each tracked bot by its behavior — what the bot may do on your site. A single bot can have one or more of the following behaviors: -For a bot to be verified, it must meet the following requirements: +| Behavior | Description | +| --- | --- | +| Search | Crawling to build search indexes or RAG databases. | +| Agent | User-directed agents visiting a page on behalf of a human. | +| Training | Crawling to train or fine-tune models. | +| Transact | Checkout or other transaction actions on behalf of users. | +| Data Collection | Price scraping, competitive intelligence gathering, and third-party analytics. | +| Security Testing | Vulnerability scanning and penetration testing. | +| SEO | SEO crawling, site auditing, and accessibility checks. | +| Ads Verification | Ad placement verification and ad fraud detection. | +| Social / Link Preview | Link previews for social platforms and messaging apps. | +| Feed Fetching | RSS readers, podcast aggregators, and news feed bots. | +| Monitoring & Operations | Uptime monitoring, webhooks, and health checks. | -1. The bot must follow [verified bots policy](/bots/concepts/bot/verified-bots/policy/). -2. The bot must be verified using one of the following verification methods: - - [Web Bot Auth](/bots/reference/bot-verification/web-bot-auth/) - - [IP validation](/bots/reference/bot-verification/ip-validation/) +Search, Agent, and Training are also available as managed presets you can act on across all plans. For more information, refer to [AI bots](/bots/concepts/bot/#ai-bots). -Once Cloudflare approves a verified bot, it should appear on [Cloudflare Radar's bots and agents directory](https://radar.cloudflare.com/verified-bots). +Cloudflare also labels every Verified bot or agent by how it is operated. ---- +| Label | Description | +| --- | --- | +| **Direct** | Operated by a single, narrow operator — usually on the operator's own infrastructure. Only that operator can send requests that present as this bot. | +| **Intermediary** | An agentic service that a wide range of end users can operate. The operator runs the software, but each action is initiated by a different end user. | + +Because an **intermediary** acts on behalf of many different end users, the operator and the end user are not the same party. This introduces **transitive trust**: you may trust the intermediary operator, but not necessarily every end user driving it. Cloudflare is experimenting with forwarding information about the end user (using the `Forwarded` header defined in [RFC 7239](https://www.rfc-editor.org/info/rfc7239)) so that website owners can apply their preferences to the party ultimately responsible for a request. + +## Becoming a Verified bot + +You can request for your bot or agent to be added to Cloudflare's bots and agents directory by filling out an [online application](https://dash.cloudflare.com/?to=/:account/configurations/verified-bots) in the Cloudflare dashboard. -## Verification methods +Once Cloudflare approves a Verified bot, it should appear in [BotBase](/bots/botbase/), shared through [Cloudflare Radar's bots and agents directory](https://radar.cloudflare.com/verified-bots). -The bot must be verified using one of the following validation methods: +The bot must be Verified using one of the following validation methods: - [Web Bot Auth](/bots/reference/bot-verification/web-bot-auth/) - [IP validation](/bots/reference/bot-verification/ip-validation/) ---- +### Breach of policy -## Categories +If any of the requirements to validate are breached, a service will be removed from the global allowlist. -You can segment your verified bot traffic by its type and purpose by adding the Verified Bot Categories field `cf.verified_bot_category` as a filter criteria in [WAF Custom rules](/waf/custom-rules/), [Advanced Rate Limiting](/waf/rate-limiting-rules/), and Late Transform rules. +The following are examples of breaches of policy: -:::caution -The Verified Bot Categories field is not compatible with legacy Firewall rules. -::: +- Adding a set of IPs that are not solely used by Verified service. +- The service IPs are breached by an attacker. +- The service has vulnerabilities that have not been patched. +- A block of IPs not briefed on onboarding is added to the list. +- The disclosed purpose of the service does not reflect on the traffic. +- An AI Crawler that does not respect the crawl-delay directive in robots.txt. -:::note[Availability] -Verified Bot Categories is available on all plans. +## Legacy categories + +:::note +The original categories of Verified bots continue to be usable in WAF custom rules and currently continue to work as before, even as Cloudflare continues launching new capabilities based on the updated taxonomy. :::
@@ -124,6 +156,10 @@ Verified Bot Categories is available on all plans. **Definition**: Powers AI-driven search experiences. **Example**: OAI-SearchBot + +:::note +Under the taxonomy introduced on July 1, 2026, there is no longer a meaningful distinction between "AI Search" and traditional search — both are treated as **Search** behavior. The `AI Search` category value is retained for backward compatibility with existing rules, but new search crawlers are classified under **Search**. Refer to [AI bots](/bots/concepts/bot/#ai-bots). +:::
@@ -224,34 +260,8 @@ Verified Bot Categories is available on all plans. **Definition**: A dedicated category for bots that do not fit into the other classifications.
-Cloudflare reserves the right to re-assign verified bot categories if the bot's public documentation and observed behavior differ from the category listed in the bot submission form. - ---- - -## Inactive verified bots - -Once Cloudflare lists a bot as a verified bot, this entry is cached and may get delisted if no traffic is seen in the Cloudflare network coming from the bot for a defined period of time. - -It takes approximately 24 hours for an inactive IP to be removed as a verified bot. - ---- - -### Known issues - -The Yandex bot is classified as a Verified Bot, but traffic may occasionally be blocked by a [WAF Managed Rule](/waf/managed-rules/) (such as the rule with ID `...f6cbb163`). - -This typically occurs when Yandex updates its source IP address ranges. The new IPs are temporarily unrecognized by the WAF Managed Rules until the updated Verified Bot IP list is fully synchronized across the Cloudflare network. - -To restore Yandex traffic, deploy a [WAF exception](/waf/managed-rules/waf-exceptions/) that temporarily skips the managed rule with ID `` when a request is coming from the **Yandex IP** and the user-agent contains **Yandex**. This ensures that legitimate Yandex traffic bypasses the blocking rule without disabling security features for other traffic. - -You can also create a [WAF Custom Rule](/waf/custom-rules/skip/) with the _Skip_ action targeting the managed ruleset that contains the blocking rule. The rule expression should specifically match the request's Yandex IP and User-Agent. - -The issue is transient and will resolve automatically once the new Yandex IP addresses are fully propagated to Cloudflare's systems. This propagation typically takes up to 48 hours. If the bot remains blocked after 48 hours, contact [Cloudflare Support](/support/contacting-cloudflare-support/). - ---- +Cloudflare reserves the right to re-assign Verified bot categories if the bot's public documentation and observed behavior differ from the category listed in the bot submission form. ## Availability -Verified bots are excluded by default when [Bot Fight Mode](/bots/get-started/bot-fight-mode/) is enabled to block definite bots. - -[Super Bot Fight Mode](/bots/get-started/super-bot-fight-mode/) and [Enterprise Bot Management](/bots/get-started/bot-management/) customers have the option to block or allow verified bots. \ No newline at end of file +Historically, Verified bots have been excluded in default bot configurations across all plans. Now, all customers have the option to [configure AI bot policies](/bots/additional-configurations/block-ai-bots/) to define their block vs. allow expectations. diff --git a/src/content/docs/bots/concepts/bot/verified-bots/policy.mdx b/src/content/docs/bots/concepts/bot/verified-bots/policy.mdx deleted file mode 100644 index 0a353f69829..00000000000 --- a/src/content/docs/bots/concepts/bot/verified-bots/policy.mdx +++ /dev/null @@ -1,95 +0,0 @@ ---- -pcx_content_type: reference -title: Verified bots policy -description: Requirements a bot must meet to be listed as a Cloudflare verified bot. -products: - - bots -sidebar: - order: 5 - label: Policy - ---- - -import { GlossaryTooltip } from "~/components" - -In order to be listed by Cloudflare as a verified bot, your bot must conform to the below requirements. To provide the best possible protection to our customers, this policy may change in the future as we adapt to new bot behaviors. - -## Bot policy - -### Minimum traffic - -A bot or proxy must have a minimum amount of traffic for Cloudflare to be able to find it in the sampled data. The minimum traffic should have more than 1,000 requests per day across multiple domains. - -:::note -Minimum traffic is not a requirement if you are using [Web Bot Auth](/bots/reference/bot-verification/web-bot-auth/) as an authentication method. -::: - -### Minimum zones - -Service must be made for a widespread use of zones. - -#### Example - -A bot crawling one site is not valid. - -### Bot identification - -The user-agent or message signature with the following requirements: - -- Have at least five characters. -- Must not contain special characters. -- Must not include the same user-agent of another verified service. - -#### Example - -`GoogleBot/1.0` is a valid user-agent. - -### Domain owner consent - -Domains should only be crawled with the explicit or implicit consent of the zone's owner or terms of use. Search engines crawlers must read the `robots.txt` to exclude paths to crawl from the owner. - -#### Example - -A tool trying to scalp inventories from different websites might be breaking terms of use while a search engine bot indexing websites but complying with `robots.txt` is a valid service. - -### Service purpose - -The purpose of the service should be benign or helpful to both the owner of a zone and the users of the service. The service cannot perform any of the following: - -- Bot tooling -- Scalpers -- Credential-stuffing -- Directory-traversal scanning -- Excessive data scraping -- DDoS botnets - -#### Example - -Price scraping direct e-commerce competitors is not a valid use case. - -### Crawling etiquette - -The crawling etiquette should check `robots.txt` if crawling the whole website, and it should not attempt to crawl sensitive paths. - -#### Example - -If a search engine crawler skips `robots.txt`, it will be rejected. - -### Public documentation - -The bot must have publicly documented expected behavior or user-agent format. - ---- - -## Breach of Policy - -If any of the requirements to validate are breached, a service will be removed from the global allowlist. - -The following are examples of breaches of policy: - -- Adding a set of IPs that are not solely used by verified service. -- The service IPs are breached by an attacker. -- The service has vulnerabilities that have not been patched. -- A block of IPs not briefed on onboarding is added to the list. -- The disclosed purpose of the service does not reflect on the traffic. -- An AI Crawler that does not respect the crawl-delay directive in robots.txt. diff --git a/src/content/docs/bots/reference/bot-management-variables.mdx b/src/content/docs/bots/reference/bot-management-variables.mdx index 8b78f19b39b..a24abc0c09a 100644 --- a/src/content/docs/bots/reference/bot-management-variables.mdx +++ b/src/content/docs/bots/reference/bot-management-variables.mdx @@ -24,8 +24,8 @@ Bot Management provides access to several [new variables](/ruleset-engine/rules- - **Serves Static Resource** (`cf.bot_management.static_resource`): An identifier that matches [file extensions](/bots/additional-configurations/static-resources/) for many types of static resources. Use this variable if you send emails that retrieve static images. - **ja3Hash** (`cf.bot_management.ja3_hash`) and **ja4** (`cf.bot_management.ja4`): A [**JA3/JA4 fingerprint**](/bots/additional-configurations/ja3-ja4-fingerprint/) helps you profile specific SSL/TLS clients across different destination IPs, Ports, and X509 certificates. - **Bot Detection IDs** (`cf.bot_management.detection_ids`): List of IDs that correlate to the Bot Management heuristic detections made on a request (you can have multiple heuristic detections on the same request). -- **Signed Agent** (`cf.bot_management.signed_agent`): A boolean value that indicates whether the request originated from a known [signed agent](/bots/concepts/bot/signed-agents/). -- **Verified Bot Categories** (`cf.verified_bot_category`): A string that allows you to segment your verified bot traffic by its [type and purpose](/bots/concepts/bot/verified-bots/#categories). +- **Signed Agent** (`cf.bot_management.signed_agent`): A boolean value that indicates whether the request originated from a known agent that self-identifies with Web Bot Auth. Such agents are now classified as [verified bots and agents](/bots/concepts/bot/verified-bots/) labeled as intermediary. +- **Verified Bot Categories** (`cf.verified_bot_category`): A string that allows you to segment your verified bot traffic by its [type and purpose](/bots/concepts/bot/verified-bots/#legacy-categories). ## Workers variables diff --git a/src/content/docs/bots/reference/bot-verification/index.mdx b/src/content/docs/bots/reference/bot-verification/index.mdx index c074fd2973d..cee75ecc657 100644 --- a/src/content/docs/bots/reference/bot-verification/index.mdx +++ b/src/content/docs/bots/reference/bot-verification/index.mdx @@ -1,7 +1,7 @@ --- pcx_content_type: navigation title: Bot verification methods -description: Validation methods Cloudflare uses to verify bots and signed agents. +description: Validation methods Cloudflare uses to verify bots and agents. products: - bots sidebar: @@ -13,6 +13,6 @@ sidebar: import { DirectoryListing } from "~/components" -Refer to the following pages for more information on Cloudflare's validation methods for [Verified](/bots/concepts/bot/verified-bots/) and [Signed](/bots/concepts/bot/signed-agents/) bots. +Refer to the following pages for more information on Cloudflare's validation methods for [verified bots and agents](/bots/concepts/bot/verified-bots/). \ No newline at end of file diff --git a/src/content/docs/bots/reference/bot-verification/web-bot-auth.mdx b/src/content/docs/bots/reference/bot-verification/web-bot-auth.mdx index 552bcc76087..d1ae5555964 100644 --- a/src/content/docs/bots/reference/bot-verification/web-bot-auth.mdx +++ b/src/content/docs/bots/reference/bot-verification/web-bot-auth.mdx @@ -13,7 +13,7 @@ tags: import { GlossaryTooltip, Steps } from "~/components"; -Web Bot Auth is an authentication method that leverages cryptographic signatures in HTTP messages to verify that a request comes from an automated bot. Web Bot Auth is used as a verification method for [verified bots](/bots/concepts/bot/verified-bots/) and [signed agents](/bots/concepts/bot/signed-agents/). +Web Bot Auth is an authentication method that leverages cryptographic signatures in HTTP messages to verify that a request comes from an automated bot. Web Bot Auth is used as a verification method for [verified bots and agents](/bots/concepts/bot/verified-bots/). It relies on IETF drafts: a [directory draft](https://datatracker.ietf.org/doc/html/draft-meunier-http-message-signatures-directory-03) allowing the crawler to share their public keys, and a [protocol draft](https://datatracker.ietf.org/doc/html/draft-meunier-web-bot-auth-architecture-02) defining how these keys should be used to attach the crawler's identity to HTTP requests. diff --git a/src/content/fields/index.yaml b/src/content/fields/index.yaml index 5c88bf72004..31018e78200 100644 --- a/src/content/fields/index.yaml +++ b/src/content/fields/index.yaml @@ -538,7 +538,7 @@ entries: keywords: [request, bots, client, visitor] summary: Provides the type and purpose of a verified bot. description: |- - For more details, refer to [Verified bot categories](/bots/concepts/bot/verified-bots/#categories). + For more details, refer to [Verified bot categories](/bots/concepts/bot/verified-bots/#legacy-categories). - name: cf.bot_management.score data_type: Number @@ -635,7 +635,7 @@ entries: categories: [Request, Bots] keywords: [request, bots, client, visitor, agent, web bot auth] plan_info_label: Enterprise add-on - summary: Indicates whether or not the request originated from a known [signed agent](/bots/concepts/bot/signed-agents/). + summary: Indicates whether or not the request originated from a known agent that self-identifies with Web Bot Auth, now classified as a [verified bot or agent](/bots/concepts/bot/verified-bots/) labeled as intermediary. description: |- Requires a Cloudflare Enterprise plan with [Bot Management](/bots/plans/bm-subscription/) enabled. diff --git a/src/content/glossary/bots.yaml b/src/content/glossary/bots.yaml index 4d5099fab45..efe5531ddb5 100644 --- a/src/content/glossary/bots.yaml +++ b/src/content/glossary/bots.yaml @@ -21,10 +21,22 @@ entries: general_definition: |- Static rules that are used to detect predictable bot behavior with no overlap with human traffic. + - term: direct + general_definition: |- + A label applied to a verified bot or agent operated by a single, narrow operator, usually on the operator's own infrastructure. Replaces the standalone "verified bot" classification used before July 1, 2026. + + - term: intermediary + general_definition: |- + A label applied to a verified agent that a wide range of end users can operate, such as a browser-use or agentic service. Replaces the "signed agent" classification used before July 1, 2026. + - term: JA3 fingerprint general_definition: |- JA3 and JA4 fingerprints profile specific SSL/TLS clients across different destination IPs, Ports, and X509 certificates. + - term: signed agent + general_definition: |- + A deprecated classification (retired July 1, 2026) for end-user-controlled agents that self-identify with Web Bot Auth. These agents are now verified bots labeled as intermediary. + - term: verified bot general_definition: |- - Bots that are transparent about who they are and what they do. + A bot or agent that Cloudflare has confirmed is transparent about who it is and what it does: it represents itself honestly and does not abuse the access that honesty earns. diff --git a/src/content/partials/bots/bot-score-categories.mdx b/src/content/partials/bots/bot-score-categories.mdx index 72c29fdb05d..390ea0bbc7c 100644 --- a/src/content/partials/bots/bot-score-categories.mdx +++ b/src/content/partials/bots/bot-score-categories.mdx @@ -4,7 +4,7 @@ Cloudflare classifies bot traffic into categories based on bot scores and verification status: -- **Verified bots**: Crawlers and services that Cloudflare has confirmed as legitimate, such as Googlebot, Bingbot, and uptime monitors. Cloudflare maintains a [verified bot list](/bots/concepts/bot/verified-bots/policy/) with strict requirements. +- **Verified bots**: Crawlers and services that Cloudflare has confirmed as legitimate, such as Googlebot, Bingbot, and uptime monitors. Cloudflare maintains a [verified bot list](/bots/concepts/bot/verified-bots/) with strict requirements. - **Automated** (score 1): Cloudflare is quite certain the request is automated. - **Likely automated** (scores 2-29): Probably a bot. This category and Automated are the primary targets for security rules, including scrapers, credential stuffing tools, and spam submitters. - **Likely human** (scores 30-99): These requests appear to come from real users. Do not challenge or block this traffic. diff --git a/src/content/partials/bots/verified-bots.mdx b/src/content/partials/bots/verified-bots.mdx index 643ff4708ca..c020f6977f2 100644 --- a/src/content/partials/bots/verified-bots.mdx +++ b/src/content/partials/bots/verified-bots.mdx @@ -3,8 +3,8 @@ --- -Cloudflare maintains an internal directory of [verified bot](/bots/concepts/bot/verified-bots/) and [signed agents](/bots/concepts/bot/signed-agents/) that are associated with search engine optimization (SEO), website monitoring, and more. +Cloudflare maintains an internal directory of [verified bots and agents](/bots/concepts/bot/verified-bots/) that are associated with search engine optimization (SEO), website monitoring, user-driven automation, and more. You can use this directory to prevent any bot protection measures from impacting otherwise helpful bots and agents, such as search crawlers. -For a partial list of verified bots and signed agents, refer to [Cloudflare Radar](https://radar.cloudflare.com/verified-bots). +For a partial list of verified bots and agents, refer to [Cloudflare Radar](https://radar.cloudflare.com/verified-bots). diff --git a/src/content/release-notes/bots.yaml b/src/content/release-notes/bots.yaml index b6b046e0560..e9298b9f3c6 100644 --- a/src/content/release-notes/bots.yaml +++ b/src/content/release-notes/bots.yaml @@ -3,6 +3,14 @@ link: "/bots/changelog/" productName: Bots productLink: "/bots/" entries: + - publish_date: "2026-07-01" + title: New options to manage AI traffic + description: |- + All customers can now manage AI crawlers by behavior — [Search, Agent, and Training](/bots/concepts/bot/#ai-bots) — instead of a single Block AI bots toggle. Configure these options from [Block AI Bots](/bots/additional-configurations/block-ai-bots/). New defaults, in which Training and Agent are blocked on pages that display ads while Search remains allowed, take effect for new domains on September 15, 2026. + - publish_date: "2026-07-01" + title: BotBase and Attribution Business Insights for Enterprise Bot Management + description: |- + Enterprise Bot Management customers can now use [BotBase](/bots/botbase/), a searchable directory of all tracked bots and agents with their behavior classification and detection IDs, and [Attribution Business Insights](/bots/attribution-business-insights/), a dashboard showing site-wide and per-operator crawl-to-referral ratios alongside bot traffic to your content. - publish_date: "2025-07-02" title: Managed robots.txt will prepend existing files description: Cloudflare will prepend our managed `robots.txt` before your existing `robots.txt`, combining both into a single response.