GEO & AI Search · Glossary · Updated Apr 2026

ClaudeBot

Definition

ClaudeBot is Anthropic's web crawler, identified by the user-agent `ClaudeBot`. It's used to gather public content for training Claude models. Block it via robots.txt with `User-agent: ClaudeBot`. Two earlier user-agents, `anthropic-ai` and `Claude-Web`, are deprecated but worth listing for safety.

Find related

Long definition

ClaudeBot is the active crawler Anthropic uses to collect web content for training Claude. It identifies itself with a user-agent string containing ClaudeBot and follows standard robots.txt rules. Anthropic publishes documentation and IP ranges at anthropic.com — verifying log hits against the published ranges is the cleanest way to confirm a request is real.

Blocking it is straightforward:

User-agent: ClaudeBot
Disallow: /

History matters here because Anthropic has changed how it identifies its crawlers. The earlier user-agents anthropic-ai and Claude-Web are deprecated, but defensive operators include all three in a single block group:

User-agent: ClaudeBot
User-agent: anthropic-ai
User-agent: Claude-Web
Disallow: /

Like OpenAI, Anthropic distinguishes between crawlers used for training and traffic generated when a user inside Claude.ai requests a web page in conversation. The Claude-User agent, when present, identifies live user-initiated fetches. Blocking ClaudeBot prevents future training use; it does not necessarily block live fetches a user explicitly initiates.

Cloudflare's bot management dashboards show ClaudeBot among the most frequent AI training crawlers across the web, alongside GPTBot and CCBot. Volume is meaningful for large sites — operators have reported 10-100x higher ClaudeBot crawl rates than Googlebot during peak training windows in 2024-2025.

The decision is the same as for GPTBot: if you want to be cited in Claude's answers but not used as training data, only blocking ClaudeBot keeps you in live-retrieval contexts when those exist. If you want full opt-out, block all three legacy and current Anthropic agents and consider also blocking CCBot, since Common Crawl is a known training input.

Common misconceptions

"Blocking anthropic-ai is enough." It's deprecated. The active agent is ClaudeBot. A robots.txt that only mentions anthropic-ai does nothing for current crawls.
"ClaudeBot ignores robots.txt." Anthropic publicly commits to honoring robots.txt. Reports of ignored rules typically trace back to spoofers using the user-agent string — verify against the published IP ranges before assuming bad behavior.
"Blocking ClaudeBot removes my content from existing Claude models." Training data already used cannot be retroactively unlearned. Blocks affect future training cycles only.
"All AI bots can be blocked with a single rule." Each crawler has its own user-agent. A wildcard User-agent: * block hits everyone including Googlebot, which is rarely what you want. List AI agents explicitly.

Continue exploring