Which AI Bots Are You Actually Blocking? (GPTBot, ClaudeBot, Perplexity & More)

No Comments

"Block AI bots" sounds like one switch. It is not. Every AI company runs several crawlers with different jobs, and blocking the wrong one is how sites accidentally disappear from ChatGPT, Claude and Perplexity answers while believing they only opted out of model training. If you are going to make a decision about AI crawlers — and you should — make it the right one.

Every AI bot has one of three jobs

  • Training — collects content that may train future models. Blocking it affects whether you contribute to the model, not whether you show up in answers today.
  • Search / indexing — builds the live index the assistant retrieves from. Block this and you vanish from that product's cited answers.
  • User-fetch — grabs a specific URL when a human asks the assistant to read it. Block this and the assistant cannot open your page on request.

That distinction is the whole game. Here is who's who.

The 2026 AI crawler field guide

Bot (user agent)OperatorJobWhat blocking it does
GPTBotOpenAITrainingExcludes you from training data. Does not remove you from ChatGPT search.
OAI-SearchBotOpenAISearch indexRemoves you from ChatGPT's search results and citations.
ChatGPT-UserOpenAIUser-fetchChatGPT can't open your page when a user asks it to.
ClaudeBotAnthropicTrainingExcludes you from Claude training data only.
Claude-SearchBotAnthropicSearch indexRemoves you from Claude's search-backed answers.
Claude-UserAnthropicUser-fetchClaude can't fetch your URL on a user's request.
PerplexityBotPerplexitySearch indexRemoves you from Perplexity's indexed answers.
Perplexity-UserPerplexityUser-fetchLive fetches for user questions (historically does not honor robots.txt — control it at the firewall).
GooglebotGoogleSearch indexRemoves you from Google Search and AI Overviews (which use the search index). Renders JavaScript.
Google-ExtendedGoogleTraining tokenA robots.txt token, not a crawler. Opts you out of Gemini/Vertex training. Does not affect Search or AI Overviews.
BingbotMicrosoftSearch indexRemoves you from Bing and Microsoft Copilot.
CCBotCommon CrawlTraining (open dataset)Excludes you from the open dataset many AI trainers use.

Sources: OpenAI's crawler docs and Anthropic's documented agents.

The mistake almost everyone makes

Someone reads a scary headline about AI scraping, drops a blanket Disallow for "AI bots," or a security plugin/WAF starts 403'ing anything that looks like a bot. The result: you block OAI-SearchBot, Claude-SearchBot and PerplexityBot too — the exact crawlers that decide whether you get cited — while thinking you only said no to training. (For real, reproducible examples of big sites doing this, see The Forgotten HTML.)

The two robots.txt recipes that actually make sense

1. "I want AI citations, just not to train the models." Allow the search and user bots; block the training ones:

User-agent: GPTBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: Google-Extended
Disallow: /

# leave these ALLOWED (default):
# OAI-SearchBot, ChatGPT-User, Claude-SearchBot, Claude-User, PerplexityBot, Googlebot, Bingbot

2. "Block everything AI." Disallow all of the above — but understand you are opting out of AI-answer visibility entirely, and accept the traffic trade-off.

One caveat: robots.txt is a request, not a wall. Some user-fetch bots ignore it, and a firewall can override it in either direction. If it must be enforced, do it at the WAF/CDN.

How to check what you're actually blocking

Test each user agent directly — a 403 means you're blocking it, content means you're not:

for ua in GPTBot OAI-SearchBot ChatGPT-User ClaudeBot Claude-SearchBot PerplexityBot; do
  echo -n "$ua: "; curl -s -o /dev/null -w "%{http_code}n" -A "$ua" https://yoursite.com/
done

Then read your robots.txt, and check your CDN/WAF bot-management rules — that is where accidental blocks usually live.

Not sure which bots can see (or are blocked from) your site?

An advanced technical audit checks rendering, retrievability, and bot access end to end. See how an advanced SEO audit works →

    About SEO ProCheck

    Technical SEO consulting and GEO strategy with 20 years of enterprise experience. Case studies, resources, and tools for search and AI visibility.

    Work With Me

    Technical SEO audits, GEO strategy, site migrations, and international SEO. Hourly consulting for teams who need hands-on support, not just reports.

    Subscribe to our newsletter!

    More from our blog