Robots.txt is a few lines of text with outsized power: get it wrong and you can wall off your whole site or quietly block the crawlers you most want. Here are the mistakes that actually cause damage.
The mistakes that hurt
- Disallowing CSS/JS. Blocking assets Google needs to render the page degrades how it understands and ranks you. Let rendering resources through.
- Using robots.txt to deindex.
Disallowstops crawling, not indexing — a blocked URL can still appear in results (without a snippet). To remove a page, usenoindexand let it be crawled. - A stray sitewide block. A leftover
Disallow: /from staging is the classic catastrophe. Always check after launches. - Stacked, conflicting groups. Multiple
User-agent: *blocks confuse maintenance and invite mistakes. Keep one clean group per agent. - Blocking the AI crawlers you wanted. A blanket rule can exclude the search and user-fetch bots that drive AI citations. (See The Forgotten HTML.)
- Assuming it is enforcement. Robots.txt is a request. Misbehaving bots ignore it; for real enforcement use a firewall.
How to get it right
- Default to open. Only disallow what genuinely should not be crawled (internal search, cart, admin).
- Keep it minimal and readable — one group per user agent, comments for intent.
- Reference your sitemap with a
Sitemap:line. - Test changes in GSC's robots.txt report before and after.
- Re-check after every launch or migration.
Related: Crawl budget · What AI crawlers really see
🔍 The Indexing & Crawl series
Pages not getting indexed?
Diagnosing why Google won't index or rank your pages is the core of a technical audit. See how an advanced SEO audit works →
About SEO ProCheck
Technical SEO consulting and GEO strategy with 20 years of enterprise experience. Case studies, resources, and tools for search and AI visibility.
Work With Me
Technical SEO audits, GEO strategy, site migrations, and international SEO. Hourly consulting for teams who need hands-on support, not just reports.
Subscribe to our newsletter!
Two sites can look identical in a browser and be worlds apart…
"Block AI bots" sounds like one switch. It is not. Every AI…
You can tell when a website cost a fortune. The type is…
Best practices guide for Missing Local Schema (LO-004). Priority: High. Batch check…
Best practices guide for No Local Reviews (LO-007). Priority: Medium. Batch check…
Best practices guide for Keyword Stuffing Detection (ON-040). Priority: Medium. Live check…
Best practices guide for No Local Citations (LO-010). Priority: Medium. Batch check…
It feels like progress to watch your indexed-page count climb. It usually…
Recent Posts
All Website Tags
AI SEO
Analytics
B2B
Batch Check
Content Analysis
Content Quality
Content Strategy
Core Web Vitals
Crawl Budget
Crawling
DOM Parsing
E-commerce
E-E-A-T
Enterprise
FAQ
GEO
Googlebot
Google Search Console
High Priority
Hreflang
HTTP Requests
Indexing
Internal Linking
International SEO
JavaScript SEO
Keyword Research
Link Building
Live Check
Local SEO
Low Priority
Medium Priority
Multiple APIs
On-Page SEO
Page Speed
PageSpeed API
Rich Results
SaaS
Schema Validation
SEO Tools
SERP Features
Site Architecture
Site Migrations
Structured Data
Technical SEO
Title Tags








