Robots Meta Tag vs X-Robots-Tag HTTP Header: When to Use Each

June 13, 2020
Technical SEO

No Comments

Robots meta tag vs x-robots-tag http header: when to use each

Both the robots meta tag and the X-Robots-Tag HTTP header tell crawlers the same things, noindex, nofollow, noarchive, and the rest, but they are delivered through completely different channels. The meta tag lives inside an HTML <head>; the header travels in the server response itself. That single distinction decides which tool can reach a given URL, and getting it wrong is one of the most common reasons unwanted files keep showing up in search results.

The core difference: where the directive lives

A robots meta tag is markup. It only exists if the response body is HTML and the crawler parses it:

<meta name="robots" content="noindex, nofollow">, applies to all crawlers
<meta name="googlebot" content="noindex">, targets a single user agent

The X-Robots-Tag is part of the HTTP response header, configured at the server level and sent before the body. It works on any resource the server delivers, regardless of file type:

X-Robots-Tag: noindex
X-Robots-Tag: googlebot: noindex, nofollow, user-agent-scoped
X-Robots-Tag: noindex, nosnippet, multiple directives, comma-separated

Google honors the identical vocabulary in both places. The header is simply the only option when there is no HTML head to write into, and the more scalable option when you need to govern thousands of URLs at once.

A decision framework

Ask three questions, in order. The first one that returns "yes" tells you which mechanism to reach for.

Is the resource non-HTML? PDFs, images, videos, spreadsheets, JSON feeds, plain-text files, and downloadable binaries have no <head>. The meta tag is physically impossible to embed. Use X-Robots-Tag. This is the single most important rule, because PDFs and images are exactly the assets that slip into the index unintentionally.
Are you controlling a bulk URL pattern? If the rule applies to an entire directory, a query-parameter signature, a file extension, or a generated section of the site, a server-level header lets you express it once and have it apply everywhere. Editing the head of every matching template, or worse, every static file, does not scale and drifts out of sync.
Otherwise, use the meta tag. For an individual HTML page where the content team or CMS owns the template, the meta tag is closer to the content, easier to audit in "view source," and doesn't require server access. This covers most thin pages, internal search results rendered as HTML, and paginated or filtered views.

What only the header can do

These are the scenarios where the meta tag is not merely inconvenient but unavailable:

PDFs and office documents. A white paper, price list, or manual that you want crawlable for users but absent from the index.
Media files. Images you don't want appearing in image search, or video files served directly.
Generated and exported assets. CSV exports, .txt files, sitedata feeds, and API responses that are publicly reachable but shouldn't rank.
Staging or sensitive directories where you'd rather emit noindex across every file type than rely on per-page markup.

Implementation patterns

On Apache, target a file type across the whole site with a single block in .htaccess or the vhost config:

<FilesMatch ".(pdf|docx|xlsx)$"> Header set X-Robots-Tag "noindex, noarchive" </FilesMatch>

On Nginx, the equivalent uses a location block:

location ~* .(pdf|docx|xlsx)$ { add_header X-Robots-Tag "noindex, noarchive"; }

A note on Nginx: add_header does not propagate into a block if that block defines its own add_header directives, and it is skipped on some error responses. Verify the header actually appears on a real request rather than assuming the config is live.

You can also set the header dynamically in application code (PHP's header(), a middleware layer, a CDN edge worker, or your framework's response object), which is ideal when the indexing decision depends on logic, for example, emitting noindex on filtered listing URLs that carry more than one query parameter.

The crawlability prerequisite, true for both

Neither mechanism works if the URL is blocked in robots.txt. This trips people up constantly: a crawler that is disallowed from fetching a URL never sees the response, and therefore never sees the noindex in the header or the head. A disallowed page can still be indexed from external links, shown with no snippet, precisely because the directive telling it to drop out was never read.

The fix is to allow crawling and apply noindex. Let the crawler in, let it read the directive, let it drop the URL. Only after the page has been recrawled and de-indexed should you consider blocking it in robots.txt to save crawl budget.

Common mistakes

Trying to noindex a PDF with a meta tag. There's nowhere to put it. The asset stays indexed until you switch to the header.
Combining Disallow in robots.txt with noindex. They cancel each other out, the block prevents the directive from ever being seen.
Setting both a meta tag and a header with conflicting values. When a crawler sees two directives for the same signal, the more restrictive one generally wins, so a stray noindex anywhere can quietly de-index a page you wanted live. Audit for duplicates.
Assuming the header is set without checking. Confirm with curl -I https://example.com/file.pdf and look for the X-Robots-Tag line in the response. Server config caveats mean "I added it" and "it's being sent" are not the same thing.
Forgetting that directives are case-insensitive but vocabulary is fixed. Only documented values (noindex, nofollow, noarchive, nosnippet, noimageindex, unavailable_after, etc.) do anything. Invented values are ignored silently.

FAQ

Does the header carry any ranking weight or speed penalty? No. It's a tiny string in a response you're already sending. There is no measurable performance cost to applying it across a directory.

Can I use the header on HTML pages too? Yes. It's perfectly valid, and it's the right call when you want to govern HTML pages by pattern without editing templates. The meta tag is just usually more convenient for one-off HTML pages.

How fast does removal happen? Only on the next crawl of that URL. To accelerate it, request indexing or resubmit the URL in your search console of choice, and make sure the page is internally linked enough to be recrawled promptly.

Do other search engines respect X-Robots-Tag? Google and Bing both support it. Treat compliance from smaller or non-mainstream crawlers as unreliable, and never use either mechanism as a security control, anything you truly need hidden belongs behind authentication, not a polite request to stay out of the index.

Related on SEO ProCheck

Want this handled properly on your site?

It is exactly the kind of work an advanced technical SEO audit covers. See how an advanced SEO audit works →

Claude Vincent

Claude Vincent is a technical SEO consultant focused on crawlability, rendering, and AI-search visibility. He writes the field guides and case studies at SEO ProCheck, with a bias toward the durable, unglamorous work that decides whether search engines and AI answer engines can actually read and cite a site.

About SEO ProCheck

Technical SEO consulting and GEO strategy with 20 years of enterprise experience. Case studies, resources, and tools for search and AI visibility.

Learn more about me

Work With Me

Technical SEO audits, GEO strategy, site migrations, and international SEO. Hourly consulting for teams who need hands-on support, not just reports.

Contact now

Subscribe to our newsletter!

More from our blog

Diagram of the agent-readable file stack showing AGENTS.md in the code repository read by coding agents, llms.txt and llms-full.txt at the website root read by answer engines, and robots.txt plus RSL as the access and licensing layer beneath both.

Prev. Post

Robots Meta Tag vs X-Robots-Tag HTTP Header: When to Use Each

The core difference: where the directive lives

A decision framework

What only the header can do

Implementation patterns

The crawlability prerequisite, true for both

Common mistakes

FAQ

Want this handled properly on your site?

About SEO ProCheck

Work With Me

Subscribe to our newsletter!

More from our blog

AGENTS.md vs llms.txt vs llms-full.txt: Which Agent File Does What

Profound vs Semrush and Ahrefs: What an AI-Search Tool Actually Replaces (and What It Doesn't)

SEO vs AEO vs GEO: What Each One Means and How They Actually Differ

Google May 2026 Core Update: What We Learned After the Dust Settled

Pogosticking: The Click Pattern That Quietly Decides Who Ranks

Interaction to Next Paint (INP): The Complete Guide

SSR vs CSR: Why Rendering Decides Whether AI Can Read Your Site

Which AI Bots Are You Actually Blocking? (GPTBot, ClaudeBot, Perplexity & More)

Recent Posts

Robots Meta Tag vs X-Robots-Tag HTTP Header: When to Use Each

The core difference: where the directive lives

A decision framework

What only the header can do

Implementation patterns

The crawlability prerequisite, true for both

Common mistakes

FAQ

Want this handled properly on your site?

About SEO ProCheck

Work With Me

Subscribe to our newsletter!

More from our blog

Recent Posts

All Website Tags