Technical SEO · Glossary · Updated Apr 2026

X-Robots-Tag

Definition

X-Robots-Tag is an HTTP response header that delivers crawler directives at the server level. Functionally equivalent to a meta robots tag but works on any resource, including PDFs, images, and other non-HTML files where meta tags cannot be embedded. Supports the same directives: noindex, nofollow, etc.

Find related

Long definition

The meta robots tag works only inside HTML <head>. That leaves PDFs, images, videos, JSON feeds, and any other non-HTML resource without a way to opt out of indexing — until you reach for X-Robots-Tag, the HTTP header version. Google documents both at the robots meta tag reference.

Syntax is identical to meta robots, just delivered in the response header:

X-Robots-Tag: noindex
X-Robots-Tag: noindex, nofollow
X-Robots-Tag: googlebot: noindex, bingbot: nofollow
X-Robots-Tag: noindex, unavailable_after: 25 Jun 2026 15:00:00 GMT

When X-Robots-Tag is the right tool:

  • PDFs you don't want indexed — terms-of-service archives, internal docs accidentally exposed, generated reports.
  • Images you don't want in image search — proprietary product shots, logos behind a paywall.
  • API responses — JSON or XML endpoints that shouldn't appear in search results.
  • Bulk directives at the server level — applying noindex to an entire /staging/ path via Nginx config without modifying every page template.
  • Non-HTML resources with unavailable_after — scheduled removal of seasonal PDFs, expiring documents.

When meta robots is fine:

  • Standard HTML pages where you control the template.
  • Per-page directives managed in CMS metadata fields.
  • Anything with template-level conditional logic — you already render the meta tag conditionally.

Server configuration examples:

  • Nginx: add_header X-Robots-Tag "noindex" always;
  • Apache: Header set X-Robots-Tag "noindex" inside <Files> or <FilesMatch>.
  • Cloudflare Workers / edge: set the header on response.
  • CDN rules: most CDNs support response header injection by URL pattern.

Important: the URL must be crawlable for the directive to be honored. If you Disallow a path in robots.txt and also set X-Robots-Tag: noindex, Googlebot can't fetch the URL to read the header — and the URL may still appear in search results as an "indexed though blocked" entry. Pick one strategy: either let Googlebot crawl and read noindex, or block fully via robots.txt knowing the URL may still surface.

Common misconceptions

  • "X-Robots-Tag is more powerful than meta robots." It's equivalent — same directives, same priority. Google and Bing honor both equally. The advantage is only that headers work on non-HTML resources.
  • "You can use X-Robots-Tag to noindex pages blocked in robots.txt." No. Blocked URLs are never fetched, so the header is never read. The two mechanisms are mutually exclusive — pick one.
  • "X-Robots-Tag is honored by all search engines." Google and Bing yes. Many AI bots, niche crawlers, and image scrapers ignore both meta robots and X-Robots-Tag. For hard exclusion, you need authentication or robots.txt.
  • "Setting X-Robots-Tag on every response is a good default." Not unless you mean to deindex everything. Misconfigured global headers (e.g. noindex on a staging environment that gets promoted to production) have caused real outages. Apply scoped to specific paths or content types.