Technical SEO · Glossary · Updated Apr 2026

Indexability

Definition

Indexability is whether a crawlable URL is eligible to appear in a search engine's index. Blocked by noindex directives, canonical tags pointing elsewhere, thin or duplicate content, or quality filters. Crawlability is a prerequisite but not a guarantee.

Find related

Long definition

Indexability sits one gate downstream of crawlability. Once Googlebot has fetched the page, the URL still has to clear several filters before it becomes eligible to appear in search results:

  • Directive filters — a <meta name="robots" content="noindex"> tag, an X-Robots-Tag: noindex header, or a canonical tag pointing at a different URL all remove the page from eligibility.
  • Content filters — thin content (too little substance for the claimed topic), exact duplicates of other URLs on the same site, and "soft 404" patterns (a 200 OK page saying "this product is gone") get filtered silently.
  • Quality filters — broader signals around site trust, spam patterns, and topical authority can demote or remove URLs even when directives and content look clean.

Search Console's "Page indexing" report is the clearest diagnostic. The important distinction is between "indexed" URLs (eligible to rank), "crawled - currently not indexed" (fetched, failed an index-stage filter), and "excluded" with a specific reason (noindex, canonical mismatch, blocked by robots.txt, etc.).

Common misconceptions

  • "Indexable means it will rank." No. Indexability is the floor; ranking is a separate question entirely driven by relevance and authority signals on the query.
  • "Noindex + robots.txt disallow is belt-and-braces." It's actually incompatible. A URL blocked by robots.txt can't be crawled, so Google can't see the noindex. The URL may stay in the index with a generic snippet until it drops out naturally.
  • "Self-canonicals guarantee indexability." They help Google understand your preference, but a self-canonical on a thin/duplicate page doesn't unblock it. Google can and does ignore canonicals when signals conflict.
  • "Removing a URL from the sitemap removes it from the index." It doesn't. Use noindex (then wait for a recrawl), or a 410 Gone response code to drop a URL.