GEO & AI Search
26 terms in this domain · show all 220
A
-
AI Hallucination
An AI hallucination is a confident, plausible-sounding output from a language model that is factually wrong, fabricated, or unsupported by the retrieved sources. Common causes: thin retrieval, ambiguous prompts, and the model's tendency to fill gaps with its training-data priors.
-
AI Mode
AI Mode is Google's standalone AI-only search interface, announced at Google I/O on May 20, 2025. Unlike AI Overviews — which sit above the classic SERP — AI Mode replaces the entire results page with a Gemini- generated conversational answer. Opt-in via a tab next to All, Images, News, etc.
-
AI Overview
AI Overview is the Gemini-generated answer block Google renders at the top of selected SERPs. Citations link to source pages. It graduated from the Search Generative Experience (SGE) preview to general availability in the US on May 14, 2024 (Google I/O), then expanded to 100+ countries through late 2024.
-
AI training opt-out
AI training opt-out is the bundle of mechanisms that prevent your content from being used to train language and image models: robots.txt blocks for named user-agents, meta `noai`/`noimageai` tags, `X-Robots-Tag` HTTP headers, and — in the EU — text-and-data-mining reservation under DSM Directive Article 4(3).
-
Answer Engine Optimization (AEO)
Answer Engine Optimization (AEO) is the practice of optimizing content to win direct-answer surfaces: featured snippets, People Also Ask, voice assistants, and FAQ rich results. The term predates GEO (~2017-2020) and targets extractive answers — the engine quotes a passage of yours verbatim.
B
-
Bing Copilot
Microsoft Copilot in Bing is Microsoft's AI answer engine inside Bing Search. Launched February 7, 2023 as "Bing Chat" using GPT-4-class models with Bing-index grounding. Rebranded to "Copilot" on September 21, 2023 to align with Microsoft's broader Copilot product family.
-
Brand visibility in LLMs
Brand visibility in LLMs is the practice of tracking when, where, and how your brand appears inside model-generated answers — named, attributed, or described in context. It's distinct from search rankings and is now the core measurement category for generative engine optimization.
-
Bytespider
Bytespider is the web crawler operated by ByteDance, the parent company of TikTok, used to gather training data for ByteDance's models including the Doubao family. It's known for aggressive request rates, frequently appears at the top of Cloudflare's "most-blocked AI bots" reports, and respects robots.txt when properly addressed.
C
-
Citation rate
Citation rate is the percentage of AI-generated answers (Perplexity, ChatGPT Search, AI Overviews, Gemini) that cite your domain across a defined query set. It's the GEO equivalent of share-of-voice for keyword rankings — and the emerging benchmark for how visible you are inside model outputs.
-
ClaudeBot
ClaudeBot is Anthropic's web crawler, identified by the user-agent `ClaudeBot`. It's used to gather public content for training Claude models. Block it via robots.txt with `User-agent: ClaudeBot`. Two earlier user-agents, `anthropic-ai` and `Claude-Web`, are deprecated but worth listing for safety.
-
Common Crawl
Common Crawl is a non-profit open-web archive that has crawled the public internet monthly since 2008. Its crawler is `CCBot`, and the resulting petabytes of HTML and text are the foundation of most public LLM training datasets, including GPT-3, LLaMA, and many others. Hosted at commoncrawl.org.
-
Conversational Search
Conversational search is multi-turn search where context from earlier queries persists into later ones. Instead of issuing a fresh query for each refinement, the user asks follow-ups like a chat. Standard in ChatGPT Search, Perplexity, Google AI Mode, and Bing Copilot.
E
G
-
Gemini Search
Gemini Search refers to the web-grounded answer mode inside Google Gemini (gemini.google.com), where the assistant uses the Google Search grounding tool to fetch live results and cite them. Distinct from AI Overviews (inside google.com SERPs) and AI Mode (the AI-only Search tab).
-
Generative Engine Optimization (GEO)
Generative Engine Optimization (GEO) is the practice of optimizing content so large language model answer engines — ChatGPT, Perplexity, Google AI Overviews, Bing Copilot — cite, quote, and reference your brand inside generated responses. The term was coined in a 2023 paper by Aggarwal et al. (arXiv:2311.09735).
-
Google-Extended
Google-Extended is an opt-out token for `robots.txt` that blocks your content from training Google's generative models (Gemini, Vertex AI generative APIs) without affecting Googlebot, Google Search ranking, or indexing. Introduced September 2023 as a separate dial from search inclusion.
-
GPTBot
GPTBot is OpenAI's web crawler, identified by the user-agent token `GPTBot` and used to gather public content for training future models. Blocking it in robots.txt prevents training use but does not affect ChatGPT Search, which uses a separate user-agent (`ChatGPT-User`) for live retrieval.
L
-
LLM Grounding
LLM grounding is the practice of constraining a language model's output to retrieved sources or structured data, rather than letting it generate from parametric memory alone. Implemented via RAG, tool use, or system prompts that require citations. The mechanism behind inline citations in AI answers.
-
llms.txt
llms.txt is a proposed standard introduced by Jeremy Howard (Answer.AI) in September 2024: a Markdown file at `/llms.txt` that gives LLMs a clean, prioritized index of a site's most useful content. Adoption as of early 2026 is real but not universal; spec lives at llmstxt.org.
P
-
Perplexity
Perplexity AI is an answer engine that pairs every generated claim with numbered inline citations to web sources. Founded 2022 (Aravind Srinivas). Free tier with usage limits, Pro tier with model choice and unlimited Pro Search, Spaces for shared collections, and the Sonar API for developers.
-
PerplexityBot
PerplexityBot is the indexing crawler used by Perplexity AI to build its search index. `Perplexity-User` is a separate user-agent that fetches pages on demand when a user asks a question. Distinguishing the two lets you opt out of indexing while remaining citable in live answers.
R
S
-
Search Generative Experience (SGE)
Search Generative Experience (SGE) was Google's opt-in Labs preview of generative AI answers in Search, launched May 10, 2023 at Google I/O. It graduated to general availability as AI Overviews on May 14, 2024. SGE no longer exists as a separate product — references to it now point to AI Overviews.
-
SearchGPT
SearchGPT was OpenAI's prototype answer engine, announced July 25, 2024. It was integrated into ChatGPT as "ChatGPT Search" on October 31, 2024, and rolled out broadly (free tier included) on December 16, 2024. The standalone SearchGPT product no longer exists as a separate surface.