GEO & AI Search · Glossary · Updated Apr 2026

Vector search

Definition

Vector search is retrieval by similarity in a high-dimensional embedding space, typically using cosine similarity or dot product. It returns semantically related documents even when no keywords match — the retrieval layer beneath most RAG pipelines and modern semantic search products.

Find related

Long definition

Classic keyword search ranks documents by how their lexical terms match a query, weighted by frequency and rarity (TF-IDF, BM25). Vector search does something different: it represents both the query and every document as a vector (an embedding) in the same semantic space, then returns the documents whose vectors sit closest to the query vector.

The closeness measure is almost always one of:

Cosine similarity — angle between vectors, ignoring magnitude. Default for most embedding models.
Dot product — cosine times magnitudes. Equivalent to cosine on normalized vectors and faster to compute.
Euclidean (L2) distance — straight-line distance. Used in some legacy vector setups; less common for normalized embeddings.

The advantage is semantic. Query "low-impact running shoes for flat feet" and a document titled "stability trainers for fallen arches" will rank highly even though the lexical overlap is minimal. The disadvantage is the inverse: exact-match queries (product codes, error messages, proper nouns the model has never seen) underperform. Production systems almost always run hybrid retrieval — BM25 plus vector — and merge the two with a re-ranker.

The tooling landscape splits four ways:

Embedded in the database: pgvector extension for Postgres, native vector types in MongoDB, Elasticsearch, Redis, and Cassandra. Cheapest path when you already run the DB.
Managed pure vector services: Pinecone, Weaviate Cloud, Qdrant Cloud, Vespa Cloud. Optimized for scale; integrate with most RAG frameworks out of the box.
Self-hosted vector engines: Qdrant, Weaviate, Milvus, Vespa, FAISS as a library. Full control, ops cost.
Hybrid search platforms: Typesense, Meilisearch, OpenSearch — keyword-first systems that added vector search.

For RAG, vector search is the retrieval step before the generation step. The LLM never sees your full corpus; it sees the top-k documents the vector search returned. Quality of retrieval caps quality of the answer — a hallucinating model often started with a bad top-k.

Practical scaling note: under 1M documents, almost any tool works. Above 100M, indexing strategy (HNSW vs IVF, quantization, sharding) starts to matter, and the choice between Pinecone-class managed services and self-hosted Qdrant or Vespa becomes load-bearing.

Common misconceptions

"Vector search replaces keyword search." It doesn't. Hybrid retrieval (BM25 + vector + re-ranker) consistently outperforms pure vector search on real query distributions, especially for rare terms, codes, and proper nouns.
"All vector databases are equivalent." They aren't. pgvector at 1M docs and Pinecone at 1M docs are similar; pgvector at 100M docs without good HNSW tuning is not. Pick by scale and ops profile.
"Vector search is fuzzy keyword search." Different mechanism. Fuzzy keyword tolerates typos in lexical match. Vector search retrieves on meaning regardless of typing — "Maria's recipe for paella" can return a document titled "traditional Valencian rice dish" with no lexical overlap.
"You only need vector search for RAG." RAG is the headline use case but not the only one. Internal site search, product recommendations, duplicate detection, content clustering, and content-gap analysis all run on the same primitive.

Continue exploring