Indexing Vectors: HNSW & IVFFlat

Why an unindexed similarity search slows to a crawl, the two index types pgvector ships — HNSW and IVFFlat — and the knobs that govern each (m, ef_construction, ef_search vs lists, probes). You will build an index per distance operator with the correct opclass and learn to navigate the recall / speed / build-time tradeoff so you know which index wins for your data.

Day 4 Progress0%

Why the Sequential Scan Gets Slow

On Day 3 you ran your first similarity search and it felt instant. That was a lie of small data. With no vector index, every ORDER BY embedding <=> $1 query is a sequential scan: Postgres reads every row in the table, computes the distance between the query vector and that row's vector, then sorts the lot to find the top K.

The Cost Is Linear in Rows × Dimensions

A single cosine distance between two 1536-dim vectors is ~1536 multiply-adds. Do that for every row, then sort. The work scales as O(rows × dimensions):

RowsExact-scan latency (rough)
10,000a few milliseconds
100,000tens of milliseconds
1,000,000hundreds of ms to seconds
10,000,000seconds to minutes — unusable for an API

The exact numbers depend on dimension, hardware, and whether the vectors fit in RAM, but the shape is the trap: it is fine in development and falls over the moment real data arrives.

Confirm It With EXPLAIN

Never guess whether an index is used — ask Postgres. EXPLAIN ANALYZE shows the actual plan. Before you build an index you will see a Seq Scan; after, an Index Scan.

EXPLAIN ANALYZE
SELECT id FROM items
ORDER BY embedding <=> '[0.1, 0.2, 0.3]'
LIMIT 5;
-- Before index:  Seq Scan on items  (... rows=1000000 ...)
-- After index:   Index Scan using items_embedding_idx ...

Exact vs Approximate

The sequential scan is exact: it always returns the true K nearest neighbors. A vector index is approximate (ANN — approximate nearest neighbor). It trades a small amount of correctness for an enormous speedup, walking a smart data structure instead of every row. The measure of that correctness is recall: of the true top-K, what fraction did the index actually return? Recall of 0.95 means you got 95% of the genuinely-closest results. That tradeoff — speed for recall — is the entire topic of this lesson.

When You Do NOT Need an Index

Under ~10,000 rows, a sequential scan is often faster than an approximate index and always more accurate. Building an index has a cost (time and storage), and on tiny tables the planner may ignore it anyway. Index when scans get slow, not before.

Key Takeaways
  • With no vector index, every similarity query is a sequential scan: O(rows × dimensions) — fine in dev, fatal in prod
  • A vector index is approximate (ANN): it trades a little recall for a huge speedup
  • Recall = fraction of the true top-K that the index actually returns; it's the number you tune against speed

AI Learning Assistant

Powered by advanced LLM

Get personalized help with concepts, code examples, and explanations tailored to your learning pace.

Course Stats

Estimated Time
55 min
Lessons
5 sections