Most production RAG queries are not pure vector search — they are vector search plus a WHEREclause. Learn how Postgres combines metadata filters with pgvector's KNN operators, how to read an EXPLAIN ANALYZE plan, and how to avoid the recall cliff that filtered approximate search hides from you.
Day 1 showed how to combine dense vector similarity with sparse keyword matching. But in production, almost no RAG query is only a similarity search. It's a similarity search constrained by metadata: "find the chunks most similar to this question where tenant_id = 42 and doc_type = 'policy' and published_at > now() - interval '90 days'."
The metadata predicates come from the application, not the model:
In Postgres with pgvector this looks deceptively simple:
SELECT id, content
FROM chunks
WHERE tenant_id = 42 AND doc_type = 'policy'
ORDER BY embedding <=> $1
LIMIT 10;
The <=> operator is cosine distance (<-> is L2, <#> is negative inner product). The ORDER BY ... LIMIT is the KNN search. The WHERE is the filter. The hard part — and the subject of this whole day — is what the planner does with those two requirements together.
Postgres can satisfy that query in fundamentally different ways:
WHERE, sort the survivors by distance, take the top 10. Always correct, but O(N) — fine for thousands of rows, not millions.WHERE predicate on each candidate as it comes out.The second is fast but introduces a subtlety that does not exist in unfiltered search: the index is ordered by distance, not by your filter. The vectors that pass your filter might be scattered deep in the index, and an approximate scan can give up before it finds enough of them. That is the filtered-ANN problem, and the rest of this day is about recognizing and fixing it.
Think of an HNSW index as a guided tour of the nearest vectors in distance order. Unfiltered, the tour hands you the 10 closest and stops. Filtered, the tour hands you the closest vectors but you reject the ones that fail your WHERE. If only 1% of rows match your filter, the tour may walk past hundreds of rejected neighbors before finding 10 keepers — or hit its internal search-budget limit first and return too few, or miss the truly-nearest matching rows entirely. Lost recall, silently.
Powered by advanced LLM
Get personalized help with concepts, code examples, and explanations tailored to your learning pace.
Schema Design for RAG