Back to Courses

Metadata Filtering & Query Planning

Most production RAG queries are not pure vector search — they are vector search plus a WHEREclause. Learn how Postgres combines metadata filters with pgvector's KNN operators, how to read an EXPLAIN ANALYZE plan, and how to avoid the recall cliff that filtered approximate search hides from you.

Day 2 Progress0%

Filtered Vector Search: The Real Query

Day 1 showed how to combine dense vector similarity with sparse keyword matching. But in production, almost no RAG query is only a similarity search. It's a similarity search constrained by metadata: "find the chunks most similar to this question where tenant_id = 42 and doc_type = 'policy' and published_at > now() - interval '90 days'."

Why Filters Are Everywhere

The metadata predicates come from the application, not the model:

  • Multi-tenancy — never leak tenant A's documents into tenant B's results
  • Access control — only return rows the current user is allowed to see
  • Freshness — exclude stale or superseded versions
  • Faceting — language, source, category, region

In Postgres with pgvector this looks deceptively simple:

SELECT id, content
FROM chunks
WHERE tenant_id = 42 AND doc_type = 'policy'
ORDER BY embedding <=> $1
LIMIT 10;

The <=> operator is cosine distance (<-> is L2, <#> is negative inner product). The ORDER BY ... LIMIT is the KNN search. The WHERE is the filter. The hard part — and the subject of this whole day — is what the planner does with those two requirements together.

Two Ways to Run the Same Query

Postgres can satisfy that query in fundamentally different ways:

  1. Exact / sequential: scan rows, apply the WHERE, sort the survivors by distance, take the top 10. Always correct, but O(N) — fine for thousands of rows, not millions.
  2. Approximate index scan: walk an HNSW or IVFFlat index ordered by distance, and check the WHERE predicate on each candidate as it comes out.

The second is fast but introduces a subtlety that does not exist in unfiltered search: the index is ordered by distance, not by your filter. The vectors that pass your filter might be scattered deep in the index, and an approximate scan can give up before it finds enough of them. That is the filtered-ANN problem, and the rest of this day is about recognizing and fixing it.

The Mental Model

Think of an HNSW index as a guided tour of the nearest vectors in distance order. Unfiltered, the tour hands you the 10 closest and stops. Filtered, the tour hands you the closest vectors but you reject the ones that fail your WHERE. If only 1% of rows match your filter, the tour may walk past hundreds of rejected neighbors before finding 10 keepers — or hit its internal search-budget limit first and return too few, or miss the truly-nearest matching rows entirely. Lost recall, silently.

Key Takeaways
  • Production RAG queries are KNN search (ORDER BY embedding <=> $1 LIMIT k) plus a metadata WHERE clause
  • Postgres can answer them exactly (seq/sort, always correct, O(N)) or approximately (vector index scan, fast, recall-sensitive)
  • Combining a distance-ordered index with an unrelated filter is the filtered-ANN problem — the core challenge of the day

AI Learning Assistant

Powered by advanced LLM

Get personalized help with concepts, code examples, and explanations tailored to your learning pace.

Course Stats

Estimated Time
50 min
Lessons
5 sections