Pure vector search misses exact terms — product codes, error strings, rare names. This day combines Postgres full-text search (tsvector) with semantic search (pgvector) and fuses the two rankings with Reciprocal Rank Fusion, entirely in SQL.
If you finished the beginner track you can already store embeddings in a vector column and run ORDER BY embedding <=> query to get semantic neighbors. That's powerful — but on its own it quietly fails on a whole category of queries.
Vector search retrieves by meaning, which is exactly wrong when the user wants an exact token:
SKU-4471, ORA-00942, CVE-2024-3094. Embedding models tokenize these oddly and place near-duplicates far apart.Keyword search has the opposite failure mode: it nails exact terms but misses synonyms and paraphrases ("car" vs "automobile", "how do I cancel" vs "termination process").
Hybrid search runs both retrievers and fuses their results:
tsvector, ranked by ts_rank / ts_rank_cd (BM25-like term weighting).pgvector nearest-neighbor by cosine or L2 distance.Each retriever produces a ranked list. A fusion step merges the two lists into one final ranking. The result reliably beats either retriever alone on real-world query mixes, because the two cover each other's blind spots.
The nice part for a Postgres shop: you don't need a separate search cluster. A single table can carry both a tsvector column (with a GIN index) and a vector column (with an HNSW or IVFFlat index). Both retrievers, the fusion, and your business filters all live in one SQL query against one transactional store — no sync pipeline, no dual-write consistency problem.
tsvector, tsquery, ts_rank, and the GIN index.pgvector recap and why its scores aren't directly comparable to ts_rank.