Back to Courses

Hybrid Retrieval: Graph + Vector

Vector search finds passages that sound like the question; a knowledge graph knows how facts actually connect. This day fuses the two into a hybrid GraphRAG retriever — using embeddings to find the right entry points into the graph, then traversing edges to pull in the connected facts an LLM needs to answer multi-hop questions that pure vector search quietly gets wrong.

Day 3 Progress0%

Why Combine Vector and Graph

By now you can extract entities and design an ontology, and the Beginner course showed you a first GraphRAG pipeline. This day asks the harder question: how do you actually retrieve the right context when you have both an embedding index over your text and a knowledge graph over your entities?

The Two Retrieval Modalities

A modern RAG stack usually has two indexes built from the same corpus:

  • A vector index — every chunk (or entity description) embedded into a dense vector, searched by cosine similarity. It answers "what text is semantically about this query?"
  • A knowledge graph — entities as nodes, relationships as typed edges, searched by traversal. It answers "what is connected to this thing, and how?"

They fail in opposite ways. Vector search is fuzzy and forgiving — it finds the relevant paragraph even when the wording differs — but it's flat: it has no idea that "Acme Corp" is owned by "Globex" which employs the person you asked about. The graph knows that connection exactly, but it can't find the entry node from a vague natural-language question on its own.

What Each Modality Is Good At

Question typeVector winsGraph wins
"Summarize the refund policy"Yes — find the relevant passageNo clear entry point
"Which suppliers does the parent of Acme use?"Misses the multi-hop linkYes — traverse owns → uses
"What did the CEO say about layoffs?"Finds quotesResolves "the CEO" to a person node
"Themes across the whole dataset"Returns scattered chunksYes — community structure

Multi-Hop Is the Killer Case

The clearest reason to add a graph is multi-hop reasoning. Consider: "Which products are made by companies that my manager's former employer acquired?" That's four hops: you → manager → former-employer → acquisitions → products. Pure vector search would need a single chunk that happens to state all four links in one place — which almost never exists. The graph answers it with a path query, then vector search ranks which of the connected facts are most relevant to phrase the answer.

The insight that drives the rest of this day: use embeddings to find where to start in the graph, then use the graph's structure to gather what's connected. Neither alone is enough.

Key Takeaways
  • Vector search is fuzzy and forgiving but flat; the graph knows exact, typed connections but needs an entry point
  • Their strengths are complementary — combine, don't choose
  • Multi-hop questions are the killer case: a path query answers what no single chunk states

AI Learning Assistant

Powered by advanced LLM

Get personalized help with concepts, code examples, and explanations tailored to your learning pace.

Course Stats

Estimated Time
50 min
Lessons
5 sections