Vector search finds passages that sound like the question; a knowledge graph knows how facts actually connect. This day fuses the two into a hybrid GraphRAG retriever — using embeddings to find the right entry points into the graph, then traversing edges to pull in the connected facts an LLM needs to answer multi-hop questions that pure vector search quietly gets wrong.
By now you can extract entities and design an ontology, and the Beginner course showed you a first GraphRAG pipeline. This day asks the harder question: how do you actually retrieve the right context when you have both an embedding index over your text and a knowledge graph over your entities?
A modern RAG stack usually has two indexes built from the same corpus:
They fail in opposite ways. Vector search is fuzzy and forgiving — it finds the relevant paragraph even when the wording differs — but it's flat: it has no idea that "Acme Corp" is owned by "Globex" which employs the person you asked about. The graph knows that connection exactly, but it can't find the entry node from a vague natural-language question on its own.
| Question type | Vector wins | Graph wins |
|---|---|---|
| "Summarize the refund policy" | Yes — find the relevant passage | No clear entry point |
| "Which suppliers does the parent of Acme use?" | Misses the multi-hop link | Yes — traverse owns → uses |
| "What did the CEO say about layoffs?" | Finds quotes | Resolves "the CEO" to a person node |
| "Themes across the whole dataset" | Returns scattered chunks | Yes — community structure |
The clearest reason to add a graph is multi-hop reasoning. Consider: "Which products are made by companies that my manager's former employer acquired?" That's four hops: you → manager → former-employer → acquisitions → products. Pure vector search would need a single chunk that happens to state all four links in one place — which almost never exists. The graph answers it with a path query, then vector search ranks which of the connected facts are most relevant to phrase the answer.
The insight that drives the rest of this day: use embeddings to find where to start in the graph, then use the graph's structure to gather what's connected. Neither alone is enough.