Name: Hybrid Retrieval: Graph + Vector
Availability: InStock

Why Combine Vector and Graph

By now you can extract entities and design an ontology, and the Beginner course showed you a first GraphRAG pipeline. This day asks the harder question: how do you actually retrieve the right context when you have both an embedding index over your text and a knowledge graph over your entities?

The Two Retrieval Modalities

A modern RAG stack usually has two indexes built from the same corpus:

A vector index — every chunk (or entity description) embedded into a dense vector, searched by cosine similarity. It answers "what text is semantically about this query?"
A knowledge graph — entities as nodes, relationships as typed edges, searched by traversal. It answers "what is connected to this thing, and how?"

They fail in opposite ways. Vector search is fuzzy and forgiving — it finds the relevant paragraph even when the wording differs — but it's flat: it has no idea that "Acme Corp" is owned by "Globex" which employs the person you asked about. The graph knows that connection exactly, but it can't find the entry node from a vague natural-language question on its own.

What Each Modality Is Good At

Question type	Vector wins	Graph wins
"Summarize the refund policy"	Yes — find the relevant passage	No clear entry point
"Which suppliers does the parent of Acme use?"	Misses the multi-hop link	Yes — traverse owns → uses
"What did the CEO say about layoffs?"	Finds quotes	Resolves "the CEO" to a person node
"Themes across the whole dataset"	Returns scattered chunks	Yes — community structure

Multi-Hop Is the Killer Case

The clearest reason to add a graph is multi-hop reasoning. Consider: "Which products are made by companies that my manager's former employer acquired?" That's four hops: you → manager → former-employer → acquisitions → products. Pure vector search would need a single chunk that happens to state all four links in one place — which almost never exists. The graph answers it with a path query, then vector search ranks which of the connected facts are most relevant to phrase the answer.

The insight that drives the rest of this day: use embeddings to find where to start in the graph, then use the graph's structure to gather what's connected. Neither alone is enough.

Key Takeaways

Vector search is fuzzy and forgiving but flat; the graph knows exact, typed connections but needs an entry point
Their strengths are complementary — combine, don't choose
Multi-hop questions are the killer case: a path query answers what no single chunk states

Hybrid Retrieval: Graph + Vector

Why Combine Vector and Graph

Why Combine Vector and Graph

The Two Retrieval Modalities

What Each Modality Is Good At

Multi-Hop Is the Killer Case

GraphRAG Retrieval Patterns

Local vs Global GraphRAG

When Graph Structure Beats Pure Vector

Assembling Subgraph Context for an LLM

AI Learning Assistant

Course Stats

Up Next