Name: Production GraphRAG at Scale
Availability: InStock

From Static Retrieval to Agentic Traversal

The beginner course built a GraphRAG pipeline that does this: extract entities from the question, look them up in the graph, pull a fixed k-hop neighborhood, dump it into the prompt. That works for "who is the CEO of Acme?" It falls apart on "which suppliers of Acme's competitors had a recall in the last two years?" — a question whose answer lives several hops away along a path you can't know in advance.

Why a Fixed Neighborhood Isn't Enough

A static k-hop fetch has two failure modes that pull in opposite directions:

Too shallow. The answer is 4 hops away but you fetched 2. It's simply not in the context, and the LLM either hallucinates or says "I don't know."
Too deep. You fetch 3 hops "to be safe," and a single well-connected node (a country, a popular product category, a hub author) explodes the neighborhood into tens of thousands of nodes. This is supernode blow-up, and it blows your token budget and your latency at the same time.

You cannot pick one k that is right for every question. The fix is to stop pre-deciding the shape of the subgraph and instead let the LLM decide where to go next, one step at a time.

The Agentic Loop

Agentic (or iterative) GraphRAG reframes retrieval as a control loop:

Seed. Resolve the question's entities to graph nodes (this is the entity-linking step from earlier days). Those nodes are your starting frontier.
Observe. Summarize the current frontier — node labels, the relationship types available from here, maybe a one-line description per neighbor. This is what the LLM "sees."
Decide. The LLM, given the question and the observation, chooses an action: expand along a specific relationship type, follow a specific node, answer now, or give up.
Act. Execute the chosen graph operation, updating the frontier and the accumulated evidence.
Repeat until the LLM answers, or a budget (hops, nodes, tokens, time) is exhausted.

The crucial difference from naive RAG: the LLM never sees the whole graph. It sees a local view and steers. This is the same shape as a ReAct agent, with the "tools" being graph operations (expand, get_neighbors, get_node) instead of web search.

Expose Relationship Types, Not Raw Neighbors

The single most important design choice: when you show the LLM the current frontier, show it the available relationship types ("Acme has 3 outgoing SUPPLIES edges, 12 COMPETES_WITH edges, 1 HEADQUARTERED_IN edge") rather than the raw list of neighbor nodes. The LLM picks an edge type to traverse; you then materialize only those neighbors. This keeps the per-step observation small even at a supernode, and it turns the LLM's job into typed query planning rather than scrolling a giant list.

Key Takeaways

Static k-hop retrieval is simultaneously too shallow for deep questions and too deep at supernodes — no single k is correct
Agentic GraphRAG turns retrieval into an observe→decide→act loop where the LLM steers traversal one hop at a time and never sees the whole graph
Expose available relationship TYPES to the LLM instead of raw neighbor lists — this keeps observations small and turns the LLM into a typed query planner

Production GraphRAG at Scale

From Static Retrieval to Agentic Traversal

From Static Retrieval to Agentic Traversal

Why a Fixed Neighborhood Isn't Enough

The Agentic Loop

Expose Relationship Types, Not Raw Neighbors

Query Planning and Bounding Traversal Cost

Caching Subgraphs and Embeddings

Latency Budgets and Observability

Failure Modes at Scale

AI Learning Assistant

Course Stats

Up Next