The final capstone of the Knowledge Graphs track. Everything you've learned — scaling, temporal modeling, graph embeddings, and production GraphRAG — converges into two end-to-end platform designs: an enterprise compliance and provenance graph versus a large consumer GraphRAG product. You'll walk ingestion → extraction → entity resolution → storage and scale → hybrid retrieval → serving, with capacity planning, cost modeling, and a launch playbook for each.
This is the final capstone of the Knowledge Graphs track. The last five days each handed you one power tool — scaling (Day 1), temporal graphs (Day 2), graph embeddings (Day 3), graph neural networks (Day 4), and production GraphRAG (Day 5). Today we wire them into complete platforms by designing two real systems end-to-end and letting their differences teach the trade-offs.
A regulated bank needs a graph of who approved what, when, and on whose authority. Entities: people, accounts, transactions, controls, policies, documents. The killer requirement is provenance and auditability: every edge must carry its source, the time it was valid, and who asserted it. Auditors run multi-hop questions like "show every transaction over \$10k approved by someone who later failed a controls review." Scale is modest (~50M nodes, ~400M edges) but correctness, lineage, and bitemporal history are non-negotiable.
A consumer research assistant answers natural-language questions over ~80M documents by combining a knowledge graph with retrieval. Entities are extracted automatically from messy text at huge volume. The killer requirements are throughput, freshness, and answer quality at low latency — 3,000 QPS at peak, p95 under 900ms end-to-end, the graph growing by millions of nodes daily. Here a wrong-ish edge is tolerable; a slow or stale answer is not.
Every knowledge-graph platform — both of ours — is the same pipeline. Memorize it; it is the spine of the rest of this lesson:
The two case studies make the same decisions differently. That contrast is the lesson: there is no universal best architecture, only an architecture that fits a requirement profile.
Before any boxes-and-arrows, write the numbers down. The requirement table is the single most useful artifact in a design review:
| Dimension | ComplyGraph | AtlasRAG |
|---|---|---|
| Scale | 50M nodes / 400M edges | 1B+ nodes, +millions/day |
| Read pattern | Deep multi-hop audits, low QPS | Shallow hybrid retrieval, 3k QPS |
| Freshness | Hours acceptable | Minutes |
| Consistency | Strong, bitemporal, auditable | Eventual is fine |
| Failure cost | A wrong audit = regulatory fine | A stale answer = mild annoyance |
Notice nothing here is about technology yet. Pick the requirement profile first; the stack falls out of it.