A knowledge graph is more than a queryable store of facts — it is a space you can learn from. Today you turn nodes and edges into vectors with random-walk methods (DeepWalk, node2vec) and translational/bilinear models (TransE, DistMult, RotatE), score missing links, contrast learned embeddings against ontological and rule-based inference, and use those embeddings to augment retrieval beyond what literal traversal can reach.
So far in this track you've treated the graph as something you query — you traverse edges, run Cypher, compute centralities. Today the graph becomes something you learn from. The goal of graph representation learning is to map every node (and often every edge type) to a dense vector — an embedding — such that the geometry of the embedding space reflects the structure of the graph.
A query like MATCH (a)-[:WORKS_FOR]->(c) returns only edges that literally exist. But real knowledge graphs are radically incomplete — Freebase was estimated to be missing the place-of-birth of ~70% of the people it contained. Two questions traversal can't answer well:
Embeddings answer both because they place structurally/semantically similar nodes near each other in vector space, and because algebraic operations on the vectors approximate relational facts.
People conflate these constantly, and it costs them:
A rough rule: if your graph is "who-knows-whom" with one edge meaning, reach for node2vec. If it's a typed KG with many relation types (bornIn, worksFor, capitalOf), reach for a KGE model.
Traversal-based reasoning is closed-world: an unrecorded edge is treated as false. Embedding-based link prediction is open-world: an unrecorded edge is unknown, and the model produces a plausibility score. This distinction governs how you generate training data — in particular, why you have to synthesize negative examples (covered in Section 3), because the graph only stores positives.