Back to Courses

Ontology Design & Schema Evolution

Day 1 pulled entities and relations out of raw text. But a pile of triples is not yet a knowledge graph — it needs an ontology: the explicit model of which classes exist, how they relate, and what constraints hold. Today you'll design that model, learn how it differs from a flat taxonomy, and — the part nobody warns you about — how to evolve it as your data and questions change without shattering the queries already running in production.

Day 2 Progress0%

Ontology vs Taxonomy

Day 1 gave you a heap of extracted triples: (:Person {name:"Ada"})-[:WORKS_AT]->(:Company {name:"Acme"}). That heap is data. The model that says a Person can WORKS_AT a Company, that a Company has exactly one legal name, and that WORKS_AT is the inverse of EMPLOYS — that model is the ontology. Without it, your graph is just a denormalized dump with arrows.

A Taxonomy Is a Tree

A taxonomy is a hierarchy of categories — a single "is-a" tree. Biology's kingdom → phylum → class → order is a taxonomy. A retailer's Electronics → Phones → Smartphones is a taxonomy. It answers exactly one kind of question: "what broader/narrower category does this belong to?"

Taxonomies are cheap, intuitive, and limited. Every node has one parent (or, in a polyhierarchy, a few). The only relationship is subsumption — broader/narrower. You cannot express "this drug interacts with that drug" or "this author influenced that author" in a taxonomy; those aren't is-a relationships.

An Ontology Is a Graph of Meaning

An ontology generalizes a taxonomy. It still has classes arranged in is-a hierarchies, but it also defines:

  • Relationship types between classes (a Drug TREATS a Condition; an Author WROTE a Book)
  • Properties on classes, with expected data types (birthDate is a date; isbn is a string)
  • Constraints: cardinality (a Book has exactly one isbn), domain/range (WROTE goes from Author to Book, never the reverse), uniqueness
  • Axioms / inference rules: if A SUBCLASS_OF B and B SUBCLASS_OF C, then A SUBCLASS_OF C; if WORKS_AT implies EMPLOYS in the other direction

So: every taxonomy is a (very simple) ontology, but most ontologies are far richer than any taxonomy. The slogan worth remembering: a taxonomy classifies; an ontology describes.

Why the Distinction Matters in Practice

When stakeholders say "we need a taxonomy of our products," they often actually need an ontology — because the interesting questions ("which products are compatible with which accessories, made by which suppliers, recalled in which regions?") are relationship questions, not classification questions. Building a tree when you needed a graph means bolting relationships on later as ad-hoc properties, which is exactly the schema-drift mess this lesson exists to prevent.

Standards You'll Encounter

  • RDFS / OWL — the W3C stack for RDF graphs. RDFS gives you classes, subclassing, domain/range. OWL adds rich axioms (inverse, transitive, functional properties, cardinality, disjointness) and a formal semantics that reasoners can act on.
  • SHACL — a constraint/validation language for RDF: "every Person must have exactly one birthDate of type date." Think of it as schema validation for graphs.
  • Property-graph schemas — Neo4j, TigerGraph, etc. are historically schema-optional; you enforce structure with constraints and conventions rather than a formal ontology language. Neo4j has added node-key and property-existence/type constraints to close the gap.

You do not need to adopt OWL to have an ontology. Even an informal, documented agreement — "these are our node labels, these are our relationship types, here are the rules" — is an ontology. The discipline matters more than the format.

Key Takeaways
  • A taxonomy is a single is-a hierarchy that only answers 'what category is this?'; an ontology adds typed relationships, properties, constraints, and inference rules
  • Most real requests for a 'taxonomy' are actually requests for an ontology, because the valuable questions are relationship questions
  • You don't need OWL to have an ontology — even a documented agreement on labels, relationship types, and rules counts; the discipline matters more than the format

AI Learning Assistant

Powered by advanced LLM

Get personalized help with concepts, code examples, and explanations tailored to your learning pace.

Course Stats

Estimated Time
50 min
Lessons
5 sections