14  Entity-Relationship Diagrams

14.1 Learning Objectives

By the end of this chapter, students will be able to:

  • Read and interpret an entity-relationship (ER) diagram, identifying entities, attributes, and relationships.
  • Identify one-to-one, one-to-many, and many-to-many relationships in an ER diagram.
  • Explain how a many-to-many relationship is resolved using a junction (associative) table.
  • Apply crow’s foot notation to represent cardinality and optionality in a diagram.
  • Translate an ER diagram into a set of CREATE TABLE statements with appropriate primary keys, foreign keys, and constraints.
  • Construct a basic ER diagram for a given set of business requirements before writing any SQL.
  • Evaluate an existing ER diagram for missing relationships, incorrect cardinality, or structural problems.

14.2 Design Before You Build

Writing SQL to create a database without first sketching its structure is like framing walls without a blueprint. The tables might hold together, but you will likely discover expensive problems later: a relationship missing, columns in the wrong place, a structure that forces the kind of redundancy the normalization chapter describes. Changes to a schema after data is loaded are slow and disruptive.

An entity-relationship diagram (ER diagram, or ERD) is the standard tool for planning a database schema before writing a single line of SQL. It shows what entities exist, what attributes each entity has, and how entities relate to each other.

This chapter uses DBML (Database Markup Language) to define schemas and crow’s foot notation to render the resulting diagrams. DBML is a concise, SQL-like language designed specifically for describing database structure. The diagrams in this chapter render directly in the book via the quarto-dbml extension; the same DBML code works identically in dbdiagram.io, a free web tool you will use for the exercises.

14.3 Defining Entities: Table Blocks

In DBML, an entity is defined with a Table block. Each row inside the block defines one attribute (column), with the column name followed by its data type and any constraints enclosed in square brackets.

customers PK customer_id int NN customer varchar UN NN email varchar

The constraint annotations inside [...] map directly to SQL:

Annotation Meaning
pk Primary key
increment Auto-incrementing surrogate (SERIAL in PostgreSQL)
not null Column cannot be NULL
unique Values must be distinct across all rows
default: value Default value when none is supplied
note: 'text' Documentation only; not enforced by the database

A composite primary key cannot be declared inline. It goes in an Indexes block at the end of the table definition:

order_items NN order_id int NN product varchar NN quantity int

14.4 Defining Relationships: Ref Declarations

Relationships between entities are declared with Ref statements. Each Ref names two columns and the cardinality between them using one of four operators:

Operator Cardinality Example
> Many-to-one (left is the FK/many side) orders.customer_id > customers.customer_id
< One-to-many (right is the FK/many side) customers.customer_id < orders.customer_id
- One-to-one users.profile_id - profiles.id
<> Many-to-many students.id <> courses.id

> and < are mirrors of each other. Most schemas use > consistently so the FK column appears on the left: FK_table.FK_column > PK_table.PK_column.

14.4.1 How NOT NULL Shapes the Diagram

The Ref operator sets cardinality (one vs. many). Participation (mandatory vs. optional) is determined by whether the FK column carries [not null]:

  • A FK column declared [not null] means every child row must have a parent. The crow’s foot rendering shows a mandatory symbol (a bar, |) at the parent end.
  • A nullable FK column means the relationship is optional. The rendering shows an optional symbol (a circle, o) at the parent end.

The many end almost always shows zero-or-many (crow’s foot with a circle), because relational constraints generally cannot enforce “at least one child row must exist.” That rule lives in application logic, not in the schema itself.

14.5 Reading Crow’s Foot Diagrams

Each end of a relationship line in a crow’s foot diagram carries two symbols: one for cardinality (the maximum) and one for participation (the minimum).

Cardinality symbols:

Symbol Meaning
Single bar \| One (no more than one)
Crow’s foot { Many (no upper limit)

Participation symbols:

Symbol Meaning
Single bar \| Mandatory: at least one must exist
Circle o Optional: zero is permitted

The four practical combinations:

Symbols at one end Reads as
\|\| Exactly one
o\| Zero or one
\|{ One or many
o{ Zero or many

To read a relationship, start at one entity and read toward the other. The symbols at the far end answer the question “how many of that entity can relate to one of mine?”

14.6 The Orders Schema

Here is the complete orders schema from the normalization chapter expressed in DBML. Notice how closely it resembles the CREATE TABLE statements you will write in the next chapter.

Table customers {
  customer_id int     [pk, increment]
  customer    varchar [not null]
  email       varchar [not null, unique]
}

Table orders {
  order_id    int  [pk, increment]
  customer_id int  [not null]
  order_date  date [not null, default: `now()`]
}

Table order_items {
  order_id int     [not null]
  product  varchar [not null]
  quantity int     [not null, default: 1]

  Indexes {
    (order_id, product) [pk]
  }
}

Ref: orders.customer_id > customers.customer_id
Ref: order_items.order_id > orders.order_id
customers PK FK customer_id int NN customer varchar UN NN email varchar orders PK FK order_id int FK NN customer_id int NN order_date date order_items FK NN order_id int NN product varchar NN quantity int

Two things to observe in the rendered diagram:

CUSTOMERS to ORDERS: orders.customer_id is [not null], so every order must belong to a customer. The diagram shows a mandatory symbol at the CUSTOMERS end. The ORDERS end shows zero-or-many: a customer can exist without any orders.

ORDERS to ORDER_ITEMS: order_items.order_id is [not null], so every line item must belong to an order. The diagram shows a mandatory symbol at the ORDERS end. The ORDER_ITEMS end shows zero-or-many.

The normalization chapter described the ORDER_ITEMS side as “one or many” because an order with no items is not a meaningful business object. DBML and SQL constraints cannot enforce this at the schema level. The zero-or-many rendering is technically correct for what the schema can guarantee; the “at least one” rule lives in application logic.

14.7 The worldbank Schema

The worldbank database from the joins chapter has a richer structure. OBSERVATIONS is a junction table that resolves a many-to-many relationship between COUNTRIES and INDICATORS:

N 1 N 1 N 1 N 1 regions PK FK region_code varchar NN region_name varchar income_groups PK FK income_code varchar NN income_name varchar countries PK FK country_code varchar NN country_name varchar FK NN region_code varchar FK income_code varchar indicators PK FK indicator_code varchar NN indicator_name varchar observations FK NN country_code varchar FK NN indicator_code varchar NN year int value float

OBSERVATIONS sits at the center, connected to both COUNTRIES and INDICATORS. Each observation belongs to exactly one country and exactly one indicator, while each country and indicator can have many observations.

Notice that countries.income_code has no [not null] annotation. Some countries in the dataset have no income classification. The diagram renders an optional symbol (circle) on the INCOME_GROUPS end of that relationship.

14.8 Self-Referencing Relationships

A table can reference itself. The employee hierarchy example from the previous chapter uses a manager_id column that points back to employee_id in the same table:

N 1 employees PK FK employee_id int NN name varchar FK manager_id int

manager_id has no [not null], which is intentional: the person at the top of the hierarchy has no manager, so their manager_id is NULL. The optional symbol at the parent end reflects this.

14.9 DBML and dbdiagram.io

The DBML you write in this book renders identically in dbdiagram.io, a free browser-based tool. To use it:

  1. Go to dbdiagram.io and open the editor (no account required for basic use).
  2. Paste your DBML into the left panel.
  3. The diagram renders immediately on the right.
  4. You can drag tables to arrange the layout, export to PNG or PDF, and share a link.

dbdiagram.io is the tool to use for the exercises below and for designing schemas in your own work. Write the DBML first, render it to check the structure, then iterate before writing any SQL.

14.10 From Diagram to Schema

An ER diagram and a CREATE TABLE script are two representations of the same design. DBML makes the translation almost mechanical:

  • Each Table block becomes a CREATE TABLE statement.
  • Each column annotation ([not null], [unique], [pk], [default: ...]) becomes the corresponding SQL constraint.
  • Each Ref becomes a FOREIGN KEY declaration.
  • The Indexes { (col1, col2) [pk] } block becomes a table-level PRIMARY KEY (col1, col2).

The next chapter walks through this translation in full.

14.11 Exercises

14.11.1 Reflection

1. In DBML, what is the difference between a > and a < in a Ref declaration? Are they interchangeable, or do they mean different things?


2. A Ref line alone tells you the cardinality of a relationship. What additional information tells you whether participation is mandatory or optional, and where does that information live in the DBML definition?


3. The worldbank schema leaves countries.income_code nullable (no [not null]). What does this mean for the rendered diagram, and what real-world fact does this model?


14.11.2 Design

4. A regional library system tracks books, borrowers, and loans. The business rules are:

  • Each book has a title, an ISBN (unique), and an author.
  • Each borrower has a name and a library card number (unique).
  • A loan records which borrower checked out which book, along with the checkout date and due date.
  • The same book can be loaned to different borrowers at different times; a borrower can have multiple active loans.

Write the DBML for this schema. Include all appropriate constraint annotations. Paste it into dbdiagram.io to verify the diagram, then copy the final DBML here.


5. The following diagram is rendered from a DBML definition. Read it and answer the questions below.

N 1 N 1 doctors PK FK doctor_id int NN name varchar specialty varchar patients PK FK patient_id int NN name varchar NN date_of_birth date appointments PK appointment_id int FK NN doctor_id int FK NN patient_id int NN scheduled_at datetime notes text
  1. Describe each relationship in plain English, including cardinality and participation.
  2. Is APPOINTMENTS acting as a junction table? Explain.
  3. A new rule states that every appointment must have notes recorded before it can be saved. What change would you make to the DBML to enforce this?
  4. Another rule states that a doctor can have at most one appointment per time slot. What change would you make to the DBML to enforce this uniqueness constraint?

6. A podcast platform needs to store shows, episodes, listener subscriptions, and playlists. The rules are:

  • Each show has many episodes; each episode belongs to exactly one show.
  • Listeners can subscribe to many shows; a show can have many subscribers. The subscription records when the listener subscribed.
  • Each listener can have many playlists; each playlist belongs to exactly one listener.
  • Each playlist can contain many episodes; each episode can appear in many playlists.

Write the full DBML for this schema. Identify any junction tables your design requires, and explain why they are needed. Paste into dbdiagram.io to verify the structure before submitting.