By the end of this chapter, students will be able to:
WriteCREATE TABLE statements with appropriate column names, data types, and inline constraints.
ApplyNOT NULL, UNIQUE, DEFAULT, and CHECK constraints when defining columns.
UseALTER TABLE to add, drop, or modify columns and constraints on an existing table.
WriteDROP TABLE and TRUNCATE TABLE and explain the difference in behavior and reversibility.
UseCREATE TABLE AS SELECT to persist query results as a new permanent table.
Define an auto-incrementing primary key column using GENERATED ALWAYS AS IDENTITY.
Distinguish between a temporary table and a permanent table and identify appropriate use cases for each.
16.2 From Reading to Writing
Every query in this book so far has been a SELECT statement: reading data from tables that already exist. Creating and modifying tables is the domain of Data Definition Language (DDL), a subset of SQL distinct from the SELECT, INSERT, UPDATE, and DELETE statements that make up Data Manipulation Language (DML).
This chapter introduces CREATE TABLE, the fundamental DDL statement, in the context of building a university database from scratch. By the end, you will know how to create a database, define a table with appropriate constraints, prevent duplicate or invalid data at the schema level, and modify a table’s structure after the fact. The next chapter extends this foundation to multi-table schemas with foreign keys.
16.3 Creating a Database
In PostgreSQL, a database is the top-level container: an isolated environment with its own tables, schemas, users, and settings. A running server can host multiple databases simultaneously. The worldbank database you have been querying throughout this book is one such database on your local Docker container. Here, you will create a second one.
CREATEDATABASE university;
CREATE DATABASE must be run while connected to a different database — you cannot create a database while you are connected to it. The standard approach is to connect to the built-in postgres maintenance database first:
Once connected to university, every CREATE TABLE statement you run will place its tables there, completely separate from worldbank.
16.4 Schemas
Within a database, a schema is a named namespace that groups related objects — tables, views, functions — together. Think of a database as a building and schemas as the floors inside it.
Every PostgreSQL database starts with a schema called public. When you create a table without specifying a schema, it lands in public by default:
-- These two are equivalentCREATETABLE students (...);CREATETABLEpublic.students (...);
You can create additional schemas to separate concerns — for example, a registrar schema for enrollment data and an finance schema for billing tables. Tables outside the default schema are referenced with a schema-qualified name: schema_name.table_name. The search_path setting controls which schemas PostgreSQL searches when you use an unqualified name.
For the exercises in this chapter, the public schema of the university database is sufficient. Schemas become more important when a single database serves multiple applications or teams — a pattern covered in the chapter on privileges.
Each column is defined by its name, its data type (covered in the data types chapter), and any constraints that apply to it. Constraints tell the database what values are and are not acceptable, enforcing business rules at the storage level regardless of which application or user touches the data.
16.6 Column Constraints
16.6.1 NOT NULL
NOT NULL prevents a column from ever holding a NULL value. It is the most common constraint after the primary key.
CREATETABLE students ( student_id SERIAL PRIMARYKEY, first_name TEXT NOTNULL, last_name TEXT NOTNULL);
Any INSERT or UPDATE that leaves first_name or last_name as NULL will be rejected immediately by the database.
16.6.2 DEFAULT
A DEFAULT value is substituted automatically when an INSERT statement omits the column entirely.
CREATETABLE students ( student_id SERIAL PRIMARYKEY, first_name TEXT NOTNULL, last_name TEXT NOTNULL, enrolled_on DATENOTNULLDEFAULTCURRENT_DATE, is_active BOOLEANNOTNULLDEFAULTTRUE);
If a new row is inserted without specifying enrolled_on, the database records the current date. A default is not the same as NOT NULL: a column can have a default and still accept an explicitly supplied NULL, unless NOT NULL is also present.
16.6.3 UNIQUE
UNIQUE ensures no two rows share the same value in that column. A student’s email address is a natural candidate — each address should appear at most once.
CREATETABLE students ( student_id SERIAL PRIMARYKEY, first_name TEXT NOTNULL, last_name TEXT NOTNULL, email TEXT NOTNULLUNIQUE, enrolled_on DATENOTNULLDEFAULTCURRENT_DATE, is_active BOOLEANNOTNULLDEFAULTTRUE);
PRIMARY KEY already implies both uniqueness and non-nullability, so you do not add UNIQUE or NOT NULL to a primary key column separately.
PostgreSQL permits multiple NULL values in a UNIQUE column because NULL is not considered equal to NULL — no uniqueness violation occurs. If a non-key column must be both unique and present in every row, declare both UNIQUE and NOT NULL.
16.6.4 CHECK
A CHECK constraint specifies a boolean condition that every row must satisfy. GPA values, for example, should fall within a defined range:
CREATETABLE students ( student_id SERIAL PRIMARYKEY, first_name TEXT NOTNULL, last_name TEXT NOTNULL, email TEXT NOTNULLUNIQUE, enrolled_on DATENOTNULLDEFAULTCURRENT_DATE, is_active BOOLEANNOTNULLDEFAULTTRUE, gpa NUMERIC(3, 2) CHECK (gpa ISNULLOR (gpa >=0.0AND gpa <=4.0)));
CHECK constraints are evaluated on every INSERT and UPDATE. The gpa IS NULL clause is necessary because a CHECK constraint only fails when it evaluates to FALSE — a NULL result is treated as passing. Allowing NULL here is intentional: a newly enrolled student may not yet have a GPA on record.
16.7 Primary Keys
The keys and integrity chapter introduced primary keys as a design concept. In SQL, the PRIMARY KEY constraint is how you enforce that concept.
16.7.1 Single-Column Primary Key
The simplest form declares PRIMARY KEY inline with the column definition:
CREATETABLE students ( student_id SERIAL PRIMARYKEY,...);
PRIMARY KEY implies both NOT NULL and UNIQUE, so neither needs to be written separately. SERIAL is a PostgreSQL convenience type that creates an auto-incrementing integer column backed by a sequence — each new row automatically receives the next integer if no value is supplied. This is the standard way to create a surrogate primary key.
SERIAL is PostgreSQL-specific. The SQL standard equivalent, supported in PostgreSQL 10 and later, is:
student_id INTEGERGENERATED ALWAYS AS IDENTITY PRIMARYKEY
Both accomplish the same thing. SERIAL is still common in documentation and existing codebases.
16.7.2 Composite Primary Keys
When no single column uniquely identifies a row, two or more columns can serve as the primary key together. A composite key is written as a table-level constraint after the column definitions:
Inline PRIMARY KEY syntax cannot express a composite key. The full multi-table schema, including this enrollments table and the foreign keys that connect it to students and courses, is built out in the next chapter.
16.8 Putting It Together: The Students Table
Here is the complete students table definition, bringing all the constraints above into a single statement:
CREATETABLE students ( student_id SERIAL PRIMARYKEY, first_name TEXT NOTNULL, last_name TEXT NOTNULL, email TEXT NOTNULLUNIQUE, enrolled_on DATENOTNULLDEFAULTCURRENT_DATE, is_active BOOLEANNOTNULLDEFAULTTRUE, gpa NUMERIC(3, 2) CHECK (gpa ISNULLOR (gpa >=0.0AND gpa <=4.0)))
Priya’s GPA is NULL — she enrolled mid-year and has not yet completed any graded coursework. The CHECK constraint permits this because it was written to allow NULLs explicitly.
16.9 CREATE TABLE IF NOT EXISTS
Running CREATE TABLE against a table that already exists produces an error. The IF NOT EXISTS modifier suppresses that error, making the statement safe to run in a script that may execute more than once:
CREATETABLEIFNOTEXISTS students ( student_id SERIAL PRIMARYKEY, first_name TEXT NOTNULL, last_name TEXT NOTNULL, email TEXT NOTNULLUNIQUE, enrolled_on DATENOTNULLDEFAULTCURRENT_DATE, is_active BOOLEANNOTNULLDEFAULTTRUE, gpa NUMERIC(3, 2) CHECK (gpa ISNULLOR (gpa >=0.0AND gpa <=4.0)));
If students already exists, this statement does nothing and raises no error. This is the pattern used in setup scripts and database migrations.
16.10 Removing Tables
DROP TABLE removes a table and all of its data permanently:
DROPTABLE students;
DROP TABLE IF EXISTS is the complement to CREATE TABLE IF NOT EXISTS — it does nothing if the table does not exist rather than raising an error:
DROPTABLEIFEXISTS students;
When a table has foreign keys pointing to it from other tables, dropping it requires either dropping the dependent tables first or using CASCADE to remove the referencing foreign key constraints automatically:
DROPTABLE students CASCADE;
CASCADE removes the constraint definitions in the dependent tables; it does not drop those tables themselves. Use it carefully — in a complex schema it can remove constraints you did not intend to touch.
16.11 Modifying Existing Tables
After a table is created, ALTER TABLE can change its structure without dropping and recreating it.
Add a column:
ALTERTABLE studentsADDCOLUMN major TEXT;
New columns are added with NULL in every existing row unless a DEFAULT is specified. If the new column is NOT NULL, you must supply a DEFAULT so the database has a value to back-fill:
ALTERTABLE studentsADDCOLUMN major TEXT NOTNULLDEFAULT'Undeclared';
Change a column’s data type:
ALTERTABLE studentsALTERCOLUMN major TYPEVARCHAR(100);
The cast must be valid for all existing values. Changing TEXT to VARCHAR(100) succeeds; changing TEXT to INTEGER would fail unless every value can be parsed as a number.
Adding a CHECK constraint to a populated table causes PostgreSQL to validate every existing row. If any row fails the check, the statement is rejected.
Remove a column:
ALTERTABLE studentsDROPCOLUMN major;
ALTER TABLE modifies the live table immediately. On large tables, some operations — particularly adding a NOT NULL constraint without a default, or building a new index — can lock the table and block reads and writes for seconds or longer. In a learning environment the tables are small enough that this does not matter, but it is worth knowing before working on production systems.
16.12 Exercises
1. Connect to your university database and run \dt in psql to list its tables. Confirm that students exists. Then query information_schema.columns to display the column names, data types, and nullability for the students table.
2. Insert a row into students that violates the gpa CHECK constraint (for example, a GPA of 5.0). Observe the error message. Then insert a row with a NULL GPA and confirm it succeeds.
3. Insert two students with the same email address and observe the error. Explain why this constraint belongs in the database rather than only in the application that writes to it.
4. Add a phone column of type TEXT to the students table, allowing NULL. Then alter the column to add a CHECK constraint that requires the value to be at least 10 characters long (using char_length(phone) >= 10), or NULL. Test the constraint with a valid and invalid value.
5. Write a CREATE TABLE IF NOT EXISTS statement for a courses table with the following rules:
A natural primary key using a course code such as 'CS101' (not a surrogate integer).
A title that is required.
A credit hours column that must be a positive integer.
A department name that defaults to 'General'.
Run the statement twice and confirm the second execution produces no error.
6. Drop the students table, then recreate it using CREATE TABLE IF NOT EXISTS. Verify that the data is gone after the drop and that the table structure is restored after the recreate.