Database Normalization Explained — 1NF, 2NF, and 3NF with Examples
Normalization is the process of structuring a database schema to reduce data redundancy and improve integrity. A poorly normalized schema stores the same data in multiple places — meaning updates have to happen in multiple rows, and inconsistencies are inevitable. This guide walks through the first three normal forms with concrete before-and-after examples.
Why Normalization Matters
Consider a single orders table that stores everything:
| order_id | customer_name | customer_email | product | product_price |
|---|---|---|---|---|
| 1 | Alice | alice@example.com | Widget | 9.99 |
| 2 | Alice | alice@example.com | Gadget | 24.99 |
| 3 | Bob | bob@example.com | Widget | 9.99 |
Problems: Alice's email is stored twice — if it changes, you must update every row. The Widget price is stored twice — a price change requires finding every row that references it. This is the kind of redundancy normalization eliminates.
First Normal Form (1NF)
Rule: Every column must hold a single, atomic value. No repeating groups, no comma-separated lists in a cell.
Violation example
| order_id | products |
|---|---|
| 1 | Widget, Gadget, Sprocket |
✗ Not in 1NF
The products column contains multiple values. You can't query "all orders containing a Widget" without a LIKE hack.
Fixed
| order_id | product |
|---|---|
| 1 | Widget |
| 1 | Gadget |
| 1 | Sprocket |
✓ In 1NF
Each row holds one value. The table now has a composite primary key of (order_id, product).
Second Normal Form (2NF)
Rule: The table must be in 1NF, and every non-key column must depend on the entire primary key — not just part of it. This only applies to tables with composite primary keys.
Violation example
| order_id | product_id | quantity | product_name | product_price |
|---|---|---|---|---|
| 1 | 42 | 2 | Widget | 9.99 |
| 2 | 42 | 1 | Widget | 9.99 |
✗ Not in 2NF
The primary key is (order_id, product_id). But product_name and product_price depend only on product_id — not on the full composite key. They're stored redundantly in every order line.
Fixed — split into two tables
-- order_items: only order-specific data
order_id | product_id | quantity
-- products: product data lives here once
product_id | product_name | product_price
✓ In 2NF
Now product_name and product_price are stored once in products. A price change updates one row.
Third Normal Form (3NF)
Rule: The table must be in 2NF, and no non-key column should depend on another non-key column (no transitive dependencies).
Violation example
| employee_id | department_id | department_name |
|---|---|---|
| 1 | 10 | Engineering |
| 2 | 10 | Engineering |
| 3 | 20 | Marketing |
✗ Not in 3NF
department_name depends on department_id, not on employee_id. It's a transitive dependency through a non-key column. Renaming the department requires updating every employee row.
Fixed — extract the dependency
-- employees
employee_id | department_id
-- departments
department_id | department_name
✓ In 3NF
Department names live in one place. Renaming "Engineering" is a single row update.
When to Denormalize
3NF is the right default for transactional databases. But sometimes you'll deliberately break the rules for performance:
- Reporting and analytics — denormalized "wide" tables avoid expensive joins across many tables in read-heavy workloads.
- Caching derived values — storing a pre-computed
order_totalavoids summingorder_itemson every page load, at the cost of keeping it in sync. - Historical snapshots — sometimes you want to store the product price at the time of purchase, not the current price. Denormalization is correct here by design.
The key principle: normalize first, then denormalize deliberately and with documentation. Never denormalize out of laziness.
Visualize your normalized schema
SQL Designer makes it easy to split tables correctly and draw the foreign key relationships between them. Free, browser-based, no installation required.
Create a Free Account