Key Takeaways
- A B-tree index turns a lookup on a 1-million-row table into roughly 2-3 page reads, versus scanning every row sequentially (MySQL 8.0 Reference Manual).
- A composite index only serves queries that filter on its leftmost column — an index on
(a, b, c)can't be used by a query filtering onbalone. - When a query would touch a large share of a table's rows, MySQL's optimizer often chooses a full table scan over the index, since fewer random seeks are needed.
- MySQL 8.0's
sys.schema_unused_indexesview lists indexes with zero recorded reads, making it the fastest way to find dead weight slowing down writes.
What Is a MySQL Index?
A MySQL index is a separate, sorted data structure that stores a copy of one or more column values alongside a pointer back to the full row. Instead of reading every row to find a match, the storage engine walks the much smaller, sorted index structure and jumps straight to the matching entries.
InnoDB, MySQL's default storage engine, implements every index as a B-tree. The MySQL 8.0 Reference Manual notes that a B-tree index traversal on a table with roughly a million rows typically takes only 2-3 page reads, compared with reading every single row in sequence. That gap is the entire reason indexes exist.
Without an index, MySQL has exactly one option for finding matching rows: read the table from start to end and check every row against the condition. That's a full table scan, and it's the default behaviour for any column without an index.
How B-Tree Indexes Work
A B-tree keeps its structure shallow by packing many keys into each 16 KB page and using high fanout — every internal page can point to dozens of child pages. That's why the tree stays only 3-4 levels deep even at millions of rows, and why a lookup costs a handful of page reads rather than scaling linearly with table size.
In InnoDB specifically, the primary key is the table — data rows live directly in the primary key's B-tree, a design called a clustered index. Every secondary index (any index other than the primary key) stores the indexed column plus the primary key value, then does a second lookup into the clustered index to fetch the rest of the row. That's why a secondary index lookup is technically two B-tree traversals, not one, though both are still fast.
The chart above isn't to scale — the indexed bar would be invisible if it were — but it makes the point: an index turns an O(n) scan into something close to O(log n). On small tables the difference doesn't matter. On a table with millions of rows, it's the difference between a query that returns instantly and one that times out.
Creating and Managing Indexes
MySQL gives you three equivalent ways to add an index: inline during CREATE TABLE, standalone with CREATE INDEX, or appended later with ALTER TABLE. All three produce the same on-disk structure.
Inline during table creation:
CREATE TABLE orders (
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
user_id INT UNSIGNED NOT NULL,
status VARCHAR(20) NOT NULL,
created_at DATETIME NOT NULL,
PRIMARY KEY (id),
INDEX idx_orders_user (user_id),
INDEX idx_orders_status (status)
);
Standalone, on an existing table:
CREATE INDEX idx_orders_created_at ON orders (created_at);
Added with ALTER TABLE:
ALTER TABLE orders ADD INDEX idx_orders_status (status);
To enforce uniqueness alongside the lookup speedup, use UNIQUE:
CREATE UNIQUE INDEX idx_users_email ON users (email);
Name your indexes with a consistent pattern, such as idx_table_column. It keeps SHOW INDEX FROM table_name readable and makes dropping a specific index straightforward: DROP INDEX idx_orders_status ON orders;. This mirrors the same naming discipline covered in the foreign key guide for constraint names.
MySQL: INDEXES are awesome — Bro Code (YouTube, 2022)
CREATE TABLE, standalone via CREATE INDEX, or after the fact with ALTER TABLE ... ADD INDEX — all three produce an identical on-disk B-tree structure. Adding UNIQUE before INDEX enforces that no two rows share the same indexed value while retaining the same lookup speedup (MySQL 8.0 Reference Manual).
Composite Indexes and the Leftmost Prefix Rule
A composite index spans multiple columns, sorted first by the first column, then by the second within each group of matching first-column values, and so on. INDEX idx_orders_lookup (user_id, status, created_at) sorts first by user_id, then by status within each user, then by created_at within each status.
CREATE INDEX idx_orders_lookup ON orders (user_id, status, created_at);
That physical ordering is why the leftmost prefix rule exists. MySQL can use this index for a query filtering on user_id alone, on user_id and status, or on all three columns — because each of those is a contiguous slice of the sorted structure. It can't use the index for a query filtering on status alone, or on created_at alone, because neither is a leftmost prefix.
| Query filters on | Can use idx_orders_lookup (user_id, status, created_at)? |
|---|---|
user_id |
Yes — leftmost column |
user_id, status |
Yes — leftmost two columns |
user_id, status, created_at |
Yes — full index |
status only |
No — not a leftmost prefix |
created_at only |
No — not a leftmost prefix |
The practical implication is that column order in a composite index is a design decision, not an afterthought. Put the column your application filters on most often — usually a foreign key like user_id — first, and reserve the rightmost position for the column that benefits most from range queries, like a timestamp, since range conditions stop the leftmost-prefix matching at that point.
When MySQL Ignores Your Index
An index existing doesn't guarantee MySQL uses it. The query optimizer estimates the cost of each available access path and picks the cheapest one — and sometimes that's a full table scan, even with a matching index in place. Why would the optimizer skip a structure built specifically to speed up the query?
When the optimizer estimates a query will touch a large percentage of a table's rows, a table scan's sequential reads often beat an index's random reads, since random I/O is far more expensive per page than sequential I/O on spinning disks and still carries overhead on SSDs. Percona's benchmarking found that in a disk-bound scenario where the working set didn't fit in memory, a full table scan completed in about 4 seconds while a full index scan on the same query took about 30 seconds — the opposite of what most developers expect.
This is the benchmark result that surprises developers building a schema visually for the first time: adding an index isn't always a free win. Random reads scattered across a disk-bound table can cost far more than reading the whole table in sequence, so index design has to account for row count, memory, and how selective the condition actually is — not just "does a matching column have an index."
The optimizer also skips indexes on very small tables — typically under about ten rows — since a direct scan is cheaper than the overhead of a B-tree traversal. And as covered above, it ignores a composite index entirely once a query's filter conditions fall outside the leftmost prefix.
Reading EXPLAIN Output
EXPLAIN shows the execution plan MySQL chose for a query, including which index (if any) it used. Run it by prefixing any SELECT:
EXPLAIN SELECT * FROM orders WHERE user_id = 42 AND status = 'shipped';
The key column tells you which index MySQL actually chose — NULL means no index was used, which is your signal to investigate. The rows column estimates how many rows MySQL expects to examine; a number close to the table's total row count on a query that should be selective is a red flag. Extra showing Using filesort or Using temporary often points to a missing index on an ORDER BY or GROUP BY column.
For a deeper look at actual (not estimated) execution timing, EXPLAIN ANALYZE (available since MySQL 8.0.18) runs the query and reports real cost per plan node rather than the optimizer's estimate:
EXPLAIN ANALYZE SELECT * FROM orders WHERE user_id = 42 AND status = 'shipped';
To find indexes that exist but are never used — pure write overhead with no read benefit — query MySQL 8.0's built-in sys schema:
SELECT * FROM sys.schema_unused_indexes;
This view surfaces indexes with zero recorded reads in Performance Schema since the last restart or stats reset. Confirm over a reasonably long observation window before dropping anything — a monthly report query still needs its index even if it hasn't run in the last hour.
sys.schema_unused_indexes view, built on top of Performance Schema's table_io_waits_summary_by_index_usage table, lists every index with zero recorded read operations since the last server restart or statistics reset — the standard starting point for finding indexes that only add write overhead.
Common Mistakes
- Indexing every column "just in case." Every index MySQL must update on every
INSERT,UPDATE, andDELETE. Add indexes to match queries you've actually observed viaEXPLAINor the slow query log, not speculatively. - Wrong column order in a composite index. An index on
(status, user_id)is nearly useless for a query that filters mostly onuser_id, sinceuser_idisn't the leftmost column. Match the order to your most common filter pattern. - Functions wrapped around indexed columns.
WHERE YEAR(created_at) = 2026can't use an index oncreated_at, because MySQL must evaluate the function per row before it can compare. Rewrite as a range:WHERE created_at >= '2026-01-01' AND created_at < '2027-01-01'. - Ignoring cardinality. An index on a low-cardinality column like a boolean
is_activeflag rarely helps, since it can't narrow the row set enough to beat a table scan. Reserve indexes for columns with many distinct values. - Never revisiting indexes after schema changes. Indexes that made sense a year ago can become dead weight once query patterns shift. Periodically check
sys.schema_unused_indexesas part of routine maintenance, the same way you'd review foreign key constraints after a schema refactor.
Visualise Indexes Before Writing DDL
Index planning is easier once the schema's query paths are visible. A diagram makes it obvious which foreign key columns need composite indexes, and which lookup columns are worth indexing before you've written a single CREATE INDEX statement.
You can design your schema visually and export the correct MySQL DDL, including primary keys and foreign key indexes, directly from the diagram. The crow's foot notation guide explains how to read the relationship lines that typically indicate where a foreign key index belongs, and the database schema examples post shows complete indexed schemas across common application types.
Frequently Asked Questions
What is the syntax to create an index in MySQL?
CREATE INDEX idx_name ON table_name (column_name); creates a standalone index. You can also add one inline during CREATE TABLE, or with ALTER TABLE table_name ADD INDEX idx_name (column_name);. Use CREATE UNIQUE INDEX to also enforce uniqueness on the indexed column.
Why does MySQL ignore my index and do a full table scan?
The optimizer skips an index when it estimates the query will touch a large percentage of the table's rows, since a table scan needs fewer random seeks than an equivalent index lookup at that volume. It also ignores indexes on tables with very few rows, and skips a composite index entirely if the query doesn't filter on its leftmost column.
What is the leftmost prefix rule for composite indexes?
A composite index on (a, b, c) can serve queries filtering on a, on a and b, or on a, b, and c — but not queries that filter on b or c alone without a. MySQL can only use a composite index starting from its leftmost column, since that's the order the B-tree is physically sorted by.
How many indexes should a MySQL table have?
There's no fixed number — it depends on read/write ratio. Every index speeds up matching SELECT queries but adds overhead to every INSERT, UPDATE, and DELETE, since MySQL must update each index alongside the table data. Add indexes to match actual query patterns found via EXPLAIN or slow query logs, not preemptively.
How do I find unused indexes in MySQL?
Query sys.schema_unused_indexes, a view built into MySQL 8.0's sys schema that lists indexes with zero recorded reads since the last server restart or Performance Schema reset. Cross-check with a longer observation window before dropping anything, since seasonal or infrequent queries might rely on an index that looks unused over a short sample.