cs.thefarshad
medium

Database Internals

How databases store data on disk, find rows fast with indexes, and keep data correct with transactions.

A database has to do two hard things well: find data fast even with billions of rows, and stay correct even with many writers and sudden crashes. Indexes solve the first; transactions solve the second.

The index is almost always a B-tree. Insert keys below and watch nodes fill, split, and push a median upward — keeping every leaf at the same depth so lookups stay shallow. Then search for a key to trace the path the database walks.

Insert keys to build the B-tree.
1/35
Empty tree. Insert keys to watch nodes fill and split.
compare inserted split found
order 3 (max 2 keys/node) · 10 keys

Indexes: finding rows fast

Without an index, finding a row means scanning the whole table — O(n). An index is a separate, sorted structure that points into the table so lookups become O(log n).

Most relational databases use a B-tree (a balanced, high-fan-out tree kept on disk). Its shallow height means a lookup touches only a few disk pages. The trade-off: every index speeds up reads but slows down writes (each insert must update the index too) and uses space.

Write-heavy systems often use an LSM-tree instead: buffer writes in memory, flush them as sorted files, and merge in the background — great for ingest, at the cost of more work on reads.

Transactions and ACID

A transaction groups operations so they happen all-or-nothing. The guarantees are ACID:

  • Atomicity — all steps commit, or none do (no half-finished transfers).
  • Consistency — the database moves from one valid state to another.
  • Isolation — concurrent transactions don’t see each other’s partial work.
  • Durability — once committed, it survives a crash (via a write-ahead log flushed before acknowledging).

Isolation levels

Full isolation is expensive, so databases offer levels trading correctness for speed — from read committed (may see different data if you re-read) up to serializable (as if transactions ran one at a time). Picking the right level is a real engineering decision.

Takeaways

  • Indexes (usually B-trees) turn O(n) scans into O(log n) lookups — but cost write speed and space.
  • Transactions give ACID guarantees; the write-ahead log provides durability.
  • Isolation levels trade strictness for concurrency/performance.