cs.thefarshad
medium

Cryptographic Hash Functions

One-way fingerprints for data — deterministic, collision-resistant, and the avalanche effect that powers integrity checks and password storage.

A cryptographic hash function takes any input and produces a fixed-size digest — a short fingerprint of the data. The same input always yields the same digest, but you cannot run the function backwards to recover the input, and no one can feasibly find two inputs with the same digest.

Type a message below, then change a single character. Watch how the digest transforms completely — that sensitivity is the avalanche effect.

your message
attack·at·dawn
79759aaa59d927df
one character changed
Attack·at·dawn
440a553c8592aac6
flipping 1 character changed 16/16 hex digits and 38/64 bits (about 59%) — the avalanche effect.
Toy 64-bit hash for illustration only. Real functions like SHA-256 are 256 bits and collision-resistant.

The hash above is a deliberately simple toy so the avalanche is easy to see. Real functions like SHA-256 produce 256-bit digests and have withstood years of cryptanalysis.

The four properties

A good cryptographic hash HH is:

  • DeterministicH(x)H(x) is always the same for the same xx.
  • One-way (preimage resistant) — given a digest dd, finding any xx with H(x)=dH(x) = d is infeasible.
  • Collision resistant — finding two different inputs xyx \neq y with H(x)=H(y)H(x) = H(y) is infeasible.
  • Avalanche — flipping one input bit flips about half the output bits, so digests reveal nothing about how similar two inputs were.

Integrity: detecting tampering

Publish a file alongside its SHA-256 digest. After downloading, you re-hash the file and compare. If even one byte changed in transit, the digests differ — the avalanche effect guarantees a mismatch. This is how package managers, Git commit IDs, and software releases verify they got exactly the intended bytes.

Password storage: never store the password

Servers must never store raw passwords. Instead they store H(password)H(\text{password}). At login, the server hashes what you typed and compares digests. A breach then leaks only digests, not passwords.

But plain hashing is not enough. Attackers precompute digests of common passwords (rainbow tables). The fix is a salt: a unique random value stored per user and mixed in, so you store H(saltpassword)H(\text{salt} \mathbin{\|} \text{password}). Identical passwords now get different digests, defeating precomputation.

Crucially, password hashing should be slow. Functions like bcrypt, scrypt, and Argon2 are deliberately expensive (and memory-hard), so an attacker testing billions of guesses is throttled. Fast hashes like SHA-256 are right for integrity but wrong for passwords.

Takeaways

  • A cryptographic hash is a deterministic, one-way, collision-resistant fingerprint with a strong avalanche effect.
  • Use hashes to verify integrity — re-hash and compare digests.
  • Store passwords as salted hashes using a slow algorithm (bcrypt, scrypt, Argon2), never in plaintext and never with a fast hash alone.

References