A primer · five layers

The art of not asking twice.

Every fast computer you have ever used works because somewhere, someone decided not to fetch the same thing twice. The pattern is called caching, and it shows up at every layer of the stack — from the silicon inside your CPU to the database in front of your application. This is a tour of five places it lives, and why each one is worth the trouble.

The pattern
Keep a small, fast copy of frequently-needed data close to where you need it.
Why it works
Real-world access patterns are uneven. A few items are requested constantly; most are not.
Five examples
CPU caches, browser cache, browser DNS cache, OS DNS cache, Redis.
Why it exists

A modern CPU runs at roughly 3 GHz — about 100× faster than main memory. Without cache, the processor would idle waiting for RAM nearly every cycle. Each level trades capacity for speed: smaller and closer means faster.

CPU core 3 GHz L1 cache ~1 ns 64 KB L2 cache ~3 ns 1 MB L3 cache ~12 ns 32 MB · shared Main memory ~100 ns several GB FASTEST SLOWEST · 100×
Step 1 / 5

Why it exists

A typical webpage downloads dozens of files — HTML, CSS, JavaScript, images, fonts. Re-downloading them on every visit would waste bandwidth and slow page loads to a crawl. The browser keeps a copy on disk, controlled by HTTP cache headers like Cache-Control and ETag sent by the server.

Browser renders the page Disk cache ~5 ms · local SSD HTML · CSS · JS · images Web server 100–800 ms across the network on device across the internet
Step 1 / 5

Why it exists

Every URL like example.com must be turned into an IP address before any data can flow. Even a quick OS-level DNS lookup costs a system call and inter-process work. Browsers keep their own in-process DNS cache so common hostnames resolve at essentially zero cost — a hash table read in the same memory as the page itself.

Browser process needs an IP to open a connection In-process DNS cache ~0.01 ms just a hash table read OS resolver syscall slower, shared by all apps same memory as the page crosses a process boundary
Step 1 / 5

Why it exists

When a DNS lookup misses every local cache, it can take a 50–300 ms round trip through your router, your ISP, the root servers, the TLD servers, and finally the authoritative nameserver. The OS keeps a system-wide cache so a lookup done by your terminal warms the cache for your browser, your mail client, and everything else.

OS resolver request from any app browser · mail · terminal OS DNS cache ~0.1 ms system-wide, shared DNS hierarchy 50–300 ms router → ISP → root → TLD on device over the public internet
Step 1 / 5

Why it exists

Reading from a relational database means parsing SQL, planning a query, hitting disk, locking rows, and serialising results — easily 50–200 ms. Redis keeps the answers to hot queries in RAM, so repeat reads return in about a millisecond and the database is freed up to do real work.

App server handles request checks Redis first Redis ~1 ms key → value, all in RAM Database 50–200 ms SQL · disk · locks your application source of truth
Step 1 / 5

The pattern, abstracted

Three roles, repeated.

Once you've stepped through the five examples, the underlying pattern is hard to miss. There's always a requester that needs data. There's always a cache that holds a fast copy. And there's always a slow source of truth that has the real answer when the cache doesn't.

Two design questions show up everywhere: how long is a cached value still valid (TTLs, ETags, hardware coherence protocols), and what gets evicted when the cache fills up (LRU is the most common answer at every layer). Get those two right, and a tiny amount of fast storage in front of a slow source can absorb the vast majority of traffic.

Role · 1
The requester
asks for data.
Role · 2
The cache
holds a fast copy.
Role · 3
The source of truth
knows the real answer.