About This Page

Topic: Caching — the single most effective way to make a slow system fast. Parent: System Design. Prev: System Design - Scalability & CAP. Next: System Design - Databases.

What is Caching and Why Does It Matter?

The Core Idea

  • In software: caching stores the result of an expensive operation in fast memory, so the next identical request gets the answer immediately without re-computing.
Without CacheWith Cache
Request → Database query (100ms) → ResponseRequest → Redis lookup (1ms) → Response
Every request hits the DBOnly cache misses hit the DB
DB gets overwhelmed at scaleDB serves 10% of the traffic

Cache Hit vs Cache Miss

  • Cache Hit — The data you need IS in the cache. Fast response. ✅ Cache Miss — The data is NOT in cache. Must query the database. Slow first time. ❌
Cache Hit Rate = hits / (hits + misses) × 100%

A 90% hit rate = only 10% of requests hit the database
A 95% hit rate = 5× less DB load than 90%

Target: 80–95% hit rate for a well-tuned cache
A low hit rate means your cache keys are wrong or TTL is too short

Cache Placement — Where Does the Cache Live?

Layers of Caching

  • Caching isn’t just one thing — there are multiple layers, each serving a different purpose:
LayerLocationWhat’s CachedSpeedExample
Browser cacheUser’s browserHTML, CSS, JS, imagesInstant (no network)Cache-Control: max-age=86400
CDN (Edge cache)Servers near the userStatic files, images, video~10–50ms (vs 200ms from origin)Cloudflare, CloudFront
App-level cacheYour app server’s memoryComputed results~0.1msPython dict, Node.js Map
Distributed cacheDedicated cache serverDB query results, sessions~1–5msRedis, Memcached
Database cacheInside the databaseQuery results, indexesVariesPostgreSQL shared_buffers

How to Write Data to Cache — Write Strategies

Cache-Aside (Lazy Loading) — Most Common

  • How it works:
    1. App checks cache → hit? Return data immediately ✅
    2. App checks cache → miss? Query the database
    3. Store database result in cache (with a TTL expiry)
    4. Return data to user
    5. On data update: Delete (invalidate) the cache key
def get_user(user_id):
    # 1. Check cache first
    data = redis.get(f"user:{user_id}")
    if data:
        return json.loads(data)          # Cache HIT ✅
 
    # 2. Cache miss — go to database
    user = db.query("SELECT * FROM users WHERE id=?", user_id)
 
    # 3. Store in cache for 1 hour
    redis.setex(f"user:{user_id}", 3600, json.dumps(user))
    return user
 
def update_user(user_id, new_data):
    db.query("UPDATE users SET ... WHERE id=?", user_id)
    redis.delete(f"user:{user_id}")     # Invalidate stale cache
  • Pros: Simple, only caches what’s actually needed. Cold start on first use.
  • Cons: First request after cache miss is slow. Stale data risk if invalidation is missed.

Write-Through — Always Fresh

  • How it works: Every write goes to cache AND database simultaneously. Cache is always in sync with the database.
Write request arrives:
  1. Write to Cache   ✅
  2. Write to Database ✅  (both in same operation)
  3. Next read → cache hit ✅

No stale data possible.
Cost: Every write is as slow as a DB write (no speed benefit for writes).
  • Best for: Data that is read frequently right after being written (e.g., user profile updates).

Write-Back (Write-Behind) — Fast Writes

  • How it works: Write to cache immediately. Sync to database asynchronously (later, in background).
Write request arrives:
  1. Write to Cache ✅ (responds immediately — very fast)
  2. Background worker → writes to database later

Risk: If cache crashes before background sync → DATA LOST! ⚠️
  • Best for: High write throughput where occasional data loss is acceptable (analytics counters, likes). Never for: Financial transactions, orders, anything that cannot afford data loss.

Which Strategy to Use?

SituationStrategy
General purpose (most cases)Cache-Aside
Must always serve fresh dataWrite-Through
Very high write throughput, loss tolerableWrite-Back
Write-once, rarely-read data (logs)Write-Around (skip cache on write)

Cache Eviction — What Happens When Cache is Full?

The Problem

  • Memory is expensive and limited. Your cache can’t store everything forever. When cache fills up, it must evict (delete) some entries to make room. The question is: which entries to remove?

LRU — Least Recently Used (The Default)

  • Rule: Remove the item that was least recently accessed. The assumption: if you haven’t needed something in a while, you probably won’t need it soon.
Cache capacity: 3 items

Step 1: Access A → Cache: [A]
Step 2: Access B → Cache: [A, B]
Step 3: Access C → Cache: [A, B, C]  ← full
Step 4: Access D → Must evict! A was least recently used → [B, C, D]
Step 5: Access B → B is now most recent → [C, D, B]
Step 6: Access E → Must evict! C is LRU → [D, B, E]
  • Good for: General-purpose caching. Most common choice (Redis default).

LFU — Least Frequently Used

  • Rule: Remove the item accessed the fewest number of times overall.
  • Good for: When some data is always popular (hot items) and should stay cached.
  • Downside: A newly added hot item might get evicted before it accumulates access counts.

TTL — Time To Live (The Simplest)

  • Rule: Every cache entry has an expiry time. After that time, it’s automatically deleted.
redis.setex("user:123", 3600, data)  # Expires in 1 hour

No need to manually invalidate — it self-destructs.
Risk: May serve stale data for up to TTL duration before expiry.
  • Good for: Data that changes infrequently (product catalog, config, exchange rates).

Cache Invalidation — The Hard Problem

Why It’s Hard

  • The problem: your database is the truth, but your cache has a copy. When the database changes, the cache copy is now stale (wrong). How do you keep them in sync?

Common Approaches

  • TTL-Based (Simplest): Set a short expiry. Accept that data may be stale for up to TTL duration. Works for: Product prices, weather, exchange rates. Not for: Bank balances.
  • Delete on Write (Most Common): When data changes in DB → immediately delete the cache key. Next read will be a cache miss → fresh data loaded from DB → re-cached. Simple and safe. One request after each update will be slow (cold miss).
  • Event-Driven: DB change triggers an event → cache consumer deletes the key. More complex but works well at scale with Kafka/CDC.

Cache Key Design

  • Good key design makes invalidation easy:
Format:  {service}:{entity}:{id}:{variant}

Examples:
  user:profile:12345
  product:detail:abc-789:en-US
  feed:timeline:user:67890:page:1

Why this matters:
  You can delete ALL product cache with:  SCAN MATCH "product:*"
  You can delete ONE user's cache with:   DEL "user:profile:12345"

Cache Stampede — When Your Cache Saves You… Until It Doesn’t

The Danger Scenario

Scenario:
  Popular cache key "top_products" has TTL = 60 seconds.
  At t=60s, it expires.
  At that exact moment, 10,000 users request the page.
  All 10,000 get a cache miss simultaneously.
  All 10,000 query the database at the same time.
  Database crashes under the load. 💀
  This is called a "Cache Stampede" or "Thundering Herd".

How to Prevent It

  • 1. Add TTL Jitter (Easiest fix): Instead of TTL = 3600, use TTL = 3600 + random(0, 300). Keys expire at slightly different times → stampede spreads out.
  • 2. Mutex / Distributed Lock: First cache-miss request grabs a lock → fetches from DB → populates cache. All other requests WAIT for the lock → then read from cache (no DB hit).
lock_key = f"lock:top_products"
if redis.set(lock_key, 1, nx=True, ex=5):  # nx=True: only set if not exists
    data = db.query("SELECT * FROM products ORDER BY sales DESC LIMIT 10")
    redis.setex("top_products", 3600, json.dumps(data))
    redis.delete(lock_key)
else:
    time.sleep(0.1)  # brief wait, then retry cache
    data = redis.get("top_products")
  • 3. Stale-While-Revalidate: Serve the stale cache immediately (fast response to user) while refreshing in background. Cache-Control: stale-while-revalidate=60 — browser / CDN handle this automatically for HTTP.

Redis — The Industry Standard Cache

Why Redis, Not Just a Dictionary in Memory?

  • An in-memory dictionary in your app is fast — but it’s local to ONE server. If you have 10 app servers, each has its own cache → inconsistent, wasteful. Redis is a shared cache — all your app servers connect to it.
  • Also, Redis data survives app restarts (persistence options), unlike in-process memory.

Redis vs Memcached — When to Use Which

QuestionIf Yes → Use
Do you need data types beyond strings? (lists, sorted sets)Redis
Do you need persistence (survive reboots)?Redis
Do you need pub/sub messaging?Redis
Do you need rate limiting, leaderboards, job queues?Redis
Do you ONLY need simple key→value with max throughput?Memcached

Useful Redis Commands

# Store with TTL
SET user:123 '{"name":"Alice"}' EX 3600
 
# Get
GET user:123
 
# Delete (invalidate)
DEL user:123
 
# Increment counter (atomic — for rate limiting, view counts)
INCR page:views:home
 
# Sorted set (leaderboard)
ZADD leaderboard 9500 "player:alice"
ZADD leaderboard 8200 "player:bob"
ZREVRANGE leaderboard 0 9 WITHSCORES  # Top 10
 
# Check TTL remaining
TTL user:123  # returns seconds remaining
 
# Set only if not exists (mutex lock)
SET lock:job:456 1 NX EX 30

CDN — Caching at the Edge (Global Scale)

What is a CDN?

  • CDN = globally distributed servers that cache your static content close to users.
Without CDN:
  User in India → Server in US Virginia → 200ms latency → poor experience

With CDN:
  User in India → CDN edge in Mumbai → 10ms latency → fast experience
  Edge served cached copy of your CSS/JS/images/video

HTTP Cache Headers (Control CDN + Browser Caching)

Cache-Control: public, max-age=86400
  → Cache in both browser AND CDN for 1 day (86400 seconds)
  → Use for: static files (images, JS, CSS) that don't change

Cache-Control: private, max-age=3600
  → Cache in browser only (CDN skips it) for 1 hour
  → Use for: user-specific pages, dashboards

Cache-Control: no-cache
  → Always check with server before using cached copy
  → Server may respond 304 Not Modified (fast) if unchanged

Cache-Control: no-store
  → Never cache anywhere (sensitive data: banking, medical records)

Cache-Control: stale-while-revalidate=60
  → Serve stale immediately, refresh in background
  → Use for: pages that change but where slight staleness is OK

Useful Links & Resources