Caching Strategies Explained: Cache-Aside, Write-Through, and More

Caching speeds up reads. But there's more than one way to cache data — and picking the wrong strategy causes stale data, cache misses, or data loss.

Here are the four main caching strategies and when each one makes sense.

Why Strategy Matters

A cache sits between your application and your database. The strategy defines two things:

How data gets into the cache (on read? on write?)
How writes are handled (update cache first? DB first? both?)

Getting this wrong causes either stale data (cache has old values) or cold cache (frequent misses that hit the DB every time).

1. Cache-Aside (Lazy Loading)

The most common pattern. The application manages the cache directly.

Read:

Check cache first
Cache hit → return data
Cache miss → fetch from DB, store in cache, return data

Write:

Write to DB
Invalidate (delete) the cached value

python

import redis
import json
 
r = redis.Redis()
 
def get_user(user_id: int) -> dict:
    cache_key = f"user:{user_id}"
    
    # Step 1: check cache
    cached = r.get(cache_key)
    if cached:
        return json.loads(cached)
    
    # Step 2: cache miss — fetch from DB
    user = db.query("SELECT * FROM users WHERE id = %s", user_id)
    
    # Step 3: store in cache with TTL
    r.setex(cache_key, 300, json.dumps(user))
    
    return user
 
def update_user(user_id: int, data: dict):
    # Write to DB
    db.execute("UPDATE users SET ... WHERE id = %s", user_id)
    
    # Invalidate cache — next read will fetch fresh from DB
    r.delete(f"user:{user_id}")

Pros: Only requested data gets cached (no waste). Cache survives Redis restarts — worst case is a cache miss, not data loss.

Cons: First request after cache miss is slow (cache miss penalty). Brief period of stale data between write and invalidation.

Best for: Read-heavy workloads. General purpose. When not all data needs to be cached.

2. Read-Through

Similar to cache-aside, but the cache library handles the DB fetch automatically. The application only ever talks to the cache.

python

# Using a read-through cache library (conceptual)
cache = ReadThroughCache(
    redis=r,
    loader=lambda user_id: db.query("SELECT * FROM users WHERE id = %s", user_id),
    ttl=300
)
 
# Application only calls cache — it handles misses internally
user = cache.get(f"user:{user_id}")

Pros: Simpler application code — no manual miss handling.

Cons: First-time load is always slow. Cache-aside gives you more control over loading logic.

Best for: When you want cache logic abstracted away. Common in ORM-level caching.

3. Write-Through

Every write goes to cache AND database synchronously.

python

def update_user(user_id: int, data: dict):
    # Write to DB
    db.execute("UPDATE users SET name = %s WHERE id = %s", data["name"], user_id)
    
    # Also update cache immediately
    user = db.query("SELECT * FROM users WHERE id = %s", user_id)
    r.setex(f"user:{user_id}", 300, json.dumps(user))

Pros: Cache is always fresh — no stale data. Reads are always fast after the first write.

Cons: Every write hits the DB AND cache — slightly slower writes. Infrequently-read data still gets cached (wasted memory).

Best for: Write-then-immediately-read patterns. When stale reads are unacceptable. Often combined with cache-aside: write-through for frequently-accessed keys, cache-aside for everything else.

4. Write-Behind (Write-Back)

Write to cache immediately. Write to DB asynchronously in the background.

python

import asyncio
from collections import defaultdict
 
write_buffer = {}
 
def update_user(user_id: int, data: dict):
    cache_key = f"user:{user_id}"
    
    # Update cache immediately — user sees change right away
    r.setex(cache_key, 300, json.dumps(data))
    
    # Queue DB write for background processing
    write_buffer[cache_key] = (user_id, data)
 
async def flush_writes():
    while True:
        await asyncio.sleep(1)  # Flush every second
        for key, (user_id, data) in write_buffer.items():
            db.execute("UPDATE users SET ... WHERE id = %s", user_id)
        write_buffer.clear()

Pros: Writes feel instant. Excellent for write-heavy workloads — DB gets batched writes.

Cons: Risk of data loss if cache crashes before flush. More complex. Harder to ensure consistency.

Best for: Write-heavy workloads where small data loss is acceptable (game scores, view counts, analytics). Never use for financial data.

Cache Invalidation Strategies

How you expire cache entries matters as much as the strategy.

TTL (Time-to-Live): Simplest. Set a time limit. Data expires automatically.

python

r.setex("user:123", 300, json.dumps(user))  # Expires in 5 minutes

Stale window = TTL. Short TTL = more DB load. Long TTL = more stale data. Pick based on how often data changes.

Event-based invalidation: Explicitly delete cache entries on writes.

python

# When user updates their profile
def update_profile(user_id, data):
    db.update(user_id, data)
    r.delete(f"user:{user_id}")           # Invalidate specific user
    r.delete(f"user_list:all")            # Invalidate any list caches
    r.delete(f"user_count")              # Invalidate aggregate caches

Immediate consistency. But you must track all cache keys that relate to this data.

Versioned keys: Change the key when data changes — no explicit invalidation needed.

python

# Store user version in DB or Redis
user_version = get_user_version(user_id)  # Increment on each update
cache_key = f"user:{user_id}:v{user_version}"
 
# Old versioned keys just expire via TTL

Common Mistakes

Not setting TTL: Cache grows forever, hits memory limits, evicts randomly. Always set TTL.

Caching too much: Every DB result in cache. Most of it is never read again. Cache selectively — hot data only.

Cache stampede: Cache entry expires. 1000 requests all miss at once, all hit DB simultaneously. Fix: mutex locks or probabilistic early expiration.

python

import redis.lock
 
def get_user_safe(user_id: int) -> dict:
    cache_key = f"user:{user_id}"
    lock_key = f"lock:user:{user_id}"
    
    cached = r.get(cache_key)
    if cached:
        return json.loads(cached)
    
    # Only one request fetches from DB
    with r.lock(lock_key, timeout=5):
        cached = r.get(cache_key)  # Check again inside lock
        if cached:
            return json.loads(cached)
        
        user = db.query("SELECT * FROM users WHERE id = %s", user_id)
        r.setex(cache_key, 300, json.dumps(user))
        return user

Not warming the cache: After a deploy or Redis restart, all cache is cold. Every request misses. Pre-warm critical data at startup.

Which Strategy to Pick

Scenario	Strategy
General read-heavy API	Cache-aside
Write-then-read same request	Write-through
Write-heavy, small loss ok	Write-behind
ORM-level, abstracted	Read-through
Most production systems	Cache-aside + TTL

Key Takeaways

Cache-aside: app checks cache, fetches DB on miss — most common, most flexible
Write-through: writes go to cache + DB together — always fresh, slightly slower writes
Write-behind: writes go to cache, DB updated async — fast writes, risk of data loss
Read-through: cache handles misses automatically — simpler app code
Always set TTL — never let cache grow unbounded
Cache stampede is real — use locks for hot keys
Start with cache-aside + TTL; add write-through for data that's read immediately after write

Caching is the cheapest way to speed up a slow system. Pick the strategy that matches how your data is written and read.

Caching Strategies Explained: Cache-Aside, Write-Through, and More

Why Strategy Matters

1. Cache-Aside (Lazy Loading)

2. Read-Through

3. Write-Through

4. Write-Behind (Write-Back)

Cache Invalidation Strategies

Common Mistakes

Which Strategy to Pick

Key Takeaways

Enjoyed this article?

Related Posts

Redis Caching Explained: Speed Up Your Backend

System Design Interview: Design a URL Shortener

CDN Explained: How Content Gets Delivered Fast