How to Read This Post
This post is a visual reference for caching strategies, organized by increasing complexity. Each section contains a Mermaid diagram followed by a brief explanation. No code — just patterns and pictures.
| Strategy | Type | Consistency | Latency | Complexity | Best For |
|---|---|---|---|---|---|
| Cache-Aside | Read | Eventual | Low reads | Low | General purpose |
| Read-Through | Read | Eventual | Low reads | Medium | Read-heavy workloads |
| Write-Through | Write | Strong | High writes | Medium | Consistency-critical writes |
| Write-Behind | Write | Eventual | Low writes | High | Write-heavy workloads |
| Write-Around | Write | Eventual | Low writes | Low | Rarely-read written data |
| LRU / LFU / TTL | Eviction | N/A | N/A | Low–Medium | Memory management |
| Consistent Hashing | Distribution | Varies | Low | High | Horizontal scaling |
| Multi-Layer Cache | Read | Eventual | Very low | High | High-traffic systems |
Level 1 — Foundations
1. No Cache Baseline
No Cache Baseline — Every request hits the database directly. Under load, the database becomes the bottleneck: latency climbs, connections exhaust, and throughput collapses. This is the problem caching solves.
2. Cache-Aside (Lazy Loading)
~2ms"| C Cache -->|"3b. MISS"| A A -->|"4b. Query DB"| DB DB -->|"5b. Return data"| A A -->|"6b. Populate cache"| Cache A -->|"7b. Return data
~100ms first time"| C classDef blue fill:#4a9eff,stroke:#4a9eff,color:#e5e7eb classDef red fill:#ff6b6b,stroke:#ff6b6b,color:#e5e7eb classDef green fill:#51cf66,stroke:#51cf66,color:#1a1a2e classDef purple fill:#a29bfe,stroke:#a29bfe,color:#e5e7eb
Cache-Aside (Lazy Loading) — The application owns all cache logic. On a miss, it reads from the database, stores the result in cache, and returns it. On a hit, the cached value is returned directly. The cache only contains data that has actually been requested, keeping memory usage efficient.
3. Read-Through Cache
Read-Through Cache — Unlike cache-aside, the cache itself is responsible for loading data on a miss. The application simply asks the cache for a key and the cache handles the database fetch transparently. This simplifies application code but couples the cache layer to the data source.
Level 2 — Write Strategies
4. Write-Through
always in sync C->>A: Read request (key=user:42) A->>Cache: 6. Read from cache Cache-->>A: 7. Return (always fresh) A-->>C: 8. Fresh data guaranteed
Write-Through — Every write goes to the cache first, and the cache synchronously writes to the database before acknowledging. This guarantees strong consistency between cache and database at the cost of higher write latency, since every write must wait for the database round-trip.
5. Write-Behind (Write-Back)
without waiting for DB Cache->>Q: 4. Enqueue write Note over Q: Batches accumulate... Q->>DB: 5. Async batch write DB-->>Q: 6. Ack Note over Cache,DB: Risk: cache crash before
flush = data loss ⚠️
Write-Behind (Write-Back) — The application writes to the cache and returns immediately. Writes are queued and flushed to the database asynchronously in batches. This dramatically reduces write latency and database load, but introduces a window where data exists only in cache — a crash during that window means data loss.
6. Write-Around
Write-Around — Writes go directly to the database, completely bypassing the cache. The cache is only populated when data is subsequently read. This avoids polluting the cache with data that may never be read, making it ideal for write-heavy workloads where most written data is rarely accessed.
7. Cache-Aside + Write-Through Combined
Cache-Aside + Write-Through Combined — Reads follow the cache-aside pattern (check cache, fallback to DB, populate cache), while writes use write-through (update cache and DB synchronously). This combination delivers low read latency with strong write consistency, and is the most common caching architecture in production systems.
Level 3 — Eviction & Invalidation
8. LRU vs LFU vs TTL
accessed for the longest time"]:::blue LRU3["Good for: temporal locality
Recent = likely needed"]:::blue LRU1 --> LRU2 --> LRU3 end subgraph LFU ["LFU — Least Frequently Used"] direction TB LFU1["Track access count"]:::green LFU2["Evict the entry with
the fewest total accesses"]:::green LFU3["Good for: popularity-based
Frequent = likely needed"]:::green LFU1 --> LFU2 --> LFU3 end subgraph TTL ["TTL — Time To Live"] direction TB TTL1["Set expiry at write time"]:::yellow TTL2["Evict when current time
exceeds expiry timestamp"]:::yellow TTL3["Good for: freshness guarantees
Stale data auto-removed"]:::yellow TTL1 --> TTL2 --> TTL3 end FULL --> LRU FULL --> LFU FULL --> TTL classDef red fill:#ff6b6b,stroke:#ff6b6b,color:#e5e7eb classDef blue fill:#4a9eff,stroke:#4a9eff,color:#e5e7eb classDef green fill:#51cf66,stroke:#51cf66,color:#1a1a2e classDef yellow fill:#ffd43b,stroke:#ffd43b,color:#1a1a2e
LRU vs LFU vs TTL — Three fundamental eviction strategies. LRU evicts what has not been touched recently (favors recency), LFU evicts what is accessed least overall (favors popularity), and TTL evicts based on a fixed time-to-live regardless of access patterns. Most production systems combine TTL with either LRU or LFU.
9. Cache Stampede Problem
TTL expires ⏰ C1->>Cache: GET product:99 Cache-->>C1: MISS C2->>Cache: GET product:99 Cache-->>C2: MISS C3->>Cache: GET product:99 Cache-->>C3: MISS Note over C1,C3: 100+ concurrent requests
all see MISS simultaneously C1->>DB: SELECT * FROM products WHERE id=99 C2->>DB: SELECT * FROM products WHERE id=99 C3->>DB: SELECT * FROM products WHERE id=99 Note over DB: Database overwhelmed!
100 identical queries ⚠️ DB-->>C1: Result DB-->>C2: Result DB-->>C3: Result C1->>Cache: SET product:99 C2->>Cache: SET product:99 C3->>Cache: SET product:99 Note over Cache: 100 redundant cache writes
Cache Stampede (Thundering Herd) — When a popular cache key expires, many concurrent requests simultaneously discover the miss and all hit the database with the same query. This can overload or crash the database, especially for expensive queries, turning a cache expiration into a cascading failure.
10. Cache Stampede Solutions
get stale data"]:::blue M3["First request fetches DB,
populates cache, releases lock"]:::blue M4["Waiting requests read
from fresh cache"]:::blue M1 --> M2 --> M3 --> M4 end subgraph PEE ["Solution 2: Probabilistic Early Expiration"] direction TB P1["Key TTL = 60s"]:::green P2["At ~50s remaining, each request
has small random chance
to refresh early"]:::green P3["One request wins the race
and refreshes before expiry"]:::green P4["Key never actually expires
for other clients"]:::green P1 --> P2 --> P3 --> P4 end subgraph BG ["Solution 3: Background Refresh"] direction TB B1["Background worker monitors
popular keys"]:::yellow B2["Refreshes keys before
TTL expires"]:::yellow B3["Cache always warm
for hot keys"]:::yellow B4["Clients never see a miss
on popular data"]:::yellow B1 --> B2 --> B3 --> B4 end Problem --> Mutex Problem --> PEE Problem --> BG classDef red fill:#ff6b6b,stroke:#ff6b6b,color:#e5e7eb classDef blue fill:#4a9eff,stroke:#4a9eff,color:#e5e7eb classDef green fill:#51cf66,stroke:#51cf66,color:#1a1a2e classDef yellow fill:#ffd43b,stroke:#ffd43b,color:#1a1a2e
Cache Stampede Solutions — Three common mitigations. The mutex approach serializes DB fetches so only one request populates the cache. Probabilistic early expiration randomly refreshes keys before they expire, spreading the load. Background refresh proactively keeps hot keys warm so they never expire under real traffic.
Level 4 — Distributed Caching
11. Distributed Cache with Consistent Hashing
pos: 1000"]:::blue N2["Node B
pos: 4000"]:::green N3["Node C
pos: 7000"]:::purple end subgraph Keys ["Key Placement"] direction TB K1["user:42 → hash: 2500
→ Node B (next clockwise)"]:::green K2["session:7 → hash: 5500
→ Node C (next clockwise)"]:::purple K3["product:99 → hash: 800
→ Node A (next clockwise)"]:::blue end subgraph AddNode ["Adding Node D at pos: 5000"] direction TB Before["Before: keys 4001–7000
all on Node C"]:::yellow After["After: keys 4001–5000 → Node D
keys 5001–7000 → Node C"]:::yellow Impact["Only ~1/N keys remapped
minimal disruption"]:::teal end Ring --> Keys Keys --> AddNode classDef blue fill:#4a9eff,stroke:#4a9eff,color:#e5e7eb classDef green fill:#51cf66,stroke:#51cf66,color:#1a1a2e classDef purple fill:#a29bfe,stroke:#a29bfe,color:#e5e7eb classDef yellow fill:#ffd43b,stroke:#ffd43b,color:#1a1a2e classDef teal fill:#27ae60,stroke:#27ae60,color:#e5e7eb
Distributed Cache with Consistent Hashing — Keys are mapped onto a hash ring and assigned to the next node clockwise. When a node is added or removed, only the keys between it and its predecessor need to move — roughly 1/N of total keys. This makes scaling the cache cluster far less disruptive than naive modular hashing, where adding a node would remap nearly every key.
12. Multi-Layer Caching
~0.01ms latency
Small capacity
Per-instance"]:::green end subgraph L2 ["L2 — Distributed Cache"] Redis["Redis Cluster
~1ms latency
Large capacity
Shared across instances"]:::purple end subgraph L3 ["L3 — CDN Edge"] CDN["CDN / Edge Cache
~5ms latency
Huge capacity
Geographically distributed"]:::yellow end subgraph Origin ["Origin"] DB[(Database
~50ms latency)]:::red end C -->|"1. Request"| L1 L1 -->|"2. L1 MISS"| L2 L2 -->|"3. L2 MISS"| L3 L3 -->|"4. L3 MISS"| Origin Origin -->|"5. Response"| L3 L3 -->|"6. Cache & forward"| L2 L2 -->|"7. Cache & forward"| L1 L1 -->|"8. Cache & return"| C classDef blue fill:#4a9eff,stroke:#4a9eff,color:#e5e7eb classDef red fill:#ff6b6b,stroke:#ff6b6b,color:#e5e7eb classDef green fill:#51cf66,stroke:#51cf66,color:#1a1a2e classDef purple fill:#a29bfe,stroke:#a29bfe,color:#e5e7eb classDef yellow fill:#ffd43b,stroke:#ffd43b,color:#1a1a2e
Multi-Layer Caching — A waterfall of caches, each with different latency and capacity characteristics. L1 (in-process) is fastest but smallest and per-instance. L2 (Redis) is shared across instances with millisecond latency. L3 (CDN) handles geographic distribution. Each layer absorbs misses from the layer above, so only a tiny fraction of requests ever reach the origin database.
13. Cache Invalidation Patterns
(CDC / Triggers)"]:::yellow INV1["Invalidation Handler"]:::teal CACHE1[(Cache)]:::green DB1 -->|"1. Row updated"| CDC CDC -->|"2. Change event"| INV1 INV1 -->|"3. DELETE key"| CACHE1 end subgraph PubSub ["Pattern B: Pub/Sub Invalidation"] direction LR SVC1["Service A
(writes)"]:::blue BUS["Message Bus
(Redis Pub/Sub, Kafka)"]:::purple SVC2["Service B"]:::blue SVC3["Service C"]:::blue CACHE2[(Cache B)]:::green CACHE3[(Cache C)]:::green SVC1 -->|"1. Publish:
invalidate user:42"| BUS BUS -->|"2. Notify"| SVC2 BUS -->|"2. Notify"| SVC3 SVC2 -->|"3. DELETE"| CACHE2 SVC3 -->|"3. DELETE"| CACHE3 end subgraph Versioned ["Pattern C: Versioned Keys"] direction LR APP["Application"]:::purple CACHEV[(Cache)]:::green APP -->|"Read: GET user:42:v7"| CACHEV APP -->|"Write: increment version
SET user:42:v8"| CACHEV CACHEV -.->|"Old key user:42:v7
expires via TTL"| CACHEV end classDef red fill:#ff6b6b,stroke:#ff6b6b,color:#e5e7eb classDef blue fill:#4a9eff,stroke:#4a9eff,color:#e5e7eb classDef green fill:#51cf66,stroke:#51cf66,color:#1a1a2e classDef purple fill:#a29bfe,stroke:#a29bfe,color:#e5e7eb classDef yellow fill:#ffd43b,stroke:#ffd43b,color:#1a1a2e classDef teal fill:#27ae60,stroke:#27ae60,color:#e5e7eb
Cache Invalidation Patterns — Three approaches to the hardest problem in caching. Event-driven invalidation uses CDC or database triggers to delete stale keys when source data changes. Pub/Sub invalidation broadcasts invalidation messages across services so each can clear its own cache. Versioned keys sidestep invalidation entirely by embedding a version number in the key — old versions simply expire via TTL.
TL;DR — When to Use What
| Scenario | Recommended Strategy | Why |
|---|---|---|
| General-purpose app | Cache-Aside + TTL | Simple, flexible, well-understood |
| Read-heavy, rarely updated | Read-Through + TTL | Cache handles loading, app stays clean |
| Consistency-critical writes | Write-Through | DB and cache always in sync |
| High write throughput | Write-Behind | Async batching reduces DB pressure |
| Write-heavy, rarely read | Write-Around | Avoid polluting cache |
| Hot keys under high concurrency | Mutex + Background Refresh | Prevent stampede |
| Horizontal scaling | Consistent Hashing | Minimal key redistribution |
| Global low latency | Multi-Layer (L1/L2/CDN) | Cascade absorbs load at each layer |
| Microservice cache coherence | Pub/Sub Invalidation | Cross-service consistency |
The right caching strategy depends on your read/write ratio, consistency requirements, and tolerance for complexity. Start simple with cache-aside and TTL. Add write-through if you need consistency. Graduate to distributed patterns only when scale demands it.