How to Read This Post

This post is a visual reference for caching strategies, organized by increasing complexity. Each section contains a Mermaid diagram followed by a brief explanation. No code — just patterns and pictures.

Strategy	Type	Consistency	Latency	Complexity	Best For
Cache-Aside	Read	Eventual	Low reads	Low	General purpose
Read-Through	Read	Eventual	Low reads	Medium	Read-heavy workloads
Write-Through	Write	Strong	High writes	Medium	Consistency-critical writes
Write-Behind	Write	Eventual	Low writes	High	Write-heavy workloads
Write-Around	Write	Eventual	Low writes	Low	Rarely-read written data
LRU / LFU / TTL	Eviction	N/A	N/A	Low–Medium	Memory management
Consistent Hashing	Distribution	Varies	Low	High	Horizontal scaling
Multi-Layer Cache	Read	Eventual	Very low	High	High-traffic systems

Level 1 — Foundations

1. No Cache Baseline

No Cache Baseline — Every request hits the database directly. Under load, the database becomes the bottleneck: latency climbs, connections exhaust, and throughput collapses. This is the problem caching solves.

2. Cache-Aside (Lazy Loading)

Cache-Aside (Lazy Loading) — The application owns all cache logic. On a miss, it reads from the database, stores the result in cache, and returns it. On a hit, the cached value is returned directly. The cache only contains data that has actually been requested, keeping memory usage efficient.

3. Read-Through Cache

Read-Through Cache — Unlike cache-aside, the cache itself is responsible for loading data on a miss. The application simply asks the cache for a key and the cache handles the database fetch transparently. This simplifies application code but couples the cache layer to the data source.

Level 2 — Write Strategies

4. Write-Through

Write-Through — Every write goes to the cache first, and the cache synchronously writes to the database before acknowledging. This guarantees strong consistency between cache and database at the cost of higher write latency, since every write must wait for the database round-trip.

5. Write-Behind (Write-Back)

Write-Behind (Write-Back) — The application writes to the cache and returns immediately. Writes are queued and flushed to the database asynchronously in batches. This dramatically reduces write latency and database load, but introduces a window where data exists only in cache — a crash during that window means data loss.

6. Write-Around

%%{init: {'theme':'dark', 'themeVariables': {'primaryTextColor':'#e5e7eb','secondaryTextColor':'#e5e7eb','tertiaryTextColor':'#e5e7eb','textColor':'#e5e7eb','nodeTextColor':'#e5e7eb','edgeLabelText':'#e5e7eb','clusterTextColor':'#e5e7eb','actorTextColor':'#e5e7eb'}}}%% flowchart TB subgraph WriteFlow ["Write Path"] CW([Client Write]):::blue AW[App Server]:::purple DBW[(Database)]:::red CacheW[(Cache)]:::gray end CW -->|"1. Write request"| AW AW -->|"2. Write directly to DB"| DBW DBW -->|"3. Ack"| AW AW -.->|"Cache bypassed ✗"| CacheW AW -->|"4. Success"| CW subgraph ReadFlow ["Read Path (later)"] CR([Client Read]):::blue AR[App Server]:::purple CacheR[(Cache)]:::green DBR[(Database)]:::red end CR -->|"1. Read request"| AR AR -->|"2. Check cache"| CacheR CacheR -->|"3. MISS"| AR AR -->|"4. Read from DB"| DBR DBR -->|"5. Return"| AR AR -->|"6. Populate cache"| CacheR AR -->|"7. Return data"| CR classDef blue fill:#4a9eff,stroke:#4a9eff,color:#e5e7eb classDef red fill:#ff6b6b,stroke:#ff6b6b,color:#e5e7eb classDef green fill:#51cf66,stroke:#51cf66,color:#1a1a2e classDef purple fill:#a29bfe,stroke:#a29bfe,color:#e5e7eb classDef gray fill:#546e7a,stroke:#546e7a,color:#e5e7eb

Write-Around — Writes go directly to the database, completely bypassing the cache. The cache is only populated when data is subsequently read. This avoids polluting the cache with data that may never be read, making it ideal for write-heavy workloads where most written data is rarely accessed.

7. Cache-Aside + Write-Through Combined

%%{init: {'theme':'dark', 'themeVariables': {'primaryTextColor':'#e5e7eb','secondaryTextColor':'#e5e7eb','tertiaryTextColor':'#e5e7eb','textColor':'#e5e7eb','nodeTextColor':'#e5e7eb','edgeLabelText':'#e5e7eb','clusterTextColor':'#e5e7eb','actorTextColor':'#e5e7eb'}}}%% flowchart TB C([Client]):::blue A[App Server]:::purple Cache[(Cache)]:::green DB[(Database)]:::red subgraph ReadPath ["Read Path — Cache-Aside"] R1["1. Check cache"]:::teal R2{"Hit?"}:::yellow R3["Return cached"]:::teal R4["Read from DB"]:::teal R5["Populate cache"]:::teal R6["Return data"]:::teal end subgraph WritePath ["Write Path — Write-Through"] W1["1. Write to cache"]:::purple W2["2. Sync write to DB"]:::purple W3["3. Ack to client"]:::purple end C -->|"READ"| R1 R1 --> R2 R2 -->|"Yes"| R3 R2 -->|"No"| R4 R4 --> R5 R5 --> R6 R1 -.- Cache R4 -.- DB R5 -.- Cache C -->|"WRITE"| W1 W1 -.- Cache W2 -.- DB W1 --> W2 W2 --> W3 classDef blue fill:#4a9eff,stroke:#4a9eff,color:#e5e7eb classDef red fill:#ff6b6b,stroke:#ff6b6b,color:#e5e7eb classDef green fill:#51cf66,stroke:#51cf66,color:#1a1a2e classDef purple fill:#a29bfe,stroke:#a29bfe,color:#e5e7eb classDef teal fill:#27ae60,stroke:#27ae60,color:#e5e7eb classDef yellow fill:#ffd43b,stroke:#ffd43b,color:#1a1a2e

Cache-Aside + Write-Through Combined — Reads follow the cache-aside pattern (check cache, fallback to DB, populate cache), while writes use write-through (update cache and DB synchronously). This combination delivers low read latency with strong write consistency, and is the most common caching architecture in production systems.

Level 3 — Eviction & Invalidation

8. LRU vs LFU vs TTL

%%{init: {'theme':'dark', 'themeVariables': {'primaryTextColor':'#e5e7eb','secondaryTextColor':'#e5e7eb','tertiaryTextColor':'#e5e7eb','textColor':'#e5e7eb','nodeTextColor':'#e5e7eb','edgeLabelText':'#e5e7eb','clusterTextColor':'#e5e7eb','actorTextColor':'#e5e7eb'}}}%% flowchart TB FULL["Cache Full — Must Evict"]:::red subgraph LRU ["LRU — Least Recently Used"] direction TB LRU1["Track last access time"]:::blue LRU2["Evict the entry not
accessed for the longest time"]:::blue LRU3["Good for: temporal locality
Recent = likely needed"]:::blue LRU1 --> LRU2 --> LRU3 end subgraph LFU ["LFU — Least Frequently Used"] direction TB LFU1["Track access count"]:::green LFU2["Evict the entry with
the fewest total accesses"]:::green LFU3["Good for: popularity-based
Frequent = likely needed"]:::green LFU1 --> LFU2 --> LFU3 end subgraph TTL ["TTL — Time To Live"] direction TB TTL1["Set expiry at write time"]:::yellow TTL2["Evict when current time
exceeds expiry timestamp"]:::yellow TTL3["Good for: freshness guarantees
Stale data auto-removed"]:::yellow TTL1 --> TTL2 --> TTL3 end FULL --> LRU FULL --> LFU FULL --> TTL classDef red fill:#ff6b6b,stroke:#ff6b6b,color:#e5e7eb classDef blue fill:#4a9eff,stroke:#4a9eff,color:#e5e7eb classDef green fill:#51cf66,stroke:#51cf66,color:#1a1a2e classDef yellow fill:#ffd43b,stroke:#ffd43b,color:#1a1a2e

LRU vs LFU vs TTL — Three fundamental eviction strategies. LRU evicts what has not been touched recently (favors recency), LFU evicts what is accessed least overall (favors popularity), and TTL evicts based on a fixed time-to-live regardless of access patterns. Most production systems combine TTL with either LRU or LFU.

9. Cache Stampede Problem

Cache Stampede (Thundering Herd) — When a popular cache key expires, many concurrent requests simultaneously discover the miss and all hit the database with the same query. This can overload or crash the database, especially for expensive queries, turning a cache expiration into a cascading failure.

10. Cache Stampede Solutions

%%{init: {'theme':'dark', 'themeVariables': {'primaryTextColor':'#e5e7eb','secondaryTextColor':'#e5e7eb','tertiaryTextColor':'#e5e7eb','textColor':'#e5e7eb','nodeTextColor':'#e5e7eb','edgeLabelText':'#e5e7eb','clusterTextColor':'#e5e7eb','actorTextColor':'#e5e7eb'}}}%% flowchart TB Problem["Cache Stampede Detected"]:::red subgraph Mutex ["Solution 1: Mutex / Lock"] direction TB M1["First request acquires lock"]:::blue M2["Other requests wait or
get stale data"]:::blue M3["First request fetches DB,
populates cache, releases lock"]:::blue M4["Waiting requests read
from fresh cache"]:::blue M1 --> M2 --> M3 --> M4 end subgraph PEE ["Solution 2: Probabilistic Early Expiration"] direction TB P1["Key TTL = 60s"]:::green P2["At ~50s remaining, each request
has small random chance
to refresh early"]:::green P3["One request wins the race
and refreshes before expiry"]:::green P4["Key never actually expires
for other clients"]:::green P1 --> P2 --> P3 --> P4 end subgraph BG ["Solution 3: Background Refresh"] direction TB B1["Background worker monitors
popular keys"]:::yellow B2["Refreshes keys before
TTL expires"]:::yellow B3["Cache always warm
for hot keys"]:::yellow B4["Clients never see a miss
on popular data"]:::yellow B1 --> B2 --> B3 --> B4 end Problem --> Mutex Problem --> PEE Problem --> BG classDef red fill:#ff6b6b,stroke:#ff6b6b,color:#e5e7eb classDef blue fill:#4a9eff,stroke:#4a9eff,color:#e5e7eb classDef green fill:#51cf66,stroke:#51cf66,color:#1a1a2e classDef yellow fill:#ffd43b,stroke:#ffd43b,color:#1a1a2e

Cache Stampede Solutions — Three common mitigations. The mutex approach serializes DB fetches so only one request populates the cache. Probabilistic early expiration randomly refreshes keys before they expire, spreading the load. Background refresh proactively keeps hot keys warm so they never expire under real traffic.

Level 4 — Distributed Caching

11. Distributed Cache with Consistent Hashing

%%{init: {'theme':'dark', 'themeVariables': {'primaryTextColor':'#e5e7eb','secondaryTextColor':'#e5e7eb','tertiaryTextColor':'#e5e7eb','textColor':'#e5e7eb','nodeTextColor':'#e5e7eb','edgeLabelText':'#e5e7eb','clusterTextColor':'#e5e7eb','actorTextColor':'#e5e7eb'}}}%% flowchart TB subgraph Ring ["Consistent Hash Ring (0 — 2^32)"] direction LR N1["Node A
pos: 1000"]:::blue N2["Node B
pos: 4000"]:::green N3["Node C
pos: 7000"]:::purple end subgraph Keys ["Key Placement"] direction TB K1["user:42 → hash: 2500
→ Node B (next clockwise)"]:::green K2["session:7 → hash: 5500
→ Node C (next clockwise)"]:::purple K3["product:99 → hash: 800
→ Node A (next clockwise)"]:::blue end subgraph AddNode ["Adding Node D at pos: 5000"] direction TB Before["Before: keys 4001–7000
all on Node C"]:::yellow After["After: keys 4001–5000 → Node D
keys 5001–7000 → Node C"]:::yellow Impact["Only ~1/N keys remapped
minimal disruption"]:::teal end Ring --> Keys Keys --> AddNode classDef blue fill:#4a9eff,stroke:#4a9eff,color:#e5e7eb classDef green fill:#51cf66,stroke:#51cf66,color:#1a1a2e classDef purple fill:#a29bfe,stroke:#a29bfe,color:#e5e7eb classDef yellow fill:#ffd43b,stroke:#ffd43b,color:#1a1a2e classDef teal fill:#27ae60,stroke:#27ae60,color:#e5e7eb

Distributed Cache with Consistent Hashing — Keys are mapped onto a hash ring and assigned to the next node clockwise. When a node is added or removed, only the keys between it and its predecessor need to move — roughly 1/N of total keys. This makes scaling the cache cluster far less disruptive than naive modular hashing, where adding a node would remap nearly every key.

12. Multi-Layer Caching

%%{init: {'theme':'dark', 'themeVariables': {'primaryTextColor':'#e5e7eb','secondaryTextColor':'#e5e7eb','tertiaryTextColor':'#e5e7eb','textColor':'#e5e7eb','nodeTextColor':'#e5e7eb','edgeLabelText':'#e5e7eb','clusterTextColor':'#e5e7eb','actorTextColor':'#e5e7eb'}}}%% flowchart LR C([Client]):::blue subgraph L1 ["L1 — In-Process Cache"] LP["Local HashMap
~0.01ms latency
Small capacity
Per-instance"]:::green end subgraph L2 ["L2 — Distributed Cache"] Redis["Redis Cluster
~1ms latency
Large capacity
Shared across instances"]:::purple end subgraph L3 ["L3 — CDN Edge"] CDN["CDN / Edge Cache
~5ms latency
Huge capacity
Geographically distributed"]:::yellow end subgraph Origin ["Origin"] DB[(Database
~50ms latency)]:::red end C -->|"1. Request"| L1 L1 -->|"2. L1 MISS"| L2 L2 -->|"3. L2 MISS"| L3 L3 -->|"4. L3 MISS"| Origin Origin -->|"5. Response"| L3 L3 -->|"6. Cache & forward"| L2 L2 -->|"7. Cache & forward"| L1 L1 -->|"8. Cache & return"| C classDef blue fill:#4a9eff,stroke:#4a9eff,color:#e5e7eb classDef red fill:#ff6b6b,stroke:#ff6b6b,color:#e5e7eb classDef green fill:#51cf66,stroke:#51cf66,color:#1a1a2e classDef purple fill:#a29bfe,stroke:#a29bfe,color:#e5e7eb classDef yellow fill:#ffd43b,stroke:#ffd43b,color:#1a1a2e

Multi-Layer Caching — A waterfall of caches, each with different latency and capacity characteristics. L1 (in-process) is fastest but smallest and per-instance. L2 (Redis) is shared across instances with millisecond latency. L3 (CDN) handles geographic distribution. Each layer absorbs misses from the layer above, so only a tiny fraction of requests ever reach the origin database.

13. Cache Invalidation Patterns

%%{init: {'theme':'dark', 'themeVariables': {'primaryTextColor':'#e5e7eb','secondaryTextColor':'#e5e7eb','tertiaryTextColor':'#e5e7eb','textColor':'#e5e7eb','nodeTextColor':'#e5e7eb','edgeLabelText':'#e5e7eb','clusterTextColor':'#e5e7eb','actorTextColor':'#e5e7eb'}}}%% flowchart TB subgraph EventDriven ["Pattern A: Event-Driven Invalidation"] direction LR DB1[(Database)]:::red CDC["Change Data Capture
(CDC / Triggers)"]:::yellow INV1["Invalidation Handler"]:::teal CACHE1[(Cache)]:::green DB1 -->|"1. Row updated"| CDC CDC -->|"2. Change event"| INV1 INV1 -->|"3. DELETE key"| CACHE1 end subgraph PubSub ["Pattern B: Pub/Sub Invalidation"] direction LR SVC1["Service A
(writes)"]:::blue BUS["Message Bus
(Redis Pub/Sub, Kafka)"]:::purple SVC2["Service B"]:::blue SVC3["Service C"]:::blue CACHE2[(Cache B)]:::green CACHE3[(Cache C)]:::green SVC1 -->|"1. Publish:
invalidate user:42"| BUS BUS -->|"2. Notify"| SVC2 BUS -->|"2. Notify"| SVC3 SVC2 -->|"3. DELETE"| CACHE2 SVC3 -->|"3. DELETE"| CACHE3 end subgraph Versioned ["Pattern C: Versioned Keys"] direction LR APP["Application"]:::purple CACHEV[(Cache)]:::green APP -->|"Read: GET user:42:v7"| CACHEV APP -->|"Write: increment version
SET user:42:v8"| CACHEV CACHEV -.->|"Old key user:42:v7
expires via TTL"| CACHEV end classDef red fill:#ff6b6b,stroke:#ff6b6b,color:#e5e7eb classDef blue fill:#4a9eff,stroke:#4a9eff,color:#e5e7eb classDef green fill:#51cf66,stroke:#51cf66,color:#1a1a2e classDef purple fill:#a29bfe,stroke:#a29bfe,color:#e5e7eb classDef yellow fill:#ffd43b,stroke:#ffd43b,color:#1a1a2e classDef teal fill:#27ae60,stroke:#27ae60,color:#e5e7eb

Cache Invalidation Patterns — Three approaches to the hardest problem in caching. Event-driven invalidation uses CDC or database triggers to delete stale keys when source data changes. Pub/Sub invalidation broadcasts invalidation messages across services so each can clear its own cache. Versioned keys sidestep invalidation entirely by embedding a version number in the key — old versions simply expire via TTL.

TL;DR — When to Use What

Scenario	Recommended Strategy	Why
General-purpose app	Cache-Aside + TTL	Simple, flexible, well-understood
Read-heavy, rarely updated	Read-Through + TTL	Cache handles loading, app stays clean
Consistency-critical writes	Write-Through	DB and cache always in sync
High write throughput	Write-Behind	Async batching reduces DB pressure
Write-heavy, rarely read	Write-Around	Avoid polluting cache
Hot keys under high concurrency	Mutex + Background Refresh	Prevent stampede
Horizontal scaling	Consistent Hashing	Minimal key redistribution
Global low latency	Multi-Layer (L1/L2/CDN)	Cascade absorbs load at each layer
Microservice cache coherence	Pub/Sub Invalidation	Cross-service consistency

The right caching strategy depends on your read/write ratio, consistency requirements, and tolerance for complexity. Start simple with cache-aside and TTL. Add write-through if you need consistency. Graduate to distributed patterns only when scale demands it.

Caching Strategies: A Visual Guide

How to Read This Post

Level 1 — Foundations

1. No Cache Baseline

2. Cache-Aside (Lazy Loading)

3. Read-Through Cache

Level 2 — Write Strategies

4. Write-Through

5. Write-Behind (Write-Back)

6. Write-Around

7. Cache-Aside + Write-Through Combined

Level 3 — Eviction & Invalidation

8. LRU vs LFU vs TTL

9. Cache Stampede Problem

10. Cache Stampede Solutions

Level 4 — Distributed Caching

11. Distributed Cache with Consistent Hashing

12. Multi-Layer Caching

13. Cache Invalidation Patterns

TL;DR — When to Use What

AI Assistant

Hi! I'm your AI assistant

How to Read This Post#

Level 1 — Foundations#

1. No Cache Baseline#

2. Cache-Aside (Lazy Loading)#

3. Read-Through Cache#

Level 2 — Write Strategies#

4. Write-Through#

5. Write-Behind (Write-Back)#

6. Write-Around#

7. Cache-Aside + Write-Through Combined#

Level 3 — Eviction & Invalidation#

8. LRU vs LFU vs TTL#

9. Cache Stampede Problem#

10. Cache Stampede Solutions#

Level 4 — Distributed Caching#

11. Distributed Cache with Consistent Hashing#

12. Multi-Layer Caching#

13. Cache Invalidation Patterns#

TL;DR — When to Use What#

How to Read This Post

Level 1 — Foundations

1. No Cache Baseline

2. Cache-Aside (Lazy Loading)

3. Read-Through Cache

Level 2 — Write Strategies

4. Write-Through

5. Write-Behind (Write-Back)

6. Write-Around

7. Cache-Aside + Write-Through Combined

Level 3 — Eviction & Invalidation

8. LRU vs LFU vs TTL

9. Cache Stampede Problem

10. Cache Stampede Solutions

Level 4 — Distributed Caching

11. Distributed Cache with Consistent Hashing

12. Multi-Layer Caching

13. Cache Invalidation Patterns

TL;DR — When to Use What