Introduction

Choosing the right deployment strategy is critical for minimizing downtime and risk when releasing new versions of your application. Different strategies offer different trade-offs between speed, safety, and resource usage.

This guide visualizes three essential deployment strategies:

  • Rolling Updates: Gradual replacement of instances
  • Blue-Green Deployments: Instant cutover between versions
  • Canary Deployments: Progressive rollout with traffic splitting
  • Comparison and Use Cases: When to use each strategy

Part 1: Rolling Update Deployment

Rolling updates gradually replace old version pods with new version pods, ensuring continuous availability.

Rolling Update Flow

%%{init: {'theme':'dark', 'themeVariables': {'primaryTextColor':'#e5e7eb','secondaryTextColor':'#e5e7eb','tertiaryTextColor':'#e5e7eb','textColor':'#e5e7eb','nodeTextColor':'#e5e7eb','edgeLabelText':'#e5e7eb','clusterTextColor':'#e5e7eb','actorTextColor':'#e5e7eb'}}}%% flowchart TD Start([Deploy v2.0
kubectl apply -f deployment.yaml]) --> Current[Current State:
4 pods running v1.0
All healthy ✓] Current --> Config[Update Configuration:
maxSurge: 1
maxUnavailable: 1
Allows 1 extra pod
Tolerates 1 unavailable] Config --> Step1[Step 1: Create new pod
Pod-5 v2.0 starting] Step1 --> Health1{Pod-5
healthy?} Health1 -->|No| Rollback1[Rollback triggered
Delete Pod-5
Keep all v1.0 pods] Health1 -->|Yes| Traffic1[Step 2: Add Pod-5 to load balancer
Now: 5 pods total
4x v1.0 + 1x v2.0
20% traffic to v2.0] Traffic1 --> Terminate1[Step 3: Terminate old pod
Delete Pod-1 v1.0
Now: 4 pods total
3x v1.0 + 1x v2.0] Terminate1 --> Step2[Step 4: Create new pod
Pod-6 v2.0 starting] Step2 --> Health2{Pod-6
healthy?} Health2 -->|No| Rollback2[Rollback triggered] Health2 -->|Yes| Traffic2[Step 5: Add Pod-6 to LB
Now: 5 pods total
2x v1.0 + 3x v2.0
60% traffic to v2.0] Traffic2 --> Terminate2[Step 6: Terminate old pod
Delete Pod-2 v1.0] Terminate2 --> Continue[Continue process...] Continue --> Final[Final State:
4 pods running v2.0
0 pods running v1.0
Deployment complete ✓] style Health1 fill:#064e3b,stroke:#10b981 style Health2 fill:#064e3b,stroke:#10b981 style Final fill:#064e3b,stroke:#10b981 style Rollback1 fill:#7f1d1d,stroke:#ef4444 style Rollback2 fill:#7f1d1d,stroke:#ef4444

Rolling Update Visualization

%%{init: {'theme':'dark', 'themeVariables': {'primaryTextColor':'#e5e7eb','secondaryTextColor':'#e5e7eb','tertiaryTextColor':'#e5e7eb','textColor':'#e5e7eb','nodeTextColor':'#e5e7eb','edgeLabelText':'#e5e7eb','clusterTextColor':'#e5e7eb','actorTextColor':'#e5e7eb'}}}%% gantt title Rolling Update Timeline: v1.0 → v2.0 dateFormat ss axisFormat %S s section Pod-1 v1.0 Running :done, p1, 00, 15s Terminating :crit, 15, 17s Deleted :17, 20s section Pod-2 v1.0 Running :done, p2, 00, 25s Terminating :crit, 25, 27s Deleted :27, 30s section Pod-3 v1.0 Running :done, p3, 00, 35s Terminating :crit, 35, 37s Deleted :37, 40s section Pod-4 v1.0 Running :done, p4, 00, 45s Terminating :crit, 45, 47s Deleted :47, 50s section Pod-5 v2.0 Creating :active, 10, 12s Starting :active, 12, 15s Running :done, 15, 60s section Pod-6 v2.0 Creating :active, 20, 22s Starting :active, 22, 25s Running :done, 25, 60s section Pod-7 v2.0 Creating :active, 30, 32s Starting :active, 32, 35s Running :done, 35, 60s section Pod-8 v2.0 Creating :active, 40, 42s Starting :active, 42, 45s Running :done, 45, 60s

Rolling Update Configuration

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
spec:
  replicas: 4
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1        # Max 1 extra pod during update (25% of 4)
      maxUnavailable: 1   # Max 1 pod can be unavailable (25% of 4)

  selector:
    matchLabels:
      app: myapp

  template:
    metadata:
      labels:
        app: myapp
        version: v2.0  # Updated version
    spec:
      containers:
      - name: myapp
        image: myapp:v2.0  # New image version
        ports:
        - containerPort: 8080

        # Critical: Health checks ensure new pods are ready
        readinessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5

        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 10

Part 2: Blue-Green Deployment

Blue-green deployment maintains two identical environments (blue = current, green = new) and switches traffic instantly.

Blue-Green Deployment Flow

%%{init: {'theme':'dark', 'themeVariables': {'primaryTextColor':'#e5e7eb','secondaryTextColor':'#e5e7eb','tertiaryTextColor':'#e5e7eb','textColor':'#e5e7eb','nodeTextColor':'#e5e7eb','edgeLabelText':'#e5e7eb','clusterTextColor':'#e5e7eb','actorTextColor':'#e5e7eb'}}}%% flowchart TD Start([Prepare v2.0 deployment]) --> BlueRunning[Blue Environment
Version: v1.0
Pods: 4 running
Traffic: 100%] BlueRunning --> CreateGreen[Create Green Environment
Version: v2.0
Deploy 4 new pods
No traffic yet] CreateGreen --> GreenStarting[Green pods starting...
Pulling images
Running init containers
Starting processes] GreenStarting --> HealthCheck{All green pods
healthy?} HealthCheck -->|No| FixIssues[Fix issues in green
Debug problems
Blue still serves 100%
Zero user impact] FixIssues -.->|Redeploy| CreateGreen HealthCheck -->|Yes| RunTests[Run smoke tests
on green environment
- API endpoints
- Database connections
- Critical features] RunTests --> TestResults{Tests
passed?} TestResults -->|No| Investigate[Investigate failures
Green has issues
Blue still serves 100%
Users unaffected] Investigate -.->|Fix and retry| CreateGreen TestResults -->|Yes| ReadyToSwitch[Green environment ready ✓
Version: v2.0
All tests passed
Standing by for cutover] ReadyToSwitch --> Switch{Execute
traffic switch} Switch --> UpdateService[Update Service selector:
version: v1.0 → version: v2.0
Instant cutover] UpdateService --> NewState[New State:
Blue v1.0: 0% traffic
Green v2.0: 100% traffic ✓
Cutover complete] NewState --> Monitor[Monitor green for issues
Watch metrics:
- Error rates
- Latency
- Resource usage] Monitor --> Success{Green
stable?} Success -->|Yes| Cleanup[Keep green running
Delete blue environment
Or keep blue for quick rollback] Success -->|No| Rollback[Instant rollback!
Switch service back to blue
version: v2.0 → version: v1.0
Downtime: ~1 second] Rollback --> BlueRestored[Blue environment restored
100% traffic on v1.0
Debug green offline] style BlueRunning fill:#1e3a8a,stroke:#3b82f6 style ReadyToSwitch fill:#064e3b,stroke:#10b981 style NewState fill:#064e3b,stroke:#10b981 style Rollback fill:#7f1d1d,stroke:#ef4444 style FixIssues fill:#78350f,stroke:#f59e0b

Blue-Green Architecture

%%{init: {'theme':'dark', 'themeVariables': {'primaryTextColor':'#e5e7eb','secondaryTextColor':'#e5e7eb','tertiaryTextColor':'#e5e7eb','textColor':'#e5e7eb','nodeTextColor':'#e5e7eb','edgeLabelText':'#e5e7eb','clusterTextColor':'#e5e7eb','actorTextColor':'#e5e7eb'}}}%% graph TB Users([👥 Users]) subgraph LoadBalancer[Load Balancer / Service] LB[Service Selector:
app: myapp
version: v2.0] end subgraph BlueEnv[💙 Blue Environment IDLE] BlueLabel[Version: v1.0
Status: Standby
Traffic: 0%] B1[Pod 1
v1.0] B2[Pod 2
v1.0] B3[Pod 3
v1.0] B4[Pod 4
v1.0] end subgraph GreenEnv[💚 Green Environment ACTIVE] GreenLabel[Version: v2.0
Status: Active
Traffic: 100%] G1[Pod 1
v2.0] G2[Pod 2
v2.0] G3[Pod 3
v2.0] G4[Pod 4
v2.0] end subgraph Database[(Database)] DB[(Shared Database
Compatible with
both versions)] end Users --> LoadBalancer LB -.->|0% traffic| BlueEnv LB ==>|100% traffic| GreenEnv BlueEnv --> Database GreenEnv --> Database Note1[To rollback:
Update selector to version: v1.0
Instant switch to blue] style BlueEnv fill:#1e3a8a,stroke:#3b82f6,stroke-dasharray: 5 5 style GreenEnv fill:#064e3b,stroke:#10b981 style LoadBalancer fill:#1e3a8a,stroke:#3b82f6

Blue-Green with Kubernetes

# Blue deployment (current version v1.0)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-blue
spec:
  replicas: 4
  selector:
    matchLabels:
      app: myapp
      version: v1.0
  template:
    metadata:
      labels:
        app: myapp
        version: v1.0
    spec:
      containers:
      - name: myapp
        image: myapp:v1.0

---
# Green deployment (new version v2.0)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-green
spec:
  replicas: 4
  selector:
    matchLabels:
      app: myapp
      version: v2.0
  template:
    metadata:
      labels:
        app: myapp
        version: v2.0
    spec:
      containers:
      - name: myapp
        image: myapp:v2.0

---
# Service - controls which version receives traffic
apiVersion: v1
kind: Service
metadata:
  name: myapp-service
spec:
  selector:
    app: myapp
    version: v2.0  # Change this to switch versions
                   # v1.0 = blue, v2.0 = green
  ports:
  - port: 80
    targetPort: 8080

Switch traffic:

# Currently on blue (v1.0)
kubectl patch service myapp-service -p '{"spec":{"selector":{"version":"v2.0"}}}'
# Now on green (v2.0) - instant switch!

# Rollback to blue if needed
kubectl patch service myapp-service -p '{"spec":{"selector":{"version":"v1.0"}}}'
# Back to blue (v1.0) - instant rollback!

Part 3: Canary Deployment

Canary deployment gradually shifts traffic from old to new version, allowing you to validate changes with a small percentage of users first.

Canary Deployment Flow

%%{init: {'theme':'dark', 'themeVariables': {'primaryTextColor':'#e5e7eb','secondaryTextColor':'#e5e7eb','tertiaryTextColor':'#e5e7eb','textColor':'#e5e7eb','nodeTextColor':'#e5e7eb','edgeLabelText':'#e5e7eb','clusterTextColor':'#e5e7eb','actorTextColor':'#e5e7eb'}}}%% flowchart TD Start([Deploy canary v2.0]) --> Baseline[Baseline: v1.0
Replicas: 10
Traffic: 100%
Serving all users] Baseline --> DeployCanary[Deploy Canary: v2.0
Replicas: 1
Traffic: ~9%
Testing with small subset] DeployCanary --> Monitor10[Monitor Phase 1
Duration: 10 minutes
Watch metrics:
- Error rates
- Latency p95/p99
- Business metrics] Monitor10 --> Check10{Metrics
healthy?} Check10 -->|No| Rollback1[❌ Rollback canary
Scale v2.0 to 0
Issue detected early
91% users unaffected] Check10 -->|Yes| Scale25[✓ Increase traffic
v1.0: 7 replicas 64%
v2.0: 4 replicas 36%
More confident] Scale25 --> Monitor25[Monitor Phase 2
Duration: 20 minutes
Larger sample size
More diverse traffic] Monitor25 --> Check25{Metrics
healthy?} Check25 -->|No| Rollback2[❌ Rollback canary
Scale v2.0 to 0
Issue found under load] Check25 -->|Yes| Scale50[✓ Increase to 50%
v1.0: 5 replicas
v2.0: 5 replicas
Even split] Scale50 --> Monitor50[Monitor Phase 3
Duration: 30 minutes
Half of traffic
Statistical significance] Monitor50 --> Check50{Metrics
healthy?} Check50 -->|No| Rollback3[❌ Rollback canary
Scale v2.0 to 0
Scale v1.0 to 10] Check50 -->|Yes| Scale100[✓ Complete rollout
v1.0: 0 replicas
v2.0: 10 replicas
Full deployment] Scale100 --> Final[Final monitoring
v2.0 serving 100%
Remove v1.0 deployment
Canary successful ✓] style Check10 fill:#064e3b,stroke:#10b981 style Check25 fill:#064e3b,stroke:#10b981 style Check50 fill:#064e3b,stroke:#10b981 style Final fill:#064e3b,stroke:#10b981 style Rollback1 fill:#7f1d1d,stroke:#ef4444 style Rollback2 fill:#7f1d1d,stroke:#ef4444 style Rollback3 fill:#7f1d1d,stroke:#ef4444

Canary Traffic Split Progression

%%{init: {'theme':'dark', 'themeVariables': {'primaryTextColor':'#e5e7eb','secondaryTextColor':'#e5e7eb','tertiaryTextColor':'#e5e7eb','textColor':'#e5e7eb','nodeTextColor':'#e5e7eb','edgeLabelText':'#e5e7eb','clusterTextColor':'#e5e7eb','actorTextColor':'#e5e7eb'}}}%% gantt title Canary Deployment Traffic Distribution Over Time dateFormat HH:mm axisFormat %H:%M section v1.0 (Stable) 100% :done, v1_100, 00:00, 30m 91% :done, v1_90, 00:30, 40m 64% :done, v1_64, 01:10, 50m 50% :done, v1_50, 02:00, 60m 0% :crit, v1_0, 03:00, 30m section v2.0 (Canary) 0% :crit, v2_0, 00:00, 30m 9% :active, v2_9, 00:30, 40m 36% :active, v2_36, 01:10, 50m 50% :active, v2_50, 02:00, 60m 100% :done, v2_100, 03:00, 30m section Gates Deploy Canary :milestone, m1, 00:30, 0m Gate 1 Metrics OK :milestone, m2, 01:10, 0m Gate 2 Metrics OK :milestone, m3, 02:00, 0m Gate 3 Metrics OK :milestone, m4, 03:00, 0m Complete :milestone, m5, 03:30, 0m

Canary Deployment with Kubernetes

# Stable version (v1.0)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-stable
spec:
  replicas: 9  # 90% of traffic
  selector:
    matchLabels:
      app: myapp
      track: stable
  template:
    metadata:
      labels:
        app: myapp
        track: stable
        version: v1.0
    spec:
      containers:
      - name: myapp
        image: myapp:v1.0

---
# Canary version (v2.0)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-canary
spec:
  replicas: 1  # 10% of traffic
  selector:
    matchLabels:
      app: myapp
      track: canary
  template:
    metadata:
      labels:
        app: myapp
        track: canary
        version: v2.0
    spec:
      containers:
      - name: myapp
        image: myapp:v2.0

---
# Service selects both stable and canary
apiVersion: v1
kind: Service
metadata:
  name: myapp-service
spec:
  selector:
    app: myapp  # Matches both stable and canary
  ports:
  - port: 80
    targetPort: 8080

Progressive traffic increase:

# Start: 10% canary
kubectl scale deployment myapp-stable --replicas=9
kubectl scale deployment myapp-canary --replicas=1

# Increase to 30% canary
kubectl scale deployment myapp-stable --replicas=7
kubectl scale deployment myapp-canary --replicas=3

# Increase to 50% canary
kubectl scale deployment myapp-stable --replicas=5
kubectl scale deployment myapp-canary --replicas=5

# Complete rollout: 100% canary
kubectl scale deployment myapp-stable --replicas=0
kubectl scale deployment myapp-canary --replicas=10

Part 4: Advanced Canary with Automated Analysis

Automated Canary Decision Flow

%%{init: {'theme':'dark', 'themeVariables': {'primaryTextColor':'#e5e7eb','secondaryTextColor':'#e5e7eb','tertiaryTextColor':'#e5e7eb','textColor':'#e5e7eb','nodeTextColor':'#e5e7eb','edgeLabelText':'#e5e7eb','clusterTextColor':'#e5e7eb','actorTextColor':'#e5e7eb'}}}%% flowchart TD Start([Canary deployed
10% traffic]) --> Collect[Collect metrics from:
- Prometheus
- Application logs
- Business analytics] Collect --> Compare[Compare metrics:
Stable vs Canary] Compare --> ErrorRate{Error rate
comparison} ErrorRate --> |Canary > 2x Stable| AutoRollback1[🚨 Auto-rollback triggered
Error rate too high] ErrorRate --> |Acceptable| Latency{Latency p95
comparison} Latency --> |Canary > 1.5x Stable| AutoRollback2[🚨 Auto-rollback triggered
Latency degradation] Latency --> |Acceptable| Business{Business metrics
comparison} Business --> |Canary conversion
< 80% of Stable| AutoRollback3[🚨 Auto-rollback triggered
Business impact detected] Business --> |Acceptable| CustomMetrics{Custom metrics
check} CustomMetrics --> |Failed| AutoRollback4[🚨 Auto-rollback triggered
Custom threshold violated] CustomMetrics --> |Passed| AllGood[✅ All metrics healthy
Analysis passed] AllGood --> TimeCheck{Enough
time elapsed?} TimeCheck --> |No| WaitMore[Wait for more data
Minimum observation period] WaitMore -.-> Collect TimeCheck --> |Yes| Promote[🎯 Auto-promote canary
Increase traffic to next tier] Promote --> NextTier{Current
traffic %?} NextTier --> |10%| Scale25[Scale to 25%] NextTier --> |25%| Scale50[Scale to 50%] NextTier --> |50%| Scale100[Scale to 100%] Scale25 -.->|Monitor next tier| Collect Scale50 -.->|Monitor next tier| Collect Scale100 --> Complete[Deployment complete ✓] AutoRollback1 --> Rollback[Execute rollback
Scale canary to 0
Alert team] AutoRollback2 --> Rollback AutoRollback3 --> Rollback AutoRollback4 --> Rollback style AllGood fill:#064e3b,stroke:#10b981 style Complete fill:#064e3b,stroke:#10b981 style AutoRollback1 fill:#7f1d1d,stroke:#ef4444 style AutoRollback2 fill:#7f1d1d,stroke:#ef4444 style AutoRollback3 fill:#7f1d1d,stroke:#ef4444 style AutoRollback4 fill:#7f1d1d,stroke:#ef4444 style Rollback fill:#7f1d1d,stroke:#ef4444

Part 5: Comparison of Deployment Strategies

Strategy Decision Matrix

%%{init: {'theme':'dark', 'themeVariables': {'primaryTextColor':'#e5e7eb','secondaryTextColor':'#e5e7eb','tertiaryTextColor':'#e5e7eb','textColor':'#e5e7eb','nodeTextColor':'#e5e7eb','edgeLabelText':'#e5e7eb','clusterTextColor':'#e5e7eb','actorTextColor':'#e5e7eb'}}}%% flowchart TD Start([Choose deployment
strategy]) --> Resources{Have 2x
resources?} Resources -->|No| RollingOnly[Use Rolling Update
Most resource efficient] Resources -->|Yes| RiskTolerance{Risk
tolerance?} RiskTolerance -->|Low risk tolerance
Need instant rollback| BlueGreen[Use Blue-Green
Instant cutover & rollback
Full testing before switch] RiskTolerance -->|Can tolerate
some risk| UserImpact{Acceptable user
impact during
rollout?} UserImpact -->|All users can
see changes
immediately| Rolling[Use Rolling Update
Gradual rollout
Minimal resources] UserImpact -->|Need gradual
validation| Monitoring{Have good
monitoring &
metrics?} Monitoring -->|No| ManualCanary[Use Manual Canary
Manual traffic control
Manual validation] Monitoring -->|Yes| AutoCanary[Use Automated Canary
Automatic analysis
Auto-rollback on issues
Progressive delivery] style RollingOnly fill:#064e3b,stroke:#10b981 style BlueGreen fill:#064e3b,stroke:#10b981 style Rolling fill:#064e3b,stroke:#10b981 style ManualCanary fill:#064e3b,stroke:#10b981 style AutoCanary fill:#064e3b,stroke:#10b981

Detailed Comparison Table

Aspect Rolling Update Blue-Green Canary
Downtime Zero (if health checks work) Zero Zero
Rollback Speed Slow (reverse rolling) Instant (switch back) Fast (reduce traffic)
Resource Usage Low (1-2 extra pods) High (2x infrastructure) Low to Medium
Risk Medium (gradual exposure) Low (test before switch) Very Low (gradual validation)
Complexity Low Medium High
Testing Limited Full environment testing Real user validation
User Impact All users see changes gradually All users switch at once Gradual user exposure
Best For Standard releases Critical systems High-risk changes
Monitoring Needs Basic health checks Smoke tests Comprehensive metrics
Rollback Impact All users affected All users affected Minimal users affected

Part 6: Deployment Strategy Recommendations

When to Use Each Strategy

Rolling Update - Default Choice

✅ Use When:
- Standard application updates
- Limited infrastructure budget
- Changes are low-risk
- Quick deployments needed
- Kubernetes native support desired

❌ Avoid When:
- Zero tolerance for partial rollouts
- Need instant rollback capability
- Database schema changes incompatible with old version

Blue-Green - Zero Downtime Critical Systems

 Use When:
- Deploying to production databases
- Regulatory compliance requirements
- Need full environment testing
- Instant rollback is critical
- Can afford 2x infrastructure

 Avoid When:
- Resource constrained
- Stateful applications (hard to maintain two)
- Database migrations are complex

Canary - High-Risk Changes

✅ Use When:
- Major version upgrades
- Architecture changes
- Uncertain about performance impact
- Good monitoring infrastructure exists
- Can analyze real user metrics

❌ Avoid When:
- No good metrics to validate
- Can't split traffic easily
- Changes must be all-or-nothing
- Rapid rollout needed

Part 7: Best Practices

Health Checks are Critical

# Essential for any deployment strategy
readinessProbe:
  httpGet:
    path: /health/ready
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 5
  failureThreshold: 3
  successThreshold: 1

livenessProbe:
  httpGet:
    path: /health/live
    port: 8080
  initialDelaySeconds: 30
  periodSeconds: 10
  failureThreshold: 3

Database Compatibility

Always maintain backward compatibility:

Deployment N-1:    Deployment N:      Deployment N+1:
Schema v1          Schema v1 + v2     Schema v2
Code reads v1      Code reads v1/v2   Code reads v2
                   Code writes v2     Code writes v2

Monitoring Checklist

  • ✅ Error rates (4xx, 5xx)
  • ✅ Latency percentiles (p50, p95, p99)
  • ✅ Request throughput
  • ✅ Resource utilization (CPU, memory)
  • ✅ Business metrics (conversions, transactions)
  • ✅ Custom application metrics

Conclusion

Choosing the right deployment strategy depends on your:

  • Risk tolerance: How much user impact is acceptable?
  • Resources: Can you afford 2x infrastructure?
  • Monitoring: Do you have metrics to validate changes?
  • Rollback needs: How quickly must you recover from issues?

Quick recommendations:

  • Start with Rolling Updates: Built into Kubernetes, low overhead
  • Use Blue-Green for: Critical systems, major releases, zero-downtime requirements
  • Adopt Canary for: High-risk changes, user-facing features, when you have good metrics

The visual diagrams in this guide show how traffic flows and how rollbacks work in each strategy, making it easier to understand the trade-offs.


Further Reading


Choose the right deployment strategy for your risk profile and resources!