Introduction
Choosing the right deployment strategy is critical for minimizing downtime and risk when releasing new versions of your application. Different strategies offer different trade-offs between speed, safety, and resource usage.
This guide visualizes three essential deployment strategies:
- Rolling Updates: Gradual replacement of instances
- Blue-Green Deployments: Instant cutover between versions
- Canary Deployments: Progressive rollout with traffic splitting
- Comparison and Use Cases: When to use each strategy
Part 1: Rolling Update Deployment
Rolling updates gradually replace old version pods with new version pods, ensuring continuous availability.
Rolling Update Flow
kubectl apply -f deployment.yaml]) --> Current[Current State:
4 pods running v1.0
All healthy ✓] Current --> Config[Update Configuration:
maxSurge: 1
maxUnavailable: 1
Allows 1 extra pod
Tolerates 1 unavailable] Config --> Step1[Step 1: Create new pod
Pod-5 v2.0 starting] Step1 --> Health1{Pod-5
healthy?} Health1 -->|No| Rollback1[Rollback triggered
Delete Pod-5
Keep all v1.0 pods] Health1 -->|Yes| Traffic1[Step 2: Add Pod-5 to load balancer
Now: 5 pods total
4x v1.0 + 1x v2.0
20% traffic to v2.0] Traffic1 --> Terminate1[Step 3: Terminate old pod
Delete Pod-1 v1.0
Now: 4 pods total
3x v1.0 + 1x v2.0] Terminate1 --> Step2[Step 4: Create new pod
Pod-6 v2.0 starting] Step2 --> Health2{Pod-6
healthy?} Health2 -->|No| Rollback2[Rollback triggered] Health2 -->|Yes| Traffic2[Step 5: Add Pod-6 to LB
Now: 5 pods total
2x v1.0 + 3x v2.0
60% traffic to v2.0] Traffic2 --> Terminate2[Step 6: Terminate old pod
Delete Pod-2 v1.0] Terminate2 --> Continue[Continue process...] Continue --> Final[Final State:
4 pods running v2.0
0 pods running v1.0
Deployment complete ✓] style Health1 fill:#064e3b,stroke:#10b981 style Health2 fill:#064e3b,stroke:#10b981 style Final fill:#064e3b,stroke:#10b981 style Rollback1 fill:#7f1d1d,stroke:#ef4444 style Rollback2 fill:#7f1d1d,stroke:#ef4444
Rolling Update Visualization
Rolling Update Configuration
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
replicas: 4
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1 # Max 1 extra pod during update (25% of 4)
maxUnavailable: 1 # Max 1 pod can be unavailable (25% of 4)
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
version: v2.0 # Updated version
spec:
containers:
- name: myapp
image: myapp:v2.0 # New image version
ports:
- containerPort: 8080
# Critical: Health checks ensure new pods are ready
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 10
periodSeconds: 10
Part 2: Blue-Green Deployment
Blue-green deployment maintains two identical environments (blue = current, green = new) and switches traffic instantly.
Blue-Green Deployment Flow
Version: v1.0
Pods: 4 running
Traffic: 100%] BlueRunning --> CreateGreen[Create Green Environment
Version: v2.0
Deploy 4 new pods
No traffic yet] CreateGreen --> GreenStarting[Green pods starting...
Pulling images
Running init containers
Starting processes] GreenStarting --> HealthCheck{All green pods
healthy?} HealthCheck -->|No| FixIssues[Fix issues in green
Debug problems
Blue still serves 100%
Zero user impact] FixIssues -.->|Redeploy| CreateGreen HealthCheck -->|Yes| RunTests[Run smoke tests
on green environment
- API endpoints
- Database connections
- Critical features] RunTests --> TestResults{Tests
passed?} TestResults -->|No| Investigate[Investigate failures
Green has issues
Blue still serves 100%
Users unaffected] Investigate -.->|Fix and retry| CreateGreen TestResults -->|Yes| ReadyToSwitch[Green environment ready ✓
Version: v2.0
All tests passed
Standing by for cutover] ReadyToSwitch --> Switch{Execute
traffic switch} Switch --> UpdateService[Update Service selector:
version: v1.0 → version: v2.0
Instant cutover] UpdateService --> NewState[New State:
Blue v1.0: 0% traffic
Green v2.0: 100% traffic ✓
Cutover complete] NewState --> Monitor[Monitor green for issues
Watch metrics:
- Error rates
- Latency
- Resource usage] Monitor --> Success{Green
stable?} Success -->|Yes| Cleanup[Keep green running
Delete blue environment
Or keep blue for quick rollback] Success -->|No| Rollback[Instant rollback!
Switch service back to blue
version: v2.0 → version: v1.0
Downtime: ~1 second] Rollback --> BlueRestored[Blue environment restored
100% traffic on v1.0
Debug green offline] style BlueRunning fill:#1e3a8a,stroke:#3b82f6 style ReadyToSwitch fill:#064e3b,stroke:#10b981 style NewState fill:#064e3b,stroke:#10b981 style Rollback fill:#7f1d1d,stroke:#ef4444 style FixIssues fill:#78350f,stroke:#f59e0b
Blue-Green Architecture
app: myapp
version: v2.0] end subgraph BlueEnv[💙 Blue Environment IDLE] BlueLabel[Version: v1.0
Status: Standby
Traffic: 0%] B1[Pod 1
v1.0] B2[Pod 2
v1.0] B3[Pod 3
v1.0] B4[Pod 4
v1.0] end subgraph GreenEnv[💚 Green Environment ACTIVE] GreenLabel[Version: v2.0
Status: Active
Traffic: 100%] G1[Pod 1
v2.0] G2[Pod 2
v2.0] G3[Pod 3
v2.0] G4[Pod 4
v2.0] end subgraph Database[(Database)] DB[(Shared Database
Compatible with
both versions)] end Users --> LoadBalancer LB -.->|0% traffic| BlueEnv LB ==>|100% traffic| GreenEnv BlueEnv --> Database GreenEnv --> Database Note1[To rollback:
Update selector to version: v1.0
Instant switch to blue] style BlueEnv fill:#1e3a8a,stroke:#3b82f6,stroke-dasharray: 5 5 style GreenEnv fill:#064e3b,stroke:#10b981 style LoadBalancer fill:#1e3a8a,stroke:#3b82f6
Blue-Green with Kubernetes
# Blue deployment (current version v1.0)
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-blue
spec:
replicas: 4
selector:
matchLabels:
app: myapp
version: v1.0
template:
metadata:
labels:
app: myapp
version: v1.0
spec:
containers:
- name: myapp
image: myapp:v1.0
---
# Green deployment (new version v2.0)
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-green
spec:
replicas: 4
selector:
matchLabels:
app: myapp
version: v2.0
template:
metadata:
labels:
app: myapp
version: v2.0
spec:
containers:
- name: myapp
image: myapp:v2.0
---
# Service - controls which version receives traffic
apiVersion: v1
kind: Service
metadata:
name: myapp-service
spec:
selector:
app: myapp
version: v2.0 # Change this to switch versions
# v1.0 = blue, v2.0 = green
ports:
- port: 80
targetPort: 8080
Switch traffic:
# Currently on blue (v1.0)
kubectl patch service myapp-service -p '{"spec":{"selector":{"version":"v2.0"}}}'
# Now on green (v2.0) - instant switch!
# Rollback to blue if needed
kubectl patch service myapp-service -p '{"spec":{"selector":{"version":"v1.0"}}}'
# Back to blue (v1.0) - instant rollback!
Part 3: Canary Deployment
Canary deployment gradually shifts traffic from old to new version, allowing you to validate changes with a small percentage of users first.
Canary Deployment Flow
Replicas: 10
Traffic: 100%
Serving all users] Baseline --> DeployCanary[Deploy Canary: v2.0
Replicas: 1
Traffic: ~9%
Testing with small subset] DeployCanary --> Monitor10[Monitor Phase 1
Duration: 10 minutes
Watch metrics:
- Error rates
- Latency p95/p99
- Business metrics] Monitor10 --> Check10{Metrics
healthy?} Check10 -->|No| Rollback1[❌ Rollback canary
Scale v2.0 to 0
Issue detected early
91% users unaffected] Check10 -->|Yes| Scale25[✓ Increase traffic
v1.0: 7 replicas 64%
v2.0: 4 replicas 36%
More confident] Scale25 --> Monitor25[Monitor Phase 2
Duration: 20 minutes
Larger sample size
More diverse traffic] Monitor25 --> Check25{Metrics
healthy?} Check25 -->|No| Rollback2[❌ Rollback canary
Scale v2.0 to 0
Issue found under load] Check25 -->|Yes| Scale50[✓ Increase to 50%
v1.0: 5 replicas
v2.0: 5 replicas
Even split] Scale50 --> Monitor50[Monitor Phase 3
Duration: 30 minutes
Half of traffic
Statistical significance] Monitor50 --> Check50{Metrics
healthy?} Check50 -->|No| Rollback3[❌ Rollback canary
Scale v2.0 to 0
Scale v1.0 to 10] Check50 -->|Yes| Scale100[✓ Complete rollout
v1.0: 0 replicas
v2.0: 10 replicas
Full deployment] Scale100 --> Final[Final monitoring
v2.0 serving 100%
Remove v1.0 deployment
Canary successful ✓] style Check10 fill:#064e3b,stroke:#10b981 style Check25 fill:#064e3b,stroke:#10b981 style Check50 fill:#064e3b,stroke:#10b981 style Final fill:#064e3b,stroke:#10b981 style Rollback1 fill:#7f1d1d,stroke:#ef4444 style Rollback2 fill:#7f1d1d,stroke:#ef4444 style Rollback3 fill:#7f1d1d,stroke:#ef4444
Canary Traffic Split Progression
Canary Deployment with Kubernetes
# Stable version (v1.0)
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-stable
spec:
replicas: 9 # 90% of traffic
selector:
matchLabels:
app: myapp
track: stable
template:
metadata:
labels:
app: myapp
track: stable
version: v1.0
spec:
containers:
- name: myapp
image: myapp:v1.0
---
# Canary version (v2.0)
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-canary
spec:
replicas: 1 # 10% of traffic
selector:
matchLabels:
app: myapp
track: canary
template:
metadata:
labels:
app: myapp
track: canary
version: v2.0
spec:
containers:
- name: myapp
image: myapp:v2.0
---
# Service selects both stable and canary
apiVersion: v1
kind: Service
metadata:
name: myapp-service
spec:
selector:
app: myapp # Matches both stable and canary
ports:
- port: 80
targetPort: 8080
Progressive traffic increase:
# Start: 10% canary
kubectl scale deployment myapp-stable --replicas=9
kubectl scale deployment myapp-canary --replicas=1
# Increase to 30% canary
kubectl scale deployment myapp-stable --replicas=7
kubectl scale deployment myapp-canary --replicas=3
# Increase to 50% canary
kubectl scale deployment myapp-stable --replicas=5
kubectl scale deployment myapp-canary --replicas=5
# Complete rollout: 100% canary
kubectl scale deployment myapp-stable --replicas=0
kubectl scale deployment myapp-canary --replicas=10
Part 4: Advanced Canary with Automated Analysis
Automated Canary Decision Flow
10% traffic]) --> Collect[Collect metrics from:
- Prometheus
- Application logs
- Business analytics] Collect --> Compare[Compare metrics:
Stable vs Canary] Compare --> ErrorRate{Error rate
comparison} ErrorRate --> |Canary > 2x Stable| AutoRollback1[🚨 Auto-rollback triggered
Error rate too high] ErrorRate --> |Acceptable| Latency{Latency p95
comparison} Latency --> |Canary > 1.5x Stable| AutoRollback2[🚨 Auto-rollback triggered
Latency degradation] Latency --> |Acceptable| Business{Business metrics
comparison} Business --> |Canary conversion
< 80% of Stable| AutoRollback3[🚨 Auto-rollback triggered
Business impact detected] Business --> |Acceptable| CustomMetrics{Custom metrics
check} CustomMetrics --> |Failed| AutoRollback4[🚨 Auto-rollback triggered
Custom threshold violated] CustomMetrics --> |Passed| AllGood[✅ All metrics healthy
Analysis passed] AllGood --> TimeCheck{Enough
time elapsed?} TimeCheck --> |No| WaitMore[Wait for more data
Minimum observation period] WaitMore -.-> Collect TimeCheck --> |Yes| Promote[🎯 Auto-promote canary
Increase traffic to next tier] Promote --> NextTier{Current
traffic %?} NextTier --> |10%| Scale25[Scale to 25%] NextTier --> |25%| Scale50[Scale to 50%] NextTier --> |50%| Scale100[Scale to 100%] Scale25 -.->|Monitor next tier| Collect Scale50 -.->|Monitor next tier| Collect Scale100 --> Complete[Deployment complete ✓] AutoRollback1 --> Rollback[Execute rollback
Scale canary to 0
Alert team] AutoRollback2 --> Rollback AutoRollback3 --> Rollback AutoRollback4 --> Rollback style AllGood fill:#064e3b,stroke:#10b981 style Complete fill:#064e3b,stroke:#10b981 style AutoRollback1 fill:#7f1d1d,stroke:#ef4444 style AutoRollback2 fill:#7f1d1d,stroke:#ef4444 style AutoRollback3 fill:#7f1d1d,stroke:#ef4444 style AutoRollback4 fill:#7f1d1d,stroke:#ef4444 style Rollback fill:#7f1d1d,stroke:#ef4444
Part 5: Comparison of Deployment Strategies
Strategy Decision Matrix
strategy]) --> Resources{Have 2x
resources?} Resources -->|No| RollingOnly[Use Rolling Update
Most resource efficient] Resources -->|Yes| RiskTolerance{Risk
tolerance?} RiskTolerance -->|Low risk tolerance
Need instant rollback| BlueGreen[Use Blue-Green
Instant cutover & rollback
Full testing before switch] RiskTolerance -->|Can tolerate
some risk| UserImpact{Acceptable user
impact during
rollout?} UserImpact -->|All users can
see changes
immediately| Rolling[Use Rolling Update
Gradual rollout
Minimal resources] UserImpact -->|Need gradual
validation| Monitoring{Have good
monitoring &
metrics?} Monitoring -->|No| ManualCanary[Use Manual Canary
Manual traffic control
Manual validation] Monitoring -->|Yes| AutoCanary[Use Automated Canary
Automatic analysis
Auto-rollback on issues
Progressive delivery] style RollingOnly fill:#064e3b,stroke:#10b981 style BlueGreen fill:#064e3b,stroke:#10b981 style Rolling fill:#064e3b,stroke:#10b981 style ManualCanary fill:#064e3b,stroke:#10b981 style AutoCanary fill:#064e3b,stroke:#10b981
Detailed Comparison Table
| Aspect | Rolling Update | Blue-Green | Canary |
|---|---|---|---|
| Downtime | Zero (if health checks work) | Zero | Zero |
| Rollback Speed | Slow (reverse rolling) | Instant (switch back) | Fast (reduce traffic) |
| Resource Usage | Low (1-2 extra pods) | High (2x infrastructure) | Low to Medium |
| Risk | Medium (gradual exposure) | Low (test before switch) | Very Low (gradual validation) |
| Complexity | Low | Medium | High |
| Testing | Limited | Full environment testing | Real user validation |
| User Impact | All users see changes gradually | All users switch at once | Gradual user exposure |
| Best For | Standard releases | Critical systems | High-risk changes |
| Monitoring Needs | Basic health checks | Smoke tests | Comprehensive metrics |
| Rollback Impact | All users affected | All users affected | Minimal users affected |
Part 6: Deployment Strategy Recommendations
When to Use Each Strategy
Rolling Update - Default Choice
✅ Use When:
- Standard application updates
- Limited infrastructure budget
- Changes are low-risk
- Quick deployments needed
- Kubernetes native support desired
❌ Avoid When:
- Zero tolerance for partial rollouts
- Need instant rollback capability
- Database schema changes incompatible with old version
Blue-Green - Zero Downtime Critical Systems
✅ Use When:
- Deploying to production databases
- Regulatory compliance requirements
- Need full environment testing
- Instant rollback is critical
- Can afford 2x infrastructure
❌ Avoid When:
- Resource constrained
- Stateful applications (hard to maintain two)
- Database migrations are complex
Canary - High-Risk Changes
✅ Use When:
- Major version upgrades
- Architecture changes
- Uncertain about performance impact
- Good monitoring infrastructure exists
- Can analyze real user metrics
❌ Avoid When:
- No good metrics to validate
- Can't split traffic easily
- Changes must be all-or-nothing
- Rapid rollout needed
Part 7: Best Practices
Health Checks are Critical
# Essential for any deployment strategy
readinessProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 3
successThreshold: 1
livenessProbe:
httpGet:
path: /health/live
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
failureThreshold: 3
Database Compatibility
Always maintain backward compatibility:
Deployment N-1: Deployment N: Deployment N+1:
Schema v1 Schema v1 + v2 Schema v2
Code reads v1 Code reads v1/v2 Code reads v2
Code writes v2 Code writes v2
Monitoring Checklist
- ✅ Error rates (4xx, 5xx)
- ✅ Latency percentiles (p50, p95, p99)
- ✅ Request throughput
- ✅ Resource utilization (CPU, memory)
- ✅ Business metrics (conversions, transactions)
- ✅ Custom application metrics
Conclusion
Choosing the right deployment strategy depends on your:
- Risk tolerance: How much user impact is acceptable?
- Resources: Can you afford 2x infrastructure?
- Monitoring: Do you have metrics to validate changes?
- Rollback needs: How quickly must you recover from issues?
Quick recommendations:
- Start with Rolling Updates: Built into Kubernetes, low overhead
- Use Blue-Green for: Critical systems, major releases, zero-downtime requirements
- Adopt Canary for: High-risk changes, user-facing features, when you have good metrics
The visual diagrams in this guide show how traffic flows and how rollbacks work in each strategy, making it easier to understand the trade-offs.
Further Reading
- Kubernetes Deployment Strategies
- Flagger - Progressive Delivery Operator
- Argo Rollouts - Advanced Deployment Controller
- Martin Fowler - Blue-Green Deployment
Choose the right deployment strategy for your risk profile and resources!