Introduction

Kubernetes Pods are the smallest deployable units in Kubernetes, representing one or more containers that share resources. Understanding the Pod lifecycle is crucial for debugging, monitoring, and managing applications in Kubernetes.

This guide visualizes the complete Pod lifecycle:

Pod Creation: From YAML manifest to scheduling
State Transitions: Pending → Running → Succeeded/Failed
Init Containers: Pre-application setup
Container Restart Policies: How Kubernetes handles failures
Termination: Graceful shutdown process

Part 1: Pod Lifecycle Overview

Complete Pod State Machine

Pod Creation to Running Flow

%%{init: {'theme':'dark', 'themeVariables': {'primaryTextColor':'#e5e7eb','secondaryTextColor':'#e5e7eb','tertiaryTextColor':'#e5e7eb','textColor':'#e5e7eb','nodeTextColor':'#e5e7eb','edgeLabelText':'#e5e7eb','clusterTextColor':'#e5e7eb','actorTextColor':'#e5e7eb'}}}%% flowchart TD Start([kubectl apply -f pod.yaml]) --> APIServer[API Server
Validates YAML
Writes to etcd] APIServer --> Scheduler{Scheduler finds
suitable node?} Scheduler -->|No| PendingNoNode[Status: Pending
Reason: Unschedulable
- Insufficient resources
- Node selector mismatch
- Taints/tolerations] Scheduler -->|Yes| AssignNode[Pod assigned to Node
Update: spec.nodeName] AssignNode --> Kubelet[Kubelet on target node
receives Pod spec] Kubelet --> PullImages{Pull container
images} PullImages -->|Failed| ImagePullError[Status: Pending
Reason: ImagePullBackOff
- Image doesn't exist
- Registry auth failed
- Network issues] PullImages -->|Success| InitContainers{Init containers
defined?} InitContainers -->|Yes| RunInit[Run init containers
sequentially] InitContainers -->|No| CreateContainers RunInit --> InitSuccess{All init
containers
succeeded?} InitSuccess -->|No| InitFailed[Status: Init:Error
or Init:CrashLoopBackOff] InitSuccess -->|Yes| CreateContainers[Create main containers
Setup networking
Mount volumes] CreateContainers --> StartContainers[Start all containers
in Pod] StartContainers --> HealthChecks{Startup probe
defined?} HealthChecks -->|Yes| StartupProbe[Execute startup probe] HealthChecks -->|No| Running StartupProbe --> StartupResult{Probe
passed?} StartupResult -->|No| ProbeFailed[Container not ready
If fails too long:
CrashLoopBackOff] StartupResult -->|Yes| Running[Status: Running
- Container ready
- Liveness probe active
- Readiness probe active] Running --> ServingTraffic[Pod receives traffic
Added to Service endpoints] style PendingNoNode fill:#78350f,stroke:#f59e0b style ImagePullError fill:#7f1d1d,stroke:#ef4444 style InitFailed fill:#7f1d1d,stroke:#ef4444 style Running fill:#064e3b,stroke:#10b981 style ServingTraffic fill:#064e3b,stroke:#10b981

Part 2: Pod Creation Sequence

API Server to Kubelet Communication

%%{init: {'theme':'dark', 'themeVariables': {'primaryTextColor':'#e5e7eb','secondaryTextColor':'#e5e7eb','tertiaryTextColor':'#e5e7eb','textColor':'#e5e7eb','nodeTextColor':'#e5e7eb','edgeLabelText':'#e5e7eb','clusterTextColor':'#e5e7eb','actorTextColor':'#e5e7eb'}}}%% sequenceDiagram participant User as Developer participant API as API Server participant ETCD as etcd participant Sched as Scheduler participant Kubelet as Kubelet (Node) participant Runtime as Container Runtime participant Reg as Container Registry User->>API: kubectl apply -f pod.yaml Note over API: Validate Pod spec
- Required fields
- Resource limits
- Security context API->>ETCD: Write Pod object
Status: Pending
nodeName: ETCD-->>API: Acknowledged API-->>User: Pod created Note over Sched: Watch for unscheduled Pods Sched->>API: List Pods with nodeName="" API-->>Sched: Pod list Note over Sched: Score nodes:
- CPU/Memory available
- Affinity rules
- Taints/Tolerations
Best node: node-1 Sched->>API: Bind Pod to node-1 API->>ETCD: Update Pod.spec.nodeName = "node-1" Note over Kubelet: Watch for Pods on node-1 Kubelet->>API: Get Pod specifications API-->>Kubelet: Pod details Kubelet->>Runtime: Pull image: nginx:1.21 Runtime->>Reg: Pull nginx:1.21 Reg-->>Runtime: Image layers Note over Runtime: Extract and cache image Kubelet->>Runtime: Create container
with Pod spec config Runtime-->>Kubelet: Container created Kubelet->>Runtime: Start container Runtime-->>Kubelet: Container started Kubelet->>API: Update Pod Status:
Phase: Running
containerStatuses: ready API->>ETCD: Save Pod status Kubelet->>Kubelet: Start health checks
- Startup probe
- Readiness probe
- Liveness probe Note over Kubelet,Runtime: Continuous monitoring
and health checking

Part 3: Init Containers

Init containers run before app containers and must complete successfully before the main containers start.

Init Container Execution Flow

Init Container Example

apiVersion: v1
kind: Pod
metadata:
  name: myapp-pod
spec:
  initContainers:
  # Init container 1: Wait for database
  - name: check-database
    image: busybox:1.35
    command: ['sh', '-c']
    args:
    - |
      until nc -z postgres-service 5432; do
        echo "Waiting for database..."
        sleep 2
      done
      echo "Database is ready!"      

  # Init container 2: Setup configuration
  - name: setup-config
    image: busybox:1.35
    command: ['sh', '-c']
    args:
    - |
      cp /config-template/app.conf /config/
      chmod 600 /config/app.conf      
    volumeMounts:
    - name: config
      mountPath: /config
    - name: config-template
      mountPath: /config-template

  # Init container 3: Run migrations
  - name: run-migrations
    image: myapp:v1.0
    command: ['./migrate']
    env:
    - name: DATABASE_URL
      valueFrom:
        secretKeyRef:
          name: db-secret
          key: url

  # Main application container
  containers:
  - name: myapp
    image: myapp:v1.0
    ports:
    - containerPort: 8080
    volumeMounts:
    - name: config
      mountPath: /config

  volumes:
  - name: config
    emptyDir: {}
  - name: config-template
    configMap:
      name: app-config

Part 4: Container Restart Policies

Restart Policy Decision Tree

%%{init: {'theme':'dark', 'themeVariables': {'primaryTextColor':'#e5e7eb','secondaryTextColor':'#e5e7eb','tertiaryTextColor':'#e5e7eb','textColor':'#e5e7eb','nodeTextColor':'#e5e7eb','edgeLabelText':'#e5e7eb','clusterTextColor':'#e5e7eb','actorTextColor':'#e5e7eb'}}}%% flowchart TD Start([Container exits]) --> GetExitCode{Exit code?} GetExitCode -->|0| Success[Container exited
successfully] GetExitCode -->|Non-zero| Failure[Container failed] Success --> CheckPolicySuccess{Restart
Policy?} Failure --> CheckPolicyFailure{Restart
Policy?} CheckPolicySuccess -->|Always| RestartSuccess[Restart container
Wait: backoff delay] CheckPolicySuccess -->|OnFailure| NoRestart1[No restart
Status: Succeeded] CheckPolicySuccess -->|Never| NoRestart2[No restart
Status: Succeeded] CheckPolicyFailure -->|Always| RestartFailure1[Restart container
Wait: backoff delay] CheckPolicyFailure -->|OnFailure| RestartFailure2[Restart container
Wait: backoff delay] CheckPolicyFailure -->|Never| NoRestart3[No restart
Status: Failed] RestartSuccess --> Backoff1[Backoff calculation:
delay = min280s] RestartFailure1 --> Backoff2[Backoff calculation:
delay = min280s] RestartFailure2 --> Backoff3[Backoff calculation:
delay = min280s] Backoff1 --> Wait1[Wait delay seconds] Backoff2 --> Wait2[Wait delay seconds] Backoff3 --> Wait3[Wait delay seconds] Wait1 --> Attempt1[Restart attempt] Wait2 --> Attempt2[Restart attempt] Wait3 --> Attempt3[Restart attempt] Attempt1 --> CheckCount1{Too many
restarts?} Attempt2 --> CheckCount2{Too many
restarts?} Attempt3 --> CheckCount3{Too many
restarts?} CheckCount1 -->|Yes| CrashLoop1[CrashLoopBackOff] CheckCount2 -->|Yes| CrashLoop2[CrashLoopBackOff] CheckCount3 -->|Yes| CrashLoop3[CrashLoopBackOff] CheckCount1 -->|No| StartContainer1[Start container] CheckCount2 -->|No| StartContainer2[Start container] CheckCount3 -->|No| StartContainer3[Start container] NoRestart1 --> AllDone{All containers
done?} NoRestart2 --> AllDone NoRestart3 --> AllDone AllDone -->|Yes| PodComplete[Pod Status:
Succeeded or Failed] style NoRestart1 fill:#064e3b,stroke:#10b981 style NoRestart2 fill:#064e3b,stroke:#10b981 style NoRestart3 fill:#7f1d1d,stroke:#ef4444 style CrashLoop1 fill:#7f1d1d,stroke:#ef4444 style CrashLoop2 fill:#7f1d1d,stroke:#ef4444 style CrashLoop3 fill:#7f1d1d,stroke:#ef4444 style PodComplete fill:#1e3a8a,stroke:#3b82f6

Restart Policy Comparison

Policy	On Success (Exit 0)	On Failure (Exit ≠ 0)	Use Case
Always	Restart with backoff	Restart with backoff	Long-running services, web servers
OnFailure	No restart	Restart with backoff	Batch jobs that should retry on failure
Never	No restart	No restart	One-time tasks, completed jobs

Restart Policy Examples

# Always restart - for services
apiVersion: v1
kind: Pod
metadata:
  name: web-server
spec:
  restartPolicy: Always  # Default
  containers:
  - name: nginx
    image: nginx:1.21

---
# Restart on failure - for batch jobs
apiVersion: v1
kind: Pod
metadata:
  name: data-processor
spec:
  restartPolicy: OnFailure
  containers:
  - name: processor
    image: data-processor:v1

---
# Never restart - for one-time tasks
apiVersion: v1
kind: Pod
metadata:
  name: migration
spec:
  restartPolicy: Never
  containers:
  - name: migrate
    image: migrate:v1

Part 5: Health Checks (Probes)

Kubernetes uses three types of probes to check container health:

Probe Types and Execution Flow

%%{init: {'theme':'dark', 'themeVariables': {'primaryTextColor':'#e5e7eb','secondaryTextColor':'#e5e7eb','tertiaryTextColor':'#e5e7eb','textColor':'#e5e7eb','nodeTextColor':'#e5e7eb','edgeLabelText':'#e5e7eb','clusterTextColor':'#e5e7eb','actorTextColor':'#e5e7eb'}}}%% flowchart TD Start([Container started]) --> Startup{Startup probe
configured?} Startup -->|No| Readiness1[Skip to readiness probe] Startup -->|Yes| StartupExec[Execute startup probe
every initialDelaySeconds] StartupExec --> StartupResult{Probe
result?} StartupResult -->|Success| StartupPass[Startup probe passed
Container initialized ✓] StartupResult -->|Failure| StartupCheck{Exceeded
failureThreshold?} StartupCheck -->|No| StartupWait[Wait periodSeconds] StartupWait --> StartupExec StartupCheck -->|Yes| KillContainer[Kill container
Apply restart policy] StartupPass --> Parallel{Run continuously} Parallel --> Liveness[Liveness Probe
Is container alive?] Parallel --> Readiness[Readiness Probe
Can accept traffic?] Liveness --> LivenessExec[Execute every
periodSeconds] Readiness --> ReadinessExec[Execute every
periodSeconds] LivenessExec --> LivenessResult{Result?} ReadinessExec --> ReadinessResult{Result?} LivenessResult -->|Success| LivenessOK[Container healthy ✓
Reset failure count] LivenessResult -->|Failure| LivenessCount{Consecutive
failures >=
failureThreshold?} LivenessCount -->|No| LivenessWait[Wait for next check] LivenessCount -->|Yes| RestartContainer[Restart container
Container unhealthy] LivenessOK --> LivenessWait LivenessWait -.-> LivenessExec ReadinessResult -->|Success| ReadyForTraffic[Mark Ready ✓
Add to Service endpoints
Receive traffic] ReadinessResult -->|Failure| NotReady[Mark Not Ready ✗
Remove from endpoints
No traffic] ReadyForTraffic --> ReadinessWait[Wait for next check] NotReady --> ReadinessWait ReadinessWait -.-> ReadinessExec RestartContainer -.->|After restart| Start style StartupPass fill:#064e3b,stroke:#10b981 style LivenessOK fill:#064e3b,stroke:#10b981 style ReadyForTraffic fill:#064e3b,stroke:#10b981 style KillContainer fill:#7f1d1d,stroke:#ef4444 style RestartContainer fill:#7f1d1d,stroke:#ef4444 style NotReady fill:#78350f,stroke:#f59e0b

Probe Configuration Examples

apiVersion: v1
kind: Pod
metadata:
  name: app-with-probes
spec:
  containers:
  - name: app
    image: myapp:v1.0
    ports:
    - containerPort: 8080

    # Startup probe - runs first, protects slow-starting apps
    startupProbe:
      httpGet:
        path: /healthz
        port: 8080
      initialDelaySeconds: 0
      periodSeconds: 5
      failureThreshold: 30  # 30 * 5 = 150 seconds to start

    # Liveness probe - restarts container if fails
    livenessProbe:
      httpGet:
        path: /healthz
        port: 8080
      initialDelaySeconds: 10
      periodSeconds: 10
      failureThreshold: 3
      successThreshold: 1
      timeoutSeconds: 5

    # Readiness probe - controls traffic routing
    readinessProbe:
      httpGet:
        path: /ready
        port: 8080
      initialDelaySeconds: 5
      periodSeconds: 5
      failureThreshold: 3
      successThreshold: 1
      timeoutSeconds: 3

    # Example TCP probe
    # livenessProbe:
    #   tcpSocket:
    #     port: 8080
    #   periodSeconds: 10

    # Example exec probe
    # livenessProbe:
    #   exec:
    #     command:
    #     - cat
    #     - /tmp/healthy
    #   periodSeconds: 10

Part 6: Pod Termination

Graceful Shutdown Process

%%{init: {'theme':'dark', 'themeVariables': {'primaryTextColor':'#e5e7eb','secondaryTextColor':'#e5e7eb','tertiaryTextColor':'#e5e7eb','textColor':'#e5e7eb','nodeTextColor':'#e5e7eb','edgeLabelText':'#e5e7eb','clusterTextColor':'#e5e7eb','actorTextColor':'#e5e7eb'}}}%% sequenceDiagram participant User as User/Controller participant API as API Server participant Endpoint as Endpoint Controller participant Kubelet as Kubelet participant Container as Container Process User->>API: kubectl delete pod mypod API->>API: Set deletionTimestamp
Grace period: 30s (default) Note over API: Pod enters
Terminating state par Remove from load balancer API->>Endpoint: Update endpoints Endpoint->>Endpoint: Remove Pod from Service Note over Endpoint: No new traffic
routed to Pod and Send termination signal API->>Kubelet: Terminate Pod Kubelet->>Kubelet: Execute preStop hook
(if defined) Note over Kubelet: PreStop hook runs
e.g., /shutdown endpoint
Max time: grace period Kubelet->>Container: Send SIGTERM signal Note over Container: Application receives
SIGTERM and begins
graceful shutdown:
- Finish current requests
- Close connections
- Save state
- Release resources end Note over Container: Shutdown in progress...
Grace period: 30s alt Container exits before grace period Container-->>Kubelet: Process exited (code 0) Note over Kubelet: Clean shutdown ✓ Kubelet->>Kubelet: Remove container Kubelet->>API: Pod terminated successfully else Grace period expires Note over Kubelet: 30 seconds elapsed
Container still running Kubelet->>Container: Send SIGKILL signal Note over Container: Forced termination ✗
Process killed immediately Container-->>Kubelet: Process killed Kubelet->>Kubelet: Remove container Kubelet->>API: Pod terminated (forced) end Kubelet->>Kubelet: Clean up:
- Remove volumes
- Release network
- Delete container API->>API: Remove Pod object from etcd API-->>User: Pod deleted

Pod with PreStop Hook Example

apiVersion: v1
kind: Pod
metadata:
  name: graceful-shutdown
spec:
  terminationGracePeriodSeconds: 60  # Wait up to 60s
  containers:
  - name: app
    image: myapp:v1.0
    ports:
    - containerPort: 8080

    lifecycle:
      # Called before SIGTERM
      preStop:
        exec:
          command: ["/bin/sh", "-c"]
          args:
          - |
            # Notify application to stop accepting new requests
            curl -X POST http://localhost:8080/shutdown
            # Wait for in-flight requests to complete
            sleep 15

Part 7: Common Pod Issues and Debugging

Pod Status Troubleshooting

%%{init: {'theme':'dark', 'themeVariables': {'primaryTextColor':'#e5e7eb','secondaryTextColor':'#e5e7eb','tertiaryTextColor':'#e5e7eb','textColor':'#e5e7eb','nodeTextColor':'#e5e7eb','edgeLabelText':'#e5e7eb','clusterTextColor':'#e5e7eb','actorTextColor':'#e5e7eb'}}}%% flowchart TD Start([Pod not working?]) --> CheckStatus{Pod Status?} CheckStatus -->|Pending| Pending[Pending State] CheckStatus -->|ImagePullBackOff| ImageIssue[Image Pull Issues] CheckStatus -->|CrashLoopBackOff| Crashing[Container Crashing] CheckStatus -->|Running but
not ready| NotReady[Readiness Issues] CheckStatus -->|Error/Failed| Failed[Container Failed] Pending --> PendingChecks[Check:
❯ kubectl describe pod
❯ Events section

Common causes:
- Insufficient resources
- No nodes match selector
- Taints on nodes
- Volume mount issues] ImageIssue --> ImageChecks[Check:
❯ kubectl describe pod
❯ Look for image name

Common causes:
- Typo in image name
- Image doesn't exist
- Registry auth needed
- Network issues] Crashing --> CrashChecks[Check:
❯ kubectl logs pod
❯ kubectl logs pod --previous
❯ kubectl describe pod

Common causes:
- Application crash
- Missing config/secrets
- Failed liveness probe
- OOMKilled] NotReady --> ReadyChecks[Check:
❯ kubectl describe pod
❯ Check readiness probe
❯ kubectl logs pod

Common causes:
- Readiness probe failing
- App not listening on port
- Dependencies not ready
- Slow startup] Failed --> FailedChecks[Check:
❯ kubectl logs pod
❯ kubectl describe pod
❯ Exit code in status

Common causes:
- Application error
- Init container failed
- Invalid command
- Resource limits exceeded] style PendingChecks fill:#78350f,stroke:#f59e0b style ImageChecks fill:#7f1d1d,stroke:#ef4444 style CrashChecks fill:#7f1d1d,stroke:#ef4444 style ReadyChecks fill:#78350f,stroke:#f59e0b style FailedChecks fill:#7f1d1d,stroke:#ef4444

Essential Debugging Commands

# Get pod status
kubectl get pods
kubectl get pods -o wide  # Show node and IP

# Detailed pod information
kubectl describe pod <pod-name>

# View pod logs
kubectl logs <pod-name>
kubectl logs <pod-name> -c <container-name>  # Multi-container pod
kubectl logs <pod-name> --previous  # Previous container instance
kubectl logs <pod-name> --follow  # Stream logs

# Execute commands in pod
kubectl exec <pod-name> -- <command>
kubectl exec -it <pod-name> -- /bin/sh  # Interactive shell

# Check pod events
kubectl get events --sort-by=.metadata.creationTimestamp

# View pod YAML
kubectl get pod <pod-name> -o yaml

# Check resource usage
kubectl top pod <pod-name>

# Port forwarding for local access
kubectl port-forward <pod-name> 8080:80

Part 8: Pod Lifecycle Best Practices

Configuration Checklist

Resource Requests and Limits

resources:
  requests:
    memory: "256Mi"
    cpu: "250m"
  limits:
    memory: "512Mi"
    cpu: "500m"

Health Checks

startupProbe:   # For slow-starting apps
  httpGet:
    path: /healthz
    port: 8080
  failureThreshold: 30
  periodSeconds: 10

livenessProbe:  # Restart if unhealthy
  httpGet:
    path: /healthz
    port: 8080
  periodSeconds: 10

readinessProbe: # Control traffic routing
  httpGet:
    path: /ready
    port: 8080
  periodSeconds: 5

Graceful Shutdown

terminationGracePeriodSeconds: 60
lifecycle:
  preStop:
    exec:
      command: ["/bin/sh", "-c", "sleep 15"]

Security Context

securityContext:
  runAsNonRoot: true
  runAsUser: 1000
  readOnlyRootFilesystem: true
  capabilities:
    drop:
    - ALL

Comparison: Pod Restart Policies

Scenario	Always	OnFailure	Never
Success (Exit 0)	Restarts	Stays stopped	Stays stopped
Failure (Exit ≠ 0)	Restarts	Restarts	Stays stopped
Best for	Services, daemons	Batch jobs, tasks	One-time jobs
Example	Web server, API	Data processing	Database migration
Pod final status	Always Running	Succeeded/Failed	Succeeded/Failed

Conclusion

Understanding the Kubernetes Pod lifecycle is essential for:

Debugging: Quickly identify why Pods aren’t running
Reliability: Configure proper health checks and restart policies
Performance: Optimize startup and shutdown processes
Observability: Know where to look when things go wrong

Key takeaways:

Pods transition through well-defined states: Pending → Running → Succeeded/Failed
Init containers prepare the environment before app containers start
Restart policies determine how Kubernetes handles container failures
Health probes (startup, liveness, readiness) ensure application health
Graceful shutdown with preStop hooks prevents data loss

The visual diagrams in this guide show how Kubernetes orchestrates containerized applications from creation to termination.

Kubernetes Pod Lifecycle: Pending → Running → Succeeded

Introduction

Part 1: Pod Lifecycle Overview

Complete Pod State Machine

Pod Creation to Running Flow

Part 2: Pod Creation Sequence

API Server to Kubelet Communication

Part 3: Init Containers

Init Container Execution Flow

Init Container Example

Part 4: Container Restart Policies

Restart Policy Decision Tree

Restart Policy Comparison

Restart Policy Examples

Part 5: Health Checks (Probes)

Probe Types and Execution Flow

Probe Configuration Examples

Part 6: Pod Termination

Graceful Shutdown Process

Pod with PreStop Hook Example

Part 7: Common Pod Issues and Debugging

Pod Status Troubleshooting

Essential Debugging Commands

Part 8: Pod Lifecycle Best Practices

Configuration Checklist

Comparison: Pod Restart Policies

Conclusion

Further Reading

AI Assistant

Hi! I'm your AI assistant

Introduction#

Part 1: Pod Lifecycle Overview#

Complete Pod State Machine#

Pod Creation to Running Flow#

Part 2: Pod Creation Sequence#

API Server to Kubelet Communication#

Part 3: Init Containers#

Init Container Execution Flow#

Init Container Example#

Part 4: Container Restart Policies#

Restart Policy Decision Tree#

Restart Policy Comparison#

Restart Policy Examples#

Part 5: Health Checks (Probes)#

Probe Types and Execution Flow#

Probe Configuration Examples#

Part 6: Pod Termination#

Graceful Shutdown Process#

Pod with PreStop Hook Example#

Part 7: Common Pod Issues and Debugging#

Pod Status Troubleshooting#

Essential Debugging Commands#

Part 8: Pod Lifecycle Best Practices#

Configuration Checklist#

Comparison: Pod Restart Policies#

Conclusion#

Further Reading#

Introduction

Part 1: Pod Lifecycle Overview

Complete Pod State Machine

Pod Creation to Running Flow

Part 2: Pod Creation Sequence

API Server to Kubelet Communication

Part 3: Init Containers

Init Container Execution Flow

Init Container Example

Part 4: Container Restart Policies

Restart Policy Decision Tree

Restart Policy Comparison

Restart Policy Examples

Part 5: Health Checks (Probes)

Probe Types and Execution Flow

Probe Configuration Examples

Part 6: Pod Termination

Graceful Shutdown Process

Pod with PreStop Hook Example

Part 7: Common Pod Issues and Debugging

Pod Status Troubleshooting

Essential Debugging Commands

Part 8: Pod Lifecycle Best Practices

Configuration Checklist

Comparison: Pod Restart Policies

Conclusion

Further Reading