Pods, ReplicaSets, and Deployments

1. PODS: The Atomic Unit

1.1 Internal Architecture

A Pod is not a process; it is an environment for processes. In Linux kernel terms, a Pod is a collection of namespaces (cgroups) shared by a group of containers.

The "Pause" Container: Every Pod acts as a logical host. When a Pod starts, Kubernetes first spins up a hidden "infra container" (often called the pause container). This container holds the network namespace (IP address) and IPC namespace.
Shared Context: All application containers in the Pod join the namespaces created by the pause container. This is why:
- localhost works between containers in the same Pod.
- They share the same IP address.
- They share the same volume mounts.

1.2 The Lifecycle State Machine

Understanding the lifecycle is mandatory for debugging CrashLoopBackOff or stuck deployments.

Phase	Internal State	Description & Debug Action
Pending	`Scheduling`	API Server has the object in Etcd, but the Scheduler has not found a node. Debug: `kubectl get events` (Look for "Insufficient CPU", "Taints", or "Unbound PVC").
ContainerCreating	`Pulling/Mounting`	Node assigned. Kubelet is creating the sandbox, mounting CSI volumes, and pulling images. Debug: `kubectl describe pod` (Check "ImagePullBackOff", "ErrImagePull", "MountFailed").
Running	`Active`	The process has started. Note: This does not mean the app is healthy; only that the PID exists.
Succeeded	`Exit Code 0`	Process terminated gracefully. Normal for Batch Jobs.
Failed	`Exit Code > 0`	Process crashed or was OOMKilled. Debug: `kubectl logs -p` (previous logs) or check memory limits.
Unknown	`Node Lost`	Controller Manager lost contact with the Node Kubelet (usually network partition or node crash).

1.3 Container Lifecycle Hooks

Hooks allow execution of code at specific lifecycle points.

postStart: Async execution. There is no guarantee this runs before the container ENTRYPOINT. Do not use this for database migrations or critical dependencies.
preStop: Synchronous blocking. Critical for graceful shutdowns.
- The Problem: When a Pod is deleted, Kubelet sends SIGTERM. Many apps (Nginx, Java) sever connections immediately.
- The Solution: Use preStop to sleep (allowing Load Balancers to drain traffic) or issue a graceful shutdown command.
- Constraint: Must complete within terminationGracePeriodSeconds (default 30s).

1.4 Production-Grade Pod Manifest

Do not use kubectl run for production definitions. Use this comprehensive reference.

apiVersion: v1
kind: Pod
metadata:
  name: prod-payment-processor
  labels:
    app: payment-processor  # Used by Service selectors
    version: v1.2.0
    tier: backend
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "9090"
spec:
  # GRACEFUL SHUTDOWN: Time for preStop + SIGTERM handling
  terminationGracePeriodSeconds: 45 
  
  # SCHEDULING: Soft preference for specific nodes
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 1
        preference:
          matchExpressions:
          - key: disktype
            operator: In
            values:
            - ssd
            
  # SECURITY: Run as non-root user (Best Practice)
  securityContext:
    runAsUser: 1000
    runAsGroup: 3000
    fsGroup: 2000
    
  containers:
  - name: app
    image: enterprise/payment-api:1.2.0
    imagePullPolicy: IfNotPresent
    
    # PORTS: Informational only, does not open firewall
    ports:
    - containerPort: 8080
      name: http
      
    # RESOURCES: Mandatory for Scheduler and QoS classes
    resources:
      requests:
        memory: "512Mi"  # Guaranteed memory
        cpu: "250m"      # 1/4 Core guaranteed
      limits:
        memory: "1Gi"    # OOMKill if exceeded
        cpu: "500m"      # Throttled if exceeded
        
    # PROBES: Self-healing configuration
    livenessProbe:       # Restart if dead
      httpGet:
        path: /healthz
        port: 8080
      initialDelaySeconds: 15
      periodSeconds: 20
    readinessProbe:      # Remove from Load Balancer if failing
      httpGet:
        path: /ready
        port: 8080
      initialDelaySeconds: 5
      
    # LIFECYCLE HOOKS
    lifecycle:
      preStop:
        exec:
          # Nginx example: quit gracefully, don't just kill
          # Generic app: sleep 5 to allow iptables propagation
          command: ["/bin/sh", "-c", "sleep 5; /usr/sbin/nginx -s quit"]

2. REPLICASET (RS)

2.1 The Controller Logic

The ReplicaSet is the pure "availability engine." Its reconciliation loop is simple:

Check current number of Pods with matching labels.
Compare with spec.replicas.
If Current < Desired: Create Pod.
If Current > Desired: Delete Pod (youngest first usually).

Note: In modern Kubernetes, you rarely manage RS directly. Deployments manage RS, and RS manages Pods.

2.2 Selectors: Equality vs. Set-Based

ReplicaSets support complex selection logic, unlike the obsolete ReplicationController.

spec:
  selector:
    matchExpressions:
      - {key: tier, operator: In, values: [frontend, api]}
      - {key: env, operator: NotIn, values: [dev]}

2.3 Debugging Pattern: Pod Quarantine

A powerful technique for debugging intermittent failures without affecting production capacity.

Scenario: One Pod in a set of 10 is throwing errors. You want to debug it, but if you kubectl exec into it and kill the process, the logs are gone.

The Fix (Label Hijacking):

The RS tracks the Pod via the label app=payment.

Overwrite the label on the broken Pod:

kubectl label pod payment-xyz-123 app=payment-debug --overwrite

Result:
- The RS sees 9/10 pods. It immediately spins up a fresh replacement Pod to restore capacity.
- The "broken" Pod (payment-xyz-123) is no longer managed by the RS. It stays running, isolated from the load balancer (Service), ready for you to kubectl exec, install debug tools (strace, curl), and analyze logs at your leisure.

3. DEPLOYMENT

The Deployment object is a higher-level abstraction that manages ReplicaSets to provide Declarative Updates.

3.1 Internals: How Updates Work

When you update a Deployment (e.g., change image tag), it does not patch existing Pods.

Deployment creates a New ReplicaSet.
It ramps up the New RS (e.g., 0 -> 1 -> 2 replicas).
It ramps down the Old RS (e.g., 10 -> 9 -> 8 replicas).
This cross-scaling is controlled by the Strategy.

3.2 Deployment Strategies

A. Recreate

Behavior: replicas: 0 (Old) -> replicas: N (New).
Result: Downtime.
Use Case: Database schema changes where Version A and Version B cannot write to the DB simultaneously.

B. RollingUpdate (Default & Recommended)

Calculates the pace of the rollout to ensure availability.

maxSurge: How many extra pods can we create above the desired count? (Can be % or integer).
maxUnavailable: How many pods can be missing from the desired count?

The Zero-Downtime Configuration: To guarantee that you never drop below 100% capacity during an update:

spec:
  replicas: 10
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 2        # Allow 12 pods total during update
      maxUnavailable: 0  # NEVER kill a pod until a new one is Ready

3.3 Rollbacks and History

Kubernetes maintains a history of ReplicaSets to facilitate rollbacks.

1. Check History:

kubectl rollout history deployment/web-app

Sample Output:

REVISION  CHANGE-CAUSE
       kubectl create deployment web-app --image=nginx:1.19 --record
       kubectl set image deployment/web-app nginx=nginx:1.20 --record
       kubectl set image deployment/web-app nginx=nginx:1.21 --record

2. Rollback: The following command flips the spec.replicas of Revision 2 to the desired count and Revision 3 to 0.

kubectl rollout undo deployment/web-app --to-revision=2

3.4 Production Deployment Manifest

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mission-critical-api
  labels:
    app: api
spec:
  replicas: 3
  # REVISION HISTORY: Keep only last 5 RS to save Etcd space
  revisionHistoryLimit: 5
  
  selector:
    matchLabels:
      app: api
      
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
      
  template:
    metadata:
      labels:
        app: api
    spec:
      containers:
      - name: api-server
        image: my-registry/api:v4.5
        # ... (Insert Pod Spec from Section 1.4) ...

4. Advanced Commands & Troubleshooting

4.1 "Tough" Command Examples

1. Decode the internal reason for a Pod failure: When kubectl get pod just says "CrashLoopBackOff", you need the Exit Code and the Reason.

kubectl get pod <pod-name> -o jsonpath='{.status.containerStatuses[0].state}'

Sample Output:

{"terminated":{"containerID":"containerd://...","exitCode":137,"reason":"OOMKilled","startedAt":"2023-10-27T10:00:00Z"}}

(Exit Code 137 indicates 128 + 9 (SIGKILL), confirming an Out Of Memory kill).

2. Watch a Rolling Update in real-time: Don't just wait; watch the ReplicaSets swap over.

kubectl get rs -l app=my-app --watch

Sample Output:

NAME                 DESIRED   CURRENT   READY   AGE
my-app-6b474c4b7     10        10        10      5d    (Old RS)
my-app-8f92739a2     0         0         0       0s    (New RS created)
my-app-8f92739a2     3         3         0       2s    (New RS scaling up)
my-app-6b474c4b7     8         8         8       2s    (Old RS scaling down)

3. Force Replace (The Nuclear Option): Sometimes a Deployment gets stuck because of immutable field conflicts.

kubectl replace --force -f deployment.yaml

Warning: This deletes the deployment and recreates it. Causes downtime unless carefully managed.

4.2 Common Pitfalls

Missing imagePullSecrets: Pod hangs in ImagePullBackOff.
latest tag: Avoid using :latest. It breaks the immutability of rollbacks (rolling back to previous revision won't help if the image behind :latest has changed).
Mismatched Selectors: If the Deployment selector does not match the template.metadata.labels, the Deployment will fail to create.
Zombie Pods: If a preStop hook hangs forever, the Pod sticks in Terminating state. Force delete if necessary:
```
kubectl delete pod <pod-name> --grace-period=0 --force
```

1. PODS: The Atomic Unit​

1.1 Internal Architecture​

1.2 The Lifecycle State Machine​

1.3 Container Lifecycle Hooks​

1.4 Production-Grade Pod Manifest​

2. REPLICASET (RS)​

2.1 The Controller Logic​

2.2 Selectors: Equality vs. Set-Based​

2.3 Debugging Pattern: Pod Quarantine​

3. DEPLOYMENT​

3.1 Internals: How Updates Work​

3.2 Deployment Strategies​

A. Recreate​

B. RollingUpdate (Default & Recommended)​

3.3 Rollbacks and History​

3.4 Production Deployment Manifest​

4. Advanced Commands & Troubleshooting​

4.1 "Tough" Command Examples​

4.2 Common Pitfalls​

1. PODS: The Atomic Unit

1.1 Internal Architecture

1.2 The Lifecycle State Machine

1.3 Container Lifecycle Hooks

1.4 Production-Grade Pod Manifest

2. REPLICASET (RS)

2.1 The Controller Logic

2.2 Selectors: Equality vs. Set-Based

2.3 Debugging Pattern: Pod Quarantine

3. DEPLOYMENT

3.1 Internals: How Updates Work

3.2 Deployment Strategies

A. Recreate

B. RollingUpdate (Default & Recommended)

3.3 Rollbacks and History

3.4 Production Deployment Manifest

4. Advanced Commands & Troubleshooting

4.1 "Tough" Command Examples

4.2 Common Pitfalls