Project Lab 05: Resource Governance & Elastic Scaling
In a shared cluster, "noisy neighbors" can starve other applications of CPU and Memory. Without proper health probes, a failing application might still receive traffic, leading to 500 errors. This lab focuses on building a self-healing, self-scaling, and governed environment.
Reference Material:
docs/04-resource-mgmt-probes/1-resource-limits-quotas.mddocs/04-resource-mgmt-probes/2-hpa-vpa.mddocs/04-resource-mgmt-probes/4-health-probes.md
1. OBJECTIVE: THE STABLE COMMERCE PLATFORM
The goal is to configure the e-commerce namespace so that:
- No team can deploy a Pod without resource limits.
- The entire namespace is capped to prevent cloud-bill spikes.
- The application scales horizontally when CPU exceeds 50%.
- The application survives a "heavy" startup phase (e.g., cache loading).
2. PHASE 1: NAMESPACE GOVERNANCE
Before deploying the app, we must set the "Rules of Engagement" for the namespace.
2.1 Create Namespace and ResourceQuota
The Quota acts as a hard ceiling for the aggregate resources in the namespace.
# 01-quota.yaml
apiVersion: v1
kind: ResourceQuota
metadata:
name: checkout-dept-quota
namespace: e-commerce
spec:
hard:
requests.cpu: "2"
requests.memory: "2Gi"
limits.cpu: "4"
limits.memory: "4Gi"
pods: "10" # Max 10 pods allowed in this namespace
2.2 Create LimitRange
The LimitRange ensures that every container has a default size if the developer forgets to specify one.
# 02-limitrange.yaml
apiVersion: v1
kind: LimitRange
metadata:
name: checkout-defaults
namespace: e-commerce
spec:
limits:
- default:
cpu: 500m
memory: 512Mi
defaultRequest:
cpu: 250m
memory: 256Mi
type: Container
3. PHASE 2: DEPLOYING THE RESILIENT WORKLOAD
We will deploy the checkout-api. It is designed to be "Slow to start" (takes 30s to initialize) and "CPU intensive" under load.
3.1 The Deployment Manifest (checkout-deploy.yaml)
apiVersion: apps/v1
kind: Deployment
metadata:
name: checkout-api
namespace: e-commerce
spec:
replicas: 2
selector:
matchLabels:
app: checkout
template:
metadata:
labels:
app: checkout
spec:
containers:
- name: api
image: k8s.gcr.io/hpa-example # A lightweight image that allows CPU stress testing
ports:
- containerPort: 80
resources:
requests:
cpu: "200m" # Baseline for HPA calculation
memory: "256Mi"
limits:
cpu: "500m"
memory: "512Mi"
# 1. STARTUP PROBE: Handles the 30s initialization
startupProbe:
httpGet:
path: /
port: 80
failureThreshold: 30
periodSeconds: 1 # Total 30s wait time
# 2. READINESS PROBE: Ensures traffic only hits healthy pods
readinessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 5
periodSeconds: 5
# 3. LIVENESS PROBE: Restarts if process deadlocks
livenessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 10
periodSeconds: 20
4. PHASE 3: ELASTIC SCALING (HPA)
We want the checkout service to scale out to handle surges in traffic.
4.1 Define the HPA
kubectl autoscale deployment checkout-api \
--cpu-percent=50 \
--min=2 \
--max=8 \
-n e-commerce
The Math: If current average CPU utilization exceeds 100m (50% of the 200m request), HPA will trigger a scale-out.
5. PHASE 4: THE LOAD TEST (VALIDATION)
5.1 Monitor the Scaling
Open two terminals.
# Terminal 1: Watch HPA
kubectl get hpa checkout-api -n e-commerce -w
# Terminal 2: Watch Pods
kubectl get pods -n e-commerce -w
5.2 Trigger the Surge
Run a "Generator" pod to bombard the service with requests.
kubectl run -i --tty load-generator --rm --image=busybox:1.28 --restart=Never -n e-commerce -- /bin/sh -c "while true; do wget -q -O- http://checkout-api; done"
Expected Observation:
- Terminal 1 will show CPU increasing (e.g.,
250% / 50%). - Terminal 2 will show new
checkout-api-xxxxpods moving fromPendingtoRunning. - Pods will stay in
0/1 Runningfor 30 seconds (Startup Probe) before moving to1/1 Running(Readiness Probe).
6. TROUBLESHOOTING & NINJA COMMANDS
6.1 Audit Quota Usage
If the HPA fails to create pods, check if you hit the Namespace Quota.
kubectl describe quota checkout-dept-quota -n e-commerce
Observation: If Used equals Hard for pods (10/10), the HPA will be blocked from scaling further.
6.2 Check Component Resource Usage
# Verify Metrics Server is working
kubectl top pods -n e-commerce
6.3 Identify Probe Failures
If a pod keeps restarting, find out which probe failed:
kubectl describe pod <pod-name> -n e-commerce | grep -i "probe failed"
7. ARCHITECT'S KEY TAKEAWAYS
- Requests are for Scheduling: The HPA uses the
requestvalue as the denominator for percentage calculations. - Startup Probes save Liveness: Without a Startup probe, a slow-starting app might be killed by the Liveness probe before it ever finishes booting.
- Namespace Isolation: Quotas are the only way to prevent one team's auto-scaling from consuming the entire cluster's budget.
- Limits prevent Crashes: Memory limits are hard (OOMKill); CPU limits are soft (Throttling). Always provide both for production stability.