Skip to main content

Project Lab 12: Zero-Trust Networking & DNS Hardening

In Kubernetes, networking is open by default. A compromised frontend pod can normally scan the entire cluster network for databases. This lab demonstrates how to lock down a production environment using Layer 3/4 Network Policies and ensuring that Service Discovery remains performant and secure.

Reference Material:

  • docs/11-networking/1-network-policies.md
  • docs/11-networking/2-dns.md
  • docs/11-networking/3-coredns.md

1. OBJECTIVE: THE FINANCIAL DATA PERIMETER

The goal is to deploy the payments application stack across two namespaces (frontend-zone and secure-zone) with the following constraints:

  1. Default Deny: All pods in both namespaces must block all traffic by default.
  2. Strict Flow: Web can only talk to API. API can only talk to Database.
  3. DNS Resilience: Policies must explicitly allow DNS resolution (Port 53), or the services will fail to find each other.
  4. Discovery Audit: Validate the use of SRV records for specialized service discovery.

2. PHASE 1: NAMESPACE & WORKLOAD SETUP

We will segregate the components into separate namespaces to simulate an enterprise boundary.

kubectl create namespace frontend-zone
kubectl create namespace secure-zone

# 1. Deploy Frontend (in frontend-zone)
kubectl run web-server --image=nginx -n frontend-zone --labels="app=web-server"

# 2. Deploy API and DB (in secure-zone)
kubectl run payment-api --image=nginx -n secure-zone --labels="app=payment-api"
kubectl run secure-db --image=redis -n secure-zone --labels="app=secure-db"

3. PHASE 2: THE "DEFAULT DENY" POSTURE

We will apply a policy to the secure-zone that isolates all pods. After this step, even the API won't be able to talk to the DB.

3.1 Apply Default Deny All (default-deny.yaml)

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: secure-zone
spec:
podSelector: {} # Selects all pods in the namespace
policyTypes:
- Ingress
- Egress

Note: We apply this to Egress as well. If an attacker gets a shell in your pod, they cannot even ping Google or scan your internal network.


4. PHASE 3: OPENING THE FIREWALL (THE "ALLOW" LIST)

Now we surgically open the paths required for the application to function.

4.1 Allow API to DB and DNS

The Bible Trap: If you restrict Egress but forget Port 53, the API will fail to resolve secure-db.secure-zone.svc.cluster.local.

# api-policy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: api-network-policy
namespace: secure-zone
spec:
podSelector:
matchLabels:
app: payment-api
policyTypes: ["Ingress", "Egress"]
ingress:
- from: # Allow from Frontend Namespace
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: frontend-zone
egress:
# 1. Allow traffic to DB
- to:
- podSelector:
matchLabels:
app: secure-db
# 2. Allow DNS (MANDATORY)
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53

5. PHASE 4: DNS DISCOVERY AUDIT

5.1 Validating SRV Records

Kubernetes DNS provides SRV records for discovering the port of a service. This is used by many gRPC and database drivers.

# Start a debug pod
kubectl run -it --rm dns-test --image=tutum/dnsutils -n frontend-zone -- /bin/sh

# Query SRV records for the API service
# Format: _port-name._protocol.service.namespace.svc.cluster.local
dig SRV _http._tcp.payment-api.secure-zone.svc.cluster.local

Expected Observation: The output will contain an ANSWER SECTION showing the target hostname and the specific port (e.g., 80) assigned to that service.

5.2 The ndots Latency Check

Inside your dns-test pod, run:

cat /etc/resolv.conf

Architect's Audit: Look for options ndots:5.

  • Test: Run time nslookup google.com vs time nslookup google.com. (with a trailing dot).
  • Discovery: The query with the trailing dot is significantly faster because it bypasses the Kubernetes search domains.

6. VERIFICATION & TROUBLESHOOTING

6.1 Test the "Zero-Trust" Boundary

Try to reach the database directly from the frontend-zone (bypassing the API).

kubectl exec -it web-server -n frontend-zone -- curl --connect-timeout 2 secure-db.secure-zone

Expected Result: Connection timeout. Why? The secure-db is protected by the default-deny-all policy in secure-zone, and no specific Ingress rule allows traffic from the frontend-zone namespace directly to the database.

6.2 CoreDNS Log Audit

If DNS resolution fails, check the CoreDNS logs for circular loops or upstream errors.

kubectl logs -n kube-system -l k8s-app=kube-dns

Common Error: [PANIC] loop: loop detected

  • Cause: This happens if the host's /etc/resolv.conf points back to CoreDNS (127.0.0.1 or the ClusterIP).

7. ARCHITECT'S KEY TAKEAWAYS

  1. Ingress + Egress Default Deny: This is the only way to achieve true isolation. Without blocking Egress, a compromised pod can still communicate with external Command & Control servers.
  2. DNS is the Lifeblood: Every Egress policy must account for Port 53 UDP/TCP to the kube-system namespace.
  3. Namespace Selectors vs. Pod Selectors: Use namespaceSelector for broad team-level boundaries and podSelector for granular application-level security.
  4. Trailing Dots for Performance: In high-traffic applications making external API calls (e.g., to Stripe or AWS S3), use absolute FQDNs ending in a dot to reduce CoreDNS load by up to 300%.