Monitoring, Auditing and Incident Response · Operations

L18. Audit Logging and Incident Response in Kubernetes

Video generating

Check back soon for the video lesson on Audit Logging and Incident Response in Kubernetes

When an incident happens in Kubernetes, audit logs are your forensic trail. Learn how to configure comprehensive audit logging, investigate security incidents, and build a Kubernetes IR playbook.

Kubernetes Audit Logs

Audit logs record every request to the API server. They answer the critical forensic questions: who did what, when, and from where.

Each audit event includes:

User: The authenticated identity (user, service account, or anonymous)
Verb: The operation (get, create, delete, patch)
Resource: What was affected (pod, secret, role)
Namespace: Where it happened
Source IP: Where the request came from
Response code: Whether it succeeded

Configuring Audit Policies

A good audit policy balances visibility with log volume:

apiVersion: audit.k8s.io/v1
kind: Policy
rules:
  # Log all secret operations with full request body
<ul class="list-disc pl-6 mb-4 space-y-2">
<li class="text-slate-300">level: Request</li>
</ul>
    resources:
<ul class="list-disc pl-6 mb-4 space-y-2">
<li class="text-slate-300">group: ""</li>
</ul>
        resources: ["secrets"]
  # Log RBAC changes with full details
<ul class="list-disc pl-6 mb-4 space-y-2">
<li class="text-slate-300">level: RequestResponse</li>
</ul>
    resources:
<ul class="list-disc pl-6 mb-4 space-y-2">
<li class="text-slate-300">group: "rbac.authorization.k8s.io"</li>
</ul>
        resources: ["clusterroles", "clusterrolebindings", "roles", "rolebindings"]
  # Log pod exec and port-forward
<ul class="list-disc pl-6 mb-4 space-y-2">
<li class="text-slate-300">level: Request</li>
</ul>
    resources:
<ul class="list-disc pl-6 mb-4 space-y-2">
<li class="text-slate-300">group: ""</li>
</ul>
        resources: ["pods/exec", "pods/portforward", "pods/attach"]
  # Log node and namespace operations
<ul class="list-disc pl-6 mb-4 space-y-2">
<li class="text-slate-300">level: Metadata</li>
</ul>
    resources:
<ul class="list-disc pl-6 mb-4 space-y-2">
<li class="text-slate-300">group: ""</li>
</ul>
        resources: ["nodes", "namespaces"]  # Log everything else at Metadata level
<ul class="list-disc pl-6 mb-4 space-y-2">
<li class="text-slate-300">level: Metadata</li>
</ul>
    omitStages: ["RequestReceived"]

Log Shipping

Send audit logs to a centralized SIEM for analysis and alerting:

Managed clusters: EKS sends audit logs to CloudWatch (enable in cluster logging settings). AKS sends to Azure Monitor. GKE sends to Cloud Logging.
Self-managed: Ship logs from the audit log file to Elasticsearch, Splunk, or your SIEM using Fluentd or Vector.

Incident Response Playbook

When a security incident is detected in Kubernetes: 1. Contain

# Isolate the compromised pod with a deny-all network policy kubectl label pod compromised-pod quarantine=true -n production

# Apply isolation policy kubectl apply -f - <<EOF apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: quarantine namespace: production spec: podSelector: matchLabels: quarantine: "true" policyTypes: ["Ingress", "Egress"] EOF

2. Preserve evidence

# Capture pod details before deletion kubectl get pod compromised-pod -n production -o yaml > pod-evidence.yaml kubectl logs compromised-pod -n production --all-containers > pod-logs.txt kubectl describe pod compromised-pod -n production > pod-describe.txt

# Snapshot the container filesystem kubectl cp production/compromised-pod:/tmp ./evidence/tmp/

3. Investigate Query audit logs for the compromised identity:

# KQL-style query for Sentinel/Elasticsearch
user.username == "system:serviceaccount:production:web-app"
AND verb in ("create", "patch", "delete")
AND objectRef.resource in ("secrets", "roles", "clusterroles")

4. Eradicate

Rotate all secrets the compromised pod had access to
Revoke the service account's RBAC permissions
Delete and recreate the compromised pod from a known-good image
Scan the image for vulnerabilities and verify its signature

5. Recover and improve

Tighten RBAC to remove over-permissions found during investigation
Add Network Policies to prevent the lateral movement path
Add Falco rules to detect the attack pattern
Update Pod Security Standards if the attack exploited weak security contexts

Key Detection Queries

What to Detect	Audit Log Signal
Secret enumeration	Multiple `get secrets` across namespaces
RBAC escalation	`create` or `patch` on clusterrolebindings
Pod exec abuse	`create pods/exec` on production pods
Token theft	`get secrets` for service account tokens
Namespace escape	`create pods` in kube-system by non-admin

Exam Focus Points

✓Audit logs record every API server request: who (user/SA), what (verb/resource), when, and from where (source IP)
✓Log secret operations at Request level and RBAC changes at RequestResponse level for forensic detail
✓Contain compromised pods immediately with a deny-all NetworkPolicy using a quarantine label
✓Preserve evidence (pod YAML, logs, filesystem snapshot) before deleting compromised pods
✓Key detection signals: secret enumeration across namespaces, RBAC escalation, pod exec in production, and pods created in kube-system by non-admins

Knowledge Check

1. What is the first step when responding to a compromised pod in Kubernetes?

2. Which audit log signal indicates possible RBAC privilege escalation?

3. Why should you preserve pod evidence before deletion during incident response?

Runtime Security: Falco, Tetragon and eBPF