Container Security in Azure: AKS + Defender for Containers Complete Guide
Most AKS clusters deployed between 2020 and 2022 have no Pod Security Admission, overly permissive RBAC, and Defender for Containers disabled. That combination is not theoretical risk: a single privileged pod or unscanned image with a critical CVE is all it takes for a container escape to become a full cluster compromise. This guide covers the full security stack for production AKS workloads.

The Attack That Happened Because Nobody Checked the Pod Spec
In early 2025, a penetration test against a mid-size financial services firm found a CI/CD pipeline that deployed workloads to AKS with securityContext.privileged: true and hostPID: true set on the build agent pod. The cluster was running Kubernetes 1.26, and the deprecated PodSecurityPolicy controller had been disabled in 1.25 without a replacement policy in place. Defender for Containers was not enabled. The registry had no image signing.
Within 15 minutes of landing on the build agent pod, the tester had mounted the host filesystem via /proc/1/root, read the node's Azure Instance Metadata Service (IMDS) token at http://169.254.169.254/metadata/identity/oauth2/token, and used that managed identity to enumerate the Azure subscription. The managed identity had Contributor on the resource group. Game over.
This is not an edge case. The combination of no Pod Security Admission, permissive pod specs, and no runtime detection is the default state for AKS clusters set up in the 2020-2022 window and never revisited. Most of the security effort went into network-level controls: private cluster, NSGs, Azure Firewall. None of those controls stop a privileged container from escaping to the host and abusing the node's identity.
This guide covers the full stack of controls that actually prevent and detect that scenario.
Defender for Containers: What It Covers and What It Does Not
Defender for Containers is the Microsoft Defender plan that targets AKS and Arc-enabled Kubernetes clusters. It is not a firewall or a pod policy engine. Understanding what it actually does prevents the common mistake of treating it as a complete solution.
What Is Included
Image vulnerability assessment: Defender scans images in Azure Container Registry (ACR) and produces a per-image CVE list correlated against the OS package manifest and language runtime packages. As of the Defender for Cloud update in Q4 2024, it uses both the Microsoft Vulnerability Database and the Qualys scanner engine for coverage of OS-level and application-level vulnerabilities. Scans trigger on push and run weekly for images already in the registry.
Admission-time scanning: With the Defender profile deployed as a DaemonSet on the AKS node pool, images are checked at admission time against the vulnerability database. Pods referencing images with critical CVEs can be blocked via a deny policy. This is separate from registry scanning and covers images pulled from non-ACR registries.
Kubernetes audit log analysis: Defender ingests the AKS audit log stream and applies detection rules for anomalous API calls: creation of privileged pods, modification of cluster-admin bindings, use of exec into running containers, creation of pods in the kube-system namespace by non-system accounts.
Node-level threat detection: The Defender sensor running as a DaemonSet monitors process trees, network connections, and filesystem events at the node level. It detects crypto miners, reverse shells, and container escape techniques such as mounting /proc or accessing the Docker socket.
Kubernetes control plane hardening assessment: Defender for Cloud surfaces CIS Kubernetes Benchmark recommendations, AKS-specific misconfigurations (anonymous authentication on the API server, overly permissive RBAC bindings), and network policy gaps.
What It Does NOT Cover
Defender for Containers does not enforce pod security policy. It alerts on privileged pods but does not block them unless you separately configure an Azure Policy deny effect. It does not replace Network Policy: you can have Defender fully deployed with no network segmentation between pods and it will not prevent lateral movement via the pod network. It does not sign images or enforce image provenance checks. That requires Notation with Azure Key Vault or a third-party admission webhook like Kyverno with Cosign.
The plan costs $7 per vCore per month (as of January 2026). For a 10-node cluster with 4 vCores per node, that is $280/month. Not trivial, but significantly less than the average cost of a container compromise incident.
See the [CSPM comparison guide](/blog/best-cspm-tools-2026-defender-for-cloud-vs-wiz-vs-orca-vs-prisma-cloud) for how Defender for Containers stacks up against Wiz and Orca on container security coverage.
Enabling Defender for Containers via Azure CLI
# Enable Defender for Containers on the subscription
az security pricing create \
--name Containers \
--tier Standard
# Verify the Defender profile DaemonSet is running on the cluster
kubectl get daemonset microsoft-defender-collector-ds \
-n kube-system
# Check the sensor is reporting to Defender
kubectl logs -n kube-system \
-l app=microsoft-defender-collector \
--tail=20The Defender sensor deploys automatically when the plan is enabled and the cluster has the --enable-defender flag set, or when the Azure Policy initiative "Enable Microsoft Defender for Cloud on your subscription" is assigned.
Image Scanning and Supply Chain Security
Registry Scanning vs. Admission-Time Scanning
These two mechanisms cover different attack vectors and are not interchangeable.
| Mechanism | When It Runs | What It Catches | What It Misses |
|---|---|---|---|
| ACR registry scanning | On push + weekly | CVEs in images stored in ACR | Images from Docker Hub, GHCR, non-ACR registries |
| Defender admission scanning | At pod creation | CVEs in any image at deployment time | Images that are not yet deployed |
| OPA/Kyverno policy | At pod creation | Policy violations (e.g., no digest pinning) | Vulnerability content inside image |
| Notation signing check | At pod creation | Unsigned or tampered images | Signed images with vulnerabilities |
For a production cluster, you need all four layers. Registry scanning catches drift between your last scan and today's CVE database. Admission scanning catches images from external registries. Policy enforcement catches configuration mistakes like using latest tags or unpinned digests. Signing verification catches supply chain substitution.
Enforcing Image Scanning Results with Azure Policy
# Assign the built-in policy to block containers with critical CVEs
az policy assignment create \
--name "block-critical-cve-images" \
--display-name "Block AKS pods with critical vulnerabilities" \
--policy "/providers/Microsoft.Authorization/policyDefinitions/13cd7ae3-5bc0-4ac4-a62d-4f7c120b9759" \
--scope "/subscriptions/<subscription-id>/resourceGroups/<rg-name>" \
--enforcement-mode Default \
--params '{"effect": {"value": "Deny"}}'
# Check compliance state for the cluster
az policy state list \
--resource "/subscriptions/<subscription-id>/resourceGroups/<rg-name>/providers/Microsoft.ContainerService/managedClusters/<cluster-name>" \
--query "[?complianceState=='NonCompliant'].{policy:policyDefinitionName,resource:resourceId}" \
--output tableNotation Image Signing with Azure Key Vault
Notation (the CNCF image signing standard, now at v1.1.0) integrates with Azure Key Vault for key storage and ACR for signature storage. The workflow: the CI pipeline signs the image digest after build, and an admission webhook (Ratify, maintained by Azure) verifies the signature at pod creation time.
# Install Notation CLI
curl -Lo notation.tar.gz https://github.com/notaryproject/notation/releases/download/v1.1.0/notation_1.1.0_linux_amd64.tar.gz
tar -xzf notation.tar.gz
sudo mv notation /usr/local/bin/
# Add the Azure Key Vault plugin
notation plugin install azure-kv \
https://github.com/Azure/notation-azure-kv/releases/download/v1.2.0/notation-azure-kv_1.2.0_linux_amd64.tar.gz
# Sign an image using the AKV-backed signing key
notation sign \
--key "https://<keyvault-name>.vault.azure.net/keys/<key-name>/<version>" \
<acr-name>.azurecr.io/<image-name>@sha256:<digest>
# Verify signature
notation verify \
<acr-name>.azurecr.io/<image-name>@sha256:<digest>Deploy Ratify as an admission webhook to enforce signature verification at the Kubernetes layer. All unsigned images are rejected at admission, regardless of where they originate.
Pod Security Admission: Replacing Deprecated PSP
PodSecurityPolicy was deprecated in Kubernetes 1.21 and removed in 1.25. AKS clusters running 1.25+ have no pod security enforcement unless you explicitly configure Pod Security Admission (PSA) or a third-party admission controller like Kyverno or OPA Gatekeeper.
The Three PSA Modes
Pod Security Admission operates on three levels, each of which can run in three modes:
- enforce: Policy violations reject the pod at admission. Use this in production namespaces.
- audit: Violations are logged to the Kubernetes audit log but the pod is allowed. Use this during migration.
- warn: Violations produce a warning in the API response (visible in kubectl output) but the pod is allowed. Use this for developer feedback.
The two relevant security profiles:
baseline: Prevents known privilege escalations. Blocks: privileged containers, host namespaces (hostPID, hostIPC, hostNetwork), host path mounts, and specific Linux capabilities (NET_RAW, SYS_ADMIN).
restricted: Enforces the full hardened posture. Requires: runAsNonRoot: true, seccompProfile: RuntimeDefault or Localhost, drops all capabilities (drop: ["ALL"]), disallows privilege escalation (allowPrivilegeEscalation: false), and requires read-only root filesystem.
Practical Namespace Labeling Strategy
Do not apply restricted to every namespace by default. Most legacy workloads will break. Use a tiered approach:
# Production application namespaces: enforce restricted
kubectl label namespace production \
pod-security.kubernetes.io/enforce=restricted \
pod-security.kubernetes.io/enforce-version=v1.29 \
pod-security.kubernetes.io/audit=restricted \
pod-security.kubernetes.io/audit-version=v1.29 \
pod-security.kubernetes.io/warn=restricted \
pod-security.kubernetes.io/warn-version=v1.29
# Staging/migration namespaces: audit restricted, enforce baseline
kubectl label namespace staging \
pod-security.kubernetes.io/enforce=baseline \
pod-security.kubernetes.io/enforce-version=v1.29 \
pod-security.kubernetes.io/audit=restricted \
pod-security.kubernetes.io/audit-version=v1.29
# Infrastructure namespaces (monitoring, ingress): baseline enforce only
kubectl label namespace monitoring \
pod-security.kubernetes.io/enforce=baseline \
pod-security.kubernetes.io/enforce-version=v1.29
# Dry-run to preview what violations exist before enforcing restricted
kubectl --dry-run=server \
label namespace <namespace> \
pod-security.kubernetes.io/enforce=restrictedThe --enforce-version label pins the policy to a specific Kubernetes version, preventing policy drift when the cluster is upgraded. Always pin to a specific version.
What Baseline Blocks That Matters Most
Baseline blocks the IMDS token theft scenario from the opening: hostPID: true is not permitted under baseline. A pod spec with hostPID or hostNetwork will be rejected at admission in an enforce-baseline namespace. It also blocks securityContext.privileged: true, which is the other primary container escape vector.
Restricted additionally requires seccompProfile: RuntimeDefault, which constrains the syscall surface available to the container. Combined with capability dropping, this significantly raises the cost of exploiting a container-level vulnerability to achieve host compromise.
See the [Kubernetes security best practices guide](/blog/kubernetes-security-best-practices-2026) for the full set of pod spec hardening recommendations beyond what PSA enforces.
AKS RBAC and Workload Identity
Why AAD Pod Identity Is a Lateral Movement Risk
AAD Pod Identity (the v1 solution, also called aad-pod-identity) worked by running a privileged DaemonSet that intercepted IMDS calls from pods and returned tokens scoped to an assigned managed identity. The interception mechanism used a hostNetwork: true pod with iptables rules.
Two problems: first, any pod that could manipulate iptables on the node could intercept IMDS calls from other pods. Second, the pod-to-identity binding used Kubernetes labels, which any user with pod edit permissions could apply to their own pod to assume a more privileged identity.
AKS Workload Identity (GA since AKS 1.27) eliminates both problems by using the OIDC issuer built into AKS and Kubernetes service account token projection. No privileged DaemonSet, no iptables manipulation, no label-based identity assignment.
Migrating from AAD Pod Identity to AKS Workload Identity
# Enable OIDC issuer and Workload Identity on an existing cluster
az aks update \
--name <cluster-name> \
--resource-group <rg-name> \
--enable-oidc-issuer \
--enable-workload-identity
# Get the OIDC issuer URL
OIDC_ISSUER=$(az aks show \
--name <cluster-name> \
--resource-group <rg-name> \
--query "oidcIssuerProfile.issuerUrl" \
--output tsv)
# Create a user-assigned managed identity for the workload
az identity create \
--name "workload-identity-myapp" \
--resource-group <rg-name>
CLIENT_ID=$(az identity show \
--name "workload-identity-myapp" \
--resource-group <rg-name> \
--query "clientId" \
--output tsv)
# Create federated credential linking the managed identity to a Kubernetes service account
az identity federated-credential create \
--name "myapp-federated-cred" \
--identity-name "workload-identity-myapp" \
--resource-group <rg-name> \
--issuer "$OIDC_ISSUER" \
--subject "system:serviceaccount:production:myapp-sa" \
--audience "api://AzureADTokenExchange"Then annotate the Kubernetes service account and configure the pod spec:
apiVersion: v1
kind: ServiceAccount
metadata:
name: myapp-sa
namespace: production
annotations:
azure.workload.identity/client-id: "<client-id-from-above>"
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
namespace: production
spec:
template:
metadata:
labels:
azure.workload.identity/use: "true"
spec:
serviceAccountName: myapp-sa
containers:
- name: myapp
image: <acr-name>.azurecr.io/myapp:v1.2.3@sha256:<digest>
securityContext:
runAsNonRoot: true
runAsUser: 1000
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop: ["ALL"]
seccompProfile:
type: RuntimeDefaultThe pod receives a projected service account token volume automatically. The Azure SDK reads the token from the well-known path and exchanges it with Entra ID for an access token scoped to the managed identity. No IMDS call leaves the pod. No privileged DaemonSet intercepts the request.
The subject format system:serviceaccount:<namespace>:<sa-name> is exact and namespace-scoped. A pod in a different namespace cannot claim the same identity. This is the fundamental security improvement over label-based pod identity.
See the [federated credentials guide](/blog/flexible-federated-identity-credentials-entra-github-terraform) for Bicep templates that provision the full federated credential chain, and the [non-human identity guide](/blog/non-human-identities-nhi-security-guide) for governance of workload identities at scale.
Provisioning Workload Identity with Bicep
param appName string
param location string
param oidcIssuerUrl string
param aksNamespace string
param serviceAccountName string
resource keyVault 'Microsoft.KeyVault/vaults@2023-07-01' existing = {
name: 'kv-${appName}'
}
// Managed identity for the workload
resource workloadIdentity 'Microsoft.ManagedIdentity/userAssignedIdentities@2023-01-31' = {
name: 'workload-identity-${appName}'
location: location
}
// Federated credential linking to Kubernetes service account
resource federatedCredential 'Microsoft.ManagedIdentity/userAssignedIdentities/federatedIdentityCredentials@2023-01-31' = {
parent: workloadIdentity
name: '${appName}-federated-cred'
properties: {
issuer: oidcIssuerUrl
subject: 'system:serviceaccount:${aksNamespace}:${serviceAccountName}'
audiences: ['api://AzureADTokenExchange']
}
}
// Scoped role assignment: least-privilege to Key Vault secrets only
resource kvSecretUserRole 'Microsoft.Authorization/roleAssignments@2022-04-01' = {
name: guid(workloadIdentity.id, keyVault.id, 'Key Vault Secrets User')
scope: keyVault
properties: {
roleDefinitionId: subscriptionResourceId(
'Microsoft.Authorization/roleDefinitions',
'4633458b-17de-408a-b874-0445c86b69e6' // Key Vault Secrets User
)
principalId: workloadIdentity.properties.principalId
principalType: 'ServicePrincipal'
}
}
output clientId string = workloadIdentity.properties.clientId
output principalId string = workloadIdentity.properties.principalIdNetwork Policy: Default Deny and Egress Restrictions
Calico vs Azure Network Policy
AKS supports two network policy engines: Azure Network Policy (managed by Microsoft, L4 only, limited to 250 nodes) and Calico (open source, full NetworkPolicy spec support, scales beyond 250 nodes, supports FQDN-based egress filtering in the enterprise version).
| Feature | Azure Network Policy | Calico OSS | Calico Enterprise |
|---|---|---|---|
| Max nodes | 250 | Unlimited | Unlimited |
| Egress DNS filtering | No | No | Yes (FQDN policy) |
| NetworkPolicy spec support | Full L4 | Full L4 | Full L4 + L7 |
| Pod-to-pod encryption | No | WireGuard (v3.14+) | WireGuard |
| Global network policies | No | Yes (CRDs) | Yes |
| Cost | Included | Included | Commercial license |
For clusters over 250 nodes or clusters requiring encrypted pod-to-pod communication, Calico is the correct choice. Specify the network plugin at cluster creation: --network-plugin azure --network-policy calico.
Implementing Default Deny
The Kubernetes default allows all pods to communicate with all other pods. If a pod is compromised, it has full network access to every other pod in the cluster. Default deny reverses this: all traffic is blocked unless explicitly permitted.
# Default deny all ingress and egress for a namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
# Allow egress to kube-dns only (required for DNS resolution)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns-egress
namespace: production
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- ports:
- port: 53
protocol: UDP
- port: 53
protocol: TCP
# Allow ingress from ingress controller to application pods
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-ingress-controller
namespace: production
spec:
podSelector:
matchLabels:
app: myapp
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: ingress-nginx
ports:
- port: 8080
protocol: TCP
# Allow application egress to Azure SQL private endpoint
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-sql-egress
namespace: production
spec:
podSelector:
matchLabels:
app: myapp
policyTypes:
- Egress
egress:
- to:
- ipBlock:
cidr: 10.0.4.5/32 # Private endpoint IP for Azure SQL
ports:
- port: 1433
protocol: TCPApply default deny before applying allow rules. If you reverse the order, there is a window where pods are unreachable. With Calico, use GlobalNetworkPolicy CRDs to apply default deny across the entire cluster before namespace-scoped policies add exceptions.
For a comprehensive [zero trust](/blog/what-is-zero-trust-security-complete-guide) network posture in AKS, default deny at the network layer is the foundation. Every allowed flow should be documented and justified.
Runtime Threat Detection: Defender Alerts and KQL
Key Defender for Containers Alerts
Defender for Containers generates alerts in the SecurityAlert table in Log Analytics. The most actionable alerts for AKS:
| Alert Name | Severity | What It Detects |
|---|---|---|
| `K8S.NODE_CryptominerDetected` | High | Crypto miner process running in container |
| `K8S.NODE_PrivilegedContainerArtifacts` | High | Privileged container accessing host paths |
| `K8S.NODE_ContainerEscape` | Critical | Container escape technique detected at node level |
| `K8S.NODE_ReverseShell` | High | Outbound connection matching reverse shell pattern |
| `K8S.NODE_NewPrivilegedContainer` | Medium | New container started with privileged flag |
| `K8S_AUDIT.ClusterAdminBindingCreated` | High | New ClusterRoleBinding to cluster-admin created |
| `K8S_AUDIT.ExposedServiceAccountToken` | Medium | Service account token accessed from exec session |
| `K8S_AUDIT.AnonymousAccessToAPIServer` | Medium | Unauthenticated request to API server |
KQL Queries for AKS Threat Hunting
// Detect privileged pod creation in the last 7 days
SecurityAlert
| where TimeGenerated > ago(7d)
| where AlertType in (
"K8S.NODE_NewPrivilegedContainer",
"K8S.NODE_PrivilegedContainerArtifacts",
"K8S.NODE_ContainerEscape"
)
| extend Details = parse_json(ExtendedProperties)
| project
TimeGenerated,
AlertType,
AlertSeverity,
CompromisedEntity,
Details.ContainerName,
Details.PodName,
Details.Namespace,
Details.NodeName,
RemediationSteps
| order by TimeGenerated desc
// Detect cluster-admin binding creation (privilege escalation indicator)
AzureDiagnostics
| where TimeGenerated > ago(24h)
| where Category == "kube-audit"
| extend AuditLog = parse_json(log_s)
| where AuditLog.verb == "create"
and AuditLog.objectRef.resource == "clusterrolebindings"
| extend
User = AuditLog.user.username,
RoleRef = AuditLog.requestObject.roleRef.name,
Subjects = AuditLog.requestObject.subjects
| where RoleRef == "cluster-admin"
| project TimeGenerated, User, RoleRef, Subjects
// Hunt for container exec sessions (potential lateral movement indicator)
AzureDiagnostics
| where TimeGenerated > ago(24h)
| where Category == "kube-audit"
| extend AuditLog = parse_json(log_s)
| where AuditLog.verb == "create"
and AuditLog.objectRef.subresource == "exec"
| extend
User = AuditLog.user.username,
Namespace = AuditLog.objectRef.namespace,
PodName = AuditLog.objectRef.name,
Command = AuditLog.requestObject.command
| project TimeGenerated, User, Namespace, PodName, Command
| order by TimeGenerated desc
// Images pulled from non-ACR registries (supply chain risk signal)
ContainerImageInventory
| where TimeGenerated > ago(24h)
| where Repository !contains ".azurecr.io"
and Repository !startswith "mcr.microsoft.com"
| project TimeGenerated, Computer, Repository, Image, ImageTag, Running
| order by TimeGenerated descEnable AKS audit log collection by sending the kube-audit and kube-audit-admin diagnostic categories to the Log Analytics workspace attached to Defender for Cloud. Without these categories, the audit log KQL queries return no results.
# Enable AKS diagnostic settings for audit logging
az monitor diagnostic-settings create \
--name "aks-audit-logs" \
--resource "/subscriptions/<subscription-id>/resourceGroups/<rg-name>/providers/Microsoft.ContainerService/managedClusters/<cluster-name>" \
--workspace "<log-analytics-workspace-id>" \
--logs '[
{"category": "kube-audit", "enabled": true},
{"category": "kube-audit-admin", "enabled": true},
{"category": "kube-controller-manager", "enabled": true},
{"category": "kube-scheduler", "enabled": true},
{"category": "cluster-autoscaler", "enabled": true}
]'What Falco Catches That Defender Misses
Defender for Containers provides solid coverage for known-bad patterns but has latency in adding detections for novel techniques. Falco (CNCF graduated, v0.39.0 as of early 2026) runs as a DaemonSet with direct kernel access via eBPF and can be tuned with custom rules.
Specific gaps Falco fills in a Defender deployment:
- Sensitive file reads inside containers: Falco can alert on reads to /etc/shadow, /root/.ssh, or credential files within a running container. Defender detects escape attempts but not reconnaissance inside the container before the escape.
- Unexpected outbound connections by process name: Falco rules can fire when python3 or sh initiates a network connection that an nginx container would never normally make.
- Syscall-level container escape detection: Falco with the eBPF probe detects ptrace calls, namespace manipulation via unshare, and nsenter commands that indicate an attempted container escape before it succeeds.
- Modification of /etc/passwd or /etc/sudoers: Persistence mechanisms that write to these paths inside a container are caught by Falco rules but are not in Defender's default detection set.
The operational cost is rule maintenance. Falco generates significant noise in default configuration. Tune it against your specific workload before enabling alerting in production. Combine Falco alerts with Sentinel via the Falco Sidekick integration to correlate with Defender alerts in a single incident queue.
AKS Hardening Checklist
- [ ] Enable Defender for Containers on all subscriptions running AKS clusters (az security pricing create --name Containers --tier Standard)
- [ ] Deploy Defender sensor DaemonSet to all node pools and verify it is reporting (kubectl get daemonset microsoft-defender-collector-ds -n kube-system)
- [ ] Enable OIDC issuer and Workload Identity on all clusters; remove all AAD Pod Identity components (aad-pod-identity DaemonSet and CRDs)
- [ ] Migrate all workloads from AAD Pod Identity to AKS Workload Identity with namespace-scoped federated credentials
- [ ] Apply Pod Security Admission labels to all namespaces: enforce=restricted for production, enforce=baseline minimum for all others
- [ ] Audit all existing pods for privileged: true, hostPID: true, hostNetwork: true, and hostIPC: true in pod specs; remediate before enforcing PSA
- [ ] Deploy default-deny NetworkPolicy to all application namespaces; document and justify every allow rule
- [ ] Enable Notation image signing in CI/CD pipeline for all images pushed to ACR; deploy Ratify admission webhook to enforce signature verification at pod creation
- [ ] Enable ACR image vulnerability scanning and assign the Azure Policy deny effect for images with critical CVEs
- [ ] Enable AKS diagnostic settings for kube-audit and kube-audit-admin log categories sent to the Defender-linked Log Analytics workspace
- [ ] Create KQL alert rules for: cluster-admin binding creation, privileged pod creation, container exec into running pods, and images pulled from non-ACR registries
- [ ] Disable the Kubernetes dashboard if enabled; audit all ClusterRoleBinding objects bound to cluster-admin and remove any non-system service accounts
- [ ] Pin all pod images to digest (image@sha256:<digest>) rather than mutable tags in production deployments
- [ ] Configure private cluster (API server VNET integration) and restrict authorized IP ranges for all clusters accessible via public endpoint
- [ ] Review Defender for Cloud AKS recommendations weekly; remediate all High severity findings within 14 days per your patch SLA
Frequently Asked Questions
What is Pod Security Admission and why is it replacing Pod Security Policies in AKS?
Pod Security Admission (PSA) is a built-in Kubernetes admission controller that enforces security profiles (privileged, baseline, or restricted) at the namespace level using labels. Pod Security Policies (PSP) were a cluster-scoped resource that required complex RBAC bindings to work correctly and were removed from Kubernetes in version 1.25. PSA is simpler to operate because security requirements are declared on the namespace itself, making it immediately visible what enforcement level applies to each workload. AKS automatically supports PSA from Kubernetes 1.25 onward and Microsoft recommends the restricted profile for all production application namespaces.
How does AKS Workload Identity differ from the legacy AAD Pod Identity approach?
AAD Pod Identity used a DaemonSet that intercepted IMDS requests from pods and mapped them to Azure AD identities via CRDs. This architecture had known security issues including privilege escalation risk through the NMI (Node Managed Identity) component and race conditions in the identity binding. AKS Workload Identity is the replacement: it uses the Kubernetes service account token as an OIDC credential that is exchanged for an Azure AD access token using federated identity credentials. There is no daemon intercepting IMDS traffic, and the identity binding is scoped to a specific Kubernetes service account in a specific namespace with a specific OIDC issuer claim, giving it a much smaller blast radius.
What does Defender for Containers detect that a standard Kubernetes audit log review would miss?
Defender for Containers runs a threat intelligence engine against cluster activity that a manual audit log review would not catch at scale. Specific detections include cryptocurrency mining activity identified by process behavior rather than known hashes, container escape attempts using techniques like namespace manipulation and privileged container creation, exposed Kubernetes dashboards probed from external IPs, and lateral movement via service account token exfiltration. The sensor runs as a DaemonSet with direct node access and can detect syscall-level activity that the Kubernetes API server audit log does not record.
Why should container images be pinned to digest rather than using mutable tags in production?
A mutable image tag such as nginx:latest or app:v2 can be overwritten in the registry, meaning the image a pod pulls today may be different from what it pulled yesterday. An attacker who can push to your registry or who compromises an upstream registry can silently replace a tagged image with a malicious one. Pinning to digest (image@sha256:abc123...) guarantees that the exact byte-for-byte image that was built and scanned in CI is the one that runs in production. Notation image signing combined with the Ratify admission webhook provides a stronger guarantee by verifying a cryptographic signature at pod creation time.
What is the most commonly missed AKS security configuration in enterprise deployments?
Based on Defender for Cloud recommendations and real-world assessments, the most commonly missed configuration is leaving the Kubernetes API server accessible from the public internet without IP range restrictions. Many AKS clusters are deployed with the default public API server endpoint and never have authorized IP ranges configured, which means the API server accepts connections from any IP address. The second most common gap is missing Defender for Containers sensor coverage on clusters deployed before the sensor was available or in subscriptions where the Defender plan was not enabled at deployment time.
Get weekly security insights
Cloud security, zero trust, and identity guides — straight to your inbox.
Microsoft Cloud Solution Architect
Cloud Solution Architect with deep expertise in Microsoft Azure and a strong background in systems and IT infrastructure. Passionate about cloud technologies, security best practices, and helping organizations modernize their infrastructure.
Share this article
Questions & Answers
Related Articles
Need Help with Your Security?
Our team of security experts can help you implement the strategies discussed in this article.
Contact Us