Cyber Intelligence
Cloud Security11 min read

Infrastructure Drift: How to Detect It and What to Do About It

Infrastructure drift causes outages and security issues. Learn how to detect when your actual infrastructure differs from your code, and how to fix it.

I
Microsoft Cloud Solution Architect
Infrastructure Drift: How to Detect It and What to Do About It infographic showing key Cloud Security concepts and controls
Infrastructure Drift: How to Detect It and What to Do About It infographic showing key Cloud Security concepts and controls
Infrastructure as CodeDrift DetectionTerraformComplianceDevOps
Video transcript

Your Infrastructure as Code looks perfect in Git. But your actual cloud resources have drifted. How did that happen? And more importantly, how do you catch it before it causes an outage? Infrastructure drift is the silent killer in DevOps teams. When your live resources stop matching your declared code, you lose compliance visibility, create security gaps, and invite manual configuration errors that nobody can trace. The cost is steep: unplanned downtime, audit failures, and late-night incident calls. Think of drift like a ship's course correction. You set your heading in Terraform, but manual clicks in the A W S console nudge the wheel. Soon you're miles off course. Drift detection tools continuously compare your declared state against reality, catching those nudges before they matter. Automated remediation works like a co-pilot with authority. When drift is detected, your system can automatically roll back to the golden configuration. This eliminates the human lag between discovery and fix. Teams using automated remediation close gaps in minutes, not days. Compliance scanning during drift detection ties your I A C directly to regulatory requirements. Every divergence from code becomes a compliance event, not just a technical hiccup. This transforms drift detection from a DevOps chore into a continuous compliance checkpoint. Start today: pick one critical resource group and enable drift detection in your I A C tool. Watch what surfaces. You'll be amazed. Read the complete guide at protego dot me.

The Drift Problem

You've got beautiful Terraform code, well-organized modules, everything documented. Then someone makes a "quick fix" in the AWS console, and suddenly your code doesn't match reality.

That's drift. It starts small and grows until you have no idea what's actually running.

Why Drift Happens

The Usual Suspects

  1. Emergency fixes: Production is down, someone fixes it manually
  2. Console convenience: It's faster to click than to write code
  3. Automated processes: Auto-scaling modifies resources
  4. Service integrations: AWS services create resources on your behalf
  5. Lack of access control: Too many people with console access

The Cost of Drift

  • Security gaps: Hardcoded rules bypassing IaC review
  • Outages: Terraform destroys manually-created resources
  • Compliance failures: Auditors find undocumented changes
  • Lost time: Engineers debugging why environments differ

Detecting Drift

Terraform Plan

Run terraform plan -detailed-exitcode regularly. Exit code 2 means drift detected.

Automated Drift Detection

Set up a scheduled pipeline that runs terraform plan every few hours and alerts on drift. For teams on Azure DevOps, our [Azure DevOps Pipelines guide](/blog/azure-devops-pipelines-beginners-guide) covers how to configure scheduled pipeline runs and alert integrations.

Third-Party Solutions

Tools like Driftctl, Firefly, env0, and Spacelift provide sophisticated drift detection.

Remediation Strategies

Option 1: Update Infrastructure to Match Code

Run terraform apply to revert manual changes. Warning: This might cause downtime.

Option 2: Update Code to Match Infrastructure

Update your Terraform code to include the change, then verify with terraform plan.

Option 3: Import Unmanaged Resources

Write the resource block, run terraform import, adjust until plan shows no changes.

Option 4: Remove from State

Use terraform state rm to stop managing resources that should be managed elsewhere.

Preventing Future Drift

Technical Controls

  • Restrict console access using IAM policies
  • Enforce tags that identify IaC-managed resources
  • Use Service Control Policies (SCPs)

Process Controls

  • Document break-glass procedures
  • Require PR review for all changes
  • Conduct regular drift audits
  • Train the team on why IaC matters

Drift Response Playbook

  1. Assess: Is this expected?
  2. Document: Who, what, when, why
  3. Decide: Update code or infrastructure?
  4. Remediate: Make the fix
  5. Verify: Confirm drift resolved
  6. Prevent: How do we stop this recurring?

Key Takeaways

  • Drift is inevitable; detecting it quickly is what matters
  • Automated scanning should run at least daily
  • Prevention through access control is better than detection
  • Document everything

Zero drift is unrealistic. Quick detection and consistent remediation? That's achievable.

Frequently Asked Questions

What is infrastructure drift in Terraform?

Infrastructure drift occurs when the actual state of cloud resources diverges from the desired state defined in your Terraform code. It happens when engineers make manual changes in the cloud console, when automated processes (auto-scaling, service integrations) modify resources, or when Terraform applies partial changes due to errors. Drift is detected by running terraform plan and observing that Terraform plans to make changes even though you haven't modified your code.

How do I detect Terraform infrastructure drift automatically?

Run terraform plan -detailed-exitcode on a scheduled basis in your CI/CD pipeline. Exit code 2 indicates drift was detected (a non-empty plan). Configure the pipeline to send an alert (Slack, Teams, or email) when drift is detected. Third-party tools like Driftctl, Firefly, Spacelift, and env0 provide more sophisticated drift detection with reporting and remediation workflows.

How do I fix infrastructure drift in Terraform?

You have four options depending on the situation. Run terraform apply to revert the infrastructure back to match your code (may cause downtime if the manual change was intentional). Update your Terraform code to match the current infrastructure state, then verify with terraform plan that the result is a clean no-op. Use terraform import to bring unmanaged resources under Terraform management. Use terraform state rm to remove resources from Terraform management if they should be managed by a different configuration.

How do I prevent infrastructure drift in the first place?

The primary control is restricting direct console access using IAM policies and Service Control Policies (SCPs) so engineers cannot make changes outside of Terraform. Enforce a PR review process for all Terraform changes. Use break-glass procedures for emergency changes that require immediate manual intervention, with a mandatory follow-up to codify the change in Terraform within 24 hours. Regular automated drift detection ensures any drift that does occur is caught and remediated quickly.

What causes infrastructure drift in AWS and Azure environments?

The most common causes are emergency production fixes applied directly in the console or via CLI, developers using the cloud console for convenience rather than updating IaC, auto-scaling groups modifying resource counts, managed service integrations that create resources automatically (AWS creating ENIs, security groups, or IAM roles), and incomplete Terraform runs that partially apply a change before failing.

N

Recommended tool: Nordpass

Up to 40% commission

Get weekly security insights

Cloud security, zero trust, and identity guides — straight to your inbox.

I

Microsoft Cloud Solution Architect

Cloud Solution Architect with deep expertise in Microsoft Azure and a strong background in systems and IT infrastructure. Passionate about cloud technologies, security best practices, and helping organizations modernize their infrastructure.

Share this article

Questions & Answers

Related Articles

Need Help with Your Security?

Our team of security experts can help you implement the strategies discussed in this article.

Contact Us