MCP Server Hardening Case Study: Locking Down a Corporate Dev Environment
Most teams treat MCP servers as developer tooling. They are infrastructure, and the incident logs prove it. This guide walks through network isolation, authenticated gateways, Azure Policy governance, and KQL detection for enterprise MCP deployments, drawn from a real post-incident remediation.
The Incident That Exposed the Architecture Gap
In Q1 2026, a 40-person engineering team at a financial services firm had been running Claude Code enterprise-wide for six weeks. MCP configuration was left to individual developers. Fifteen different MCP server configurations were running across developer workstations and CI/CD runners. Three of those configurations included a file system MCP server with root-level access. Two included an MCP server pointing to the internal secrets management API with no authentication beyond the developer's personal API key.
When a developer left the company, his CI/CD runner (still active, still running his MCP configuration) continued processing scheduled jobs. The runner had his credentials cached. Three weeks later, the firm's DLP system flagged an unusual pattern: 200MB of files from the internal document repository had been accessed from a CI/CD runner at 2 AM. The runner was using the file system MCP server, reading documents outside the project scope.
No malicious actor was involved. A scheduled pipeline had drifted. But the incident exposed what the [MCP server security guide](/blog/mcp-server-security-guide-2026) covers in theory: in a corporate environment, MCP servers are infrastructure, not developer tooling. They need the same controls as any privileged workload.
This article documents the architecture changes that team implemented over three weeks, including working Terraform, Azure API Management policy, Azure Policy definitions, and KQL detection queries.
---
The Target Architecture
The design goal: every MCP server in the corporate environment runs in a controlled, audited, network-isolated configuration. No developer runs an MCP server from a personal workstation with access to shared corporate resources.
The architecture has four layers:
- Network isolation: MCP servers run in Azure Container Instances inside a dedicated subnet, not on developer workstations
- Identity control: Each MCP server instance uses a dedicated user-assigned managed identity with minimal permissions, not developer personal credentials
- Tool scope enforcement: Azure API Management authenticates every tool call with JWT validation and enforces per-client rate limits
- Audit pipeline: All tool calls log to a central Log Analytics workspace, with KQL alerts for anomalous access patterns
---
Step 1: Network Isolation with Terraform
VNet Design
The MCP server subnet is isolated from both the developer VNet (where workstations and CI/CD runners are) and the production VNet (where APIs and databases live). MCP servers get controlled access to specific internal APIs via private endpoints. No broad network access.
resource "azurerm_virtual_network" "mcp" {
name = "vnet-mcp-${var.environment}"
location = var.location
resource_group_name = var.resource_group_name
address_space = ["10.50.0.0/16"]
tags = var.common_tags
}
resource "azurerm_subnet" "mcp_servers" {
name = "snet-mcp-servers"
resource_group_name = var.resource_group_name
virtual_network_name = azurerm_virtual_network.mcp.name
address_prefixes = ["10.50.1.0/24"]
delegation {
name = "container-instances"
service_delegation {
name = "Microsoft.ContainerInstance/containerGroups"
actions = ["Microsoft.Network/virtualNetworks/subnets/action"]
}
}
}
resource "azurerm_network_security_group" "mcp_servers" {
name = "nsg-mcp-servers"
location = var.location
resource_group_name = var.resource_group_name
security_rule {
name = "AllowMCPFromCICDRunners"
priority = 100
direction = "Inbound"
access = "Allow"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "3000"
source_address_prefix = var.cicd_runner_subnet_cidr
destination_address_prefix = "*"
}
security_rule {
name = "DenyAllInbound"
priority = 4096
direction = "Inbound"
access = "Deny"
protocol = "*"
source_port_range = "*"
destination_port_range = "*"
source_address_prefix = "*"
destination_address_prefix = "*"
}
security_rule {
name = "DenyInternetOutbound"
priority = 200
direction = "Outbound"
access = "Deny"
protocol = "*"
source_port_range = "*"
destination_port_range = "*"
source_address_prefix = "*"
destination_address_prefix = "Internet"
}
}Container Instance Per MCP Server Type
Each MCP server type gets its own container instance with a dedicated managed identity. This prevents a compromised MCP server from using another server's credentials.
resource "azurerm_user_assigned_identity" "mcp_github" {
name = "id-mcp-github-${var.environment}"
location = var.location
resource_group_name = var.resource_group_name
}
resource "azurerm_container_group" "mcp_github" {
name = "aci-mcp-github-${var.environment}"
location = var.location
resource_group_name = var.resource_group_name
ip_address_type = "Private"
subnet_ids = [azurerm_subnet.mcp_servers.id]
os_type = "Linux"
restart_policy = "Always"
identity {
type = "UserAssigned"
identity_ids = [azurerm_user_assigned_identity.mcp_github.id]
}
container {
name = "mcp-github"
image = "${var.acr_login_server}/mcp-github:${var.mcp_github_version}"
cpu = "0.5"
memory = "0.5"
ports {
port = 3000
protocol = "TCP"
}
environment_variables = {
"LOG_ENDPOINT" = var.log_analytics_endpoint
"MCP_SERVER_NAME" = "github"
"ALLOWED_ORGS" = var.allowed_github_orgs
}
}
image_registry_credential {
server = var.acr_login_server
}
tags = merge(var.common_tags, { MCPServer = "true" })
}
resource "azurerm_role_assignment" "mcp_github_acr_pull" {
scope = var.acr_resource_id
role_definition_name = "AcrPull"
principal_id = azurerm_user_assigned_identity.mcp_github.principal_id
}The MCPServer = "true" tag is required for the Azure Policy enforcement in Step 3. Every MCP container group must carry this tag; the policy targets it.
---
Step 2: APIM Gateway with JWT Enforcement
Azure API Management sits in front of all MCP servers. It enforces JWT validation against Entra ID, per-client rate limiting, and logs every tool call request. No MCP server is reachable without going through APIM.
APIM Infrastructure
resource "azurerm_api_management" "mcp_gateway" {
name = "apim-mcp-${var.environment}"
location = var.location
resource_group_name = var.resource_group_name
publisher_name = var.publisher_name
publisher_email = var.publisher_email
sku_name = "Developer_1"
virtual_network_type = "Internal"
virtual_network_configuration {
subnet_id = azurerm_subnet.apim.id
}
identity {
type = "SystemAssigned"
}
}Use Standard_1 in production. Developer_1 has no SLA and is not zone-redundant.
APIM Inbound Policy
The APIM policy validates the Entra ID JWT, enforces rate limits, and logs all tool calls to Event Hub:
<policies>
<inbound>
<validate-jwt header-name="Authorization" failed-validation-httpcode="401">
<openid-config url="https://login.microsoftonline.com/<tenant-id>/v2.0/.well-known/openid-configuration"/>
<required-claims>
<claim name="aud" match="any">
<value><mcp-app-client-id></value>
</claim>
<claim name="scp" match="any">
<value>mcp.tools.read</value>
<value>mcp.tools.write</value>
</claim>
</required-claims>
</validate-jwt>
<rate-limit-by-key calls="50" renewal-period="60"
counter-key="@(context.Request.IpAddress)"/>
<log-to-eventhub logger-id="mcp-audit-logger">
@{
return new JObject(
new JProperty("timestamp", DateTime.UtcNow),
new JProperty("caller", context.Request.Headers
.GetValueOrDefault("X-MS-CLIENT-PRINCIPAL-ID", "unknown")),
new JProperty("tool", context.Request.Url.Path),
new JProperty("method", context.Request.Method),
new JProperty("body", context.Request.Body.As<string>(preserveContent: true))
).ToString();
}
</log-to-eventhub>
</inbound>
</policies>Replace <tenant-id> and <mcp-app-client-id> with your Entra ID tenant and the app registration client ID for the MCP gateway.
Entra ID App Registration Scopes
The MCP gateway app registration in Entra ID defines scopes by tool category. Clients request only what they need:
| Scope | Tools Available | Who Gets It |
|---|---|---|
| `mcp.tools.read` | File read, repo view, code search | All developer clients |
| `mcp.tools.write` | File write, PR creation, issue creation | Approved developer clients |
| `mcp.tools.admin` | Repo settings, webhook management | DevOps service principals only |
| `mcp.tools.secrets` | Secrets management API access | CI/CD pipeline service principals only |
Enforce the secrets scope restriction with a Conditional Access policy: if the requesting principal is a user (not a service principal), and the requested scope includes mcp.tools.secrets, block the authentication. No developer should ever acquire the secrets scope interactively.
---
Step 3: Azure Policy for MCP Server Governance
Two Azure Policy definitions protect the MCP infrastructure from misconfiguration.
Policy 1: Require Managed Identity on All MCP Container Groups
resource "azurerm_policy_definition" "require_mcp_managed_identity" {
name = "require-mcp-managed-identity"
policy_type = "Custom"
mode = "All"
display_name = "MCP server containers must use managed identity"
policy_rule = jsonencode({
if = {
allOf = [
{
field = "type"
equals = "Microsoft.ContainerInstance/containerGroups"
},
{
field = "tags.MCPServer"
exists = "true"
},
{
anyOf = [
{
field = "identity.type"
exists = "false"
},
{
field = "identity.type"
equals = "None"
}
]
}
]
}
then = {
effect = "Deny"
}
})
}A container group tagged MCPServer: true that lacks a managed identity gets denied at the ARM layer before it starts. This blocks the class of incident in the case study: a developer's local container configuration pointing at shared resources using personal credentials.
Policy 2: Deny MCP Servers Outside the Approved Subnet
resource "azurerm_policy_definition" "mcp_approved_subnet_only" {
name = "mcp-approved-subnet-only"
policy_type = "Custom"
mode = "All"
display_name = "MCP server containers must run in approved subnet"
policy_rule = jsonencode({
if = {
allOf = [
{
field = "type"
equals = "Microsoft.ContainerInstance/containerGroups"
},
{
field = "tags.MCPServer"
exists = "true"
},
{
field = "Microsoft.ContainerInstance/containerGroups/subnetIds[*].id"
notIn = var.approved_mcp_subnet_ids
}
]
}
then = {
effect = "Deny"
}
})
}This is the policy that would have blocked the original incident. Any MCP container group launched outside the approved subnet (for example, in the developer subnet where the CI/CD runner lived) is denied by policy before the container starts.
Assign both policies to the subscription scope, not just the MCP resource group. Developers may have access to other resource groups where they could attempt to launch containers.
---
Step 4: Audit Logging and KQL Detection
Log Schema
Every tool call processed by the APIM gateway logs a structured JSON event to a Log Analytics workspace via an Event Hub connector. The schema:
{
"timestamp": "2026-05-17T14:32:11Z",
"caller_oid": "9f3a2b1c-4d5e-6f7a-8b9c-0d1e2f3a4b5c",
"caller_upn": "developer@contoso.com",
"mcp_server": "github",
"tool_name": "create_pull_request",
"scope_used": "mcp.tools.write",
"resource_accessed": "repos/contoso/backend-api",
"latency_ms": 312,
"response_status": 200,
"apim_request_id": "abc123def456"
}Service principal callers have caller_oid but caller_upn is empty. That distinction is the key for separating pipeline activity from developer activity in KQL.
KQL: Write Tool Calls Outside Business Hours
MCPAuditLog_CL
| where TimeGenerated > ago(24h)
| extend Hour = datetime_part("hour", TimeGenerated)
| where Hour < 7 or Hour > 20
| where tool_name_s startswith "create_"
or tool_name_s startswith "update_"
or tool_name_s startswith "delete_"
| where isnotempty(caller_upn_s) // human user, not service principal
| project TimeGenerated, caller_upn_s, mcp_server_s, tool_name_s,
resource_accessed_s, Hour
| order by TimeGenerated descAlert threshold: any write tool call outside 7 AM to 8 PM from a human user principal. Pipeline service principals legitimately run outside business hours; the isnotempty(caller_upn_s) filter separates them.
KQL: High-Volume File System Access
MCPAuditLog_CL
| where mcp_server_s == "filesystem"
| where tool_name_s in ("read_file", "list_directory", "search_files")
| summarize FilesAccessed = count(),
UniqueDirectories = dcount(resource_accessed_s)
by caller_oid_s, caller_upn_s, bin(TimeGenerated, 1h)
| where FilesAccessed > 100 or UniqueDirectories > 20
| project TimeGenerated, caller_upn_s, caller_oid_s,
FilesAccessed, UniqueDirectories
| order by FilesAccessed descThis is the detection that would have flagged the incident 3 weeks earlier than the DLP system. Alert threshold: more than 100 file reads or 20 unique directories in a 1-hour window. The 2 AM pipeline had read 847 files across 34 directories before the DLP caught it. This query fires at 101 files.
KQL: Repeated Authorization Failures (Scope Escalation Attempts)
MCPAuditLog_CL
| where response_status_d == 403
| summarize FailedAttempts = count()
by caller_upn_s, tool_name_s, bin(TimeGenerated, 10m)
| where FailedAttempts > 5
| project TimeGenerated, caller_upn_s, tool_name_s, FailedAttempts
| order by FailedAttempts descMultiple 403s from the same caller on restricted tools indicate a client attempting to call tools beyond its authorized scope. In practice this fires for two reasons: misconfigured MCP client (legitimate, need to tune scope) and deliberate probe (needs investigation). Distinguish by looking at which tools are being probed: mcp.tools.admin or mcp.tools.secrets scope failures are higher severity.
---
Step 5: CI/CD Integration Without Stored Credentials
Pipelines authenticate to the MCP gateway using workload identity federation, not stored secrets. This follows the [federated credentials pattern for GitHub Actions and Entra ID](/blog/flexible-federated-identity-credentials-entra-github-terraform).
GitHub Actions Workflow
name: Deploy via MCP Gateway
on:
push:
branches: [main]
permissions:
id-token: write
contents: read
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Authenticate to Azure via OIDC
uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- name: Get MCP gateway token
run: |
MCP_TOKEN=$(az account get-access-token \
--resource ${{ secrets.MCP_APP_CLIENT_ID }} \
--query accessToken -o tsv)
echo "::add-mask::${MCP_TOKEN}"
echo "MCP_TOKEN=${MCP_TOKEN}" >> $GITHUB_ENV
- name: Call MCP tool via authenticated gateway
env:
MCP_ENDPOINT: ${{ secrets.MCP_GATEWAY_ENDPOINT }}
run: |
curl -sf \
-H "Authorization: Bearer ${MCP_TOKEN}" \
-H "Content-Type: application/json" \
"${MCP_ENDPOINT}/tools/create_pull_request" \
-d '{"repo": "contoso/backend-api", "branch": "release/v2.1"}'No client secrets anywhere. The Entra ID identity authenticates through OIDC, and the resulting token is scoped to mcp.tools.write only.
Federated Credential Terraform Configuration
resource "azurerm_user_assigned_identity" "mcp_pipeline" {
name = "id-mcp-pipeline-${var.environment}"
location = var.location
resource_group_name = var.resource_group_name
}
resource "azurerm_federated_identity_credential" "mcp_pipeline_main" {
name = "github-actions-main-branch"
resource_group_name = var.resource_group_name
audience = ["api://AzureADTokenExchange"]
issuer = "https://token.actions.githubusercontent.com"
parent_id = azurerm_user_assigned_identity.mcp_pipeline.id
subject = "repo:<github-org>/<github-repo>:ref:refs/heads/main"
}The subject constraint limits the federated credential to tokens issued for the main branch only. A pull request branch cannot acquire this identity, which prevents feature branch pipelines from getting production-level MCP access.
---
What Changed After the Remediation
After implementing this architecture (three weeks elapsed: one week Terraform, one week APIM policy and Entra ID configuration, one week KQL tuning), the engineering team ran a 30-day comparison:
| Metric | Before | After |
|---|---|---|
| MCP servers running with personal credentials | 14 | 0 |
| MCP tool calls logged and queryable | 0% | 100% |
| Policy violations blocked at ARM layer | N/A | 3 (rogue container attempts) |
| Mean time to detect file system anomaly | 21 days (DLP) | 38 minutes (KQL alert) |
| Developer offboarding MCP cleanup steps | 0 | 4 (documented checklist) |
Three policy violation blocks in the first 30 days: two developers who tried to launch local MCP containers with the MCPServer tag pointing at a shared resource group, and one CI/CD runner template that hadn't been updated to the new subnet configuration.
The two KQL alerts that fired were both legitimate: one developer testing the file system MCP on a personal repo (tuned to allowlist), one service principal with an expiring certificate attempting re-authentication (certificate rotation accelerated to 90 days from the previous 365).
---
Developer Offboarding Checklist for MCP Access
When a developer leaves, four steps are now standard in the offboarding runbook:
- Revoke the developer's Entra ID app registration consent for the MCP gateway app
- Remove any federated credential subject entries referencing the developer's GitHub username from pipeline identities
- Verify no active CI/CD runners carry the developer's personal API token as an environment variable (scan runner configs in GitHub Actions and Azure Pipelines)
- Audit MCPAuditLog_CL for the developer's caller_upn for the past 90 days: confirm last activity matches expected patterns before their last day
None of these steps were in the original offboarding checklist. The incident audit revealed that step 3 is what allowed the drift: a cached environment variable on a runner that had never been cleaned up.
---
Hardening Checklist
- [ ] No MCP servers running on developer workstations with access to shared corporate resources or production APIs
- [ ] All MCP servers containerized in dedicated subnet (ACI or AKS) with user-assigned managed identity
- [ ] MCPServer: true tag applied to every MCP container group resource
- [ ] Azure Policy: require managed identity deployed and assigned at subscription scope
- [ ] Azure Policy: approved subnet only deployed and assigned at subscription scope for MCPServer-tagged resources
- [ ] APIM gateway deployed in front of all MCP servers enforcing JWT validation against Entra ID
- [ ] Entra ID scopes granular: read / write / admin / secrets as separate scopes
- [ ] Conditional Access policy: block human users from acquiring mcp.tools.secrets scope
- [ ] APIM rate limit: 50 tool calls per minute per principal
- [ ] All CI/CD pipelines use workload identity federation (OIDC) not stored client secrets
- [ ] Federated credential subject constraints scoped to specific branches only (not wildcard)
- [ ] All tool calls logging to Log Analytics workspace via Event Hub
- [ ] KQL alert: write operations outside business hours from human user principals
- [ ] KQL alert: file system access exceeding 100 reads or 20 unique directories per hour
- [ ] KQL alert: repeated 403 responses from same caller on restricted tool scopes
- [ ] Developer offboarding checklist includes MCP gateway consent revocation and runner credential audit
Get weekly security insights
Cloud security, zero trust, and identity guides โ straight to your inbox.
Microsoft Cloud Solution Architect
Cloud Solution Architect with deep expertise in Microsoft Azure and a strong background in systems and IT infrastructure. Passionate about cloud technologies, security best practices, and helping organizations modernize their infrastructure.
Questions & Answers
Related Articles
Azure AI Foundry Security: Threat Model, RBAC, and Data Governance Controls (2026)
20 min read
Azure AI Foundry Private Link Setup: Secure Azure OpenAI, AI Search, and Storage End-to-End
18 min read
MCP Server Security: How to Protect AI Agents from Prompt Injection and Tool Abuse (2026)
18 min read
Need Help with Your Security?
Our team of security experts can help you implement the strategies discussed in this article.
Contact Us