Securing AI Agents in Microsoft Environments: Prompt Injection, Shadow Agents, and the New Attack Surface
Microsoft Build 2026 shipped a coordinated set of controls to discover, govern, protect, and verify AI agents. Here is how they map to the real attack surface: prompt injection, shadow agents, MCP abuse, and SearchLeak-style data theft.

The Microsoft enterprise stack now ships AI agents the way it once shipped mailboxes: by the thousand, with mixed permissions, mixed owners, and very mixed levels of oversight. At Build 2026 Microsoft acknowledged this directly and announced a coordinated set of capabilities to discover, govern, protect, and verify agents across the development lifecycle. The same week, Varonis disclosed SearchLeak (CVE-2026-42824), a one-click data exfiltration attack chain against Microsoft 365 Copilot that pulled emails, files, and MFA codes out of SharePoint and OneDrive through indirect prompt injection.
An AI agent is a software system that uses a large language model to reason, plan, and take actions on enterprise data and tools, often with its own identity and standing permissions. Prompt injection is the class of attack where adversary-controlled text inside the model context overrides the developer or user intent and steers the agent into actions the operator never authorized. Together those two definitions describe most of the new attack surface this article covers.
This guide is written for security teams running Microsoft Entra, Defender XDR, Defender for Cloud, Purview, M365 Copilot, and Copilot Studio. It synthesizes the Build 2026 announcements with the real attack surface that emerged across 2025 and 2026, then gives a practical checklist you can take to your next agent review.
The new agent attack surface in Microsoft environments
Classic identity and endpoint defenses were built around a human at a keyboard. Agents break those assumptions: they hold credentials, they call tools, they ingest documents and emails as instructions, and they spawn other agents. The attack surface that emerges has four distinct shapes.
Direct prompt injection
A user (or attacker) types instructions that override the system prompt. In the Microsoft stack this is most common in Copilot Studio agents exposed as Teams bots, Power Platform automations, or web chat surfaces. Without an input guardrail layer the agent will follow the most recent confident instruction, which is exactly what an attacker counts on.
Indirect prompt injection through grounding data
M365 Copilot grounds responses in user mail, calendar items, Teams messages, and SharePoint or OneDrive documents. Any of those surfaces can carry attacker-controlled text. The SearchLeak chain disclosed in June 2026 is the canonical example: a victim clicked a crafted Microsoft link, Copilot parsed an attacker-controlled parameter as a prompt, then chained an HTML rendering race condition with a Bing SSRF to bypass the content security policy and exfiltrate emails, files, and MFA codes. Microsoft tracked it as CVE-2026-42824 and added AI Request Verification, a user-facing confirmation dialog for outbound data, in the June patch.
The lesson generalizes beyond SearchLeak. If your Copilot Studio agent reads from a shared mailbox, an inbound message can carry instructions. If it reads from a SharePoint library, a single uploaded document can do the same. Any agent that grounds on data a low-trust actor can write to is exposed to this class of attack.
Tool and MCP abuse
The Model Context Protocol (MCP) became the dominant standard for connecting agents to external tools in 2025. Microsoft now supports MCP across Foundry, Copilot Studio, and several first-party agents. The problem is that MCP cannot enforce trust at the protocol level: a poisoned tool description shipped inside an MCP server runs on every invocation, silently, across every session. A 2026 disclosure surfaced roughly 200,000 vulnerable MCP instances across IDEs, internal tools, and cloud services. An MCP server that aggregates credentials for multiple services also becomes a single point of failure for everything it can reach.
Shadow agents and identity sprawl
Copilot Studio shifted agent building into a low-code activity, which is the same governance pattern that produced shadow IT in the SaaS era. Enterprises that have built fifty or more custom agents typically have no canonical inventory of what exists, who owns each one, what permissions it holds, or which data it touches. Microsoft now calls this agent sprawl, and the practical risk is concrete: agents with leftover Microsoft Graph permissions, agents owned by employees who left, and agents that talk to each other and accumulate effective privilege across the conversation.
For deeper background on the broader non-human identity problem, see the NHI and AI agent security guide. Agent sprawl is the most acute current example of an NHI problem with poor lifecycle controls.
What Microsoft Build 2026 added: discover, govern, protect, verify
Microsoft organized the Build 2026 security announcements around four verbs that map cleanly onto the agent lifecycle. Each verb maps to a concrete product surface, and most of these capabilities went generally available or entered broad preview between May and June 2026.
Discover: find every agent in the tenant
Microsoft Agent 365 reached general availability on May 1, 2026 and ships an agent registry that produces a unified inventory of every agent in the tenant, including third-party agents. Defender Agent Security Posture Management (Agent SPM) extends this with detection of shadow agents, agent sprawl, and cumulative attack surface across Foundry, Copilot Studio, and custom frameworks. Defender for Cloud AI-SPM adds multi-cloud agent visibility across Microsoft Foundry, AWS Bedrock, and GCP Vertex AI, and rolls those workloads into a dedicated secure score. From July 1, 2026, agent discovery and posture for Foundry and third-party cloud agents move under the Agent 365 license rather than the Defender CSPM plan.
Govern: give every agent an identity
Microsoft Entra Agent ID reached general availability in April 2026 and treats every agent as a first-class identity. Assistive agents act on behalf of a signed-in user through delegated permissions; autonomous agents authenticate with their own client credentials flow. Conditional Access policies for agents enforce the same controls used for humans: MFA on the user the agent acts for, device compliance for the host, location and risk evaluation, and explicit block policies for high-risk agents. The Conditional Access Optimization Agent identifies agent identities with excessive or unused Microsoft Graph permissions and recommends least-privilege adjustments.
Microsoft Purview added AI Agent Governance policies that extend data classification, sensitivity labels, and DLP to agent-accessible content. The practical effect: an agent that grounds on a SharePoint library inherits the library's sensitivity labels, and Purview can block exfiltration of labeled content through an agent response just as it would through an email.
Protect: stop attacks at runtime
Microsoft Defender now provides runtime protection for AI agents. The enhanced Defender for AI Services plan, available from February 2026, inspects agent activity at three critical points: user prompts, pre-tool calls, and post-tool responses. It detects prompt injection, classifies dangerous actions, and can audit or block before the action executes. Coverage extends across the full agent loop: inputs, memory, reasoning, tool calls, actions, and model dependencies, not just the prompt and response surface.
Defender for Office 365 added a dedicated AI agent risk engine that screens autonomous email behaviors before they reach inboxes, which directly addresses indirect injection vectors like the one used in SearchLeak. Defender XDR correlates agent alerts with identity and endpoint signals in a single incident graph, so a suspicious Foundry agent action and the user identity that triggered it land in the same investigation queue.
Verify: red team before deployment
Azure AI Foundry shipped the AI Red Teaming Agent in public preview. It integrates Microsoft's open-source PyRIT toolkit and runs automated adversarial scans against models and application endpoints using more than twenty attack strategies, including multi-turn refinement against conversational agents with session memory. The output is a structured report of safety and security risks: prompt injection success rates, harmful content generation, privacy leakage, and robustness failures. Run it from the Foundry portal against a deployed endpoint, or run it locally with the Azure AI Evaluation SDK against a model under development.
The verification step is the one most security teams skip. For a structured set of risk categories to test against, the OWASP Top 10 for agentic AI provides the cleanest checklist, and the AI Red Teaming Agent attack strategies map well onto it.
Practical checklist for Microsoft and Azure security teams
This list assumes you already run Entra ID, Defender XDR, Defender for Cloud, and Purview. Treat it as a phased plan: discovery and identity first, then runtime protection, then verification cadence.
1. Inventory every agent in Entra Agent ID
Enable Entra Agent ID and pull the Agent 365 registry into your CMDB or asset platform. For each agent record owner, business purpose, classification of data accessed, and the list of tools or MCP servers it can call. Treat any agent without an owner as a candidate for retirement.
2. Apply Conditional Access policies to agents
Use the Microsoft-supplied policy template that blocks high-risk agent identities flagged by Entra ID Protection. Add a policy that requires compliant device for any agent that handles regulated data. For autonomous agents, scope client credential authentication to specific named locations or Conditional Access network locations rather than allowing any IP.
3. Apply Purview sensitivity labels to agent-accessible content
Any SharePoint library, OneDrive folder, or shared mailbox an agent grounds on must carry a sensitivity label that reflects its data. Configure Purview DLP to block agent responses that include content labeled Confidential or higher when the response destination is external. This is the single most effective control against M365 Copilot data exfiltration of the SearchLeak class.
4. Turn on Defender runtime protection for AI agents
Enable the enhanced Defender for AI Services plan against Foundry agents and connect alerts into Defender XDR. Configure runtime policies to block, not just audit, the two highest-confidence detections: prompt injection from grounding content and post-tool responses that match exfiltration signatures.
5. Allowlist MCP servers and audit tool descriptions
Maintain a tenant-level allowlist of MCP servers that agents may connect to. Block remote MCP endpoints by default and require sign-off to add a new one. For every approved server, store a copy of the tool descriptions in source control and diff them on every update; a silently changed description is the tool poisoning attack pattern.
6. Enforce AI Request Verification in M365 Copilot
Make the AI Request Verification dialog mandatory by policy for any Copilot action that sends data to an external endpoint. This is the specific control Microsoft shipped to close the SearchLeak class of indirect injection, and it is opt-in by default.
7. Run the AI Red Teaming Agent on a fixed cadence
Run the Foundry AI Red Teaming Agent on every production agent at deployment and at minimum quarterly afterward. Treat findings above a defined severity bar as release blockers in your agent CI pipeline, the same way you treat a critical SAST finding for traditional code.
8. Run the Conditional Access Optimization Agent monthly
Use the optimization agent to surface agent identities with unused or overprivileged Microsoft Graph permissions, and act on the recommendations. Agent permissions tend to accumulate the same way human role permissions do, but without a review cycle nobody enforces.
For a worked threat model of an autonomous agent in an Azure-native setting, the Azure Logic Apps autonomous agent threat model walks through how these controls combine in a concrete deployment.
What is still unsolved
The Build 2026 announcements close several large gaps but leave a handful of real ones open. Treat these as known-residual risks that need compensating controls and manual oversight.
No detector catches every indirect prompt injection. Defender runtime protection raises the bar but a determined attacker can still hide instructions inside content the agent must trust to function. Sensitivity labels and DLP at the egress side remain the durable control.
Agent-to-agent trust is not a solved problem. When one agent calls another, the second agent inherits the call context but may also pick up adversarial instructions buried in it. Microsoft has not published a clean per-hop authorization model for chained agents.
MCP trust is still an out-of-band decision. Microsoft can list trusted servers but cannot prove a remote MCP server is not silently rewriting its tool descriptions tomorrow. Source-of-truth tool descriptions in your own repo, with diff-based alerting, remains a human-driven control.
Licensing complexity is its own risk. Several of the most useful agent controls move under the Agent 365 license on July 1, 2026. Teams that do not budget for that license will lose Foundry agent discovery and posture in Defender for Cloud unless they migrate.
Custom agent frameworks outside Foundry and Copilot Studio get partial coverage. AI-SPM extends to AWS Bedrock and GCP Vertex AI, but a self-hosted open-source agent on a Linux VM gets identity coverage through Entra and endpoint coverage through Defender for Endpoint, with limited agent-specific runtime inspection.
Summary
Securing AI agents inside a Microsoft environment in 2026 is no longer a research problem. The pieces exist: Entra Agent ID for identity, Conditional Access for agent policy, Purview for data-level controls, Defender for runtime protection, Defender for Cloud AI-SPM for posture, and the Azure AI Foundry AI Red Teaming Agent for verification. SearchLeak proved that indirect prompt injection is real and weaponized in the wild, and shadow agents and MCP abuse are now mainstream risk categories rather than speculative ones. The work for security teams is to actually deploy these controls, prove them with adversarial testing, and treat agent identities with the same discipline they apply to human admin accounts.
Key actions at a glance:
Inventory every agent in Entra Agent ID and assign an accountable owner
Apply Conditional Access policies that block high-risk agent identities and enforce least privilege
Label all agent-accessible content in Purview and enforce DLP at agent response egress
Turn on Defender runtime protection for Foundry agents and block, not just audit, the highest-confidence detections
Allowlist MCP servers, version-control tool descriptions, and run the AI Red Teaming Agent at least quarterly
Get weekly security insights
Cloud security, zero trust, and identity guides โ straight to your inbox.
Microsoft Cloud Solution Architect
Cloud Solution Architect with deep expertise in Microsoft Azure and a strong background in systems and IT infrastructure. Passionate about cloud technologies, security best practices, and helping organizations modernize their infrastructure.
Share this article
Questions & Answers
Related Articles
Need Help with Your Security?
Our team of security experts can help you implement the strategies discussed in this article.
Contact Us