What is prompt injection in the context of AI agents?

Prompt injection is an attack where adversary-controlled text inside the model context overrides the developer or user intent and steers the agent into actions the operator never authorized. It comes in two flavors: direct injection (a user types malicious instructions into the agent) and indirect injection (the malicious instructions arrive inside grounding data the agent reads, like an email, calendar item, or SharePoint document).

What did Microsoft Build 2026 add for AI agent security?

Microsoft organized the Build 2026 agent security announcements around four pillars: discover (Agent 365 registry, Defender Agent SPM, Defender for Cloud AI-SPM), govern (Entra Agent ID, Conditional Access for agents, Purview AI agent governance), protect (Defender for AI runtime protection of prompts, pre-tool calls, and post-tool responses), and verify (Azure AI Foundry AI Red Teaming Agent with PyRIT).

What is Microsoft Entra Agent ID?

Microsoft Entra Agent ID is an identity framework that gives every AI agent its own first-class identity in Entra. Assistive agents act on behalf of a user through delegated permissions, while autonomous agents authenticate with the client credentials flow. It reached general availability in April 2026 and lets administrators apply Conditional Access, audit, and least-privilege policies to agent identities the same way they do to human users.

What is a shadow agent and why is it a security risk?

A shadow agent is an AI agent built inside the tenant without going through formal IT or security review, usually in Copilot Studio or another low-code platform. The risk is the same pattern as shadow IT: no canonical inventory, no clear owner, leftover permissions on Microsoft Graph, and agents that talk to each other and accumulate effective privilege. Microsoft Agent 365 and Defender Agent SPM are designed to detect these and bring them under governance.

What was the SearchLeak vulnerability in Microsoft 365 Copilot?

SearchLeak (CVE-2026-42824) was a critical one-click data exfiltration chain disclosed by Varonis in June 2026. It combined a parameter-to-prompt injection, an HTML rendering race condition, and a Bing SSRF that bypassed the content security policy. A single victim click on a crafted Microsoft link let Copilot exfiltrate emails, files, and even MFA codes from SharePoint and OneDrive. Microsoft patched it and added AI Request Verification, a confirmation dialog for outbound data.

Is Microsoft Defender for AI enough to secure AI agents on its own?

No. Defender for AI runtime protection catches a large share of prompt injection and dangerous tool-call patterns, but it cannot stop every indirect injection hidden in grounding content the agent must trust to do its job. Pair it with Purview sensitivity labels and DLP on the egress side, Entra Agent ID and Conditional Access on the identity side, an MCP server allowlist, and a regular AI Red Teaming Agent cadence on every production agent.

Cybersecurity10

Securing AI Agents in Microsoft Environments: Prompt Injection, Shadow Agents, and the New Attack Surface

Microsoft Build 2026 shipped a coordinated set of controls to discover, govern, protect, and verify AI agents. Here is how they map to the real attack surface: prompt injection, shadow agents, MCP abuse, and SearchLeak-style data theft.

Idan Ohayon

Microsoft Cloud Solution Architect

June 19, 2026

AI agent security in Microsoft environments diagram showing Entra Agent ID, Defender runtime protection, Purview DLP, and AI Red Teaming Agent verification

AI agentsMicrosoftprompt injectionshadow agentsEntra Agent IDMicrosoft Build 2026AI securityDefender for AI

The Microsoft enterprise stack now ships AI agents the way it once shipped mailboxes: by the thousand, with mixed permissions, mixed owners, and very mixed levels of oversight. At Build 2026 Microsoft acknowledged this directly and announced a coordinated set of capabilities to discover, govern, protect, and verify agents across the development lifecycle. The same week, Varonis disclosed SearchLeak (CVE-2026-42824), a one-click data exfiltration attack chain against Microsoft 365 Copilot that pulled emails, files, and MFA codes out of SharePoint and OneDrive through indirect prompt injection.

An AI agent is a software system that uses a large language model to reason, plan, and take actions on enterprise data and tools, often with its own identity and standing permissions. Prompt injection is the class of attack where adversary-controlled text inside the model context overrides the developer or user intent and steers the agent into actions the operator never authorized. Together those two definitions describe most of the new attack surface this article covers.

Recommended: Pluralsight

Level up your cybersecurity skills with expert-led courses and labs.

Accelerate your tech careerAffiliate link; we may earn a commission at no extra cost to you.

Card required for the free trial; cancel anytime before day 10 to avoid charges.

This guide is written for security teams running Microsoft Entra, Defender XDR, Defender for Cloud, Purview, M365 Copilot, and Copilot Studio. It synthesizes the Build 2026 announcements with the real attack surface that emerged across 2025 and 2026, then gives a practical checklist you can take to your next agent review.

The new agent attack surface in Microsoft environments

Classic identity and endpoint defenses were built around a human at a keyboard. Agents break those assumptions: they hold credentials, they call tools, they ingest documents and emails as instructions, and they spawn other agents. The attack surface that emerges has four distinct shapes.

Direct prompt injection

A user (or attacker) types instructions that override the system prompt. In the Microsoft stack this is most common in Copilot Studio agents exposed as Teams bots, Power Platform automations, or web chat surfaces. Without an input guardrail layer the agent will follow the most recent confident instruction, which is exactly what an attacker counts on.

Indirect prompt injection through grounding data

M365 Copilot grounds responses in user mail, calendar items, Teams messages, and SharePoint or OneDrive documents. Any of those surfaces can carry attacker-controlled text. The SearchLeak chain disclosed in June 2026 is the canonical example: a victim clicked a crafted Microsoft link, Copilot parsed an attacker-controlled parameter as a prompt, then chained an HTML rendering race condition with a Bing SSRF to bypass the content security policy and exfiltrate emails, files, and MFA codes. Microsoft tracked it as CVE-2026-42824 and added AI Request Verification, a user-facing confirmation dialog for outbound data, in the June patch.

The lesson generalizes beyond SearchLeak. If your Copilot Studio agent reads from a shared mailbox, an inbound message can carry instructions. If it reads from a SharePoint library, a single uploaded document can do the same. Any agent that grounds on data a low-trust actor can write to is exposed to this class of attack.

Tool and MCP abuse

The Model Context Protocol (MCP) became the dominant standard for connecting agents to external tools in 2025. Microsoft now supports MCP across Foundry, Copilot Studio, and several first-party agents. The problem is that MCP cannot enforce trust at the protocol level: a poisoned tool description shipped inside an MCP server runs on every invocation, silently, across every session. A 2026 disclosure surfaced roughly 200,000 vulnerable MCP instances across IDEs, internal tools, and cloud services. An MCP server that aggregates credentials for multiple services also becomes a single point of failure for everything it can reach.

Shadow agents and identity sprawl

Copilot Studio shifted agent building into a low-code activity, which is the same governance pattern that produced shadow IT in the SaaS era. Enterprises that have built fifty or more custom agents typically have no canonical inventory of what exists, who owns each one, what permissions it holds, or which data it touches. Microsoft now calls this agent sprawl, and the practical risk is concrete: agents with leftover Microsoft Graph permissions, agents owned by employees who left, and agents that talk to each other and accumulate effective privilege across the conversation.

For deeper background on the broader non-human identity problem, see the NHI and AI agent security guide. Agent sprawl is the most acute current example of an NHI problem with poor lifecycle controls.

What Microsoft Build 2026 added: discover, govern, protect, verify

Microsoft organized the Build 2026 security announcements around four verbs that map cleanly onto the agent lifecycle. Each verb maps to a concrete product surface, and most of these capabilities went generally available or entered broad preview between May and June 2026.

Discover: find every agent in the tenant

Microsoft Agent 365 reached general availability on May 1, 2026 and ships an agent registry that produces a unified inventory of every agent in the tenant, including third-party agents. Defender Agent Security Posture Management (Agent SPM) extends this with detection of shadow agents, agent sprawl, and cumulative attack surface across Foundry, Copilot Studio, and custom frameworks. Defender for Cloud AI-SPM adds multi-cloud agent visibility across Microsoft Foundry, AWS Bedrock, and GCP Vertex AI, and rolls those workloads into a dedicated secure score. From July 1, 2026, agent discovery and posture for Foundry and third-party cloud agents move under the Agent 365 license rather than the Defender CSPM plan.

Govern: give every agent an identity

Microsoft Entra Agent ID reached general availability in April 2026 and treats every agent as a first-class identity. Assistive agents act on behalf of a signed-in user through delegated permissions; autonomous agents authenticate with their own client credentials flow. Conditional Access policies for agents enforce the same controls used for humans: MFA on the user the agent acts for, device compliance for the host, location and risk evaluation, and explicit block policies for high-risk agents. The Conditional Access Optimization Agent identifies agent identities with excessive or unused Microsoft Graph permissions and recommends least-privilege adjustments.

Microsoft Purview added AI Agent Governance policies that extend data classification, sensitivity labels, and DLP to agent-accessible content. The practical effect: an agent that grounds on a SharePoint library inherits the library's sensitivity labels, and Purview can block exfiltration of labeled content through an agent response just as it would through an email.

Protect: stop attacks at runtime

Microsoft Defender now provides runtime protection for AI agents. The enhanced Defender for AI Services plan, available from February 2026, inspects agent activity at three critical points: user prompts, pre-tool calls, and post-tool responses. It detects prompt injection, classifies dangerous actions, and can audit or block before the action executes. Coverage extends across the full agent loop: inputs, memory, reasoning, tool calls, actions, and model dependencies, not just the prompt and response surface.

Defender for Office 365 added a dedicated AI agent risk engine that screens autonomous email behaviors before they reach inboxes, which directly addresses indirect injection vectors like the one used in SearchLeak. Defender XDR correlates agent alerts with identity and endpoint signals in a single incident graph, so a suspicious Foundry agent action and the user identity that triggered it land in the same investigation queue.

Verify: red team before deployment

Azure AI Foundry shipped the AI Red Teaming Agent in public preview. It integrates Microsoft's open-source PyRIT toolkit and runs automated adversarial scans against models and application endpoints using more than twenty attack strategies, including multi-turn refinement against conversational agents with session memory. The output is a structured report of safety and security risks: prompt injection success rates, harmful content generation, privacy leakage, and robustness failures. Run it from the Foundry portal against a deployed endpoint, or run it locally with the Azure AI Evaluation SDK against a model under development.

The verification step is the one most security teams skip. For a structured set of risk categories to test against, the OWASP Top 10 for agentic AI provides the cleanest checklist, and the AI Red Teaming Agent attack strategies map well onto it.

Practical checklist for Microsoft and Azure security teams

This list assumes you already run Entra ID, Defender XDR, Defender for Cloud, and Purview. Treat it as a phased plan: discovery and identity first, then runtime protection, then verification cadence.

1. Inventory every agent in Entra Agent ID

Enable Entra Agent ID and pull the Agent 365 registry into your CMDB or asset platform. For each agent record owner, business purpose, classification of data accessed, and the list of tools or MCP servers it can call. Treat any agent without an owner as a candidate for retirement.

2. Apply Conditional Access policies to agents

Use the Microsoft-supplied policy template that blocks high-risk agent identities flagged by Entra ID Protection. Add a policy that requires compliant device for any agent that handles regulated data. For autonomous agents, scope client credential authentication to specific named locations or Conditional Access network locations rather than allowing any IP.

3. Apply Purview sensitivity labels to agent-accessible content

Any SharePoint library, OneDrive folder, or shared mailbox an agent grounds on must carry a sensitivity label that reflects its data. Configure Purview DLP to block agent responses that include content labeled Confidential or higher when the response destination is external. This is the single most effective control against M365 Copilot data exfiltration of the SearchLeak class.

4. Turn on Defender runtime protection for AI agents

Enable the enhanced Defender for AI Services plan against Foundry agents and connect alerts into Defender XDR. Configure runtime policies to block, not just audit, the two highest-confidence detections: prompt injection from grounding content and post-tool responses that match exfiltration signatures.

5. Allowlist MCP servers and audit tool descriptions

Maintain a tenant-level allowlist of MCP servers that agents may connect to. Block remote MCP endpoints by default and require sign-off to add a new one. For every approved server, store a copy of the tool descriptions in source control and diff them on every update; a silently changed description is the tool poisoning attack pattern.

6. Enforce AI Request Verification in M365 Copilot

Make the AI Request Verification dialog mandatory by policy for any Copilot action that sends data to an external endpoint. This is the specific control Microsoft shipped to close the SearchLeak class of indirect injection, and it is opt-in by default.

7. Run the AI Red Teaming Agent on a fixed cadence

Run the Foundry AI Red Teaming Agent on every production agent at deployment and at minimum quarterly afterward. Treat findings above a defined severity bar as release blockers in your agent CI pipeline, the same way you treat a critical SAST finding for traditional code.

8. Run the Conditional Access Optimization Agent monthly

Use the optimization agent to surface agent identities with unused or overprivileged Microsoft Graph permissions, and act on the recommendations. Agent permissions tend to accumulate the same way human role permissions do, but without a review cycle nobody enforces.

For a worked threat model of an autonomous agent in an Azure-native setting, the Azure Logic Apps autonomous agent threat model walks through how these controls combine in a concrete deployment.

What is still unsolved

The Build 2026 announcements close several large gaps but leave a handful of real ones open. Treat these as known-residual risks that need compensating controls and manual oversight.

No detector catches every indirect prompt injection. Defender runtime protection raises the bar but a determined attacker can still hide instructions inside content the agent must trust to function. Sensitivity labels and DLP at the egress side remain the durable control.
Agent-to-agent trust is not a solved problem. When one agent calls another, the second agent inherits the call context but may also pick up adversarial instructions buried in it. Microsoft has not published a clean per-hop authorization model for chained agents.
MCP trust is still an out-of-band decision. Microsoft can list trusted servers but cannot prove a remote MCP server is not silently rewriting its tool descriptions tomorrow. Source-of-truth tool descriptions in your own repo, with diff-based alerting, remains a human-driven control.
Licensing complexity is its own risk. Several of the most useful agent controls move under the Agent 365 license on July 1, 2026. Teams that do not budget for that license will lose Foundry agent discovery and posture in Defender for Cloud unless they migrate.
Custom agent frameworks outside Foundry and Copilot Studio get partial coverage. AI-SPM extends to AWS Bedrock and GCP Vertex AI, but a self-hosted open-source agent on a Linux VM gets identity coverage through Entra and endpoint coverage through Defender for Endpoint, with limited agent-specific runtime inspection.

Summary

Securing AI agents inside a Microsoft environment in 2026 is no longer a research problem. The pieces exist: Entra Agent ID for identity, Conditional Access for agent policy, Purview for data-level controls, Defender for runtime protection, Defender for Cloud AI-SPM for posture, and the Azure AI Foundry AI Red Teaming Agent for verification. SearchLeak proved that indirect prompt injection is real and weaponized in the wild, and shadow agents and MCP abuse are now mainstream risk categories rather than speculative ones. The work for security teams is to actually deploy these controls, prove them with adversarial testing, and treat agent identities with the same discipline they apply to human admin accounts.

Key actions at a glance:

Inventory every agent in Entra Agent ID and assign an accountable owner
Apply Conditional Access policies that block high-risk agent identities and enforce least privilege
Label all agent-accessible content in Purview and enforce DLP at agent response egress
Turn on Defender runtime protection for Foundry agents and block, not just audit, the highest-confidence detections
Allowlist MCP servers, version-control tool descriptions, and run the AI Red Teaming Agent at least quarterly

Recommended: Pluralsight

Level up your cybersecurity skills with expert-led courses and labs.

Accelerate your tech careerAffiliate link; we may earn a commission at no extra cost to you.

Card required for the free trial; cancel anytime before day 10 to avoid charges.

Free download

Security Hardening Checklist

Essential security controls for cloud-native applications and infrastructure.

No spam. Unsubscribe anytime.

Continue Learning

SOC Analyst Level 1 Roadmap

Get job-ready for your first Security Operations Center role.

Start the Beginner Path10h · 4 topics · 10 quiz questions

Idan Ohayon

Microsoft Cloud Solution Architect

Cloud Solution Architect with deep expertise in Microsoft Azure and a strong background in systems and IT infrastructure. Passionate about cloud technologies, security best practices, and helping organizations modernize their infrastructure.

Share this article

X / Twitter LinkedIn

Questions & Answers

🔐

Cybersecurity

Why People Are Leaving Google Drive for Zero-Knowledge Storage

8 min read

🔐

Cybersecurity

I Ran an Autonomous AI Hacker on My Own Site: An Honest Strix Review

10 min read

🔐

Cybersecurity

Google Ads Disapproved for "Compromised Site": How to Fix It

8 min read

Need Help with Your Security?

Our team of security experts can help you implement the strategies discussed in this article.

Securing AI Agents in Microsoft Environments: Prompt Injection, Shadow Agents, and the New Attack Surface

The new agent attack surface in Microsoft environments

Direct prompt injection

Indirect prompt injection through grounding data

Tool and MCP abuse

Shadow agents and identity sprawl

What Microsoft Build 2026 added: discover, govern, protect, verify

Discover: find every agent in the tenant

Govern: give every agent an identity

Protect: stop attacks at runtime

Verify: red team before deployment

Practical checklist for Microsoft and Azure security teams

1. Inventory every agent in Entra Agent ID

2. Apply Conditional Access policies to agents

3. Apply Purview sensitivity labels to agent-accessible content

4. Turn on Defender runtime protection for AI agents

5. Allowlist MCP servers and audit tool descriptions

6. Enforce AI Request Verification in M365 Copilot

7. Run the AI Red Teaming Agent on a fixed cadence

8. Run the Conditional Access Optimization Agent monthly

What is still unsolved

Summary

Security Hardening Checklist

SOC Analyst Level 1 Roadmap

Idan Ohayon

Share this article

Questions & Answers

Ask a Question

Related Articles

Why People Are Leaving Google Drive for Zero-Knowledge Storage

I Ran an Autonomous AI Hacker on My Own Site: An Honest Strix Review

Google Ads Disapproved for "Compromised Site": How to Fix It

Need Help with Your Security?