Microsoft Purview Information Protection: Complete...

Editorial note: The financial-services scenario and its measurements are illustrative. They are included to explain the deployment approach, not to represent a published customer benchmark.

When DLP Fails Because the Data Was Never Labeled

A financial services team ran a DLP policy in Microsoft Purview blocking email attachments with credit card numbers. A contractor exfiltrated a financial model spreadsheet by email. The spreadsheet contained no credit card numbers: it contained formulas, assumptions, and client names that were worth far more. DLP never fired because the policy was reactive, not proactive. The data was not classified, so DLP had nothing to match against.

Microsoft Purview Information Protection solves a different problem than DLP: it establishes classification at the point of content creation, so that DLP and other downstream controls have a reliable signal to act on. This guide covers a complete tenant setup: label taxonomy design, sensitivity label publishing, auto-labeling policies, and DLP policies that use those labels as conditions.

Designing the Label Taxonomy

The Most Common Mistake: Too Many Labels

Enterprise tenants routinely deploy 15-20 sensitivity labels and then find that users apply them inconsistently or not at all. The research on user behavior with sensitivity labels consistently shows that beyond 5-6 labels, adoption drops sharply.

The taxonomy that works in practice:

Label	Color	Encryption	Use case
Public	Green	None	Marketing content, public docs
General	Blue	None	Internal business communications
Confidential	Yellow	None	Internal-only sensitive content
Confidential / All Employees	Yellow	Encrypt to org	Broad-access sensitive content
Confidential / Specific People	Orange	Encrypt to named group	Restricted project content
Highly Confidential	Red	Encrypt plus DRM restrictions	Trade secrets, M&A, regulated PII

Sublabels under Confidential give you the granularity you need for DLP policy targeting without multiplying top-level labels that users have to reason about. The sublabel separator is a forward slash (/), and Microsoft renders it as a hierarchy in the label picker.

Label Settings That Matter for Security

Several label settings are invisible in the user experience but carry significant security implications:

Encryption with user-defined permissions: users can choose who gets access. This undermines audit trail completeness because the label event logs do not capture the recipients. Use predefined permissions with named groups instead.
Let users assign permissions when applying labels in Outlook: disabling this forces DLP to have consistent conditions. If users can choose Do Not Forward vs. Encrypt-Only, your DLP policies that match on label plus sensitivity may produce inconsistent results.
Content marking (headers, footers, watermarks): at Confidential and above, add visual markings. Watermarks on Highly Confidential documents create a legal record that the user saw the classification before forwarding.
Auto-labeling settings on the label itself: distinct from org-wide auto-labeling policies. Label-level auto-labeling applies when a user manually creates content and the sensitive information type is detected in Office apps. This is client-side detection; it runs on save, not on upload.

Configuring Sensitivity Labels via PowerShell

The Purview portal is adequate for initial setup, but any production tenant needs label configuration in source control. Use the Exchange Online PowerShell module (which manages Purview labels):

# Connect to Security and Compliance PowerShell
Connect-IPPSSession -UserPrincipalName admin@contoso.com

# Create top-level Confidential label
New-Label `
  -Name 'Confidential' `
  -DisplayName 'Confidential' `
  -Tooltip 'Business content not meant for public distribution' `
  -Color '#FFB900' `
  -Priority 3

# Create sublabel Confidential/All Employees with encryption
New-Label `
  -Name 'ConfidentialAllEmployees' `
  -DisplayName 'All Employees' `
  -ParentId 'Confidential' `
  -Tooltip 'Confidential content accessible to all employees' `
  -Color '#FFB900' `
  -Priority 4 `
  -EncryptionEnabled $true `
  -EncryptionProtectionType 'Template' `
  -EncryptionTemplateId '<rms-template-id-for-all-employees>'

# Create Highly Confidential with stricter controls
New-Label `
  -Name 'HighlyConfidential' `
  -DisplayName 'Highly Confidential' `
  -Tooltip 'Highest protection: trade secrets, M&A, regulated PII' `
  -Color '#D83B01' `
  -Priority 5 `
  -EncryptionEnabled $true `
  -EncryptionProtectionType 'RemoveProtection' `
  -ContentMarkingUpHeaderEnabled $true `
  -ContentMarkingUpHeaderText 'HIGHLY CONFIDENTIAL' `
  -ContentMarkingUpHeaderFontSize 12 `
  -ContentMarkingUpHeaderFontColor '#D83B01' `
  -WaterMarkingEnabled $true `
  -WaterMarkingText 'HIGHLY CONFIDENTIAL - DO NOT DISTRIBUTE' `
  -WaterMarkingFontSize 24

After creating labels, publish them via a label policy scoped to the appropriate users:

New-LabelPolicy `
  -Name 'corp-sensitivity-labels-all-users' `
  -Labels @('Public','General','Confidential','ConfidentialAllEmployees','ConfidentialSpecificPeople','HighlyConfidential') `
  -ExchangeLocation All `
  -ModernGroupLocation All `
  -AddSharePointLocation All

Auto-Labeling Policies

Service-Side vs. Client-Side Auto-Labeling

The label configuration above supports client-side auto-labeling: when a user has a document open and saves it, the Office app detects sensitive information types and recommends or applies a label. Service-side auto-labeling runs in the cloud, scanning content in SharePoint, OneDrive, and Exchange before users interact with it.

Service-side auto-labeling is more powerful because it operates on existing content and on content uploaded without an Office app. Configure it separately in the Purview auto-labeling policy blade.

A critical distinction: service-side auto-labeling policies run in simulation mode first. You must run the simulation, review the results, and explicitly enable enforcement. Skipping simulation in a production tenant and enabling enforcement immediately will generate false positives and user complaints.

Auto-Labeling Policy for Regulated PII

# Create an auto-labeling policy targeting financial PII in SharePoint and OneDrive
New-AutoSensitivityLabelPolicy `
  -Name 'autolabel-financial-pii' `
  -ApplySensitivityLabel 'HighlyConfidential' `
  -Mode 'TestWithoutNotifications' `
  -SharePointLocation All `
  -OneDriveLocation All `
  -ExchangeLocation All

# Add sensitive information type rules to the policy
New-AutoSensitivityLabelRule `
  -Name 'autolabel-financial-pii-rule' `
  -Policy 'autolabel-financial-pii' `
  -ContentContainsSensitiveInformation @(
    @{Name='Credit Card Number'; minCount=1; confidenceLevel='High'},
    @{Name='U.S. Social Security Number (SSN)'; minCount=1; confidenceLevel='High'},
    @{Name='International Banking Account Number (IBAN)'; minCount=1; confidenceLevel='Medium'}
  ) `
  -Workload 'OneDriveForBusiness,SharePoint,Exchange'

After running for 7 days in simulation mode, review the simulation results in the Purview portal. Look specifically for false positives: documents that would be labeled Highly Confidential but should not be. Common false positive sources include test data files, development seed data, and documentation that describes the format of regulated data rather than containing actual regulated data.

To promote from simulation to enforcement:

Set-AutoSensitivityLabelPolicy `
  -Identity 'autolabel-financial-pii' `
  -Mode 'Enable'

DLP Policies Using Sensitivity Labels

Why Label-Based DLP Outperforms Pattern-Matching DLP Alone

Pattern-matching DLP looks for specific formats: 16-digit credit card numbers, US SSN formats, and similar recognizable patterns. It fails on:

Context-free sensitive data (the financial model scenario from the introduction)
Encrypted files where patterns cannot be read
Images of text without OCR scanning
Content already labeled by a human who understood the sensitivity

Label-based DLP conditions catch all of these cases because the label is applied before the data leaves the creation point. A DLP policy that targets label equals Highly Confidential will block exfiltration regardless of file format, encryption state, or content type.

# Create DLP policy blocking external email of labeled confidential content
New-DlpCompliancePolicy `
  -Name 'dlp-block-confidential-external-email' `
  -Comment 'Block external email of Confidential and above content' `
  -ExchangeLocation All `
  -Mode 'Enable'

New-DlpComplianceRule `
  -Name 'block-confidential-external-email-rule' `
  -Policy 'dlp-block-confidential-external-email' `
  -SentToScope 'NotInOrganization' `
  -AnyOfMessageSensitivityLabel @('Confidential','ConfidentialAllEmployees','HighlyConfidential') `
  -BlockAccess $true `
  -NotifyUser 'LastModifiedUser' `
  -NotifyUserType 'NotifyOnly' `
  -GenerateIncidentReport 'SiteAdmin' `
  -IncidentReportContent @('All')

For the highest-sensitivity labels, use BlockAccessScope 'All' and require business justification override. For Confidential, allow override with justification and audit the override event.

DLP Policy for SharePoint: Restrict Access to Labeled Documents

# Block sharing of Highly Confidential documents externally in SharePoint
New-DlpCompliancePolicy `
  -Name 'dlp-block-hc-sharepoint-external' `
  -Comment 'Prevent external sharing of Highly Confidential content in SharePoint' `
  -SharePointLocation All `
  -OneDriveLocation All `
  -Mode 'Enable'

New-DlpComplianceRule `
  -Name 'block-hc-sharepoint-external-rule' `
  -Policy 'dlp-block-hc-sharepoint-external' `
  -AccessScope 'NotInOrganization' `
  -AnyOfMessageSensitivityLabel @('HighlyConfidential') `
  -BlockAccess $true `
  -BlockAccessScope 'PerAnonymousLink'

Endpoint DLP: Extending Protection to Managed Devices

Endpoint DLP extends label-based policies to device-level activities: copy to USB, print, upload to non-corporate cloud storage, and paste into unsanctioned applications. It requires Microsoft Defender for Endpoint onboarding and is configured under the same DLP policy framework in Purview.

Devices must be onboarded to Defender for Endpoint. The Purview compliance extension is required for Windows 10 1809 and later and is deployed via Intune as a required app.

Once Endpoint DLP is active, add device scope to existing label-based DLP policies:

# Add Endpoint DLP to the existing confidential email policy
Set-DlpCompliancePolicy `
  -Identity 'dlp-block-confidential-external-email' `
  -AddEndpointDlpLocation All

# Create a separate rule for device-specific activities
New-DlpComplianceRule `
  -Name 'block-hc-usb-copy-endpoint' `
  -Policy 'dlp-block-confidential-external-email' `
  -AnyOfMessageSensitivityLabel @('HighlyConfidential') `
  -EndpointDlpRestrictions @(
    @{Setting='CopyToRemovableMedia'; Value='Block'},
    @{Setting='PrintToPrinter'; Value='AuditOnly'},
    @{Setting='UploadToCloudService'; Value='Block'},
    @{Setting='CopyToNetworkShare'; Value='AuditOnly'}
  )

The AuditOnly setting for print and network share copy is intentional for initial rollout. Promote to Block after 30 days of audit data confirms that legitimate workflows are not being caught.

Unsanctioned App Restrictions

Endpoint DLP can block pasting or uploading labeled content to specific browser destinations and desktop applications. Define your unsanctioned cloud service domains list in the Purview portal under Settings > Endpoint DLP. Common additions:

dropbox.com
wetransfer.com
mega.nz
pastebin.com
Personal OneDrive tenants (not your corporate tenant)

Monitoring with KQL and the Compliance Portal

KQL: Label Activity from the Unified Audit Log

The M365 Unified Audit Log feeds into Log Analytics when you configure the Purview diagnostic export. Query label operations:

OfficeActivity
| where Operation in ('SensitivityLabelApplied', 'SensitivityLabelChanged', 'SensitivityLabelRemoved')
| where TimeGenerated > ago(30d)
| project TimeGenerated, UserId, Operation,
    FileName = tostring(OfficeObjectId),
    LabelName = extract('"SensitivityLabel":"([^"]+)"', 1, tostring(OfficeProperties)),
    ClientIP = ClientIP
| summarize LabelEvents = count() by UserId, Operation, LabelName
| order by LabelEvents desc

Pay attention to high volumes of SensitivityLabelRemoved from a single user. Label removal requires a justification but is not blocked by default. Bulk label removal before sending files externally is a pre-exfiltration signal.

KQL: DLP Policy Match Events

OfficeActivity
| where RecordType == 'ComplianceDLPSharePoint'
    or RecordType == 'ComplianceDLPExchange'
| where TimeGenerated > ago(7d)
| project TimeGenerated, UserId, Operation,
    PolicyName = tostring(PolicyDetails[0].PolicyName),
    RuleName = tostring(PolicyDetails[0].Rules[0].RuleName),
    Severity = tostring(PolicyDetails[0].Rules[0].Severity),
    ActionsTaken = tostring(PolicyDetails[0].Rules[0].ActionsTaken[0])
| summarize Violations = count() by UserId, PolicyName, ActionsTaken
| order by Violations desc

Alert when ActionsTaken == 'Block' exceeds 10 events for the same user in 24 hours. Single-digit block counts are accidental policy violations. Ten or more in a day indicates an active exfiltration attempt.

KQL: Detecting Overshared Labeled Content in SharePoint

OfficeActivity
| where Operation in ('SharingSet', 'SharingInvitationCreated')
| where TimeGenerated > ago(7d)
| join kind=inner (
    OfficeActivity
    | where Operation == 'SensitivityLabelApplied'
    | project OfficeObjectId, AppliedLabel = tostring(OfficeProperties)
  ) on OfficeObjectId
| where AppliedLabel contains 'Confidential' or AppliedLabel contains 'HighlyConfidential'
| project TimeGenerated, UserId, Operation, OfficeObjectId,
    TargetUserOrGroupType, AppliedLabel
| where TargetUserOrGroupType == 'Guest'
| order by TimeGenerated desc

This surfaces labeled confidential content being shared with guest users. Review weekly.

Common Implementation Problems

Labels Not Appearing in Office Apps

The most common cause is label policy scope. If a label policy is scoped to a specific Microsoft 365 group or distribution list, users not in that group will not see the labels. Verify with:

Get-LabelPolicy -Identity 'corp-sensitivity-labels-all-users' |
  Select-Object Name, ExchangeLocation, SharePointLocation

If ExchangeLocation shows specific groups instead of All, expand the scope or create a second policy for the missing users.

Encryption Breaking M365 Features

Labeling with encryption configured to specific named users breaks co-authoring in SharePoint and OneDrive because SharePoint cannot read the content for indexing, version history, or eDiscovery. The solution is to configure encryption using Azure AD groups rather than named users. Additionally, labels with customer-managed keys (BYOK) are incompatible with Search, eDiscovery, and Copilot for M365. Only use BYOK for the Highly Confidential sublabel in regulated industries; use Microsoft-managed keys for all other labels.

Auto-Labeling Not Applying to Existing Content

New auto-labeling policies only scan content modified after the policy was enabled, by default. To retroactively scan all existing content in SharePoint and OneDrive, you must explicitly trigger a full scan. This runs as a background job and can take days for large tenants. Monitor progress in the auto-labeling policy details page under the simulation or enforcement run status.

Integration with Purview for AI Governance

If your organization is deploying Microsoft Copilot for M365 or Azure AI Foundry, sensitivity labels are the primary control that governs what data those systems can access and surface. An unlabeled SharePoint document with no sensitivity label may be surfaced by Copilot to users who would not otherwise have had context for that document. Run auto-labeling on your entire SharePoint estate before enabling Copilot for M365, not after. The [Purview AI governance guide](/blog/microsoft-purview-ai-governance-training-data) covers the AI-specific configuration in detail.

Comparison: Label-Based DLP vs. Sensitive Info Type DLP

Capability	Label-based DLP	SIT-based DLP
Works on encrypted files	Yes (label survives encryption)	No (cannot read ciphertext)
Works on non-text content (images)	Yes (label metadata)	Only with OCR scanning
Catches context-free sensitive data	Yes	No
Requires user training	Yes (label must be applied)	No (fully automatic)
False positive rate	Low (human applied)	Medium to high (pattern matching)
Coverage for legacy/unlabeled content	Only after auto-labeling runs	Immediate

The practical recommendation: deploy both. SIT-based DLP provides coverage before auto-labeling has processed all existing content, and catches new sensitive data types that were not included in the original label taxonomy. Label-based DLP provides the higher-fidelity signal once labeling coverage reaches 80% or more across the content estate.

Hardening Checklist

[ ] Label taxonomy finalized at 6 labels or fewer before publishing to production
[ ] Label policies published covering all Exchange, SharePoint, and OneDrive locations
[ ] Client-side auto-labeling configured on Confidential and Highly Confidential labels for PII sensitive info types
[ ] Service-side auto-labeling policy deployed in simulation mode before enforcement
[ ] Simulation results reviewed for false positives before enabling enforcement
[ ] Encryption configured with group-based permissions not named users, for all labels except Highly Confidential
[ ] Visual markings (header, footer, watermark) enabled for Confidential and above
[ ] DLP policy targeting labeled content deployed covering Exchange, SharePoint, OneDrive
[ ] DLP override with justification enabled for Confidential labels; no override for Highly Confidential
[ ] Endpoint DLP enabled for all Defender for Endpoint-onboarded devices
[ ] Unsanctioned cloud service domain list configured in Endpoint DLP settings
[ ] Label downgrade justification requirement enabled on all label policies
[ ] Activity Explorer review scheduled weekly for label downgrades and high-volume DLP block events
[ ] KQL alert deployed for DLP block events exceeding 10 per user per 24 hours
[ ] Labeled content guest sharing monitored via KQL alert
[ ] BYOK encryption restricted to Highly Confidential sublabel only
[ ] Auto-labeling retroactive scan triggered before enabling Copilot for M365

Frequently Asked Questions

What is the difference between client-side and service-side auto-labeling in Microsoft Purview?

Client-side auto-labeling runs in Office apps on the user's device and prompts or automatically applies a label as the user creates or edits content. Service-side auto-labeling runs in the Purview compliance portal as a background policy that scans SharePoint, OneDrive, and Exchange content at rest and in transit. Service-side labeling is required to classify existing content at scale, but it can take days to process a large tenant and must be validated in simulation mode before enforcement is enabled.

How many sensitivity labels should an organization deploy?

Research on user adoption consistently shows that beyond five or six labels, end-user compliance drops sharply because the distinctions become ambiguous. The recommended taxonomy uses four main labels (Public, Internal, Confidential, Highly Confidential) with one or two Confidential sub-labels for regulated data. More granular classification should be handled through DLP policies and auto-labeling rules rather than by adding more label choices for users.

Can sensitivity labels be applied to content stored outside Microsoft 365?

Yes, but only for files that have been downloaded and labeled using a Purview-aware application such as the Azure Information Protection unified labeling client. Sensitivity labels applied to Office files travel with the file as metadata, so a labeled Excel file uploaded to a non-Microsoft cloud storage service retains its label and encryption. However, Purview cannot scan or auto-label content stored natively in non-Microsoft data stores without third-party connectors.

Why does enabling encryption on a sensitivity label break co-authoring in SharePoint?

SharePoint requires access to the file contents for indexing, version history, eDiscovery, and Copilot for M365. When a file is encrypted with permissions scoped to named users rather than Azure AD groups, SharePoint cannot decrypt the file on behalf of the service account that performs these operations. The fix is to configure encryption using Azure AD groups rather than named individual users, which allows SharePoint's service account to access the content when it is a member of the authorized group.

How does label-based DLP differ from sensitive information type DLP and when should you use both?

Label-based DLP uses the sensitivity label metadata as the policy condition, which means it works on encrypted files and any content where a human or auto-labeling policy has already applied a label. Sensitive information type DLP uses pattern matching (regex for credit card numbers, SSNs, etc.) on file content directly, so it provides immediate coverage for unlabeled data but cannot read encrypted files. The practical recommendation is to deploy both: SIT-based DLP gives coverage before auto-labeling has processed existing content, while label-based DLP provides higher-fidelity detection once labeling coverage reaches 80 percent or more.

When DLP Fails Because the Data Was Never Labeled

Designing the Label Taxonomy

The Most Common Mistake: Too Many Labels

Label Settings That Matter for Security

Configuring Sensitivity Labels via PowerShell

Auto-Labeling Policies

Service-Side vs. Client-Side Auto-Labeling

Auto-Labeling Policy for Regulated PII

DLP Policies Using Sensitivity Labels

Why Label-Based DLP Outperforms Pattern-Matching DLP Alone

DLP Policy for Exchange: Block External Sharing of Confidential Content

DLP Policy for SharePoint: Restrict Access to Labeled Documents

Endpoint DLP: Extending Protection to Managed Devices

Unsanctioned App Restrictions

Monitoring with KQL and the Compliance Portal

KQL: Label Activity from the Unified Audit Log

KQL: DLP Policy Match Events

KQL: Detecting Overshared Labeled Content in SharePoint

Common Implementation Problems

Labels Not Appearing in Office Apps

Encryption Breaking M365 Features

Auto-Labeling Not Applying to Existing Content

Integration with Purview for AI Governance

Comparison: Label-Based DLP vs. Sensitive Info Type DLP

Hardening Checklist

Frequently Asked Questions

What is the difference between client-side and service-side auto-labeling in Microsoft Purview?

How many sensitivity labels should an organization deploy?

Can sensitivity labels be applied to content stored outside Microsoft 365?

Why does enabling encryption on a sensitivity label break co-authoring in SharePoint?

How does label-based DLP differ from sensitive information type DLP and when should you use both?

Cloud Security Checklist

Get weekly security insights

Cloud Security Engineer Roadmap

Idan Ohayon

Share this article

Questions & Answers

Ask a Question

Related Articles

Best Cybersecurity Training Platforms for Azure and Cloud Engineers (2026)

CIEM vs CSPM: Understanding the Difference and Why You Need Both

Azure DDoS Protection Standard: When You Need It and How to Configure It

Need Help with Your Security?