Cyber Intelligence
Cloud Security18 min read

Microsoft Purview Information Protection: Complete Setup Guide

Pattern-matching DLP fails when sensitive data has no recognizable format. This guide covers a complete Purview Information Protection deployment: label taxonomy design, service-side auto-labeling, DLP policies that use labels as conditions, and Endpoint DLP for managed devices.

I
Microsoft Cloud Solution Architect
Microsoft PurviewInformation ProtectionDLPSensitivity LabelsData ClassificationEndpoint DLPCompliance

When DLP Fails Because the Data Was Never Labeled

A financial services team ran a DLP policy in Microsoft Purview blocking email attachments with credit card numbers. A contractor exfiltrated a financial model spreadsheet by email. The spreadsheet contained no credit card numbers: it contained formulas, assumptions, and client names that were worth far more. DLP never fired because the policy was reactive, not proactive. The data was not classified, so DLP had nothing to match against.

Microsoft Purview Information Protection solves a different problem than DLP: it establishes classification at the point of content creation, so that DLP and other downstream controls have a reliable signal to act on. This guide covers a complete tenant setup: label taxonomy design, sensitivity label publishing, auto-labeling policies, and DLP policies that use those labels as conditions.

---

Designing the Label Taxonomy

The Most Common Mistake: Too Many Labels

Enterprise tenants routinely deploy 15-20 sensitivity labels and then find that users apply them inconsistently or not at all. The research on user behavior with sensitivity labels consistently shows that beyond 5-6 labels, adoption drops sharply.

The taxonomy that works in practice:

LabelColorEncryptionUse case
PublicGreenNoneMarketing content, public docs
GeneralBlueNoneInternal business communications
ConfidentialYellowNoneInternal-only sensitive content
Confidential / All EmployeesYellowEncrypt to orgBroad-access sensitive content
Confidential / Specific PeopleOrangeEncrypt to named groupRestricted project content
Highly ConfidentialRedEncrypt plus DRM restrictionsTrade secrets, M&A, regulated PII
Sublabels under Confidential give you the granularity you need for DLP policy targeting without multiplying top-level labels that users have to reason about. The sublabel separator is a forward slash (/), and Microsoft renders it as a hierarchy in the label picker.

Label Settings That Matter for Security

Several label settings are invisible in the user experience but carry significant security implications:

  • Encryption with user-defined permissions: users can choose who gets access. This undermines audit trail completeness because the label event logs do not capture the recipients. Use predefined permissions with named groups instead.
  • Let users assign permissions when applying labels in Outlook: disabling this forces DLP to have consistent conditions. If users can choose Do Not Forward vs. Encrypt-Only, your DLP policies that match on label plus sensitivity may produce inconsistent results.
  • Content marking (headers, footers, watermarks): at Confidential and above, add visual markings. Watermarks on Highly Confidential documents create a legal record that the user saw the classification before forwarding.
  • Auto-labeling settings on the label itself: distinct from org-wide auto-labeling policies. Label-level auto-labeling applies when a user manually creates content and the sensitive information type is detected in Office apps. This is client-side detection; it runs on save, not on upload.

---

Configuring Sensitivity Labels via PowerShell

The Purview portal is adequate for initial setup, but any production tenant needs label configuration in source control. Use the Exchange Online PowerShell module (which manages Purview labels):

# Connect to Security and Compliance PowerShell
Connect-IPPSSession -UserPrincipalName admin@contoso.com

# Create top-level Confidential label New-Label ` -Name 'Confidential' ` -DisplayName 'Confidential' ` -Tooltip 'Business content not meant for public distribution' ` -Color '#FFB900' ` -Priority 3

# Create sublabel Confidential/All Employees with encryption New-Label ` -Name 'ConfidentialAllEmployees' ` -DisplayName 'All Employees' ` -ParentId 'Confidential' ` -Tooltip 'Confidential content accessible to all employees' ` -Color '#FFB900' ` -Priority 4 ` -EncryptionEnabled $true ` -EncryptionProtectionType 'Template' ` -EncryptionTemplateId '<rms-template-id-for-all-employees>'

# Create Highly Confidential with stricter controls New-Label ` -Name 'HighlyConfidential' ` -DisplayName 'Highly Confidential' ` -Tooltip 'Highest protection: trade secrets, M&A, regulated PII' ` -Color '#D83B01' ` -Priority 5 ` -EncryptionEnabled $true ` -EncryptionProtectionType 'RemoveProtection' ` -ContentMarkingUpHeaderEnabled $true ` -ContentMarkingUpHeaderText 'HIGHLY CONFIDENTIAL' ` -ContentMarkingUpHeaderFontSize 12 ` -ContentMarkingUpHeaderFontColor '#D83B01' ` -WaterMarkingEnabled $true ` -WaterMarkingText 'HIGHLY CONFIDENTIAL - DO NOT DISTRIBUTE' ` -WaterMarkingFontSize 24

After creating labels, publish them via a label policy scoped to the appropriate users:

New-LabelPolicy `
  -Name 'corp-sensitivity-labels-all-users' `
  -Labels @('Public','General','Confidential','ConfidentialAllEmployees','ConfidentialSpecificPeople','HighlyConfidential') `
  -ExchangeLocation All `
  -ModernGroupLocation All `
  -AddSharePointLocation All

---

Auto-Labeling Policies

Service-Side vs. Client-Side Auto-Labeling

The label configuration above supports client-side auto-labeling: when a user has a document open and saves it, the Office app detects sensitive information types and recommends or applies a label. Service-side auto-labeling runs in the cloud, scanning content in SharePoint, OneDrive, and Exchange before users interact with it.

Service-side auto-labeling is more powerful because it operates on existing content and on content uploaded without an Office app. Configure it separately in the Purview auto-labeling policy blade.

A critical distinction: service-side auto-labeling policies run in simulation mode first. You must run the simulation, review the results, and explicitly enable enforcement. Skipping simulation in a production tenant and enabling enforcement immediately will generate false positives and user complaints.

Auto-Labeling Policy for Regulated PII

# Create an auto-labeling policy targeting financial PII in SharePoint and OneDrive
New-AutoSensitivityLabelPolicy `
  -Name 'autolabel-financial-pii' `
  -ApplySensitivityLabel 'HighlyConfidential' `
  -Mode 'TestWithoutNotifications' `
  -SharePointLocation All `
  -OneDriveLocation All `
  -ExchangeLocation All

# Add sensitive information type rules to the policy New-AutoSensitivityLabelRule ` -Name 'autolabel-financial-pii-rule' ` -Policy 'autolabel-financial-pii' ` -ContentContainsSensitiveInformation @( @{Name='Credit Card Number'; minCount=1; confidenceLevel='High'}, @{Name='U.S. Social Security Number (SSN)'; minCount=1; confidenceLevel='High'}, @{Name='International Banking Account Number (IBAN)'; minCount=1; confidenceLevel='Medium'} ) ` -Workload 'OneDriveForBusiness,SharePoint,Exchange'

After running for 7 days in simulation mode, review the simulation results in the Purview portal. Look specifically for false positives: documents that would be labeled Highly Confidential but should not be. Common false positive sources include test data files, development seed data, and documentation that describes the format of regulated data rather than containing actual regulated data.

To promote from simulation to enforcement:

Set-AutoSensitivityLabelPolicy `
  -Identity 'autolabel-financial-pii' `
  -Mode 'Enable'

---

DLP Policies Using Sensitivity Labels

Why Label-Based DLP Outperforms Pattern-Matching DLP Alone

Pattern-matching DLP looks for specific formats: 16-digit credit card numbers, US SSN formats, and similar recognizable patterns. It fails on:

  • Context-free sensitive data (the financial model scenario from the introduction)
  • Encrypted files where patterns cannot be read
  • Images of text without OCR scanning
  • Content already labeled by a human who understood the sensitivity

Label-based DLP conditions catch all of these cases because the label is applied before the data leaves the creation point. A DLP policy that targets label equals Highly Confidential will block exfiltration regardless of file format, encryption state, or content type.

DLP Policy for Exchange: Block External Sharing of Confidential Content

# Create DLP policy blocking external email of labeled confidential content
New-DlpCompliancePolicy `
  -Name 'dlp-block-confidential-external-email' `
  -Comment 'Block external email of Confidential and above content' `
  -ExchangeLocation All `
  -Mode 'Enable'

New-DlpComplianceRule ` -Name 'block-confidential-external-email-rule' ` -Policy 'dlp-block-confidential-external-email' ` -SentToScope 'NotInOrganization' ` -AnyOfMessageSensitivityLabel @('Confidential','ConfidentialAllEmployees','HighlyConfidential') ` -BlockAccess $true ` -NotifyUser 'LastModifiedUser' ` -NotifyUserType 'NotifyOnly' ` -GenerateIncidentReport 'SiteAdmin' ` -IncidentReportContent @('All')

For the highest-sensitivity labels, use BlockAccessScope 'All' and require business justification override. For Confidential, allow override with justification and audit the override event.

DLP Policy for SharePoint: Restrict Access to Labeled Documents

# Block sharing of Highly Confidential documents externally in SharePoint
New-DlpCompliancePolicy `
  -Name 'dlp-block-hc-sharepoint-external' `
  -Comment 'Prevent external sharing of Highly Confidential content in SharePoint' `
  -SharePointLocation All `
  -OneDriveLocation All `
  -Mode 'Enable'

New-DlpComplianceRule ` -Name 'block-hc-sharepoint-external-rule' ` -Policy 'dlp-block-hc-sharepoint-external' ` -AccessScope 'NotInOrganization' ` -AnyOfMessageSensitivityLabel @('HighlyConfidential') ` -BlockAccess $true ` -BlockAccessScope 'PerAnonymousLink'

---

Endpoint DLP: Extending Protection to Managed Devices

Endpoint DLP extends label-based policies to device-level activities: copy to USB, print, upload to non-corporate cloud storage, and paste into unsanctioned applications. It requires Microsoft Defender for Endpoint onboarding and is configured under the same DLP policy framework in Purview.

Devices must be onboarded to Defender for Endpoint. The Purview compliance extension is required for Windows 10 1809 and later and is deployed via Intune as a required app.

Once Endpoint DLP is active, add device scope to existing label-based DLP policies:

# Add Endpoint DLP to the existing confidential email policy
Set-DlpCompliancePolicy `
  -Identity 'dlp-block-confidential-external-email' `
  -AddEndpointDlpLocation All

# Create a separate rule for device-specific activities New-DlpComplianceRule ` -Name 'block-hc-usb-copy-endpoint' ` -Policy 'dlp-block-confidential-external-email' ` -AnyOfMessageSensitivityLabel @('HighlyConfidential') ` -EndpointDlpRestrictions @( @{Setting='CopyToRemovableMedia'; Value='Block'}, @{Setting='PrintToPrinter'; Value='AuditOnly'}, @{Setting='UploadToCloudService'; Value='Block'}, @{Setting='CopyToNetworkShare'; Value='AuditOnly'} )

The AuditOnly setting for print and network share copy is intentional for initial rollout. Promote to Block after 30 days of audit data confirms that legitimate workflows are not being caught.

Unsanctioned App Restrictions

Endpoint DLP can block pasting or uploading labeled content to specific browser destinations and desktop applications. Define your unsanctioned cloud service domains list in the Purview portal under Settings > Endpoint DLP. Common additions:

  • dropbox.com
  • wetransfer.com
  • mega.nz
  • pastebin.com
  • Personal OneDrive tenants (not your corporate tenant)

---

Monitoring with KQL and the Compliance Portal

KQL: Label Activity from the Unified Audit Log

The M365 Unified Audit Log feeds into Log Analytics when you configure the Purview diagnostic export. Query label operations:

OfficeActivity
| where Operation in ('SensitivityLabelApplied', 'SensitivityLabelChanged', 'SensitivityLabelRemoved')
| where TimeGenerated > ago(30d)
| project TimeGenerated, UserId, Operation,
    FileName = tostring(OfficeObjectId),
    LabelName = extract('"SensitivityLabel":"([^"]+)"', 1, tostring(OfficeProperties)),
    ClientIP = ClientIP
| summarize LabelEvents = count() by UserId, Operation, LabelName
| order by LabelEvents desc

Pay attention to high volumes of SensitivityLabelRemoved from a single user. Label removal requires a justification but is not blocked by default. Bulk label removal before sending files externally is a pre-exfiltration signal.

KQL: DLP Policy Match Events

OfficeActivity
| where RecordType == 'ComplianceDLPSharePoint'
    or RecordType == 'ComplianceDLPExchange'
| where TimeGenerated > ago(7d)
| project TimeGenerated, UserId, Operation,
    PolicyName = tostring(PolicyDetails[0].PolicyName),
    RuleName = tostring(PolicyDetails[0].Rules[0].RuleName),
    Severity = tostring(PolicyDetails[0].Rules[0].Severity),
    ActionsTaken = tostring(PolicyDetails[0].Rules[0].ActionsTaken[0])
| summarize Violations = count() by UserId, PolicyName, ActionsTaken
| order by Violations desc

Alert when ActionsTaken == 'Block' exceeds 10 events for the same user in 24 hours. Single-digit block counts are accidental policy violations. Ten or more in a day indicates an active exfiltration attempt.

KQL: Detecting Overshared Labeled Content in SharePoint

OfficeActivity
| where Operation in ('SharingSet', 'SharingInvitationCreated')
| where TimeGenerated > ago(7d)
| join kind=inner (
    OfficeActivity
    | where Operation == 'SensitivityLabelApplied'
    | project OfficeObjectId, AppliedLabel = tostring(OfficeProperties)
  ) on OfficeObjectId
| where AppliedLabel contains 'Confidential' or AppliedLabel contains 'HighlyConfidential'
| project TimeGenerated, UserId, Operation, OfficeObjectId,
    TargetUserOrGroupType, AppliedLabel
| where TargetUserOrGroupType == 'Guest'
| order by TimeGenerated desc

This surfaces labeled confidential content being shared with guest users. Review weekly.

---

Common Implementation Problems

Labels Not Appearing in Office Apps

The most common cause is label policy scope. If a label policy is scoped to a specific Microsoft 365 group or distribution list, users not in that group will not see the labels. Verify with:

Get-LabelPolicy -Identity 'corp-sensitivity-labels-all-users' |
  Select-Object Name, ExchangeLocation, SharePointLocation

If ExchangeLocation shows specific groups instead of All, expand the scope or create a second policy for the missing users.

Encryption Breaking M365 Features

Labeling with encryption configured to specific named users breaks co-authoring in SharePoint and OneDrive because SharePoint cannot read the content for indexing, version history, or eDiscovery. The solution is to configure encryption using Azure AD groups rather than named users. Additionally, labels with customer-managed keys (BYOK) are incompatible with Search, eDiscovery, and Copilot for M365. Only use BYOK for the Highly Confidential sublabel in regulated industries; use Microsoft-managed keys for all other labels.

Auto-Labeling Not Applying to Existing Content

New auto-labeling policies only scan content modified after the policy was enabled, by default. To retroactively scan all existing content in SharePoint and OneDrive, you must explicitly trigger a full scan. This runs as a background job and can take days for large tenants. Monitor progress in the auto-labeling policy details page under the simulation or enforcement run status.

Integration with Purview for AI Governance

If your organization is deploying Microsoft Copilot for M365 or Azure AI Foundry, sensitivity labels are the primary control that governs what data those systems can access and surface. An unlabeled SharePoint document with no sensitivity label may be surfaced by Copilot to users who would not otherwise have had context for that document. Run auto-labeling on your entire SharePoint estate before enabling Copilot for M365, not after. The Purview AI governance guide covers the AI-specific configuration in detail.

---

Comparison: Label-Based DLP vs. Sensitive Info Type DLP

CapabilityLabel-based DLPSIT-based DLP
Works on encrypted filesYes (label survives encryption)No (cannot read ciphertext)
Works on non-text content (images)Yes (label metadata)Only with OCR scanning
Catches context-free sensitive dataYesNo
Requires user trainingYes (label must be applied)No (fully automatic)
False positive rateLow (human applied)Medium to high (pattern matching)
Coverage for legacy/unlabeled contentOnly after auto-labeling runsImmediate
The practical recommendation: deploy both. SIT-based DLP provides coverage before auto-labeling has processed all existing content, and catches new sensitive data types that were not included in the original label taxonomy. Label-based DLP provides the higher-fidelity signal once labeling coverage reaches 80% or more across the content estate.

---

Hardening Checklist

  • [ ] Label taxonomy finalized at 6 labels or fewer before publishing to production
  • [ ] Label policies published covering all Exchange, SharePoint, and OneDrive locations
  • [ ] Client-side auto-labeling configured on Confidential and Highly Confidential labels for PII sensitive info types
  • [ ] Service-side auto-labeling policy deployed in simulation mode before enforcement
  • [ ] Simulation results reviewed for false positives before enabling enforcement
  • [ ] Encryption configured with group-based permissions not named users, for all labels except Highly Confidential
  • [ ] Visual markings (header, footer, watermark) enabled for Confidential and above
  • [ ] DLP policy targeting labeled content deployed covering Exchange, SharePoint, OneDrive
  • [ ] DLP override with justification enabled for Confidential labels; no override for Highly Confidential
  • [ ] Endpoint DLP enabled for all Defender for Endpoint-onboarded devices
  • [ ] Unsanctioned cloud service domain list configured in Endpoint DLP settings
  • [ ] Label downgrade justification requirement enabled on all label policies
  • [ ] Activity Explorer review scheduled weekly for label downgrades and high-volume DLP block events
  • [ ] KQL alert deployed for DLP block events exceeding 10 per user per 24 hours
  • [ ] Labeled content guest sharing monitored via KQL alert
  • [ ] BYOK encryption restricted to Highly Confidential sublabel only
  • [ ] Auto-labeling retroactive scan triggered before enabling Copilot for M365
N

Recommended tool: Nordpass

Up to 40% commission

Get weekly security insights

Cloud security, zero trust, and identity guides — straight to your inbox.

I

Microsoft Cloud Solution Architect

Cloud Solution Architect with deep expertise in Microsoft Azure and a strong background in systems and IT infrastructure. Passionate about cloud technologies, security best practices, and helping organizations modernize their infrastructure.

Share this article

Questions & Answers

Related Articles

Need Help with Your Security?

Our team of security experts can help you implement the strategies discussed in this article.

Contact Us