L8. Data Connectors: Ingesting Logs at Scale
Video generating
Check back soon for the video lesson on Data Connectors: Ingesting Logs at Scale
Data connectors are the pipeline for getting security data into Microsoft Sentinel. This lesson covers connector types, the Azure Monitor Agent, CEF/Syslog ingestion, and custom log strategies for the SC-200 exam.
Data Connector Categories
Sentinel supports 300+ data connectors organized by type:
| Connector Type | Examples | Mechanism |
|---|---|---|
| Microsoft service-to-service | Microsoft 365, Entra ID, Defender XDR | Native API integration |
| Azure service | Azure Firewall, Key Vault, NSG Flow Logs | Diagnostic settings |
| Syslog/CEF | Linux hosts, network appliances, firewalls | Agent-based collection |
| REST API | Custom applications, third-party SaaS | API polling or webhook |
| Custom logs (DCR-based) | Any structured log source | Azure Monitor Agent + DCR |
| Codeless Connector Platform (CCP) | Partner-built connectors | Standardized API framework |
Microsoft Service Connectors
Microsoft first-party connectors are the simplest to configure. Key connectors:
- Microsoft Defender XDR: Ingests incidents, alerts, and raw event data from all Defender workloads
- Microsoft Entra ID: Sign-in logs, audit logs, provisioning logs, risky users, risk detections
- Office 365: Exchange, SharePoint, and Teams activity logs (free ingestion)
- Azure Activity: Subscription-level management operations (free ingestion)
Azure Monitor Agent (AMA)
The Azure Monitor Agent replaced the legacy Log Analytics Agent (MMA) for collecting logs from Windows and Linux machines. Key concepts:
- Data Collection Rules (DCR): Define which logs to collect, how to filter them, and where to send them
- Data Collection Endpoints (DCE): Network endpoints the agent connects to for uploading data
- Transformations: KQL-based transforms that filter or modify data during ingestion (before it hits the workspace)
DCR-based collection allows you to:
- Filter events at collection time (reducing ingestion costs)
- Route different log types to different workspaces
- Apply transformations to normalize or enrich data
Syslog and CEF Collection
For network appliances, firewalls, and Linux hosts:
- Syslog: Standard Unix logging protocol. Collected by the AMA on a Linux forwarder
- CEF (Common Event Format): Structured syslog format. Uses the same AMA-based collection but parses into the CommonSecurityLog table
Architecture for CEF/Syslog:
- Network appliance sends logs to a Linux log forwarder (rsyslog or syslog-ng)
- Azure Monitor Agent on the forwarder collects and sends to Sentinel
- Syslog data lands in the Syslog table; CEF data lands in CommonSecurityLog
Custom Logs and the Logs Ingestion API
For sources without a built-in connector, use the Logs Ingestion API:
- Create a custom table in the Log Analytics workspace
- Define a DCR with the table schema and any transformations
- Send data via the REST API using Microsoft Entra authentication
This is the recommended approach for custom application logs, third-party tools without native connectors, and IoT devices.
Connector Health
Monitor connector health through:
- Data connectors page: Shows last log received timestamp and status
- Health monitoring workbook: Visualizes ingestion latency and volume anomalies
- Heartbeat table: For agent-based connectors, tracks agent check-in status
- ✓The Defender XDR connector provides free alert/incident ingestion, but raw advanced hunting tables are billable.
- ✓Azure Monitor Agent (AMA) replaced the legacy Log Analytics Agent (MMA) for log collection.
- ✓Data Collection Rules (DCR) define what to collect, how to filter, and where to send logs.
- ✓Ingestion-time transformations use KQL in the DCR to filter or modify data before storage, reducing costs.
- ✓CEF logs land in the CommonSecurityLog table; standard Syslog lands in the Syslog table.
- ✓The Logs Ingestion API enables custom log sources to send data to Sentinel via REST.
1. Which component defines what logs the Azure Monitor Agent collects and where it sends them?
2. A firewall appliance sends CEF-formatted logs to a Linux forwarder running the Azure Monitor Agent. Which Sentinel table receives these logs?
3. An organization wants to reduce Sentinel ingestion costs by filtering out informational Windows events before they reach the workspace. Which feature should they use?