Cyber Intelligence
Microsoft Sentinel · 50-55% of exam

L8. Data Connectors: Ingesting Logs at Scale

Video generating

Check back soon for the video lesson on Data Connectors: Ingesting Logs at Scale

Data connectors are the pipeline for getting security data into Microsoft Sentinel. This lesson covers connector types, the Azure Monitor Agent, CEF/Syslog ingestion, and custom log strategies for the SC-200 exam.

Data Connector Categories

Sentinel supports 300+ data connectors organized by type:

Connector TypeExamplesMechanism
Microsoft service-to-serviceMicrosoft 365, Entra ID, Defender XDRNative API integration
Azure serviceAzure Firewall, Key Vault, NSG Flow LogsDiagnostic settings
Syslog/CEFLinux hosts, network appliances, firewallsAgent-based collection
REST APICustom applications, third-party SaaSAPI polling or webhook
Custom logs (DCR-based)Any structured log sourceAzure Monitor Agent + DCR
Codeless Connector Platform (CCP)Partner-built connectorsStandardized API framework

Microsoft Service Connectors

Microsoft first-party connectors are the simplest to configure. Key connectors:

  • Microsoft Defender XDR: Ingests incidents, alerts, and raw event data from all Defender workloads
  • Microsoft Entra ID: Sign-in logs, audit logs, provisioning logs, risky users, risk detections
  • Office 365: Exchange, SharePoint, and Teams activity logs (free ingestion)
  • Azure Activity: Subscription-level management operations (free ingestion)
Exam tip: The Defender XDR connector can ingest raw event tables (DeviceProcessEvents, EmailEvents, etc.) into Sentinel. This is called "advanced hunting data" and it is billable, unlike the alerts and incidents which are free.

Azure Monitor Agent (AMA)

The Azure Monitor Agent replaced the legacy Log Analytics Agent (MMA) for collecting logs from Windows and Linux machines. Key concepts:

  • Data Collection Rules (DCR): Define which logs to collect, how to filter them, and where to send them
  • Data Collection Endpoints (DCE): Network endpoints the agent connects to for uploading data
  • Transformations: KQL-based transforms that filter or modify data during ingestion (before it hits the workspace)

DCR-based collection allows you to:

  • Filter events at collection time (reducing ingestion costs)
  • Route different log types to different workspaces
  • Apply transformations to normalize or enrich data
Exam tip: Ingestion-time transformations reduce costs by filtering out unwanted events before they are stored. The transformation is defined as a KQL query in the DCR.

Syslog and CEF Collection

For network appliances, firewalls, and Linux hosts:

  • Syslog: Standard Unix logging protocol. Collected by the AMA on a Linux forwarder
  • CEF (Common Event Format): Structured syslog format. Uses the same AMA-based collection but parses into the CommonSecurityLog table

Architecture for CEF/Syslog:

  1. Network appliance sends logs to a Linux log forwarder (rsyslog or syslog-ng)
  2. Azure Monitor Agent on the forwarder collects and sends to Sentinel
  3. Syslog data lands in the Syslog table; CEF data lands in CommonSecurityLog

Custom Logs and the Logs Ingestion API

For sources without a built-in connector, use the Logs Ingestion API:

  1. Create a custom table in the Log Analytics workspace
  2. Define a DCR with the table schema and any transformations
  3. Send data via the REST API using Microsoft Entra authentication

This is the recommended approach for custom application logs, third-party tools without native connectors, and IoT devices.

Connector Health

Monitor connector health through:

  • Data connectors page: Shows last log received timestamp and status
  • Health monitoring workbook: Visualizes ingestion latency and volume anomalies
  • Heartbeat table: For agent-based connectors, tracks agent check-in status
Exam Focus Points
  • The Defender XDR connector provides free alert/incident ingestion, but raw advanced hunting tables are billable.
  • Azure Monitor Agent (AMA) replaced the legacy Log Analytics Agent (MMA) for log collection.
  • Data Collection Rules (DCR) define what to collect, how to filter, and where to send logs.
  • Ingestion-time transformations use KQL in the DCR to filter or modify data before storage, reducing costs.
  • CEF logs land in the CommonSecurityLog table; standard Syslog lands in the Syslog table.
  • The Logs Ingestion API enables custom log sources to send data to Sentinel via REST.
Knowledge Check

1. Which component defines what logs the Azure Monitor Agent collects and where it sends them?

2. A firewall appliance sends CEF-formatted logs to a Linux forwarder running the Azure Monitor Agent. Which Sentinel table receives these logs?

3. An organization wants to reduce Sentinel ingestion costs by filtering out informational Windows events before they reach the workspace. Which feature should they use?