Patching and Incident Response · Operations

L20. Incident Response on Linux: First Steps When Something Goes Wrong

Video generating

Check back soon for the video lesson on Incident Response on Linux: First Steps When Something Goes Wrong

Learn to recognize signs of compromise on a Linux system, collect volatile data before it disappears, check for unauthorized users and persistence mechanisms, preserve evidence properly, and decide when to isolate versus investigate.

Recognizing the Signs of Compromise

Before you can respond to an incident, you need to recognize that something is wrong. Compromised Linux systems often show one or more of these warning signs:

Unexpected processes consuming CPU or memory
Unfamiliar network connections to external IP addresses
Modified system files (especially binaries in /usr/bin, /usr/sbin)
New user accounts or SSH keys you did not create
Unusual cron jobs or systemd timers
Log gaps where entries were deleted or logging was stopped
Files with recent modification timestamps in directories that rarely change

Not every anomaly is a breach, but each one deserves investigation. The cost of checking and finding nothing is far lower than the cost of missing a real compromise.

Initial Triage: What to Do First

When you suspect a compromise, the order of your actions matters. Volatile data (running processes, active connections, logged-in users) disappears the moment you reboot or the attacker notices your investigation. Start by collecting this data before making any changes.

The Golden Rule

Do not reboot the system. Rebooting destroys volatile evidence in memory, clears temporary files, and may trigger attacker-installed persistence mechanisms that modify their footprint.

Step 1: Document the Current Time

# Record the exact time you started the investigation
date -u
uptime

This establishes a timeline anchor for everything you discover afterward.

Collecting Volatile Data

Work through these commands systematically. Copy the output to a file on a separate system if possible:

Running Processes

# Full process listing with command arguments ps auxww # Process tree showing parent-child relationships ps auxwwf

# Look for suspicious processes # Check for: unfamiliar names, processes running as root that should not be, high CPU usage ps auxww | sort -rk 3 | head -20 # top CPU consumers

Network Connections

# All connections with process info sudo ss -tnpa # Listening ports (what services are exposed?) sudo ss -tlnp

# Established connections (who is connected?) sudo ss -tnp state established

Logged-In Users

# Who is logged in right now? who # Recent login history last -20 # Failed login attempts sudo lastb -20

# Currently active SSH sessions sudo ss -tnp | grep ':22'

Open Files and Network Sockets

# Files opened by a suspicious process sudo lsof -p <PID>

# All network connections with associated processes sudo lsof -i -nP

Checking for Unauthorized Access

Unauthorized User Accounts

# List all user accounts with shells (potential interactive users)
grep -v '/nologin\|/false' /etc/passwd
# Check for accounts with UID 0 (root-equivalent)
awk -F: '$3 == 0 {print $1}' /etc/passwd# Recently modified user files
ls -la /etc/passwd /etc/shadow /etc/group
stat /etc/passwd

Unauthorized SSH Keys

# Check every user's authorized_keys file
for dir in /home/*; do
  echo "=== $dir ==="
  cat "$dir/.ssh/authorized_keys" 2>/dev/null
done# Check root's authorized keys
cat /root/.ssh/authorized_keys 2>/dev/null

Suspicious Cron Jobs

Cron jobs are a favorite persistence mechanism for attackers because they survive reboots and run on a schedule:

# List cron jobs for all users
for user in $(cut -f1 -d: /etc/passwd); do
  echo "=== $user ==="
  crontab -l -u "$user" 2>/dev/null
done
# Check system-wide cron directories
ls -la /etc/cron.d/
ls -la /etc/cron.daily/
ls -la /etc/cron.hourly/# Check systemd timers
systemctl list-timers --all

Modified System Binaries

# Debian/Ubuntu: verify installed package file integrity
sudo debsums -c 2>/dev/null | head -20
# RHEL/Fedora: verify package files
sudo rpm -Va | head -20# Check common binaries for unexpected modification times
ls -la /usr/bin/ssh /usr/bin/curl /usr/bin/wget /usr/sbin/sshd

Preserving Evidence

If this is a real incident, preserving evidence properly is critical. Poor evidence handling can make forensic analysis impossible and may affect any legal proceedings.

Saving Volatile Data

# Create an evidence directory (ideally on an external/mounted drive)
mkdir -p /mnt/evidence/$(hostname)_$(date +%Y%m%d)
# Dump process listing
ps auxwwf > /mnt/evidence/$(hostname)_$(date +%Y%m%d)/processes.txt
# Dump network connections
ss -tnpa > /mnt/evidence/$(hostname)_$(date +%Y%m%d)/connections.txt
# Dump logged-in users and login history
who > /mnt/evidence/$(hostname)_$(date +%Y%m%d)/who.txt
last > /mnt/evidence/$(hostname)_$(date +%Y%m%d)/last.txt# Copy relevant log files
cp /var/log/auth.log /mnt/evidence/$(hostname)_$(date +%Y%m%d)/
cp /var/log/syslog /mnt/evidence/$(hostname)_$(date +%Y%m%d)/

Hashing Evidence Files

# Create checksums of all evidence files for integrity verification
cd /mnt/evidence/$(hostname)_$(date +%Y%m%d)
sha256sum * > checksums.sha256

Isolate vs Investigate: Making the Decision

One of the hardest decisions during incident response is whether to isolate the system immediately or continue investigating while it runs.

Approach	When to Use	Trade-off
Isolate immediately	Active data exfiltration, ransomware spreading, attacker is logged in	Stops the damage but may alert the attacker and lose volatile data
Investigate first	Suspected compromise without active damage, need to understand scope	Preserves evidence but the attacker may continue operating
Isolate network, keep running	Best middle ground for most incidents	Prevents lateral movement while preserving memory and processes

Network Isolation Without Shutdown

# Drop all traffic except your investigation session
sudo iptables -I INPUT -s <your-ip> -j ACCEPT
sudo iptables -I OUTPUT -d <your-ip> -j ACCEPT
sudo iptables -A INPUT -j DROP
sudo iptables -A OUTPUT -j DROP

This keeps the system running (preserving volatile evidence) while preventing the attacker from communicating with command-and-control servers or moving laterally to other systems.

Basic Timeline Reconstruction from Logs

Once you have collected volatile data and secured the system, start building a timeline of what happened:

# Authentication events sudo grep -E 'Accepted|Failed|session opened|session closed' /var/log/auth.log | tail -50 # Sudo usage sudo grep 'sudo:' /var/log/auth.log | tail -20 # Recent file modifications in sensitive directories sudo find /etc -mtime -7 -type f -ls sudo find /usr/bin -mtime -7 -type f -ls sudo find /tmp -mtime -3 -type f -ls

# Journal entries around a specific time sudo journalctl --since "2026-06-19 02:00" --until "2026-06-19 04:00"

Building the Narrative

As you review logs and evidence, build a timeline document that answers:

When did the suspicious activity start? (earliest log entry)
How did the attacker gain access? (SSH brute force, exploited service, stolen credentials)
What did they do? (new accounts, installed tools, accessed data)
What is still running? (persistence mechanisms, backdoors)
What else might be affected? (lateral movement to other systems)

This timeline becomes the foundation of your incident report and guides the remediation steps that follow.

Exam Focus Points

✓Never reboot a compromised system: volatile data (processes, connections, memory) is destroyed on restart
✓Collect volatile data first: ps, ss, who, last, lsof capture evidence that disappears quickly
✓Check for persistence: unauthorized SSH keys, cron jobs, systemd timers, and modified system binaries
✓Preserve evidence with checksums: hash all evidence files with sha256sum for integrity verification
✓Network isolation without shutdown preserves evidence while stopping lateral movement and C2 communication

Knowledge Check

1. Why should you avoid rebooting a system you suspect has been compromised?

2. Which of the following is the best approach when you discover an active compromise but no data is currently being exfiltrated?

3. An attacker wants their access to survive a system reboot. Which persistence mechanism should you check for during incident response?

Patch Management: Automated Updates and Rollback