Analyzing Random Crash Dumps: Forensic Steps When a Process-Roulette Tool Brings Down Windows
forensicsdebuggingWinDbg

Analyzing Random Crash Dumps: Forensic Steps When a Process-Roulette Tool Brings Down Windows

wwindows
2026-02-02 12:00:00
10 min read
Advertisement

Practical forensics for process kills: capture dumps, analyze with WinDbg, and attribute terminations using Sysmon, ETW, and automation.

When process-roulette breaks Windows: a postmortem playbook for engineers

Random process terminations—whether from a malicious “process roulette” tool, an overzealous EDR, or flaky automation—are a nightmare for sysadmins and developers. You need to capture reliable evidence, attribute who (or what) pulled the plug, and extract an actionable root cause. This guide gives a practical, repeatable forensic workflow for capturing crash dumps, analyzing them with WinDbg, and correlating telemetry so you can confidently say what killed the process.

Why this matters in 2026

By late 2025 and into 2026, Windows deployments are more heterogeneous (Windows 10/11/Server combinations, containerized apps, and lightweight WinPE-like boot environments). EDR and cloud diagnostics are also more aggressive at process termination to prevent lateral movement. That increases false positives and intentional terminations. At the same time, WinDbg Preview and Time-Travel Debugging (TTD) matured, and Sysmon/ETW tracing became standard telemetry pipelines. Your forensic playbook needs to combine classic dump analysis with robust telemetry correlation and lightweight automation.

High-level forensic flow (inverted pyramid)

  1. Immediately preserve evidence: enable dump capture and collect logs.
  2. Capture the right dump: user-mode full, mini, or kernel depending on impact.
  3. Analyze the dump with WinDbg: automated triage then manual investigation.
  4. Attribute the termination: correlate Event Log, Sysmon, ETW, EDR, and driver/stack data.
  5. Automate and prevent recurrence: supervisor/debugger patterns, LocalDumps, Sysmon rules and creative automation.

1) Preserve evidence: what to enable immediately

If you suspect a targeted or random process kill, act before rebooting or changing config. The following should be enabled across affected hosts:

  • Local crash dumps (WER) for user-mode processes: registry LocalDumps captures process dumps when they crash. This is essential for reproducible crashes and helps when processes throw exceptions or terminate abnormally.
  • Kernel crash settings: ensure kernel dumps are configured (full or kernel) so OS-level failures are preserved in %SystemRoot%\MEMORY.DMP.
  • Audit Process Creation/Termination: enable Windows Audit Policy for process creation and process termination events (Event IDs like 4688/4689) and collect logs centrally.
  • Deploy Sysmon (recommended): Sysmon captures rich process start/stop telemetry, command lines, parent process, and image hashes—crucial for attribution.
  • Collect EDR/AV logs: modern EDR tools often terminate processes they consider malicious; collect their telemetry and policy decisions.

Quick commands: enable LocalDumps and kernel dumps (PowerShell)

# LocalDumps for all processes (HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps)
  New-Item -Path 'HKLM:\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps' -Force
  New-ItemProperty -Path 'HKLM:\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps' -Name 'DumpFolder' -Value 'C:\Dumps' -PropertyType ExpandString -Force
  New-ItemProperty -Path 'HKLM:\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps' -Name 'DumpCount' -Value 10 -PropertyType DWord -Force
  New-ItemProperty -Path 'HKLM:\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps' -Name 'DumpType' -Value 2 -PropertyType DWord -Force  # 2 = full dump

  # Kernel crash dump: set to Kernel (1) or CompleteMemoryDump (1=kernel, 2=complete)
  reg add "HKLM\SYSTEM\CurrentControlSet\Control\CrashControl" /v CrashDumpEnabled /t REG_DWORD /d 1 /f
  

2) Which dump to collect and when

Not all dumps are equal. Choose based on the event:

  • Process crash (unhandled exception): user-mode full dump or mini-dump with heap gives exception context—use LocalDumps or ProcDump with -ma.
  • Process termination via TerminateProcess: there is no exception to trap. You need a supervised debugger or persistent ETW/Sysmon evidence because once the process exits, you can no longer dump its memory.
  • System crash/BSOD: kernel dump in %SystemRoot%\MEMORY.DMP is required.
  • ProcDump (Sysinternals) – reliable for crash-triggered dumps and can act as a supervisor for unhandled exceptions.
  • WinDbg Preview – for in-depth analysis and TTD support.
  • Sysmon – for powerful telemetry to attribute who caused termination.
  • ETW/xperf/WPR – for system-wide traces and performance context.
  • PowerShell + P/Invoke helper – small supervisor to spawn and debug processes and write dumps on EXIT events.

3) Supervising processes so you can catch TerminateProcess

When attackers or tools call TerminateProcess, the process is forcibly ended and user-mode exception handling is bypassed. The solution is to run a lightweight debugger/supervisor that starts the target process with debugging enabled. The debugger receives exit events and can call MiniDumpWriteDump while the process state is still available.

Supervisor pattern: options

  • Debugging supervisor: CreateProcess with DEBUG_PROCESS (or DEBUG_ONLY_THIS_PROCESS) in a small native or .NET host. On EXIT_PROCESS_DEBUG_EVENT, use MiniDumpWriteDump to capture a full dump.
  • ProcDump supervised launches: ProcDump can be launched to monitor a process; if you control the startup, ProcDump can be used as the launcher and will create dumps on certain exceptions. (Note: pure TerminateProcess still requires a debugger.)
  • Container/Job-object isolation: run critical apps in an environment where you can snapshot or restore if they are killed.

Minimal C# supervisor (concept)

Below is a conceptual PowerShell snippet that compiles a tiny C# debug-supervisor to start a process under a debugger and dump on exit. This is a template—test in lab before deployment.

# PowerShell: compile and run small C# supervisor that uses CreateProcess with DEBUG_ONLY_THIS_PROCESS
  $code = @'
  using System;
  using System.Diagnostics;
  using System.Runtime.InteropServices;
  // P/Invoke CreateProcess, WaitForDebugEvent, MiniDumpWriteDump etc. 
  class Supervisor { static void Main(string[] args){ /* omitted for brevity - implement debug loop and call MiniDumpWriteDump */ } }
  '@

  Add-Type -TypeDefinition $code -Language CSharp
  [Supervisor]::Main(@('C:\path\to\target.exe'))
  

Tip: For production, use a vetted open-source supervisor or wrap ProcDump combined with Audit/Sysmon to reduce surface area.

4) WinDbg triage: automated then manual

When you have a dump (user-mode or kernel), use a two-stage WinDbg workflow:

  1. Automated triage: run !analyze -v and capture the top lines: exception code, faulting module, module base, and stack.
  2. Manual inspection: inspect stacks, loaded modules, PEB/TEB, handle lists, and thread context to determine whether termination was self-induced, crashed due to access violation, or externally terminated.

WinDbg starter checklist

  • Set symbol path: .sympath SRV*c:\symbols*https://msdl.microsoft.com/download/symbols
  • Reload symbols: .reload /f
  • Automated analysis: !analyze -v
  • Inspect exception context: .ecxr
  • List threads and stacks: ~* k or ~* e !clrstack for managed code
  • List loaded modules: lmvm suspiciousModule

Common signals in user-mode dumps

  • Access violation (0xC0000005): likely a bug in application/native module—inspect stack and heap.
  • Abort/terminate in runtime: look for calls to TerminateProcess in stacks or suspicious modules that call ExitProcess or TerminateProcess.
  • No exception and clean exit: likely intentional termination; supplement analysis with Event Log and Sysmon.

5) Attribution: who/what killed the process?

Dump analysis alone rarely tells you the actor. You need to correlate system telemetry. These are the most reliable sources:

  • Sysmon (ProcessTerminate event): provides ProcessId, Image, ParentImage, and often the SHA256 of the image that terminated (when configured)—useful to see if an automated tester or tool invoked the termination.
  • Security event logs (4689/4688): show process creation and exit; pair with 4688 to see who launched the terminating tool.
  • EDR/AV logs: check policy decisions—many enterprise EDRs log blockage and termination actions with reasons and signatures.
  • ETW trace: a system-wide ETW session (Process provider) can show high-fidelity timelines with thread stacks if you enabled stack capture (xperf/WPR).
  • Driver stacks and kernel event: if a kernel driver initiated the termination (rare, but possible), check kernel-mode stacks in a kernel dump.

Example correlation workflow

  1. Note the dump timestamp and process PID from the dump (.process or !process in WinDbg).
  2. Search centralized Event Logs/Sysmon events for that PID and time window.
  3. Identify a process that called TerminateProcess or an EDR policy triggered at that time.
  4. If available, pull EDR telemetry for policy name, rule, and operator.

6) Advanced techniques & 2026 updates

Recent tooling and OS changes give additional options:

  • TTD and WinDbg Preview (2024–2026): Time-Travel Debugging is more integrated. If you capture a TTD recording, you can replay the exact sequence leading to termination and inspect memory over time.
  • Sysmon v15+ features: improved hashing and process image logging makes attribution easier. Deploy updated Sysmon configurations that include Process Tampering and ImageLoad rules.
  • Cloud WER collection: Microsoft’s cloud-based diagnostic pipeline gives richer telemetry for OS-level failures—use it for large fleets when local dumps are insufficient. Consider integrating with cloud tooling like Bitbox.cloud or your existing diagnostic collectors.
  • EDR-enabled automated kill chains: be aware that many EDR products now perform kills and quarantine with justification metadata—collect that metadata as part of your standard forensic data set.

7) Practical scripts and automation you can deploy now

Below are two practical scripts: (A) enable LocalDumps across a fleet and (B) a watchdog that uses ProcDump to capture dumps on crashes and logs termination events via Sysmon/Windows Event Log.

A) Fleet LocalDumps registry push (PowerShell)

# Run as admin on target machines or via Group Policy / SCCM
  $computers = Get-Content -Path .\hosts.txt
  foreach ($c in $computers) {
    Invoke-Command -ComputerName $c -ScriptBlock {
      New-Item -Path 'HKLM:\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps' -Force | Out-Null
      New-ItemProperty -Path 'HKLM:\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps' -Name 'DumpFolder' -Value 'C:\Dumps' -PropertyType ExpandString -Force
      New-ItemProperty -Path 'HKLM:\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps' -Name 'DumpCount' -Value 10 -PropertyType DWord -Force
      New-ItemProperty -Path 'HKLM:\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps' -Name 'DumpType' -Value 2 -PropertyType DWord -Force
    }
  }
  

B) Lightweight watchdog combining ProcDump and Sysmon

# This is a conceptual script: run procdump for a list of critical processes
  $names = @('target.exe','service.exe')
  foreach ($n in $names) {
    Start-Process -FilePath 'C:\Tools\procdump.exe' -ArgumentList "-ma -e -w $n C:\Dumps\$n-%d-%t.dmp" -NoNewWindow
  }

  # Monitor Sysmon events for ProcessTerminate and write a note to a central collector
  $query = "*[System[(EventID=5)]]"  # Sysmon Process Terminate
  Register-WmiEvent -Query "Select * from __InstanceCreationEvent within 1 where TargetInstance ISA 'Win32_NTLogEvent'" -Action {
    # parse, forward to SIEM
  }
  

Note: Replace the placeholders with tested production-safe code. The watchdog helps capture crash dumps; for TerminateProcess you must run processes under a supervisor. Use orchestration and configuration management (Group Policy, SCCM, or tools discussed in modular publishing and orchestration playbooks) to roll these settings out.

8) Real-world case study (short)

In late 2025 we assisted a SaaS provider who reported pods occasionally failing with random service exits. Dump triage showed no exception and clean exits. Sysmon showed that a signed EDR helper process (with known policy) issued a termination around the same timestamps. Correlation with EDR telemetry revealed aggressive heuristic that killed any process exceeding memory thresholds during an upgrade. The fix combined configuration to relax the heuristic during maintenance and deploying the supervisor pattern for service processes during upgrades. Root cause: policy mismatch + ephemeral process restarts.

Checklist: What to collect after a random termination

  • Local user-mode dump (if available) in C:\Dumps or WER store
  • Kernel MEMORY.DMP if a BSOD occurred
  • Sysmon logs (Process Create/Terminate, ImageLoad, NetworkConnect)
  • Security event logs (4688/4689) for process creation/exit
  • EDR/AV logs and quarantine actions
  • WPR/xperf traces if you have them
  • Configuration of the system at time of event (installed drivers, hotfixes, policies)

Wrap-up and actionable takeaways

  • Always enable LocalDumps and kernel dumps across critical hosts—don’t rely on chance.
  • Deploy Sysmon with a termination-aware config to get the telemetry you need to attribute kills.
  • Use a supervisor/debugger for critical processes you cannot lose; it’s the only reliable way to capture memory when TerminateProcess is used.
  • Correlate artifacts: dumps + Event Logs + Sysmon + EDR = attribution. One source rarely suffices.
  • Automate collection: use PowerShell, Group Policy, or your orchestration tool to ensure dumps and logs are preserved across reboots. See related tooling and orchestration approaches in modern modular delivery playbooks.
Forensic evidence is ephemeral. Collect aggressively, analyze methodically, and automate defensively.

Next steps (call to action)

Start by enabling LocalDumps and a Sysmon baseline for a pilot group. If you want the supervisor sample and a hardened Sysmon config we used in production (safe for 2026 fleets), download the companion scripts and step-by-step checklist on windows.page (search “process-roulette postmortem scripts”) or subscribe to get the diagnostic automation bundle. If you’re investigating an active incident and want a template WinDbg triage checklist or a review of your dumps, reach out via our contact channels for consulting.

Advertisement

Related Topics

#forensics#debugging#WinDbg
w

windows

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.

Advertisement
2026-01-24T04:37:16.175Z