Crash Analysis and Exploitability Assessment

Prerequisites

Before starting this week, ensure you have:

  • A Windows VM (for WinDbg labs) and a Linux VM (for GDB/ASAN/CASR labs).

  • Completed Week 2 fuzzing labs, including running AFL++ or libFuzzer against at least one C/C++ target

  • Completed (or skimmed) Week 3 patch diffing labs:

    • Familiar with Ghidriff/Diaphora diff reports and how to interpret changed functions

    • Understand how to extract Windows updates and Linux kernel patches

    • Reviewed at least one case study (CVE-2022-34718 EvilESP, CVE-2024-1086 nf_tables, or 7-Zip symlink bugs)

  • Comfortable understanding from Week 1 of basic vulnerability classes (buffer overflow, UAF, integer bugs, info leaks) and their exploit primitives

Crash Analysis Decision Tree

Use this decision tree to select the appropriate tools and workflow for any crash you encounter:

┌─────────────────────────────────────────────────────────────────────┐
│                        CRASH RECEIVED                               │
└─────────────────────────────────────────────────────────────────────┘


                    ┌───────────────────────┐
                    │ Source code available?│
                    └───────────────────────┘
                      │                    │
                     Yes                   No
                      │                    │
                      ▼                    ▼
        ┌─────────────────────┐   ┌──────────────────────────┐
        │ Recompile with      │   │ What platform?           │
        │ ASAN + UBSAN        │   └──────────────────────────┘
        │ (Day 2)             │     │         │         │
        └─────────────────────┘     │         │         │
                      │          Windows   Linux    Mobile
                      │             │         │         │
                      ▼             ▼         ▼         ▼
        ┌─────────────────────┐ ┌───────┐ ┌───────┐ ┌───────────┐
        │ Run crash input     │ │WinDbg │ │Pwndbg │ │ Tombstone │
        │ Get detailed report │ │+ TTD  │ │+ rr   │ │ + Frida   │
        └─────────────────────┘ │(Day 1)│ │(Day 1)│ │ (Future)  │
                      │         └───────┘ └───────┘ └───────────┘
                      │             │         │         │
                      └─────────────┴────┬────┴─────────┘


                    ┌─────────────────────────────────────┐
                    │ Crash requires special environment? │
                    └─────────────────────────────────────┘
                       │                              │
                      Yes                             No
                       │                              │
                       ▼                              │
        ┌─────────────────────────────┐               │
        │ Setup reproduction env:     │               │
        │ - Network (tcpdump, proxy)  │               │
        │ - Files (strace, procmon)   │               │
        │ - Services (docker, VM)     │               │
        └─────────────────────────────┘               │
                       │                              │
                       └──────────────┬───────────────┘


                            ┌─────────────────────┐
                            │ Crash type known?   │
                            └─────────────────────┘
                              │                 │
                             Yes                No
                              │                 │
                              ▼                 ▼
                ┌─────────────────────┐  ┌─────────────────────┐
                │ Run CASR for        │  │ Manual analysis:    │
                │ classification      │  │ - Examine registers │
                │ (Day 3)             │  │ - Check memory      │
                └─────────────────────┘  │ - Disassemble       │
                              │          │ (Day 3)             │
                              │          └─────────────────────┘
                              │                 │
                              └────────┬────────┘


                          ┌─────────────────────────┐
                          │ EXPLOITABILITY ASSESS   │
                          │ - Check mitigations     │
                          │ - Control analysis      │
                          │ - Reachability (Day 4)  │
                          └─────────────────────────┘


                          ┌─────────────────────────┐
                          │ Multiple crashes?       │
                          └─────────────────────────┘
                            │                    │
                           Yes                   No
                            │                    │
                            ▼                    ▼
              ┌─────────────────────┐   ┌─────────────────────┐
              │ Deduplicate (Day 5) │   │ Minimize (Day 5)    │
              │ - CASR cluster      │   │ - afl-tmin          │
              │ - Stack hash        │   │ - Manual reduction  │
              └─────────────────────┘   └─────────────────────┘
                            │                    │
                            └────────┬───────────┘


                        ┌─────────────────────────┐
                        │ Create PoC (Day 6)      │
                        │ - Python + pwntools     │
                        │ - Verify reliability    │
                        │ - Document findings     │
                        └─────────────────────────┘

Quick Reference - Tool Selection by Scenario:

Scenario
Primary Tool
Secondary Tool
Sanitizer

Linux binary, have source

GDB + Pwndbg

rr

ASAN + UBSAN

Linux binary, no source

GDB + Pwndbg

Ghidra

N/A

Windows binary, have source

WinDbg + TTD

Visual Studio

ASAN

Windows binary, no source

WinDbg + TTD

IDA/Ghidra

N/A

Fuzzer crash corpus

CASR

afl-tmin

ASAN

Non-deterministic crash

rr (Linux) / TTD (Windows)

Chaos mode

TSAN

Kernel crash (Linux)

crash utility

GDB + KASAN

KASAN

Kernel crash (Windows)

WinDbg kernel

Driver Verifier

N/A

Android app crash

Tombstone + ndk-stack

Frida

HWASan

Rust/Go crash

Native debugger

Sanitizer output

Built-in

Day 1: Debugger Fundamentals and Crash Dump Analysis

Reproduction Fidelity

[!IMPORTANT] Before any crash analysis, ensure you can reproduce the crash reliably. A crash that only happens "sometimes" or "on the fuzzer's machine" is nearly impossible to analyze or exploit. This section establishes the mandatory checklist for achieving reproduction fidelity.

Reproduction Fidelity Checklist

Before analyzing any crash, verify these match between discovery and analysis environments:

Essential Environment Knobs

ASAN/UBSAN Options (Linux/macOS):

glibc Allocator Tuning (Linux):

Core Dump Configuration (Linux):

ASLR Control (Linux - for deterministic analysis):

Input Path Matching

The crash may behave differently depending on HOW input reaches the target:

Example: stdin vs file difference:

Quick Reproduction Test Script

Installing WinDbg and Symbol Support

WinDbg Preview (recommended - modern UI):

Windows SDK Debugging Tools (includes cdb.exe for command-line/batch analysis):

Configure Symbol Path:

Linux Crash Dump Generation and Pwndbg Setup

[!HINT] While Windows uses WinDbg, Linux crash analysis uses GDB enhanced with Pwndbg. This section covers parallel Linux setup.

Installing Pwndbg:

[!WARNING] Pwndbg is installed per-user in ~/.gdbinit. If you run sudo gdb, it uses root's home directory and won't find your pwndbg config. Solutions: For crash analysis of your own compiled test programs, you typically don't need sudo. Only use sudo when attaching to system processes or analyzing setuid binaries.

Configuring Core Dumps on Linux:

[!TIP] For the exercises in this course, you typically only need:

On modern Ubuntu/Debian with systemd, cores are handled by systemd-coredump even if you set ulimit. Use coredumpctl to list and debug them.

[!WARNING] Optional: Local core files in CWD (modifies system-wide settings)

If you specifically need core files in your working directory instead of systemd-coredump:

Additional kernel settings that affect core dumps:

  • kernel.core_uses_pid: Append PID to core filename

  • fs.suid_dumpable: Controls dumps for setuid binaries (0=disabled, 1=enabled, 2=suidsafe)

Building a Vulnerable Test Suite for Linux

Create these vulnerable C programs to generate real crashes:

vulnerable_suite.c - Save this file for testing multiple vulnerability types:

Build the test suite:

Generate your first crashes:

Using coredumpctl (systemd systems):

Configuring systemd-coredump (/etc/systemd/coredump.conf):

After editing, reload: sudo systemctl daemon-reload

ASAN and Core Dumps

[!NOTE] ASAN often exits via SIGABRT, not SIGSEGV. This can be confusing when trying to capture core dumps.

Building Vulnerable Test Suite for Windows

Prerequisites:

  • Visual Studio 2022 (Community edition is free) or Build Tools for Visual Studio

  • Open "x64 Native Tools Command Prompt for VS 2022" for compilation

vulnerable_suite_win.c - Save this file for Windows crash analysis practice:

Build the Windows test suite:

Generate your first Windows crashes:

Using PowerShell to generate long strings:

Verify crashes are captured:

WER/ProcDump Dump Collection

Windows Error Reporting (WER) LocalDumps

WER is Windows' built-in crash reporting. Configure it to save dumps locally:

Enable LocalDumps via Registry:

Per-Application LocalDumps (configure for our test binary):

Verify WER is Enabled:

Sysinternals ProcDump

ProcDump provides more control than WER and catches crashes in real-time:

Basic Crash Capture (using our test binary):

Advanced ProcDump Usage:

ProcDump + Fuzzing Integration:

Batch Dump Triage with CDB

Analyze multiple dumps automatically:

Batch triage script (batch_triage.cmd):

PowerShell Batch Analysis:

Symbols and Symbolization (Linux Quick Reference)

Meaningful backtraces (GDB, CASR, ASAN reports) require symbols.

1. Build with debug info (preferred for labs):

2. Install debug symbols for system libraries (real-world targets):

3. Use debuginfod for "fetch symbols on demand" (when local symbols unavailable):

4. Symbolize raw addresses when you only have PCs:

Symbol Hygiene Best Practices

  • Symbols make or break crash analysis.

  • Without them, you're staring at hex addresses instead of function names.

  • This section provides best practices for both Windows and Linux.

Linux Symbol Management

1. debuginfod (Automatic Symbol Fetching):

debuginfod can automatically fetch debug symbols on-demand from public servers when you don't have them installed locally.

[!IMPORTANT] debuginfod vs local debug packages: debuginfod queries remote servers for symbols you don't have locally. If you install debug symbol packages (e.g., coreutils-dbgsym), the symbols are stored locally at /usr/lib/debug/ and GDB uses them directly without needing debuginfod.

Verification: Don't use debuginfod-find to verify your setup—it only queries remote servers. Instead, verify GDB can find symbols:

When to use debuginfod: debuginfod is useful when you're analyzing crashes in binaries where you haven't installed the -dbgsym package. GDB will automatically fetch symbols from the configured server.

2. Installing Debug Symbol Packages:

3. Symbolizing Addresses with addr2line:

4. Verifying Symbol Quality:

Windows Symbol Management

1. Configuring _NT_SYMBOL_PATH:

2. WinDbg Symbol Commands:

3. Troubleshooting Symbol Issues:

Cross-Platform Symbol Checklist

  • Linux

  • Windows

  • Both Platforms

Analyzing Crash in Pwndbg

WinDbg User Interface Overview

Command Window: Type commands here Registers Window: View CPU register state Disassembly Window: View assembly code at current IP Memory Window: Inspect memory contents Call Stack Window: View function call hierarchy Locals/Watch Window: Inspect variables

Essential Keyboard Shortcuts:

  • F5: Go (continue execution)

  • F10: Step over

  • F11: Step into

  • Shift+F9: Set/remove breakpoint

  • Shift+F11: Step out

  • Ctrl+Break: Break into debugger

Analyzing Stack Buffer Overflow Crashes

Crash Scenario: Stack buffer overflow in vulnerable application

Load Crash Dump:

Initial Analysis Commands:

Analyzing Heap Corruption Crashes

Using the vuln_win.exe test suite from the "Building a Windows Vulnerable Test Suite" section, generate heap-related crashes:

Load and Analyze Heap Overflow Dump:

Heap Metadata Corruption Pattern (typical output):

Identifying UAF with vuln_win.exe:

Classification: Use-After-Free - object accessed after being freed.

Common Crash Patterns and Identification

1. Null Pointer Dereference:

2. Access Violation (Invalid Address):

3. Stack Cookie Violation:

4. Heap Corruption Detected:

Essential WinDbg Commands Reference

Memory Examination:

Disassembly:

Breakpoints:

Execution Control:

Searching Memory:

Modules and Symbols:

Heap Commands:

Linux (Pwndbg Equivalents):

Pwndbg Crash Analysis Commands

Essential Pwndbg Commands for Crash Analysis:

Stack Overflow Offset Mini-Lab

This mini-lab teaches you to find the exact offset needed to control RIP:

[!NOTE] The offset (72 in this example) is the number of bytes from the start of your input to the saved return address. In Week 5, you'll replace 0xdeadbeefcafebabe with actual exploit targets (ROP gadgets, shellcode addresses, etc.).

Time Travel Debugging (TTD)

What Is TTD?:

  • Time Travel Debugging (TTD) is Microsoft's revolutionary debugging technology that records program execution and allows stepping backward in time.

  • Unlike traditional debugging where you can only step forward, TTD captures the entire execution trace, enabling you to navigate to any point in the program's history.

Why TTD Matters for Crash Analysis:

  • No More "Oops, I stepped too far": Step backward to inspect the exact state before a crash

  • Perfect Reproducibility: Recorded traces can be replayed indefinitely with identical behavior

  • Non-deterministic Bug Analysis: Catches race conditions, timing issues, and heisenbug patterns

  • Offline Analysis: Record on one machine, analyze on another

  • Root Cause Discovery: Trace backward from crash to find where corruption originated

Example TTD Workflow with vuln_win.exe:

This example uses the stack overflow crash from our test suite:

TTD Data Model Queries:

TTD integrates with WinDbg's data model, enabling powerful queries:

Memory Access Queries:

Call Queries:

Example: Finding Where Return Address Was Overwritten:

Example: Tracing User Input Through vuln_win.exe:

Practical TTD Crash Analysis: Use-After-Free in vuln_win.exe:

This example demonstrates TTD's power for analyzing UAF bugs:

TTD Best Practices:

  1. Record Minimal Scope: Only record the crashing process to keep traces manageable

  2. Use Breakpoints Wisely: Set breakpoints before recording to stop at interesting points

  3. Leverage Data Model: TTD queries are more powerful than manual navigation

  4. Save Interesting Positions: Use !positions to bookmark important execution points

  5. Combine with Memory Analysis: Use TTD to find when corruption occurred, traditional commands to analyze it

  6. Enable PageHeap for Heap Bugs: TTD + PageHeap gives you allocation/free stacks AND time travel

TTD Limitations:

  • Trace Size: Long-running processes create large trace files (GBs)

  • Performance: Recording adds ~10-20x slowdown

  • Windows Only: No Linux equivalent (use rr instead - see Day 4)

  • No Kernel Mode: TTD is user-mode only

  • x64 Only: No 32-bit support in modern versions

  • WinDbg Preview Required: Classic WinDbg from Windows SDK doesn't include TTD

Black-Box Crash Analysis

[!IMPORTANT] In real-world vulnerability research, especially on Windows, you rarely have source code. The sanitizer-based techniques in Day 2 require recompilation. This section covers black-box techniques for when you can't recompile.

When to Use Black-Box Analysis:

  • Analyzing crashes in closed-source software (Microsoft, Adobe, etc.)

  • Third-party libraries shipped as binaries

  • Malware analysis

  • CTF challenges without source

  • Production crash dumps from customers

Setup: Creating a Symbol-less Binary for Practice:

Manual Crash State Analysis

Initial Crash Assessment:

When RIP is Invalid - Use TTD to Go Back:

Module and Section Analysis:

Reverse Engineering the Vulnerable Function:

Identifying Library Functions Without Symbols:

String Search for Context Clues:

Pattern Recognition Without Symbols:

WinDbg Scripting for Black-Box Analysis

Automated Crash Classification Script:

Usage:

Quick Black-Box Analysis Commands:

GDB/Pwndbg Black-Box Script:

Lab: Root Cause ≠ Crash Site

The Problem:

  • Heap corruption crashes often occur in malloc()/free() consistency checks

  • The actual overflow/UAF happened earlier—sometimes thousands of instructions before

  • Without understanding this, you'll waste hours staring at allocator internals

Lab Setup: The Delayed Corruption Bug

vulnerable_delayed.c - A bug where corruption and crash are separated:

Exercise Part 1: Observe the Problem (Without ASAN)

What You'll See:

  • Crash occurs in add_entry() or cleanup() - NOT in process_data()!

  • The error message is malloc(): corrupted top size - heap corruption detected

  • Backtrace shows allocator functions (_int_malloc, malloc_printerr, etc.)

  • The actual vulnerable strcpy() in process_data() is NOT visible in the backtrace

  • Signal is SIGABRT (from allocator detecting corruption)

Example backtrace (notice process_data is NOT shown):

The crash is in add_entry() during a malloc() call - the allocator detected that heap metadata was corrupted. But the actual bug is in process_data() which overwrote heap structures with 'A's.

Exercise Part 2: Reproduce with ASAN

ASAN Output (shows TRUE root cause):

Exercise Part 3: Find Root Cause with Watchpoints/rr

When you can't use ASAN (closed-source binary, can't recompile):

Exercise Part 4: Document the Difference

Create a comparison table of what you observed:

Lab Deliverables

  1. Screenshot/log of non-ASAN crash (showing misleading backtrace)

  2. Screenshot/log of ASAN crash (showing true root cause)

  3. GDB transcript showing watchpoint catching the overflow

  4. Written explanation (2-3 sentences) of why the crash and bug are in different locations

Success Criteria:

  • Understand that crash site ≠ bug site for heap corruption

  • Can use ASAN to find true root cause

  • Can use watchpoints/rr to trace corruption without ASAN

  • Can explain the delayed corruption phenomenon

Identifying Vulnerability Types Without Source

1. Recognizing Heap UAF in Closed-Source:

2. Recognizing Type Confusion:

3. Recognizing Logic Bugs:

Practical Exercise

[!NOTE] You should have already built the vulnerable test suite earlier in this section. If not, scroll up to "Building a Vulnerable Test Suite (Do This First!)" and complete that setup before continuing.

Alternative: Pre-built Vulnerable Targets

If you want additional crash samples beyond the test suite:

Tasks

Task: Analyze 5 different crash types and classify each

Using the test suite you built above (or crashes from your Week 2 fuzzing), analyze each crash type.

Crash Types to Generate and Analyze (Linux):

  1. stack_overflow - Run: ./vuln_no_protect 1 $(python3 -c "print('A'*200)")

  2. heap_overflow - Run: ./vuln_asan 2 $(python3 -c "print('A'*100)")

  3. use_after_free - Run: ./vuln_asan 3

  4. double_free - Run: ./vuln_asan 4

  5. null_deref - Run: ./vuln_no_protect 5 0

Crash Types to Generate and Analyze (Windows with TTD):

  1. stack_overflow - Record with TTD: vuln_win.exe 1 AAAA...(200+ chars)

  2. heap_overflow - Enable PageHeap first, then: vuln_win.exe 2 AAAA...(100+ chars)

  3. use_after_free - Enable PageHeap first, then: vuln_win.exe 3

  4. double_free - Run: vuln_win.exe 4

  5. null_deref - Run: vuln_win.exe 5 0

For Each Crash (WinDbg):

  1. Load and Get Overview:

  2. Examine Crash State:

  3. For TTD Traces - Find Root Cause:

For Each Crash (GDB/Linux):

  1. Load and Get Overview:

  2. Examine Crash State:

Classify Bug Type:

  • What register/memory caused crash?

  • What operation was attempted?

  • What's the root cause?

Assess Exploitability:

  • Can attacker control crash address?

  • Is value being written controllable?

  • Are there mitigations active?

Document Findings:

Success Criteria:

  • All 5 dumps analyzed

  • Correct crash type identified for each

  • Root cause understood

  • Exploitability assessment provided

  • Findings documented clearly

Lab: PageHeap/AppVerifier for Windows

[!IMPORTANT] PageHeap is the Windows equivalent of ASAN for heap bugs—it surrounds allocations with guard pages and tracks allocation/free stacks.

What PageHeap Does

PageHeap (part of Application Verifier / gflags) modifies the Windows heap to:

  • Place each allocation on its own page boundary

  • Add inaccessible guard pages after allocations

  • Keep freed memory inaccessible (catches UAF immediately)

  • Record allocation and free stack traces

Lab Setup

[!TIP] You can also use vuln_win.exe from the "Building a Windows Vulnerable Test Suite" section earlier in Day 1. The dedicated heap_vuln.c below is simpler and focused specifically on heap bugs for this lab.

1. Create Vulnerable Windows Program:

2. Compile the Test Program:

Step-by-Step PageHeap Lab

Step 1: Run WITHOUT PageHeap (observe the problem):

Step 2: Enable PageHeap:

Step 3: Reproduce with PageHeap (crashes immediately):

Step 4: Analyze in WinDbg:

For UAF (Use-After-Free) Analysis:

Step 5: Check Mitigations with PowerShell:

Step 6: Disable PageHeap After Analysis:

Lab Deliverables

  1. Screenshot: gflags showing PageHeap enabled

  2. WinDbg log: !heap -p -a output showing allocation stack

  3. Comparison: Document behavior with/without PageHeap

  4. PowerShell output: Get-ProcessMitigation results

Key Takeaways

  1. WinDbg is essential: Primary tool for Windows crash analysis

  2. Symbols are crucial: Without symbols, analysis is much harder

  3. Crash patterns are recognizable: Common patterns indicate specific bug types

  4. Context matters: Same crash can have different exploitability based on mitigations

  5. Practice builds speed: Analyzing many crashes makes patterns obvious

  6. Pattern recognition is essential: Learn to recognize crash signatures without symbols

  7. Registers tell the story: Systematic register analysis reveals control

  8. Scripts accelerate triage: Automate repetitive analysis tasks

  9. TTD is powerful: Time-travel debugging helps even without symbols

  10. Document methodology: Structured reports help track analysis

  11. PageHeap is essential: Windows heap bug detection requires it

Discussion Questions

  1. How do stack cookies change the exploitability of stack overflows?

  2. What information can be gained from a crash even if it's not directly exploitable?

  3. How does Page Heap help identify heap corruption root causes?

  4. How does Time Travel Debugging (TTD) change your approach to finding where memory corruption originated, compared to traditional forward-only debugging?

Day 2: AddressSanitizer and Memory Error Classification

Understanding AddressSanitizer

[!TIP] Ubuntu Quick Setup - Copy this environment block before running ASAN-compiled binaries:

Key options explained:

  • abort_on_error=1: Abort on first error (generates signal for debugging)

  • disable_coredump=0: Allow core dump generation even with ASAN

  • detect_leaks=1: Enable LeakSanitizer (LSan)

  • symbolize=1: Show source file/line in reports

Note on ASAN + core dumps: ASAN often calls abort() on errors, which generates SIGABRT (-6), not SIGSEGV (-11). Set disable_coredump=0 if you need core dumps for post-mortem analysis.

What is ASAN?:

  • Compiler instrumentation tool for detecting memory errors

  • Inserts runtime checks around memory operations

  • Uses "shadow memory" to track allocation state

  • Detects: buffer overflows, UAF, double-free, memory leaks, and more

How It Works:

  1. Shadow Memory: 1 shadow byte tracks 8 bytes of application memory

  2. Red Zones: Poisoned memory surrounding allocations

  3. Quarantine: Freed memory held before reuse to catch UAF

  4. Stack Instrumentation: Red zones around stack variables

Installing and Using ASAN (Linux)

With Clang:

With GCC:

ASAN Error Types and Reports

1. Heap Buffer Overflow:

Vulnerable Code:

ASAN Report:

Shadow Memory Interpretation:

  • fa = heap redzone (poison bytes around allocations)

  • 00 = 8 fully addressable bytes

  • 02 = 2 more addressable bytes (totaling the 10-byte allocation)

  • [02] bracket shows exactly where the overflow was detected

Analysis:

  • Error: heap-buffer-overflow

  • Operation: WRITE of size 18 (string "This is too long!" + null terminator)

  • Location: heap.c:6 (strcpy transformed to memcpy)

  • Allocation: 10-byte buffer allocated at line 5

  • Overflow: 8 bytes past end of allocation (detected at byte 10)

2. Stack Buffer Overflow:

Vulnerable Code:

ASAN Report:

Analysis:

  • Error: stack-buffer-overflow

  • Operation: WRITE of size 29 (28 'A' characters + null terminator)

  • Location: stack.c:5 (strcpy in vulnerable_function)

  • Buffer: 16-byte buffer 'buffer' at stack frame offset [32, 48)

  • Overflow: 13 bytes past end of allocation (access at offset 48, buffer ends at 48)

  • Shadow byte f1: Stack left redzone

  • Shadow byte f3: Stack right redzone (where overflow was detected)

3. Use-After-Free:

Vulnerable Code:

ASAN Report:

Analysis:

  • Error: heap-use-after-free

  • Operation: WRITE of size 4 (writing int value 43)

  • Location: uaf.c:8 (assignment *ptr = 43)

  • Allocation: 4-byte region allocated at line 5

  • Free: Memory freed at line 7

  • Use: Dangling pointer write at line 8

  • Shadow byte fd: Freed heap memory (quarantined by ASAN)

4. Double-Free:

Vulnerable Code:

ASAN Report:

Analysis:

  • Error: double-free (attempting to free already-freed memory)

  • Operation: Second free() call on same pointer

  • Location: df.c:7 (second free(ptr))

  • Allocation: 10-byte region allocated at line 5

  • First free: Memory freed at line 6

  • Second free: Invalid free attempt at line 7

  • Impact: Can corrupt heap metadata, potentially exploitable

5. Memory Leak:

Vulnerable Code:

ASAN Report (with leak detection enabled):

Analysis:

  • Error: Memory leak detected by LeakSanitizer (part of ASAN)

  • Type: Direct leak (pointer lost, not reachable)

  • Size: 100 bytes in 1 allocation

  • Location: ml.c:5 (malloc call)

  • Cause: Program exits without freeing allocated memory

  • Note: LeakSanitizer runs at program exit to detect unreachable allocations

ASAN Options and Configuration

Key Options:

Suppression File Example (asan_suppressions.txt):

Comparing ASAN with Traditional Debugging

ASAN Advantages:

  • Detects errors at point of occurrence (not later crash)

  • Provides exact allocation/free stack traces

  • Catches leaks without explicit testing

  • Red zones catch off-by-one errors

  • Quarantine catches some UAF that might not crash

Limitations:

  • Performance overhead limits production use

  • Doesn't catch all logic bugs

  • Can miss non-deterministic races

  • Requires recompilation

When to Use Each:

  • ASAN: During development and fuzzing for comprehensive testing

  • Traditional debugging: Production crashes, reverse engineering binaries

  • Both: Reproduce ASAN-found bug in debugger for detailed analysis

When ASAN Changes Behavior

[!WARNING] ASAN modifies heap layout and timing. A bug that crashes reliably under ASAN may behave completely differently (or not manifest at all) in a non-ASAN build. Always reproduce important bugs in both configurations.

Why ASAN Changes Crash Behavior:

  1. Heap Layout Changes:

    • ASAN adds red zones (padding) around allocations

    • Allocation sizes are rounded up

    • Heap addresses are completely different

    • Adjacent allocations that would overlap in normal builds are separated

  2. Quarantine Effects:

    • Freed memory is held in quarantine before reuse

    • UAF bugs may "disappear" because memory isn't immediately reallocated

    • Without ASAN, freed memory may be immediately reused

  3. Timing Differences:

    • ASAN instrumentation adds overhead

    • Race conditions may hide or manifest differently

    • Callback timing changes

Mini-Lab: Same Bug, Different Manifestation

uaf_timing.c - Demonstrates how UAF behavior differs with/without ASAN:

Exercise:

Key Observations:

  1. Without ASAN: malloc() immediately reused the freed slot

  2. With ASAN: Quarantine prevents reuse; UAF is detected

  3. The "bug" exists in both builds, but only ASAN catches it

Quarantine Tuning

Control ASAN's quarantine to understand timing effects:

Reproduction Best Practice

For any bug found with ASAN:

Other Sanitizers

  • While AddressSanitizer (ASAN) is the most widely-used sanitizer for spatial memory safety, the LLVM sanitizer family includes several complementary tools that detect different bug classes.

  • Understanding when to use each sanitizer—and which ones can be combined—is essential for comprehensive testing.

MemorySanitizer (MSAN): Detecting Uninitialized Memory

What MSAN Detects:

  • Use of uninitialized memory

  • Uninitialized variables passed to functions

  • Uninitialized memory in conditionals

  • Propagation of uninitialized data

Compilation:

Installing libc++ for MSAN from apt.llvm.org (Optional but recommended):

MSAN works best with an instrumented libc++. Without it, you may get false positives from uninstrumented stdlib calls. The LLVM project provides pre-built libc++ packages via apt.llvm.orgarrow-up-right.

Example MSAN Detection:

MSAN Report:

When to Use MSAN:

  • Logic errors from uninitialized variables

  • Information leaks via uninitialized stack/heap data

  • Parser bugs that rely on uninitialized state

  • Kernel-style code sensitive to info leaks

ThreadSanitizer (TSAN): Detecting Data Races

What TSAN Detects:

  • Data races between threads

  • Unsynchronized memory accesses

  • Use-after-free in multithreaded contexts

  • Deadlocks

  • Lock order violations

Example TSAN Detection:

Compilation:

TSAN Report:

When to Use TSAN:

  • Multithreaded applications

  • Server software with concurrent request handling

  • Race condition vulnerabilities

  • Non-deterministic crashes

  • Lock-free data structures

Lab: Race Condition Analysis with TSAN and valgrind

Lab Target: Multithreaded UAF

race_uaf.c - A race condition leading to use-after-free:

Exercise Part 1: Reproduce with TSAN

Exercise Part 2: Detect Races with Helgrind

TSAN detects the race, but Helgrind (part of Valgrind) provides more detailed analysis and works in VMs without hardware PMU support:

Sample Helgrind Output:

Helgrind shows:

  • Which threads are racing (thread #2 vs #3)

  • Exact source locations (line 32 vs line 17)

  • The memory address and allocation origin

  • That no locks were held during access

Exercise Part 3: Analyze the Race Conditions

Use Helgrind output to answer these questions:

  1. What data is being raced on?

    Look for "Possible data race" messages - they show the address and what allocated it:

  2. Which threads are involved?

    Helgrind announces threads and shows their creation stack:

  3. What's the UAF pattern?

    Look for races where one thread writes/frees while another reads:

  4. Identify the strcpy UAF:

Lab Deliverables

  1. TSAN report showing the detected race

  2. valgrind helgrind command that reproduces the crash

  3. Interleaving description: Which thread did what, in what order

  4. Root cause: One paragraph explaining the bug

Success Criteria:

  • Can detect race with TSAN

  • Can reproduce race with valgrind

  • Can explain the thread interleaving that causes the bug

  • Understand why normal runs often don't crash

UndefinedBehaviorSanitizer (UBSAN): Catching Undefined Behavior

What UBSAN Detects:

  • Integer overflow (signed)

  • Division by zero

  • Null pointer dereference

  • Misaligned pointer access

  • Array bounds violations (with bounds checking)

  • Type confusion (via vptr checks)

  • Shifts by invalid amounts

Example UBSAN Detection:

Compilation:

Compiler Warning (at compile time):

UBSAN Runtime Report:

Key Observations:

  • Integer overflow (line 7): Detected and recoverable — execution continues, showing wrapped value -2147483648

  • Division by zero (line 11): Detected but fatal — CPU raises SIGFPE (Floating Point Exception), program aborts regardless of halt_on_error setting

  • Without halt_on_error=0, UBSAN aborts on the first error (integer overflow)

When to Use UBSAN:

  • Integer overflow vulnerabilities

  • Arithmetic bugs in parsers

  • Type confusion detection

  • Undefined behavior that doesn't crash immediately

  • Hardening development builds

Sanitizer Combinations

Compatible Combinations:

Running Combined Sanitizers:

Incompatible Combinations (Cannot Use Together):

Combination
Reason

ASAN + MSAN

Both use shadow memory with conflicting layouts

ASAN + TSAN

Conflicting instrumentation and memory tracking

MSAN + TSAN

Conflicting instrumentation

Combination Best Practices:

  1. Default Fuzzing Setup: ASAN + UBSAN

    • Catches most memory corruption + arithmetic errors

    • Good performance trade-off (~2x slowdown)

    • Use: clang -fsanitize=address,undefined ...

  2. Dedicated MSAN Run: Separate build with MSAN + UBSAN

    • Run periodically to catch uninitialized memory

    • Requires instrumented libc++ (clang++ -stdlib=libc++)

    • Cannot combine with ASAN

  3. Dedicated TSAN Run: For multithreaded targets

    • Run separate TSAN build (cannot combine with ASAN/MSAN)

    • Higher overhead (~5-15x slowdown)

    • Use: gcc -fsanitize=thread -lpthread ...

Performance Comparison

Sanitizer
CPU Overhead
Memory Overhead
Use Case

ASAN

~2x

2-3x

Spatial memory safety (overflow, UAF)

MSAN

~3x

2-3x

Uninitialized memory reads

TSAN

5-15x

5-10x

Data races in multithreaded code

UBSAN

~1.2x

Minimal

Undefined behavior (overflow, div-by-zero)

ASAN+UBSAN

~2.2x

2-3x

Combined memory + arithmetic bugs

Performance Notes:

  • ASAN overhead is predictable and acceptable for fuzzing

  • TSAN overhead makes it impractical for long fuzzing campaigns

  • UBSAN adds minimal overhead—almost always worth enabling

  • MSAN requires instrumented standard library for full effectiveness

Advanced Sanitizers (Brief Overview)

Several newer sanitizer technologies address ASAN's limitations. These are covered in depth in later weeks but are important to know about for crash analysis:

HWASan (Hardware-assisted AddressSanitizer):

  • Uses ARM64 Top Byte Ignore (TBI) feature for memory tagging

  • ~2x overhead vs ASAN's ~2x (similar), but uses only ~15% more memory vs ASAN's 2-3x

  • Essential for Android/ARM64 crash analysis

  • Detects same bug classes as ASAN with better memory efficiency

MTE (Memory Tagging Extension):

  • ARM hardware feature (ARMv8.5+, e.g., Pixel 8, server ARM64)

  • Near-zero overhead memory safety in production

  • Crashes from MTE-enabled binaries require understanding tag mismatch errors

  • Increasingly important as ARM64 adoption grows

GWP-ASan (Google-Wide Performance ASan):

  • Sampling-based allocator for production use

  • Catches ~1% of heap bugs with minimal overhead

  • Deployed in Chrome/Chromium and Android (platform- and version-specific), and available via allocator integrations (e.g., LLVM Scudo)

  • Useful for analyzing crashes from production telemetry

Frida for Dynamic Analysis:

  • Runtime instrumentation without recompilation

  • Essential for closed-source binary crash analysis

  • Can trace memory operations, hook functions, and dump state

  • Covered in detail in later weeks for mobile/binary analysis

These tools become relevant when analyzing crashes from production systems, mobile platforms, or closed-source binaries where traditional ASAN isn't available.

GWP-ASan: Production Crash Analysis

GWP-ASan (originally "Google-Wide Performance ASan") is a sampling-based heap error detector designed for production use.

Where GWP-ASan Runs:

  • Chrome/Chromium: Deployed in production (often via feature flags/field trials); used for crash telemetry

  • Android: Integrated into the platform allocator on many devices; configuration is platform-specific

  • LLVM/Scudo allocator: Includes GWP-ASan; the easiest way to try it locally is building with -fsanitize=scudo

  • Other allocators: Some allocators implement guarded sampling / GWP-ASan-style mechanisms

How GWP-ASan Works:

Analyzing GWP-ASan Crash Reports:

GWP-ASan reports look similar to ASAN but with sampling context:

Enabling GWP-ASan:

Reproducing GWP-ASan Crashes:

GWP-ASan crashes are non-deterministic (sampled). To reproduce:

GWP-ASan vs ASAN for Crash Analysis:

Aspect
GWP-ASan
ASAN

Overhead

~0.1%

~200%

Memory

Minimal

2-3x

Detection rate

~1% of bugs

100% of bugs

Use case

Production

Development/fuzzing

Reproducibility

Low (sampling)

100%

Deployment

Safe for prod

Never in prod

Workflow: GWP-ASan Crash → Full Analysis:

Key Points for GWP-ASan Analysis:

  1. Sampling means incomplete view: The bug exists, but you only caught it by luck

  2. Allocation context is crucial: The allocation stack tells you what was sampled

  3. Use full ASAN to reproduce: Convert GWP-ASan report to ASAN-reproducible test

  4. Production-only bugs are real: Some bugs only manifest under real workloads

  5. Check telemetry frequency: Multiple GWP-ASan hits = higher severity bug

Practical Workflow

Step 1: Initial Fuzzing (ASAN + UBSAN):

Step 2: Periodic MSAN Check:

Step 3: Multithreaded Target TSAN Check:

Sanitizer Selection Guide:

Example: Combining Sanitizers

Scenario: Fuzzing a multithreaded HTTP server

Phase 1: ASAN + UBSAN fuzzing (24 hours)

Phase 2: MSAN validation (4 hours)

Phase 3: TSAN validation (4 hours)

Result: 8 unique bugs across 3 bug classes

Practical Exercise

Task: Identify and classify 10 ASAN-detected bugs

If you built and fuzzed real targets in Week 2 (for example, libWebP, GStreamer, or your own small parser/HTTP server), consider recompiling one of those exact targets with ASAN and running this workflow on the crashes you already found. The synthetic exercises below are fine to start with, but applying the same process to a familiar Week 2 target will make the connection between fuzzing and crash analysis very concrete.

Provided Test Programs (compile each with ASAN):

  1. heap_overflow.c - Heap buffer overflow

  2. stack_overflow.c - Stack buffer overflow

  3. uaf_read.c - Use-after-free (read)

  4. uaf_write.c - Use-after-free (write)

  5. double_free.c - Double-free

  6. memory_leak.c - Memory leak

  7. global_overflow.c - Global buffer overflow

  8. stack_use_after_return.c - Stack use-after-return

  9. initialization_order.c - Initialization order bug

  10. alloc_dealloc_mismatch.c - new/delete mismatch

For Each Program:

  1. Compile with ASAN:

  2. Run and Capture Output:

  3. Analyze Report:

    • What type of error was detected?

    • What line triggered it?

    • What was the allocation/free stack trace?

    • How many bytes were involved?

  4. Classify Exploitability:

    • Read vs Write access?

    • Controlled by attacker input?

    • How many bytes overflow?

    • What mitigations apply?

  5. Document:

Success Criteria:

  • All 10 programs analyzed

  • ASAN error types correctly identified

  • Stack traces interpreted

  • Exploitability assessed

  • Clear documentation of findings

Key Takeaways

  1. ASAN is powerful: Catches bugs at source, not just symptoms

  2. Detailed reports: Allocation and free stacks make root cause obvious

  3. Multiple error types: Different bugs have different ASAN signatures

  4. Essential for fuzzing: Turns crashes into actionable vulnerability reports

  5. Combine with debugging: ASAN finds bug, debugger analyzes exploit primitive

Discussion Questions

  1. Why does ASAN have lower false positive rate than traditional memory checkers like Valgrind?

  2. How does the quarantine mechanism help catch use-after-free bugs?

  3. When would you use MSAN vs ASAN vs TSAN for a multi-threaded program with suspected memory issues?

  4. Why can't ASAN and MSAN be combined in the same build, and how do you work around this limitation?

Day 3: Exploitability Assessment with Automated Tools

Quick Triage Checklist

Before diving into detailed analysis, run through this checklist for every crash:

Interactive Analysis and Mitigation Checks

Checking Binary Mitigations First

Always check mitigations before deep analysis - they determine exploitability:

Using checksec (pwntools):

Checking for CET (Control-flow Enforcement Technology):

Checking System-Wide Protections:

Enhanced GDB with Pwndbg

  • Modern crash analysis on Linux uses enhanced GDB plugins that provide significantly better crash context than vanilla GDB.

  • Pwndbg is the current standard for exploit development and crash analysis, replacing older tools like the now-unmaintained GDB exploitable plugin.

What Pwndbg Provides:

  • Automatic context display on every stop (registers, stack, code, backtrace)

  • Heap visualization and analysis (heap, bins, arena)

  • Memory search and pattern finding (search, telescope)

  • Exploit development helpers (cyclic, rop, checksec)

  • Enhanced memory display with smart dereferencing

Crash Analysis with Pwndbg:

Key Pwndbg Commands for Crash Analysis:

Automated Batch Analysis with Pwndbg:

Exploitability Assessment with Pwndbg:

CASR - Modern Crash Analyzer

What Is CASR?:

CASR (Crash Analysis and Severity Reporter) is a modern, Rust-based crash analysis framework developed by ISP RAS.

Key Features (v2.13+ / Latest: v2.14):

  • Multi-language support: C/C++, Rust, Go, Python, Java, JavaScript, C#

  • Multiple analysis backends: ASAN, UBSAN, TSAN, MSAN, GDB, core dumps

  • Fuzzer integration: AFL++, libFuzzer, Atheris (Python), honggfuzz

  • CI/CD ready: SARIF reports, DefectDojo integration, GitHub Actions support

  • 23+ severity classes: Precise exploitability assessment with modern patterns

  • Clustering: Automatic deduplication using stack trace similarity

  • TUI interface: Interactive crash browsing with filtering

  • LibAFL integration: Native support for Rust-based fuzzing (v2.14+)

Installation:

[!IMPORTANT] CASR severity is heuristic-based: CASR is a triage assistant, not an oracle. Its classifications (EXPLOITABLE, PROBABLY_EXPLOITABLE, NOT_EXPLOITABLE) are based on crash patterns and may not reflect actual exploitability. Always perform manual analysis on high-priority crashes. For example:

  • A "NOT_EXPLOITABLE" null deref might become exploitable with heap manipulation

  • An "EXPLOITABLE" crash might be blocked by mitigations CASR doesn't detect

  • Use CASR for prioritization, not final verdicts

CASR Tool Suite

casr-san: Analyze sanitizer output (ASAN/UBSAN/MSAN/TSAN)

casr-gdb: Analyze crashes via GDB (no sanitizer needed)

casr-core: Analyze core dumps

casr-cluster: Deduplicate and cluster crashes

casr-cli: TUI for browsing crash reports

AFL++ Fuzzing to CASR Triage

Timeouts and Hangs Are Bugs Too

Why Timeouts Matter

  • Denial of Service: A single malicious input causing 100% CPU for hours

  • Algorithmic Complexity: O(n²) or O(n!) behavior with crafted input

  • Deadlocks: Multithreaded code stuck waiting forever

  • Resource Exhaustion: Memory growth without bounds

Creating a Hang-Prone Test Program

First, let's create a program that can hang to practice these techniques:

Build the hang test program:

Collecting Stack Dumps from Hangs

CASR Classification for Hangs

CASR is designed for crash analysis, not hang detection. It requires the program to actually crash (receive a signal like SIGSEGV or SIGABRT from within the program):

Key insight: Hangs and timeouts are different from crashes:

  • Crash: Program receives a signal (SIGSEGV, SIGABRT) due to internal error

  • Hang: Program runs forever, must be killed externally

  • CASR: Only analyzes crashes, not externally-killed processes

For hang analysis, use the GDB attach method shown in Method 1 above.

When to use CASR: Use it for actual crashes from the Day 1-2 test binaries:

Simple Hang Bucketing

When you have many timeouts from fuzzing, bucket by stack signature:

Test the bucketing script:

Infinite Loop Detection Patterns

When analyzing hangs interactively, GDB helps identify the specific loop pattern. The key is distinguishing between a program waiting for input (blocked in read()) versus an actual infinite loop (spinning CPU).

Common Mistake: Blocking vs Spinning

Correct Approach: Provide Input First

Distinguishing Hang Types:

Testing Different Loop Patterns:

Identifying Algorithmic Hangs vs Infinite Loops:

Algorithmic Complexity Attack Detection

CASR Severity Classes

CASR classifies crashes into three main categories with 23 specific types:

EXPLOITABLE (High Severity):

  1. SegFaultOnPc: Instruction pointer controlled by attacker

  2. ReturnAv: Return address overwrite

  3. BranchAv: Branch target controlled

  4. CallAv: Call instruction with controlled target

  5. DestAv: Write-what-where primitive

  6. heap-buffer-overflow-write: Heap write overflow

PROBABLY_EXPLOITABLE (Medium Severity):

  1. SourceAv: Read from controlled address

  2. BadInstruction: Invalid opcode execution

  3. heap-use-after-free-write: UAF write access

  4. double-free: Double free corruption

  5. stack-buffer-overflow: Stack corruption

  6. heap-buffer-overflow: Heap read overflow

NOT_EXPLOITABLE (Low Severity):

  1. AbortSignal: Intentional abort

  2. null-deref: NULL pointer dereference

  3. SafeFunctionCheck: Security check triggered

Additional Severity Types:

  • stack-use-after-return: Stack address used after return

  • stack-use-after-scope: Stack variable used after scope

  • heap-use-after-free: UAF read

  • global-buffer-overflow: Global array overflow

  • container-overflow: STL container bounds violation

  • initialization-order-fiasco: Static init race

  • alloc-dealloc-mismatch: new/delete mismatch

  • signal: Uncaught signal (SIGABRT, SIGFPE, etc.)

Example CASR Report

Here's an actual CASR report from analyzing a stack buffer overflow:

Key Fields Explained:

  • CrashSeverity.Type: EXPLOITABLE / PROBABLY_EXPLOITABLE / NOT_EXPLOITABLE

  • CrashSeverity.ShortDescription: Specific bug class (e.g., stack-buffer-overflow(write))

  • Stacktrace: Full call stack with source locations (when symbols available)

  • CrashLine: Exact source file and line where crash occurred

  • Source: Context lines around the crash (with ---> marking the crash line)

  • AsanReport: Complete ASAN output including shadow memory visualization

Mitigation Context

When assessing exploitability, you must understand which mitigations are active. Modern systems have multiple layers of protection that affect whether a crash is weaponizable.

Checking Mitigations on Linux:

Checking Mitigations on Windows:

Modern Mitigation Impact on Exploitability:

Mitigation
What It Prevents
Bypass Complexity
Deployment Status

Stack Canaries

Stack buffer overflow → RIP control

Medium (info leak required)

Universal

NX/DEP

Execute shellcode on stack/heap

Medium (ROP/JOP required)

Universal

ASLR/PIE

Hardcoded addresses in exploits

Medium (info leak required)

Universal

RELRO

GOT overwrite

Full RELRO: High

Common (Full in hardened)

CFG/CFI

Arbitrary indirect calls

High (gadget constraints)

Windows default, Linux opt-in

CET Shadow Stack

ROP attacks

Very High (hardware enforced)

Windows 11+, Chrome, Edge

CET IBT

JOP/COP attacks

Very High (hardware enforced)

Emerging (Linux 6.2+)

ARM PAC

Pointer corruption

High (key required)

Apple Silicon, Android 12+

ARM BTI

Branch to arbitrary code

High (landing pads required)

ARMv8.5+, iOS/Android

ARM MTE

Spatial/temporal memory bugs

High (tag bypass required)

Pixel 8+, select ARM servers

CET (Control-flow Enforcement Technology):

Intel CET is a game-changer for exploitability assessment. Available on 11th Gen+ Intel and AMD Zen 3+:

ARM Pointer Authentication (PAC):

On Apple Silicon and ARMv8.3+ systems:

Exploitability Assessment Update:

When documenting crashes, always include mitigation context:

Key Questions for Exploitability:

  1. Is CET/PAC enabled? If yes, ROP/JOP may be blocked

  2. Is CFG/CFI present? Limits callable targets

  3. Is the binary sandboxed? (Chrome, iOS apps)

  4. What's the deployment context? (kernel, hypervisor, user-space)

  5. Are there adjacent info leak primitives?

Microsoft !exploitable (Windows)

What It Does:

  • WinDbg extension for exploitability analysis

  • Similar to GDB exploitable

  • Classifies Windows crashes

  • Essential for Windows fuzzing

Installation:

[!NOTE] The original Microsoft download (download ID 44445) is no longer available. The community-maintained build at the GitHub repository above provides the same functionality.

Usage:

Automated Batch Analysis (PowerShell):

Command-Line Quick Analysis:

Crash Deduplication Strategies

Why Deduplication Matters:

  • Fuzzing generates thousands of crashes

  • Many crashes are duplicates (same root cause)

  • Need to focus on unique bugs

  • Reduces manual analysis workload

Deduplication Methods:

1. Stack Hash:

2. Coverage Hash:

3. Exploitable Hash:

4. ASAn Report Hash:

Combining Tools for Best Results

Recommended Workflow:

  1. AFL++ Fuzzing: Generate crashes with coverage-guided fuzzing

  2. CASR triage: Initial deduplication and classification

  3. ASAN Analysis: Detailed classification of unique crashes

  4. CASR Clustering: Group similar bugs together

  5. Manual Review: Verify high-priority crashes

  6. Exploit Development: Focus on EXPLOITABLE crashes

Practical Exercise

Task: Triage 20 AFL++ crashes using CASR and automated tools

[!TIP] If you completed Week 2 fuzzing exercises (libWebP, GStreamer, json-c, or your own targets), use those real crashes here. The workflow is more meaningful with crashes you generated yourself.

Setup

Step 1: Generate CASR Reports

Step 2: Cluster Similar Crashes

Step 3: Prioritize by Exploitability

Step 4: Interactive Review with casr-cli

Step 5: Document Findings

Create a triage report following this template:

Success Criteria

Exercise: Black-Box Stripped Binary Analysis

In the real world, you often analyze crashes in binaries without symbols or source code. This exercise forces you to do crash analysis using only primitive tools.

Setup

Your Task

Analyze the crash without source code or symbols. Use only:

  • gdb / pwndbg for debugging

  • checksec for mitigations

  • objdump / readelf for binary info

Hints (use these Pwndbg commands):

Deliverable

Write a 1-page report answering:

  1. What signal/crash type occurred?

  2. What instruction caused the crash?

  3. Which registers contain attacker-controlled data?

  4. What's the likely vulnerability type?

  5. Is it exploitable? Why/why not?

Success Criteria:

Exercise: Realistic Corpus Pipeline (Week 2 → Week 4)

It connects fuzzing (Week 2) to crash analysis (Week 4) and PoC development. Use AFL++ output if available.

Pipeline Overview

Your Task

Complete the full pipeline from raw crashes to a working PoC:

Step 1: Gather Crashes

Step 2: Triage with CASR

Step 3: Minimize Top Crash

Step 4: Write PoC

Deliverable

A short report documenting:

  1. Input: How many crashes, from what target

  2. Triage: EXPLOITABLE/PROBABLY_EXPLOITABLE/NOT_EXPLOITABLE counts

  3. Clusters: How many unique bugs found

  4. Selected crash: Which one and why

  5. Minimization: Original vs minimized size

  6. PoC: Does it reliably trigger the crash?

Success Criteria:

Standardized Triage Notes: The Crash Card

This one-page document captures everything needed to understand, reproduce, and prioritize the bug. It becomes your deliverable for professional crash analysis.

Crash Card Template

Example Filled-In Crash Card

Key Takeaways

  1. Automation is essential: Manual triage of thousands of crashes is impractical

  2. Multiple tools provide confidence: Agree classification increases confidence

  3. Deduplication saves time: Focus on unique bugs, not duplicate crashes

  4. Exploitability guides priority: EXPLOITABLE bugs warrant immediate attention

  5. Clustering reveals patterns: Multiple crashes often share root cause

  6. Standardized reports: Crash Cards make analysis professional and reproducible

Discussion Questions

  1. How reliable are automated exploitability assessments (CASR, Pwndbg checksec, !exploitable) compared to manual analysis?

  2. What are the limitations of stack-hash based deduplication used by these tools?

  3. Why might two crashes with different stack traces have the same root cause?

  4. When would you choose CASR batch analysis over interactive Pwndbg debugging?

Day 4: Reachability Analysis - Tracing Input to Crash

Understanding Reachability Analysis

What Is Reachability?:

  • Tracing how attacker-controlled input reaches vulnerable code

  • Answering: "Can an attacker trigger this bug?"

  • Essential for proving exploitability

Why It Matters:

  • Bug in reachable code = vulnerability

  • Bug in unreachable code = non-issue (for that attack surface)

  • Determines attack complexity and prerequisites

Methods:

  1. Static Analysis: Code review, call graph analysis

  2. Dynamic Analysis: Runtime tracing, instrumentation

  3. Symbolic Execution: Path exploration with constraints

  4. Hybrid: Combine static and dynamic

Coverage-Guided Reachability (DynamoRIO)

DynamoRIO + drcov:

  • Dynamic binary instrumentation framework

  • drcov module tracks code coverage

  • Generates .drcov files for Lighthouse

  • Works on binaries without source

Installation:

Collecting Coverage:

Visualizing in Lighthouse (IDA Pro / Binary Ninja):

Differential Coverage:

Intel Processor Trace (PT)

What Is Intel PT?:

  • Hardware-based execution tracing

  • Records all branches taken by CPU

  • Near-zero overhead (~5%)

  • Requires supported CPU (Broadwell+)

Check Support:

[!NOTE] Intel PT doesn't work inside VMs by default. For KVM/QEMU, the host kernel needs CONFIG_KVM_INTEL_PT=y and kvm_intel pt_mode=1. The VM also needs intel_pt=on in its CPU flags. If PT isn't available, use software-based alternatives like perf record with software events, or run PT workloads on bare metal.

Intel PT Example: Tracing Stack Overflow to Crash:

This example uses the vuln_no_protect binary from Day 1 to trace how input reaches the vulnerable stack_overflow() function:

Tracing Different Vulnerability Types:

Using libipt for Custom Analysis:

Frida-Based Tracing (Alternative for Closed-Source)

When DynamoRIO isn't available or you need cross-platform tracing, Frida provides dynamic instrumentation without recompilation. This is especially useful for analyzing crashes in binaries where you don't have source code.

Installation:

Basic Function Tracing with Lab Binaries:

[!NOTE] The functions in vuln_no_protect (like stack_overflow, heap_overflow, etc.) are not exported symbols - they're internal functions. Module.findExportByName() won't find them, but Frida can resolve them automatically using DebugSymbol.fromName() if the binary has symbols (not stripped).

[!TIP] For stripped binaries: If DebugSymbol.fromName() returns null addresses, the binary was compiled without symbols (-s flag) or stripped with strip. In that case, you'll need to get addresses manually with nm (before stripping) or reverse engineer them with Ghidra/IDA.

Running Frida Traces with Lab Binaries:

Key Lessons:

  1. DebugSymbol.fromName(): Resolves internal function symbols automatically (no manual nm needed)

  2. findExportByName(): Only works for dynamically exported symbols (libc, shared libs)

  3. Defer libc hooks with setTimeout: When using -f (spawn mode), libraries aren't loaded at script init time

  4. Stripped binaries: If symbols are stripped, you'll need manual address resolution via reverse engineering

Memory Access Tracing (Find what reads your input):

Complete Reachability Analysis Script:

Running the Reachability Script:

Record and Replay Debugging (rr)

What Is rr?:

  • Records program execution deterministically

  • Replays execution in GDB

  • Allows reverse execution (step backward!)

  • Perfect for analyzing non-deterministic bugs and tracing data flow

Installation:

Recording and Replaying Lab Binaries:

Tracing Stack Overflow with rr:

Tracing Use-After-Free with rr:

Tracing Double-Free with rr:

rr vs TTD: When to Use Which

Feature
rr (Linux)
TTD (Windows)

Platform

Linux only

Windows only

Recording overhead

~5-10x

~10-20x

Trace size

Moderate

Large (GBs for long runs)

Query capability

Basic (GDB commands)

Advanced (Data Model queries)

Reverse execution

Full support

Full support

Multi-threaded

Yes (chaos mode for races)

Yes

Kernel debugging

No

No (user-mode only)

ARM64 support

Yes (v5.6+)

No (x64 only)

IDE integration

VSCode (Midas), GDB

WinDbg Preview

Best for

Linux apps, race conditions

Windows apps, complex queries

Decision Guide:

  • Analyzing Linux crash? → Use rr

  • Analyzing Windows crash? → Use TTD

  • Need to query "when did X change"? → TTD's data model is more powerful

  • Hunting race conditions? → rr's chaos mode

  • Limited resources/VM? → rr has lower overhead

Don't use rr for:

  • Windows targets (use TTD instead)

  • Kernel debugging (use KGDB/crash instead)

  • Performance-sensitive recording (use Intel PT for lightweight tracing)

  • GUI applications (high overhead on X11/Wayland)

Taint Analysis Concepts

What Is Taint Analysis?:

  • Mark input data as "tainted"

  • Track taint propagation through execution

  • Identify if crash involves tainted data

Taint Sources (where data comes from):

  • Network input (recv, read from socket)

  • File input (read, fread)

  • User input (scanf, gets)

  • Command-line arguments (argv)

  • Environment variables (getenv)

Taint Sinks (where vulnerabilities occur):

  • Memory operations (memcpy, strcpy)

  • System calls (exec, system)

  • Control flow (indirect jumps, function pointers)

Manual Taint Tracking (with GDB):

Automated Taint Analysis (Advanced):

Tools like Triton, libdft, or QEMU-based taint trackers can automate this, but setup is complex. Manual analysis sufficient for most cases.

Call Graph Analysis (Static Approach)

Using IDA Pro:

Using Ghidra:

Scripting Call Graph (IDA Python):

  • as a task write a script to visualize or print call graph

Ghidra Scripting for Crash Analysis

Ghidra's scripting capabilities are powerful for automating crash analysis tasks. Unlike IDA which requires a license, Ghidra is free and supports both Python (via Jython) and Java scripts.

Basic Crash Context Script (Python/Jython):

  • fix the following script to make it work as you want

Find Similar Vulnerable Patterns:

  • fix this script to make it work as you want

Trace Data Flow to Crash (Headless Mode):

  • fix this script to make it work correctly

Key Ghidra APIs for Crash Analysis:

Task
API

Get function at address

getFunctionContaining(addr)

Get instruction

getInstructionAt(addr)

Find references

getReferencesTo(addr), getReferencesFrom(addr)

Decompile

DecompInterface().decompileFunction()

Search memory

findBytes(startAddr, pattern)

Get call graph

FunctionManager.getFunctions()

Symbol lookup

getSymbol(name, namespace)

Practical Exercise

Task: Trace HTTP request to crash in vulnerable web server

Setup:

You can treat this tiny HTTP server as a stand-in for the parser-style fuzz targets you worked with in Week 2 (for example, HTTP/JSON/image parsers) and for the kinds of functions you saw being fixed in Week 3 patch diffing (like Ipv6pReassembleDatagram in CVE-2022-34718, or the archive extraction logic in the 7-Zip case study). The goal is to bridge those earlier fuzzing and diffing exercises by following a single crashing request all the way from socket read to the vulnerable function and, ultimately, the patched code path. If you've completed the Week 3 capstone on CVE-2024-38063 or CVE-2024-1086, you can apply the same reachability analysis to trace network packets or syscall paths to the vulnerable kernel functions you identified in the diff.

Step 1: Identify Crash:

Step 2: Record Execution:

Step 3: Trace Data Flow:

Step 4: Visualize Path:

Step 5: Document Reachability:

Success Criteria:

  • Complete data flow traced from input to crash

  • Critical functions identified

  • Reachability confirmed

  • Attack vector documented

  • Exploitation prerequisites listed

Key Takeaways

  1. Reachability determines exploitability: Unreachable bugs aren't vulnerabilities

  2. Multiple approaches exist: Coverage, tracing, static analysis all valuable

  3. Automation speeds analysis: DynamoRIO + Lighthouse makes patterns obvious

  4. Replay debugging is powerful: rr enables time-travel debugging

  5. Document the path: Clear reachability proof essential for vulnerability reports

Reachability Proof Standard Template

[!IMPORTANT] Every exploitability claim needs a proof. Use this standardized template to document exactly how attacker-controlled input reaches the vulnerable code. This is your deliverable for Day 4.

The Reachability Proof Template

Lab: Network-Reachable Crash Analysis

Setup: A vulnerable HTTP server with a heap overflow in header parsing.

Step 1: Build and Test:

Step 2: Record and Trace with rr:

Step 3: Fill Out Proof Template:

Complete the Reachability Proof Template for this vulnerability:

  1. Input Source: read() from network socket (TCP port 8888)

  2. Parsing Boundary: parse_request() with sscanf()

  3. Sink: sscanf() writing to undersized req->path[64]

  4. Data Flow: accept()read()parse_request()sscanf() → heap overflow

  5. Evidence: rr trace, checkpoint/restart, ASan report showing heap-buffer-overflow

Deliverable: A completed Reachability Proof document following the template.

Success Criteria:

  • All template sections filled in with evidence

  • Dynamic trace shows complete path from socket to overflow

  • Attack surface correctly assessed (remote, unauthenticated)

  • PoC command that triggers crash remotely:

Discussion Questions

  1. How does attack surface (local vs remote) affect reachability assessment?

  2. What are the limitations of coverage-based reachability analysis with DynamoRIO/Lighthouse?

  3. How does rr's time-travel debugging change the approach to tracing input propagation compared to traditional forward-only debugging?

  4. When might static call graph analysis miss actual execution paths?

Day 5: Crash Deduplication and Corpus Minimization

Lab Setup: Building AFL-Instrumented Binary

For coverage-based deduplication and AFL tools (afl-tmin, afl-cmin), you need an AFL-instrumented build:

[!NOTE] If you don't have AFL++ installed, you can skip the coverage-based methods and use stack-hash or CASR-based deduplication instead.

Why Deduplication and Minimization Matter

The Problem:

  • Fuzzing generates thousands of crashes

  • Many are duplicates (same bug, different input)

  • Large inputs make analysis difficult

  • Need efficient prioritization

Benefits of Deduplication:

  • Focus on unique bugs, not symptoms

  • Reduce analysis time from days to hours

  • Better resource allocation

  • Clear bug count for tracking

Benefits of Minimization:

  • Smaller inputs easier to understand

  • Faster crash reproduction

  • Clearer root cause identification

  • Simpler exploit development

Crash Deduplication Strategies

Method 1: Stack Trace Hashing

Concept: Hash the call stack to identify unique crashes

Pros:

  • Fast and simple

  • Deterministic

  • No special tools needed

Cons:

  • Different stacks can be same bug

  • Non-deterministic bugs may vary

  • Address randomization affects hashing

Implementation:

Method 2: Coverage-Based Deduplication

Concept: Hash the code coverage path

Pros:

  • More accurate than stack traces

  • Captures execution flow

  • Works with non-deterministic crashes

Cons:

  • Requires instrumentation

  • Slower than stack hashing

  • May over-deduplicate

Implementation:

Method 3: CASR-Based Deduplication (Recommended)

Concept: Use CASR's semantic crash classification

Pros:

  • Semantically meaningful (23 severity types)

  • Built-in clustering algorithm

  • Modern, actively maintained

  • Considers crash type, location, and severity

Cons:

  • Requires ASAN build for best results

  • Some setup required

Implementation:

[!NOTE] The clerr cluster contains crashes that CASR couldn't fully classify (e.g., AbortSignal from ASAN reports without clear memory corruption). The DestAvNearNull clusters indicate potential NULL pointer dereferences.

Alternative: Pwndbg-Based Analysis (Interactive):

[!WARNING] The crash files in this lab contain test numbers and inputs formatted for the ASAN build. For GDB analysis, you need to pass arguments directly rather than via stdin.

Expected Output (Stack Overflow):

[!TIP] Analysis Notes:

  • Return address overwritten with 0x4141414141414141 ('AAAA...' in hex) = RIP control achieved

  • No stack canary + Executable stack + No PIE = Highly exploitable

  • The crash at vulnerable_suite.c:11 indicates the function epilogue (ret instruction)

Expected Output (Heap Overflow - No Crash):

[!WARNING] Why No Crash? Heap overflows often don't cause immediate crashes without sanitizers:

  • The overflow corrupts adjacent heap metadata/data silently

  • Crash may only occur later during free() or when corrupted data is accessed

  • Use ASAN build to detect: ./vuln_asan 2 "$HEAP_PAYLOAD" will report heap-buffer-overflow

  • This demonstrates why sanitizers are essential for finding heap corruption bugs

Expected Output (Use-After-Free - No Crash):

[!WARNING] Why No Crash? Use-after-free bugs are often silent without sanitizers:

  • The freed memory is accessed but returns garbage/stale data (notice empty UAF read)

  • Memory may still be mapped, just marked as "free" in the allocator

  • A crash only occurs if the page is unmapped or memory is reused with different data

  • Use ASAN build to detect: ./vuln_asan 3 will report heap-use-after-free

  • UAF bugs are highly exploitable - attacker can control what replaces the freed object

Expected Output (Double-Free - Crashes!):

[!TIP] Analysis Notes (Double-Free):

  • glibc tcache detection triggered: Modern glibc (2.26+) includes tcache double-free mitigation

  • Stack trace shows: double_free()__libc_free()_int_free()malloc_printerr()abort()

  • The error message "free(): double free detected in tcache 2" is the tcache key check

  • SIGABRT (signal 6) = program called abort() due to detected corruption

  • This mitigation can be bypassed in exploitation scenarios (e.g., filling tcache first)

Method 4: Combined Approach

Differential Crash Analysis

Concept: Compare similar crashes to understand root cause variations and identify distinct bugs that appear similar.

When to Use:

  • Multiple crashes in same function but different behaviors

  • Crashes that look similar but have different exploitability

  • Understanding crash variants from the same bug class

Differential Analysis Workflow (for .casrep files):

Usage Examples:

Alternative: Generate and Compare from Raw Inputs:

Usage:

Expected Output (Stack Overflow vs Heap Overflow):

[!TIP] Analysis Insight: Both crashes have strcpy at frame #0 (same dangerous function), but different vulnerability functions (stack_overflow vs heap_overflow). Same root cause pattern (unbounded copy), different memory corruption targets.

Crash Variant Discovery

Concept: Given a crash, find related crashes by mutating the input to explore the bug's attack surface.

Why Find Variants?:

  • Original crash might be DoS-only, variant might be RCE

  • Different variants may bypass different mitigations

  • Helps understand full scope of vulnerability

  • Variants with different severity may have different priority

Mutation-Based Variant Discovery:

[!NOTE] Why Only 1 Variant? The simple stack overflow always crashes at the same strcpy location regardless of payload content. To find different crash variants, you need inputs that trigger different code paths. The script above is useful when fuzzing complex parsers where mutations might reach different vulnerable functions.

Alternative: Multi-Vulnerability Variant Finder

For vulnerable_suite, use this version that explores different test cases:

Running the Multi-Vulnerability Finder:

Running the Variant Finder:

Targeted Variant Discovery:

[!TIP] Why deduplication matters: Without deduplication, you might see 30+ "crashes" that are all the same bug. With proper ASLR-normalized signatures, radamsa found 2 truly unique crash types:

  • stack-buffer: Original overflow from test case 1

  • use-after: Radamsa mutated the test number ("1" -> "3"), discovering UAF!

This demonstrates radamsa's power to explore beyond the original crash input.

Test Case Minimization with afl-tmin

What Is afl-tmin?:

  • AFL++ tool for minimizing crash inputs

  • Uses delta debugging algorithm

  • Removes bytes while preserving crash

  • Produces minimal reproducer

[!WARNING] Important for vulnerable_suite: afl-tmin with @@ passes a filename to the target, but vulnerable_suite expects command-line arguments (./vuln 1 AAAA). For this lab, use the Python-based minimizer below or CASR's casr-afl for minimization.

Basic Usage (for file-input targets):

Python-Based Minimizer (for command-line argument targets):

Running the Minimizer:

[!TIP] The minimizer found that 64 bytes is the minimum payload to trigger the stack overflow. Why? The buffer is char buffer[64], and strcpy adds a null terminator (\0), so 64 chars + 1 null = 65 bytes written, overflowing by exactly 1 byte!

What Minimization Does:

Batch Minimization (Simple Approach):

[!TIP] Minimization Results Analysis:

  • Stack overflow (test 1): Reduced to 64-byte payload (exact buffer size)

  • Double-free (test 4): Reduced to 0-byte payload (crash is payload-independent)

  • NULL deref (test 5): Reduced to "0" (just needs trigger flag)

Tips for Effective Minimization:

  1. Use block deletion first: Much faster than byte-by-byte (O(n log n) vs O(n²))

  2. Set Appropriate Timeout: ASAN is slow, use 5+ seconds

  3. Verify After Minimization: Ensure crash still reproduces

  4. Know payload-independent crashes: UAF/double-free don't need payload minimization

Corpus Minimization with afl-cmin

What Is afl-cmin?:

  • Minimizes corpus while preserving coverage

  • Keeps smallest inputs that cover all edges

  • Essential for efficient continuous fuzzing

[!WARNING] Important for vulnerable_suite: Like afl-tmin, afl-cmin with @@ passes a filename, but vulnerable_suite expects command-line arguments. For this lab, we demonstrate the concept but note this requires file-input targets in practice.

Usage (for file-input targets):

Python-Based Corpus Minimization (for CLI argument targets):

Running Corpus Minimization:

Practical Exercise

Task: Deduplicate and minimize crashes from the vulnerable_suite test cases

Setup:

Challenge 1: Stack Hash Deduplication

Write a script that:

  1. Runs each crash through ./vuln_no_protect with GDB

  2. Extracts the backtrace (bt command)

  3. Normalizes addresses (remove 0x... to handle ASLR)

  4. Computes MD5 hash of normalized stack

  5. Groups crashes by unique hash

Hints:

  • Use gdb -batch -ex "run ..." -ex "bt" -ex "quit"

  • sed 's/0x[0-9a-f]\+//g' removes hex addresses

  • Expected result: ~4-5 unique hashes (one per vulnerability type)

Challenge 2: CASR Classification

For each unique crash from Challenge 1:

  1. Run through casr-san with the ASAN build

  2. Extract CrashSeverity.Type from the JSON report

  3. Note which bugs CASR classifies as EXPLOITABLE

Hints:

  • casr-san -o output.casrep -- ./vuln_asan <args>

  • jq -r '.CrashSeverity.Type' output.casrep

  • Some vuln types (heap overflow, UAF) need ASAN to detect!

Challenge 3: Crash Minimization

Write a Python minimizer that:

  1. Takes a crash file and binary target as input

  2. Iteratively removes bytes while crash still reproduces

  3. Outputs the minimal crash that still triggers the bug

Hints:

  • Stack overflow should minimize to ~64 bytes (buffer size)

  • Double-free/NULL-deref are already minimal (just the test number)

  • Check subprocess.run() return code or ASAN output for crash detection

  • Binary search is faster than linear removal

Challenge 4: Variant Discovery

Find additional crash variants by:

  1. Mutating existing crashes with radamsa

  2. Running variants through your deduplication pipeline

  3. Identifying any new unique stack signatures

Success Criteria:

Key Takeaways

  1. Deduplication is essential: Analyzing 100 duplicates wastes time

  2. Multiple methods improve accuracy: Stack + coverage + CASR severity

  3. Minimization clarifies bugs: 42 bytes easier than 8KB to understand

  4. Automation enables scale: Manual triage doesn't scale past dozens of crashes

  5. Verification is critical: Always confirm minimized crash reproduces bug

Discussion Questions

  1. When might stack-based deduplication give false duplicates (different bugs, same stack)?

  2. How does ASLR affect crash deduplication strategies, and how does CASR handle this?

  3. What are the risks of over-aggressive test case minimization with afl-tmin (e.g., losing the root cause trigger)?

  4. When should you use afl-cmin (corpus minimization) vs afl-tmin (single test case minimization)?

Day 6: Creating PoC Reproducers and Automation

Why Reliable PoCs Matter

Uses of PoC Scripts:

  • Demonstrate vulnerability to stakeholders

  • Enable consistent reproduction for testing

  • Foundation for exploit development

  • Required for CVE submission

  • Facilitate regression testing

  • Aid in patch verification

Quality Criteria:

  1. Reliability: Works ≥ 90% of attempts

  2. Clarity: Code is readable and commented

  3. Minimalism: No unnecessary complexity

  4. Portability: Works across similar environments

  5. Safety: Clearly marked as PoC, not weaponized

Building PoCs with Python

Why Python?:

  • Excellent libraries (pwntools, scapy, requests)

  • Clear syntax for security researchers

  • Easy byte manipulation

  • Cross-platform

  • Rapid prototyping

pwntools Installation (if not already done in Day 1):

PoC Example: Stack Buffer Overflow

Scenario: Stack buffer overflow in vulnerable_suite.c (Test Case 1)

Crash Analysis (from Day 1):

  • Buffer size: 64 bytes in stack_overflow()

  • Overflow at: strcpy(buffer, input)

  • Crash with 64+ bytes (buffer overflow)

  • Minimal crash payload: 64 bytes (exact buffer boundary)

[!NOTE] ASAN vs Non-ASAN Behavior

  • With ASAN: Crashes immediately at 64+ bytes (detects overflow)

  • Without ASAN: May need more bytes to corrupt return address

  • For reliable PoC, use ASAN build or 100+ byte payload

PoC Script:

Running the PoC:

Automated Crash-to-PoC Pipeline

Complete Automation Script:

Running the Pipeline:

[!NOTE] Minimization Results

  • Stack overflow: 200 → 64 bytes (exact buffer size in stack_overflow())

  • Heap overflow: 100 → 32 bytes (exact buffer size in heap_overflow())

  • The minimizer finds the exact boundary where overflow occurs!

[!TIP] Reliability Note The ~80% crash rate is due to pwntools process() timeout/race conditions (shows "Stopped process" with exit code: None), not actual unreliability. These are deterministic bugs that crash 100% when run directly:

PoC Development for Network Services

Many real-world vulnerabilities are in network services. The vuln_http_server from Day 4 is a good example. These require socket-based PoCs rather than stdin-based.

Network Service PoC for vuln_http_server (from Day 4):

Running the HTTP Server PoC:

Generic Network Service PoC Template:

HTTP Service PoC Template:

TCP Protocol PoC Template:

PoC Development for Rust and Go Programs

Modern memory-safe languages still crash—through panics, FFI bugs, or unsafe code blocks. When creating PoCs for Rust or Go targets, the workflow differs from C/C++.

Rust Crash Analysis and PoC

Rust Panic Backtraces:

Rust with Sanitizers (nightly):

Debugging Rust Crashes:

Analyzing FFI Crashes (Rust calling C):

Rust PoC Template (for Rust targets with unsafe code):

Go Crash Analysis and PoC

Go Panic Traces:

Go Race Detector (similar to TSAN):

Debugging Go with Delve:

Go CGo Crashes (Go calling C):

Crash Analysis Comparison

Aspect
Rust
Go
C/C++

Memory bugs in safe code

Panic (not exploitable)

Panic (not exploitable)

Crash (exploitable)

Unsafe/CGo crashes

ASAN-detectable

ASAN via CGo

ASAN native

Race conditions

Compiler prevents most

Race detector

TSAN required

Backtrace quality

Excellent (DWARF)

Good (Go symbols)

Varies (need symbols)

Debugger

rust-gdb/lldb

Delve

GDB/LLDB

Core dump analysis

Standard tools

go tool pprof

crash/GDB

Practical Exercise

Task: Convert minimized crashes from Day 5 to reliable PoC scripts

Setup:

Step 1: Create Crash Inputs for Each Vulnerability Type:

Step 2: Run Automated Pipeline:

Step 3: Create Manual PoCs for UAF and Double-Free:

Since UAF and double-free are triggered by test case number alone (no payload needed), create simple PoCs.

[!WARNING] UAF requires ASAN build! The UAF vulnerability (test case 3) does NOT crash with vuln_no_protect — the memory is silently corrupted but execution continues. Always use vuln_asan for reliable UAF detection.

Save as pocs/uaf_poc.py and create similar for double-free (test case 4) and NULL deref (test case 5).

Step 4: Test All PoCs:

Step 5: Test PoC Reliability:

Expected Results:

Vulnerability
Test Case
PoC File
Reliability
Notes

Stack Overflow

1

stack_overflow_poc.py

100%

Crashes with/without ASAN

Heap Overflow

2

heap_overflow_poc.py

100% (ASAN)

Silent without ASAN!

Use-After-Free

3

uaf_poc.py

100% (ASAN)

Silent without ASAN!

Double-Free

4

double_free_poc.py

100%

Crashes with/without ASAN

NULL Deref

5

null_deref_poc.py

100%

Crashes with/without ASAN

[!WARNING] Critical: ASAN Required for Heap Bugs Heap overflow and UAF vulnerabilities do not crash without AddressSanitizer! Always test with vuln_asan build to detect these bug types.

Success Criteria:

  • PoC generated for each of the 5 vulnerability types in vulnerable_suite.c

  • Each PoC crashes target reliably (use ASAN build for heap overflow and UAF)

  • Code is documented with vulnerability type and test case number

  • Scripts can be run independently from ~/crash_analysis_lab

  • Pipeline runs end-to-end without manual intervention

Key Takeaways

  1. Reliable PoCs are essential: Foundation for exploit development and reporting

  2. Automation enables scale: Manual PoC creation doesn't scale past a few bugs

  3. Testing is critical: Verify PoC reliability before sharing

  4. Documentation matters: Clear comments make PoCs useful for others

  5. Python + pwntools is powerful: Standard toolset for security research

  6. Panics ≠ Vulnerabilities: Safe Rust/Go panics are DoS at worst

  7. Unsafe code is the attack surface: Focus analysis on unsafe blocks and FFI boundaries

  8. Race conditions matter: Go's race detector catches what safe code analysis misses

  9. FFI boundaries need ASAN: Sanitize both sides of language boundaries

  10. Tooling exists: Use rust-gdb, Delve—don't force C/C++ tools

Discussion Questions

  1. What are the ethical considerations when publishing PoC code?

  2. How does PoC reliability (e.g., 10/10 crash rate) affect vulnerability severity assessment?

  3. What pwntools features (p32/p64, tubes, ELF parsing) are most useful for PoC development?

  4. How can automated crash→minimize→PoC pipelines be integrated into continuous fuzzing workflows?

Capstone Project - The Crash Analysis Pipeline

  • Goal: Apply the week's techniques to process a batch of crashes into actionable vulnerability reports and reliable PoCs.

  • Activities:

    • Triage: Deduplicate crashes from the vulnerable_suite and vuln_http_server targets.

    • Analysis: Perform root cause analysis on the unique crashes.

    • Exploitability: Determine which crashes are weaponizable.

    • PoC: Develop stable Python PoCs for the critical bugs.

    • Reporting: Deliver a professional crash analysis report.

Capstone Scenario

You are a security researcher who has completed fuzzing sessions on the lab targets from this week. You have crashes from:

  • vulnerable_suite.c (test cases 1-5)

  • vuln_http_server.c (network-accessible)

Your manager wants a report identifying:

  1. How many actual unique bugs exist?

  2. Which ones are remotely exploitable?

  3. Proof-of-concept scripts for the highest severity issues.

Lab Setup for Capstone

vulnerable_suite_rop.c - Enhanced version with embedded ROP gadgets for exploitation exercises:

Build the enhanced binary:

Expected gadget output:

Verify with ropper:

[!NOTE] Ropper vs Binary Addresses Ropper may report slightly different addresses than the binary's built-in print_gadgets(). This is because ropper scans for byte patterns and may find gadgets at different offsets within the same instructions. Both addresses work - use the binary's output for consistency.

Execution Steps

Phase 1: Generate Crash Corpus

First, generate a diverse set of crashes from the lab targets:

Phase 2: Triage & Deduplication

Expected Triage Results:

Cluster
Count
Crash Type
Severity

cl1

5

double-free

NOT_EXPLOITABLE

cl2

5

AbortSignal (stack overflow)

NOT_EXPLOITABLE

cl3

3

DestAvNearNull (NULL deref)

PROBABLY_EXPLOITABLE

cl4

5

AbortSignal (heap overflow)

NOT_EXPLOITABLE

cl5

5

heap-use-after-free(write)

EXPLOITABLE

[!NOTE]: Cluster ordering may vary between runs. ASAN-caught crashes appear as "AbortSignal" because ASAN terminates the process before the actual crash. The UAF cluster is typically the highest priority for exploit development.

Phase 3: Deep Analysis

Select the most promising crash from each cluster and perform detailed analysis:

Verified RIP Control Analysis:

Finding ROP Gadgets:

Gadget Search Results (vuln_rop):

Gadget
Purpose

pop rdi; ret

Set 1st argument (RDI)

pop rsi; pop r15; ret

Set 2nd argument (RSI)

pop rdx; ret

Set 3rd argument (RDX)

pop rax; ret

Set syscall number

jmp rsp

Jump to shellcode on stack

syscall; ret

Execute syscall

ret

Stack alignment / chain continue

Phase 4: Minimization

Phase 5: Exploitation PoC (vuln_rop)

Create working exploits using the ROP-friendly binary:

[!NOTE] Null Bytes in Payloads 64-bit addresses contain null bytes (e.g., 0x401256\x56\x12\x40\x00\x00\x00\x00\x00). Since C strings terminate at null bytes and pwntools rejects them in argv, this script writes payloads to a temp file and uses bash command substitution to pass binary data.

Save and run:

Expected Output:

[!NOTE] Null Byte Limitation The ROP chain exploit fails via argv because bash strips null bytes from command substitution. This is a real-world constraint - 64-bit addresses like 0x401952 contain null bytes when packed (\x52\x19\x40\x00\x00\x00\x00\x00). Real exploits use stdin, network sockets, or file input to bypass this limitation.

Manual ROP Chain Verification with GDB:

Expected GDB Output:

The ROP chain works when injected directly into memory, confirming the gadget addresses and chain structure are correct. The limitation is purely in the delivery mechanism (argv null bytes), not the exploit logic.

Phase 6: Reporting

Create the final vulnerability report:

Capstone Checklist

Expected Deliverables

Key Takeaways

  1. Triage is a Filter: The 28 crash inputs reduced to just 5 unique bugs - automation saves hours of manual analysis.

  2. Root Cause > Crash Location: ASAN shows where corruption is detected, but the bug is in the strcpy() call.

  3. Reproducibility is King: All PoCs achieve 100% reliability because the bugs are deterministic.

  4. Report for the Audience: The vulnerability report includes both technical details (for developers) and severity ratings (for management).

  5. Stack Overflow = RIP Control: The 72-byte offset gives direct control over the return address.

Discussion Questions

  1. Why does the stack overflow require 72 bytes to control RIP (not 64)?

  2. How would ASLR affect exploitation of the stack overflow in vuln_protected?

  3. Why is the NULL pointer dereference classified as NOT_EXPLOITABLE while the others are EXPLOITABLE?

  4. How would you extend this analysis to include the vuln_http_server network target?

Bonus Challenge: Network Target Analysis

Extend the capstone to include the vuln_http_server from Day 4:

This adds a network-accessible vulnerability to your report and demonstrates an important lesson: sanitizers have blind spots - always use multiple detection methods.

Last updated