Bug Identification
Overview
Bug identification is the process of discovering potential vulnerabilities in software through various techniques including static analysis, dynamic analysis, and fuzzing. This document outlines methodologies and tools for effective vulnerability research.
For practical exploit development, see Exploit Development.
Vulnerability Research Methodology
Phase 1: Reconnaissance
Target Enumeration: Identify version, dependencies, configuration
Attack Surface Mapping: List all input vectors, APIs, protocols
Documentation Review: RFCs, specifications, developer docs
Prior Art Analysis: CVE database, exploit-db, bug trackers
Phase 2: Static Analysis
Source Review: If available, focus on parsing/validation code
Binary Analysis: Reverse engineering with Ghidra/IDA
Patch Diffing: Compare vulnerable vs patched versions
SBOM Analysis: Check third-party component vulnerabilities
Phase 3: Dynamic Analysis
Behavioral Analysis: Monitor syscalls, network, file I/O
Debugging: Trace execution paths with controlled input
Instrumentation: Coverage-guided exploration
Taint Analysis: Track input propagation
Phase 4: Fuzzing
Corpus Generation: Create valid seed inputs
Harness Development: Isolate target functionality
Coverage Monitoring: Identify untested code paths
Crash Triage: Classify and prioritize findings
Phase 5: Exploitation
Primitive Development: Convert bug to reliable primitives
Mitigation Bypass: Defeat ASLR, DEP, CFG, etc.
Payload Development: Create working exploit
Weaponization: Package for real-world use (if authorized)
Attack Surface Identification
Before diving into specific bug hunting techniques, it's essential to understand where to look for vulnerabilities.
Windows User Mode
Shared Memory
RPC
Named Pipes
File & Network IO
Windows Messages
For authentication-related vulnerabilities, see Windows Auth
Kernel
Device Drivers
Many third-party software with drivers to target
Can accept arbitrary user input via the
IOCTLinterfaceAlso performs actions when we
open,closehandles to it
OS
Drivers that handle hardware and user input
Intercepts/transitions from user to kernel
Modern Linux interfaces (hotspots)
io_uring: SQE size/offset confusions, submission/completion race windows, kernel copy?sizes derived from user buffers
userfaultfd: cross?thread write?what?where and TOCTOU primitives during fault handling
seccomp user?notifier: confused?deputy patterns in broker processes; notifier time?of?check vs time?of?use gaps
Hyper-V & VTL Interfaces � On many modern Windows 11 systems (especially 24H2 on supported hardware), Virtualization?Based Security and VTL1 are enabled or easily enabled by policy. Treat the hypervisor surface (e.g.,
hvix64.exeand synthetic MSRs) as a common kernel target, and verify VBS/HVCI status on the host before assuming defaults.
Drivers
DriverEntry: registers for any callbacks, setup structure, etc
I/O Handlers: handlers that get called when a process attempts to
open,close,etcthe driver,IOCTLallows driver functionality to be called from user processesPractical triage example (CVE?2025?8061):
IOCTL handlers that accept a fixed?size struct and pass a user?controlled
PHYSICAL_ADDRESSdirectly toMmMapIoSpacethen memcpy out/in mapped memory (sometimes via wrappers that swap src/dst) indicate physical memory read/write primitives.
Similarly, unguarded MSR read/write paths yield
RDMSR/WRMSRprimitives.
See the Lenovo
LnvMSRIO.syscase study in windows-kernel.md
eBPF & XDP
BPF helpers and verifier: pointer leaks, verifier bypass, JIT bugs
User?entry vectors:
bpf()syscall, privileged pods in Kubernetes, Cilium datapathTooling:
bpftool, verifier logs,bpftracescripts for quick triageCO?RE skeletons (
bpftool gen skeleton) simplify packaging portable tracing probes.BPF LSM hooks allow low?overhead coverage feedback on security?critical kernel paths; export events with
trace_pipe.
Container & Micro?VM Surface
Namespace/cgroup escapes, device?mapper abuse, races in snapshotting backends (e.g., overlayfs)
Micro?VM hypercalls in Firecracker, CloudHypervisor, Kata Containers
For detailed container exploitation techniques, see Container
Cloud?Native & IAM Bugs
Misconfigured IAM policies, privilege?escalating API actions (AWS
sts:AssumeRole, Azure Golden SAML)SSRF paths into metadata services (
169.254.169.254, IMDSv2 bypass techniques)Race conditions in managed control?plane components (Kubernetes API server, AWS Lambda workers)
Kubernetes Attack Vectors: look at kubernetes for a deeper checklist
Serverless Vulnerabilities:
Lambda layer poisoning
Function URL authentication bypass
Event injection through SQS/SNS/EventBridge
Cold start race conditions
Network / Transport Protocol Parsers
QUIC / HTTP/3: coalesced frames, reorder/timing corner cases; verify against RFC�9000 (QUIC) and RFC�9114 (HTTP/3)
HTTP/2: stream state machine desync; flow?control integer edge cases (RFC�7540)
gRPC / Protobuf: length truncation across language FFI, map/list coercion; see gRPC framing and protobuf varint rules
GraphQL: input coercion and resolver recursion limits; check GraphQL spec for type coercion semantics
WebAssembly Runtimes
WASM JIT optimization bugs in V8, Wasmtime, Wasmer
WASI sandbox escapes through host?call interfaces
Typed?Func?Refs, GC, Tail?calls, Memory64 expand type/bounds confusion surface. See the WebAssembly proposals status page for current rollout and engine adoption.
Checklist:
validate table element types/import signatures/hostcall marshalling
fuzz mixed 32/64-bit memories.
Fuzzing tip: compile native libs to WASM for fast, deterministic mutation cycles
Browser / JS Engine Exploitation
Modern V8 Architecture (2024-2025)
V8 now uses a multi-tier JIT pipeline with distinct exploitation characteristics:
Ignition (Interpreter): Bytecode interpreter; rarely targeted directly
Maglev (Mid-tier JIT): Introduced Chrome 115+; simpler IR than TurboFan
TurboFan (Optimizing JIT): Aggressive optimization; traditional exploitation target
Turboshaft: New IR replacing TurboFan internals; different optimization patterns create new bug classes
Type lattice changes affecting confusion bugs
Maglev ? Turboshaft transition paths expose state inconsistencies
Node-based to block-based IR transition
V8 Maglev Exploitation
Integer overflow in Maglev's fast-path arithmetic
Corrupted HeapNumber backing store via Maglev bounds check bypass
Map/ElementsKind confusion in polymorphic inline caches
WebAssembly JSPI (JavaScript Promise Integration)
Stack Heap Spray: Suspended WASM stacks allocated on heap; predictable layout
Type Confusion:
WebAssembly.Suspendingwrapper type mismatchInfo Leak: Stack pointers exposed through Promise resolution chains
Sandbox Escape: JSPI bridges JS/WASM boundary; bypass traditional WASM isolation
Spectre-BHB Browser Mitigations
Chrome 120+: Site Isolation per-frame; shared array buffer restrictions
Firefox 122+: Process-per-site with BHI fences in JIT trampolines
Safari 17.4+: WebKit JIT speculation guards on type checks
Site Isolation Plus
Frame-level process isolation: Each cross-origin frame in separate process
Cross-origin memory protection: Hardware-backed memory isolation
New IPC attack surface: Mojo interface exploitation required for escapes
Renderer ? Browser requirements: Need Mojo race or type confusion
New Info-Leak Requirements:
Traditional
SharedArrayBuffer + Atomicstiming attacks less reliableNeed alternative side-channels: CSS timing, WebGL shader execution, AudioContext
Cross-origin info leaks require chaining multiple primitives
Practical Browser Exploitation Workflow
Target Selection:
V8 Maglev for Chrome/Edge (faster development cycle = more bugs)
JSC for Safari (less scrutiny than V8)
SpiderMonkey for Firefox (IonMonkey/Warp still viable)
Primitive Development:
addrof: Leak object addresses (info leak)fakeobj: Craft fake object (type confusion)arbread/arbwrite: Arbitrary memory accessshellcode: RWX page or WASM JIT abuse
Sandbox Escape:
Mojo IPC race conditions
GPU process exploitation via WebGL
Utility process TOCTOU (Chrome's new architecture)
Post-Exploitation:
Chrome: Target browser process via Mojo
Safari: XPC service exploitation for sandbox escape
Firefox: Target parent process via IPC
Firmware & Embedded
UEFI DXE driver flaws, BMC web console auth bypass, ECU/CAN message injection
BLE & Zigbee stack overflows, heap exploits in
btstack,lwIP
macOS / Apple?Silicon Kernel
IOKit user?client input validation, IOMFB allocator corner?cases
Hypervisor.framework fuzzing with
hv_fuzz
Mobile Platforms (iOS/Android)
iOS 17+ Exploitation
PAC Bypass: Pointer Authentication Code bypass via signing gadgets
PPL Bypass: Page Protection Layer exploitation for kernel r/w
Secure Enclave: SEP exploitation via malformed Mach messages
Neural Engine: ANE kernel driver attack surface
Android 14+ Exploitation
MTE (Memory Tagging): Probabilistic bypass with tag collisions
GKI (Generic Kernel Image): Vendor hooks as attack surface
Scudo Hardening: Heap exploitation with hardened allocator
Hardware Attestation: Keymaster/StrongBox TEE attacks
Cross-Platform Mobile
Flutter: Dart VM type confusion, FFI boundary issues
React Native: JavaScript bridge serialization bugs
Unity: IL2CPP memory corruption, native plugin vulnerabilities
Supply Chain Attack Surface
Package Manager Vulnerabilities
Dependency Confusion: Internal vs public package name conflicts
Typosquatting: Similar package names (numpy vs numpi)
Manifest Manipulation: Lock file poisoning, version pinning bypass
Build-time Injection: Malicious install scripts, post-install hooks
CI/CD Pipeline Analysis
GitHub Actions: Workflow poisoning via PR from forked repos
Jenkins: Groovy script injection, plugin vulnerabilities
Docker: Build argument exploitation, base image substitution
Secrets Exposure: Environment variables in build logs, artifact leakage
AI & LLM Application Security
Prompt?injection, sandbox boundary escapes, hidden?channel data exfil
See AI Security for a deeper checklist
Confidential?Computing / TEE Surface
Intel TDX: diff
tdx.koortdx_psci.cbetween kernel LTS branches to spot new GPA?HPA validation checks.AMD SEV?SNP: look for unchecked
VMGEXITleafs in PSP firmware;sevtool --decodehelps locate IDA entry points.Arm CCA / RMM: analyze SMC handlers inside Realm Management Monitor (RMM) EL3 firmware.
Cloud offerings (Azure CCE, Google C3): focus on paravirtualised MMIO and attestation report flows exposed to guests.
For TEE-specific exploitation, see Secure Enclaves
GPU & vGPU Surface
HGX?HMC (verify CVE/advisories): research indicates malformed NVLINK?C2C packets can corrupt HMC register space; confirm against vendor advisories for the specific platform.
vGPU manager IOCTLs: diff
nvidia?vgpu?mgrmonthly; watchVGPU_PLUGIN_IOCTL_GET_STATEand similar calls for unchecked buffers.LeftoverLocals info?leak: contiguous VRAM allocations can leak data from prior tenants in multi?tenant AI clusters.
Hardware Security Attack Surface
Side-Channel Analysis
Power Analysis: DPA/SPA attacks on cryptographic operations
Electromagnetic (EM): Near-field probing of processor emissions
Timing Attacks: Cache timing, branch prediction analysis
Acoustic: Key extraction via CPU sound emissions
Fault Injection
Voltage Glitching: Brown-out attacks on secure boot
Clock Glitching: Skip instruction execution
Laser Fault Injection (LFI): Targeted bit flips
EM Pulse Injection: Wider area fault induction
Hardware Implants & Supply Chain
PCB Modification: Added components, trace rerouting
Firmware Backdoors: UEFI/BMC persistent implants
Hardware Trojans: Malicious logic in ICs
DMA Attacks: PCIe, Thunderbolt, FireWire exploitation
EDR Driver Vulnerability Research
Common vulnerability types in EDR drivers
Authorization bypass issues
Memory corruption in IOCTL handlers
Race conditions in driver communication
Improper input validation
For detailed EDR analysis techniques, see EDR
Research methodology
Identify accessible driver interfaces
Reverse engineer IOCTL/message handlers
Analyze authorization mechanisms
Test for input validation flaws
Look for race conditions and memory corruption
Tools for driver analysis
IDA Pro / Ghidra for reverse engineering
WinDbg for dynamic analysis
Process Monitor for behavior analysis
Custom fuzzing tools for interface testing
Quick triage rubric (post?crash)
Buffer overflow vs UAF: check access type, allocation lifetime, and red?zones (ASan/KASAN reports)
Integer issues: trace size/length and allocation math; look for truncation/casts
Logic bugs: unexpected state transitions without memory errors; validate auth/flags
Info?leaks: uninitialized reads, OOB reads, pointer/string formatters
Coverage?first recon checklist
Produce one baseline coverage run (e.g.,
drcov, Intel� PT, or Lighthouse import)Identify cold paths reachable from attacker inputs
Seed corpus: include minimal valid examples that traverse target parsers
Enable lightweight oracles (ASan/UBSan/KASAN) where feasible to maximize signal
Static Analysis Methods
Static analysis examines code without execution to identify potential vulnerabilities.
Manual Code Review
Installing the target application and examining its structure
Enumerating the ways to feed input to it
Examine the file formats and network protocols that the application uses
Locating logical vulnerabilities or memory corruptions
For Windows-specific techniques, see Windows Kernel
For Linux-specific techniques, see Linux
Patch Diffing
Patch diffing compares vulnerable and patched versions of binaries to identify security changes.
What is Patch Diffing
Patch diffing is a technique to identify changes across versions of binaries related to security patches. It compares a vulnerable version of a binary with a patched one to highlight the changes, helping to discover new, missing, and interesting functionality across versions.
Benefits
Single Source of Truth: Without a CVE blog post or sample POC, a patch diff can be the only source of information to determine changes and deduce the original issue.
Vulnerability Discovery: While understanding the original issue, you may discover additional vulnerabilities in the troubled code area.
Skill Development: Patch diffing provides focused practice in reverse engineering and helps build mental models for various vulnerability classes.
Challenges
Asymmetry: Small source code changes can drastically affect compiled binaries.
Finding Security-Related Changes: Security patches often include other changes like new features, bug fixes, and performance improvements.
Minimizing Noise:
Diff the correct binaries to avoid analyzing unrelated updates
Reduce the time delta between compared versions
Use binary symbols when available to add precision to comparisons
Tools
IDA Pro with plugins like DarunGrim and Diaphora
BinDiff Works with analysis output from IDA or Ghidra
Ghidriff: Ghidra binary diffing engine
Radare2 (radiff2)
Ghidra Version Tracking Tool
Ghidra 11 built-in Partial Match Correlator
Patch Diffing Workflow
The process of patch diffing typically follows these steps:
Preparation
Create a diffing session
Load binary versions (vulnerable and patched)
Ensure binaries pass preconditions
Run auto-analysis on both binaries
Evaluation
Run correlators to find similarities
Generate associations between binaries
Evaluate matches between functions
Accept matching functions
Analyze differences until sufficient understanding is reached
Function Analysis
Identify new functions: Functions in the patched binary with no match in the original
Identify deleted functions: Functions in the original binary with no match in the patched version
Identify changed functions: Functions that exist in both versions but have been modified
Focus on functions with security relevance (often indicated by their names or based on CVE descriptions)
Interpreting Results
New functions often indicate added security checks or validation
Changed functions may show modified logic for handling edge cases
Correlate changes with public CVE information when available
Remember that patches are not necessarily atomic - multiple issues may be fixed in one update
When using Ghidra's Version Tracking:
Use "Show Only Unmatched Functions" filter to identify new or deleted functions
Look for functions with a similarity score below 1.0 to find modified functions
Examine the modified functions to understand what security checks were added
Starting with Ghidra 11 (December 2024) a built-in Partial Match Correlator covers most PatchDiffCorrelator use-cases; install the plugin only if you need bulk-mnemonics scoring.
Case Study: 7?Zip Symlink Path Traversal
Target: 7?Zip 24.09 (vulnerable) ? 25.00 (fixed)
File of interest:
CPP/7zip/UI/Common/ArchiveExtractCallback.cppHigh?signal edits: absolute?path detection and link?path validation for WSL/Linux symlinks converted on Windows.
Minimal security?relevant diff (simplified):
Root cause (logic):
Linux/WSL symlink data containing a Windows?style path (e.g.,
C:\...) was treated as relative by the Linux absolute?path check, settinglinkInfo.isRelative = true.SetFromLinkPathprefixed the symlink�s zip?internal directory when buildingrelatPath, lettingIsSafePath(relatPath)pass despite an absolute Windows target.A subsequent �dangerous link� guard checked
_item.IsDir; non?directory symlinks skipped the validation.Result: symlink creation to arbitrary absolute Windows paths; extracted files written into the link target.
Practical triage checklist:
Search this file for:
IsSafePath,CLinkLevelsInfo::Parse,SetFromLinkPath,CloseReparseAndFile,FillLinkData,CLinkInfo::Parse,_ntOptions.SymLinks_AllowDangerous.Verify absolute?path detection across OS semantics (Linux vs Windows) and that relative/absolute status cannot be desynced by mixed?style paths.
Ensure �dangerous link� checks run for both files and directories; avoid
_item.IsDirshort?circuiting validation for file symlinks.Confirm
IsSafePathevaluates the final target path after concatenations; normalize before validation.
Quick repro (Windows, developer mode or elevated):
Create zip structure:
data/link? symlink toC:\Users\<USER>\Desktopdata/link\calc.exe? payload file
If
linkis extracted first, subsequent writes follow the symlink into the absolute target directory.
Apple Patch Diffing
Identify a CVE of interest
Download corresponding IPSW and update and N-1
use ipsw.me to download those files
convert the downloaded
.ipswto.zip
Determine changes for update
you can use IPSW tool to download and diff
Map binaries to CVE
Extract the related file(s)
Diff the binaries
In Ghidra set Decompiler Parameter ID to true
Leverage IDAObjectTypes
Root cause the vulnerability
Windows�11 Patch Diffing
This checklist mirrors the Apple IPSW workflow but uses Microsoft tooling and build numbers.
Identify the target update
Open Settings???Windows?Update???Update history or consult the Windows?Release?Health dashboard to note the KB and OS build numbers (e.g., KB5037778�? build?22631.3525).
Record the previous build you want to diff against (e.g., 22631.3447).
Collect the binaries
Fetch matching symbols
Load in the disassembler
Open tcpip.sys from both pre and post folders in IDA�8+ or Ghidra�11; ensure PDB symbols resolve.
Save the IDA databases (e.g.,
tcpip_3447.i64,tcpip_3525.i64).
Run the diff
BinDiff?7: Tools???BinDiff???Diff Database� and select the two IDBs to generate a
.BinDiffreport.Ghidriff (headless):
Triage the results
Sort by Similarity?% ascending; investigate anything below 95?%.
Focus on functions with names like
Validate,Parse,Copy,Check, or protocol?specific handlers (IppReceiveEsp,Ipv6pFragmentReassemble, etc.).Determine whether changes add bounds checks, size validations, or privilege checks.
Validate in a lab VM
Snapshot two Windows?11 VMs (build 3447 and 3525).
Attach WinDbg (kernel mode) using
bcdedit /dbgsettings net hostip:<IP> port:<PORT>.Reproduce the issue against the pre?patch VM; confirm no crash or breakpoint triggers in the post?patch VM.
Automate monthly
Schedule a PowerShell script that, every Patch Tuesday (second Tuesday), downloads the latest Cumulative Update, extracts changed PE files, retrieves symbols, and launches a headless Diaphora diff.
Email the generated HTML report to quickly spot new attack surface.
[!TIP] For large modules like ntoskrnl.exe, diff only the
.textsection to save RAM: bindiff --primary ntoskrnl_pre.i64 --secondary ntoskrnl_post.i64 --section .text
Linux Kernel Patch Diffing
Patch?diffing Linux kernels is often faster at the source level, but for binary?only targets (vendor kernels, modules) function?level diffing is still practical.
Identify target builds
Note distro and kernel build (e.g., Ubuntu
6.8.0-47-generic, RHEL5.14.0-503).Capture both pre and post versions (package changelogs or CVE bulletins help).
Fetch kernel images and debug info
Ubuntu/Debian:
Fedora/RHEL/CentOS:
Extract
vmlinuxIdentify changed modules quickly
Function?level binary diff
Open
vmlinux-<pre>andvmlinux-<post>in Ghidra 11/IDA 8 and run Diaphora/BinDiff/Ghidriff.For hot subsystems (e.g.,
io_uring,net/ipv6,fs/overlayfs), diff only the relevant.kopairs to reduce noise.
Source?level triage (when sources are available)
Symbolization and crash mapping (cheat?sheet)
[!TIP] For modern distros built with Clang: KCFI and fine?grained CFI thunks create many small stub changes; filter by real function body deltas to focus on security?relevant logic.
[!NOTE] Syzkaller routinely bisects kernel bugs; consult syzbot reports for reproducers and fix commits, then confirm your diff isolates the same region before deeper RE.
Kernel network parser identification heuristics (SMB2-inspired, broadly applicable)
Cross-field invariants (length/offset/next)
Always validate
(offset + length) <= remaining_bufferand<= total_bufferusing a widened type (e.g.,u64) before arithmetic; reject on overflow withcheck_add_overflow()/array_size()helpers.For chained entries with a
nextfield, assert:next >= sizeof(entry_header),next <= remaining_buffer, and that pointer advancement actually makes progress. For entries carrying sub-lengths (e.g.,name_len,value_len), assertheader + name_len + value_len <= next.Do not cast to a struct until the full header is present and aligned; gate recasts with a prior
buf_lencheck.
Fixed-size buffers vs variable-length payloads
Ban unbounded copies/crypto/decompression into fixed-size arrays. Require
len <= sizeof(array)(or clamp withmin_t()and bail) when writing into in-struct arrays.Crypto transforms are just writes with extra steps: if using ARC4/AES helpers that copy
lenbytes into a fixed buffer (e.g., session keys), boundlenagainst a named maximum constant and prefer allocating a buffer sized from validatedlen.
Type/width hazards
Normalize parser math to a wide unsigned type before comparisons; avoid truncating
u32/u64fields intou16for size checks. Favorsize_t/u64foroffset+lenarithmetic, then compare tobuf_lenof the same width.
Loop structure around next
Pattern to flag:
e = (struct entry *)((char *)e + next);without a preceding block that revalidatesbuf_lenand the entry�s internal sub-lengths.Ensure a break condition on exhaustion and reject zero/negative progress values to avoid infinite loops or pointer stagnation.
Allocation-size correlation
When parser-controlled
leninfluences a subsequent write into an object from a fixed SLUB cache (e.g.,kmalloc-512), ensure the write length is bounded by the destination object field, not just the incoming length.
Patch-diff signals to prioritize
Newly added guards like
if (len > CONST) return -EINVAL;,if (buf_len < sizeof(struct foo)) return -EINVAL;, or conversions tomin_t(size_t, len, sizeof(...)).Insertions of
check_add_overflow(offset, len, &sum)orarray_size(n, sz)helpers in hot parse paths.
Static query seeds (Semgrep/CodeQL), to tune per codebase
Unbounded copies into struct fields:
Dangerous
next-driven pointer arithmetic without bounds checks:Crypto/decompression writes to fixed arrays (seed with function names in your tree, e.g.,
*_crypt,*_decrypt,decompress_*).
Dynamic confirmation (cheap)
Grammar fuzz small invariants: send
next < header,next > remaining,name_len + value_len > next, andlen > MAX_CONSTvariants; expect-EINVAL/reject. If not, investigate.Use TUN/TAP + KCOV to drive packet/SMB request paths; enable KASAN/KMSAN to surface overflows/leaks early.
Reference (motivating example)
Lessons distilled from a 2025 ksmbd remote chain writeup combining a fixed-buffer overflow in NTLM auth with an EA
nextvalidation issue � see Will�s Root: Eternal?Tux: KSMBD 0?Click RCE (https://www.willsroot.io/2025/09/ksmbd-0-click.html).
Case Study: EvilESP Vulnerability (CVE-2022-34718)
This case study demonstrates real-world patch diffing to identify a Windows TCP/IP RCE vulnerability.
Vulnerability Overview
CVE-2022-34718: Critical RCE in
tcpip.sysdiscovered in September 2022An unauthenticated attacker could send specially crafted IPv6 packets to Windows nodes with IPsec enabled
Affects the handling of ESP (Encapsulating Security Payload) packets in IPv6 fragmentation
Patch Diffing Process
Binary Acquisition
Used Winbindex to obtain sequential versions of
tcpip.sys(pre-patch and post-patch)Loaded both files in Ghidra with PDB symbols
Diff Analysis
Used BinDiff to compare the binaries
Identified only two functions with less than 100% similarity:
IppReceiveEspandIpv6pReassembleDatagram
Code Analysis
Ipv6pReassembleDatagram: Added bounds check comparing
nextheader_offsetagainst the header buffer lengthIppReceiveEsp: Added validation for the Next Header field of ESP packets
Root Cause Identification
Found an out-of-bounds 1-byte write vulnerability
ESP Next Header field is located after the encrypted payload data
A malicious packet could cause
nextheader_offsetto exceed the allocated buffer size
(Update: Server 2022 build 20349.2300, May 2024, hardened this code path; the original PoC needs a 2-byte pad tweak to reproduce the crash.)
Exploitation
Required setting up IPsec security association on the victim
Created fragmented IPv6 packets encapsulated in ESP
Controlled the offset of the out-of-bounds write through payload and padding size
Value written is controllable via the Next Header field
Limited to writing to addresses that are 4n-1 aligned (where n is an integer)
Initially achieved DoS with potential for RCE through further exploitation
Lessons Learned
Binary patch diffing effectively identified the vulnerability location and nature
Understanding protocol specifications (ESP and IPv6 fragmentation) was critical
Simple buffer checks are still overlooked in complex networking code
Even limited primitives (single byte overwrite at constrained offsets) can be dangerous
For modern exploitation techniques, see Modern Samples
For mitigation bypass techniques, see Modern Mitigations
When applying patch diffing to networking protocols:
Understand the protocol specifications thoroughly
Look for missing bounds checks in data processing
Pay attention to buffer size calculations
Check for proper validation of protocol field values and locations
Consider evasion techniques for exploit deployment - see EDR
Specs: ESP (RFC 4303) and IPv6 (RFC 8200) are essential references when reasoning about header placement and bounds
Semi-Automatic Patch Diffing
You can also use Diaphora instead of BinDiff
Manual Patch Diffing
Microsoft releases patches on the second Tuesday of each month
For Windows you can go to update catalog and search for the product version (for example
2022-10 x64 "Windows 10" 22H2)Try to look for smaller updates
You can use Patch Extract instead
Open unpatched version in IDA as the primary and the second, after that use BinDiff add-on to find the differences between them then right click on a different matched function and see the visual diff in bin diff also you can uncheck proximity browsing to see the entire function look at red blocks and then yellow blocks
With patch clean script you can only see the actual changed files
Static Analysis Tools
IDA Pro and Rust Tools for Vulnerability Research
rhabdomancer: IDA Pro headless plugin that locates calls to potentially insecure API functions in binary files
Helps auditors backtrace from candidate points to find pathways allowing access from untrusted input
Generates JSON/SARIF reports containing vulnerable function calls and their details
Written in Rust using IDA Pro 9 idalib and Binarly's idalib Rust bindings
haruspex: IDA Pro headless plugin that extracts pseudo-code generated by IDA Pro's decompiler
Exports pseudo-code in a format suitable for IDEs or static analysis tools like Semgrep/weggli
Creates individual files for each function with their pseudo-code
Can be used as a library by third-party crates
augur: IDA Pro headless plugin that extracts strings and related pseudo-code from binary files
Stores pseudo-code of functions that reference strings in an organized directory tree
Helps trace how strings are used within the application
Complements other reverse engineering tools
Modern Static Analysis Tools
Semgrep Pro � Cloud-augmented SAST with custom rule sharing, LLM-assisted rule writing
CodeQL � GitHub's semantic code analysis, excellent for variant analysis
Weggli � Fast semantic search for C/C++ (better than grep for code patterns)
Joern � Code property graph analysis for vulnerability discovery
Ghidra 11.2+ � Built-in ML-powered function signature recognition
Binary Ninja 4.0 � Cloud collaboration, improved HLIL decompilation
Cutter � Rizin GUI with built-in decompiler integration
Deprecated Tools (Avoid)
Intel Pin ? Use DynamoRIO or Frida (Pin is sustain-only mode)
WinAFL ? Use AFL++ 4.x (integrated Windows support)
Old BinDiff ? Use BinDiff 8 or Ghidriff
For more details about these tools: Streamlining vulnerability research with IDA Pro and Rust
Dynamic Analysis Methods
Dynamic analysis examines code during execution to identify vulnerabilities in real-time operation.
Hybrid Reverse Engineering (dynamic)
Reverse engineering involves analyzing an application while it runs to understand its behavior.
Network Protocol Analysis Example
Identify ports used by a program
Use WinDbg to debug it
Set a breakpoint on
bp wsock32!recvWrite a python script to send data to that port
After your breakpoint hits, continue execution using
ptand checkraxfor result anddd rsp L5for input bufferlm m <exe_name>to identify its location on the diskOpen that executable with IDA and rebase module
Use the information printed by WinDbg
kcommand to find the location you need to investigateSet a hardware breakpoint on your input buffer, only check functions that touches it
Dump the callstack when your breakpoint got hit
Try to identify the correct packet structure
Then try to send proper malformed packets to crash it
Hypervisor Debugging and Analysis
Debugging Windows Hypervisor (Hyper-V) requires specialized setup and knowledge
Setup requirements:
Host: Windows 10/11 with WinDbg
Guest: Windows VM with Virtualization-based security (VBS) enabled
Configure debugging with
bcdedit /hypervisorsettingscommands
Analyzing Hypervisor components:
Target
hvix64.exe(Intel processor hypervisor core)Use bindiff with
winload.efiand older versions ofhvloader.dllfor insightLeverage Hypervisor Top Level Functional Specification (TLFS) documentation
Inspecting Secure Kernel (SK) Calls:
Monitor VTL calls (transitions from VTL0 to VTL1)
Use conditional breakpoints on
nt!VslpEnterIumSecureModeExamine secure service call numbers (SSCN)
Use tools like hvext to translate guest virtual addresses
Memory access techniques:
Translate Guest Virtual Address (GVA) to Guest Physical Address (GPA)
Translate GPA to Host Physical Address (HPA) when needed
Understand Second Level Address Translation mechanisms
For mitigation techniques, see Mitigation or Modern Mitigations
Binary Instrumentation
Binary instrumentation is a prerequisite for advanced dynamic analysis methods.
Intel Pin: is a dynamic binary instrumentation framework
It enables you to write modules that execute some program and insert code into it at execution time to track things
Can easily hook any function in the program and collect runtime data
Can be combined with Ghidra/IDA to provide more information
[!NOTE] Intel?Pin is now in sustain?only mode � no new ISA extensions will be added. Prefer DynamoRIO or Frida?Stalker for forward?looking projects.
DynamoRIO: more powerful and faster alternative to
Intel Pin, but harder to setupFrida: Slower, better for mobile applications
Dynamic Taint Analysis
Technique to determine how data flow through a given program and influence its state
Done by marking certain bytes of memory and tracking how it flows through program execution
Often implemented on top of dynamic binary instrumentation tools like Intel Pin
Steps:
Define taint sources: data that you want to track
Define taint sinks: program locations that we want to check if they get influenced by data from taint sources
Tracking input propagation: all instructions handling taint data need to be instrumented
Symbolic Execution
Transform the program into a mathematical equation that can be solved
Instructions are simulated to maintain a binary formula describing the satisfied conditions for each path
Scan types:
Static: S2E based on source code, not reliant on architecture, very hard to reason about kernel/librariesDynamic: combine concrete state with symbolic state, scales better
You can augment fuzzers via symbolic execution, driller is an example
Challenges with Symbolic Execution
Memory
Complex data structures are hard to symbolize
Environment
How are library and system calls handled
Symbolically executing massive library functions hurt performance
State/Path Explosion
Biggest issue, loops/nested conditionals exponentially increases execution state
Heuristics are often used to determine with paths to follow
Constraint Solving
Once the formula gets large enough, it gets hard to solve and impacts performance
eBPF?based Dynamic Tracing
bpftrace / BCC: attach uprobes/kprobes to collect heap, syscall, or scheduler events with minimal overhead
Ideal for pre?fuzz triage and live coverage measurement on kernels or containerised workloads
Coverage Recon (quick)
Load into Lighthouse (IDA) or lcov HTML to identify cold paths reachable from attacker input
Use results to select fuzz targets and build a minimal, representative seed corpus
Fuzzing
Fuzzing is a technique where you feed the application malformed inputs and monitor for crashes or unintended behaviors. See the dedicated Fuzzing document for more detailed techniques.
Fuzzing Overview
What is Fuzzing?
Target software parses controllable input
We create and/or mutate input and feed it into program
Find crashes
What a Fuzzer Does
Generic but platform/architecture specific
Handles Input Generation/Mutation/Saving (called Corpus)
Instrumenting Target (running, resetting, getting feedback)
Reporting Crashes
What a Harness Does
Target Specific
Handles Feeding the input into the target
Fuzzer vs Harness Relationship
Find Top Level Callable Functions
Use Harness to Call that Function and Feed Input to it
Fuzzer Generates and Sends the Input to Harness and Collects Coverage and Detects Crashes
Crash Detection Techniques
Paged Heap (heap overflows, UAF)
Guard pages between allocations
Address Sanitizer (overflows + more)
Shadow memory (by inserting red zones in-between stack objects)
Memory Sanitizer (uninitialized variable read)
Memory Leak, Used to Break ASLR
Cluster crashes with token?based Capstone?hashing or
gdb?script dedup.pybefore manual analysis.
Fuzzing Tools
Modern Fuzzing Frameworks
AFL++ 4.21+ � Unified cross-platform fuzzer with Windows support, CMPLOG, QEMU-mode
LibAFL 0.13+ � Rust-native, highly customizable, supports in-process and fork-server modes
Honggfuzz � Persistent-mode Windows support, hardware-based feedback
Nyx � Full-VM snapshot fuzzing with KVM acceleration
ICICLE � Fast Windows kernel fuzzing framework
Syzkaller � Kernel fuzzer with LLM-guided seed selection (2025)
Instrumentation & Coverage
DynamoRIO � Used by AFL++, faster than Intel Pin (Note: Intel Pin is now sustain-only)
Frida Stalker � Cross-platform dynamic instrumentation
Intel PT � Hardware-accelerated coverage collection
Emerald � Generates
drcovdata for coverage visualization
Specialized Fuzzers
ChatAFL / LLM-driller � LLM-guided corpus expansion (+30% coverage on complex targets)
Reads-From Fuzzer (RFF) � Concurrency fuzzer for race/TOCTOU bugs
LibFuzzer � In-process, coverage-guided fuzzer for source-available targets
Radamsa ? Replaced by LibAFL mutators (more coverage-aware)
Symbolic Execution Engines
Triton � Dynamic binary analysis framework
Angr � Binary analysis platform with symbolic execution
Manticore � Dynamic symbolic execution tool
S2E � Selective symbolic execution platform
Continuous-Integration Fuzzing
ClusterFuzzLite � GitHub Actions/CI runner that feeds corpora to AFL++, libFuzzer or honggfuzz and files issues automatically.
Snapshot Fuzzing
VMM Snapshot Fuzzing
QEMU Snapshots: Fast restoration for stateful targets
KVM Acceleration: Dirty page tracking for efficient resets
Persistent Mode: Memory-only reset without full VM restore
Fuzzing Types
Dumb Fuzzing
Just sending random data to the target
Smart Fuzzing
Mutation Based: Test cases are obtained by applying mutations to valid, known good samples (e.g., Radamsa)
Generation(Grammar) Based: Test cases are obtained by modeling files or protocol specifications based on models, templates, RFCs, or documentation (e.g., Peach Fuzzer)
Model Based: Test cases are obtained by modeling the target protocol/file format (when you want to test the target's ability to accept and process invalid sequences of data)
Differential Fuzzing: Comparing outputs of different implementations with the same input
Evolutionary Fuzzing
Test cases and inputs are generated based on the response from the program
AFLis an exampleOr Google ClusterFuzz
Concurrency Fuzzing
Systematically permutes thread scheduling to uncover data?race, atomicity, and TOCTOU vulnerabilities that traditional coverage?guided fuzzers miss
LLM?Guided Fuzzing
ChatAFL � integrates an LLM to propose protocol?aware mutations; boosts state coverage on network daemons by ~40?%.
SyzAgent / SyzLLM (2025?02) � schedules kernel
syzprograms suggested by an LLM fine?tuned on the Syzkaller corpus.NumSeed � leverages natural?language descriptions of inputs to seed generation for binary?only targets.
Combined Method
The most effective approach often combines multiple techniques:
Reverse engineer first to identify interesting parts
Fuzz those parts to find crashes
Investigate crashes to find exploitable vulnerabilities
For shellcode development, see Shellcode
AI/ML-Assisted Vulnerability Discovery
Modern vulnerability research increasingly leverages machine learning and large language models to accelerate discovery and analysis.
LLM-Powered Triage and Analysis
Automated Crash Analysis:
GPT-4/Claude for interpreting crash dumps and stack traces
Automated root-cause hypothesis generation from fuzzer output
Natural language queries against large codebases for vulnerability patterns
Decompilation Enhancement:
Ghidra's ML-powered function signature recognition (11.2+)
Binary Ninja's HLIL AI improvements for cleaner pseudocode
Automated variable and function renaming based on context
AI-Powered Vulnerability Scanners
ZeroScan: Deep learning model trained on CVE datasets to identify vulnerability patterns in binaries
BigCode/StarCoder Models: Fine-tuned on security-relevant code for pattern recognition
CodeQL with ML: GitHub's semantic analysis enhanced with machine learning classifiers
LLM-Assisted Fuzzing
ChatAFL: Uses LLMs to generate input grammars and seed corpus based on protocol documentation
HyLLFuzz: GPT/Llama-3 generates branch-targeted mutations achieving ~1.3� edge coverage improvement
Grammar Inference: Automatically derive input structure from example files using transformer models
Practical Integration
Limitations and Considerations
False Positives: AI models can hallucinate vulnerabilities; manual verification essential
Training Data Bias: Models trained on public CVEs may miss novel vulnerability classes
OPSEC: Avoid sending proprietary code to cloud LLM APIs; use local models (Llama, Mistral)
Context Windows: Most models have 4K-32K token limits; chunk large binaries/logs appropriately
Quick Reference: Tool Selection Guide
By Target Type
Linux Kernel
Syzkaller, AFL++
KASAN, KCOV, ftrace
Windows Kernel
ICICLE, WinAFL
Verifier, KFUZZ
Browsers
LibFuzzer, Domato
ClusterFuzz, Dharma
Network Services
AFL++, Boofuzz
Peach, Sulley
Mobile Apps
QARK, Frida
MobSF, Objection
Web Apps
Burp Suite, FFUF
Nuclei, Semgrep
Firmware
Binwalk, EMBA
FACT, Firmwalker
Containers
Trivy, Falco
Grype, Syft
By Technique
Coverage Fuzzing
AFL++ 4.21+
Cross-platform, CMPLOG support
Snapshot Fuzzing
Nyx, QEMU+AFL++
Stateful target support
Concurrency Fuzzing
RFF, ThreadSanitizer
Race condition detection
Symbolic Execution
Angr, Triton
Path exploration
Taint Analysis
DynamoRIO, Triton
Data flow tracking
Binary Diffing
BinDiff 8, Ghidriff
Patch analysis
Static Analysis
CodeQL, Semgrep
Pattern matching
Dynamic Analysis
Frida, DynamoRIO
Runtime instrumentation
Tool Migration Path
Intel Pin
DynamoRIO
Pin is sustain-only
WinAFL
AFL++ 4.x
Integrated Windows support
Radamsa
LibAFL mutators
Better coverage awareness
BinDiff 7
BinDiff 8/Ghidriff
Improved algorithms
IDA 7.x
IDA 8.x/Ghidra 11
Better decompilation
Last updated