4.12. Xpath injection

4.12.1. Xpath definitions

XPath injection attack refers to the use of the loose input and fault tolerance features of XPath parser, which can attach malicious XPath query code to URLs, forms or other information to gain access to permission information and change the information. XPath injection attack is a new attack method applied to Web services, which allows attackers to obtain the complete content of an XML document through XPath query without prior knowledge of XPath query.

4.12.2. Principle of Xpath Injection Attack

XPath injection attacks are mainly by constructing special inputs, which are often some combination of XPath syntax. These inputs will be passed into the web application as parameters, and the intruder will perform the operation that the intruder wants by executing the XPath query. Take the module in as an example to illustrate the implementation principle of XPath injection attack.

In the login verification program of a Web application, there are generally two parameters: username and password. The program will perform authorization operations through the username and password submitted by the user. If the authentication data is stored in an XML file, the principle is to authorize access by looking up the results of the username (username) and password (password) in the user table.

The example exists in the user.xml file as follows:

<users>
     <user>
         <firstname>Ben</firstname>
         <lastname>Elmore</lastname>
         <loginID>abc</loginID>
         <password>test123</password>
     </user>
     <user>
         <firstname>Shlomy</firstname>
         <lastname>Gantz</lastname>
         <loginID>xyz</loginID>
         <password>123test</password>
     </user>

Then the typical query statement in XPath is: //users/user[loginID/text()='xyz'and password/text()='123test']

However, an injection attack can be used to bypass authentication as follows. If the user passes in a login and password, for example loginID = 'xyz' password = '123test', the query will return true. But if the user passes in like ' or 1=1 or ''=' , the query will also get a return value of true, because the XPath query will eventually become the following code://users/user[loginID/text()=''or 1=1 or ''='' and password/text()='' or 1=1 or ''='']

This string will logically make the query always return true and will always allow the attacker to access the system. An attacker can use XPath to dynamically manipulate XML documents in an application. After the attack is completed, the user can obtain the most privileged account and other important document information through the XPath blind entry technology.