MSC37-J. Prevent XPath Injection

Due to its platform independence, flexibility and relative simplicity, extensible markup language (XML) has been widely adopted in a wide variety of applications, from remote procedure calls to data storage. However, because of its versatility, XML is vulnerable to attacks which change the structure of the document. These attacks can be broadly classified into two types: XML Injection and XPath Injection.

XPath injection occurs when an XML document is used for data storage in a manner similar to a relational database. This way, an XPath injection is similar to an SQL injection attack, where an attack is able to include query logic in a data field in such a way the the conditional field of the query resolves as a tautology or otherwise gives the attacker access to information it should not be entitled to.

XPath Injection Example

Consider the following XML document being used as a database:

<users>
  <user>
    <login>Utah</login>
    <password>test123</password>
  </user>
  <user>
    <login>Bohdi</login>
    <password>password</password>
  </user>
  <user>
    <login>Busey</login>
    <password>abc123</password>
  </user>
</users>

Unsafe code will attempt to retrieve a user from this file with an XPath statement constructed dynamically from user input.

str_query = "//users/user[LoginID/text()= " & login & 
            " and password/text()=" & password & "]"

Therefore, the user may specify input such as login = "' or 1=1" and password = "' or 1=1", yielding the following query string:

//users/user[LoginID/text()='' or 1=1  and password/text()='' or 1=1]

This will subsequently reveal all the records in the XML file

Noncompliant Code Example

XML Injection may occur when:

Data is read from an untrusted source (such as user input)
Data is subsequently written to an XPath query string without proper sanitization.

Consider the following example in which a login and password are read from the user and used to construct the query string, in the context of the attack illustrated above.

                
import java.io.IOException;
import org.w3c.dom.*;
import org.xml.sax.SAXException;
import javax.xml.parsers.*;
import javax.xml.xpath.*;

public class XpathInjectionExample {

  
       public boolean doLogin(String loginID, String password)
             throws ParserConfigurationException, SAXException,IOException, 
XPathExpressionException {

          DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
         domFactory.setNamespaceAware(true);
         DocumentBuilder builder = domFactory.newDocumentBuilder();
         Document doc = builder.parse("users.xml");

         XPathFactory factory = XPathFactory.newInstance();
         XPath xpath = factory.newXPath();
         XPathExpression expr = xpath.compile("//users/user[login/text()='"+login+"' 
and password/text()='"+password+"' ]);
     Object result = expr.evaluate(doc, XPathConstants.NODESET);
         NodeList nodes = (NodeList) result;
//print first names to the console 
         for (int i = 0; i < nodes.getLength(); i++) {
             System.out.println(nodes.item(i).getNodeValue());}
             
       
         if (nodes.getLength() >= 1) {               
              return true;}
              else
             {return false;}
       }
}

The evaluate function call will return a set of all nodes in the XML file, causing the login function to return true, and bypassing authorization.

Compliant Solution

XPath injection can be prevented with many of the same methods used to prevent SQL injection, and input sanitization in general. These methods include:

Assume all input may include an attack.
When validating user input, verify the data type, length, format and contents. For example, construct a regular expression that checks for XML tags and special characters in user input.
In a client-server application, perform validation at both the client and server.
Extensively test applications which supply user input.