Due to its platform independence, flexibility and relative simplicity, extensible markup language (XML) has been widely adopted in a wide variety of applications, from remote procedure calls to data storage. However, because of its versatility, XML is vulnerable to attacks which change the structure of the document. One such attack is XML injection (MSC36-J. Prevent XML Injection), in which XML tags are injected directly into data fields. A variant of this is is XPath injection, in which the attacker manipulates a query into an XML-specified document.
XPath injection occurs when an XML document is used for data storage in a manner similar to a relational database. This way, an XPath injection is similar to an SQL injection attack (MSC34-J. Prevent against SQL Injection), where an attack is able to include query logic in a data field in such a way the the conditional field of the query resolves as a tautology or otherwise gives the attacker access to information it should not be entitled to.
XPath Injection Example
Consider the following XML document being used as a database:
<users> <user> <login>Utah</login> <password>test123</password> </user> <user> <login>Bohdi</login> <password>password</password> </user> <user> <login>Busey</login> <password>abc123</password> </user> </users>
Unsafe code will attempt to retrieve a user from this file with an XPath statement constructed dynamically from user input.
str_query = "//users/user[LoginID/text()= " & login & " and password/text()=" & password & "]"
Therefore, the user may specify input such as login = "' or 1=1
" and password = "' or 1=1
", yielding the following query string:
//users/user[LoginID/text()='' or 1=1 and password/text()='' or 1=1]
This will subsequently reveal all the records in the XML file
Noncompliant Code Example
XML Injection may occur when:
- Data is read from an untrusted source (such as user input)
- Data is subsequently written to an XPath query string without proper sanitization.
Consider the following example in which a login and password are read from the user and used to construct the query string, in the context of the attack illustrated above.
import java.io.IOException; import org.w3c.dom.*; import org.xml.sax.SAXException; import javax.xml.parsers.*; import javax.xml.xpath.*; public class XpathInjectionExample { public boolean doLogin(String loginID, String password) throws ParserConfigurationException, SAXException,IOException, XPathExpressionException { DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance(); domFactory.setNamespaceAware(true); DocumentBuilder builder = domFactory.newDocumentBuilder(); Document doc = builder.parse("users.xml"); XPathFactory factory = XPathFactory.newInstance(); XPath xpath = factory.newXPath(); XPathExpression expr = xpath.compile("//users/user[login/text()='" + + loginID +"'" + "and password/text()='"+password+"' ]")"; Object result = expr.evaluate(doc, XPathConstants.NODESET); NodeList nodes = (NodeList) result; //print first names to the console for (int i = 0; i < nodes.getLength(); i++) { System.out.println(nodes.item(i).getNodeValue());} if (nodes.getLength() >= 1) { return true;} else {return false;} } }
The evaluate function call will return a set of all nodes in the XML file, causing the login function to return true, and bypassing authorization.
Compliant Solution
XPath injection can be prevented with many of the same methods used to prevent SQL injection, and input sanitization in general. These methods include:
- Assume all input may include an attack.
- When validating user input, verify the data type, length, format and contents. For example, construct a regular expression that checks for XML tags and special characters in user input.
- In a client-server application, perform validation at both the client and server.
- Extensively test applications which supply user input.
In similar vulnerabilities such as SQL injection, the best practice to avoid injection vulnerabilities is to use a technique called parameterization, in which user-specified data is passed directly to an API as a parameter, which in turn ensures that no data specified by the user is interpreted as execution logic. Unfortunately, such an interface does not currently exist in Java. However, this functionality can be emulated by using an interface such as XQuery, which enables the user to effectively parameterize data by specify a query statement in a separate file, and only specify data at runtime. Consider the example illustrated above, the following query specified in a text file, and the following source code.
Input File: login.qry
declare variable $loginID as xs:string external; declare variable $password as xs:string external;//users/user[@loginID= $loginID and @password=$password]
Source Code
Document doc = new Builder().build("users.xml"); XQuery xquery = new XQueryFactory().createXQuery(new File(" dologin.xq")); Map queryVars= new HashMap(); queryVars.put("loginid", "Utah"); queryVars.put("password", "test123"); Nodes results = xquery.execute(doc, null, vars).toNodes(); for (int i=0; i < results.size(); i++) { System.out.println(results.get(i).toXML()); }
Using this method, the data specified in loginID and password will not be interpreted as executable expressions at runtime.
Risk Assessment
Failing to validate user input may result in a Java application being seriously compromised. Information disclosure is possible, but most likely the attacker will be able to modify sensitive information, such as in the example above in which the attacker modifies the data in the price field. In certain cases, such as a table representing users and privileges, the attacker could be able to modify information about their user account that would allow them to run code with elevated privileges.
Rule |
Severity |
Likelihood |
Remediation Cost |
Priority |
Level |
---|---|---|---|---|---|
MSC36-J-J |
medium |
medium |
medium |
P4 |
L3 |
Related Vulnerabilities
Search for vulnerabilities resulting from the violation of this rule on the CERT website.