Because of its platform independence, flexibility, and relative simplicity, the extensible markup language (XML) has found use in applications ranging from remote procedure calls to the systematic storage, exchange, and retrieval of data. However, because of its versatility, XML is vulnerable to a wide spectrum of attacks. One such attack is called XML injection.
A user who has the ability to provide structured XML as input can override the contents of an XML document by injecting XML tags in data fields. These tags are interpreted and classified by an XML parser as executable content and, as a result, may cause certain data members to be overridden.
Consider the following XML code snippet from an online store application, designed primarily to query a back-end database. The user has the ability to specify the quantity of an item available for purchase.
<item> <description>Widget</description> <price>500.0</price> <quantity>1</quantity> </item>
A malicious user might input the following string instead of a simple number in the quantity
field.
1</quantity><price>1.0</price><quantity>1
Consequently, the XML resolves to the following block:
<item> <description>Widget</description> <price>500.0</price> <quantity>1</quantity><price>1.0</price><quantity>1</quantity> </item>
A Simple API for XML (SAX) parser (org.xml.sax
and javax.xml.parsers.SAXParser
) interprets the XML such that the second price field overrides the first, leaving the price of the item as $1. Even when it is not possible to perform such an attack, the attacker may be able to inject special characters, such as comment blocks and CDATA
delimiters, which corrupt the meaning of the XML.
Noncompliant Code Example
In this noncompliant code example, a client method uses simple string concatenation to build an XML query to send to a server. XML injection is possible because the method performs no input validation.
import java.io.BufferedOutputStream; import java.io.ByteArrayOutputStream; import java.io.IOException; public class OnlineStore { private static void createXMLStreamBad(final BufferedOutputStream outStream, final String quantity) throws IOException { String xmlString = "<item>\n<description>Widget</description>\n" + "<price>500</price>\n" + "<quantity>" + quantity + "</quantity></item>"; outStream.write(xmlString.getBytes()); outStream.flush(); } }
Compliant Solution (Input Validation)
Depending on the specific data and command interpreter or parser to which data is being sent, appropriate methods must be used to sanitize untrusted user input. This compliant solution validates that quantity
is an unsigned integer.
import java.io.BufferedOutputStream; import java.io.ByteArrayOutputStream; import java.io.IOException; public class OnlineStore { private static void createXMLStream(final BufferedOutputStream outStream, final String quantity) throws IOException, NumberFormatException { // Write XML string only if quantity is an unsigned integer (count). int count = Integer.parseUnsignedInt(quantity); String xmlString = "<item>\n<description>Widget</description>\n" + "<price>500</price>\n" + "<quantity>" + count + "</quantity></item>"; outStream.write(xmlString.getBytes()); outStream.flush(); } }
Compliant Solution (XML Schema)
A more general mechanism for checking XML for attempted injection is to validate it using a Document Type Definition (DTD) or schema. The schema must be rigidly defined to prevent injections from being mistaken for valid XML. Here is a suitable schema for validating our XML snippet:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="item"> <xs:complexType> <xs:sequence> <xs:element name="description" type="xs:string"/> <xs:element name="price" type="xs:decimal"/> <xs:element name="quantity" type="xs:integer"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
The schema is available as the file schema.xsd
. This compliant solution employs this schema to prevent XML injection from succeeding. It also relies on the CustomResolver
class to prevent XXE attacks. This class, as well as XXE attacks, are described in the subsequent code examples.
private void createXMLStream(BufferedOutputStream outStream, String quantity) throws IOException { String xmlString; xmlString = "<item>\n<description>Widget</description>\n" + "<price>500.0</price>\n" + "<quantity>" + quantity + "</quantity></item>"; InputSource xmlStream = new InputSource( new StringReader(xmlString) ); // Build a validating SAX parser using our schema SchemaFactory sf = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI); DefaultHandler defHandler = new DefaultHandler() { public void warning(SAXParseException s) throws SAXParseException {throw s;} public void error(SAXParseException s) throws SAXParseException {throw s;} public void fatalError(SAXParseException s) throws SAXParseException {throw s;} }; StreamSource ss = new StreamSource(new File("schema.xsd")); try { Schema schema = sf.newSchema(ss); SAXParserFactory spf = SAXParserFactory.newInstance(); spf.setSchema(schema); SAXParser saxParser = spf.newSAXParser(); // To set the custom entity resolver, // an XML reader needs to be created XMLReader reader = saxParser.getXMLReader(); reader.setEntityResolver(new CustomResolver()); saxParser.parse(xmlStream, defHandler); } catch (ParserConfigurationException x) { throw new IOException("Unable to validate XML", x); } catch (SAXException x) { throw new IOException("Invalid quantity", x); } // Our XML is valid, proceed outStream.write(xmlString.getBytes()); outStream.flush(); }
Using a schema or DTD to validate XML is convenient when receiving XML that may have been loaded with unsanitized input. If such an XML string has not yet been built, sanitizing input before constructing XML yields better performance.
Risk Assessment
Failure to sanitize user input before processing or storing it can result in injection attacks.
Rule | Severity | Likelihood | Remediation Cost | Priority | Level |
---|---|---|---|---|---|
IDS16-J | High | Probable | Medium | P12 | L1 |
Automated Detection
Tool | Version | Checker | Description |
---|---|---|---|
Fortify | 1.0 | Missing_XML_Validation | Implemented |
Related Vulnerabilities
CVE-2008-2370 describes a vulnerability in Apache Tomcat 4.1.0 through 4.1.37, 5.5.0 through 5.5.26, and 6.0.0 through 6.0.16. When a RequestDispatcher
is used, Tomcat performs path normalization before removing the query string from the URI, which allows remote attackers to conduct directory traversal attacks and read arbitrary files via a .. (dot dot) in a request parameter.
Related Guidelines
CERT Perl Secure Coding Standard | IDS33-PL. Sanitize untrusted data passed across a trust boundary |
Injection [RST] | |
CWE-116, Improper encoding or escaping of output |
Bibliography
A Guide to Building Secure Web Applications and Web Services | |
[W3C 2008] | 4.4.3, "Included If Validating" |