Input sanitization refers to the elimination of unwanted characters from the input by means of removal, replacement, encoding or escaping the characters. Input must be sanitized, both because an application may be unprepared to handle the malformed input, and also because unsanitized input may conceal an attack vector.
Noncompliant Code Example
This noncompliant code example uses a user generated string xmlString
, which will be parsed by an XML parser; see guideline IDS08-J. Prevent XML Injection. The description
node is a String
, as defined by the XML schema. Consequently, it accepts all valid characters including CDATA
tags.
xmlString = "<item>\n" + "<description><![CDATA[<]]>script<![CDATA[>]]> alert('XSS')<![CDATA[<]]>/script<![CDATA[>]]></description>\n" + "<price>500.0</price>\n" + "<quantity>1</quantity>\n" + "</item>";
This is insecure because an attacker may be able to inject an executable script into the XML representation, disguised using CDATA
tags. CDATA
tags, when processed, are removed by the XML parser, yielding the executable script. This can result in a Cross Site Scripting (XSS) vulnerability if the text in the nodes is displayed back to the user.
Similarly, if the XML tree is constructed at the server side from client inputs, comments of the form
<!-- -->
may be maliciously inserted in an attempt to override the server side inputs. For instance, if the user can enter input into the description
and quantity
fields, it may be possible to override the price
field set by the server. This can be achieved by entering the string "<!-- description
" in the description
field and the string "--></description> <price>100.0</price><quantity>1
" in the quantity
field (without the '"' characters in each case). The equivalent XML representation is:
xmlString = "<item>\n"+ "<description><!-- description</description>\n" + "<price>500.0</price>\n" + "<quantity>--></description> <price>100.0</price> <quantity>1</quantity>\n" + "</item>";
Note that the user can thus override the price field, changing it from 500.0 to an arbitrary value such as 100.0 (in this case).
Compliant Solution
This compliant solution creates a white list of possible string inputs. It allows only alphabetic characters in the description
node, consequently eliminating the possibility of injection of <
and >
tags.
if(!xmlString.matches("[\\w]*")) { // String does not match white-listed characters throw new IllegalArgumentException(); } // Use the xmlString
Risk Assessment
Failure to sanitize user input before processing or storing it can lead to injection of arbitrary executable content.
Guideline |
Severity |
Likelihood |
Remediation Cost |
Priority |
Level |
---|---|---|---|---|---|
IDS01-J |
high |
probable |
medium |
P12 |
L1 |
Related Vulnerabilities
CVE-2008-2370 describes a vulnerability in Apache Tomcat 4.1.0 through 4.1.37, 5.5.0 through 5.5.26, and 6.0.0 through 6.0.16. When a RequestDispatcher
is used, Tomcat performs path normalization before removing the query string from the URI, which allows remote attackers to conduct directory traversal attacks and read arbitrary files via a .. (dot dot) in a request parameter.
Search for other vulnerabilities resulting from the violation of this guideline on the CERT website.
Bibliography
[[OWASP 2008]] Testing for XML Injection (OWASP-DV-008)
[[OWASP 2005]]
[[OWASP 2007]]
Filter data that passes through a trust boundary 13. Input Validation and Data Sanitization (IDS) IDS02-J. Validate strings after performing normalization