...
Noncompliant Code Example (toUpperCase()
)
Many web apps, such as forum or blogging software, must accept HTML as input and present it as output. Displaying untrusted HTML can subject a web app to XSS (cross-site scripting) or HTML injection vulnerabilities. Therefore, it is vital that HTML be sanitized before sending it to a web browser.
One common step in sanitization is identifying tags that may contain malicious content. The <SCRIPT>
tag is one such tag; it typically contains Javascript code that is executed by a client's browser. Therefore sanitizing HTML involves identifying <SCRIPT>
tags and converting them to something harmless, or deleting them altogether. However, identifying <SCRIPT>
tags is not as simple as it appears.
In HTML, tags are case-insensitive, and can therefore be specified using uppercase, lowercase, or any mixture of cases. This noncompliant code example uses the locale-dependent String.toUpperCase()
method to convert an HTML tag to upper case, to check it for further processing. The code must ignore <SCRIPT>
tags, as they indicate code that is to be discarded. While the English locale would convert "script"
to "SCRIPT"
, the Turkish locale will convert "script"
to "SCRİPT"
, and the check will fail to prune scripts from further processing.
...