Many classes allow inclusion of escape sequences in character and string literals; examples include Pattern
as well as classes that support XML
and SQL
based actions by passing String
arguments to methods. According to the Java Language Specification [[JLS 2005]], Section 3.10.6, "Escape Sequences for Character and String Literals"
The character and string escape sequences allow for the representation of some nongraphic characters as well as the single quote, double quote, and backslash characters in character literals (§3.10.4) and string literals (§3.10.5).
Correctly use of escape sequences in String
literals depends on correct understanding of how the escape sequences are interpreted. SQL
statements written in Java, for example, sometimes require certain escape characters or sequences (e.g., sequences containing \t
, \n
, \r
). When representing SQL
queries in Java String
form, all escape sequences must be preceded by an extra backslash for correct interpretation.
As another example, consider the Pattern
class used in performing regular expression related tasks. A String
literal used for pattern matching is compiled into an instance of the Pattern
type. When the pattern to be matched contains a sequence of characters that is identical to one of the Java escape sequences — '\' 'n', for example — the Java compiler will treat that portion of the string as a Java escape sequence, and will consequently transform the sequence into a newline character. Consequently, the programmer must precede the "\n" sequence with an additional backslash to prevent the Java compiler from treating it as an escape sequence. The string constructed from the resulting sequence
'\\n'
consequently contains the correct two-character sequence '\' 'n' and correctly denotes back references rather than newline.
In general, for a particular escape character of the form '\X', the equivalent Java representation is:
"\\X"
Noncompliant Code Example
This noncompliant code example defines a method splitWords()
that finds matches between the String
literal and the input sequence. The programmer believes that Java that String
literals can be used as is for regular expression patterns. Consequently, he initializes the string WORDS
to "\b", expecting that the string literal will hold the escape sequence for matching a word boundary. However, the Java compiler treats the "\b" as a Java escape sequence, and the string WORDS
silently compiles to a backspace character.
public class BadSplitter { private final String WORDS = "\b"; // The Intent was to split on word boundaries public String[] splitWords(String input){ Pattern p = Pattern.compile(WORDS); String[] input_array = p.split(input); return input_array; } }
Compliant Solution
This compliant solution shows the correctly escaped value of the String
literal WORDS
that results in a regular expression designed to split on word boundaries.
public class GoodSplitter { private final String WORDS = "\\b"; // Allows splitting on word boundaries public String[] split(String input){ Pattern p = Pattern.compile(WORDS); String[] input_array = p.split(input); return input_array; } }
Risk Assessment
Incorrect use of escape characters in String
literals can result in misinterpretation and potential corruption of data.
Guideline |
Severity |
Likelihood |
Remediation Cost |
Priority |
Level |
---|---|---|---|---|---|
IDS17-J |
low |
unlikely |
high |
P1 |
L3 |
Related Vulnerabilities
Search for vulnerabilities resulting from the violation of this guideline on the CERT website.
Bibliography
[[API 2006]] Class Pattern "Backslashes, escapes, and quoting"
[[API 2006]] Package java.sql
[[JLS 2005]] 3.10.6 Escape Sequences for Character and String Literals
IDS16-J. Do not use locale dependent methods on locale insensitive data 13. Input Validation and Data Sanitization (IDS) IDS18-J. Check for inputs that would cause excessive resource consumption