Many classes allow inclusion of escape sequences in character and string literals; examples include {{Pattern}} as well as classes that support {{XML}} and {{SQL}} based actions by passing {{String}} arguments to methods. According to the Java Language Specification \[[JLS 2005|AA. Bibliography#JLS 05]\], Section 3include Wiki Markup java.util.regex.Pattern
as well as classes that support XML- and SQL-based actions by passing string arguments to methods. According to the Java Language Specification (JLS), §3.10.6, "Escape Sequences for Character and String Literals"[JLS 2013],
The character and string escape sequences allow for the representation of some nongraphic characters as well as the single quote, double quote, and backslash characters in character literals (§3.10.4) and string literals (§3.10.5).
Correct use of escape sequences in String
string literals depends on correct requires understanding of how the escape sequences are interpreted by the Java compiler as well as how they are interpreted by any subsequent processor, such as a SQL engine. SQL statements written in Java, for example, sometimes require certain escape characters or sequences (e.g., sequences containing may require escape sequences (for example, sequences containing \t
, \n
, \r
) in certain cases, such as when storing raw text in a database. When representing SQL queries statements in Java String
formstring literals, all each escape sequences sequence must be preceded by an extra backslash for correct interpretation.
As another example, consider the Pattern
class used in performing regular expression-related tasks. A String
string literal used for pattern matching is compiled into an instance of the Pattern
type. When the pattern to be matched contains a sequence of characters that is identical to one of the Java escape sequences — '\' 'n', for example — the Java compiler will treat sequences—"\"
and "n"
, for example—the Java compiler treats that portion of the string as a Java escape sequence , and will consequently transform transforms the sequence into an actual newline character. To insert the newline escape sequence, rather than a literal newline character. Consequently, the programmer must precede the "\n"
sequence with an additional backslash to prevent the Java compiler from treating it as an escape sequencereplacing it with a newline character. The string constructed from the resulting sequence,
Code Block |
---|
'\\n' |
consequently contains the correct two-character sequence '\
' 'n
' and correctly denotes back references rather than newlinethe escape sequence for newline in the pattern.
In general, for a particular escape character of the form '\X
', the equivalent Java representation is:
Code Block |
---|
"\\X" |
Noncompliant Code Example (String Literal)
This noncompliant code example defines a method, splitWords()
, that finds matches between the String
string literal (WORDS
) and the input sequence. The programmer believes that String
literals can be used as is for regular expression patterns. Consequently, he initializes the string WORDS
to "\b", expecting that the string literal will It is expected that WORDS
would hold the escape sequence for matching a word boundary. However, the Java compiler treats the "\b"
literal as a Java escape sequence, and the string WORDS
silently compiles to a regular expression that checks for a single backspace character.
Code Block | ||
---|---|---|
| ||
public class BadSplitterSplitter { private// finalInterpreted String WORDS = "\b";as backspace // Fails to split on word boundaries private final String WORDS = "\b"; public String[] splitWords(String input) { Pattern ppattern = Pattern.compile(WORDS); String[] input_array = ppattern.split(input); return input_array; } } |
Compliant Solution (String Literal)
This compliant solution shows the correctly escaped value of the String
string literal WORDS
that results in a regular expression designed to split on word boundaries.:
Code Block | ||
---|---|---|
| ||
public class GoodSplitterSplitter { // Interpreted as two chars, '\' and 'b' // Correctly splits on word privateboundaries private final String WORDS = "\\b"; public String[] split(String input){ Pattern pattern = Pattern.compile(WORDS); String[] input_array = pattern.split(input); return input_array; } } |
Noncompliant Code Example (String Property)
This noncompliant code example uses the same method, splitWords()
. This time the WORDS
string is loaded from an external properties file.
Code Block |
---|
public class Splitter { private final String WORDS; public Splitter() throws IOException { Properties properties = new Properties(); properties.load(new FileInputStream("splitter.properties")); // AllowsWORDS splitting on word boundaries= properties.getProperty("WORDS"); } public String[] split(String input){ Pattern ppattern = Pattern.compile(WORDS); String[] input_array = ppattern.split(input); return input_array; } } |
...
In the properties file, the WORD
property is once again incorrectly specified as \b
.
Code Block | ||
---|---|---|
| ||
WORDS=\b |
This is read by the Properties.load()
method as a single character b
, which causes the split()
method to split strings along the letter b
. Although the string is interpreted differently than if it were a string literal, as in the previous noncompliant code example, the interpretation is incorrect.
Compliant Solution (String Property)
This compliant solution shows the correctly escaped value of the WORDS
property:
Code Block | ||
---|---|---|
| ||
WORDS=\\b |
Applicability
Incorrect use of escape characters in String
literals string inputs can result in misinterpretation and potential corruption of data.
...
Automated Detection
Severity Tool | Likelihood Version | Remediation Cost Checker | Priority Description | Level | |
---|---|---|---|---|---|
IDS17-J | low | unlikely | high | P1 | L3 |
Related Vulnerabilities
Search for vulnerabilities resulting from the violation of this guideline on the CERT website.
Bibliography
Wiki Markup |
---|
\[[API 2006|AA. Bibliography#API 06]\] [Class Pattern|http://java.sun.com/javase/6/docs/api/java/util/regex/Pattern.html] "Backslashes, escapes, and quoting"
\[[API 2006|AA. Bibliography#API 06]\] [Package java.sql|http://java.sun.com/javase/6/docs/api/java/sql/package-summary.html]
\[[JLS 2005|AA. Bibliography#JLS 05]\] 3.10.6 Escape Sequences for Character and String Literals |
The Checker Framework |
| Tainting Checker | Trust and security errors (see Chapter 8) |
Bibliography
[API 2013] | Class Pattern, "Backslashes, Escapes, and Quoting" Package java.sql |
[JLS 2013] | §3.10.6, "Escape Sequences for Character and String Literals" |
...
IDS16-J. Do not use locale dependent methods on locale insensitive data 13. Input Validation and Data Sanitization (IDS) IDS18-J. Check for inputs that would cause excessive resource consumption