Many classes allow inclusion of escape sequences in character and string literals; examples include java.util.regex.Pattern
as well as classes that support XML- and SQL-based actions by passing string arguments to methods. According to the Java Language Specification (JLS), §3.10.6, "Escape Sequences for Character and String Literals" [JLS 2013],
The character and string escape sequences allow for the representation of some nongraphic characters as well as the single quote, double quote, and backslash characters in character literals (§3.10.4) and string literals (§3.10.5).
...
In general, for a particular escape character of the form \X
, the equivalent Java representation is
Code Block |
---|
\\X |
Noncompliant Code Example (String Literal)
This noncompliant code example defines a method, splitWords()
, that finds matches between the string literal (WORDS
) and the input sequence. It is expected that WORDS
would hold the escape sequence for matching a word boundary. However, the Java compiler treats the "\b"
literal as a Java escape sequence, and the string WORDS
silently compiles to a regular expression that checks for a single backspace character.
Code Block | ||
---|---|---|
| ||
public class Splitter { // Interpreted as backspace // Fails to split on word boundaries private final String WORDS = "\b"; public String[] splitWords(String input) { Pattern pattern = Pattern.compile(WORDS); String[] input_array = pattern.split(input); return input_array; } } |
Compliant Solution (String Literal)
This compliant solution shows the correctly escaped value of the string literal WORDS
that results in a regular expression designed to split on word boundaries:
Code Block | ||
---|---|---|
| ||
public class Splitter { // Interpreted as two chars, '\' and 'b' // Correctly splits on word boundaries private final String WORDS = "\\b"; public String[] split(String input){ Pattern pattern = Pattern.compile(WORDS); String[] input_array = pattern.split(input); return input_array; } } |
Noncompliant Code Example (String Property)
This noncompliant code example uses the same method, splitWords()
. This time the WORDS
string is loaded from an external properties file.
...
This is read by the Properties.load()
method as a single character b
, which causes the split()
method to split strings along the letter b
. Although the string is interpreted differently than if it were a string literal, as in the previous noncompliant code example, the interpretation is incorrect.
Compliant Solution (String Property)
This compliant solution shows the correctly escaped value of the WORDS
property:
Code Block | ||
---|---|---|
| ||
WORDS=\\b |
Applicability
Incorrect use of escape characters in string inputs can result in misinterpretation and potential corruption of data.
Bibliography
[API 2013] | Class Pattern, "Backslashes, Escapes, and Quoting" Package java.sql |
[JLS 2013] | §3.10.6, "Escape Sequences for Character and String Literals" |
...