...
In general, for a particular escape character of the form \X
, the equivalent Java representation is
Code Block |
---|
\\X |
Noncompliant Code Example (String Literal)
This noncompliant code example defines a method splitWords()
that finds matches between the string literal (WORDS
) and the input sequence. It is expected that WORDS
would hold the escape sequence for matching a word boundary. However, the Java compiler treats the "\b"
literal as a Java escape sequence, and the string WORDS
silently compiles to a backspace character.
Code Block | ||
---|---|---|
| ||
public class Splitter { private final String WORDS = "\b"; // interpreted as backspace, fails to split on word boundaries public String[] splitWords(String input){ Pattern pattern = Pattern.compile(WORDS); String[] input_array = pattern.split(input); return input_array; } } |
Compliant Solution (String Literal)
This compliant solution shows the correctly escaped value of the string literal WORDS
that results in a regular expression designed to split on word boundaries.
Code Block | ||
---|---|---|
| ||
public class Splitter { private final String WORDS = "\\b"; // interpreted as two chars, '\' and \b'. Correctly splits on word boundaries public String[] split(String input){ Pattern pattern = Pattern.compile(WORDS); String[] input_array = pattern.split(input); return input_array; } } |
Noncompliant Code Example (String Property)
This noncompliant code example uses the same method splitWords()
. This time the WORDS
string is loaded from an external properties file.
...
In the properties file, the WORD
property is once again incorrectly specified as \b
. This is read by the Properties.load()
method as a single character b
, which causes the split()
method to split strings along the letter b
. While the string is interpreted differently than if it were a string literal, as in the previous noncompliant code example, it is still interpreted incorrectly.
Code Block | ||
---|---|---|
| ||
WORDS: \b |
Compliant Solution (String Property)
This compliant solution shows the correctly escaped value of the WORDS
property.
Code Block | ||
---|---|---|
| ||
WORDS: \\b |
Applicability
Incorrect use of escape characters in string inputs can result in misinterpretation and potential corruption of data.
Bibliography
[API 2011] Class Pattern "Backslashes, escapes, and quoting"
[API 2011] Package java.sql
[JLS 2011] §3.10.6, Escape Sequences for Character and String Literals
...