Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: some worthwhile text and minor code edits, pls review

Many classes allow inclusion of escape sequences in character and string literals; examples include Pattern as java.util.regex.Pattern as well as classes that support XML- and SQL-based actions by passing string arguments to methods. According to the Java Language Specification [JLS 2011], Section 3.10.6, "Escape Sequences for Character and String Literals"

The character and string escape sequences allow for the representation of some nongraphic characters as well as the single quote, double quote, and backslash characters in character literals (§3.10.4) and string literals (§3.10.5).

Correct use of escape sequences in string literals depends on correct requires understanding of how the escape sequences are interpreted. SQL statements written in Java, for example, sometimes require certain escape characters or sequences (e.g., sequences containing \t, \n, \r). When representing SQL queries in Java string form, all escape sequences must be preceded by an extra backslash for correct interpretation.

As another example, consider the Pattern class used in performing regular expression-related tasks. A string literal used for pattern matching is compiled into an instance of the Pattern type. When the pattern to be matched contains a sequence of characters identical to one of the Java escape sequences — "\" and "n", for example — the Java compiler treats that portion of the string as a Java escape sequence and transforms the sequence into a newline character. To avoid inserting a newline character, the programmer must precede the "\n" sequence with an additional backslash to prevent the Java compiler from treating it as an escape sequence. The string constructed from the resulting sequence

Code Block
"\\n"

consequently contains the correct two-character sequence \n and correctly denotes a newline character in the pattern.

In general, for a particular escape character of the form \X, the equivalent Java representation is

Code Block
"\\X"

Noncompliant Code Example

This noncompliant code example defines a method, method splitWords(), that  that finds matches between the string literal and {{WORDS}} and the input sequence. The programmer believes that string literals can be used as is for regular expression patterns and consequently initializes the string WORDS to "\b", expecting that the string literal will It is expected that WORDS would hold the escape sequence for matching a word boundary. However, the Java compiler treats the "\b" as a Java escape sequence, and the string WORDS silently compiles to a backspace character.

Code Block
bgColor#FFCCCC
public class BadSplitter {
  private final String WORDS = "\b"; // Fails to split on word boundaries

  public String[] splitWords(String input){
    Pattern ppattern = Pattern.compile(WORDS);
    String[] input_array = ppattern.split(input);
    return input_array;
  }
}

...

Code Block
bgColor#ccccff
public class GoodSplitter {
  private final String WORDS = "\\b"; // Allows splitting on word boundaries

  public String[] split(String input){
    Pattern ppattern = Pattern.compile(WORDS);
    String[] input_array = ppattern.split(input);
    return input_array;
  }
}

...