Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Wiki MarkupMany classes, including {{Pattern}} and those that support {{XML}} and {{SQL}} based actions by passing {{String}} arguments to methods, allow inclusion of escape sequences in character and string literals. According to the Java Language Specification \[[JLS 2005|AA. Bibliography#JLS 05]\], Section 3classes allow inclusion of escape sequences in character and string literals; examples include java.util.regex.Pattern as well as classes that support XML- and SQL-based actions by passing string arguments to methods. According to the Java Language Specification (JLS), §3.10.6, "Escape Sequences for Character and String Literals"[JLS 2013],

The character and string escape sequences allow for the representation of some nongraphic characters as well as the single quote, double quote, and backslash characters in character literals (§3.10.4) and string literals (§3.10.5).

In order to correctly Correct use of escape sequences pertaining to String literals, an understanding of in string literals requires understanding how the escape sequences are interpreted by the Java compiler as well as how they are interpreted is essential. For example, SQL statements written in Java, sometimes require certain special escape characters or sequences (for instance, sequences containing \t, \n, \r). In SQL queries, all escape sequences by any subsequent processor, such as a SQL engine. SQL statements may require escape sequences (for example, sequences containing \t\n\r) in certain cases, such as when storing raw text in a database. When representing SQL statements in Java string literals, each escape sequence must be preceded by an extra backslash for correct interpretation.

As another example, consider the Pattern class that finds extensive use used in performing regular expression-related tasks. A given String string literal used for pattern matching is compiled into an instance of the Pattern type. If When the pattern to be matched contains an undesired escape sequence such as a '\n', to avoid it being interpreted by the Java bytecode compiler as an escape sequence, the Pattern class requires the literal to be preceded by a backslash:

Code Block

'\\n'

a sequence of characters identical to one of the Java escape sequences—"\" and "n", for example—the Java compiler treats that portion of the string as a Java escape sequence and transforms the sequence into an actual newline character. To insert the newline escape sequence, rather than a literal newline character, the programmer must precede the "\n" sequence with an additional backslash to prevent the Java compiler from replacing it with a newline character. The string constructed from the resulting sequence,

Code Block
\\n

consequently contains the correct two-character sequence \n and correctly denotes the escape sequence for newline in the patternwhich now correctly denotes back references instead of a new line.

In general, for a particular escape character of the form '\X', the equivalent Java representation is:

Code Block

"\\X"

As an aside, this particular condition gains remarkable importance in automatic exploit signature detection systems and filters that rely on patter matching.

Noncompliant Code Example (String Literal)

This noncompliant code example defines a method, splitWords(), that finds matches between the String string literal (WORDS) and the input sequence. Because '\b' is It is expected that WORDS would hold the escape sequence for matching a word boundary. However, the misleading notion that String literals can be used as is, can convince the implementer that the pattern matches to the word boundaries and as a result, splits a given string into individual words. Instead, Java compiler treats the "\b" literal as a Java escape sequence, and the string WORDS silently compiles to a regular expression that checks for a single backspace character.

Code Block
bgColor#FFCCCC

public class BadSplitterSplitter {
  private// finalInterpreted String WORDS = "\b";as backspace
  // TheFails Intent was to split on word boundaries
  private final String WORDS = "\b";

  public String[] splitWords(String input) {
    Pattern ppattern = Pattern.compile(WORDS);
    String[] input_array = ppattern.split(input);
    return input_array;
  }
}

Compliant Solution (String Literal)

This compliant solution shows the correctly escaped value of the String string literal WORDS that results in a regular expression designed to split on word boundaries.:

Code Block
bgColor#ccccff

public class GoodSplitterSplitter {
  // Interpreted as two chars, '\' and 'b'
  // Correctly  privatesplits on word boundaries
  private final String WORDS = "\\b"; // Allows splitting on word boundaries

  public String[] split(String input){
    Pattern pattern = Pattern.compile(WORDS);
    String[] input_array = pattern.split(input);
    return input_array;
  }
}

Noncompliant Code Example (String Property)

This noncompliant code example uses the same method, splitWords(). This time the WORDS string is loaded from an external properties file.

Code Block
public class Splitter {
  private final String WORDS;
 
  public Splitter() throws IOException {
    Properties properties = new Properties();
    properties.load(new FileInputStream("splitter.properties"));
    WORDS = properties.getProperty("WORDS");
  }

  public String[] split(String input){
    Pattern ppattern = Pattern.compile(WORDS);
    String[] input_array = ppattern.split(input);
    return input_array;
  }
}

...

In the properties file, the WORD property is once again incorrectly specified as \b

Code Block
bgColor#FFCCCC
WORDS=\b

This is read by the Properties.load() method as a single character b, which causes the split() method to split strings along the letter b. Although the string is interpreted differently than if it were a string literal, as in the previous noncompliant code example, the interpretation is incorrect.

Compliant Solution (String Property)

This compliant solution shows the correctly escaped value of the WORDS property:

Code Block
bgColor#ccccff
WORDS=\\b

Applicability

Incorrect use of escape characters in String literals string inputs can result in misinterpretation and potential corruption of data.

...

Automated Detection

Severity Tool Likelihood Version Remediation Cost Checker Priority Description

Level

IDS17-J

low

unlikely

high

P1

L3

Related Vulnerabilities

Search for vulnerabilities resulting from the violation of this guideline on the CERT website.

Bibliography

Wiki Markup
\[[JLS 2005|AA. Bibliography#JLS 05]\] 3.10.6 Escape Sequences for Character and String Literals
\[[API 2006|AA. Bibliography#API 06]\] [Class Pattern|http://java.sun.com/javase/6/docs/api/java/util/regex/Pattern.html] "Backslashes, escapes, and quoting"
\[[API 2006|AA. Bibliography#API 06]\] [Package java.sql|http://java.sun.com/javase/6/docs/api/java/sql/package-summary.html]

The Checker Framework

Include Page
The Checker Framework_V
The Checker Framework_V

Tainting CheckerTrust and security errors (see Chapter 8)

Bibliography

 

...

Image Added Image Added Image AddedIDS16-J. Do not use locale dependent methods on locale insensitive data      13. Input Validation and Data Sanitization (IDS)      IDS18-J. Check that inputs do not produce excessive resource consumption