You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 54 Next »

Many classes, including Pattern and those that support XML and SQL based actions by passing String arguments to methods, allow inclusion of escape sequences in character and string literals. According to the Java Language Specification [[JLS 2005]], Section 3.10.6, "Escape Sequences for Character and String Literals"

The character and string escape sequences allow for the representation of some nongraphic characters as well as the single quote, double quote, and backslash characters in character literals (§3.10.4) and string literals (§3.10.5).

In order to correctly use escape sequences pertaining to String literals, an understanding of how they are interpreted is essential. For example, SQL statements written in Java, sometimes require certain special escape characters or sequences (for instance, sequences containing \t, \n, \r). In SQL queries, all escape sequences must be preceded by an extra backslash for correct interpretation.

As another example, consider the Pattern class that finds extensive use in performing regular expression related tasks. A given String literal used for pattern matching is compiled into an instance of the Pattern type. If the pattern to be matched contains an undesired escape sequence such as a '\n', to avoid it being interpreted by the Java bytecode compiler as an escape sequence, the Pattern class requires the literal to be preceded by a backslash:

'\\n'

which now correctly denotes back references instead of a new line.

In general, for a particular escape character of the form '\X', the equivalent Java representation is:

"\\X"

As an aside, this particular condition gains remarkable importance in automatic exploit signature detection systems and filters that rely on patter matching.

Noncompliant Code Example

This noncompliant code example defines a method splitWords() that finds matches between the String literal and the input sequence. Because '\b' is the escape sequence for a word boundary, the misleading notion that String literals can be used as is, can convince the implementer that the pattern matches to the word boundaries and as a result, splits a given string into individual words. Instead, the string WORDS silently compiles to a backspace character.

public class BadSplitter {
  private final String WORDS = "\b"; // The Intent was to split on word boundaries

  public String[] splitWords(String input){
    Pattern p = Pattern.compile(WORDS);
    String[] input_array = p.split(input);
    return input_array;
  }
}

Compliant Solution

This compliant solution shows the correctly escaped value of the String literal WORDS that results in a regular expression designed to split on word boundaries.

public class GoodSplitter {
  private final String WORDS = "\\b"; // Allows splitting on word boundaries

  public String[] split(String input){
    Pattern p = Pattern.compile(WORDS);
    String[] input_array = p.split(input);
    return input_array;
  }
}

Risk Assessment

Incorrect use of escape characters in String literals can result in misinterpretation and potential corruption of data.

Guideline

Severity

Likelihood

Remediation Cost

Priority

Level

IDS17-J

low

unlikely

high

P1

L3

Related Vulnerabilities

Search for vulnerabilities resulting from the violation of this guideline on the CERT website.

Bibliography

[[JLS 2005]] 3.10.6 Escape Sequences for Character and String Literals
[[API 2006]] Class Pattern "Backslashes, escapes, and quoting"
[[API 2006]] Package java.sql


IDS16-J. Do not use locale dependent methods on locale insensitive data      Input Validation and Data Sanitization (IDS)      IDS18-J. Check that inputs do not produce excessive resource consumption

  • No labels