...
This noncompliant code example attempts to trim leading letters from the string
. However, this method may fail because methods that only accept a char
value cannot support supplementary characters. According to the Java API [API 2014] class Character
documentation:
They treat
char
values from the surrogate ranges as undefined characters. For example,Character.isLetter('\uD840')
returnsfalse
, even though this specific value if followed by any low-surrogate value in a string would represent a letter.
Because the method It fails to accomplish this task because Character.isLetter()
lacks support for supplementary characters. Because it only examines one character at a time, this method also can corrupt combining charactersit will also separate combining character sequences.
Code Block | ||
---|---|---|
| ||
// Fails for supplementary or combining characters
public static String trim(String string) {
char ch;
int i;
for (i = 0; i < string.length(); i += 1) {
ch = string.charAt(i);
if (!Character.isLetter(ch)) {
break;
}
}
return string.substring(i);
}
|
...
This noncompliant code example corrects the problem with supplementary characters by using the String.codePointAtinteger form of Character.isLetter()
method , which accepts that accepts a Unicode code point as an int
argument. Java library methods that accept an int
value support all Unicode characters, including supplementary characters. However, it this method still fails to handle combining characters because it only examines one character at a time.
...