...
Wiki Markup |
---|
The trailing byte ranges overlap the range of both the single-byte and lead-byte characters. When a multibyte character is separated across a buffer boundary, it can be interpreted differently than if it were not separated across the buffer boundary; this difference arises because of the ambiguity of its composing bytes \[[Phillips 2005|AA. Bibliography#PhillipsReferences#Phillips 05]\]. |
Supplementary Characters
Wiki Markup |
---|
According to the Java API \[[API 2006|AA. Bibliography#APIReferences#API 06]\] class {{Character}} documentation (Unicode Character Representations): |
...
Wiki Markup |
---|
This noncompliant code example attempts to trim leading letters from the {{string}}. It fails to accomplish this task because {{Character.isLetter()}} lacks support for supplementary and combining characters \[[Hornig 2007|AA. Bibliography#HornigReferences#Hornig 07]\]. |
Code Block | ||
---|---|---|
| ||
// Fails for supplementary or combining characters public static String trim_bad1(String string) { char ch; int i; for (i = 0; i < string.length(); i += 1) { ch = string.charAt(i); if (!Character.isLetter(ch)) { break; } } return string.substring(i); } |
...
Wiki Markup |
---|
This noncompliant code example attempts to correct the problem by using the {{String.codePointAt()}} method, which accepts an {{int}} argument. This works for supplementary characters but fails for combining characters \[[Hornig 2007|AA. Bibliography#HornigReferences#Hornig 07]\]. |
Code Block | ||
---|---|---|
| ||
// Fails for combining characters public static String trim_bad2(String string) { int ch; int i; for (i = 0; i < string.length(); i += Character.charCount(ch)) { ch = string.codePointAt(i); if (!Character.isLetter(ch)) { break; } } return string.substring(i); } |
...
Wiki Markup |
---|
This compliant solution works both for supplementary and for combining characters \[[Hornig 2007|AA. Bibliography#HornigReferences#Hornig 07]\]. According to the Java API \[[API 2006|AA. Bibliography#APIReferences#API 06]\] class {{java.text.BreakIterator}} documentation: |
...
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="3e9d028b22e3575e-1483388e-4b6f4ac9-96d99c6d-e3c3b3d1db59266c56e110ca"><ac:plain-text-body><![CDATA[ | [[API 2006 | AA. Bibliography#API References#API 06]] | Classes | ]]></ac:plain-text-body></ac:structured-macro> |
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="4a863f2c7a2098f5-ae050552-4c4d4ee0-99dfb013-65325d5414868ef55debde08"><ac:plain-text-body><![CDATA[ | [[Hornig 2007 | AA. Bibliography#Hornig References#Hornig 07]] | Problem Areas: Characters | ]]></ac:plain-text-body></ac:structured-macro> |
...