Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Wiki Markup
The trailing byte ranges overlap the range of both the single-byte and lead-byte characters. When a multibyte character is separated across a buffer boundary, it can be interpreted differently than if it were not separated across the buffer boundary; this difference arises because of the ambiguity of its composing bytes \[[Phillips 2005|AA. Bibliography#PhillipsReferences#Phillips 05]\]. 

Supplementary Characters

Wiki Markup
According to the Java API \[[API 2006|AA. Bibliography#APIReferences#API 06]\] class {{Character}} documentation (Unicode Character Representations):

...

Wiki Markup
This noncompliant code example attempts to trim leading letters from the {{string}}. It fails to accomplish this task because {{Character.isLetter()}} lacks support for supplementary and combining characters \[[Hornig 2007|AA. Bibliography#HornigReferences#Hornig 07]\].

Code Block
bgColor#FFcccc
// Fails for supplementary or combining characters
public static String trim_bad1(String string) {
  char ch;
  int i;
  for (i = 0; i < string.length(); i += 1) {
    ch = string.charAt(i);
    if (!Character.isLetter(ch)) {
      break;
    }
  }
  return string.substring(i);
}

...

Wiki Markup
This noncompliant code example attempts to correct the problem by using the {{String.codePointAt()}} method, which accepts an {{int}} argument. This works for supplementary characters but fails for combining characters \[[Hornig 2007|AA. Bibliography#HornigReferences#Hornig 07]\].

Code Block
bgColor#FFcccc
// Fails for combining characters
public static String trim_bad2(String string) {
  int ch;
  int i;
  for (i = 0; i < string.length(); i += Character.charCount(ch)) {
    ch = string.codePointAt(i);
    if (!Character.isLetter(ch)) {
      break;
    }
  } 
  return string.substring(i);
}

...

Wiki Markup
This compliant solution works both for supplementary and for combining characters \[[Hornig 2007|AA. Bibliography#HornigReferences#Hornig 07]\]. According to the Java API \[[API 2006|AA. Bibliography#APIReferences#API 06]\] class {{java.text.BreakIterator}} documentation:

...

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="3e9d028b22e3575e-1483388e-4b6f4ac9-96d99c6d-e3c3b3d1db59266c56e110ca"><ac:plain-text-body><![CDATA[

[[API 2006

AA. Bibliography#API References#API 06]]

Classes Character and BreakIterator

]]></ac:plain-text-body></ac:structured-macro>

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="4a863f2c7a2098f5-ae050552-4c4d4ee0-99dfb013-65325d5414868ef55debde08"><ac:plain-text-body><![CDATA[

[[Hornig 2007

AA. Bibliography#Hornig References#Hornig 07]]

Problem Areas: Characters

]]></ac:plain-text-body></ac:structured-macro>

...