...
A user character may be composed of more than one Unicode character. For example, the user character
ü
can be composed by combining the Unicode characters\u0075
(u
) and\u00a8
(¨
). ... The characterü
may also be represented by the single Unicode character\u00fc
.
Do not split a string between two combining characters.
Multibyte Characters
Multibyte encodings are used for character sets that require more than one byte to uniquely identify each constituent character. For example, the Japanese encoding Shift-JIS (shown below) supports multibyte encoding where the maximum character length is two bytes (one leading and one trailing byte).
...