The char
data type is based on the original Unicode specification, which defined characters as fixed-width 16-bit entities. The Unicode Standard has since been changed to allow for characters whose representation requires more than 16 bits. The range of legal Unicode code points is now U+0000 to U+10FFFF. The set of characters from U+0000 to U+FFFF is referred to as the basic multilingual plane (BMP) while characters whose code points are greater than U+FFFF are called supplementary characters. Such characters are generally rare, but some are used, for example, as part of Chinese and Japanese personal names. To support supplementary characters without changing the char
primitive data type and causing incompatibility with previous Java programs, supplementary characters are defined by a pair of code point values that are called surrogates. According to the Java API [API 2014] class Character
documentation (Unicode Character Representations):
...