Page History

The char data type is based on the original Unicode specification, which defined characters as fixed-width 16-bit entities. The Unicode Standard has since been changed to allow for characters whose representation requires more than 16 bits. The range of Unicode code points is now U+0000 to U+10FFFF. The set of characters from U+0000 to U+FFFF is referred to as the basic multilingual plane (BMP) while characters whose code points are greater than U+FFFF are called supplementary characters. Such characters are generally rare, but some are used, for example, as part of Chinese and Japanese personal names. To support supplementary characters without changing the char primitive data type and causing incompatibility with previous Java programs, supplementary characters are defined by a pair of Unicode code units called surrogates. According to the Java API [API 2014] class Character documentation (Unicode Character Representations):

...

Unfortunately, the trim() method may fail because it is using the character form of the Character.isLetter() method. Methods that only accept a char value cannot support supplementary characters. According to the Java API [API 2014] class Character documentation:

...

Rule	Severity	Likelihood	Remediation Cost	Priority	Level
STR01-J	low	unlikely	medium	P2	L3

Bibliography

[API 2014]	Classes `Character` and `BreakIterator`
[Tutorials 2008]	Character Boundaries

Rule 04: Characters and Strings (STR)Image Added

Space shortcuts

Page tree

Versions Compared

Old Version 114

New Version 115

Key

Bibliography