Use visually distinct identifiers that are unlikely to be misread during development and review of code. Depending on the fonts used, certain characters are visually similar or even identical and can be misinterpreted. Consider the examples in the following table.
Misleading characters
Intended Character | Could Be Mistaken for This Character, and Vice Versa |
---|---|
0 (zero) | O (capital o) |
1 (one) | I (capital i) |
2 (two) | Z (capital z) |
5 (five) | S (capital s) |
8 (eight) | B (capital b) |
n (lowercase N) | h (lowercase H) |
rn (lowercase R, lowercase N) | m (lowercase M) |
The Java Language Specification (JLS) mandates that program source code be written using the Unicode character encoding [Unicode 2013]. Some distinct Unicode characters share identical glyph representation when displayed in many common fonts. For example, the Greek and Coptic characters (Unicode Range 0370–03FF) are frequently indistinguishable from the Greek-character subset of the Mathematical Alphanumeric Symbols (Unicode Range 1D400–1D7FF).
...
Code Block | ||
---|---|---|
| ||
public class Visual { public static void main(String[] args) { System.out.println(11111 + 1111l); } } |
Compliant Solution
This compliant solution uses an uppercase L
(long
) instead of lowercase l
to disambiguate the visual appearance of the second integer. Its behavior is the same as that of the noncompliant code example, but the programmer's intent is clear:
...
Detection of integer literals that have a leading zero is trivial. However, determining whether the programmer intended to use an octal literal or a decimal literal is infeasible. Accordingly, sound automated detection is also infeasible. Heuristic checks may be useful.
Automated Detection
Tool | Version | Checker | Description | ||||||
---|---|---|---|---|---|---|---|---|---|
PVS-Studio |
| V6061, V6097 | |||||||
SonarQube |
|
|
| S1314 S818 |
Bibliography
Puzzle 4, "It's Elementary" | |
[JLS 2013] | |
[Seacord 2009] | DCL02-C. Use visually distinct identifiers |
[Unicode 2013] |
...
...