Use visually distinct identifiers that are unlikely to be misread during development and review of code. Depending on the fonts used, certain characters are visually similar or even identical and can be misinterpreted. Consider the examples in the following table.
Misleading characters
Intended Character | Could Be Mistaken for This Character, and Vice Versa |
---|---|
0 (zero) | O (capital o) |
1 (one) | I (capital i) |
2 (two) | Z (capital z) |
5 (five) | S (capital s) |
8 (eight) | B (capital b) |
n (lowercase N) | h (lowercase H) |
rn (lowercase R, lowercase N) | m (lowercase M) |
The Java Language Specification (JLS) mandates that program source code be written using the Unicode character encoding [Unicode 2013]. Some distinct Unicode characters share identical glyph representation when displayed in many common fonts. For example, the Greek and Coptic characters (Unicode Range 0370–03FF) are frequently indistinguishable from the Greek-character subset of the Mathematical Alphanumeric Symbols (Unicode Range 1D400–1D7FF).
...
Do not use multiple identifiers that vary by only one or more visually similar characters. Also, make the initial portions of long identifiers distinct to aid recognition.
According to the JLS, §3.10.1, "Integer Literals" [JLS 2013],
An integer literal is of type long if it is suffixed with an ASCII letter L or l (ell); otherwise, it is of type int. The suffix L is preferred because the letter l (ell) is often hard to distinguish from the digit 1 (one).
Consequently, use L
, not l
, to clarify programmer intent when indicating that an integer literal is of type long
.
Integer literals with leading zeros, in actuality, denote octal values, not decimal values. According to §3.10.1, "Integer Literals" of the JLS [JLS 2013],
An octal numeral consists of an ASCII digit
0
followed by one or more of the ASCII digits0
through7
interspersed with underscores, and can represent a positive, zero, or negative integer.
This misinterpretation may result in programming errors and is more likely to occur while declaring multiple constants and trying to enhance the formatting with zero padding.
Noncompliant Code Example
This noncompliant code example has two variables, stem
and stern
, within the same scope that can be easily confused and accidentally interchanged:
Code Block | ||
---|---|---|
| ||
int stem; // Position near the front of the boat /* ... */ int stern; // Position near the back of the boat |
Compliant Solution
This compliant solution eliminates the confusion by assigning visually distinct identifiers to the variables:
Code Block | ||
---|---|---|
| ||
int bow; // Position near the front of the boat /* ... */ int stern; // Position near the back of the boat |
Noncompliant Code Example
This noncompliant example prints the result of adding an int
and a long
value even though it appears that two integers 11111
are being added:. According to the JLS, §3.10.1, "Integer Literals" [JLS 2013],
An integer literal is of type long if it is suffixed with an ASCII letter L or l (ell); otherwise, it is of type int. The suffix L is preferred because the letter l (ell) is often hard to distinguish from the digit 1 (one).
Consequently, use L
, not l
, to clarify programmer intent when indicating that an integer literal is of type long
.
Code Block | ||
---|---|---|
| ||
public class Visual { public static void main(String[] args) { System.out.println(11111 + 1111l); } } |
Compliant Solution
This compliant solution uses an uppercase L
(long
) instead of lowercase l
to disambiguate the visual appearance of the second integer. Its behavior is the same as that of the noncompliant code example, but the programmer's intent is clear:
Code Block | ||
---|---|---|
| ||
public class Visual { public static void main(String[] args) { System.out.println(11111 + 1111L); } } |
Noncompliant Code Example
This noncompliant example mixes decimal values and octal values while storing them in an array:. Integer literals with leading zeros denote octal values–not decimal values. According to §3.10.1, "Integer Literals" of the JLS [JLS 2013],
An octal numeral consists of an ASCII digit
0
followed by one or more of the ASCII digits0
through7
interspersed with underscores, and can represent a positive, zero, or negative integer.
This misinterpretation may result in programming errors and is more likely to occur while declaring multiple constants and trying to enhance the formatting with zero padding.
Code Block | ||
---|---|---|
| ||
int[] array = new int[3]; void exampleFunction() { array[0] = 2719; array[1] = 4435; array[2] = 0042; // ... } |
It appears that the The third element in array
is was likely intended to hold the decimal value 42. However, the decimal value 34 (corresponding to the octal value 42) gets is assigned.
Compliant Solution
When integer literals are intended to represent a decimal value, avoid padding with leading zeros. Use another technique instead, such as padding with whitespace, to preserve digit alignment.
Code Block | ||
---|---|---|
| ||
int[] array = new int[3]; void exampleFunction() { array[0] = 2719; array[1] = 4435; array[2] = 42; // ... } |
Applicability
Failing to use visually distinct identifiers could result in the use of the wrong identifier and lead to unexpected program behavior.
...
Detection of integer literals that have a leading zero is trivial. However, determining whether the programmer intended to use an octal literal or a decimal literal is infeasible. Accordingly, sound automated detection is also infeasible. Heuristic checks may be useful.
Automated Detection
Tool | Version | Checker | Description | ||||||
---|---|---|---|---|---|---|---|---|---|
PVS-Studio |
| V6061, V6097 | |||||||
SonarQube |
| S1314 S818 |
Bibliography
Puzzle 4, "It's Elementary" | |
[JLS 2013] | |
[Seacord 2009] | DCL02-C. Use visually distinct identifiers |
[Unicode 2013] |
...