The C language provides several different kinds of constants: integer constants such as 10
and 0x1C
, floating constants such as 1.0
and 6.022e+23
, and character constants such as 'a'
and '\x10'
. C also provides string literals such as "hello, world"
and "\n"
. These may all be referred to as literals.
When used in program logic, literals can reduce the readability of source code. As a result, literals in general, and integer constants in particular, are frequently referred to as magic numbers because their purpose is often obscured. Magic numbers may be constant values that represent either an arbitrary value (such as a determined appropriate buffer size) or a malleable concept (such as the age a person is considered an adult, which could change between geopolitical boundaries). Rather than embed literals in program logic, use appropriately named symbolic constants to clarify the intent of the code. In addition, if a specific value needs to be changed, reassigning a symbolic constant once is more efficient and less error prone than replacing every instance of the value. [[Saks 02]].
The C programming language has several mechanisms for creating named, symbolic constants: const
-qualified objects, enumeration constants, and object-like macro definitions. Each of these mechanisms has associated advantages and disadvantages.
const
-qualified Objects
Objects that are const
-qualified have scope and can be type-checked by the compiler. Because these are named objects (unlike macro definitions), (certain) debugging tools can show the name of the object. The objects also consumes memory (though this is not too important).
A const
-qualified objects allows you to specify the exact type of the constant. For example:
unsigned int const buffer_size = 256;
defines buffer_size
as a constant whose type is unsigned int
.
Unfortunately, const
-qualified objects cannot be used where compile-time integer constants are required, namely to define the
- size of a bit-field member of a structure
- size of an array (except in the case of variable length arrays)
- value of an enumeration constant
- value of a
case
constant
If any of these are required, then an integer constant (which would be an rvalue) must be used.
const
-qualified objects allows the programmer to take the address of the object.
const int max = 15; int a[max]; /* invalid declaration outside of a function */ const int *p; p = &max; /* a const-qualified object can have its address taken */
const
-qualified objects are likely to incur some runtime overhead. [Saks 01b] Most C compilers, for example, allocate memory for const
-qualified objects. const
-qualified objects declared inside a function body will have automatic storage duration. Consequently, if the compiler allocates storage for the object, it will be on the stack and this storage will need to be allocated an initialized each time the containing function is invoked.
Enumeration Constants
Enumeration constant can be used to represent an integer constant expression that has a value representable as an int
. Unlike const
-qualified objects, enumeration constants do not require allocated storage is allocated for the value so it is not possible to take the address of an enumeration constant.
enum { max = 15 }; int a[max]; /* OK */ const int *p; p = &max; /* error: '&' on constant */
Enumeration constants do not allow the type of the value to be specified. An enumeration constant whose value can be represented as an int
is always an int
.
Object-like Macros
A preprocessing directive of the form:
#
define
identifier replacement-list
defines an object-like macro that causes each subsequent instance of the macro name to be replaced by the replacement list of preprocessing tokens that constitute the remainder of the directive [[ISO/IEC 9899-1999]].
C programmers frequently define symbolic constants as object-like macros. For example, the code:
#define buffer_size 256
defines buffer_size
as a macro whose value is 256. The preprocessor substitutes macros before the compiler does any other symbol processing. Later compilation phases never see macro symbols such as buffer_size
; they see only the source text after macro substitution. Consequently, many compilers do not preserve macro names among the symbols they pass on to their debuggers.
Macro names do not observe the scope rules that apply to other names. Consequently, macros might substitute in unexpected places with unanticipated results.
Object-like macros do not consume memory, and consequently, it is not possible to create a pointer to one. Macros do not provide for type checking, as they are textually replaced by the preprocessor.
Macros may be passed as compile-time arguments.
Summary
The following table summarizes some of the differences between const
-qualified objects, enumeration constants, and object-like macro definitions.
Method |
Evaluated at |
Consumes Memory |
Viewable by Debuggers |
Type Checking |
Compile-time constant expression |
---|---|---|---|---|---|
Enumerations |
compile time |
no |
yes |
yes |
yes |
|
run time |
yes |
yes |
yes |
no |
Macros |
preprocessor |
no |
no |
no |
yes |
Non-Compliant Code Example
The meaning of the integer literal 18 is not clear in this example.
/* ... */ if (age >= 18) { /* Take action */ } else { /* Take a different action */ } /* ... */
Compliant Solution
The compliant solution replaces the integer literal 18 with the symbolic constant ADULT_AGE
to clarify the meaning of the code.
enum { ADULT_AGE=18 }; /* ... */ if (age >= ADULT_AGE) { /* Take action */ } else { /* Take a different action */ } /* ... */
Non-Compliant Code Example
Integer literals are frequently used when referring to array dimensions, as shown in this non-compliant coding example.
char buffer[256]; /* ... */ fgets(buffer, 256, stdin);
This use of integer literals can easily result in buffer overflows, if for example, the buffer size is reduced but the integer literal used in the call to fgets()
is not.
Compliant Solution (enum)
In this compliant solution the integer literal is replaced with an enumeration constant (see DCL00-A. Const-qualify immutable objects).
enum { BUFFER_SIZE=256 }; char buffer[BUFFER_SIZE]; /* ... */ fgets(buffer, BUFFER_SIZE, stdin);
Enumeration constants can safely be used anywhere a constant expression is required.
Compliant Solution (sizeof
)
Frequently it is possible to obtain the desired readability by using a symbolic expression composed of existing symbols rather than by defining a new symbol. For example, a sizeof
expression can work just as well as an enumeration constant (see EXP09-A. Use sizeof to determine the size of a type or variable).
char buffer[256]; /* ... */ fgets(buffer, sizeof(buffer), stdin);
Using the sizeof
expression in this example reduces the total number of names declared in the program, which is generally a good idea [[Saks 02]]. The sizeof
operator is almost always evaluated at compile time (except in the case of variable length arrays).
Non-Compliant Code Example
In this non-compliant code example, the string literal "localhost"
and integer constant 1234
are embedded directly in program logic, and are consequently difficult to change.
if ( (ld = ldap_init("localhost", 1234)) == NULL) { perror("ldap_init"); return(1); }
Compliant Solution
In this compliant solution, the host name and port number are both defined as object-like macros, so that that may be passed as compile-time arguments.
#ifndef PORTNUMBER /* might be passed on compile line */ # define PORTNUMBER 1234 #endif #ifndef HOSTNAME /* might be passed on compile line */ # define HOSTNAME "localhost" #endif /* ... */ if ( (ld = ldap_init(HOSTNAME, PORTNUMBER )) == NULL) { perror("ldap_init"); return(1); }
Exceptions
DCL06-EX1: While replacing numeric constants with a symbolic constant is often a good practice, it can be taken too far. Remember that the goal is to improve readability. Exceptions can be made for constants that are themselves the abstraction you want to represent, as in this compliant solution.
x = (-b + sqrt(b*b - 4*a*c)) / (2*a);
Replacing numeric constants with symbolic constants in this example does nothing to improve the readability of the code, and may actually make the code more difficult to read:
enum { TWO = 2 }; /* a scalar */ enum { FOUR = 4 }; /* a scalar */ enum { SQUARE = 2 }; /* an exponent */ x = (-b + sqrt(pow(b, SQUARE) - FOUR*a*c))/ (TWO * a);
When implementing recommendations, it is always necessary to use sound judgment.
Note that this example does not check for invalid operations (taking the sqrt()
of a negative number.) See FLP32-C. Prevent or detect domain and range errors in math functions for more information on detecting domain and range errors in math functions.
Risk Assessment
Using numeric literals makes code more difficult to read and understand. Buffer overruns are frequently a consequence of a magic number being changed in one place (like an array declaration) but not elsewhere (like a loop through an array).
Recommendation |
Severity |
Likelihood |
Remediation Cost |
Priority |
Level |
---|---|---|---|---|---|
DCL06-A |
1 (low) |
1 (unlikely) |
2 (medium) |
P2 |
L3 |
Automated Detection
The LDRA tool suite V 7.6.0 is able to detect violations of this recommendation.
Related Vulnerabilities
Search for vulnerabilities resulting from the violation of this rule on the CERT website.
References
[[Henricson 92]] Chapter 10, "Constants"
[[ISO/IEC 9899-1999]] Section 6.7, "Declarations"
[[ISO/IEC PDTR 24772]] "BRS Leveraging human experience"
[[Saks 01a]] Dan Saks. Symbolic Constants. Embedded Systems Design. November, 2001.
[[Saks 01b]] Dan Saks. Enumeration Constants vs. Constant Objects. Embedded Systems Design. November, 2001.
[[Saks 02]] Dan Saks. Symbolic Constant Expressions. Embedded Systems Design. February, 2002.
[[Summit 05]] Question 10.5b
DCL05-A. Use typedefs to improve code readability 02. Declarations and Initialization (DCL) DCL07-A. Include the appropriate type information in function declarators