C99 supports universal character names that may be used in identifiers, character constants, and string literals to designate characters that are not in the basic character set. The universal character name \U
nnnnnnnn designates the character whose eight-digit short identifier (as specified by ISO/IEC 10646) is nnnnnnnn. Similarly, the universal character name \u
nnnn designates the character whose four-digit short identifier is nnnn (and whose eight-digit short identifier is 0000
nnnn).
C99, Section 5.1.1.2, paragraph 4, says:
If a character sequence that matches the syntax of a universal character name is produced by token concatenation (6.10.3.3), the behavior is undefined.
In general, universal character names should be avoided in identifiers unless absolutely necessary. The basic character set should suffice for almost every identifier.
Noncompliant Code Example
This code example is noncompliant because it produces a universal character name by token concatenation.
#define assign(uc1, uc2, uc3, uc4, val) \ uc1##uc2##uc3##uc4 = val; int \u0401\u0401\u0401\u0402; assign( \u0401, \u0401, \u0401, \u0402, 4);
Implementation Details
While noncompliant, this code does produce the expected behavior; that is, it assigns the variable the value 4 on both MSVC 2008, and on Linux/GCC 4.3, when compiled with -std=c99 -fextended-identifiers
.
Compliant Solution
This code solution is compliant.
#define assign(ucn, val) ucn = val; int \u0401\u0401\u0401\u0402; assign( \u0401\u0401\u0401\u0402, 4);
Risk Assessment
Creating a universal character name through token concatenation results in undefined behavior.
Rule |
Severity |
Likelihood |
Remediation Cost |
Priority |
Level |
---|---|---|---|---|---|
PRE30-C |
low |
unlikely |
medium |
P2 |
L3 |
Related Vulnerabilities
Search for vulnerabilities resulting from the violation of this rule on the CERT website.
Other Languages
This rule appears in the C++ Secure Coding Standard as PRE30-CPP. Do not create a universal character name through concatenation.
References
[[ISO/IEC 10646-2003]]
[[ISO/IEC 9899:1999]] Section 5.1.1.2, "Translation phases," Section 6.4.3, "Universal character names," and Section 6.10.3.3, "The ## operator"
PRE12-C. Define numeric constants in a portable way 01. Preprocessor (PRE) PRE31-C. Never invoke an unsafe macro with arguments containing assignment, increment, decrement, volatile access, or function call