C99 supports universal character names that may be used in identifiers, character constants, and string literals to designate characters that are not in the basic character set.
The universal character name \U
nnnnnnnn designates the character whose eight-digit short identifier (as specified by ISO/IEC 10646) is nnnnnnnn. Similarly, the universal
character name \u
nnnn designates the character whose four-digit short identifier is nnnn (and whose eight-digit short identifier is 0000
nnnn).
If a character sequence that matches the syntax of a universal character name is produced by token concatenation, the behavior is undefined.
Non-Compliant Code Example
This code example is non-compliant because it produces a universal character name by token concatenation.
#define assign(uc1, uc2, uc3, uc4, val) uc1##uc2##uc3##uc4 = val; int \U00010401\U00010401\U00010401\U00010402; assign(\U00010401, \U00010401, \U00010401, \U00010402, 4);
Compliant Solution
This code solution is compliant.
#define assign(ucn, val) ucn = val; int \U00010401\U00010401\U00010401\U00010402; assign(\U00010401\U00010401\U00010401\U00010402, 4);
Risk Assessment
Rule |
Severity |
Likelihood |
Remediation Cost |
Priority |
Level |
---|---|---|---|---|---|
PRE30-C |
1 (low) |
1 (unlikely) |
1 (low) |
P1 |
L3 |
References
- ISO/IEC 9899-1999 Section 5.1.1.2, "Translation phases," Section 6.4.3, "Universal character names," and Section 6.10.3.3, "The ## operator"