You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 67 Next »

C99 supports universal character names that may be used in identifiers, character constants, and string literals to designate characters that are not in the basic character set. The universal character name \Unnnnnnnn designates the character whose eight-digit short identifier (as specified by ISO/IEC 10646) is nnnnnnnn. Similarly, the universal character name \unnnn designates the character whose four-digit short identifier is nnnn (and whose eight-digit short identifier is 0000nnnn).

C99, Section 5.1.1.2, Paragraph 4, says

If a character sequence that matches the syntax of a universal character name is produced by token concatenation (6.10.3.3), the behavior is undefined.

(See also undefined behavior 3 of Annex J.)

In general, universal character names should be avoided in identifiers unless absolutely necessary. The basic character set should suffice for almost every identifier.

Noncompliant Code Example

This code example is noncompliant because it produces a universal character name by token concatenation.

#define assign(uc1, uc2, val) uc1##uc2 = val;

int \u0401;
assign( \u04, 01, 4);

Implementation Details

This code compiles and runs on Microsoft Visual C++ 2008, assigning 4 to the variable as expected.

GCC 4.3 on Linux refuses to compile this code; it complains of a "stray \", referring to the universal character fragment in the invocation of the assign macro.

Compliant Solution

This code solution is compliant.

#define assign(ucn, val) ucn = val;

int \u0401;
assign( \u0401, 4);

Risk Assessment

Creating a universal character name through token concatenation results in undefined behavior.

Rule

Severity

Likelihood

Remediation Cost

Priority

Level

PRE30-C

low

unlikely

medium

P2

L3

Automated Detection

Tool

Version

Checker

Description

9.7.1

573 S

Fully Implemented

Related Vulnerabilities

Search for vulnerabilities resulting from the violation of this rule on the CERT website.

Related Guidelines

CERT C++ Secure Coding Standard: PRE30-CPP. Do not create a universal character name through concatenation

ISO/IEC 10646-2003

ISO/IEC 9899:1999 Section 5.1.1.2, "Translation phases," Section 6.4.3, "Universal character names," and Section 6.10.3.3, "The ## operator"

Bibliography


PRE13-C. Avoid changing control flow in macro definitions      01. Preprocessor (PRE)      PRE31-C. Avoid side-effects in arguments to unsafe macros

  • No labels