According to MISRA 2008, concatenation of wide and narrow string literals leads to undefined behavior. This was once considered implicitly undefined behavior until C90 [ISO/IEC 9899:1990]. However, C99 defined this behavior [ISO/IEC 9899:1999], and C11 further explains in subclause 6.4.5, paragraph 5 [ISO/IEC 9899:2011]:
In translation phase 6, the multibyte character sequences specified by any sequence of adjacent character and identically-prefixed string literal tokens are concatenated into a single multibyte character sequence. If any of the tokens has an encoding prefix, the resulting multibyte character sequence is treated as having the same prefix; otherwise, it is treated as a character string literal. Whether differently-prefixed wide string literal tokens can be concatenated and, if so, the treatment of the resulting multibyte character sequence are implementation-defined.
Nonetheless, it is recommended that string literals that are concatenated should all be the same type so as not to rely on implementation-defined behavior or undefined behavior if compiled on a platform that supports only C90.
Noncompliant Code Example (C90)
Wiki Markup |
---|
According to \[[MISRA 08|AA. C References#MISRA 08]\], concatenation of wide and narrow string literals leads to undefined behavior. |
...
This noncompliant code example concatenates wide and narrow string literals. The Although the behavior is undefined in this caseC90, the programmer probably intended to create a wide string literal.
Code Block | ||||
---|---|---|---|---|
| ||||
wchar_t *msg = L"This message is very long, so I want to devidedivide it " "into two parts."; |
Compliant Solution (
...
C90, Wide String Literals)
If the concatenated string needs to be a wide string literal, each element in the concatenation must be a wide string literal, as in this compliant solution:This compliant solution concatenates wide string literals only.
Code Block | ||||
---|---|---|---|---|
| ||||
wchar_t *msg = L"This message is very long, so I want to devidedivide it " L"into two parts."; |
Compliant Solution (
...
C90, Narrow String Literals)
If wide string literals are not necessaryunnecessary, it is better to use narrow string literals., as in this compliant solution:
Code Block | ||||
---|---|---|---|---|
| ||||
char * msg = "This message is very long, so I want to devidedivide it " "into two parts."; |
Risk Assessment
Concatenation The concatenation of wide and narrow string literals leads could lead to undefined behavior.
Rule | Severity | Likelihood | Remediation Cost | Priority | Level |
---|
STR10-C |
Low |
Probable |
Medium | P4 | L3 |
Automated Detection
Tool | Version | Checker | Description | ||||||
---|---|---|---|---|---|---|---|---|---|
Astrée |
| encoding-mismatch | Fully checked | ||||||
Axivion Bauhaus Suite |
| CertC-STR10 | |||||||
ECLAIR |
| CC2.STR10 | Fully implemented. | ||||||
Helix QAC |
| C0874 | |||||||
LDRA tool suite |
| 450 S | Fully implemented | ||||||
Parasoft C/C++test |
| CERT_C-STR10-a | Narrow and wide string literals shall not be concatenated | ||||||
PC-lint Plus |
| 707 | Fully supported | ||||||
SonarQube C/C++ Plugin |
| NarrowAndWideStringConcat | |||||||
RuleChecker |
| encoding-mismatch | Fully checked |
Related Vulnerabilities
Search for vulnerabilities resulting from the violation of this rule on the CERT website.
References
...
Related Guidelines
MISRA C++:2008 | Rule |
...
2-13-5 |
...
Bibliography
[ |
...
ISO/IEC 9899:2011] | Section 6.4.5, "String Literals" |
...
14882-2003|AA. C References#ISO/IEC 14882-2003]\] 2.13.4 String literals