Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Null-terminated byte strings are implemented as arrays of characters and are susceptible to the same problems as arrays. As a result, rules and recommendations for arrays should also be applied to null-terminated byte strings.

...

  • Suitable for small integer values.

"

...

Plain" char

  • The type of each element of a string literal.
  • Used for character data from a limited character set (where signedness has little meaning) as opposed to integer data.

int

  • Used for data that can be either EOF (a negative value) or character data interpreted as unsigned char and then converted to int. As a result, it is returned by fgetc(), getc(), getchar(), and ungetc(). Also, accepted by the character-handling functions from <ctype.h> because they might be passed the result of fgetc(), etc.and so on
  • The type of a character constant. Its ; its value is that of a plain char converted to int.

Note that the two different ways a character is used as an int (as an unsigned char + EOF or as a plain char converted to int) can lead to confusion. For example, isspace('\200') results in undefined behavior when char is signed.

...

  • Used internally for string comparison functions even though these functions operate on character data. Consequently; consequently, the result of a string comparison does not depend on whether plain char is signed.
  • Used when the object being manipulated might be of any type, and it is necessary to access all bits of that object, as with fwrite().

Unlike other integer types, unsigned char has the unique property that

values stored in [ . . . ] objects of type unsigned char shall be represented using a pure binary notation (C Standard, Section subclause 6.2.6.1 [ISO/IEC 9899:2011])

where a pure binary notation is defined as the following:

A positional representation for integers that uses the binary digits 0 and 1, in which the values represented by successive bits are additive, begin with 1, and are multiplied by successive integral powers of 2, except perhaps the bit with the highest position. A byte contains CHAR_BIT bits, and the values of type unsigned char range from 0 to 2 CHAR_BIT − 1. (Section subclause 6.2.6, fn. footnote 49)

That is, objects of type unsigned char may have no padding bits and consequently no trap representation. As a result, non-bit-field objects of any type may be copied into an array of unsigned char (for example, via memcpy()) and have their representation examined one byte at a time.

...

  • Wide characters are used for natural-language character data.

Risk Assessment

Understanding how to represent characters and character strings can eliminate many common programming errors that lead to software vulnerabilities.

Recommendation

Severity

Likelihood

Remediation Cost

Priority

Level

STR00-C

medium

Medium

probable

Probable

low

Low

P12

L1

Automated Detection

Tool

Version

Checker

Description

Astrée
Include Page
Astrée_V
Astrée_V

Supported indirectly via MISRA C:2004 rule 6.1 and MISRA C:2012 rule 10.1.
CodeSonar
Include Page
CodeSonar_V
CodeSonar_V
MISC.NEGCHARNegative Character Value
LDRA tool suite
Include Page
LDRA_V
LDRA_V
329 S, 432 SFully implemented
Parasoft C/C++test
Include Page
Parasoft_V
Parasoft_V
CERT_C-STR00-a

The plain char type shall be used only for the storage and use of character values

RuleChecker
Include Page
RuleChecker_V
RuleChecker_V

Supported indirectly via MISRA C:2004 rule 6.1 and MISRA C:2012 rule 10.1.
SonarQube C/C++ Plugin
Include Page
SonarQube C/C++ Plugin_V
SonarQube C/C++ Plugin_V
S810

Related Vulnerabilities

Search for vulnerabilities resulting from the violation of this rule on the CERT website.

Related Guidelines

Bibliography

[ISO/IEC 9899:2011]
Section
Subclause 6.2.6, "Representations of Types"
[Seacord
2005a
2013]Chapter 2, "Strings"

...


...

Image Modified Image Modified Image Modified