...
- Suitable for small integer values.
"Plain" char
- The type of each element of a string literal.
- Used for character data from a limited character set (where signedness has little meaning) as opposed to integer data.
int
- Used for data that can be either
EOF
(a negative value) or character data interpreted asunsigned char
and then converted toint
. As a result, it is returned byfgetc()
,getc()
,getchar()
, andungetc()
. Also, accepted by the character-handling functions from<ctype.h>
because they might be passed the result offgetc()
, etc.and so on - The type of a character constant. Its ; its value is that of a plain
char
converted toint
.
Note that the two different ways a character is used as an int
(as an unsigned char
+ EOF
or as a plain char
converted to int
) can lead to confusion. For example, isspace('\200')
results in undefined behavior when char
is signed.
...
- Used internally for string comparison functions even though these functions operate on character data. Consequently; consequently, the result of a string comparison does not depend on whether plain
char
is signed. - Used when the object being manipulated might be of any type, and it is necessary to access all bits of that object, as with
fwrite()
.
Unlike other integer types, unsigned char
has the unique property that
values stored in [ . . . ] objects of type
unsigned char
shall be represented using a pure binary notation (C Standard, Section subclause 6.2.6.1 [ISO/IEC 9899:2011])
where a pure binary notation is defined as the following:
A positional representation for integers that uses the binary digits 0 and 1, in which the values represented by successive bits are additive, begin with 1, and are multiplied by successive integral powers of 2, except perhaps the bit with the highest position. A byte contains
CHAR_BIT
bits, and the values of typeunsigned char
range from 0 to 2CHAR_BIT
− 1. (Section subclause 6.2.6, fn. footnote 49)
That is, objects of type unsigned char
may have no padding bits and consequently no trap representation. As a result, non-bit-field objects of any type may be copied into an array of unsigned char
(for example, via memcpy()
) and have their representation examined one byte at a time.
...
- Wide characters are used for natural-language character data.
Risk Assessment
Understanding how to represent characters and character strings can eliminate many common programming errors that lead to software vulnerabilities.
Recommendation | Severity | Likelihood | Remediation Cost | Priority | Level |
---|---|---|---|---|---|
STR00-C | medium | probable | low | P12 | L1 |
...
Bibliography
[ISO/IEC 9899:2011] | Section Subclause 6.2.6, "Representations of Types" |
[Seacord 2013] | Chapter 2, "Strings" |
...