Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Strings are a fundamental concept in software engineering, but they are not a built-in type in C. Null-terminated byte strings (NTBS) consist of a contiguous sequence of characters terminated by and including the first null character and are supported in C as the format used for string literals. The C programming language supports single-byte character strings, multibyte character strings, and wide-character strings. Single-byte and multibyte character strings are both described as null-terminated byte strings, which are also called narrow character strings.

...

Null-terminated byte strings are implemented as arrays of characters and are susceptible to the same problems as arrays. As a result, rules and recommendations for arrays should also be applied to null-terminated byte strings.

The C standard Standard uses the following philosophy for choosing character types, though it is not explicitly stated in one place.:

signed char and unsigned char

...

  • Used internally for string comparison functions even though these functions operate on character data. Consequently, the result of a string comparison does not depend on whether plain char is signed.
  • Used for situations where when the object being manipulated might be of any type, and it is necessary to access all bits of that object, as with fwrite().

...

values stored in [...] objects of type unsigned char shall be represented using a pure binary notation. (C11C Standard, Section 6.2.6.1 [ISO/IEC 9899:2011])

where a pure binary notation is defined as the following:

A positional representation for integers that uses the binary digits 0 and 1, in which the values represented by successive bits are additive, begin with 1, and are multiplied by successive integral powers of 2, except perhaps the bit with the highest position. A byte contains CHAR_BIT bits, and the values of type unsigned char range from 0 to 2 CHAR_BIT − 1. (Section 6.2.16, fn. 49)

That is, objects of type unsigned char may have no padding bits and consequently no trap representation. As a result, non-bit-field objects of any type may be copied into an array of unsigned char (for example, via memcpy()) and have their representation examined one byte at a time.

...

Search for vulnerabilities resulting from the violation of this rule on the CERT website.

Related Guidelines

...

Bibliography

...

]Section 6.2.6, "

...

Representations of

...

Types"
[Seacord 2005a]Chapter 2, "Strings"