The C99 C function strtok()
is a string tokenization function which that takes three two arguments: an initial string to be parsed , and a const
-qualified character delimiter, and . It returns a pointer to the first character of a token or to a pointer to modify to return the resultnull pointer if there is no token.
The first time strtok()
is called, the string to be is parsed into tokens , and a character delimiter, and address of the variable in which to return the result are passed as arguments. The strtok()
function parses the string up to the first instance of the delimiter character, replaces the character in place with a null byte ('\0'
), and puts returns the address of the first character in the token to the passed-in variable. Subsequent calls to strtok()
begin parsing immediately after the most recently - placed null character.
Because strtok()
modifies its argumentthe initial string to be parsed, the string is subsequently unsafe and cannot be used in its original form. If you need to preserve the original string, copy it into a buffer and pass the address of the buffer to strtok()
instead of the original string.
...
Noncompliant Code Example
In this example, the strtok()
function is used to parse the first argument into colon-delimited tokens; it outputs each word from the string on a new line. Assume that PATH
is "/usr/bin:/usr/sbin:/sbin"
.
Code Block | ||||
---|---|---|---|---|
| ||||
char *token; char *path = getenv("PATH"); char *token; token = strtok(path, ":"); puts(token); while (token = strtok(0, ":")) { puts(token); } printf("PATH: %s\n", path); /* PATH is now just "/usr/bin" */ |
However, after After the while loop ends, path
will have been modified to look like this is modified as follows: "/usr/bin\0/bin\0/usr/sbin\0/sbin\0"
. This is an issue on several levels. If we check our because the local path
variable , we will only see becomes /usr/bin
now. Even worse, we have unintentionally changed and because the environment variable PATH
has been unintentionally changed, which could cause unintended results.can have unintended consequences. (See ENV30-C. Do not modify the object referenced by the return value of certain functions.)
Compliant Solution
In this compliant solution, the string being tokenized is copied into a temporary buffer which buffer that is not referenced after the calls call to strtok()
:
Code Block | ||||
---|---|---|---|---|
| ||||
char *token; const char *path = getenv("PATH"); /* PATH is something like "/usr/bin:/bin:/usr/sbin:/sbin" */ char *copy = (char *)malloc(strlen(path) + 1); if (copy == NULL) { /* Handle error */ } strcpy(copy, path); char *token; token = strtok(copy, ":"); puts(token); while (token = strtok(0, ":")) { puts(token); } free(copy); copy = NULL; printf("PATH: %s\n", path); /* PATH is still "/usr/bin:/bin:/usr/sbin:/sbin" */ |
Another possibility is to provide your own implementation of strtok()
which that does not modify the initial arguments.
Risk Assessment
To quote the The Linux Programmer's Manual (man) page on strtok(3)
[Linux 2008] states:
Never use this function. This function modifies its first argument. The identity of the delimiting character is lost. This function cannot be used on constant strings.
However, The improper use of strtok()
use will probably only is likely to result in truncated data, producing unexpected results later in program execution.
Recommendation | Severity | Likelihood | Remediation Cost | Priority | Level |
---|---|---|---|---|---|
STR06- |
2 (medium)
2 (probable)
3 (low)
P12
C | Medium | Likely | Medium | P12 | L1 |
Automated Detection
Tool | Version | Checker | Description | ||||||
---|---|---|---|---|---|---|---|---|---|
CodeSonar |
| (customization) | Users who wish to avoid using strtok() entirely can add a custom check for all uses of strtok() . | ||||||
Compass/ROSE | |||||||||
Helix QAC |
| C5007 | |||||||
LDRA tool suite |
| 602 S | Enhanced Enforcement |
Related Vulnerabilities
Search for vulnerabilities resulting from the violation of this rule on the CERT website.
References
Wiki Markup |
---|
\[[ISO/IEC 9899-1999|AA. C References#ISO/IEC 9899-1999]\] Section 7.21.5.8, "The strtok function"
\[Unix Man page\] strtok(3) |
Related Guidelines
SEI CERT C++ Coding Standard | VOID STR06-CPP. Do not assume that strtok() leaves the parse string unchanged |
MITRE CWE | CWE-464, Addition of data structure sentinel |
Bibliography
...
STR05-A. Prefer making string literals const-qualified 07. Characters and Strings (STR) STR07-A. Take care when calling realloc() on a null-terminated byte string