You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 19 Next »

The C99 function strtok() is a string tokenization function which takes three arguments: an initial string to be parsed, a const-qualified character delimiter, and a pointer to a pointer to modify to return the result.

The first time you call strtok(), you pass the string to be parsed into tokens, the character delimiter, and the address of the variable to return the result in. The strtok() function parses the string up to the first instance of the delimiter character, replaces the character in place with a null byte ('\0'), and puts the address of the first character in the token to the passed-in variable. Subsequent calls to strtok() begin parsing immediately after the recently-placed null character.

Because strtok() modifies its argument, the string is subsequently unsafe and cannot be used in its original form. If you need to preserve the original string, copy it into a buffer and pass the address of the buffer to strtok() instead of the original string.

Non-Compliant Code Example

char *path = getenv("PATH"); 
/* PATH is something like "/usr/bin:/bin:/usr/sbin:/sbin" */
char *token; 
 
token = strtok(path, ":"); 
puts(token); 
 
while (token = strtok(0, ":")) { 
  puts(token); 
} 
 
printf("PATH: %s\n", path); 
/* PATH is now just "/usr/bin" */

In this example, the strtok() function is used to parse the first argument into colon-delimited tokens; it will output each word from the string on a new line. However, after the while loop ends, path will have been modified to look like this: "/usr/bin\0/bin\0/usr/sbin\0/sbin\0". This is an issue on several levels. If we check our local path variable, we will only see /usr/bin now. Even worse, we have unintentionally changed the environment variable PATH, which could cause unintended results.

Compliant Solution

One possible solution is to copy the string being tokenized into a temporary buffer which isn't referenced after the calls to strtok():

char *path = getenv("PATH"); 
/* PATH is something like "/usr/bin:/bin:/usr/sbin:/sbin" */

char *copy = malloc(strlen(path) + 1);
strcpy(copy, path);
char *token; 
 
token = strtok(copy, ":"); 
puts(token); 
 
while (token = strtok(0, ":")) { 
  puts(token); 
} 
 
printf("PATH: %s\n", path); 
/* PATH is still "/usr/bin:/bin:/usr/sbin:/sbin" */

Another possibility is to provide your own implementation of strtok() which does not modify the initial arguments.

Risk Assessment

To quote the Linux Programmer's Manual (man) page on strtok(3):

Never use this function. This function modifies its first argument. The identity of the delimiting character is lost. This function cannot be used on constant strings.

However, improper strtok() use will probably only result in truncated data, producing unexpected results later in program execution.

Rule

Severity

Likelihood

Remediation Cost

Priority

Level

STR06-A

1 (low)

2 (probable)

3 (low)

P6

L2

Related Vulnerabilities

Search for vulnerabilities resulting from the violation of this rule on the CERT website.

References

[[ISO/IEC 9899-1999:TC2]] Section 7.21.5.8, "The strtok function"
[Unix Man page] strtok(3)

  • No labels