Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The C99 function strtok() is a string tokenization function which takes three arguments: an initial string to be parsed, a const-qualified character delimiter, and a pointer to a pointer to modify to return the result.

The first time you call strtok() is called, you pass the string to be parsed into tokens, the character delimiter, and the address of the variable in which to return the result inare passed as arguments. The strtok() function parses the string up to the first instance of the delimiter character, replaces the character in place with a NULL null byte ('\0'), and puts the address of the first character in the token to the passed-in variable. Subsequent calls to strtok() begin parsing immediately after the most recently-placed NULL null character.

Because strtok() modifies its argument, the string is subsequently unsafe and cannot be used in its original form. If you need to preserve the original string, copy it into a buffer and pass the address of the buffer to strtok() instead of the original string.

Non-Compliant Code Example

In this example, the strtok() function is used to parse the first argument into colon-delimited tokens; it outputs each word from the string on a new line. Assume that PATH is "usr/bin:/usr/sbin:/sbin".

Code Block
bgColor#FFCCCC
char *path = getenv("PATH"); 
/* PATH is something like "/usr/bin:/bin:/usr/sbin:/sbin" */
char *token; 
 
token = strtok(path, ":"); 
puts(token); 
 
while (token = strtok(0, ":")) { 
  puts(token); 
} 
 
printf("PATH: %s\n", path); 
/* PATH is now just "/usr/bin" */

In this example, the strtok() function is used to parse the first argument into colon-delimited tokens; it will output each word from the string on a new line. However, after the while loop ends, path will have been modified to look like this: "/usr/bin\0/bin\0/usr/sbin\0/sbin\0". This is an issue on several levels. If we check our local path variable, we will only see /usr/bin now. Even worse, we have unintentionally changed the environment variable PATH, which could cause unintended results.

Compliant Solution

One possible solution is to copy In this solution the string being tokenized is copied into a temporary buffer which isn't is not referenced after the calls to strtok():

Code Block
bgColor#ccccff
char *path = getenv("PATH"); 
/* PATH is something like "/usr/bin:/bin:/usr/sbin:/sbin" */

char *copy = malloc(strlen(path) + 1);
strcpy(copy, path);
char *token; 
 
token = strtok(copy, ":"); 
puts(token); 
 
while (token = strtok(0, ":")) { 
  puts(token); 
} 
 
printf("PATH: %s\n", path); 
/* PATH is still "/usr/bin:/bin:/usr/sbin:/sbin" */

...