You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Summary

The ISO 9899 function strtok() is a string tokenization function which takes three arguments; an initial string to be parsed, a const-qualified character delimiter, and a pointer to a pointer to modify to return the result.

The first time you call strtok(), you pass the string to be parsed into tokens, the character delimiter, and the address of the variable to return the result in. strtok() parses the string up to the first instance of the delimiter character, replaces the character in place with a null byte ('\0'), and puts the address of the first character in the token to the passed-in variable. Subsequent calls to strtok() begin parsing immediately after the recently-placed null character.

Because strtok() modifies it's argument the string is subsequently unsafe and cannot be used in its original form. Use of strtok() should be avoided if possible, or if necessary, copy the original string into a buffer and pass the address of the buffer to strtok() instead of the original string.

Non-Compliant Code Example

    char string[] = "Hello secure coding wiki!";
    char *token;

    token = strtok(string, ' ');
    printf("%s\n", token);

    while ( token = strtok(NULL, ' ') ) {
        printf("%s\n", token);
    }

    /* further string manipulation on string[] fails */

In this example, the strtok() function is used to parse the first argument into space-delimited tokens; it will output each word from the string on a new line. However, after the while loop ends, string[] will have been modified to look like this: "Hello\0secure\0coding\0wiki!\0". It is clear that any further manipulation of string[] operating on the assumption that it is still whole will see only "Hello" instead of the expected string value.

Compliant Solutions

One possible solution is to copy the string being tokenized into a temporary buffer which isn't referenced after the calls to strtok():

    char string[] = "Hello secure coding wiki!";
    char buff[256];
    char *token;

    strncpy(buff, string);
    token = strtok(buff, ' ');
    printf("%s\n", token);

    while ( token = strtok(NULL, ' ') ) {
        printf("%s\n", token);
    }

    /* further string manipulation on string[] succeeds */

Another possibility is to provide your own implementation of strtok() which does not modify the initial arguments.

To quote the Linux Programmer's Manual (man) page on strtok(3):

Never use this function. This function modifies its first
argument. The identity of the delimiting character is
lost. This function cannot be used on constant strings.

References

Unix Man page strtok(3)

 Library functions which enter the namespace from linked-in libraries can have the same name as a #declare'd macro; in order to prevent such a naming conflict parenthesize the name of the library function when it is called:

  • No labels