Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The C99 C function strtok() is a string tokenization function which that takes three two arguments: an initial string to be parsed , and a const-qualified character delimiter, and . It returns a pointer to the first character of a pointer to modify to return the resulttoken or to a null pointer if there is no token.

The first time you call strtok() is called, you pass the string to be is parsed into tokens , the and a character delimiter, and the address of the variable to return the result in. The strtok() function parses the string up to the first instance of the delimiter character, replaces the character in place with a null byte ('\0'), and puts returns the address of the first character in the token to the passed-in variable. Subsequent calls to strtok() begin parsing immediately after the most recently - placed null character.

Because strtok() modifies it's argumentthe initial string to be parsed, the string is subsequently unsafe and cannot be used in its original form. If you need to preserve the original string, copy it into a buffer and pass the address of the buffer to strtok() instead of the original string.

...

Noncompliant Code Example

In this example, the strtok() function is used to parse the first argument into colon-delimited tokens; it outputs each word from the string on a new line. Assume that PATH is "/usr/bin:/usr/sbin:/sbin".

Code Block
bgColor#FFCCCC
langc

    char string[]char *token;
char *path = "Hello secure coding wiki!";
    char *token;

    getenv("PATH");

token = strtok(stringpath, ' '":");
    printf("%s\n", puts(token);

    while ( token = strtok(NULL0, ' ') ":")) {
        puts(token);
}

printf("PATH: %s\n", tokenpath);
    }

    /* furtherPATH stringis manipulationnow on string[] failsjust "/usr/bin" */

In this example, the strtok() function is used to parse the first argument into space-delimited tokens; it will output each word from the string on a new line. However, after the while loop ends, string[] will have been modified to look like this: "Hello\0secure\0coding\0wiki\0". Any further manipulation of string operating on the assumption that it is still whole will see only "Hello" instead of the expected string value.

Compliant Solutions

After the loop ends, path is modified as follows: "/usr/bin\0/bin\0/usr/sbin\0/sbin\0". This is an issue because the local path variable becomes /usr/bin and because the environment variable PATH has been unintentionally changed, which can have unintended consequences. (See ENV30-C. Do not modify the object referenced by the return value of certain functions.)

Compliant Solution

In this compliant solution, the string being tokenized is copied into a temporary buffer that is not referenced after the call One possible solution is to copy the string being tokenized into a temporary buffer which isn't referenced after the calls to strtok():

Code Block
bgColor#ccccff
langc
char *token;
const    char string[]*path = "Hello secure coding wiki!";
    char buff[256];
    char *token;

    strncpy(buff, string);
    getenv("PATH");
/* PATH is something like "/usr/bin:/bin:/usr/sbin:/sbin" */

char *copy = (char *)malloc(strlen(path) + 1);
if (copy == NULL) {
  /* Handle error */
}
strcpy(copy, path);
token = strtok(buffcopy, ' '":");
    printf("%s\n", puts(token);

    while ( token = strtok(NULL0, ' ') ":")) {
  puts(token);
}

free(copy);
copy      = NULL;

printf("PATH: %s\n", tokenpath);
    }

    /* furtherPATH stringis manipulation on string[] succeeds still "/usr/bin:/bin:/usr/sbin:/sbin" */

Another possibility is to provide your own implementation of strtok() which that does not modify the initial arguments.

Risk Assessment

The To quote the Linux Programmer's Manual (man) page on strtok(3) [Linux 2008] states:

Never use this function. This function modifies its first
argument. The identity of the delimiting character is
lost. This function cannot be used on constant strings.

References

Unix Man page The improper use of strtok(3)) is likely to result in truncated data, producing unexpected results later in program execution.

Recommendation

Severity

Likelihood

Remediation Cost

Priority

Level

STR06-C

Medium

Likely

Medium

P12

L1

Automated Detection

Tool

Version

Checker

Description

CodeSonar
Include Page
CodeSonar_V
CodeSonar_V
(customization)Users who wish to avoid using strtok() entirely can add a custom check for all uses of strtok().
Compass/ROSE




Helix QAC

Include Page
Helix QAC_V
Helix QAC_V

C5007
LDRA tool suite
Include Page
LDRA_V
LDRA_V

602 S

Enhanced Enforcement

Related Vulnerabilities

Search for vulnerabilities resulting from the violation of this rule on the CERT website.

Related Guidelines

Bibliography


...

Image Added Image Added Image Added Library functions which enter the namespace from linked-in libraries can have the same name as a #declare'd macro; in order to prevent such a naming conflict parenthesize the name of the library function when it is called: