Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Copying data to a buffer that is not large enough to hold that data results in a buffer overflow. Buffer overflows occur frequently when manipulating strings [Seacord 20132013b]. To prevent such errors, either limit copies through truncation or, preferably, ensure that the destination is of sufficient size to hold the character data to be copied and the null-termination character . (See see STR03-C. Do not inadvertently truncate a string).)

When strings live on the heap, this rule is a specific instance of MEM35-C. Allocate sufficient memory for an object.  Because strings are represented as arrays of characters, this rule is related to both ARR30-C. Do not form or use out-of-bounds pointers or array subscripts and ARR38-C. Guarantee that library functions do not form invalid pointers.

...

This noncompliant code example demonstrates an off-by-one error [Dowd 2006]. The loop copies data from src to dest. However, the null terminator may incorrectly be written 1 byte past the end of dest because the loop does not account for the null-termination character that must be appended to , it may be incorrectly written 1 byte past the end of dest.

Code Block
bgColor#FFCCCC
langc
#include <stddef.h>
 
enum { ARRAY_SIZE = 32 };
 
void func(void) {
  char dest[ARRAY_SIZE];
  char src[ARRAY_SIZE];
  size_t i;
 
  for (i = 0; src[i] && (i < sizeof(dest)); ++i) {
    dest[i] = src[i];
  }
  dest[i] = '\0';
}

...

The gets() function, which was deprecated in the C99 Technical Corrigendum 3 and removed from C11, is inherently unsafe and should never be used because it provides no way to control how much data is read into a buffer from stdin. This noncompliant code example assumes that gets() will not read more than BUFFER_SIZE - 1 characters from stdin. This is an invalid assumption, and the resulting operation can cause result in a buffer overflow.

The gets() function reads characters from the stdin into a destination array until end-of-file is encountered or a newline character is read. Any newline character is discarded, and a null character is written immediately after the last character read into the array.

...

The fgets() function reads, at most, one less than a the specified number of characters from a stream into an array. This solution is compliant because the number of bytes characters copied from stdin to buf cannot exceed the allocated memory:

Code Block
bgColor#ccccff
langc
#include <stdio.h>
#include <string.h>
 
enum { BUFFERSIZE = 32 };
 
void func(void) {
  char buf[BUFFERSIZE];
  int ch;

  if (fgets(buf, sizeof(buf), stdin)) {
    /* fgets() succeedssucceeded; scan for newline character */
    char *p = strchr(buf, '\n');
    if (p) {
      *p = '\0';
    } else {
      /* Newline not found; flush stdin to end of line */
      while (((ch = getchar()) != '\n')
 && ch != EOF)
        && !feof(stdin);
      if (ch == EOF   && !ferrorfeof(stdin))
  && !ferror(stdin)) {
          /* Character resembles EOF; handle error */ 
      }
    }
  } else {
    /* fgets() failed; handle error */
  }
}

The fgets() function , however, is not a strict replacement for the gets() function because fgets() retains the newline character (if read) and may also return a partial line. It is possible to use fgets() to safely process input lines too long to store in the destination array, but this is not recommended for performance reasons. Consider using one of the following compliant solutions when replacing gets().

Compliant Solution (gets_s()

...

)

The gets_s() function reads, at most, one less than the number of characters specified from the stream pointed to by stdin into an array.

Annex K, subclause K.3.5.4.1, of the C Standard The C Standard, Annex K [ISO/IEC 9899:2011], states:

No additional characters are read after a new-line character (which is discarded) or after end-of-file. The discarded new-line character does not count towards number of characters read. A null character is written immediately after the last character read into the array.

...

The getline() function is similar to the fgets() function but can dynamically allocate memory for the input buffer. If passed a null pointer, getline() dynamically allocates a buffer of sufficient size to hold the input.   If instead, you pass passed a pointer to dynamically allocated storage that is too small to hold the contents of the string, the getline() function resizes the buffer, using realloc(), rather than truncating the input. If successful, the getline() function returns the number of characters read, which can be used to determine if the input has any null characters before the newline.   The getline() function works only with dynamically allocated buffers.   Allocated memory must be explicitly deallocated by the caller to avoid memory leaks (see MEM31-C. Free dynamically allocated memory when no longer needed).)

Code Block
bgColor#ccccff
langc
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
 
void func(void) {
  int ch;
  size_t buffer_size = 32;
  char *buffer = malloc(buffer_size);
 
  if (!buffer) {
    /* Handle error */
    return;
  }

  if ((ssize_t size = getline(&buffer, &buffer_size, stdin))
        == -1) {
    /* Handle error */
  } else {
    char *p = strchr(buffer, '\n');
    if (p) {
      *p = '\0';
    } else {
      /* Newline not found; flush stdin to end of line */
      while (((ch = getchar()) != '\n' && ch != EOF)
	    ;
	  if (ch == EOF && !feof(stdin)
	     && !ferror(stdin)) {
         /* Character resembles EOF; handle error */
      }
    }
  }
  free (buffer);
}

Note that the getline() function uses an in-band error indicator, in violation of the recommendation ERR02-C. Avoid in-band error indicators.

Noncompliant Code Example (getchar())

Reading one character at a time provides more flexibility in controlling behavior, though with additional performance overhead. This noncompliant code example uses the getchar() function to read one character at a time from stdin instead of reading the entire line at once. The stdin stream is read until end-of-file is encountered or a newline character is read. Any newline character is discarded, and a null character is written immediately after the last character read into the array. Similar to the previous examplenoncompliant code example that invokes gets(), there are no guarantees that this code will not result in a buffer overflow.

Code Block
bgColor#FFCCCC
langc
#include <stdio.h>
 
enum { BUFFERSIZE = 32 };
 
void func(void) {
  char buf[BUFFERSIZE];
  char *p;
  int ch;
  p = buf;
  while (((ch = getchar()) != '\n')
         && !feof(stdin)
         && !ferror(stdin))ch != EOF) {
    *p++ = (char)ch;
  }
  *p++ = 0;
}

Compliant Solution (getchar())

In this compliant solution, characters are no longer copied to buf once index == BUFFERSIZE - 1, leaving room to null-terminate the string. The loop continues to read characters until the end of the line, the end of the file, or an error is encountered.

  if (ch == EOF) {
      /* Handle EOF or error */
  }
}

After the loop ends, if ch == EOF, the loop has read through to the end of the stream without encountering a newline character, or a read error occurred before the loop encountered a newline character. To conform to FIO34-C. Distinguish between characters read from a file and EOF or WEOF, the error-handling code must verify that an end-of-file or error has occurred by calling feof() or ferror().

Compliant Solution (getchar())

In this compliant solution, characters are no longer copied to buf once index == BUFFERSIZE - 1, leaving room to null-terminate the string. The loop continues to read characters until the end of the line, the end of the file, or an error is encountered. When chars_read > index, the input string has been truncated.

Code Block
bgColor#ccccff
langc
#include <stdio.h>
 
enum { BUFFERSIZE = 32 };
 
void func(void) {
  char buf[BUFFERSIZE];
  int ch;
  size_t index = 0;
  size_t chars_read = 0;
 
  while ((ch = getchar()) != '\n' && ch != EOF) {
    if (index < sizeof(buf) - 1
Code Block
bgColor#ccccff
langc
#include <stdio.h>
 
enum { BUFFERSIZE = 32 };
 
void func(void) {
  unsigned char buf[BUFFERSIZE];
  int ch;
  int index = 0;
  int chars_read = 0;
 
  while (((ch = getchar()) != '\n')
          && !feof(stdin)
          && !ferror(stderr)) {
    if (index < sizeof(buf) - 1) {
      buf[index++] = (unsigned char)ch;
    }
    chars_read++;
  }
  buf[index] = '\0';  /* Terminate NTBS */
  if (feof(stdin)) {
    /* Handle EOF */
  }
  if (ferror(stdin)) {
    /* Handle error */
    buf[index++] = (char)ch;
    }
  if  (chars_read > index) {++;
  }
  buf[index]  /* Handle truncation */
  }
}

After the loop ends, if feof(stdin) != 0, the loop has read through to the end of the file without encountering a newline character. Similarly, if ferror(stdin) != 0, a read error occurred before the loop encountered a newline character, and if chars_read > index, the input string has been truncated. FIO34-C. Distinguish between characters read from a file and EOF or WEOF is also applied in this solution.

Reading one character at a time provides more flexibility in controlling behavior without additional performance overhead.

The following test for the while loop is normally sufficient:

Code Block
while (((ch = getchar()) != '\n') && !feof(stdin) && !ferror(stdin))= '\0';  /* Terminate string */
  if (ch == EOF) {
    /* Handle EOF or error */
  }
  if (chars_read > index) {
    /* Handle truncation */
  }
}

Noncompliant Code Example (fscanf())

...

Code Block
bgColor#ffcccc
langc
#include <stdio.h>
 
enum { BUF_LENGTH = 1024 };
 
void get_data(void) {
  char buf[BUF_LENGTH];
  if (1 != fscanf(stdin, "%s", buf);) {
    /* Handle error */
  }

  /* restRest of function */
}

Compliant Solution (fscanf())

...

Code Block
bgColor#ccccff
langc
#include <stdio.h>
 
enum { BUF_LENGTH = 1024 };
 
void get_data(void) {
  char buf[BUF_LENGTH];
  if (1 != fscanf(stdin, "%1024s%1023s", buf); buf)) {
    /* Handle error */
  }

  /* restRest of function */
}

Noncompliant Code Example (argv)

Arguments In a hosted environment, arguments read from the command line are stored in process memory. The function main(), called at program startup, is typically declared as follows when the program accepts command-line arguments:

...

Code Block
bgColor#FFcccc
langc
#include <string.h>
 
int main(int argc, char *argv[]) {
  /* Ensure argv[0] is not null */
  const char *const name = (argc && argv[0]) ? argv[0] : "";
  char prog_name[128];
  strcpy(prog_name, argv[0]name);
 
  return 0;
}

Compliant Solution (argv)

The strlen() function can be used to determine the length of the strings referenced by argv[0] through argv[argc - 1] so that adequate memory can be dynamically allocated. Note that care must be taken to avoid assuming that argv[0] is non-null.

Code Block
bgColor#ccccff
langc
#include <stdlib.h>
#include <string.h>
 
int main(int argc, char *argv[]) {
  /* BeEnsure prepared for argv[0] tois benot null */
  const char *const name = (argc && argv[0]) ? argv[0] : "";
  char *prog_name = (char *)malloc(strlen(name) + 1);
  if (prog_name != NULL) {
    strcpy(prog_name, name);
  } else {
    /* Handle error */
  }
  free(prog_name);
  return 0;
}

Remember to add a byte to the destination string size to accommodate the null-termination character.

Compliant Solution (

...

argv)

The strcpy_s() function provides additional safeguards, including accepting the size of the destination buffer as an additional argument (see STR07-C. Use the bounds-checking interfaces for remediation of existing string manipulation code). Do not assume that argv[0] is non-null.

Code Block
bgColor#ccccff
langc
#define __STDC_WANT_LIB_EXT1__ 1
#include <stdlib.h>
#include <string.h>
 
int main(int argc, char *argv[]) {
  /* BeEnsure prepared for argv[0] tois benot null */
  const char *const name = (argc && argv[0]) ? argv[0] : "";

  char *prog_name;
  size_t prog_size;

  prog_size = strlen(name) + 1;
  prog_name = (char *)malloc(prog_size);

  if (prog_name != NULL) {
    if (strcpy_s(prog_name, prog_size, name)) {
      /* Handle  error */
    }
  } else {
    /* Handle error */
  }
  /* ... */
  free(prog_name);
  return 0;
}

The strcpy_s() function can be used to copy data to or from dynamically allocated memory or a statically allocated array. If insufficient space is available, strcpy_s() returns an error.

Compliant Solution (argv)

If an argument is will not going to be modified or concatenated, there is no reason to make a copy of the string. Not copying a string is the best way to prevent a buffer overflow and is also the most efficient solution. Care must be taken to avoid assuming that argv[0] is non-null.

Code Block
bgColor#ccccff
langc
int main(int argc, char *argv[]) {
  /* BeEnsure prepared for argv[0] tois benot null */
  const char * const prog_name = (argc && argv[0]) ?
                                   argv[0] : "";
  /* ... */
  return 0;
}

Noncompliant Code Example (getenv())

According to the C Standard, 7.22.4.6 [ISO/IEC 9899:2011]:

The getenv

...

function

...

searches an environment list, provided by the host environment, for a string that matches the string pointed to by name. The set of environment names and the method for altering the environment list are implementation

...

defined.

Environment variables can be arbitrarily large, and copying them into fixed-length arrays without first determining the size and allocating adequate storage can result in a buffer overflow.

...

Noncompliant Code Example (sprintf())

In this noncompliant code example, name refers to an external string; it could have originated from user input, from the file system, or from the network. The program constructs a file name from the string in preparation for opening the file.

Code Block
bgColor#FFcccc
langc
#include <stdio.h>
 
void func(const char *name) {
  char filename[128];
  sprintf(filename, "%s.txt", name);
}

However, because Because the sprintf() function makes no guarantees regarding the length of the generated string, a sufficiently long string in name could generate a buffer overflow.

Compliant Solution (sprintf())

The buffer overflow buffer overflow in the preceding noncompliant example can be prevented by adding a precision to the %s conversion specification. If the precision is specified, no more than that many bytes are written. The precision 123 in this compliant solution ensures that filename can contain the first 123 characters of name, the .txt extension, and the null terminator.

...

CERT C Secure Coding Standard

STR03-C. Do not inadvertently truncate a string
STR07-C. Use the bounds-checking interfaces for remediation of existing string manipulation code
MSC24-C. Do not use deprecated or obsolescent functions
MEM00-C. Allocate and free memory in the same module, at the same level of abstraction
FIO34-C. Distinguish between characters read from a file and EOF or WEOF

CERT C++ Secure Coding StandardSTR31-CPP. Guarantee that storage for character arrays has sufficient space for character data and the null terminator
ISO/IEC TR 24772:2013String Termination [CJM]
Buffer Boundary Violation (Buffer Overflow) [HCB]
Unchecked Array Copying [XYW]
ISO/IEC TS 17961:2013

Using a tainted value to write to an object using a formatted input or output function [taintformatio]
Tainted strings are passed to a string copying function [taintstrcpy]

MITRE CWECWE-119, Failure to constrain operations Improper Restriction of Operations within the bounds of an allocated memory bufferBounds of a Memory Buffer
CWE-120, Buffer copy Copy without checking size Checking Size of input Input ("classic buffer overflowClassic Buffer Overflow")
CWE-193, Off-by-one errorError

Bibliography

...

[Dowd 2006]Chapter 7, "Program Building Blocks" ("Loop Constructs," pp. 327–336)
[Drepper 2006]Section 2.1.1, "Respecting Memory Bounds"
[ISO/IEC 9899:2011]Subclause K.3.5.4.1, "The gets_s function Function"
[Lai 2006] 
[NIST 2006]SAMATE Reference Dataset Test Case ID 000-000-088
[Seacord 2013]Chapter 2, "Strings"
[xorl 2009]FreeBSD-SA-09:11: NTPd Remote Stack Based Buffer Overflows

...