Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Copying data to a buffer that is not large enough to hold that data results in a buffer overflow. Buffer overflows occur frequently when manipulating strings [Seacord 2013]. To prevent such errors, either limit copies through truncation or, preferably, ensure that the destination is of sufficient size to hold the data to be copied. C-style strings require a null character to indicate the end of the string, while the C++ std::basic_string template requires no such character.

Noncompliant Code Example

Because the input is unbounded, the following code could lead to a buffer overflow.

Code Block
bgColor#FFcccc
langcpp
#include <iostream>
 
void f() {
  char buf[12];
  std::cin >> buf;
}

Noncompliant Code Example

To solve this problem, it may be tempting to use the std::ios_base::width() method, but there still is a trap, as shown in this noncompliant code example.

Code Block
bgColor#ffcccc
langcpp
#include <iostream>
 
void f() {
  char bufOne[12];
  char bufTwo[12];
  std::cin.width(12);
  std::cin >> bufOne;
  std::cin >> bufTwo;
}

In this example, the first read will not overflow, but could fill bufOne with a truncated string. Furthermore, the second read still could overflow bufTwo. The C++ Standard, [istream.extractors], paragraphs 7–9  [ISO/IEC 14882-2014], describes the behavior of operator>>(basic_istream &, charT *) and, in part, states the following:

operator>> then stores a null byte (charT()) in the next position, which may be the first position if no characters were extracted. operator>> then calls width(0).

Consequently, it is necessary to call width() prior to each operator>> call passing a bounded array. However, this does not account for the input being truncated, which may lead to information loss or a possible vulnerability.

Compliant Solution

The best solution for ensuring that data is not truncated and for guarding against buffer overflows is to use std::string instead of a bounded array, as in this compliant solution.

Code Block
bgColor#ccccff
langcpp
#include <iostream>
#include <string>
 
void f() {
  std::string input;
  std::string stringOne, stringTwo;
  std::cin >> stringOne >> stringTwo;
}

Noncompliant Code Example

In this noncompliant example, the unformatted input function std::basic_istream<T>::read() is used to read an unformatted character array of 32 characters from the given file. However, the read() function does not guarantee that the string will be null terminated, so the subsequent call of the std::string constructor results in undefined behavior if the character array does not contain a null terminator.

Code Block
bgColor#ffcccc
langcpp
#include <fstream>
#include <string>
 
void f(std::istream &in) {
  char buffer[32];
  try {
    in.read(buffer, sizeof(buffer));
  } catch (std::ios_base::failure &e) {
    // Handle error
  }
 
  std::string str(buffer);
  // ...
}

Compliant Solution

This compliant solution assumes that the input from the file is at most 32 characters. Instead of inserting a null terminator, it constructs the std::string object based on the number of characters read from the input stream. If the size of the input is uncertain, it is better to use std::basic_istream<T>::readsome() or a formatted input function, depending on need.

Code Block
bgColor#ccccff
langcpp
#include <fstream>
#include <string>

void f(std::istream &in) {
  char buffer[32];
  try {
    in.read(buffer, sizeof(buffer));
  } catch (std::ios_base::failure &e) {
    // Handle error
  }
  std::string str(buffer, in.gcount());
  // ...
}

Risk Assessment

Copying string data to a buffer that is too small to hold that data results in a buffer overflow. Attackers can exploit this condition to execute arbitrary code with the permissions of the vulnerable process.

Rule

Severity

Likelihood

Remediation Cost

Priority

Level

STR50-CPP

High

Likely

Medium

P18

L1

Automated Detection

Tool

Version

Checker

Description

Astrée

Include Page
Astrée_V
Astrée_V

stream-input-char-array
Partially checked + soundly supported
CodeSonar
Include Page
CodeSonar_V
CodeSonar_V

MISC.MEM.NTERM

LANG.MEM.BO
LANG.MEM.TO

No space for null terminator

Buffer overrun
Type overrun

Helix QAC

Include Page
Helix QAC_V
Helix QAC_V

 C++5216

DF2835, DF2836, DF2839,


Klocwork
Include Page
Klocwork_V
Klocwork_V
NNTS.MIGHT
NNTS.TAINTED
NNTS.MUST
SV.UNBOUND_STRING_INPUT.CIN

LDRA tool suite
Include Page
LDRA_V
LDRA_V

489 S, 66 X, 70 X, 71 X

Partially implemented

Parasoft C/C++test
Include Page
Parasoft_V
Parasoft_V

CERT_CPP-STR50-b
CERT_CPP-STR50-c
CERT_CPP-STR50-e
CERT_CPP-STR50-f
CERT_CPP-STR50-g

Avoid overflow due to reading a not zero terminated string
Avoid overflow when writing to a buffer
Prevent buffer overflows from tainted data
Avoid buffer write overflow from tainted data
Do not use the 'char' buffer to store input from 'std::cin'

Polyspace Bug Finder

Include Page
Polyspace Bug Finder_V
Polyspace Bug Finder_V

CERT C++: STR50-CPP

Checks for:

  • Use of dangerous standard function
  • Missing null in string array
  • Buffer overflow from incorrect string format specifier
  • Destination buffer overflow in string manipulation
  • Insufficient destination buffer size

Rule partially covered.

RuleChecker
Include Page
RuleChecker_V
RuleChecker_V
stream-input-char-array
Partially checked
SonarQube C/C++ Plugin
Include Page
SonarQube C/C++ Plugin_V
SonarQube C/C++ Plugin_V
S3519

Functions that perform unbounded copies often rely on external input to be a reasonable size. Such assumptions may prove to be false, causing a buffer overflow to occur. For this reason, care must be taken when using functions that may perform unbounded copies.

Noncompliant Code Example

This example uses the getchar() function to read in a character at a time from stdin, instead of reading the entire line at once. The stdin stream is read until end-of-file is encountered or a new-line character is read. Any new-line character is discarded, and a null character is written immediately after the last character read into the array. Similar to the previous example, there are no guarantees that this code will not result in a buffer overflow. Note that BUFSIZ is a macro integer defined in cstdio which represents a suggested value for setbuf() and not the maximum size of such an input buffer.

Code Block
bgColor#FFCCCC
char buf[BUFSIZ], *p;
int ch;
p = buf;
while ( ((ch = getchar()) != '\n')
       && !feof(stdin)
       && !ferror(stdin))
{
  *p++ = ch;
}
*p++ = 0;

Compliant Solution

In this compliant solution, characters are no longer copied to buf once index = BUFFERSIZE, leaving room to null terminate the string. The loop continues to read through to the end of the line until the end of the file is encountered or an error occurs.

Code Block
bgColor#ccccff
unsigned char buf[BUFFERSIZE];
int ch;
int index = 0;
int chars_read = 0;
while ( ( (ch = getchar()) != '\n')
        && !feof(stdin)
        && !ferror(stderr) )
{
  if (index < BUFFERSIZE-1) {
    buf[index++] = (unsigned char)ch;
  }
  chars_read++;
} /* end while */
buf[index] = '\0';  /* terminate NTBS */
if (feof(stdin)) {
  /* handle EOF */
}
if (ferror(stdin)) {
  /* handle error */
}
if (chars_read > index) {
  /* handle truncation */
}

If at the end of the loop feof(stdin) != 0, the loop has read through to the end of the file without encountering a new-line character. If at the end of the loop ferror(stdin) != 0, a read error occurred before the loop encountered a new-line character. If at the end of the loop chars_read > index, the input string has been truncated. Rule FIO34-CPP. Use int to capture the return value of character IO functions is also applied in this solution.

Reading a character at a time provides more flexibility in controlling behavior without additional performance overhead.

The following test for the while loop is normally sufficient.

Code Block

while ( ( (ch = getchar()) != '\n') && ch != EOF ) {

See FIO35-CPP. Use feof() and ferror() to detect end-of-file and file errors when sizeof(int) == sizeof(char) for the case where feof() and ferror() must be used instead.

Noncompliant Code Example ( gets() )

The gets() function is inherently unsafe, and should never be used as it provides no way to control how much data is read into a buffer from stdin. These two lines of code assume that gets() will not read more than BUFSIZ - 1 characters from stdin. This is an invalid assumption and the resulting operation can cause a buffer overflow. Again note that BUFSIZ is a macro from <cstdio> and does not represent the maximum size of an input buffer.

Wiki Markup
According to Section 7.19.7.7 of C99 \[[ISO/IEC 9899:1999|AA. C++ References#ISO/IEC 9899-1999]\], the {{gets()}} function reads characters from the {{stdin}} into a destination array until end-of-file is encountered or a new-line character is read.  Any new-line character is discarded, and a null character is written immediately after the last character read into the array.

Code Block
bgColor#FFCCCC
char buf[BUFSIZ];
if (gets(buf) == NULL) {
  /* Handle Error */
}

The gets() function is obsolescent, and is deprecated.

Compliant Solution ( fgets() )

The fgets() function reads, at most, one less than a specified number of characters from a stream into an array. This example is compliant because the number of bytes copied from stdin to buf cannot exceed the allocated memory.

Code Block
bgColor#ccccff
char buf[BUFFERSIZE];
int ch;
char *p;

if (fgets(buf, sizeof(buf), stdin)) {
  /* fgets succeeds, scan for newline character */
  p = strchr(buf, '\n');
  if (p) {
    *p = '\0';
  }
  else {
    /* newline not found, flush stdin to end of line */
    while (((ch = getchar()) != '\n')
          && !feof(stdin)
          && !ferror(stdin)
    );
  }
}
else {
  /* fgets failed, handle error */
}

The fgets() function, however, is not a strict replacement for the gets() function because fgets() retains the new-line character (if read) but may also return a partial line. It is possible to use fgets() to safely process input lines too long to store in the destination array, but this is not recommended for performance reasons. Consider using one of the following compliant solutions when replacing gets().

Compliant Solution ( get_s() )

The gets_s() function reads at most one less than the number of characters specified from the stream pointed to by stdin into an array.

Wiki Markup
According to TR 24731 \[[ISO/IEC TR 24731-2006|AA. C++ References#ISO/IEC TR 24731-2006]\]:
<blockquote><p>No additional characters are read after a new-line character (which is discarded) or after end-of-file. The discarded new-line character does not count towards number of characters read. A null character is written immediately after the last character read into the array.</p></blockquote>If end-of-file is encountered and no characters have been read into the destination array, or if a read error occurs during the operation, then the first character in the destination array is set to the null character and the other elements of the array take unspecified values.

Code Block
bgColor#ccccff
char buf[BUFFERSIZE];

if (gets_s(buf, sizeof(buf)) == NULL) {
  /* handle error */
}

Noncompliant Code Example ( scanf() )

The scanf() function is used to read and format input from stdin. Improper use of scanf() may result in an unbounded copy. In the code below, the call to scanf() does not limit the amount of data read into buf. If more than 9 characters are read, then a buffer overflow occurs.

Code Block
bgColor#FFCCCC
enum { CHARS_TO_READ = 9 };

char buf[CHARS_TO_READ + 1];
scanf("%s", buf);

Compliant Solution ( scanf() )

The number of characters read by scanf() can be bounded by using the format specifier supplied to scanf().

Code Block
bgColor#ccccff
#define STRING(n) STRING_AGAIN(n)
#define STRING_AGAIN(n) #n

#define CHARS_TO_READ 9

char buf[CHARS_TO_READ + 1];
scanf("%"STRING(CHARS_TO_READ)"s", buf);

Non-Compliant Code Example (operator<<())

Since the input is unbounded, the following code could lead to a buffer overflow

Code Block
bgColor#FFcccc
char buf[12];
cin >> buf;

Non-compliant solution 1 (operator<<())

To solve this problem, one can be tempted to use the width method of the ios_base class, but there still is a trap.

Code Block
bgColor#ffcccc
char buf_one[12];
char buf_two[12];
cin.width(12);
cin >> buf_one;
cin >> buf_two;

Wiki Markup
In this example, the first read won't overflow, but the second still could, because as the C+\+ standard states : "operator>> extracts characters and stores them into successive locations of an array \[...\] operator>> then calls width(0)." Which means that width should be called every time you use the >> operator with a bounded array.

Non-compliant solution 2 (operator<<())

While the following doesn't suffer of the same problem as the previous, it still has some :

Code Block
bgColor#ffcccc
char buf_one[12];
char buf_two[12];
cin.width(12);
cin >> buf_one;
cin.width(12);
cin >> buf_two;

Wiki Markup
because, as the C+\+ standard states, "If width() is greater than zero, n is width() \[...\] n-1 characters are stored \[...\] Operator>> then stores a null byte (charT()) in the next position, which may be the first position if no characters were extracted." The input could therefore be truncated, leading to information lost, and to a possible vulnerability.
In this particular example, if the user enters a string longer than 11 (11 characters + the NULL terminating character automatically appended by the >> operator equals 12 characters), the 12th and all subsequent characters will be lost.

Compliant solution (operator<<())

To avoid this truncation problem, it would be better to use an instance of the string class to store the input, as it is dynamically resized to fit the input.

Code Block
bgColor#ccccff
string input;
const char *buf_one;
const char *buf_two;
string string_one;
string string_two;
cin >> string_one;
cin >> string_two;
buf_one = string_one.c_str();
buf_two = string_two.c_str();

By special attention to the const, and you may want to read STR45-CPP for details on how to handle the output of c_str().

Risk Assessment

Copying data from an unbounded source to a buffer of fixed size may result in a buffer overflow.

Rule

Severity

Likelihood

Remediation Cost

Priority

Level

STR35-CPP

high

likely

medium

P18

L1

Automated Detection

The LDRA tool suite Version 7.6.0 can detect violations of this rule.

Compass/ROSE can detect some violations of this rule.

...


Related Vulnerabilities

Search for vulnerabilities resulting from the violation of this rule on the the CERT website.

Other Languages

This rule appears in the C Secure Coding Standard as STR35-C. Do not copy data from an unbounded source to a fixed-length array.

References

Wiki Markup
\[[Drepper 06|AA. C++ References#Drepper 06]\] Section 2.1.1, "Respecting Memory Bounds"
\[[ISO/IEC 14882-2003|AA. C++ References#ISO/IEC 14882-2003]\] Sections 3.6.1 Main function, and 18.7 Other runtime support
\[[ISO/IEC 9899:1999|AA. C++ References#ISO/IEC 9899-1999]\] Section 7.19, "Input/output <{{stdio.h}}>"
\[[ISO/IEC TR 24731-2006|AA. C++ References#ISO/IEC TR 24731-2006]\] Section 6.5.4.1, "The {{gets_s}} function"
\[[Lai 06|AA. C++ References#Lai 06]\]
\[[MITRE 07|AA. C++ References#MITRE 07]\] [CWE ID 120|http://cwe.mitre.org/data/definitions/120.html], "Unbounded Transfer ('Classic Buffer Overflow')"
\[[NIST 06|AA. C++ References#NIST 06]\] SAMATE Reference Dataset Test Case ID 000-000-088
\[[Seacord 05a|AA. C++ References#Seacord 05]\] Chapter 2, "Strings"

Related Guidelines

Bibliography

[ISO/IEC 14882-2014]

Subclause 27.7.2.2.3, "basic_istream::operator>>"
Subclause 27.7.2.3, "Unformatted Input Functions" 

[Seacord 2013]Chapter 2, "Strings"


...

Image Added Image Added Image AddedSTR34-CPP. Cast characters to unsigned types before converting to larger integer sizes      07. Characters and Strings (STR)      STR36-CPP. Do not specify the bound of a character array initialized with a string literal