Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The basic_string template class has unusual invalidation semantics. According to the C++ Standard, [string.require], paragraph 5 [ISO/IEC 14882-2014]:

References, pointers, and iterators referring to the elements of a basic_string

...

 sequence may be

...

invalidated by the following uses of that basic_string

...

 object:

...

  •  as an argument to any standard library function taking a reference to non-

...

  • const basic_string

...

  •  as an argument.
  • Calling non-const member functions, except operator[]

...

  • , at

...

  • , front, back, begin

...

  • , rbegin

...

  • , end

...

  • ,

...

  • and rend

...

  • .

Examples of standard library functions taking a reference to non-const std::basic_string are: std::swap(), ::operator>>(basic_istream &, string &), and std::getline().

Do not use a reference, pointer, or iterator that has been invalidated, as that results in undefined behavior. This rule is a specific instance of CTR32-CPP. Do not use iterators invalidated by container modification.

Noncompliant Code Example

This noncompliant code example copies input into a std::string

...

Non-Compliant Code Example

This non-compliant example copies the null-terminated byte string input into the string email, replacing ';' characters with spaces. This example is non-compliant noncompliant because the iterator loc is invalidated after the first call to insert(). The behavior of subsequent calls to insert is () is undefined.

Code Block
bgColor#FFcccc
langcpp

char input[] = "bogus@addr.com; cat /etc/passwd";
#include <string>
 
void f(const std::string &input) {
  std::string email;
  std::string::iterator loc = email.begin();

  // copy input into stringemail converting ";" to " "
  for (size_t i=0; i <= strlen(input); i++) {
  if (input[i] != ';'auto I = input.begin(), E = input.end(); I != E; ++I, ++loc) {
    email.insert(loc++, input[i]);
  }
  else {
    email.insert(loc++,*I != ';' ? *I : ' ');
  }
} // end string for each element in NTBS

Compliant Solution (std::string::insert())

In this compliant solution, the value of the iterator loc is updated as a result of each call to insert so () so that the insert() method invalidated iterator is never called with an invalid iteratoraccessed. The updated iterator is then incremented at the end of the loop.

Code Block
bgColor#ccccff
langcpp
#include <string>
char input[] = "bogus@addr.com; cat /etc/passwd";

void f(const std::string &input) {
  std::string email;
  std::string::iterator loc = email.begin();

  // copy input into stringemail converting ";" to " "
  for (size_t i=0; i <= strlen(input); i++) {
  if (input[i] != ';'auto I = input.begin(), E = input.end(); I != E; ++I, ++loc) {
    loc = email.insert(loc, input[i]);
  }
  else {
    loc = email.insert(loc, *I != ';' ? *I : ' ');
  }
  ++loc;
} // end string for each element in NTBS

Non-Compliant Code Example

In this non-compliant example, the string s is initialized as "rcs" and the string iterator si is initialized to the beginning of the string. The size of s is three, and we'll assume the capacity is fifteen. The for loop appends 20 characters to the end of the sting. As a result, the si iterator is invalidated because the capacity of the string is exceeded, requiring a reallocation. As a result, the call to insert() results in undefined behavior.

Code Block
bgColor#FFcccc
langcpp

string s("rcs");
string::iterator si = s.begin();

for (size_t i=0; i<20; ++i) {
  s.push_back('x');
}
s.insert(si, '*');

Compliant Solution

The relationship between size and capacity makes it possible to predict when a call to a non-const member function will cause a string to perform a reallocation. This in turn makes it possible to predict when an insertion will invalidate references, pointers, and iterators (to anything other than the end of the string).

Compliant Solution (std::replace())

In this compliant solution, the non-compliant example is modified to only append capacity-size characters to the string s. As a result, the call to push_back() no longer invalidates the iteratormanual loop is replaced with a standard algorithm that performs the replacement. Using generic algorithms is generally preferable to inventing your own solution when possible.

Code Block
bgColor#ccccff
langcpp
#include <algorithm>
string s("rcs");
string::iterator si = s.begin();

for (size_t i=0; i < 20; ++i) {
   if ( s.size() == s.capacity() ) {
     break;
   }
  s.push_back('x');
}
s.insert(si, '*');

If instead of performing a push_back() the code were to insert into an arbitrary location in the string, all references, pointers, and iterators from the insertion point to the end of the string would be invalidated.

Exceptions

The intent of these iterator invalidation rules is to give implementors greater freedom in implementation techniques. Some implementations implement method versions that do not invalidate references, pointers, and iterators in all cases. Check with the documentation for your implementation before attempting to access a (potentially) invalid iterator. Document any violation of the semantics specified by the standard for portability.

#include <string>
 
void f(const std::string &input) {
  std::string email{input};
  std::replace(email.begin(), email.end(), ';', ' ');
}

Risk Assessment

Using an invalid reference, pointer or iterator to a string object could allow an attacker to run arbitrary code.

Rule

Severity

Likelihood

Remediation Cost

Priority

Level

STR38-CPP

high High

probable Probable

high High

P6

L2

Automated Detection

Tool

Version

Checker

Description

    

Related Vulnerabilities

Search for vulnerabilities resulting from the violation of this rule on the CERT website.

Related Guidelines

  

Bibliography

[ISO/IEC 14882-2014]

21.4.1, "basic_string General Requirements"

[Meyers 01]Item 43: Prefer algorithm calls to hand-written loops

 .
[ISO/IEC 14882-2003] 21.3 Class template basic_string.

      07. Characters and Strings (STR)      STR39-CPP. Range check element access