The Since std::basic_string
is a container of characters, this rule is a specific instance of CTR51-CPP. Use valid references, pointers, and iterators to reference elements of a container. As a container, it supports iterators just like other containers in the Standard Template Library. However, the std::basic_string
template class has unusual invalidation semantics. The C++ Standard, [string.require], paragraph 5 [ISO/IEC 14882-2014], states the following:
References, pointers, and iterators referring to the elements of a
basic_string
...
sequence may be
...
invalidated by the following uses of that
basic_string
...
object:
...
- As an argument to any standard library function taking a reference to non-
...
- const
basic_string
...
- as an argument.
- Calling
...
- non-const
...
- member
...
- functions,
...
- except
...
operator
...
[
...
]
...
- ,
...
at
,front
,back
,begin
,rbegin
,end
, andrend
.
Examples of standard library functions taking a reference to non-const
std::basic_string
are std::swap()
, ::operator>>(basic_istream &, string &)
, and std::getline()
.
Do not use an invalidated reference, pointer, or iterator because doing so results in undefined behavior.
Noncompliant Code Example
This noncompliant code example copies input
into a std::string
, replacing semicolon (;)
characters with spaces. This example is noncompliant
...
Wiki Markup |
---|
subsequent to any of the above uses except the forms of {{insert()}} and {{erase()}} that return iterators, the first call to non-const member functions {{operator\[\]()}}, {{at()}}, {{begin()}}, {{rbegin()}}, {{end()}}, or {{rend()}} |
Non-Compliant Code Example
This non-compliant example copies the null-terminated byte string input
into the string email
, replacing ';' characters with spaces. This example is non-compliant because the iterator loc
is invalidated after the first call to insert()
. The behavior of subsequent calls to insert
is ()
is undefined.
Code Block | ||||
---|---|---|---|---|
| ||||
#include <string> void f(const std::string &input) { std::string email; // Copy input into email char input[] = "bogus@addr.com; cat /etc/passwd"; string email; string::iterator loc = email.begin(); // copy into string converting ";" to " " for (size_t i=0; i <= strlen(input); i++) { if (input[i] != ';' std::string::iterator loc = email.begin(); for (auto i = input.begin(), e = input.end(); i != e; ++i, ++loc) { email.insert(loc++, input[i]); } else { email.insert(loc++, *i != ';' ? *i : ' '); } } // end string for each element in NTBS |
Compliant Solution (std::string::insert()
)
In this compliant solution, the value of the iterator loc
is updated as a result of each call to insert
so ()
so that the insert()
method invalidated iterator is never called with an invalid iteratoraccessed. The updated iterator is then incremented at the end of the loop.
Code Block | ||||
---|---|---|---|---|
| ||||
#include <string> char input[] = "bogus@addr.com; cat /etc/passwd"; string email; string::iterator loc = email.begin(); // copy into string void f(const std::string &input) { std::string email; // Copy input into email converting ";" to " " for (size_t i=0; i <= strlen(input); i++) { if (input[i] != ';' std::string::iterator loc = email.begin(); for (auto i = input.begin(), e = input.end(); i != e; ++i, ++loc) { loc = email.insert(loc, input[i]*i != ';' ? *i : ' '); } } |
Compliant Solution (std::replace()
)
This compliant solution uses a standard algorithm to perform the replacement. When possible, using a generic algorithm is preferable to inventing your own solution.
Code Block | ||||
---|---|---|---|---|
| ||||
#include <algorithm> #include <string> void f(const elsestd::string &input) { std::string email{input}; loc = std::replace(email.begin(), email.insert(locend(), ';', ' '); } ++loc; } // end string for each element in NTBS |
...
Noncompliant Code Example
In this non-compliant noncompliant code example, the string s
is initialized as "rcs" and the string iterator si
is initialized to the beginning of the string. The size of s
is three, and we'll assume the capacity is fifteen. The for
loop appends 20 characters to the end of the sting. As a result, the si
iterator is invalidated because the capacity of the string is exceeded, requiring a reallocation. As a result, the call to insert()
results in data
is invalidated after the call to replace()
, and so its use in g()
is undefined behavior.
Code Block | ||||
---|---|---|---|---|
| ||||
string s("rcs");
string::iterator si = s.begin();
for (size_t i=0; i<20; ++i) {
s.push_back('x');
}
s.insert(si, '*');
|
Compliant Solution
...
| |
#include <iostream>
#include <string>
extern void g(const char *);
void f(std::string &exampleString) {
const char *data = exampleString.data();
// ...
exampleString.replace(0, 2, "bb");
// ...
g(data);
} |
Compliant Solution
In this compliant solution, the non-compliant example is modified to only append capacity-size characters to the string s
. As a result, the call to push_back()
no longer invalidates the iteratorpointer to exampleString
's internal buffer is not generated until after the modification from replace()
has completed.
Code Block | ||||
---|---|---|---|---|
| ||||
#include <iostream> string s("rcs"); string::iterator si = s.begin(#include <string> extern void g(const char *); forvoid (size_t i=0; i < 20; ++if(std::string &exampleString) { if ( s.size() == s.capacity() ) { break// ... exampleString.replace(0, 2, "bb"); // }... s.push_back('x'); } s.insert(si, '*'); |
If instead of performing a push_back()
the code were to insert into an arbitrary location in the string, all references, pointers, and iterators from the insertion point to the end of the string would be invalidated.
Exceptions
...
g(exampleString.data());
} |
Risk Assessment
Using an invalid reference, pointer, or iterator to a string object could allow an attacker to run arbitrary code.
Rule | Severity | Likelihood | Remediation Cost | Priority | Level |
---|
STR52-CPP |
3 (high)
2 (probable)
High | Probable | High | P6 | L2 |
Bibliography
Wiki Markup |
---|
\[[Meyers 01|AA. Bibliography#Meyers 01]\] Item 43: Prefer algorithm calls to hand-written loops.
\[[ISO/IEC 14882-2003|AA. Bibliography#ISO/IEC 14882-2003]\] 21.3 Class template basic_string. |
Automated Detection
Tool | Version | Checker | Description | ||||||
---|---|---|---|---|---|---|---|---|---|
CodeSonar |
| ALLOC.UAF | Use After Free | ||||||
Helix QAC |
| DF4746, DF4747, DF4748, DF4749 | |||||||
Parasoft C/C++test |
| CERT_CPP-STR52-a | Use valid references, pointers, and iterators to reference elements of a basic_string | ||||||
Polyspace Bug Finder |
| CERT C++: STR52-CPP | Checks for use of invalid string iterator (rule partially covered). |
Related Vulnerabilities
Search for vulnerabilities resulting from the violation of this rule on the CERT website.
Related Guidelines
SEI CERT C++ Coding Standard | CTR51-CPP. Use valid references, pointers, and iterators to reference elements of a container |
Bibliography
[ISO/IEC 14882-2014] | Subclause 21.4.1, " |
[Meyers 2001] | Item 43, "Prefer Algorithm Calls to Hand-written Loops" |
...
STR37-CPP. Arguments to character handling functions must be representable as an unsigned char 07. Characters and Strings (STR) STR39-CPP. Range check element access