Potentially exploitable undefined behavior can result from any of the following:
- Using pointer arithmetic so that the result does not point into or just past the end of the same object
- Using such pointers in arithmetic expressions
- Dereferencing pointers that do not point to a valid object in memory
- Using an array subscript so that the resulting reference does not refer to an element in the array
The C Standard identifies the following distinct situations in which undefined behavior (UB) can arise as a result of invalid pointer operations:
UB | Description | Example Code |
---|---|---|
Addition or subtraction of a pointer into, or just beyond, an array object and an integer type produces a result that does not point into, or just beyond, the same array object. | Forming Out-of-Bounds Pointer, | |
Addition or subtraction of a pointer into, or just beyond, an array object and an integer type produces a result that points just beyond the array object and is used as the operand of a unary | Dereferencing Past the End Pointer, Using Past the End Index | |
An array subscript is out of range, even if an object is apparently accessible with the given subscript (as , for example, in the lvalue expression | ||
An attempt is made to access, or generate a pointer to just past, a flexible array member of a structure when the referenced object provides no elements for that array. |
...
In this noncompliant code example, the function f()
attempts to validate the index
before using it as an offset to the statically allocated table
of integers. However, the function fails to reject negative index
values. When index
is less than zero, the behavior of the addition expression in the return statement of the function is undefined behavior 46. On some implementations, the addition alone can trigger a hardware trap. On other implementations, the addition may produce a result that when dereferenced can trigger triggers a hardware trap. Other implementations still may produce a dereferenceable pointer that points to an object distinct from table
. Using such a pointer to access the object may lead to information exposure or cause the wrong object to be modified.
...
Compliant Solution
Another , slightly simpler and potentially more efficient compliant solution is to use an unsigned type to avoid having to check for negative values while still rejecting out-of-bounds positive values of index
:
...
This noncompliant code example shows the flawed logic in the Windows Distributed Component Object Model (DCOM) Remote Procedure Call (RPC) interface that was exploited by the W32.Blaster.Worm. The error is that the while
loop in the GetMachineName()
function (used to extract the host name from a longer string) is not sufficiently bounded. When the character array pointed to by pwszTemp
does not contain the backslash character among the first MAX_COMPUTERNAME_LENGTH_FQDN + 1
elements, the final valid iteration of the loop will dereference past - the - end pointer, resulting in exploitable undefined behavior 47. In this case, the actual exploit allowed the attacker to inject executable code into a running program. Economic damage from the Blaster worm has been estimated to be at least $525 million [Pethia 2003].
For a discussion of this programming error in the Common Weakness Enumeration database, see CWE-119, "Failure to constrain operations Improper Restriction of Operations within the bounds Bounds of a memory bufferMemory Buffer," and CWE-121, "Stack-based buffer overflow."Buffer Overflow" [MITRE 2013].
Code Block | ||||
---|---|---|---|---|
| ||||
error_status_t _RemoteActivation( /* ... */, WCHAR *pwszObjectName, ... ) { *phr = GetServerPath( pwszObjectName, &pwszObjectName); /* ... */ } HRESULT GetServerPath( WCHAR *pwszPath, WCHAR **pwszServerPath ){ WCHAR *pwszFinalPath = pwszPath; WCHAR wszMachineName[MAX_COMPUTERNAME_LENGTH_FQDN+1]; hr = GetMachineName(pwszPath, wszMachineName); *pwszServerPath = pwszFinalPath; } HRESULT GetMachineName( WCHAR *pwszPath, WCHAR wszMachineName[MAX_COMPUTERNAME_LENGTH_FQDN+1]) { pwszServerName = wszMachineName; LPWSTR pwszTemp = pwszPath + 2; while (*pwszTemp != L'\\') *pwszServerName++ = *pwszTemp++; /* ... */ } |
...
In this compliant solution, the while
loop in the GetMachineName()
function is bounded so that the loop terminates when a backslash character is found, the null-termination character (L'\0'
) is discovered, or the end of the buffer is reached. This code does not result in a buffer overflow even if no backslash character is found in wszMachineName
.
...
Third, the function violates INT30-C. Ensure that unsigned integer operations do not wrap when calculating the size of memory to allocate, which could lead to wrapping when 1 is added to pos
or when size
is multiplied by the size of int
.
For a discussion of this programming error in the Common Weakness Enumeration database, see CWE-122, "Heap-based buffer overflowBuffer Overflow," and CWE-129, "Improper validation of array index."Validation of Array Index" [MITRE 2013].
Code Block | ||||
---|---|---|---|---|
| ||||
#include <stdlib.h> static int *table = NULL; static size_t size = 0; int insert_in_table(size_t pos, int value) { if (size < pos) { int *tmp; size = pos + 1; tmp = (int *)realloc(table, sizeof(*table) * size); if (tmp == NULL) { return -1; /* Failure */ } table = tmp; } table[pos] = value; return 0; } |
...
This compliant solution correctly validates the index pos
by using the <=
relational operator, ensures the multiplication will not overflow, and avoids modifying size
until it has verified that the call to realloc()
was successful:
Code Block | ||||
---|---|---|---|---|
| ||||
#include <stdint.h>
#include <stdlib.h>
static int *table = NULL;
static size_t size = 0;
int insert_in_table(size_t pos, int value) {
if (size <= pos) {
int *tmp;
if ((pos + 1) > SIZE_MAX / sizeof(*table)) {
return -1;
}
tmp = (int *)realloc(table, sizeof(*table) * (pos + 1));
if (tmp == NULL) {
return -1;
}
/* Modify size only after realloc() succeeds */
size = pos + 1;
table = tmp;
}
table[pos] = value;
return 0;
}
|
...
This noncompliant code example declares matrix
to consist of 7 rows and 5 columns in row-major order. The function init_matrix
then iterates over all 35 elements in an attempt to initialize each to the value given by the function argument x
. However, because multidimensional arrays are declared in C in row-major order, and the function iterates over the elements in column-major order, and when the value of j
reaches the value COLS
during the first iteration of the outer loop, the function attempts to access element matrix[0][5]
. Because the type of matrix
is int[7][5]
, the j
subscript is out of range, and the access has undefined behavior 49.
...
In this noncompliant code example, the function find()
attempts to iterate over the elements of the flexible array member buf
, starting with the second element. However, because function g()
does not allocate any storage for the member, the expression first++
in find()
attempts to form a pointer just past the end of buf
when there are no elements. This attempt results in is undefined behavior 62 (see MSC21-C. Use robust loop termination conditions for more information).
Code Block | ||||
---|---|---|---|---|
| ||||
#include <stdlib.h> struct S { size_t len; char buf[]; /* Flexible array member */ }; const char *find(const struct S *s, int c) { const char *first = s->buf; const char *last = s->buf + s->len; while (first++ != last) { /* Undefined behavior */ if (*first == (unsigned char)c) { return first; } } return NULL; } void g(void) { struct S *s = (struct S *)malloc(sizeof(struct S)); if (s == NULL) { /* handleHandle error */ } s->len = 0; find(s, 'a'); } |
...
Code Block | ||||
---|---|---|---|---|
| ||||
#include <stdlib.h> struct S { size_t len; char buf[]; /* Flexible array member */ }; const char *find(const struct S *s, int c) { const char *first = s->buf; const char *last = s->buf + s->len; while (first != last) { /* Avoid incrementing here */ if (*++first == (unsigned char)c) { return first; } } return NULL; } void g(void) { struct S *s = (struct S *)malloc(sizeof(struct S)); if (s == NULL) { /* handleHandle error */ } s->len = 0; find(s, 'a'); } |
...
This function fails to check if the allocation succeeds; , which is a violation of ERR33-C. Detect and handle standard library errors. If the allocation fails, then malloc()
returns a null pointer. The null pointer is added to offset
and passed as the destination argument to memcpy()
. Because a null pointer does not point to a valid object, the result of the pointer arithmetic is undefined behavior 46.
An attacker who can supply the arguments to this function can exploit it to write to execute arbitrary code. This can be accomplished by providing a sufficiently an overly large value for block_size
to cause size
, which causes malloc()
to fail and return a null pointer. The offset
argument will then serve as the destination address to the call to memcpy()
. The The attacker can specify the data
and data_size
arguments can to provide the address and length of the address, respectively, that the the attacker wishes to write into the memory referenced by offset
. Consequently, The overall result is that the call to memcpy()
can be exploited by an attacker to overwrite an address arbitrary memory location with an attacker-supplied address; , typically resulting in arbitrary code execution.
...
Code Block | ||||
---|---|---|---|---|
| ||||
#include <string.h> #include <stdlib.h> char *init_block(size_t block_size, size_t offset, char *data, size_t data_size) { char *buffer = malloc(block_size); if (NULL == buffer) { /* Handle error */ } if (data_size > block_size || block_size - data_size > offset) { /* Data won't fit in buffer, handle error */ } memcpy(buffer + offset, data, data_size); return buffer; } |
Risk Assessment
Accessing Writing to out-of-range pointers or array subscripts for writing can result in a buffer overflow and the execution of arbitrary code with the permissions of the vulnerable process. Reading from out-of-range pointers or array subscripts can result in unintended information disclosure.
...
CVE-2008-1517 results from a violation of this rule. Before Mac OSX version 10.5.7, the xnu XNU kernel accessed an array at an unverified user-input index, allowing an attacker to execute arbitrary code by passing an index greater than the length of the array and therefore accessing outside memory [xorl 2009].
...
ISO/IEC TR 24772:2013 | Arithmetic Wrap-around Error [FIF] Unchecked Array Indexing [XYZ] |
ISO/IEC TS 17961 | Forming or using out-of-bounds pointers or array subscripts [invptr] |
MITRE CWE | CWE-119, Failure to constrain operations Improper Restriction of Operations within the bounds Bounds of a memory bufferMemory Buffer CWE-122, Heap-based buffer overflowBuffer Overflow CWE-129, Unchecked array indexingImproper Validation of Array Index CWE-788, Access of memory location Memory Location after end End of bufferBuffer |
Bibliography
[Finlay 2003] | |
[Microsoft 2003] | |
[Pethia 2003] | |
[Seacord 20132013b] | Chapter 1, "Running with Scissors" |
[Viega 2005] | Section 5.2.13, "Unchecked Array Indexing" |
[xorl 2009 ] | "CVE-2008-1517: Apple Mac OS X (XNU) Missing Array Index Validation" |
...