...
UB | Description | Example Code | |||
---|---|---|---|---|---|
Addition or subtraction of a pointer into, or just beyond, an array object and an integer type produces a result that does not point into, or just beyond, the same array object. | |||||
Addition or subtraction of a pointer into, or just beyond, an array object and an integer type produces a result that points just beyond the array object and is used as the operand of a unary | #Dereferencing Past The End Pointer, #Using Past The End Index | ||||
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="b79812315b36f540-cb9e0d67-45464e5b-a087bcc2-8090306068c67fa0a0360364"><ac:plain-text-body><![CDATA[ | [46 | CC. Undefined Behavior#ub_46] | An array subscript is out of range, even if an object is apparently accessible with the given subscript (as in the lvalue expression | [#Apparently Accessible Out Of Range Index] | ]]></ac:plain-text-body></ac:structured-macro> |
An attempt is made to access, or generate a pointer to just past, a flexible array member of a structure when the referenced object provides no elements for that array. | |||||
The pointer passed to a library function array parameter does not have a value such that all address computations and object accesses are valid. |
...
In the following noncompliant code example the function f()
attempts to validate the index
before using it as an offset to the statically allocated table
of integers. However, the function fails to reject negative index
values. When index
is less than zero, the behavior of the addition expression in the return statement of the function has is undefined behavior 43. On some implementations the addition alone may trigger a hardware trap. On other implementations, using the result of the addition or dereferencing it may cause a similar manifestation of undefined behavioralso trigger a hardware trap.
Code Block | ||
---|---|---|
| ||
enum { TABLESIZE = 100 }; static int table[TABLESIZE]; int* f(int index) { if (index < TABLESIZE) return table + index; return NULL; } |
...
One compliant solution is to detect and reject invalid values of index
when if using them in the pointer arithmetic expression would result in the formation of an invalid pointer.
Code Block | ||
---|---|---|
| ||
enum { TABLESIZE = 100 }; static int table[TABLESIZE]; int* f(int index) { if (0 <= index && index < TABLESIZE) return table + index; return NULL; } |
Compliant Solution
Another, slightly simpler and potentially more efficient compliant solution is to use an unsigned type to avoid having to check for negative values while still rejecting out of bounds positive values of index
.
...
Noncompliant Code Example (Dereferencing Past The End Pointer)
Wiki Markup |
---|
The following noncompliant code example below shows the flawed logic in the Windows Distributed Component Object Model (DCOM) Remote Procedure Call (RPC) interface that was exploited by the W32.Blaster.Worm. The error is that the while loop in the {{GetMachineName()}} function (used to extract the host name from a longer string) is not sufficiently bounded. When the character array pointed to by {{pwszTemp}} does not contain the backslash character among the first {{MAX_COMPUTERNAME_LENGTH_FQDN + 1}} elements the final valid iteration of the loop will dereference the past the end pointer resulting in exploitable undefined behavior [44|CC. Undefined Behavior#ub_44]. In this case, the actual exploit allowed the attacker to inject executable code into a running program. Economic damage from the Blaster worm has been estimated to be at least $525 million \[[Pethia 03|AA. Bibliography#Pethia 03]\]. |
...
Code Block | ||
---|---|---|
| ||
error_status_t _RemoteActivation( /* ... */, WCHAR *pwszObjectName, ... ) { *phr = GetServerPath( pwszObjectName, &pwszObjectName); /* ... */ } HRESULT GetServerPath( WCHAR *pwszPath, WCHAR **pwszServerPath ){ WCHAR *pwszFinalPath = pwszPath; WCHAR wszMachineName[MAX_COMPUTERNAME_LENGTH_FQDN+1]; hr = GetMachineName(pwszPath, wszMachineName); *pwszServerPath = pwszFinalPath; } HRESULT GetMachineName( WCHAR *pwszPath, WCHAR wszMachineName[MAX_COMPUTERNAME_LENGTH_FQDN+1]) { pwszServerName = wszMachineName; LPWSTR pwszTemp = pwszPath + 2; while ( *pwszTemp != L'\\' ) *pwszServerName++ = *pwszTemp++; /* ... */ } |
Compliant Solution
In this the following compliant solution, the while loop in the GetMachineName()
function is bounded so that the loop terminates when a backslash character is found, the null termination character (L'\0'
is discovered, or the end of the buffer is reached. This code does not result in a buffer overflow, even if no L'
backslash character is found in
'wszMachineName
.
...
This compliant solution is for illustrative purposes and is not necessarily the solution implemented by Microsoft. This particular "solution" may not be correct, because there is no guarantee that a L'
backslash is found.
'
Anchor | ||||
---|---|---|---|---|
|
...
Similarly to the #Dereferencing Past The End Pointer error, the function insert_in_table()
in the following noncompliant code example below uses an otherwise valid index to attempt to store a value in an element just past the end of an array.
...
Code Block | ||
---|---|---|
| ||
static int *table = NULL; static size_t size = 0; int insert_in_table(size_t pos, int value) { if (size < pos) { int *tmp; size = pos + 1; tmp = (int*)realloc(table, sizeof *table * size); if (NULL == tmp) return -1; table = tmp; } table[pos] = value; return 0; } |
Compliant Solution
The following compliant solution below correctly validates the index pos
by using the <=
operator and avoids modifying size
until it has verified that the call to realloc()
was successful.
...
Noncompliant Code Example (Apparently Accessible Out Of Range Index)
Wiki Markup |
---|
The following noncompliant code example below declares {{matrix}} to consist of 7 rows and 5 columns in row-major order. The function {{init_matrix}} then iterates over all 35 elements in an attempt to initialize each to the value given by the function argument {{x}}. However, since multidimensional arrays are declared in C in row-major order and the function iterates over the elements in column-major order, when the value of {{j}} reaches the value {{COLS}} during the first iteration of the outer loop the function attempts to access element {{matrix\[0\]\[5\]}}. Since the type of {{matrix}} is {{int\[7\]\[5\]}}, the {{j}} subscript is out of range and the access has undefined behavior [46|CC. Undefined Behavior#ub_46]. |
Code Block | ||
---|---|---|
| ||
static const size_t COLS = 5; static const size_t ROWS = 7; static int matrix[ROWS][COLS]; void init_matrix(int x) { for (size_t i = 0; i != COLS; ++i) for (size_t j = 0; j != ROWS; ++j) matrix[i][j] = x; } |
Compliant Solution
The following compliant solution below takes care to avoid avoids using out-of-range indices by initializing matrix
elements in the same row-major order as multidimensional objects are declared in C.
...
In the following noncompliant code example the function find()
attempts to iterate over the elements of the flexible array member buf
, starting with the second element. However, since function g()
does not allocate any storage for the member, the expression first++
in find()
will attempt to form a pointer just past the end of buf
when there are no elements. This attempt results in undefined behavior 59.
Code Block | ||
---|---|---|
| ||
struct S { size_t len; char buf[]; /* flexible array member */ }; char* find(const struct S *s, int c) { char *first = s->buf; char *last = s->buf + s->len; while (first++ != last) /* undefined behavior here */ if (*first == (unsigned char)c) return first; return NULL; } void g() { struct S *s = (struct S*)malloc(sizeof (struct S)); s->len = 0; /* ... */ char *where = find(s, '.'); /* ... */ } |
Compliant Solution
The following compliant solution avoids incrementing the pointer unless a value past the end pointer's current value is known to exist.
Code Block | ||
---|---|---|
| ||
struct S { size_t len; char buf[]; /* flexible array member */ }; char* find(const struct S *s, int c) { char *first = s->buf; char *last = s->buf + s->len; while (first != last) /* avoid incrementing here */ if (*++first == (unsigned char)c) return first; return NULL; } void g() { struct S *s = (struct S*)malloc(sizeof (struct S)); s->len = 0; /* ... */ char *where = find(s, '.'); /* ... */ } |
...
In the following noncompliant code example the function f()
calls fread()
to read nitems
of type wchar_t
, each size
bytes in size, into an array of BUFSIZ
elements, wbuf
. However, the expression used to compute the value of nitems
fails to account for the fact that unlike the size of char
, the size of wchar_t
may be greater than 1. Thus, fread()
may attempt to form pointers past the end of wbuf
and use them to assign values to non-existing elements of the array. Such an attempt results in undefined behavior 103. A likely manifestation of this undefined behavior is classic buffer overflow which is often exploitable by code injection attacks.
...
Code Block | ||
---|---|---|
| ||
void f(FILE *file) { wchar_t wbuf[BUFSIZ]; const size_t size = sizeof *wbuf; const size_t nitems = sizeof wbuf; size_t nread; nread = fread(wbuf, size, nitems, file); /* ... */ } |
Compliant Solution
The following compliant solution is to correctly compute computes the maximum number of items for fread()
to read from the file.
...