Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

When performing pointer arithmetic, the size of the value to add to a pointer is automatically scaled to the size of the type of the pointed-to object. For instance, when adding a value to the byte address of a 4-byte integer, the value is scaled by a factor of 4 and then added to the pointer. Failing to understand how pointer arithmetic works can lead to miscalculations that result in serious errors, such as buffer overflows.

Noncompliant Code Example

In this noncompliant code example, integer values returned by parseint(getdata()) are stored into an array of INTBUFSIZE elements of type int called buf [Dowd 2006]. If data is available for insertion into buf (which is indicated by havedata()) and buf_ptr has not been incremented past buf + sizeof(buf), an integer value is stored at the address referenced by buf_ptr. However, the sizeof operator returns the total number of bytes in buf, which is typically a multiple of the number of elements in buf. This value is scaled to the size of an integer and added to buf. As a result, the check to make sure integers are not written past the end of buf is incorrect, and a buffer overflow is possible

Pointer arithmetic in C is a powerful feature when working with many data structures, however it can lead to subtle and hard to spot coding errors.  This is due to the importance of context (the type of the pointer in question) which is likely declared outside the pointer arithmetic expression.  In the case of bounds checking to determine if there is space in a region of memory, this can lead to buffer overflow vulnerabilities.

Background

Pointer arithmetic is based around the concept of scaling computation to the size of the pointer type. When working with arrays this allows for easily accessing elements.

Non-Compliant Code Example

This non-compliant code illustrates possible undefined behavior associated with demoting floating point represented numbers.

Code Block
bgColor#FFCCCC
langc

int buf[1024INTBUFSIZE];
int *buf_ptr = buf;

while (havedata() && buf_ptr < (buf + sizeof(buf))
) {
    *buf_ptr++ = parseint(getdata());
}

Compliant Solution

In this compliant solution, the size of buf, INTBUFSIZE, is added directly to buf and used as an upper bound. The integer literal INTBUFSIZE is scaled to the size of an integer, and the upper bound of buf is checked correctly.

Code Block
bgColor#CCCCFF
langc
int buf[INTBUFSIZE];
int *buf_ptr = buf;

while (havedata() && buf_ptr < (buf + INTBUFSIZE)) {
  *buf_ptr++ = parseint(getdata());
}

While at first look this code appears correct and that it will prevent overflowing the allocated buffer, in fact buf + sizeof(buf) returns a value 3 times further past An arguably better solution is to use the address of the nonexistent element following the end of the buffer, thus allowing overflow.

Compliant Code Example

array, as follows:

Code Block
bgColor#CCCCFF
langc

int buf[1024INTBUFSIZE];
int *buf_ptr = buf;

while (havedata() && buf_ptr < (char&buf[INTBUFSIZE]) {
  *)buf_ptr++ += sizeof(buf))
{
    *buf_ptr = parseint(getdata());
    buf_ptr++;
}

In this version we explicitly cast buf as a char pointer this serves two goals:

  1. It eliminates the coding error of the original code
  2. The intended result of the expression remains clear

Risk Analysis

Failure to notice a coding error of this variety would easily become a buffer overflow vulnerability. In a worst case scenario this could lead to arbitrary code execution and thus hold severe risk.

Reference

parseint(getdata());
}

This solution works because the C Standard guarantees the address of buf[INTBUFSIZE] even though no such element exists.

Noncompliant Code Example

This noncompliant code example is based on a flaw in the OpenBSD operating system. An integer, skip, is added as an offset to a pointer of type struct big. The adjusted pointer is then used as a destination address in a call to memset(). However, when skip is added to the struct big pointer, it is automatically scaled by the size of struct big, which is 32 bytes (assuming 4-byte integers, 8-byte long long integers, and no structure padding). This scaling results in the call to memset() writing to unintended memory.

Code Block
bgColor#FFCCCC
langc
struct big {
  unsigned long long ull_1; /* Typically 8 bytes */
  unsigned long long ull_2; /* Typically 8 bytes */
  unsigned long long ull_3; /* Typically 8 bytes */
  int si_4; /* Typically 4 bytes */
  int si_5; /* Typically 4 bytes */
};
/* ... */
 
int f(void) {
  size_t skip = offsetof(struct big, ull_2);
  struct big *s = (struct big *)malloc(sizeof(struct big));
  if (!s) {
   return -1; /* Indicate malloc() failure */
  }

  memset(s + skip, 0, sizeof(struct big) - skip);
  /* ... */
  free(s);
  s = NULL;
  
  return 0;
}

A similar situation occurred in OpenBSD's make command [Murenin 2007].

Compliant Solution

To correct this example, the struct big pointer is cast as a char *, which causes skip to be scaled by a factor of 1:

Code Block
bgColor#CCCCFF
langc
struct big {
  unsigned long long ull_1; /* Typically 8 bytes */
  unsigned long long ull_2; /* Typically 8 bytes */
  unsigned long long ull_3; /* Typically 8 bytes */
  int si_4; /* Typically 4 bytes */
  int si_5; /* Typically 4 bytes */
};
/* ... */
 
int f(void) {
  size_t skip = offsetof(struct big, ull_2);
  struct big *s = (struct big *)malloc(sizeof(struct big));
  if (!s) {
    return -1; /* Indicate malloc() failure */
  }

  memset((char *)s + skip, 0, sizeof(struct big) - skip);
/* ... */
  free(s);
  s = NULL;

  return 0;
}

Risk Assessment

Failure to understand and properly use pointer arithmetic can allow an attacker to execute arbitrary code.

Recommendation

Severity

Likelihood

Remediation Cost

Priority

Level

EXP08-C

High

Probable

High

P6

L2

Automated Detection

Tool

Version

Checker

Description

Astrée
Include Page
Astrée_V
Astrée_V

Supported: Astrée reports potential runtime errors resulting from invalid pointer arithmetics.
CodeSonar
Include Page
CodeSonar_V
CodeSonar_V

LANG.STRUCT.PARITH
LANG.MEM.BO
LANG.MEM.BU
LANG.STRUCT.PBB
LANG.STRUCT.PPE
LANG.MEM.TBA
LANG.MEM.TO
LANG.MEM.TU
LANG.STRUCT.CUP
LANG.STRUCT.SUP

Pointer arithmetic
Buffer overrun
Buffer underrun
Pointer before beginning of object
Pointer past end of object
Tainted buffer access
Type overrun
Type underrun
Comparison of Unrelated Pointers
Subtraction of Unrelated Pointers

Helix QAC

Include Page
Helix QAC_V
Helix QAC_V

C0488, C2930, C2931, C2932, C2933


Klocwork
Include Page
Klocwork_V
Klocwork_V
ABV.ITERATOR
ABV.GENERAL
ABV.GENERAL.MULTIDIMENSION

LDRA tool suite
Include Page
LDRA_V
LDRA_V

45 D
53 D
54 D
438 S
576 S

Partially implemented

Parasoft C/C++test
Include Page
Parasoft_V
Parasoft_V

CERT_C-EXP08-a
CERT_C-EXP08-b


Pointer arithmetic should not be used
Avoid accessing arrays out of bounds

Parasoft Insure++

Runtime analysis
PC-lint Plus

Include Page
PC-lint Plus_V
PC-lint Plus_V

416

Partially supported

Polyspace Bug Finder

Include Page
Polyspace Bug Finder_V
Polyspace Bug Finder_V

CERT C: Rec. EXP08-C


Checks for:

  • Pointer points outside array after arithmetic on pointer operand
  • Subtraction between pointers to different arrays
  • Incorrect pointer scaling

Rec. fully supported.

PVS-Studio

Include Page
PVS-Studio_V
PVS-Studio_V

V503, V520, V574, V600, V613, V619, V620, V643, V650, V687, V769, V1004

How long is 4 yards plus 3 feet? It is obvious from elementary arithmetic that any answer involving 7 is wrong, as the student did not take the units into account. The right method is to convert both numbers to reflect the same units.

The examples in this rule reflect both a correct and an incorrect way to handle comparisons of numbers representing different things (either single bytes or multibyte data structures). The noncompliant examples just add the numbers without regard to units, whereas the compliant solutions use type casts to convert one number to the appropriate unit of the other number.

ROSE can catch both noncompliant examples by searching for pointer arithmetic expressions involving different units. The "different units" is the tricky part, but you can try to identify an expression's units using some simple heuristics:

  • A pointer to a foo object has foo as the unit.
  • A pointer to char * has byte as the unit.
  • Any sizeof or offsetof expression also has unit byte as the unit.
  • Any variable used in an index to an array of foo objects (e.g., foo[variable]) has foo as the unit.

In addition to pointer arithmetic expressions, you can also hunt for array index expressions, as array[index] is merely shorthand for "array + index."

Related Vulnerabilities

Search for vulnerabilities resulting from the violation of this rule on the CERT website.

Related Guidelines

SEI CERT C++ Coding StandardVOID EXP08-CPP. Ensure pointer arithmetic is used correctly
ISO/IEC TR 24772:2013Pointer Casting and Pointer Type Changes [HFC]
Pointer Arithmetic [RVG]
ISO/IEC TS 17961Forming or using out-of-bounds pointers or array subscripts [invptr]
MISRA C:2012Rule 18.1 (required)
Rule 18.2 (required)
Rule 18.3 (required)
Rule 18.4 (advisory)
MITRE CWECWE-468, Incorrect pointer scaling

Bibliography

[Dowd 2006]Chapter 6, "C Language Issues"
[Murenin 2007]


...

Image Added Image Added Image Added

...