Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Error handling is critical to the success and security of your application. It is necessary to adopt and implement a consistent error handling policy that is consistent with the goals and requirements of your application domain.

Non-Compliant Code Example (Memory Management)

Wiki Markup
This example, taken from \[[MEM32-C. Detect and handle critical memory allocation errors]\] demonstrates why checking the return value of memory allocation routines is critical. The buffer {{input_string}} is copied into dynamically allocated memory referenced by {{str}}. However, the result of {{malloc()}} is not checked before {{str}} is referenced. Consequently, if {{malloc()}} fails, the program will abnormally terminate.

Code Block
bgColor#FFcccc

/* ... */
size_t size = strlen(input_string);
if (size == SIZE_MAX) {
  /* Handle Error */
}
str = malloc(size+1);
strcpy(str, input_string);
/* ... */
free(str);

Compliant Solution (Memory Management)

Upon failure, the malloc() function returns NULL. Failing to detect and properly handle this error condition appropriately can lead to abnormal and abrupt program termination.

Code Block
bgColor#ccccff

/* ... */
size_t size = strlen(input_string);
if (size == SIZE_MAX) {
  /* Handle Error */
}
str = malloc(size+1);
if (str == NULL) {
  /* Handle Allocation Error */
}
strcpy(str, input_string);
/* ... */
free(str);

Non-Compliant Code Example (File Operations)

In this example, fopen() is used to open a file for reading. If fopen() is unable to open the file it returns a null pointer. Failing to detect and properly handle this error condition appropriately can lead to abnormal and abrupt program termination.

Code Block
bgColor#FFcccc

FILE *fptr = fopen("MyFile.txt","r");

Compliant Solution (File Operations)

To correct this example, the return value of fopen() should be checked for NULL.

Code Block
bgColor#ccccff

FILE *fptr = fopen("MyFile.txt","r");
if (fptr == NULL) {
   /* Handle error condition */
}

...

ISO/IEC PDTR 24772 Section 6.47, "REU Termination strategy" says:

Expectations that a system will be dependable are based on the confidence that the system will operate as expected and not fail in normal use. The dependability of a system and its fault tolerance can be measured through the component part's reliability, availability, safety and security. Reliability is the ability of a system or component to perform its required functions under stated conditions for a specified period of time IEEE 1990 glossary. Availability is how timely and reliable the system is to its intended users. Both of these factors matter highly in systems used for safety and security. In spite of the best intentions, systems will encounter a failure, either from internally poorly written software or external forces such as power outages/variations, floods, or other natural disasters. The reaction to a fault can affect the performance of a system and in particular, the safety and security of the system and its users.

When a fault is detected, there are many ways in which a system can react. The quickest and most noticeable way is to fail hard, also known as fail fast or fail stop. The reaction to a detected fault is to immediately halt the system. Alternatively, the reaction to a detected fault could be to fail soft. The system would keep working with the faults present, but the performance of the system would be degraded. Systems used in a high availability environment such as telephone switching centers, e-commerce, etc. would likely use a fail soft approach. What is actually done in a fail soft approach can vary depending on whether the system is used for safety critical or security critical purposes. For fail safe systems, such as flight controllers, traffic signals, or medical monitoring systems, there would be no effort to meet normal operational requirements, but rather to limit the damage or danger caused by the fault. A system that fails securely, such as cryptologic systems, would maintain maximum security when a fault is detected, possibly through a denial of service.

ISO/IEC PDTR 24772 Section 6.47, "REU Termination strategy" also says:

The reaction to a fault in a system can depend on the criticality of the part in which the fault originates. When a program consists of several tasks, the tasks each may be critical, or not. If a task is critical, it may or may not be restartable by the rest of the program. Ideally, a task which detects a fault within itself should be able to halt leaving its resources available for use by the rest of the program, halt clearing away its resources, or halt the entire program. The latency of any such communication, and whether other tasks can ignore such a communication, should be clearly specified. Having inconsistent reactions to a fault, such as the fault reaction to a crypto fault, can potentially be a vulnerability.

ISO/IEC PDTR 24772 Section 6.47, "REU Termination strategy" describes the following mitigation strategies:

Software developers can avoid the vulnerability or mitigate its ill effects in the following ways:

  • A strategy for fault handling should be decided. Consistency in fault handling should be the same with respect to critically similar parts.
  • A multi-tiered approach of fault prevention, fault detection and fault reaction should be used.
  • System-defined components that assist in uniformity of fault handling should be used when available. For one example, designing a "runtime constraint handler" (as described in ISO/IEC TR 24731-1) permits the application to intercept various erroneous situations and perform one consistent response, such as flushing a previous transaction and re-starting at the next one.
    • When there are multiple tasks, a fault-handling policy should be specified whereby a task may
    • halt, and keep its resources available for other tasks (perhaps permitting restarting of the faulting task)
    • halt, and remove its resources (perhaps to allow other tasks to use the resources so freed, or to allow a recreation of the task)
    • halt, and signal the rest of the program to likewise halt.

Risk Analysis

Failing to detect error condition can result in unexpected program behavior, and possibly abnormal program termination resulting in a denial-of-service condition.

Recommendation

Severity

Likelihood

Remediation Cost

Priority

Level

ERR00-A

2 (medium)

2 (probable)

2 (medium)

P8

L2

Automated Detection

...

Related Vulnerabilities

Search for vulnerabilities resulting from the violation of this rule on the CERT website.

...