Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: fixed refs

...

Expectations that a system will be dependable are based on the confidence that the system will operate as expected and not fail in normal use. The dependability of a system and its fault tolerance can be measured through the component part's reliability, availability, safety and security. Reliability is the ability of a system or component to perform its required functions under stated conditions for a specified period of time IEEE 1990 glossary. Availability is how timely and reliable the system is to its intended users. Both of these factors matter highly in systems used for safety and security. In spite of the best intentions, systems will encounter a failure, either from internally poorly written software or external forces such as power outages/variations, floods, or other natural disasters. The reaction to a fault can affect the performance of a system and in particular, the safety and security of the system and its users.

Wiki Markup
Effective error-handling (which includes error reporting, report aggregation, analysis, response, and recovery) is a central aspect of the design, implementation, maintenance, and operation of systems that exhibit survivability under stress.  Survivability is the capability of a system to fulfill its mission, in a timely manner, despite an attack, accident, or other stress that is outside the bounds of normal
operation 1. If full services
 operation \[[Lipson 00|AA. C References#Lipson 00]\].  If full services can't be maintained under a given stress, survivable systems degrade gracefully, continue to deliver essential services, and recover full services as conditions permit.

Wiki Markup
Error reporting and error handling play a central role in the engineering and operation of survivable systems.  Survivability is an emergent property of a system as a
whole 2 and depends on the behavior of all of the system's components and the interactions among them. From the viewpoint of error handling, every system component, down to the smallest routine, can be considered to be a sensor capable of reporting on some aspect of the health of the system. Any error
 whole \[[Fisher 99|AA. C References#Fisher 99]\] and depends on the behavior of all of the system's components and the interactions among them.  From the viewpoint of error handling, every system component, down to the smallest routine, can be considered to be a sensor capable of reporting on some aspect of the health of the system.  Any error (i.e., anomaly) ignored, or improperly handled, could threaten delivery of essential system services and thus put at risk the organizational or business mission that the system supports.

The key characteristics of survivability include the 3Rs: resistance, recognition, and recover. Resistance refers to measures that "harden" a system against particular stresses, recognition refers to situational awareness with respect to instances of stress and their impact on the system, and recovery is the ability of a system to restore services after (and possibly during) an attack, accident, or other event that has disrupted those services. Comprehensive error reporting and handling can

...

Search for vulnerabilities resulting from the violation of this rule on the CERT website.

References

Wiki Markup
\[[Fisher 99|AA. C References#Fisher 99]\]
\[[Horton 90|AA. C References#Horton 90]\] Section 11 p. 168, Section 14 p. 254
\[[ISO/IEC 9899-1999|AA. C References#ISO/IEC 9899-1999]\] Sections 7.1.4, 7.9.10.4, and 7.11.6.2
\[[ISO/IEC PDTR 24772|AA. C References#ISO/IEC PDTR 24772]\] "REU Termination strategy", "NZN Returning error status"
\[[Koenig 89|AA. C References#Koenig 89]\] Section 5.4 p. 73
\[[Lipson 00|AA. C References#Lipson 00]\]
\[[Lipson 06|AA. C References#Lipson 06]\]
\[[MISRA 04|AA. C References#MISRA 04]\] Rule 16.1
\[[Summit 05|AA. C References#Summit 05]\] C-FAQ Question 20.4
1 Howard Lipson & David Fisher. "Survivability—A New Technical and Business Perspective on Security," 33-39. Proceedings of the 1999 New Security Paradigms Workshop. Caledon Hills, Ontario, Canada, Sept. 22-24, 1999. New York: Association for Computing Machinery, 2000.
2 David Fisher & Howard Lipson, "Emergent Algorithms - A New Method for Enhancing Survivability in Unbounded Systems," Proceedings of the 32nd Annual Hawaii International Conference on System Sciences (HICSS-32). Maui, HI, January 5-8, 1999.
4 Howard Lipson, Evolutionary Systems Design: Recognizing Changes in Security and Survivability Risks, SEI Technical Note, CMU/SEI-2006-TN-027, September 2006.

...

12. Error Handling (ERR)      12. Error Handling (ERR)       ERR01-A. Use ferror() rather than errno to check for FILE stream errors