Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Errors during floating point operations are often neglected by applications; instead, the greatest effort is usually in validating the operands before an operation. Errors occurring during floating point operations are admittedly difficult to determine and diagnose, but the benefits of understanding how to check for these errors and employing the process of doing so may outweigh the costs. This recommendation suggests ways to capture errors during floating point operations.

Consider the The following code has undefined behavior:

Code Block
int j = 0;
int iResult = 1 / j;

Running the code above results in undefined behavior. On most implementations, integer division by zero is a terminal error, commonly printing a diagnostic message and aborting the program.

...

Operating System

How to handle floating point errors

Linux
Solaris 10
Mac OS X 10.5

Use the C99 floating point exception functions.

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="2eb1e94cd97e0ebf-2a77057f-430f4b79-a14cb5aa-631f45f2eb4db6b93acd2a35"><ac:plain-text-body><![CDATA[

Windows

Either use the C99 floating point exception function or structured exception handling through _fpieee_flt [[MSDN

AA. C References#MSDN]]

]]></ac:plain-text-body></ac:structured-macro>

...

Compliant Solution 2 (Windows)

MS Microsoft Visual Studio 2008 and earlier does not support C99 functions to handle floating point errors. Windows provides an alternative method to get floating point exception code using _statusfp(), _statusfp2(), and _clearfp().

...

Compliant Solution 3 (Windows)

MS Microsoft Visual Studio 2008 also uses structured exception handling (SEH) to handle floating point operation. Using the SEH allows the programmer to change the results of the floating point operation that caused the error condition. Using SEH also provides more information about the error condition.

Code Block
bgColor#ccccff
fp_usingSEH(void) {
  /* ... */
  double a = 1e-40, b, c = 0.1;
  float x = 0, y;
  unsigned int rv ;

  unmask_fp();

  _try {
    /* Store into y is inexact and underflows */
    y = a;

    /* divide by zero operation */
    b = y / x;

    /* inexact */
    c = sin(30) * a;
  }

  _except (_fpieee_flt (
             GetExceptionCode(), 
             GetExceptionInformation(), 
             fpieee_handler)) 
  {
	printf ("fpieee_handler: EXCEPTION_EXECUTE_HANDLER");
  }

  /* ... */
}

void unmask_fpsr(void) {
  unsigned int u;
  unsigned int control_word;
  _controlfp_s(&control_word, 0, 0);
  u = control_word & ~(_EM_INVALID 
                     | _EM_DENORMAL 
                     | _EM_ZERODIVIDE 
                     | _EM_OVERFLOW 
                     | _EM_UNDERFLOW 
                     | _EM_INEXACT);
  _controlfp_s( &control_word, u, _MCW_EM);
  return ;
}

int fpieee_handler(_FPIEEE_RECORD *ieee) {
  /* ... */

  switch (ieee->RoundingMode) {
    case _FpRoundNearest:
      /* ... */
      break;

      /* Other RMs include _FpRoundMinusInfinity, 
       * _FpRoundPlusInfinity, _FpRoundChopped */
    
      /* ... */
    }

  switch (ieee->Precision) {
    case _FpPrecision24:
      /* ... */
      break;

      /* Other Ps include _FpPrecision53*/
      /* ... */
    }

   switch (ieee->Operation) {
     case _FpCodeAdd:
       /* ... */
       break;

       /* Other Ops include _FpCodeSubtract, _FpCodeMultiply, 
        * _FpCodeDivide, _FpCodeSquareRoot, _FpCodeCompare, 
        * _FpCodeConvert, _FpCodeConvertTrunc */
       /* ... */
    }

  /* process the bitmap ieee->Cause */
  /* process the bitmap ieee->Enable */
  /* process the bitmap ieee->Status */
  /* process the Operand ieee->Operand1, 
   * evaluate format and Value */
  /* process the Operand ieee->Operand2, 
   *evaluate format and Value */
  /* process the Result ieee->Result, 
   * evaluate format and Value */
  /* The result should be set according to the operation 
   * specified in ieee->Cause and the result format as 
   * specified in ieee->Result */

  /* ... */
}

...

Wiki Markup
\[[IEEE 754|AA. C References#IEEE 754 2006]\]
\[[Intel 01|AA. C References#Intel 01]\]
\[[Keil 08|AA. C References#Keil 08]\]
\[[MSDN|AA. C References#MSDN]\] "[fpieee_flt (CRT)|http://msdn.microsoft.com/en-us/library/te2k2f2t(VS.80).aspx]"
\[[Open Group 04|AA. C References#Open Group 04]\] "[{{fenv.h}} - Floating point environment|http://www.opengroup.org/onlinepubs/009695399/basedefs/fenv.h.html]"
\[[SecurityFocus 07|AA. C References#SecurityFocus 07]\] 

...

FLP02-A. Consider avoiding floating point numbers when precise computation is needed      05. Floating Point (FLP)       FLP30-C. Do not use floating point variables as loop counters