Exceptional Floating Point

Floating-point math has an answer for everything, but sometimes that’s not what you want. Sometimes instead of getting an answer to the question sqrt(-1.0) (it’s NaN) it’s better to know that your software is asking imaginary questions.

The IEEE standard for floating-point math defines five exceptions that shall be signaled when certain conditions are detected. Normally the flags for these exceptions are raised (set), a default result is delivered, and execution continues. This default behavior is often desirable, especially in a shipping game, but during development it can be useful to halt when an exception is signaled.

Halting on exceptions can be like adding an assert to every floating-point operation in your program, and can therefore be a great way to improve code reliability, and find mysterious behavior at its root cause.

This article is part of a series on floating-point. The complete list of articles in the series is:

Let’s get it started again

The five exceptions mandated by the IEEE floating-point standard are:

  1. Invalid operation: this is signaled if there is no usefully definable result, such as zero divided by zero, infinity minus infinity, or sqrt(-1). The default result is a NaN (Not a Number)
  2. Division by zero: this is signaled when dividing a non-zero number by zero. The result is a correctly signed infinity.
  3. Overflow: this is signaled when the rounded result won’t fit. The default result is a correctly signed infinity.
  4. Underflow: this is signaled when the result is non-zero and between -FLT_MIN and FLT_MIN. The default result is the rounded result.
  5. Inexact: this is signaled any time the result of an operation is not exact. The default result is the rounded result.

The underflow exception is usually not of interest to game developers – it happens rarely, and usually doesn’t detect anything of interest. The inexact result is also usually not of interest to game developers – it happens frequently (although not always, and it can be useful to understand what operations are exact) and usually doesn’t detect anything of interest.

imageThat leaves invalid operation, division by zero, and overflow. In the context of game development these are usually truly exceptional. They are rarely done intentionally, so they usually indicate a bug. In many cases these bugs are benign, but occasionally these bugs indicate real problems. From now one I’ll refer to these first three exceptions as being the ‘bad’ exceptions and assume that game developers would like to avoid them, if only so that the exceptions can be enabled without causing crashes during normal game play.

When can divide by zero be useful?

While the ‘bad’ exceptions typically represent invalid operations in the context of games, this is not necessarily true in all contexts. The default result (infinity) of division by zero can allow a calculation to continue and produce a valid result, and the default result (NaN) of invalid operation can sometimes allow a fast algorithm to be used and, if a NaN result is produced, a slower and more robust algorithm to be used instead.

The classic example of the value of the division by zero behavior is calculation of parallel resistance. The formula for this for two resistors with resistance R1 and R2 is:

image

Because division by zero gives a result of infinity, and because infinity plus another number gives infinity, and because a finite number divided by infinity gives zero, this calculation calculates the correct parallel resistance of zero when either R1 or R2 is zero. Without this behavior the code would need to check for both R1 and R2 being zero and handle that case specially.

In addition, this calculation will give a result of zero if R1 or R2 are very small – smaller than the reciprocal of FLT_MAX or DBL_MAX. This zero result is not technically correct. If a programmer needs to distinguish between these scenarios then monitoring of the overflow and division by zero flags will be needed.

The interpretation of divide-by-zero as infinity bothers some as can be seen in this official interpretation request/response, which explains the decision quite well.

Resistance is futile

Assuming that we are not trying to make use of the divide-by-zero behavior we need a convenient way of turning on the ‘bad’ floating-point exceptions. And, since we have to coexist with other code (calling out to physics libraries, D3D, and other code that may not be ‘exception clean’) we also need a way of temporarily turning off all floating-point exceptions.

The appropriate way to do this is with a pair of classes whose constructors and destructors do the necessary magic. Here are some classes that do that, for VC++:

// Declare an object of this type in a scope in order to suppress
// all floating-point exceptions temporarily. The old exception
// state will be reset at the end.
class FPExceptionDisabler
{
public:
FPExceptionDisabler()
{
// Retrieve the current state of the exception flags. This
// must be done before changing them. _MCW_EM is a bit
// mask representing all available exception masks.
// Fixed – used to pass _MCW_EM for the last two values.
_controlfp_s(&mOldValues, 0, 0);
// Set all of the exception flags, which suppresses FP
// exceptions on the x87 and SSE units.
_controlfp_s(0, _MCW_EM, _MCW_EM);
}
~FPExceptionDisabler()
{
// Clear any pending FP exceptions. This must be done
// prior to enabling FP exceptions since otherwise there
// may be a ‘deferred crash’ as soon the exceptions are
// enabled.
_clearfp();

// Reset (possibly enabling) the exception status.
_controlfp_s(0, mOldValues, _MCW_EM);
}

private:
unsigned int mOldValues;

// Make the copy constructor and assignment operator private
// and unimplemented to prohibit copying.
FPExceptionDisabler(const FPExceptionDisabler&);
FPExceptionDisabler& operator=(const FPExceptionDisabler&);
};

// Declare an object of this type in a scope in order to enable a
// specified set of floating-point exceptions temporarily. The old
// exception state will be reset at the end.
// This class can be nested.
class FPExceptionEnabler
{
public:
// Overflow, divide-by-zero, and invalid-operation are the FP
// exceptions most frequently associated with bugs.
FPExceptionEnabler(unsigned int enableBits = _EM_OVERFLOW | _EM_ZERODIVIDE | _EM_INVALID)
{
// Retrieve the current state of the exception flags. This
// must be done before changing them. _MCW_EM is a bit
// mask representing all available exception masks.
_controlfp_s(&mOldValues, _MCW_EM, _MCW_EM);

// Make sure no non-exception flags have been specified,
// to avoid accidental changing of rounding modes, etc.
enableBits &= _MCW_EM;

// Clear any pending FP exceptions. This must be done
// prior to enabling FP exceptions since otherwise there
// may be a ‘deferred crash’ as soon the exceptions are
// enabled.
_clearfp();

// Zero out the specified bits, leaving other bits alone.
_controlfp_s(0, ~enableBits, enableBits);
}
~FPExceptionEnabler()
{
// Reset the exception state.
_controlfp_s(0, mOldValues, _MCW_EM);
}

private:
unsigned int mOldValues;

// Make the copy constructor and assignment operator private
// and unimplemented to prohibit copying.
FPExceptionEnabler(const FPExceptionEnabler&);
FPExceptionEnabler& operator=(const FPExceptionEnabler&);
};

The comments explain a lot of the details, but I’ll mention a few here as well.

_controlfp_s is the secure version of the portable version of the old _control87 function. _controlfp_s controls exception settings for both the x87 and SSE FPUs. It can also be used to control rounding directions on both FPUs, and on the x87 FPU it can be used to control the precision settings. These classes use the mask parameter to ensure that only the exception settings are altered.

The floating-point exception flags are sticky, so when an exception flag is raised it will stay set until explicitly cleared. This means that if you choose not to enable floating-point exceptions you can still detect whether any have happened. And – not so obviously – if the exception associated with a flag is enabled after the flag is raised then an exception will be triggered on the next FPU instruction, even if that is several weeks after the operation that raised the flag. Therefore it is critical that the exception flags be cleared each time before exceptions are enabled.

Typical usage

The floating-point exception flags are part of the processor state which means that they are per-thread settings. Therefore, if you want exceptions enabled everywhere you need to do it in each thread, typically in main/WinMain and in your thread start function, by dropping an FPExceptionEnabler object in the top of these functions.

When calling out to D3D or any code that may use floating-point in a way that triggers these exceptions you need to drop in an FPExceptionDisabler object.

Alternately, if most your code is not FP exception clean then it may make more sense to leave FP exceptions disabled most of the time and then enable them in particular areas, such as particle systems.

Because there is some cost associated with changing the exception state (the FPU pipelines will be flushed at the very least) and because making your code more crashy is probably not what you want for your shipping game you should put #ifdefs in the constructors and destructors so that these objects become NOPs in your retail builds.

There have been various instances in the past (printer drivers from a manufacturer who shall not be named) that would enable floating-point exceptions and leave them enabled, meaning that some perfectly legitimate software would start crashing after calling into third-party code (such as after printing). Having somebody’s hapless code crash after calling a function in your code is a horrible experience, so be particularly careful if your code may end up injected into other processes. In that situation you definitely need to not leave floating-point exceptions enabled when you return, and you may need to be tolerant of being called with floating-point exceptions enabled.

Performance implications of exceptions

Raising the exception flags (triggering a floating-point exception) should have no performance implications. These flags are raised frequently enough that any CPU designer will make sure that doing so is free. For example, the inexact flag is raised on virtually every floating-point instruction.

However having exceptions enabled can be expensive. Delivering precise exceptions on super-scalar CPUs can be challenging and some CPUs choose to implement this by disabling FPU parallelism when floating-point exceptions are enabled. This hurts performance. The PowerPC CPU used in the Xbox 360 CPU (and presumably the one used in the PS3) slows down significantly when any floating-point exceptions are enabled. This means that when using this technique on these processors you should just enable FPU exceptions on an as-needed basis.

Sample code

The sample code below calls TryDivByZero() three times – once in the default environment, once with the three ‘bad’ floating-point exceptions enabled, and once with them suppressed again. TryDivByZero does a floating-point divide-by-zero inside a Win32 __try/__except block in order to catch exceptions, print a message, and allow the tests to continue. This type of structured exception handling block should not (repeat not) be used in production code, except possibly to record crashes and then exit. I hesitate to demonstrate this technique because I fear it will be misused. Continuing after unexpected structured exceptions is pure evil.

With that said, here is the code:

int __cdecl DescribeException(PEXCEPTION_POINTERS pData, const char *pFunction)
{
// Clear the exception or else every FP instruction will
// trigger it again.
_clearfp();

DWORD exceptionCode = pData->ExceptionRecord->ExceptionCode;
const char* pDescription = NULL;
switch (exceptionCode)
{
case STATUS_FLOAT_INVALID_OPERATION:
pDescription = “float invalid operation”;
break;
case STATUS_FLOAT_DIVIDE_BY_ZERO:
pDescription = “float divide by zero”;
break;
case STATUS_FLOAT_OVERFLOW:
pDescription = “float overflow”;
break;
case STATUS_FLOAT_UNDERFLOW:
pDescription = “float underflow”;
break;
case STATUS_FLOAT_INEXACT_RESULT:
pDescription = “float inexact result”;
break;
case STATUS_FLOAT_MULTIPLE_TRAPS:
// This seems to occur with SSE code.
pDescription = “float multiple traps”;
break;
default:
pDescription = “unknown exception”;
break;
}

void* pErrorOffset = 0;
#if defined(_M_IX86)
void* pIP = (void*)pData->ContextRecord->Eip;
pErrorOffset = (void*)pData->ContextRecord->FloatSave.ErrorOffset;
#elif defined(_M_X64)
void* pIP = (void*)pData->ContextRecord->Rip;
#else
#error Unknown processor
#endif

printf(“Crash with exception %x (%s) in %s at %p!\n”,
exceptionCode, pDescription, pFunction, pIP);

if (pErrorOffset)
{
// Float exceptions may be reported in a delayed manner — report the
// actual instruction as well.
printf(“Faulting instruction may actually be at %p.\n”, pErrorOffset);
}

// Return this value to execute the __except block and continue as if
// all was fine, which is a terrible idea in shipping code.
return EXCEPTION_EXECUTE_HANDLER;
// Return this value to let the normal exception handling process
// continue after printing diagnostics/saving crash dumps/etc.
//return EXCEPTION_CONTINUE_SEARCH;
}

static float g_zero = 0;

void TryDivByZero()
{
__try
{
float inf = 1.0f / g_zero;
printf(“No crash encountered, we successfully calculated %f.\n”, inf);
}
__except (DescribeException(GetExceptionInformation(), __FUNCTION__))
{
// Do nothing here – DescribeException() has already done
// everything that is needed.
}
}

int main(int argc, char* argv[])
{
#if _M_IX86_FP == 0
const char* pArch = “with the default FPU architecture”;
#elif _M_IX86_FP == 1
const char* pArch = “/arch:sse”;
#elif _M_IX86_FP == 2
const char* pArch = “/arch:sse2”;
#else
#error Unknown FP architecture
#endif
printf(“Code is compiled for %zd bits, %s.\n”, sizeof(void*) * 8, pArch);

// Do an initial divide-by-zero.
// In the registers window if display of Floating Point
// is enabled then the STAT register will have 4 ORed
// into it, and the floating-point section’s EIP register
// will be set to the address of the instruction after
// the fdiv.
printf(“\nDo a divide-by-zero in the default mode.\n”);
TryDivByZero();
{
// Now enable the default set of exceptions. If the
// enabler object doesn’t call _clearfp() then we
// will crash at this point.
FPExceptionEnabler enabled;
printf(“\nDo a divide-by-zero with FP exceptions enabled.\n”);
TryDivByZero();
{
// Now let’s disable exceptions and do another
// divide-by-zero.
FPExceptionDisabler disabled;
printf(“\nDo a divide-by-zero with FP exceptions disabled.\n”);
TryDivByZero();
}
}

return 0;
}

Typical output is:

image

When generating SSE code I sometimes see STATUS_FLOAT_MULTIPLE_TRAPS instead of STATUS_FLOAT_DIVIDE_BY_ZERO. This is slightly less helpful, but the root cause should be straightforward to determine.

That said, determining the root cause can be slightly tricky. On the x87 FPU, floating-point exception reporting is delayed. Your program won’t actually crash until the next floating-point instruction after the the problematic one. In the example below the fdiv does the divide by zero, but the crash doesn’t happen until the fstp after.

011A10DD fdiv        dword ptr [__fmode+4 (11A3374h)]
011A10E3 fstp        dword ptr [ebp-1Ch]

Normally it is easy enough to look back one instruction to find the culprit, but sometimes the gap can be long enough to cause confusion. Luckily the CPU records the address of the actual faulting instruction and this can be retrieved from the exception record. This value is printed out when applicable in my exception handler, or you can see it in the Visual Studio registers window.

The sample code can be downloaded as a VisualC++ 2010 project (32-bit and 64-bit) from here:

https://www.cygnus-software.com/ftp_pub/floatexceptions.zip

Handle and continue

If you want to get really crazy/sophisticated then it is possible to catch a floating-point exception with __try/__except, handle it in some domain specific way (handling overflow by scaling down the result and recording that you did that) and then resume. This is sufficiently esoteric that I have no more to say about it – consult the documentation for _fpieee_flt if this sounds interesting.

SIMD

SSE and its SIMD instructions throw a few wrinkles into the mix. One thing to be aware of is that instructions like reciprocal estimate (rcpps) never trigger divide-by-zero exceptions – they just silently generate infinity. Therefore they are a way that infinity can be generated even when the ‘bad’ exceptions are enabled.

Additionally, many common patterns for SIMD instructions only use some components of the four-wide registers. This could be because the code is operating on a three-float vector, or it could be because the code is operating on an array of floats that is not a multiple of four long. Either way, the ‘unused’ component or components in the registers may end up triggering floating-point exceptions. These exceptions are false-positives (they don’t indicate a bug), but they must be dealt with in order to allow floating-point exceptions to be enabled. The best way to deal with this is to ensure that the unused components are filled with valid data, at least in the development builds where floating-point exceptions are enabled. Filling them with one or zero is generally good enough.

Filling the unused components with valid values may also improve performance. Some CPUs drop to microcode when they encounter some ‘special’ numbers (NaNs, infinities, and/or denormals) and using well behaved values avoids that risk.

Practical experience

On some projects I have been able to enable these three floating-point exceptions, fix all of the accidental-but-unimportant exceptions, and then find a few crucial bugs hidden in the weeds. On these projects, enabling floating-point exceptions during development was crucial. On other projects – big messy projects with a lot of history and large teams – I was unable to get the team to buy off on the concept, so it ultimately didn’t work.

Your mileage may vary, but as with asserts of any type, enabling them early, and ensuring that violations get fixed promptly, is the trick to getting value from floating-point exceptions. Adding them to a large existing codebase is trickier, but can be dealt with by only enabling them in particular parts of the code where their value exceeds their cost.

Practical experience, hot off the presses

I’ve been trying to improve the usability of debug builds on my current project and one persistent problem was a NaN that would show up in the particle system early on, triggering many different asserts. I couldn’t tell where this NaN was being generated so I knew I had to enable floating-point exceptions, using the classes described above. This project was not designed to have floating-point exceptions enabled so there were several challenges. The process was:

  • Enable floating-point exceptions in three key functions that called out to all of the particle system code
  • Disable floating-point exceptions in one child function that had a by-design floating-point overflow
  • Pad one of our arrays of particle system data with valid data to the next multiple of four so that the unused SIMD lanes wouldn’t trigger spurious exceptions
  • Find and fix five bugs that were causing floating-point exceptions

It worked. All of the bugs were worth fixing, and one of them was the source of the NaNs. After most of a day of investigation the crucial fix was to change one letter – from ‘e’ to ‘t’ – and this was enough to prevent us from dividing zero by zero. Now our debug builds are significantly more usable, and a genuine bug that was causing (apparently unnoticed) glitches is gone.

Homework

The summary is that while floating-point exceptions, even the ‘bad’ ones, aren’t necessarily bad, you can often find bugs in your code by treating them as errors. By using the classes shown above, with appropriate #ifdefs so that they go away in retail builds, you can enable floating-point exceptions in most parts of your code, and thereby improve reliability and avoid unexpected behavior.

But please, don’t use the __try/__except block, in debug or in retail code. It is an ugly and dangerous hack that should only be used in specialized demonstration code.

About brucedawson

I'm a programmer, working for Google, focusing on optimization and reliability. Nothing's more fun than making code run 10x as fast. Unless it's eliminating large numbers of bugs. I also unicycle. And play (ice) hockey. And sled hockey. And juggle. And worry about whether this blog should have been called randomutf-8. 2010s in review tells more: https://twitter.com/BruceDawson0xB/status/1212101533015298048
This entry was posted in AltDevBlogADay, Floating Point, Programming and tagged , , , , . Bookmark the permalink.

25 Responses to Exceptional Floating Point

  1. Chad says:

    Great stuff.
    Do you happen to know of a way to guarantee/enable signaling on QNaNs (or a way to make invalid ops generate SNaNs instead?) I did something similar to your suggestion, but missed some NaNs, even with exceptions enabled, since I was looking at QNaNs set while exceptions were disabled. (And also, uninitialized memory.) QNaNs/SNaNs is a whole box of floating point craziness I haven’t seen you touch on yet!

    • brucedawson says:

      I don’t know how to handle the issues you mention with QNaNs. We do try to initialize our uninitialized memory (you know what I mean) to patterns that are SNaNs.

  2. Dealing with floating point state can be maddening on Windows where there’s all kinds of poorly-written third-party software getting loaded into your process. We’ve dealt with multiple problems at Mozilla stemming from third-party software doing bad things, including:
    * Crashes from Cisco VPN software mucking up floating point state in their kernel driver: https://bugzilla.mozilla.org/show_bug.cgi?id=435756#c81
    * Crashes from plugins/etc enabling FP exceptions and causing crashes later in our code: https://bugzilla.mozilla.org/show_bug.cgi?id=533035 (“Fixed” by filtering out floating point exceptions in a chaining exception handler: http://mxr.mozilla.org/mozilla-central/source/toolkit/xre/nsSigHandlers.cpp#368 )

    • brucedawson says:

      I feel your pain. Having kernel drivers mucking up floating-point state is particularly heinous. In general I’ve tried to deal with that sort of thing by disabling FP exceptions on return from the bad code, but if there are lots of possible plugin points that quickly gets crazy. Reporting the bugs is also important, even though it often doesn’t work.

  3. Pingback: That’s Not Normal–the Performance of Odd Floats | Random ASCII

  4. Pingback: When Even Crashing Doesn’t Work | Random ASCII

  5. So when enabling these signals, do you still need __try and __except to actually catch these exceptions? You’d say they’re useful in debug mode (or perhaps tucked away deeper with an #ifdef), but you recommend not using that code.

    • brucedawson says:

      I only enable floating-point exceptions during development. Normally I’m running under the debugger so when an exception is thrown I drop into the debugger. No __try/__except needed.

      For production the floating-point exceptions are disabled.

      The one remaining case is if I am running a development build but not under the debugger. In that case you need some way of ensuring that you get a crash dump. See this post for some options:

      More Adventures in Failing to Crash Properly

      • I’m optionally turning this on during Release builds as well, but the only problem currently is that my stack trace (I’m using StackWalker) ends up with a stack trace in the exception handler, rather than where the exception occurred. I’ll have to read that article, see if it’s easy to circumvent.

        I like keeping such options in Release code, since that runs faster and sometimes behaves slightly different from Debug runs. I warn about performance-decreasing options in a log file, so they don’t end up in production code by mistake. It still helps often to be able to turn some more heavy debugging features on in Release code, once the code is out there.

        • brucedawson says:

          What is StackWalker?

          I know that if you save a minidump from in an exception handler the windbg will by default show the context of the code that saved the minidump, rather than the exception context. You have to type .ecxr to go to the exception context. I don’t know if that is relevant.

          • StackWalker is available at http://stackwalker.codeplex.com/, it creates a stacktrace plus accompanying file/line info. For example, when I get a crash (like ‘char *p=0; *p=1;} I get this (I have a console command in my exe which actively crashes):

            FATAL: Exception 0xC0000005, flags 0, Address 0x0041D093
            (this dialog text is stored in QLOG.txt)

            OS-Version: 6.1.7601 (Service Pack 1) 0x100-0x1

            0x0041D093 d:\source\trunk\dev\src\libs\qlib\qdebug.cpp (line 430): QCrash()
            0x004B0F0B d:\source\trunk\dev_racer\src\lib\rscript.cpp (line 1391): RScriptInterpret()
            0x00476536 d:\source\trunk\dev_racer\src\lib\rconsole.cpp (line 777): RConsole::EndInput()
            0x00477062 d:\source\trunk\dev_racer\src\lib\rconsole.cpp (line 711): RConsole::ProcessEvent()
            0x00402ED8 d:\source\trunk\dev_racer\src\mrun.cpp (line 323): rrGameEvent()
            0x00576F17 d:\source\trunk\dev\src\libs\license\qapp.cpp (line 945): QApp::Run1()
            0x00403F8C d:\source\trunk\dev_racer\src\mrun.cpp (line 2203): Run()
            0x004016AA d:\source\trunk\dev_racer\src\main.cpp (line 436): main()
            0x00401733 d:\source\trunk\dev_racer\src\main.cpp (line 443): WinMain()
            0x005C0FCB f:\dd\vctools\crt_bld\self_x86\crt\src\crt0.c (line 263): __tmainCRTStartup()
            0x762133CA [kernel32]: (filename not available): BaseThreadInitThunk
            0x77469ED2 [ntdll]: (filename not available): RtlInitializeExceptionChain
            0x77469EA5 [ntdll]: (filename not available): RtlInitializeExceptionChain

            Note that QCrash() is defined like this:

            void QCrash(cstring reason)
            {
            qinfo(“Going to crash, reason: %s”,reason);
            int *p=0;
            *p=12345;
            }

            This helps quite a bit, although I must supply my .pdb file with the exe (which is quite big).

            • brucedawson says:

              That looks handy for things like logging of memory allocations, however it doesn’t seem like a good idea for crashes. For crashes I find it far more useful to save a minidump file. It doesn’t take much code, and it can be recorded without access to symbols (great for on customer machines) and it can then be loaded into a debugger to see the state of the process when it crashed — all threads, registers, stack contents, etc. Heap as well if you want it, although that bloats the crash dump considerably. Together with a symbol server:

              Symbols the Microsoft Way


              and source indexing:

              Source Indexing is Underused Awesomeness


              it’s pretty darned magical and makes a lot more bugs diagnosable.

  6. Pingback: Game Developer Magazine Floating Point | Random ASCII

  7. Pingback: Float Precision Revisited: Nine Digit Float Portability | Random ASCII

  8. Pingback: Comparing Floating Point Numbers, 2012 Edition | Random ASCII

  9. alex says:

    i’ve seen this code to check for pending exceptions:
    __asm fwait;

    • brucedawson says:

      Yep, that should do the trick. But since any floating-point instruction should trigger pending exceptions I haven’t found the need for fwait. In general the joyful thing about floating-point exceptions is that you can enable them and then you magically get stricter testing of millions of lines of code, without having to change any of those millions of lines of code.

  10. Wyatt says:

    Is it just me or did all of the code blocks get collapsed to single lines of text in scrolling boxes by a rambunctious HTMLifier? Because that’s what I’m seeing…

    • brucedawson says:

      Thanks for pointing that out. Windows Live Writer and the code formatting plugins don’t get along. I’ve fixed the code and I’ve given up on using the code formatting plugin. So, the code isn’t beautiful, but at least its there now.

  11. Pingback: There’s Only Four Billion Floats–So Test Them All! | Random ASCII

  12. Cesar says:

    I’ve been reading your floating point articles and they’re very useful. Thanks a lot!!

    There’s a subject, however, for which I haven’t found a definite answer anywhere: Checking if a floating point division is “safe” in advance (where “safe” means having a numerical result –no overflow, no NaN, but underflow isn’t usually a worry). Yes, by reading this article I know this can be checked with exceptions, but… what would be a good way of doing it without exceptions?

    I find this is a recurring scenario when writing games or demos: you have a denominator, but you don’t want to use it unless you know it won’t generate a non-numerical result (i.e.: you don’t want to normalize a vector unless you’re confident you won’t get an infinity vector). Maybe this would be complex enough for being a subject for a whole article, but I find it’s not well discussed anywhere on the Internet, so I believe this would be a great subject to talk about.

    • brucedawson says:

      In most cases the best thing to do is to just do the divide, and then check for infinity. Any other check would be extremely complicated and basically equivalent to doing the divide. If you have divide-by-zero exceptions enabled then you would need to check against zero before dividing. If you have overflow exceptions enabled then you have a much trickier problem.

      But in general the best answer is “do it and then see if the result is finite”.

      Anybody else have any wisdom?

  13. Thomas says:

    Think your code above in the constructor of FPExceptionDisabler has an issue, please correct me if I’m wrong.
    Your code:
    FPExceptionDisabler()
    {
    // Retrieve the current state of the exception flags. This
    // must be done before changing them. _MCW_EM is a bit
    // mask representing all available exception masks.
    _controlfp_s(&mOldValues, _MCW_EM, _MCW_EM);

    With following signature of _controlfp_s:
    errno_t _controlfp_s(
    unsigned int *currentControl,
    unsigned int newControl,
    unsigned int mask
    );

    –> The microsoft documentation says (https://docs.microsoft.com/de-de/cpp/c-runtime-library/reference/controlfp-s?view=vs-2019):

    If mask is nonzero, a new value for the control word is set … In this scenario, currentControl is set to the value after the change completes; it is not the old control-word bit value.

    –> To me, the instruction to read the current control bits into mOldValues should rather be:
    _controlfp_s(&mOldValues, 0, 0);

    –> Means that using the code above results in all FP Exceptions being permanently disabled, even after the destructor has been called.
    Am I right?

    Anyway, very nice blog, many thanks!

    • brucedawson says:

      Huh. Yeah, I think you’re right. And that was pointed out to me years ago and (I think) fixed but somehow the fix got lost. I’ll correct it.

      It’s not as terrible as it might be because disabled is the normal state, but it does mean that the objects nest poorly, and it is wrong.

      Fixing it messed up the code formatting, unfortunately, because WordPress and code don’t get along and posting code snippets was a bad idea. Oh well.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.