For days I am struggling hard (but in vain) with the exception masks.
I have developed an application that makes heavy floating point calculations on hundreds of thousands of records. Obviously the code must be able to handle exceptions, especially those related to floating point calculations: Overflow, ZeroDivide, etc..
The application runs correctly under Windows 7 (32bit or 64bit) with many different types of processors, if an error occurs the condition is properly handled, an exception is raised and the record is discarded.
Unfortunately, the problems start when I launch the application just where is intended to run: on a dedicated server with Intel Xeon E5-2640 v2 CPU and Windows Server 2003 R2. Here the exceptions are not raised: records with errors are not discarded and so the results are polluted by these numerical values with which the machine depicts +INF
or -INF
.
The problem is that on the server the default settings of the error masking are different from those that we find in Windows 7. In particular, calling the procedure GetExceptionMask
on the server by default I find exZeroDivide
while if calling GetExceptionMask
on Windows 7 this exception is not masked. The result is what I said: running the application on the server these exceptions are not raised but handled by the processor returning extremes and "polluting" numerical values.
Ok, do not panic, I say, you simply call (i.e. in an initialization section) SetExceptionMask
excluding exZeroDivide
, but does not work. Or better, although just after calling SetExceptionMask
the exception exZeroDivide
is no longer masked, when is executed the code with floating point calculations the set TArithmeticExceptionMask
returned by GetExceptionMask
still contains exZeroDivide
and so if an error occurs the exception is not raised.
Can anyone tell me what is the correct way to call SetExceptionMask
?
Which is the reason why the masking default may be different from a computer and another? the operating system or type of processor?
Thanks.
The usual cause of this is that you are calling third party code that clears the masks. It may be a library that you are knowingly using but more likely it is something that you aren't particularly aware of calling. A common example of this is printer drivers. These are notorious for changing the floating point control flags.
The next step is to identify that part of the code that changes the control flags. I suggest you add debug trace logging. Calls to
OutputDebugString
would suffice, but you would do well to use a more advanced logging library. Log the state of the control flags as your program executes. You'll need a few cycles of adding logging calls, running, reading the log, before you are able to locate the culprit. Once you've found the external code that changes the flags, make sure you restore them after that external code executes.This is a tricky area I'm afraid. It's not easy to get right. External code does sometimes play fast and loose with the control flags as if that code were the only code in existence. The Delphi RTL isn't the greatest at handling control flags either. It's perhaps not well known that
Set8087CW
is not threadsafe, for instance.I've personally been through your struggles with my own floating point app. But you should be able to solve such problems. Good luck!