determine the current precision/epsilon of a floating point variable

362 views Asked by At

I am trying to separate two doubles of equal value by the minimum amount. The context is an event simulation. I don't want events to occur at the same time, so I increment the time that a new event is set to occur, by a minimal amount.

(This happens annoyingly often (self-implemented RNG), so I actually need to probe until I find a time when there isn't an event occurring.)

Currently, my code wants to do something like this (not tested, treat as pseudocode):

typedef struct event 
 { 
   double time;
   struct event* nexteventinqueue;
 } event;
//insert newevent into the queue:
void add(event *newevent)
{
static event *firsteventinqueue = NULL;
if(firsteventinqueue == NULL)
{
    firsteventinqueue = newevent;
    return;
}
event *currentevent = firsteventinqueue;
event *temp;
//find an event point in the queue that does not precede the new event.
while((currentevent->time) < (newevent -> time))
{
    temp = currentevent;
    currentevent = currentevent->nexteventinqueue;
}
if(currentevent == NULL)//no such event found, so tag it onto the end.
{
    temp->next = newevent;
    return;
}
//handle coincidences by delaying the new event
while((currentevent->time) == (newevent->time))
{
    double d = (maximally precise increment of a double precision floating point number);
    while((currentevent->time) == (newevent.time)) //loop I want to get rid of
    {
        newevent.time += d;
        d *= 2;
    }
    temp = currentevent;
    currentevent = currentevent.nexteventinqueue;
}
temp.nexteventinqueue = newevent;
newevent.nexteventinqueue = currentevent;
return;
}

Now there are quite a few issues with this, but I want to somehow get rid of the while loop in the middle. Most of my times aren't even close to holding the maximum precision that a double floating point can muster, so it's a waste of time to start by assuming they do, and because my RNG isn't especially random, this loop must execute quite frequently.

Is there a way to either (1) directly increment the fractional part of a double floating point variable, or to (2) figure out how precise a given floating point variable x in less than O(log(x))?

2

There are 2 answers

13
rici On BEST ANSWER

Use nextafter (from <math.h>) to get the immediately next largest value after a double-precision value. (Or nextafterf for a single-precision value).

For more information, man nextafter, also available here.

9
David C. Rankin On

Following @Brendan's comment and @rici's comment, it may be useful to see why an integer representation will ultimately decide what the smallest nextafter value is and why it is the integer precision that controls. Floats/doubles are stored in IEEE-754 single/double precision floating point format in memory. For every float or double there is an integer representation that corresponds to the floating point representation. The next available increment, or nextafter value is an increment of the value at bit-0. The following example illustrates the point for the number 123.456 (example of float shown for space saving, double is simply a larger number of bits involved):

Example of next value for   : 123.456

    The float value entered : 123.456001

    As unsigned integer     : 1123477881

    binary value in memory  : 01000010-11110110-11101001-01111001

    next larger int value   : 1123477882

    binary value in memory  : 01000010-11110110-11101001-01111010

    next larger float value : 123.456009

In order to find the next possible incremental value, the value in memory is incremented by 1 (or decremented by 1 if the most significant bit is 1 (indicating a negative value)). The results are then translated back into floating point to provide the next possible floating point value. Brendan's point is well taken. By using an integer value, you avoid the necessity of the nextafter calculation. But regardless of whether you are using floating point or an integer value, the smallest increment in memory is still the increment (or decrement) of 1 to bit-0. For the example above, it is the difference between 123.456001 and 123.456009 (in single precision) as shown below:

IEEE-754 Single Precision Floating Point Representations

  0 1 0 0 0 0 1 0 1 1 1 1 0 1 1 0 1 1 1 0 1 0 0 1 0 1 1 1 1 0 0 1
  |- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -|
  |s|      exp      |                  mantissa                   |

  0 1 0 0 0 0 1 0 1 1 1 1 0 1 1 0 1 1 1 0 1 0 0 1 0 1 1 1 1 0 1 0
  |- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -|
  |s|      exp      |                  mantissa                   |

Hopefully, the visual along with the explanation adds to the understanding. Note: there is no actual 'translation' or 'conversion' involved. It is just looking at the same value in memory as either a floating point number, or integer. The only difference to your code is whether you add 1 or add a few more lines of code to find out what the next floating point value would be after you add 1. None of it is processor or time intensive.