std::fmod abysmal double precision

1.9k views Asked by At

fmod(1001.0, 0.0001) is giving 0.00009999999995, which seems like a very low precision (10-5), given the expected result of 0.

According to cppreference, fmod() is implementable using remainder(), but remainder(1001.0, 0.0001) gives -4.796965775988316e-14 (still far from double precision, but much better than 10-5).

Why does fmod precision depend on input arguments that much? Is it normal?

MCVE:

#include <cmath>
#include <iomanip>
#include <iostream>
using namespace std;

int main() {
    double a = 1001.0, b = 0.0001;
    cout << setprecision(16);
    cout << "fmod:      " << fmod(a, b) << endl;
    cout << "remainder: " << remainder(a, b) << endl;
    cout << "actual:    " << a-floor(a/b)*b << endl;
    cout << "a/b:       " << a / b << endl;
}

Output:

fmod:      9.999999995203035e-05
remainder: -4.796965775988316e-14
actual:    0
a/b:       10010000

(same result with GCC, Clang, MSVC, with and without optimization)

Live demo

3

There are 3 answers

0
Alan Birtles On BEST ANSWER

If we modify your program to:

#include <cmath>
#include <iomanip>
#include <iostream>

int main() {
    double a = 1001.0, b = 0.0001;
    std::cout << std::setprecision(32) << std::left;
    std::cout << std::setw(16) << "a:" << a << "\n"; 
    std::cout << std::setw(16) << "b:" << b << "\n"; 
    std::cout << std::setw(16) << "fmod:" << fmod(a, b) << "\n";
    std::cout << std::setw(16) << "remainder:" << remainder(a, b) << "\n";
    std::cout << std::setw(16) << "floor a/b:" << floor(a/b) << "\n";
    std::cout << std::setw(16) << "actual:" << a-floor(a/b)*b << "\n";
    std::cout << std::setw(16) << "a/b:" << a / b << "\n";
    std::cout << std::setw(16) << "floor 10009999:" << floor(10009999.99999999952) << "\n";
}

It outputs:

a:              1001
b:              0.00010000000000000000479217360238593
fmod:           9.9999999952030347032290447106817e-05
remainder:      -4.796965775988315527911254321225e-14
floor a/b:      10010000
actual:         0
a/b:            10010000
floor 10009999: 10010000

we can see that 0.0001 is not representable as a double so b is actually set to 0.00010000000000000000479217360238593.

This results in a/b being 10009999.9999999995203034224 which therefore means fmod should return 1001 - 10009999*0.00010000000000000000479217360238593 which is 9.99999999520303470323e-5.

(numbers calculated in speedcrunch so may not match exactly IEEE double values)

The reason your "actual" value is different is that floor(a/b) returns 10010000 not the exact value used by fmod which is 10009999, this is itself due to 10009999.99999999952 not being representable as a double so it is rounded to 10010000 before being passed to floor.

0
Eric Postpischil On

fmod produces exact results, with no error.

Given the C++ source code fmod(1001.0, 0.0001) in an implementation using IEEE-754 binary64 (the most commonly used format for double), the source text 0.0001 is converted to the double value 0.000100000000000000004792173602385929598312941379845142364501953125.

Then 1001 = 10009999• 0.000100000000000000004792173602385929598312941379845142364501953125 + 0.000099999999952030347032290447106817055100691504776477813720703125, so fmod(1001, 0.0001) is exactly 0.000099999999952030347032290447106817055100691504776477813720703125.

The only error occurs in converting the decimal numeral in the source text to the binary-based double format. There is no error in the fmod operation.

0
David K On

The fundamental issue here (the IEEE-754 representation of 0.0001) has been well-established already, but just for kicks, I copied the implementation of fmod using std::remainder from https://en.cppreference.com/w/cpp/numeric/math/fmod and compared it with std::fmod.

#include <iostream>
#include <iomanip>
#include <cmath>

// Possible implementation of std::fmod according to cppreference.com
double fmod2(double x, double y)
{
#pragma STDC FENV_ACCESS ON
    double result = std::remainder(std::fabs(x), (y = std::fabs(y)));
    if (std::signbit(result)) result += y;
    return std::copysign(result, x);
}

int main() {
    // your code goes here
    double b = 0.0001;
    std::cout << std::setprecision(25);
    std::cout << "              b:" << std::setw(35) << b << "\n"; 
    
    double m = 10010000.0;
    double c = m * b;
    double d = 1001.0 - m * b;
    std::cout << std::setprecision(32);
    std::cout << "     10010000*b:" << std::setw(6) << c << "\n"; 
    std::cout << std::setprecision(25);
    std::cout << "1001-10010000*b:" << std::setw(6) << d << "\n";
    
    long double m2 = 10010000.0;
    long double c2 = m2 * b;
    long double d2 = 1001.0 - m2 * b;
    std::cout << std::setprecision(32);
    std::cout << "     10010000*b:" << std::setw(35) << c2 << "\n"; 
    std::cout << std::setprecision(25);
    std::cout << "1001-10010000*b:" << std::setw(35) << d2 << "\n";
    
    std::cout << "      remainder:" << std::setw(35) << std::remainder(1001.0, b) << "\n"; 
    std::cout << "           fmod:" << std::setw(35) << std::fmod(1001.0, b) << "\n"; 
    std::cout << "          fmod2:" << std::setw(35) << fmod2(1001.0, b) << "\n"; 
    std::cout << " fmod-remainder:" << std::setw(35) <<
                 std::fmod(1001.0, b) - std::remainder(1001.0, b) << "\n"; 
    return 0;
}

The results are:

              b:     0.0001000000000000000047921736
     10010000*b:  1001
1001-10010000*b:     0
     10010000*b:  1001.0000000000000479616346638068
1001-10010000*b:    -4.796163466380676254630089e-14
      remainder:    -4.796965775988315527911254e-14
           fmod:     9.999999995203034703229045e-05
          fmod2:     9.999999995203034703229045e-05
 fmod-remainder:     0.0001000000000000000047921736

As illustrated by the last two lines of output, the actual std::fmod (at least in this implementation) matches the implementation suggested on the cppreference page, at least for this example.

We also see that 64 bits of IEEE-754 is not enough precision to show that 10010000 * 0.0001 differs from an integer. But if we go to 128 bits, the fractional part is clearly represented, and when we subtract this from 1001.0 we find that the remainder is approximately the same as the return value of std::remainder. (The difference is presumably due to std::remainder being computed with fewer than 128 bits; it may be using 80-bit arithmetic.)

Finally, note that std::fmod(1001.0, b) - std::remainder(1001.0, b) turns out to be equal to the 64-bit IEEE-754 value of 0.0001. That is, both functions are returning results that are congruent to the same value modulo 0.0001000000000000000047921736, but std::fmod chooses the smallest positive value, while std::remainder chooses the value closest to zero.