The following 3 lines give imprecise results with "gcc -Ofast -march=skylake":
int32_t i = -5;
const double sqr_N_min_1 = (double)i * i;
1. - ((double)i * i) / sqr_N_min_1
Obviously, sqr_N_min_1
gets 25.
, and in the 3rd line (-5 * -5) / 25
should become 1.
so that the overall result from the 3rd line is exactly 0.
. Indeed, this is true for compiler options "gcc -O3 -march=skylake".
But with "-Ofast" the last line yields -2.081668e-17
instead of 0.
and with other i
than -5
(e.g. 6
or 7
) it gets other very small positive or negative random deviations from 0.
.
My question is: Where exactly is the source of this imprecision?
To investigate this, I wrote a small test program in C:
#include <stdint.h> /* int32_t */
#include <stdio.h>
#define MAX_SIZE 10
double W[MAX_SIZE];
int main( int argc, char *argv[] )
{
volatile int32_t n = 6; /* try 6 7 or argv[1][0]-'0' */
double *w = W;
int32_t i = 1 - n;
const int32_t end = n - 1;
const double sqr_N_min_1 = (double)i * i;
/* Here is the crucial part. The loop avoids the compiler replacing it with constants: */
do {
*w++ = 1. - ((double)i * i) / sqr_N_min_1;
} while ( (i+=2) <= end );
/* Then, show the results (only the 1st and last output line matters): */
w = W;
i = 1 - n;
do {
fprintf( stderr, "%e\n", *w++ );
} while ( (i+=2) <= end );
return( 0 );
}
Godbolt shows me the assembly produced by an "x86-64 gcc9.3" with the option "-Ofast -march=skylake" vs. "-O3 -march=skylake". Please, inspect the five columns of the website (1. source code, 2. assembly with "-Ofast", 3. assembly with "-O3", 4. output of 1st assembly, 5. output of 2nd assembly):
Godbolt site with five columns
As you can see the differences in the assemblies are obvious, but I can't figure out where exactly the imprecision comes from. So, the question is, which assembler instruction(s) are responsible for this?
A follow-up question is: Is there a possibility to avoid this imprecision with "-Ofast -march=skylake" by reformulating the C-program?
From the comments, it seems that, for
-O3
, the compiler computes1. - ((double)i * i) / sqr_N_min_1
:i
to double and square it.sqr_N_min_1
.and, for
-Ofast
, computes it:sqr_N_min_1
.i
to double and square it.The latter improves speed because it calculates the division only once, and multiplication is much faster than division in the target processors. On top of that, the fused operation is faster than a separate multiplication and subtraction.
The error occurs because the reciprocal operation introduces a rounding error that is not present in the original expression (1/25 is not exactly representable in a binary format, while 25/25 of course is). This is why the compiler does not make this optimization when it is attempting to provide strict floating-point semantics.
Additionally, simply multiplying the reciprocal by 25 would erase the error. (This is somewhat by “chance,” as rounding errors vary in complicated ways.
1./25*25
produces 1, but1./49*49
does not.) But the fused operation produces a more accurate result (it produces the result as if the product were computed exactly, with rounding occurring only after the subtraction), so it preserves the error.