The "magic" method of computing the inverse-square root, dating back to the Quake game apparently, is described in many sources. Wikipedia has a nice article on it: https://en.wikipedia.org/wiki/Fast_inverse_square_root
I particularly found the following to be a very nice write-up and analysis of the algorithm: https://cs.uwaterloo.ca/~m32rober/rsqrt.pdf
I'm trying to replicate some of these results in this paper, but having an issue with the accuracy. The algorithm, coded in C, is given as follows:
#include <math.h>
#include <stdio.h>
float Q_rsqrt(float number) {
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = *(long *) &y;
i = 0x5f3759df - (i >> 1);
y = *(float *) &i;
y = y * (threehalfs - (x2 * y * y));
// y = y * (threehalfs - (x2 * y * y));
return y;
}
The paper states that the relative error is at most 0.0017522874
for all positive normal floats. (See Appendix 2 for the code, and the discussion in Section 1.4.)
However, when I "plug-in" the number 1.4569335e-2F
, the error I get is larger than this predicted tolerance:
int main ()
{
float f = 1.4569335e-2F;
double tolerance = 0.0017522874;
double actual = 1.0 / sqrt(f);
float magic = Q_rsqrt(f);
double err = fabs (sqrt(f) * (double) magic - 1);
printf("Input : %a\n", f);
printf("Actual : %a\n", actual);
printf("Magic : %a\n", magic);
printf("Err : %a\n", err);
printf("Tolerance: %a\n", tolerance);
printf("Passes : %d\n", err <= tolerance);
return 0;
}
The output is:
Input : 0x1.dd687p-7
Actual : 0x1.091cc953ea828p+3
Magic : 0x1.08a5dcp+3
Err : 0x1.cb5b716b7b6p-10
Tolerance: 0x1.cb5a044e0581p-10
Passes : 0
So, this particular input seems to violate the claim made in that paper.
I'm wondering if this is an issue with the paper itself, or if I made a mistake in my coding. I'd appreciate any feedback!
Let's try a little piece of code to recompute the bound on the relative error and shows that it's slightly larger than the one in the thesis of Matthew Robertson. Indeed, as noticed first in the answer of @squeamishossifrage and noted in the thesis of Matthew Robertson, this implementation is the one that was disclosed in the source of Quake III. In particular, the original value of the constant of Quake III can be found in the source of Quake III, in file q_math.c on line 561.
First, the code needs to be adapted to work on 64bit plateforms. The only thing that may have to be modified is the integer type:
long
is not plateform-independent. On my linux computer,sizeof(long)
returns 8... As updated in the paper on page 49, the typeuint32_t
will ensure that the type of the integer is of the same size as afloat
.Here goes the code, to be compiled by
gcc main.c -o main -lm -Wall
and ran by./main
:For the bound, I obtained
0.0017523386721 == 0x1.cb5d752717ep-10
. As you noticed, it is slightly larger than the one reported in the paper (0.001752287
). Evaluating the error usingfloat
instead ofdouble
does not change much the outcome.