Why is math.sqrt() incorrect for large numbers?

3.9k views Asked by At

Why does the math module return the wrong result?

First test

A = 12345678917
print 'A =',A
B = sqrt(A**2)
print 'B =',int(B)

Result

A = 12345678917
B = 12345678917

Here, the result is correct.

Second test

A = 123456758365483459347856
print 'A =',A
B = sqrt(A**2)
print 'B =',int(B)

Result

A = 123456758365483459347856
B = 123456758365483467538432

Here the result is incorrect.

Why is that the case?

4

There are 4 answers

0
willeM_ Van Onsem On BEST ANSWER

Because math.sqrt(..) first casts the number to a floating point and floating points have a limited mantissa: it can only represent part of the number correctly. So float(A**2) is not equal to A**2. Next it calculates the math.sqrt which is also approximately correct.

Most functions working with floating points will never be fully correct to their integer counterparts. Floating point calculations are almost inherently approximative.

If one calculates A**2 one gets:

>>> 12345678917**2
152415787921658292889L

Now if one converts it to a float(..), one gets:

>>> float(12345678917**2)
1.5241578792165828e+20

But if you now ask whether the two are equal:

>>> float(12345678917**2) == 12345678917**2
False

So information has been lost while converting it to a float.

You can read more about how floats work and why these are approximative in the Wikipedia article about IEEE-754, the formal definition on how floating points work.

0
Rory Daulton On

The documentation for the math module states "It provides access to the mathematical functions defined by the C standard." It also states "Except when explicitly noted otherwise, all return values are floats."

Those together mean that the parameter to the square root function is a float value. In most systems that means a floating point value that fits into 8 bytes, which is called "double" in the C language. Your code converts your integer value into such a value before calculating the square root, then returns such a value.

However, the 8-byte floating point value can store at most 15 to 17 significant decimal digits. That is what you are getting in your results.

If you want better precision in your square roots, use a function that is guaranteed to give full precision for an integer argument. Just do a web search and you will find several. Those usually do a variation of the Newton-Raphson method to iterate and eventually end at the correct answer. Be aware that this is significantly slower that the math module's sqrt function.

Here is a routine that I modified from the internet. I can't cite the source right now. This version also works for non-integer arguments but just returns the integer part of the square root.

def isqrt(x):
    """Return the integer part of the square root of x, even for very
    large values."""
    if x < 0:
        raise ValueError('square root not defined for negative numbers')
    n = int(x)
    if n == 0:
        return 0
    a, b = divmod(n.bit_length(), 2)
    x = (1 << (a+b)) - 1
    while True:
        y = (x + n//x) // 2
        if y >= x:
            return x
        x = y
0
MaxNoe On

If you want to calculate sqrt of really large numbers and you need exact results, you can use sympy:

import sympy

num = sympy.Integer(123456758365483459347856)

print(int(num) == int(sympy.sqrt(num**2)))
0
Isaac Saffold On

The way floating-point numbers are stored in memory makes calculations with them prone to slight errors that can nevertheless be significant when exact results are needed. As mentioned in one of the comments, the decimal library can help you here:

>>> A = Decimal(12345678917)
>>> A
Decimal('123456758365483459347856')
>>> B = A.sqrt()**2
>>> B
Decimal('123456758365483459347856.0000')
>>> A == B
True
>>> int(B)
123456758365483459347856

I use version 3.6, which has no hardcoded limit on the size of integers. I don't know if, in 2.7, casting B as an int would cause overflow, but decimal is incredibly useful regardless.