I was trying to use numpy.divmod with very large integers and I noticed a strange behaviour. At around 2**63 ~ 1e19 (which should be the limit for the usual memory representation of int in python 3.5+), this happens:
from numpy import divmod
test = 10**6
for i in range(15,25):
x = 10**i
print(i, divmod(x, test))
15 (1000000000, 0)
16 (10000000000, 0)
17 (100000000000, 0)
18 (1000000000000, 0)
19 (10000000000000.0, 0.0)
20 ((100000000000000, 0), None)
21 ((1000000000000000, 0), None)
22 ((10000000000000000, 0), None)
23 ((100000000000000000, 0), None)
24 ((1000000000000000000, 0), None)
Somehow, the quotient and remainder works fine till 2**63, then there's something different.
My guess is that the int representation is "vectorized" (i.e. as BigInt in Scala, as a little endian Seq of Long). But then, I'd expect, as a result of divmod(array, test), a pair of arrays: the array of quotients and the array of remainders.
I have no clue about this feature. It does not happen with the built-in divmod (everything works as expected)
Why does this happen? Does it have something to do with int internal representation?
Details: numpy version 1.13.1, python 3.6
The problem is that
np.divmodwill convert the arguments to arrays and what happens is really easy:You will get an
objectarray for10**iwithi > 19, in the other cases it will be a "real NumPy array".And, indeed, it seems like
objectarrays behave strangely withnp.divmod:I guess in this case the normal Python built-in
divmodcalculates the first returned element and all remaining items are filled withNonebecause it delegated to Pythons function.Note that
objectarrays often behave differently than native dtype arrays. They are a lot slower and often delegate to Python functions (which is often the reason for different results).