Numba corrupts data by affecting in-place

260 views Asked by At

Numba and NumPy don't execute the following foo function in the same way:

from numba import jit
import numpy as np

def foo(a):
    a[:] = a[::-1] # reverse the array

a = np.array([0, 1, 2])

With NumPy (without @jit) it prints [2, 1, 0], while with Numba (with @jit) it prints [2, 1, 2]. It looks like Numba modifies the array in-place, which leads in data corruption. It is easy to work around by making a copy of the array:

a[:] = a[::-1].copy()

But is this the desired behavior? Shouldn't Numba and NumPy give the same result?

I am using Numba v0.26.0 in Python 3.5.2.


There are 2 answers


This is a known issue ( and it was fixed in numba 0.27. Following NumPy behavior, the fix detects for overlap and makes temporary copies to avoid corrupting the data.

hpaulj On

Your jit has the same sort of in-place problems that this Python loop does.

In [718]: x=list(range(3))
In [719]: for i in range(3):
     ...:     x[i] = x[2-i]
In [720]: x
Out[720]: [2, 1, 2]

The x[:] = x[::-1] is buffered, not because numpy recognizes that something special is happening, but because it always uses some sort of buffering with doing assignments.

The Python interpreter translates [] notation into calls to __setitem__ and __getitem__. So 681 and 682 do the same thing:

In [680]: x=np.arange(3)
In [681]: x[:] = x[::-1]
In [682]: x.__setitem__(slice(None), x.__getitem__(slice(None,None,-1)))
In [683]: x
Out[683]: array([0, 1, 2])

That means that x[::-1] is evaluated in full - to a temporary array, before being copied to x[:]. Now x[::-1] is a view, not a copy, so the setitem step must do some sort of buffered copy.

Another way to do this copy is with

np.copyto(x, x[::-1])

Checking the x.__array_interface__ I see that the data buffer address remains the same. So it is doing a copy, not just changing the data buffer address. But it's in low level compiled code.

Usually buffering is just a implementation issue that users don't need to worry about. is designed to deal with cases where that buffering creates problems. This topic comes up periodically; search for


Note that Python lists behave the same way. The translation to 'get/setitem' is the same.

In [699]: x=list(range(3))
In [700]: x[:] = x[::-1]
In [701]: x
Out[701]: [2, 1, 0]


I'm not entirely sure this is relevant or not, but since I tested these ideas I'll document them. suggests using np.nditer as a stepping stone for implementing iterative tasks in cython.

A first stab at using nditer is:

In [769]: x=np.arange(5)
In [770]: it = np.nditer((x,x[::-1]), op_flags=[['readwrite'], ['readonly']])
In [771]: for i,j in it:
     ...:     print(i,j)
     ...:     i[...] = j
0 4
1 3
2 2
3 3
4 4
In [772]: x
Out[772]: array([4, 3, 2, 3, 4])

This produces the same sort of overlapping result as numba.

Adding a copy makes for a clean reversal.

it = np.nditer((x,x[::-1].copy()), op_flags=[['readwrite'], ['readonly']])

If I add the external_loop flag I also get a clean reversal:

In [781]: x=np.arange(5)
In [782]: it = np.nditer((x,x[::-1]), op_flags=[['readwrite'], ['readonly']], fl
     ...: ags = ['external_loop'])
In [783]: for i,j in it:
     ...:     print(i,j)
     ...:     i[...] = j
[0 1 2 3 4] [4 3 2 1 0]
In [784]: x
Out[784]: array([4, 3, 2, 1, 0])