What is wrong with numpy.where function?

700 views Asked by At

I have a numpy array (a list of 2 elements lists) a give below and I have a list of 2 elements [30.94, 0.] that I would like to look for.

When I did the following I don't get the desired result. Why?

import numpy as np
a = np.array([[  5.73,   0.  ],
              [ 57.73,  10.  ],
              [ 57.73,  20.  ],
              [ 30.94,   0.  ],
              [ 30.94,  10.  ],
              [ 30.94,  20.  ],
              [  4.14,   0.  ],
              [  4.14,  10.  ]])

np.where(a==np.array([30.94, 0.]))

But I get

(array([0, 3, 3, 4, 5, 6]), array([1, 0, 1, 0, 0, 1]))

which is not true.

2

There are 2 answers

3
Dietrich Epp On BEST ANSWER

As Divakar hinted, a == np.array([30.94, 0.]) is not what you expect. The array is broadcast, and the comparison is done elementwise. Here is the result:

array([[False,  True],
       [False, False],
       [False, False],
       [ True,  True],
       [ True, False],
       [ True, False],
       [False,  True],
       [False, False]], dtype=bool)

However, we can get what we want with np.all:

>>> np.all(a==np.array([30.94, 0.]), axis=-1)
array([False, False, False,  True, False, False, False, False], dtype=bool)
>>> np.where(_)
(array([3]),)

So you can see that row 3 matches, as expected. Note that the usual caveats to using == with floating-point numbers will apply here.

0
kmario23 On

Yet another solution but please be aware that this will be little bit slower than Dietrich's solution, particularly for large arrays.

In [1]: cond = np.array([30.94, 0.])
In [2]: arr = np.array([[  5.73,   0.  ],
                       [ 57.73,  10.  ],
                       [ 57.73,  20.  ],
                       [ 30.94,   0.  ],
                       [ 30.94,  10.  ],
                       [ 30.94,  20.  ],
                       [  4.14,   0.  ],
                       [  4.14,  10.  ]])

In [3]: [idx for idx, el in enumerate(arr) if np.array_equal(el, cond)]
Out[3]: [3]