I'm migrating from python2 to python3 and I'm facing an issue which I have simplified to this:
import numpy as np
a = np.array([1, 2, None])
(a > 0).nonzero()
Traceback (most recent call last):
File "<input>", line 1, in <module>
TypeError: '>' not supported between instances of 'NoneType' and 'int'
In reality I'm processing np-arrays with millions of data and really need to keep the np-operation for performance. In python 2 this was working fine and returns what I expect, since python2 is not so keen on types. What is the best approach for migrating this?
One way to achieve the desired result is to use a lambda function with
np.vectorize
:Of course, if the array doesn't contain negative integers, you could just use
np.where(a)
, as bothNone
and0
would evaluate toFalse
:Another way this can be solved is by first converting the array to use the float dtype, which has the effect of converting
None
tonp.nan
. Thennp.where(a>0)
can be used as normal.Time comparison:
So Bob's approach, while not as easy on the eyes, is about twice as fast as the
np.vectorise
approach, and slightly slower than the float conversion approach.Code to reproduce: