I'm migrating from python2 to python3 and I'm facing an issue which I have simplified to this:
import numpy as np
a = np.array([1, 2, None])
(a > 0).nonzero()
Traceback (most recent call last):
File "<input>", line 1, in <module>
TypeError: '>' not supported between instances of 'NoneType' and 'int'
In reality I'm processing np-arrays with millions of data and really need to keep the np-operation for performance. In python 2 this was working fine and returns what I expect, since python2 is not so keen on types. What is the best approach for migrating this?
One way to achieve the desired result is to use a lambda function with
np.vectorize:Of course, if the array doesn't contain negative integers, you could just use
np.where(a), as bothNoneand0would evaluate toFalse:Another way this can be solved is by first converting the array to use the float dtype, which has the effect of converting
Nonetonp.nan. Thennp.where(a>0)can be used as normal.Time comparison:
So Bob's approach, while not as easy on the eyes, is about twice as fast as the
np.vectoriseapproach, and slightly slower than the float conversion approach.Code to reproduce: