Migrating python2 mixed-type np.array operations to python3

Question

Migrating python2 mixed-type np.array operations to python3

77 views Asked by Bob At 10 February 2020 at 09:24

I'm migrating from python2 to python3 and I'm facing an issue which I have simplified to this:

import numpy as np
a = np.array([1, 2, None])
(a > 0).nonzero()
Traceback (most recent call last):
  File "<input>", line 1, in <module>
TypeError: '>' not supported between instances of 'NoneType' and 'int'

In reality I'm processing np-arrays with millions of data and really need to keep the np-operation for performance. In python 2 this was working fine and returns what I expect, since python2 is not so keen on types. What is the best approach for migrating this?

Original Q&A

There are 2 answers

Bob On 10 February 2020 at 10:21

To conclude, with the help of @CDJB and @DeepSpace, the best solution I found is to replace the None values with a value suitable for the specific operation. Also included deep copy of array for not messing up the original data.

import numpy as np
a = np.array([1, None, 2, None])
deep_copy = np.copy(a)
deep_copy[deep_copy == None] = 0
result = (deep_copy > 0).nonzero()[0]
print(result)
[0 2]

**CDJB** · Accepted Answer · 2020-02-10T10:22:56+00:00

One way to achieve the desired result is to use a lambda function with np.vectorize:

>>> a = np.array([1, 2, None, 4, -1])
>>> f = np.vectorize(lambda t: t and t>0)
>>> np.where(f(a))
(array([0, 1, 3], dtype=int64),)

Of course, if the array doesn't contain negative integers, you could just use np.where(a), as both None and 0 would evaluate to False:

>>> a = np.array([1, 2, None, 4, 0])
>>> np.where(a)
(array([0, 1, 3], dtype=int64),)

Another way this can be solved is by first converting the array to use the float dtype, which has the effect of converting None to np.nan. Then np.where(a>0) can be used as normal.

>>> a = np.array([1, 2, None, 4, -1])
>>> np.where(a.astype(float) > 0)
(array([0, 1, 3], dtype=int64),)

Time comparison:

So Bob's approach, while not as easy on the eyes, is about twice as fast as the np.vectorise approach, and slightly slower than the float conversion approach.

Code to reproduce:

import perfplot
import numpy as np

f = np.vectorize(lambda t: t and t>0)

choices = list(range(-10,11)) + [None]

def cdjb(arr):
    return np.where(f(arr))

def cdjb2(arr):
    return np.where(arr.astype(float) > 0)

def Bob(arr):
    deep_copy = np.copy(arr)
    deep_copy[deep_copy == None] = 0
    return (deep_copy > 0).nonzero()[0]

perfplot.show(
    setup=lambda n: np.random.choice(choices, size=n),
    n_range=[2**k for k in range(25)],
    kernels=[
        cdjb, cdjb2, Bob
        ],
    xlabel='len(a)',
    )

TechQA.

Migrating python2 mixed-type np.array operations to python3

There are 2 answers

Related Questions in PYTHON-3.X

Related Questions in NUMPY

Related Questions in MIGRATION

Related Questions in NUMPY-NDARRAY

Related Questions in MIXED-TYPE

Popular Questions

Popular Tags

Trending Questions