Rationale for numpy.split returning a list and not an array

Question

Rationale for numpy.split returning a list and not an array

2.7k views Asked by kilojoules At 03 August 2018 at 17:59

I was surprised that numpy.split yields a list and not an array. I would have thought it would be better to return an array, since numpy has put a lot of work into making arrays more useful than lists. Can anyone justify numpy returning a list instead of an array? Why would that be a better programming decision for the numpy developers to have made?

Original Q&A

There are 2 answers

akki On 30 June 2019 at 16:44

Actually you are right it returns a list

import numpy as np 
a=np.random.randint(1,30,(2,2))
b=np.hsplit(a,2)
type(b)

it will return type(b) as list so, there is nothing wrong in the documentation, i also first thought that the documentation is wrong it doesn't return a array, but when i checked

type(b[0])
type(b[1])

it returned type as ndarray.

it means it returns a list of ndarrary's.

**hpaulj** · Accepted Answer · 2018-08-03T18:32:51+00:00

A comment pointed out that if the slit is uneven, the result can't be a array, at least not one that has the same dtype. At best it would be an object dtype.

But lets consider the case of equal length subarrays:

In [124]: x = np.arange(10)
In [125]: np.split(x,2)
Out[125]: [array([0, 1, 2, 3, 4]), array([5, 6, 7, 8, 9])]
In [126]: np.array(_)     # make an array from that
Out[126]: 
array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

But we can get the same array without split - just reshape:

In [127]: x.reshape(2,-1)
Out[127]: 
array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

Now look at the code for split. It just passes the task to array_split. Ignoring the details about alternative axes, it just does

sub_arys = []
for i in range(Nsections):
    # st and end from `div_points
    sub_arys.append(sary[st:end])
return sub_arys

In other words, it just steps through array and returns successive slices. Those (often) are views of the original.

So split is not that sophisticate a function. You could generate such a list of subarrays yourself without a lot of numpy expertise.

Another point. Documentation notes that split can be reversed with an appropriate stack. concatenate (and family) takes a list of arrays. If give an array of arrays, or a higher dim array, it effectively iterates on the first dimension, e.g. concatenate(arr) => concatenate(list(arr)).

TechQA.

Rationale for numpy.split returning a list and not an array

There are 2 answers

Related Questions in PYTHON

Related Questions in NUMPY

Related Questions in DESIGN-DECISIONS

Popular Questions

Trending Questions