why there is a difference in the output of the same indexing selection inside a numpy array

97 views Asked by At

let's assume that I have a 2-dimensional NumPy array that looks like that and i want to extract the left bottom square (4x4):

arr_2d = [[ 5,10,15],
          [20,25,30],
          [35,40,45]]

why there is difference between this way:

arr_2d[row,col]

and this way:

arr_2d[row][col]

im saying that there is a differnce because i got different outputs while trying to do this:

arr_2d[1:3,1:3] #output was: [[25,30],
                              [40,45]]

arr_2d[1:3][1:3] #output was: [[35, 40, 45]]

if I'm wrong in my question can you tell me why, please?

thanks in advance!

3

There are 3 answers

0
JohanC On BEST ANSWER

Supposing arr_2d is declared as numpy array:

import numpy as np
arr_2d = np.array([[5, 10, 15],
                   [20, 25, 30],
                   [35, 40, 45]])

Then, arr_2d[1:3, 1:3] will return the submatrix with element 1 and 2 from each dimension (note that Python starts indexing at 0).

arr_2d[1:3][1:3] is interpreted as indexing two times:

  • First arr_2d[1:3] takes rows 1 and 2: rows_1_2 = np.array([[20, 25, 30], [35, 40, 45]])

  • Then, that result is indexed again with [1:3], so rows_1_2[1:3] which would give rows 1 and 2 of rows_1_2. As row 2 doesn't exist in that array, only row 1 is returned, so [[35, 40, 45]]. Note that this is a 1x3 array.

In general, it strongly recommended using 'slice indexing', because indexing 2 times can be needlessly slow for large arrays.

Note that with standard Python lists, to obtain a similar sub-matrix, you'd need to write it as:

list_2d = ([[5, 10, 15],
            [20, 25, 30],
            [35, 40, 45]])
[row[1:3] for row in list_2d[1:3]] # result: [[25, 30], [40, 45]]

This is both harder to read and much slower for large lists. But note that standard Python can work with sublists of different types and lengths, whil numpy needs everything of the same size and type.

Slicing and broadcasting is what make Python with numpy very suitable for numeric manipulations and calculations.

0
Gigioz On

There is an order in slicing: when I do

arr_2d[1:3]

I get [[20 25 30],[35 40 45]]

and so the second time I use it

arr_2d[1:3][1:3]

I get [[35 40 45]]

0
kuco 23 On

You have to understand that when indexing an object with [], you are calling the __getitem__ method, defined inside the object's class. Now numpy defines indexing in two ways. In your first case you use the one that works with two arguments and is supposed to index a matrix, as in

arr_2d[0:2, 0:2]
# returns [[5, 10], [20,25]]

In the second case you use the one (the one that accepts one argument) that is defined pretty much the same as normal list indexing. You are slicing the array two times, as illustrated bellow

a1 = arr_2d[1:3] # gets [[20,25,30], [35,40,45]]
a1[1:3] # returns [[35,40,45]]