What does -1 mean in numpy reshape?

682.5k views Asked by At

A 2D array can be reshaped into a 1D array using .reshape(-1). For example:

>>> a = numpy.array([[1, 2, 3, 4], [5, 6, 7, 8]])
>>> a.reshape(-1)
array([[1, 2, 3, 4, 5, 6, 7, 8]])

Usually, array[-1] means the last element. But what does -1 mean here?

12

There are 12 answers

0
Dinesh Kumar On
numpy.reshape(a,newshape,order{})

check the below link for more info. https://docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html

for the below example you mentioned the output explains the resultant vector to be a single row.(-1) indicates the number of rows to be 1. if the

a = numpy.matrix([[1, 2, 3, 4], [5, 6, 7, 8]])
b = numpy.reshape(a, -1)

output:

matrix([[1, 2, 3, 4, 5, 6, 7, 8]])

this can be explained more precisely with another example:

b = np.arange(10).reshape((-1,1))

output:(is a 1 dimensional columnar array)

array([[0],
       [1],
       [2],
       [3],
       [4],
       [5],
       [6],
       [7],
       [8],
       [9]])

or

b = np.arange(10).reshape((1,-1))

output:(is a 1 dimensional row array)

array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
3
falsetru On

According to the documentation:

newshape : int or tuple of ints

The new shape should be compatible with the original shape. If an integer, then the result will be a 1-D array of that length. One shape dimension can be -1. In this case, the value is inferred from the length of the array and remaining dimensions.

0
iacob On

"Infer this dimension given all other dimensions have been specified."

Explicitly, this will make the -1 dimension the quotient of the original array's dimension product by the dimension product of the newly specified dims. If this is not an integer it will return an error.

E.g. for an array of shape (2,3,5), the following are all equivalent:

a = np.random.rand(2, 3, 5)

np.reshape(a, (-1,  2,  5))
np.reshape(a, ( 3, -1,  5))
np.reshape(a, ( 3,  2, -1))
0
Sida Zhou On

I didn't manage to understand what np.reshape() does until I read this article.

Mechanically it is clear what reshape() does. But how do we interpret the data before and after reshape?

The missing piece for me was:

When we train a machine learning model, the nesting levels of arrays have precisely defined meaning.

This means that the reshape operation has to be keenly aware both points below before the operation has any meaning:

  • the data it operates on (how the reshape input looks like)
  • how the algorithm/model expects the reshaped data to be (how the reshape output looks like)

For example:

The external array contains observations/rows. The inner array contains columns/features. This causes two special cases when we have either an array of multiple observations of only one feature or a single observation of multiple features.

For more advanced example: See this stackoverflow question


EDIT: added much more detailed example, see below.

Scenario

we have 3 groups/copies of of following:

fig

(figure illustrates 1 group)

Everything is flattened, so emb of 3 src node, with emb_size=32, is torch.Size([3, 32]). And, emb of 6 tgt node torch.Size([6, 32])

fig2

Goal

We want to reshape the data so that each src corresponds to 2 tgt node, so we do:

fig3

Now, for i-th src node, we have:

  • source_embs[i,:]
  • with the corresponding target_embs[i,:,:]
  • This is the whole point: data is now neatly organized, without reshaping we cant do this simple indexing.

Details

Looking at shape of target_embs:

  • before reshaping, shape is [6,32]
  • we start from rightmost dim, dim1=32, it isn't changed in the reshape, so ignore
  • we view shape as [6,*], and now the rightmost dim is dim0=6, almost like ignore dim1, and view it as [6]
  • When we reshape [6] into [3,2], we always look at the rightmost dim first, so we take 2 elements, then change row, then 2 element then change row and so on
  • As prior knowledge, we know [6,*] corresponds to [src1_tgt1, src1_tgt2, src2_tgt1, src2_tgt2, src3_tgt1, src3_tgt2] (this input has to be in this format, or else we need to rearrange the input into this format)
  • hence we know output is formatted correctly: [3,2] will correspond to what we want: [[src1_tgt1,src1_tgt2],[src2_tgt1, src2_tgt2],[src3_tgt1, src3_tgt2]]
  • So reshaping [6,32] into [3,2,32] is now complete
  • what if we want to reshape [6,32] into [4,3,16]? torch can do this, because the index match up, but the result is useless to our purposes
  • what if we want to have [32,2,3] in the end instead of [3,2,32]? Do we just do reshape(input6x32,(32,2,3))? No. Because the data will be scrambled and will be meaningless. What we can do is to get to [3,2,32] first, and then use transpose() into [32,2,3].

summary (for basic usage)

  • reshape 2 consecutive dimensions at a time, and only 2. This way it's much more understandable.
  • If want to reshape non-consecutive dimensions, then transpose before reshaping
  • There probably are more advanced usages, but this is the only way I manage to understand what reshape() is doing.
0
Scott On
import numpy as np
x = np.array([[2,3,4], [5,6,7]]) 

# Convert any shape to 1D shape
x = np.reshape(x, (-1)) # Making it 1 row -> (6,)

# When you don't care about rows and just want to fix number of columns
x = np.reshape(x, (-1, 1)) # Making it 1 column -> (6, 1)
x = np.reshape(x, (-1, 2)) # Making it 2 column -> (3, 2)
x = np.reshape(x, (-1, 3)) # Making it 3 column -> (2, 3)

# When you don't care about columns and just want to fix number of rows
x = np.reshape(x, (1, -1)) # Making it 1 row -> (1, 6)
x = np.reshape(x, (2, -1)) # Making it 2 row -> (2, 3)
x = np.reshape(x, (3, -1)) # Making it 3 row -> (3, 2)
1
Shawyan Azdam On

Long story short: you set some dimensions and let NumPy set the remaining(s).

(userDim1, userDim2, ..., -1) -->>

(userDim1, userDim1, ..., TOTAL_DIMENSION - (userDim1 + userDim2 + ...))
0
deepjyoti22 On

The final outcome of the conversion is that the number of elements in the final array is same as that of the initial array or data frame.

-1 corresponds to the unknown count of the row or column. We can think of it as x(unknown). x is obtained by dividing the number of elements in the original array by the other value of the ordered pair with -1.

Examples:

12 elements with reshape(-1,1) corresponds to an array with x=12/1=12 rows and 1 column.


12 elements with reshape(1,-1) corresponds to an array with 1 row and x=12/1=12 columns.

0
F0rge1cE On

The -1 stands for "unknown dimension" which can be inferred from another dimension. In this case, if you set your matrix like this:

a = numpy.matrix([[1, 2, 3, 4], [5, 6, 7, 8]])

Modify your matrix like this:

b = numpy.reshape(a, -1)

It will call some default operations to the matrix a, which will return a 1-d numpy array/matrix.

However, I don't think it is a good idea to use code like this. Why not try:

b = a.reshape(1, -1)

It will give you the same result and it's more clear for readers to understand: Set b as another shape of a. For a, we don't how many columns it should have (set it to -1!), but we want a 1-dimension array (set the first parameter to 1!).

5
Julu Ahamed On

The criterion to satisfy for providing the new shape is that 'The new shape should be compatible with the original shape'

numpy allow us to give one of new shape parameter as -1 (eg: (2,-1) or (-1,3) but not (-1, -1)). It simply means that it is an unknown dimension and we want numpy to figure it out. And numpy will figure this by looking at the 'length of the array and remaining dimensions' and making sure it satisfies the above mentioned criteria

Now see the example.

z = np.array([[1, 2, 3, 4],
         [5, 6, 7, 8],
         [9, 10, 11, 12]])
z.shape
(3, 4)

Now trying to reshape with (-1) . Result new shape is (12,) and is compatible with original shape (3,4)

z.reshape(-1)
array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

Now trying to reshape with (-1, 1) . We have provided column as 1 but rows as unknown . So we get result new shape as (12, 1).again compatible with original shape(3,4)

z.reshape(-1,1)
array([[ 1],
   [ 2],
   [ 3],
   [ 4],
   [ 5],
   [ 6],
   [ 7],
   [ 8],
   [ 9],
   [10],
   [11],
   [12]])

The above is consistent with numpy advice/error message, to use reshape(-1,1) for a single feature; i.e. single column

Reshape your data using array.reshape(-1, 1) if your data has a single feature

New shape as (-1, 2). row unknown, column 2. we get result new shape as (6, 2)

z.reshape(-1, 2)
array([[ 1,  2],
   [ 3,  4],
   [ 5,  6],
   [ 7,  8],
   [ 9, 10],
   [11, 12]])

Now trying to keep column as unknown. New shape as (1,-1). i.e, row is 1, column unknown. we get result new shape as (1, 12)

z.reshape(1,-1)
array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12]])

The above is consistent with numpy advice/error message, to use reshape(1,-1) for a single sample; i.e. single row

Reshape your data using array.reshape(1, -1) if it contains a single sample

New shape (2, -1). Row 2, column unknown. we get result new shape as (2,6)

z.reshape(2, -1)
array([[ 1,  2,  3,  4,  5,  6],
   [ 7,  8,  9, 10, 11, 12]])

New shape as (3, -1). Row 3, column unknown. we get result new shape as (3,4)

z.reshape(3, -1)
array([[ 1,  2,  3,  4],
   [ 5,  6,  7,  8],
   [ 9, 10, 11, 12]])

And finally, if we try to provide both dimension as unknown i.e new shape as (-1,-1). It will throw an error

z.reshape(-1, -1)
ValueError: can only specify one unknown dimension
0
Lucas On

When you using the -1 (or any other negative integer numbers, i made this test kkk) in

b = numpy.reshape(a, -1)

you are only saying for the numpy.reshape to automatically calculate the size of the vector (rows x columns) and relocate it into a 1-D vector with that dimension. This command is interesting because it does it automatically for you. If you wanted to reshape the vector to 1-D by putting a positive integer value, the reshape command would only work if you correctly entered the value "rows x columns". So being able to enter a negative integer makes the process easier, you know.

0
lonewolf On

It simply means that you are not sure about what number of rows or columns you can give and you are asking numpy to suggest number of column or rows to get reshaped in.

numpy provides last example for -1 https://docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html

check below code and its output to better understand about (-1):

CODE:-

import numpy
a = numpy.matrix([[1, 2, 3, 4], [5, 6, 7, 8]])
print("Without reshaping  -> ")
print(a)
b = numpy.reshape(a, -1)
print("HERE We don't know about what number we should give to row/col")
print("Reshaping as (a,-1)")
print(b)
c = numpy.reshape(a, (-1,2))
print("HERE We just know about number of columns")
print("Reshaping as (a,(-1,2))")
print(c)
d = numpy.reshape(a, (2,-1))
print("HERE We just know about number of rows")
print("Reshaping as (a,(2,-1))")
print(d)

OUTPUT :-

Without reshaping  -> 
[[1 2 3 4]
 [5 6 7 8]]
HERE We don't know about what number we should give to row/col
Reshaping as (a,-1)
[[1 2 3 4 5 6 7 8]]
HERE We just know about number of columns
Reshaping as (a,(-1,2))
[[1 2]
 [3 4]
 [5 6]
 [7 8]]
HERE We just know about number of rows
Reshaping as (a,(2,-1))
[[1 2 3 4]
 [5 6 7 8]]
1
Anuj Gupta On

Say we have a 3 dimensional array of dimensions 2 x 10 x 10:

r = numpy.random.rand(2, 10, 10) 

Now we want to reshape to 5 X 5 x 8:

numpy.reshape(r, shape=(5, 5, 8)) 

will do the job.

Note that, once you fix first dim = 5 and second dim = 5, you don't need to determine third dimension. To assist your laziness, Numpy gives the option of using -1:

numpy.reshape(r, shape=(5, 5, -1)) 

will give you an array of shape = (5, 5, 8).

Likewise,

numpy.reshape(r, shape=(50, -1)) 

will give you an array of shape = (50, 4)

You can read more at http://anie.me/numpy-reshape-transpose-theano-dimshuffle/