input dimension reshape in Tensorflow conolutional network

1.3k views Asked by At

In the expert mnist tutorial in tensorflow website, it have something like this :

x_image = tf.reshape(x, [-1,28,28,1])

I know that the reshape is like

tf.reshape(input,[batch_size,width,height,channel])

Q1 : why is the batch_size equals -1? What does the -1 means?

And when I go down the code there's one more thing I can not understand

W_fc1 = weight_variable([7 * 7 * 64, 1024])

Q2:What does the image_size * 64 means?

2

There are 2 answers

0
chris On BEST ANSWER

Q1 : why is the batch_size equals -1? What does the -1 means?

-1 means "figure this part out for me". For example, if I run:

reshape([1, 2, 3, 4, 5, 6, 7, 8], [-1, 2])

It creates two columns, and whatever number of rows it needs to get everything to fit:

array([[1, 2],
       [3, 4],
       [5, 6],
       [7, 8]])

Q2:What does the image_size * 64 means?

It is the number of filters in that particular filter activation. Shapes of filters in conv layers follow the format [height, width, # of input channels (number of filters in the previous layer), # of filters].

2
Mark McDonald On

When you pass -1 as a dimension in tf.reshape, it preserves the existing dimension. From the docs:

If one component of shape is the special value -1, the size of that dimension is computed so that the total size remains constant. In particular, a shape of [-1] flattens into 1-D. At most one component of shape can be -1.

The reference to 7 x 7 x 64 is because the convolutional layer being applied prior to this example has reduced the image to a shape of [7, 7, 64], and the input to the next fully connected layer needs to be a single dimension, so in the next line of the example, the tensor is reshaped from [7,7,64] to [7*7*64] so it can connect to the FC layer.

For more info on how convolutions and max pooling works, the wikipedia page has some helpful graphics:

e.g. network architecture:

cnn architecture

and pooling:

cnn pooling