Error: Invalid argument: ConcatOp : Dimensions of inputs should match

Question

Error: Invalid argument: ConcatOp : Dimensions of inputs should match

333 views Asked by Terry At 02 December 2020 at 05:42

I'm trying to add attention layer to Seq2Seq model, but I got the InvalidArgumentError on the concatenation step when fitting on the train set. The error is from the concat step where it's concat decoder output and attention output.

The error mentioned:

Dimensions of inputs should match: shape[0] = [32,15,300] vs. shape[1] = [32,32,300]

My understanding the first 32 item is the batch size, second is the sequence length, and 300 is number of units. But why the shape[1] has also 32 for the second item?

Below is my code, any insights would be very helpful.

WORD2VEC_DIMS = 50
DICTIONARY_SIZE = num_tokens
units = 300
ADAM = Adam(lr=0.00005)
MAX_LEN = 15 
drop_out_rate= 0.2

encoder_inputs_att = Input(shape=( MAX_LEN , ))
encoder_embedding_att = embedding_layer_encoder(encoder_inputs_att)
encoder_embedding_att=layers.SpatialDropout1D(drop_out_rate)(encoder_embedding_att)
encoder_outputs_att , state_h_att , state_c_att = LSTM( units , return_state=True )( encoder_embedding_att )
encoder_states_att = [ state_h_att , state_c_att ]

decoder_inputs_att = Input(shape=( MAX_LEN ,  ))
decoder_embedding_att = embedding_layer_decoder(decoder_inputs_att)
decoder_lstm_att = LSTM( units , return_state=True , return_sequences=True )
decoder_outputs_att , state_dec_h_att , state_dec_c_att = decoder_lstm_att ( decoder_embedding_att , initial_state=encoder_states_att )

# add attention
attn_layer_att = Attention(name='attention_layer', causal = True)
attn_out_att = attn_layer_att([encoder_outputs_att, decoder_outputs_att])

#decoder_outputs_att = tf.keras.layers.GlobalAveragePooling1D()(decoder_outputs_att)
#attn_out_att = tf.keras.layers.GlobalAveragePooling1D()(attn_out_att)

decoder_concat_input_att = Concatenate(axis=-1, name='concat_layer')([decoder_outputs_att, attn_out_att])

decoder_dense_att = Dense( DICTIONARY_SIZE , activation="softmax" ) 

# add time distributed
dense_time_att = TimeDistributed(decoder_dense_att, name='time_distributed_layer')

output_att = dense_time_att ( decoder_concat_input_att )
#output = tf.cast(tf.keras.backend.argmax(output), tf.float64)
output_att = tf.cast(output_att,tf.float64)

model_att = tf.keras.models.Model([encoder_inputs_att, decoder_inputs_att], output_att )

model_att.compile(optimizer=ADAM, loss='sparse_categorical_crossentropy')

model_att.summary()

model_att.fit([x_train, y_train], y_train_decoded, batch_size = 32, epochs = 50, validation_split=0.1, shuffle=True)

Original Q&A

There are 1 answers

**Jindřich** · Answer 1 · 2020-12-02T08:49:13+00:00

You provided the arguments to the attention in the opposite order. It should be:

attn_out_att = attn_layer_att([decoder_outputs_att, encoder_outputs_att])

From the tf.keras.layers.Attention documentation: inputs: List of the following tensors:

query: Query Tensor of shape [batch_size, Tq, dim].
value: Value Tensor of shape [batch_size, Tv, dim].
key: Optional key Tensor of shape [batch_size, Tv, dim]. If not given, will use value for both key and value, which is the most common case.

In case of the seq2seq model, you can imagine the attention as a probabilistic retrieval of information from the encoder by the decoder. In every decoding step, the decoder collects what is relevant from the encoder. Therefore, the decoder states are used as the queries and the encoder states are the retrieved values.

TechQA.

Error: Invalid argument: ConcatOp : Dimensions of inputs should match

There are 1 answers

Related Questions in PYTHON

Related Questions in TENSORFLOW

Related Questions in KERAS

Related Questions in SEQ2SEQ

Popular Questions

Popular Tags

Trending Questions