I've problems integrating Bert Embedding Layer in a BiLSTM model for text classification task.
My dataset is in the form where each row has 2 columns: text and polarity
text = string/tweet
polarity = can be 0 or 1
So the shape of training data is (1500,2)
I am generating BERT embeddings following this code https://github.com/strongio/keras-bert/blob/master/keras-bert.ipynb
I want to add Bi-LSTM between Bert Layer and the Dense layer. I have done it like this:
# Build model
def build_model(max_seq_length):
embedding_size = 768
in_id = tf.keras.layers.Input(shape=(max_seq_length,), name="input_ids")
in_mask = tf.keras.layers.Input(shape=(max_seq_length,), name="input_masks")
in_segment = tf.keras.layers.Input(shape=(max_seq_length,), name="segment_ids")
bert_inputs = [in_id, in_mask, in_segment]
bert_output = BertLayer(n_fine_tune_layers=3, pooling="mean")(bert_inputs)
bert_output = tf.keras.layers.Reshape((max_seq_length, embedding_size))(bert_output)
bilstm = tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(128, dropout=0.2,recurrent_dropout=0.2,return_sequences=True))(bert_output)
output = tf.keras.layers.Dense(1, activation="softmax")(bilstm)
model = tf.keras.models.Model(inputs=bert_inputs, outputs=output)
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
return model
def initialize_vars(sess):
sess.run(tf.local_variables_initializer())
sess.run(tf.global_variables_initializer())
sess.run(tf.tables_initializer())
K.set_session(sess)
model = build_model(max_seq_length)
# Instantiate variables
initialize_vars(sess)
model.fit(
[train_input_ids, train_input_masks, train_segment_ids],
train_labels,
validation_data=([test_input_ids, test_input_masks, test_segment_ids], test_labels),
epochs=1,
batch_size=32
)
It gives an error: ValueError: A target array with shape (1500, 1) was passed for an output of shape (None, 256, 1) while using as loss `binary_crossentropy`. This loss expects targets to have the same shape as the output.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow_core/python/ops/init_ops.py:97: calling GlorotUniform.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow_core/python/ops/init_ops.py:97: calling GlorotUniform.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow_core/python/ops/init_ops.py:97: calling Orthogonal.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow_core/python/ops/init_ops.py:97: calling Orthogonal.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow_core/python/ops/init_ops.py:97: calling Zeros.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow_core/python/ops/init_ops.py:97: calling Zeros.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_ids (InputLayer) [(None, 256)] 0
__________________________________________________________________________________________________
input_masks (InputLayer) [(None, 256)] 0
__________________________________________________________________________________________________
segment_ids (InputLayer) [(None, 256)] 0
__________________________________________________________________________________________________
bert_layer (BertLayer) (None, 768) 110104890 input_ids[0][0]
input_masks[0][0]
segment_ids[0][0]
__________________________________________________________________________________________________
reshape (Reshape) (None, 256, 768) 0 bert_layer[0][0]
__________________________________________________________________________________________________
bidirectional (Bidirectional) (None, 256, 256) 918528 reshape[0][0]
__________________________________________________________________________________________________
dense (Dense) (None, 256, 1) 257 bidirectional[0][0]
==================================================================================================
Total params: 111,023,675
Trainable params: 22,182,401
Non-trainable params: 88,841,274
__________________________________________________________________________________________________
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-28-827856e3678d> in <module>()
9 validation_data=([test_input_ids, test_input_masks, test_segment_ids], test_labels),
10 epochs=1,
---> 11 batch_size=32
12 )
3 frames
/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/keras/engine/training_utils.py in check_loss_and_target_compatibility(targets, loss_fns, output_shapes)
739 raise ValueError('A target array with shape ' + str(y.shape) +
740 ' was passed for an output of shape ' + str(shape) +
--> 741 ' while using as loss `' + loss_name + '`. '
742 'This loss expects targets to have the same shape '
743 'as the output.')
ValueError: A target array with shape (1500, 1) was passed for an output of shape (None, 256, 1) while using as loss `binary_crossentropy`. This loss expects targets to have the same shape as the output.
What can I do to resolve this? Does it have something to do with what activation or loss is being used ? How can the shape be matched?
Any help will be appreciated.
The loss function you specify especially binary, mean, logarithms, and other than Adam are calculated on the shape as well, you may try to change the loss fn but the target to solve the issues is the make matching of input-output where I add one layer which makes bi-lstm possible with exists BERT model I load.
[ Sample ]:
[ Output ]: