CRF layer ValueError: Dimensions must be equal, but are 75 and 8 for

255 views Asked by At

I am using BiLSTM-CRF for the NER problem when I build the layers and successfully generate the summary, however, when I try to train the model it gives me the Dimension error. It was working fine when I am using Keras and Keras-contrib packages however, these packages won't work in python3.8. Therefore, I have to move tensorflow for BiLSTM and tenorflow-addons for CRF. Unfortunately, these packages me me unknown errors. I am trying for about 3 weeks I couldn't find any solution please help me.

The following are the layers for my code:

from keras.models import Model, Input
from keras.layers import LSTM, Embedding, Dense, TimeDistributed, Dropout, Bidirectional
from tensorflow_addons.layers import CRF


input = Input(shape=(max_len,))
model = Embedding(input_dim=n_words + 1, output_dim=20, input_length=max_len, mask_zero=True)(input)  # 20-dim embedding
model = Bidirectional(LSTM(units=75, return_sequences=True, recurrent_dropout=0.1))(model)
model = TimeDistributed(Dense(75, activation="relu"))(model)
crf = CRF(n_tags)  # CRF layer
out = crf(model)  # output
model = Model(input,out)
model.compile('rmsprop', loss='mean_absolute_error', metrics=['accuracy'])
model.summary()

Model: "model_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param    
=================================================================
 input_2 (InputLayer)        [(None, 75)]              0         
                                                                 
 embedding_1 (Embedding)     (None, 75, 20)            259240    
                                                                 
 bidirectional_1 (Bidirectio  (None, 75, 150)          57600     
 nal)                                                            
                                                                 
 time_distributed_1 (TimeDis  (None, 75, 75)           11325     
 tributed)                                                       
                                                                 
 crf_1 (CRF)                 [(None, 75),              688       
                              (None, 75, 8),                     
                              (None,),                           
                              (8, 8)]                            
                                                                 
=================================================================
Total params: 328,853
Trainable params: 328,853
Non-trainable params: 0
_________________________________________________________________

import numpy as np
history = model.fit(X_tr, np.array(y_tr), batch_size=22, epochs=20, validation_split=0.1, verbose=1)

**Error:**
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
C:\Users\BLACKP~1\AppData\Local\Temp/ipykernel_11164/2422502856.py in <module>
----> 1 history = model.fit(X_tr, np.array(y_tr), batch_size=22, epochs=20, validation_split=0.1, verbose=1)

c:\users\blackpearl\appdata\local\programs\python\python38\lib\site-packages\keras\utils\traceback_utils.py in error_handler(*args, **kwargs)
     65     except Exception as e:  # pylint: disable=broad-except
     66       filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67       raise e.with_traceback(filtered_tb) from None
     68     finally:
     69       del filtered_tb

c:\users\blackpearl\appdata\local\programs\python\python38\lib\site-packages\tensorflow\python\framework\func_graph.py in autograph_handler(*args, **kwargs)
   1127           except Exception as e:  # pylint:disable=broad-except
   1128             if hasattr(e, "ag_error_metadata"):
-> 1129               raise e.ag_error_metadata.to_exception(e)
   1130             else:
   1131               raise

ValueError: in user code:

    File "c:\users\blackpearl\appdata\local\programs\python\python38\lib\site-packages\keras\engine\training.py", line 878, in train_function  *
        return step_function(self, iterator)
    File "c:\users\blackpearl\appdata\local\programs\python\python38\lib\site-packages\keras\engine\training.py", line 867, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "c:\users\blackpearl\appdata\local\programs\python\python38\lib\site-packages\keras\engine\training.py", line 860, in run_step  **
        outputs = model.train_step(data)
    File "c:\users\blackpearl\appdata\local\programs\python\python38\lib\site-packages\keras\engine\training.py", line 809, in train_step
        loss = self.compiled_loss(
    File "c:\users\blackpearl\appdata\local\programs\python\python38\lib\site-packages\keras\engine\compile_utils.py", line 201, in __call__
        loss_value = loss_obj(y_t, y_p, sample_weight=sw)
    File "c:\users\blackpearl\appdata\local\programs\python\python38\lib\site-packages\keras\losses.py", line 141, in __call__
        losses = call_fn(y_true, y_pred)
    File "c:\users\blackpearl\appdata\local\programs\python\python38\lib\site-packages\keras\losses.py", line 245, in call  **
        return ag_fn(y_true, y_pred, **self._fn_kwargs)
    File "c:\users\blackpearl\appdata\local\programs\python\python38\lib\site-packages\keras\losses.py", line 1332, in mean_absolute_error
        return backend.mean(tf.abs(y_pred - y_true), axis=-1)

    ValueError: Dimensions must be equal, but are 75 and 8 for '{{node mean_absolute_error/sub}} = Sub[T=DT_INT32](model_1/crf_1/ReverseSequence_1, mean_absolute_error/Cast)' with input shapes: [?,75], [?,75,8].
1

There are 1 answers

0
AudioBubble On

Seems like problem with input shape.

I modified your code. working sample code

import tensorflow as tf
from tensorflow_addons.layers import CRF
inputs = tf.keras.Input(shape=(10, 128))
conv_2d_layer = tf.keras.layers.Dense(64, activation="relu")
outputs = tf.keras.layers.TimeDistributed(conv_2d_layer)(inputs)
layer = CRF(4)
decoded_sequence, potentials, sequence_length, chain_kernel = layer(outputs)
print(decoded_sequence.shape)
print(potentials.shape)
print(sequence_length.shape)
print(chain_kernel.shape)

Output

(None, 10)
(None, 10, 4)
(None,)
(4, 4)