Keras Dense Model ValueError: logits and labels must have the same shape ((None, 200, 1) vs (None, 1, 1))

217 views Asked by At

I'm new in machine learning and I'm trying to train a model. I'm using this Keras oficial example as a guide to set my dataset and feed it into the model: https://www.tensorflow.org/api_docs/python/tf/keras/utils/Sequence

From the training data I have an sliding windows created for a single column and for the labels I have a binary classification (1 or 0).

This is the model creation code:

n = 200
hidden_units = n
dense_model = Sequential()
dense_model.add(Dense(hidden_units, input_shape=([200,1])))
dense_model.add(Activation('relu'))
dense_model.add(Dropout(dropout))
print(hidden_units)

while hidden_units > 2:
    hidden_units = math.ceil(hidden_units/2)
    dense_model.add(Dense(hidden_units))
    dense_model.add(Activation('relu'))
    dense_model.add(Dropout(dropout))
    print(hidden_units)
dense_model.add(Dense(units = 1, activation='sigmoid'))

This is the functions I'm using to compile the model:

def compile_and_fit(model, window, epochs, patience=2):
    early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss',
                                                      patience=patience,
                                                      mode='min')
    model.compile(loss='binary_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])
    model.fit(window.train , epochs=epochs)
    return history

This is the model training:

break_batchs = find_gaps(df_train, 'date_diff', diff_int_value)
for keys, values in break_batchs.items():
    dense_window = WindowGenerator(data=df_train['price_var'],
                                   data_validation=df_validation['price_var'],
                                   data_test=df_test['price_var'],
                                   input_width=n,
                                   shift=m,
                                   start_index=values[0],
                                   end_index=values[1], 
                                   class_labels=y_buy,
                                   class_labels_train=y_buy_train,
                                   class_labels_test=y_buy_test,
                                   label_width=1,
                                   label_columns=None,
                                   classification=True,
                                   batch_size=batch_size,
                                   seed=None)
    history = compile_and_fit(dense_model, dense_window)

and those are the shapes of the batches:

(TensorSpec(shape=(None, 200, 1), dtype=tf.float32, name=None), TensorSpec(shape=(None, 1, 1), dtype=tf.float64, name=None))

The problem is (I guess) that, from the model summary the model is training from the last dimension when it should be working in the second one:

dense_model.summary()

Model: "sequential_21"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
                                          |Model is being applied here
                                          |
                                          v
dense_232 (Dense)            (None, 200, 200)          400       
_________________________________________________________________
                                     |When it should be applied here
                                     |
                                     v
activation_225 (Activation)  (None, 200, 200)          0         
_________________________________________________________________
dropout_211 (Dropout)        (None, 200, 200)          0         
_________________________________________________________________
dense_233 (Dense)            (None, 200, 100)          20100     
_________________________________________________________________
activation_226 (Activation)  (None, 200, 100)          0         
_________________________________________________________________
dropout_212 (Dropout)        (None, 200, 100)          0         
_________________________________________________________________
dense_234 (Dense)            (None, 200, 50)           5050      
_________________________________________________________________
activation_227 (Activation)  (None, 200, 50)           0         
_________________________________________________________________
dropout_213 (Dropout)        (None, 200, 50)           0         
_________________________________________________________________
dense_235 (Dense)            (None, 200, 25)           1275      
_________________________________________________________________
activation_228 (Activation)  (None, 200, 25)           0         
_________________________________________________________________
dropout_214 (Dropout)        (None, 200, 25)           0         
_________________________________________________________________
dense_236 (Dense)            (None, 200, 13)           338       
_________________________________________________________________
activation_229 (Activation)  (None, 200, 13)           0         
_________________________________________________________________
dropout_215 (Dropout)        (None, 200, 13)           0         
_________________________________________________________________
dense_237 (Dense)            (None, 200, 7)            98        
_________________________________________________________________
activation_230 (Activation)  (None, 200, 7)            0         
_________________________________________________________________
dropout_216 (Dropout)        (None, 200, 7)            0         
_________________________________________________________________
dense_238 (Dense)            (None, 200, 4)            32        
_________________________________________________________________
activation_231 (Activation)  (None, 200, 4)            0         
_________________________________________________________________
dropout_217 (Dropout)        (None, 200, 4)            0         
_________________________________________________________________
dense_239 (Dense)            (None, 200, 2)            10        
_________________________________________________________________
activation_232 (Activation)  (None, 200, 2)            0         
_________________________________________________________________
dropout_218 (Dropout)        (None, 200, 2)            0         
_________________________________________________________________
dense_240 (Dense)            (None, 200, 1)            3         
=================================================================
Total params: 27,306
Trainable params: 27,306
Non-trainable params: 0
_________________________________________________________________


And because of that Im getting ValueError: logits and labels must have the same shape ((None, 200, 1) vs (None, 1, 1))

How can I tell Keras to apply the training in the second dimension and not the last one?

EDIT

Diagram explanation

This is what I understand is happening, is this right? How I fixed it?

Edit 2

I tried to modify as suggested, using:

dense_model.add(Dense(hidden_units, input_shape=(None,200,1)))

but I'm getting the following warning:

WARNING:tensorflow:Model was constructed with shape (None, None, 200, 1) for input KerasTensor(type_spec=TensorSpec(shape=(None, None, 200, 1), dtype=tf.float32, name='dense_315_input'), name='dense_315_input', description="created by layer 'dense_315_input'"), but it was called on an input with incompatible shape (None, 200, 1, 1).
1

There are 1 answers

2
roman_ka On

The first dimension that you are pointing at is batch size, as you specified in your input layer (the input shape is [batch_size, input_dim] as can be seen here

dense_model.add(Dense(hidden_units, input_shape=([200,1])))

So your model is outputting 200 values because your batch size is 200, but the target label you are comparing only has one value.