I'm new in machine learning and I'm trying to train a model. I'm using this Keras oficial example as a guide to set my dataset and feed it into the model: https://www.tensorflow.org/api_docs/python/tf/keras/utils/Sequence
From the training data I have an sliding windows created for a single column and for the labels I have a binary classification (1 or 0).
This is the model creation code:
n = 200
hidden_units = n
dense_model = Sequential()
dense_model.add(Dense(hidden_units, input_shape=([200,1])))
dense_model.add(Activation('relu'))
dense_model.add(Dropout(dropout))
print(hidden_units)
while hidden_units > 2:
hidden_units = math.ceil(hidden_units/2)
dense_model.add(Dense(hidden_units))
dense_model.add(Activation('relu'))
dense_model.add(Dropout(dropout))
print(hidden_units)
dense_model.add(Dense(units = 1, activation='sigmoid'))
This is the functions I'm using to compile the model:
def compile_and_fit(model, window, epochs, patience=2):
early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss',
patience=patience,
mode='min')
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(window.train , epochs=epochs)
return history
This is the model training:
break_batchs = find_gaps(df_train, 'date_diff', diff_int_value)
for keys, values in break_batchs.items():
dense_window = WindowGenerator(data=df_train['price_var'],
data_validation=df_validation['price_var'],
data_test=df_test['price_var'],
input_width=n,
shift=m,
start_index=values[0],
end_index=values[1],
class_labels=y_buy,
class_labels_train=y_buy_train,
class_labels_test=y_buy_test,
label_width=1,
label_columns=None,
classification=True,
batch_size=batch_size,
seed=None)
history = compile_and_fit(dense_model, dense_window)
and those are the shapes of the batches:
(TensorSpec(shape=(None, 200, 1), dtype=tf.float32, name=None), TensorSpec(shape=(None, 1, 1), dtype=tf.float64, name=None))
The problem is (I guess) that, from the model summary the model is training from the last dimension when it should be working in the second one:
dense_model.summary()
Model: "sequential_21"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
|Model is being applied here
|
v
dense_232 (Dense) (None, 200, 200) 400
_________________________________________________________________
|When it should be applied here
|
v
activation_225 (Activation) (None, 200, 200) 0
_________________________________________________________________
dropout_211 (Dropout) (None, 200, 200) 0
_________________________________________________________________
dense_233 (Dense) (None, 200, 100) 20100
_________________________________________________________________
activation_226 (Activation) (None, 200, 100) 0
_________________________________________________________________
dropout_212 (Dropout) (None, 200, 100) 0
_________________________________________________________________
dense_234 (Dense) (None, 200, 50) 5050
_________________________________________________________________
activation_227 (Activation) (None, 200, 50) 0
_________________________________________________________________
dropout_213 (Dropout) (None, 200, 50) 0
_________________________________________________________________
dense_235 (Dense) (None, 200, 25) 1275
_________________________________________________________________
activation_228 (Activation) (None, 200, 25) 0
_________________________________________________________________
dropout_214 (Dropout) (None, 200, 25) 0
_________________________________________________________________
dense_236 (Dense) (None, 200, 13) 338
_________________________________________________________________
activation_229 (Activation) (None, 200, 13) 0
_________________________________________________________________
dropout_215 (Dropout) (None, 200, 13) 0
_________________________________________________________________
dense_237 (Dense) (None, 200, 7) 98
_________________________________________________________________
activation_230 (Activation) (None, 200, 7) 0
_________________________________________________________________
dropout_216 (Dropout) (None, 200, 7) 0
_________________________________________________________________
dense_238 (Dense) (None, 200, 4) 32
_________________________________________________________________
activation_231 (Activation) (None, 200, 4) 0
_________________________________________________________________
dropout_217 (Dropout) (None, 200, 4) 0
_________________________________________________________________
dense_239 (Dense) (None, 200, 2) 10
_________________________________________________________________
activation_232 (Activation) (None, 200, 2) 0
_________________________________________________________________
dropout_218 (Dropout) (None, 200, 2) 0
_________________________________________________________________
dense_240 (Dense) (None, 200, 1) 3
=================================================================
Total params: 27,306
Trainable params: 27,306
Non-trainable params: 0
_________________________________________________________________
And because of that Im getting ValueError: logits and labels must have the same shape ((None, 200, 1) vs (None, 1, 1))
How can I tell Keras to apply the training in the second dimension and not the last one?
EDIT
This is what I understand is happening, is this right? How I fixed it?
Edit 2
I tried to modify as suggested, using:
dense_model.add(Dense(hidden_units, input_shape=(None,200,1)))
but I'm getting the following warning:
WARNING:tensorflow:Model was constructed with shape (None, None, 200, 1) for input KerasTensor(type_spec=TensorSpec(shape=(None, None, 200, 1), dtype=tf.float32, name='dense_315_input'), name='dense_315_input', description="created by layer 'dense_315_input'"), but it was called on an input with incompatible shape (None, 200, 1, 1).
The first dimension that you are pointing at is batch size, as you specified in your input layer (the input shape is
[batch_size, input_dim]
as can be seen hereSo your model is outputting 200 values because your batch size is 200, but the target label you are comparing only has one value.