I am trying to train an LSTM with keras using TensorFlow backend on toy data and am getting this error:
ValueError: Error when checking target: expected dense_39 to have 2 dimensions, but got array with shape (996, 1, 1)
The error occurs immediately upon calling
model.fit; nothing seems to run. It seems to me that Keras is checking dimensions, but ignoring the fact that it should be taking batches of my target with each batch of my input. The error shows the full dimension of my target array, which implies to me that it's never split into batches by Keras, at least while checking dimensions. For the life of me I can't figure out why this would be or anything else that might help.
My network definition with expected layer output shapes in comments:
batch_shape = (8, 5, 1) x_in = Input(batch_shape=batch_shape, name='input') # (8, 5, 1) seq1 = LSTM(8, return_sequences=True, stateful=True)(x_in) # (8, 5, 8) dense1 = TimeDistributed(Dense(8))(seq1) # (8, 5, 8) seq2 = LSTM(8, return_sequences=False, stateful=True)(dense1) # (8, 8) dense2 = Dense(8)(seq2) # (8, 8) out = Dense(1)(dense2) # (8, 1) model = Model(inputs=x_in, outputs=out) optimizer = Nadam() model.compile(optimizer=optimizer, loss='mean_squared_error') model.summary()
The model summary, shapes as expected:
_________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input (InputLayer) (8, 5, 1) 0 _________________________________________________________________ lstm_28 (LSTM) (8, 5, 8) 320 _________________________________________________________________ time_distributed_18 (TimeDis (8, 5, 8) 72 _________________________________________________________________ lstm_29 (LSTM) (8, 8) 544 _________________________________________________________________ dense_38 (Dense) (8, 8) 72 _________________________________________________________________ dense_39 (Dense) (8, 1) 9 ================================================================= Total params: 1,017 Trainable params: 1,017 Non-trainable params: 0 _________________________________________________________________
My toy data, where the target is just a line decreasing from 100 to 0, and the input is just an array of zeros. I want to do one-step-ahead prediction, so I create rolling windows of my input and target using a
rolling_window() method defined below:
target = np.linspace(100, 0, num=1000) target_rolling = rolling_window(target[4:], 1)[:, :, None] target_rolling.shape # (996, 1, 1) <-- this seems to be the array that's causing the error x_train = np.zeros((1000,)) x_train_rolling = rolling_window(x_train, 5)[:, :, None] x_train_rolling.shape # (996, 5, 1)
def rolling_window(arr, window): shape = arr.shape[:-1] + (arr.shape[-1] - window + 1, window) strides = arr.strides + (arr.strides[-1],) return np.lib.stride_tricks.as_strided(arr, shape=shape, strides=strides)
And my training loop:
reset_state = LambdaCallback(on_epoch_end=lambda _, _: model.reset_states()) callbacks = [reset_state] history = model.fit(x_train_rolling, y_train_rolling, batch_size=8, epochs=100, validation_split=0., callbacks=callbacks)
I have tried:
- Non-stateful LSTM, but I really need stateful for the eventual application. Same error.
return_sequence=Truein the second LSTM with a
Flattenlayer after. Same error.
Flattenlayer. This gives a different error because it is expecting a target with the same shape as the output, which at that point is
(batch_size, 5, 1)and not
(batch_size, 1, 1).
- Running the same architecture on the whole sequence at once (batch size of 1), without rolling windows. This works, but just learns to approximate the mean of my target and is useless for my purposes.
Note that none of these questions seem to directly answer mine, although I was really hopeful on a couple:
- Error when checking target: expected time_distributed_5 to have 3 dimensions, but got array with shape (14724, 1)
- LSTM and CNN: ValueError: Error when checking target: expected time_distributed_1 to have 3 dimensions, but got array with shape (400, 256)
- ValueError: Error when checking target: expected lstm_27 to have 2 dimensions, but got array with shape (1, 11, 1)
- expected dense_218_input to have 2 dimensions, but got array with shape (512, 28, 28, 1)
- expected dense_1 to have 2 dimensions, but got array with shape (308, 1, 6)