So I have a 2D sequence that I want to use to predict another 2D sequence (map them) like: x = [6 7; 8 9; 9 0] and y = [4 5;3 4;5 6] I was using the sliding window approach to have 5 inputs of x to predict the single 5th value of y, The code is as follows after reshaping x as ([],5,2) and y as ([],1,2), this is how I was building the model:
model = Sequential()
model.add(Input(shape=(xtrain.shape[1],xtrain.shape[2]), name='Input-Layer')) # Input Layer - need to specify the shape of inputs
model.add(Bidirectional(LSTM(128, activation='relu', #input_shape=(m, 2) where m is memory
return_sequences=True)))
model.add(Dense(units=10, activation='relu'))
model.add(Dense(units=2, activation='linear'))
model.compile(loss="mse", optimizer="Adam")
model.summary()
I don't understand why the model only trains well when the return_sequences=True and gets stuck with high loss when return_sequences=False. Also, when I am fitting with return_sequences=True, shouldn't it raise an error when its outputting 5 y values instead of 1 (expected) and what are the other four values correspond to?
I would really appreciate an explanation on this matter.