training loss during LSTM training is higher than validation loss

896 views Asked by At

I am training an LSTM to predict a time series. I have tried an encoder-decoder, without any dropout. I divided my data n 70% training and 30% validation. The total points in the training set and validation set are around 107 and 47 respectively. However, the validation loss is always greater than training loss. below is the code.

 seed(12346)
 tensorflow.random.set_seed(12346)
 Lrn_Rate=0.0005
 Momentum=0.8
 sgd=SGD(lr=Lrn_Rate, decay = 1e-6, momentum=Momentum, nesterov=True)
 adam=Adam(lr=Lrn_Rate, beta_1=0.9, beta_2=0.999, amsgrad=False)
 optimizernme=sgd
 optimizernmestr='sgd'
 callbacks= EarlyStopping(monitor='loss',patience=50,restore_best_weights=True)

 train_X1 = numpy.reshape(train_X1, (train_X1.shape[0], train_X1.shape[1], 1))
 test_X1 = numpy.reshape(test_X1, (test_X1.shape[0], test_X1.shape[1], 1))

 train_Y1 = train_Y1.reshape((train_Y1.shape[0],  train_Y1.shape[1], 1))
 test_Y1= test_Y1.reshape((test_Y1.shape[0],  test_Y1.shape[1], 1))

 model = Sequential()
 Hiddenunits=240
 DenseUnits=100
 n_features=1
 n_timesteps= look_back
 model.add(Bidirectional(LSTM(Hiddenunits, activation='relu', return_sequences=True,input_shape= 
 (n_timesteps, n_features))))#90,120 worked for us uk 
 model.add(Bidirectional(LSTM( Hiddenunits, activation='relu',return_sequences=False)))
 model.add(RepeatVector(1)) 
 model.add(Bidirectional(LSTM( Hiddenunits, activation='relu',return_sequences=True)))
 model.add(Bidirectional(LSTM(Hiddenunits, activation='relu', return_sequences=True)))

 model.add(TimeDistributed(Dense(DenseUnits, activation='relu'))) 
 model.add(TimeDistributed(Dense(1)))
 model.compile(loss='mean_squared_error', optimizer=optimizernme)
 
 
history=model.fit(train_X1,train_Y1,validation_data(test_X1,test_Y1),batch_size=batchsize,epochs=250,
              callbacks=[callbacks,TqdmCallback(verbose=0)],shuffle=True,verbose=0)

plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss'+ modelcaption)
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()

the training loss is coming greater than validation loss. training loss =0.02 and validation loss are approx 0.004 please the attached picture. I tried many things including dropouts and adding more hidden units but it did not solve the problem. Any comments suggestion is appreciated enter image description here

1

There are 1 answers

5
Derek O On BEST ANSWER

Better performance on a validation set compared to a training set is an indication that your model is overfitting to the training data. It sounds like you have tried creating the model architecture with and without a dropout layer (which combats overfitting) but the results are similar.

One possibility is data leakage, which is when information outside of the training data is used in the creation of the training data set. For example, if you normalized or standardized all of the data at once, instead of separately for the training and validation sets, then you are implicitly using the validation data in your training data, which would lead to the model overfitting.