What exactly does the loss value give while training a LSTM model in keras?

3.3k views Asked by At

I have a LSTM model that predicts tomorrow's water outflow volume based on today's outflow volume, temperature, and precipitation.

model = Sequential()
model.add(LSTM(units=24, return_sequences=True, 
           input_shape=(X_Train.shape[1],X_Train.shape[2])))
model.add(Dropout(0.2))
model.add(LSTM(units=50))
model.add(Dropout(0.2))
model.add(Dense(20, activation='relu'))
model.add(Dense(1, activation='linear'))
model.compile(optimizer = 'adam', loss = 'mean_squared_error')
history = model.fit(X_Train, Y_Train, epochs=8,
               validation_data=(X_Test, Y_Test))

While training I got:

Epoch 1/8
4638/4638 [==============================] - 78s 17ms/step - loss: 
1.9951e-04 - val_loss: 1.5074e-04
Epoch 2/8
4638/4638 [==============================] - 77s 17ms/step - loss: 
9.6735e-05 - val_loss: 1.0922e-04
Epoch 3/8
4638/4638 [==============================] - 78s 17ms/step - loss: 
6.5202e-05 - val_loss: 5.9079e-05
Epoch 4/8
4638/4638 [==============================] - 77s 17ms/step - loss: 
5.1011e-05 - val_loss: 4.9478e-05
Epoch 5/8
4638/4638 [==============================] - 77s 17ms/step - loss: 
4.3992e-05 - val_loss: 5.1148e-05
Epoch 6/8
4638/4638 [==============================] - 77s 17ms/step - loss: 
3.9901e-05 - val_loss: 4.2351e-05
Epoch 7/8
4638/4638 [==============================] - 74s 16ms/step - loss: 
3.6884e-05 - val_loss: 4.0763e-05
Epoch 8/8
4638/4638 [==============================] - 74s 16ms/step - loss: 
3.5287e-05 - val_loss: 3.6736e-05

But when I manually calculate the mean-square error, I get a different result

mean_square_root = mean_squared_error(predicted_y_values_unnor, Y_test_actual)
130.755469707972

Manual Calculation:

I wanted to know why they would the validation loss be different while training than while calculating manually. How is the loss calculated while training?

2

There are 2 answers

1
Abhi25t On

The loss you have chosen is mean_squared_error in the line

model.compile(optimizer = 'adam', loss = 'mean_squared_error')

That is the loss your LSTM model is minimizing.

The Mean Squared Error, or MSE, loss is the default loss to use for regression problems. Mean squared error is calculated as the average of the squared differences between the predicted and actual values. The result is always positive regardless of the sign of the predicted and actual values and a perfect value is 0.0. The squaring means that larger mistakes result in more error than smaller mistakes, meaning that the model is punished for making larger mistakes.

LSTM is a general model and you can choose lots of different loss functions. Here is keras inbuilt available functions list https://keras.io/api/losses/ You need to select a loss function as per your problem.

0
sergio urquijo On

I was just having the same exact problem and i solved it with flatten() function. When you do: "mean_squared_error(predicted_y_values_unnor, Y_test_actual)" the shape of "Y_test_actual" is just a one dimensional array. Example: array([297, 290, 308, 308, 214]) but the other argument is a bidimensional array in wich one of the dimensions is 1. Example: "array([300.53693, 299.9416 , 295.61334, 218.2563 , 219.74983], dtype=float32)"

you just have to do flatten() on "predicted_y_values_unnor"