I am training a neural network model using:

`model.compile(loss="mean_squared_error", optimizer=optimizer, metrics=[tf.keras.metrics.MeanSquaredError()]) `

What troubles me, is the difference between value of loss and metric in training history, while at first glance both should calculate the same thing.

```
Epoch 4/30
15484/15484 [==============================] - 135s 9ms/step - loss: 0.1212 - mean_squared_error: 0.1188 - val_loss: 0.1146 - val_mean_squared_error: 0.1120
Epoch 5/30
15484/15484 [==============================] - 135s 9ms/step - loss: 0.1198 - mean_squared_error: 0.1170 - val_loss: 0.1138 - val_mean_squared_error: 0.1109
Epoch 6/30
15484/15484 [==============================] - 131s 8ms/step - loss: 0.1187 - mean_squared_error: 0.1157 - val_loss: 0.1132 - val_mean_squared_error: 0.1101
```

Are they calculating MSE based on different things (batch vs whole epoch) or are implemented differently? I have searched through some documentation and did a little bit of googling but didn't find the answer.

What is the difference between these two?