I have written a custom model in tensorflow (using version 2.4 in python 3.9.5) so that I can implement my own custom loss function. I noticed that when using the mean squared error as the loss in my custom model (via the self.compute_loss method) and implementing this in the self.train_step, I get a different value for the mean squared error metric to that of the loss.
The shape of the input data is (1000, 18) and the output data is (1000, 1).
See the code below:
def mse_func(y_true, y_pred):
# if we don't call tf.reduce_mean we get loads of arrays as an output
# mse = mean_squared_error(y_true, y_pred)
mse = tf.reduce_mean(mean_squared_error(y_true, y_pred))
return mse
class CustomModel(tf.keras.Model):
def train_step(self, data):
x, y_true = data
with tf.GradientTape() as tape:
y_pred = self(x, training=True) # Forward pass
# Compute the loss value
loss = self.compute_loss(y_true, y_pred)
# Compute gradients
trainable_vars = self.trainable_variables
gradients = tape.gradient(loss, trainable_vars)
# Update weights
self.optimizer.apply_gradients(zip(gradients, trainable_vars))
self.compiled_metrics.update_state(y_true, y_pred)
# Return a dict mapping metric names to current value (to work
# with tensorboard)
metrics_dict = {
'loss': loss,
}
metrics_dict.update({m.name: m.result() for m in self.metrics})
return metrics_dict
def compute_loss(self, y_true, y_pred):
mse = mse_func(y_true, y_pred)
return mse
Epoch 1/10
8/8 - 0s - loss: 0.0053 - mean_squared_error: 0.0113
Epoch 2/10
8/8 - 0s - loss: 0.0060 - mean_squared_error: 0.0063
Epoch 3/10
8/8 - 0s - loss: 0.0018 - mean_squared_error: 0.0032
Epoch 4/10
8/8 - 0s - loss: 0.0024 - mean_squared_error: 0.0023
...
However when I pass the model's compute_loss method as the loss to the model.compile() call, and replace loss = self.compute_loss(y_true, y_pred) with loss = self.compiled_loss(y_true, y_pred, regularization_losses=self.losses), the values are equivalent (which is the expected behaviour):
# when passing the 'mse' loss into the model.compile call it doesn't give us the same problem
# e.g. no problem with this code:
model = CustomModel(inputs, outputs)
model.compile(
optimizer=Adam(),
metrics=['mean_squared_error'],
loss = model.compute_loss
)
Epoch 1/10
8/8 - 0s - loss: 0.0119 - mean_squared_error: 0.0119
Epoch 2/10
8/8 - 0s - loss: 0.0060 - mean_squared_error: 0.0060
Epoch 3/10
8/8 - 0s - loss: 0.0040 - mean_squared_error: 0.0040
Epoch 4/10
8/8 - 0s - loss: 0.0021 - mean_squared_error: 0.0021
Epoch 5/10
8/8 - 0s - loss: 0.0019 - mean_squared_error: 0.0019
...
I'm not sure I understand what is going on under the hood. Is there something different going on when using a non-compiled loss in the train_step function that would cause the difference?