I am training a Binary detection architecture using TensorFlow 2.2 and Keras. Previously, I had this working if I loaded the data in the same script as the training of the model. However, when I use a larger dataset (x6 more samples, same ratio of positive to negative samples), I now get this set of errors (it ran for a few epochs 5-10 (I ran it multiple times) before giving this error):
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: assertion failed: [predictions must be >= 0] [Condition x >= y did not hold element-wise:] [x (dense_1/Sigmoid:0) = ] [[[nan][nan][nan]]...] [y (Cast_4/x:0) = ] [0]
[[{{node assert_greater_equal/Assert/AssertGuard/else/_1/Assert}}]]
[[gradient_tape/point_conv_fp_1/ScatterNd/_192]]
(1) Invalid argument: assertion failed: [predictions must be >= 0] [Condition x >= y did not hold element-wise:] [x (dense_1/Sigmoid:0) = ] [[[nan][nan][nan]]...] [y (Cast_4/x:0) = ] [0]
[[{{node assert_greater_equal/Assert/AssertGuard/else/_1/Assert}}]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_14820]
Here is the architecture:
And here is code related to the layer where the error appears:
# initialisation
..
# point_conv_sa layers
..
self.dense4 = keras.layers.Dense(128, activation=tf.nn.elu)
self.bn4 = keras.layers.BatchNormalization()
self.dropout4 = keras.layers.Dropout(0.5)
# This line corresponds to 'dense_1' in the image
self.dense_fin = keras.layers.Dense(self.num_classes, activation=tf.nn.sigmoid, bias_initializer=self.initial_bias)
# training step
..
# point_conv_fp layers
..
net = self.dense4(points)
net = self.bn4(net)
net = self.dropout4(net)
pred = self.dense_fin(net)
return pred
Does it have to do with the loss function that Im using. I used keras.losses.BinaryCrossentropy() and there was no problem for both small and large dataset. Then I changed to focal loss based on https://github.com/mkocabas/focal-loss-keras and it failed for the large dataset:
def focal_loss(gamma=2., alpha=.25):
def focal_loss_fixed(y_true, y_pred):
pt_1 = tf.where(tf.equal(y_true, 1), y_pred, tf.ones_like(y_pred))
pt_0 = tf.where(tf.equal(y_true, 0), y_pred, tf.zeros_like(y_pred))
return -K.mean(alpha * K.pow(1. - pt_1, gamma) * K.log(pt_1)) - K.mean((1 - alpha) * K.pow(pt_0, gamma) * K.log(1. - pt_0))
return focal_loss_fixed
....
model.compile(
optimizer=keras.optimizers.Adam(config['lr']),
loss = focal_loss(alpha=config['fl_alpha'], gamma=config['fl_gamma']),
metrics=[Precision(),
Recall(),
AUC()]
)
Let me know if more information is needed.
Cheers
update to tensorflow version 2.10 should works find. https://github.com/keras-team/keras/issues/15715#issuecomment-1100795008