How to create weighted cross entropy loss?

4.7k views Asked by At

I have to deal with highly unbalanced data. As I understand, I need to use weighted cross entropy loss.

I tried this:

import tensorflow as tf

weights = np.array([<values>])
def loss(y_true, y_pred):
    # weights.shape = (63,)
    # y_true.shape = (64, 63)
    # y_pred.shape = (64, 63)
    return tf.reduce_mean(tf.nn.weighted_cross_entropy_with_logits(y_true, y_pred, weights))

model.compile('adam', loss=loss, metrics=['acc'])

But there's an error:

ValueError: Creating variables on a non-first call to a function decorated with tf.function

How can I create this kind of loss?

3

There are 3 answers

3
Timbus Calin On BEST ANSWER

I suggest in the first instance to resort to using class_weight from Keras.

class_weight

is a dictionary with {label:weight}

For example, if you have 20 times more examples in label 1 than in label 0, then you can write

# Assign 20 times more weight to label 0
model.fit(..., class_weight = {0:20, 1:0})

In this way you don't need to worry implementing weighted CCE on your own.

Additional note : in your model.compile() do not forget to use weighted_metrics=['accuracy'] in order to have a relevant reflection of your accuracy.

model.fit(..., class_weight = {0:20, 1:0}, weighted_metrics = ['accuracy'])
0
Gerry P On

class weights is a dictionary that compensates for the imbalance in the data set. For example if you had a data set of 1000 dog images and 100 cat images your classifier be biased toward the dog class. If it predicted dog each time it would be correct 90 percent of the time. To compensate for the imbalance the class_weights dictionary enables you to weight samples of cats 10 times higher than that of dogs when calculating loss. One way is to use the class_weight method from sklearn as shown below

from sklearn.utils import class_weight
import numpy as np

class_weights = class_weight.compute_class_weight(
               'balanced',
                np.unique(train_generator.classes), 
                train_generator.classes) 
0
bebbieyin On

If you are working with imbalance classes, you should use the class weights. For example if you have two classes where class 0 has twice as more data than class 1 :

class_weight = {0 :1, 1: 2}

When you compile, make use the weighted_metrics instead of just metrics or else the model won't take into account the class weights when calculating the accuracy and it will be unrealistically high.

model.compile(loss="binary_crossentropy",optimizer='adam', weighted_metrics=['accuracy'])

hist = model.fit_generator(train,validation_split=0.2,epochs=20,class_weight=class_weight)