I was wondering is there an equivalent PyTorch loss function for TensorFlow's softmax_cross_entropy_with_logits
?
PyTorch equivalence for softmax_cross_entropy_with_logits
25.6k views Asked by Dark_Voyager AtThere are 4 answers
is there an equivalent PyTorch loss function for TensorFlow's
softmax_cross_entropy_with_logits
?
torch.nn.functional.cross_entropy
This takes logits as inputs (performing log_softmax
internally). Here "logits" are just some values that are not probabilities (i.e. not necessarily in the interval [0,1]
).
But, logits are also the values that will be converted to probabilities.
If you consider the name of the tensorflow function you will understand it is pleonasm (since the with_logits
part assumes softmax
will be called).
In the PyTorch implementation looks like this:
loss = F.cross_entropy(x, target)
Which is equivalent to :
lp = F.log_softmax(x, dim=-1)
loss = F.nll_loss(lp, target)
It is not F.binary_cross_entropy_with_logits
because this function assumes multi label classification:
F.sigmoid + F.binary_cross_entropy = F.binary_cross_entropy_with_logits
It is not torch.nn.functional.nll_loss
either because this function takes log-probabilities (after log_softmax()
) not logits.
@Blade Here's the solution I came up with!
import torch
import torch.nn as nn
import torch.nn.functional as F
class masked_softmax_cross_entropy_loss(nn.Module):
r"""my version of masked tf.nn.softmax_cross_entropy_with_logits"""
def __init__(self, weight=None):
super(masked_softmax_cross_entropy_loss, self).__init__()
self.register_buffer('weight', weight)
def forward(self, input, target, mask):
if not target.is_same_size(input):
raise ValueError("Target size ({}) must be the same as input size ({})".format(target.size(), input.size()))
input = F.softmax(input)
loss = -torch.sum(target * torch.log(input), 1)
loss = torch.unsqueeze(loss, 1)
mask /= torch.mean(mask)
mask = torch.unsqueeze(mask, 1)
loss = torch.mul(loss, mask)
return torch.mean(loss)
Btw: I needed this loss function at the time (Sept 2017) because I was attempting to translate Thomas Kipf's GCN (see https://arxiv.org/abs/1609.02907) code from TensorFlow to PyTorch. However, I now notice that Kipf has done this himself (see https://github.com/tkipf/pygcn), and in his code, he simply uses the built-in PyTorch loss function, the negative log likelihood loss, i.e.
loss_train = F.nll_loss(output[idx_train], labels[idx_train])
Hope this helps.
~DV
Following the pointers in several threads, I ended up with the following conversion. I will put post my solution here in case anyone else falls to this thread. It is modified from here, and behaves as expected within this context.
# pred is the prediction with shape [C, H*W]
# gt is the target with shape [H*W]
# idx is the boolean array on H*W for masking
# Tensorflow version
loss = tf.nn.sparse_softmax_cross_entropy_with_logits( \
logits=tf.boolean_mask(pred, idx), \
labels=tf.boolean_mask(gt, idx)))
# Pytorch version
logp = torch.nn.functional.log_softmax(pred[idx])
logpy = torch.gather(logp, 1, Variable(gt[idx].view(-1,1)))
loss = -(logpy).mean()
A solution
onehot function: