Cross Entropy, Softmax and the derivative term in Backpropagation

Question

Cross Entropy, Softmax and the derivative term in Backpropagation

5k views Asked by GreekFire At 23 November 2014 at 14:20

I'm currently interested in using Cross Entropy Error when performing the BackPropagation algorithm for classification, where I use the Softmax Activation Function in my output layer.

From what I gather, you can drop the derivative to look like this with Cross Entropy and Softmax:

Error = targetOutput[i] - layerOutput[i]

This differs from the Mean Squared Error of:

Error = Derivative(layerOutput[i]) * (targetOutput[i] - layerOutput[i])

So, can you only drop the derivative term when your output layer is using the Softmax Activation Function for classification with Cross Entropy? For instance, if I were to do Regression using the Cross Entropy Error (with say TANH activation function) I would still need to keep the derivative term, correct?

I haven't been able to find an explicit answer on this and I haven't attempted to work out the math on this either (as I am rusty).

Original Q&A

There are 1 answers

**Seguy** · Accepted Answer · 2014-12-18T16:10:08+00:00

You do not use the derivative term in the output layer since you get the 'real' error (the difference between your output and your target), in the hidden layers you have to calculate the approximate error using backpropagation.

What we are doing is an approximation taking the derivate of the error of the next layer against the weights of the current layer instead of the error of the current layer (that its unknown).

Best regards,

TechQA.

Cross Entropy, Softmax and the derivative term in Backpropagation

There are 1 answers

Related Questions in NEURAL-NETWORK

Related Questions in BACKPROPAGATION

Related Questions in ENTROPY

Related Questions in DERIVATIVE

Related Questions in SOFTMAX

Popular Questions

Popular Tags

Trending Questions