How do I convert my backward propagation derivatives into matrix equations?

53 views Asked by At

I'm learning neural networks, and am facing issues with the backward propagation step of the process. I know how to get the right derivatives and have done so, but the problem is that I calculated these derivatives the normal way, and I don't know how to use them with matrices. For example, one of my derivatives is:

(((1-p)*y-p*(1-y))*e**(-z)*w2*w3*x1)/(p*(1-p)*(1+e**(-z))**(2))

My problem is that most of these variables are matrices and therefore require matrix multiplication. They're all of different shapes as well, and the output needs to be one particular shape. I just don't know how to go about framing the same equation in terms of matrix multiplication.

This isn't that relevant to the question, but if you are curious, here are the matrix shapes:

p:(1,13)
y:(1,13)
z:(1,13)
w2:(50,50)
w3:(1,50)
x1:(50,76800)
e is just the constant(2.718281828459045)
and the desired shape at the end is (50,76800).

If it's of any help, the neural network is a binary classification program that differentiates between cats and dogs. x1 is the input layer; w2 and w3 are the weights of the second and third layer; p is the predicted output; z is the predicted output, prior to the sigmoid function; y are the truth values. This derivative is for the weights of the first layer(w1).

I'd like to add that this was my best attempt:

np.dot(np.dot(np.transpose(weights_2),np.dot(np.transpose(weights_3),dldp(a3,truth))),np.transpose(data))
0

There are 0 answers