I'm working on a Keras/TensorFlow based neural network. I am trying to do something a bit different.
Normally, the output layer of the network produces an output tensor (i.e. a list of numbers). Then those numbers are compared directly to a target list of training data (the labels) using a loss function such as mean squared error.
However, I would instead like the output layer of the network to be a list of numbers that serve as function parameters. The function operates on these parameters to produce a new list of numbers. The loss function then becomes the MSE between the function output and the labels (instead of, as would normally be the case, the MSE between the output layer and the labels).
I understand that I need to write a Keras custom loss function that computes the values of the target function from the output layer values, then computes and returns the MSE between the target function output and the labels. I also realize that all of this needs to be done within the TensorFlow graph, and that the target function needs to be differentiable so that gradients can be computed. I believe I understand all of this well enough.
Here's what I can't wrap my head around. Let's say there are four neurons in the output layer - call them a, b, c, d. Each of them is a separate parameter to the target function F(a, b, c, d). Let's say I iterate F(a, b, c, d) 20 times and get a set of 20 values. That is, F(a, b, c, d, 1); F(a, b, c, d, 2); etc. Then I just want to take the MSE between these 20 values and the 20 values in the corresponding label tensor. That will be the loss function.
I just don't understand the Keras/Tensorflow backend well enough to know how to obtain the individual elements of the output tensor. How do I address the zeroth, first, second, etc elements in this tensor so that I can use them to compute the function values? I know how to perform operations on whole tensors, but I don't understand how to address individual tensor elements.
I hope I've explained the issue sufficiently clearly.
Thanks for your help!
Since the predicted result and the labels must have the same shape, we should create an entire model, containing the function you want (not leaving the function to the loss function).
Later we can take the output of a previous layer, which will be the desired parameters.
So, suppose you've got your model prepared up to the layer that outputs the parameters (A
Dense(4)
most likely, which will output 4 parameters for each input sample).Let's add two lambda layers after it.
a*sin(bx + c) + d
So:
Where:
Now you can train this model passing the labels.
When you want to retrieve the four parameters, you need another model, that ends earlier:
The result will be shaped like (20,4), but all the 20 lines will be repeated.