Keras neural network outputting function parameters / how to construct loss function?

561 views Asked by At

I'm working on a Keras/TensorFlow based neural network. I am trying to do something a bit different.

Normally, the output layer of the network produces an output tensor (i.e. a list of numbers). Then those numbers are compared directly to a target list of training data (the labels) using a loss function such as mean squared error.

However, I would instead like the output layer of the network to be a list of numbers that serve as function parameters. The function operates on these parameters to produce a new list of numbers. The loss function then becomes the MSE between the function output and the labels (instead of, as would normally be the case, the MSE between the output layer and the labels).

I understand that I need to write a Keras custom loss function that computes the values of the target function from the output layer values, then computes and returns the MSE between the target function output and the labels. I also realize that all of this needs to be done within the TensorFlow graph, and that the target function needs to be differentiable so that gradients can be computed. I believe I understand all of this well enough.

Here's what I can't wrap my head around. Let's say there are four neurons in the output layer - call them a, b, c, d. Each of them is a separate parameter to the target function F(a, b, c, d). Let's say I iterate F(a, b, c, d) 20 times and get a set of 20 values. That is, F(a, b, c, d, 1); F(a, b, c, d, 2); etc. Then I just want to take the MSE between these 20 values and the 20 values in the corresponding label tensor. That will be the loss function.

I just don't understand the Keras/Tensorflow backend well enough to know how to obtain the individual elements of the output tensor. How do I address the zeroth, first, second, etc elements in this tensor so that I can use them to compute the function values? I know how to perform operations on whole tensors, but I don't understand how to address individual tensor elements.

I hope I've explained the issue sufficiently clearly.

Thanks for your help!

1

There are 1 answers

3
Daniel Möller On BEST ANSWER

Since the predicted result and the labels must have the same shape, we should create an entire model, containing the function you want (not leaving the function to the loss function).

Later we can take the output of a previous layer, which will be the desired parameters.

So, suppose you've got your model prepared up to the layer that outputs the parameters (A Dense(4) most likely, which will output 4 parameters for each input sample).

Let's add two lambda layers after it.

  • One to output the 4 unique parameters, independent from samples, because you will want to retrieve them later
  • One to be the actual function a*sin(bx + c) + d

So:

#add them to your model the usual way you do
model.add(Lambda(getParameters,output_shape=(4,),name='paramLayer'))
model.add(Lambda(yourFunction,output_shape=(1,),name='valueLayer'))

Where:

import keras.backend as K

def getParameters(x):

    #since x comes in as a batch with shape (20,4) -- (or any other batch size different from 20)

    #let's condense X in one sample only, because we want only 4 elements, not 20*4 elements
    xCondensed = K.mean(x,axis=0,keepdims=True)
        #I'm using keepdims because we will need that x end up with the same number of samples for compatibility purposes (keras rules)

    #let's expand x again (for compatibility purposes), now repeating the 4 values 20 (or more) times
    return K.ones_like(x) * xCondensed



def yourFunction(x):


    #now x has 4 parameters (assuming you had a Dense(4) before these lambda layers)
    a = x[:,0]
    b = x[:,1]
    c = x[:,2]
    d = x[:,3]

    #creating the 20 (or more) iterations
    ones = K.ones_like(x[:,0])
    iterationsStartingAt1= K.cumsum(ones)
    iterationsStartingAt0= iterationsStartingAt1 - 1
    iterations = #choose one of the above


    return (a * K.sin((b*iterations) + c)) + d

Now you can train this model passing the labels.

When you want to retrieve the four parameters, you need another model, that ends earlier:

from keras.models import Model

paramModel = Model(model.inputs,model.get_layer('paramLayer').output)
params = paramModel.predict(testOrTrainData)

The result will be shaped like (20,4), but all the 20 lines will be repeated.