I have an LSTM model that takes some number timesteps of a multidimensional ('n' dimensional) variables, and predicts the next timestep (so the output is an 'n' dimensional vector). I was trying to use tensorflow probability to predict a distribution over predictions; specifically, I was interested in the mean and variance of the predicted distribution, so I can use this to create a confidence interval. The model looks something like this:
# Negative log likelihood loss function
neg_log_lik = lambda y, rv_y: -rv_y.log_prob(y)
# The LSTM architecture
model_lstm = Sequential()
model_lstm.add(LSTM(units=125, activation="tanh", input_shape=(n_steps, n), return_sequences=False))
model_lstm.add(Dense(units=n*2, activation="relu"))
model_lstm.add(tfp.layers.DistributionLambda(
lambda t: tfd.Normal(loc=t[..., :n],
scale=0.01*tf.math.softplus(t[..., n:]))))
# Compiling the model
model_lstm.compile(optimizer="RMSprop", loss=neg_log_lik)
I was hoping it would return 'n' means and 'n' variances (hence the units=n*2 in the dense layer before the DistributionLambda layer), but it seems to just return an 'n' dimensional vector at the end (when I do model.predict(); interestingly, when i print out the model summary it shows something like this:((None, n), (None, n)), but finally the actual output shape is just (batch_size, n)) when i make a prediction.
I was wondering why this is the case, and if there is a way to get the mean and variance of each dimension in the output?
I am quite new to tensorflow probability, so do let me know if I'm not explaining things correctly.
Thank you for your help!