DenseVariational layer produces deterministic predictions even though it should be random

59 views Asked by At

I am trying to implement a mock example of a network only consisting of one DenseVariational layer with one neuron from tensorflow_probability.

However, when I train it and run a prediction, the predicted values are always the same even though the predictions should change randomly due to the learned distribution over the weights.

When I access the weights of the learned distribution on the weight I get the following:

  • weight: mean = 1.4353443, scale = 0.06307282
  • bias: mean = 3.0068393, scale = 0.08853491

Of course the variance is not huge, but in my opinion it is also big enough to see some variations in the predictions.

Here is the data generating process:

x_train_100 = np.linspace(-1,1,100).reshape((100,1))
y_train_100 = 1.5 * x_train_100 + 3 + 0.35 * np.random.standard_normal(100).reshape((100, 1))

Here are the prior and posterior I am using:

def prior(kernel_size, bias_size, dtype=None):
    n = kernel_size + bias_size
    prior_model = tf.keras.Sequential(
        [  
            # Non-trainable distribution
            tfpl.DistributionLambda(
                lambda t: tfd.MultivariateNormalDiag(
                    loc=tf.zeros(n), scale_diag=tf.ones(n))
                )
        ]
    )
    return prior_model

def posterior_mean_field(kernel_size, bias_size=0, dtype=None):
    n = kernel_size + bias_size
    c = np.log(np.expm1(1.))
    return tf.keras.Sequential([
        tfp.layers.VariableLayer(2 * n, dtype=dtype),
        tfp.layers.DistributionLambda(lambda t: tfd.Independent(
            tfd.Normal(
                loc=t[..., :n],
                scale=1e-5 + 0.01*tf.nn.softplus(c + t[..., n:])),
            reinterpreted_batch_ndims=1)),
  ])

This is the model specification:

# Build, compile and fit small BNN
model_bnn_100 = tf.keras.Sequential(
    [
        tfpl.DenseVariational(
            units=1,
            input_shape=(1,),
            make_prior_fn=prior,
            make_posterior_fn=posterior_mean_field,
            kl_weight=1/100,
            kl_use_exact=False,
        )
    ]
)
model_bnn_100.compile(loss="mse", optimizer="sgd")
history_bnn_100 = model_bnn_100.fit(x_train_100, y_train_100, epochs=100, verbose=0)

So when I finally call a prediction, the results are always the same. I haven't set any seed or something similar before.

model_bnn_100(x_train_100)
0

There are 0 answers