I am trying to implement a mock example of a network only consisting of one DenseVariational layer with one neuron from tensorflow_probability
.
However, when I train it and run a prediction, the predicted values are always the same even though the predictions should change randomly due to the learned distribution over the weights.
When I access the weights of the learned distribution on the weight I get the following:
- weight: mean = 1.4353443, scale = 0.06307282
- bias: mean = 3.0068393, scale = 0.08853491
Of course the variance is not huge, but in my opinion it is also big enough to see some variations in the predictions.
Here is the data generating process:
x_train_100 = np.linspace(-1,1,100).reshape((100,1))
y_train_100 = 1.5 * x_train_100 + 3 + 0.35 * np.random.standard_normal(100).reshape((100, 1))
Here are the prior and posterior I am using:
def prior(kernel_size, bias_size, dtype=None):
n = kernel_size + bias_size
prior_model = tf.keras.Sequential(
[
# Non-trainable distribution
tfpl.DistributionLambda(
lambda t: tfd.MultivariateNormalDiag(
loc=tf.zeros(n), scale_diag=tf.ones(n))
)
]
)
return prior_model
def posterior_mean_field(kernel_size, bias_size=0, dtype=None):
n = kernel_size + bias_size
c = np.log(np.expm1(1.))
return tf.keras.Sequential([
tfp.layers.VariableLayer(2 * n, dtype=dtype),
tfp.layers.DistributionLambda(lambda t: tfd.Independent(
tfd.Normal(
loc=t[..., :n],
scale=1e-5 + 0.01*tf.nn.softplus(c + t[..., n:])),
reinterpreted_batch_ndims=1)),
])
This is the model specification:
# Build, compile and fit small BNN
model_bnn_100 = tf.keras.Sequential(
[
tfpl.DenseVariational(
units=1,
input_shape=(1,),
make_prior_fn=prior,
make_posterior_fn=posterior_mean_field,
kl_weight=1/100,
kl_use_exact=False,
)
]
)
model_bnn_100.compile(loss="mse", optimizer="sgd")
history_bnn_100 = model_bnn_100.fit(x_train_100, y_train_100, epochs=100, verbose=0)
So when I finally call a prediction, the results are always the same. I haven't set any seed or something similar before.
model_bnn_100(x_train_100)