Variational bayes method - Unreasonable result from posterior distribution

84 views Asked by At

I'm trying to implement the basic example from https://en.wikipedia.org/wiki/Variational_Bayesian_methods#A_basic_example in order to find the posterior distribution over the mean and variance for the input data(which I generate in the beginning).

It is my understanding that the approximated posterior should be given by the product of q(mu)*q(tau) so I thought I could get it by simple multiplying each distribution for each point in the grid and then plot it. Although I can't see any error with my distributions, the gamma distribution produces extremely small values while the Gaussian distributions only has one non-zero element. My guess is that there is something wrong in the end where I multiply the two distributions for each point in the grid produced by meshgrid but I just wrote both distributions straight from Wikipedia. Why are my posterior probabilities so small/nan and what can I do to fix it?

Here is my code:

# First Exact solution 
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
from scipy.special import factorial

# Produce N data points from known gaussian distribution
real_mu = 2
real_tau = 1
x = np.arange(-3,7,0.1)
N = len(x)
nrv = stats.norm.pdf(x, real_mu, real_tau)

## Approximate posterior distribution over mean and covariance ##
x = nrv     # N data points from "unknown distribution"
N = len(x) # 


# The Algorithm 

#hyperparameters - can be set arbitrarily but smaller positive values indicates larger prior distributions over mu and tau
lambda_0 = 0.05
mu_0 = 0.05
a_0 = 0.05
b_0 = 0.05
## 
x_sum = np.sum(x)
x_ave = np.sum(x)/N
x_squared = np.dot(x, x)
mu_n = (lambda_0*mu_0 + N*x_ave)/(lambda_0 + N)
a_N = a_0 + (N + 1)/2
lambda_N = 1 # Initialize lambda to some arbitrary value
difference = 9999999
while difference > 0.0000001:
    b_N = b_0 + 0.5*((lambda_0 + N)*((1/lambda_N) + mu_n**2) - 2*(lambda_0*mu_0 + x_sum)*mu_n + (x_squared) + lambda_0*mu_0*mu_0)
    new_lambda_N = (lambda_0 + N)*a_N/b_N
    difference_1 = new_lambda_N - lambda_N
    lambda_N = new_lambda_N
    difference = np.absolute(difference_1)


# Calulate the approximated posterior from these parameters: q(mu, tau) = q(mu)q(tau)
t = np.arange(-2,2,0.01)
#qmu = stats.norm.pdf(t, mu_n, 1/lambda_N)
#qtau = gamma.pdf(t, a_N, loc=0, scale=b_N) #scale=1/b_N)
def gaussian(x):
    return (1/(np.sqrt(2*np.pi*sigma*sigma)))*np.exp(-(x-mu_n)*(x-mu_n)/(2*sigma*sigma)) 

def gamma(x):
     return ((b_N**a_N)*(x**(a_N-1))*np.exp(-x*b_N))/(factorial(a_N-1))

sigma = 1/lambda_N

xx, yy = np.meshgrid(t, t)
# First part in zz is from Gaussian distribution over mu and second a gamma distribution over tau
# Same as the two defined functions above
zz = ((1/(np.sqrt(2*np.pi*sigma*sigma)))*np.exp(-(xx-mu_n)*(xx-mu_n)/(2*sigma*sigma)))*((b_N**a_N)*(yy**(a_N-1))*np.exp(-yy*b_N))/(factorial(a_N-1))  

plt.xlabel("mu")
plt.ylabel("tau")
plt.contourf(t,t,zz)
plt.show()
0

There are 0 answers