Linked Questions

Popular Questions

I am struggling to implement Bayesian model comparison in a specific case and would greatly appreciate some help. I want to compare multiple models, which are represented by normal distributions. Each model has informative prior distributions for both the mean and standard deviation, and these priors are based on domain knowledge.

In my scenario, I aim to compare the estimates provided by different experts. Each expert's estimate corresponds to a prior distribution, and I want to incorporate these estimates in the model comparison process.

Any guidance or suggestions on how to implement this Bayesian model comparison with informative priors would be greatly appreciated. Thank you!

I have attempted to implement this using scipy and pymc libraries, but I seem to can't find relevant resources to help me out. While I am aware that using an MCMC sampling method would be a promising solution, I am also curious if there is a way to leverage the fact that my priors are conjugate. Unfortunately, I haven't been able to successfully make progress with either of these approaches.

I tried the following approach in pymc, but I find the LOO value to be less interpretable compared the bayes factor.

import pymc as pm

np.random.seed(123)
data = np.random.normal(loc=40, scale=10, size=100)

# Define the models with different prior distributions for the mean
with pm.Model() as model_a:
    # Prior distribution for the mean: uniform distribution between -10 and 10
    mean = pm.Normal('mean', mu=40, sigma=5)
    sigma = pm.Normal('sigma', mu=10, sigma=5)
    
    # Likelihood function: normal distribution with unknown mean and standard deviation
    likelihood = pm.Normal('likelihood', mu=mean, sigma=sigma, observed=data)
    # Run the MCMC sampler to obtain samples from the posterior distribution
    trace_a = pm.sample(5000)

with pm.Model() as model_b:
    # Prior distribution for the mean: normal distribution with mean 0 and standard deviation 10
    mean = pm.Normal('mean', mu=40, sigma=1)
    sigma = pm.Normal('sigma', mu=10, sigma=1) 
    
    # Likelihood function: normal distribution with unknown mean and standard deviation
    likelihood = pm.Normal('likelihood', mu=mean, sigma=sigma, observed=data)
    # Run the MCMC sampler to obtain samples from the posterior distribution
    trace_b = pm.sample(5000)

# Calculate BIC manually
def calculate_bic(log_likelihood, num_parameters, num_data_points):
    bic = -2 * log_likelihood + num_parameters * np.log(num_data_points)
    return bic

with model_a:
    pm.compute_log_likelihood(trace_a)

with model_b:
    pm.compute_log_likelihood(trace_b)

a_loo = az.loo(trace_a)

df_comp_loo = az.compare({"A": trace_a, "B": trace_b})

az.plot_compare(df_comp_loo, insample_dev=False);

# Calculate BIC for each model
log_likelihood_uniform = model_a.logp().sum()
log_likelihood_normal = model_b.logp().sum()

num_parameters_uniform = len(model_a.free_RVs)
num_parameters_normal = len(model_b.free_RVs)

num_data_points = len(data)

bic_uniform = calculate_bic(log_likelihood_uniform, num_parameters_uniform, num_data_points)
bic_normal = calculate_bic(log_likelihood_normal, num_parameters_normal, num_data_points)

print('BIC for uniform prior:', bic_uniform.sum())
print('BIC for normal prior:', bic_normal.sum())

Related Questions