In GPFlow, one can set priors on the parameters of the model as explained in their webpage
long_term_kernel = gpflow.kernels.SquaredExponential()
long_term_kernel.variance.prior = tfp.distributions.LogNormal(
tf.math.log(gpflow.utilities.to_default_float(280_000.0)), 1.0
)
long_term_kernel.lengthscales.prior = tfp.distributions.LogNormal(
tf.math.log(gpflow.utilities.to_default_float(140.0)), 0.05
)
I was wondering how that affects the optimization process. If I am not mistaken, GPFlow finds the best parameters by optimizing the log likelihood function. How does this work when we have a probability distribution as the prior for a parameter? Is a regularization term (associated with the prior) added to the objective function? Is the log likelihood function computed in a different way? If so, how?
Thank you very much.