Why do the trace values have periods of (unwanted) stability?

Question

Why do the trace values have periods of (unwanted) stability?

172 views Asked by jrmxn At 06 September 2017 at 15:33

I have a fairly simple test data set I am trying to fit with pymc3.

The result generated by traceplot looks something like this. Essentially the trace of all parameter look like there is a standard 'caterpillar' for 100 iterations, followed by a flat line for 750 iterations, followed by the caterpillar again.

The initial 100 iterations happen after 25,000 ADVI iterations, and 10,000 tune iterations. If I change these amounts, I randomly will/won't have these periods of unwanted stability.

I'm wondering if anyone has any advice about how I can stop this from happening - and what is causing it?

Thanks.

The full code is below. In brief, I am generating a set of 'phases' (-pi -> pi) with a corresponding set of values y = a(j)*sin(phase) + b(j)*sin(phase). a and b are drawn for each subject j at random, but are related to each other. I then essentially try to fit this same model.

Edit: Here is a similar example, running for 25,000 iterations. Something goes wrong around iteration 20,000.

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import matplotlib.pyplot as plt
import numpy as np
import pymc3 as pm
%matplotlib inline

np.random.seed(0)


n_draw = 2000
n_tune = 10000
n_init = 25000
init_string = 'advi'
target_accept = 0.95

##
# Generate some test data
# Just generates:
# x a vector of phases
# y a vector corresponding to some sinusoidal function of x
# subject_idx a vector corresponding to which subject x is

#9 Subjects
N_j = 9

#Each with 276 measurements
N_i = 276
sigma_y = 1.0
mean = [0.1, 0.1]
cov = [[0.01, 0], [0, 0.01]]  # diagonal covariance

x_sub = np.zeros((N_j,N_i))
y_sub = np.zeros((N_j,N_i))
y_true_sub = np.zeros((N_j,N_i))
ab_sub = np.zeros((N_j,2))
tuning_sub = np.zeros((N_j,1))
sub_ix_sub = np.zeros((N_j,N_i))
for j in range(0,N_j):
    aj,bj = np.random.multivariate_normal(mean, cov)
    #aj = np.abs(aj)
    #bj = np.abs(bj)
    xj = np.random.uniform(-1,1,size = (N_i,1))*np.pi
    xj = np.sort(xj)#for convenience
    yj_true = aj*np.sin(xj) + bj*np.cos(xj)
    yj = yj_true + np.random.normal(scale=sigma_y, size=(N_i,1))

    x_sub[j,:] = xj.ravel()
    y_sub[j,:] = yj.ravel()
    y_true_sub[j,:] = yj_true.ravel()

    ab_sub[j,:] = [aj,bj]
    tuning_sub[j,:] = np.sqrt(((aj**2)+(bj**2)))
    sub_ix_sub[j,:] = [j]*N_i

x = np.ravel(x_sub)
y = np.ravel(y_sub)
subject_idx = np.ravel(sub_ix_sub)
subject_idx = np.asarray(subject_idx, dtype=int)

##
# Fit model
hb1_model = pm.Model()
with hb1_model:
#    Hyperpriors
    hb1_mu_a = pm.Normal('hb1_mu_a', mu=0., sd=100)
    hb1_sigma_a = pm.HalfCauchy('hb1_sigma_a', 4)
    hb1_mu_b = pm.Normal('hb1_mu_b', mu=0., sd=100)
    hb1_sigma_b = pm.HalfCauchy('hb1_sigma_b', 4)

    # We fit a mixture of a sine and cosine with these two coeffieicents
    # allowed to be different for each subject
    hb1_aj = pm.Normal('hb1_aj', mu=hb1_mu_a, sd=hb1_sigma_a, shape = N_j)
    hb1_bj = pm.Normal('hb1_bj', mu=hb1_mu_b, sd=hb1_sigma_b, shape = N_j)

    # Model error
    hb1_eps = pm.HalfCauchy('hb1_eps', 5)

    hb1_linear = hb1_aj[subject_idx]*pm.math.sin(x) + hb1_bj[subject_idx]*pm.math.cos(x)
    hb1_linear_like = pm.Normal('y', mu = hb1_linear, sd=hb1_eps, observed=y)

with hb1_model:
    hb1_trace = pm.sample(draws=n_draw, tune = n_tune,
                      init = init_string, n_init = n_init,
                     target_accept = target_accept)

pm.traceplot(hb1_trace)

Original Q&A

There are 2 answers

jrmxn On 06 September 2017 at 19:50

To partially answer my own question: After playing with this for a while, it looks like the problem might be due to the hyperprior standard deviation going to 0. I am not sure why the algorithm should get stuck there though (testing a small standard deviation can't be that uncommon...).

In any case, two solutions that seem to alleviate the problem (although they don't remove it entirely) are:

1) Add an offset to the definitions of the standard deviation. e.g.:

offset = 1e-2
hb1_sigma_a = offset + pm.HalfCauchy('hb1_sigma_a', 4)

2) Instead of using a HalfCauchy or HalfNormal for the SD prior, use a logNormal distribution set so that 0 is unlikely.

**altroware** · Accepted Answer · 2017-09-07T10:53:01+00:00

altroware On 07 September 2017 at 10:53 BEST ANSWER

I'd look at the divergencies, as explained in notes and literature on Hamiltonian Monte Carlo, see, e.g., here and here.

with model:
    np.savetxt('diverging.csv', hb1_trace['diverging'])

As a dirty solution, you can try to increase target_accept, perhaps.

Good luck!

TechQA.

Why do the trace values have periods of (unwanted) stability?

There are 2 answers

Related Questions in PYMC

Related Questions in PYMC3

Related Questions in MCMC

Popular Questions

Trending Questions