Is there a way to run GLM.from_formula without the intercept (PyMC3)?

1.4k views Asked by At

This may be a dumb question but I've searched through pyMC3 docs and forums and can't seem to find the answer. I'm trying to create a linear regression model from a dataset that I know a priori should not have an intercept. Currently my implementation looks like this:

formula = 'Y ~ ' + ' + '.join(['X1', 'X2'])

# Define data to be used in the model
X = df[['X1', 'X2']]
Y = df['Y']

# Context for the model
with pm.Model() as model:
    # set distribution for priors
    priors = {'X1':     pm.Wald.dist(mu=0.01),
              'X2':     pm.Wald.dist(mu=0.01) }
    
    family = pm.glm.families.Normal()
    
    # Creating the model requires a formula and data
    pm.GLM.from_formula(formula, data = X, family=family, priors = priors)
    
    # Perform Markov Chain Monte Carlo sampling
    trace = pm.sample(draws=4000, cores = 2, tune = 1000)

As I said, I know I shouldn't have an intercept but I can't seem to find a way to tell GLM.from_formula() to not look for one. Do you all have a solution? Thanks in advance!

1

There are 1 answers

1
merv On BEST ANSWER

I'm actually puzzled that it does run with an intercept since the default in the code for GLM.from_formula is to pass intercept=False to the constructor. Maybe it's because the patsy parser defaults to adding an intercept?

Either way, one can explicitly include or exclude an intercept via the patsy formula, namely with 1 or 0, respectively. That is, you want:

formula = 'Y ~ 0 + ' + ' + '.join(['X1', 'X2'])