How to put restriction on factors coeff in lm()

1.3k views Asked by At

I am using a standard lm() model in R with numeric variables and factors. For factors R give coeff for every levels but one, the one being 0.

Is it possible to choose this level?

For example, here is the output of my model:

Coefficients:
                             Estimate Std. Error t value Pr(>|t|)    
(Intercept)                     9.847e+00  1.499e-02 656.984   <2e-16 ***
base$km                        -3.343e-06  5.669e-08 -58.974   <2e-16 ***
log(base$nbJour + 1)            2.395e-02  1.743e-03  13.738   <2e-16 ***
id_boite2                 -5.980e-02  4.741e-03 -12.615   <2e-16 ***
cylindre2.0                1.125e-01  8.174e-03  13.762   <2e-16 ***
cylindre2.7                2.291e-01  1.056e-02  21.692   <2e-16 ***
cylindre3.0                3.393e-01  1.061e-02  31.970   <2e-16 ***

The variable id_boite can have 2 values, 1 or 2. By default R has set id_boite1 to 0 and id_boite2 to -5.980e-02. I want to know if it is possible to force it to set the other level to 0, or more globally to manage to set the level with the most negative effect to 0, in order to have all my coeff positive.

2

There are 2 answers

0
josliber On

I think you're looking for the relevel() function. Before you ran your linear model (assuming a data frame named df), you would do:

df$id_boite = relevel(df$id_boite, ref=2)
0
Ben Bolker On

You could use df <- transform(df,id_boite=relevel(id_boite,response_var)) (assuming your data are in a data frame df and that response_var is the response variable), which would set the factor levels in order of increasing (marginal) mean response. This wouldn't guarantee positive coefficients in a complex regression model where the conditional means associated with each level could be different than their marginal means, but it might work reasonably well in general.