glmnet: how to set reference category for multinomial logit

4k views Asked by At

Following my question in Cross Validate glmnet: which is the reference category or class in multinomial regression?, can someone explain how can we set the reference category in glmnet for multinomial logistic regression?

Even though glmnet is for applying shrinking methods (Ridge, Lasso, etc.) neither its documentation nor glmnet forums out there answer this question.

Thank you in advance

1

There are 1 answers

8
LyzandeR On BEST ANSWER

Well no you cannot do that in the function glmnet but you can do that very easily just before you run the function using model.matrix:

a <- factor( rep(c("cat1", "cat2", "cat3", "no-cat"),50) ) #make a factor
levels(a) <- c("no-cat", "cat1", "cat2", "cat3") #change the order of the levels because 
#the first category is always the reference category using the model.matrix function
df <- data.frame(a) #put the factor in a dataframe

dummy_a <- model.matrix(~a,data=df) #make dummies for the factor. 
#Note the first category of the levels(a) will get excluded i.e. 
#become the reference category

cat_dummified <- dummy_a[,2:4] #the first column is the intercept i.e. a column of 1s
#which we exclude here

> head(cat_dummified)
  acat1 acat2 acat3
1     0     0     0
2     1     0     0
3     0     1     0
4     0     0     1
5     0     0     0
6     1     0     0

> class(cat_dummified)
[1] "matrix"

cat_dummified is also of class matrix, ready to use in the glmnet function. This way you have only 3 dummies for which you will have coefficients and are referenced against the no-cat category.

Hope this helps!