Linked Questions

Popular Questions

Split factor column into several columns in R

Asked by At

I want to multiply regression coefficients with the actual variables for every observation. Without factors, I can do this by multiplying the matrix of variables element wise with the vector of coefficients

v_coef <- as.matrix(vars) %*% as.matrix(coef)

However, the problem I am facing is that one of my variables is a factor. Therefore, the regression returns multiple coefficients associated with dummy variables (one for every year with one year excluded). Therefore, the line of code above does not work anymore as several coefficients are associated with the same column in the matrix of variables.

### Working example
# Make up dataframe
df      <- data.frame(matrix(rnorm(6*1000, 1, .5), ncol=6))
# Make up some years (3)
df$year <- c(rep(1,333),rep(2,333),rep(3,334))
# Regress something with years as factor
model   <- lm(X1~X2+X3+X4+X5+X6+factor(year),data=df)
# This does not work because years receive 3 coefficients for 1 column
m_coef  <- as.matrix(df) %*% as.matrix(model$coefficients)

I see two solutions, however, cannot figure out how to implement them. Either, I split the factor column into several columns with 0's for all except the applicable year and 1's for the observations that fall within that year. Alternatively, I change the matrix multiplication and assign the coefficients to different values of the factor

Related Questions