I'm trying to use model.matrix in a function (I won't show all function, just the part of interest) but I notice that the outcome is different for model.matrix when this command is using inside a function. Here's the code:
df <- data.frame(a=1:4, b=5:8, c= 9:12)
model.matrix(a~.,data=df)
#The outcome is:
(Intercept) b c
1 1 5 9
2 1 6 10
3 1 7 11
4 1 8 12
attr(,"assign")
[1] 0 1 2
#Using model.matrix inside in a function
#Entries for function are a dataframe and a dependent var.
fun1 <- function(DF,vdep){
model.matrix(vdep ~.,data=DF)
}
fun1(df,df$a)
(Intercept) a b c
1 1 1 5 9
2 1 2 6 10
3 1 3 7 11
4 1 4 8 12
attr(,"assign")
[1] 0 1 2 3
#As you can see the outcome includes dependent var (a).
Why these outcomes differ? Thanks.
First, you are "regressing" (for the lack of a better term)
a
against everything else. Inside a function, you are regressingvdep
against everything else, includinga
. Your function is essentially just doingmodel.matrix(1:4 ~.,data=df)
. Formula argument is a "string" and doesn't recognize variables as you see them.You could modify your function as following