Different outcome using model.matrix for a function in R

279 views Asked by At

I'm trying to use model.matrix in a function (I won't show all function, just the part of interest) but I notice that the outcome is different for model.matrix when this command is using inside a function. Here's the code:

df <- data.frame(a=1:4, b=5:8, c= 9:12)

model.matrix(a~.,data=df)
#The outcome is:
(Intercept) b  c
1           1 5  9
2           1 6 10
3           1 7 11
4           1 8 12
attr(,"assign")
[1] 0 1 2
#Using model.matrix inside in a function
#Entries for function are a dataframe and a dependent var.
fun1 <- function(DF,vdep){
model.matrix(vdep ~.,data=DF)
}

fun1(df,df$a)
  (Intercept) a b  c
1           1 1 5  9
2           1 2 6 10
3           1 3 7 11
4           1 4 8 12
attr(,"assign")
[1] 0 1 2 3    
#As you can see the outcome includes dependent var (a).

Why these outcomes differ? Thanks.

1

There are 1 answers

5
Roman Luštrik On BEST ANSWER

First, you are "regressing" (for the lack of a better term) a against everything else. Inside a function, you are regressing vdep against everything else, including a. Your function is essentially just doing model.matrix(1:4 ~.,data=df). Formula argument is a "string" and doesn't recognize variables as you see them.

You could modify your function as following

fun2 <- function(DF,vdep){
  model.matrix(as.formula(paste(vdep, "~ .")), data = DF)
}  

fun2(df, "a")

  (Intercept) b  c
1           1 5  9
2           1 6 10
3           1 7 11
4           1 8 12
attr(,"assign")
[1] 0 1 2