Do centered variables have to stay in matrix form when using them in a regression equation?
I have centered a few variables using the scale
function with center=T
and scale=F
. I then converted those variables to a numeric variable, so that I can manipulate the data frame for other purposes. However, when I run an ANOVA, I get slightly different F values, just for that variable, all else is the same.
Edit:
What's the difference between these two:
scale(df$A, center=TRUE, scale=FALSE)
Which will embed a matrix within your data.frame
AND
scale(df$A, center=TRUE, scale=FALSE)
df$A = as.numeric(df$A)
Which makes variable A numeric, and removes the matrix notation within the variable?
Example of what I am trying to do, but the example doesn't cause the problem I am having:
library(car)
library(MASS)
mtcars$wt_c <- scale(mtcars$wt, center=TRUE, scale=FALSE)
mtcars$gear <- as.factor(mtcars$gear)
mtcars1 <- as.data.frame(mtcars)
# Part 1
rlm.mpg <- rlm(mpg~wt_c+gear+wt_c*gear, data=mtcars1)
anova.mpg <- Anova(rlm.mpg, type="III")
# Part 2
# Make wt_c Numeric
mtcars1$wt_c <- as.numeric(mtcars1$wt_c)
rlm.mpg2 <- rlm(mpg~wt_c+gear+wt_c*gear, mtcars1)
anova.mpg2 <- Anova(rlm.mpg2, type="III")
I'll attempt to answer both of your questions
I'm not sure what you mean by this, but you can strip the center and scale attributes you get back from
scale()
if that is what you are referring to. You can see in the example below you get the same answer whether it is in 'matrix form' or not.From the help file for
scale()
we see that it returns,You are getting back a matrix with attributes for scaled and center.
as.numeric(AA)
strips off those attributes which is the difference between your first and second method.c(AA)
does the same thing. I would guessas.numeric()
either callsc()
(throughas.double()
) or uses the same method it does.lm()
seems to return the same thing so it appears they both are the same.