Multilevel models in R using nmle package

2k views Asked by At

I am using the nlme package to learn multilevel models, and following examples from the textbook "Discovering Statistics Using R" when it happened.

Mixed Models Code

The data set is Honeymoon Period.dat, also downloadable under their companion website.

Data Set - Multilevel Models

require(nlme)
require(reshape2)
satisfactionData = read.delim("Honeymoon Period.dat",  header = TRUE)

restructuredData<-melt(satisfactionData, id = c("Person", "Gender"), measured = c("Satisfaction_Base", "Satisfaction_6_Months", "Satisfaction_12_Months", "Satisfaction_18_Months"))
names(restructuredData)<-c("Person", "Gender", "Time", "Life_Satisfaction")


#print(restructuredData)
#restructuredData.sorted<-restructuredData[order(Person),]

intercept <-gls(Life_Satisfaction~1, data = restructuredData, method = "ML", na.action = na.exclude)
randomIntercept <-lme(Life_Satisfaction ~1, data = restructuredData, random = ~1|Person, method = "ML",  na.action = na.exclude, control = list(opt="optim"))
anova(intercept, randomIntercept)

timeRI<-update(randomIntercept, .~. + Time)
timeRS<-update(timeRI, random = ~Time|Person)
ARModel<-update(timeRS, correlation = corAR1(0, form = ~Time|Person))

The error occured at this moment, when I am trying to update "timeRS" model. The error message is as follows:

Error in as.character.factor(X[[i]], ...) : malformed factor

Any stats people/programmers here who knows what this means?

1

There are 1 answers

1
user3585829 On BEST ANSWER

I have looked at this book. It appears that the coding is wrong. The error you are getting is because your time variable is a factor and you need it to be numeric. In the author's first figure in the book he represents time as numeric (0 - 3) but his code for the models is incorrect. I've recoded it for you:

## First, Time needs to be recoded as a numeric

restructuredData$Time.Numeric <- with(restructuredData, ifelse(Time == "Satisfaction_Base", 0, 
        ifelse(Time == "Satisfaction_6_Months", 1,
        ifelse(Time == "Satisfaction_12_Months", 2,
        ifelse(Time == "Satisfaction_18_Months", 3, NA)))))

## Baseline Model

intercept <-gls(Life_Satisfaction~1, data = restructuredData, method = "ML", na.action = na.exclude)

summary(intercept)

## Model where intercept can vary for Individuals

randomIntercept <- lme(Life_Satisfaction ~ 1, data = restructuredData, random = ~1|Person, method = "ML", na.action = na.exclude, control = list(opt = "optim"))

summary(randomIntercept)

## Add time as a fixed effect

timeRI <- lme(Life_Satisfaction ~ Time.Numeric, data = restructuredData, random = ~1|Person, method = "ML", na.action = na.exclude, control = list(opt = "optim"))

summary(timeRI)

## Add a random slope to the model by nesting the Individual within the test time

timeRS <- lme(Life_Satisfaction ~ Time.Numeric, data = restructuredData, random = ~Time.Numeric|Person, method = "ML", na.action = na.exclude, control = list(opt = "optim"))

summary(timeRS)


## Modeling the covariance structure structure of the errors with a first-order autoregressive covariance structure

ARModel <- update(timeRS, correlation = corAR1(0, form = ~Time.Numeric|Person))

summary(ARModel)

anova(intercept, randomIntercept, timeRI, timeRS, ARModel)

The anova read out for model comparisons is now exactly as shown in the book.