model diagnosis in GLMM model of binary outcome variable

65 views Asked by At

Dear Scientists and researchers, Greetings

I am trying to do a research article on morbidity, which is categorical, i.e., yes or no, and I also consider fixed and random effects, given that all of the fixed effect covariates are categorical. Here is the model.

twoRIM<glmer(morbidity~co_fuel+wealth+san_class+residencey+education+marital+age+parity+Electricity_avaliablity +education*age+(1|region)+(1|participant_id),family=binomial,data=M).

Hence, I try to compare these two random intercept model models to a model that has a single random intercept model based on AIC, and model "twoRIM" is the best model because it has the lowest AIC. Then I tried to check the model diagnosis of this model, mode "twoRIM", by running the following r codes, and I got the following results or plots: And the plots I got—the residual and other related plots—are quite different from other forms of model diagnosis I knew before, and I faced a bit more difficulty interpreting and understanding the nuance of these plots. Maybe I took the codes and arguments wrongly? So, dear researchers, subject matter specialists, professors, or anybody who can help me, I cannot thank you enough for your valuable idea about it. I am waiting for your kind suggestions about these issues.

  1. scatter.smooth(fitted(twoRIM),sqrt(abs(resid(twoRIM))),col=6)

    enter image description here

  2. qqline(resid(twoRIM))

    enter image description here

  3. plot(twoRIM)

    enter image description here

  4. qqnorm(resid(twoRIM),main="Residual normal plot",col=4,adj=0.1)

    enter image description here

  5. qqnorm(ranef(twoRIM)$"region"[[1]],main="Regional level random effects",col=2,adj=0.1)

  6. enter image description here6.qqnorm(ranef(twoRIM)$"participant_id"[[1]],main="Cluster level random effects",col=6,adj=0.1) enter image description here

7 plot(fitted(twoRIM),resid(twoRIM),col=4)

enter image description here

8.qqnorm(resid(twoRIM))

enter image description here

my questions is about model diagnosis in GLMM

1

There are 1 answers

0
Daniel On

You could use the performance-package to assess model fit, inspect fit indices and test models for statistically significant differences.

The performance package (part of the easystats-framework, simply install.packages("easystats")) can deal with many different model classes. There are some documentations for the different functions, for example here how to interpret the diagnostic plots.

For logistic regression models, the residual plots are often less useful. Instead, either use the DHARMa package for more appropriate plots, or using check_models(), focus on the binned residuals and posterior predictive checks. The help files (see ?binned_residuals, or online here) provides more background and literature.

library(performance)
library(lme4)
#> Loading required package: Matrix

model1 <- glmer(vs ~ wt + (1 | gear), data = mtcars, family = "binomial")
model2 <- glmer(vs ~ wt + mpg + (1 | gear), data = mtcars, family = "binomial")

# check model assumption
check_model(model2)


# simple comparison of fit indicess
model_performance(model1)
#> # Indices of model performance
#> 
#> AIC    |   AICc |    BIC | R2 (cond.) | R2 (marg.) |   ICC |  RMSE | Sigma | Log_loss | Score_log | Score_spherical
#> -------------------------------------------------------------------------------------------------------------------
#> 35.723 | 36.580 | 40.120 |      0.627 |      0.459 | 0.310 | 0.352 | 1.000 |    0.382 |   -11.180 |           0.080
compare_performance(model1, model2)
#> # Comparison of Model Performance Indices
#> 
#> Name   |    Model | AIC (weights) | AICc (weights) | BIC (weights) | R2 (cond.) | R2 (marg.) |   ICC |  RMSE | Sigma | Log_loss | Score_log | Score_spherical
#> -------------------------------------------------------------------------------------------------------------------------------------------------------------
#> model1 | glmerMod |  35.7 (0.224) |   36.6 (0.282) |  40.1 (0.375) |      0.627 |      0.459 | 0.310 | 0.352 | 1.000 |    0.382 |   -11.180 |           0.080
#> model2 | glmerMod |  33.2 (0.776) |   34.7 (0.718) |  39.1 (0.625) |      0.703 |      0.643 | 0.170 | 0.341 | 1.000 |    0.347 |   -15.116 |           0.098

# test for statistical significant difference
test_likelihoodratio(model1, model2)
#> # Likelihood-Ratio-Test (LRT) for Model Comparison (ML-estimator)
#> 
#> Name   |    Model | df | df_diff | Chi2 |     p
#> -----------------------------------------------
#> model1 | glmerMod |  3 |         |      |      
#> model2 | glmerMod |  4 |       1 | 4.49 | 0.034
test_performance(model1, model2)
#> Name   |    Model |   BF
#> ------------------------
#> model1 | glmerMod |     
#> model2 | glmerMod | 1.67
#> Models were detected as nested (in terms of fixed parameters) and are compared in sequential order.

Created on 2023-11-05 with reprex v2.0.2