Can someone help me to edit my code in a way that all levels of discrete variables are included into analysis? Namely, gls takes by default 1st level of the discrete variable as a reference level, and this is something I would like to avoid.
I found that I can use the contrast function on the factor variables before fitting the model, but it doesn't really do the job for all the discrete variables in my data set.
Here is a dummy code:
# Install and load the necessary packages
if (!requireNamespace("nlme", quietly = TRUE)) {
install.packages("nlme")
}
if (!requireNamespace("ape", quietly = TRUE)) {
install.packages("ape")
}
library(nlme)
library(ape)
# Generate a dummy phylogenetic tree
set.seed(42)
phylo_tree <- rcoal(10)
# Generate a dummy dataset with 9 predictors (5 discrete) and a target variable
n_samples <- 100
data <- data.frame(
x1 = sample(0:2, n_samples, replace = TRUE), # Discrete predictor
x2 = sample(0:1, n_samples, replace = TRUE), # Discrete predictor
x3 = rnorm(n_samples),
x4 = sample(0:3, n_samples, replace = TRUE), # Discrete predictor
x5 = rnorm(n_samples),
x6 = rnorm(n_samples),
x7 = sample(0:1, n_samples, replace = TRUE), # Discrete predictor
x8 = rnorm(n_samples),
x9 = rnorm(n_samples)
)
data$y = with(data, 2 * x1 + 3 * x2 + 1.5 * x3 + 0.5 * x4 + 4 * x5 +
0.8 * x6 + 2.5 * x7 + 1 * x8 + 0.5 * x9 + rnorm(n_samples))
# Fit GLS model with corBM and phylogenetic tree
correlation_matrix <- corBrownian(phylo_tree, value = 1) # Adjust the parameter as needed
correlation_structure <- corStruct(corBM(1, form = ~ 1 | Subject), nl = TRUE)
model <- gls(y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9,
data = data,
correlation = correlation_structure)
# Print summary of the model
summary(model)`
Any hint would be helpful, thanks in advance!