I am trying to build a GLMM to fit my data but for some reason all my random effects come back as "not defined because of singularities".
I understand that this would indicate that they are perfectly predicted by another variable, but these variables are time of day, date, and individual ID and are not easily correlated with each other or any other variable. I have been adding them to the model as ...+ (1|randomeffect).
I have tried just including one and not the others, but I get this error regardless. The rest of the model runs fine.
Here is the model and the output:
Call:
glm(formula = df$Sex ~ df$`Low Freq (KHz)` + df$`Full Song Duration` +
(1 | df$Individual) + (1 | df$TOD) + (1 | df$DATER), family = binomial(link = "logit"),
data = df)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.95539 -0.18003 0.02514 0.10766 2.16469
Coefficients: (3 not defined because of singularities)
Estimate Std. Error z value Pr(>|z|)
(Intercept) 4.2354 1.0846 3.905 9.42e-05 ***
df$`Low Freq (KHz)` -0.7999 0.3923 -2.039 0.0414 *
df$`Full Song Duration` 5.2124 1.2008 4.341 1.42e-05 ***
1 | df$IndividualTRUE NA NA NA NA
1 | df$TODTRUE NA NA NA NA
1 | df$DATERTRUE NA NA NA NA
Your problem is that you're not actually fitting a GLMM: that's not what
glm()
does. You probably wanted:glm()
interprets terms like1|TOD
as a literal "or" statement: in this context, 0 is treated as FALSE and any other number as TRUE, so1|x
is always TRUE — so you ended up with several extra columns of 1s (converted back from TRUE) in your model, which are all collinear with the intercept ...Some slightly tangential suggestions:
df$...
inside GLM(M) formulae; R knows enough to take these variables from the data frame providedlow_freq
andfull_duration
(but this is admittedly a matter of taste)