Need help determining the zero-inflated distribution of my data to be able to use it in a GLMM in R

59 views Asked by At

I am working in R with a data set which includes a column of Shannon Indexes per location code. I would like to perform a GLMM using this column as the response variable. Treatment would be my fixed effect and then I have a couple of random effects as well. The issue is that the Shannon Indexes have an awkward distribution:

1

There is a zero inflation and ignoring that it presents a bimodal curve. Below is the histogram of the non-zero subset of this data: 2

Would this be a zero-inflated gaussian distribution?

I tried fitting a Gaussian mixed model for the non-zero subset of this data:

> descdist(subset_Shan_lepi, discrete = FALSE, boot = 500)
summary statistics
------
min:  0.2337917   max:  1.819511 
median:  0.7143846 
mean:  0.8885969 
estimated sd:  0.2991915 
estimated skewness:  0.6367288 
estimated kurtosis:  3.174011

3

> gaussian_mix_model <- normalmixEM(subset_Shan_lepi, k = 2)
number of iterations= 23 
> hist(subset_Shan_lepi, probability = TRUE, col = "lightgray", main = "Histogram with Fitted Mixture Model")
> lines(density(gaussian_mix_model$mu), col = 2, lty = 2, lwd = 2)

4

It seems like a good fit but I gave no idea how to perform a GLMM on a data with this mixed distribution let alone including the zeroinflated data.

Any help welcome, thanks for reading!

0

There are 0 answers