How to fit a multivariate normal distribution in R?

2.4k views Asked by At

I need to fit a multivaraite normal distribution to each specie in the Iris dataset in R. I saw the mvtnormpackage might be useful; however, i want to use the maximum likelihood estimation and not sure how to do so in R. Any ideas?

2

There are 2 answers

2
kangaroo_cliff On BEST ANSWER

It seems you are looking for a mixture discriminant analysis (since class labels are known). In this case, you can use the MclustDA from the mclust package.

 model= MclustDA(data = iris[,1:4], class = iris$Species)
 summary(model)

However, if you are looking to cluster the data using by fitting multivariate Gaussian mixtures then you can use the Mclust function.

  fit = Mclust(data = iris[,1:4], G=3)
  table(fit$classification, iris$Species)

 #   setosa versicolor virginica
 # 1     50          0         0
 # 2      0         45         0
 # 3      0          5        50
0
dcarlson On

If you just want to fit a distribution to each species, you might want mvnorm.mle in the Rfast package:

install.packages("Rfast")
library(Rfast)
iris.split <- split(iris[, 1:4], iris$Species)
iris.mvnorm <- lapply(iris.split, function(x) mvnorm.mle(as.matrix(x)))
iris.mvnorm[["setosa"]]
# $loglik
# [1] 44.91657
#
# $mu
# [1] 5.006 3.428 1.462 0.246
#
# $sigma
#              Sepal.Length Sepal.Width Petal.Length Petal.Width
# Sepal.Length     0.121764    0.097232     0.016028    0.010124
# Sepal.Width      0.097232    0.140816     0.011464    0.009112
# Petal.Length     0.016028    0.011464     0.029556    0.005948
# Petal.Width      0.010124    0.009112     0.005948    0.010884

The other species are stored in iris.mvnorm[["versicolor"]] and iris.mvnorm[["virginica"]].