Estimating parameters of a mixture distribution of 2 normals given quantile values

327 views Asked by At

I have a mixture distribution of two normals with known weights 0.6 and 0.4.

I know the true values of the parameters -in this case the first is a normal with mean = 10030, sd = 2 and the second is normal with mean 10000 and sd = 1- but I want to be able to estimate them from quantile values.

If I'm given 23 quantiles

0.010 0.025 0.050 0.100 0.150 0.200 0.250 0.300 0.350 0.400 0.450 0.500 0.550 0.600 0.650
0.700 0.750 0.800 0.850 0.900 0.950 0.975 0.990

and their values

9998.040  9998.466  9998.850  9999.326  9999.681 10000.000 10000.319 10000.674 10001.150
10004.895 10027.234 10028.065 10028.651 10029.139 10029.579 10030.000 10030.421 10030.861
10031.349 10031.935 10032.766 10033.463 10034.256

What is the best way to estimate in R the mean and variance parameters for each distribution?

I have tried estimating using least squares with the nls function

nls(quantiles~weights[1]*pnorm(rvals,mean1,sd1),start = list(mean1=startm1, sd1=startsd1, startm2, startsd2))

I've also tried root finding using the rootSolve::multiroot()

I've tried solving for one parameter at a time or solving for all four. The only hope of having good estimates so far is to give starting values very close to the true parameters.

Any suggestions help.

Thanks

1

There are 1 answers

0
dcarlson On

There are several R packages that perform mixture analysis. Here is an example using mixtools:

library(mixtools)
mix <- normalmixEM(rvals)   # Using all default parameters
mix
# number of iterations= 5 
# $x
#  [1]  9998.040  9998.466  9998.850  9999.326  9999.681 10000.000 10000.319 10000.674 10001.150 10004.895 10027.234 10028.065 10028.651 10029.139 10029.579 10030.000 10030.421
# [18] 10030.861 10031.349 10031.935 10032.766 10033.463 10034.256
# 
# $lambda
# [1] 0.4347826 0.5652174
# 
# $mu
# [1] 10000.14 10030.59
# 
# $sigma
# [1] 1.836207 2.034118
# 
# $loglik
# [1] -63.6896
# 
# $posterior
#    . . . 
plot(mix, which=2)

You can see that the estimated weights (lambda) are .435 and .565, the estimated means (mu) are 10,000 and 10,030, and the standard deviations (sigma) are 1.836 and 2.034 which are close to your expected values. Reading the package vignette and fine tuning the parameters will probably get you closer.

Mixture Plot