How to use fitdist when the paramters are already known (Pareto distribution)

1.1k views Asked by At

I am fitting a Pareto distribution to some data and have already estimated the maximum likelihood estimates for the data. Now I need to create a fitdist (fitdistrplus library) object from it, but I am not sure how to do this. I need a fitdist object because I would like to create qq, density etc. plots with the function such as denscomp. Could someone help?

The reason I calculated the MLEs first is because fitdist does not do this properly - the estimates always blow up to infinity, even if I give the correct MLEs as the starting values (see below). If the earlier option of manually giving fitdist my parameters is not possible, is there an optimization method in fitdist that would allow the pareto parameters to be properly estimated?

I don't have permission to post the original data, but here's a simulation using MLE estimates of a gamma distribution/pareto distribution on the original.

library(fitdistrplus)
library(actuar)

sim <- rgamma(1000, shape = 4.69, rate = 0.482)
fit.pareto <- fit.dist(sim, distr = "pareto", method = "mle", 
                       start = list(scale = 0.862, shape = 0.00665))
#Estimates blow up to infinity
fit.pareto$estimate
1

There are 1 answers

3
user2554330 On BEST ANSWER

If you look at the ?fitdist help topic, it describes what fitdist objects look like: they are lists with lots of components. If you can compute substitutes for all of those components, you should be able to create a fake fitdist object using code like

fake <- structure(list(estimate = ..., method = ..., ...),
                  class = "fitdist")

For the second part of your question, you'll need to post some code and data for people to improve.

Edited to add:

I added set.seed(123) before your simulation of random data. Then I get the MLE from fitdist to be

   scale    shape 
87220272  9244012

If I plot the log likelihood function near there, I get this:

loglik <- Vectorize(function(shape, scale) sum(dpareto(sim, shape, scale, log = TRUE)))
shape <- seq(1000000, 10000000, len=30)
scale <- seq(10000000, 100000000, len=30)
surface <- outer(shape, scale, loglik)
contour(shape, scale, surface)
points(9244012, 87220272, pch=16)

screenshot

That looks as though fitdist has made a somewhat reasonable choice, though there may not actually be a finite MLE. How did you find the MLE to be such small values? Are you sure you're using the same parameters as dpareto uses?