I would like to fit a gamma distribution to a dataset made of 338 elements, with a fixed low threshold (I'm using R). To represent the low limit I thought to use a gamma with three parameters, adding the location. Here's my code:
library(FAdist)
library(fitdistrplus)
fit <- fitdist(mydata,"gamma3",start=list(1,1,mythreshold))
Everytime i run the code, I got the same error:
<simpleError in optim(par = vstart, fn = fnobj, fix.arg = fix.arg, obs = data, gr = gradient, ddistnam = ddistname, hessian = TRUE, method = meth, lower = lower, upper = upper, ...): non-finite finite-difference value [3]>
Error in fitdist(Hs, "gamma3", start = list(1, 1, 3)) :
the function mle failed to estimate the parameters,
with the error code 100
So I tried to change the starting values, using the ones obtained by the fit of the gamma with two parameters, which gave this result:
fit <- fitdist(mydata,"gamma")
fit
Fitting of the distribution ' gamma ' by maximum likelihood
Parameters:
estimate Std. Error
shape 21.417503 1.6348313
rate 5.352422 0.4133735
But it still doesn't work..I would understand it if even the gamma with two parameters didn't work, but this is not the case and I can't give myself an explanation. Moreover, the QQ-plot and the ecdf for the gamma with two parameters are not so good...but if i fit the distribution on the original dataset scaled with respect to the low threshold the fit looks perfect:
fit <- fitdist(mydata-mythreshold,"gamma")
fit
Fitting of the distribution ' gamma ' by maximum likelihood
Parameters:
estimate Std. Error
shape 1.059540 0.07212832
rate 1.058007 0.09117620
But then I don't know if it is correct to do like that...the parameters are very different and I need them to calculate the return periods related to my data ! that's why I thought about the gamma with the location parameter.
p.s. i did not attach the data because they are too many, but i can report the summary of them:
summary(mydata)
Min. 1st Qu. Median Mean 3rd Qu. Max.
3.003 3.282 3.753 4.001 4.444 8.087
Set up a reproducible example;
For values of the threshold parameter greater than the minimum value in the data set the likelihood is infinite (since these values are supposed to be impossible/have a value of zero under the truncated distribution). We can make things work by setting an upper bound for this parameter:
Note (1) under maximum likelihood this kind of threshold parameter will always end up at the largest value it can take (i.e., the minimum value in the data set); (2) the standard errors are
NA
because Wald standard errors can't be computed when the parameters hit a boundary.Alternatively you can fix the threshold parameter by defining wrapper function: