I have a simple x, y
data.frame.
mydata <- data.frame(days = 1:96, risk = c(5e-09, 5e-09, 5e-09, 1e-08, 4e-08, 6e-08, 9e-08, 1.5e-07, 4.2e-07,
7.2e-07, 1.02e-06, 1.32e-06, 1.66e-06, 2.19e-06, 2.76e-06, 3.32e-06,
3.89e-06, 4.55e-06, 5.8e-06, 7.16e-06, 8.51e-06, 9.85e-06, 1.138e-05,
1.396e-05, 1.672e-05, 1.947e-05, 2.222e-05, 2.521e-05, 2.968e-05,
3.439e-05, 3.909e-05, 4.378e-05, 4.894e-05, 5.697e-05, 6.546e-05,
7.392e-05, 8.236e-05, 9.16e-05, 0.00010573, 0.00012063, 0.00013547,
0.00015025, 0.00016642, 0.00019127, 0.00021743, 0.00024343, 0.00026924,
0.00029818, 0.00034681, 0.00039832, 0.00044932, 0.00049976, 0.0005451,
0.00056293, 0.00057586, 0.00058838, 0.0006005, 0.00061562, 0.00065079,
0.00068845, 0.00072508, 0.00076062, 0.00079763, 0.00084886, 0.00090081,
0.0009507, 0.00099844, 0.00104427, 0.00108948, 0.00113175, 0.00117056,
0.00120576, 0.00123701, 0.00126253, 0.00128269, 0.00129757, 0.00130716,
0.00131291, 0.00132079, 0.0013216, 0.00131392, 0.00129806, 0.00127247,
0.00122689, 0.00117065, 0.00110696, 0.00103735, 0.00095951, 0.00085668,
0.0007517, 0.00065083, 0.000556, 0.0004669, 0.00037675, 0.00029625,
0.00093289))
I think Weibull(3, 0.155)
is a pretty good fit to my data, judging from the plot below.
plot(1:96, dweibull(mydata$risk, shape = 3, scale = 0.155), type = "l", xlab = "days", ylab = "risk")
lines(mydata, type = "l", col = "grey")
legend("topleft", c("Data", "Estimate"), col = c("black", "grey"), lty = c(1, 1))
I write a function that calculates the negative log-likelihood, which will be passed into mle
.
estimate <- function(kappa, lambda){
-sum(dweibull(mydata$y, shape = kappa, scale = lambda, log = TRUE))
}
I call mle
, provide my initial parameter estimates, and obtain the following error.
> mle(estimate, start = list(kappa = 3, lambda = 0.155))
Error in optim(start, f, method = method, hessian = TRUE, ...) :
non-finite finite-difference value [2]
In addition: There were 50 or more warnings (use warnings() to see the first 50)
What went wrong here?
What do you want to do? From what I can tell, you have a dataset of 96 values of "risk", and you want to fit it's distribution with weibull. Note that "days" is not relevant at all if this is the case. You have an unordered vector of values.
The figure above is misleading. You calculate
dweibull()
for risk values. The figure indicates thatdweibull(risk)
is roughly equal to risk. This is a rather different claim than weibull with given parameters being a good fit.for instance, here is the distribution of your data:
hist(mydata$risk, breaks=15)
while the weibull density with your parameters in the relevant range looks like this:curve((function(x) dweibull(x, shape=3, scale=0.155))(x), 0, 0.0014)
Hence these distributions are very different. I would say your empirical distributions is uniform plus mass at zero, not weibull.
Now to your last problem: as the distributions do not fit well, the optimizer runs into numerical singularities. I don't know
mle()
too well, but with a little tweaksmaxLik::maxLik()
will show the problem:gives you
Note that I did one major change: removing
sum
from your log-likelihood, and using BHHH optimizer. This typically more stable than optimizing based on a single summed likelihood. You should also seriously consider writing analytic derivatives for the estimation.You can check that the distributions look a lot more similar now.