Update: for a minimal example scroll down (does not reproduce the same error but different results)
a word of caution before my actual question: I am quite new to R so I hope I make sense. I have tried figuring out the problem using google and the debug functions of R ( although I am still not too familiar with them).
I have read amongst others:
Question on when to use %in% and when == while subsetting
subset vs adressing rows and cols via []
which brought me to the very good blog of Hadley Wickham, where I will further look into the debugging features of R, his article on: Non standard evaluation
My problem
I have a data frame called data containing data on corporate bonds amongst others yield to maturity, life and rating of a bond.
I want to use the mle2 function from the bbmle package ,which is a wrapper for the optim function, to estimate a model for the term structure, see code below. Whether or not the assumption of normally distributed residuals is justified or not is open to question, nevertheless:
Approach 1: "manually" subsetting works just fine:
require("bbmle")
lifeBBB <- data$life[data$Rating == "BBB"]
yieldBBB <- data$yield[data$Rating == "BBB"]
LL <- function(b0, b1, b2, lambda, mu, sigma){
Score= yieldBBB - (b0+b1*((1-exp(-lambda*lifeBBB))/(lambda*lifeBBB))+ b2*((1-exp(-lambda*lifeBBB))/(lambda*lifeBBB)-exp(-lambda*lifeBBB)))
Score = suppressWarnings(dnorm(Score, mu, sigma, log=TRUE))
-sum(Score)
}
fit <- mle2(LL, start = list(b0 = 108, b1 = -85, b2=168, lambda= 0.12, mu = -10, sigma=70),control=list(maxit = 5000))'
Call:
mle2(minuslogl = LL, start = list(b0 = 108, b1 = -85, b2 = 168,
lambda = 0.12, mu = -10, sigma = 70), control = list(maxit = 5000))
Coefficients:
b0 b1 b2 lambda mu
110.2355968 -82.7010072 167.5960478 0.1410541 -7.7644032
sigma
69.4846302
Log-likelihood: -1018.8
Approach 2: When I try to call subset from within the mle2 function an error occurs:
LL2 <- function(b0, b1, b2, lmbd, mu, sigma){
Score= yield - (b0+b1*((1-exp(-lmbd*life))/(lmbd*life))+ b2*((1-exp(-lmbd*life))/(lmbd*life)-exp(-lmbd*life)))
Score = suppressWarnings(dnorm(Score, mu, sigma, log=TRUE))
-sum(Score)
}
fit1 <- mle2(minuslogl=LL2, start = list(b0 = 108, b1 = -85, b2=168, lmbd= 0.12, mu = -10, sigma=70),method="BFGS", data=data, subset= Rating=="BBB", control=list(maxit=5000))
Error in optim(par = c(108, -85, 168, 0.12, -10, 70), fn = function (p) :
initial value in 'vmmin' is not finite
So since approach 1 works and approach 2 should be a more convenient way of doing the exact same thing without having to introduce new variables for every rating, I figured that something in my code has to be wrong. As I said I tried to use the debugging method and ended up somewhere deep in the machine room of R with an entirely different error.
On the error produced I found this from the R Mailing list, the problem here was that fractions instead of integers were provided as the size parameter of a dbinom distribution. But I can't see how this is an issue in my code.
Thanks in advance
B. Lör
On request I tried to come up with a minimal example which uses exemplary data. The minimal example does not produce an error but two different set of estimates. This is peculiar since in my understanding both functions should do the same
#### minmial example
require("bbmle")
## approach 1:
set.seed(23456)
Rating <- c(rep("A",38),rep("BBB",39) )
yield <- rnorm(77,0.5,15)
life <- runif(77,1,20)
exdata <- data.frame(Rating,yield,life)
lifeBBB <- exdata$life[exdata$Rating == "BBB"]
yieldBBB <- exdata$yield[exdata$Rating == "BBB"]
LL <- function(b0, b1, b2, lambda, mu, sigma){
Score= yieldBBB - (b0+b1*((1-exp(-lambda*lifeBBB))/(lambda*lifeBBB))+ b2*((1-exp(-lambda*lifeBBB))/(lambda*lifeBBB)-exp(-lambda*lifeBBB)))
Score = suppressWarnings(dnorm(Score, mu, sigma, log=TRUE))
-sum(Score)
}
fit <- mle2(LL, start = list(b0 = 100, b1 = -80, b2=160, lambda= 0.12, mu = 1, sigma=14),method ="BFGS",control=list(maxit = 5000))
### approach 2:
LL2 <- function(b0, b1, b2, lmbd, mu, sigma){
Score= exdata$yield - (b0+b1*((1-exp(-lmbd*exdata$life))/(lmbd*exdata$life))+ b2*((1-exp(-lmbd*exdata$life))/(lmbd*exdata$life)-
exp(-lmbd*exdata$life)))
Score = suppressWarnings(dnorm(Score, mu, sigma, log=TRUE)) ## assumption is that residuals are normally distributed
-sum(Score)
}
fit1 <- mle2(minuslogl=LL2, start = list(b0 = 100, b1 = -80, b2=160, lmbd= 0.12, mu = 1, sigma=14),method="BFGS", data=exdata, subset= Rating=="BBB", control=list(maxit=5000))
Call(fit):
Call:
mle2(minuslogl = LL, start = list(b0 = 100, b1 = -80, b2 = 160,
lambda = 0.12, mu = 1, sigma = 14), method = "BFGS", control = list(maxit = 5000))
Coefficients:
b0 b1 b2 lambda mu sigma
94.34166416 -85.35582080 159.76952349 -0.00283408 -4.65833584 12.94283927
Log-likelihood: -155.2
Call (fit1):
all:
mle2(minuslogl = LL2, start = list(b0 = 100, b1 = -80, b2 = 160,
lmbd = 0.12, mu = 1, sigma = 14), method = "BFGS", data = exdata,
subset = Rating == "BBB", control = list(maxit = 5000))
Coefficients:
b0 b1 b2 lmbd mu sigma
9.269496e+01 -8.600709e+01 1.586543e+02 -1.186371e-04 -6.305040e+00 1.458118e+01
Log-likelihood: -315.6