how to use glmrob with MT methods?

311 views Asked by At

Here is my code, but I have this error ..

model <- glmrob(x ~ as.factor(i) + as.factor(j), 
  family = poisson(link = "log"), data = myData,method="MT")

error message:

Error in optim(start, sumaConPesos, method = "L-BFGS-B", x = x, y = y,  : non-finite value supplied by optim
1

There are 1 answers

2
Ben Bolker On

I strongly suspect this is an example of Andrew Gelman's folk theorem of statistical computing:

When you have computational problems, often there’s a problem with your model

If I replicate your example 100 times I get errors every time, but different errors different times:

library(robustbase)
data("GenInsLong",package="ChainLadder")
names(GenInsLong)[3] <- "claims"
tt <-
    replicate(100,
        {cat(".")
         try(robustbase::glmrob(claims~factor(accyear)+factor(devyear),
                   family=poisson,method="MT", data=GenInsLong),
           silent=TRUE)
         })
table(tt)

gives 80 instances of your "non-finite value supplied by optim" error and 20 cases of "Error in solve.default(de/n): Lapack routine dgesv: system is exactly singular: U[x,x] = 0" for a variety of values of x.

I've dug down within the function, using debug on various levels. The problem seems to be that the initialization procedure (glmrobMT -> beta0IniCP -> betaExacto) randomly samples (without replacement) a number of rows from the data equal to the number of columns of the model matrix (19) and tries to fit a poisson GLM to it. Because of the structure of your data, these subsamples are almost always multicollinear, leading to NA values in the coefficients, which screws things up down the line.

I don't know how to fix this, but I would ask the following general modeling questions:

  • are you sure it makes sense to fit a model with 19 parameters to a data set with 55 observations? The usual liberal rule of thumb is N/p > 10, this is N/p approx 2 ...
  • are you sure it makes sense to fit a Poisson model, even a robust one, to a response with a mean value of about 2.5 million? The implied coefficient of variation for a Poisson model with mean 2.5 million is sqrt(V)/mean = 1/sqrt(mean) = 0.0006. Even though these are counts, they're so large that I'd suggest a model based on a continuous distribution ...