Why am I getting a log-likelihood of -infinity when using positive poisson regression on a large dataset in R?

178 views Asked by At

I need to run a regression on a count dependent variable containing only non-zero positive integers with a mean of 2.31 and a variance of 3.86. Thus, I am using a positive poisson regression using the "vglm" family in R because the mean and variance are fairly close. I have 30 independent variables (with some variables having 60% NA values) and around 12 million rows of data. When I take a random subset of 50,000 rows, the positive poisson regression works without any warnings and I get a finite negative log-likelihood value. However, when I include the entire dataset (12 million rows), I get a -infinity value for the log-likelihood and the following warning messages: 1: In vglm.fitter(x = x, y = y, w = w, offset = offset, Xm2 = Xm2, : iterations terminated because half-step sizes are very small 2: In vglm.fitter(x = x, y = y, w = w, offset = offset, Xm2 = Xm2, : some quantities such as z, residuals, SEs may be inaccurate due to convergence at a half-step

I have the following questions:

  1. I understand that a -infinity log-likelihood value could occur when there are non-integers in the dependent variable because a poisson distribution will have a 0% chance of predicting non-integers. However, there are only positive integers present in my dependent variable. Moreover, this problem only occurs when I include the entire dataset. Thus, how do I interpret the -infinity value? Also, is it safe for me to ignore it and use the results of the regression?
  2. Should I be concerned about the warnings that R produces? Also, how do I fix the warnings? I get these warnings despite not including the independent variables with a high % of NA values. I am mainly interested in the coefficients of the independent variables and the p-values.
0

There are 0 answers