marginaleffects package gives "NaN" estimates when used with fixest package on large dataset

115 views Asked by At

I am working on a 2sls IV model with interacting fixed effects, where the endogenous variable interacts with a binary variable. I am able to do this without any problems using feols() from the fixest package. I then try to get predictions for different sub-groups using the marginaleffects package, given its compatibility with fixest. However, this only works when I:

  1. limit my dataset to 50,000 observations, or
  2. do not use fixed effect interactions.

Unfortunately my dataset has over a million observations and I need to use the fixed effect interactions. With this the marginaleffects package results in NaN or NAs. I would be grateful for any advise on how I could proceed!

I used the below code:

#The main model
m1 = feols(y ~ x1 | fe1^fe2 |x2_end*x3_end ~ x2_iv*x3_end, data)

#Calculate marginal effects
avg_predictions(m1)
summary(marginaleffects(m1, variables = "x1"))
predictions(m1, newdata = datagrid(x2_end=unique, x3_end=unique))
predictions(m1, newdata = datagrid(x2_end=unique))
predictions(m1, newdata = datagrid(x3_end=unique))

#Alternative model spec
m2 = feols(y ~ x1 | fe1^fe2 |x2_end*x3_end ~ x2_iv*x3_end, data[1:50000,])
m3 = feols(y ~ x1 | fe1+fe2 |x2_end*x3_end ~ x2_iv*x3_end, data)

All the marginal effects command produced NaN or NA results for m1, but work for m2 & m3.

1

There are 1 answers

1
Vincent On BEST ANSWER

I have updated the development version of the marginaleffects package to return more informative error messages. You can install it with:

remotes::install_github("vincentarelbundock/marginaleffects")

Restart R completely for the changes to take effect.

Then, you’ll hopefully see that the error message tells you exactly what you must do to fix the problem:

library(marginaleffects)
library(fixest)
ptdata = readRDS("ptdata_vab.rds")

baseint_fe = feols(inventor_foc_appNtk5 ~ prior_rev_n | examiner_art_unit^appYear |
    ioc * gender_code ~ ear * gender_code, ptdata)

predictions(baseint_fe, by = c("ioc", "gender_code"))
# Error: Unable to compute predicted values with this model. You can try to
#   supply a different dataset to the `newdata` argument.  This error was
#   also raised:
#   
#   Error in predict.fixest(object = model, newdata = newdata, type = type)
#   : 
#   You cannot use predict() based on the initial regression since the
#   fixed-effect 'examiner_art_unit^appYear' was combined using an algorithm
#   dropping the FE values (but fast). Please re-run the regression using
#   the argument 'combine.quick=FALSE'.
# 
#   
#   Bug Tracker:
#   https://github.com/vincentarelbundock/marginaleffects/issues

Follow the error message instructions and fit the model again with combine.quick=FALSE:

baseint_fe = feols(inventor_foc_appNtk5 ~  prior_rev_n |examiner_art_unit^appYear|
                   ioc*gender_code ~ ear*gender_code, ptdata,
    combine.quick=FALSE)
predictions(baseint_fe, by = c("ioc","gender_code"))
# 
#           ioc gender_code Estimate Std. Error      z Pr(>|z|)    S 2.5 % 97.5 %
#  Successful             F     3.00     0.0115 261.15   <0.001  Inf  2.97   3.02
#  Successful             M     3.06     0.0665  45.93   <0.001  Inf  2.93   3.19
#  Unsuccessful           F     1.66     0.2709   6.13   <0.001 30.1  1.13   2.19
#  Unsuccessful           M     2.26     0.2071  10.91   <0.001 89.6  1.85   2.66
# 
# Columns: ioc, gender_code, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 
# Type:  response