Compute and display mean of dependent variable in modelsummary output tables in R

845 views Asked by At

I am using the feols command from the fixest package in R to estimate fixed effect regressions, in most cases with weights. In order to output my results, I am mainly using the modelsummary command from the modelsummary package. I have figured out how to use the add_rows feature to add further information about the models (in my cases, I'm using it to add indicators for the presence of control variables in a given model, in a similar fashion to how modelsummary can natively indicate the presence of fixed effects in a given model).

However, I would also like to add, to each model, the mean of the dependent variable of the model, estimated on the same observations as the given model is estimated (i.e. accounting for that some observations drop because of missing values, and that this can differ from model to model depending on the structure of missing values), and preferably only for observations with my treatment variable set to == 0. How could I achieve this?

The basic regression structure I have is as follows (with some specifications being simpler, e.g. without the Municipality.ID):

model = feols(depvar ~ i(Year.factor, Treatment.dummy, ref ='2007') + Year.factor + Treatment.dummy + Control.var, data = subset(data.frame, condition < limit), weights = Weight.var, panel.id = c(Year.factor, Municipality.ID), cluster = ~Municipality.ID)

I loop these estimations to the list of models mdls. The modelsummary output code is more or less as follows:

mdl.stats <- list(
  list("raw" = "nobs", "clean" = "Observations", "fmt" = 0),
  list("raw" = "r.squared", "clean" = "R Squared", "fmt" = 3),
  list("raw" = "r2", "clean" = "R Squared", "fmt" = 3),
  list("raw" = "r2.within", "clean" = "R2", "fmt" = 3),
  list("raw" = "within.r.squared", "clean" = "R2 (within)", "fmt" = 3),
  list("raw" = "r.squared.within", "clean" = "R2 (within)", "fmt" = 3),
  list("raw" = "FE: TERYT.f", "clean" = "Municipality FE", "fmt" = 0))
cntrls <- rep(c("", "X", ""), times=length(mdls)/3)
cntrls <- as.data.frame(t(c("Controls", cntrls)))
cntrls <- set_names(cntrls, c("Coefficients", names(mdls)))

modelsummary::modelsummary(mdls, gof_map = mdl.stats, stars = TRUE, output = "latex", booktabs = TRUE, add_rows = cntrls)
2

There are 2 answers

3
Rafael Bressan On BEST ANSWER

Sorry I cannot comment on Vincent's answer. Fixest provides the "mean of dependent variable" fitstat under "my", such that your glance_custom function can be simplified to:

glance_custom.fixest <- function(x, ...) {
  out <- data.frame("Mean(DV)" = as.numeric(fitstat(x, type = "my")))
  return(out)
}

and there is no need to depend on insight.

6
Vincent On

You can of course create your own rows manually and add them to the table with the add_rows argument. This just requires a bit of creativity with base R functions in order to create the appropriate data frame.

If you are looking for a more “automated” strategy, the Customizing Existing Models section of the modelsummary website describes a very powerful mechanism to add new information to the bottom of the table. In a nutshell: create a function called glance_custom.fixest which takes a single model object as input and returns a data.frame with only one row and each of the new statistics in different columns.

In this example, I use the find_response() and get_data() functions from the insight package to make it easier to find the dependent variables, extract the data, and compute the mean:

library(modelsummary)
library(fixest)
library(insight)

glance_custom.fixest <- function(x, ...) {
    dv <- insight::find_response(x)
    dat <- insight::get_data(x)
    out <- data.frame("Mean(DV)" = mean(dat[[dv]]), check.names = FALSE)
    return(out)
}

mod <- list(
    feols(mpg ~ hp, data = mtcars[mtcars$cyl == 4,]),
    feols(mpg ~ hp, data = mtcars[mtcars$cyl == 6,]),
    feols(mpg ~ hp, data = mtcars[mtcars$cyl == 8,]))

modelsummary(mod, output = "markdown")
Model 1 Model 2 Model 3
(Intercept) 35.983 20.674 18.080
(5.201) (3.304) (2.988)
hp -0.113 -0.008 -0.014
(0.061) (0.027) (0.014)
Num.Obs. 11 7 14
R2 0.274 0.016 0.080
R2 Adj. 0.193 -0.181 0.004
AIC 63.8 27.9 67.8
BIC 64.6 27.8 69.1
Log.Lik. -29.891 -11.954 -31.920
Std.Errors IID IID IID
Mean(DV) 26.664 19.743 15.100