Does purrr::map change an object type?

116 views Asked by At

I've noticed something very strange while doing some regression analysis. Essentially, when I estimate a regression independently and that same regression within a purrr::map function and extract the element, I get that these two objects are not identical. My question is why this is the case or IF this SHOULD be the case.

The main reason I ask this is because some packages are having issues pulling information from estimations that are extracted from purrr::map, but not when I estimate them individually. Here is a small example with some nonsensical regressions:

library(fixest)
library(tidyverse)

## creating a formula for a regression example
formula <- as.formula(paste0(
  "mpg", "~",
  paste("cyl", collapse = "+"),
  paste("|"), paste(c("gear", "carb"), collapse = "+")))

## estimating the regression and saying
mtcars_formula <- feols(formula, cluster = "gear", data = mtcars)

## estimating the same regression twice, but using map
mtcars_list_map <- map(list("gear", "gear"), ~ feols(formula, cluster = ., data = mtcars))

## extracting the first element of the list
is_identical_1 <- mtcars_list_map %>% 
  pluck(1)


## THESE ARE NOT IDENTIAL
identical(mtcars_formula, is_identical_1)

I am tagging this with fixest package as well, only because this may be package specific...

1

There are 1 answers

2
langtang On BEST ANSWER

The differences largely come down to differences in environment. For example, the third element of these lists (i.e. of mtcars_formula and is_identical_1) is the formula mpg~cyl (and in fact mtcars_formula[[3]] == is_identical_1[[3]] will return TRUE. However, you will see that these are associated with differing environments.

> mtcars_formula[[3]] == is_identical_1[[3]]
[1] TRUE
> environment(mtcars_formula[[3]])
<environment: 0x560a2490ef40>
> environment(is_identical_1[[3]])
<environment: 0x560a2554d810>

Whether or not you consider these differences "trivial" or not depends on your use case, but you can check the differences like this:

differences =list()
for(i in 1:length(mtcars_formula)) {
  if(!identical(mtcars_formula[[i]], is_identical_1[[i]])) {
    differences[[names(mtcars_formula)[i]]] = list(mtcars_formula[[i]], is_identical_1[[i]])
  }
}

One element that is indeed different is the reported call (the 4th element)

> mtcars_formula[[4]] == is_identical_1[[4]]
[1] FALSE
> c(mtcars_formula[[4]], is_identical_1[[4]])
[[1]]
feols(fml = formula, data = mtcars, cluster = "gear")

[[2]]
feols(fml = formula, data = mtcars, cluster = .)

This may have something to do with the error you report in the comments above, associated with fwildclusterboot::boottest(). Note that the call from the object created using map() indicates the cluster=., instead of `cluster="gear".

One way to get around this would be to do something like this:

mtcars_list_map <- map(list("gear", "gear"), function(x) {
  # create the model
  model = feols(formula, cluster = x, data = mtcars)
  # manipulate the call object
  model$call$cluster=x
  # return the model
  model
})