Weighted dataset after IPTW using weightit?

539 views Asked by At

I'm trying to get a weighted dataset after IPTW using weightit. Unfortunately, I'm not even sure where to start. Any help would be appreciated.

library(WeightIt)
library(cobalt)
library(survey)

W.out <- weightit(treat ~ age + educ + race + married + nodegree + re74 + re75,
                  data = lalonde, estimand = "ATT", method = "ps")

bal.tab(W.out)

# pre-weighting dataset
lalonde

# post-weighting dataset??
1

There are 1 answers

0
socialscientist On BEST ANSWER

The weightit() function produces balance weights. In your case, setting method = "ps" will produce propensity scores that are transformed into weights. More details of how it produces those weights can be found with ?method_ps. You can extract the weights from your output and store them as a column in a data.frame via: data.frame(w = W.out[["weights"]]). The output is a vector of weights with a length equal to the number of non-NA rows in your data (lalonde).

What you actually mean by "weighted dataset" is ambiguous for two reasons. First, any analyses that use those weights will typically not actually produce a new data.set...rather it will weight the contribution of the row to the likelihood. This is substantively different from simply analyzing a dataset that has had each row's values multiplied by its weight and will produce different results for many models. Second, you are asking how to get a weighted dataset that has character vectors in columns. For example, lalonde$race is a character vector. Multiplying 5*"black" doesn't make much sense.

If you are indeed intent on multiplying every value in every row of your data by the row's respective weight, you will need to convert your race variable to numeric indicators, remove it from your data, then you can apply sweep():

library(dplyr)
df <- lalonde %>%
              black = if_else(race == "black", 1, 0),
              hispan = if_else(race == "hispan",1,0),
              white = if_else(race == "white",1,0)) %>%
  select(-race)
   

sweep(df, MARGIN = 2, W.out[["weights"]], `*`)