In many scientific papers, covariate balance is presented in Table 1 before and after weighting.
Continuous variables, for example, are presented using the mean and the standard deviation, and binary variables using frequency and proportions.
I do not know how to conveniently display the frequency and proportions after weighting as bal.tab(..., disp = c("means", "sds")
is limited to "means"
and "sds"
Example data:
library(cobalt)
library(dplyr)
set.seed(123)
lalonde <- cbind(lalonde,
event = sample(c(0,1), size=614, replace=TRUE, prob=c(0.84,0.16)),
time = runif(614, min=10, max=365))
formula <- treat ~ age + educ + race + married + nodegree + re74 + re75 + re78
# PS
lalonde$pscore <- glm(formula, data = lalonde,
family = binomial(link = "logit"))$fitted.values
# Calculate weights
lalonde$weight <- ifelse(lalonde$treat == 1,
pmin(lalonde$pscore, 1 - lalonde$pscore) / lalonde$pscore,
pmin(lalonde$pscore, 1 - lalonde$pscore) / (1 - lalonde$pscore))
I used bal.tab
to display means and sds for continous variables. IMO, for binary variables, the weighted means (e.g. M.0.Adj
) is a weighted rate.
bal.tab(formula, data = lalonde, thresholds = c(m = .1), un = TRUE, disp = c("means", "sds"), weights = lalonde$weight)
which results, for example, in:
Type M.0.Adj SD.0.Adj M.1.Adj SD.1.Adj Diff.Adj
married Binary 0.2475 . 0.2581 . 0.0105
Is there a solution to derive the frequency and proportions for binary variables?
The weighted mean of a binary variable is the weighted proportion of units with that characteristic. It doesn't make sense to request a weighted frequency. I describe why in this answer. There is no principled way to do it, and it is not useful information anyway, so you should not attempt to report it. I don't understand what you mean by saying the weighted mean is a weighted rate and not a weighted proportion. Use
bal.tab()
and report the weighted mean as a weighted proportion. This is best practice and what all papers that use IPW do.