Balance table with frequencies and proportions after weighting

227 views Asked by At

In many scientific papers, covariate balance is presented in Table 1 before and after weighting.
Continuous variables, for example, are presented using the mean and the standard deviation, and binary variables using frequency and proportions.

I do not know how to conveniently display the frequency and proportions after weighting as bal.tab(..., disp = c("means", "sds") is limited to "means" and "sds"

Example data:

library(cobalt)
library(dplyr)

set.seed(123)
lalonde <- cbind(lalonde,
                 event = sample(c(0,1), size=614, replace=TRUE, prob=c(0.84,0.16)),
                 time = runif(614, min=10, max=365))

formula <- treat ~ age + educ + race + married + nodegree + re74 + re75 + re78

# PS
lalonde$pscore <- glm(formula, data = lalonde,
                      family = binomial(link = "logit"))$fitted.values

# Calculate weights
lalonde$weight <- ifelse(lalonde$treat == 1,
                         pmin(lalonde$pscore, 1 - lalonde$pscore) / lalonde$pscore,
                         pmin(lalonde$pscore, 1 - lalonde$pscore) / (1 - lalonde$pscore))

I used bal.tab to display means and sds for continous variables. IMO, for binary variables, the weighted means (e.g. M.0.Adj) is a weighted rate.

bal.tab(formula, data = lalonde, thresholds = c(m = .1), un = TRUE, disp = c("means", "sds"), weights = lalonde$weight)

which results, for example, in:

             Type         M.0.Adj  SD.0.Adj   M.1.Adj  SD.1.Adj Diff.Adj
married      Binary       0.2475         .    0.2581         .   0.0105

Is there a solution to derive the frequency and proportions for binary variables?

3

There are 3 answers

5
Noah On BEST ANSWER

The weighted mean of a binary variable is the weighted proportion of units with that characteristic. It doesn't make sense to request a weighted frequency. I describe why in this answer. There is no principled way to do it, and it is not useful information anyway, so you should not attempt to report it. I don't understand what you mean by saying the weighted mean is a weighted rate and not a weighted proportion. Use bal.tab() and report the weighted mean as a weighted proportion. This is best practice and what all papers that use IPW do.

0
geek45 On

---------------UPDATE 13.11.2023
Maybe it's just like this?

library(survey)
library(gtsummary)
library(dplyr)

survey::svydesign(~1, data = as.data.frame(lalonde), weights = ~weight) %>%
tbl_svysummary(by = treat, percent = "col", include = c(married)) %>% 
print()

# Result
Characteristic  0, N = 109  1, N = 110
married            27 (25%) 28 (26%)

Still confused.
How do the authors get the frequency and proportions in Table1 for the weighted sample?

6
jay.sf On

Perhaps something simple and effective.

> par(mar=c(5, 5, 4, 2))
> plot(tb$Diff.Adj, seq_len(nrow(tb)), xlim=c(-.1, .1), ylim=c(.5, nrow(tb) + .5), 
+      yaxt='n', ylab='', xlab='adj. dif', pch=19, main='Balance')
> axis(2, seq_len(nrow(tb)), labels=rownames(tb), las=1)
> abline(v=0)

enter image description here


Data:

tb <- data.frame(
  Type = c("Binary", "Binary"),
  M.0.Adj = c(0.2475, 0.2675),
  SD.0.Adj = c(".", "."),
  M.1.Adj = c(0.2581, 0.2581),
  SD.1.Adj = c(".", "."),
  Diff.Adj = c(0.0105, -0.0094),
  row.names = c("married", "single")
)