R Function for Rounding Imputed Binary Variables

322 views Asked by At

There is an ongoing discussion about the reliable methods of rounding imputed binary variables. Still, the so-called Adaptive Rounding Procedure developed by Bernaards and colleagues (2007) is currently the most widely accepted solution.

Adoptive Rounding Procedure involves normal approximation to a binomial distribution. That is, the imputed values in a binary variable are assigned the values of either 0 or 1, based on the threshold derived by the below formula, where x is the mean of the imputed binary variable:

threshold <- mean(x) - qnorm(mean(x))*sqrt(mean(x)*(1-mean(x)))

To the best of my knowledge, major R packages on imputation (such as Amelia or mice) have yet to include functions that help with the rounding of binary variables. This shortcoming makes it difficult especially for researchers who intend to use the imputed values in logistic regression analysis, given that their dependent variable is coded in binary.

Therefore, it makes sense to write an R function for the Bernaards formula above:

bernaards <- function(x)
{
mean(x) - qnorm(mean(x))*sqrt(mean(x)*(1-mean(x)))
}

With this formula, it is much easier to calculate the threshold for an imputed binary variable with a mean of, say, .623:

bernaards(.623)
[1] 0.4711302

After calculating the threshold, the usual next step is to round the imputed values in variable x.

My question is: how can the above function be extended to include that task as well?

In other words, one can do all of the above in R with three lines of code:

threshold <- mean(x) - qnorm(mean(x))*sqrt(mean(x)*(1-mean(x)))
df$x[x > threshold] <- 1
df$x[x < threshold] <- 0

It would be best if the function included the above recoding/rounding, as repeating the same process for each binary variable would be time-consuming, especially when working with large data sets. With such a function, one could simply run an extra line of code (as below) after imputation, and continue with the analyses:

bernaards(dummy1, dummy2, dummy3)
0

There are 0 answers