averaging imputation of missing values

2.7k views Asked by At

I got a few questions, I couldn't really find anything on with the documentation unless I'm missing something or don't understand imputation process/logic.

Basically the most important is that since sometimes the 'imputed' values are different, I'd like to take the average - if it is numeric - or mode if it is a categorical value.

All the examples that I see showing "complete(miced_model, 1)". If I'm running the mice model with 5 or 10 different iterations I don't see the point in just picking 1. I'd like the average of all of them.

Can anyone show me how to do this?

set.seed(2016)
library(mice)
nhanes # this is the dataset
nhanes[5,1]=NA  # setting up some categorical examples
nhanes[1,1]=NA
nhanes$age = as.factor(nhanes$age)
imputed_values = mice(nhanes, m = 5, method='rf',maxit = 3)
new_nhanes = complete(imputed_values, 'long') # or repeated? or what?

new_hanes_fixed =   # new data frame with averaged values imputed rather than just arbitrary '1st' iteration?

THANKS!!

2

There are 2 answers

0
wissem On

It sounds like you want to pool your results of your analysis, that way you run your analysis on every imputed data set. Read more here on Pooling Data: https://www.r-bloggers.com/imputing-missing-data-with-r-mice-package/

0
Steffen Moritz On

You should look at the comment of SimonG

You are completely on the wrong track. The whole point of multiple imputation is that you have different imputed datasets. (on which you would perform your analysis)

If you don't need multiple imputation you can directly use single imputation methods.( for example kNN or imri function from the VIM package)