I have been doing some fairly simple ROC curve analysis, involving creating some ROC curves, calculating AUC and 95% CI (2000 bootstrapped replicates) and then thresholding the curve to give a 95% sensitivity, with the threshold, TN, FN, FP, TP, Sens and Spec returned. This is simple when using a single dataset as below:
BIOMroc1 = pROC::roc(response = DF$CT, predictor = DF$Biom.1, ci=TRUE)
pROC::auc(BIOMroc1 )
pROC::ci.auc(BIOMroc1 , conf.level=0.95, method=c("bootstrap"), boot.n = 2000)
However I am now wanting to use a estimated variable as the predictor that has required imputation to form, meaning there are 10 imputed datasets. I wish to calculate the AUC and 95% CI as similar to above and then combine them. Following this I'd like to somehow create a ROC curve and threshold at 95% sensitivity. I am abit lost as to where to start! Code for a replica dataset is below.
set.seed(123)
num_imputations <- 5
num_rows <- 20
generate_data <- function(num_rows, mean_range, sd_range) {
data <- data.frame(
CT = sample(0:1, num_rows, replace = TRUE),
matrix(runif(num_rows * 6, min = mean_range[1], max = mean_range[2]), ncol = 6)
)
colnames(data)[-1] <- paste0("Biomarkers_", 1:6)
return(data)
}
imputed_datasets <- lapply(1:num_imputations, function(i) {
generate_data(num_rows, c(5, 10), c(2, 4))
})
imputed_datasets <- lapply(1:num_imputations, function(i) {
data <- imputed_datasets[[i]]
data$Imputation <- i
return(data)
})
combined_data <- do.call(rbind, imputed_datasets)