I'm doing a homework assignment where I'm asked to use the bootstrap on a support vector machine to estimate the class probability. I've managed that. Next, I am asked to use these probabilities and the true test set labels to plot an ROC curve for this SVM model (using the packages e1071 and ROCR). What I struggle with is how to use these probabilities to construct a ROCR::prediction object, which I will need to construct an ROCR::performance object, which I will need to plot the ROC curve.
I feel like I'm really stuck. Will I need to use these bootstrapped class probabilities to create a new SVM? If so, how? If not, how do I get from these class probabilities to an ROC curve?
A reproducible example:
set.seed(123)
library(e1071)
library(ROCR)
library(purrr)
### make some data
category_labels <- sample(c(-1, 1), 1000))
predictor1 <- rnorm(1000, 0, 0.1)
predictor2 <- rnorm(1000, 0, 0.1)
my_df <- as.data.frame(cbind(category_labels, predictor1, predictor2))
### 50/50 training/testing split
train <- sample(nrow(my_df), 500)
df_train <- my_df[train,]
df_test <- my_df[-train,]
### make 200 bootstrap datasets
df_train_boot <- replicate(200, df_train[sample(500, 500, T),], simplify = F)
### make helper function for bootstrap
calculate_class_prob <- function(x){
tmp_fit <- svm(category_labels ~ ., data = x, kernel = "radial", cost = 0.1)
tmp_pred <- predict(tmp_fit, newdata = df_test)
return(tmp_pred)
}
### Run bootstrap
bootstrap_class_prob <- map_dfc(.x = df_train_boot, .f = calculate_class_prob)
### Get class probability
minusones <- sum(unlist(lapply(lapply(bootstrap_class_prob, table), "[[", 1)))/200/NROW(bootstrap_class_prob)
ones <- sum(unlist(lapply(lapply(bootstrap_class_prob), "[[", 2)))/200/NROW(bootstrap_class_prob)