How to use bootstrapped SVM probabilities to construct ROC curve?

174 views Asked by At

I'm doing a homework assignment where I'm asked to use the bootstrap on a support vector machine to estimate the class probability. I've managed that. Next, I am asked to use these probabilities and the true test set labels to plot an ROC curve for this SVM model (using the packages e1071 and ROCR). What I struggle with is how to use these probabilities to construct a ROCR::prediction object, which I will need to construct an ROCR::performance object, which I will need to plot the ROC curve.

I feel like I'm really stuck. Will I need to use these bootstrapped class probabilities to create a new SVM? If so, how? If not, how do I get from these class probabilities to an ROC curve?

A reproducible example:

set.seed(123)
library(e1071)
library(ROCR)
library(purrr)


### make some data

category_labels <- sample(c(-1, 1), 1000))
predictor1 <- rnorm(1000, 0, 0.1)
predictor2 <- rnorm(1000, 0, 0.1)

my_df <- as.data.frame(cbind(category_labels, predictor1, predictor2))

### 50/50 training/testing split 

train <- sample(nrow(my_df), 500)
df_train <- my_df[train,]
df_test <- my_df[-train,]

### make 200 bootstrap datasets

df_train_boot <- replicate(200, df_train[sample(500, 500, T),], simplify = F)

### make helper function for bootstrap

calculate_class_prob <- function(x){
  tmp_fit <- svm(category_labels ~ ., data = x, kernel = "radial", cost = 0.1)
  tmp_pred <- predict(tmp_fit, newdata = df_test)
  return(tmp_pred)
}

### Run bootstrap

bootstrap_class_prob <- map_dfc(.x = df_train_boot, .f = calculate_class_prob)

### Get class probability

minusones <- sum(unlist(lapply(lapply(bootstrap_class_prob, table), "[[", 1)))/200/NROW(bootstrap_class_prob)
ones <- sum(unlist(lapply(lapply(bootstrap_class_prob), "[[", 2)))/200/NROW(bootstrap_class_prob)
0

There are 0 answers