I wrote this code
library(caret)
set.seed(100)
options(warn=-1)
subsets <- c(1:5, 10, 15, 18)
ctrl <- rfeControl(functions = rfFuncs,
method = "repeatedcv",
repeats = 5,
verbose = FALSE)
lmProfile <- rfe(x=train_clean[trainIndex,], y=train_clean_price$SalePrice[trainIndex],
sizes = subsets,
rfeControl = ctrl)
train_clean conteins 80 variables and 1200 lines. this has been going on for 30 minutes How to find out how long will the code take to execute?
First, some guidelines when it comes to figuring out where or why R code takes time to execute.
Read up on
Rprof
, the profiling tool in theutils
package. That will show you exactly how much time each call (function) is taking. Then, You could try using the packagemicrobenchmark
.But last, and probably most important, when you find yourself waiting forever for completion, kill the process and start again with the smallest possible subset of your input data. For example, try setting up
mini_train <- train_clean[1:5,]
and see what happens... or evenmini_train <- train_clean[1:5,1:20]
. Side note: I'm assuming your "80 variables and 1200 lines" means it's a 1200-by-80 data array.