Why is tidymodels with a ranger engine so much slower than ranger?

590 views Asked by At

I'm taking a first look at tidymodels. My alternative for the current project would be non-tidyfied ranger. On a test run, classification random forest with tidymodels using the ranger engine is much slower than hand-held ranger (approximately ten times slower) when run on the classic iris dataset. Why is that?

library(tidymodels)
library(ranger)

# Make example data
data("iris")
mydata <- iris[sample(1:nrow(iris), 600, replace=T),]

# Recipe 
myrecipe <- mydata %>% recipe( Species ~ . )

# Setting a Ranger RF model
myRF <- rand_forest( trees = 300, mtry = 3, min_n = 1) %>% 
  set_mode("classification") %>% 
  set_engine("ranger")

# Setting a workflow
myworkflow <- workflow() %>% 
  add_model(myRF) %>% 
  add_recipe(myrecipe)

# Compare base ranger and tidy setup

time <- Sys.time()
fit_ranger <- ranger( Species ~ . , data = mydata, probability = T,
                     mtry = 3, num.trees = 300, min.node.size = 1)
ranger_time <- difftime( Sys.time(), time, "secs")


time <- Sys.time()
fit_tidy <- myworkflow %>% 
  fit(data= mydata)
tidy_time <- difftime( Sys.time(), time, "secs")

tidy_time
ranger_time
0

There are 0 answers