How to save a parsnip model fit (from ranger)?

1k views Asked by At

I have a parsnip model (from ranger), roughly from here:

# install.packages("tidymodels")

data(cells, package = "modeldata")

rf_mod <- 
  rand_forest(trees = 100) %>% 
  set_engine("ranger") %>% 
  set_mode("classification")

set.seed(123)
cell_split <- initial_split(cells %>% select(-case), strata = class)

cell_train <- training(cell_split)

rf_fit <- 
  rf_mod %>% 
  fit(class ~ ., data = cell_train)
> class(rf_fit)
[1] "_ranger"   "model_fit"

How do I save it to disk so that I can load it at a later time?

I tried dput, and that gets an error:

dput(rf_fit, file="rf_fit.R")
rf_fit2 <- dget("rf_fit.R")
Error in missing_arg() : could not find function "missing_arg"

It's true, the model_fit.R file has a couple of missing_arg calls in it, which appears to be some sort of way to mark missing args. However, that's a side line. I don't need to use dput, I just want to be able to save and load a model.

3

There are 3 answers

1
Duck On BEST ANSWER

Try with this option. save() and load() functions allow you to store the model and then inkove it again. Here the code:

data(cells, package = "modeldata")

rf_mod <- 
  rand_forest(trees = 100) %>% 
  set_engine("ranger") %>% 
  set_mode("classification")

set.seed(123)
cell_split <- initial_split(cells %>% select(-case), strata = class)

cell_train <- training(cell_split)

rf_fit <- 
  rf_mod %>% 
  fit(class ~ ., data = cell_train)

#Export option
save(rf_fit,file='Mymod.RData')
load('Mymod.RData')

The other option would be using saveRDS() to save the model and then use readRDS() to load it but it requires to be allocated in an object:

#Export option 2
saveRDS(rf_fit, file = "Mymod.rds")
# Restore the object
rf_fit <- readRDS(file = "Mymod.rds")
0
hnagaty On

As Duck mentioned, saveRDS() and readRDS() can be used to save/load any R object. Also save() & load() can be used for the same purpose. There are many online discussions/blogs comparing the two approaches.

0
Simon Couch On

For others that may come across this post in the future:

Some model fits in R, including several that parsnip supports as a modeling backend, require native serialization methods to be saved and reloaded in a new R session properly. saveRDS() and readRDS() will do the trick most of the time, though fall short for model objects from packages that require native serialization.

The folks from the tidymodels team put together a new package, bundle, to provide a consistent interface for native serialization of model objects. The bundle() verb prepares a model object for serialization, and then you can safely saveRDS() + readRDS() and pass between R sessions as you wish, and then unbundle() in the new session. With a parsnip model fit mod:

mod_bundle <- bundle(mod)
saveRDS(mod_bundle, file = "path/to/file.rds")

# in a new R session:
mod_bundle <- readRDS("path/to/file.rds")
mod_new <- unbundle(mod_bundle)