Mysterious error with tidymodels recipe using "step_interact"

100 views Asked by At

I have been trying to create a recipe to train a model for the data set ames, but I am encountering an error when I try to fit the model and I don't know what it is. This is a MWE

library(tidyverse)
library(tidymodels)


# Load dataset
data("ames")

ames <- ames |> 
    mutate(Sale_Price = log10(Sale_Price))

# Split the data frame into train/test
set.seed(123)
ames_split <- initial_split(ames, prop = 0.80)
ames_train <- training(ames_split)
ames_test <- testing(ames_split)

# Model Specification
model_spec <- linear_reg() |> 
    set_engine("lm")

# Recipe
ames_rec <- recipe(Sale_Price ~ ., data = ames_train) |> 
    step_log(Gr_Liv_Area, base = 10) |> 
    step_dummy(all_nominal_predictors()) |> 
    step_interact( ~ Gr_Liv_Area : starts_with("Bldg_Type")) |> 
    step_zv(all_numeric_predictors()) |> 
    step_normalize(all_numeric_predictors()) |> 
    step_pca(matches("(SF$)|(^Bsmt)|(^Garage)"), num_comp = 5) |> 
    prep()

# Workflow
ames_wflow <- workflow() |> 
    add_model(model_spec) |> 
    add_recipe(ames_rec)

# Train the model
model_fit <-  fit(ames_wflow, ames_train)

When I run this code it gives me the following error:

Error in `step_interact()`:
Caused by error in `str2lang()`:
! <text>:2:0: unexpected end of input
1: ~
   ^
Run `rlang::last_trace()` to see where the error occurred.

Can you explain me what I am doing wrong?

1

There are 1 answers

0
topepo On BEST ANSWER

Without the prep(), it appears to work:

library(tidymodels)

# Load dataset
data("ames")

ames <- ames |> 
  mutate(Sale_Price = log10(Sale_Price))

# Split the data frame into train/test
set.seed(123)
ames_split <- initial_split(ames, prop = 0.80)
ames_train <- training(ames_split)
ames_test <- testing(ames_split)

# Model Specification
model_spec <- linear_reg() |> 
  set_engine("lm")

# Recipe
ames_rec <- recipe(Sale_Price ~ ., data = ames_train) |> 
  step_log(Gr_Liv_Area, base = 10) |> 
  step_dummy(all_nominal_predictors()) |> 
  step_interact( ~ Gr_Liv_Area : starts_with("Bldg_Type")) |> 
  step_zv(all_numeric_predictors()) |> 
  step_normalize(all_numeric_predictors()) |> 
  step_pca(matches("(SF$)|(^Bsmt)|(^Garage)"), num_comp = 5)

# Workflow
ames_wflow <- workflow() |> 
  add_model(model_spec) |> 
  add_recipe(ames_rec)

# Train the model
model_fit <-  fit(ames_wflow, ames_train)
class(model_fit)
#> [1] "workflow"

Created on 2023-11-28 with reprex v2.0.2