Using tuneGrid and Controls in ctree (Caret)

2.7k views Asked by At

I am running into an issue when using the tuneGrid and controls options in caret. In this example, I would like to set mincriterion and max depth but would also like to specify the min bucket size. This error seems to occur when any options are passed to ctree_control().

I get the error:

In eval(expr, envir, enclos) : model fit failed for Fold1: mincriterion=0.95, maxdepth=7 Error in (function (cl, name, valueClass) : assignment of an object of class “numeric” is not valid for @‘maxdepth’ in an object of class “TreeGrowControl”; is(value, "integer") is not TRUE"

This can be reproduced by running:

library(caret)
data("GermanCredit")

trainCtrl <- trainControl(method = 'cv', number = 2, sampling = 'down', 
verboseIter = FALSE, allowParallel = FALSE, classProbs = TRUE, 
                      summaryFunction = twoClassSummary)

tune <- expand.grid(.mincriterion = .95, .maxdepth = seq(5, 10, 2))

ctree_fit <- train(Class ~ ., data = GermanCredit, 
method = 'ctree2', trControl = trainCtrl, metric = "Sens", 
tuneGrid = tune, controls = ctree_control(minbucket = 10))

I am trying this approach based on the answer posted here: Run cforest with controls = cforest_unbiased() using caret package

By the looks of the error, it has something to do with how caret is passing the max depth to ctree but I'm not sure if there is anyway around this. Running ctree directly with the ctree_control works fine.

Any help is greatly appreciated

1

There are 1 answers

1
thie1e On BEST ANSWER

This looks like a possible bug to me. You can make it work if you use as.integer():

tune <- expand.grid(.mincriterion = .95, 
                    .maxdepth = as.integer(seq(5, 10, 2)))

Reason: If you use the controls argument what caret does is

theDots$controls@tgctrl@maxdepth <- param$maxdepth
theDots$controls@gtctrl@mincriterion <- param$mincriterion
ctl <- theDots$controls

If we take a look at the treeControl class it looks like this

Formal class 'TreeControl' [package "party"] with 4 slots
  ..@ varctrl  :Formal class 'VariableControl' [package 
  ..@ tgctrl   :Formal class 'TreeGrowControl' [package "party"] with 4 slots

[left stuff out]

  .. .. ..@ stump         : logi FALSE
  .. .. ..@ maxdepth      : int 0
  .. .. ..@ savesplitstats: logi TRUE
  .. .. ..@ remove_weights: logi FALSE 

So it expects maxdepth to be integer and caret tries to assign a numeric (which may be an integer but not of class integer), but only if controls is specified.

If you don't specify controls it does

ctl <- do.call(getFromNamespace("ctree_control", "party"), 
                                      list(maxdepth = param$maxdepth,
                                           mincriterion = param$mincriterion))

...then goes from there in a way which I don't fully understand just by looking at the source right now. Have a look at https://github.com/topepo/caret/blob/master/models/files/ctree2.R if you're interested.