problem
I am trying to run a multilabel classification in r using mlr package. I used https://www.rdocumentation.org/packages/mlr/versions/2.19.0/topics/makeMultilabelClassifierChainsWrapper to implement multilabel classification. But I need to add hyperparameter tuning. And that seem to create all kinds of problems. I followed the example on https://mlr.mlr-org.com/articles/tutorial/tune.html for tuning parameters. tuneParams requires the argument resample and there I get stuck.
example data
age <- c(round(rnorm(120,mean = 50,sd = 10)))
sex <- c(round(rnorm(120,mean = 0.5,sd = 0.2)))
l1 <- as.logical(c(round(rnorm(120,mean = 0.5,sd = 0.2))))
l2 <- as.logical(c(round(rnorm(120,mean = 0.5,sd = 0.2))))
l3 <- as.logical(c(round(rnorm(120,mean = 0.5,sd = 0.2))))
l4 <- as.logical(c(round(rnorm(120,mean = 0.5,sd = 0.2))))
data <- as.data.frame(cbind(age,sex,l1,l2,l3,l4))
In reality I have 12 labels, but I left the others out to make it easier to look at. The idea is that l1 untill l4 are logical vectors. Somehow that doesn't work, so I hope you can fix that. But be aware that that is not my main question.
code
task <- makeMultilabelTask(data = data, target = label_bact)
ps <- makeParamSet(
makeDiscreteParam("ntree",values = c(50,100,150,200,300,500,550)),
makeDiscreteParam("mtry",values = c(1,2,3,4,5))
)
ctrl <- makeTuneControlGrid()
rdesc <- makeResampleDesc(method = "CV",iters = 5, predict = "test",
stratify.cols = c(l1,l2,l3,l4)
measure <- acc
learner <- "classif.randomForest"
lrn <- makeLearner(learner)
lrn <- makeMultilabelClassifierChainsWrapper(lrn, order = NULL)
lrn <- setPredictType(lrn,"prob")
res <- tuneParams(lrn,task = task,resample = rdesc, par.set = ps,control = ctrl)
error
The error that I get:
Error in tuneParams(lrn, task = task, resample = rdesc, par.set = ps, :
Assertion on 'resample.fun' failed: Must be a function, not 'CVDesc/ResampleDesc'.
So I added the code line:
r <- resample(learner = lrn,task = task,rdesc)
and this tells me that
Error in makeResampleInstance(resampling, task = task) :
Stratification for tasks of type 'multilabel' not supported
check
This is confirmed by:
>rdesc
Resample description: cross-validation with 5 iterations.
Predict: test
Stratification: FALSE
questions
- So first question is how can I solve the stratification (in the makeResampleDesc function) for multiple outcome labels?
- Second question is how can I make the tuneParams function work?
- Related question is is there a way to skip the resample argument as I already do CV and stratification outside these functions?
Thanks in advance!
Try this:
It works for me! I hope it`ll work for you