How to use sample weights for a random forest classificator in Orange?

723 views Asked by At

I am trying to train a random forest classificator on a very imbalanced dataset with 2 classes (benign-malign).

I have seen and followed the code from a previous question (How to set up and use sample weight in the Orange python package?) and tried to set various higher weights to the minority class data instances, but the classificators that I get work exactly the same.

My code:

data = Orange.data.Table(filename)
st = Orange.classification.tree.SimpleTreeLearner(min_instances=3)
forest = Orange.ensemble.forest.RandomForestLearner(learner=st, trees=40, name="forest")
weight = Orange.feature.Continuous("weight")
weight_id = -10
data.domain.add_meta(weight_id, weight)
data.add_meta_attribute(weight, 1.0)
for inst in data:
    if inst[data.domain.class_var]=='malign':
        inst[weight]=100
classifier = forest(data, weight_id)

Am I missing something?

1

There are 1 answers

0
JanezD On BEST ANSWER

Simple tree learner is simple: it's optimized for speed and does not support weights. I guess learning algorithms in Orange that do not support weight should raise an exception if the weight argument is specified.

If you need them just to change the class distribution, multiply data instances instead. Create a new data table and add 100 copies of each instance of malignant tumor.