How to set up my own probabilistic threshold in random forest?

3.1k views Asked by At

I use python to run the random forest on an imbalanced dataset with binary target class. I wanna change the default probabilistic threshold 0.5 to another value to raise the recall and precision. I cannot find so far any defined method or class which can be used to conduct this task. Could anyone please advice a method or did it mean I should code for it myself? Cheers

1

There are 1 answers

1
René On

The RandomForestClassifier of scikit-learn has no fixed threshold to assign a class to sample. As you can see in the source code of RandomForestClassifier.predict it simply returns the most likely class. Of course you can use the approach suggested by @thiom but I can hardly imagine that this will improve precision and recall.

For instance, if your chosen threshold is 0.7 and the class probabilities are 0.6 and 0.4 what class do you assign? None at all?

As an alternative, you can try to use the class_weight option of RandomForestClassifier to put more weight on your underrepresented class.