Setting the cost of False Positives to be much higher than the cost of False Negatives in LightGBM

223 views Asked by At

I am faced with a situation where False Positives are much more costly than False Negatives.

Imagine a case of a model used to decide whether you will undergo a very painful and dangerous surgery right away or you will rather explore other possibilities including consulting more MDs and trying out alternative therapies.

While you would not object to the dangerous and painful surgery if it was absolutely necessary, you would be mad if the model advised you to take it without a very serious reason. After all, you could always make the surgery one month later after more thorough medical investigations of your particular case.

So in that case a False Positive costs much more than a False Negative, at least to you.

So somehow I would like to inform the model that FP are much worse than FNs.

I did some research in Stack Overflow and they propose two solutions:

  1. Play with the threshold when converting probabilities to classes (this does not affect training)
  2. Increase the weight of the negative class (this affects training and the shape of the decision boundary).

The assumption is that by increasing the weight of the negative class, you dissuade the model from making a mistake when it comes to negative examples, i.e. misclassifying them as positive.

On the contrary, the model will be less reluctant to err on the side of the positive examples misclassifying them as negative.

But when the positive class is a small fraction of the overall data -- e.g. 3%-- such oversampling of the negative class will lead most probably the classifier to opt always for the negative class, a Catch22 kind of situation.

What would be your advice?

0

There are 0 answers