use catboost for ranking task

5.4k views Asked by At

I'd like to know how to configure catboost for a ranking task. The catboost homepage alludes that it can be used for ranking tasks. However, it seems documentation for ranking tasks is scarce: https://tech.yandex.com/catboost/doc/dg/concepts/cli-reference_train-model-docpage/

and all of the tutorials are focused on classifying individual instances: https://github.com/catboost/catboost/tree/master/catboost/tutorials

Ideally there would be some documentation or examples similar to LightGBM for ranking: https://github.com/Microsoft/LightGBM/tree/master/examples/lambdarank

Has anyone used catboost for ranking?

1

There are 1 answers

2
Anna Veronika Dorogush On

Starting from version 0.9 CatBoost supports several ranking modes. To use a ranking mode you need to build a dataset that contains groups of objects (user group_id for that). The alrogithm will try to find the best order within a group.

There are two pairwise modes in CatBoost, PairLogit and PairLogitPairwise. For a pairwise mode you need to provide pairs as a part of your dataset. PairLogit is much faster but PairLogitPairwise might give better quality on large datasets.

There are two ranking modes YetiRank and YetiRankPairwise. To use them you need to have labels in your dataset. The difference between them is the same, YetiRankPairwise is more computationally expensive, but might give better results.

There are also a mix between ranking and regression (QueryRMSE), a mix between ranking and classification (QueryCrossEntropy) and a QuerySoftMax loss.