Hyperparameter tuning with XGBRanker

1.7k views Asked by At

I am trying to figure how to tune my hyperparameter through RandomizedSearchCV with an XGBRanker model.

I could split the data into groups, feed it into the model and make predictions. However I am not sure how to set up the Search object, namely 2 specific things - how to inform it about the groups and also what kind of score I need to supply.

model = xg.XGBRanker(
    tree_method='exact',
    booster='gbtree',
    objective='rank:pairwise',
    random_state=42,
    learning_rate=0.06,
    max_depth=5,
    n_estimators=700,
    subsample=0.75,
    #colsample_bytree=0.9,
    #subsample=0.75
    min_child_weight=0.06
    )

model.fit(x_train, y_train, group=train_groups, verbose=True)

This works fine. This is where I need some help

param_dist = {'n_estimators': stats.randint(40, 1000),
              'learning_rate': stats.uniform(0.01, 0.59),
              'subsample': stats.uniform(0.3, 0.6),
              'max_depth': [3, 4, 5, 6, 7, 8, 9],
              'colsample_bytree': stats.uniform(0.5, 0.4),
              'min_child_weight': [0.05, 0.1, 0.02]
              }
clf = RandomizedSearchCV(model,
                         param_distributions=param_dist,
                         cv=5,
                         n_iter=5,  
                         scoring=???, #
                         error_score=0,
                         verbose=3,
                         n_jobs=-1)
#also what about the groups?
1

There are 1 answers

1
Antulii On

i had tried something similar. for scoring however i used the ndcg_scorer from sklearn. i added

scoring = sklearn.metrics.make_scorer(sklearn.metrics.ndcg_score, greater_is_better=True)

for groups u can add to the fit_params in RandomizedSearchCV.

fit_params = {"model__groups": group}

clf = RandomizedSearchCV(model,
                         param_distributions=param_dist,
                         cv=5,
                         n_iter=5,  
                         scoring=scoring,
                         error_score=0,
                         verbose=3,
                         n_jobs=-1,fit_params = fit_params)