Here is my code. It is a binary classification problem and the evaluation criteria are the AUC score. I have looked at one solution on Stack Overflow and implemented it but did not work and still giving me an error.
param_grid = {
'n_estimators' : [1000, 10000],
'boosting_type': ['gbdt'],
'num_leaves': [30, 35],
#'learning_rate': [0.01, 0.02, 0.05],
#'colsample_bytree': [0.8, 0.95 ],
'subsample': [0.8, 0.95],
'is_unbalance': [True, False],
#'reg_alpha' : [0.01, 0.02, 0.05],
#'reg_lambda' : [0.01, 0.02, 0.05],
'min_split_gain' :[0.01, 0.02, 0.05]
}
lgb = LGBMClassifier(random_state=42, early_stopping_rounds = 10, eval_metric = 'auc', verbose_eval=20)
grid_search = GridSearchCV(lgb, param_grid= param_grid,
scoring='roc_auc', cv=5, n_jobs=-1, verbose=1)
grid_search.fit(X_train, y_train, eval_set = (X_val, y_val))
best_model = grid_search.best_estimator_
start = time()
best_model.fit(X_train, y_train)
Train_time = round(time() - start, 4)
Error happens at best_model.fit(X_train, y_train)
Answer
This error is caused by the fact that you used early stopping during grid search, but decided not to use early stopping when fitting the best model over the full dataset.
Some keyword arguments you pass into
LGBMClassifier
are added to theparams
in the model object produced by training, includingearly_stopping_rounds
.To disable early stopping, you can use
update_params()
.More Details
I made some assumptions to turn your question into a minimal reproducible example. In the future, I recommend doing that when you ask questions here. It will help you get better, faster help.
I installed
lightgbm
3.1.0 withpip install lightgbm==3.1.0
. I'm using Python 3.8.3 on Mac.Things I changed from your example to make it an easier-to-use reproduction
[10, 100]
andnum_leaves
to[8, 10]
so training would run much fasterreproducible example