I'm running grid search on AdaBoost with DecisionTreeClassifier as its base learner to get the best parameters for AdaBoost and DecisionTree.
The search on a dataset (130000, 22) has been running for 18 hours so I'm wondering if it's just another typical day of waiting for training or maybe there might be an issue with the set up.
Is the base-learner, grid search, training and params set up correctly?
ada_params = {"base_estimator__criterion" : ["gini", "entropy"],
"base_estimator__splitter" : ["best", "random"],
"base_estimator__min_samples_leaf": [*np.arange(100,1500,100)],
"base_estimator__max_depth": [5,10,13,15],
"base_estimator__max_features": [5,10,15],
"n_estimators": [500, 700, 1000, 1500],
"learning_rate": [0.001, 0.01, 0.1, 0.3]
}
dt_base_learner = DecisionTreeClassifier(random_state = 42, max_features="auto", class_weight = "balanced")
ada_clf = AdaBoostClassifier(base_estimator = dt_base_learner)
ada_search = GridSearchCV(ada_clf, param_grid=ada_params, scoring = 'f1', cv=kf)
ada_search.fit(scaled_X_train, y_train)
If I am not mistaken, your GridSearch tests
14 * 4 * 3 * 4 * 4 = 2,688different model configuration, each for a crossvalidation of an unknown number of splits. You should definitely try to reduce the number of combinations in theGridSearchCVor go forRandomizedSearchCVorBayesSearchCVfromskopt.