RandomizedSearchCV
is useful because it doesn't try all parameters you listed it to try. Instead, it shows a few and tests them to see which is better.
But How can I know which parameters were tested?
For instance, in the script below, which combinations of n_estimators
, max_features
, and max_depth
were tested? n_estimator = 10
was tested? n_estimator = 100
was tested?
rf = RandomForestRegressor()
n_estimators = [int(x) for x in np.linspace(start=10, stop=2000, num=200)]
max_features = ["auto", "sqrt", "log2"]
max_depth = [int(x) for x in np.linspace(5, 500, num=100)]
random_grid = {
"n_estimators": n_estimators,
"max_features": max_features,
"max_depth": max_depth,
}
randomsearch = RandomizedSearchCV(rf, param_distributions=random_grid, cv=5)
randomsearch.fit(X_train, y_train)
A lot of information about the search is available in the attribute
cv_results_
. Importing that dict into a dataframe, you get a row for each hyperparameter combination tested, with the hyperparameter values, fold and average scores, optionally training scores, training time, etc.