I am using the following code to train a XGBoost model:
# Calculate class weights for cost-sensitive learning
class_weights = compute_sample_weight(class_weight='balanced', y=y_train)
# Define a parameter distribution for randomized search
param_dist = {
'eta': uniform(0.01, 0.3),
'max_depth': randint(3, 10),
'min_child_weight': randint(1, 10),
'subsample': uniform(0.5, 1), # Adjust range for subsample
'colsample_bytree': uniform(0.5, 1), # Adjust range for colsample_bytree
'n_estimators': randint(50, 1000), # Change the range to 1-1000
'objective': ['binary:logistic'],
'eval_metric': ['auc']
}
# Create an XGBoost classifier
xgb_classifier = xgb.XGBClassifier()
# Perform randomized search with cross-validation to find optimal parameters
random_search = RandomizedSearchCV(estimator=xgb_classifier,
param_distributions=param_dist, n_iter=60, cv=5,
scoring='roc_auc', n_jobs=-1, random_state=10)
random_search.fit(X_train, y_train, sample_weight=class_weights)
# Get the best parameters from the randomized search
best_params = random_search.best_params_
print("Best Parameters:", best_params)
# Use the best parameters to fit the XGBoost model
best_xgb_model = xgb.XGBClassifier(**best_params)
best_xgb_model.fit(X_train, y_train)
I have specified the weights in random_search fit. My question is, does this adjust the weights during training fit or should I include "sample_weights=class_weights" in best_xgb_model.fit as an argument?
Please help! Thanks !