I am performing parameter selection using GridSearchCv
(sklearn package in python) where the model is an Elastic Net with a Logistic loss (i.e a logistic regression with L1- and L2- norm regularization penalties). I am using SGDClassifier
to implement this model. There are two parameters I am interested in searching the optimal values for: alpha
(the constant that multiplies the regularization term) and l1_ratio
(the Elastic Net mixing parameter). My data set has ~300,000 rows. I initialize the model as follows:
sgd_ela = SGDClassifier(alpha=0.00001, fit_intercept=True, l1_ratio=0.1,loss='log', penalty='elasticnet')
and the searching fxn. as follows:
GridSearchCV(estimator=sgd_ela, cv=8, param_grid=tune_para)
,
with tuning parameters:
tune_para = [{'l1_ratio': np.linspace(0.1,1,10).tolist(),'alpha':[0.00001, 0.0001, 0.001, 0.01, 0.1, 1]}]
.
I get the best_params
(of alpha
and l1_ratio
) upon running the code. However, in repeated runs, I do not get the same set of best parameters. I am interested to know why is this the case, and if possible, how can I overcome it?
Try setting the random seed if you want to get the same result each time.