scikit-learn linear regression K fold cross validation

Question

scikit-learn linear regression K fold cross validation

3.1k views Asked by Syed Humayun At 10 October 2020 at 04:02

I want to run Linear Regression along with K fold cross validation using sklearn library on my training data to obtain the best regression model. I then plan to use the predictor with the lowest mean error returned on my test set.

For example the below piece of code gives me an array of 20 results with different neg mean absolute errors, I am interested in finding the predictor which gives me this (least) error and then use that predictor on my test set.

sklearn.model_selection.cross_val_score(LinearRegression(), trainx, trainy, scoring='neg_mean_absolute_error', cv=20)

Original Q&A

There are 1 answers

**Sergey Bushmanov** · Accepted Answer · 2020-10-10T05:14:11+00:00

There is no such thing as "predictor which gives me this (least) error" in cross_val_score, all estimators in :

sklearn.model_selection.cross_val_score(LinearRegression(), trainx, trainy, scoring='neg_mean_absolute_error', cv=20)

are the same.

You may wish to check GridSearchCV that will indeed search through different sets of hyperparams and return the best estimator:

from sklearn import datasets
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import GridSearchCV
X,y = datasets.make_regression()
lr_model = LinearRegression()
parameters = {'normalize':[True,False]}
clf = GridSearchCV(lr_model, parameters, refit=True, cv=5)
best_model = clf.fit(X,y)

Note the refit=True param that ensures the best model is refit on the whole dataset and returned.

TechQA.

scikit-learn linear regression K fold cross validation

There are 1 answers

Related Questions in SCIKIT-LEARN

Related Questions in LINEAR-REGRESSION

Related Questions in K-FOLD

Popular Questions

Popular Tags

Trending Questions