Python raises an AttributeError when methods on the sklearn Pipeline object are called

43 views Asked by At

Problem

I am calling the fit_transform() and transform() methods on a Pipeline object, but Python is raising an AttributeError whenever I try to do so. Here is what I'm trying to run, with imports. (Note: train/test splitting has been done already)

from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import Pipeline

pipe = Pipeline([('mean_impute', SimpleImputer()), 
                 ('norm',        StandardScaler()), 
                 ('sklearn_lm',  LinearRegression())])

pipe.fit_transform(x_train, y_train)  #<-- error here

x_transform = pipe.transform(x_test)  #<-- and here if previous line is absent

The text of the error is as follows:

AttributeError: This 'Pipeline' has no attribute 'fit_transform'

What went wrong? I'm sure it's something simple.

Things I have tried:

2

There are 2 answers

0
e-motta On BEST ANSWER

Documentation for sklearn.pipeline.Pipeline.fit_transform states that it's "[o]nly valid if the final estimator either implements fit_transform or fit and transform." Wording may be a bit ambiguous, but it means two possibilities: (i) final estimator implements fit_transform, or (ii) final estimator implements fit and transform.

Your final estimator is sklearn.linear_model.LinearRegression, which implements fit, but not transform. This is why the error is raised.

0
Bronze_Chaos On

On this website it says that this method is only valid if the final estimator either implements fit_transform or fit and transform. I do not know what a final estimator is, but that might be your problem (I know this wasn't very helpful but I tried).