Afternoon all, I am trying to run my pipeline but I keep running into the error 'numpy.ndarray' object has no attribute 'fit' and I cannot figure out why. Below is the code for the transformer, the pipeline, and the dataframes. Any tip is appreciated:
import pandas as pd
import numpy as np
data = pd.read_csv("CustomTransformerData.csv")
data
from sklearn.base import BaseEstimator, TransformerMixin
#column index
x1_ix, x2_ix, x3_ix, x4_ix, x5_ix = 0,1,2,3,4
class Assignment4Transformer(BaseEstimator, TransformerMixin):
def __init__(self, add_x6 = True, y = None):
self.add_x6 = add_x6
def fit(self, data, y=None):
return self
def transform(self, data):
if self.add_x6:
y=[]
x4 = (x1_ix**3) / (x5_ix)
y.append(x4)
x1 = (x1_ix**3) / (x5_ix)
y.append(x1)
x2 = (x1_ix**3) / (x5_ix)
y.append(x2)
x3 = (x1_ix**3) / (x5_ix)
y.append(x3)
x5 = (x1_ix**3) / (x5_ix)
y.append(x5)
x6 = (x1_ix**3) / (x5_ix)
y.append(x6)
return y
attr_adder = Assignment4Transformer(add_x6 = True)
assignment4_extra_attribs = attr_adder.transform(data)
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler
num_pipeline=Pipeline([
('imputer',SimpleImputer(strategy="mean")),
('attribs_adder', Assignment4Transformer),
('std_scaler', StandardScaler())
])
data_num = data.drop("x3", axis = 1)
data_cat = data.drop(["x1", "x2", "x4", "x5"], axis = 1)
data_num_transformed = num_pipeline.fit_transform(data_num)
Output of last line:
error 'numpy.ndarray' object has no attribute 'fit
In the 4th line an additional data variable is present. You remove it and check. Also the return statement for transform function should match the indentation of the previous line
Check these two points and share the results.