I have the following dataframe

train_x:

col1 col2 col3
1      4    89
0.4    1.6  14
100    678  970

train_y:

target
0
0
1

I want to convert the xgboost model to pmml as below:

from sklearn2pmml import sklearn2pmml, PMMLPipeline
from sklearn_pandas import DataFrameMapper
from xgboost.sklearn import XGBClassifier
pipeline = PMMLPipeline([("mapper", DataFrameMapper([
                                    ([num_features,SimpleImputer(strategy='median')],
                                     [num_features,StandardScaler()],
                                     [cat_features,SimpleImputer(strategy='constant', fill_value='missing')],
                                     [cat_features,OneHotEncoder(sparse=False, handle_unknown='ignore')])
                                     ])),
                         ("classifier", XGBClassifier(**best_params,n_jobs=-1))
])

and fit the pipeline

pipeline.fit(train_x, train_y)

but i get the error below

TypeError: _build_feature() takes from 2 to 3 positional arguments but 4 were given**

1 Answers

0
user1808924 On

This TypeError is raised by the DataFrameMapper.fit method, because you've specified invalid column-to-transformer mappings.

You should be specifying a list of two-element tuples ([(), ()]), but right now you're supplying a singleton list, which contains a tuple, which contains four lists ([([], [], [], [])]).