LightGBM does not accept the dtypes of my data

3k views Asked by At

I'm trying to use LGBMClassifier and for some reason, he does not accept the types of my data (all features are not accepted, I tested it).

When we look at my data we can clearly see that all dtypes are either category, float or int (pd.DataFrame.info())

dtypes: category(275), float64(115), int64(9)

When I eventually try to train my LGBMClassifier I get the follwoing Error:

ValueError: Series.dtypes must be int, float or bool

Has anyone an idea what is wrong?

2

There are 2 answers

0
MGCHEM On BEST ANSWER

I figured out that the error:

ValueError: Series.dtypes must be int, float or bool

refers in my case to the label, thus to the only passed series to the lgb.train() method. My label had the type category which can not be handled by lgb.train(). I had to change the dtype from 'category' to 'int'. Then It worked.

0
Julio Reckin On

Features with data type category are handled separately in LGBM. When you create the dataset for training you use the keyword categorical_feature for these features. This can look like this for example. First you can store all features with type category in a list

categoricals = ["feature1", "feature2",...]

Then you use the list when creating the training data set for the LGBM model:

lgb_train = lgb.dataset(train_X,train_y,categorical_feature=categoricals)

The same you can do for the test data set for the LGBM model:

lgb_test = lgb.Dataset(test_X,test_y,categorical_feature=categoricals)