I am using LightGBM 2.0.6 Python API. My training data has around 80K samples and 400 features, and I am training a model with ~2000 iterations, and the model is for multi-class classification (#classes = 10). When the model is trained, and when I called model.feature_importance()
, I encountered segmentation fault.
I tried to generate artificial data to test (with the same number of samples, classes, iterations and hyperparameters), and I can successfully obtain the list of feature importance. Therefore I suspect whether the problem occurs depends on the training data.
I would like to see if someone else has encountered this problem and if so how was it overcome. Thank you.
This is a bug in LightGBM; 2.0.4 doesn't have this issue. It should be also fixed in LightGBM master. So either downgrade to 2.0.4, wait for a next release, or use LightGBM master.
The problem indeed depends on training data; feature_importances segfault only when there are "constant" trees in the trained ensemble, i.e. trees with a single leaf, without any splits.