I would like to do Partial Dependence Plot (PDP) on some clinical data with lots' of features (let's say 300). As I'd like to compute the pairwise feature interactions, it could be way too many combinations and difficult to run.

Would it be possible to train/test the model in order to select a small number of highly predictive features (for ex: top 20), and then do PDP on this subset ?

My concern is that since the top predictive features were determined from the same data as the PDP, this doesn't feel right to me.

Thanks !

0 Answers