Display the dropped dummy when using PDPBox

48 views Asked by At

I am currently working on how Machine Learning models can be interpreted and I found the function "pdp_plot" from the package PDPBox very useful to show how predicted outcome is impacted by changes in explanatory variables. However, I didn't find how to show all dumy variables, including the dummy variable dropped in the data pre-processing step.

In my initial dataset I had an explanatory variable called "Area" with 6 unique values: A, B, C, D, E, F. After creating dummy variables and dropping the first column, the dataset used for training my XGB model included Area_B, Area_C, Area_D, Area_E, Area_F.

When using the 'pdp_isolate' and then 'pdp_plot' functions from PDPBox, it shows the case where dummy variable Area_B = 1, then the case where dummy variable Area_C = 1, then the case where dummy variable Area_D = 1, etc. but it does not show the results for the case where all these dummy variables = 0. Does someone know how to display this as well?

Thanks a lot for your time. Hope the answer will also help the community. Please reach out if clarification is needed!

0

There are 0 answers