Top N features that are responsible for the local SHAP value

2.3k views Asked by At

I've been trying to use SHAP values in my ML to help understand the contribution of each feature on the local outcome. I understand that SHAP values of all features sum up to explain why the prediction was different from the baseline value. This allows us to decompose a prediction in a graph like this:

explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_train)
i = 400
shap.force_plot(explainer.expected_value, shap_values[i], features=X_train.loc[400], feature_names=X_train.columns)

enter image description here

I was wondering if there was a way to get the top 3 features that contribute positively and negatively to the SHAP value in my example

  1. LSTAT, PTRATIO and INDUS help push the value to the right
  2. RM,Tax,Rad push in the other direction

I need these features as an array or a dataframe so i can preform further operations on them

1

There are 1 answers

0
Sergey Bushmanov On

Top 3 features that contribute positively for your example:

i = 400
features = X_train.columns
id_sorted = np.argsort(shap_values[i])
top3_positive = features[id_sorted[:-4:-1]]

Top 3 features that contribute negatively:

top3_negative = features[id_sorted[:3]]