Do we permute columns in the test set when running Permutation Importance?

13 views Asked by At

I've been looking at the documentation and relevant tutorials on permutation importance and nobody seems to have a clear idea on what they are actually permuting.

Just to clarify, would the step by step process be as follows:

  1. Split dataset into X_train, X_val and X_test

  2. Train data on X_train, using X_val to e.g. find best epoch

  3. Run the trained model on X_test, taking note of the metric we are measuring

  4. Permuting a feature in X_test, and running the same model on this permuted X_test dataset

  5. Take note of the same metric and compare the two

  6. Repeat for each variable, without changing the model.

Aside Question: Would it be worth running repeats of this permutation process, with X_train,X_val and X_test changing with each repeat. I know the resultant model will be different, but I want to get a broad perspective of how the general model (with fixed hyperparameters) behaves when trained on different datasets, as keeping X_test fixed may skew the perceived importances of certain features.

0

There are 0 answers