Getting feature importances out of an Adaboosted linear regression

158 views Asked by At

I have the following code:

modelClf = AdaBoostRegressor(base_estimator=LinearRegression(), learning_rate=2, n_estimators=427, random_state=42)

modelClf.fit(X_train, y_train)

While trying to interpret and improve the results, I wanted to see the feature importances, however I get an error saying that linear regressions don't really do that kind of thing.

Alright, sounds reasonable, so I tried using .coef_ since it should work for linear regressions, but it, in place, turned out incompatible with the adaboost regressor.

Is there any way to find the feature importances or is it impossible when adaboost it used on a linear regression?

2

There are 2 answers

0
Abhishek On

Checked with below code, there is an attribute for feature importance:

import pandas as pd
import random 
from sklearn.ensemble import AdaBoostRegressor

df = pd.DataFrame({'x1':random.choices(range(0, 100), k=10), 'x2':random.choices(range(0, 100), k=10)})

df['y'] = df['x2'] * .5

X = df[['x1','x2']].values
y = df['y'].values

regr = AdaBoostRegressor(random_state=0, n_estimators=100)
regr.fit(X, y)

regr.feature_importances_

Output: You can see feature 2 is more important as Y is nothing but half of it (as the data is created in such way).

enter image description here

0
Ben Reiniger On

Issue12137 suggests to add support for this using the coefs_, although a choice needs to be made how to normalize negative coefficients. There's also the question of when coefficients are really good representatives of importance (you should at least scale your data first). And then there's the question of when adaptive boosting helps a linear model in the first place.

One way to do this quickly is to modify the LinearRegression class:

class MyLinReg(LinearRegression):
    @property
    def feature_importances_(self):
        return self.coef_  # assuming one output

modelClf = AdaBoostRegressor(base_estimator=MyLinReg(), ...)