Interpreting logistic regression coefficients of scaled features

643 views Asked by At

I'm using a logistic regression to estimate the probability of scoring a goal in soccer/footbal. I've got 5 features. My target values are 1 (goal) or 0 (no goal).

As is always a must, I've scaled my features before fitting my model. I've used the MinMaxScaler, who scales all features in the range [0-1] as follows: X_scaled = (x - x_min)/(x_max - x_min)

The coefficients of my logistic regression model are the following:

coef = [[-2.26286643 4.05722387 0.74869811 0.20538172 -0.49969841]]

My first thoughts are that the second features is the most important, followed by the first. Is this always true?

I read that "In other words, for a one-unit increase in the 'the second feature', the expected change in log odds is 4.05722387." on this site, but there, their features were normalized with a mean of 50 and some std deviation.

If I do not scale my features, the coefficients of the model are the following:

coef = [[-0.04743728 0.04394143 -0.00247654 0.23769469 -0.55051824]]

And now it seems that the first feature is more important than the second one. I read in literature about my topic that this is indeed true. So this confuses me off course.

My questions are:

  • Which of my features is the most important and what/why is the best methodology to find it?
  • How can I interprete the meaning of the scaled coefficients? E.g. what does an increase with 1 meter in feature 1 mean? Can I throw 1 meter in the MinMaxScaler, see what comes out and use that as 'the one inut increase'?
  • Is it true that the final probability wil be computed as y = 1/(1 + exp(-fx)) with fx = intercept + feature1*coef1 + feature2*coef2 + ... (with all features scaled).
1

There are 1 answers

0
KM_83 On

Which of my features is the most important and what/why is the best methodology to find it?

Look at several versions of marginal effects calculations. For example, see overview/discussion in a blog Stata's example resources for R

How can I interprete the meaning of the scaled coefficients? E.g. what does an increase with 1 meter in feature 1 mean? Can I throw 1 meter in the MinMaxScaler, see what comes out and use that as 'the one inut increase'?

The interpretation depends on which marginal effects you calculate. You just need to account for scaling when you talk about one unit of X increasing/decreasing the change in probability or odds ratio etc.

Is it true that the final probability wil be computed as y = 1/(1 + exp(-fx)) with fx = intercept + feature1coef1 + feature2coef2 + ... (with all features scaled).

Yes, it's just that features x are in scaled measures.