I am currently doing a Udemy course, and the lecturer for the SVR class said that feature scaling has to be applied separately for X and y, as their standard deviation and mean are different. The following is the screenshot of the code and the dataset. X is level and y is salary. code for feature scaling
For the data-preprocessing class, the lecturer used different dataset, and the dataset consisted of more than 1 independent variable. However, he did not feature scale them independently, as shown in the code. I am confused with this part, because all the independent variables have different standard deviation and mean as well. So why do we not feature scale them separately? The following is the code and dataset code
dataset for pre-processing class
Btw this code is by Kirill Eremenko
Feature Scaling basically helps to normalize the data within a particular range. Normally several common class types contain the feature scaling function so that they make feature scaling automatically. However, the SVR class is not a commonly used class type so we should perform feature scaling.
Scaling inputs helps to avoid the situation, when one or several features dominate others in magnitude, as a result, the model hardly picks up the contribution of the smaller scale variables, even if they are strong.
If we don't do scaling separately the distribution/magnitude of the dependent variable might be impacted. It is often common practice to normalize dependent and independent features separately.