Say I have a dataframe with features and labels:
f1 f2 label
-1000 -100 1
-5 3 2
0 4 3
1 5 1
3 6 1
1000 100 2
I want to filter outliers from columns f1 and f2 to get:
f1 f2 label
-5 3 2
0 4 3
1 5 1
3 6 1
I know that I can do something like this:
data = data[(data > data.quantile(.05)) & ( data < data.quantile(.95))]
But 'label' column will also be filtered. How can I avoid filtering some column? I don't want to filter all columns manually because there are dozens of them. Thanks.
what about the following approach: