Filter outliers from Pandas dataframe from all columns except one

Question

Filter outliers from Pandas dataframe from all columns except one

1.5k views Asked by shda At 09 January 2017 at 23:06

Say I have a dataframe with features and labels:

f1    f2   label
-1000 -100 1
-5    3    2
0     4    3
1     5    1
3     6    1
1000  100  2

I want to filter outliers from columns f1 and f2 to get:

f1    f2   label
-5    3    2
0     4    3
1     5    1
3     6    1

I know that I can do something like this:

data = data[(data > data.quantile(.05)) & ( data < data.quantile(.95))]

But 'label' column will also be filtered. How can I avoid filtering some column? I don't want to filter all columns manually because there are dozens of them. Thanks.

Original Q&A

There are 1 answers

**MaxU - stand with Ukraine** · Accepted Answer · 2017-01-09T23:15:29+00:00

MaxU - stand with Ukraine On 09 January 2017 at 23:15 BEST ANSWER

what about the following approach:

In [306]: x = data.drop('label', 1)

In [307]: x.columns
Out[307]: Index(['f1', 'f2'], dtype='object')

In [308]: data[((x > x.quantile(.05)) & (x < x.quantile(.95))).all(1)]
Out[308]:
   f1  f2  label
1  -5   3      2
2   0   4      3
3   1   5      1
4   3   6      1

TechQA.

Filter outliers from Pandas dataframe from all columns except one

There are 1 answers

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in DATAFRAME

Related Questions in FILTERING

Related Questions in PERCENTILE

Popular Questions

Popular Tags

Trending Questions