Set the values out of the defined set to a given value (f.e. NaN) for a column in pandas data frame

Question

Set the values out of the defined set to a given value (f.e. NaN) for a column in pandas data frame

80 views Asked by Krzysztof Słowiński At 08 August 2018 at 08:45

Having a defined set of valid values, all the pandas data frame column values out of it should be set to a given value, f.e. NaN. The values contained in the set and data frame can be assumed to be of numerical type.

Having the following valid values set and data frame:

valid = {5, 22}
df = pd.DataFrame({'a': [5, 1, 7, 22],'b': [12, 3 , 10, 9]})

    a   b
0   5  12
1   1   3
2   7  10
3  22   9

Setting the valid values on column a would result in:

Original Q&A

There are 2 answers

U13-Forward On 08 August 2018 at 08:54

Why not isin:

df.loc[~df['a'].isin(valid),'a']=pd.np.nan

Example:

import pandas as pd
valid = {5, 22}
df = pd.DataFrame({'a': [5, 1, 7, 22],'b': [12, 3 , 10, 9]})
df.loc[~df['a'].isin(valid),'a']=pd.np.nan
print(df)

Output:

      a   b
0   5.0  12
1   NaN   3
2   NaN  10
3  22.0   9

**jpp** · Accepted Answer · 2018-08-08T08:48:24+00:00

You can use pd.Series.where:

df['a'].where(df['a'].isin(valid), inplace=True)

print(df)

      a   b
0   5.0  12
1   NaN   3
2   NaN  10
3  22.0   9

A few points to note:

pd.Series.isin will work more efficiently with a list versus a set. See also Pandas pd.Series.isin performance with set versus array.
Your series will necessarily be converted to float since NaN is considered float.
The operation does not require assignment to a variable when inplace=True is used.

TechQA.

Set the values out of the defined set to a given value (f.e. NaN) for a column in pandas data frame

There are 2 answers

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in DATAFRAME

Related Questions in VALUE-RESTRICTION

Popular Questions

Popular Tags

Trending Questions