I want to check that all the dtypes of a dataframe are contained in a subset of dtypes which I specify.
I thought this might be a good way to do that but getting unexpected results:
import pandas as pd
df = pd.DataFrame({'A': [1., 2., 3.], 'B': [100, 101, 102], 'C': ['a', 'b', 'c']})
assert df.dtypes.A == 'float64'
assert df.dtypes.B == 'int64'
assert df.dtypes.C == 'object'
print(df.dtypes.isin(['int64', 'float64']))
Output
A True
B False
C False
dtype: bool
In fact, the results seem to vary each time I run the script. Sometimes I get this:
A False
B False
C False
dtype: bool
Sometimes this:
A False
B True
C False
dtype: bool
Clearly, df.dtypes.isin
was not meant to be used this way.
The following works as expected, so I suspect this is something to do with my use of strings in place of the dtype objects:
df.dtypes.isin([np.dtype('float64'), np.dtype('int64')])
(I realize I could use select_dtypes
if I wanted to select the columns of a certain type or is_numeric_dtype
if I wanted to check all columns are a numeric type, but that is not what I want to do.)
Additional Info
People in the comments are saying it is not reproducible. Here is a copy of my console output.
(base) Mac-mini-2:stackoverflow username$ python pandas_dtypes_isin.py
A False
B False
C False
dtype: bool
(base) Mac-mini-2:stackoverflow username$ python pandas_dtypes_isin.py
A True
B False
C False
dtype: bool
(base) Mac-mini-2:stackoverflow username$ python pandas_dtypes_isin.py
A True
B False
C False
dtype: bool
(base) Mac-mini-2:stackoverflow username$ python pandas_dtypes_isin.py
A True
B True
C False
dtype: bool
(base) Mac-mini-2:stackoverflow username$ cat pandas_dtypes_isin.py
import pandas as pd
df = pd.DataFrame({'A': [1., 2., 3.], 'B': [100, 101, 102], 'C': ['a', 'b', 'c']})
assert df.dtypes.A == 'float64'
assert df.dtypes.B == 'int64'
assert df.dtypes.C == 'object'
print(df.dtypes.isin(['int64', 'float64']))