Why is df.dtypes.isin not giving expected results when passed a list of strings?

66 views Asked by At

I want to check that all the dtypes of a dataframe are contained in a subset of dtypes which I specify.

I thought this might be a good way to do that but getting unexpected results:

import pandas as pd
df = pd.DataFrame({'A': [1., 2., 3.], 'B': [100, 101, 102], 'C': ['a', 'b', 'c']})
assert df.dtypes.A == 'float64'
assert df.dtypes.B == 'int64'
assert df.dtypes.C == 'object'
print(df.dtypes.isin(['int64', 'float64']))

Output

A     True
B    False
C    False
dtype: bool

In fact, the results seem to vary each time I run the script. Sometimes I get this:

A    False
B    False
C    False
dtype: bool

Sometimes this:

A    False
B     True
C    False
dtype: bool

Clearly, df.dtypes.isin was not meant to be used this way.

The following works as expected, so I suspect this is something to do with my use of strings in place of the dtype objects:

df.dtypes.isin([np.dtype('float64'), np.dtype('int64')])

(I realize I could use select_dtypes if I wanted to select the columns of a certain type or is_numeric_dtype if I wanted to check all columns are a numeric type, but that is not what I want to do.)

Additional Info

People in the comments are saying it is not reproducible. Here is a copy of my console output.

(base) Mac-mini-2:stackoverflow username$ python pandas_dtypes_isin.py
A    False
B    False
C    False
dtype: bool
(base) Mac-mini-2:stackoverflow username$ python pandas_dtypes_isin.py
A     True
B    False
C    False
dtype: bool
(base) Mac-mini-2:stackoverflow username$ python pandas_dtypes_isin.py
A     True
B    False
C    False
dtype: bool
(base) Mac-mini-2:stackoverflow username$ python pandas_dtypes_isin.py
A     True
B     True
C    False
dtype: bool
(base) Mac-mini-2:stackoverflow username$ cat pandas_dtypes_isin.py
import pandas as pd
df = pd.DataFrame({'A': [1., 2., 3.], 'B': [100, 101, 102], 'C': ['a', 'b', 'c']})
assert df.dtypes.A == 'float64'
assert df.dtypes.B == 'int64'
assert df.dtypes.C == 'object'
print(df.dtypes.isin(['int64', 'float64']))
0

There are 0 answers