Filter DataFrame by values from other

65 views Asked by At

I have 2 pandas DataFrames: users and interactions.

I need to filter first so that values from users['user_id'] are in interactions['user_id']

users = users[users.user_id.isin(interactions['user_id'])]

A get such DataFrame:

        Unnamed: 0  user_id         age        income sex  kids_flg
0                0   973171   age_25_34  income_60_90   М         1
1                1   962099   age_18_24  income_20_40   М         0
3                3   721985   age_45_54  income_20_40   Ж         0
4                4   704055   age_35_44  income_60_90   Ж         0
5                5  1037719   age_45_54  income_60_90   М         0
...            ...      ...         ...           ...  ..       ...
818672      840184   529394   age_25_34  income_40_60   Ж         0
818674      840186    80113   age_25_34  income_40_60   Ж         0
818676      840188   312839  age_65_inf  income_60_90   Ж         0
818677      840189   191349   age_45_54  income_40_60   М         1
818678      840190   393868   age_25_34  income_20_40   М         0

[566772 rows x 6 columns]

Now let's count the number of values which are not in interactions['user_id']:

print(users['user_id'].size - interactions['user_id'].unique().size)
>> 98359
print(users['user_id'].size)
>> 818683
#number of values in users['user_id']

We can notice that 818683 - 98359 != 566772

What am I doing wrong?

I don't know where problem is, can you help me?

0

There are 0 answers