I have two columns, both strings, and I'd like to create a function that removes rows in cases where the full_name contains the part_name
df = pd.DataFrame({'city_code':['34', '36', '89', '34'],
'full_name': ['WXYZ(24)', 'ZYXW', 'YZWX','WXYZ(24)'],
'part_name': ['WXYZ', 'ABCD', 'YZWX', 'ABCD']})
print(df)
city_code full_name part_name
34 WXYZ(24) WXYZ
36 ZYXW ABCD
89 YZWX YZWX
34 WXYZ(24) ABCD
The ouput I want is:
city_code full_name part_name
36 ZYXW ABCD
34 WXYZ(24) ABCD
Because this resulting line is the only one where part_name is not contained within full_name. I've tried the below and received the following error:
df = df[~df['full_name'].str.contains(df['part_name'])]
TypeError: unhashable type: 'Series'
I've seen similar entries on this matter, but the resolution for those was to use a dictionary, which isn't suitable for this case as far as I can tell because I need to remove these rows based on their relative values.
Please let me know if I can provide any further detail.
Code
Although vectorized operations might be possible, here is a non-vectorized solution that should work for now.
out
Example Code
your example code has many typo