I wondered if there is a way to check and then drop certain rows which are not unique?
My data frame looks something like this:
ID1 ID2 weight
0 2 4 0.5
1 3 7 0.8
2 4 2 0.5
3 7 3 0.8
4 8 2 0.5
5 3 8 0.5
EDIT: I added a couple more rows to show that other unique rows that may have the same weight should be kept.
I think that when I use pandas drop_duplicates(subset=['ID1', 'ID2','weight'], keep=False)
it considers each row individually but not recognise that rows 0 and 2 and 1 and 4 are in fact the same values?
Sort the dataframe along
axis=1
then usenp.unique
with optional paramreturn_index=True
to get the indices of unique elements:Alternative approach suggested by @anky: