How to remove duplicate rows in pandas with multiple conditions

Question

How to remove duplicate rows in pandas with multiple conditions

147 views Asked by Lion_YY At 23 August 2022 at 14:26

import pandas as pd

df = pd.DataFrame(
    [
        ['China', 'L', '08/06/2022 20:00', '08/10/2022 20:00'],
        ['China', 'L', '8/13/2022 00:54', '8/14/2022 00:54'],
        ['China', 'M', '8/14/2022 00:54', '8/14/2022 12:54'],
        ['United Kingdom', 'L', '8/27/2022 06:36', '8/31/2022 21:08'],
        ['United Kingdom', 'L', '9/01/2022 21:08', '09/02/2022 21:38'],
        ['China', 'D', '09/04/2022 21:38', '09/06/2022 21:38']
    ],
    columns=['Country', 'Function', 'Arrival', 'Departure']
)

In this case, i want to remove the consistent duplicate rows and replace the departure time with the last duplicates value, with below two conditions:

do not remove other duplicates that are not in consistent manner.
if the 'Function' column changed, do not take it as duplicate even it is in consistent manner.

So it should look like this:

df = pd.DataFrame(
    [
        ['China', 'L', '08/06/2022 20:00', '8/14/2022 00:54'],
        ['China', 'M', '8/14/2022 00:54', '8/14/2022 12:54'],
        ['United Kingdom', 'L', '8/27/2022 06:36', '09/02/2022 21:38'],
        ['China', 'D', '09/04/2022 21:38', '09/06/2022 21:38']
    ],
    columns=['Country', 'Function', 'Arrival', 'Departure']
)

Original Q&A

There are 1 answers

**mozway** · Answer 1 · 2022-08-23T14:48:24+00:00

You can use groupby.idxmax:

idx = (pd.to_datetime(df['Departure'])
         .groupby([df['Country'], df['Function']], sort=False)
         .idxmax()
       )

out = df.loc[idx]

output:

          Country Function           Arrival         Departure
1           China        L   8/13/2022 00:54   8/14/2022 00:54
2           China        M   8/14/2022 00:54   8/14/2022 12:54
4  United Kingdom        L   9/01/2022 21:08  09/02/2022 21:38
5           China        D  09/04/2022 21:38  09/06/2022 21:38

TechQA.

How to remove duplicate rows in pandas with multiple conditions

There are 1 answers

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in DUPLICATES

Related Questions in ROWS

Popular Questions

Trending Questions