Drop consecutive rows if condition is met in pandas df

22 views Asked by At

I have a data set in which a certain rows are getting repeated and i need to drop them

for example:

Column A Column B
john 1
next nan
nan nan
123 nan
smith 2
Pete 3
next nan
nan nan
123 nan
Angie 2
tom 3

so every time 'Next' comes in 'column A', 3 rows including the row that contains 'next' needs to be deleted.

how can i solve this.

1

There are 1 answers

0
mozway On

Assuming you don't want to just dropna on Column B, you could use a rolling.max on Column A to get N rows after the next:

N = 3
out = df[df['Column A'].eq('next').rolling(N, min_periods=1).max().eq(0)]

Output:

   Column A  Column B
0      john       1.0
4     smith       2.0
5      Pete       3.0
9     Angie       2.0
10      tom       3.0

Intermediates:

   Column A  Column B  eq('next')  rolling_max  eq(0)
0      john       1.0       False          0.0   True
1      next       NaN        True          1.0  False
2       NaN       NaN       False          1.0  False
3       123       NaN       False          1.0  False
4     smith       2.0       False          0.0   True
5      Pete       3.0       False          0.0   True
6      next       NaN        True          1.0  False
7       NaN       NaN       False          1.0  False
8       123       NaN       False          1.0  False
9     Angie       2.0       False          0.0   True
10      tom       3.0       False          0.0   True