Create sequential event id for groups of consecutive ones

210 views Asked by At

I have a df like so:

Period  Count
1       1
2       0
3       1
4       1
5       0
6       0
7       1
8       1
9       1
10      0

and I want to return a 'Event ID' in a new column if there are two or more consecutive occurrences of 1 in Count and a 0 if there is not. So in the new column each row would get a 1 based on this criteria being met in the column Count. My desired output would then be:

Period  Count  Event_ID
1       1      0
2       0      0
3       1      1
4       1      1
5       0      0
6       0      0
7       1      2
8       1      2
9       1      2
10      0      0

I have researched and found solutions that allow me to flag out consecutive group of similar numbers (e.g 1) but I haven't come across what I need yet. I would like to be able to use this method to count any number of consecutive occurrences, not just 2 as well. For example, sometimes I need to count 10 consecutive occurrences, I just use 2 in the example here.

2

There are 2 answers

0
G.G On
col1=(df1['Count'].diff().ne(0)&df1['Count'].ne(1)).cumsum().mul(df1.Count)
df1.assign(flag=col1)

out:

 Period  Count  flag
0       1      1   0.0
1       2      0   0.0
2       3      1   1.0
3       4      1   1.0
4       5      0   0.0
5       6      0   0.0
6       7      1   2.0
7       8      1   2.0
8       9      1   2.0
9      10      0   0.0
0
Gerd On

This will do the job:

ones = df.groupby('Count').groups[1].tolist()
# creates a list of the indices with a '1': [0, 2, 3, 6, 7, 8]
event_id = [0] * len(df.index)
# creates a list of length 10 for Event_ID with all '0'

# find consecutive numbers in the list of ones (yields [2,3] and [6,7,8]):
for k, g in itertools.groupby(enumerate(ones), lambda ix : ix[0] - ix[1]):
  sublist = list(map(operator.itemgetter(1), g))
  if len(sublist) > 1:
    for i in sublist:
      event_id[i] = len(sublist)-1    
# event_id is now [0, 0, 1, 1, 0, 0, 2, 2, 2, 0]   

df['Event_ID'] = event_id

The for loop is adapted from this example (using itertools, other approaches are possible too).