Assuming a df as follows

col1    col2
1   1
1   2
1   4
1   6
1   7
1   8
1   24
1   23
1   24
1   1
1   1
1   2
1   3
1   1
1   3
1   2
2   2
2   3
2   4
2   5
2   5
2   6
2   9
2   15
2   16
2   19
2   24
2   1
2   3
2   2
2   1
2   2
2   2
2   3
2   3

I would like to do kind of groupby on col1 and check if in col2 numbers 1, 2, 3 occur after 24. If yes, the values related to these must be stored as separate dataframes, preferably as follows:

df1:

col1    col2
1   1
1   1
1   1
2   1
2   1

df2:

col1    col2
1   2
1   2
2   2
2   2
2   2

df3:

col1    col2
1   3
1   3
2   3
2   3
2   3

The dataframes: df1, df2, df3 have been created from the values that occur after 24 in col2

Edit 1: In the df, there is an instance where in the col2 a 23 is present between two 24s. In such a case as well, it must check that value, and if it's either 1, 2, or 3, it must be assigned to their respective dataframes

1 Answers

0
Shijith On Best Solutions
  1. Iterate through each group of groupby for _, group in df.groupby('col1')

  2. Find the original index for the first occurrence of 24 on each group using group.index.get_loc(group[group.col2.eq(24)].index[0])

  3. Subset each group from index found on previous step to the end group[indexfound: ]

  4. From the subsetted data frame find the occurrences of 1,2 and 3 [group.col2.eq(1/2/3)] and save each to separate data frames

df1=pd.DataFrame(columns=['col1','col2'])
df2= df1.copy()
df3 = df1.copy()
for _, group in df.groupby('col1'):
    df1=df1.append(group[group.index.get_loc(group[group.col2.eq(24)].index[0]):][group.col2.eq(1)])
    df2=df2.append(group[group.index.get_loc(group[group.col2.eq(24)].index[0]):][group.col2.eq(2)])
    df3=df3.append(group[group.index.get_loc(group[group.col2.eq(24)].index[0]):][group.col2.eq(3)])