Here is a sample row that I have in my dataframe:
{
"sessionId" : "454ec8b8-7f00-40b2-901c-724c5d9f5a91",
"useCaseId" : "3652b5d7-55b8-4bee-82b6-ab32d5543352",
"timestamp" : "1559403699899",
"endFlow" : "true"
}
I do groupby by 'sessionId', which will give me a group like this:
Row 1:
{
"sessionId" : "454ec8b8-7f00-40b2-901c-724c5d9f5a91",
"useCaseId" : "usecaseId1",
"timestamp" : "1559403699899",
"endFlow" : "false"
},
Row 2:
{
"sessionId" : "454ec8b8-7f00-40b2-901c-724c5d9f5a91",
"useCaseId" : "usecaseId1",
"timestamp" : "1559403699899",
"endFlow" : "false"
},
Row 3:
{
"sessionId" : "454ec8b8-7f00-40b2-901c-724c5d9f5a91",
"useCaseId" : "usecaseId2",
"timestamp" : "1559403699899",
"endFlow" : "true"
},
Row 4:
{
"sessionId" : "454ec8b8-7f00-40b2-901c-724c5d9f5a91",
"useCaseId" : "usecaseId1",
"timestamp" : "1559403699899",
"endFlow" : "false"
},
Row 5:
{
"sessionId" : "454ec8b8-7f00-40b2-901c-724c5d9f5a91",
"useCaseId" : "usecaseId1",
"timestamp" : "1559403699899",
"endFlow" : "true"
}
Taking the above group as example, what I want to achieve here is, after grouping the dataframe by 'sessionId', I want to loop through consecutive rows with same 'useCaseId'(So from, the above group, there will be three sets of consecutive rows through which I want to loop,
Row1-Row2,Row3,Row4-Row5)
And from each of the above consecutive sets of rows(Row1-Row2,Row3,Row4-Row5 (Where each set has same useCaseId),
I want to find the number of sets who's rows endflow value in only false.
So, from the above given example of group,the expected outcome is as follows:
1(Since, Row1-Row2 with same useCaseId 'usecaseId1' has endflow only 'false', while 'Row3' and 'Row4-Row5' has endflow 'true')
How can I achieve this?
Updates:
df.head():
sessionId useCaseId timestamp endFlow 0 sessionId1 useCaseId1 1559403699899 false 1 sessionId1 useCaseId1 1559403699899 false 2 sessionId1 useCaseId2 1559403699899 true 3 sessionId1 useCaseId1 1559403699899 false 4 sessionId1 useCaseId1 1559403699899 true
What I tried:
I have tried grouping the dataframe by 'sessionId' and 'usecaseId',but that won't work because it will group the dataframe uniquely with 'usecaseId' which is not what I wanted, I want to loop through consecutive rows after grouping by 'sessionId' with same 'usecaseId', and then count the consecutive rows with same 'useCaseId' having 'endFlow' only as 'false'.Expected output: After grouping by 'sessionId', I want to count the number of consecutive rows with same 'useCaseId' having 'endFlow' only as 'false'
from the above given example of group,the expected outcome is as follows: 1(Since, Row1-Row2 with same useCaseId 'usecaseId1' has endflow only 'false', while 'Row3' and 'Row4-Row5' has endflow 'true')