I am having an output data frame like the one below and I wanted to format the output so that I can use it for the further pipeline.
Few pointers about the data frame:
1)This data frame is the weekly workload data for employees.
2)load 0, load 30, load 100, etc, represents half an hour slot. Each load is a half an hour shift.
2) Whenever "1" starts it represents a shift start and whenever "BREAK" appears it represents a break slot/shift.
For example: In row 1, for the employee 1234, his shift starts at 12:00 and ends at 2:00, and in between, he is having a break from 1:00 to 1:30
employee date store load0 load30 load100 load130 load200 load230 load300
1234 2021-12-1 450 1 1 BREAK 1 1 0 0
1234 2021-12-2 450 0 1 1 BREAK 1 1 0
5678 2021-12-1 650 0 0 0 0 1 1 0
5678 2021-12-2 650 0 0 1 1 BREAK 1 0
For the above example the output should be something like:
Start End Segment type
date+12:00:00 date+1:00:00 Regular_segment
date+1:00:00 date+1:30:00 Break segment
date+1:30:00 date+2:30:00 Regular segment
Ps. there are around 350 employees and for every employee, there will be a schedule like this for less than 7 days in a week
I want the output like BELOW:
employee store Start End SegmentType
0 1234 450 2021-12-1T12:00:00Z 2021-12-1T12:30:00Z REGULAR_SEGMENT
1 1234 450 2021-12-1T1:00:00Z 2021-12-1T1:30:00Z BREAK_SEGMENT
2 1234 450 2021-12-1T1:30:00Z 2021-12-1T2:00:00Z REGULAR_SEGMENT
3 1234 450 2021-12-2T12:30:00Z 2021-12-2T1:00:00Z REGULAR_SEGMENT
4 1234 450 2021-12-2T1:30:00Z 2021-12-2T2:20:00Z BREAK_SEGMENT
5 1234 450 2021-12-2T2:00:00Z 2021-12-2T2:30:00Z REGULAR_SEGMENT
6 5678 650 2021-12-1T2:00:00Z 2021-12-1T2:30:00Z REGULAR_SEGMENT
7 5678 650 2021-12-2T1:00:00Z 2021-12-1T2:30:00Z REGULAR_SEGMENT
8 5678 650 2021-12-2T2:00:00Z 2021-12-2T2:00:00Z BREAK_SEGMENT
9 5678 650 2021-12-2T2:30:00Z 2021-12-2T2:30:00Z REGULAR_SEGMENT
I hope this will work!