I have a pandas DataFrame with timedeltas as a cumulative sum of those deltas in a separate column expressed in milliseconds. An example is provided below:
Transaction_ID Time TimeDelta CumSum[ms]
1 00:00:04.500 00:00:00.000 000
2 00:00:04.600 00:00:00.100 100
3 00:00:04.762 00:00:00.162 262
4 00:00:05.543 00:00:00.781 1043
5 00:00:09.567 00:00:04.024 5067
6 00:00:10.654 00:00:01.087 6154
7 00:00:14.300 00:00:03.646 9800
8 00:00:14.532 00:00:00.232 10032
9 00:00:16.500 00:00:01.968 12000
10 00:00:17.543 00:00:01.043 13043
I would like to be able to provide a maximum value for CumSum[ms] after which the cumulative sum would start over again at 0. For example, if the maximum value was 3000 in the above example, the results would look like so:
Transaction_ID Time TimeDelta CumSum[ms]
1 00:00:04.500 00:00:00.000 000
2 00:00:04.600 00:00:00.100 100
3 00:00:04.762 00:00:00.162 262
4 00:00:05.543 00:00:00.781 1043
5 00:00:09.567 00:00:04.024 0
6 00:00:10.654 00:00:01.087 1087
7 00:00:14.300 00:00:03.646 0
8 00:00:14.532 00:00:00.232 232
9 00:00:16.500 00:00:01.968 2200
10 00:00:17.543 00:00:01.043 0
I have explored using the modulo operator, but am only successful in resetting back to zero when the resulting cumsum is equal to the limit provided (i.e. cumsum[ms] of 500 % 500 equals zero).
Thanks in advance for any thoughts you may have, and please let me know if I can provide any more information.
Here's an example of how you might do this by iterating over each row in the dataframe. I created new data for the example for simplicity:
So let's do an accumulator loop with your desired 3000 max:
Then put the
newcom
list into the dataframe: