I have the following data:
date unit status
2023-04-30 unit1 1
2023-05-31 unit1 1
2023-08-31 unit1 1
2023-09-30 unit1 1
2023-11-30 unit1 1
2023-12-31 unit1 1
2024-01-31 unit1 1
2024-02-28 unit1 1
For a reference date I would like to know the length of the first upcoming "streak" (on MSSQL, used for production, and sqlite, used for unit tests)
Example 1:
For date 2023-05-15 my desired output is:
unit streak
unit1 3
The reason for this is that the first month with status=1 after 2023-05 is 2023-08, and then I just count for each consecutive month.
Example 2:
For date 2023-11-01 my desired output is:
unit streak
unit1 3
The reason is that the first month with status=1 after 2023-11 is 2023-12, and the streak ends on 2024-02 as months with status=0 are not recorded, and the next month with status=1 is more that a month away.
This is a
gaps and islands
problem that can be resolved by (value minus row_number), since that is invariant within a consecutive sequence. The start and end dates are just the MIN() and MAX() of the group :NB : The giving date have been converted to the first day of the month so the GroupingSet can be matched withing same year/month !
Demo here
———
Using the row_number method, we can easily obtain the top streak for each unit when considering several units:
Demo here