This is sample of data I have:
id,date_from,date_to,contract_type
308,01.01.2023,28.09.2023,1
308,04.03.2023,15.07.2023,2
308,01.10.2023,31.07.2024,1
477,02.04.2023,30.08.2023,1
477,01.06.2023,31.12.2023,2
And this is what data I would like to get:
id,date_from,date_to,contract_type
308,01.01.2023,03.03.2023,1
308,04.03.2023,15.07.2023,2
308,16.07.2023,28.09.2023,1
308,01.10.2023,31.07.2024,1
477,02.04.2023,31.05.2023,1
477,01.06.2023,31.12.2023,2
Basically I have two T-SQL tables, a pre-stage and a stage table. In the first table I get overlapping date periods of employee contracts. I want to split all the overlapping date periods into non-overlapping intervals and for each overlapping interval choose the higher number in contract_type. This is important.
Can anyone help me? Do I calculate each date and then group them by id and contract_type? Or is there a method to detect and split date ranges with a condition?
I should do this all in procedure that loads the stage table. I can do this, but I thought I should mention it.
Thanks to anyone in advance!
I don't know if there is a simpler way to do this that explicitly recognizes and handles specific overlap cases, but the following is a generalized divide, conquer, and merge approach.
For each id:
When doing date-range calculations, it is often easier to work with date ranges having an exclusive date (00:00 the next day) instead of inclusive end dates (implicitly 23:59:59.999* the last day). This avoids the constant off-by-one adjustments when comparing start vs end dates.
A standard date-range overlap comparison would then be
start1 < end2 AND start2 < end1
.Code:
Results:
See this db<>fiddle for a demo.
NOTE: One side effect of this processing is that any adjacent or overlapping contracts of the same type in the original data may be merged into a single row in the final result.