Using OVER PARTITION BY in higher levels

194 views Asked by At

I've been struggling with this problem. I have a table that looks like this:

Date Location SkU Model Type Qty
2020-01-01 01A AB1 One A 10
2020-01-01 01A AB1 One B 3
2020-01-01 01A AB2 One A 5
2020-01-01 01A AB2 One B 0
2020-01-02 01A AB1 One A 8
2020-01-02 01A AB1 One B 2
2020-01-02 01A AB2 One A 1
2020-01-02 01A AB2 One B 0
2020-01-01 01A CD1 Two A 3
2020-01-01 01A CD1 Two B 9
2020-01-01 01A CD2 Two A 0
2020-01-01 01A CD2 Two B 1
2020-01-02 01A CD1 Two A 7
2020-01-02 01A CD1 Two B 4
2020-01-02 01A CD2 Two A 1
2020-01-02 01A CD2 Two B 5

As you can see, I have dates, locations, skus , models and types, and what I want to do is to calculate a running total between the current date and 3 days before by Location and Model. However, when I do this in my current code, since I have several rows for those combinations, it's giving me incorrect results. This is my code:

SUM(Qty) OVER (PARTITION BY Location, Model ORDER BY date ROWS BETWEEN 3 PRECEDING AND CURRENT ROW)

The expected output should look like this:

Date Location SkU Model Type Qty RunningTotal
2020-01-01 01A AB1 One A 10 18
2020-01-01 01A AB1 One B 3 18
2020-01-01 01A AB2 One A 5 18
2020-01-01 01A AB2 One B 0 18
2020-01-02 01A AB1 One A 8 29
2020-01-02 01A AB1 One B 2 29
2020-01-02 01A AB2 One A 1 29
2020-01-02 01A AB2 One B 0 29
2020-01-01 01A CD1 Two A 3 13
2020-01-01 01A CD1 Two B 9 13
2020-01-01 01A CD2 Two A 0 13
2020-01-01 01A CD2 Two B 1 30
2020-01-02 01A CD1 Two A 7 30
2020-01-02 01A CD1 Two B 4 30
2020-01-02 01A CD2 Two A 1 30
2020-01-02 01A CD2 Two B 5 30
1

There are 1 answers

0
Brett On

This seems to do the trick. You may want to test with a larger data set.

I pulled out the 'higher level' of data into a CTE, apply the lag and then join this back to the original set of data.

What I don't think this solves - but you didn't expressly ask for - is the case where there are gaps in the dates within the data. You might not get your expected results in this case (e.g. it doesn't lag by three calendar days, but rather by three records).

declare @src table ([Date] DateTime, [Location] varchar(5), [SkU] varchar(5), Model varchar(10), [Type] varchar(1), Qty int)

insert into @src values
('2020-01-01', '01A', 'AB1', 'One', 'A',    10),
('2020-01-01', '01A', 'AB1', 'One', 'B',    3 ),
('2020-01-01', '01A', 'AB2', 'One', 'A',    5 ),
('2020-01-01', '01A', 'AB2', 'One', 'B',    0 ),
('2020-01-02', '01A', 'AB1', 'One', 'A',    8 ),
('2020-01-02', '01A', 'AB1', 'One', 'B',    2 ),
('2020-01-02', '01A', 'AB2', 'One', 'A',    1 ),
('2020-01-02', '01A', 'AB2', 'One', 'B',    0 ),
('2020-01-01', '01A', 'CD1', 'Two', 'A',    3 ),
('2020-01-01', '01A', 'CD1', 'Two', 'B',    9 ),
('2020-01-01', '01A', 'CD2', 'Two', 'A',    0 ),
('2020-01-01', '01A', 'CD2', 'Two', 'B',    1 ),
('2020-01-02', '01A', 'CD1', 'Two', 'A',    7 ),
('2020-01-02', '01A', 'CD1', 'Two', 'B',    4 ),
('2020-01-02', '01A', 'CD2', 'Two', 'A',    1 ),
('2020-01-02', '01A', 'CD2', 'Two', 'B',    5 )

-- Create a CTE with the quantity totals by Date/Location/Model
; with dailyTotals as
(
    select [Date], [Location], [Model], sum(Qty) QtyForDay
    from @src
    group by [Date], [Location], [Model]
)
-- then generate the lag using the daily total
, dailyTotalsWithLag as
(
    select [Date], [Location], [Model], RunningTotalForDay = SUM(QtyForDay)  OVER (PARTITION BY Location, Model ORDER BY date ROWS BETWEEN 3 PRECEDING AND CURRENT ROW)
    from dailyTotals
)
-- and finally join the lagged totals back to the source records to get the desired result.
select s.*, t.RunningTotalForDay
 from @src s
 join dailyTotalsWithLag t
 on s.[Date] = t.[Date] and s.Location = t.Location and s.Model = t.Model