SQL Select - adding field to Select is changing the results

613 views Asked by At

I'm stumped by this SQL problem that I suspect will be easy pickings for someone out there.

I have a table that contains rows representing several daily lists of ranked items. The relevent fields are as follows: ID, ListID, ItemID, ItemName, ItemRank, Date.

I have a query that returns the items that were on a list yesterday but not today (Items Off List) as follows:

Select ItemID, ListID, ItemName, convert(varchar(10),MAX(date),101) as date, COUNT(ItemName) as days_on_list 
From Table 
Group By ItemID, ListID, ItemName
Having Max(date) = DATEADD("d",-1,convert(varchar(10),getdate(),101)) and ListID = 1
Order By ListID, ItemName, COUNT(ItemName)

Basically I'm looking for records where the max date is yesterday. It works fine and shows the number of days each item was previously on the list (although not necessarily consecutively, but that's fine for now).

The problem is when I try to add ranking to see what yesterday's rank was. I tried the following:

Select ItemID, ListID, ItemName, ranking, convert(varchar(10),MAX(date),101) as date, COUNT(ItemName) as days_on_list 
From Table
Group By ItemID, ListID, ItemName, ranking
Having Max(date) = DATEADD("d",-1,convert(varchar(10),getdate(),101)) and ListID = 1
Order By ListID, ItemName, ranking, COUNT(ItemName)

This returns a great deal more records than the previous query so something isn't right with it. I want the same number of records, but with the ranking included. I can get the rank by doing a self-join with a subquery and getting records where the ItemID occurs yesterday but not today - but then I don't know how to get the Count any more.

Appreciation in advance for any help with this.

======== SOLVED ==============

Select ItemID, ListID, ItemName, ranking, convert(varchar(10),MAX(date),101) as date, (Select COUNT(ItemName) From Table T3 Where T3.ItemID = T.ItemID and T3.ListID = 1) as days_on_list
from Table T
Where date = DATEADD("d",-1,convert(varchar(10),getdate(),101)) and ListID = 1 and T.ItemID Not In
(select T.ItemID from Table T
    join Table T2 on T.ItemID = T2.ItemID and T.ListID = T2.ListID
    where T.date = DATEADD("d",-1,convert(varchar(10),getdate(),101)) and T2.date = convert
(varchar(10),getdate(),101) and T.ListID = 1)
Group by ItemID, ListID, ItemName, ranking

Basically, what I did was create a subquery that finds all items that appear in both days, and finds items that appeared yesterday but are not in the set of items that appeared both days. Then I was able to do the aggregate function and grouping correctly. I would NOT be surprised if this is more convoluted than necessary but I understand it and can modify it as needed and performance doesn't seem to be an issue.

Thanks to all again!

4

There are 4 answers

3
David MÃ¥rtensson On BEST ANSWER

The problem probably is that ranking is different for items that would in the first question get grouped.

So when you add ranking to the group by there are more groups than before.

If this is the case and you still want the same amount of rows you have to include ranking after grouping witn a wrapping question and then you also have to decide if you like min ranking or max ranking from the matching rows.

2
TheJubilex On

I'm not sure if this is your problem, and I wish I had the reputation to comment this, but is the "ranking" item in the group by statement necessary? Given this context, it doesn't look like it should be.

3
Jimbo On

Your initial query without ranking groups by three fields. Your query, with the ranking added, groups by four fields. So your grouping is by a finer grain and will therefore most likely give you more records.

I would need to see a sample dataset and sample desired values before suggestion a query solution.

5
Thomas On

You could calculate the Count without using a Group By by the Over clause.

With Items As
    (
    Select ItemID, ListID, ItemName, Ranking, [Date]
        , Row_Number() Over( Partition By ItemID, ListID, ItemName Order By [Date] Desc ) As Num
        , Count(ItemName) Over ( Partition By ItemId, ListId, ItemName ) As DaysOnList
    From Table As T
    Where T.[Date] = DateAdd(d, -1, CURRENT_TIMESTAMP) 
        And Not Exists  (
                        Select 1
                        From Table As T2
                        Where T2.ItemID = T.ItemID
                            And T2.ListId = T.ListId
                            And T2.[Date] > T.[Date]
                        )
    )
Select ItemID, ListID, ItemName, LastDate, Ranking
From Items
Where Num = 1
Order By ListID, ItemName, DaysOnList