Using SQL Server I have two tables, below sample Table #T1 in DB has well over a million rows, Table #T2 has 100 rows. Both tables are in Column format and I need to Pivot to rows and join both.

Can I get it all in one query with Cross Apply and remove the cte?

#T1

#T2

Output

This is my code, I have correct output but is this the most efficient way to do this considering number of rows?

with cte_sizes
as
(
 select SizeRange,Size,ColumnPosition
 from #T2
 cross apply (
              values(Sz1,1),(Sz2,2),(Sz3,3),(Sz4,4)
             ) X (Size,ColumnPosition)
 )

 select a.ProductID,a.SizeRange,c.Size,isnull(x.Qty,0) as Qty
 from #T1 a
 cross apply (
              values(a.Sale1,1),(a.Sale2,2),(a.Sale3,3),(a.Sale4,4)
              ) X (Qty,ColumnPosition)
 inner join cte_sizes c 
 on c.SizeRange = a.SizeRange 
 and c.ColumnPosition = x.ColumnPosition

I have also code and considered this but is this the CROSS APPLY a better method?

with cte_sizes
as
(
   select 1 as SizePos
  union all
   select SizePos + 1 as SizePos
   from cte_sizes
   where SizePos < 4
)

select a.ProductID
      ,a.SizeRange
      ,(case when b.SizePos = 1 then c.Sz1
             when b.SizePos = 2 then c.Sz2
             when b.SizePos = 3 then c.Sz3
             when b.SizePos = 4 then c.Sz4 end
       ) as Size
      ,isnull((case when b.SizePos = 1 then a.Sale1
                    when b.SizePos = 2 then a.Sale2
                    when b.SizePos = 3 then a.Sale3
                    when b.SizePos = 4 then a.Sale4 end
              ),0) as Qty

 from #T1 a
 inner join #T2 c on c.SizeRange = a.SizeRange
 cross join cte_sizes b

1 Answers

1
Shnugo On Best Solutions

This is wild guessing, but my magic crystall ball told me, that you might be looking for something like this:

For this we do not need your table #TS at all.

WITH Unpivoted2 AS
(
    SELECT t2.SizeRange,A.* FROM #t2 t2
    CROSS APPLY(VALUES(1,t2.Sz1)
                     ,(2,t2.Sz2)
                     ,(3,t2.Sz3)
                     ,(4,t2.Sz4)) A(SizePos,Size)
) 
SELECT t1.ProductID
      ,Unpivoted2.SizeRange
      ,Unpivoted2.Size
      ,Unpivoted1.Qty
FROM #t1 t1
CROSS APPLY(VALUES(1,t1.Sale1)
                 ,(2,t1.Sale2)
                 ,(3,t1.Sale3)
                 ,(4,t1.Sale4)) Unpivoted1(SizePos,Qty)
LEFT JOIN Unpivoted2 ON Unpivoted1.SizePos=Unpivoted2.SizePos AND t1.SizeRange=Unpivoted2.SizeRange
ORDER BY t1.ProductID,Unpivoted2.SizeRange;

The result:

+-----------+-----------+------+------+
| ProductID | SizeRange | Size | Qty  |
+-----------+-----------+------+------+
| 123       | S-XL      | S    | 1    |
+-----------+-----------+------+------+
| 123       | S-XL      | M    | 12   |
+-----------+-----------+------+------+
| 123       | S-XL      | L    | 13   |
+-----------+-----------+------+------+
| 123       | S-XL      | XL   | 14   |
+-----------+-----------+------+------+
| 456       | 8-14      | 8    | 2    |
+-----------+-----------+------+------+
| 456       | 8-14      | 10   | 22   |
+-----------+-----------+------+------+
| 456       | 8-14      | 12   | NULL |
+-----------+-----------+------+------+
| 456       | 8-14      | 14   | 24   |
+-----------+-----------+------+------+
| 789       | S-L       | S    | 3    |
+-----------+-----------+------+------+
| 789       | S-L       | M    | NULL |
+-----------+-----------+------+------+
| 789       | S-L       | L    | 33   |
+-----------+-----------+------+------+
| 789       | S-L       | XL   | NULL |
+-----------+-----------+------+------+

The idea in short:

The cte will return your #T2 in an unpivoted structure. Each name-numbered column (something you should avoid) is return as a single row with an index indicating the position.

The SELECT will do the same with #T1 and join the cte against this set.

UPDATE: After a lot of comments...

If I get this (and the changes to the initial question) correctly, the approach above works perfectly well, but you want to know, what was best in performance.

The first answer to "What is the fastest approach?" is Race your horses by Eric Lippert.

Good to know 1: A CTE is nothing more then syntactic sugar. It will allow to type a sub-query once and use it like a table, but it has no effect to the way how the engine will work this down.

Good to know 2: It is a huge difference whether you use APPLY or JOIN. The first will call the sub-source once per row, using the current row's values. The second will have to create two sets first and will then join them by some condition. There is no general "what is better"...

For your issue: As there is one very big set and one very small set, all depends on when you reduce the big set usig any kind of filter. The earlier the better.

And most important: It is - in any case - a sign of bad structures - when you find name numbering (something like phone1, phone2, phoneX). The most expensive work will be to transform your 4 name-numbered columns to some dedicated rows. This should be stored in normalized format...

If you still need help, I'd ask you to start a new question.