Lot of full table scans (around 600) because of a simple IN statement

92 views Asked by At

I have 2 CTE in my query. In the end of the query I simply join them and write the result to a page.

On the page I have filter options so when I have filters I have to put a trivial IN statement to the end of the query.

When I don't have the where condition the query is fast enough about (5 seconds) for more then 5 k result.

But when I have the trivial where condition the query takes about 3-4 minutes which is weird.

So I profilled it in SQL MS and I checked the actual execution plan. I realized that without the where condition I have only one full table scan but with the where condition around the number of the result.

After that I simply put the query into an inline table ()x and I used the where condition outside of it and the result is around 1 second.

See below the three query. Can you describe me why it's happening and how can I prevent these kind of situations?

/* Takes about 5 second 6k result*/

WITH First_CTE(ID,COLUMN2,COLUMN3,COLUMN4,COLUMN5)
AS
(
    ....
)
WITH Second_CTE(ID,COLUMN2,COLUMN3,COLUMN4,COLUMN5)
AS
(
    ....
)
SELECT *
FROM First_CTE AS t1
LEFT JOIN Second_CTE AS t2 ON t1.COLUMN2 = t2.COLUMN2

/* Takes about 4 minutes 600 result*/

WITH First_CTE(ID,COLUMN2,COLUMN3,COLUMN4,COLUMN5)
AS
(
    ....
)
WITH Second_CTE(ID,COLUMN2,COLUMN3,COLUMN4,COLUMN5)
AS
(
    ....
)
SELECT *
FROM First_CTE AS t1
LEFT JOIN Second_CTE AS t2 ON t1.COLUMN2 = t2.COLUMN2
WHERE t2.SomeColumn IN ( 22,23,24) -- 2 or more value

/* Takes about 1 second 600 result */

WITH First_CTE(ID,COLUMN2,COLUMN3,COLUMN4,COLUMN5)
AS
(
    ....
)
WITH Second_CTE(ID,COLUMN2,COLUMN3,COLUMN4,COLUMN5)
AS
(
    ....
)
SELECT * FROM (
    SELECT *
    FROM First_CTE AS t1
    LEFT JOIN Second_CTE AS t2 ON t1.COLUMN2 = t2.COLUMN2
)x 
WHERE x.SomeColumn IN ( 22,23,24) -- 2 or more value

2

There are 2 answers

4
Raj On

Can you try this and see if it makes a difference?

WITH First_CTE(ID,COLUMN2,COLUMN3,COLUMN4,COLUMN5)
AS
(
    ....
)
WITH Second_CTE(ID,COLUMN2,COLUMN3,COLUMN4,COLUMN5)
AS
(
    ....
)
SELECT *
FROM First_CTE AS t1
LEFT JOIN Second_CTE AS t2 ON t1.COLUMN2 = t2.COLUMN2
AND t2.SomeColumn IN ( 22,23,24) -- 2 or more value
0
paparazzo On

A CTE is just syntax (it is not materialized)
Pretty sure the slow IN created a loop where the CTE is evaluated many times

In the last that runs in 1 second most like each CTE is evaluated once then the where is applied to the results

You do know that t2.SomeColumn IN ( 22,23,24) negates the left
You might as well use a a join

Try this

SELECT *
FROM First_CTE AS t1
JOIN Second_CTE AS t2 
ON t1.COLUMN2 = t2.COLUMN2
AND t2.SomeColumn IN ( 22,23,24) -- 2 or more value

Better yet move the t2.SomeColumn IN ( 22,23,24) up into the CTE to give the query optimizer even less opportunity to get stupid

WITH First_CTE(ID,COLUMN2,COLUMN3,COLUMN4,COLUMN5)
AS
(
    ....
)
WITH Second_CTE(ID,COLUMN2,COLUMN3,COLUMN4,COLUMN5)
AS
(
    ....
    WHERE SomeColumn IN ( 22,23,24)
)
SELECT *
FROM First_CTE  AS t1
JOIN Second_CTE AS t2 
ON t1.COLUMN2 = t2.COLUMN2