I have a table with hierarchical data (contains ~5.000.000 records). I use the following query to retrieve all children records of a specific record:
with TEMP_PACKAGES as
(
select ID AS PACKAGE_ID,PARENT_PACKAGE_ID,1 as LEVEL from PACKAGES where PACKAGES.ID = 5405988
union all
select A.ID AS PACKAGE_ID, B.PARENT_PACKAGE_ID,B.LEVEL+1 AS LEVEL
from PACKAGES as A join TEMP_PACKAGES B on (A.PARENT_PACKAGE_ID = B.PACKAGE_ID)
)
select PACKAGE_ID,PARENT_PACKAGE_ID,LEVEL from TEMP_PACKAGES
so far so good, the above query executed instantly (0 ms).
Now, when I add one more field (name:RESERVED) on the select, the query execution time goes from 0ms to 15000ms (15") (!):
with TEMP_PACKAGES as
(
select ID AS PACKAGE_ID,PARENT_PACKAGE_ID,RESERVED,1 as LEVEL from PACKAGES where PACKAGES.ID = 5405988
union all
select A.ID AS PACKAGE_ID, B.PARENT_PACKAGE_ID,A.RESERVED,B.LEVEL+1 AS LEVEL
from PACKAGES as A join TEMP_PACKAGES B on (A.PARENT_PACKAGE_ID = B.PACKAGE_ID)
)
select PACKAGE_ID,PARENT_PACKAGE_ID,RESERVED,LEVEL from TEMP_PACKAGES
Note that:
- All the appropriate indexes exists (ID,PARENT_PACKAGE_ID)
- The type of RESERVED field is bit(NULL)
Any ideas why this happening?
Also note that if I modify the query as this:
with TEMP_PACKAGES as
(
select ID AS PACKAGE_ID,PARENT_PACKAGE_ID,1 as LEVEL from PACKAGES where PACKAGES.ID = 5405988
union all
select A.ID AS PACKAGE_ID, B.PARENT_PACKAGE_ID,B.LEVEL+1 AS LEVEL
from PACKAGES as A join TEMP_PACKAGES B on (A.PARENT_PACKAGE_ID = B.PACKAGE_ID)
)
select P.ID,P.PARENT_PACKAGE_ID,P.RESERVED,TP.LEVEL
from TEMP_PACKAGES as TP join PACKAGES as P on TP.PACKAGE_ID=P.ID
the performance is also instantly (0ms), as the first query.
Update (2022.04.13)
thank you for your answers. I attached both execution plans (fast query & slow query) as many of you have requested.
Also, the SQL server edition is 2008 64bit (SP3).
You should include the Reserved column as part of the index on ID column. Before adding the reserved column, you query only used the index and did not touch the table for any I/O.
As soon as you added reserved column, every CTE iteration needed to look up the reserved value from the table using ID.
If you cover the reserved column with an index on ID, you will get the performance you seek.
See MS documentation on this here