How to sum a column in SQL Server recursive cte for optimization?

1.6k views Asked by At

I have following table with hierarchical data:

FolderId ParentFolderId NumberOfAffectedItems
---------------------------------------------
1           NULL        2
2           1           3
3           2           5
4           2           3
5           1           0

I want to find number of affected items under each folders and all of its children. I can write a recursive cte, which can produce following result, after that by doing group by I can find out what I want.

Normal recursive CTE:

WITH FolderTree AS
(
    SELECT
        fsa.FolderId AS ParentFolderId,
        fsa.FolderId AS ChildFolderId,          
        fsa.NumberOfReportsAffected
    FROM
        FoldersWithNumberOfReportsAffected fsa

    UNION ALL

    SELECT
        ft.ParentFolderId,
        fsa.FolderId AS ChildFolderId,                  
        fsa.NumberOfReportsAffected
    FROM
        FoldersWithNumberOfReportsAffected fsa
    INNER JOIN
        FolderTree ft ON fsa.ParentFolderId = ft.ChildFolderId          
  )

Result:

ParentFolderId ChildFolderId NumberOfAffectedItems
--------------------------------------------------
1               1           2
1               2           3
1               3           5
1               4           3
1               5           0
2               2           3
2               3           5
2               4           3
3               3           5
4               4           3
5               5           0

But I want to optimize it, I want to start from the leaf child, while moving through the CTE itself, I want to compute NumberOfAffectedItems.

Expected CTE

WITH FolderTree AS
(
    SELECT
        fsa.FolderId AS LeafChildId,
        fsa.FolderId AS ParentFolderId,         
        fsa.NumberOfReportsAffected
    FROM
        FoldersWithNumberOfReportsAffected fsa
    LEFT JOIN
        FoldersWithNumberOfReportsAffected f ON fsa.folderid = f.ParentfolderId
    WHERE
        f.ParentfolderId is null -- this is finding leaf child

    UNION ALL

    SELECT
        ft.LeafChildId,
        fsa.FolderId AS ParentFolderId,                 
        fsa.NumberOfReportsAffected + ft.NumberOfReportsAffected AS [ComputedResult]
    FROM
        FoldersWithNumberOfReportsAffected fsa
    INNER JOIN 
        FolderTree ft ON fsa.FolderId = ft.ParentFolderId
  )

Result:

LeafChildId ParentFolderId ComputedNumberOfAffectedItems
---------------------------------------------------------
3           3               5
3           2               8
3           1               10
4           4               3
4           2               5
4           1               7
5           5               0
5           1               2

If I group by ParentFolderId, I will get a wrong result, the reason is while doing computing in CTE, the same parent folder is visited from multiple children, hence results in a wrong result. I want to find out is there anyway we can compute the result while going through the CTE itself.

1

There are 1 answers

6
Tyron78 On

Please check the following solution. I used your cte as basis and added the calculation (as column x) to it:

DECLARE @t TABLE(
  FolderID INT
 ,ParentFolderID INT
 ,NumberOfAffectedItems INT
);

INSERT INTO @t VALUES (1           ,NULL        ,2)
                     ,(2           ,1           ,3)
                     ,(3           ,2           ,5)
                     ,(4           ,2           ,3)
                     ,(5           ,1           ,0);


WITH FolderTree AS
(
    SELECT 1lvl,
        fsa.FolderId AS LeafChildId,
        fsa.ParentFolderId AS ParentFolderId,
        fsa.NumberOfAffectedItems
    FROM
        @t fsa
    LEFT JOIN
        @t f ON fsa.folderid = f.ParentfolderId
    WHERE
        f.ParentfolderId is null -- this is finding leaf child

    UNION ALL

    SELECT lvl + 1,
        ft.LeafChildId,
        fsa.ParentFolderId,                 
        fsa.NumberOfAffectedItems
    FROM
        FolderTree ft
    INNER JOIN @t fsa
        ON fsa.FolderId = ft.ParentFolderId
  )
SELECT  LeafChildId,
        ISNULL(ParentFolderId, LeafChildId) ParentFolderId,
        NumberOfAffectedItems,
        SUM(NumberOfAffectedItems) OVER (PARTITION BY LeafChildId ORDER BY ISNULL(ParentFolderId, LeafChildId) DESC) AS x
  FROM FolderTree
  ORDER BY 1, 2 DESC
  OPTION (MAXRECURSION 0)

Result:

LeafChildId ParentFolderId  NumberOfAffectedItems   x
3           3               2                       2
3           2               5                       7
3           1               3                       10
4           4               2                       2
4           2               3                       5
4           1               3                       8
5           5               2                       2
5           1               0                       2