I have been researching methods to improve a hierarchical look through (currently carried out using a C# task) on a universe of financial portfolios. The purpose is to determine the ultimate holdings of the top parent portfolio by apportioning out holdings of children portfolios accordingly.
The current set of data takes around 30 minutes to run. This is for about 1000 financial portfolios, and each portfolio can link up to about 4 portfolios (4 deep, with around 40 line items on average each).
The data is stored in SQL Server so this was my first avenue, and initially attempted a CTE approach. The CTE approach shortened the run to around 7 minutes, which is a huge improvement. However when ran against a larger set of data, it then stalls and takes even longer than the C# task.
I then turned to a set-base while loop (LAHP) approach after reading the following article: https://www.simple-talk.com/sql/performance/the-performance-of-traversing-a-sql-hierarchy-/
I don't want to go into too much detail into how it works as this question is already long enough, however it uses a combination of 2 temp tables, and then a while loop to insert into alternating tables, side stepping the Halloween protection.
The current LAHP is even faster than the CTE and seems to be better at handling larger sets of portfolios, and deeper look throughs, however some of the results (financial holding value of the instrument (child/security) within the portfolio) are coming out exactly half of what they should be.
Please see query below, this is used in a stored proc format and then takes each root portfolio as a variable.
DECLARE @root AS VARCHAR(50) = 'Portfolio1';
CREATE TABLE #T1
(
ID INT NOT NULL IDENTITY(1,1) PRIMARY KEY,
lvl INT NOT NULL,
PortfolioCode varchar(MAX) NULL,
InstrumentCode varchar(MAX) NULL,
LinkedID varchar(MAX) NULL,
TradedMarketValue NUMERIC(28, 8) NOT NULL
);
CREATE NONCLUSTERED INDEX IDX_Pcode ON #T1(ID)
INCLUDE (lvl)
CREATE TABLE #T2
(
ID INT NOT NULL IDENTITY(1,1) PRIMARY KEY,
lvl INT NOT NULL,
PortfolioCode varchar(MAX) NULL,
InstrumentCode varchar(MAX) NULL,
LinkedID varchar(MAX) NULL,
TradedMarketValue NUMERIC(28, 8) NOT NULL
);
CREATE NONCLUSTERED INDEX IDX_Pcode ON #T2(ID)
INCLUDE (lvl)
DECLARE @lvl AS INT = 0;
-- insert root node
INSERT INTO #T1
(
lvl,
LinkedID,
PortfolioCode,
InstrumentCode,
TradedMarketValue
)
SELECT
@lvl,
I.LinkedID,
P.PortfolioCode,
I.Code,
P.TradedMarketValue
FROM PortfolioHolding P
join Instrument I on I.Code = P.InstrumentCode
WHERE P.PortfolioCode = @root
WHILE @@ROWCOUNT > 0
BEGIN
SET @lvl += 1;
-- insert children of nodes in prev level
IF @lvl % 2 = 1
INSERT INTO #T2
(
lvl,
LinkedID,
PortfolioCode,
InstrumentCode,
TradedMarketValue
)
SELECT
@lvl,
I.LinkedID,
P.PortfolioCode,
I.Code,
--calculate ratio of holding and apply to holding
cast(cast(cast(ISNULL(T.TradedMarketValue, 0) as float) /
NULLIF(sum(cast(ISNULL(P.TradedMarketValue, 0) as float)) over (partition by P.PortfolioCode) , 0) as float)
* ISNULL(P.TradedMarketValue, 0) as float)
FROM #T1 AS T
INNER JOIN PortfolioHolding AS P
ON T.lvl = @lvl - 1
AND P.PortfolioCode = T.LinkedID
JOIN Instrument I on I.Code = P.InstrumentCode;
ELSE
INSERT INTO #T1
(
lvl,
LinkedID,
PortfolioCode,
InstrumentCode,
TradedMarketValue
)
SELECT
@lvl,
I.LinkedID,
P.PortfolioCode,
I.Code,
--calculate ratio of holding and apply to holding
(cast(cast(cast(ISNULL(T.TradedMarketValue, 0) as float) /
NULLIF(sum(cast(ISNULL(P.TradedMarketValue, 0) as float)) over (partition by P.PortfolioCode) , 0) as float)
* ISNULL(P.TradedMarketValue, 0) as float))
FROM #T2 AS T
INNER JOIN PortfolioHolding AS P
ON T.lvl = @lvl - 1
AND P.PortfolioCode = T.LinkedID
JOIN Instrument I on I.Code = P.InstrumentCode;
END
SELECT
@root, InstrumentCode ,sum(tradedmarketvalue) As TradedMarketValue
FROM
(SELECT * FROM #T1 UNION ALL SELECT * FROM #T2) AS U
WHERE LinkedID is null
group by InstrumentCode
DROP TABLE #T1, #T2;
As you can see from the query the PortfolioCode (ParentID
) holds InstrumentCodes (ChildId
), and the LinkedId
matches an InstrumentCode
to its relevant PortfolioCode
.
The TradedMarketValue
is summed where LinkedID
is null
to get all the leaf nodes.
As previously mentioned some of the TradedMarketValues are coming in at half of what they should be. Your help will be greatly appreciated.
Well well well... After spending a good 2 hours creating sample data to post with my question and running through it the answer presented itself!
Please see sample data below:
Instrument Table:
Portfolio Table:
So as an example, if we wanted to explode Portfolio 1
Port1
, this would end up 3 portfolios deep, first intoPort2
then into a 3rd level ofPort4
andPort5
See diagram:
This whole system works and the query provides the correct results. The problem comes in when two different instruments in one portfolio point to the same
LinkedId
or 'PortfolioCode'.See diagram:
What then occurs is this section of the code:
that calculates the ratio of the TradedMarketValue (TMV) of the parent portfolio to the total value of the child portfolio is summing twice, causing the denominator to double.
To resolve this issue, one can change the above section of code to
adding an extra variable to partition by (group by)
T.InstrumentCode
which then stops the above occurring.I hope this helps someone down the line.