tree - how to figure out using mysql if the descendants of a node are complete

320 views Asked by At

Asked a similar question before, but no one replied, so I've rethought the issue, and asking a similar/different question.

I have a process that starts with a parent/root(top) app. The root app then spawns child apps which can also spawn descendant apps. This can continue for multiple levels. Each level then can be either a node, or a leaf. A node can have descendants. A leaf has no spawned children/descendant apps.

At the start of the process, the app knows the number of levels. The process is also structured so each child app is able to update a tbl when it completes, with its own ID, as well as the parentID.

So, when the entire process runs, the resulting data is a hierarchical tree.

I'm trying to figure out how to be able to look at a given item/node in the tree, and to determine if the descendant apps are complete.

I'm trying to accomplish this in mysql. I'm not that familiar with stored procedures/sub-selects. I've seen a number of online papers/sites that discuss this, but nothing that I appear to be on point for my problem.

Looking for a mysql guru to help me get clarity on this issue.

Thanks!

---------------------------------

The sample tree would look like:

spawn
3 levels
a - 3 copies of b
b - 3 copies of c


                     a(1)
                      |
---------------------------------------------------------------------
          |b(1)                   |b(2)                            |b(3)
-------------------         -------------------          --------------------  
|c(1)    |c(2)    |c(3)     c(1)    |c(2)   |c(3)        |c(1)    |c(2)    |c(3)  


so we have a total of 12 crawls/fetches

the levels
a
b
c

the (parent/child) levelRelationships
"",a
a,b
b,c

start level
a (parent/top)
end level
c (leaf)


operational process:

an app spawns either no child app, a single child app, or multiple child app(s)
an app that spawns children is a node
an app that spawns no children is a leaf
 there is no guarantee that an app at a given level, will stop operation 
 before an app at a lower level started by it's parent
each child app can set a tbl with a status when it completes 
 when each child app is complete, it generates a "level/complete" status
  which is stored in a levelStatusTBL

at the start of the root/top level process:
-the tree can have multiple/unknown levels
-each child app can spawn an unknown number of children


issue...
 how to algorithmically determine when all the descendants of a root/top level function have completed?
 how to algorithmically determine when all the descendants of a node have completed

The sample tbls that I'm considering are:

CREATE TABLE `crawlNodeChildrenCountTBL` (
  `rootID` varchar(100) NOT NULL DEFAULT '',
  `uCrawlID` varchar(100) NOT NULL DEFAULT '',
  `childCount` int(5) NOT NULL DEFAULT 0,
  `ID` int(10) NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`ID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

CREATE TABLE `EdgeNodeCheckTBL` (
  `CollegeID` varchar(100) NOT NULL DEFAULT '',
  `rootID` varchar(100) NOT NULL DEFAULT '',
  `parentLevel` varchar(100) NOT NULL DEFAULT '',
  `Level` varchar(100) NOT NULL DEFAULT '',
  `nodeType` int(5) NOT NULL DEFAULT 0,     
  `masterParseInputUUID` varchar(100) NOT NULL DEFAULT '',
  `parentSetupPreComboID` varchar(100) NOT NULL DEFAULT '',
  `SetupPreComboChildStatusID` varchar(100) NOT NULL DEFAULT '',
  `ID` int(10) NOT NULL AUTO_INCREMENT,
  UNIQUE KEY `ID` (`ID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;


EdgeNodeCheckTBL.SetupPreComboChildStatusID is the baseID
EdgeNodeCheckTBL.parentSetupPreComboID is the parentID of SetupPreComboChildStatusID

this is used to implement the standard child/parent relationship tbl 
1

There are 1 answers

4
staticsan On

This is really a data-algorithm question. The basic problem is that storing the parent ids in a child record of a relation database is going to require recursive queries. If you are okay to do that, either in a stored procedure or in another language, then this is a valid approach.

A better way to store trees in a relational database is to use a nested set model. The idea is that each node has a Left Id and a Right Id which is simply a sequence when each node is visited using pre-order traversal. The Left Id is set when going down the tree away from top node; the Right Id is set when going up the tree towards the top node. You also have to update the numbers as the tree is modified.

The advantage this structure gives you is that you don't need recursive queries to examine or update the tree. It also encourages you to isolate modifications to the tree to one place so that you can update the Left and Right Ids correctly.

The Wikipedia article has more details.