Genealogy tree Algorithm

2.5k views Asked by At

I'm new in this domain and like to write an application managing genealogical data. My main concern is how to store and retreive these data from MySQL. I know that DB like Oracle are optimised for recursive queries, but maybe I can find an alternative solution to use MySQL which I undestand is not supporting "CONNECT" . PS. I know that there are thousands of existing Open source solutions, but consider that these data will be a limited part of the functionality, and I need to keep control of the full code.

I had a quick look on the web and found some interesting approach, for instance Interval-based algo which is perfect for queries but not satisfactory for update / deletion.

I'm going to have a look on Prefix-based(Dewey) approach, but one may know an efficient and proven approach to share ?

thanks

gilles

2

There are 2 answers

13
dani herrera On BEST ANSWER

First problem, design data schema: I keep hierarchis with a foreign key to parent row. It is simply.

Second problem, retrieve ascendants/descendants: As you explain, problems comes with select: select some people and all descendants os ascendants. To solve this you should to create a new tree table. This table contains pairs: al combination to a person with all they ancestors (and itself):

people( id, name, id_parent)
people_tree( id, id_ancestor, distance )

Noticie that with this structure is easy to query hierarchies. Sample: all descendants of somebody:

select people.*, distance
from 
  people p
    inner join 
  people_tree t 
    on ( p.id = t.id)
where
  id_ancesor = **sombody.id **

You can play with distance to get only grandparents, grandwchildren, etc. ...

Last problem, keep tree: tree must be all time up to data. You should automatize this: a trigger over people or a store procedure for CRUD operations,

EDITED

Because this is a Genealogy tree, each person must to have both references, parent and mother:

people( id, name, id_parent, id_mother)

Then, 2 trees are needed:

parent_ancestors_tree( id, id_ancestor, distance )
mother_ancestors_tree( id, id_ancestor, distance )

David ask for sample data:

people: id    name    id_parent    id_mother
         1    Adam         NULL      NULL
         2    Eva          NULL      NULL
         3    Cain            1         2
        ..    ...
         8    Enoc            3         5

parent_ancestors_tree id    id_ancestor  distance
              (Adam)   1              1         0
              (Eva)    2              2         0
              (Cain)   3              3         0
                       3              1         1
              (Enoc)   8              8         0
                       8              3         1
                       8              1         2

mother_ancestors_tree id    id_ancestor  distance
              (Adam)   1              1         0
              (Eva)    2              2         0
              (Cain)   3              3         0
                       3              2         1
              (Enoc)   8              8         0
                  -- here ancestors of Enoc's mother --

regards.

0
Micromega On

I would also recommend an adjacent tree model and for the more complex logic I suggest to use simple mysql queries (Joins). Most likely it's more important to create the tree. You can do more data mining when the application is finished and everything went ok.