MySql sorting hierarchical data in a closure table that has repeated nodes

947 views Asked by At

I have a hierarchy that I have represented as a closure table, as described by Bill Karwin. I am trying to write a query that will return the nodes sorted as a depth-first traversal. This reply would solve my problem, except that in my structure some nodes appear more than once because they have multiple parents.

My sample data looks like this:

  • 1
    • 2
      • 5
    • 3
      • 5
    • 4
      • 6
      • 2
        • 5

As you can see, node 2 appears twice, both as a child and a grandchild of the root. Node 5 appears twice as a grandchild of the root (each time with a different parent), and then again as a great-grandchild because its parent, node 2, is repeated.

This will set up the data as a closure table:

CREATE TABLE ancestor_descendant (
  ancestor int NOT NULL,
  descendant int NOT NULL,
  path_length int NOT NULL
);
INSERT INTO ancestor_descendant (ancestor, descendant, path_length) VALUES 
    (1,1,0),(2,2,0),(3,3,0),(4,4,0),(5,5,0),(6,6,0),(1,2,1),(1,3,1),(1,4,1),
    (2,5,1),(3,5,1),(4,6,1),(4,2,1),(1,5,2),(1,6,2),(1,2,2),(1,5,3),(4,5,2);

or as an adjacency list:

CREATE TABLE parent_child (
  parent int NOT NULL,
  child int NOT NULL
);
INSERT INTO parent_child (parent, child) VALUES 
    (1,2),(1,3),(1,4),(2,5),(3,5),(4,2),(4,6);

I can produce a breadth-first traversal (although 5 only appears as a grandchild once):

SELECT CONCAT(LPAD('', path_length, '-'), ' ', descendant)
FROM ancestor_descendant
WHERE ancestor = 1
ORDER BY path_length;

1
- 2
- 3
- 4
-- 5
-- 6
-- 2
--- 5

but my attempt at a depth-first traversal using breadcrumbs fails (it shows the repeated nodes only once because of the GROUP BY a.descendant):

SELECT a.descendant, GROUP_CONCAT(b.ancestor ORDER BY b.path_length DESC) AS breadcrumbs
FROM ancestor_descendant a 
INNER JOIN ancestor_descendant b ON (b.descendant = a.descendant) 
WHERE a.ancestor = 1
GROUP BY a.descendant 
ORDER BY breadcrumbs;

1   1
2   1,1,4,1,4,1,2,2
5   1,1,4,1,4,1,3,2,3,2,5,5
3   1,3
4   1,4
6   1,4,6

Is it possible to output a depth-first traversal using a closure table representation?

Should I use an alternative representation? I can't use recursive CTEs, because I'm restricted to MySql (which doesn't implement them).

1

There are 1 answers

1
Gordon Linoff On BEST ANSWER

I would suggest splitting the node id into two concepts. One would be a unique id that is used for the graph properties (i.e. ancestor_descendant list). The second is what you show on output.

  • 1
    • 2
      • 5
    • 3
      • 50
    • 4
      • 6
      • 20
        • 51

Then create a mapping table:

Id      Value
 1        1
 2        2
20        2
 3        3
 4        4
 5        5
50        5
51        5
 6        6

You can then get what you want by joining back to the mapping table and using the value column instead of the id column.