Why do I need a list fold to actually deconstruct a tree with this tree catamorphism?

126 views Asked by At

A catamorphism can either deconstruct a value

[1,2,3].reduce((acc, x) => acc + x, 0); // 6

or maintain the structure and act like the identity of the underlying type:

[1,2,3].reduce((acc, x) => acc.concat([x]), []); // [1,2,3]

With lists (or arrays in JS) the catamorphism and the fold (of a foldable container) coincide.

However, with trees they do not:

const treeCata = f => ([x, xs]) =>
  f(x) (arrMap(treeCata(f)) (xs));

const arrMap = f => xs =>
  xs.map((x, i) => f(x, i));

const arrFold = f => acc => xs =>
  xs.reduce((acc, x) => f(acc) (x), acc);

const log = x => (console.log(x), x);

const Node = (x, xs) => ([x, xs]);
const Node_ = x => xs => ([x, xs]);

const tree = Node(1, [
  Node(2, [
    Node(3, []),
    Node(4, [])
  ]),
  Node(5, [])
]);

const foo = treeCata(Node_) (tree);

const bar = treeCata(x => xs => x + arrFold(y => z => y + z) (0) (xs)) (tree);

log(foo);
log(bar);

The role as identity works as expected. The deconstruction, however, is a bit more involved. As a matter of fact I need a list fold to conduct it. I can actually see that there is one layer of deconstruction, because otherwise a list fold wouldn't be enough for a non-linear tree. But still having to use another fold seems odd.

I can only guess that this behavior is related to the terse definition of the used catamorphism. It only takes a single case into account the product Tree a = Node a [Tree a]. Maybe it would be more promising if I'd fully embrace the algebraic structure of the type and distingiush Node/Array constructor and the empty array case.

But does this mean treeCata is not a proper catamorphism? What is it lacking?

2

There are 2 answers

7
Bergi On

I think your treeCata is fine, and is a valid catamorphism for the type Tree a = Node a [Tree a] you defined. The underlying functor is TreeF a b = Node a [b], the structure which all your algebras will have to deal with. The catamorphism only handles the recursion for you, not the list structure of your node children. If it did, it would hide information from you, and you would not be able to reconstruct the tree using the initial algebra.

Of course, you might want to define a few helper functions on TreeF. A particularly interesting one, due to the non-emptiness of your nodes, might be

// fold :: Semigroup a => TreeF a a -> a
// fold1 :: (a -> a -> a) -> TreeF a a -> a
const fold1 = f => ([x, xs]) => arrFold(f)(x)(xs)

With this, you can indeed define your sum function as

const treeSum = treeCata(fold1((x, y) => x+y));

As @Aadit remarks, this is the function that lets us derive the traversal fold from the structural fold.

const treeCata = f => ([x, xs]) =>
  f([x, arrMap(treeCata(f))(xs)]);

const arrMap = f => xs =>
  xs.map((x, i) => f(x, i));

const arrFold = f => acc => xs =>
  xs.reduce((acc, x) => f(acc) (x), acc);

const treeF_fold1 = f => ([x, xs]) => arrFold(f)(x)(xs);

const treeSum = treeCata(treeF_fold1(x => y => x+y));

const Node = (x, xs) => ([x, xs]);
const Node_ = x => xs => ([x, xs]);

const tree = Node(1, [
  Node(2, [
    Node(3, []),
    Node(4, [])
  ]),
  Node(5, [])
]);

console.log(treeCata(Node_)(tree));
console.log(treeSum(tree));

0
Aadit M Shah On

There are two kinds of folds.

  1. Structural folds which are used to perform induction on a data structure.
  2. Traversal folds which are used to traverse and summarize the elements of a data structure.

Structural folds intuitively replace each data constructor of a data structure with a given function.

Node_(1)([
    Node_(2)([
        Node_(3)([]),
        Node_(4)([])
    ]),
    Node_(5)([])
])

/*
          |
          | treeCata(fNode)
          v
*/

fNode(1)([
    fNode(2)([
        fNode(3)([]),
        fNode(4)([])
    ]),
    fNode(5)([])
])

However, traversal folds traverse the individual elements of a data structure and summarize them.

Node_(1)([
    Node_(2)([
        Node_(3)([]),
        Node_(4)([])
    ]),
    Node_(5)([])
])

/*
          |
          | foldrTree(add)(0)
          v
*/

[1, 2, 3, 4, 5].reduceRight(add, 0);

In Haskell, the Foldable type class implements the traversal folds. The minimum complete definition for the Foldable type class is the foldr method. So, let's implement it for trees.

// Node : a -> [Tree a] -> Tree a
const Node_ = value => children => ({ value, children });

// treeCata : (a -> [b] -> b) -> Tree a -> b
const treeCata = fNode => function fold({ value, children }) {
    return fNode(value)(children.map(fold));
};

// treeElems : Tree a -> [a]
const treeElems = treeCata(x => xss => [x].concat(...xss));

// foldrTree : ((b, a) -> b) -> b -> Tree a -> b
const foldrTree = cons => empty => tree =>
    treeElems(tree).reduceRight(cons, empty);

// add : (Number, Number) -> Number
const add = (x, y) => x + y;

// tree : Tree Number
const tree = Node_(1)([
    Node_(2)([
        Node_(3)([]),
        Node_(4)([])
    ]),
    Node_(5)([])
]);

console.log(foldrTree(add)(0)(tree));

Rules of thumb:

  1. Use a structural fold when you want to perform induction on a data structure.
  2. Use a traversal fold when you want to summarize the elements of a data structure.