Why does the weighted quick union algorithm consider the sizes of the tree instead of their heights?

5k views Asked by At

I was watching Robert Sedgewick's video on improvements to quick union. (https://youtu.be/sEo6LlPxpHE?t=267)

There he uses size of the tree rather than the height. Actually the problem is finding the root node . finding will be difficult if the height is high. so we should find a way to mitigate the effect of height. Only if we compare heights will it not work as expected ? will connecting shorter tree to taller tree not solve the problem instead doing :connecting tree with smaller number of nodes to tree with larger number of nodes ?

what about the following case ? enter image description here

according to the logic in video:

size of A tree = 4

size of B tree = 7

and if you connect A to B . actually we are making the resulting tree taller(height 4). But had we done it based on tree heights we could have solved it by connecting tree B to A . and so the resulting tree will have height 3.

Am I right ? If wrong where am I wrong ?

3

There are 3 answers

2
templatetypedef On

Keep in mind that most implementations of a disjoint-set forest apply the path compression optimization, in which during every lookup, you change the parent pointers of all the nodes in the chain so that they all point to their representative.

It turns out that if you use path compression and keep track of the heights of all the nodes prior to compressing them, then the asymptotic performance of linking by height (usually called union-by-rank, where the "rank" is the height prior to any compressions) and linking by weight are identical. Both give the inverse Ackermann time complexity. This result comes from this paper, which is highly technical but does prove both results.

Even if you don't do this, though, there's another way to see that these two approaches are (roughly) equal to one another. Notice that if you have a tree of height one, it has to have at least two nodes in it. Why? Well, the only way you can make a tree of height one is to merge to trees of height zero, each of which must have at least one node. Using similar logic, you can see that if you have a tree of height two, then it must have at least four nodes, since to form it you had to merge two trees of height one together. More generally, you can show that a tree of height n must have at least 2n nodes in it. Therefore, merging by the height is essentially the same as merging by weight, since there's a close connection between the heights and sizes of the trees.

Hope this helps!

0
Valar Morghulis On

First, as others mentioned, both size and height have similar performance.

But still, why we choose size over height? I think it's because the height will change during path compression while the size will not.

This is my own opinion since I didn't find it anywhere. Hope it's true.

1
S.K. On

Weighted quick-union by size or by height are roughly the same, they have the same upper bounds, as mentioned by @templatetypedef before.

Let me share a few more details.

In the strategy weighted by size, let a tree T has a single node x, its depth(or height) h is 1. h increase by 1 only if T is merged to another tree whose size is equal or larger than T. so,

  • for h = 1, tree size N = 1
  • h = 2, N >= 2
  • h = 3, N >= 4
  • h = 4, N >= 8
  • ....

In another words, the upper bound of h is lgN + 1( h <= (lgN + 1) ).

What if union-find is weighted by height? It's obvious that when tree depth h = 2, at least 2 nodes are required. When h = 3, the minimal numbers of nodes are 4(two trees of depth 2 both with minimal numbers of nodes merged ). The situation is the same as above,

  • for h=1, tree size N = 1
  • h = 2, N >= 2
  • h = 3, N >= 4
  • h = 4, N >= 8
  • ....

For union-find weighted by height, we have the conclusion: h <= (lgN + 1). That is, no matter weighted by height or by size, their depths have the same upper bound.

With regard to the confusing case you mentioned, that's caused by the randomness of merging, we have find out the upper bound of the depth, let's say it's 5, but it's possible the outcome are 4 or 3, or any numbers not exceeding 5, base on the sequence of merging.

One more words, the textbook Algorithms(4th edition) do mention weighed quick-union by height, at the Creative Problems of 1.5.14, pages 237.