How to merge sets in better then O(len(set))?

324 views Asked by At

I have a list of sets called groups. groups[i] is a set of labels (integers) that are part of the same group as i. For example,

# idx:     0    1    2        3    4      5        6        7
groups = [{0}, {1}, {2,5,6}, {3}, {4,7}, {2,5,6}, {2,5,6}, {4,7}]

indicates that

  • 0 belongs to the group containing just 0
  • 1 belongs to the group containing just 1
  • 2, 5, and 6 each belong to the same group, namely the group containing 2, 5 and 6
  • 3 belongs to the group containing just 1
  • 4 and 7 each belong to the same group, namely the group containing 4 and 7

Sometimes it turns out that two groups are the same. For example, say we found out that 4 and 5 belong to the same group. Then we have to merge group {2,5,6} with group {4,7}. After this merge, groups should look like:

# idx:     0    1    2            3    4            5            6            7
groups = [{0}, {1}, {2,4,5,6,7}, {3}, {2,4,5,6,7}, {2,4,5,6,7}, {2,4,5,6,7}, {2,4,5,6,7}]

Naively, this would require updating groups[2], groups[4], groups[5], groups[6] and groups[7] or, in general, when merging groupA with groupB, it would require len(groupA) + len(groupB) updates.


A more efficient approach, would be if instead of having multiple copies of the same set in groups, we had a reference to the same set

# idx:     0    1   2        3   4       5       6       7
groups = [{0}, {1}, groupA, {3}, groupB, groupA, groupA, groupB]

where groupA == {2,5,6} and groupB == {4,7}

Then, to merge groupA and groupB would require at most min(len(groupA), len(groupB)) updates:

if len(groupA) < len(groupB):
  groupA, groupB = groupB, groupA
# len(groupA) >= len(groupB)
groupA |= groupB
for label in groupB:
  groups[label] = groupA

which would result in

# idx:     0    1   2        3   4       5       6       7
groups = [{0}, {1}, groupA, {3}, groupA, groupA, groupA, groupA]

where groupA == {2,4,5,6,7}


Is there a more efficient way to merge groups? Is Union-Find (Disjoin Set Union) data structure the way?

0

There are 0 answers