Asymptotic complexity of binary heaps implemented with sets vs. priority queues

369 views Asked by At

I'm reviewing for a final and have stumbled upon the following problem:

Binary heaps have priority queue semantics by default:If we insert some element n times, we need to remove it n times before it “disappears.” Imagine you want to implement binary heaps with set semantics instead. How fast can the insert and remove operations be in this case?

(a) insert: O(n) remove: O(logn)

(b) insert: O(log n) remove: O(n)

(c) insert: O(log n) remove: O(log n)

(d) insert: O(1) remove: O(n2)

(e) None of the above.

To insert into a heap implemented with priority queue semantics, it will be O(logn), and so will the removal. But with sets, insertions and removals depend on many factors of the set itself. What do you think the answer should be?

2

There are 2 answers

4
A Sdi On

I would say insert: O(n) remove: O(log n). In a set duplicated item is not allowed and attempt to insert a duplicate item should be ignored and since a heap is a binary tree that is completely filled, (with a exception of the bottom) , If you want to insert to a set you search(item) if you find it you do nothing else insert so insertion may take O(n) for both find and insert (you may percolate up at most log n times). deleteMin has o(log n) time complexity

0
meriton On

It is not perfectly clear what set semantics refer to, but the author likely means the property

A ∪ {x} ∪ {x} = A ∪ {x}

i.e. adding an element that is already in the set should have no (observable) effect.

A straightforward way to do this is to check for duplicates upon insertion. However, in a normal binary heap, removing an arbitrary element takes O(n), which is clearly rather slow.

A smarter idea might be to eliminate duplicates upon removal of the smallest element, by repeating removal until we see a new value. This takes O(k log n), where k is the number of duplicates for the retrieved node. If k is known to be constant, this is O(log n). In the worst case though, k = n, making the worst case time complexity O(n log n). However, if we measure amortized time complexity instead, this algorithm achieves O(log n) for insertion and removal.

It is possible to achieve worst case time complexity O(log n) for insertion and removal by using a balanced binary search tree instead of a heap. Whether this qualifies as "heap" is not clear though.

In summary, possible answers are:

  • O(n), O(log n) : somewhat naive heap
  • O(log n), O(n log n) : tuned real heap, worst case time complexity
  • O(log n), O(log n) : tuned real heap, amortized time complexity
  • O(log n), O(log n) : balanced binary search tree masquerading as a heap

That is, depending on how we interpret the question, different answers are correct.