Heap sort, Understanding the basics

367 views Asked by At

As a disclaimer I am new to this site and therefore, do not know very well how to ask questions. Please don't be too harsh because I really am trying to just understand how some of these concepts work. If i am missing understanding near the beginning, please just tell me that so I can start from there and not waste your time with the rest. Here goes nothing. Because I think my understanding might be flawed, I poised some questions about how heaps would act in different areas, and then tried to answer them.

First, I would like help understanding how a random set of numbers added to an empty heap would look. lets say for example, I have 9, 4, 5, 3, 2, 7, 8, 7. After adding it to the heap, what would the heap look like? I can visually understand this (I think) the 9 being the root, 4 being the first left child and so on and so forth, but since this isn't a tree specifically, and is a heap, would it sort the numbers by switching them (see paragraph "if my understanding is correct") so that they are sorted in either min or max order?

Now lets say we removed the 9 from the heap (I believe the 9 would be the root), how would we respond to this change and then what would then be put into the root? I think here if 9 is the root, we would take the next largest number and copy it into the slot of the nine, while if this was a min heap and we where just removing a node at the bottom, it would just be removed no problem.

Along similar lines, what would a formula to get the parent of the heap item in the array? --I think I understand this, If parent is at i, the left child would be at i*2 and the right child would be at i*2+1. And therefore going to find the parent, we would have to divide i/2 to find the parent. For example if we where at i=7 the parent would be i=3 because 3.5 would be truncated and if we where at point i=6 the parent would also be i=3. From this example the child at i = 7 would be right child of i = 3 while i=6 would be the left child of i = 3.

If my understanding of this is correct, then to reheapify after a new term has been added to the root I would compare the child to parent and if the child is larger, switch the terms. BUT I would need to compare the two children (if there are two) to see which one is bigger to decide which one needs to swap. This would be for a max heap and would go the other direction for a min heap.

Finally, if I where to add the root element, how would it reheapify?

2

There are 2 answers

1
Jason Mitrovich On

To begin with your first question about how the heap would look. It will take on the structure of a complete binary tree. We could just walk down the list and update the tree as we see it but this will ruin the run time so there is a more clever way to do it. We want to first start by linearly going through the array and adding it to the left most open slot where the first entry in the array is the root. Then once you have an array, we want to fix the heap from the ground up. This involves looking at the highest depth of the heap and fixing it by making a swap so that the minimum is the parent. Then move up one in the depth of the tree and make the swap if either child is less than the new parent. If this is true then make the swap, however we may have broken the min property and so we must recursively move down the heap to fix the property. Once we recursively move towards the top and fix the heap at the top then we will have made the min Heap desired. Note that through some nice algebra, we can show that this will run in O(n) time.

The second question about removing 9 is not correct (as it is not the root anymore) so let's focus on removing the root node. When the root node is removed (from the tree or the first entry of the array) then we need to place something there for the tree structure and we place the left most node of the tree or the last element in the array, but as you might be thinking, this may have ruined the min-property and you are right. So once you move the left most to the root node, we have to check its children and if it is smaller than both, then we are good. Otherwise, we need to swap with the smaller and repeat this for the next set of children until it is smaller than both its children.

In an array, it is correct that we use 2i and 2i+1 as the index so just dividing by 2 will not be sufficient. We note that 2i is even and 2i+1 is odd and so we should focus on whether the index we are looking at is even or odd. However, it is correct that truncating would given the correct answer for the parent and that the decimal would result in the decision for the left and right child.

To address your final concern, we should note that when you add something to a heap that it is a complete binary tree and should be added to the left most slot and not the root. When you add something to the left most (for a min heap), we need to check if it is smaller than its parents and move it towards the root.

Additionally, building your heap with O(n) is efficient when needing to run prim's algorithm or Dijkstra's Shortest Path Algorithm.

Hope this helps - Jason

0
hopper On

After 9 is deleted, nothing becomes the root. The heapsort algorithm goes to the left child for sorting (you said 4.) Then the right child (or 5), etc. If the number is being checked is the root (we have different implementations) then 4 becomes the root, then 5, etc. If you are confused, look at this definition of heapsort, written in javascript:

var heapSort = function(array) {
  var swap = function(array, firstIndex, secondIndex) {
    var temp = array[firstIndex];
    array[firstIndex] = array[secondIndex];
    array[secondIndex] = temp;
  };
  var maxHeap = function(array, i) {
    var l = 2 * i;
    var r = l + 1;
    var largest;
    if (l < array.heapSize && array[l] > array[i]) {
      largest = l;
    } else {
      largest = i;
    }
    if (r < array.heapSize && array[r] > array[largest]) {
      largest = r;
    }
    if (largest !== i) {
      swap(array, i, largest);
      maxHeap(array, largest);
    }
  };
  var buildHeap = function(array) {
    array.heapSize = array.length;
    for (var i = Math.floor(array.length / 2); i >= 0; i--) {
      maxHeap(array, i);
    }
  };
  buildHeap(array);
  for (var i = array.length-1; i >= 1; i--) {
    swap(array, 0, i);
    array.heapSize--;
    maxHeap(array, 0);
  }
  array.heapMaximum = function(){
      return this[0];
  };
  array.heapExtractMax = function(){
      if(this.heapSize < 1){
          throw new RangeError("heap underflow");
      }
      var max = this[0];
      this[0] = this[this.heapSize - 1];
      this.heapSize--;
      maxHeap(this, 1);
      return max;
  };
  array.heapIncreaseKey = function(i, key){
      if(key < this[i]){
          throw new SyntaxError("new key is smaller than current key");
      }
      this[i] = key;
      while(i > 1 && this[Math.floor(i / 2)] < this[i]){
          swap(this, i, Math.floor(i / 2));
          i = Math.floor(i / 2);
      }
  };
  array.maxHeapInsert = function(key){
      this.heapSize--;
      this[this.heapSize] = -Infinity;
      this.heapIncreaseKey(this.heapSize, key);
  };
};
var a = [Math.floor(Math.random() * 100), Math.floor(Math.random() * 100), Math.floor(Math.random() * 100), Math.floor(Math.random() * 100), Math.floor(Math.random() * 100)];
heapSort(a);
document.writeln(a);
*{
  font-family:monospace;
}

I actually don't know how it would reheapify, but you can see the snippet to find out.