Heap Sort
Heapsort is a comparison-based sorting algorithm that uses a binary heap data structure. Like mergesort, heapsort has a running time of \(O(n\log n),\) and like insertion sort, heapsort sorts in-place, so no extra space is needed during the sort.
The binary heap data structure allows the heapsort algorithm to take advantage of the heap's heap properties and the heapsort algorithm makes use of the efficient running time for inserting to and deleting from the heap.
Contents
Heap Data Structure
The binary heap data structure is heap implementation. These are often shown as an array object that can be viewed as nearly complete binary tree built out of a given set of data. The heap data structure is also used in the construction of a priority queue. The complete binary tree maps the binary tree structure into array indices, as shown in the figure below. Each array index represents a node.
The node's parent, left, and right child can be expressed as
- parent\((i)\rightarrow \big\lfloor \frac { i }{ 2 } \big\rfloor \)
- leftchild\((i)\rightarrow 2*i\)
- rightchild\((i)\rightarrow 2*i +1.\)
There are two kinds of binary heaps: max-heap and min-heap. Both types of heaps satisfy a certain heap property.
Max-heap Property:
If \(A\) is an array representation of a heap, then in max-heap
\[A[\text{parent}(i)]\geq A[i],\]
which means that a node can't have a greater value than its parent. In a max-heap, the largest element is stored at the root, and the minimum elements are in the leaves.
Min-heap Property:
Similarly, if \(A\) is an array representation of a heap, then in min-heap
\[A[\text{parent}(i)]\leq A[i],\]
which means that a parent node can't have a greater value than its children. Thus, the minimum element is located at the root, and the maximum elements are located in the leaves.
Both min-heap and max-heap can be used to implement a heapsort, but this wiki will discuss heapsort in terms of max-heaps. To convert to min-heap, just change the problem around to use min-heaps and ensure that the min-heap property holds.
Show that the array \( [100, 19, 36, 6, 3, 25, 1, 2, 7]\) doesn't satisfy the max-heap property by drawing the heap.
We have in a max-heap \(A[\text{parent}(i)]\geq A[i]\):
But we can see it is not the case here, since \(6<7\). \(_\square\)
Maintaining a Max-heap
In order to maintain the max-heap property, heapsort uses a procedure called max_heapify(A,i)
. It takes an array \(A\) and an index in the array \(i\) as input. Maintaining the max-heap property is a vital part of the heapsort algorithm.
Here is a Python implementation of max_heapify
:
1 2 3 4 5 6 7 8 9 10 11 |
|
Essentially, if an element \(A[i]\) violates the max-heap property, max_heapify
will correct it by trickling the element down the tree, until the subtree rooted at index \(i\) is a max-heap (and therefore the violation is corrected).
Heapsort Algorithm
The heapsort algorithm has two main parts (that will be broken down further below): building a max-heap and then sorting it. The max-heap is built as described in the above section. Then, heapsort produces a sorted array by repeatedly removing the largest element from the heap (which is the root of the heap), and then inserting it into the array. The heap is updated after each removal. Once all elements have been removed from the heap, the result is a sorted array.
The heapsort algorithm uses the max_heapify
function, and all put together, the heapsort algorithm sorts a heap array \(A\) like this:
- Build a max-heap from an unordered array.
- Find the maximum element, which is located at \(A[0]\) because the heap is a max-heap.
- Swap elements \(A[n]\) and \(A[0]\) so that the maximum element is at the end of the array where it belongs.
- Decrement the heap size by one (this discards the node we just moved to the bottom of the heap, which was the largest element). In a manner of speaking, the sorted part of the list has grown and the heap (which holds the unsorted elements) has shrunk.
- Now run
max_heapify
on the heap in case the new root causes a violation of the max-heap property. (Its children will still be max-heaps.) - Return to step 2.
Implementation of Heapsort
Here is a Python implementation of the heapsort algorithm.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
|
Complexity of Heapsort
Heapsort has a running time of \(O(n\log n)\).
Building the max-heap from the unsorted list requires \(O(n)\) calls to the max_heapify
function, each of which takes \(O( \log n)\) time. Thus, the running time of build_heap
is \(O(n \log n)\).
Note: While it is true that build_heap
has a running time of \(O(n \log n)\), a tighter bound of \(O(n)\) can be proved by analyzing the height of the tree where max_heapify
is called. However, this does not change the overall running time of heapsort, and since the explanation of this is quite involved, it has been omitted​.
Heapsort has a running time of \(O(n\log n)\) since the call to build_heap
takes \(O(n \log n)\) time, and each of the \(O(n)\) calls to max_heapify
takes \(O(\log n)\) time.
Heapsort has a worst- and average-case running time of \(O(n \log n)\) like mergesort, but heapsort uses \(O(1)\) auxiliary space (since it is an in-place sort) while mergesort takes up \(O(n)\) auxiliary space, so if memory concerns are an issue, heapsort might be a good, fast choice for a sorting algorithm. Quicksort has an average-case running time of \(O(n \log n)\) but has notoriously better constant factors, making quicksort faster than other \(O(n \log n)\)-time sorting algorithms. However, quicksort has a worst-case running time of \(O(n^2)\) and a worst-case space complexity of \(O(\log n\)), so if it is very important to have a fast worst-case running time and efficient space usage, heapsort is the best option. Note, though, that heapsort is slower than quicksort on average in most cases.
Pros and Cons
Pros
- Time-efficient with time complexity of \(O(n \log n)\)
- Less memory usage
- Consistent performance: it performs equally well in best-, average-, and worst-case scenarios
Cons
- Unstable sort
- External sorting not possible with heapsort
See Also
References
- , S. Heapsort Example. Retrieved June 7, 2016, from https://en.wikipedia.org/wiki/File:Heapsort-example.gif
- Demaine, E., & Devadas, S. Lecture 4: Heaps and Heap Sort. Retrieved May 23, 2016, from http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/lecture-videos/MIT6_006F11_lec04.pdf