Heapsort is a comparison-based sorting algorithm that uses a binary heap data structure. Like mergesort, heapsort has a running time of and like insertion sort, heapsort sorts in-place, so no extra space is needed during the sort.
The binary heap data structure allows the heapsort algorithm to take advantage of the heap's heap properties and the heapsort algorithm makes use of the efficient running time for inserting to and deleting from the heap.
The binary heap data structure is heap implementation. These are often shown as an array object that can be viewed as nearly complete binary tree built out of a given set of data. The heap data structure is also used in the construction of a priority queue. The complete binary tree maps the binary tree structure into array indices, as shown in the figure below. Each array index represents a node.
The node's parent, left, and right child can be expressed as
There are two kinds of binary heaps: max-heap and min-heap. Both types of heaps satisfy a certain heap property.
If is an array representation of a heap, then in max-heap which means that a node can't have a greater value than its parent. In a max-heap, the largest element is stored at the root, and the minimum elements are in the leaves.
Similarly, if is an array representation of a heap, then in min-heap which means that a parent node can't have a greater value than its children. Thus, the minimum element is located at the root, and the maximum elements are located in the leaves.
Both min-heap and max-heap can be used to implement a heapsort, but this wiki will discuss heapsort in terms of max-heaps. To convert to min-heap, just change the problem around to use min-heaps and ensure that the min-heap property holds.
Show that the array doesn't satisfy the max-heap property by drawing the heap.
We have in a max-heap :
But we can see it is not the case here, since .
In order to maintain the max-heap property, heapsort uses a procedure called
max_heapify(A,i). It takes an array and an index in the array as input. Maintaining the max-heap property is a vital part of the heapsort algorithm.
Here is a Python implementation of
1 2 3 4 5 6 7 8 9 10 11
Essentially, if an element violates the max-heap property,
max_heapify will correct it by trickling the element down the tree, until the subtree rooted at index is a max-heap (and therefore the violation is corrected).
The heapsort algorithm has two main parts (that will be broken down further below): building a max-heap and then sorting it. The max-heap is built as described in the above section. Then, heapsort produces a sorted array by repeatedly removing the largest element from the heap (which is the root of the heap), and then inserting it into the array. The heap is updated after each removal. Once all elements have been removed from the heap, the result is a sorted array.
The heapsort algorithm uses the
max_heapify function, and all put together, the heapsort algorithm sorts a heap array like this:
- Build a max-heap from an unordered array.
- Find the maximum element, which is located at because the heap is a max-heap.
- Swap elements and so that the maximum element is at the end of the array where it belongs.
- Decrement the heap size by one (this discards the node we just moved to the bottom of the heap, which was the largest element). In a manner of speaking, the sorted part of the list has grown and the heap (which holds the unsorted elements) has shrunk.
- Now run
max_heapifyon the heap in case the new root causes a violation of the max-heap property. (Its children will still be max-heaps.)
- Return to step 2.
Here is a Python implementation of the heapsort algorithm.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
Heapsort has a running time of .
Building the max-heap from the unsorted list requires calls to the
max_heapify function, each of which takes time. Thus, the running time of
build_heap is .
Note: While it is true that
build_heap has a running time of , a tighter bound of can be proved by analyzing the height of the tree where
max_heapify is called. However, this does not change the overall running time of heapsort, and since the explanation of this is quite involved, it has been omitted.
Heapsort has a running time of since the call to
build_heap takes time, and each of the calls to
max_heapify takes time.
Heapsort has a worst- and average-case running time of like mergesort, but heapsort uses auxiliary space (since it is an in-place sort) while mergesort takes up auxiliary space, so if memory concerns are an issue, heapsort might be a good, fast choice for a sorting algorithm. Quicksort has an average-case running time of but has notoriously better constant factors, making quicksort faster than other -time sorting algorithms. However, quicksort has a worst-case running time of and a worst-case space complexity of ), so if it is very important to have a fast worst-case running time and efficient space usage, heapsort is the best option. Note, though, that heapsort is slower than quicksort on average in most cases.
- Time-efficient with time complexity of
- Less memory usage
- Consistent performance: it performs equally well in best-, average-, and worst-case scenarios
- Unstable sort
- External sorting not possible with heapsort
- , S. Heapsort Example. Retrieved June 7, 2016, from https://en.wikipedia.org/wiki/File:Heapsort-example.gif
- Demaine, E., & Devadas, S. Lecture 4: Heaps and Heap Sort. Retrieved May 23, 2016, from http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/lecture-videos/MIT6_006F11_lec04.pdf