Heaps
Heaps are tree-based data structures constrained by a heap property. Heaps are used in many famous algorithms such as Dijkstra’s algorithm for finding the shortest path, the heap sort sorting algorithm, implementing priority queues, and more. Essentially, heaps are the data structure you want to use when you want to be able to access the maximum or minimum element very quickly.
There are many variations of heaps, each offering advantages and tradeoffs.
Contents
General Structure
A heap is a data structure that is usually implemented with an array that can be thought of as a tree (they can be implemented using pointers, however). For example, in the image below, notice how the elements in the array map to the tree to create a max-heap (a heap where the parent node has a larger value than its children).
In terms of the tree, the root of the heap is the top most element. In the image below, the root is \(16\). The height of a given node in the tree is defined by the longest path from it to a leaf, where a leaf is a node at the bottom of the tree.
There are two general versions of the heap property: a min-heap property and a max-heap property.
Max-heap property
If \(A\) is an array representation of a heap, then in Max-heap:
\[A[\text{parent}(i)]\geq A[i],\]
which means that a node can't have a greater value than its parent. In a max-heap, the largest element is stored at the root, and the minimum elements are in the leaves.
Min-heap property
Similarly, if \(A\) is an array representation of a heap then, in Min-heap:
\[A[\text{parent}(i)]\leq A[i],\]
which means that a parent node can't have a greater value than its children. Thus, the minimum element is located at the root, and the maximum elements are located in the leaves.
Depending on the type of heap used, the heap property may have additional requirements.
In order to maintain the max-heap property (or min-heap property), heapsort uses a procedure called max_heapify(A,i)
. It takes an array \(A\) and an index in the array \(i\) as input. This can easily be adapted to a min-heapify
function.
Here is a Python implementation of max_heapify
:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
Essentially, if an element \(A[i]\) violates the max-heap property, max_heapify
will correct it by trickling the element down the tree, until the subtree rooted at index \(i\) is a max heap (and therefore the violation is corrected).
Minimum Functionalities of Heaps
There are several operations that all heaps should implement. Depending on the specific type of heap used, the way that these operations are implemented may vary. No matter what operation is done in any given heap implementation, the heap properties associated with the implementation must be satisfied when the operation is complete. For example, if performing an operation on a min-heap implemented using a binary heap violates the min-heap property, the violation must be fixed before the operation is complete. Here are a few basic heap operations.
Build Heap: It is important to be able to construct a heap.
Build-Heap
is usually implemented using theInsert
andHeapify
function repeatedly. So starting from an empty heap, nodes are added withInsert
and thenHeapify
is called to make sure the heap maintains the heap properties at each step.Heapify: Used to maintain the heap properties (described in above sections).
Insert: It is important to be able to add elements to the heap.
Remove: It is important to be able to delete elements from the heap.
Find Minimum/Maximum and Extract Minimum/Maximum: Depending on the purpose of the heap, the largest or smallest elements are often of interest so these operations are useful to have.
Decrease/Increase Key: This is used to change the key of a particular node. The key determines the place in the heap where the node will be. So adjusting the key allows the algorithm to rearrange parts of the heap.
Merge: Sometimes called meld, the merge function is a useful operation to have to combine heaps.
Types of Heaps
There are several different types of heaps, each with a different implementation and various advantages and disadvantages. However, each heap type satisfies the heap property and can be used for the same types of tasks.
Type of Heap | Insert | Delete | Decrease Key | Merge | Extract Min |
Binary Heap | \(O(\log n)\) | \(O(\log n)\) | \(O(\log n)\) | \(O(n)\) | \(\Theta(\log n)\) |
Fibonacci Heap* | \(O(1)\) | \(O(\log n)\) | \(O(1)\) | \(O(1)\) | \(O(\log n)\) |
Binomial Heap | \(O(1)\) | \(O(\log n)\) | \(O(\log n)\) | \(O(\log n)\) | \(O(\log n)\) |
Pairing Heap* | \(O(1)\) | \(O(\log n)\) | \(O(\log n)\) | \(O(\log n)\) | \(O(\log n)\) |
*Amortized running times
Python Implementation of a Heap
This is partial Python code from Python's documentation site describing the internal configuration of heaps in the Python language. This is just a sketch of how heap operations work. There are parts of the code that are missing here, for the sake of brevity, and this code is meant as a learning tool to get an idea of how Python implements a few basic heap operations. [3]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 |
|
The full implementation can be found on the Python documentation website here.
Applications
Queues
Heaps can be used to implement priority queues where the first object in is the first object to come out of the queue.
Consider a supermarket line. How can you model a line of customers waiting to pay for their items as a max-heap?
One way to model this problem is to build a heap where each person in the line is represented by a node. The node’s value will correspond to the amount of time the person has stood in the line — people who have been in the line longer, and are therefore towards the head of the line, will have a larger key value. The root node of the heap will be the person who has been in the line longest. After the person has paid for their items, their node, the maximum value, can be extracted from the heap, and then the heap can be heapifyied to indicate the new line after the person has left.
Heap Sort
Heaps are used in the heapsort sorting algorithm. Heapsort is a fast and space efficient sorting algorithm. It works by maintaining heap properties and taking advantage of the ordered nature of min and max heaps.
Here is an animation that shows heapsort. Notice how the heap is built up from the list and how the max-heap property is enforced.
See Also
References
- , E. Max-Heap.svg. Retrieved June 5, 2016, from https://en.wikipedia.org/wiki/File:Max-Heap.svg
- Demaine, E., & Devadas, S. Lecture 4: Heaps and Heap Sort. Retrieved May 23, 2016, from http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/lecture-videos/MIT6_006F11_lec04.pdf
- Panter , M. heapq.py. Retrieved June 7, 2016, from https://hg.python.org/cpython/file/2.7/Lib/heapq.py
- , S. Heapsort Example. Retrieved June 7, 2016, from https://en.wikipedia.org/wiki/File:Heapsort-example.gif