AVL Tree
An AVL tree is a variant of the binary search tree. Like a binary search tree, it is made up of a "root" and "leaf" nodes. Every node has at most two children, where the left child is less than the parent and the right child is greater.
But binary search trees can either be unbalanced or balanced. A tree is balanced if the depths of its left subtree and right subtree differ by at most 1. An AVL tree balances itself after every operation. An AVL has much faster operations because it's always balanced. AVL trees were the first self-balancing tree structures. They have been generally overtaken by red-black trees in popularity since.
Contents
Overview
An AVL tree operates very similarly to the binary search trees. When an element is inserted, we can quickly find its desired location by using binary search. Starting with the root, a node is compared to the new element. If the new element is smaller, the same process happens for the node's left child. If it is larger, the right child is used. In the following GIF, the element being inserted is a 1. Starting with the root node, a 5, the 1 is compared with a specific node. In this case, it is smaller, so it goes left to the 3. It is also smaller than the 3, so it finds its desired location as the left child of the 3.
In this GIF, the depth of each node is written next to the node. Remember that an AVL tree must remain balanced. This means that the depth of a node's left subtree and its right subtree can differ by at most 1.
Now let's look at a situation where the AVL tree needs to balance itself after an insert. The number 15 will be inserted. This will cause the left subtree of 8 (a tree with depth 0) and the right subtree of 8 (a tree with depth 2) to have a difference in height greater than 1. So, it will need to change its structure to reflect that change.
What happened there? There are two kinds of rotations in AVL trees, a right rotation and a left rotation. When to use these rotations can be tricky, but the types themselves are graphical in nature. Watch the tree in the above GIF restructure itself. The rotation is happening on node 8. A left rotation is performed, and we call it that because the nodes are moving in a counterclockwise manner.
Type of Rotations
As discussed above, there are two types of rotations, left and right. However, the trick is knowing which rotation to employ and knowing which node you rotate about. There are generally 4 cases:
Right-Right Case
This is case we saw in the second GIF. In this case we have an unbalanced tree that looks something like the following picture:
In this case, all we need is a simple left rotation on the root node of the unbalanced subtree, in this case 1. A left rotation has the following protocol:
- Node 2 becomes the new root.
- Node 1 takes ownership of Node 2's left-child as its right-child (in this case null).
- Node 2 takes ownership of Node 1 as its left child.
Now, the tree is balanced around Node 2.
Left-Left Case
The left-left case is a mirror image of the right-right case. In this scenario, a right rotation is needed on the root node of the unbalanced tree. You can imagine a left-left case with Node 3 as the root, Node 2 as Node 3's left child, and Node 1 as Node 2's left child. The protocol is as follows:
- Node 2 becomes the new root.
- Node 3 takes ownership of Node 1's right-child as its left-child (in this case null).
- Node 2 takes ownership of Node 3 as its right child.
Right-Left Case
The following two cases are slightly trickier. A right-left case looks like the following image.
As you can see, from the root, there is a right-child and a left-child (hence the name). This tree is unbalanced. However, a simple left-rotation won't do us any good because it will simply flip the tree around and still be unbalanced. Instead we have to do two things to make this balanced.
- Perform a right rotation on the right-child of the root
- Perform a left rotation on the root
First a right rotation is performed on Node 3. Now, we're back to our Right-Right case. Then, as before, a left rotation is performed on the root, Node 1. Now our tree is balanced.
Left-Right Case
This case is a mirror image of the previous case. The protocol is as follows:
- Perform a left rotation on the left-child of the root
- Perform a right rotation on the root
In this case, Node 3 would be the root, Node 1 would be its left-child, and Node 2 would be Node 1's right child. A left rotation on Node 1 would leave us in the Left-Left case, and a right rotation on Node 3 would balance the AVL tree.
Complexity
AVL trees perform very well due to their balanced nature. In search, AVL trees perform the exact same way as binary search trees. Finding an item takes \(O(\log_2(n))\) time.
You might think that traversing an AVL tree can be time consuming. After all, if you want to get to the next biggest node, you might have to backtrack all the way up the tree to the root and go down the other subtree. However, using amortized analysis, the AVL can be shown to have linear \(O(n)\) time. Why? To traverse an entire tree, you have to enter and exit each edge one time each. Since there are \(n - 1\) edges in a balanced tree, that means there will be \((2 \cdot (n - 1))/n\) amortized traversals in total, or 2.
For insertion and deletion, the act of finding the correct space for the key as well as the re-balancing process must be taken into account. Since it is a binary tree, it will take \(O(\log_2(n))\) to either find the correct spot for the new node or find the node we wish to delete. From there, we must retrace our steps all the way through the root to make sure that the invariant of the AVL tree has not been violated--that at no point, a node's left and right subtrees have heights that differ by more than one. However, we have a hard limit on how many times we have to retrace; it's also \(O(\log_2(n))\)--we're just pulling out of the tree the same way we came in. The rotation itself is a constant time operation, so it does factor into asymptotic complexity.
See also: big O notation.
Average | |
Space | \(O(n)\) |
Search | \(O(\log_2(n))\) |
Traversal | *\(O(n)\) |
Insert | \(O(\log_2(n))\) |
Delete | \(O(\log_2(n))\) |
*amortized analysis