Binary Search Trees
Binary search trees (also binary trees or BSTs) contain sorted data arranged in a tree-like structure. A binary tree consists of "root" and "leaf" data points, or nodes, that branch out in two directions.
Binary trees store "items" (such as numbers, names, etc.) in memory, allowing fast lookup, addition, and removal of items. They can be used to implement either dynamic sets of items or lookup tables that allow finding an item by its key. Binary trees and related variations have many practical uses such as in databases, for parsing syntax in a programming language, and in file compression.
Contents
Overview
Each tree starts with a single root node that contains data as well as references to left and right nodes. Each left and right nodes can also contain data and references to its own left and right nodes, respectively. This pattern continues recursively, forming a tree.
Operations
All of the code examples in this section are using a common class for the nodes of the binary tree:
1 2 3 4 5 6 7 8 |
|
Tree Traversal:
There are two primary ways to traverse, or iterate through, a tree: depth-first and breadth-first. In the depth-first method, the left subtree is searched first, then the right subtree is searched. In the breadth-first method, the search progresses through the nodes at each height level of the tree, or in other words, the root node first, then level 1 nodes, then level 2 nodes, and so forth.
Depth-first
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
Breadth-first
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
Insertion:
To insert a node, find the leftmost leaf node that is less than the value being inserted. Add a child node to that node containing the inserted value.
1 2 3 4 5 6 7 8 9 |
|
Deletion:
Deletion is less straightforward. There are three cases to consider:
- Case 1: If the deleted node has no children, remove the node from its parent.
- Case 2: If the deleted node has one child, replace the node with its child.
- Case 3: If the deleted node has two children, do the following:
- Start with the current node's right child.
- Find that child's leftmost child.
- Replace the current node's value with the leftmost child's value.
- Call the deletion function on the current node's right child.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
|
Tree Shape and Balancing
Inserting and deleting nodes in a binary tree affects the shape of the tree, and the shape of a tree can affect the efficiency of operations on the tree. A "balanced" tree has the smallest possible height, or node depth, and is the most efficient configuration for the tree for the set of all possible searches.
In the example below, both trees contain the same data, but only one is balanced:
Since the efficiency of operations is inversely proportional to the height of the tree, binary trees must be periodically rebalanced to maintain efficiency. Some variations of binary trees, such as red-black trees, AVL trees, and scapegoat trees, are self-balancing, in that when a node is inserted or deleted, the tree is modified to improve its shape.