Trees
Introduction
A tree is an abstract data structure that stores elements based on hierarchy. With the exception of the top element (also called the root element), each element in a tree has a parent element, while some or all elements may contain children.
A tree is usually defined by a set of nodes which contain parent and children variables. For example, for \( \bf{\color{pink} {\text{pink}}}\), the root has no parents but has children \( \bf{\color{yellow} {\text{yellow}}}\), \( \bf{\color{blue} {\text{blue}}} \) and \( \bf{\color{red} {\text{red}}} \). These variables are actually pointers to other nodes. Even though trees are technically graphs, we can differentiate between the two by noting that in a tree abstract data type, any two vertices are connected by one and only one path. Two nodes that are children of the same parent are siblings (i.e. \( \bf{\color{purple} {\text{purple}}}\) and \( \bf{\color{grey} {\text{grey}}}\)). A node is external if it has no children (i.e. \( \bf{\color{green} {\text{green}}} \)). Conversely, a node is internal if it has one or more children. External nodes are also known as leaves. The depth or level of a given node is the length of the simple path from the root to that node (i.e. the level of the \( \bf{\color{orange} {\text{orange}}}\) is 2 because that is the number of edges from it to the root) .
There are multiple ways that the height of a rooted tree is defined. Many sources, including Brilliant's course on binary trees define the height to be the maximum level number that occurs. This means graph has a height of 3. But be careful: other people define the height of the tree to be the number of nodes along the longest simple path; the height ends up being one greater under this definition.
How many external nodes (leaves) does the tree below contain?
Remember the leaves are the nodes without children. Since \(4, 5, 6,\) and \(7\) are leaves in the tree above, there are four leaves. \(_\square\)
What is the height of the tree in the previous example?
Inspecting the graph, we see that \(2\) is the longest path from any individual node in our graph. Thus \(2\) is the height of the tree. \(_\square\)
Java Implementation of a Tree
Let us build an informal Java tree ADT. We begin by representing a position in a tree as a Node
. This object will contain a parent
, left
and right
variables. The tree will then be the root node along with its respective pointers.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
|
Write a method/s that prints the value of the tree
1 2 3 4 5 6 7 8 9 10 11 12 13private void print(Node p){ if (p != null){ print(p.left); System.out.println(p.value); print(p.right); } } public void print() { // prints the values in the tree // add something here print(root); }
Here we built a basic recursive solution, that actually prints the tree in order if its a balanced binary search tree. The main idea is to consider our base case when
p.value == null
at which point nothing will be done.
Write a method that computes the height of a given tree.
1 2 3 4 5 6 7 8 9private int height(Node p) { // returns the height of the tree if (p == null) return -1; return Math.max(height(p.left), height(p.right) ) + 1; } public int height(){ return height(root); }
Huffman Codes
See full article: Huffman Code.
The most common way to represent characters internally in a computer is to use fixed-length bit strings. Fore example ASCII, represents each character by a string of seven bits. Examples are shown below:
Huffman codes, which represent characters by variable-length bit strings, provide alternatives to ASCII and other fixed length codes. The idea is to use short bit strings to represent more frequently used characters, such as text and programs, in less space than if ASCII were used. Because of limited memory, some hand-held computers have used Huffman codes.
A Huffman code is easily defined by a tree. To decode a bit string we begin at the root and move down the tree until a character
is encountered. The bit, 0 or 1, tells us whether to move right or left. As an example, let us decode the string \[01010111.\] We begin at the root. Since the first bit is \(0\), the first move is right. Next we move left and then right. At this point we encounter the first character R. To decode the next character, we begin again at the root. The next bit is 1, so we move left and encounter the next character A. The last bits 0111 decode as T. Therefore the bit string represents the word \(RAT\).
Huffman gave an algorithm to construct a Huffman code from a table giving the frequency of occurrence of the characters to be represented so that the code constructed represents strings of characters in minimal space.