Tries

Tries (also known as radix trees or prefix trees) are tree-based data structures that are typically used to store associative arrays where the keys are usually strings. Since they also implement associative arrays, tries are often compared to hash tables. There are important pros and cons to consider when deciding whether to use a trie or a hash table, and it often comes down to how the associative array will be used.

Tries are interesting for many reasons. For one, nodes in the tree do not store keys. Instead, they each store parts of keys. Traversing down from the root node to a leaf allows you to build the key as you progress. Also, there doesn't need to be a value at every node. In fact, values are typically only associated with leaf nodes. Building keys as you go is useful for specific applications, notably auto-complete.

For example, below is a graphic that is a representation of a trie that contains the following associative array. Remember that an associative array is an abstract data type, so a trie is one way to implement that ADT.

map = { 'ape': 32, 'ball': 2, 'atom': 16, 'ate': 18, 'bait': 5 }

A trie A trie

As you can see, the key is built as you descend down from the root to a leaf. The word "ape" is built by traversing the "a" edge, the "p" edge, and the "e" edge. When we reach a leaf, we can extract the value, which is 32.

Tries have many advantages over their counterpart, the hash table. They are used in many string search applications such as auto-complete, text search, and prefix matching. Radix trees, a kind of trie, are often used in IP routing.

Overview

A trie is, like other tree-based data structures, made up of a set of nodes connected by pointers. These pointers indicate a parent-child relationship between the nodes. Parents are above their children in the tree. Typically the root node of the tree is an "empty" node so that it can point to all members of the alphabet the trie is using to store.

Unlike other tree-based data structure like the AVL tree, the trie is not necessarily balanced. It is totally dependent on the contents of the tree.

Nodes in the trie can hold either single members of the alphabet or the entire word so far. For example, the tree could look like the graphic above, or it could look like this:

A different kind of trie^[1]

Design Considerations

Tries are often used as replacements for hash tables. To that end, there are no collisions, so the worst-case performance of a trie is better than a poorly implemented hash table. Plus, there is no need for hash functions. Also, tries have the ability to order their information, alphabetically for instance. This can be useful in some applications.

Deciding how to design your trie has a lot to do with the application of your trie. For instance, if you wanted to use a trie to implement English autocomplete, it needs to store every word in the English language (plus some slang terms). It wouldn't make sense for you to have pointers going from the root a string of 7 consecutive 'z' nodes. There's no word spelled with 7 consecutive z's. So, that would be a huge waste of space (and computation time). This is especially wasteful if your entire tree is the height of the longest word in the English language. The longest non-technical term is Antidisestablishmentarianism, with 27 characters. It's much more efficient for the data structure to be unbalanced in this case! A balanced tree is a tree where all paths from the root to all leaves differ in length by at most 1, like in an AVL tree. An unbalanced tree has no such restriction. The unbalanced tree would clip off unnecessary strings like the 7 consecutive z's.

The language one uses for their trie is very important. For example, while storing strings (human language) is a common application of tries, one could also store sequences of bits. In the case of the English language, each node can have up to 26 children nodes. However, using bits limits the number of possible children to 2. Furthermore, a programmer can limit the alphabet of their language by compressing it. Any word that can be represented using $n$ bytes can also be represented using $2n$ 4-bit units. When this compression is used, lookups can visit up to twice as many nodes in the worst case, but the storage requirements are reduced by a factor of 8.

Complexity

The time complexity of making a trie depends heavily on the representation of the language being stored in the trie. As stated earlier, small changes to a language's alphabetic representation can have a large impact on both storage and operation time complexity.

Here are the worst-case times, where $m$ is the length of the longest word, and $n$ is the number of words in the trie. $a$ is the length of the word you are searching for. Note that to change these times to the average or best case, you would simply swap out $m$ for the average length of word or the length of the shortest word, respectively.

Each operation is predicated on the complexity of the lookup operation. Anything from creating the trie to searching it later to deleting an element will require you to perform $m$ lookups on each of the $n$ words.

See also: big O notation.

Trie Operation	Worst
Create	$O(m \cdot n)$
Lookup	$O(a \cdot n)$
Insert	$O(a \cdot n)$
Delete	$O(a \cdot n)$

These operational complexities are mirrored in the hash table with the following values. Looking up entire words is easier with a hash table. However, tries allow you to look up words by their prefixes, something that hash tables cannot do because their keys are not split.

Hash Table Operation	Average	Worst
Lookup	$O(1)$	$O(n)$
Insert	$O(1)$	$O(n)$
Delete	$O(1)$	$O(n)$

Sample Python Implementation

The following python code is an example of an object-oriented approach to the trie. It has basic functionality and uses the English language as its alphabet.

class Node:
    def __init__(self):
        self.key = None
        self.value = None
        self.children = {}

class Trie:
    def __init__(self):
        self.root = Node()

    def insert(self, word, value):
        currentWord = word
        currentNode = self.root
        while len(currentWord) > 0:
            if currentWord[0] in currentNode.children:
                currentNode = currentNode.children[currentWord[0]]
                currentWord = currentWord[1:]
            else:
                newNode = Node()
                newNode.key = currentWord[0]
                if len(currentWord) == 1:
                    newNode.value = value
                currentNode.children[currentWord[0]] = newNode
                currentNode = newNode
                currentWord = currentWord[1:]

    def lookup(self, word):
        currentWord = word
        currentNode = self.root
        while len(currentWord) > 0:
            if currentWord[0] in currentNode.children:
                currentNode = currentNode.children[currentWord[0]]
                currentWord = currentWord[1:]
            else:
                return "Not in trie"
        if currentNode.value == None:
            return "None"
        return currentNode.value

    def printAllNodes(self):
        nodes = [self.root]
        while len(nodes) > 0:
            for letter in nodes[0].children:
                nodes.append(nodes[0].children[letter])
            print nodes.pop(0).key

def makeTrie(words):
    trie = Trie()
    for word, value in words.items():
        trie.insert(word, value)
    return trie

Here it is in action.

>>> trie = makeTrie({'hello': 5, 'hat': 7, 'her': 1})
>>> trie.lookup('nope')
'Not in trie'
>>> trie.lookup('hello')
5
>>> trie.printAllNodes()
None
h
a
e
t
r
l
l
o

The root node prints out a key of None, as expected. Notice how there's only 1 node with a key of 'h'? That's because both 'hello' and 'hat' use that node. That node acts as the root for all words that begin with the letter 'h'.

Applications

There are many cool applications of tries. Ever used autocomplete on your cellphone? Well, that's maybe the most common application of the trie. Once you've typed in a letter, the tree of potential words is greatly reduced, allowing the program to easily enumerate what kinds of strings are possible.

If all you wanted to do was to store words, a state machine might be easier. However, the trie can store additional information for each word. For example, the trie can store how popular a word might be. That's why when you type "ye", "yes" is suggested before "yelling".

Tries are also useful for approximate matching algorithms, like those used in spell check. A slightly misspelled word will have a similar path from the root of the tree when compared with the correctly spelled word. This is especially true if certain modifications are made to the trie, like branch merging.

String matching is another use case where tries and approximation algorithms are useful. Longest suffix or prefix matching uses these methods.

Tries can also be used for sorting. Using a trie to do this is similar to radix sort. A pre-order traversal of the tree will result in an output that is in increasing order, allowing sorting.

References

Bazookaj, B. Trie. Retrieved June 22, 2016, from https://en.wikipedia.org/wiki/Trie

Contents