Depth-First Search (DFS)

Depth-first search (DFS) is an algorithm for searching a graph or tree data structure. The algorithm starts at the root (top) node of a tree and goes as far as it can down a given branch (path), then backtracks until it finds an unexplored path, and then explores it. The algorithm does this until the entire graph has been explored. Many problems in computer science can be thought of in terms of graphs. For example, analyzing networks, mapping routes, scheduling, and finding spanning trees are graph problems. To analyze these problems, graph-search algorithms like depth-first search are useful.

Depth-first searches are often used as subroutines in other more complex algorithms. For example, the matching algorithm, Hopcroft–Karp, uses a DFS as part of its algorithm to help to find a matching in a graph. DFS is also used in tree-traversal algorithms, also known as tree searches, which have applications in the traveling-salesman problem and the Ford-Fulkerson algorithm.

How do you solve a maze?

Depth-first search is a common way that many people naturally approach solving problems like mazes. First, we select a path in the maze (for the sake of the example, let's choose a path according to some rule we lay out ahead of time) and we follow it until we hit a dead end or reach the finishing point of the maze. If a given path doesn’t work, we backtrack and take an alternative path from a past junction, and try that path. Below is an animation of a DFS approach to solving this maze.

DFS is a great way to solve mazes and other puzzles that have a single solution.

Depth-first Search

The main strategy of depth-first search is to explore deeper into the graph whenever possible. Depth-first search explores edges that come out of the most recently discovered vertex, \(s\). Only edges going to unexplored vertices are explored. When all of \(s\)’s edges have been explored, the search backtracks until it reaches an unexplored neighbor. This process continues until all of the vertices that are reachable from the original source vertex are discovered. If there are any unvisited vertices, depth-ﬁrst search selects one of them as a new source and repeats the search from that vertex. The algorithm repeats this entire process until it has discovered every vertex. This algorithm is careful not to repeat vertices, so each vertex is explored once. DFS uses a stack data structure to keep track of vertices.

Here are the basic steps for performing a depth-first search:

Visit a vertex \(s\).

Mark \(s\) as visited.

Recursively visit each unvisited vertex attached to \(s\).

This animation illustrates the depth-first search algorithm:

Note: This animation does not show the marking of a node as "visited," which would more clearly illustrate the backtracking step.

Fill out the following graph by labeling each node 1 through 12 according to the order in which the depth-first search would visit the nodes:

source:wikipedia

Solution: source:wikipedia

Implementing Depth-first Search

Below are examples of pseudocode and Python code implementing DFS both recursively and non-recursively. This algorithm generally uses a stack in order to keep track of visited nodes, as the last node seen is the next one to be visited and the rest are stored to be visited later.

Pseudocode^[1]

Initialize an empty stack for storage of nodes, S.
For each vertex u, define u.visited to be false.
Push the root (first node to be visited) onto S.
While S is not empty:
    Pop the first element in S, u.
    If u.visited = false, then:
        U.visited = true
        for each unvisited neighbor w of u:
            Push w into S.
End process when all nodes have been visited.

Python Implementation without Recursion

 def depth_first_search(graph):
    visited, stack = set(), [root]
    while stack:
        vertex = stack.pop()
        if vertex not in visited:
            visited.add(vertex)
            stack.extend(graph[vertex] - visited)
    return visited

DFS can also be implemented using recursion, which greatly reduces the number of lines of code.

Python Implementation Using Recursion

def depth_first_search_recursive(graph, start, visited=None):
    if visited is None:
        visited = set()
    visited.add(start)
    for next in graph[start] - visited:
        depth_first_search_recursive(graph, next, visited)
    return visited

It is common to modify the algorithm in order to keep track of the edges instead of the vertices, as each edge describes the nodes at each end. This is useful when one is attempting to reconstruct the traversed tree after processing each node. In case of a forest or a group of trees, this algorithm can be expanded to include an outer loop that iterates over all trees in order to process every single node.

There are three different strategies for implementing DFS: pre-order, in-order, and post-order.

Pre-order DFS works by visiting the current node and successively moving to the left until a leaf is reached, visiting each node on the way there. Once there are no more children on the left of a node, the children on the right are visited. This is the most standard DFS algorithm.

Instead of visiting each node as it traverses down a tree, an in-order algorithm finds the leftmost node in the tree, visits that node, and subsequently visits the parent of that node. It then goes to the child on the right and finds the next leftmost node in the tree to visit.

A post-order strategy works by visiting the leftmost leaf in the tree, then going up to the parent and down the second leftmost leaf in the same branch, and so on until the parent is the last node to be visited within a branch. This type of algorithm prioritizes the processing of leaves before roots in case a goal lies at the end of a tree.

Complexity of Depth-first Search

Depth-first search visits every vertex once and checks every edge in the graph once. Therefore, DFS complexity is \(O(V + E)\). This assumes that the graph is represented as an adjacency list.

DFS vs BFS

Breadth-first search is less space-efficient than depth-first search because BFS keeps a priority queue of the entire frontier while DFS maintains a few pointers at each level.

If it is known that an answer will likely be found far into a tree, DFS is a better option than BFS. BFS is good to use when the depth of the tree can vary or if a single answer is needed—for example, the shortest path in a tree. If the entire tree should be traversed, DFS is a better option.

BFS always returns an optimal answer, but this is not guaranteed for DFS.

Here is an example that compares the order that the graph is searched in when using a BFS and then a DFS (by each of the three approaches).^[2]

Breadth First Search : 1 2 3 4 5

Depth First Search

Pre-order: 1 2 4 5 3

In-order : 4 2 5 1 3

Post-order : 4 5 2 3 1

Applications

Depth-first search is used in topological sorting, scheduling problems, cycle detection in graphs, and solving puzzles with only one solution, such as a maze or a sudoku puzzle.

Other applications involve analyzing networks, for example, testing if a graph is bipartite. Depth-first search is often used as a subroutine in network flow algorithms such as the Ford-Fulkerson algorithm.

DFS is also used as a subroutine in matching algorithms in graph theory such as the Hopcroft–Karp algorithm.

Depth-first searches are used in mapping routes, scheduling, and finding spanning trees.

References

Heap, D. Depth-first search (DFS). Retrieved December 16th, 2002, from http://www.cs.toronto.edu/~heap/270F02/node36.html
Gupta, D. BFS vs DFS for Binary Tree. Retrieved July 20, 2016, from http://www.geeksforgeeks.org/bfs-vs-dfs-binary-tree/

Retiring Dec 20: Data Structures

Contents