Depth-first search (DFS)
Depth-first Search (DFS) is an algorithm for searching a graph or tree data structure. The algorithm starts at the root (top) node of a tree and goes as far as it can down a given branch (path), and then backtracks until it finds an unexplored path, and then explores it. The algorithm does this until the entire graph has been explored. Many problems in computer science can be thought of in terms of graphs. For example, analyzing networks, mapping routes, scheduling, and finding spanning trees are graph problems. To analyze these problems, graph search algorithms like depth-first search are useful.
Depth-first searches are often used as subroutines in other more complex algorithms. For example, the matching algorithm, Hopcroft–Karp uses a DFS as part of its algorithm to help find a matching in a graph. DFS is also used in tree traversal algorithms, also known as tree searches, which have applications in the the travelling salesman problem and the Ford Fulkerson’s algorithm.
How do you solve a maze?
Depth-first search is a common way that many people naturally approach solving problems like mazes. First, we select a path in the maze (for the sake of the example, let's choose a path according to some a rule we lay out ahead of time) and we follow it until we hit a dead end or reach the finishing point of the maze. If a given path doesn’t work, we backtrack and take an alternative path from a past junction, and try that path. Below is an animation of a DFS approach to solving this maze.
Contents
Depth-first search
The main strategy of depth-first search is to explore deeper into the graph whenever possible. Depth-first search explores edges that come out of the most recently discovered vertex, \(s\). Only edges going to unexplored vertices are explored. When all of \(s\)’s edges have been explored, the search backtracks until it reaches an unexplored neighbor. This process continues until all of the vertices that are reachable from the original source vertex are discovered. If there are any unvisited vertices, depth-ﬁrst search selects one of them as a new source and repeats the search from that vertex. The algorithm repeats this entire process until it has discovered every vertex. This algorithm is careful not to repeat vertices, so each vertex is explored once. DFS uses a stack data structure to keep track of vertices.
Here are the basic steps for performing a depth-first search:
- Visit a vertex \(s\).
- Mark \(s\) as visited.
- Recursively visit each unvisited vertex attached to \(s\).
This animation illustrates the depth-first search algorithm:
Note: this animation does not show the marking of a node as "visited" which would more clearly illustrate the backtracking step.
Fill out the following graph by labeling each node 1 through 12 according to the order in which the depth-first search would visit the nodes.
Solution:
Implementing Depth First Search
Below are examples of pseudocode and Python code implementing DFS both recursively and non-recursively. This algorithm generally uses a stack in order to keep track of visited nodes, as the last node seen is the next one to be visited and the rest are stored to be visited later.
Pseudocode^{[1]}
1 2 3 4 5 6 7 8 9 10Initialize an empty stack for storage of nodes, S. For each vertex u, define u.visited to be false. Push the root (first node to be visited) onto S. While S is not empty: Pop the first element in S, u. If u.visited = false, then: U.visited = true for each unvisited neighbor w of u: Push w into S. End process when all nodes have been visited.
Python Implementation without Recrusion
1 2 3 4 5 6 7 8def depth_first_search(graph): visited, stack = set(), [root] while stack: vertex = stack.pop() if vertex not in visited: visited.add(vertex) stack.extend(graph[vertex] - visited) return visited
DFS can also be implemented using recursion, which greatly reduces the number of lines of code.
Python Implementation Using Recursion
1 2 3 4 5 6 7def depth_first_search_recursive(graph, start, visited=None): if visited is None: visited = set() visited.add(start) for next in graph[start] - visited: depth_first_search_recursive(graph, next, visited) return visited
It is common to modify the algorithm in order to keep track of the edges instead of the vertices, as each edge describes the nodes at each end. This is useful when one is attempting to reconstruct the traversed tree after processing each node. In case of a forest, or a group of trees, this algorithm can be expanded to include an outer loop that iterates over all trees in order to process every single node.
Which answer shows one possible order that the vertices will be discovered by a depth-first search on the graph below?
There are three different strategies for implementing DFS: pre-order, in-order, and post-order.
Pre-order DFS works by visiting the current node, and successively moving to the left until a leaf is reached, visiting each node on the way there. Once there are no more children on the left of a node, the children on the right are visited. This is the most standard DFS algorithm.
Instead of visiting each node as it traverses down a tree, n in-order algorithm finds the left-most node in the tree, visits that node, and subsequently visit the parent of that node. It then goes to the child on the right and finds the next left-most node in the tree to visit.
A post-order strategy works by visiting the leftmost leaf in the tree, then going up to the parent and down the second left-most leaf in the same branch, and so on until the parent is the last node to be visited within a branch. This type of algorithm prioritizes the processing of leafs before roots in case a goal lies at the end of a tree.
Complexity of Depth First Search
Depth-first search visits every vertex once and checks every edge in the graph once. Therefore, DFS complexity is \(O(V + E)\). This assumes that the graph is represented as an adjacency list.
DFS vs BFS
Breadth-first search is less space efficient than depth-first search because BFS keeps a priority queue of the entire frontier while DFS maintains a few pointers at each level.
If it is known that an answer will likely be found far into a tree, DFS is a better option than BFS. BFS is good to use when the depth of the tree can vary or if a single answer is needed — for example, the shortest path in a tree. If the entire tree should be traversed, DFS is a better option.
BFS always returns an optimal answer, but this is not guaranteed for DFS.
Here is an example that compares the order that the graph is searched in when using a BFS and then a DFS (by each of the three approaches). ^{[2]}
Breadth First Search : 1 2 3 4 5
Depth First Search
Pre-order: 1 2 4 5 3
In-order : 4 2 5 1 3
Post-order : 4 5 2 3 1
Applications
Depth-first search is used in topological sorting, scheduling problems, cycle detection in graphs, and solving puzzles with only one solution, such as a maze or a sudoku puzzle.
Other applications involve analyzing networks, for example, testing if a graph is bipartite. Depth-first search is often used as a subroutine in network flow algorithms such as the Ford-Fulkerson algorithm.
DFS is also used as a subroutine in matching algorithms in graph theory such as the Hopcroft–Karp algorithm.
Depth-first searches are used in mapping routes, scheduling, and finding spanning trees.
References
- Heap, D. Depth-first search (DFS). Retrieved December 16th, 2002, from http://www.cs.toronto.edu/~heap/270F02/node36.html
- Gupta, D. BFS vs DFS for Binary Tree. Retrieved July 20, 2016, from http://www.geeksforgeeks.org/bfs-vs-dfs-binary-tree/