Ford-Fulkerson Algorithm

The Ford-Fulkerson algorithm is an algorithm that tackles the max-flow min-cut problem. That is, given a network with vertices and edges between those vertices that have certain weights, how much "flow" can the network process at a time? Flow can mean anything, but typically it means data through a computer network.

It was discovered in 1956 by Ford and Fulkerson. This algorithm is sometimes referred to as a method because parts of its protocol are not fully specified and can vary from implementation to implementation. An algorithm typically refers to a specific protocol for solving a problem, whereas a method is a more general approach to a problem.

The Ford-Fulkerson algorithm assumes that the input will be a graph, \(G\), along with a source vertex, \(s\), and a sink vertex, \(t\). The graph is any representation of a weighted graph where vertices are connected by edges of specified weights. There must also be a source vertex and sink vertex to understand the beginning and end of the flow network.

Ford-Fulkerson has a complexity of \(O\big(|E| \cdot f^{*}\big),\) where \(f^{*}\) is the maximum flow of the network. The Ford-Fulkerson algorithm was eventually improved upon by the Edmonds-Karp algorithm, which does the same thing in \(O\big(V^2 \cdot E\big)\) time, independent of the maximum flow value.

Intuition

The intuition behind the algorithm is quite simple (even though the implementation details can obscure this). Imagine a flow network that is just a traffic network of cars. Each road can hold a certain number of cars. This can be illustrated by the following graphic.

Flow Network

The intuition goes like this: as long as there is a path from the source to the sink that can take some flow the entire way, we send it. This path is called an augmenting path. We keep doing this until there are no more augmenting paths. In the image above, we could start by sending 2 cars along the topmost path (because only 2 cars can get through the last portion). Then we might send 3 cars along the bottom path for a total of 5 cars. Finally, we can send 2 more cars along the top path for two edges, send them down to bottom path and through to the sink. The total number of cars sent is now 7, and it is the maximum flow.

Why could we only send 2 cars in the last augmenting path in the previous example?

We had already sent two cars along the top path for the first augmenting path. That means that the first two edges along that top path are already at \(2/4\) capacity. So, those edges can't fit any more than 2 cars, so they are the limiting factors in the last augmenting path.

This is why calculating augmenting paths gets tricky for larger and more complicated flow networks. To handle this, residual graphs are generated after each augmenting path is found and maximized. Residual graphs show the differences between various paths' capacity and their current flow and allow future augmenting paths to be calculated accurately. \(_\square\)

Algorithm Pseudo-code

The pseudo-code for this method is quite short; however, there are some functions that bear further discussion. The simple pseudo-code is below.

This pseudo-code is not written in any specific computer language. Instead, it is an informal, high-level description of the algorithm.

Ford-Fulkerson Algorithm \((\)Graph \(G\), source \(s\), sink \(t):\)

initialize flow to 0
path = findAugmentingPath(G, s, t)
while path exists:
    augment flow along path                 #This is purposefully ambiguous for now
    G_f = createResidualGraph()
    path = findAugmentingPath(G_f, s, t)
return flow

Basically, what this simplified version says is that as long as there is a path from the source to the sink that can handle more flow, send that flow. Here is a version of the pseudo-code that explains the flow augmentation more in depth:

flow = 0
for each edge (u, v) in G:
    flow(u, v) = 0
while there is a path, p, from s -> t in residual network G_f:
    residual_capacity(p) = min(residual_capacity(u, v) : for (u, v) in p)
    flow = flow + residual_capacity(p)
    for each edge (u, v) in p:
        if (u, v) is a forward edge:
            flow(u, v) = flow(u, v) + residual_capacity(p)
        else:
            flow(u, v) = flow(u, v) - residual_capacity(p)
return flow

This algorithm is more well-defined. The only ambiguous point is the 'foward edge' terminology on line 8. When a residual graph, \(G_f\), is created, edges can be created that go in the opposite direction when compared to the original graph. An edge is a 'forward edge' if the edge existed in the original graph, \(G\). If it is a reversal of an original edge, it is called a 'backwards edge.'

Residual Graphs

Residual graphs are an important middle step in calculating the maximum flow. As noted in the pseudo-code, they are calculated at every step so that augmenting paths can be found from the source to the sink. To understand how these are created and how they are used, we can use the graph from the intuition section.

However, one important attribute is important to understand before looking at residual graphs. Residual capacity is a term used in the above pseudo-code, and it plays an important role in residual graph creation. Residual capacity is defined as the new capacity after a given flow has been taken away. In other words, for a given edge \((u, v)\), the residual capacity, \(c_f\) is defined as

\[c_f(u, v) = c(u, v) - f(u, v).\]

However, there must also be a residual capacity for the reverse edge as well. The max-flow min-cut theorem states that flow must be preserved in a network. So, the following equality always holds:

\[f(u, v) = -f(v, u).\]

With these tools, it is possible to calculate the residual capacity of any edge, forward or backward, in the flow network. Then, those residual capacities are used to make a residual network, \(G_f\).

Taking the original intuition network and adding labels to the vertices, we now have this:

Flow network before Ford-Fulkerson

It was shown that 2 units of flow can be pushed along the top-most path initially. When this happens, only three edges are affected: (S, A), (A, B), and (B, T). (S, A) and (A, B) are affected in the same way because they have the same capacity. Two things happen:

In the forward direction, the edges now have a residual capacity equal to \(c_f(u, v) = c(u, v) - f(u, v)\). The flow is equal to 2, so the residual capacity of (S, A) and (A, B) is reduced to 2, while the edge (B, T) has a residual capacity of 0.
In the backward direction, the edges now have a residual capacity equal to \(c_f(v, u) = c(v, u) - f(v, u)\). Because of flow preservation, this can be written as \(c_f(v, u) = c(v, u) + f(u, v)\). And since the capacity of those backward edges was initially 0, all of the backward edges (T, B), (B, A), and (A, S) now have a residual capacity of 2.

When a new residual graph is constructed with these new edges, any edges with a residual capacity of 0—like (B, T)—are not included.

Residual graph after 1 round

Now, a new augmenting path must be found \((\)the top-most path can never be used again because the edge (B, T) was erased\().\) The bottom path can be chosen and flow of 3 can be sent along it. The resulting graph is as follows:

Residual graph after 2 rounds

Finally, a flow of 2 can be sent along the path [(S, A), (A, B), (B, C), (C, D), (D, T)] because the minimum residual capacity along that path is 2. The final residual graph after this is done is as follows:

Residual graph after 3 rounds

There are not more paths from the source to the sink, so there can be no more augmenting paths. Therefore, the loop is complete. The flow, 7, is a maximum flow.

Sample Python Implementation

The Ford-Fulkerson algorithm is straightforward to implement once the concept of the residual graphs is understood. As noted in the introduction to this wiki, this algorithm is often called a method because the method of finding augmenting paths in a residual graph is not fully specified. This implementation is one way of doing this.

The following implementation is not battle-tested, and is only meant to be a learning tool.

class Vertex:
    def __init__(self, name, source=False, sink=False):
        self.name = name
        self.source = source
        self.sink = sink

class Edge:
    def __init__(self, start, end, capacity):
        self.start = start
        self.end = end
        self.capacity = capacity
        self.flow = 0
        self.returnEdge = None

class FlowNetwork:
    def __init__(self):
        self.vertices = []
        self.network = {}

    def getSource(self):
        for vertex in self.vertices:
            if vertex.source == True:
                return vertex
        return None

    def getSink(self):
        for vertex in self.vertices:
            if vertex.sink == True:
                return vertex
        return None

    def getVertex(self, name):
        for vertex in self.vertices:
            if name == vertex.name:
                return vertex

    def vertexInNetwork(self, name):
        for vertex in self.vertices:
            if vertex.name == name:
                return True
        return False

    def getEdges(self):
        allEdges = []
        for vertex in self.network:
            for edge in self.network[vertex]:
                allEdges.append(edge)
        return allEdges

    def addVertex(self, name, source=False, sink=False):
        if source == True and sink == True:
            return "Vertex cannot be source and sink"
        if self.vertexInNetwork(name):
            return "Duplicate vertex"
        if source == True:
            if self.getSource() != None:
                return "Source already Exists"
        if sink == True:
            if self.getSink() != None:
                return "Sink already Exists"
        newVertex = Vertex(name, source, sink)
        self.vertices.append(newVertex)
        self.network[newVertex.name] = []

    def addEdge(self, start, end, capacity):
        if start == end:
            return "Cannot have same start and end"
        if self.vertexInNetwork(start) == False:
            return "Start vertex has not been added yet"
        if self.vertexInNetwork(end) == False:
            return "End vertex has not been added yet"
        newEdge = Edge(start, end, capacity)
        returnEdge = Edge(end, start, 0)
        newEdge.returnEdge = returnEdge
        returnEdge.returnEdge = newEdge
        vertex = self.getVertex(start)
        self.network[vertex.name].append(newEdge)
        returnVertex = self.getVertex(end)
        self.network[returnVertex.name].append(returnEdge)

    def getPath(self, start, end, path):
        if start == end:
            return path
        for edge in self.network[start]:
            residualCapacity = edge.capacity - edge.flow
            if residualCapacity > 0 and not (edge, residualCapacity) in path:
                result = self.getPath(edge.end, end, path + [(edge, residualCapacity)])
                if result != None:
                    return result

    def calculateMaxFlow(self):
        source = self.getSource()
        sink = self.getSink()
        if source == None or sink == None:
            return "Network does not have source and sink"
        path = self.getPath(source.name, sink.name, [])
        while path != None:
            flow = min(edge[1] for edge in path)
            for edge, res in path:
                edge.flow += flow
                edge.returnEdge.flow -= flow
            path = self.getPath(source.name, sink.name, [])
        return sum(edge.flow for edge in self.network[source.name])

The class defined on line 1 is a simple Vertex class. It only has a name, which we use for edge creation (and debugging), whether or not the vertex is a source or a sink. On line 7, the Edge class is defined. This class has a starting vertex (just the name of the vertex, not the class instance), an ending vertex, a capacity, a flow, and a pointer to the return edge which is used in the residual graph. The capacity of an edge is the total weight it can carry whereas an edge's flow is the weight it is currently carrying.

The FlowNetwork class defined on line 15 has two attributes. The vertices attribute is a list of all of the instances of vertices in the graph. This list holds entire objects, linking a vertex's name to its source and sink attributes. The network dictionary is a data structure that maps every vertex's name to all of the edges coming out of the corresponding vertex. By simply having the vertex's name act as the dictionary key, we can easily switch back and forth between 'vertices' and 'network' to suit the needs of the algorithm.

Lines 20-48 contain various helper functions that make later functions easier and help reduce code duplication. For example, the function on line 32, getVertex() takes in a vertex name and goes to find the entire Vertex object in FlowNetwork's vertices attribute.

addVertex() on line 50 adds a vertex to the graph after it checks various error cases to make sure the vertex can be added. It adds the entire vertex to the vertices list on line 62 and adds the vertex name to an empty list of edges on line 63. addEdge() on line 65 first checks that the starting and ending vertices are both in the graph and that they are not the same vertex. Then, it creates the new edge and a corresponding returning edge with capacity 0. It then adds the new edge to the network map on line 77 and adds the return edge to the same map on line 79.

The 'getPath()' function on line 81 does the important work in this class. It is a recursive function that walks through the flow network starting at a given vertex and calculates residual capacity for each edge. This residual capacity is used to decide how much flow to send along a given path in the next function. The path is grown until it reaches the end of the flow network; at that point the base case for the recursion is called on line 82 and the function ends.

calculateMaxFlow() on line 91 is what calls getPath with the correct parameters and adds up the max flow. It first finds the source and sink of the network. Then, it calculates an initial augmenting path on line 96. Then, we start the important Ford-Fulkerson method and continue to calculate flow while there still is a path. While there is a path, an augmenting capacity is in each edge of the path, so a flow can be calculated on line 98. That flow is added to the forward edges and subtracted from the reverse edges. Then, another path is calculated and the process resumes. Finally, we only need to calculate the total flow coming out of the source because this is the exact amount of flow through the entire network (since we only have one source).

Using the Implementation

To illustrate the above python implementation, we can construct the exact same graph from the Intuition section.

    fn = FlowNetwork()
    fn.addVertex('s', True, False)
    fn.addVertex('t', False, True)
    map(fn.addVertex, ['a', 'b','c','d'])
    fn.addEdge('s', 'a', 4)
    fn.addEdge('a', 'b', 4)
    fn.addEdge('b', 't', 2)
    fn.addEdge('s', 'c', 3)
    fn.addEdge('c', 'd', 6)
    fn.addEdge('d', 't', 6)
    fn.addEdge('b', 'c', 3)

The flow network is created, so its vertices and network can be inspected. However, these attributes contain full objects—the vertices attribute contains Vertex objects and the network attribute contains Edge objects. So we need to clean it up a little.

>>> [vertex.name for vertex in fn.vertices]
['s', 't', 'a', 'b', 'c', 'd']
>>> ['%s -> %s' % (e.start, e.end) for e in fn.getEdges()]
['a -> s', 'a -> b', 'c -> s', 'c -> d', 'c -> b', 'b -> a', 'b -> t', 'b -> c', 'd -> c', 'd -> t', 's -> a', 's -> c', 't -> b', 't -> d']

Remember that the edges in getEdges() also contains all of the reverse edges. Now, this class can calculate its own maximum flow.

>>> fn.calculateMaxFlow()
7

This is the same answer we found by inspection earlier.

What happens to the 'capacity' and 'flow' attributes of the edges in 'getEdges()' when 'fn.calculateMaxFlow()' is called?

Hint: This answers for forward edges and backward edges are a little different.
Once this procedure has been completed, all forward edges are filled up as much as they can be, and the reverse edges reflect that filled-up status. As the max-flow min-cut theorem states, the total flow of the network must be preserved. So, for a given edge with endpoints \((u, v)\), the flows of the forward and backward edges will sum to 0, just as when the process began.

Here is the difference before and after calling fn.calculateMaxFlow():
1
2
3
4
5
6
>>> ['%s -> %s; %s/%s' % (e.start, e.end, e.flow, e.capacity) for e in fn.getEdges()]
['a -> s; -4/0', 'a -> b; 4/4', 'c -> s; -3/0', 'c -> d; 5/6', 'c -> b; -2/0', 'b -> a; -4/0', 'b -> t; 2/2', 'b -> c; 2/3', 'd -> c; -5/0', 'd -> t; 5/6', 's -> a; 4/4', 's -> c; 3/3', 't -> b; -2/0', 't -> d; -5/0']
>>> fn.calculateMaxFlow()
7
>>> ['%s -> %s; %s/%s' % (e.start, e.end, e.flow, e.capacity) for e in fn.getEdges()]
['a -> s; -4/0', 'a -> b; 4/4', 'c -> s; -3/0', 'c -> d; 5/6', 'c -> b; -2/0', 'b -> a; -4/0', 'b -> t; 2/2', 'b -> c; 2/3', 'd -> c; -5/0', 'd -> t; 5/6', 's -> a; 4/4', 's -> c; 3/3', 't -> b; -2/0', 't -> d; -5/0']
If you draw out all of the edges, you will see that the graph you've created is the same that was made in the residual graphs section.

Complexity

The analysis of Ford-Fulkerson depends heavily on how the augmenting paths are found. The typical method is to use breadth-first search to find the path. If this method is used, Ford-Fulkerson runs in polynomial time.

If all flows are integers, then the while loop of Ford-Fulkerson is run at most \(|f^{*}|\) times, where \(f^{*}\) is the maximum flow. This is because the flow is increased, at worst, by 1 in each iteration.

Finding the augmenting path inside the while loop takes \(O\big(V + E^{'}\big),\) where \(E^{'}\) is the set of edges in the residual graph. This can be simplified to \(O(E)\). So, the runtime of Ford-Fulkerson is \[O\big(E \cdot |f^{*}|\big).\]

Contents