Flow Network
A Flow network is a directed graph where each edge has a capacity and a flow. They are typically used to model problems involving the transport of items between locations, using a network of routes with limited capacity. Examples include modeling traffic on a network of roads, fluid in a network of pipes, and electricity in a network of circuit components.
For example, a company might want to ship packages from Los Angeles to New York City using trucks to transport between intermediate cities. If there is only one truck for the route connecting a pair of cities and each truck has a maximum load, then the graph describing the transportation options will be a flow network. Each node will represent a city and each edge will represent a truck route between those cities (e.g. a highway). The capacity for a particular route will be the maximum load the truck for that route can carry. Using this model, the company can decide how to split their packages between trucks so that the packages can reach their destination using the available routes and trucks. The number of packages the company decides to ship along a particular truck route is the flow for that route.
Motivation
Imagine a courier service that wants to ship as many widgets as possible from city $s$ to city $t$. Unfortunately, there is no way to ship widgets directly from $s$ to $t$, so the courier service must ship the widgets using the intermediate cities $a$, $b$, $c$, and $d$. Particular pairs of cities are connected by flights, which allow the transport of widgets between those cities. This transportation network can be represented by the following directed graph, where nodes represent cities and directed edges represent flights between those cities.
Obviously, any realistic airplane can't carry an unlimited number of widgets. Therefore, every flight has a maximum number of widgets that it can carry, dependent on the size of its cargo bay. This maximum is called the capacity for that flight. It is a number associated with each edge in the graph above and denotes the maximum number of widgets that can be transported between cities. The graph below is the graph above plus the corresponding capacities. For example, the flight from $b$ to $c$ can carry a maximum of $9$ widgets, so edge $\vec{bc}$ has capacity $9$.
Now that the courier service has a representation of the flights it can use to transport widgets, as well as the maximum number of widgets that can be moved between cities, it can start the task of deciding how many widgets to transport on each flight. The number of widgets transported along each flight is known as the flow for that flight. The naive solution would be to just assign the maximum number of widgets possible to each flight. However, this violates the common sense constraint that the number of widgets flown into an intermediate city must equal the number of widgets being flown out of that city. Otherwise, widgets are being left behind (in the case that there are more flown in than flown out) or created out of thin air (in the case there are more flown out than flow in), two scenarios which are not in the courier's interest or power. Indeed, assigning the maximum flows possible leaves $14$ widgets entering city $d$ and only $11$ widgets leaving, meaning $3$ widgets of vanished.
So, for every node in the graph other than $s$ and $t$, the total flow leaving that node must be equal to the total flow entering that node. One possible assignment that respects these two constraints (an edge's flow may not exceed its capacity and the total flow entering a node must equal the total flows leaving that node) is shown below.
A natural question is to ask whether this is the assignment that maximizes the number of widgets that can ship from city $s$ to city $t$. Since the number of widgets being shipped to city $t$ is simply the number of widgets leaving $s$ or entering $t$, summing either one yields that this flow assignment allows the shipment of $9$ widgets. In fact, an optimal assignment, known as the maximum flow, is $23$ widgets, which is shown in the graph below. This is one flow assignment (it is not necessarily unique) that maximizes the courier service's stated objective of maximizing the number of widgets to ship from $s$ to $t$. Other problems and objectives are described in the section Flow Network Problems below.
Definition
Formally, a flow network is defined as
A flow network $N$ is a tuple $N=(G, c, s, t)$ where
$G = (V, E)$ is a directed graph of vertices $V$ and directed edges $E$
$c: E \rightarrow \mathbb{R}_0^+$ is a mapping from the edges to the nonnegative reals, and $c(e)$ is called the capacity of edge $e$
$s, t \in V$ are special vertices of $G$ called the source and sink, respectively
An admissible flow. or, for short, a flow on N, is defined as
A flow $f$ for a flow network $N$ is a mapping $f: E \rightarrow \mathbb{R}_0^+$ satisfying the following two constraints:
$0 \le f(e) \le c(e)$
for all edges $e \in E$
$\sum_{e^+ = v}f(e) = \sum_{e^- = v}f(e)$
for each vertex $v \neq s, t$, where $e^-$ and $e^+$ denote the start and end vertex of edge $e$, respectively
The first constraint (called the feasibility condition) simply says that the flow $f(e)$ along each edge $e$ must be nonnegative and may not exceed the capacity $c(e)$ for that edge. The second constraint (called the flow conservation condition) says that the flows into a vertex must equal the flows out of that vertex, except for the source and sink vertices.
Given vertices $s, a, b, t$ with source $s$ and sink $t$, draw the flow network with the following capacities, making sure to label each edge with its corresponding capacity.
$c(\vec{sa}) = 1 \quad c(\vec{sb}) = 1 \quad c(\vec{ab}) = 2 \quad c(\vec{at}) = 1 \quad c(\vec{bt}) = 1$
One possible layout of the flow network is shown to the left. Note that, since edges are directed in a flow network, edge $\vec{ba}$ is not included in the graph even though edge $\vec{ab}$ is included. Also, since $s$ is a source, it has no edges entering it. Similarly, sink $t$ has no edges leaving it.
Given the flow network from the previous example, is the mapping (with each edge label equal to that edge's flow) below an admissible flow?
No, the mapping is not an admissible flow. While the mapping satisfies the feasibility condition (check that the flow for each edge is between zero and the capacity for that edge), the flow conservation condition is not satisfied. Specifically, the total flow exiting node $b$ is $1$, while the total flow entering node $b$ is $1.5$. Changing the flow of edge $\vec{sb}$ from $.75$ to $.25$ will make this an admissible flow.
Useful Concepts
While many different problems can be formulated as flow networks, a few concepts are common to almost all of them. Some are used to transform a given problem statement into the canonical form $N = (G, s, t, c)$ as defined in the previous section, while others are used as intermediate calculations in algorithmic solutions.
Residual Capacity
The residual capacity $r$ of a flow network $N$ and a flow $f$ is the difference between the capacity $c$ and the flow $f$. Formally, this is defined as
The residual capacity for a flow network $N$ and flow $f$ is a mapping $r: E \rightarrow \mathbb{R}_0^+$ such that
$r(e) = c(e) - f(e)$
for all edges $e \in E$
Setting each edge's weight to its residual capacity forms a residual network for a flow network $N$ and flow $f$. This network models the available capacity for all edges, and is used as an intermediate calculation in the Ford-Fulkerson algorithm for solving the maximum flow problem.
Augmenting Paths
An augmenting path is a directed path in the residual network with unused capacity for every edge in that path. Formally, this is defined as
An augmenting path for a flow network $N$ and flow $f$ is a directed path $p$ starting at the source $s$ and ending at the sink $t$ such that the residual capacity $r$ for all edges on that path is nonzero. Specifically,
$p = (u_1, u_2, \dots, u_{k-1}, u_k)$ is a directed path with $u_1 = s$ and $u_k = t$ such that
$r(e) = c(e) - f(e) \gt 0$
for all edges $e = (u_i, u_{i+1})$ for $i = 1, \dots, k - 1$ in the path $p$.
Actually, it turns out that a flow network $N$ with flow $f$ is a maximum flow if and only if there is no augmenting path in the residual network. The 'if' part of this theorem, that a maximum flow has no augmenting paths, is clear since the existence of such a path implies the existence of a more maximal flow (by increasing the flow for all edges in the path by the minimum residual capacity along that path). The 'only if' part of this theorem, that no augmenting paths implies maximality, is less intuitive, and is the key insight behind the Ford-Fulkerson algorithm.
Supernodes
Often, a given problem will require modeling multiple sources and/or multiple sinks. Since the canonical definition of a flow network $N = (G, s, t, c)$ specifies only a single source and a single sink, sometimes the addition of supernodes to the flow network is needed. Specifically, multiple sources requires the addition of a supersource, with edges from the supersource to each individual source, while multiple sinks requires the addition of a supersink, with edges from each individual sink to the supersink. Then, the flow network's source $s$ is set to the supersource and the flow network's sink $t$ is set to the supersink.
Since the addition of supernodes shouldn't influence any calculations on the graph, the capacity $c$ for the newly created edges is set to infinity, so that the flow along those edges is only constrained to be nonnegative. For example, the figure to the right is a flow network with $n$ sources $s_1, \dots, s_n$ and $m$ sinks $t_1, \dots, t_m$, with supersource $S$ and supersink $T$.
Flow Network Problems
There are several different problems that can be modeled as flow networks. A few in particular are common and show up in very diverse fields. Since flow network problems have been well studied, each is only briefly described here, along with some example applications. More in depth discussions are linked to in each problem description.
Maximum Flow
The maximum flow problem is the problem of finding the maximum admissible flow through a single source, single sink flow network. It was originally formulated in 1954 by mathematicians attempting to model Soviet railway traffic flow. Well known solutions for the maximum flow problem include the Ford-Fulkerson algorithm, Edmonds-Karp algorithm, and Dinic's algorithm.
Maximum flow algorithms have an enormous range of applications. Examples include airline flight crew scheduling, the circulation-demand problem (where goods with location dependent demand must be transported using routes with limited capacity), and determining when during a sports season to eliminate losing teams.
Minimum Cost Flow
Minimum cost flow is the problem of finding the cheapest possible way to send a certain amount of flow through a network. This requires extending the flow network so that each edge $e = (u, v)$ now also has an associated cost $a(e)$ per unit of flow per edge. The total cost of a flow is then $sum_(e \in E)$a(e) \cdot f(e)). Minimum cost flow can be solved using linear programming since the objective function is linear, as are the constraints.
One common application of minimum cost flow is in determining the cheapest way to transport items from point A to point B, given some routes with limited capacity and associated cost. For example, truck routes may have greater capacity, but higher cost (in terms of time), while plane routes may have lesser capacity, but lower cost. Thus, a company looking to minimize the cost of transport would seek to model this trade-off between capacity and cost using minimum cost flow.
Maximal Bipartite Matching
Maximal bipartite matching is the problem of determining the maximal matching for a bipartite graph. If the graph is modeled as a flow network (flow from one set of nodes to the other), various flow algorithms can be used to solve it. For example, the Ford-Fulkerson algorithm can solve bipartite matching in unweighted graphs, as can the Hopcroft–Karp algorithm, which does so more efficiently since it is designed specially for bipartite graphs. For the weighted graph case (called the assignment problem), the best known algorithm is the Hungarian algorithm.
Most interesting maximal bipartite matching is framed in terms of the assignment problem. The most common formulation of the assignment problem is that given a certain number of agents and a certain number of tasks, as well as a cost/benefit for each agent on each task, assign each agent to exactly one task such that the cost/benefit is minimized/maximized. Many common problems cans be formulated in terms of the assignment problem, so the applications are extremely diverse. Applications include scheduling, resource allocation, and transportation segmentation.
Transportation Problem
A more general version of the assignment problem is the transportation problem. It turns out the transportation problem is also a more specific version of the minimum cost problem. The transportation problem is similar to the assignment problem, but instead of each agent being assigned to exactly one task, each agent can split their time among multiple tasks and each task may by attended to by multiple agents. Thus, some agents and tasks may not be assigned.
The transportation problem, like its name suggests, is sometimes described in terms of mines and factories. If mines supply factories with ore, then the transportation problem reduces to determining the most cost effective way to transport ore from individual mines to individual factories. Thus, ore from one mine may be sent to multiple factories, while one factory may use the ore from many mines. In this case, the cost function is dependent on the Euclidean distance between each mine and factory, as well as the cost of available transportation methods between the two. Obviously, mining isn't the only industry whose transportation concerns can be formulated in this fashion, so the problem itself is very general, as are its applications.
References
- , J. Network_Flow_Reversed_Edge_1. Retrieved January 23, 2013, from https://en.wikiversity.org/wiki/File:Network_Flow_Reversed_Edge_1.svg
- Lee, C. Multi-source_multi-sink_flow_problem. Retrieved September 7, 2009, from https://commons.wikimedia.org/wiki/File:Multi-source_multi-sink_flow_problem.svg