I'm sure you've all heard of buzz-words like NP-Completeness and the like. They sound hard and unapproachable, so today let's solve one of the non-classical NP-Hard problems (in exponential time of course).
A graph, , is basically a set of points (called vertices) on a plane floating around. However, there are also a set of lines joining the various points together in order to relate them together somehow. These are called nodes and edges, to imagine them, you can think of the cliche social network graph with the dense set of edges connecting you to possibly thousands of friends and acquantences. Typically, graphs aren't really that dense and well-connected, and typically you'll only see very few edges floating around. More formally, we describe a graph by a pair of sets: , standinf for vertices; standing for edges. Each edge is composed of a line and two vertices, and we may denote the edge going from to (both vertices) as . (Note in this presentation, edges do not have a direction.)
[missing image: http://www.networkweaving.com/blog/uploaded_images/ArtTechNet-701564.gif]
Now, let's consider a subset of these graph structures known as bipartite graphs that have two distinct sides (let's imagine left and right), with equal number of vertices/nodes on either side, and in which the edges and the lines only ever connect vertices from one side into the other. Such a graph can be intuitive imagined as the underlying structure behind some kind of a matching problem. For example, suppose you have 6 buyers and 6 houses, each buyer has a list of houses that he is willing to buy; in this example, the left hand side might be the buyers and the right hand side would be the houses, you would draw a line from a buyer vertex to a house vertex if the buyer deems the house acceptable. If you're a real-estate agent, you would probably want to sell off all of the houses, so you would look for a set of 6 edges that matches each house to one and only one buyer. This set of edges is called a matching if none of the edges will link a buyer to multiple houses or a house to multiple buyers; a matching (or a set of edges) is called a perfect matching if its edges matched up every house and every buyer in our graph (that is, a maximum matching that is also complete).
[missing image: bipartite graph: https://d2o58evtke57tz.cloudfront.net/wp-content/uploads/maximum_matching1.png]
Now, the problem of constructing a perfect matching turns out to be relatively easy to do with pen and paper, and even more so by a computer (we won't go into that right now, wait until your first algorithm course). However, not satisfied with just one perfect matching, I want you to tell me how many perfect matchings there are in a bipartite graph! As far as we know, this is a really hard problem.
How would you solve this problem? A natural starting point would be to brute-force the solution; since the problem is hard, we can't expect a sub-exponential algorithm anyways! Fair point, let's start with a brute-force algorithm. Suppose your bipartite graph has edges and nodes (where is even by definition of being bipartite), then a perfect matching must be a set of edges. One way to solve this problem then is to enumerate all of the such edges, test each one individually to see if it is a perfect matching, and count how many were actually perfect matchings. How long would this algorithm take? Well, suppose for (because otherwise we can't have a perfect matching) and since there are such edges we have to consider, then it's possible to show that the number of edge sets we have to consider is approximately ! For many applications, it's very likely that isn't small. For example, consider the realestate market, here, it's not at all unlikely that the average number of houses considered per potential buyer is greater than 4! Can we do better?
Since in this problem, the number of edges could be potentially up to , it's a very desirable property to construct an algorithm that is completely independent of the number of edges! It turns out that there's a really elegant algorithm (that can be further improved in fact) which runs in time , no matter how many edges there are in our graph. Here's how to do it.
Consider the inclusion-exclusion principle, and let's consider a generalization of it. Suppose we have a function that assigns a natural value to each subset of , then we let the zeta-transform of to be . A generalization of the Mobius Inversion Theorem (lifted out of number theory and applied to complete partial orders) states that which can be seen as a generalization of the inclusion/exclusion principle on arbitrary predicate function $f(X)$. Informally, you can easily see the relationship from the mobius inversion theorem to the inclusion-exclusion theorem, which states that
Now, let's take a leap of faith together. Let be the right half of the vertices; define a semi-leftmatch to be a set of edges, each of which starts at exactly one vertex on the left side (so it is a perfect matching for only the left hand side). For example, this is a semi-leftmatch
Now, define to be the number of semi-leftmatches whose edges do not enter the right hand side vertices in . For example, the above semi-leftmatch is also included in . Then it turns out that is the zeta transform of that returns the number of perfect matchings when , that is To see this is true, look at the following example:
Here, we have = 3. Now, considers every possible semi-leftmatch, of which there are possibilities (the first left node has 3 edges, the second has 2, and the last also has 2). Of course, we also need to subtract off the non-matching semimatches. Notice that we have the invariant: if we exclude a node out of the right hand side for matching, then none of the semimatches generated are true matchings, because it's impossible to match left hand side nodes with right hand side nodes. Now, if you actually look closely, there are only 3 perfect matchings. If you do the mobius inversion on , which you can compute by counting the number of semimatches (shown in the figure), you'll also get 12 - 4 - 2 - 4 + 1 + 0 + 0 + 0 = 3! (I neglected to include the cases where there are zero semimatches).
Therefore, in order to compute the number of perfect matchings, we just need computations of , which is easy:
function g(G,X) remove from E all edges pointing to each node of X remove from R all of X return the product of the number of remaining edges coming out of each of the node on the left end
that is, collapse into by removing all of the nodes and edges connected to , then where is the function that returns the number of edges connected to a vertex . We can then use the mobius inversion transform above to compute in time, irregardless of how many edges are present.
This concludes another presentation on using clever mathematics to solve problems in computer science. The algorithm derived here came from relatively recent research, and as of late, inclusion-exclusion has been an actively researched area in algorithm design. For those interested, I would recommend Fast polynomial-space algorithms using M ̈obius inversion: Improving on Steiner Tree and related problems