Combinatorial Optimization
Combinatorial optimization is an emerging field at the forefront of combinatorics and theoretical computer science that aims to use combinatorial techniques to solve discrete optimization problems. A discrete optimization problem seeks to determine the best possible solution from a finite set of possibilities.
From a computer science perspective, combinatorial optimization seeks to improve an algorithm by using mathematical methods either to reduce the size of the set of possible solutions or to make the search itself faster. From a combinatorics perspective, it interprets complicated questions in terms of a fixed set of objects about which much is already known: sets, graphs, polytopes, and matroids. Combinatorial optimization refers primarily to the methods used to approach such problems and, for the most part, does not provide guidelines on how to turn real-world problems into abstract mathematical questions, or vice versa.
This subject originally grew out of graph theoretic concerns like edge colorings in undirected graphs and matchings in bipartite graphs. Much of the initial progress is due entirely to theorems in graph theory and their duals. With the advent of linear programming, these methods were applied to problems including assignment, maximal flow, and transportation. In the modern era, combinatorial optimization is useful for the study of algorithms, with special relevance to artificial intelligence, machine learning, and operations research.
The Marriage Problem, in its traditional formulation, seeks to find a set of couples in a set of men and women such that each couple contains one man and one woman and no two people would prefer each other to their own partners. Each woman has an ordering of preferences for the men, and each man has an ordering of preferences for the women.
Given a set of preferences for the men and women, does there exist a solution? What is a good way to find such a solution?
It turns out there will always exist a solution, and this is known as Hall's Marriage Theorem. This general sort of problem is known as a matching problem, and it is one of a few central types of problems in combinatorial optimization.
Fundamentals
This subject encompasses the study of partially ordered sets, graphs, and matroids.
A partially ordered set, abbreviated to poset, is a nonempty set together with a partial order on such that
- For all , .
- For all , and implies .
- For all , and implies .
In a poset , two distinct are comparable if or . A chain of is a subset of in which any two elements are comparable. An antichain of is a subset of in which any two elements are not comparable.
A graph is a set (denoted vertices) together with a set (denoted edges) of element subsets of . A bipartite graph is a graph such that there exist and satisfying and for all , , and . Vertex and edge are said to be incident if .
A vertex matching of graph is a set of edges such that no two edges are incident to the same vertex. A vertex cover of is a set of vertices such that every edge is incident to at least one element of the vertex cover.
A path is a finite sequence of distinct edges such that for each , . The path is said to start at vertex and end at .
A collection of paths is edge-disjoint if no pair of them contain the same edge. An edge-separating set of two vertices and (or more generally, sets of vertices and ) is a set of edges such that any path from to (or, from a vertex of to a vertex of ) contains at least one of those edges.
A matriod is a pair of sets such that
- contains the emptyset .
- If and , then .
- If and , then there is an in such that .
Min-Max Relations
The tools of interest in combinatorics suggest that there may be broader themes of optimality. In fact, intuitively speaking, the interplay between minimality and maximality rises from the construction of the objects themselves. For instance, in graph theory, trees are graphs that are, equivalently, maximally acyclic and minimally connected. Computational problems on the defining features of graphs may often be solved through the use of a greedy algorithm (and such an algorithm inherently requires some maximal or minimal feature); for instance, in the solution to the minimal spanning tree problem, Kruskal's algorithm encodes the definition of a tree as "maximally acyclic" while Prim's Algorithm takes advantage of the characterization as "minimally connected".
Given a poset , the maximal size of an antichain of is equal to the minimal size of a partition of into chains.
Proceed by induction on the size of the poset: the theorem holds for the poset of one element.
Now suppose it holds for all posets of size at most , and let be a element poset with some maximal element . Then, (with the inherited partial order) has a partition of minimal size into chains . By the inductive hypothesis, the maximal size of an antichain is , and since any two elements of an antichain may not be in the same chain, we may exhibit antichains of maximal size simply by taking for each and considering ; let be the antichain of maximal size in which each is the maximal element of (guaranteed to be unique since all elements of a chain are comparable). Then either is not comparable to any element of , in which case is a partition of into chains and is an antichain of length . Or is comparable to an element for some fixed , and is a partition of into chains and is still a maximal antichain of . So satisfies Dilworth's Theorem, and by the principle of mathematical induction, the proof is complete.
Mirsky's Theorem.
Given a poset , the maximal size of a chain of is equal to the minimal size of a partition of into antichains.
Suppose is a chain of maximal size of and is a cover of antichains. Then, since for all , it follows that Then, note that there is a cover of antichains where for each , Note immediately that is indeed an antichain; for if distinct are comparable (say, ), then is the maximal element of a chain of length , given by together with the chain of length and maximal element (it follows from transitivity that this is indeed a chain). Additionally, the form a partition of , since the maximal length of a chain of , so every element of is the maximal element of some chain of length either . Thus, they necessarily cover and therefore is a cover of by antichains. It follows that . We conclude the theorem as stated.
Given a bipartite graph , the maximal size of a vertex matching of is equal to the minimal size of a vertex cover of .
Suppose is a bipartite graph, and is a vertex cover of . Let . Then, consider the poset on with partial order such that for any and , iff is connected to . The result is then precisely Dilworth's Theorem.
Given a set and collection , there exists a set of distinct elements such that for each iff for any , the inequality holds.
Menger's Theorem.
Given a graph and vertices and , the maximal size of a set of edge-disjoint paths from to is equal to the minimal size of an edge-separating set of and .
Linear Programming
In its mathematical formulation, a linear program takes as input an matrix and column vectors of length and of length . Its goal is the following:
Find a column vector of length satisfying such that is maximal. Otherwise, decide that there is no satisfying that constraint or that there is no maximal and is unbounded.
If such an exists, it is known as an optimal solution.
The problem is known as infeasible if there is no solution satisfying , and it is known as unbounded if for all real numbers , there is a column vector such that .
Any linear program that is not infeasible and not unbounded must have an optimal solution.
In essence, this follows from theorems of linear algebra.
Linear programs are often discussed in terms of polyhedra, and when the polyhedra is bounded, polytopes. Graphically, the points satisfying create a polyhedron, and its dimension is By moving around on the faces of the polyhedron, it is possible to find an optimal solution. This gives rise to the simplex method, which while effective (and polynomial) in general has very poor (exponential) performance in the worst case.
All linear programs can be reduced to the "standard form" where and .
For instance, suppose a linear program has the following constraints on positive , , and :
Then, the "slack variables" and (each ) should be introduced such that
It follows that whatever the linear program seeks to maximize can be maximized with the introduction of the 's as well.
Algorithms
Prim's algorithm provides a method for solving one of the simplest problems of combinatorial optimization: finding a minimum spanning tree on a (weighted) graph. It takes advantage of the fact that tress are minimally connected graphs and that graphs have a matroid structure (and therefore are susceptible to certain implementations of the greedy algorithm). It works as follows:
- Select a vertex on the graph.
- Of the edges connecting the selected vertices to the rest of the graph, select the one with minimal weight (and the associated vertex).
- Repeat step 2 until all vertices are in the tree.
The algorithm manages to iteratively take a local feature (the minimal weight in some neighborhood of the graph) and turn it into a global feature (a minimum spanning tree of the graph). This is a greedy algorithm. With proper implementation, this algorithm can have complexity
Kruskal's algorithm tackles the same problem and takes advantage of the fact that trees are maximally acyclic graphs. It works as follows:
- Order the edges from minimal weight to maximal weight, and start with an empty set of edges.
- Add the edge of minimal weight that does not make the set contain any cycles.
- Repeat step 2 until the set determines a spanning tree.
This algorithm works because any other spanning tree could be made from the same process-- but would be later in the sequence. It follows that the spanning tree in question is minimal. The algorithm has complexity .