Catalan Numbers
The Catalan numbers are a sequence of positive integers that appear in many counting problems in combinatorics. They count certain types of lattice paths, permutations, binary trees, and many other combinatorial objects. They satisfy a fundamental recurrence relation, and have a closed-form formula in terms of binomial coefficients.
The Catalan numbers are given by the formula
The first few Catalan numbers are
It is not immediately clear from the definition that the are integers, but this will follow from many of the theorems proved below.
Contents
Dyck Paths and Acceptable Sequences
The number of valid parenthesis expressions that consist of right parentheses and left parentheses is equal to the Catalan number. For example, and there are 5 ways to create valid expressions with 3 sets of parenthesis:
- ( ) ( ) ( )
- ( ( ) ) ( )
- ( ) ( ( ) )
- ( ( ( ) ) )
- ( ( ) ( ) )
Considering right parenthesis to be +1s, and left -1s, we can write this more formally as follows:
The number of sequences of terms that can be formed by using copies of 's and copies 's whose partial sums satisfy
for equals the Catalan number.
Let us call a sequence of copies of 's and copies of 's to be acceptable if it satisfies the above condition, and unacceptable otherwise. We denote the number of acceptable sequences by and the number of unacceptable sequences by .
The total number of sequences of 's and 's can be regarded as a permutation of objects with two different types with different objects of one type and of the other. So the total number of such sequences is
So and the next step is to count .
Consider an unacceptable sequence of copies of 's and copies of 's. Then there is a smallest for which the partial sum is negative. By minimality of there must be an equal number of 's and 's preceding So we must have odd, and . So among the terms there are copies of 's and copies of 's. Among the remaining terms there are copies of 's and copies of 's.
Now consider the effect of reversing the sign of each of the first terms, that is, replacing with for and leaving the rest unchanged. Now this new sequence obtained, say will have copies of 's and copies of 's (since they add up to 2).
This gives a function from the set of unacceptable sequences to the set of sequences of terms that add up to . To see that is a bijection, construct its inverse: let be a sequence of terms that add up to . Let be the first index for which there are more 's in than 's. (There must be such a since the sum of all the terms is positive.) Then define to be the (unacceptable) sequence . It is clear that and are inverses.
Now the number of sequences of copies of 's and copies of 's is . By the bijection, this is also equal to the number of unacceptable sequences. So the total number of unacceptable sequences is
So the number of acceptable sequences is
This completes the proof.
Notes:
(1) These sequences also describe Dyck paths of length , which are sequences of equally-spaced northeast and southeast walks starting at the origin, ending on the -axis, and never going below the -axis. The bijection between acceptable sequences and Dyck paths assigns a to a northeast walk and a to a southeast walk, and the acceptability condition is equivalent to the path never dropping below the starting and finishing line.
The five Dyck paths for
(2) This shows that the Catalan numbers are integers, although this also follows from the identity which comes up in the above proof.
Recurrence Relation; Generating Function
A useful tool in proofs involving the Catalan numbers is the recurrence relation that describes them.
The Catalan numbers satisfy the recurrence relation
There are several ways to prove this, but perhaps the most elegant is by appealing to Dyck paths of length , which we saw above that counts. Given a Dyck path of length let be the first nonzero -coordinate where the path hits the -axis, then .
The path breaks up into two pieces, the part to the left of and the part to its right. The part to the right is a Dyck path of length , so it is counted by . The part to the left is a northeast step, then a Dyck path of length , and then a southeast step. The middle path is a Dyck path "on stilts"; it never dips below its starting point because it cannot hit the -axis earlier than There are of these. So there are a total of paths which hit the -axis first at , and summing these terms up gives , which is the recurrence relation.
If , then counts the five Dyck paths pictured above:
- Path 1 has , counted in .
- Path 2 has , counted in .
- Path 3 has , counted in .
- Path 4 has , counted in .
- Path 5 has , counted in .
The middle path of length in paths 1 and 2, and the top half of the left peak of path 3, are the Dyck paths on stilts referred to in the proof above.
This recurrence is useful because it can be used to prove that a sequence of numbers is the Catalan numbers. If the sequence call it satisfies the same recurrence as the Catalan numbers, and has the same initial condition then by induction.
The generating function for the Catalan numbers is
Let . Then
So by the quadratic formula, , but the minus sign is the only choice that makes sense, because the plus sign leads to an expression that blows up at . Choosing the minus sign leads to an expression that has a limit as by L'Hopital's rule:
(Exercise: prove the theorem instead by using the binomial theorem on the right side.)
Other Appearances of the Catalan Numbers
Restricted random walks:
A man walks blocks north and blocks west of his residence. Every day he has blocks to walk. How many routes are possible if he never crosses the diagonal line from home to office?
![]()
Clearly, each acceptable route is either above the diagonal or below the diagonal and both of these paths are symmetric. So we calculate the number of paths below the diagonal and multiply it by 2. Each route is a sequence of northerly blocks and westerly blocks. We identify west with and north with . Each route corresponds to a sequence of copies of 's and copies of 's. But in order for the route not to go above the diagonal, we must have
for . So the number of acceptable routes above the diagonal equals the Catalan number, and the total number of acceptable routes is
The above diagram shows the routes below the diagonal for . Notice that the number of such routes is equal to . So the answer is
Binary trees:
A rooted binary tree is a tree with one root node, where each node has either zero or two branches descending from it. A node is internal if it has two nodes coming from it. How many rooted binary trees are there with internal nodes?
Rooted binary trees with internal nodes.[1]
This is a good illustration of proving that a sequence equals the Catalan numbers by using the recurrence relation. Let be the number of rooted binary trees with internal nodes. Then as in the picture. Now for suppose there is a rooted binary tree with internal nodes. The root node is internal. There are internal nodes on the left and internal nodes on the right, for some Looking at the left and right sides gives two rooted binary trees with and internal nodes, respectively. So
which is the same recurrence relation as the Catalan numbers. So for all
Polygon triangulations:
How many ways are there to cut an -gon into triangles, by drawing non-crossing lines between vertices of the polygon?
All the triangulations for [2]
Let be the number of such triangulations, and define . Consider , which counts triangulations of an -gon. The idea is to pick an edge of that -gon and classify triangulations by which triangle that edge ends up in.
Number the vertices . Suppose our edge is . If the triangle this edge ends up in is , then this triangle splits the polygon into two remaining polygons that must be triangulated.
- If , then what is left is an -gon. The number of triangulations like this is .
- If , there is a triangle on one side and an -gon on the other side. The number of triangulations like this is .
- If , there is a quadrilateral on one side and an -gon on the other side. The number of triangulations like this is .
Proceeding in this way,
Also . This is the same recurrence relation and initial condition that the Catalan numbers satisfy, so it must be the case that for all .
Let 18 equally spaced points be chosen on the circumference of a circle.
How many ways can these points be connected in pairs such that the resulting 9 segments do not intersect?
Note: Rotations are reflections are considered distinct arrangements.
For more examples, see the wiki on proofs using bijections.
References
- Berkeley Math Circle, . Binary Tree. Retrieved September 9, 2014, from https://commons.wikimedia.org/wiki/File:Binary_Tree.png
- Harvey, D. Catalan Hexagons. Retrieved April 8, 2009, from https://commons.wikimedia.org/wiki/File:Catalan-Hexagons-example.svg