The Lambda calculus is an abstract mathematical theory of computation, involving functions. The lambda calculus can be thought of as the theoretical foundation of functional programming. It is a Turing complete language; that is to say, any machine which can compute the lambda calculus can compute everything a Turing machine can (and vice versa).
The notation is based on function abstraction and application based on variable binding and substitution. If you have done the same basic programming already, you might be familiar with lambda functions or anonymous functions. They are featured in Haskell, Mathematica, Python and recent versions of C++, too. In this wiki, we will mostly stick to Haskell.
Consider the function implemented as
f x = x^2
Another way to write this function is which in Haskell would be
(\ x -> x^2)
Notice that we're just stating the function without naming it. This is the essence of lambda calculus. In lambda calculus notation, we would write this as We could apply this function on another expression (a variable, or another function), like which actually refers to 49.
To compute a lambda expression is to transform it into a canonical form using a set of transformation rules/reductions, as we will see. We will see that the and the (along with a few predefined variables) themselves are powerful enough to express any computable function.
The simplest function abstraction I can think of that can be expressed in the lambda calculus is the identity function:
Expressions can be thought of as programs in the language of lambda calculus.
Given the notion of a variable, usually denoted by we define an expression inductively in terms of abstractions (anonymous functions) and applications as follows:
Let be the set of expressions.
- Variables: If is a variable, then
- Abstractions: If is a variable and , then
- Applications: If and , then
Here are two important conventions:
- The common convention is that function application is left associative, unless stated otherwise by parentheses, i.e.
- Consecutive abstractions can be uncurried, i.e.
- The body of the abstraction extends to the right as far as possible:
In an abstraction like the variable is something that has no original meaning but is a placeholder. We say that is a variable bound to the
On the other hand, in , i.e. a function which always returns whatever it takes, is a free variable since it has an independent meaning by itself.
Because a variable is bound in some sub-expression (the scope) does not mean it is bound everywhere. For example, this is a perfectly valid expression (an example of an application, by the way): Here the in the second parentheses has nothing to do with the one in the first.
Let us define these concepts formally:
- is free in the expression .
- is free in the expression if and is free in .
- is free in if is free in or if it is free in .
- is bound in the expression .
- is bound in if is bound in or if it is bound in .
Notice that a variable can both be bound and free but they represent different things, as we discussed in the example above.
An expression with no free variables is called a closed expression. (Compare this with how we define propositions in predicate logic.)
equivalence states that any bound variable is a placeholder and can be replaced (renamed) with a different variable, provided there are no clashes.
and are equivalent.
However, this is not always that simple. Consider the expression . It is equivalent to but not to .
Also, conversion cannot result in a variable getting captured by a different example. For example, conversion is not something that happens only in the lambda calculus. Remember that and . That's the same thing, too.
Let us move forward and formalize this idea.
For any expression and any such that
- and are not bound in and is not free in
That is to say, all instances of will be replaced by .
If were to be generalized, one would have to define for all -function . But replacing all instances of by wouldn't make any sense, since may not be defined (in case is not a variable).
Therefore, it can be tempting to replace all occurrences of by except those directly following a
But in this case, if , the fact that doesn't occur in is not sufficient anymore, as demonstrated in the following example:
This is why we have to ensure that
- and are not bound in , and is not free in
reduction is the central idea of the calculus. It tells us how simplifications of abstractions work.
Let's apply the identity function to a free variable, say, The next thing that should happen is that the function should act on the argument, which is what a is: This could also be written as
Let us take another example: That's simple, the identity function applied to the identity function is the identity function itself.
An important feature to be mentioned here is that a reduction cannot result in the capture of a free variable by another abstraction.
Our first step is to plug in the second expression as in the first one. But do we do it like this:
No! Because we're messing up between the original 's and the 's of the second abstraction. Going farther down the wrong root results in this:
This is the right way to go:
Let us formalize this with the notion of (capture-avoiding) substitution of free variables.
The capture-avoiding substitution of for free occurrences of in -- in symbols--is defined as follows:
Please notice that bound variables cannot be substituted, as we discussed before.
Two terms that can be reduced to each other in zero or more reductions or its inverse are equivalent.
Principle of Extensionality: Two functions are identical if they do the same thing, i.e. the same mapping.
Consider the function . What happens when we apply this to an arbitrary variable ?
By reduction, we'll get . Isn't that just the same thing as applying to Instead of applying , we might as well have applied
We could say that and are the same function by the principal of extensionality.
That brings us to reduction.
If is a variable and does not appear free in
haskell (\x -> abs x)
is better written as just
Because the calculus is a programming language, we'll gradually develop the machinery to write out programs and encode data structures in this language.
To begin with, let us define True as choosing the first element and False as choosing the second element:
But what does that mean? Well, that does not matter as long as we can get them to work like Booleans.
Recall the regular if-then-else statements used in functional or imperative programming, or the ternary operator in C++. In Haskell, they are of the following form:
This means that we pick
condition is and if
condition is , we pick
that. That (no pun intended) brings us to the following implementation of
This works because of the way we defined and , that is picking the first and the second element, respectively.
Now we could define the rest of the logic gates in terms of .
not can be defined as
if True then False else True, which can be written as
and returns only when both variables are . Thus, we could say that if the first element is the result is definitely , else the result is whatever the second variable was. In Haskell,
Or in calculus
Similarly, we could pretend
Or in calculus
Prove that represents the exclusive or (
Can you implement the other logic gates?
Any Turing complete language should be able to work with some representation of natural numbers. That is what church numerals do in calculus. These numerals are inspired by Peano axioms.
We define as , which is the same as . Then we define the rest of the church numerals as
To obtain the successor of , we wrap an extra around the numeral. This is not very difficult:
Defining natural numbers the way we did, is actually applying a function to some starting point . That lets us define addition as
(If the notation seems unfamiliar, review the conventions in Expressions part of the wiki.)
Using a similar argument, we could define multiplication as follows:
This is simple, we add to times. Notice that we're using just one argument on when we say . This is called a partial application.
Can you use the above ideas to implement such that
The implementation of the predecessor function is involved. In fact, Church thought for a long time that it might not be possible, until his student Kleene found it. In fact, there is a legend that Kleene conceived the idea while visiting his dentist, which is why the trick is called the wisdom tooth trick.
We'll skip the exact implementation until we introduce pairs, but let us assume that this has already been done and let's call it such that
Subtraction can be implemented in the same way as addition or multiplication:
Of course the simplest way to implement the ordered pair is , but that does not allow retrieval of the two values. Instead, we'll implement pairs like this:
That is, the ordered pair of is . This is helpful because we could use and to extract the first and the second elements, respectively called the left projection() and the right projection
Why does this work? It's because of the way we defined and
Wisdom Tooth Trick
Consider the following function which transforms to
We can easily check that the first statement is true.
Let the second statement be true for all values up to
Then, by hypotheses,
Operating on both sides,
Hence, by induction, we have proven our claim.
Implementing the predecessor function is now simple, just pick the second element.
Note that, according to this implementation, the predecessor of is itself. This will be useful as we'll notice in the next section.
It is indeed possible to encode linked lists, negative numbers or/and rational numbers in the calculus. However, we do not discuss that in this wiki.
The function returns if the argument is and if it isn't.
Recall that the definition of and we used are equivalent.
For some ,
is less than or equal to if is . This is because the predecessor of is still as we noted in the previous section.
We'll use the property that
How would you implement strict less-than and greater-than?
A fixed point of a function is a point that maps back to itself, i.e.
The point into which the cobweb converges is the fixed point of the cosine function. That means, this point satisfies the equation
All real functions do not necessarily have a fixed point. However, in lambda calculus, everything is a function and also has a fixed point, which we shall constructively prove here.
In calculus, every expression has a fixed point.
It turns out that there are functions called fixed point combinators that generate the fixed point for any .
Take for instance Curry's paradoxical combinator or the Y-combinator
We claim that is a fixed point for any , i.e. . Here's how:
There are also other fixed-point combinators such as the Turing combinator
As an exercise, convince yourself that is indeed a fixed point combinator, by writing out a proof.
We know that in the standard interpretation of natural numbers, the successor function has no fixed point. However, in calculus, it does. What interpretation can we assign to that?
Notice that applying the combinator to does not yield a church numeral in any finite number of or reductions. Instead, we get something like this:
One could think of it as an undefined value or infinity, in either of which cases a fixed point is understandable.
Control.Monad.Fix is a fixed point combinator, implemented in the following way:
Our discussion of the fixed point combinators up to this point has been very abstract. As an intelligible reader, you might have wondered if these expressions serve any more purpose than sheer mathematical beauty. Actually, they do. Since our functions are anonymous, there is no way we can refer to a function itself within its body using its name. To do so, we need to solve functional "equations."
Recursion is a central theme in functional programming. Since we do not have loops, we use recursion to our rescue.
Let us try to implement a function that returns the sum of the first natural numbers, where is an argument.
Let's call this function . We'll use recursion to define first:
- base case:
- recursion formula:
In terms of the ideas we've already defined,
By an inverse beta reduction on the right-hand side, we get
Does this look familiar? We've established that is a fixed point of
Hence, we could rewrite the equation as follows:
where is Turing's fixed point combinator.
Here is the Haskell code for the same with some syntactic sugar:
1 2 3
Prelude> let fix f = let x = f x in x Prelude> fix (\g n -> if (n == 0) then 0 else n + g (n - 1) ) 5 15
Can you write a program in calculus to denote the factorial function?
The calculus is a Turing complete language. This means that a universal Turing machine (UTM) can be effectively simulated using calculus and that the UTM can evaluate functions described by the calculus.
A function is computable if there is a corresponding lambda function such that . Because the calculus is Turing equivalent, this is the same definition of computability as that provided in context of Turing machines. According to the Church-Turing hypothesis, anything that is physically computable at all falls under this definition.
One of the undecidable things about the calculus is the equivalence of two lambda expressions. This means that there is no algorithm that can always correctly predict if two given lambda expressions can be reduced to one another. This is an application of the Rice theorem.
Another notable undecidable property of lambda expressions closely related to the Halting problem is that there is no computable function that can tell whether a given expression has a normal form.
A normal form of a lambda function is an equivalent form in which any further or reductions are not possible. For example, is a normal form, but does not have any normal form, since reduction always results in an expression with the scope of reduction still available.
Throughout this wiki, we have only discussed the untyped calculus.
Notice that though we define Booleans, integers, functions on them, and so on, there is no formal notion of invalidity of the application of one kind of a function to another. In practical programming, it is a good idea to distinguish between types, so as to effectively reason about the data we are working with. It also makes it difficult for a programmer to write a program in a statically typed language.
A large number of functional programming languages, including Haskell, is built upon ideas from the typed lambda calculus and category theory.
The Curry-Howard-Lambek isomorphism is an interesting mathematical property that relates typed lambda calculus or programming in general to propositional logic. Informally stated, it says that to every corresponding proposition, there is a corresponding type, which if inhabited (i.e. there is some variable of that type) proves that the statement is true and, similarly to every proof of a proposition, there is a corresponding program.