Lebesgue Integration

Lebesgue integration is an alternative way of defining the integral in terms of measure theory that is used to integrate a much broader class of functions than the Riemann integral or even the Riemann-Stieltjes integral. The idea behind the Lebesgue integral is that instead of approximating the total area by dividing it into vertical strips, one approximates the total area by dividing it into horizontal strips. This corresponds to asking "for each $y$ -value, how many $x$ -values produce this value?" as opposed to asking "for each $x$ -value, what $y$ -value does it produce?"

Because the Lebesgue integral is defined in a way that does not depend on the structure of $\mathbb{R}$ , it is able to integrate many functions that cannot be integrated otherwise. Furthermore, the Lebesgue integral can define the integral in a completely abstract setting, giving rise to probability theory.

Intuition

The Lebesgue integral works by calculating the value of an integral based on $y$ -values instead of $x$ -values.

Let

$f(x)=\begin{cases} \frac{1}{4} \text{ if } 0\leq x\leq \frac{3}{4}\\\\ \frac{1}{2}\text{ if } \frac{3}{4}<x\leq 1. \end{cases}$

What is the value of $\int_0^1 f(x)\, dx$ ?

This graph consists of two line segments, so the area under it can be thought of as two rectangles, so the integral has value $\frac{3}{4}\cdot \frac{1}{4}+\frac{1}{4}\cdot \frac{1}{2}=\frac{5}{16}.$ When we use the Riemann integral though, we're actually thinking about this slightly differently: we're drawing many smaller rectangles, and using them to "approximate" the large rectangles, although in this case the approximation is exact.

The Lebesgue integral thinks about this problem in a different way: the function $f$ takes only the values $\frac14$ and $\frac12$ , so we consider the size of the sets on which $f$ takes those values. They are $\frac34$ and $\frac14$ respectively, so the total area must be $\frac{1}{4}\cdot \frac{3}{4}+\frac{1}{2}\cdot \frac{1}{4}=\frac{5}{16}.\ _\square$

In this case, the distinction between the two ways of thinking about the area is meaningless, but as the following example shows, this is not always the case.

Let

$f(x)=\begin{cases} 1\text{ if } x\text{ is rational}\\\\ 0\text{ if }x\text{ is irrational}. \end{cases}$

What is the value of $\int_0^1 f(x)\, dx$ ?

If we try to use the Riemann integral here, because every interval contains infinitely many rational and irrational numbers, the graph of this function cannot be approximated by rectangles, so the area cannot be calculated using the Riemann integral. But using the perspective of the Lebesgue integral, since $f$ takes only two values, 0 and 1, all we need to do is to think about the size of the sets on which it's taking those values and then multiply by the appropriate values.

There are only countably many rationals and uncountably many irrationals, so the measure of the rationals in $(0,1)$ is 0, and the measure of the irrationals is 1. Since $f$ takes a value of 1 at the rationals, they contribute $0\cdot 1=0$ to the integral; similarly, the irrationals contribute $1\cdot 0=0$ . Thus the value of the integral is 0. $_\square$

In essence, the Lebesgue integral is looking at how often a function achieves a certain value rather than the value of a function at a particular point. According to Reinhard Siegmund-Schultze^[1], Lebesgue himself explained this idea in a letter to Paul Montel, writing

"I have to pay a certain sum, which I have collected in my pocket. I take the bills and coins out of my pocket and give them to the creditor in the order I find them until I have reached the total sum. This is the Riemann integral. But I can proceed differently. After I have taken all the money out of my pocket I order the bills and coins according to identical values and then I pay the several heaps one after the other to the creditor. This is my integral."

Lebesgue Measure

To define the Lebesgue integral formally, the notion of the "size of a set" must be formalized. This can be done with the concept of the Lebesgue measure.

The Lebesgue measure of the interval $(a,b)$ is $\mu\big((a,b)\big)=b-a$ .

Since any open set in $\mathbb{R}$ is the countable union of disjoint intervals, this definition can be extended to any open set.

The Lebesgue measure of an open set $G=\bigcup (a_n,b_n)$ is the sum of the lengths of these intervals, i.e. $\mu(G)=\sum (b_n-a_n)$ .

Finally, for an arbitrary set $A$ , the Lebesgue measure is defined by approximating $A$ with open sets.

The Lebesgue measure of a set $A$ is $\mu(A)=\inf_{\stackrel{A\subset G}{G\text{ open}}} \mu(G).$

The Lebesgue measure of a single point $A=\{a\}$ is 0, because the open set $G_{\epsilon}=(a-\epsilon, a+\epsilon)$ always contains $A$ . The measure of $G_{\epsilon}$ is $2\epsilon$ , so letting $\epsilon\to 0$ shows that $\mu(A)=0$ .

Lebesgue Integration on $\mathbb{R}$

With the Lebesgue measure in hand, the Lebesgue integral can be defined. The first class of functions the Lebesgue integral can be defined for are positive simple functions.

A simple function is a function that takes on only finitely many distinct values.

The function

$f(x)=\begin{cases} 1\text{ if } x\text{ is rational}\\ 0\text{ if }x\text{ is irrational} \end{cases}$

is simple because it only takes on the values 0 and 1, but the function $f(x)=\lfloor x\rfloor$ is not, because it takes on any value in $\mathbb Z$ , of which there are infinitely many.

Any positive simple function $f$ can be written as a linear combination of characteristic functions of sets, say $f=c_1\chi_{A_1}+\cdots+c_n\chi_{A_n}.$ Then, defining the integral of such a function with the idea of the previous sections is straightforward.

For the positive simple function $f=c_1\chi_{A_1}+\cdots+c_1\chi_{A_n}$ , the Lebesgue integral of $f$ is $\int_{\mathbb{R}} f\, d\mu=c_1\mu(A_1)+\cdots +c_n\mu(A_n).$

That is, the size of each set is multiplied by the value $f$ takes on that set, and these are summed up to give the total integral.

This idea can be extended to a negative simple function $f$ by writing $\int_{\mathbb{R}} f\, d\mu=-\big(c_1\mu(A_1)+\cdots +c_n\mu(A_n)\big).$ If $f$ takes on positive and negative values, it can be broken into a positive and negative part, $f=f^++f^-$ , and then these parts can be integrated separately.

Finally, the Lebesgue integral of an arbitrary function $f$ is defined by considering approximations of $f$ by simple functions.

For a function $f$ , the Lebesgue integral of $f$ is $\int_{\mathbb{R}} f\, d\mu =\sup_{\stackrel{g\leq f}{g\text{ simple}}} \int_{\mathbb{R}} g\, d\mu.$

To recap, the Lebesgue integral is defined in the intuitive way for simple functions, and then extended to general functions by approximating them with simple functions.

How would you define the Lebesgue integral over a set $X$ , as opposed to over all of $\mathbb{R}$ , for a simple function $f$ ? Think about how to capture the idea of the size of the set where $f$ is taking a certain value, but only on $X$ .

For the simple function $f=c_1\chi_{A_1}+\cdots+c_n\chi_{A_n}$ , you would let $\int_X f\, d\mu= c_1\mu(A_1\cap X)+\cdots+c_n\mu(A_n\cap X).\ _\square$

The Lebesgue integral can also be defined over a general measure space $(X, \Sigma, \mu)$ , with the exact same ideas.

Properties of the Lebesgue Integral

The Lebesgue integral satisfies several nice properties:

Linearity: $\int c_1f_1+c_2f_2\, d\mu =c_1\int f_1\, d\mu+c_2\int f_2\, d\mu.$
Monotone convergence theorem: If $f_n\to f$ monotonically, then $\int f_n\, d\mu \to \int f\, d\mu.$
Dominated convergence theorem: If $|f_n|\leq g$ for an integrable function $g$ , and $f_n\to f$ , then $\int f_n\, d\mu \to \int f\, d\mu.$

References

Gowers, T., Leader, I., & Barrow-Green, J. Princeton Companion to Mathematics.

Contents