Stone-Weierstrass Theorem

The Stone-Weierstrass theorem is an approximation theorem for continuous functions on closed intervals. It says that every continuous function on the interval $[a,b]$ can be approximated as accurately desired by a polynomial function. Polynomials are far easier to work with than continuous functions and allow mathematicians and computers to quickly get accurate approximations for more complex functions.

Specifically, let $\mathcal{C}[a,b]$ denote the ring of continuous functions $f: [a,b] \to \mathbb{R}$. The Stone-Weierstrass theorem classifies certain subrings of $\mathcal{C}[a,b]$ that are dense in $\mathcal{C}[a,b]$. In particular, it implies that any continuous function in $\mathcal{C}[a,b]$ may be arbitrarily well-approximated by polynomial functions. This corollary is known as the Weierstrass Approximation theorem.

$Approximating $y = \ln(x)$ with a cubic polynomial $y = p(x)$ on the interval $[0.5, 1.5]$; the maximum error on this interval is $\max_{x\in [0.5, 1.5]} |p(x) - \ln(x)| \approx 0.02$.$ Approximating $y = \ln(x)$ with a cubic polynomial $y = p(x)$ on the interval $[0.5, 1.5]$; the maximum error on this interval is $\max_{x\in [0.5, 1.5]} |p(x) - \ln(x)| \approx 0.02$.

Motivation and Proof for Weierstrass Approximation

Any infinitely differentiable (a.k.a smooth) function $f: [a,b] \to \mathbb{R}$ is arbitrarily well-approximated by its Taylor polynomials. That is, if $T_n (x)$ denotes the $n^\text{th}$ degree Taylor polynomial, then for any $\epsilon > 0$, there exists $N$ such that $\max_{x\in [a,b]} |T_n (x) - f(x)| < \epsilon$.

Unfortunately, not all continuous functions are smooth (or even differentiable), so one cannot just always approximate using Taylor polynomials. Nonetheless, it is somewhat reasonable to assume that any continuous function might be arbitrarily well-approximated by smooth functions. If this is true, then continuous functions are well-approximated by polynomials, since the smooth functions are certainly well-approximated by polynomials.

To see why continuous functions should be well-approximated by smooth functions, suppose $g: [a,b] \to \mathbb{R}$ is an arbitrary continuous function. If $h: \mathbb{R} \to \mathbb{R}$ is a smooth function, then \[T_h (x) = \int_{a}^{b} h(x-t) g(t) \, dt\] is itself smooth, by differentiation under the integral sign. One can think of this $h$ as assigning infinitesimal weights to $g(t)$ at each $t\in [a,b]$. The function $T_h$ is called the convolution of $h$ and $g$, and is usually denoted $h\ast g$. A particularly useful property of convolution is that, by a change of variables, one can see that $h\ast g = g\ast h$.

Since $h\ast g$ is essentially a weighted version of $g$, one could hope to find smooth functions $h: \mathbb{R} \to \mathbb{R}$ that make $h\ast g$ very close to $g$ by spreading the weight of $h$ more and more evenly over the interval $[a,b]$. This idea is made rigorous in the following proof:

Suppose $h: \mathbb{R} \to \mathbb{R}$ is a smooth function that is positive on $[a,b]$ and zero elsewhere, which also satisfies \[\int_{a}^{b} h(x) \, dx = 1.\] This is a function with "total weight" 1, whose weight is entirely concentrated on the interval $[a,b]$. Set \[h_s (x) = \frac{1}{s} h\left(\frac{x}{s} \right);\] each $h_s$ also has $\int_{a}^{b} h_s (x) \, dx = 1$, but the weight becomes more evenly distributed over this interval as $s\to\infty$ $($i.e., the graphs of $h_s$ become flatter$).$

Choose $\epsilon > 0$. Since $[a,b]$ is a closed interval, $g$ is uniformly continuous on $[a,b]$, so there exists $\delta > 0$ such that $|f(x) - f(y)| < \epsilon$ whenever $x, y \in [a,b]$ and $|x-y| < \delta$.

Thus, we have $($implicitly using the fact that $h\ast g = g\ast h$, and the triangle inequality$)$ \[\big|(h_s \ast g)(x) - g(x)\big| = \left \vert \int_{a}^{b} h_s (t) g(x-t) \, dt - g(x) \int_{a}^{b} h_s (t) \, dt \right\vert \le \int_{a}^{b} \big|h_s (t)\big| \big|g(x-t) - g(x)\big| \, dt.\] Since $h_s$ is zero outside $\left[\frac as, \frac bs\right]$, this is really an integral over the interval $\left[\frac as, \frac bs\right]$, i.e. \[ \int_{a}^{b} \big|h_s (t)\big| \big|g(x-t) - g(x)\big| \, dt = \int_{a/s}^{b/s} \big|h_s (t)\big| \big|g(x-t) - g(x)\big| \, dx.\] If we take $s$ large enough so that $0 < \frac{b-a}{s} < \delta$, then $|g(x-t) - g(x)| < \epsilon$ for all $t\in \left[\frac as, \frac bs\right]$. We conclude \[|(h_s \ast g)(x) - g(x)| \le \int_{a/s}^{b/s} |h_s (t)| |g(x-t) - g(x)| \, dx < \epsilon \int_{a/s}^{b/s} |h_s (t)| \, dt < \int_{a}^{b} h_s (t) \, dt = 1\] for all $x\in [a,b]$.

It remains to construct $h$. Define \[j(x) = \begin{cases} e^{-(x-1)^2} \cdot e^{-(x+1)^2} & \text{if } x \in (-1, 1), 0 & \text{otherwise}. \end{cases}\] One can check $j$ is smooth, positive on $(-1,1)$, and zero elsewhere. By scaling and translating $j$, we have constructed the desired $h$. $_\square$

Statement of Stone-Weierstrass Theorem

The Weierstrass approximation theorem states that the polynomials are dense in $\mathcal{C}[a,b]$. More generally, the Stone-Weierstrass theorem gives a classification of all the subrings of $\mathcal{C}[a,b]$ that are dense in $\mathcal{C}[a,b]$.

A subring $R \subset \mathcal{C}[a,b]$ is called a subalgebra if, for any $f, g\in R$, the product $fg$ is also in $R$.

A subring $R \subset \mathcal{C}[a,b]$ is said to separate points if, for any distinct $x$, $y \in [a,b]$, there is some $f\in R$ such that $f(x) \neq f(y)$.

(Stone-Weierstrass) Suppose $R \subset \mathcal{C}[a,b]$ is a subalgebra containing some nonzero constant function. Then, $R$ is dense in $\mathcal{C}[a,b]$ if and only if $R$ separates points.

The following example illustrates the power of Stone-Weierstrass, showing how to recover the Weierstrass approximation theorem from it.

Recover the Weierstrass Approximation theorem from the Stone-Weierstrass Theorem.

Let $\mathcal{P} \subset \mathcal{C}[a,b]$ denote the subalgebra of polynomials. Certainly $\mathcal{P}$ contains a nonzero constant function, since all constant functions are degree-zero polynomials.

To see that $\mathcal{P}$ separates points, choose $x$, $y \in [a,b]$. Then $p(t) = t-x$ is a polynomial with \[p(x) = 0 \neq y-x = p(y).\] Thus, we conclude $\mathcal{P}$ is dense in $\mathcal{C}[a,b]$. $_\square$

Contents