# Taylor's Theorem (with Lagrange remainder)

The **Taylor series** of a function is extremely useful in applications, and is fundamental to the whole *theory of functions*. Recall that, if \( f(x) \) is infinitely differentiable at \(x=a\), the Taylor series of \(f(x)\) at \(x=a\) is defined by

\[\sum_{n=0}^\infty \frac{f^{(n)}(a)}{n!} (x-a)^n = f(a) + f'(a) (x-a) + \frac{f''(a)}{2}(x-a)^2 + \frac{f'''(a)}{3!}(x-a)^3 + \cdots.\]

For \( f(x)=\sin x\) around \( x=0 \), it's easy to compute all the \( f^{(n)}(0) \) and see that the Taylor series converges for all \( x\in\mathbb R \) (by ratio test), but it's by no means obvious that it should converge to \( \sin x \). After all, the derivatives at \( x=0 \) only depend on the values of \( \sin x \) close to \( x=0. \) Why should you expect that somehow it "knows" the values of the function far away?

That the Taylor series converges to the function itself must be a nontrivial fact. Most calculus textbooks would invoke a so-called **Taylor's theorem (with Lagrange remainder)**, and would probably mention that it is a generalization of the mean value theorem. The proof of Taylor's theorem in its full generality is short but not very illuminating. Fortunately a very natural derivation based only on the fundamental theorem of calculus (and a little bit of multi-variable thinking) is all one would need for most functions. In fact, the Taylor series itself falls out of this derivation, along with the various "forms" of the remainder.

## Derivation from FTC

We start with the **fundamental theorem of calculus** (FTC) in what should be its most natural form:

\[ f(x) = f(a) + \int_a^x {\color{red}f'(x_1)}\, dx_1.\]

The expression naturally requires that \( f \) is differentiable \((\)i.e. \( f' \) exists\()\) and \( f' \) is continuous between \( a \) and \( x \)--we shall say \(f\) is *continuously differentiable* for short (or \(f\in C^1\)). You could allow \(f'\) to have some jump discontinuities, but we'll soon see that more differentiability will come up, not less. For the sake of definiteness, imagine that \( x \) is bigger than \( a \) and \(x_1\) is a variable running from \(a\) to \(x\).

If, furthermore, \( f' \) is continuously differentiable (we say \(f\) is *twice continuously differentiable*, or \(f\in C^2\)), we can apply the FTC to \(f'\) on the inverval \([a, x_1]\):

\[ \color{red} f'(x_1) = f'(a) + \int_a^{x_1} {\color{green}f''(x_2)}\, dx_2.\]

Putting this into the expression for \( f(x) \), we have

\[ \begin{align*} f(x) &= f(a) + \int_a^x \left( {\color{red} f'(a) + \int_a^{x_1} {\color{green}f''(x_2)}\, dx_2} \right) dx_1 \\ &= f(a) + f'(a) (x-a) + \int_a^x \int_a^{x_1} {\color{green}f''(x_2)}\,dx_2\, dx_1. \end{align*}\]

Playing this game again, if \(f''\) is continuously differentiable (i.e., \(f\in C^3\)), we could write

\[ \color{green} f''(x_2) = f''(a) + \int_a^{x_2} {\color{orange}f'''(x_3)}\, dx_3,\]

so now

\[ \begin{align*} f(x) &= f(a) + f'(a) (x-a) + \int_a^x \int_a^{x_1} \left( {\color{green}f''(a) + \int_a^{x_2} {\color{orange}f'''(x_3)}\, dx_3 }\right) \,dx_2\, dx_1 \\ &= f(a) + f'(a) (x-a) + f''(a)\frac{(x-a)^2}{2} + \int_a^x \int_a^{x_1} \int_a^{x_2} {\color{orange}f'''(x_3)}\, dx_3\, dx_2\, dx_1. \end{align*}\]

This clearly generalizes as follows:

If \(f(x)\) is \( n+1 \) times continuously differentiable (\(f\in C^{n+1}\)) on an interval containing \(a\), then \[ f(x) = \sum_{k=0}^n \frac{f^{(k)}(a)}{k!} (x-a)^k + R_n(x), \] where the remainder is given by \[ R_n(x) = \int_a^x \int_a^{x_1} \ldots \int_a^{x_n} f^{(n+1)}(x_{n+1})\,dx_{n+1}\ldots dx_2\, dx_1.\]

Verify it for \(f(x)=\sin x\), \(a=0\), and \(n=3\).

## The Remainder

The remainder \(R_n(x) \) as given above is an *iterated integral*, or a multiple integral that one would encounter in multi-variable calculus. This may have contributed to the fact that Taylor's theorem is rarely taught this way.

For \(n=1\) it is a double integral, and one could imagine it as the (signed) volume under the surface defined by (the graph of) a function of two variables. For our case the integrand only depends on the variable \(x_2\), so it would be easier if we could integrate over the \(x_1\) variable first. Indeed we could do so, with the help of Fubini's theorem:
\[\begin{align*}
R_1(x) &=\int_a^x \int_a^{x_1} f''(x_2)\,dx_2\,dx_1 \\
&= \int_a^x \int_{x_2}^x f''(x_2)\, dx_1\, dx_2\\
&= \int_a^x f''(x_2) (x-x_2)\, dx_2. \end{align*} \]
Note that the limits of integration were changed in accordance with the relative positions of the two variables, namely \(a\leq x_2\leq x_1\leq x\). In fact, one should regard the integration as over a (right-angled) triangle in the \(x_1 x_2\)-plane. For the general case of \(R_n(x)\), the region of integration of the \((n+1)\)-fold integral is defined by \(a\leq x_{n+1}\leq x_n\leq \cdots \leq x_1\leq x\), and by performing the integration over \(x_1, \ldots , x_n\) (with \(x_{n+1}\) fixed), we obtain the volume of a right-angled "\(n\)-simplex" with sidelength \(x-x_{n+1}\), which is
\[\frac{(x-x_{n+1})^n}{n!},\]
and we have the **integral form** of the remainder:

Under the same condition, \[ R_n(x) = \int_a^x f^{(n+1)}(\xi) \frac{(x-\xi)^n}{n!}\,d\xi. \]

By the "real" mean value theorem, this integral can be replaced by its "mean value," attained at some point \(\xi\in (a,x)\), multiplied by the length \(x-a\). Thus we obtain the remainder in the **form of Cauchy**:

(Cauchy) \[ R_n(x) = \frac{f^{(n+1)}(\xi)}{n!} (x-\xi)^n (x-a) \quad \text{for some }\ \xi\in(a, x). \]

Finally, to obtain the **form of Lagrange**, we simply need to look at the original iterated integral, and apply the multi-variable version of the "real" mean value theorem: a multiple integral over a bounded, *connected* region is equal to its "mean value," attained at some point in the domain by continuity of the integrand, multiplied by the "volume" of the region of integration. Thus we have

(Lagrange) \[ R_n(x) = \frac{f^{(n+1)}(\xi)}{(n+1)!} (x-a)^{n+1} \quad \text{for some }\ \xi\in(a, x). \]

Note that it necessarily is a different \(\xi\) from the one in the Cauchy remainder. The Lagrange remainder is easy to remember since it is the same expression as the next term in the Taylor series, except that \(f^{(n+1)}\) is being evaluated at a point \(\xi\) instead of at \(a\).

One could also obtain other forms of the remainder by integrating some but not all of the \(x_1,\ldots, x_n\) variables, and apply the mean value theorem to the other variables.

It should also be mentioned that the integral form is typically obtained by successive application of integration by parts, which avoids multiple integrals. However, it could be considered the same proof (up to homotopy, in some sense) because integration by parts, in essence, is saying that one could compute a certain area by integrating over the \(x\) variable or over the \(y\) variable.

## Convergence of Taylor Series

When \(f(x)\) is infinitely differentiable (\(f\in C^\infty\)), we have the full Taylor series with a remainder \(R_n(x)\) for each \(n\), in any of the forms above, *regardless* of whether the Taylor series converges at all. To prove a particular Taylor series converges to the function amounts to give a bound for \(R_n(x)\) so that it tends to \(0\) as \(n\to\infty\), with \(x\) fixed.

\(f(x)=\sin x\) is infinitely differentiable, and all the derivatives \(f^{(n)}(x)\) are one of four possibilities, namely \(\pm\cos x\) and \(\pm\sin x\). Therefore, in any of the forms of the remainder, we could bound \(\big|f^{(n+1)}(\xi)\big|\) by \(1\), so that (using the Lagrange form) \[\big|R_n(x)\big| \leq \frac{|x-a|^{n+1}}{(n+1)!} \to 0 \quad \text{as }\ n\to\infty\] for any \(x\in\mathbb R\). Therefore, the Taylor series of \(\sin x\), generated at any point \(a\in\mathbb R\), indeed converges to \(\sin x\) for all \(x\in\mathbb R\).

Now it would seem natural to apply this argument to as many functions as possible, and to have some general theorem describing which functions, possibly using a simple criterion or test, enjoy the property that the Taylor series, whenever converges, always converges to the function itself. Unfortunately not all \(C^\infty\) functions satisfy this property. The famous (counter)example is
\[\phi(x)=\begin{cases} e^{-1/x} & x>0 \\ 0 & x\leq 0 \end{cases}\]
for which all the derivatives vanish at \(x=0\), so the Taylor series does not converge to \(\phi(x)\) for \(x>0\). The existence of this particular function is highly significant, as it gives rise to a rich reservoir of *smooth* functions on \(\mathbb R^n\) that can have any desired support (often called *bump functions* in the study of smooth manifolds, and *test functions* in the theory of generalized functions or distributions).

But as far as Taylor series are concerned, these smooth functions are bad. The "good" functions, characterized by the very property that its Taylor series always converge to itself, are called **(real) analytic**, sometimes denoted \(f\in C^\omega\) to suggest that it is stronger than being \(C^\infty\). The best way to *explain* why \(\phi(x)\) is not analytic is by going into the complex domain: even though the two sides "stitch" together smoothly along the real axis, it is not possible to extend it (to a holomorphic function) into the complex plane, not even for a small neighborhood of 0. In fact, the behavior of \(e^{-1/x}\) for small complex \(x\) is extremely wild (see Picard's theorem on essential singularity). The following theorem, rarely mentioned in calculus as it is considered "outside the scope" of a real-variable course, gives *the* natural criterion for analyticity that bypasses Taylor's theorem and the trouble with estimating the remainder.

If \(f(x)\) is a (real- or complex-valued) function on an open set \(I\subseteq\mathbb R\), and it extends to (i.e. agrees with) a holomorphic function on a

complexneighborhood \(U\subseteq \mathbb C\) of \(I\), then the Taylor series of \(f(x)\) at any point \(a\in I\) converges to \(f(x)\) within its radius of convergence. In fact, the radius of convergence is the largest \(r\) such that \(f(x)\) admits a holomorphic extension that includes the open disk \( \{z\in\mathbb C: |z-a|<r\} \) in its domain.

See analytic continuation for more details.

## Relaxing the Condition

The condition in Taylor's theorem (with Lagrange remainder) can be relaxed a little bit, so that \( f^{(n+1)}\) is no longer assumed to be continuous (and the derivation above breaks down), but merely exist on the open interval \( (a, x) \). This is akin to the mean value theorem, which originally must refer to the fact that \[ \int_a^b f'(x)\,dx = f'(c) (b-a) \quad \text{for some }\ c\in (a, b) \] under the condition that \(f'\) is continuous, but is (slightly) generalized so that \(f'\) is no longer assumed to be continuous, but merely exist, and the integral on the left hand side is replaced by \(f(b)-f(a)\). One might wonder, for good reasons, whether such functions exist. Alas, they do. The classic example is \[ f(x)=\begin{cases} x^2\sin\dfrac{1}{x} & x\neq 0 \\ 0 & x=0 \end{cases} \] for which \(f'\) exists at \(x=0\), but is not continuous there. The discontinuity is so bad that it's not (Riemann) integrable.

The stronger mean value theorem found an entirely different proof, ultimately relying on properties of the real numbers--and in fact is an essential ingredient in the proof of the fundamental theorem of calculus.

The stronger version of Taylor's theorem (with Lagrange remainder), as found in most books, is proved directly from the mean value theorem. For a more illuminating exposition, see Timothy Gowers' blog post.

**Cite as:**Taylor's Theorem (with Lagrange remainder).

*Brilliant.org*. Retrieved from https://brilliant.org/wiki/taylors-theorem-with-lagrange-remainder/