The Taylor series of a function is extremely useful in all sorts of applications and, at the same time, it is fundamental in pure mathematics, specifically in (complex) function theory. Recall that, if is infinitely differentiable at , the Taylor series of at is by definition
The expression (and its tremendous utility) comes from its being the best polynomial approximation (up to any given degree) near the point . For and , it's easy to compute all the and to see that the Taylor series converges for all (by ratio test), but it's by no means obvious that it should converge to . After all, the derivatives at only depend on the values of the function very close to . Why should you expect that somehow it "knows" the values of the function far away?
That the Taylor series does converge to the function itself must be a non-trivial fact. Most calculus textbooks would invoke a Taylor's theorem (with Lagrange remainder), and would probably mention that it is a generalization of the mean value theorem. The proof of Taylor's theorem in its full generality may be short but is not very illuminating. Fortunately, a very natural derivation based only on the fundamental theorem of calculus (and a little bit of multi-variable perspective) is all one would need for most functions.
We start with the fundamental theorem of calculus (FTC) in what should be its most natural form:
The expression naturally requires that be differentiable i.e. exist and is continuous between and — we shall say is continuously differentiable for short or You could allow to have some jump discontinuities, but we'll soon see that more differentiability will come up, not less. For the sake of definiteness, imagine that is bigger than and is a variable running from to .
If, furthermore, is continuously differentiable we say that is twice continuously differentiable, or we can apply the FTC to on the inverval :
Putting this into the expression for , we have
Playing this game again, if is continuously differentiable i.e. we could write
This clearly generalizes as follows:
If is times continuously differentiable on an interval containing , then
known as the remainder.
In some sense, we have pushed as much information about the value of to the point as possible, and what remains is a single "complicated-looking" term.
Verify it for , , and .
Note that if there is a bound for over the interval , we can easily deduce the so-called Lagrange's error bound which suffices for most applications (such as the convergence of Taylor series; see below). The actual Lagrange (or other) remainder appears to be a "deeper" result that could be dispensed with.
(Lagrange's Error Bound)
If, in addition, is bounded by over the interval , i.e. for all , then
The remainder as given above is an iterated integral, or a multiple integral, that one would encounter in multi-variable calculus. This may have contributed to the fact that Taylor's theorem is rarely taught this way.
For , the remainder
is a "double integral," where the integrand in general could depend on both variables and . In our case, the integrand only depends on , so it would be easier if we could integrate over the variable first. Indeed we could do so (with a little help of Fubini's theorem):
Note that the limits of integration were changed to keep the relative positions of the two variables, namely . In fact, the integral should be regarded as over a right-angled triangle in the -plane, and it computes the (signed) volume under the surface . This makes it intuitively clear that interchanging the order of integration ought not affect the final result.
For the general case of , the region of integration is an -dimensional "simplex" defined by , and performing the integration over with fixed yields the volume of a right-angled "-simplex". To wit,
and this is known as the integral form of the remainder.
Under the same condition,
By the "real" mean value theorem, this integral can be replaced by the "mean value," attained at some point , multiplied by the length . Thus we obtain the remainder in the form of Cauchy:
Finally, to obtain the form of Lagrange, we simply need to look at the original -fold integral, and apply the multi-variable version of the "real" mean value theorem: a multiple integral over a bounded, connected region is equal to its "mean value," attained at some point in the domain by continuity of the integrand, multiplied by the "volume" of the region of integration. (One can prove this by a simple application of extreme value theorem and intermediate value theorem.) Thus we have
Note that it almost certainly is a different from the one in the Cauchy remainder, and in both cases we can't know where exactly is without more information on the function . The Lagrange remainder is easy to remember since it is the same expression as the next term in the Taylor series, except that is being evaluated at the point instead of at .
One could also obtain other forms of the remainder by integrating some but not all of the variables, and apply the mean value theorem to the remaining variables. With a bit careful analysis, one has
This is very close to, but not quite the same as, the Roche-Schlömilch form of the remainder.
It should also be mentioned that the integral form is typically derived by successive applications of integration by parts, which avoids ever mentioning multiple integrals. However, it may be considered the same proof (up to homotopy, in some sense) because integration by parts, in essence, is saying that one could compute a certain area either by integrating over the variable or over the variable.
In addition to giving an error estimate for approximating a function by the first few terms of the Taylor series, Taylor's theorem (with Lagrange remainder) provides the crucial ingredient to prove that the full Taylor series converges exactly to the function it's supposed to represent. A few examples are in order.
is infinitely differentiable and all the derivatives are one of four possibilities, namely and . Therefore, in any of the forms of above, we can simply bound by , so that (using the Lagrange form, say)
for any fixed and . Therefore, the Taylor series of , centered at any point , indeed converges to for all .
Now it would be natural to apply this kind of argument to as many functions as possible, and preferably to have some general theorem describing which functions, with a simple criterion or test, enjoy the property that its Taylor series always converges to the right function wherever it converges. That would be a theorem more deserving the name of Taylor's theorem (in the sense of the theorem concerning Taylor series, not to attribute it to Brook Taylor). Unfortunately, the natural criterion of being throughout an interval is not enough. The famous (counter)example is
for which all the derivatives at exist and are equal to , so its Taylor series centered at , or any does not converge to for . The existence of this seemingly-innocent function is highly significant, as it gives rise to a rich reservoir of smooth functions on that can have any desired support (often called bump functions in the study of smooth manifolds, and test functions in the theory of distributions).
But, as far as Taylor series are concerned, these smooth functions are bad. The "good" functions, characterized by the very property that its Taylor series always converge to itself, are called (real) analytic, sometimes denoted to suggest that it is stronger than being . The best way to "explain" why the above is not analytic is by going into the complex domain: even though the two sides "stitch" together smoothly (at the origin) along the real axis, it is not possible to extend it into the complex plane as a (single) holomorphic function, not even for a small neighborhood of 0. In fact, the behavior of for small complex is extremely wild (see Picard's theorem on essential singularity). The following theorem, rarely mentioned in calculus as it is considered "outside the scope" of a real-variable course, provides the natural criterion for analyticity that bypasses Taylor's theorem and the difficulty with estimating the remainder.
If is a (real- or complex-valued) function on an open interval , and it extends to (i.e. agrees with) a holomorphic function on a complex domain an open connected subset of containing , then the Taylor series of at any point converges to wherever it converges i.e., is analytic Furthermore, the radius of convergence is the largest such that admits a holomorphic extension over a domain containing the open disk .
The vast majority of functions that one encounters — including all elementary functions and their antiderivatives, and more generally solutions to (reasonable) ordinary differential equations — satisfy this criterion, and thus are analytic. For more about analytic functions on the complex domain, see the wiki Analytic Continuation.
The condition in Taylor's theorem (with Lagrange remainder) can be relaxed a little bit, so that is no longer assumed to be continuous (and the derivation above breaks down) but merely exists on the open interval . The same happens to the mean value theorem, which originally must refer to the fact that
under the condition that is continuous, but is (slightly) generalized so that is no longer assumed to be continuous but merely exist, and the integral on the left-hand side is replaced by . One might wonder, for good reasons, whether such functions exist. Alas, they do. The classic example is
for which exists at but is not continuous there. The discontinuity is so bad that it's not (Riemann) integrable.
The stronger mean value theorem found an entirely different proof — ultimately relying on properties of the real numbers — and in fact is an essential ingredient in the proof of the fundamental theorem of calculus itself. The stronger version of Taylor's theorem (with Lagrange remainder), as found in most books, is proved directly from the mean value theorem. That this is not the best approach for pedagogy is well argued in Thomas Tucker's Rethinking Rigor in Calculus: The Role of the Mean Value Theorem. For a more illuminating exposition, see Timothy Gowers' blog post.
It should also be noted that the condition in the integral form of the remainder can likewise be relaxed, so that is no longer assumed to be continuous, but that be absolutely continuous, which implies that exists almost everywhere and is (Lebesgue) integrable . In some sense, this is the most general setting for the fundamental theorem of calculus and for integration by parts.