A basic result from calculus is that the derivative of a one-term polynomial in x is:
One proof of this is based on the limit definition of derivative
The intuition behind this is that the derivative of a curve should give you information about the "slope" of a curve, or the rate at which changes as varies. A derivative is intrinsic to a curve, and so can be calculated from it alone. This can be done by choosing two points on the curve and bringing them arbitrarily close together. The limit definition above just captures this process in mathematical language. Plugging gives
There are a few ways to go from here. A direct way is to invoke the binomial theorem:
Taking the limit as simplifies things significantly:
Where terms in are effectively ignored. This was a contentious issue for a while - people knew that setting high-order powers to wasn't valid for conventional numbers, so this led to the idea of "infinitesmal" numbers. While proposed very early on, they were not rigorously established until the middle of the 20th century. This led to a long gap where their status as valid mathematical objects was unknown.
A basic model of infinitesmals is the real numbers, , extended with an extra element such that (but ) . This forms a ring, , of elements of the form:
This is a system known as the dual numbers. This model ignores the foundational issues of infinitesmals in favor of looking at what algebra done with infinitesmals typically looks like. So we don't need to show that exists or construct it explicitly, but just take its existence as given as see what happens. The same proof above can be done in this number system: simply replace with . The added benefit of using dual numbers is that ignoring higher-order terms is now valid, which will allow us to use a weaker argument than the full binomial theorem.
Let's sketch it out: we can manually expand as
Multiplying these brackets consists of choosing one member () from each bracket and multiplying them all together, then adding all the different ways to do so. To start, imagine that we start by multiplying only the terms. This gives us the leading coefficient, . So we can write
We can calculate the next term by multiplying by all but one of the terms and one of the terms. Since there are brackets, we have "different" epsilon terms to choose from. This means that
The next term has the form , which is trivially in the dual numbers - so the computation can stop here. Otherwise, we can continue the argument: we can choose two different epsilon terms from brackets. There are ways to do this as an epsilon in a bracket can't be chosen twice. However, given any pair of numbers there are two different ways to "choose" them: one before the other and vice-versa. In our case, we don't want to count these as separate instances so we need to correct by a factor of . Doing this gives
Following this argument for even higher numbers will yield the binomial theorem. In dual numbers, we may terminate at the second step, where we got . This means that
This hints at a general way to calculate derivatives in the dual numbers:
For example, given a function , we can calculate its derivative indirectly by first finding
In this case, as expected.
We can go further with this idea to prove the product rule. Given the functions and , how do we calculate ? With dual numbers, this amounts to finding :
Which confirms that . The chain rule, likewise, can be found by expanding :
With the substitutions and to simplify the calculation.