Dual numbers and derivatives

A basic result from calculus is that the derivative of a one-term polynomial in x is: ddxxn=nxn1 \frac{d}{dx} x^n = n x^{n-1}

One proof of this is based on the limit definition of derivative

ddxf(x)=limh0f(x+h)f(x)h \frac{d}{dx} f(x) = \displaystyle\lim_{h \to 0} \frac{f(x+h) - f(x)}{h}

The intuition behind this is that the derivative of a curve should give you information about the "slope" of a curve, or the rate at which f(x) f(x) changes as xx varies. A derivative is intrinsic to a curve, and so can be calculated from it alone. This can be done by choosing two points on the curve and bringing them arbitrarily close together. The limit definition above just captures this process in mathematical language. Plugging f(x)=xnf(x) = x^n gives

ddxxn=limh0(x+h)nxnh \frac{d}{dx} x^n = \displaystyle\lim_{h \to 0} \frac{(x+h)^n - x^n}{h}

There are a few ways to go from here. A direct way is to invoke the binomial theorem:

(x+h)nxnh=k=0n1(nk)xkhnk1 \frac{(x+h)^n - x^n}{h} =\displaystyle\sum_{k=0}^{n-1} \binom{n}{k} x^{k}h^{n-k-1}

Taking the limit as h0h \to 0 simplifies things significantly:

ddxxn=1hlimh0k=0n1(nk)xkhnk1=1h(nn1)xn1h=nxn1 \frac{d}{dx} x^n = \frac{1}{h}\displaystyle\lim_{h \to 0} \displaystyle\sum_{k=0}^{n-1} \binom{n}{k} x^{k}h^{n-k-1} = \frac{1}{h} \binom{n}{n-1}x^{n-1} h = nx^{n-1}

Where terms in h2,h3,h4,...h^2, h^3, h^4, ... are effectively ignored. This was a contentious issue for a while - people knew that setting high-order powers to 00 wasn't valid for conventional numbers, so this led to the idea of "infinitesmal" numbers. While proposed very early on, they were not rigorously established until the middle of the 20th century. This led to a long gap where their status as valid mathematical objects was unknown.

A basic model of infinitesmals is the real numbers, R\mathbb{R}, extended with an extra element ϵ\epsilon such that ϵ2=0\epsilon^2 = 0 (but ϵ0\epsilon \ne 0 ) . This forms a ring, R[ϵ]\mathbb{R}[\epsilon], of elements of the form:

a+bϵR[ϵ]a,bR a + b \epsilon \in \mathbb{R}[\epsilon] \qquad a, b \in \mathbb{R}

This is a system known as the dual numbers. This model ignores the foundational issues of infinitesmals in favor of looking at what algebra done with infinitesmals typically looks like. So we don't need to show that ϵ\epsilon exists or construct it explicitly, but just take its existence as given as see what happens. The same proof above can be done in this number system: simply replace hh with ϵ\epsilon. The added benefit of using dual numbers is that ignoring higher-order terms is now valid, which will allow us to use a weaker argument than the full binomial theorem.

Let's sketch it out: we can manually expand (x+ϵ)n(x + \epsilon)^n as

(x+ϵ)n=(x+ϵ)(x+ϵ)n brackets(x+ϵ) (x + \epsilon)^n = (x + \epsilon) \cdot (x + \epsilon) \cdots \text{n brackets} \cdots (x + \epsilon)

Multiplying these brackets consists of choosing one member (x,ϵx, \epsilon) from each bracket and multiplying them all together, then adding all the different ways to do so. To start, imagine that we start by multiplying only the xx terms. This gives us the leading coefficient, xnx^n. So we can write

(x+ϵ)(x+ϵ)n brackets(x+ϵ)=xn+other terms (x + \epsilon) \cdot (x + \epsilon) \cdots \text{n brackets} \cdots (x + \epsilon) = x^n + \text{other terms}

We can calculate the next term by multiplying by all but one of the xx terms and one of the ϵ\epsilon terms. Since there are nn brackets, we have nn "different" epsilon terms to choose from. This means that

(x+ϵ)(x+ϵ)n brackets(x+ϵ)=xn+nxn1ϵ+other terms (x + \epsilon) \cdot (x + \epsilon) \cdots \text{n brackets} \cdots (x + \epsilon) = x^n + n x^{n-1} \epsilon + \text{other terms}

The next term has the form xn2ϵ2x^{n-2}\epsilon^2, which is trivially 00 in the dual numbers - so the computation can stop here. Otherwise, we can continue the argument: we can choose two different epsilon terms from nn brackets. There are n(n1)n(n-1) ways to do this as an epsilon in a bracket can't be chosen twice. However, given any pair of numbers there are two different ways to "choose" them: one before the other and vice-versa. In our case, we don't want to count these as separate instances so we need to correct by a factor of 1/2 1/2 . Doing this gives

(x+ϵ)(x+ϵ)n brackets(x+ϵ)=xn+nxn1ϵ+n(n1)2xn2ϵ2+other terms (x + \epsilon) \cdot (x + \epsilon) \cdots \text{n brackets} \cdots (x + \epsilon) = x^n + n x^{n-1} \epsilon + \frac{n(n-1)}{2} x^{n-2} \epsilon ^2+ \text{other terms}

Following this argument for even higher numbers will yield the binomial theorem. In dual numbers, we may terminate at the second step, where we got (x+ϵ)n=xn+nxn1ϵ(x + \epsilon)^n = x^n + n x^{n-1} \epsilon . This means that

ddxxn=(x+ϵ)nxnϵ=xn+nxn1ϵxnϵ=nxn1 \frac{d}{dx} x^n = \frac{(x + \epsilon)^n - x^n}{\epsilon} = \frac{x^n + n x^{n-1} \epsilon - x^n}{\epsilon} = n x^{n-1}

This hints at a general way to calculate derivatives in the dual numbers:

f(x+ϵ)f(x)=ϵddxf(x) f(x + \epsilon) - f(x) =\epsilon \frac{d}{dx} f(x)

For example, given a function p(x)=x34xp(x) = x^3 - 4x, we can calculate its derivative indirectly by first finding p(x+ϵ)p(x + \epsilon)

p(x+ϵ)=(x+ϵ)34(x+ϵ)=x34x+(3x24)ϵp(x + \epsilon) = (x + \epsilon)^3 - 4(x + \epsilon) = x^3 - 4x + (3x^2 - 4) \epsilon

In this case, p(x)=3x24p'(x) = 3x^2 - 4 as expected.

We can go further with this idea to prove the product rule. Given the functions f(x)f(x) and g(x) g(x) , how do we calculate ddxf(x)g(x) \frac{d}{dx} f(x)g(x) ? With dual numbers, this amounts to finding f(x+ϵ)g(x+ϵ) f(x + \epsilon)g(x + \epsilon) :

f(x+ϵ)g(x+ϵ)=(f(x)+ϵddxf(x))(g(x)+ϵddxg(x)) f(x + \epsilon)g(x + \epsilon) = \left(f(x) + \epsilon \frac{d}{dx} f(x) \right) \left(g(x) + \epsilon \frac{d}{dx} g(x) \right) =f(x)g(x)+ϵ(g(x)ddxf(x)+f(x)ddxg(x)) = f(x)g(x) + \epsilon \left( g(x) \frac{d}{dx} f(x) + f(x) \frac{d}{dx} g(x) \right)

Which confirms that ddxf(x)g(x)=g(x)ddxf(x)+f(x)ddxg(x) \frac{d}{dx} f(x)g(x) = g(x) \frac{d}{dx} f(x) + f(x) \frac{d}{dx} g(x) . The chain rule, likewise, can be found by expanding f(g(x+ϵ))f(g(x + \epsilon)) :

f(g(x+ϵ))=f(g(x)+ϵddxg(x))=f(y+η) f(g(x + \epsilon)) = f \left (g(x) + \epsilon \frac{d}{dx} g(x) \right) = f(y + \eta) =f(y)+η(ddyf(y)) = f(y) + \eta (\frac{d}{dy} f(y)) =f(g(x))+ϵ(ddxg(x)ddg(x)f(g(x))) = f(g(x) ) + \epsilon \left(\frac{d}{dx} g(x) \cdot \frac{d}{dg(x)} f(g(x)) \right)

With the substitutions y=g(x)y = g(x) and η=ϵddxg(x) \eta = \epsilon \frac{d}{dx} g(x) to simplify the calculation.

Note by Levi Walker
3 months ago

No vote yet
1 vote

  Easy Math Editor

This discussion board is a place to discuss our Daily Challenges and the math and science related to those challenges. Explanations are more than just a solution — they should explain the steps and thinking strategies that you used to obtain the solution. Comments should further the discussion of math and science.

When posting on Brilliant:

  • Use the emojis to react to an explanation, whether you're congratulating a job well done , or just really confused .
  • Ask specific questions about the challenge or the steps in somebody's explanation. Well-posed questions can add a lot to the discussion, but posting "I don't understand!" doesn't help anyone.
  • Try to contribute something new to the discussion, whether it is an extension, generalization or other idea related to the challenge.
  • Stay on topic — we're all here to learn more about math and science, not to hear about your favorite get-rich-quick scheme or current world events.

MarkdownAppears as
*italics* or _italics_ italics
**bold** or __bold__ bold

- bulleted
- list

  • bulleted
  • list

1. numbered
2. list

  1. numbered
  2. list
Note: you must add a full line of space before and after lists for them to show up correctly
paragraph 1

paragraph 2

paragraph 1

paragraph 2

[example link](https://brilliant.org)example link
> This is a quote
This is a quote
    # I indented these lines
    # 4 spaces, and now they show
    # up as a code block.

    print "hello world"
# I indented these lines
# 4 spaces, and now they show
# up as a code block.

print "hello world"
MathAppears as
Remember to wrap math in \( ... \) or \[ ... \] to ensure proper formatting.
2 \times 3 2×3 2 \times 3
2^{34} 234 2^{34}
a_{i-1} ai1 a_{i-1}
\frac{2}{3} 23 \frac{2}{3}
\sqrt{2} 2 \sqrt{2}
\sum_{i=1}^3 i=13 \sum_{i=1}^3
\sin \theta sinθ \sin \theta
\boxed{123} 123 \boxed{123}


There are no comments in this discussion.


Problem Loading...

Note Loading...

Set Loading...