Hensel's Lemma
Hensel's lemma is a result that stipulates conditions for roots of polynomials modulo powers of primes to be "lifted" to roots modulo higher powers. The lifting method outlined in the proof is reminiscent of Newton's method for solving equations.
The lemma is useful for finding and classifying solutions of polynomial equations modulo powers of primes with a minimum of computational difficulty.
Find all the solutions to \( x^2 \equiv 14\) mod \( 625\).
Rather than searching mod \( 625\), it is easier to build up the solution modulo smaller powers of \( 5\). First, the square roots of \( 14 \) mod \( 5 \) are \( 2,3\). Now try to find a solution \( x = 2+5t \) mod \( 25: \)
\[\begin{align} (2+5t)^2 &\equiv 14 \pmod{25} \\ 4+20t &\equiv 14 \pmod{25} \\ 20t &\equiv 10 \pmod{25} \\ 4t &\equiv 2 \pmod{5}, \end{align}\]
so \( t \equiv 3 \) mod \( 5 \); so \( x\equiv 17 \) mod \(25\). Similarly, starting with \( x=3+5t \) yields \( t\equiv 1\), so \( x\equiv 8 \) mod \( 25\).
The key was that when we expanded the square, some of the terms vanished mod \( 25 \) \((\)in this case \( 25t^2)\), so the resulting equation was linear in \( t \).
Next, try to lift these two solutions to solutions mod \( 625\). Start with \(x=17+25t:\)
\[\begin{align} (17+25t)^2 &\equiv 14 \pmod{625} \\ 289+850t &\equiv 14 \pmod{625} \\ 225t &\equiv 350 \pmod{625} \\ 9t &\equiv 14 \pmod{25}, \end{align}\]
which gives \( t \equiv 21 \) mod \( 25 \), so \( x\equiv 542\). Similarly, lifting \( x=8+25t \) yields \( t\equiv 3\) mod \( 25\), so \( x \equiv 83 \). So there are two solutions, \( x \equiv \pm 83 \) mod \( 625\). \(_\square\)
The process is straightforward, but it leads to several questions.
Are lifts always guaranteed to exist?
How many lifts can there be?
How many steps are required? \((\)The modulus stepped from \( 5 \) to \( 25 \) to \( 625.)\)
Does this work for any polynomial equation?
The answers to these questions are contained in the statement of Hensel's lemma.
Statement of the Lemma
Let \( f(x) \) be a polynomial with integer coefficients. Let \( k \) be a positive integer, and \( r \) an integer such that \( f(r) \equiv 0 \) mod \( p^k \). Suppose \( m \le k \) is a positive integer. Then if \( f'(r) \not\equiv 0 \) mod \( p \), there is an integer \( s \) such that \( f(s) \equiv 0 \) mod \( p^{k+m} \) and \( s \equiv r \) mod \( p^k \). So \( s \) is a "lifting" of \( r \) to a root mod \( p^{k+m} \). Moreover, \( s\) is unique mod \( p^{k+m} \).
The restriction that \( m \le k \) is not particularly important in practice since the lemma can be applied recursively, to lift a solution to solutions modulo higher and higher powers of \( p \). That is, given \( r\) as in the lemma, with the assumptions in the lemma, there is a unique solution \( s_N \) to \( f(x) \equiv 0 \) mod \( p^N \) such that \( s_N \equiv r \) mod \( p^k \), for any \( N \ge k \).
Let \( f(x) = x^3-3 \) and \( p = 5 \). Note that \( f(2) = 0 \) mod \( 5 \) and \( f'(2) = 3(2)^2 \ne 0 \) mod \( 5 \). Then Hensel's lemma implies that there is a unique lift \( s = 2+5t \) mod \( 25 \) such that \( f(s) \equiv 0 \) mod \( 25 \). The proof supplies a formula for computing \( s\), but for now we can find \( s = 12 \) by inspection.
The process can be repeated: \( s=12 \) has a unique lift to a root mod \( 5^3 \), which turns out to be \( 87 \). The conditions of Hensel's lemma apply indefinitely, so there are unique lifts to a root mod \( 5^k \) for all \( k \).
We can write the solution mod \( 5^k \) as a series \( 2 + 2 \cdot 5 + 3 \cdot 5^2 + \cdots \). In the right context, this can be viewed as a kind of power series.
Let \( f(x) = x^3+2 \) and \( p = 3 \). Then \( r = 1 \) is the unique root of \( f \) mod \( 3 \), but in fact there are no roots of \( f \) mod \( 9 \). The conditions of Hensel's lemma are not satisfied because \( f'(r) = 3r^2 \equiv 0 \) mod \( p \).
Note that if \( g(x) = x^3-1 \) and \( p = 3\), the root \( r= 1 \) lifts to three different roots \( 1,4,7 \) mod \( 9 \).
In general, if \( f'(r) \equiv 0 \) mod \( p \), the lifting behavior of roots congruent to \( r \) is somewhat unpredictable. Sometimes there are no lifts and sometimes there are multiple lifts.
Let \( f(x) = x^3-x-2 \). How many roots does \( f(x) \) have mod \( 2^{10}?\)
Both \( 0 \) and \( 1 \) are roots mod \( 2 \). Since \( f'(0) \not\equiv 0 \) mod \( 2\), Hensel's lemma immediately implies a unique lift of this root modulo any higher power of \( 2 \).
On the other hand, \( f'(1) \equiv 0 \) mod \( 2\). The next section will discuss what can be done in this situation in general, but for this particular polynomial it is enough to note that \( f(1) \) and \( f(3) \) are both \( 2 \) mod \( 4 \), so there is no root mod \( 4 \) that lifts \( 1 \) mod \( 2 \), hence no such roots modulo any higher power.
So the answer is that there is exactly one root mod \( 2^{10} \). \(_\square\)
Here is a problem where Hensel's lemma rules out all but a few possibilities, which should be checked in detail:
Proof
The polynomial \( f\big(r+p^kt\big)\) has a Taylor series expansion
\[f\big(r+p^kt\big) = f(r) + f'(r) p^k t + \frac{f''(r)}{2!} p^{2k} t^2 + \cdots. \]
Note that this is in fact a finite sum since \( f \) is a polynomial, so there are no questions of convergence to worry about. Also note that \( \frac{f^{(d)}(r)}{d!} \) is actually an integer for all \( d \) \(\big(\)use the power rule and the fact that the binomial coefficient \( \binom{s}{d} \) is always an integer\(\big).\)
Since \( m\le k\), all the terms but the first two vanish mod \( p^{k+m}\), so
\[f\big(r+p^kt\big) \equiv f(r) +f'(r) p^k t \pmod{p^{k+m}}.\]
Setting \( f\big(r+p^kt\big) \equiv 0 \), we can solve for \( t: \)
\[\begin{align} f(r) + f'(r) p^k t &\equiv 0 \pmod{p^{k+m}} \\\\ p^k t &\equiv -\frac{f(r)}{f'(r)} \pmod{p^{k+m}} \\\\ t &\equiv -\frac{ \hspace{1.5mm} \frac{f(r)}{p^k} \hspace{1.5mm} }{f'(r)} \pmod{p^m}, \end{align}\]
which gives rise to a unique solution mod \( p^{k+m} \). Note that the numerator of this expression is an integer because \( f(r) \) is divisible by \( p^k \), and the quotient makes sense because \( f'(r) \) has a multiplicative inverse mod \( p^m \); this is where the assumptions about \( r\) are being used. \(_\square \)
A Stronger Version of the Lemma
There are stronger versions of Hensel's lemma that can be applied to cases when \( f'(r) \equiv 0 \) mod \( p \). Here is a more general statement:
Let \( f(x) \) be a polynomial with integer coefficients. Let \( k \) be a positive integer, and \( r \) an integer such that \( f(r) \equiv 0 \) mod \( p^k \), \( f(r) \not\equiv 0\) mod \( p^{k+1} \). Suppose \( d < \frac k2 \) is a positive integer. Then if \( f'(r) \equiv 0\) mod \( p^d \) but \( f'(r) \not\equiv 0 \) mod \( p^{d+1} \), then the following holds:
For every positive integer \( N>k-d \), there is a unique integer \( s_N \) such that \( f(s_N) \equiv 0 \) mod \( p^N \) and \( s_N \equiv r \) mod \( p^{k-d} \). Also, \( s_M \equiv s_N \) mod \( p^{\text{min}(M,N)} \).
Note that \( s_N \) may not be congruent to \( r \) mod \( p^k \); that is, the original \( r \) may not lift to a root modulo higher powers. But the theorem says that some \( s_{k-d}\) congruent to \( r \) mod \( p^{k-d}\) will be liftable to a root modulo higher powers.
Let \( f(x) = x^2-17 \) and \( r = 1 \). Then we can take \(k = 4 \) and \( d = 1 \). Since \( d<\frac k2 \) as desired, the stronger form of Hensel's lemma applies. Even though \( f(1) \equiv 0 \) mod \( 2^4 \), there is no lift of \( 1 \) to a root mod \( 2^5 \). But there is a lift of \( 9 \) to a root mod \( 2^4, 2^5, \) and so on. \(\big(\)Similarly, note that \( 9 \) is a root mod \( 2^6 \) but there is no lift of \( 9 \) to a root mod \( 2^7 \); but there is a lift of 41...\(\big)\)
So the lifting process is more complicated and often requires some "backtracking" in practice when \( d \ne 0 \).
The \(p\)-adic Integers
Note that the formula for the lift \( s \) of \( r \) simplifies to
\[s = r - \frac{f(r)}{f'(r)},\]
which is exactly the formula employed in Newton's method for approximating real roots of polynomials. This is not a coincidence!
(The following paragraph can safely be skipped by anyone but the very curious.)
There is a set of \(p\)-adic integers that is a sort of limit (an "inverse limit") of the sets \( {\mathbb Z}_{p^k} \). Elements of the set can be written as power series of the form \( a_0 + a_1 p + a_2 p^2 + \cdots, \) where an appropriate topology is put on the set so that \( p \) is "small" and higher powers of \( p \) are "smaller" (in the same way that the powers of \( x \) in typical power series are small). In this context, the process of Hensel lifting is in fact just Newton's method: solutions mod \( p^k \) for higher powers of \( p \) are better and better approximations to a \( p \)-adic solution.