RT Hcs
RT Hcs
P REVIEW
The so-called Pell equation x2 − ny2 = 1 (wrongly attributed to Pell
by Euler) is one of the oldest equations in mathematics and it is
fundamental to the study of quadratic Diophantine equations. The
Greeks studied the special case x2 − 2y2 = 1 because they realized√
that its natural number solutions throw light on the nature of 2.
There is a similar connection
√ between the natural number solutions
of x2 − ny2 = 1 and n when n is any nonsquare natural number.
√
The irrationality of n when n is nonsquare causes strange behavior
√ the solutions of x − ny = 1. Nevertheless, the irrationality of
in 2 2
76
5.1 Side and diagonal numbers 77
xi2 1
2
= 2+ 2 → 2 as yi → ∞.
yi yi
d1 = 3, s1 = 2,
di+1 = di + 2si , si+1 = di + si .
d12 − 2s21 = 1, 2
di+1 − 2s2i+1 = −(di2 − 2s2i ).
then a1 = a2 and b1 = b2 .
78 5 The Pell equation
Exercises
√
In the sections that follow we use numbers of the form xi +yi n to encode solution
pairs of x2 −ny2 = 1. To give a√ taste of how this works, the following two exercises
use numbers of the form a + b 2 to encode (diagonal,side) pairs.
√ √
5.1.1 Check that (1 + 2)2 = 3 + 2 2 and that
√ √ √
(x + y 2)(1 + 2) = x + 2y + (x + y) 2.
√ √
5.1.2 Use induction to show from Exercise 5.1.1 that (1 + 2)n+1 = dn + sn 2.
When n is an integer square, the equation x2 − ny2 = 1 is not so interesting,
so we dispose of it right now.
5.1.3 By factorizing the left-hand side of x2 − y2 = 1, show that it has only two
integer solutions.
5.1.4 Show similarly that x2 − ny2 = 1 has only two integer solutions when n is a
square positive integer.
To show that this rule gives a new solution we first calculate x3 and
y3 . Expanding the left-hand side, and collecting its rational and irrational
parts, we find that
x3 = x1 x2 + 2y1 y2 , y3 = x1 y2 + y1 x2 .
It can then be checked by multiplication that
(x1 x2 + 2y1 y2 )2 − 2(x1 y2 + y1 x2 )2 = (x12 − 2y21 )(x22 − 2y22 ) = 1 × 1 = 1.
Hence x32 − 2y23 = 1, as required. 2
Examples. Composing the solution (3, 2) with itself, we get a new solution
(x3 , y3 ), where
√ √ √ √
x3 + y3 2 = (3 + 2 2)2 = 9 + 8 + 12 2 = 17 + 12 2.
Equating rational and irrational parts, x3 = 17, y3 = 12, which is indeed
another solution. If we then compose (17, 12) with (3, 2) we get
√ √ √ √
(17 + 12 2)(3 + 2 2) = 51 + 48 + (36 + 34) 2 = 99 + 70 2,
hence another solution is (99, 70), and so on. By this process we can ob-
tain infinitely many integer solutions, but it is not clear how close we are
to finding all integer solutions. The situation becomes clearer when we
observe that a group structure is present.
Exercises
Another way to arrive at the composition rule is to use the irrational factorization
√ √
x2 − 2y2 = (x − y 2)(x + y 2). (*)
We suppose that 1 = x12 − 2y21 and 1 = x22 − 2y22 , so that
1 = 1 × 1 = (x12 − 2y21 )(x22 − 2y22 ). (**)
5.2.1 Apply the factorization (*) to each factor on the right-hand side of (**),
then combine the factors in a different way to show that
√
1 = [x1 x2 + 2y1 y2 − (x1 y2 + y1 x2 ) 2]
√
× [x1 x2 + 2y1 y2 + (x1 y2 + y1 x2 ) 2].
In Section 5.4 we generalize this method to find a composition rule for solu-
tions of x2 − ny2 = 1.
80 5 The Pell equation
understand
√ this group we first focus on the subgroup of positive numbers
x + y 2 where x2 − 2y2 = 1.
√
Structure of positive solutions. The group of positive x + y 2, where
√ solution of x − 2y = 1, is the infinite cyclic group of
(x, y) is an integer 2 2
powers of 3 + 2 2.
√
To see why, apply the log function to all the positive numbers x + y 2
where x, y are integers such that√x2 −2y2 = 1. Since log(ab) = log a+log b,
the resulting numbers log(x + y 2) then form a group√under +.
This group has a least positive element, log(3 + 2 2), because
√ √
• 3 + 2 2 is the least x + y 2 corresponding to solutions (x, y) with
x, y > 0,
But any such group of numbers consists of the integer multiples of its
least positive element m: if any element k lies between multiples of m,
Exercises
Suppose we define integer pairs (uk , vk ) by the equation
√ √
uk + vk 2 = (3 + 2 2)k for all integers k.
Then what we have just proved is that the pairs (uk , vk ) are all the integer solutions
(x, y) of x2 − 2y2 = 1 with x positive. It is now quite easy to express uk and vk as
√
explicit functions of k, though (not surprisingly) these functions involve 2.
√ √ √
5.3.1 Given that (3 + 2 2)k = uk + vk 2, what is (3 − 2 2)k ?
5.3.2 Deduce from Exercise 5.3.1 that
1 √ √
1 √ √
√
5.4 The general Pell equation and Z[ n]
If n is a nonsquare integer we define
√ √
Z[ n] = {x + y n : x, y ∈ Z}.
√
Just as we used the numbers x + y 2 to study x2 − 2y2 = 1 we use the
√
numbers x + y n to study x2 − ny2 = 1.
√ √
In fact, x2 − ny2 is what we call the norm of x + y n in Z[ n], the
√ √
product of x + y n by its conjugate x − y n:
√ √ √
norm(x + y n) = (x − y n)(x + y n) = x2 − ny2 .
Thus finding solutions of the Pell equation is the same as finding elements
√
of Z[ n] with norm 1.
√
The advantage of searching in Z[ n], rather than among pairs (x, y) of
√
integers, is that we can use algebra on numbers in Z[ n].
82 5 The Pell equation
This generalizes the “composition” rule used for n = 2 in Section 5.2 and
√
it may be proved as follows, using factorization in Z[ n].
Since (x1 , y1 ) and (x2 , y2 ) are solutions,
Therefore
(2 × 2 + 3 × 1 × 1, 2 × 1 + 1 × 2) = (7, 4),
(2 × 7 + 3 × 1 × 4, 2 × 4 + 1 × 7) = (26, 15),
√
and so on. These solutions correspond to the powers of 2 + 3.
coefficients x, y but also with rational coefficients, that is, quotients of in-
tegers. We use the symbol Q (“quotients”) for the rational numbers and
√
make the natural generalization of Z[ n] to
√ √
Q[ n] = {x + y n : x, y ∈ Q}.
This set of numbers is the set of quotients of elements of Z[n] and it is a
number field, that is, closed under +, −, ×, and ÷ (by nonzero members).
The closure properties are easily checked by calculation (exercises).
√
We extend the definition of norm to Q[ n] by the same formula
√
norm(x + y n) = x2 − ny2 .
√
This formula remains meaningful because each element of Q[ n] is uniquely
√
expressible as x + y n with x, y ∈ Q, by the argument of Section 5.1.
√
Multiplicative property of the norm. For any α and β in Q[ n]
norm(α )norm(β ) = norm(αβ ).
√ √
Proof. Let α = x1 + y1 n and β = x2 + y2 n. Then
norm(α )norm(β ) = (x12 − ny21 )(x22 − ny22 )
= (x1 x2 + ny1 y2 )2 − n(x1 y2 + y1 x2 )2
by the calculation above
= norm(αβ ). 2
Exercises
5.4.1 Show that +, −, and × of numbers in Q[n] are themselves numbers in Q[n].
√ √
5.4.2 Show that 1/(x + y n) for x, y ∈ Q (not both zero) is of the form x + y n
for x , y ∈ Q. Deduce that Q[n] is closed under ÷ by nonzero members.
The multiplicative property of the norm can be restated as follows.
5.4.3 If (x1 , y1 ) satisfies x2 − ny2 = k1 and (x2 , y2 ) satisfies x2 − ny2 = k2 , show
that (x1 x2 + ny1 y2 , x1 y2 + y1 x2 ) satisfies x2 − ny2 = k1 k2 .
Brahmagupta used this fact to solve x2 − ny2 = 1 via easier equations x2 − ny2 = k.
His method is most convenient when there is an obvious solution of x2 −ny2 = −1.
5.4.4 Find a nontrivial solution of x2 − 17y2 = −1 by inspection, and use it to find
a nontrivial solution of x2 − 17y2 = 1.
5.4.5 Similarly find a nontrivial solution of x2 − 37y2 = 1.
84 5 The Pell equation
two of the numbers. The difference between these two numbers, which is
√
of the form a − b n for some integers a and b, is therefore irrational and
such that
√ 1
|a − b n| < .
B
Also, b < B because b is the difference of two positive integers less than B.
2
The next few steps are short and directed towards applications of the
infinite pigeonhole principle.
and therefore
1 √ √
|a2 − nb2 | ≤ · 3b n = 3 n.
b
√ √
Hence there are infinitely many a − b n ∈ Z[ n] with norm of size
√
≤ 3 n.
• a1 ≡ a2 (mod N),
• b1 ≡ b2 (mod N).
√
The final step uses the quotient a − b n of the two numbers just found.
Its norm a2 − nb2 is clearly 1 by the multiplicative property of norm. It
is not so clear that a and b are integers, but this now follows from the
congruence conditions in step 4.
The first congruence follows from the fact that a21 − nb21 = N, which
implies
1. f (kv) = k2 f (v),
Also
Hence
f (v1 + v2 ) + f (v1 − v2 ) = 2Ax12 + 2Bx1 y1 + 2Cy21 + 2Ax22 + 2Bx2 y2 + 2Cy22
= 2 [ f (v1 ) + f (v2 )] 2
A simple consequence of Property 1 is that f (−v) = f (v), so a quadratic
form makes no distinction between a vector v and its negative. Property 1
also says that f (kv) is a multiple of f (v); in particular f (v) is prime (or
1) only for vectors v = (x, y) that are not integer multiples of other inte-
ger vectors, that is, for (x, y) with relatively prime x and y. We call these
primitive vectors.
In Section 2.8 we found a map of all the primitive vectors with positive
x and y. We also found that the latter vectors are generated from i = (1, 0)
and j = (0, 1) by the processes (v1 , v2 ) → (v1 + v2 , v2 ) and (v1 , v2 ) →
(v1 , v1 + v2 ). In the next section we see that vectors with x and y of oppo-
site sign are similarly generated from (0, −1) and (1, 0). Then Property 2
shows that there is a simple relation between the values of f at successive
stages in these processes. This leads to a “map” of the values of f .
Equivalent forms
Another view of a quadratic form f , related to the one described above,
surveys all equivalent forms f ∗ (x, y) = f (px + qy, rx + sy), obtained by
replacing the row vector (x y) by
p r
(px + qy rx + sy) = (x y) = (x y)M,
q s
where the matrix M and its inverse M −1 both have integer entries. When M
satisfies these conditions, the pairs (px + qy, rx + sy) run through the set Z2
of all integer pairs when (x, y) does. Indeed, if (x , y ) is any integer pair,
we have
(x y ) = (x y)M ⇔ (x y) = (x y )M −1 .
Thus equivalent forms have the same set of values. Examples are x2 + y2
and x2 + 2xy + 2y2 , the latter obtained from x2 + y2 when (x, y) is replaced
by (x + y, y).
When M and M −1 both have integer entries, then det M and det M −1
are both integers. Since
−1 1 0
MM = ,
0 1
5.6 ∗ Quadratic forms 89
det M · det M −1 = 1
So it follows from what we have just seen that any equivalent form is ob-
tained by replacing
A B/2 A B/2
by M M −1 ,
B/2 C B/2 C
Exercises
Although equivalent forms have the same determinant, the converse is not always
true. It so happens that the form x2 + y2 is equivalent to all other forms with
determinant 1, but x2 + 5y2 is not equivalent to all other forms with determinant 5.
5.6.1 Show that 13x2 + 16xy + 5y2 has determinant 1, and that it is equivalent to
2 3
x2 + y2 via the matrix M = .
1 2
5.6.2 Show that 2x2 + 2xy + 3y2 has the same determinant as x2 + 5y2 , but that
it is not equivalent to x2 + 5y2 , by showing that x2 + 5y2 does not take the
value 7.
5.6.3 More generally, show that x2 + 5y2 takes no values ≡ 3 or 7 (mod 20), by
working out the possible values of x2 + 5y2 (mod 20).
Also in the right half of the figure we have the schematic vector sum
rule that generates all the labels from (1, 0) and (0, 1), and in the left half
the mirror image rule that obviously applies there.
5.7 ∗ The map of primitive vectors 91
We put these two maps side by side because we want to join them
together, but we seem prevented from doing so by the incompatible labels,
(0, 1) and (0, −1), in the upper central region. The conflict can be resolved
by giving each label a ± sign. This yields Figure 5.2, which we call the
(complete) map of primitive vectors, for the obvious reason that it contains
every primitive vector. The ± labelling fuses the two vector sum rules into
the single vector difference/sum rule shown at the bottom of the Figure.
±(1, 2) ±(3, 5)
±(1, 2)
±(2, 3)
= ±(1, 1) ±(3, 5)
±(1, 1) ±(2, 3)
Figure 5.3 shows how lines may be deformed to conform with the
schematic diagram for the difference/sum rule—in particular the edge com-
mon to regions ±(1, 2) and ±(2, 3) is not really horizontal—within bounds
that preserve the meanings of “above”, “below”, “right end”, and “left end”
for the edge common to the regions ±(1, 2) and ±(2, 3). Here the choice
so at the right end ±(3, 5) = ±(v1 + v2 ) and at the left end ±(1, 1) =
±(v1 − v2 ), as required.
It follows from the vector sum rules in the separate left and right maps
in Figure 5.1 that the vector difference/sum rule holds in the complete map.
This is proved by a finite number of simple checks, similar to the example
above but more general. The details are left to the exercises.
The sign ambiguity ±(x, y) has no effect on the value of a quadratic
form because
Hence the map of primitive vectors gives an unambiguous map of all values
of the quadratic form f (x, y) = Ax2 + Bxy +Cy2 for relatively prime x and
y, obtained by entering each value f (a, b) in the region ±(a, b). Moreover,
it is possible to see some pattern in this map, thanks to the parallel between
the vector difference/sum rule and Property 2 of quadratic forms proved
in the previous section. We show this in the next section, assisted by the
invariance of the determinant AC − B2 /4 under change of variables. The
complete map also displays such changes, as we are about to see.
(x, y) = x(1, 0) + y(0, 1) and (px + qy, rx + sy) = x(p, r) + y(q, s),
this amounts to replacing the vectors (1, 0) and (0, 1) by the new vectors
(p, r) and (q, s). We call the pair of vectors (1, 0) and (0, 1) an integral
basis of Z2 because any integer vector (x, y) is a linear combination of
them with integer coefficients, namely x(1, 0) + y(0, 1).
5.7 ∗ The map of primitive vectors 93
ps − rq = ±1.
It is easily seen that this property extends to the complete map of Figure
5.2. Thus each edge in the map of primitive vectors represents an integral
basis of Z2 , namely the pair of labels on the regions that meet along the
edge. The ± signs on the labels give four different bases, but they are
essentially the same. Since the edges of the map form a tree, and each
edge is associated in this way with an integral basis (up to sign), we call
the edge complex of the map of primitive vectors the tree of integral bases.
As the name suggests, the tree represents all integral bases. We do not
need this fact. However, it is easy to prove using the vector difference/sum
rule to implement a kind of Euclidean algorithm (see exercises).
Exercises
To prove that the vector difference/sum rule holds in the complete map of primitive
vectors we check that it holds in the middle and in “general position” on the right
and left.
5.7.1 Verify that the difference/sum rule holds in the middle of the map (Figure
5.4) by choosing v1 = (0, 1) and v2 = (1, 0).
±(0, 1)
±(1, −1) ±(1, 1)
±(1, 0)
5.7.2 Figure 5.5 shows one “general position” on the right side of the complete
map. By choosing v1 = u1 and v2 = u1 + u2 , verify that the difference/sum
holds here.
94 5 The Pell equation
±u1 ±(2u1 + u2 )
±(u1 + u2 )
±u2
5.7.3 Work out which other general positions occur on the right and on the left
and verify that the difference/sum rule holds for each of them.
5.7.4 The “vector sum/difference rule” shown in Figure 5.6 is also valid. Why?
±v1
±(v1 + v2 ) ±(v1 − v2 )
±v2
To prove that the tree in the complete map represents all integral bases we
use the difference/sum and sum/difference rules to trace a path from a given basis
{(p, r), (q, s)} back to {(1, 0), (0, 1)}. Exercise 5.7.5 is an example, and Exercises
5.7.6–5.7.8 show why such a path can always be found.
5.7.5 By repeatedly subtracting the “smaller” vector from the “larger”, reduce the
pair {(35, 3), (23, 2)} to the pair {(1, 0), (11, 1)}. The latter pair is repre-
sented in the tree (why?), hence so is the former (why?).
5.7.6 Show that if
(p , r ) = (p + q, r + s), (q , s ) = (q, s)
or
(p , r ) = (p, r), (q , s ) = (p + q, r + s)
then ps − qr = ±1 ⇔ p s − q r = ±1.
5.7.7 By repeatedly adding or subtracting one vector from the other, show that
any pair {(p, r), (q, s)} with pr − qs = ±1 reduces to a pair of the form
{(p , 0), (q , s )}. (Hint: gcd(r, s) = 1. Why?) Deduce from Exercise 5.7.6
that p = ±1, q = ±1.
5.7.8 Deduce that {(p , 0), (q , s )} in Exercise 5.7.7 is represented by an edge in
the tree, and hence so is {(p, r), (q, s)}.
5.8 ∗ Periodicity in the map of x2 − ny2 95
(1, 3) −26
(1, 2) −11
−3
(2, 3) −23
(0, 1)
(1, 1) −2
(1, 0)
(3, 2) −3
1
(2, 1) 1
(3, 1) 6
−3
(0, 1)
(1, 0)
(1, 1) −2 (3, 2) −3
(26, 15) 1
U
L R
D
1. L, U + D, R is an arithmetic progression.
2. If (p, r) and (q, s) respectively are the regions above and below the
edge, then the common difference in this progression is the coefficient
of xy in the quadratic form f (px + qy, rx + sy).
5.8 ∗ Periodicity in the map of x2 − ny2 97
where v1 and v2 are the regions above and below the middle edge. It then
follows from Property 2 of quadratic forms (Section 5.6) that
Since AC −B2 /4 is constant, it follows that |A| and |C|—the absolute values
of D and U—are bounded as required. 2
Exercises
The “Pell quadratic forms” x2 − ny2 are by no means the only indefinite forms.
Another
√
interesting example is x2 + xy − y2 , which is related to the golden ratio
1+ 5
2 and the Fibonacci sequence 1, 1, 2, 3, 5, 8, 13, . . ..
√ √
5.8.1 Show that x2 + xy − y2 = x + y 1+2 5 x + y 1−2 5 and deduce from this
that the form x2 + xy − y2 is indefinite.
5.8.2 Construct enough of the river for x2 + xy − y2 to show that its period looks
like Figure 5.10.
−1
1
5.8.3 Show that the positive labels (xi , yi ) alternately below and above the river
(in the regions marked alternately 1 and −1) satisfy
5.8.4 Deduce from Exercise 5.8.3 that the natural number pairs satisfying the
equation x2 + xy − y2 = 1 are (F2n+1 , F2n+2 ) for n = 0, 1, 2, 3, . . ., where
F1 = F2 = 1 and Fi + Fi−1 = Fi+1 (the Fibonacci sequence).
Periodicity in the shape of the river leads naturally to recurrence relations
between the vectors labelling riverside regions. The Fibonacci relation arising
from x2 + xy − y2 is the simplest example of such a recurrence relation. Another
is the relation for x2 − 3y2 , whose river was constructed above.
5.8.5 Use two successive periods in the river for x2 − 3y2 to show that the non-
negative solutions (xi , yi ) of x2 − 3y2 = 1 satisfy
The river also shows why certain equations do not have solutions.
5.8.6 Explain why the equation x2 − 3y2 = −1 has no integer solution.
5.9 Discussion
The Pell equation x2 − ny2 = 1 is one of the oldest and most important
quadratic Diophantine equations. Probably its only rival is the Pythagorean
equation x2 + y2 = z2 . The Pell equation also dates back to the time of the
√ BCE), who studied the special case x − 2y = 1
Pythagoreans (around 500 2 2
(158070671986249, 15140424455100).
His English rivals Wallis and Brouncker rose to the challenge with a method
that solves the Pell equation, not unlike the method of Bhaskara II (see Weil
(1984), p. 94). In the 18th century these methods morphed into the simpler
and more elegant continued fraction algorithm, which can be viewed as the
√
Euclidean algorithm applied to the pair ( n, 1).
All of these methods are based on the observation of periodicity in cer-
tain computations. It is likely that the ancient Greeks observed periodicity
in the Euclidean algorithm, because
√ simple
√ geometric arguments show its
periodicity on pairs such as ( 2, 1) and ( 3, 1) (see, for example, Stillwell
(1998), p. 268, or Artmann (1999), p. 242). However, while many could
use periodicity to solve instances of the Pell equation, the first to prove
that periodicity always occurs was Lagrange (1768). He thereby showed
that the continued fraction method always works. He underlined the im-
portance of this result by showing that solving the Pell equation leads to
the solution of all quadratic Diophantine equations in two variables.
Conway’s visual approach, expounded in Sections 5.6–5.8, is certainly
related to the old approaches to the Pell equation. But it is essentially
simpler in that it replaces a process (the Euclidean algorithm) by a picture
(the map of primitive vectors). I have attempted to make this as clear as
possible by deriving the map of primitive vectors and its properties directly
from properties of the Euclidean algorithm, before imprinting the values
of a quadratic form on it. (Conway assumes the simplest properties of the
map, or sketches topological proofs, and proves others with the help of
quadratic forms.) For further insights obtainable from Conway’s approach,
see the book Conway (1997) or his related video ax2 + hxy + by2 available
from the American Mathematical Society.
http://www.springer.com/978-0-387-95587-2