Four Square Theorem
Four Square Theorem
Abstract. The Four Square Theorem was proved by Lagrange in 1770: ev-
ery positive integer is the sum of at most four squares of positive integers, i.e.
n = A2 + B 2 + C 2 + D2 , A, B, C, D ∈ Z An interesting proof is presented here
based on Hurwitz integers, a subset of quarternions which act like integers
in four dimensions and have the prime divisor property. It is analogous to
Fermat’s Two Square Theorem, which says that positive primes of the form
4k+1 can be written as the sum of the squares of two positive integers. Rep-
resenting integers as the sum of squares can be considered a special case of
Waring’s problem: for every k is there a number g (k) such that every integer
is representable by at most g (k) kth powers?
Contents
1. Quarternions and Hurwitz Integers 1
2. Four Square Theorem 4
3. Discussion 6
References 6
α β
−β α
a + di b + ci
=
−b + ci
a − di
1 0 0 1 0 i i 0
=a +b +c +d
0 1 −1 0 i 0 0 −i
= a1+bi+cj+dk
1
2 JIA HONG RAY NG
Definition 1.2. The norm of a quarternion, kqk is defined to be the square root
of its determinant, hence kqk2 is
α β
det = αα + ββ = |α|2 + |β|2
−β α
a + di b + ci
det = a2 + b2 + c2 + d2
−b + ci a − di
The quarternions 1, i, j, k have norm 1 and satisfy the following relations:
i2 = j2 = k2 = −1,
ij = k = −ji,
jk = i = −kj,
ki = j = −ik
Non-zero quarternions form a group under multiplication, but the product of
quarternions is generally non-commutative: q1 q2 6= q2 q1 . Quarternions also form
an abelian group under addition, and obey the left and right distributive laws:
q1 (q2 + q3 ) = q1 q2 + q1 q3 , (q2 + q3 ) = q2 q1 + q3 q1 .
Due to the multiplicative property of the determinants, det (q1 ) det (q2 ) = det (q1 q2 ),
the norm also has a multiplicative property, i.e.
norm (q1 ) norm (q2 ) = norm (q1 q2 ) .
It is interesting to note that n = 1, 2, 4, 8 are the only n for which Rn has a
multiplication that distributes over vector addition, and a multiplicative norm.
They correspond to R, C, H and O (the octonions, n = 8) respectively.
Theorem 1.3. Four Square Identity: If two numbers can each be written as the
sum of four squares, then so can their product.
Proof. Let x1 = a21 + b21 + c21 + d21 = ka1 1 + b1 i + c1 j + d1 kk2 = kq1 k2 and
x2 = a22 + b22 + c22 + d22 = ka2 1 + b2 i + c2 j + d2 kk2 = kq2 k2
Then by the multiplicative property of norms,
x1 x2 = kq1 k2 kq2 k2 = kq1 q2 k2
Therefore, x1 x2 = a2 + b2 + c2 + d2 , where kq1 q2 k2 = a2 + b2 + c2 + d2
Definition 1.4. The conjugate of any quarternion q = a + bi + cj + dk is q =
a − bi − cj − dk. Conjugation has the following easily-proved properties:
qq = |q|2 ,
q1 + q2 = q1 + q2 ,
q1 − q2 = q1 − q2 ,
q1 q2 = q2 q1
(due to non-commutative quarternion multiplication).
Definition 1.5. The Hurwitz integers are quarternions that make up the set
of all the integer combinations of 1+i+j+k 2 , i, j, k, denoted by Z [h, i, j, k], where
1+i+j+k
h = 2 . It contains the set Z [i, j, k] = {a + bi + cj + dk : a, b, c, d ∈ Z}.
There are 24 units (that form a non-abelian group): the 8 units ±1, ±i, ±j + ±k of
Z [i, j, k], and the 16 mid-points ± 21 ± 2i ± 2j ± k2 . The Hurwitz integers are closed
under addition and multiplication and the square of the norm is always an ordinary
integer.
QUARTERNIONS AND THE FOUR SQUARE THEOREM 3
5+5i+3j+3k
For example, consider 2 = 5 1+i+j+k
2 − j − k. Its norm is
52 + 52 + 32 + 32 68
= = 17.
4 4
Since 17 is an ordinary prime, 5+5i+3j+3k
2 is not the product of Hurwitz integers of
smaller norm. Therefore, it is a Hurwitz prime.
Definition 1.7. The Euclidean Algorithm is a method to find the greatest common
divisor (gcd) of any two natural numbers, and can be applied to Hurwitz integers
as well. The steps are as follow:
Suppose that |α|<|β|, let α1 = α and β1 = β.
Then for each pair (αi , βi ), produce the next pair by the rule,
αi+1 = βi , βi+1 = remainder when αi is divided by βi .
(The remainder βi+1 is less than the divisor βi by the division property described
in Definition 1.5.)
Now if α and β have a common right divisor δ, then
α = γδ, β = εδ for some γ, ε,
Therefore ρ = α − µβ = γδ − µεδ = (γ − µε) δ.
This means that a common right divisor of α and β is also a right divisor of the
remainder ρ when α is divided on the right by β. Therefore
right gcd (α1 , β1 ) = right gcd (α2 , β2 ) = . . .
The algorithm stops when αk divides αk , such that g = right gcd (α, β) =
right gcd (αk , βk ) = βk .
Theorem 1.8. Linear Representation of Greatest Common Divisor: If g is the
greatest common divisor of α and β, then g = gcd (α, β) = µα + νβ for some
integers µ and ν.
The theorem can be easily proved by induction on the number pairs obtained from
the Euclidean algorithm, starting with
α1 = 1 × α + 0 × β, β1 = 0 × α + 1 × β.
Theorem 1.9. Prime Divisor Property of Hurwitz Integers: if p is a real Hurwitz
prime and divides a Hurwitz integer product αβ, then p divides α or p divides β.
4 JIA HONG RAY NG
Proof. Assume that the prime p divides αβ but does not divide α. Then according
to Theorem 1.8,
1 = right gcd (p, α) = µp + να.
Multiplying on the right by β for both sides,
β = µpβ + ναβ.
Note that p is both a right and left divisor of whatever number it divides, since reals
commute with quarternions. Therefore p divides both µpβ, and να (by assumption).
Consequently, p divides β, as required by the statement.
With respect to the first factor, the product of ω and ω plus the even terms gives 1
plus integer terms, hence the first factor is A+Bi+Cj+Dk for some A, B, C, D ∈ Z.
The second factor is its conjugate, thus
p = A2 + B 2 + C 2 + D2 with A, B, C, D ∈ Z.
The following lemma is useful for showing that every odd prime is not a Hurwitz
prime, which combined with Theorem 2.1, implies that every odd prime can be
expressed as the sum of four squares.
Lemma 2.2. If an odd prime p = 2n + 1, then there are l, m ∈ Z such that p
divides 1 + l2 + m2 .
Proof. The squares x2 , y 2 of any two of the numbers 0,1,2,...,n are incongruent
mod p because
x2 ≡ y 2 (mod p) ⇒ x2 − y 2 ≡ 0 (mod p)
(x − y) (x + y) ≡ 0 (mod p)
x ≡ y or x + y ≡ 0 (mod p)
x + y 6≡ 0 (mod p) since 0<x + y<p. Therefore the n + 1 different numbers
l = 0, 1, 2, . . . , n give n + 1 incongruent values of l2 mod p. Similarly, there
are n + 1 incongruent values of m2 mod p and hence of −1 − m2 mod p, where
m = 0, 1, 2, . . . , n But there only exist 2n + 1 incongruent values, mod p = 2n + 1.
Therefore, by the pigeon-hole principle, for some l and m,
l2 ≡ −1 − m2 (mod p)
1 + l2 + m2 ≡ 0 (mod p)
Theorem 2.3. Four Square Theorem: Every natural number is the sum of four
squares.
Proof. Let p be an odd prime. Then we can find l and m such that p divides
1 + l2 + m2 (by Lemma 2.2)
Factorizing it,
1 + l2 + m2 = (1 + li + mj) (1 − li − mj)
If p were a Hurwitz prime, then, according to the prime divisor property (Theorem
1.9), p divides either 1+li+mj or 1−li−mj But neither case is possible, since neither
1 li mj 1 li mj
+ + nor − −
p p p p p p
is a Hurwitz integer. Hence the arbitrary odd prime p is not a Hurwitz prime, and
by Theorem 2.1,
p = A2 + B 2 + C 2 + D2 with A, B, C, D ∈ Z
6 JIA HONG RAY NG
By the four square identity (Theorem 1.3), every natural number is the sum of four
squares.
3. Discussion
The Four Square Theorem was first conjectured in Bachet’s 1621 edition of Dio-
phantus and the first proof was given by Lagrange in 1770. Fermat claimed to have
come up with a proof, but did not publish it. In fact, there is a similar theorem
called Fermat’s Two Square Theorem, that is a corollary of the Four Squares The-
orem.
References
[1] J Stillwell. Elements of Number Theory. Springer-Verlag New York Inc. 2003.
[2] P Erdos, J Suranyi. Topics in the Theory of Numbers (2nd Ed). Springer-Verlag New York
Inc. 2003.
[3] I Niven, H S Zuckerman. Introduction to the Theory of Numbers (2nd Ed). John Wiley Sons,
Inc. 1966.
[4] Weisstein, Eric W. ”Waring’s Problem.” From MathWorld–A Wolfram Web Resource.
http://mathworld.wolfram.com/WaringsProblem.html