Sturm
Sturm
Bachelor’s thesis
Eric Spreen
University of Groningen
[email protected]
Supervisors:
Prof. Dr. J. Top
University of Groningen
Dr. R. Dyer
University of Groningen
Abstract
1 Introduction 2
4 Sturm’s Theorem 27
4.1 Variations in sign . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2 Systems of equations, inequations and inequalities . . . . . . 32
4.3 Sturm’s Theorem Parametrized . . . . . . . . . . . . . . . . . 33
4.3.1 Tarski’s Principle . . . . . . . . . . . . . . . . . . . . . 38
1
Chapter 1
Introduction
V pxq kx2 ,
1
2
which is a second-degree polynomial in one variable. A similar case occurs
in higher-dimensional systems (in 3D, and with multiple bodies). Another
example is the hydrogen atom in quantum mechanics, where the radial wave-
functions takes on the form: [3]
2
is the analog of the intermediate value theorem for polynomials. It can be
shown that this result, and various other key theorems from real analysis
hold for polynomials in real closed fields.
After the discussion of Sturm’s Theorem, we will discuss an extension
of Sturm’s Theorem that allows us to simplify the problem of the existence
of a zero in a certain interval for a whole family of polynomials. The result
will be a finite set of systems of polynomial equations, inequations and
inequalities with integer coefficients, any one of which may be satisfied by
the parameters of the family for the resulting polynomial to have zero in
the interval. From this we can quickly establish criteria for the existence
of a zero of a whole family of polynomials. A secondary result is that if a
polynomial with rational coefficients has a zero in one real closed field, it
will have a zero in every other real closed field.
As a high school student I have often wondered whether it would be
possible to form an equation of which the solvability is undecidable, in par-
ticular when I was unable to solve a particular problem. At the very end we
will touch on this question.
A significant portion of this report follows [4] and [5]. If no citations
have been provided, these are the sources. We will assume that the reader
has a basic understanding of algebraic structures, such as monoids, groups
and rings.
3
Chapter 2
Before we can begin our study of real closed fields, we will develop the theory
of polynomial rings over a field to some extent. This chapter will give basic
results on arbitrary polynomial rings, and applications on extension rings
and fields. Most of this chapter will follow [4].
We will say that a subring R of a ring S is generated by a set A S,
if R is the smallest subring that contains A. Also, if u1 , . . . , un P S (and
@r P R : uir rui, 1 ¤ i ¤ n), then we denote the ring that is generated by
R Y t u1 , . . . , un u by Rru1 , . . . , un s. We can readily note that Rru1 , u2 s
pRru1sqru2s by definition. The existance of such a subring follows from the
observation that any arbitrary intersection of subrings of S is again a subring
of S.
and 0 and 1 as obvious. The elements of Rrxs are called polynomials with
coefficients in R, and the function values of a polynomial are called the
coefficients of a polynomial.
4
1. There exists an injective homomorphism R Ñ Rrxs, so that R may be
regarded as a subring of Rrxs.
5
2.2 Degree arithmetic
°
We have seen that for any polynomial 0 f pxq ni0 ai xi there is some
k P N such that ak 0, but ai 0 if i ¡ k. This observation is a strong
tool that we will use often in further arguments. We therefore define
°
Definition 2.2.1. If R is a ring and 0 f pxq ni0 ai xi P Rrxs, then the
degree of f pxq is the largest k P N such that ak 0. If f pxq 0, then the
degree is 8. We will denote the degree of f pxq by degpf q or degpf pxqq.
Furthermore, the leading coefficient of f pxq is adegpf q if f pxq 0 and 0
if f pxq 0. This will be denoted by lcpf q or lcpf pxqq. A polynomial f pxq
will be called monic if lcpf q 1.
The following two lemmas can be proven quickly by the definition of the
degree and considering the leading coefficients of f pxq g pxq and f pxqg pxq
respectively.
Lemma 2.2.1. If R is a ring, then for any f pxq, g pxq P Rrxs : degpf pxq
g pxqq ¤ maxpdegpf q, degpg qq.
Lemma 2.2.2. If D is a domain, then Drxs is also a domain, and for all
f pxq, g pxq P Drxs we have degpf g q degpf q degpg q.1 Also, the units of
Drxs will be the units of D.
Corollary 2.3.2. Let F be a field and f pxq, g pxq P F rxs with g pxq 0.
Then there exist unique q pxq, rpxq P F rxs such that:
Proof. We can find some k P N and q pxq, rpxq P F rxs such that bkm f pxq
q pxqg pxq rpxq and degprq degpg q, where bm lcpg q. Now since g pxq 0,
we have bm 0 and thus f pxq pbm q pxqqg pxq
k pbmk rpxqq. Also, since F
is a domain: degpb k
m r pxqq degpbm q degpr pxqq degpr pxqq
k degpg pxqq.
1
It is to be understood here that 8 a 8 for any a P N Y t 8 u.
6
Now let q1 pxq, r1 pxq P F rxs also satisfy (2.2). Then pq pxq q1 pxqqg pxq
r1 pxq rpxq. Without loss of generality we may assume that degprq ¥
degpr1 q. It then follows that degpg q ¡ degprq ¥ degpr1 rq degpq q1 q
degpg q and this is only possible if degpq q1 q 8. Then q pxq q1 pxq and
thus r1 pxq rpxq.
Let R be a commutative ring and f pxq, g pxq P Rrxs with f pxq, g pxq 0.
Also let m degpg q P N and 0 b P R the leading coefficient of g pxq.
Define the following three coupled sequences:
f0 pxq f pxq
n0 degpf0q
a0 lcpf0 q
#
bfi pxq ai xni m g pxq ¥m ni 1 degpfi 1 q
fi 1 px q ni
ai 1 lcpfi 1 q
0 ni m
Then there exists a k P N such that fk pxq 0, fk 1 pxq 0 and nk m.
Also2 : k1
¸
bkm f px q al bkl1 xnl m g px q fk pxq. (2.3)
l 0
Proof. Let i P N. We then see that degpfi 1 q degpbfi pxq ai xni m g pxq
degpfi pxqq, since the leading coefficients of bfi pxq and ai xni m g pxq are both
ai b. This shows that the degree strictly decreases each step, and since
f0 pxq 0, there exists some k P N such that fk pxq 0, degpfk q nk m
and thus fk 1 pxq 0. We may see this k as the terminal step of the algo-
rithm, since from this point on only zero polynomials will be produced.
° the following: for any i P N such that i ¤ k we have
We will now prove
i1 il1 xnl m g pxq. For i 0 this is clear, so pick
bi f pxq fi pxq l0 al b
i P N with 0 i ¤ k and assume this holds for i 1. Then:
i2
¸
b f pxq bb f pxq bfi1 pxq
i i 1
b al bil2 xnl m g px q
l 0
i 2
¸
fipxq ai1 xni1 m g pxq al bil1 xnl m g px q
i1
l 0
¸
fipxq al bil1 xnl m g px q .
l 0
This proves our claim. We can then set i k to obtain our final formula
for f pxq, which concludes the proof and also proves proposition 2.3.1, since
degpfk q nk m degpg q.
2
We understand here, that if k 0, the sum evaluates to 0.
7
2.3.1 Polynomial factors
The Euclidean division algorithm can be used to prove an array of useful
facts. The first of these will concern factors of polynomials. Since we will
almost exclusively be concerned with commutative rings from this point on,
R will denote a commutative ring in the rest of this chapter.
Definition 2.3.1. If f pxq, g pxq P Rrxs, then g pxq is a factor of f pxq –
denoted as g pxq f pxq – if and only if there exists an hpxq P Rrxs such that
f p x q g px qhpx q.
Also, a polynomial f pxq P Rrxs of positive degree will be called reducible
if there exist g pxq, hpxq P Rrxs of positive degree such that f pxq g pxqhpxq.
Otherwise, f pxq will be called irreducible.3
The following two results characterize the zeroes of a polynomial. They
will be used a couple of times in the next chapters.
Lemma 2.3.3. If f pxq P Rrxs and a P R, then there exists a unique q pxq P
Rrxs such that f pxq px aqq pxq f paq.
Proof. By the Euclidean division algorithm we may pick q pxq, rpxq P Rrxs
with degprq degpx aq 1 and f pxq px aqq pxq rpxq. We then
immediately see that f paq pa aqq paq rpaq rpaq, and since degprq 1
we must have rpxq f paq. Also, since rpxq is fixed in this way, if q1 pxq also
satisfies f pxq px aqq1 pxq rpxq, then px aqpq pxq q1 pxqq 0. Now,
since the leading coefficient of x is 1, which is not a zero divisor, we get
q p x q q1 p x q.
8
Lemma 2.3.5. Let F be a field and f pxq, g pxq P F rxs with g pxq 0, and
q pxq, rpxq P F rxs such that degprq degpg q and f pxq q pxqg pxq rpxq.
Then for every hpxq P F rxs, hpxq f pxq and hpxq g pxq if and only if
hpxq g pxq and hpxq rpxq.
Proof. Let hpxq P F rxs. If hpxq f pxq and hpxq g pxq, then there
exists some αpxq, β pxq P F rxs such that f pxq αpxqhpxq and g pxq
β pxqhpxq. Then rpxq f pxq q pxqg pxq αpxqhpxq q pxqβ pxqhpxq
pαpxq qpxqβ pxqqhpxq, so that hpxq rpxq and hpxq gpxq.
Conversely, let hpxq g pxq and hpxq rpxq. Then there exist γ pxq, ρpxq P
F rxs so that g pxq γ pxqhpxq and rpxq ρpxqhpxq, so that f pxq pq pxqγ pxq
ρpxqqhpxq and thus hpxq f pxq and hpxq g pxq.
Proof. Let dpxq P F rxs be a common divisor of f pxq and g pxq. Then by
repeated use of lemma 2.3.5 see that this is the case if and only if dpxq is
a common divisor of hs1 pxq and hs pxq. Therefore, every common divisor
of f pxq and g pxq will be a factor of hs pxq. Also, since hs pxq qs pxqhs pxq
hs1 pxq and hs pxq is a factor of itself, hs pxq is a common factor of f pxq and
g pxq, and thus a greatest common factor.
9
If S F , then a subfield K of F is said to be generated by S if it is the
smallest subfield containing S.
If E is a field extension over F , and S E, we denote the subfield
generated by S Y F by F pS q. If S t u1 , . . . , un u is finite, we denote it by
F pu1 , . . . , un q.
Lemma 2.4.3. Let F be a field and f pxq P F rxs of positive degree and
irreducible. Then F rxs{pf pxqq is a field. Furthermore, F rxs{pf pxqq F puq,
where u x mod pf pxqq is a zero of f pxq, when regarded as a polynomial
with coefficients in F rxs{pf pxqq.
Proof. Let J 1 F rxs{I be an ideal, where I pf pxqq. Then there exists
an ideal J F rxs such that I J and J 1 J {I. Since F rxs is a principal
ideal domain, we can find a g pxq P F rxs such that J pg pxqq. Therefore,
10
there exists a hpxq P F rxs such that f pxq hpxqg pxq. Now, since f pxq is
irreducible, we have either hpxq P F or g pxq P F . In the first case, we get
J I, so that J 1 t 0 mod I u. In the second case we get J F rxs, so
that J 1 F rxs{I. This shows that F rxs{I has only the trivial ideals and
thus it is a field.
Now set K F rxs{pf pxqq and let u x mod pf pxqq P K. Then clearly
f puq f pxq mod pf pxqq 0 mod pf pxqq. We also have F puq K. Now
let a P K. Then there exists some bpxq P F rxs so that a bpxq mod pf pxqq.
But, there also exist q pxq, rpxq P F rxs with degprq degpf q and bpxq
q°pxqf pxq rpxq. Therefore a ° rpxq mod pf pxqq rpuq. Now write rpxq
i0 ri x . Then a r puq i0 ri u P F puq. We therefore conclude that
m i m i
K F puq.
Proof. Let u be algebraic over F and f pxq P F rxs its minimum °n1polynomial.
Now let a P F puq be arbitrary.
°n1 i Then there exists a g p xq i0 ai x P F rxs
i
12
°
a1 , . . . , an P F E such that ni1 ai ui a. Therefore, t u1 , . . . , un u spans
K as a vector space over E. We conclude that rK : E s ¤ n 8.
Now let rK : E s, rE : F s 8 and pick bases pu1, . . . , umq E and
pv1, . . . , vnq K for E over F and K over E respectively. ° Pick any a P K.
Then there exist a1 , . . . , an P E such that a ni° 1m i i Also, for 1 ¤
a v .
i ¤ n there ° exist° bi1 , . . . , bim P F such that ai j 1 bij uj . This gives
us a ni1 m j 1 bij uj vi , so that the set t uj vi | 1 ¤ i ¤ n ^ 1 ¤ j ¤ m u
°n °Km as a vector space over F . Now let c11 , . . . , cnm °PmF such that
spans
i0 j 0 cij uj vi 0. Then clearly for 1 ¤ i ¤ n: di j 0 cij uj 0.
This implies that for 1 ¤ i ¤ n and 1 ¤ j ¤ m we have cij 0. Therefore
we have obtained a base for K over F and rK : F s nm rK : E srE :
F s 8.
Definition 2.4.5. Let F be a field and f pxq P F rxs monic. Then a splitting
field of f pxq over
± F is a field extension E over F , such that in E rxs we can
write f pxq ni1 px ri q for r1 , . . . , rn P E and E F pr1 , . . . , rn q.
13
±
we have f pxq i1 gi pxq, where for 1 ¤ i ¤ l, gi pxq P K rxs are the
l
1. pf g q1 pxq f 1 pxq g 1 px q
3. x1 1.
As in real analysis we may quickly derive all the familiar algebraic prop-
erties of polynomial derivatives. We can now state the following proposition,
the proof of which can be found in [4, Sec. 4.4].
14
2.4.4 Galois Theory
Galois theory is one of the pearls of modern mathematics. It allows one
to study solutions of algebraic equations in a purely algebraic way. At the
heart of the theory is the connection between the solutions of such equations
and group theory. We will state the fundamental results without proof for
later use. An extensive treatment of this subject may (again) be found in
[4, Ch. 4]. Throughout this subsection, F denotes a field.
Definition 2.4.8. f pxq P F rxs is called seperable if and only if its irreducible
factors have distinct zeroes in any splitting field.
An algebraic field extension E over F is called seperable over F if and
only if the minimum polynomial over F of every element of E is seperable.
Also, E is called normal over F if and only if every irreducible polynomial
in F rxs that has a zero in E splits into factors of degree 1.
15
• Σ is the set of subfields K E such that F K.
• γ:ΓÑΣ:H ÞÑ InvpH q.
• σ : Σ Ñ Γ : K ÞÑ GalpE {K q.
Then γ and σ are inverse bijections, and we have the following properties:
16
Chapter 3
17
Definition 3.1.1 of an ordered field is not so intuitive at first glance, but it
becomes more transparent when we recall that P were the positive elements
and we consider the following:
Proposition 3.1.2. Any ordered field pF, P q induces a strict total order ¡
by:
@a, b P F : a ¡ b ðñ a b P P, (3.1)
with the following properties:
1. @a P F : ra ¡ 0 ðñ @b P F : a b ¡ bs,
2. @a, b P F : a ¡ 0 ^ b ¡ 0 ùñ ab ¡ 0.
Conversely, if ¡ is a strict total order with the properties above, then
P t x P F | x ¡ 0 u defines an ordered field pF, P q.
Proof. Let pF, P q be an ordered field and define ¡ as above. We shall first
show that ¡ is a strict total order. Let a, b, c P F such that a ¡ b and b ¡ c.
Then a b P P and b c P P . We then have a c a b b c P P , so
a ¡ c, which shows that ¡ is transitive. Then, if we take a, b P F , we see
by lemma 3.1.1 that either a b, a b P P , or b a P P . Therefore, either
a b, a ¡ b or b ¡ a, so ¡ is trichotomous and a strict total order.
To prove property 1, let a P F . If a ¡ 0, then @b P F : pa bq b a P P ,
so @b P F : a b ¡ b. Conversely, if @b P F : a b ¡ b, then this is in
particular the case for b 0: a ¡ 0.
To prove property 2, let a, b P F with a ¡ 0 and b ¡ 0. Then a, b P P
and thus ab P P , which shows that ab ¡ 0.
Let us now suppose that we are given a strict total order ¡ on F satis-
fying properties 1 and 2. Define P t x P F | x ¡ 0 u. Then clearly 0 R F ,
since otherwise 0 ¡ 0, which is the first property of an ordered field. Also,
by the trichotomy of ¡, we have for any a P F : a ¡ 0, a 0, or 0 ¡ a. This
means: a a 0 P P , a 0, or a 0 a P P , so P also satisfies the
second property. Now let a, b P P . Then by property 1 and the transitivity
of ¡: a ¡ 0 ùñ a b ¡ b ¡ 0 ùñ a b P P . Also, by property 2:
a ¡ 0 ^ b ¡ 0 ùñ ab ¡ 0 ùñ ab P P , which finally shows that pF, P q is
an ordered field.
18
Proof. Let a P F . Then either a P P or a P P , so that a2 paq2 P P ,
since P is closed under multiplication.
We state the following lemma without proof, as it is simply proven by
considering the various possibilities of the signs of a and b.
Lemma 3.1.4 (Triangle inequality). If pF, P q is an ordered field, then for
any a, b P F :
|a b| ¤ |a| |b|. (3.2)
We will now go on to prove a nice characterization of an ordered field
in terms of sums of squares. The following definition is due to Artin and
Schreier [1]1 .
Definition 3.1.2. A formally real field is a field F that satisfies the following
property:
ņ
@n P N@a1, . . . , an P F a2i 0 ùñ @i P t 1, . . . , n u : ai 0 ,
i 0
i.e. the zero of the field is not the sum of non-zero squares, or the vanishing
of a sum of squares implies the vanishing of all the individual squares.
The following lemma illustrates a different characterization2 .
19
The converse of the foregoing lemma is a theorem that was proved by
Artin and Schreier [1, Satz 1.], and gives the definite answer on the connec-
tion between the sums of squares in a field and orderings. We follow a proof
of Jean-Pierre Serre [6]3 . We first prove the following
Lemma 3.1.7. If P0 is a subgroup of the multiplicative group F of a field,
such that P0 is closed under addition and contains all non-zero squares, and
if a P F such that a R P0 , then
P1 P0 P0 a t p P F | Dx, y P P0 : p x ya u
20
By Zorn’s Lemma we may now pick a maximal element PL P T . We
claim that this PL makes L an ordered field that extends K. To see this,
let a P L . If both a and a P PL , then 0 P PL , which is a contradiction,
so a and a cannot be simultaneously in PL . Now, if a R PL , define
P 1 t x ya | x, y P PL u. Since PL certainly contains all non-zero squares
(1 P P ), we can conclude by lemma 3.1.7 P 1 also is a subgroup of L that
is closed under addition. Furthermore, take p P P and x P L . Then
px2 px2 p1 aqp1 aq1 ppx2 px2 aqp1 aq1 P P 1 , so P 1 P T . Also, if
x P PL , then x xp1 aqp1 aq1 px xaqp1 aq1 P P 1 , so PL P 1 .
Because we took PL to be maximal in T we can now conclude that PL P 1 .
Lastly, a ap1 aqp1 aq1 pa2 aqp1 aq1 P P 1 PL , since PL
contains all non-zero squares. We can now conclude that either a P PL or
a P PL exclusively.
The above showed that pL, PL q is an ordered field. Now, if p P P K,
then p p12 P PL , so P PL and the order extends the order on K.
This is now a sum of integer multiples of squares, and thus simply a sum
of squares. Since F is formally real, we can conclude that all xi 0. By
proposition 3.1.8 we can therefore conclude that there exists an order on F
that extends the standard order on Q.
As our last result on formally real/ordered fields we will give the following
lemma, which provides us with bounds on the zeroes of a monic polynomial.
n1 °
Lemma 3.1.10. Let F be an ordered field, f pxq xn i0 a°i x P F rxs
i
n1
monic and of positive degree, and c P F . Define M maxp1, i0 |ai |q.
Then |c| ¡ M implies that |f pcq| ¡ 0. Conversely, if f pcq 0, then M ¤
c ¤ M.
Proof.
°n1 Let c P F with |c| ¡ M . We first note c 0, so that: 1 un f puq
i n . Also, |un | 1, and for i P 0, . . . , n 1 we have |uin |
i1 a i u
21
|u1|. From this, and the triangle inequality, it follows that:
¸
n 1
1 |un f puq ai uin |
i 1
n 1
¸
¤ |un||f puq| |ai||uin|
i 1
¸ n 1
|f puq| |u1| |ai|
i1
¤ |f puq|
M M |f puq|
1
1,
γ
ņ
αν2
ņ
βν2
?
2 γ
ņ
αν βν
ņ
pαν ?γ βν q2 1.
ν 1
ν 1
ν 1
ν 1
22
° ?
If nν1 αν βν 0, then° γ P F , which leads to a contradiction, so that this
sum vanishes. Also, if nν1 αν2 0, then 1 would be a sum of squares in
F , which is also a contradiction, so that sum does not vanish. We can then
conclude that γ is not a sum of squares in F , since otherwise 1 would be
a sum of squares in F . Negating this statement leads to the first property.
By the first property we may now pick α, β P F such that:
ņ ņ
α2 αν2 , β2 1 βν2
ν 1
ν 1
23
conclude that g pxq has odd degree less than or equal to 2n 1. Therefore
g pxq has a zero ρ P F . However, then:
ŗ ŗ
1 pqν pρqq2 f pρqg pρq pqν pρqq2.
ν 1
ν 1
Lemma 3.2.2. If a field F is real closed, there exists one and only one
P F such that pF, P q is an ordered field. I.e. a real closed field can be
uniquely ordered.
Note
From now on, when we speak of a real closed field, we will implicitly assume
that it is equipped with this unique order.
24
so that Dα P F : a2 b2 α2 . Also, α2 ¥ a2 so that |α| ¥ |a| and hence
Dc1, c2 P F , where we can pick c1c2 with the same sign as b, such that
c21 a 2 |α | , c22 a 2 |α| .
Also:
p2c1c2q2 4 a 2|α| a 2 |α| a2 pa2 b2 q b2 .
We can therefore conclude that pc1 c2 iq2 c21 c22 2c1 c2 i a bi. This
shows that there exists no algebraic extension field E of C with rE : C s 2
(since any quadratic equation is reducible).
With the foregoing in mind, we now let f pxq P F rxs be a monic polyno-
mial of even degree. We define E to be a splitting field over F of f pxqpx2 1q,
such that C E. Then E is Galois over F (since F is of characteristic 0
and thus any polynomial in F rxs is seperable; hence E is the splitting field
of a separable polynomial). We write | Gal E {F | 2e m, where m is odd.
By Sylow’s theorem, Gal E {F contains a subgroup H with |H| 2e . Let
H be the subfield of E containing F corresponding to H under the Galois
pairing. Therefore, 2e m rE : F s rE : H srH : F s 2e rH : F s so that
rH : F s m. But since every polynomial of odd degree in F has a zero in
F , F has no proper odd-dimensional algebraic extension fields. Therefore,
m 1, H Gal E {F , and H E. We can now conclude that because
| Gal E {F | is even, we can obtain E by repeatedly adjoining square roots.
However, since we obtained C by adjoining a square root and C contains all
possible square roots, we must have that C E. Therefore, C is a splitting
field of f pxqpx2 1q and hence contains all zeroes of f pxq4 . This shows that
every polynomial in F rxs has a zero in C. By the reasoning above we can
then conclude that C is algebraically closed.
We will now go on to show the converse. Let F be a field that is not
? C F piq F rxs{px 1q2 be algebraically closed.
algebraically closed, but let 2
25
3.3 The Intermediate Value Theorem
In this section we will discuss a very important theorem for real continous
and differentiable functions that holds in the context of polynomials with
coefficients in a real closed field. This is the familiar intermediate value
theorem, and it will be the key to our success in the next chapter.
Theorem 3.3.1 (Intermediate Value Theorem). Let F be a real closed field,
f pxq P F rxs, a, b P F and a b. Then if f paqf pbq 0, there exists a c P F
such that a c b and f pcq 0.
Proof. From theorem 3.2.3 we already know that the only irreducible poly-
nomials in F rxs are going to be those of degree 1 or 2. Furthermore, a
polynomial x2 αx β P Rrxs is going to be irreducible if and only if
α2 4β 0. This follows in the same way as for second degree polynomials
with real coefficients.
Now let us pick f pxq P F rxs to be monic and of positive degree. The
general case then follows quickly by dividing out the leading coefficient and
by noting that the premise cannot hold for polynomials of degree zero. We
can write f pxq in terms of its irreducible factors as:
¹
m ¹
s
f px q px riq g j px q,
i 1
j 1
The key property in the proof above was that every positive element of
R can be written as a square, which is a characteristic property of real closed
fields. It turns out that analogues of several other important theorems in
real analysis, such as Rolle’s Theorem and the Mean Value Theorem, hold
for polynomials in a real closed field as well.
26
Chapter 4
Sturm’s Theorem
In this chapter we will study the classical method for determining the num-
ber of zeroes of a polynomial with real coefficients that are contained within
an open interval, which is based on a theorem by J.C.F. Sturm, published in
1829 [7]. In particular, this method allows us to symbolically locate the ze-
roes of a polynomial up to an arbitrary precision. We will study this method
in the context of real closed fields, which we have shown to encompass the
real number system.
We will give two versions of the theorem. The first gives a decision
method in terms of variations in sign of a sequence of numbers. The second
answers when a parametrized family of polynomials has zero in a certain
interval, by reducing it to a set of polynomial equations and inequations
for the parameters of the family, where the equations and inequations have
integer coefficients. From the last theorem we can then quickly show that if
a polynomial with rational coefficients has a zero in one real closed field, it
will have a zero in any real closed field.
Throughout this chapter, R will denote a real closed field, equipped with
the strict total order ¡. Also, if a, b P R and a b we will use the notations
ra, bs t x P R | a ¤ x ¤ b u and sa, br t x P R | a x b u for closed and
open intervals respectively.
Most of this chapter draws from [4], but several definitions and theorems
have been modified to streamline the discussion and to get some more general
results.
| t i P t 1, . . . , n1 u | c1i1c1i 0 u |,
27
where pc10 , . . . , c1n1 q is the subsequence obtained by dropping the zero elements
of the original sequence.
1. f0 paqf0 pbq 0,
4. If c P ra, bs and f pcq 0, there exist open intervals sc1 , cr, sc, c2 r R
such that @u Psc1 , cr: f0 puqf1 puq 0 and @u Psc, c2 r: f0 puqf1 puq ¡ 0.
In the proposition below we will show that a Sturm sequence can be used
to calculate the number of distinct (i.e. not counting multiplicity) zeroes of
the polynomial that lie in some open interval.
Proof. Since the number of zeroes of all the fi pxq within ra, bs is finite, we
can write them down as a a0 a1 am b so that no fj pxq
has a zero in any of the open intervals sai1 , ai r, 1 ¤ i ¤ s. Now pick for
1 ¤ i ¤ m: ci Psai1 , ai r.
First we see that no fj pxq has a zero in sa0 , c1 r. Then by the nega-
tion of theorem 3.3.1 we have fj pa0 qfj pc1 q ¡ 0 for j P t 0, . . . , s u. Now let
k P t 0, . . . , s u with fk pa0 q 0. Then clearly 0 k s, since f0 pa0 q 0
fs pa0 q, and so fk1 pa0 qfk 1 pa0 q 0. Then fk1 pa0 qfk 1 pa0 qfk1 pcqfk 1 pcq ¡
0 implies that fk1 pcqfk 1 pcq 0. Taking into account all such k, we get
Va0 Vc1 . In exactly the same way we may prove that Vcm Vam .
We now let i P 1, . . . , m 1. Then if f pai q 0, we can carry through
the same argument to get Vci Vci 1 0. If f pai q 0, we note that (pos-
sibly by repicking our ci and ci 1 to comply with property 4 of a Sturm
sequence) f0 pci qf1 pci q 0 and f0 pci 1 qf1 pci 1 q ¡ 0. Furthermore, the argu-
ment above again shows that if 1 j s, then fj 1 pci q, fj pci q, fj 1 pci q and
fj 1 pci 1 q, fj pci 1 q, fj 1 pci 1 q have the same number of variations in sign.
Therefore in this case Vci Vci 1 1.
We can now write:
¸
m 1
¸
m 1
Va Vb pVa Vc q1 pVc Vc q pVc Va q
i i 1 m m δi ,
i 1
i 1
28
where δi 1 if f pai q 0 and δi 0 if f pai q 0. Now since all of the zeroes
of f pxq that lie within sa, br per definition are one of the ai , we have counted
all the zeroes. Therefore, Va Vb is the total number of distinct zeroes of
f pxq that lie within sa, br.
f0 pxq f pxq
f1 pxq f 1 pxq (4.1)
fi pxq qipxqfipxq fi1pxq degpfi 1q degpfiq, 1 ¤ i ¤ s
1
where qi pxq P Rrxs. Then pf0 pxq, . . . , fs pxqq is called the standard sequence
of f pxq.
We notice that if pf0 pxq, . . . , fs pxqq is the standard sequence for some
f pxq P Rrxs, then fs pxq is a common factor of f pxq and f 1 pxq and all fi pxq,
and any such common factor will be a factor of fs pxq. Temporarily pass-
ing to the field of fractions of Rrxs, we can then define a derived sequence
pg0pxq, . . . , gspxqq by setting gipxq fipxqfspxq1 for 0 ¤ i ¤ s and observ-
ing that each gi pxq P Rrxs.
Lemma 4.1.2. Let f pxq P Rrxs be of positive degree and pf0 pxq, . . . , fs pxqq
be its standard sequence. Define the derived sequence of f pxq as pg0 pxq, . . . , gs pxqq,
where gi pxq fi pxqfs pxq1 P Rpxq1 for 0 ¤ i ¤ s. Then each gi pxq P Rrxs,
and the derived sequence is a Sturm sequence for g0 pxq on every interval
ra, bs such that g0paqg0pbq 0.
Furthermore, @c P R : f pcq 0 ðñ g0 pcq 0.
1
Rpxq denotes the field of fractions of Rrxs.
29
Proof. We showed above that fs pxq is a common factor of all the fi pxq.
Therefore, for every 0 ¤ i ¤ s we have some hi pxq P Rrxs such that fi pxq
hi pxqfs pxq and thus gi pxq hi pxqfs pxqfs pxq1 hi pxq P Rrxs.
We will now show that the derived sequence is a Sturm sequence. Let
a, b P R with a b and g0 paqg0 pbq 0. Then clearly property 1 holds.
Furthermore, gs pxq 1, so that gs pxq has no zeroes in R and hence not in
ra, bs. We now use the definition of the standard sequence to see that for
1 ¤ i ¤ s (where it is understood that gs 1 pxq 0:
30
px aq nor px bq are common factors of f pxq and f pxq. It then follows
that fs paq 0 fs pbq and thus the sequences
have the same variations in sign as the gi paq and gi pbq respectively. Now, by
the foregoing proposition and the observation above, the number of distinct
zeroes of f pxq in sa, br is equal to the number of distinct zeroes of g pxq in
the interval, which is Va Vb .
We can use the foregoing result to form a useful algorithm, that runs in
polynomial time with respect to the degree of the polynomial in question.
Calculating the total number of zeroes of a
Algorithm 3:
polynomial
°
Let f pxq ° ni0 ai xi P Rrxs be monic and of positive degree. Define µ
1 maxp1, ni01 |ai |q. Calculate the standard sequence pf0 pxq, . . . , fs pxqq of
f pxq by repetitive use of algorithm 1. For c P R, let Vc denote the number of
variations in sign of the sequence pf0 pcq, . . . , fs pcqq. Then the total number
of distinct zeroes of f pxq in R is Vµ Vµ .
Proof. We have found in lemma 3.1.10 that°all zeroes of f pxq are contained in
the interval rM , M s, where M maxp1, in01 |ai |q. Therefore, all zeroes of
f pxq are certainly contained in the open interval sµ, µr, where µ 1 M .
If we combine this with Sturm’s theorem, we get Vµ Vµ as the total
number of distinct zeroes of f pxq.
Example. We let f pxq x3 3x 1 P Rrxs. Then f 1 pxq 3x2 3 and the
Euclidean sequence of f pxq and f 1 pxq (and thus the standard sequence of f pxq is:
f0 pxq x3 3x 1
f1 pxq 3x 2
3
f2 pxq p2x 1q
f3 pxq .
15
4
We observe that all zeroes of f pxq will lie in the interval sM 1, M 1r, where
M maxp1, 4q 4. We therefore evaluate the standard sequence at 5 and 5.
f3 p5q f3 p5q
15 15
0 0
4 4
From this we see that V5 V5 2 1 1, so f pxq has 1 distinct zero in any real
closed field.
31
4.2 Systems of equations, inequations and inequal-
ities
This section serves as a preamble to the next section. We will now develop
the notion of a system of equations, inequations and inequalities, which
are expressions v pt1 , . . . , tr q 0, v pt1 , . . . , tr q 0, and v pt1 , . . . , tr q ¡ 0
respectively, where v P Zrt1 , . . . , tr s for indeterminates ti , 1 ¤ i ¤ r. Note
that will write v pti q for v pt1 , . . . , tr q if it is more convenient. We can consider
any ordered field F , which will contain Z as a subring. We then have
an evaluation homomorphism Zrt1 , . . . , tr s Ñ F induced by the inclusion
homomorphism, that sends Z to Z and ti to some ci P F . In this way we
can look for solutions of such an expression in the extension field F .
We further note, that if v pc1 , . . . , cr q 0 and wpc1 , . . . , cr q 0, then
since the solutions of these two inequations are in a field F , we can rewrite
this equivalently as v pc1 , . . . , cr qwpc1 , . . . , cr q 0. So, any finite set of in-
equations can be replaced by a single inequation. We can now state the
following definition.
Definition 4.2.1. An r-system (of equations, inequations and inequalities)
is a triple
v1 pci q vs pci q 0,
v pci q 0,
v¡1 pci q, . . . , v¡u pci q ¡ 0.
32
Also, a refinement of an r-cover γ is an r-cover δ, such that for any
ordered field F of K: @∆ P δ DΓ P γ : ∆pF q ΓpF q.
We will give the following lemmas without proof, as they are quite
straightforward if you just write out the definitions.
33
Since A is a commutative ring, we can perfectly well perform Euclidean
polynomial division in Arxs. If we now make the connection with the eval-
uation in pc1 , . . . , cr q we can make the following important observation.
Lemma 4.3.1. Let F pti ; xq, Gpti ; xq P Arxs with Gpti ; xq 0 and vm pti q the
leading coefficient of G. Then there exists an even e P N and Qpti ; xq, Rpti ; xq P
Arxs with degpRq degpGq and:
Also, if pc1 , . . . , cr q P Rprq and vm pci q 0, then the q pxq, rpxq P Rrxs with
F pci ; xq q pxqGpci ; xq rpxq and degprq degpGpci qq differ from Qpci ; xq
and Rpci ; xq by a common positive multiplier.
We also note that the choice of the Qpti ; xq, Rpti ; xq and e are indepen-
dent of which real closed field we use.
Proof. The existence of an arbitrary e P N and the Qpti ; xq, Rpti ; xq P Arxs
follows from the Euclidean division algorithm. However, if e is odd, we may
multiply the entire equation by vm pti q and so obtain a new Q̃pti ; xq and
R̃pti ; xq and an even ẽ so that the equation still holds.
Now, if pc1 , . . . , cr q P Rprq such that vm pci q 0, then since e is even we
have vm pci qe ¡ 0. Then evaluating the equation in the ci and dividing by
vm pci qe , we obtain:
34
Proof. We consider any k P t 0, . . . , m u with vk pti q 0 (or equivalently
Γk pRq H), for else Gk pti ; xq Gj pti ; xq for some j k and Γk would not
be contributing to the cover γ. We can then just as well omit Γk in our
refinement. Now find Qk pti ; xq, Rk pti ; xq P Arxs as in the foregoing lemma.
We have to consider two cases.
If Rk pti ; xq 0, we can take the sequence pF, G, 0q and Γk as the
corresponding system. This suffices because if pc1 , . . . , cr q P Γk pRq, then
Gpci ; xq Gk pci ; xq and thus the Euclidean sequence would be pF pci q, Gk pci qq.
Note that we will use this case as an induction basis in the next case.
Now let Rk pti ; xq 0. If k m ¡ degpF q, then we see that Rk pti ; xq
F pti ; xq. We may then obtain the result for Gpti ; xq and Rk pti ; xq, by going
through the argument again and seeing that this case is then excluded.
Otherwise degpRk q degpGk q degpF q degpGq, so by induction on the
sum of the degrees, we may obtain a cover δk t ∆k0 , . . . , ∆khk u and hk
sequences pFkl0 pti ; xq, . . . , Fklskl pti ; xqq so that the required property holds
for Gk pti ; xq and Rk pti ; xq. We now define Γkl Γk [ ∆kl for l P t 0, . . . , hk u.
Then, if pci q P Γkl pRq Γk pRq, we have Gk pci ; xq Gpci ; xq. Also, since
Fkl0 pci ; xq Gk pci ; xq Gpci ; xq and Fkl1 pci ; xq Rk pci ; xq, we can take
the sequences pF pci ; xq, Fkl0 pci ; xq, . . . , Fklskl pci ; xqq, whose terms differ from
the Euclidean sequence of F pci ; xq and Gpci ; xq by a positive multiplier, and
pair these with the respective Γkl .
If we now let δ consist of the systems obtained above, and pair these with
their respective sequences, including Γ8 with pF, 0, 0q, we have obtained a
refinement of γ that satisfies our requirements. We also note now that the
choice of the systems and sequences did not depend on the real closed field
in question, so that the property holds for any real closed field.
Example. Let F pp, q; xq x2 px q and Gpp, q; xq 2x p. We then have
Γ8 pRq Γ0 pRq H and Γ1 pRq Rp2q . We therefore consider only k 2,
G2 pp, q; xq Gpp, q; xq. We first observe that:
Γ1 : a2 4b 0 Ø pF, Gq
Γ2 : a 2
4b 0 Ø pF, G, R2 q
If we now recall that the standard sequence of a polynomial f pxq is
simply the Euclidean sequence of f pxq and its formal derivative, we can
quickly prove the following theorem, which is our second main result. Note
that this version is more general than the one in [4], as the systems we have
to obtain include requirements on the bounds of our interval.
35
Theorem 4.3.3 (Parametrized version of Sturm’s Theorem). Let F pti ; xq P
Arxs. Then there exists a finite set of r 2-systems5 ω in K – which we
can obtain in a finite number of steps – such that for every pc1 , . . . , cr , a, bq P
Rpr 2q with a b, F pci ; xq has a zero in ra, bs if and only if F pci ; aqF pci ; bq
0 or F pci ; aqF pci ; bq 0 and there is some Ω P ω such that pc1 , . . . , cr , a, bq P
Ω pR q.
We can restate this theorem as follows: Let F pti ; xq be a family of poly-
nomials whose coefficients are parametrized by polynomials with integer co-
efficients. Then for any interval sa, br we can obtain a finite set of systems
of equations, inequations and inequalities, so that F pci ; xq has a zero in that
interval if and only if the coefficient parameters pci q and the boundaries a
and b satisfy one of those systems (provided that F pci ; aqF pci ; bq 0).
36
possible r 2-systems on K that can be formed by the elements of those
sequences (which is finite), and filter out the ones that lead to a differ-
ence in the number of variations in sign between the sequences and for
each take the join with ∆j 6 , to form the set of systems ωj . Then, if a
pci; a, bq P ΩpRq for some Ω P ωj , then pciq P ∆j pRq, so that the above ap-
plies, and there is a difference between the variation in sign in the sequences
pαjl pci, a, 0qq and pβjl pci, 0, bqq, so that F pci; xq has a zero in sa, br, provided
that F pci ; aqF pci ; bq 0. We also observe that if F pci ; aqF pci ; bq 0, then
F pci ; xq has a zero in ra, bs.
If we now let ω Yhj0 ωj , we obtain the set of r 2-systems we require,
since δ is a cover of K.
Example. In our last example we obtained a 2-cover and the corresponding se-
quences F pp, q; xq x2 px q and F 1 pp, q; xq 2x p. We write:
∆1 : p2 4q 0 Ø pF, F 1 q
∆2 : p2 4q 0 Ø pF, F 1 , p2 4q q.
For ∆1 we get the sole system Ω11 ppp2 4q q, 1, ppx2a pxa q qp2xa
pq, px2b pxb q qp2xb pqqq, that is, α1l will change sign, but β1l will not.
For ∆2 , we have to consider three cases: the α2l change sign once, and the β2l
don’t (one zero), the α2l change sign twice, and the β2l don’t (two zeroes), or the
α2l change sign twice and the β2l change sign once (one zero). These cases can all
occur in several ways, and so we end up with a whole pile of systems.
37
°m1 2
where µ pk 1q ν 0 uν pci q uk pci q . We may from that point on let
2
apti q, bpti q P A depend on the parameters and modify the αpti q and β pti q
accordingly, and finish the argument in the same way. We can therefore
state the following corollary.
Corollary 4.3.4. Let F pti ; xq P Arxs. Then we can construct a finite set of
r-systems ω in K such that for any real closed field R, and pc1 , . . . , cr q P Rprq :
F pci ; xq has a zero in R if and only if pc1 , . . . , cr q P ΩpRq for some Ω P ω.
Restated: Let F pti ; xq P Arxs be a family of polynomials whose coef-
ficients are parametrized by polynomials with integer coefficients. Then we
can construct a finite set of systems of polynomial equations, inequations and
inequalities with integer coefficients – independent of the real closed field in
question – so that for some choice pci q of the parameters, F pci ; xq P Rrxs
has a zero in R if and only if the pci q satisfy one of the constructed systems.
° °
Now let f pxq ni0 ai xi P Rrxs and let F pti ; xq ni0 ti xi . We then
see that F pai ; xq f pxq. Suppose that all the ai P Q R and that f pxq has
a zero in R. Then by corollary 4.3.4 we can construct a set of n-systems in
Z such that the pai q satisfy one of those systems. Now let R1 be another real
closed field. Then clearly all ai P R1 (by an isomorphism of the prime fields)
and they still satisfy one of those systems. Therefore, the corresponding
polynomial in R1 rxs will also have a zero in R1 .
Corollary 4.3.5. If a polynomial f pxq with rational coefficients has a zero
in one real closed field, it will have a zero in any real closed field.
This last corollary is i.a. of the utmost importance for computer calcula-
tions. E.g. the computable numbers, described by Turing as “the numbers
whose expressions as a decimal are calculable by a machine”[9], can be shown
to be real closed[2]. This then gives the result, that a polynomial with ra-
tional coefficients has a zero in the real numbers, if and only if it has a zero
in the computable numbers. Therefore, for any polynomial with rational
coefficients, we are in principle able to compute all its real zeroes with a
computer (or any realization of a Turing machine).
38
This method has an important application in the so-called field of meta-
mathematics, where the properties of mathematics itself are studied. In
particular, it implies that every “elementary” sentence in the logic of a real
closed field is decidable. This was shown by Tarski in 1948 for the real num-
bers. [8] Note that in the logic of a real closed field, we mean the first-order
logic that remains when only the axioms of the field itself are assumed. Set-
theoretic sentences are not allowed. This does however, to quote Tarski,
“gives the mathematician the assurance that he will be able to solve every
such problem (an elementary problem in a real closed field) by working at it
long enough.” And with that assurance we can continue to make algebraic
exercises for high school students.
39
Epilogue
Sturm’s Theorem has provided us with a very simple way to determine the
zeroes of a polynomial that lie within a certain interval. It is interesting to
note that despite the simplicity of this method, it is not widely taught in
undergraduate calculus courses. Perhaps this can be attributed to the inef-
ficiency of the algorithm compared to more modern root-finding methods,
the amount of algebra involved, or simply its age (almost 200 years!). In
either case I would like to express my hopes that the tides could change in
this respect.
Nevertheless, the theorem not only provides us with this calculation
method, it also leads to several important theoretical implications. As ex-
amples we have seen the decidability of the theory of real closed fields (in
metamathematics), and the fact that if a polynomial with rational coeffi-
cients is going to have a zero in one real closed field, then it is going to
have one in every real closed field. The last result finds an application in
computer science, where we can conclude that we can compute every zero of
a polynomial with rational (even computable!) coefficients with a computer
program.
I have personally enjoyed this project very much due to the large amount
of new algebra I have come to learn, and the discovery of an obscure, but fun
and useful result. I know that I will definitely have use for Sturm’s Theorem
in the future.
Lastly, I would like to acknowledge Prof. Dr. Jaap Top and Dr. Ramsay
Dyer for their support during the course of this project. Prof. Top has
recommended this project, and they have both provided me with very useful
feedback on the report, for which I am very grateful.
40
Bibliography
[6] J.P. Serre. Extensions de corps ordonnés. In Comptes rendus des séances
de l’Académie des Sciences, pages 576–577, September 1949.
[7] J.C.F. Sturm. Mémoire sur la résolution des équations numériques. Bul-
letin des Sciences de Férussac, 11:419–425, 1829.
41