0% found this document useful (0 votes)
20 views43 pages

Sturm

This document discusses polynomial rings and field extensions. It defines polynomial rings over a base ring and establishes their basic properties. It also covers simple field extensions, splitting fields, and Galois theory as methods to adjoin elements to a base field to form an extension field.

Uploaded by

Taju Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views43 pages

Sturm

This document discusses polynomial rings and field extensions. It defines polynomial rings over a base ring and establishes their basic properties. It also covers simple field extensions, splitting fields, and Galois theory as methods to adjoin elements to a base field to form an extension field.

Uploaded by

Taju Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Sturm’s Theorem: determining the number of

zeroes of polynomials in an open interval.

Bachelor’s thesis

Eric Spreen
University of Groningen
[email protected]

July 12, 2014

Supervisors:
Prof. Dr. J. Top
University of Groningen

Dr. R. Dyer
University of Groningen
Abstract

A review of the theory of polynomial rings and extension fields is presented,


followed by an introduction on ordered, formally real, and real closed fields.
This theory is then used to prove Sturm’s Theorem, a classical result that
enables one to find the number of roots of a polynomial that are contained
within an open interval, simply by counting the number of sign changes in
two sequences. This result can be extended to decide the existence of a
root of a family of polynomials, by evaluating a set of polynomial equations,
inequations and inequalities with integer coefficients.
Contents

1 Introduction 2

2 Polynomials and Extensions 4


2.1 Polynomial rings . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Degree arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Euclidean division algorithm . . . . . . . . . . . . . . . . . . 6
2.3.1 Polynomial factors . . . . . . . . . . . . . . . . . . . . 8
2.4 Field extensions . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4.1 Simple Field Extensions . . . . . . . . . . . . . . . . . 10
2.4.2 Dimensionality of an Extension . . . . . . . . . . . . . 12
2.4.3 Splitting Fields . . . . . . . . . . . . . . . . . . . . . . 13
2.4.4 Galois Theory . . . . . . . . . . . . . . . . . . . . . . . 15

3 Real Closed Fields 17


3.1 Ordered and Formally Real Fields . . . . . . . . . . . . . . . 17
3.2 Real Closed Fields . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3 The Intermediate Value Theorem . . . . . . . . . . . . . . . . 26

4 Sturm’s Theorem 27
4.1 Variations in sign . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2 Systems of equations, inequations and inequalities . . . . . . 32
4.3 Sturm’s Theorem Parametrized . . . . . . . . . . . . . . . . . 33
4.3.1 Tarski’s Principle . . . . . . . . . . . . . . . . . . . . . 38

1
Chapter 1

Introduction

In many of the natural sciences, polynomials and polynomial systems occur


as useful approximations to real-world phenomena. As an example of this
we give the harmonic oscillator, which is used in physics to approximate
dynamical systems that are very close to an equilibrium point. The potential
energy of such a (one-dimensional) system takes the form

V pxq  kx2 ,
1
2
which is a second-degree polynomial in one variable. A similar case occurs
in higher-dimensional systems (in 3D, and with multiple bodies). Another
example is the hydrogen atom in quantum mechanics, where the radial wave-
functions takes on the form: [3]

Rnl prq  eρ ρl v pρq, ρ


1 1 r
,
r an
where v pρq is a polynomial, and n P N, n ¡ 0 and l P N.1 It is a crucial
problem to find the zeroes of these functions in order to determine the
electronic structure of the atom, which can be done by finding the zeroes of
the polynomial ρl 1 v pρq.
It is clear from these examples that finding solutions of polynomial equa-
tions is a fundamental problem in applied mathematics. A classical result
that enables us to do this numerically is Sturm’s Theorem, named after
Jacques Charles François Sturm. This theorem gives the number of zeroes
of a polynomial that are contained within a certain open interval, enabling
us to determine the zeroes (by partitioning the number line appropriately,
up to machine precision) of a polynomial numerically.
We will discuss Sturm’s Theorem in the context of real closed fields,
an abstraction of the real number system that has significantly different
realizations. The key to the success of Sturm’s Theorem in real closed fields
1
We use the convention that 0 P N.

2
is the analog of the intermediate value theorem for polynomials. It can be
shown that this result, and various other key theorems from real analysis
hold for polynomials in real closed fields.
After the discussion of Sturm’s Theorem, we will discuss an extension
of Sturm’s Theorem that allows us to simplify the problem of the existence
of a zero in a certain interval for a whole family of polynomials. The result
will be a finite set of systems of polynomial equations, inequations and
inequalities with integer coefficients, any one of which may be satisfied by
the parameters of the family for the resulting polynomial to have zero in
the interval. From this we can quickly establish criteria for the existence
of a zero of a whole family of polynomials. A secondary result is that if a
polynomial with rational coefficients has a zero in one real closed field, it
will have a zero in every other real closed field.
As a high school student I have often wondered whether it would be
possible to form an equation of which the solvability is undecidable, in par-
ticular when I was unable to solve a particular problem. At the very end we
will touch on this question.
A significant portion of this report follows [4] and [5]. If no citations
have been provided, these are the sources. We will assume that the reader
has a basic understanding of algebraic structures, such as monoids, groups
and rings.

3
Chapter 2

Polynomials and Extensions

Before we can begin our study of real closed fields, we will develop the theory
of polynomial rings over a field to some extent. This chapter will give basic
results on arbitrary polynomial rings, and applications on extension rings
and fields. Most of this chapter will follow [4].
We will say that a subring R of a ring S is generated by a set A € S,
if R is the smallest subring that contains A. Also, if u1 , . . . , un P S (and
@r P R : uir  rui, 1 ¤ i ¤ n), then we denote the ring that is generated by
R Y t u1 , . . . , un u by Rru1 , . . . , un s. We can readily note that Rru1 , u2 s 
pRru1sqru2s by definition. The existance of such a subring follows from the
observation that any arbitrary intersection of subrings of S is again a subring
of S.

2.1 Polynomial rings


Definition 2.1.1. Given a ring R, its polynomial ring Rrxs is the ring of
functions f : N Ñ R such that there exists a k P N with @n P N : n ¥ k ùñ
f pnq  0. Addition and multiplication in Rrxs are defined as follows; for
f pxq, g pxq P Rrxs and any n P N:

pf g qpnq  f pnq g p n q, pf gqpnq  f pi qg pn  i q.

i 0

and 0 and 1 as obvious. The elements of Rrxs are called polynomials with
coefficients in R, and the function values of a polynomial are called the
coefficients of a polynomial.

We will establish the basic properties of polynomial rings. The proofs


will be ommitted and can be found in [4, Sec.2.10].

Proposition 2.1.1 (Properties of polynomial rings). Let R be a ring and


Rrxs its polynomial ring. Then:

4
1. There exists an injective homomorphism R Ñ Rrxs, so that R may be
regarded as a subring of Rrxs.

2. Let x P Rrxs be the polynomial with xp1q  1 and xpnq  0 if n P


Nz t 1 u. Then Rrxs is generated by R Yt x u. Furthermore, all elements
of R commute with x.

3. For any f pxq P N,


°nthere exists an n P N and unique a0, . . . , an P R
such that f pxq  i0 ai x .
i

4. If R is commutative, then so is Rrxs.

Proposition 2.1.2 (Evaluation homomorphism). If R and S are rings,


φ : R Ñ S is a homomorphism, u P S and @r P R : φprqu  uφprq, then
there exists a unique homomorphism ψ : Rrxs Ñ S such that ψ |R  φ and
ψ pxq  u. Also, the kernel of ψ is an ideal I „ Rrxs such that I XR  kerpφq.
This homomorphism is called the evaluation homomorphism in u.

Note: Evaluation in an overring


From the proposition above, it follows immediately that if we let R be a
subring of S and φ the inclusion homomorphism, the kernel of every eval-
uation homomorphism in an element of S will be an ideal I of Rrxs with
I X R  t 0 u.

We can note that if R is a ring, then the following property is “universal”


for the polynomial ring Rrxs (in the sense that any rings that have this
property are isomorphic): if S is any other ring, and φ : R Ñ S is a
homomorphism, u P S and @r P R : φprqu  uφprq, then there exists an
x P Rrxs and a unique homomorphism ψ : Rrxs Ñ S so that ψ |R  φ,
φpxq  u and Rrxs is generated by R Y t x u. [4, p.124]

Note: Notation of polynomials


From now on, we will denote a polynomial with coefficients in a ring R as
f pxq and its value under a evaluation homomorphism in some u as f puq.
Also, if f puq  0, then u is called a zero of f pxq in S.

Corollary 2.1.3. If R and S are rings, and φ : R Ñ S is a homomorphism,


then there exists a unique homomorphism ψ : Rrxs Ñ S rx1 s such that ψ |R 
φ and ψ pxq  x1 .

We will also use the notion of a polynomial in multiple indeterminates.


We can formalize this notion by (for any n P N, n ¡ 1) defining the ring
Rrx1 , . . . , xn s : Rrx1 s . . . rxn s. By induction we can then get an evaluation
homomorphism in multiple variables.

5
2.2 Degree arithmetic
°
We have seen that for any polynomial 0  f pxq  ni0 ai xi there is some
k P N such that ak  0, but ai  0 if i ¡ k. This observation is a strong
tool that we will use often in further arguments. We therefore define
°
Definition 2.2.1. If R is a ring and 0  f pxq  ni0 ai xi P Rrxs, then the
degree of f pxq is the largest k P N such that ak  0. If f pxq  0, then the
degree is 8. We will denote the degree of f pxq by degpf q or degpf pxqq.
Furthermore, the leading coefficient of f pxq is adegpf q if f pxq  0 and 0
if f pxq  0. This will be denoted by lcpf q or lcpf pxqq. A polynomial f pxq
will be called monic if lcpf q  1.

The following two lemmas can be proven quickly by the definition of the
degree and considering the leading coefficients of f pxq g pxq and f pxqg pxq
respectively.

Lemma 2.2.1. If R is a ring, then for any f pxq, g pxq P Rrxs : degpf pxq
g pxqq ¤ maxpdegpf q, degpg qq.

Lemma 2.2.2. If D is a domain, then Drxs is also a domain, and for all
f pxq, g pxq P Drxs we have degpf g q  degpf q degpg q.1 Also, the units of
Drxs will be the units of D.

2.3 Euclidean division algorithm


The proof of the following proposition will be given when we prove algorithm
1.

Proposition 2.3.1. Let R be a commutative ring, and f pxq, g pxq P Rrxs


with g pxq  0, m  degpg q and bm the leading coefficient of g pxq. Then
there exist k P N, q pxq, rpxq P Rrxs such that:

bk f pxq  q pxqg pxq rpxq ^ degprq degpg q. (2.1)

Corollary 2.3.2. Let F be a field and f pxq, g pxq P F rxs with g pxq  0.
Then there exist unique q pxq, rpxq P F rxs such that:

f p x q  q px qg px q rpxq ^ degprq degpg q. (2.2)

Proof. We can find some k P N and q pxq, rpxq P F rxs such that bkm f pxq 
q pxqg pxq rpxq and degprq degpg q, where bm  lcpg q. Now since g pxq  0,
we have bm  0 and thus f pxq  pbm q pxqqg pxq
k pbmk rpxqq. Also, since F
is a domain: degpb k
m r pxqq  degpbm q degpr pxqq  degpr pxqq
k degpg pxqq.
1
It is to be understood here that 8 a  8 for any a P N Y t 8 u.

6
Now let q1 pxq, r1 pxq P F rxs also satisfy (2.2). Then pq pxq  q1 pxqqg pxq 
r1 pxq  rpxq. Without loss of generality we may assume that degprq ¥
degpr1 q. It then follows that degpg q ¡ degprq ¥ degpr1  rq  degpq  q1 q
degpg q and this is only possible if degpq  q1 q  8. Then q pxq  q1 pxq and
thus r1 pxq  rpxq.

Algorithm 1: Euclidean Division Algorithm

Let R be a commutative ring and f pxq, g pxq P Rrxs with f pxq, g pxq  0.
Also let m  degpg q P N and 0  b P R the leading coefficient of g pxq.
Define the following three coupled sequences:

f0 pxq  f pxq
n0  degpf0q
a0  lcpf0 q
#
bfi pxq  ai xni m g pxq ¥m ni 1  degpfi 1 q
fi 1 px q  ni
ai 1  lcpfi 1 q
0 ni m
Then there exists a k P N such that fk pxq  0, fk 1 pxq  0 and nk m.
Also2 : k1
¸
bkm f px q  al bkl1 xnl m g px q fk pxq. (2.3)
l 0
Proof. Let i P N. We then see that degpfi 1 q  degpbfi pxq  ai xni m g pxq
degpfi pxqq, since the leading coefficients of bfi pxq and ai xni m g pxq are both
ai b. This shows that the degree strictly decreases each step, and since
f0 pxq  0, there exists some k P N such that fk pxq  0, degpfk q  nk m
and thus fk 1 pxq  0. We may see this k as the terminal step of the algo-
rithm, since from this point on only zero polynomials will be produced.
° the following: for any i P N such that i ¤ k we have
We will now prove
i1 il1 xnl m g pxq. For i  0 this is clear, so pick
bi f pxq  fi pxq l0 al b
i P N with 0 i ¤ k and assume this holds for i  1. Then:
i2
¸
b f pxq  bb  f pxq  bfi1 pxq
i i 1
b al bil2 xnl m g px q
l 0

i 2
¸
 fipxq ai1 xni1 m g pxq al bil1 xnl m g px q
 i1 
l 0
¸
 fipxq al bil1 xnl m g px q .

l 0
This proves our claim. We can then set i  k to obtain our final formula
for f pxq, which concludes the proof and also proves proposition 2.3.1, since
degpfk q  nk m  degpg q.
2
We understand here, that if k  0, the sum evaluates to 0.

7
2.3.1 Polynomial factors
The Euclidean division algorithm can be used to prove an array of useful
facts. The first of these will concern factors of polynomials. Since we will
almost exclusively be concerned with commutative rings from this point on,
R will denote a commutative ring in the rest of this chapter.
Definition 2.3.1. If f pxq, g pxq P Rrxs, then g pxq is a factor of f pxq –
denoted as g pxq  f pxq – if and only if there exists an hpxq P Rrxs such that
f p x q  g px qhpx q.
Also, a polynomial f pxq P Rrxs of positive degree will be called reducible
if there exist g pxq, hpxq P Rrxs of positive degree such that f pxq  g pxqhpxq.
Otherwise, f pxq will be called irreducible.3
The following two results characterize the zeroes of a polynomial. They
will be used a couple of times in the next chapters.
Lemma 2.3.3. If f pxq P Rrxs and a P R, then there exists a unique q pxq P
Rrxs such that f pxq  px  aqq pxq f paq.
Proof. By the Euclidean division algorithm we may pick q pxq, rpxq P Rrxs
with degprq degpx  aq  1 and f pxq  px  aqq pxq rpxq. We then
immediately see that f paq  pa  aqq paq rpaq  rpaq, and since degprq 1
we must have rpxq  f paq. Also, since rpxq is fixed in this way, if q1 pxq also
satisfies f pxq  px  aqq1 pxq rpxq, then px  aqpq pxq  q1 pxqq  0. Now,
since the leading coefficient of x is 1, which is not a zero divisor, we get
q p x q  q1 p x q.

Corollary 2.3.4. If f pxq P Rrxs and a P R. Then a is a zero of f pxq if and


only if px  aq  f pxq.
Proof. By the previous lemma there is a q pxq P Rrxs such that f pxq 
px  aqqpxq f paq. So, if f paq  0, then px  aq|f pxq. Conversely, if
px  aq|f pxq, there exists some hpxq P Rrxs such that f pxq  px  aqhpxq.
But then f paq  pa  aqhpaq  0.

We can also apply the Euclidean division algorithm to determine a great-


est common factor of two polynomials with coefficients in a field F . By a
greatest common factor (or divisor) of a pair of polynomials pf pxq, g pxqq
we mean a polynomial hpxq such that hpxq  f pxq, hpxq  g pxq and if
dpxq P F rxs such that dpxq  f pxq and dpxq  g pxq, then dpxq  hpxq. Degree
considerations quickly show that two greatest common factors differ by a
unit factor in F . Now, for any two polynomials f pxq, g pxq P F rxs we then
define gcdpf, g q P F rxs to be the unique monic greatest common divisor.
3
Several other definitions are possible. For example, a polynomial may be called irre-
ducible if it is not a unit, and if it can be written as a product of two polynomials, one
of them must be a unit. However, in polynomial rings over a field, this leads to the same
concept. Hence we adopt this definition.

8
Lemma 2.3.5. Let F be a field and f pxq, g pxq P F rxs with g pxq  0, and
q pxq, rpxq P F rxs such that degprq degpg q and f pxq  q pxqg pxq rpxq.
Then for every hpxq P F rxs, hpxq  f pxq and hpxq  g pxq if and only if
hpxq  g pxq and hpxq  rpxq.
Proof. Let hpxq P F rxs. If hpxq  f pxq and hpxq  g pxq, then there
exists some αpxq, β pxq P F rxs such that f pxq  αpxqhpxq and g pxq 
β pxqhpxq. Then rpxq  f pxq  q pxqg pxq  αpxqhpxq  q pxqβ pxqhpxq 
pαpxq  qpxqβ pxqqhpxq, so that hpxq  rpxq and hpxq  gpxq.
Conversely, let hpxq  g pxq and hpxq  rpxq. Then there exist γ pxq, ρpxq P
F rxs so that g pxq  γ pxqhpxq and rpxq  ρpxqhpxq, so that f pxq  pq pxqγ pxq
ρpxqqhpxq and thus hpxq  f pxq and hpxq  g pxq.

The GCD and Euclidean sequence of two


Algorithm 2:
polynomials over a field
Let f pxq, g pxq P F rxs where F is a field and g pxq  0, and define the
following sequence:
h0 px q  f px q h1 pxq  g pxq
#
hi p xq  qi 1 pxqhi 1pxq  hipxq, if hi 1 p xq  0
1 p xq  0
2
0, if hi
degphiq degphi 1q, i P N.
2

(It is understood here that 8 8). Then there exists a 1 ¤ s P N such


that hs pxq  0, but hs 1 pxq  0. Furthermore, hs pxq is a greatest common
factor of f pxq and g pxq. The finite sequence (terminating at hs pxq) thusly
defined is called the Euclidean sequence of f pxq and g pxq.

Proof. Let dpxq P F rxs be a common divisor of f pxq and g pxq. Then by
repeated use of lemma 2.3.5 see that this is the case if and only if dpxq is
a common divisor of hs1 pxq and hs pxq. Therefore, every common divisor
of f pxq and g pxq will be a factor of hs pxq. Also, since hs pxq  qs pxqhs pxq 
hs1 pxq and hs pxq is a factor of itself, hs pxq is a common factor of f pxq and
g pxq, and thus a greatest common factor.

2.4 Field extensions


In the theory of fields, the main subject of study is a field extension. Infor-
mally, this is a field that contains some other field. This notion is particularly
important for the study of polynomial equations. We will discuss field ex-
tension that are generated by finitely many elements, which can be seen as
fields that are obtained by adjoining some elements.
Definition 2.4.1. Let F be a field. An field extension over F is then a field
E such that F is a subfield of E.

9
If S „ F , then a subfield K of F is said to be generated by S if it is the
smallest subfield containing S.
If E is a field extension over F , and S „ E, we denote the subfield
generated by S Y F by F pS q. If S  t u1 , . . . , un u is finite, we denote it by
F pu1 , . . . , un q.

2.4.1 Simple Field Extensions


We will first consider the structure of simple field extensions. That is, field
extensions over F of the form F puq. The following proposition will be crucial
in our discussion. In this discussion we take a slightly different route than
in [4].
Proposition 2.4.1. Let F be a field. Then F rxs is a principal ideal domain.
Proof. It is clear that the trivial ideal t 0 u is a principal ideal. Therefore,
let t 0 u  I „ F rxs be an ideal of F rxs and f pxq P I. We may also
clearly pick a non-zero g pxq P Rrxs of minimal degree. Then there exist
q pxq, rpxq P F rxs such that f pxq  q pxqg pxq rpxq with degprq degpg q.
But then rpxq  f pxq  q pxqg pxq P I. Since g pxq was of minimal degree, we
must have that rpxq  0. Therefore f pxq  q pxqg pxq for some q pxq P F rxs
and we conclude that I  pg pxqq is principal. We have also already seen
that, since F is a domain, F rxs is a domain. This concludes the proof.

Definition 2.4.2. If R is a subring of a commutative ring S, and u P S,


we will call u algebraic over R if there exists a 0  f pxq P Rrxs such that
f puq  0. Otherwise, we will call u transcendental over R.
A field extension E over F will be called algebraic if and only if every
element of E is algebraic over F .
Lemma 2.4.2. If R is a subring of the commutative ring S and u P S is
transcendental over R, then F rxs  F rus.
Proof. If u is transcendental over R, then the kernel of the evaluation homo-
morphism ρ : F rxs Ñ S in u is t 0 u. This shows that F rxs  A  ρpF rxsq.
Since A contains R°and u, we obtain Rrus „ A. Also, if x°P A, then there
exists some f pxq  ni1 ai xi P F rxs such that x  f puq  ni1 ai ui P Rrus.
Therefore, A „ Rrus and we conclude that F rxs  A  Rrus.

Lemma 2.4.3. Let F be a field and f pxq P F rxs of positive degree and
irreducible. Then F rxs{pf pxqq is a field. Furthermore, F rxs{pf pxqq  F puq,
where u  x mod pf pxqq is a zero of f pxq, when regarded as a polynomial
with coefficients in F rxs{pf pxqq.
Proof. Let J 1 „ F rxs{I be an ideal, where I  pf pxqq. Then there exists
an ideal J „ F rxs such that I „ J and J 1  J {I. Since F rxs is a principal
ideal domain, we can find a g pxq P F rxs such that J  pg pxqq. Therefore,

10
there exists a hpxq P F rxs such that f pxq  hpxqg pxq. Now, since f pxq is
irreducible, we have either hpxq P F or g pxq P F . In the first case, we get
J  I, so that J 1  t 0 mod I u. In the second case we get J  F rxs, so
that J 1  F rxs{I. This shows that F rxs{I has only the trivial ideals and
thus it is a field.
Now set K  F rxs{pf pxqq and let u  x mod pf pxqq P K. Then clearly
f puq  f pxq mod pf pxqq  0 mod pf pxqq. We also have F puq „ K. Now
let a P K. Then there exists some bpxq P F rxs so that a  bpxq mod pf pxqq.
But, there also exist q pxq, rpxq P F rxs with degprq degpf q and bpxq 
q°pxqf pxq rpxq. Therefore a °  rpxq mod pf pxqq  rpuq. Now write rpxq 
i0 ri x . Then a  r puq  i0 ri u P F puq. We therefore conclude that
m i m i

K  F puq.

Proposition 2.4.4. Let E be a field extension over F and u P F . If u is


transcendental over F , then F puq  F pxq, the field of fractions of F rxs. If
u is algebraic over F , there exists some irreducible g pxq P F rxs such that
F puq  F rxs{pg pxqq. Moreover, this g pxq is unique up to a unit multiplier.

Proof. If u is transcendental, then F rxs  F rus. Now, since F rus „ F puq,


and the field of fractions of F rus is the smallest field containing F rus (and
the fields of fractions of two isomorphic integral domains are isomorphic),
we get F puq  F puq.
Now suppose that u is algebraic over F . Then, since F rxs is a prin-
cipal ideal domain, there exists some g pxq P F rxs such that the kernel of
the evaluation homomorphism ρ : F rxs Ñ E in u is I  pg pxqq and thus
ρpF rxsq  F rxs{I, by the first isomorphism theorem of rings. We claim that
F rxs{I is a field.
Suppose that there exist f pxq, hpxq P F rxs such that g pxq  f pxqhpxq.
Now, if f pxq P I, then there exists a k pxq P F rxs such that f pxq  k pxqg pxq
and hence g pxq  hpxqk pxqg pxq. This would imply that degphq  0, and
hence hpxq P F . Similarly, if hpxq P I, then f pxq P F . Now let both
f pxq, hpxq R I. Then f puq  0  hpuq, but f puqhpuq  g puq  0, so then
E would contain non-zero zero-divisors. From this we conclude that g pxq is
irreducible, and by lemma 2.4.3 we conclude that F rxs{I is a field.
We now observe that F „ ρpF rxsq (since I X F  t 0 u) and u P ρpF rxsq,
so
°n that F puq „ ρpF rxsq. But if x P°ρpF rxsq, then there exists a f pxq 
i1 ai x P F rxs with x  f puq  i1 ai u P F puq. Therefore, F puq 
i n i

ρpF rxsq  F rxs{pg pxqq.


Now let t 0 u  I „ F rxs be an ideal and f pxq, g pxq P F rxs such
that I  pf pxqq  pg pxqq. Then, there exist hpxq, k pxq P F rxs such that
f pxq  hpxqg pxq and g pxq  k pxqf pxq, so that f pxq  hpxqk pxqf pxq. Degree
considerations then show that 0  hpxq, k pxq P F . Therefore, f pxq  ag pxq
for some unit a P F .

Definition 2.4.3. If E is a field extension over F , and u P E is algebraic


11
over F , we call the unique monic polynomial g pxq P F rxs such that F puq 
F rxs{pg pxqq the minimum polynomial of u over F .
Also, if E  F puq, we call E a simple field extension over F with gener-
ator u, and u is called a primitive element of E.

2.4.2 Dimensionality of an Extension


If we have a field extension E over F , we may regard E as a vector space
over F . In this vector space, the addition is the normal addition of the
field, and the scalar multiplication is the normal multiplication in F , where
the scalars lie in F . In particular, in vector spaces we have the notion of a
dimension. This dimension turns out to be of critical importance.

Definition 2.4.4. If E is a field extension over F , the dimensionality (or


degree) of E over F is the dimensionality of E regarded as a vector space
over F , which shall be denoted as rE : F s.

Proposition 2.4.5. Let E be a field extension over F and u P F . Then u is


algebraic over F if and only if rF puq : F s 8. Moreover, if u is algebraic,
then rF puq : F s is the degree of the minimum polynomial of u.

Proof. Let u be algebraic over F and f pxq P F rxs its minimum °n1polynomial.
Now let a P F puq be arbitrary.
°n1 i Then there exists a g p xq  i0 ai x P F rxs
i

such that a  g puq  i0 ai u where n  degpf q and ai P F for 0 ¤ i ¤ n 


1. Therefore, p1, u, . . . , un°
1 q spans the vector space F puq over F . Now let
°
b0 , . . . , bn1 P F such that i0 bi ui  0. Then hpxq  in01 bi xi P pf pxqq.
But since degphq degpf q we get hpxq  0, so that b0      bn1  0.
This shows that p1, u, . . . , un1 q is a base for the vector space F puq over F ,
and hence rF puq : F s  n 8.
Now let u be transcendental over F . Then let n P ° . , an P F .
N and a0 , . . °
We recall that F rxs „ F pxq  F puq. Therefore, 0  ni0 ai ui  ni0 ai xi
implies that a0      an  0, which shows that there exists no finite base
for F rxs as a vector space over F . Now, since F rxs is a subspace of F puq,
we then certainly have that rF puq : F s  8. By negation of this argument,
we get that if rF puq : F s 8, then u must be algebraic over F .

Proposition 2.4.6 (Dimensionality formula). Let K be a field extension


over E, which is in turn a field extension over F . Then K is a field extension
over F and rK : F s 8 if and only if rK : E s, rE : F s 8. If rK : F s 8,
then:
rK : F s  rK : E srE : F s (2.4)

Proof. It is trivial that K is a field extension over F .


If rK : F s 8, then rE : F s 8, since E is a subspace of K. Now
let t u1 , . . . , un u € K be a base for K. Then for every a P K there exist

12
°
a1 , . . . , an P F „ E such that ni1 ai ui  a. Therefore, t u1 , . . . , un u spans
K as a vector space over E. We conclude that rK : E s ¤ n 8.
Now let rK : E s, rE : F s 8 and pick bases pu1, . . . , umq „ E and
pv1, . . . , vnq „ K for E over F and K over E respectively. ° Pick any a P K.
Then there exist a1 , . . . , an P E such that a  ni° 1m i i Also, for 1 ¤
a v .
i ¤ n there ° exist° bi1 , . . . , bim P F such that ai  j 1 bij uj . This gives
us a  ni1 m j 1 bij uj vi , so that the set t uj vi | 1 ¤ i ¤ n ^ 1 ¤ j ¤ m u
°n °Km as a vector space over F . Now let c11 , . . . , cnm °PmF such that
spans
i0 j 0 cij uj vi  0. Then clearly for 1 ¤ i ¤ n: di  j 0 cij uj  0.
This implies that for 1 ¤ i ¤ n and 1 ¤ j ¤ m we have cij  0. Therefore
we have obtained a base for K over F and rK : F s  nm  rK : E srE :
F s 8.

The following corollary is immediate and will be used later on.

Corollary 2.4.7. Let K be a field extension of F with rK : F s 8. Then


for any field extension E „ K of F , rE : F s  rK : F s. Also, if rK : F s is
prime, then the only subfields of K that contain F are K and F themselves.

2.4.3 Splitting Fields


We have seen before that if we have a field extension E over F and some
monic polynomial f pxq P F rxs, the u P E is a zero of f pxq if and only
if px  uq  f pxq, where we regard f pxq as a polynomial in E rxs (by the
induced inclusion homomorphism). It would of course be great ±if we could
find a field extension E over F where we could write f pxq  ni1 px  ri q
for r1 , . . . , rn P E. Also, taking into account the results we obtained for the
dimensionalities of field ± extensions, we would like that E  F pr1 , . . . , rn q,
since then rE : F s  ni1 rF pu1 , . . . , ui q : F pu1 , . . . , ui1 qs (where the first
term in the product is understood ± to be rF pu1 q : F s. Also, if r P E is then
a zero of f pxq, then 0  f prq  ni1 pr  ri q such that r  ri for some
1 ¤ i ¤ n. We will call such a field extension a splitting field of f pxq over
F.

Definition 2.4.5. Let F be a field and f pxq P F rxs monic. Then a splitting
field of f pxq over
± F is a field extension E over F , such that in E rxs we can
write f pxq  ni1 px  ri q for r1 , . . . , rn P E and E  F pr1 , . . . , rn q.

Proposition 2.4.8. If F is a field and f pxq P F rxs is monic, then there


exists a splitting field E of f pxq over F .
±
Proof. Let f pxq  ki1 fi pxq where for 1 ¤ i ¤ k, fk pxq P F rxs is monic
and irreducible. Then k ¤ n  degpf q. If n  k, F itself is a splitting
field of f pxq. Now let n  k ¡ 0. Then for some j P t 1, . . . , k u we have
degpfj q ¡ 1. Set K  F rxs{pf1 pxqq, which is a field that contains F , and
r  x mod pfi pxqq, so that K  F prq and f1 prq  0. Then in K rxs

13
±
we have f pxq  i1 gi pxq, where for 1 ¤ i ¤ l, gi pxq P K rxs are the
l

irreducible factors of f pxq in K rxs. Since these factors can be obtained


by taking the irreducible factors of the fi pxq and px  rq P K rxs is an
irreducible factor of f1 pxq, we obtain n ¥ l ¡ k, so that n  l n  k.
By induction
±n we then obtain an extension field E  K p r1 , . . . , r n q such that
f pxq  ii px  ri q, where r  ri for some 1 ¤ i ¤ n. This shows that
E  F prqpr1 , . . . , rn q  F pr1 , . . . , rn q is a splitting field of f pxq over F .

We state the following proposition without proof, which can be found in


[4].

Proposition 2.4.9. Let F be a field, f pxq P F rxs monic and of positive


degree, and E and E 1 splitting fields of f pxq over F . Then E  E 1 .

We may now quickly see that the splitting of a monic polynomial in


factors of degree one is unique. Therefore, the zeroes are unique and the
following definition is consistent over every possible splitting field.

Definition 2.4.6. Let F be a field, f pxq P F rxs monic and ±m of positive


degree, and E a splitting field of f pxq over F . Write f pxq  i1 px  ri qki ,
where ri  rj if i  j. We then call ki the multiplicity of ri . Also, a zero
ri is called a simple zero if and only if it has multiplicity 1. Otherwise, it is
called a multiple zero.

We lastly make the connection between the derivative of a polynomial


and the character of its zeroes. Informally, we will see that a zero (in a
splitting field) has multiplicity greater than 1 if and only if the polynomial
and its derivative have a common factor of positive degree. For this we
define the following map on a polynomial ring of a field.

Definition 2.4.7. Let F be a field. We then define the standard derivation


in F rxs as the unique function F rxs Ñ F rxs : f pxq ÞÑ f 1 pxq so that for any
f pxq, g pxq P F rxs:

1. pf g q1 pxq  f 1 pxq g 1 px q

2. pf g q1 pxq  f 1 pxqg pxq f pxqg 1 pxq

3. x1  1.
As in real analysis we may quickly derive all the familiar algebraic prop-
erties of polynomial derivatives. We can now state the following proposition,
the proof of which can be found in [4, Sec. 4.4].

Proposition 2.4.10. Let F be a field, f pxq P F rxs monic and of positive


degree, and E any splitting field of f pxq over F . Then all zeroes of f pxq in
E are simple if and only if gcdpf, f 1 q  1.

14
2.4.4 Galois Theory
Galois theory is one of the pearls of modern mathematics. It allows one
to study solutions of algebraic equations in a purely algebraic way. At the
heart of the theory is the connection between the solutions of such equations
and group theory. We will state the fundamental results without proof for
later use. An extensive treatment of this subject may (again) be found in
[4, Ch. 4]. Throughout this subsection, F denotes a field.

Definition 2.4.8. f pxq P F rxs is called seperable if and only if its irreducible
factors have distinct zeroes in any splitting field.
An algebraic field extension E over F is called seperable over F if and
only if the minimum polynomial over F of every element of E is seperable.
Also, E is called normal over F if and only if every irreducible polynomial
in F rxs that has a zero in E splits into factors of degree 1.

Lemma 2.4.11. Any field extension E over F of characteristic 0 is seper-


able.

Definition 2.4.9 (The Galois group). Let E be a field extension over F .


The Galois group of E over F is then the group GalpE {F q of automorphisms
of E that reduce to the identity when restricted to F .
Also, if G is any subgroup of the group of automorphisms of E, then
Inv G „ E is the subfield of elements that are invariant under all automor-
phisms in G.4

Definition 2.4.10. A field extension E over F is called a Galois field ex-


tension if and only if E is a splitting field of f pxq over F for some seperable
f pxq P F rxs.

Lemma 2.4.12. If E is a splitting field of f pxq over F for some monic


seperable f pxq P F rxs, then | GalpE {F q|  rE : F s.

Proposition 2.4.13. Let E be a field extension over F . Then the following


statements are equivalent:

• E is a Galois field extension over F .

• F  Inv G for some finite subgroup of Aut E.


• rE : F s 8, and E is normal and seperable over F .
Theorem 2.4.14 (Fundamental Theorem of Galois Theory). Let E be a
Galois field extension over F and define:

• Γ is the set of subgroups of GalpE {F q.


4
It follows that G  GalpE {F q is the subgroup of Aut E such that Inv G  F .

15
• Σ is the set of subfields K „ E such that F „ K.
• γ:ΓÑΣ:H ÞÑ InvpH q.
• σ : Σ Ñ Γ : K ÞÑ GalpE {K q.

Then γ and σ are inverse bijections, and we have the following properties:

1. @H1, H2 P Γ : H1 „ H2 ðñ Inv H1 H2,


2. @H P Γ : |H |  rE : Inv H s ^ rG : H s  rInv H : F s,
3. @H P Γ: H is a normal subgroup of GalpE {F q if and only if Inv H is a
normal field extension over F . In this case GalppInv H q{F q  pG{H q.

16
Chapter 3

Real Closed Fields

In this chapter we will develop the framework in which we prove Sturm’s


Theorem. We will begin with a discussion of ordered fields, showing that
a field can be (compatibly) ordered if and only if the field is formally real
(meaning that no non-zero element is a sum of squares). We will then discuss
real closed fields and some of their key properties, going on to investigate
several equivalent characterizations of real closed fields. This serves to il-
lustrate the importance of real closed fields in applications. Again we will
follow [4] in our discourse.

3.1 Ordered and Formally Real Fields


Definition 3.1.1. An ordered field is a pair pF, P q where F is a field, and
P € F such that
1. 0 R P ,
2. @a P F : a  0 _ a P P _ a P P ,
3. @a, b P P : a b P P ^ ab P P .
We call the elements of P the positive elements of F .
We also say that a field F can be ordered if and only if a P €F exists
so that pF, P q is an ordered field.
Lemma 3.1.1. If pF, P q is an ordered field, define the set of negative el-
ements N  t x P F | Dp P P : x  p u. Then P , N and t 0 u are disjoint
and F  P Y t 0 u Y N .
Proof. We first note that 0 R P by property 1. This implies that 0  0 R N .
Therefore P X t 0 u  H  N X t 0 u. Now suppose that P X N  H and
let a P P X N . Then a P P , so by property 3: 0  a  a P P . This
contradiction with property 1 shows that P X N  H.
Now let a P F z t 0 u. Then by property 2, a P P or a P N , so a P P Y N
and F  P Y t 0 u Y N .

17
Definition 3.1.1 of an ordered field is not so intuitive at first glance, but it
becomes more transparent when we recall that P were the positive elements
and we consider the following:
Proposition 3.1.2. Any ordered field pF, P q induces a strict total order ¡
by:
@a, b P F : a ¡ b ðñ a  b P P, (3.1)
with the following properties:
1. @a P F : ra ¡ 0 ðñ @b P F : a b ¡ bs,
2. @a, b P F : a ¡ 0 ^ b ¡ 0 ùñ ab ¡ 0.
Conversely, if ¡ is a strict total order with the properties above, then
P  t x P F | x ¡ 0 u defines an ordered field pF, P q.
Proof. Let pF, P q be an ordered field and define ¡ as above. We shall first
show that ¡ is a strict total order. Let a, b, c P F such that a ¡ b and b ¡ c.
Then a  b P P and b  c P P . We then have a  c  a  b b  c P P , so
a ¡ c, which shows that ¡ is transitive. Then, if we take a, b P F , we see
by lemma 3.1.1 that either a  b, a  b P P , or b  a P P . Therefore, either
a  b, a ¡ b or b ¡ a, so ¡ is trichotomous and a strict total order.
To prove property 1, let a P F . If a ¡ 0, then @b P F : pa bq b  a P P ,
so @b P F : a b ¡ b. Conversely, if @b P F : a b ¡ b, then this is in
particular the case for b  0: a ¡ 0.
To prove property 2, let a, b P F with a ¡ 0 and b ¡ 0. Then a, b P P
and thus ab P P , which shows that ab ¡ 0.
Let us now suppose that we are given a strict total order ¡ on F satis-
fying properties 1 and 2. Define P  t x P F | x ¡ 0 u. Then clearly 0 R F ,
since otherwise 0 ¡ 0, which is the first property of an ordered field. Also,
by the trichotomy of ¡, we have for any a P F : a ¡ 0, a  0, or 0 ¡ a. This
means: a  a  0 P P , a  0, or a  0  a P P , so P also satisfies the
second property. Now let a, b P P . Then by property 1 and the transitivity
of ¡: a ¡ 0 ùñ a b ¡ b ¡ 0 ùñ a b P P . Also, by property 2:
a ¡ 0 ^ b ¡ 0 ùñ ab ¡ 0 ùñ ab P P , which finally shows that pF, P q is
an ordered field.

Note: Notation of ordered fields


From now on, if we speak of an ordered field pF, P q, and we use the symbol
¡, this will denote the induced strict total order. Also, the symbol ¥ will
denote the total order induced by ¡ (defined by a ¥ b ðñ a ¡ 0 _ a  b).
Similarly, for a P F we write |a|  a if a  0 or a ¡ 0 and |a|  a if a 0.
If the set P is not used, we may also just write: “the ordered field F ”.

Lemma 3.1.3. If pF, P q is an ordered field, then for any a P F  : a2 P P .


In particular we see that 1  12 P P .

18
Proof. Let a P F  . Then either a P P or a P P , so that a2  paq2 P P ,
since P is closed under multiplication.
We state the following lemma without proof, as it is simply proven by
considering the various possibilities of the signs of a and b.
Lemma 3.1.4 (Triangle inequality). If pF, P q is an ordered field, then for
any a, b P F :
|a b| ¤ |a| |b|. (3.2)
We will now go on to prove a nice characterization of an ordered field
in terms of sums of squares. The following definition is due to Artin and
Schreier [1]1 .
Definition 3.1.2. A formally real field is a field F that satisfies the following
property:
 ņ

@n P N@a1, . . . , an P F a2i  0 ùñ @i P t 1, . . . , n u : ai  0 ,

i 0

i.e. the zero of the field is not the sum of non-zero squares, or the vanishing
of a sum of squares implies the vanishing of all the individual squares.
The following lemma illustrates a different characterization2 .

° 3.1.5. A field F is formally real if and only if Ea1 , . . . , an


Lemma P F such
that ni1 a2i  1.
Proof. Let ° F be formally real. Now° suppose that there exist a1 , . . . , an P F
such that ni1 a2i  1. Then ni1 a2i 12  1 12  0, which is
forbidden, so no such t ai u exist. °
Conversely, let there exist ° no a1 , . . . , an P F such that ni1 a2i  1 and
take b0 , . . . , bm P F such that m i0 bi  0, and suppose that b0  0 (i.e. one
2
° ° 
i1 bi  b0 ùñ
of the bi is non-zero). Then b20  0 and m 
2
2 2 m bi
i1 b0
1, which gives us a contradiction. Therefore, F is formally real.
Lemma 3.1.6. Any ordered field pF, P q is formally real.
Proof. Let a P F z t 0 u. Then either a ¡ 0 or a ¡ 0 and thus a2  paq2 ¡
0. We will now show that any sum of non-zero squares is strictly greater
than zero by induction.
The induction basis was the first°step of the proof. Therefore, let k P N,
k ¡ 0, a1 , . . . , ak 1 P F z t 0 u and ki1 a2i ¡ 0. Then a2k 1 ¡ 0 and thus
°k 1 2 °k 2
i1 ai ¡ i1 ai ¡ 0. °
Therefore, if a1 , . . . , an P F and ni1 a2i  0, we must have that a1 
    an  0 and thus F is formally real.
1
Artin and Schreier chose this as one of the key properties of the real number system,
in an effort to characterize the real numbers in a purely algebraic way.
2
This was actually the original definition of Artin and Schreier.

19
The converse of the foregoing lemma is a theorem that was proved by
Artin and Schreier [1, Satz 1.], and gives the definite answer on the connec-
tion between the sums of squares in a field and orderings. We follow a proof
of Jean-Pierre Serre [6]3 . We first prove the following
Lemma 3.1.7. If P0 is a subgroup of the multiplicative group F  of a field,
such that P0 is closed under addition and contains all non-zero squares, and
if a P F  such that a R P0 , then

P1  P0 P0 a  t p P F | Dx, y P P0 : p  x ya u

is a subgroup of F  that is closed under addition.


Proof. Let p1  x1 y1 a, p2  x2 y2 a P P1 , where x1 , y1 , x2 , y2 P P0 . Then
p1 p2  px1 x2 q py1 y2 qa P P1 , since P0 is closed under addition. Also,
p1 p2  px1 y1 aqpx2 y2 aq  px1 x2 y1 y2 a2 q px1 y2 x2 y1 qa P P1 , since
a2 P P0 and P0 is closed under addition and multiplication.
If 0 P P1 , then Dx, y P P0 : x ya  0, so a  xy 1 P P0 , which leads
to a contradiction. This shows that 0 R P1 and thus P1 „ F  .
Lastly, let p  x ya P P1 , x, y P P0 . Then p1  px yaq1 
px yaqpx yaq2  rxppx yaq1q2s ryppx yaq1q2sa P P1, since
x ya  0 and P0 contains all non-zero squares. Therefore, P1 „ F  is a
subgroup of the multiplicative group of F that is closed under addition.

Proposition 3.1.8 (Serre). If L is an extension field of an ordered field


pK, P q, then L can be ordered as pL, PLq with P „ PL (i.e. the ordering
on L extends that °n on K) if and only if for all p1 , . . . , pn P P „ K and
x1 , . . . , xn P L: i1 pi xi  0 ùñ x1      xn  0.
2

Proof. Let pL, PL q be an ordered extension field of an ordered field pK, P q


with P „ PL . Now take p1 , . . . , pn P P and x1 , . . . , xn P L. We first see
that for every xi either xi  0, in which case x2i  0, or xi ¡ 0 or xi ¡ 0
so that x2i  pxi q2 ¡ 0. Therefore, since each pi ¡ 0 and the positive
elements
°n under addition and multiplication, if one of the xi  0:
are closed°
2 ¡ 0. So,
i1 pi xi  0 implies that all xi  0.
n 2
i1 p i x i
Now suppose that the converse is true. Define T as the set of subgroups
of L that are closed under addition and contain all elements of the form
px2 where p P P and x P L° .
Clearly the set P°0  t ni1 pi x2i°| p P P ^ xi P L u is closed under addi-
tion. Now let x  ni1 pi x2i , y  m j 1 qj yj P P0 , where all pi , qj P P and
2
° °
xi , yj P L . Then xy  i1 j 1 pi qj pxi yj q2 P P0 , so P0 is closed under
n m

multiplication. Also, if x P P0 , then x1  xx2 P P0 , because x  0 (by


the hypothesis) and thus x2  px1 q2 P P0 . This shows that P0 P T , and
thus T is non-empty.
3
It is entertaining to note that although this argument was thought up by Serre, it was
presented on a seminar by Élie Cartan.

20
By Zorn’s Lemma we may now pick a maximal element PL P T . We
claim that this PL makes L an ordered field that extends K. To see this,
let a P L . If both a and a P PL , then 0 P PL , which is a contradiction,
so a and a cannot be simultaneously in PL . Now, if a R PL , define
P 1  t x ya | x, y P PL u. Since PL certainly contains all non-zero squares
(1 P P ), we can conclude by lemma 3.1.7 P 1 also is a subgroup of L that
is closed under addition. Furthermore, take p P P and x P L . Then
px2  px2 p1 aqp1 aq1  ppx2 px2 aqp1 aq1 P P 1 , so P 1 P T . Also, if
x P PL , then x  xp1 aqp1 aq1  px xaqp1 aq1 P P 1 , so PL „ P 1 .
Because we took PL to be maximal in T we can now conclude that PL  P 1 .
Lastly, a  ap1 aqp1 aq1  pa2 aqp1 aq1 P P 1  PL , since PL
contains all non-zero squares. We can now conclude that either a P PL or
a P PL exclusively.
The above showed that pL, PL q is an ordered field. Now, if p P P „ K,
then p  p12 P PL , so P „ PL and the order extends the order on K.

Now we are ready to prove

Theorem 3.1.9. A field F can be ordered if and only if it is formally real.

Proof. We already saw that if a field F can be ordered, then it is formally


real. Conversely, let F be a formally real ° field. Then its characteristic is 0
(for if it has characteristic n ¥ 1, then ni1 12  0, which is not the case),
and thus it contains Q as a subfield.
Let 0 pq11 , . . . , pqnn P Q where all pi P Z and qi P Zz t 0 u, and x1 , . . . , xn P
°
F with ni1 pqii x2i  0. Let us multiply with q1 . . . qn :

ņ ņ ¹
n
0  q1 . . . qn 
pi 2
x qj pi x2i .
q
i1 i 
i 1  
j 1 j i

This is now a sum of integer multiples of squares, and thus simply a sum
of squares. Since F is formally real, we can conclude that all xi  0. By
proposition 3.1.8 we can therefore conclude that there exists an order on F
that extends the standard order on Q.

As our last result on formally real/ordered fields we will give the following
lemma, which provides us with bounds on the zeroes of a monic polynomial.
n1 °
Lemma 3.1.10. Let F be an ordered field, f pxq  xn i0 a°i x P F rxs
i
n1
monic and of positive degree, and c P F . Define M  maxp1, i0 |ai |q.
Then |c| ¡ M implies that |f pcq| ¡ 0. Conversely, if f pcq  0, then M ¤
c ¤ M.

Proof.
°n1 Let c P F with |c| ¡ M . We first note c  0, so that: 1  un f puq 
i  n . Also, |un | 1, and for i P 0, . . . , n  1 we have |uin |
i1 a i u

21
|u1|. From this, and the triangle inequality, it follows that:

¸
n 1
1  |un f puq  ai uin |

i 1
n 1
¸
¤ |un||f puq| |ai||uin|
 i 1
¸ n 1
|f puq| |u1| |ai|
i1
¤ |f puq| 
M M  |f puq|
1
1,

from which we can conclude that |f puq| ¡ 0.


If we now negate this statement, then f pcq  0 implies that M ¤ c ¤
M.

3.2 Real Closed Fields


Artin and Schreier defined a refinement of formally real fields in an attempt
to capture the characteristic algebraic properties of the real numbers. There
are several useful examples of formally real fields, which include the real
numbers, the real numbers that are algebraic over Q, the hyperreal numbers
and the computable numbers. Let us state the definition.
Definition 3.2.1. A field F is called real closed if and only if F is formally
real and no proper algebraic extension field of F is formally real.
This definition and the foregoing discussion of formally real fields shows
that a real closed field F is closed in the sense that it can be ordered, but
no extension of it can be ordered. We will go on to find some more useful
characterizations. We first observe the following very useful facts, where we
follow the proof in [1].
Lemma 3.2.1. If F is a real closed field, then:
• Every sum of squares in F can also be written as a single square.
• @x P F Dy P F : x  y2 _ x  y2.
• Every polynomial f pxq P F rxs of odd degree has a zero in F .
Proof. Let γ P F not be a square. Then the polynomial x2  γ P F rxs is
?
irreducible, so F p γ q  F rxs{px2  γ q is a proper field extension of F , hence
it is not formally real. This shows that there exist α1 , . . . , αn , β1 , . . . , βn P F
such that

γ

αν2

βν2
?
2 γ

αν βν 

pαν ?γ βν q2  1.

ν 1 
ν 1 
ν 1 
ν 1

22
° ?
If nν1 αν βν  0, then° γ P F , which leads to a contradiction, so that this
sum vanishes. Also, if nν1 αν2  0, then 1 would be a sum of squares in
F , which is also a contradiction, so that sum does not vanish. We can then
conclude that γ is not a sum of squares in F , since otherwise 1 would be
a sum of squares in F . Negating this statement leads to the first property.
By the first property we may now pick α, β P F such that:
ņ ņ
α2  αν2 , β2 1 βν2
ν 1  
ν 1

(observe that 1  12 ) and thus:


°n 2  2
β2
γ  °n ν 12 ν  
1 β β
.
ν 1 ν
α α2 α

From this we can conclude that either γ is a square, or γ is a square, which


shows the second property.
Now let us pick any polynomial f pxq P F rxs with degpf q  2n 1, where
n P N. Without loss of generality we may assume f to be monic, since F is
a field. The third statement can then be proven by induction with respect
to n.
If n  0, then the polynomial is of first degree and thus of the form
f pxq  x  a, where a P F is a zero of f pxq.
Now let n ¥ 1 and the statement be true for all g pxq P F rxs, degpg q 
2k 1, k P N and k n. If f pxq is reducible, then it can be written as
f pxq  g pxqhpxq, where g pxq, hpxq P F rxs are monic and of positive degree
strictly smaller than 2n 1, and one of them (say g pxq)must be of odd
degree, since degpf q  degpg q degphq. By the induction hypothesis, g pxq
then has a zero in F , and hence so does f pxq.
If f pxq is irreducible, we can form the proper field extension F pαq 
F rxs{pf pxqq, where α P F pαq is a zero of f pxq. We then know that F pαq is
not formally real, and thus there exist q1 pxq, . . . , qr pxq P F rxs with degree
smaller than 2n 1 such that:

pqν pαqq2  1 P F.
ν 1 
This then shows that there exists some g pxq P F rxs such that:

pqν pxqq2 f pxqg pxq  1.
ν 1
Now, the degree of the qν pxq2 must be even, and therefore the degree of the
sum must be even and positive and strictly less than 4n 2. We therefore

23
conclude that g pxq has odd degree less than or equal to 2n  1. Therefore
g pxq has a zero ρ P F . However, then:
ŗ ŗ
1  pqν pρqq2 f pρqg pρq  pqν pρqq2.

ν 1 
ν 1

I.e. 1 is a sum of squares in F , leading to a contradiction. Therefore f pxq


must be reducible and the third statement has been proven.

Lemma 3.2.2. If a field F is real closed, there exists one and only one
P „ F  such that pF, P q is an ordered field. I.e. a real closed field can be
uniquely ordered.

Proof. If F is formally real, then we know that it can be ordered. Let


P € F be the positive numbers of such an ordering. Then we know that
any non-zero square x2 , x P F  must be positive.
Now, if F is real closed, it is formally real and can thus be ordered. Also,
@x P F  Dy P F  such that x  y2, in which case x must be positive, or
x  y2, in which case x must be positive, in any ordering. Since this
covers all non-zero elements of F , there exists only one ordering, namely the
one where exactly all the non-zero squares are positive.

Note
From now on, when we speak of a real closed field, we will implicitly assume
that it is equipped with this unique order.

The following result is the analog of the classical Fundamental Theorem


of Algebra, and shows that real closed fields capture the important property
that we may obtain an algebraically closed field by adjoining a single square
root. In particular, this shows that the? real numbers R form a real closed
field, since C is obtained by adjoining 1 and is algebraically closed.

Theorem 3.2.3. A?field F is real closed if and only if it is not algebraically


closed, and C  F p 1q  F rxs{px2 1q is algebraically closed.

Proof. Let F be a real closed field. Then we see that x2 1 is irreducible,


and hence has no zeroes in F , since otherwise 1 would be a sum of squares
in F . We can then define the field C  F rxs{px2 1q. We first define the
automorphism z  a bi ÞÑ z̄  a  bi of C, where i P C denotes a zero
(any one of the two) of x2 1. This induces an automorphism f pxq ÞÑ f¯pxq
of C rxs. We then see that if f pxq P C rxs, then f pxqf¯pxq P F rxs. Also, if
f pxqf¯pxq has a zero r in C, then f prqf¯prq  0, and hence f pxq has a zero
in C.
We now show that every element of C can be written as a square. To
this end, let z  a bi P C. Then z z̄  a2 b2 P F and non-negative,

24
so that Dα P F : a2 b2  α2 . Also, α2 ¥ a2 so that |α| ¥ |a| and hence
Dc1, c2 P F , where we can pick c1c2 with the same sign as b, such that
c21  a 2 |α | , c22  a 2 |α| .
Also:
p2c1c2q2  4 a 2|α| a 2 |α|  a2 pa2 b2 q  b2 .

We can therefore conclude that pc1 c2 iq2  c21  c22 2c1 c2 i  a bi. This
shows that there exists no algebraic extension field E of C with rE : C s  2
(since any quadratic equation is reducible).
With the foregoing in mind, we now let f pxq P F rxs be a monic polyno-
mial of even degree. We define E to be a splitting field over F of f pxqpx2 1q,
such that C „ E. Then E is Galois over F (since F is of characteristic 0
and thus any polynomial in F rxs is seperable; hence E is the splitting field
of a separable polynomial). We write | Gal E {F |  2e m, where m is odd.
By Sylow’s theorem, Gal E {F contains a subgroup H with |H|  2e . Let
H be the subfield of E containing F corresponding to H under the Galois
pairing. Therefore, 2e m  rE : F s  rE : H srH : F s  2e rH : F s so that
rH : F s  m. But since every polynomial of odd degree in F has a zero in
F , F has no proper odd-dimensional algebraic extension fields. Therefore,
m  1, H  Gal E {F , and H  E. We can now conclude that because
| Gal E {F | is even, we can obtain E by repeatedly adjoining square roots.
However, since we obtained C by adjoining a square root and C contains all
possible square roots, we must have that C  E. Therefore, C is a splitting
field of f pxqpx2 1q and hence contains all zeroes of f pxq4 . This shows that
every polynomial in F rxs has a zero in C. By the reasoning above we can
then conclude that C is algebraically closed.
We will now go on to show the converse. Let F be a field that is not
? C  F piq  F rxs{px 1q2 be algebraically closed.
algebraically closed, but let 2

We then clearly see that 1 R F , since otherwise x 1 would be reducible


and C would not be a field. Now let a, b P F . We can show in the same way
as before that every element of C can be written as a square, and so we pick
z P C such that z 2  a bi. Then a2 b2  pa biqpa  biq  z 2 z̄ 2  pz z̄ q2
and z z̄ P F . This shows that every sum of squares in F can be written
as a square. In particular, 1 is not a square, and hence not a sum of
squares, so that F is formally real. We can?also see that C is an algebraic
closure of F (since C is generated by i  1 with minimum polynomial
x2 1, so that rC : F s  2 8 and C is algebraically closed), so that every
algebraic extension of F is contained within C. But then, C is the only
proper algebraic extension field, and C is not formally real (as i2  1), so
that F is real closed.
4
We note that x2 1 splits in C rxs as px  iqpx iq.

25
3.3 The Intermediate Value Theorem
In this section we will discuss a very important theorem for real continous
and differentiable functions that holds in the context of polynomials with
coefficients in a real closed field. This is the familiar intermediate value
theorem, and it will be the key to our success in the next chapter.
Theorem 3.3.1 (Intermediate Value Theorem). Let F be a real closed field,
f pxq P F rxs, a, b P F and a b. Then if f paqf pbq 0, there exists a c P F
such that a c b and f pcq  0.
Proof. From theorem 3.2.3 we already know that the only irreducible poly-
nomials in F rxs are going to be those of degree 1 or 2. Furthermore, a
polynomial x2 αx β P Rrxs is going to be irreducible if and only if
α2  4β 0. This follows in the same way as for second degree polynomials
with real coefficients.
Now let us pick f pxq P F rxs to be monic and of positive degree. The
general case then follows quickly by dividing out the leading coefficient and
by noting that the premise cannot hold for polynomials of degree zero. We
can write f pxq in terms of its irreducible factors as:
¹
m ¹
s
f px q  px  riq g j px q,

i 1 
j 1

where r1 , . . . , rm P R and g1pxq, . . . , gspxq P Rrxs with:


gj pxq  x2 aj x bj , a2j  4bj 0, 1 ¤ j ¤ s.
For j P t 1, . . . , s u we can, by lemma 3.2.1, find 0 cj P R such that
c2j  41 p4bj  a2j q. We can then write:

gj pxq  x
aj 2
c2j ,
2
so that for all u P R, gj puq ¡ 0.
We first rule out the case that f pxq has no irreducible ± factors of first
degree. If this would be the case, then f paqf pbq  sj1 gj paqgj pbq ¡ 0,
contradicting our hypothesis. ±m
Now, if ±@2 i P t 1, . . . , m u : a ri ^ b ri , then f paqf pbq  i1 pa 
ri qpb  ri q j 1 gj paqgj pbq ¡ 0. Similarly, if @i P t 1, . . . , m u : a ¡ ri ^ b ¡ ri ,
then also f paqf pbq ¡ 0. We conclude that there exists a i P t 1, . . . , m u such
that a ri b and f pri q  0, which concludes the proof.

The key property in the proof above was that every positive element of
R can be written as a square, which is a characteristic property of real closed
fields. It turns out that analogues of several other important theorems in
real analysis, such as Rolle’s Theorem and the Mean Value Theorem, hold
for polynomials in a real closed field as well.

26
Chapter 4

Sturm’s Theorem

In this chapter we will study the classical method for determining the num-
ber of zeroes of a polynomial with real coefficients that are contained within
an open interval, which is based on a theorem by J.C.F. Sturm, published in
1829 [7]. In particular, this method allows us to symbolically locate the ze-
roes of a polynomial up to an arbitrary precision. We will study this method
in the context of real closed fields, which we have shown to encompass the
real number system.
We will give two versions of the theorem. The first gives a decision
method in terms of variations in sign of a sequence of numbers. The second
answers when a parametrized family of polynomials has zero in a certain
interval, by reducing it to a set of polynomial equations and inequations
for the parameters of the family, where the equations and inequations have
integer coefficients. From the last theorem we can then quickly show that if
a polynomial with rational coefficients has a zero in one real closed field, it
will have a zero in any real closed field.
Throughout this chapter, R will denote a real closed field, equipped with
the strict total order ¡. Also, if a, b P R and a b we will use the notations
ra, bs  t x P R | a ¤ x ¤ b u and sa, br t x P R | a x b u for closed and
open intervals respectively.
Most of this chapter draws from [4], but several definitions and theorems
have been modified to streamline the discussion and to get some more general
results.

4.1 Variations in sign


Definition 4.1.1. Let pc0 , . . . , cn q P Rn 1 be a sequence of numbers in R.
Then the number of variations in sign of this sequence is defined to be

| t i P t 1, . . . , n1 u | c1i1c1i 0 u |,

27
where pc10 , . . . , c1n1 q is the subsequence obtained by dropping the zero elements
of the original sequence.

Definition 4.1.2. Let f pxq P Rrxs and a, b P R with a b. Then a Sturm


sequence for f pxq on ra, bs is a sequence of polynomials pf0 pxq, . . . , fs pxqq P
Rrxss 1 such that f0 pxq  f pxq and:

1. f0 paqf0 pbq  0,

2. @c P ra, bs : fspcq  0 (i.e. fspxq has no zeroes in ra, bs),


3. If c P ra, bs and fj pcq  0 for some j P t 1, . . . , s  1 u, then fj 1 pcqfj 1 pcq
0,

4. If c P ra, bs and f pcq  0, there exist open intervals sc1 , cr, sc, c2 r€ R
such that @u Psc1 , cr: f0 puqf1 puq 0 and @u Psc, c2 r: f0 puqf1 puq ¡ 0.

In the proposition below we will show that a Sturm sequence can be used
to calculate the number of distinct (i.e. not counting multiplicity) zeroes of
the polynomial that lie in some open interval.

Proposition 4.1.1. Let f pxq P Rrxs be of positive degree, a, b P R with


a b, and pf0 pxq, . . . , fs pxqq a Sturm sequence for f pxq on ra, bs. For any
c P ra, bs, denote the number of variations in sign of pf0 pcq, . . . , fs pcqq as Vc .
Then the number of distinct zeroes of f pxq within sa, br is Va  Vb .

Proof. Since the number of zeroes of all the fi pxq within ra, bs is finite, we
can write them down as a  a0 a1    am  b so that no fj pxq
has a zero in any of the open intervals sai1 , ai r, 1 ¤ i ¤ s. Now pick for
1 ¤ i ¤ m: ci Psai1 , ai r.
First we see that no fj pxq has a zero in sa0 , c1 r. Then by the nega-
tion of theorem 3.3.1 we have fj pa0 qfj pc1 q ¡ 0 for j P t 0, . . . , s u. Now let
k P t 0, . . . , s u with fk pa0 q  0. Then clearly 0 k s, since f0 pa0 q  0 
fs pa0 q, and so fk1 pa0 qfk 1 pa0 q 0. Then fk1 pa0 qfk 1 pa0 qfk1 pcqfk 1 pcq ¡
0 implies that fk1 pcqfk 1 pcq 0. Taking into account all such k, we get
Va0  Vc1 . In exactly the same way we may prove that Vcm  Vam .
We now let i P 1, . . . , m  1. Then if f pai q  0, we can carry through
the same argument to get Vci  Vci 1  0. If f pai q  0, we note that (pos-
sibly by repicking our ci and ci 1 to comply with property 4 of a Sturm
sequence) f0 pci qf1 pci q 0 and f0 pci 1 qf1 pci 1 q ¡ 0. Furthermore, the argu-
ment above again shows that if 1 j s, then fj 1 pci q, fj pci q, fj 1 pci q and
fj 1 pci 1 q, fj pci 1 q, fj 1 pci 1 q have the same number of variations in sign.
Therefore in this case Vci  Vci 1  1.
We can now write:

¸
m 1 
¸
m 1
Va  Vb  pVa  Vc q1 pVc  Vc q pVc  Va q 
i i 1 m m δi ,

i 1 
i 1

28
where δi  1 if f pai q  0 and δi  0 if f pai q  0. Now since all of the zeroes
of f pxq that lie within sa, br per definition are one of the ai , we have counted
all the zeroes. Therefore, Va  Vb is the total number of distinct zeroes of
f pxq that lie within sa, br.

Now that we have a method of determining how many distinct zeroes


a polynomial has in some open interval, given a Sturm sequence, we will
need a method to actually produce a Sturm sequence. If we do this, we
have a full-blown algorithm to determine the zeroes of a polynomial in some
interval. Even better, if we can find a bound on the absolute values of the
zeroes of a polynomial and strategically disect the resulting interval, we can
locate the zeroes numerically up to an arbitrary precision! It turns out that
we can construct a Sturm sequence in a formalized way, using the Euclidean
division algorithm.
Definition 4.1.3. Let f pxq P Rrxs be of positive degree and f 1 pxq P Rrxs
its formal derivative. Then define the following sequence, terminating when
fs 1 pxq  0:

f0 pxq  f pxq
f1 pxq  f 1 pxq (4.1)
fi pxq  qipxqfipxq  fi1pxq degpfi 1q degpfiq, 1 ¤ i ¤ s
1

where qi pxq P Rrxs. Then pf0 pxq, . . . , fs pxqq is called the standard sequence
of f pxq.

Note: Existence and uniqueness


The polynomials fi 1 pxq and qi pxq exist and are unique by corollary 2.3.2.
Note however that we have picked fi 1 pxq  rpxq. This is the key in
producing a Sturm sequence.

We notice that if pf0 pxq, . . . , fs pxqq is the standard sequence for some
f pxq P Rrxs, then fs pxq is a common factor of f pxq and f 1 pxq and all fi pxq,
and any such common factor will be a factor of fs pxq. Temporarily pass-
ing to the field of fractions of Rrxs, we can then define a derived sequence
pg0pxq, . . . , gspxqq by setting gipxq  fipxqfspxq1 for 0 ¤ i ¤ s and observ-
ing that each gi pxq P Rrxs.
Lemma 4.1.2. Let f pxq P Rrxs be of positive degree and pf0 pxq, . . . , fs pxqq
be its standard sequence. Define the derived sequence of f pxq as pg0 pxq, . . . , gs pxqq,
where gi pxq  fi pxqfs pxq1 P Rpxq1 for 0 ¤ i ¤ s. Then each gi pxq P Rrxs,
and the derived sequence is a Sturm sequence for g0 pxq on every interval
ra, bs such that g0paqg0pbq  0.
Furthermore, @c P R : f pcq  0 ðñ g0 pcq  0.
1
Rpxq denotes the field of fractions of Rrxs.

29
Proof. We showed above that fs pxq is a common factor of all the fi pxq.
Therefore, for every 0 ¤ i ¤ s we have some hi pxq P Rrxs such that fi pxq 
hi pxqfs pxq and thus gi pxq  hi pxqfs pxqfs pxq1  hi pxq P Rrxs.
We will now show that the derived sequence is a Sturm sequence. Let
a, b P R with a b and g0 paqg0 pbq  0. Then clearly property 1 holds.
Furthermore, gs pxq  1, so that gs pxq has no zeroes in R and hence not in
ra, bs. We now use the definition of the standard sequence to see that for
1 ¤ i ¤ s (where it is understood that gs 1 pxq  0:

gi1 pxq  fi1 pxqfs pxq1


 pqipxqfipxq  fi 1pxqqfspxq1
 qipxqgipxq  gi 1pxq.
Suppose that c P ra, bs and gj pcq  0 for 0 j s. Then gj 1 pcqgj 1 pcq 
qj pcqgj pcqgj 1 pcq  pgj 1 pcqq2  pgi 1 pcqq2 ¤ 0. Also, gj 1 pcq  gj 1 pcq,
so if gj 1 pcq  0, then gj pcq  0  gj 1 pcq and by induction we can then
show that gs pcq  0, which is not the case. Therefore property 3 holds.
Lastly, suppose that c P ra, bs and g0 pcq  0. Then f pcq  g0 pcqfs pcq  0,
so there exist hpxq P Rrxs and e P N such that f pxq  px  cqe hpxq, e ¡ 0 and
hpcq  0. Also, f 1 pxq  epx  cqe1 hpxq px  cqe h1 pxq. Therefore, px  cqe1
is a common factor of f pxq and f 1 pxq and hence a factor of fs pxq. It follows
that there exists a k pxq P Rrxs such that fs pxq  px  cqe1 k pxq and k pcq  0.
Then hpxq  k pxqlpxq and h1 pxq  k pxqmpxq for some lpxq, mpxq P Rrxs with
lpcq  0  mpcq. Then g0 pxq  px  cqlpxq and g1 pxq  px  cqmpxq elpxq
and thus g1 pcq  elpcq  0. We may then choose2 an interval rc1 , c2 s such
that c P rc1 , c2 s and the interval contains no zeroes of g1 pxq nor lpxq. Then
by theorem 3.3.1, g1 pxqlpxq ¡ 0, so that for γ P rc1 , c2 s : g0 pγ qg1 pγ q 
pγ  cqg1pγ qlpγ q which has the same sign as γ  c and thus is negative when
γ Psc2 , cr and positive when γ Psc, c1 r. Hence property 4 holds and the
derived sequence is a Sturm sequence for g0 pxq in ra, bs.

By combining the foregoing lemma and proposition, we may now prove


the main result of this section.
Theorem 4.1.3 (Sturm’s Theorem). Let f pxq P Rrxs be of positive degree
and pf0 pxq, . . . , fs pxqq its standard sequence. For all c P R, let Vc be the
number of variations in sign of pf0 pcq, . . . , fs pcqq. Then, if a, b P R, a b
and f paqf pbq  0, the number of distinct zeroes of f pxq in the interval sa, br
is Va  Vb .
Proof. Let pg0 pxq, . . . , gs pxqq be the derived sequence of f pxq. We have seen
that f pxq and g0 pxq have the same distinct zeroes, so the derived sequence
is a Sturm sequence for g0 pxq on ra, bs. Also, since f paq  0  f pbq, neither
2
E.g. by choosing a random such interval and then filtering out the zeroes of g1 pxq and
lpxq by taking the ones closest to c and averaging with c

30
px  aq nor px  bq are common factors of f pxq and f pxq. It then follows
that fs paq  0  fs pbq and thus the sequences

fi paq  gi paqfs paq and fi pbq  gi pbqfs pbq

have the same variations in sign as the gi paq and gi pbq respectively. Now, by
the foregoing proposition and the observation above, the number of distinct
zeroes of f pxq in sa, br is equal to the number of distinct zeroes of g pxq in
the interval, which is Va  Vb .

We can use the foregoing result to form a useful algorithm, that runs in
polynomial time with respect to the degree of the polynomial in question.
Calculating the total number of zeroes of a
Algorithm 3:
polynomial
°
Let f pxq ° ni0 ai xi P Rrxs be monic and of positive degree. Define µ 
1 maxp1, ni01 |ai |q. Calculate the standard sequence pf0 pxq, . . . , fs pxqq of
f pxq by repetitive use of algorithm 1. For c P R, let Vc denote the number of
variations in sign of the sequence pf0 pcq, . . . , fs pcqq. Then the total number
of distinct zeroes of f pxq in R is Vµ  Vµ .

Proof. We have found in lemma 3.1.10 that°all zeroes of f pxq are contained in
the interval rM , M s, where M  maxp1, in01 |ai |q. Therefore, all zeroes of
f pxq are certainly contained in the open interval sµ, µr, where µ  1 M .
If we combine this with Sturm’s theorem, we get Vµ  Vµ as the total
number of distinct zeroes of f pxq.
Example. We let f pxq  x3 3x 1 P Rrxs. Then f 1 pxq  3x2 3 and the
Euclidean sequence of f pxq and f 1 pxq (and thus the standard sequence of f pxq is:

f0 pxq  x3 3x 1
f1 pxq  3x 2
3
f2 pxq  p2x 1q

f3 pxq   .
15
4
We observe that all zeroes of f pxq will lie in the interval sM  1, M 1r, where
M  maxp1, 4q  4. We therefore evaluate the standard sequence at 5 and 5.

f0 p5q  139 0 f0 p5q  141 ¡ 0


f1 p5q  78 ¡ 0 f1 p5q  78 ¡ 0
f2 p5q  9 ¡ 0 f2 p5q  11 0

f3 p5q   f3 p5q  
15 15
0 0
4 4

From this we see that V5  V5  2  1  1, so f pxq has 1 distinct zero in any real
closed field.

31
4.2 Systems of equations, inequations and inequal-
ities
This section serves as a preamble to the next section. We will now develop
the notion of a system of equations, inequations and inequalities, which
are expressions v pt1 , . . . , tr q  0, v pt1 , . . . , tr q  0, and v pt1 , . . . , tr q ¡ 0
respectively, where v P Zrt1 , . . . , tr s for indeterminates ti , 1 ¤ i ¤ r. Note
that will write v pti q for v pt1 , . . . , tr q if it is more convenient. We can consider
any ordered field F , which will contain Z as a subring. We then have
an evaluation homomorphism Zrt1 , . . . , tr s Ñ F induced by the inclusion
homomorphism, that sends Z to Z and ti to some ci P F . In this way we
can look for solutions of such an expression in the extension field F .
We further note, that if v pc1 , . . . , cr q  0 and wpc1 , . . . , cr q  0, then
since the solutions of these two inequations are in a field F , we can rewrite
this equivalently as v pc1 , . . . , cr qwpc1 , . . . , cr q  0. So, any finite set of in-
equations can be replaced by a single inequation. We can now state the
following definition.
Definition 4.2.1. An r-system (of equations, inequations and inequalities)
is a triple

Γ  ppv1 , . . . , vs q, v , pv¡1 , . . . , v¡u qq


 
P Y8i1Zrt1, . . . , tr spiq  Zrt1, . . . , tr s  Y8i1Zrt1, . . . , tr spiq .
Moreover, if pF, P q is an ordered field, then the solution set of Γ is the set
ΓpF q of pc1 , . . . , cr q P F prq such that:

v1 pci q      vs pci q  0,
v pci q  0,
v¡1 pci q, . . . , v¡u pci q ¡ 0.

If we wish to specify a system without equalities, we can specify the


trivial equality 0  0. Similarly, we can adjoin the trivial inequation 1  0
and inequality 1 ¡ 0. In this chapter, we shall not use inequalities much,
and when we do not need them, we shall drop the last term in the triple,
assuming the trivial inequality is to be adjoined. Also, when no inequation
(the second element in the triple) has been specified, we assume that the
trivial inequation must be adjoined.
We can now ask when a set of systems covers all possible cases. The
following definition will make this formal.
Definition 4.2.2. An r-cover is a finite set of r-systems δ  t ∆1 , . . . , ∆s u
such that for any ordered field F :
¤
∆pF q  F prq .
P
∆ δ

32
Also, a refinement of an r-cover γ is an r-cover δ, such that for any
ordered field F of K: @∆ P δ DΓ P γ : ∆pF q „ ΓpF q.

Definition 4.2.3. If Γ and ∆ are r-systems, their join is defined to be the


r-system Γ [ ∆3 that has as its equalities and inequalities both those of Γ
and ∆, and as inequality the product of the inequalities of Γ and ∆.

We will give the following lemmas without proof, as they are quite
straightforward if you just write out the definitions.

Lemma 4.2.1. Let Γ and ∆ be r-systems. Then:

pΓ [ ∆qpF q  ΓpF q X ∆pF q,


for any ordered field F .

Lemma 4.2.2. If Γ is an r-system and δ  t ∆”1 , . . . , ∆s u is a finite r-cover,


and we define Γj  Γ [ ∆j for 1 ¤ j ¤ s, then sj1 Γj pF q  ΓpF q for every
ordered field F .

Lemma 4.2.3. Let γ  t Γ1 , . . . , Γu u and δ  t ∆1 , . . . , ∆s u be r-covers


and define Γ1j  Γ1 [ ∆j . Then γ 1  t Γ11 , . . . , Γ1s , Γ2 , . . . , Γu u is again an
r-cover, and a refinement of γ.

4.3 Sturm’s Theorem Parametrized


We will now consider a family of polynomials in a formally real field R whose
coefficients are parametrized as multivariate polynomials over its prime ring
Z. That is, the family of polynomials is represented by a polynomial in
Zrt1 , . . . , tr srxs. The ti represent parameters, and the x represents a variable
we wish to solve for. Using Sturm’s Theorem we will show that we can,
algorithmically, obtain a cover of systems in Z such that a member of this
family has a zero in a certain interval if and only if the parameters and
boundaries satisfy one of those systems. This method could be extended
to parametrize the systems that the coefficients have to satisfy with respect
to the boundaries of the system, but that extension will not be considered
here.
In order to get to our main result, we first let K  Z and R be a real
closed field. We also let r P N, r ¥ 1 and define A  K rt1 , . . . , tr s, where the
ti , 1 ¤ i ¤ r are indeterminates. Now, if we pick pc1 , . . . , cr q P Rprq , we have
a homomorphism A Ñ R that extends the inclusion homomorphism K Ñ R
and sends ti ÞÑ ci . Therefore, we have an extension of this homomorphism
Arxs Ñ Rrxs that maps each parametrized polynomial to a polynomial with
coefficients in F : F pti ; xq ÞÑ F pci ; xq.
3
This is not standard notation, but it proves intuitive given lemma 4.2.1.

33
Since A is a commutative ring, we can perfectly well perform Euclidean
polynomial division in Arxs. If we now make the connection with the eval-
uation in pc1 , . . . , cr q we can make the following important observation.

Lemma 4.3.1. Let F pti ; xq, Gpti ; xq P Arxs with Gpti ; xq  0 and vm pti q the
leading coefficient of G. Then there exists an even e P N and Qpti ; xq, Rpti ; xq P
Arxs with degpRq degpGq and:

vm pti qe F pti ; xq  Qpti ; xqGpti ; xq  Rpti ; xq.

Also, if pc1 , . . . , cr q P Rprq and vm pci q  0, then the q pxq, rpxq P Rrxs with
F pci ; xq  q pxqGpci ; xq  rpxq and degprq degpGpci qq differ from Qpci ; xq
and Rpci ; xq by a common positive multiplier.
We also note that the choice of the Qpti ; xq, Rpti ; xq and e are indepen-
dent of which real closed field we use.

Proof. The existence of an arbitrary e P N and the Qpti ; xq, Rpti ; xq P Arxs
follows from the Euclidean division algorithm. However, if e is odd, we may
multiply the entire equation by vm pti q and so obtain a new Q̃pti ; xq and
R̃pti ; xq and an even ẽ so that the equation still holds.
Now, if pc1 , . . . , cr q P Rprq such that vm pci q  0, then since e is even we
have vm pci qe ¡ 0. Then evaluating the equation in the ci and dividing by
vm pci qe , we obtain:

F pci ; xq  vm pci qe Qpci ; xqGpci ; xq  vm pci qe Rpci ; xq


 qpxqGpci; xq  rpxq,
where the q pxq, rpxq P Rrxs are as above. And since such q pxq and rpxq are
unique in the polynomial ring of a field, we have Qpci ; xq  vm pci qe q pxq and
Rpci ; xq  vm pci qe rpxq.

We are now ready to state the following proposition, that allows us to


use Sturm’s theorem on the parametrized polynomials.
°
Proposition 4.3.2. Let F pti ; xq, Gpti ; xq P Arxs with Gpti ; xq  m j 0 vj pti qx
j 
°k
0. Define Gk pti ; xq  j 0 vj pti qxj and the r-systems Γk  ppvj , j ¡ k q, vk q
for 0 ¤ k ¤ m and Γ8  ppv0 , . . . , vm q, 1q4 . Then we can obtain, in
a finite number of steps, an r-cover δ  t ∆1 , . . . , ∆h u that is a refine-
ment of the cover γ  t Γ8 , Γ0 , . . . , Γm u and h sequences of polynomials
pFj0pti; xq, . . . , Fjsj pti; xqq in Arxs such that, if pc1, . . . , cr q P ∆j pRq, then the
terms of pFj0 pci ; xq, . . . , Fjsj pci ; xqq differ from the terms of the Euclidean
sequence of F pci ; xq and Gpci ; xq by a positive multiplier.
Also, if this property holds in one real closed field, then it holds for any
real closed field.
4
The k and 8 correspond to the degree of Gpci ; xq if pci q P Γk pRq.

34
Proof. We consider any k P t 0, . . . , m u with vk pti q  0 (or equivalently
Γk pRq  H), for else Gk pti ; xq  Gj pti ; xq for some j k and Γk would not
be contributing to the cover γ. We can then just as well omit Γk in our
refinement. Now find Qk pti ; xq, Rk pti ; xq P Arxs as in the foregoing lemma.
We have to consider two cases.
If Rk pti ; xq  0, we can take the sequence pF, G, 0q and Γk as the
corresponding system. This suffices because if pc1 , . . . , cr q P Γk pRq, then
Gpci ; xq  Gk pci ; xq and thus the Euclidean sequence would be pF pci q, Gk pci qq.
Note that we will use this case as an induction basis in the next case.
Now let Rk pti ; xq  0. If k  m ¡ degpF q, then we see that Rk pti ; xq 
F pti ; xq. We may then obtain the result for Gpti ; xq and Rk pti ; xq, by going
through the argument again and seeing that this case is then excluded.
Otherwise degpRk q degpGk q degpF q degpGq, so by induction on the
sum of the degrees, we may obtain a cover δk  t ∆k0 , . . . , ∆khk u and hk
sequences pFkl0 pti ; xq, . . . , Fklskl pti ; xqq so that the required property holds
for Gk pti ; xq and Rk pti ; xq. We now define Γkl  Γk [ ∆kl for l P t 0, . . . , hk u.
Then, if pci q P Γkl pRq „ Γk pRq, we have Gk pci ; xq  Gpci ; xq. Also, since
Fkl0 pci ; xq  Gk pci ; xq  Gpci ; xq and Fkl1 pci ; xq  Rk pci ; xq, we can take
the sequences pF pci ; xq, Fkl0 pci ; xq, . . . , Fklskl pci ; xqq, whose terms differ from
the Euclidean sequence of F pci ; xq and Gpci ; xq by a positive multiplier, and
pair these with the respective Γkl .
If we now let δ consist of the systems obtained above, and pair these with
their respective sequences, including Γ8 with pF, 0, 0q, we have obtained a
refinement of γ that satisfies our requirements. We also note now that the
choice of the systems and sequences did not depend on the real closed field
in question, so that the property holds for any real closed field.
Example. Let F pp, q; xq  x2 px q and Gpp, q; xq  2x p. We then have
Γ8 pRq  Γ0 pRq  H and Γ1 pRq  Rp2q . We therefore consider only k  2,
G2 pp, q; xq  Gpp, q; xq. We first observe that:

22 F pp, q; xq  p2x pqGpp, q; xq  pp2  4q q.

We therefore set R2 pp, q; xq  p2  4q. Now, since R2 pp, q; xq P A, another step


(only possible if p2  4q  0) will give us the 0 polynomial. We therefore have the
2-cover and corresponding sequences:

Γ1 : a2  4b  0 Ø pF, Gq
Γ2 : a 2
 4b  0 Ø pF, G, R2 q
If we now recall that the standard sequence of a polynomial f pxq is
simply the Euclidean sequence of f pxq and its formal derivative, we can
quickly prove the following theorem, which is our second main result. Note
that this version is more general than the one in [4], as the systems we have
to obtain include requirements on the bounds of our interval.

35
Theorem 4.3.3 (Parametrized version of Sturm’s Theorem). Let F pti ; xq P
Arxs. Then there exists a finite set of r 2-systems5 ω in K – which we
can obtain in a finite number of steps – such that for every pc1 , . . . , cr , a, bq P
Rpr 2q with a b, F pci ; xq has a zero in ra, bs if and only if F pci ; aqF pci ; bq 
0 or F pci ; aqF pci ; bq  0 and there is some Ω P ω such that pc1 , . . . , cr , a, bq P
Ω pR q.
We can restate this theorem as follows: Let F pti ; xq be a family of poly-
nomials whose coefficients are parametrized by polynomials with integer co-
efficients. Then for any interval sa, br we can obtain a finite set of systems
of equations, inequations and inequalities, so that F pci ; xq has a zero in that
interval if and only if the coefficient parameters pci q and the boundaries a
and b satisfy one of those systems (provided that F pci ; aqF pci ; bq  0).

Proof of the parametrized version of Sturm’s Theorem, 4.3.3. Let F pti ; xq 


°
ν 1 uν pti qx , where uν pti q P A, and un pti q is the leading coefficient. Now,
n ν
1
if F pti ; xq  0, then F pti ; xq  u0 pti q is constant and we can take the sole
system ppu0 qq. °
We can therefore now assume that 0  F 1 pti ; xq  nν1 νuν pti qxν 1 .
Then by proposition 4.3.2 we can obtain a cover δ  t ∆0 , . . . , ∆h u and
corresponding sequences pFj0 pti ; xq, . . . , Fjsj pti ; xqq (0 ¤ j ¤ h) such that
if pc1 , . . . , cr q P ∆j , then the terms of pFj0 pci ; xq, . . . , Fjsj pci ; xqq differ from
the terms of the standard sequence of F pci ; xq by positive multipliers. In
particular, we see that at any point, the number of sign changes is the same.
Now pick any j P t 0, . . . , h u. Now, if we let γ be the same cover as in
proposition 4.3.2, then δ is a refinement of γ. Therefore, if pci q P ∆j pRq,
we have either un pci q      u1 pci q  0 – in which case F pci ; xq has a
zero if and only if u0 pci q  0 – or there is some k P t 1, . . . , n u such that
uk pci q  0 but ul pci q  0 for l ¡ k. In the first case we set ωj  t ppu0 qq u:
the sole equation u0  0. In the latter case we may construct the following
two sequences:

αjl pti ; xa , xb q  um pti q2nl Fjl pti ; xa q P Arxa , xb s


βjl pti ; xa , xb q  um pti q2nl Fjl pti ; xb q P Arxa , xb s,

for 0 ¤ l ¤ sj , and where nl  degpFjl q and xa and xb are new indeter-


minates. Then all the αjl pci ; a, bq and βjl pci ; a, bq differ from Fjl pci ; aq and
F pci ; bq respectively – and hence from the standard sequence of F pci ; xq at
those points – by a positive multiplier. If F pci ; aqF pci ; bq  0, we now con-
clude by Sturm’s Theorem that F pci ; xq has a zero in sa, br if and only if the
number of variations in sign of the sequences pαj0 pci ; a, 0q, . . . , αjsj pci ; a, 0qq
and pβj0 pci ; 0, bq, . . . , βjsj pci ; 0, bqq are not equal. We therefore now take all
5
The 2 extra parameters in the systems of ω correspond to the bounds of our interval.
They serve to keep the set of systems independent of the real closed field that we choose
to use.

36
possible r 2-systems on K that can be formed by the elements of those
sequences (which is finite), and filter out the ones that lead to a differ-
ence in the number of variations in sign between the sequences and for
each take the join with ∆j 6 , to form the set of systems ωj . Then, if a
pci; a, bq P ΩpRq for some Ω P ωj , then pciq P ∆j pRq, so that the above ap-
plies, and there is a difference between the variation in sign in the sequences
pαjl pci, a, 0qq and pβjl pci, 0, bqq, so that F pci; xq has a zero in sa, br, provided
that F pci ; aqF pci ; bq  0. We also observe that if F pci ; aqF pci ; bq  0, then
F pci ; xq has a zero in ra, bs.
If we now let ω  Yhj0 ωj , we obtain the set of r 2-systems we require,
since δ is a cover of K.
Example. In our last example we obtained a 2-cover and the corresponding se-
quences F pp, q; xq  x2 px q and F 1 pp, q; xq  2x p. We write:

∆1 : p2  4q  0 Ø pF, F 1 q
∆2 : p2  4q  0 Ø pF, F 1 , p2  4q q.

We can then define the corresponding αjl and βjl as follows:

α10 pp, q; xa , xb q  x2a pxa q β10 pp, q; xa , xb q  x2b pxb q


α11 pp, q; xa , xb q  2xa p β11 pp, q; xa , xb q  2xb p

α20 pp, q; xa , xb q  x2a pxa q β20 pp, q; xa , xb q  x2b pxb q


α21 pp, q; xa , xb q  2xa p β21 pp, q; xa , xb q  2xb p
α22 pp, q; xa , xb q  p 2
 4q β22 pp, q; xa , xb q  p 2
 4q

For ∆1 we get the sole system Ω11  ppp2  4q q, 1, ppx2a pxa q qp2xa
pq, px2b pxb q qp2xb pqqq, that is, α1l will change sign, but β1l will not.
For ∆2 , we have to consider three cases: the α2l change sign once, and the β2l
don’t (one zero), the α2l change sign twice, and the β2l don’t (two zeroes), or the
α2l change sign twice and the β2l change sign once (one zero). These cases can all
occur in several ways, and so we end up with a whole pile of systems.

Note: Existence of a zero


We have introduced two extra indeterminates in our systems in order to
account for the zeroes of our polynomial. However, if we choose to inves-
tigate the problem of the existence of a zero in the entire field, we can
drop those two parameters. This can be done by noting that we do not
have to consider a and b up to the point that we define the sequences
pαjl pciqq and pβjl pciqq. In particular, at that point we can see that for any
pciq P Rprq, if ρ P R is to be a zero of F pci; xq, then necessarily µ ρ µ,
6
Technically, we now have to transform the r-system ∆j to an r 2-system by using
the inclusion homomorphism A Ñ Arxa , xb s on all the elements of the system.

37
°m1 2
where µ  pk 1q ν 0 uν pci q uk pci q . We may from that point on let
2

apti q, bpti q P A depend on the parameters and modify the αpti q and β pti q
accordingly, and finish the argument in the same way. We can therefore
state the following corollary.

Corollary 4.3.4. Let F pti ; xq P Arxs. Then we can construct a finite set of
r-systems ω in K such that for any real closed field R, and pc1 , . . . , cr q P Rprq :
F pci ; xq has a zero in R if and only if pc1 , . . . , cr q P ΩpRq for some Ω P ω.
Restated: Let F pti ; xq P Arxs be a family of polynomials whose coef-
ficients are parametrized by polynomials with integer coefficients. Then we
can construct a finite set of systems of polynomial equations, inequations and
inequalities with integer coefficients – independent of the real closed field in
question – so that for some choice pci q of the parameters, F pci ; xq P Rrxs
has a zero in R if and only if the pci q satisfy one of the constructed systems.
° °
Now let f pxq  ni0 ai xi P Rrxs and let F pti ; xq  ni0 ti xi . We then
see that F pai ; xq  f pxq. Suppose that all the ai P Q € R and that f pxq has
a zero in R. Then by corollary 4.3.4 we can construct a set of n-systems in
Z such that the pai q satisfy one of those systems. Now let R1 be another real
closed field. Then clearly all ai P R1 (by an isomorphism of the prime fields)
and they still satisfy one of those systems. Therefore, the corresponding
polynomial in R1 rxs will also have a zero in R1 .
Corollary 4.3.5. If a polynomial f pxq with rational coefficients has a zero
in one real closed field, it will have a zero in any real closed field.
This last corollary is i.a. of the utmost importance for computer calcula-
tions. E.g. the computable numbers, described by Turing as “the numbers
whose expressions as a decimal are calculable by a machine”[9], can be shown
to be real closed[2]. This then gives the result, that a polynomial with ra-
tional coefficients has a zero in the real numbers, if and only if it has a zero
in the computable numbers. Therefore, for any polynomial with rational
coefficients, we are in principle able to compute all its real zeroes with a
computer (or any realization of a Turing machine).

4.3.1 Tarski’s Principle


The question now naturally arises whether we can generalize this procedure
to families of polynomials in multiple indeterminates. The answer turns out
to be positive. The idea we can pursue is to replace an equation in multiple
indeterminates to a set of equations in one less indeterminates. We may go
on with this procedure to eventually obtain a set of equations that have to
be satisfied for the original equation to be solvable. If we then invoke the
parametrized version of Sturm’s Theorem for each of these, we obtain a set
of systems that will have to be satisfied by the parameters for our equation
to be solvable. [4, Sec. 5.6]

38
This method has an important application in the so-called field of meta-
mathematics, where the properties of mathematics itself are studied. In
particular, it implies that every “elementary” sentence in the logic of a real
closed field is decidable. This was shown by Tarski in 1948 for the real num-
bers. [8] Note that in the logic of a real closed field, we mean the first-order
logic that remains when only the axioms of the field itself are assumed. Set-
theoretic sentences are not allowed. This does however, to quote Tarski,
“gives the mathematician the assurance that he will be able to solve every
such problem (an elementary problem in a real closed field) by working at it
long enough.” And with that assurance we can continue to make algebraic
exercises for high school students.

39
Epilogue

Sturm’s Theorem has provided us with a very simple way to determine the
zeroes of a polynomial that lie within a certain interval. It is interesting to
note that despite the simplicity of this method, it is not widely taught in
undergraduate calculus courses. Perhaps this can be attributed to the inef-
ficiency of the algorithm compared to more modern root-finding methods,
the amount of algebra involved, or simply its age (almost 200 years!). In
either case I would like to express my hopes that the tides could change in
this respect.
Nevertheless, the theorem not only provides us with this calculation
method, it also leads to several important theoretical implications. As ex-
amples we have seen the decidability of the theory of real closed fields (in
metamathematics), and the fact that if a polynomial with rational coeffi-
cients is going to have a zero in one real closed field, then it is going to
have one in every real closed field. The last result finds an application in
computer science, where we can conclude that we can compute every zero of
a polynomial with rational (even computable!) coefficients with a computer
program.
I have personally enjoyed this project very much due to the large amount
of new algebra I have come to learn, and the discovery of an obscure, but fun
and useful result. I know that I will definitely have use for Sturm’s Theorem
in the future.
Lastly, I would like to acknowledge Prof. Dr. Jaap Top and Dr. Ramsay
Dyer for their support during the course of this project. Prof. Top has
recommended this project, and they have both provided me with very useful
feedback on the report, for which I am very grateful.

40
Bibliography

[1] E. Artin and O. Schreier. Algebraische konstruktion reele körper. Ab-


handlungen aus dem Mathematischen Seminar der Universität Hamburg,
5(1):85–99, December 1927. Conference proceedings from June 1926.

[2] M. Braverman. On the complexity of real functions. In Proceedings of


the 2005 46th Annual IEEE Symposium on Foundations of Computer
Science. IEEE, 2005.

[3] D.J. Griffiths. Introduction to Quantum Mechanics. Pearson Education,


2nd edition, 2005.

[4] N. Jacobson. Basic Algebra, volume 1. Dover, dover edition, 2009.

[5] N. Jacobson. Basic Algebra, volume 2. Dover, dover edition, 2009.

[6] J.P. Serre. Extensions de corps ordonnés. In Comptes rendus des séances
de l’Académie des Sciences, pages 576–577, September 1949.

[7] J.C.F. Sturm. Mémoire sur la résolution des équations numériques. Bul-
letin des Sciences de Férussac, 11:419–425, 1829.

[8] A. Tarski. A Decision Method for Elementary Algebra and Geometry.


RAND Corporation, 1948.

[9] A.M. Turing. On computable numbers, with an application to the


entscheidungsproblem. Proceedings of the London Mathematical Soci-
ety, 42:230–265, 1937.

41

You might also like