0% found this document useful (0 votes)
60 views33 pages

Poly PDF

Uploaded by

Manan Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views33 pages

Poly PDF

Uploaded by

Manan Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

2

Polynomials over a field

A polynomial over a field F is a sequence


(a0 , a1 , a2 , . . . , an , . . .)

where

ai F i

with ai = 0 from some point on. ai is called the ith coefficient of f .


We define three special polynomials. . .
0 = (0, 0, 0, . . .)
1 = (1, 0, 0, . . .)
x = (0, 1, 0, . . .).
The polynomial (a0 , . . .) is called a constant and is written simply as a0 .
Let F [x] denote the set of all polynomials in x.
If f 6= 0, then the degree of f , written deg f , is the greatest n such
that an 6= 0. Note that the polynomial 0 has no degree.
an is called the leading coefficient of f .
F [x] forms a vector space over F if we define
(a0 , a1 , . . .) = (a0 , a1 , . . .), F.
DEFINITION 2.1
(Multiplication of polynomials)
Let f = (a0 , a1 , . . .) and g = (b0 , b1 , . . .). Then f g = (c0 , c1 , . . .) where
cn = a0 bn + a1 bn1 + + an b0
n
X
=
ai bni
i=0

ai bj .

0i,0j
i+j=n

EXAMPLE 2.1
x2 = (0, 0, 1, 0, . . .), x3 = (0, 0, 0, 1, 0, . . .).
More generally, an induction shows that xn = (a0 , . . .), where an = 1 and
all other ai are zero.
If deg f = n, we have f = a0 1 + a1 x + + an xn .
20

THEOREM 2.1 (Associative Law)


f (gh) = (f g)h
PROOF Take f, g as above and h = (c0 , c1 , . . .). Then f (gh) = (d0 , d1 , . . .),
where
X
(f g)i hj
dn =
i+j=n

i+j=n

hj

u+v=i

fu gv

fu gv hj .

u+v+j=n

Likewise (f g)h = (e0 , e1 , . . .), where


X
en =

fu gv hj

u+v+j=n

Some properties of polynomial arithmetic:


fg

gf

0f

1f

f (g + h)

fg + fh

f 6= 0 and g 6= 0 f g 6= 0
and deg(f g) = deg f + deg g.
The last statement is equivalent to
f g = 0 f = 0 or g = 0.
The we deduce that
f h = f g and f 6= 0 h = g.

2.1

Lagrange Interpolation Polynomials

Let Pn [F ] denote the set of polynomials a0 + a1 x + + an xn , where


a0 , . . . , an F . Then a0 + a1 x + + an xn = 0 implies that a0 = 0, . . . , an = 0.
Pn [F ] is a subspace of F [x] and 1, x, x2 , . . . , xn form the standard basis
for Pn [F ].
21

If f Pn [F ] and c F , we write
f (c) = a0 + a1 c + + an cn .
This is thevalue of f at c. This symbol has the following properties:
(f + g)(c) = f (c) + g(c)
(f )(c) = (f (c))
(f g)(c) = f (c)g(c)
DEFINITION 2.2
Let c1 , . . . , cn+1 be distinct members of F . Then the Lagrange interpolation polynomials p1 , . . . , pn+1 are polynomials of degree n defined
by
n+1
Y  x cj 
pi =
, 1 i n + 1.
ci cj
j=1
j6=i

EXAMPLE 2.2
p1 =
p2 =

x c1
c2 c1
etc. . .

x c2
c1 c2




x cn+1
x c3

 c1 c3 
 c1 cn+1 
x cn+1
x c3

c2 c3
c2 cn+1

 

We now show that the Lagrange polynomials also form a basis for Pn [F ].
PROOF Noting that there are n + 1 elements in the standard basis, above,
we see that dim Pn [F ] = n + 1 and so it suffices to show that p1 , . . . , pn+1
are LI.
We use the following property of the polynomials pi :

1 if i = j
pi (cj ) = ij =
0 if i 6= j.
Assume that
a1 p1 + + an+1 pn+1 = 0
where ai F, 1 i n + 1. Evaluating both sides at c1 , . . . , cn+1 gives
a1 p1 (c1 ) + + an+1 pn+1 (c1 ) = 0
..
.
a1 p1 (cn+1 ) + + an+1 pn+1 (cn+1 ) = 0
22


a1 1 + a2 0 + + an+1 0 = 0
a1 0 + a2 1 + + an+1 0 = 0
..
.
a1 0 + a2 0 + + an+1 1 = 0
Hence ai = 0 i as required.
COROLLARY 2.1
If f Pn [F ] then
f = f (c1 )p1 + + f (cn+1 )pn+1 .
Proof: We know that
f = 1 p1 + + n+1 pn+1

for some i F .

Evaluating both sides at c1 , . . . , cn+1 then, gives


f (c1 ) = 1 ,
..
.
f (cn+1 ) = n+1
as required.
COROLLARY 2.2
If f Pn [F ] and f (c1 ) = 0, . . . , f (cn+1 ) = 0 where c1 , . . . , cn+1 are distinct, then f = 0. (I.e. a non-zero polynomial of degree n can have at most
n roots.)
COROLLARY 2.3
If b1 , . . . , bn+1 are any scalars in F , and c1 , . . . , cn+1 are again distinct,
then there exists a unique polynomial f Pn [F ] such that
f (c1 ) = b1 , . . . , f (cn+1 ) = bn+1 ;
namely
f = b1 p1 + + bn+1 pn+1 .
23

EXAMPLE 2.3
Find the quadratic polynomial
f = a0 + a1 x + a2 x2 P2 [R]
such that
f (1) = 8, f (2) = 5, f (3) = 4.
Solution: f = 8p1 + 5p2 + 4p3 where
p1 =
p2 =
p3 =

2.2

(x 2)(x 3)
(1 2)(1 3)
(x 1)(x 3)
(2 1)(2 3)
(x 1)(x 2)
(3 1)(3 2)

Division of polynomials

DEFINITION 2.3
If f, g F [x], we say f divides g if h F [x] such that
g = f h.
For this we write f | g, and f6 | g denotes the negation f does not divide g.
Some properties:
f | g and g 6= 0 deg f deg g
and thus of course
f | 1 deg f = 0.
2.2.1

Euclids Division Theorem

Let f, g F [x] and g 6= 0.


Then q, r F [x] such that
f = qg + r,
where r = 0 or deg r < deg g. Moreover q and r are unique.
Outline of Proof:
24

(3)

If f = 0 or deg f < deg g, (3) is trivially true (taking q = 0 and r = f ).


So assume deg f deg g, where
f = am xm + am1 xm1 + a0 ,
g = bn xn + + b0
and we have a long division process, viz:

bn

xn

+ + b0

mn +
am b1

n x
m
am x
+ am1 xm1
am xm

+ + a0
etc. . .

(See S. Perlis, Theory of Matrices, p.111.)


2.2.2

Euclids Division Algorithm


f = q 1 g + r1
g = q 2 r1 + r 2
r 1 = q 3 r2 + r 3
.
. . . . .. . . . . . . . . . . .

with
with
with

deg r1 < deg g


deg r2 < deg r1
deg r3 < deg r2

.... .................
rn2 = qn rn1 + rn with deg rn < deg rn1
rn1 = qn+1 rn
Then rn = gcd(f, g), the greatest common divisor of f and gi.e.
rn is a polynomial d with the property that
1. d | f and d | g, and
2. e F [x], e | f and e | g e | d.
(This defines gcd(f, g) uniquely up to a constant multiple.)
We select the monic (i.e. leading coefficient = 1) gcd as the gcd.
Also, u, v F [x] such that
rn = gcd(f, g)
= uf + vg
find u and v by forward substitution in Euclids algorithm; viz.
r1 = f + (q1 )g
r2 = g + (q2 )r1
25

= g + (q2 )(f + (q1 )g)


= g + (q2 )f + (q1 q2 )g
= (q2 )f + (1 + q1 q2 )g
..
.
rn = (. . .) f + (. . .) g.
|{z}
|{z}
u

In general, rk = sk f + tk g for 1 k n, where


r1 = f, r0 = g, s1 = 1, s0 = 0, t1 = 0, t0 = 1
and
sk = qk sk1 + sk2 , tk = qk tk1 + tk2
for 1 k n. (Proof by induction.)
The special case gcd(f, g) = 1 (i.e. f and g are relatively prime) is of
great importance: here u, v F [x] such that
uf + vg = 1.
EXERCISE 2.1
Find gcd(3x2 + 2x + 4, 2x4 + 5x + 1) in Q[x] and express it as uf + vg
for two polynomials u and v.

2.3

Irreducible Polynomials

DEFINITION 2.4
Let f be a non-constant polynomial. Then, if
g|f

g is a constant
or g = constant f

we call f an irreducible polynomial.


Note: (Remainder theorem)
f = (x a)q + f (a) where a F . So f (a) = 0 iff (x a) | f .
EXAMPLE 2.4
f (x) = x2 + x + 1 Z2 [x] is irreducible, for f (0) = f (1) = 1 6= 0, and
hence there are no polynomials of degree 1 which divide f .
26

THEOREM 2.2
Let f be irreducible. Then if f6 | g, gcd(f, g) = 1 and u, v F [x] such
that
uf + vg = 1.
PROOF Suppose f is irreducible and f6 | g. Let d = gcd(f, g) so
d|f

and d | g.

Then either d = cf for some constant c, or d = 1. But if d = cf then


f |d
f |g

and d | g
a contradiction.

So d = 1 as required.
COROLLARY 2.4
If f is irreducible and f | gh, then f | g or f | h.
Proof: Suppose f is irreducible and f | gh, f6 | g. We show that f | h.
By the above theorem, u, v such that
uf + vg = 1
uf h + vgh = h
f

THEOREM 2.3
Any non-constant polynomial is expressible as a product of irreducible
polynomials where representation is unique up to the order of the irreducible
factors.
Some examples:
(x + 1)2 = x2 + 2x + 1
= x2 + 1
2

(x + x + 1)
2

inZ2 [x]

in Z2 [x]

inZ3 [x]

= x +x +1

(2x + x + 1)(2x + 1) = x + x + 1
2

= (x + 2x + 2)(x + 2)
PROOF
27

inZ3 [x].

Existence of factorization: If f F [x] is not a constant polynomial, then


f being irreducible implies the result.
Otherwise, f = f1 F1 , with 0 < deg f1 , deg F1 < deg f . If f1 and F1 are
irreducible, stop. Otherwise, keep going.
Eventually we end with a decomposition of f into irreducible polynomials.
Uniqueness: Let
cf1 f2 fm = dg1 g2 gn
be two decompositions into products of constants (c and d) and monic
irreducibles (fi , gj ). Now
f1 | f1 f2 fm = f1 | g1 g2 gn
and since fi , gi are irreducible we can cancel f1 and some gj .
Repeating this for f2 , . . . , fm , we eventually obtain m = n and c = d
in other words, each expression is simply a rearrangement of the factors
of the other, as required.
THEOREM 2.4
Let Fq be a field with q elements. Then if n N, there exists an irreducible polynomial of degree n in F [x].
PROOF First we introduce the idea of the Riemann zeta function:

X
Y
1
1
(s) =
=
.
s
1
n
1

s
n=1
p prime
p
To see the equality of the latter expressions note that

X
1
=
xi = 1 + x + x2 +
1x
i=0

and so
R.H.S. =

X
1
pis

p prime i=0



1
1
1
1
=
1 + s + 2s +
1 + s + 2s +
2
2
3
3
1
1
1
= 1 + s + s + s +
2
3
4

28

note for the last step that terms will be of form


s

1
pa11 paRR
up to some prime pR , with ai 0 i = 1, . . . , R. and as R , the prime
factorizations
pa11 paRR
map onto the natural numbers, N.
We let Nm denote the number of monic irreducibles of degree m in Fq [x].
For example, N1 = q since x + a, a Fq are the irreducible polynomials of
degree 1.
Now let |f | = q deg f , and |0| = 0. Then we have
|f g| = |f | |g| since deg f g = deg f + deg g
and, because of the uniqueness of factorization theorem,
X

f monic

1
=
|f |s

1 1s
|f |
f monic and

irreducible

Now the left hand side is

n=0

X
f monic and
deg f = n

1
|f |s

X
qn
=
q ns
n=0

(there are q n monic polynomials of degree n)

X
1

n=0

and R.H.S. =

q n(s1)
1

n=1

1
q s1
1


1 q1ns

Nn .
29

Equating the two, we have


1
1

n=1

q s1

1


1 q1ns

N n .

(4)

We now take logs of both sides, and then use the fact that
 X


1
xn
=
if |x| < 1;
log
1x
n
n=1

so (4) becomes
log

1
1 q (s1)

X
k=1

1
kq (s1)k

so

k=1

qk
kq sk

N n
1 q1ns



X
1
=
Nn log 1 ns
q
n=1

n=1

Nn

n=1

Nn

n=1

m=1

m=1

n
mnq mns

nNn

mn=k

k=1

1
mq mns

kq ks

Putting x = q s , we have
k k
X
q x
k=1

xk

k=1

nNn ,

mn=k

and since both sides are power series, we may equate coefficients of xk to
obtain
X
X
qk =
nNn =
nNn .
(5)
mn=k

n|k

We can deduce from this that Nn > 0 as n (see Berlekamps Algebraic


Coding Theory).
Now note that N1 = q, so if k is a primesay k = p, (5) gives
q p = N1 + pNp = q + pNp
qp q
Np =
>0
as q > 1 and p 2.
p
30

This proves the theorem for n = p, a prime.


But what if k is not prime? Equation (5) also tells us that
q k kNk .
Now let k 2. Then
q k = kNk +

nNn

n|k
n6=k

kNk +

qn

(as nNn q n )

n|k
n6=k
bk/2c

kNk +

qn

n=1
bk/2c

< kNk +

qn

(adding 1)

n=0

q bk/2c+1 1
= kNk +
q1

(sum of geometric series).

But
q t+1 1
< q t+1
q1

if q 2,

so
qk
Nk

kNk + q bk/2c+1
q k q bk/2c+1
>
k
0 if q k q bk/2c+1 .
<

Since q > 1 (we cannot have a field with a single element, since the additive
and multiplicative identities cannot be equal by one of the axioms), the
latter condition is equivalent to
k bk/2c + 1
which is true and the theorem is proven.
31

2.4

Minimum Polynomial of a (Square) Matrix

Let A Mnn (F ), and g = chA . Then g(A) = 0 by the CayleyHamilton


theorem.
DEFINITION 2.5
Any nonzero polynomial g of minimum degree and satisfying g(A) = 0
is called a minimum polynomial of A.
Note: If f is a minimum polynomial of A, then f cannot be a constant
polynomial. For if f = c, a constant, then 0 = f (A) = cIn implies c = 0.
THEOREM 2.5
If f is a minimum polynomial of A and g(A) = 0, then f | g. (In particular, f | chA .)
PROOF Let g(A) = 0 and f be a minimum polynomial. Then
g = qf + r,
where r = 0 or deg r < deg f . Hence
g(A) = q(A) 0 + r(A)
0 = r(A).
So if r 6= 0, the inequality deg r < deg f would give a contradict the definition of f . Consequently r = 0 and f | g.
Note: It follows that if f and g are minimum polynomials of A, then f |g
and g|f and consequently f = cg, where c is a scalar. Hence there is a
unique monic minimum polynomial and we denote it by mA .
EXAMPLES (of minimum polynomials):
1. A = 0 mA = x
2. A = In mA = x 1
3. A = cIn mA = x c
4. A2 = A and A 6= 0 and A 6= In mA = x2 x.
EXAMPLE 2.5
F = Q and

5 6 6
4
2 .
A = 1
3 6 4
32

Now
A 6= c0 I3 , c0 Q, so mA 6= x c0 ,
A2 = 3A 2I3
mA = x2 3x + 2
This is an special case of a general algorithm:
(Minimum polynomial algorithm) Let A Mnn (F ). Then we find the
least positive integer r such that Ar is expressible as a linear combination
of the matrices
In , A, . . . , Ar1 ,
say
Ar = c0 + c1 A + + cr1 Ar1 .
2

(Such an integer must exist as In , A, . . . , An form a linearly dependent


family in the vector space Mnn (F ) and this latter space has dimension
equal to n2 .)
Then mA = xr cr1 xr1 c1 x c0 .
THEOREM 2.6
If f = xn + an1 xn1 + + a1 x + a0 F [x], then mC(f ) = f , where

C(f ) =

0 0
1 0
0 1
..
..
.
.
0 0

0 a0
0 a1
0 a2
..
.
1 an1

.
PROOF For brevity denote C(f ) by A. Then post-multiplying A by the
respective unit column vectors E1 , . . . , En gives
AE1 = E2
AE2 = E3 A2 E1 = E3
..
.
AEn1 = En An1 E1 = En
AEn = a0 E1 a2 E2 an1 En
= a0 E1 a2 AE1 an1 An1 E1 = An E1 ,
33

so
f (A)E1 = 0 first column of f (A) zero
Now although matrix multiplication is not commutative, multiplication of
two matrices, each of which is a polynomial in a given square matrix A, is
commutative. Hence f (A)g(A) = g(A)f (A) if f, g F [x]. Taking g = x
gives
f (A)A = Af (A).
Thus
f (A)E2 = f (A)AE1 = Af (A)E1 = 0
and so the second column of A is zero. Repeating this for E3 , . . . , En , we
see that
f (A) = 0
and thus mA |f .
To show mA = f , we assume deg mA = t < n; say
mA = xt + bt1 xt1 + + b0 .
Now
mA (A) = 0
t

A + bt1 A
t

(A + bt1 A

t1

t1

+ + b0 In = 0

+ + b0 In )E1 = 0,

and recalling that AE1 = E2 etc., and t < n, we have


Et+1 + bt1 Et + + b1 E2 + b0 E1 = 0
which is a contradictionsince the Ei are independent, the coefficient of
Et+1 cannot be 1.
Hence mA = f .
Note: It follows that chA = f . Because both chA and mA have degree n
and moreover mA divides chA .
EXERCISE 2.2
If A = Jn (a) for a F , an elementary Jordan matrix of size n, show
34

that mA = (x a)n where

a 0
1 a
0 1
.. . .
.
.

A = Jn (a) =
..

0 0 a 0
0 0
1 a

(i.e. A is an n n matrix with as on the diagonal and 1s on the subdiagonal).


Note: Again, the minimum polynomial happens to equal the characteristic
polynomial here.
DEFINITION 2.6
(Direct Sum of Matrices)
Let A1 , . . . , At be matrices over F . Then the direct sum of these matrices
is defined as follows:

A1 0 . . .
0 A2

A1 A2 At = .
.
.
.
.
.
.
.
. .

0 At
Properties:
1.
(A1 At ) + (B1 Bt ) = (A1 + B1 ) (At + Bt )
2. If F ,
(A1 At ) = (A1 ) (At )
3.
(A1 At )(B1 Bt ) = (A1 B1 ) (At Bt )
4. If f F [x] and A1 , . . . , At are square,
f (A1 At ) = f (A1 ) f (At )
DEFINITION 2.7
If f1 , . . . , ft F [x], we call f F [x] a least common multiple ( lcm ) of
f1 , . . . , ft if
35

1. f1 | f, . . . ft | f , and
2. f1 | e, . . . ft | e f | e.
This uniquely defines the lcm up to a constant multiple and so we set the
lcm to be the monic lcm .
EXAMPLES 2.1
If f g 6= 0, lcm (f, g) | f g .
(Recursive property)
lcm (f1 , . . . , ft+1 ) = lcm ( lcm (f1 , . . . , ft ), ft+1 ).
THEOREM 2.7
mA1 At = lcm (mA1 , . . . , mAt ),
Also
chA1 At =

t
Y

chAi .

i=1

PROOF Let f = L.H.S. and g = R.H.S. Then


f (A1 At ) = 0
f (A1 ) f (At ) = 0 0
f (A1 ) = 0, . . . , f (At ) = 0
mA1 | f, . . . , mAt | f
g | f.
Conversely,
mA1 | g, . . . , mAt | g
g(A1 ) = 0, . . . , g(At ) = 0
g(A1 ) g(At ) = 0 0
g(A1 At ) = 0
f = mA1 At | g.
Thus f = g.
EXAMPLE 2.6
Let A = C(f ) and B = C(g).
Then mAB = lcm (f, g).
36

Note: If
f

= cpa11 . . . pat t

g = dpb11 . . . pbt t
where c, d 6= 0 are in F and p1 , . . . , pt are distinct monic irreducibles, then
min(a1 ,b1 )

gcd(f, g) = p1
lcm (f, g) =

min(at ,bt )

. . . pt

max(a1 ,b1 )
max(at ,bt )
p1
. . . pt

Note
min(ai , bi ) + max(ai , bi ) = ai + bi .
so
gcd(f, g) lcm (f, g) = f g.
EXAMPLE 2.7
If A = diag (1 , . . . , n ), then mA = (x c1 ) (x ct ), where c1 , . . . , ct
are the distinct members of the sequence 1 , . . . , n .
PROOF. For A is the direct sum of the 1 1 matrices 1 , . . . , n having
minimum polynomials x 1 , . . . , n . Hence
mA = lcm (x 1 , . . . , x n ) = (x c1 ) (x ct ).

We know that mA | chA . Hence if


chA = pa11 . . . pat t
where a1 > 0, . . . , at > 0, and p1 , . . . , pt are distinct monic irreducibles, then
mA = pb11 . . . pbt t
where 0 bi ai , i = 1, . . . , t.
We soon show that each bi > 0, i.e. if p | chA and p is irreducible then
p | mA .
37

2.5

Construction of a field of pn elements


(where p is prime and n N)

Let f be a monic irreducible polynomial of degree n in Zp [x]that is,


Fq = Zp here.
For instance,
n = 2, p = 2 x2 + x + 1 = f
n = 3, p = 2 x3 + x + 1 = f or x3 + x2 + 1 = f.
Let A = C(f ), the companion matrix of f . Then we know f (A) = 0.
We assert that the set of all matrices of the form g(A), where g Zp [x],
forms a field consisting of precisely pn elements. The typical element is
b0 In + b1 A + + bt At
where b0 , . . . , bt Zp .
We need only show existence of a multiplicative inverse for each element
except 0 (the additive identity), as the remaining axioms clearly hold.
So let g Zp [x] such that g(A) 6= 0. We have to find h Zp [x] satisfying
g(A)h(A) = In .
Note that g(A) 6= 0 f6 | g, since
f | g g = f f1
and hence
g(A) = f (A)f1 (A) = 0f1 (A) = 0.
Then since f is irreducible and f6 | g, there exist u, v Zp [x] such that
uf + vg = 1.
Hence u(A)f (A) + v(A)g(A) = In and v(A)g(A) = In , as required.
We now show that our new field is a Zp vector space with basis consisting
of the matrices
In , A, . . . , An1 .
Firstly the spanning property: By Euclids division theorem,
g = fq + r
38

where q, r Zp [x] and deg r < deg g. So let


r = r0 + r1 x + + rn1 xn1
where r0 , . . . , rn1 Zp . Then
g(A) = f (A)q(A) + r(A)
= 0q(A) + r(A)
= r(A)
= r0 In + r1 A + + rn1 An1
Secondly, linear independence over Zp : Suppose that
r0 In + r1 A + + rn1 An1 = 0,
where r0 , r1 , . . . , rn1 Zp . Then r(A) = 0, where
r = r0 + r1 x + + rn1 xn1 .
Hence mA = f divides r. Consequently r = 0, as deg f = n whereas
deg r < n if r 6= 0.
Consequently, there are pn such matrices g(A) in the field we have constructed.
Numerical Examples
EXAMPLE 2.8
Let p = 2, n = 2, f = x2 + x + 1 Z2 [x], and A = C(f ). Then

 

0 1
0 1
,
=
A=
1 1
1 1
and
F4 = { a0 I2 + a1 A | a0 , a1 Z2 }
= { 0, I2 , A, I2 + A }.
We construct addition and multiplication tables for this field, with B =
I2 + A (as an exercise, check these):

0
I2
A
B

0
I2
A
B

0 I2 A B
0 I2 A B
I2 0 B A
A B 0 I2
B A I2 0
39

0
0
0
0
0

I2 A B
0 0 0
I2 A B
A B I2
B I2 A

EXAMPLE 2.9
Let p = 2, n = 3, f = x3 + x + 1 Z2 [x]. Then

0 0 1
0 0 1
A = C(f ) = 1 0 1 = 1 0 1 ,
0 1 0
0 1 0
and our eight-member field F8 (usually denoted by GF (8) [GF corresponds
to Galois Field, in honour of Galois]) is
F8 = { a0 I3 + a1 A + a2 A2 | a0 , a1 , a2 Z2 }
= { 0, I3 , A, A2 , I3 + A, I3 + A2 , A + A2 , I3 + A + A2 }.
1

Now find (A2 + A) .


Solution: use Euclids algorithm.
x3 + x + 1 = (x + 1)(x2 + x) + 1.
Hence
x3 + x + 1 + (x + 1)(x2 + x) = 1
A3 + A + I3 + (A + I3 )(A2 + A) = I3
(A + I3 )(A2 + A) = I3 .
Hence (A2 + A)1 = A + I3 .
THEOREM 2.8
Every finite field has precisely pn elements for some prime pthe least
positive integer with the property that
1| + 1 + 1{z+ + 1} = 0.
p

p is then called the characteristic of the field.


Also, if x F , a field of q elements, then it can be shown that if x 6= 0,
then
xq1 = 1.
In the special case F = Zp , this reduces to Fermats Little Theorem:
xp1 1
if p is prime not dividing x.
40

(mod p),

2.6

Characteristic and Minimum Polynomial of a Transformation

DEFINITION 2.8
(Characteristic polynomial of T : V 7 V )
Let be a basis for V and A = [T ] .
Then we define chT = chA . This polynomial is independent of the basis
:
PROOF ( chT is independent of the basis.)
If is another basis for V and B = [T ] , then we know A = P 1 BP
where P is the change of basis matrix [IV ] .
Then
chA

chP 1 BP

= det(xIn P 1 BP )

where n = dim V

= det(P

(xIn )P P

= det(P

(xIn B)P )

= det P
=

BP )

chB det P

chB .

DEFINITION 2.9
If f = a0 + + at xt , where a0 , . . . , at F , we define
f (T ) = a0 IV + + at T t .
Then the usual properties hold:
f, g F [x] (f +g)(T ) = f (T )+g(T ) and (f g)(T ) = f (T )g(T ) = g(T )f (T ).
LEMMA 2.1


f F [x] [f (T )] = f [T ] .
Note: The Cayley-Hamilton theorem for matrices says that chA (A) = 0.
Then if A = [T ] , we have by the lemma
[ chT (T )] = chT (A) = chA (A) = 0,
so chT (T ) = 0V .
41

DEFINITION 2.10
Let T : V V be a linear transformation over F . Then any polynomial
of least positive degree such that
f (T ) = 0V
is called a minimum polynomial of T .
We have corresponding results for polynomials in a transformation T to
those for polynomials in a square matrix A:
g = qf + r g(T ) = q(T )f (T ) + r(T ).
Again, there is a unique monic minimum polynomial of T is denoted by mT
and called the minimum polynomial of T .
Also note that because of the lemma,
mT = m[T ] .

For (with A =

[T ] )

(a) mA (A) = 0, so mA (T ) = 0V . Hence mT |mA .


(b) mT (T ) = 0V , so [mT (T )] = 0. Hence mT (A) = 0 and so mA |mT .
EXAMPLES 2.2
T = 0V mT = x.
T = IV mT = x 1.
T = cIV mT = x c.
T 2 = T and T 6= 0V and T 6= IV mT = x2 x.
2.6.1

Mnn (F [x])Ring of Polynomial Matrices

Example:

x2 + 2 x5 + 5x + 1
M22 (Q[x])
x+3
1

 





1 0
0 5
2 1
0 1
2
5
+
+x
= x
+x
0 0
1 0
3 1
0 0


we see that any element of Mnn (F [x]) is expressible as


xm Am + xm1 Am1 + + A0
where Ai Mnn (F ). We write the coefficient of xi after xi , to distinguish
these entities from corresponding objects of the following ring.
42

2.6.2

Mnn (F )[y]Ring of Matrix Polynomials

This consists of all polynomials in y with coefficients in Mnn (F ).


Example:








0 1
1 0
0 5
2 1
5
2
y +
y +
y+
M22 (F )[y].
0 0
0 0
1 0
3 1
THEOREM 2.9
The mapping
: Mnn (F )[y] 7 Mnn (F [x])
given by
(A0 + A1 y + + Am y m ) = A0 + xA1 + + xm Am
where Ai Mnn (F ), is a 11 correspondence and has the following properties:
(X + Y ) = (X) + (Y )
(XY ) = (X)(Y )
(tX) = t(X) t F.
Also
(In y A) = xIn A

A Mnn (F ).

THEOREM 2.10 ((Left) Remainder theorem for matrix polynomials)


Let Bm y m + + B0 Mnn (F )[y] and A Mnn (F ).
Then
Bm y m + + B0 = (In y A)Q + R
where
R = Am Bm + + AB1 + B0
and Q = Cm1 y m1 + + C0
where Cm1 , . . . , C0 are computed recursively:
Bm = Cm1
Bm1 = ACm1 + Cm2
..
.
B1 = AC1 + C0 .
43

PROOF. First we verify that B0 = AC0 + R:


R = Am Bm = Am Cm1
+Am1 Bm1

Am Cm1 + Am1 Cm2

+
..
.

+
..
.
A2 C1 + AC0

+AB1
+B0

B0
= B0 + AC0 .

Then
(In y A)Q + R = (In y)(Cm1 y m1 + + C0 )
A(Cm1 y m1 + + C0 ) + Am Bm + + B0
= Cm1 y m + (Cm2 ACm1 )y m1 + + (C0 AC1 )y +
AC0 + R
= Bm y m + Bm1 y m1 + + B1 y + B0 .
Remark. There is a similar right remainder theorem.
THEOREM 2.11
If p is an irreducible polynomial dividing chA , then p | mA .
PROOF (From Burton Jones, Linear Algebra).
Let mA = xt + at1 xt1 + + a0 and consider the matrix polynomial
in y
1 (mA In ) = In y t + (at1 In )y t1 + + (a0 In )
= (In y A)Q + At In + At1 (at1 In ) + + a0 In
= (In y A)Q + mT (A)
= (In y A)Q.
Now take of both sides to give
mA In = (xIn A)(Q)
and taking determinants of both sides yields
{mA }n = chA det (Q).
44

So letting p be an irreducible polynomial dividing chA , we have p | {mA }n


and hence p | mA .
Alternative simpler proof (MacDuffee):
mA (x) mA (y) = (x y)k(x, y), where k(x, y) F [x, y]. Hence
mA (x)In = mA (xIn ) mA (A) = (xIn A)k(xIn , A).
Now take determinants to get
mA (x)n = chA (x) det k(xIn , A).
Exercise: If (x) is the gcd of the elements of adj(xIn A), use the
equation (xIn a)adj(xIn A) = chA (x)In and an above equation to deduce
that mA (x) = chA (x)/(x).
EXAMPLES 2.3
With A = 0 Mnn (F ), we have chA = xn and mA = x.
A = diag (1, 1, 2, 2, 2) M55 (Q). Here
chA = (x 1)2 (x 2)3

and mA = (x 1)(x 2).

DEFINITION 2.11
A matrix A Mnn (F ) is called diagonable over F if there exists a
nonsingular matrix P Mnn (F ) such that
P 1 AP = diag (1 , . . . , n ),
where 1 , . . . , n belong to F .
THEOREM 2.12
If A is diagonable, then mA is a product of distinct linear factors.
PROOF
If P 1 AP = diag (1 , . . . , n ) (with 1 , . . . , n F ) then
mA = mP 1 AP = m diag ( , . . . , )
1
n
= (x c1 )(x c2 ) . . . (x ct )
where c1 , . . . , ct are the distinct members of the sequence 1 , . . . , n .
The converse is also true, and will (fairly) soon be proved.
45

EXAMPLE 2.10
A = Jn (a).
We saw earlier that mA = (x a)n so if n 2 we see that A is not diagonable.
DEFINITION 2.12
(Diagonable LTs)
T : V 7 V is called diagonable over F if there exists a basis for V
such that [T ] is diagonal.
THEOREM 2.13
A is diagonable TA is diagonable.
PROOF (Sketch)
Suppose P 1 AP = diag (1 , . . . , n ). Now pre-multiplying by P and
letting P = [P1 | |Pn ] we see that
TA (P1 ) = AP1 = 1 P1
..
.
TA (Pn ) = APn = n Pn
and we let be the basis P1 , . . . , Pn over Vn (F ). Then

1
2

[TA ] =

..

.
n

Reverse the argument and use Theorem 1.17.


THEOREM 2.14
Let A Mnn (F ). Then if is an eigenvalue of A with multiplicity m,
(that is (x )m is the exact power of x which divides chA ), we have
nullity (A In ) m.
46

REMARKS. (1) If m = 1, we deduce that nullity (A In ) = 1. For the


inequality
1 nullity (A In )
always holds.
(2) The integer nullity (A In ) is called the geometric multiplicity of
the eigenvalue , while m is referred to as the algebraic multiplicity of .
PROOF. Let v1 , . . . , vr be a basis for N (A In ), where is an eigenvalue
of A having multiplicity m. Extend this linearly independent family to a
basis v1 , . . . , vr , vr+1 , . . . , vn of Vn (F ). Then the following equations hold:
Av1 = v1
..
.
Avr = vr
Avr+1 = b11 v1 + + bn1 vn
..
.
Avn = b1nr v1 + + bnnr vn .
These equations can be combined into a single matrix equation:
A[v1 | |vr |vr+1 | |vn ] = [Av1 | |Avr |Avr+1 | |Avn ]
= [v1 | |vr |b11 v1 + + bn1 vn | |b1nr v1 + + bnnr vn ]


Ir B1
.
= [v1 | |vn ]
0 B2
Hence if P = [v1 | |vn ], we have
P 1 AP =

Ir B1
0 B2

Then
chA = chP 1 AP = chIr chB2 = (x )r chB2
and because (x )m is the exact power of x dividing chA , it follows
that
nullity (A In ) = r m.
THEOREM 2.15
Suppose that chT = (x c1 )a1 (x ct )at . Then T is diagonable if
nullity (T ci Iv ) = ai
47

for 1 i t.

PROOF. We first prove that the subspaces Ker (T ci IV ) are independent.


(Subspaces V1 , . . . , Vt are called independent if
v1 + + vt = 0, vi Vi , i = 1, . . . t, v1 = 0, . . . , vt = 0.
Then dim (V1 + + Vt ) = dim (V1 ) + + dim Vt ).)
Assume that
v1 + + vt = 0,
where vi Ker (T ci Iv ) for 1 i t. Then
T (v1 + + vt ) = T (0)
c1 v1 + + ct vt = 0.
Similarly we deduce that
c21 v1 + + c2t vt = 0
..
.
t1
ct1
1 v1 + + ct vt = 0.

We can combine these t equations into a

1
c1 ct

..

.
t1
t1
ct
c1

single matrix equation



v1
.. =
.
vt

o
.. .
.
0

However the coefficient matrix is the Vandermonde matrix, which is non


singular as ci 6= cj if i 6= j, so we deduce that v1 = 0, , vt = 0. Hence with
Vi = Ker (T ci IV ), we have
dim (V1 + + Vt ) =

t
X

dim Vi =

i=1

t
X

ai = dim V.

i=1

Hence
V = V1 + + V t .
Then if i is a basis for Vi for i i t and = 1 t , it follows that
is a basis for V . Moreover
[T ]

t
M
i=1

48

(ci Iai )

and T is diagonable.
EXAMPLE. Let

5
2 2
5 2 .
A= 2
2 2
5

(a) We find that chA = (x 3)2 (x 9). Next we find bases for each of the
eigenspaces N (A 9I3 ) and N (A 3I3 ):
First we solve (A 3I3 )X = 0. We have

2
2 2
1 1 1
2 2 0 0
0 .
A 3I3 = 2
2 2
2
0 0
0
Hence the eigenspace consists of vectors X = [x, y, z]t satisfying x = y +z,
with y and z arbitrary. Hence


y + z
1
1
= y 1 + z 0 ,
y
X=
z
0
1
so X11 = [1, 1, 0]t and X12 = [1, 0, 1]t form a basis for the eigenspace
corresponding to the eigenvalue 3.
Next we solve (A 9I3 )X = 0. We have

4
2 2
1 0 1
A 9I3 = 2 4 2 0 1 1 .
2 2 4
0 0 0
Hence the eigenspace consists of vectors X = [x, y, z]t satisfying x = z
and y = z, with z arbitrary. Hence

z
1
X = z = z 1
z
1
and we can take X21 = [1, 1, 1]t as a basis for the eigenspace corresponding to the eigenvalue 9.
Then P = [X11 |X12 |X21 ] is nonsingular and

3 0 0
P 1 AP = 0 3 0 .
0 0 9
49

THEOREM 2.16
If
mT = (x c1 ) . . . (x ct )
for c1 , . . . , ct distinct in F , then T is diagonable and conversely. Moreover
there exist unique linear transformations T1 , . . . , Tt satisfying
= T1 + + Tt ,

IV

= c1 T1 + + ct Tt ,

= 0V if i 6= j,

Ti Tj
Ti2

= Ti , 1 i t.

Also rank Ti = ai , where chT = (x c1 )a1 (x ct )at .


Remarks.
1. T1 , . . . , Tt are called the principal idempotents of T .
2. If g F [x], then g(T ) = g(c1 )T1 + + g(ct )Tt . For example
m
T m = cm
1 T1 + + ct Tt .

3. If c1 , . . . , ct are nonzero (that is the eigenvalues of T are nonzero),


the T 1 is given by
1
T 1 = c1
1 T1 + + ct Tt .

Formulae 2 and 3 are useful in the corresponding matrix formulation. PROOF


Suppose mT = (x c1 ) (x ct ), where c1 , . . . , ct are distinct. Then
chT = (x c1 )a1 (x ct )at . To prove T is diagonable, we have to prove
that nullity (T ci IV ) = ai , 1 i t
Let p1 , . . . , pt be the Lagrange interpolation polynomials based on c1 , . . . , ct ,
i.e.

t 
Y
x cj
pi =
, 1 i t.
ci cj
j=1
j6=i

Then
g F [x] g = g(c1 )p1 + + g(ct )pt .
In particular,
g = 1 1 = p1 + + p t
50

and
g = x x = c1 p1 + + ct pt .
Hence with Ti = pi (T ),
= T1 + + Tt

IV
T

= c1 T1 + + ct Tt .

Next
mT = (x c1 ) . . . (x ct ) | pi pj
(pi pj )(T ) = 0V

if i 6= j

pi (T )pj (T ) = 0V

or Ti Tj = 0V

if i 6= j
if i 6= j.

Then Ti2 = Ti (T1 + + Tt ) = Ti IV = Ti .


Next
0V = mT (T ) = (T c1 IV ) (T ct IV ).
Hence
dim V = nullity 0V

t
X

nullity (T ci IV )

i=1

t
X

ai = dim V.

i=1

Consequently nullity (T ci IV ) = ai , 1 i t and T is therefore diagonable.


Next we prove that rank Ti = ai . From the definition of pi , we have
nullity pi (T )

t
X

nullity (T cj IV ) =

j=1
j6=i

t
X

aj = dim V ai .

j=1
j6=i

Also pi (T )(T ci IV ) = 0, so Im (T ci IV ) Ker pi (T ). Hence


dim V ai nullity pi (T )
and consequently nullity pi (T ) = dim (V ) ai , so rank pi (T ) = ai .
We next prove the uniqueness of T1 , . . . , Tt . Suppose that S1 , . . . , St also
satisfy the same conditions as T1 , . . . , Tt . Then
Ti T

= T Ti = ci Ti

Sj T

= T Sj = cj Sj

Ti (T Sj ) = Ti (cj Sj ) = cj Ti Sj = (Ti T )Sj = ci Ti Sj


51

so (cj ci )Ti Sj = 0V and Ti Sj = 0V if i 6= j. Hence


Ti

t
X
= Ti IV = Ti (
Sj ) = T i S i
j=1

t
X

Si = IV Si = (

Tj )Si = Ti Si .

j=1

Hence Ti = Si .
Conversely, suppose that T is diagonable and let be a basis of V such
that
A = [T ] = diag (1 , . . . , n ).
Then mT = mA = (x c1 ) (x ct ), where c1 , . . . , ct are the distinct
members of the sequence 1 , . . . , n .
COROLLARY 2.5
If
chT = (x c1 ) . . . (x ct )
with ci distinct members of F , then T is diagonable.
Proof: Here mT = chT and we use theorem 3.3.
EXAMPLE 2.11
Let


0 a
A=
b 0

a, b F, ab 6= 0, 1 + 1 6= 0.

Then A is diagonable if and only if ab = y 2 for some y F .


For chA = x2 ab, so if ab = y 2 ,
chA = x2 y 2 = (x + y)(x y)
which is a product of distinct linear factors, as y 6= y here.
Conversely suppose that A is diagonable. Then as A is not a scalar
matrix, it follows that mA is not linear and hence
mA = (x c1 )(x c2 ),
where c1 6= c2 . Also chA = mA , so chA (c1 ) = 0. Hence
c21 ab = 0,

or ab = c21 .

For example, take F = Z7 and let a = 1 and b = 3. Then ab 6= y 2 and


consequently A is not diagonable.

52

You might also like