Linear Algebra Fraleigh
Linear Algebra Fraleigh
THIRD EDITION
JOHN B. FRALEIGH
RAYMOND A. BEAUREGARD
University of Rhode Island
Addison
A Vex] (oats
Longman
a
Many of the designations used by manufacturers and sellers to distinguish their products are
claimed as trademarks. Where those designations appear in this book, and Addison-Wesley
was awate of a trademark claim, the designations have been printed in initial cap or all
caps.
Copyright © 1995 by Addison-Wesley Publishing Company, Inc. All rights reserved. No part
of this publication may be reproduced, stored in a retrieval system, or transmitted, in any
form or by any means, electronic, mechanical, photocopying, recording, or otherwise,
without the prior written permission of the publisher. Printed in the United States of
America.
16-PHX-04
PREFACE
_
Our text is designed for use in a first undergraduate course in linear algebra.
Because linear algebra provides the tools for dealing with problems in fields
ranging from forestry to nuclear physics, it is desirable to make tiie subject
accessible to students from a variety of disciplines. For the mathematics
major, a course in linear algebra often serves as a bridge from the typical
intuitive treatment of calculus to more rigorous courses such as abstract
algebra and analysis. Recognizing this, we have attempted to achieve an
appropriate blend of intuition and rigor in our presentation.
SUPPLEMENTS
¢ Instructor’s Solutions Manual: This manual, prepared by the authors, is
available to the instructor from the publisher. It contains complete solu-
tions, including proofs, for all of the exercises.
¢ Student’s Solutions Manual: Prepared by the authors, this manual contains
the complete solutions, including proofs, from the Instructor’s Solutions
Manual for every third problem (1, 4, 7, etc.) in each exercise set.
¢ LINTEK: This PC software, discussed above, is included with each copy of
the text.
ACKNOWLEDGMENTS
Reviewers of text manuscripts perform a vital function by keeping authors
in touch with reality. We wish to express our appreciation to all the re-
viewers of the manuscript for this edition, including: Michael Ecker, Penn
State University; Steve Pennell, University of Massachusetts, Lowell; Paul-
ine Chow, Harrisburg Area Community College; Murray Eisenberg, Univer-
sity of Massachusetts, Amherst; Richard Blecksmith, Northern Illinois Uni-
versity; Michael Tangredi, College of St. Benedict; Ronald D. Whittekin,
Metropolitan State College; Marvin Zeman, Southern Illinois University at
Carbondale; Ken Brown, Cornell University; and Michal Gass, College of St.
Benedict.
In addition, we wish to acknowledge the contributions of reviewers of the
previous editions: Ross A. Beaumont, University of Washington; Paul Blan-
chard, Boston University; Lawrence O. Cannon, Utah State University; Henry
Cohen, University of Pittsburgh; Sam Councilman, California State Univer-
sity, Long Beach; Daniel Drucker, Wayne State University; Bruce Edwards,
University of Florida; Murray Eisenberg, University of Massachusetts, Am-
herst; Christopher Ennis, University of Minnesota, Mohammed Kazemi,
University of North Carolina, Charlotte; Robert Maynard, Tidewarter Com-
munity College; Robert McFadden, Northern Illinois University; W. A.
McWorter, Jr., Ohio State University; David Meredith, San Francisco State
University; John Morrill, DePauw University; Daniel Sweet, University of
Maryland; James M. Sobota, University of Wisconsin—LaCrosse; Marvin
Zeman, Southern Illinois University.
PREFACE Vil
We are very grateful to Victor Katz for providing the excellent historical
notes. His notes are not just biographical information about the contributors
to the subject; he actually offers insight into its development.
Finally, we wish to thank Laurie Rosatone, Cristina Malinn, and Peggy
McMahon of Addison-Wesley, and copy editor Craig Kirkpatrick for their
help in the preparatiou of this edition.
DEPENDENCE CHART
Central Core
9.1 1.1-1.6
T 1
1.7 1.8 10.1-10.3
2.1-2.3
r ~~ I
2.4 2.5
3.1-3.4
+
9.2 3.5
4.1-4.3
f
4.4
5.1-5.2
9.3 5.3
6.1-6.3
9.4 r
6.4-6.5 8.4
7.1-7.2
| 1
8.1
1
8.2 8.3
Vill
—
CONTENTS
PREFACE | iii
CHAPTER 1
CHAPTER 2
] CHAPTER 3
VECTOR SPACES 179
3.1 Vector Spaces 180
3.2 Basic Concepts of Vector Spaces 190
3.3 Coordinatization of Vectors 204
3.4 Linear Transformations 213
3.5 Inner-Product Spaces (Optional} 229
ae
CHAPTER 4 |
|
DETERMINANTS 238
4.1 Areas, Volumes, and Cross Products 238
4.2 The Neterminant of a Square Matrix 250
4.3 Computation of Determinants and Cramer’s Rule 263
4.4 Linear Transformations and Determinants (Optional) 273
CHAPTER 5
oo,
7 CHAPTER 6
ORTHOGONALITY 326
6.1 Projections 326
5.2 The Gram-—Schmidt Process 338
6.3 Orthogonal Matrices 349
6.4 The Projection Matrix 360
6.5 The Method of Least Squares 369
CONTENTS xi
CHAPTER 7
| CHAPTER 8
CHAPTER 9
CHAPTER 10
APPENDICES A-1
A. Mathematical Induction A-1
B. Two Deferred Proofs A-6
xii CONTENTS
iNDEX A-53
CHAPTER
Euclidean Spaces
Let R be the set of all real numbers. We can regard R geometrically as the
Euclidean line—that is, as Euclidean 4-space. We are familiar with rectangular
x,y-coordinates in the Euclidean plane. We consider each ordered pair (a, b)
of real numbers to represent a point in the plane, as illustrated in Figure 1.1.
The set of all such ordered pairs of real numbers is Euclidean 2-space, which
we denote by R’, and often call the plane.
To coordinatize space, we choose three mutually perpendicular lines as
coordinate axes through a point that we call the origin and label 0, as shown in
Figure 1.2. Note that we represent only half of each coordinate axis for clarity.
The coordinate system in this figure is called a right-hand system because,
when the fingers of the right hand are curved in the direction required to rotate
the positive x-axis toward the positive y-axis, the right thumb points up the
z-axis, as shown in Figure 1.2. The set of all ordered triples (a, b, c) of real
numbers is Euclidean 3-space, denoted R*, and often simply referred to as
space.
Although a Euclidean space of dimension four or more may be difficult for
us to visualize geometrically, we have no trouble writing down an ordered
quadruple of real numbers such as (2, —3, 7, 7) or an ordered quintuple such
as (—0.3, 3, 2, —5, 21.3), etc. Indeed, it can be useful to do this. A household
budget might contain nine categories, and the expenses allowed per week in
each category could be represented by an ordered 9-tuple of real numbers.
Generalizing, the set R’ of all ordered n-tuples (x,, x,, . . . , x,) of real numbers
is Euclidean n-space. Note the use of just one letter with consecutive integer
subscripts in this n-tuplc, rather than different letters. We will often denote an
element of R? by (x,, x,) and an element of R? by (x,, x, X;).
1 VECTORS IN EUCLIDEAN SPACES 3
—
1 (a, b, c)
(a, 5)
or
b
— Xx
0 a
HISTORICAL NOTE THE IDEA OF AN “1-DIMENSIONAL SPACE FOR m > 3 reached acceptance
gradually during the nineteenth century; it is thus difficult to pinpoint a frst “invention” of this
concept. Among the various early uses of this notion are its appearances in a work on the
divergence theorem by the Russian mathematician Mikhail Ostrogradskii (1801 ~1862) in 1836,
in the geometrical tracts of Hermann Grassmann (1809-1877) in the early 1840s, and in a brief
paper of Arthur Cayley (1821-1895) in 1846. Unfortunately, the first two authors were virtually
ignored in their lifetimes. In particular, the work of Grassmann was quite philosophical and
extremely difficult to read. Cayley’s note merely stated that one can generalize certain results to
dimensions greater than three “without recourse to any metaphysical notion with regard to the
possibility of a space of four dimensions.” Sir William Rowan Hamilton (1805-1865), in an 1841
letter, also noted that ‘‘it ynust be possible, in some way or other, to introduce not only tnplets but
polyplets, so as in some sense to satisfy the symbolical equation
a = (a), a), ..., &,);
a being here one symbol, as indicative of one (complex) thought; and a,, a,, . .. ,@, denoting real
numbers, positive or negative.”
Hamilton, whose work on quaternions will be mentioned later, and whc spent much of his
professional life as the Royal Astronomer of Ireland, is most famous for his work in dynamics. As
Erwin Shrédinger wrote, “the Hamiltonian principle has become the comerstone of modern
physics, the thing with which a physicist expects every physical phenomenon to be in conformity.”
4 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
in which the force is acting, and with the Jength of the arrow representing the
magnitude of the force. Such an arrow is a force vector.
Using a rectangular coordinate system in the plane, note that if we
consider a force vector to start from the origin (0, 8), then the vector is
completely determined by the coordinates of the point at the tip of the arrow.
Thus we.can consider each ordered pair in R’ to represent a vector in the plane
as well as a point in the plane. When we wish to regard an ordered pair as a
vector, we will use square brackets, rather than parentheses, to indicate this.
Also, we often will write vectors as columns of numbers rather than as rows,
and bracket notation is traditional for columns. Thus we speak of the point
(1, 2) in R? and of the vector [1, 2] in R’. To represent the point (1, 2) in th
plane, we make a dot at the appropriate place, whereas if we wish to represent
the vector (1, 2], we draw an arrow emanating from the origin with its tip at the
place where we would plot the point (1, 2). Mathematically, there is no
distinction between (1, 2) and (1, 2]. The different notations merely indicate
different views of the same member of R’. This is illustrated in Figure 1.3. A
similar observation holds for 3-space. Generalizing, each n-tuple of real
numbers can be viewed both as a point (x,, x,,..., X,) and as a vector
[X1, %),... ,X,] in R”. We use boldface letters such as a = [a,, a,], V = [V,, V2, V3],
and x = [x,, 2, . . . ; X,] to denote vectors. In written work, it is customary to
place an arrow over a letter to denote a vector, as in a, V, and x. The ith entry x,
in such a vector is the ith component of the vector. Even the real numbers in R
can be regarded both as points and as vectors. When we are not regarding a
real number as either a point or a vector, we refer to it as a scalar.
Two vectorsv = [v,, v,,..., ¥,] and w =(w,, Ww, ... , W,,] are equal ifn = m
and v, = w, for each i.
A vector containing only zeros as components 1s called a zero vector and
is denoted by 0. Thus, in R? we have 0 = [0, 0] whereas in R* we have 0 =
[0, 0, 0, O}.
When denoting a vector v in R” geometrically by an arrow in a figure, we
say that the vector is in standard position if it starts at the origin. If we draw an
x2 +2
A A
]
2+ «! 2) 2
1 t+ 1+ v=[1,2]
0 wm X) 0 > X]
(a) (b)
FIGURE 1.3
Two views of the same member of R?: (a) the point (1, 2); (b) the vector v = (1. 2].
1.1 VECTORS IN EUCLIDEAN SPACES 5
Vv Translated v
0 P
arrow having the same length and parallel to the arrow representing v but
starting at a point P other than the origin, we refer to the arrow as v translated
to P. This is illustrated in Figure 1.4. Note that we did not draw any coordinate
axes; we only marked the origin 0 and drew the two arrows. Thus we can
consider Figure 1.4 to represent a vector v in R’, R?, or indeed in R" for n = 2.
We will often leave out axes when they are not necessary for our understand-
ing. This makes our figures both less cluttered and more general.
Vector Algebra
Physicists tell us that if two forces corresponding to force vectors F, and F, act
on a body at the same time, then the two forces can be replaced by a single
force, the resultant force, which has the same effect as the original two forces.
The force vector for this resultant force is the diagonal of the parallelogram
having the force vectors F, and F, as edges, as illustrated in Figure 1.5. It is
natural to consider this resultant force vector to be the sum F, + F, of the two
original force vectors, and it is so labeled in Figure 1.5.
HISTORICAL NOTE Tue cONcEPT OF A VECTOR in its earliest manifestation comes from
physical considerations. In particular, there is evidence of velocity being thought of as a vector—a
qzantity with magnitude and direction—in Greek times. For example, in the treatise Mechanica
by an unknown author in the fourth century 5.c. is wntten: “When a body is moved in a certain
ratio (i.e., has two linear movements in a constant ratio to one another), the body must move in a
straight line, and this straight line is the diagonal of the parallelogram formed from the straight
lines which have the given ratio.” Heron of Alexandna (first century 4.D.) gave a proof of this result
when the directions were perpendicular. He showed that if a point A moves with constant velocity
over a line AB while at the same time the line 4B moves with constant velocity along the parallel
lines AC and BD so that it always remains parallel to its original position, and that if the time A
takes to reach B is the same as the time AB takes to reach CD, then in fact the point A moves along
the diagonal AD.
This basic idea of adding two motions vectorially was generalized from velocities to physical
forces in the sixteenth and seventeenth centuries. One example of this practice is found as
Corollary ! to the Laws of Motion in Isaac Newton’s Principia, where he shows that ‘“‘a body acted
on by two forces simultaneously will descrive the diagonal of a parallelogram in the same time as it
would describe the sides by those forces separately.”
6 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
We can visualize two vectors with different directions and emanating from
a point P in Euclidean 2-space or 3-space as determining a plane. It is
pedagogice!ly useful to do this for n-space for any n = 2 and show helpful
figures on our pages. Motivated by our discussion of force vectors above, we
consider the sur of two vectors v and w starting at a point P to be the vector
startingat P that forms the diagonal of the parallelogram wiih a vertex at Pand
having edges represented by v and w, as illustrated in Figure 1.6, where we take
the vectors in R" in standard position starting at 0. Thus we have a geometric
understanding of vector addition in IR". We have labeled as translated vy and
translated w the sides of the parallelogram opposite the vectors v and w.
Note that arrows along opposite sides of the parallelogram point in the
same direction and have the same length. Thus, as a force vector, the
translation of v is considered to be equivalent to the vector v, and the same is
true for w and its translation. We can think of obtaining the vector v + w by
drawing the arrow v from 0 and then drawing the arrow w translated to start
from the tip of v as shown in Figure 1.6. The vector from 0 to the tip of the
translated w is then v + w. This is often a useful way to regard v + w. To add
three vectors u, v, and w geometrically, we translate v to start at the tip of wand
then translate w to start at the tip of the translated v. The sum u + v + wthen
begins at the origin where u starts, and ends at the tip of the translated w, as
indicated in Figure 1.7.
The difference v — w of two vectors in R" is represented geometrically by
the arrow from the tip of w to the tip of v, as shown in Figure 1.8. Here v — wis
the vector that, when added to w, yields v. The dashed arrow in Figure 1.8
shows v — w in standard position.
If we are pushing a body with a force vector F and we wish to “double the
force” —that is, we want to push in the same direction but twice as hard—
then it is natural to denote the doubled force vector by 2F. If instead we want
to push the body in the opposite direction with one-third the force, we denote
the new force vector by —1F. Generalizing, we consider the product rv of a
scalar r times a vector v in R’ to be represented by the arrow whose length is |r|
times the length of v and which has the same direction as v if r > 0 but the
opposite direction if r < 0. (See Figure 1.9 for an illustration.) Thus we have a
geometric interpretation of scalar multiplication in R'—that is, of multiplica-
tion of a vector in R’ by a scalar.
Translated w
ee Translated v
Translated y
x2
.
| 2v
v2 +—_———_
‘ \v-w v-w
4 3Ul
\ TT _ oi
\ 1 Vi ] vy} 2,
0. = —3V ~ 7372
Taking a vecior v = [v,, v,] in R? and any scalar r, we would like to be able to
conipute rv algebraically as an element (ordered pair) in R’, and not just
represent it geometrically by an arrow. Figure |.9 shows the vector 2v which
points in the same direction as v but is twice as long, and shows that we have 2v
= [2y,, 2v,]. It also indicates that if we multiply all components of v by -4, the
resulting vector has direction opposite to the direction of v and length equal to
; the length of v. Similarly, if we take two vectors v = [¥,, vy] and w = [w,, w,] in
IR’, we would like to be able to compute v + w algebraically as an element
(ordered pair) in R’, Figure !.i0 indicates that we have v + w = [v, + w,,
v, + w,J]—that is, we can simply add corresponding components. With these
figures to guide us, we formally define some algebraic operations with vectors
in R’.
Letv=[v,, v,,...,V,] and w = [w,, Ww, ..., W,] be vectors in R". The
vectors are added and subtracted as follows:
Vector addition: vy + w = [v, + #,, %) + w,,...,¥,+ Wl
Vector subtraction: vy — w = [v, — W,, ¥) — Wo,.-- 5%, — Wal
If ris any scalar, the vector v is multiplied by r as follows:
Scalar multiplication: rv = [rv,, rv), ..., rv,]
X2
A
]
V2 + WwW +
W2
v» +
vt+w
vj
Ww J av]
v v) Ww v| + Wy
FIGURE 4.10
Coniputation of v + w in R?.
EXAMPLE 1 Let v = [—3, 5, —1] and w = [4, 10, —7] in 8°. Compute Sv — 3w.
SOLUTION We compute
EXAMPLE 2 For vectors v and w in R’ pointing in different directions from the origin,
represent geometricaliy 5v — 3w.
SOLUTION This is done in Figure 1.11. =
The analogues of many familiar algebraic laws for addition and multipli-
cation of scalars also hold for vector addition and scalar multiplication. For
convenience, we gather them in a theorem.
FIGURE 1.11
5v — 3w in R’.
1.1 VECTORS IN EUCLIDEAN SPACES 9
Let u, v, and w be any vectors in R’, and let rand s be any scalars in R.
Properties of Vector Addition
Al (utv)+w=ut (v+ w) An associative law
A2 v+w=wty A commutative law
A3 O+v=yV 0 as additive identity
A4 v+(-v)=0 ~v as additive inverse of v
Properties Involving Scalar Multiplication
Sl r(v+w)=rv+rw A distributive law
S2 (r+ sw=rv + sv A distributive law
S3_r(sv) = (rs)v An associative law
S4 lv=yv Preservation of scale
The eight properties given in Theorem 1.1 are quite easy to prove, and we
leave most of them as exercises. The proofs in Examples 3 and 4 are typical.
Parallel Vectors
EXAMPLE 5 Determine whether the vectors v = (2, 1, 3, —4] and w = (6, 3, 9, —12] are
parallel.
SOLUTION We put v = rw and try to solve for r. This gives rise to four component
equations:
Given vectors V,, V.,..., ¥, in R" and scalars r,, r,,..., 7, 1n R, the
vector
rv, + FoV> + ees + FV,
The vectors [1, 0] and [0, |] play a very important role in R’. Every vector
b in R? can be expressed as a linear combination of these two vectors in a
unique way—namely, b = [b,, b,] = r,[1, 0] + r,[0, 1] 1f and only if r, = b, and
r, = b,. We call [1, 0] and (0, 1] the standard basis vectors in R’. They are
often denoted by i = [1, 0] and j = [0, 1], as shown in Figure |.12(a). Thus in
R’, we may write the vector [b,, b,] as bi + b,j. Similarly, we have three
standard basis vectors in R’——namely,
i=([1, 0,0], j=(0,1,0], and k = (0,0, 1],
14 VECTORS IN EUCLIDEAN SPACES 11
a) re
(a) (b)
FIGURE 1.12
(a) Standard basis vectors in R?; (b) standard basis vectors in R?.
e,=(0,0,...,0,1,0,...,
0}.
t
rth component
We then have
a
nw
ye
—
(a) (b)
FIGURE 1.13
(a) The line along v in R?; (b) The line along v in R’.
EXAMPLE 6 Referring to Figure |.15(a), estimate scalars r and s such that rv + sw = b for
the vectors v, w, and b all lying in the plane of the paper.
SOLUTION We draw the line alcng vy, the line along w, and parallels to these lines through
the tip of the vector b, as shown in Figure 1.15(b). From Figure 1.15(b), we
estimate that b = 1.S5v — 2.5w. «
to
+
Ww
<
FIGURE 1.14
The plane spanned by v and w.
1.1 VECTORS IN EUCLIDEAN SPACES 13
V w
(a) (b)
FIGURE 1.15
(a) Vectors v, w, and b; (b) finding r and s so that b = rv + sw.
EXAMPLE 7 Let v = [1, 3] and w = [—2, 5] in R’. Find scalars r and s such that rv + sw =
[-1, 19].
SOLUTION Because rv + sw = ?[{1, 3} + s[-2, 5] = [r — 25, 3r + 5s], we see that ry + sw =
[—1, 19] if and only if both equations
r—2’s=-]
3r + 5s = 19
are Satisfied. Multiplying the first equation by —3 and adding the result to the
second equation, we obtain
0+ Ils
= 22,
We note that the components —! and 19 of the vector [—1, 19] appear on
the right-hand side of the system of two linear equations in Example 7. If we
replace —1 by 5, and 19 by b,, the same operations on the equations will enable
us to solve for the scalars r and s in terms of b, and b, (see Exercise 42). This
shows that all linear combinations of v and w do indeed fill the plane R’.
Example 7 indicates that an attempt to express a vector b as a linear
combination of given vectors corresponds to an attempt to find a solution of a
system of linear equations, This parallel is even more striking if we write our
vectors as columns of numbers rather than as ordered rows of numbers—that
is, as column vectors rather than as row vectors. For example, if we write the
vectors v and w in Example 7 as columns so that
and also rewrite [—1, 19] as a column vector, then the row-vector equation
rv + sw = [—1, 19] in the statement of Example 7 becomes
B+ sd [is}
Notice that the numbers in this column-vector equation are in the same
positions relative to each other as they are in the system of linear equations
r-2s=-1
3r+ 5s= 19
that we solved in Example 7. Every system of linear equations can be rewritten
in this fashion as a single column-vector equation. Exercises 35-38 provide
practice in this. Finding scalars r,,r,,...,7,Suchthatr,vy, tr.v, + ++ * + 7y¥,
= b for given vectors v,, v,, . . . , ¥,and bin R" is a fundamental computation in
linear algebia. Section 1.4 describes an algorithm for finding all possible such
scalars r,, f,... 4 My
The preceding paragraph indicates that often it will be natural for us to
think of vectors in R’” as column vectors rather than as row vectors.
The transpose of a row vector vy is defined to be the corresponding column
vector, and is denoted by v’. Similarly, the transpose of a column vector is the
corresponding row vector. For example,
| oT
4
[-1, 4, 15, -7]? = 15, and |—30} = [2, —30, 45].
~7 45
Note that for all vectors v we have (v’)? = v. As illustrated following Example
7, column vectors are often useful. In fact, some authors always regard every
vector v in R” as acolumn vector. Because it takes so much page space to write
column vectors, these authors may describe v by giving the row vector v’. We
do not follow this practice; we will write vectors in R” as either row or column
vectors depending on the context.
Continuing our geometric discussion, we expect that ifu, v, and w are three
nonzero vectors in R" such that u and vy are not parallel and also w is not a
vector in the plane spanned by u and y, then the set of all linear combinations
cfu, y, and w will fill a three-dimensional portion of R"—that is, a portion of
RR” wnat looks just like R?. We consider the set of these linear combinations to be
spanned by u, v, and w. We make the following definition.
Let v,,v,,..., ¥,be vectors in R”. The span of these vectors is the set of
all linear combinations of them and is denoted by sp(v,, v,,.. . , V,). In
set notation,
It is important to nc‘e that sp(v,, v,,..., v,) in R" may not fill what we
intuitively consider to be a k-dimensional portion of R". For example, in R? we
see that sp([1, —2], [—3, 6]) is just the one-dimensional line along [1, —2]
because [—3, 6] = —3[l, —2! already lies in sp({1, —2]). Similarly, if vy, is a
vector in sp(v,, v,), then sp(v,, v2, v3) = Sp(v,, ¥,) and so sp(¥,, Yo, V3) 1S not
three-dimensional. Section 2.1 will deal with this kind of dependency among
vectors. As a result of our work there, we will be able to define dimensionality.
SUMMARY
Euclidear n-space R" consists of all ordered -tuples of real numbers. Each
n-tuple x can be regarded as a point (x,, xX,,..., X,) and represented
graphically as a dot, or regarded as a vector [x,, x,,..-, X,] and
represented vy an arrow. The n-tuple 0 = [0, 0, . .. , O] isthe zero vector. A
real number r € R is called a scatar.
Vectors v and w in R’ can be aacded and subtracted, and each can be
ho
I
| EXERCISES
In Exercises 1-4, compute v + wand v — w for In Exercises 5-8, letu = [-1. 3, -2].v =
the given vectors v and w. Then draw coordinate [4, 0, -1], and w = [-3, —1, 2]. Compute the
axes and sketch, using your answers, the vectors v, indicaied vector.
Ww, v + w, andy — w.
In Exercises 9-12, compute the given linear In Exercises 21-30, find all scalars c, if any exist,
combination of u = [{1, 2, 1, 0], v = [-2, 0, |, 6], such that the given statement is true. Try to do
and w = [3, —5, |, —2]. some of these problems without using pencil and
paper.
9. u—2v
+ 4w
10. 3u+tv-—w
21. The vector [2, 6] is parallel to the vector
[c, —3].
11. 4u — 2v + 4w
22. The vector [c’, —4] is parallel to the vector
12. —u
+ 5y + 3w
[!, —2].
23. The vector [c, —c, 4] is parallel to the vector
In Exercises 13-16, reproduce on your paper [-2, 2, 20].
those vectors in Figure 1.16 that appear in the 24. The vector [c’, c’, c*] is parallel to the vector
exercise, and then draw an arrow representing [1, —2, 4] with the same direction.
each of the following linear combinations. All of
the vectors are assumed to lie in the same plane. 25. The vector [13, —15] is a linear combination
of the vectors [1, 5] and [3, c].
Use the technique illustrated in Figure 1.7 when
all three vectors are involved. 26. The vector [—1, c] is a linear combination of
the vectors [—3, 5] and [6, —11].
13. 2u
+ 3y 27. i+ cj — 3k is a linear coinbination ofi + j
andj + 3k.
14, —3u
+ 2w
28. i+ ci + (c — L)k isin the span of i + 2}+ k
15. u+vtw
and 3i + 6j + 3k.
16. 2u —v + iw
29. The vector 3i — 2j + ck is in the span of
i+ 2] — kandj + 3k.
In Exercises 17-20, reproduce un your paper 30. The vector [c, —2c, c] is in the span of
those vectors in Figure 1.17 that appear in the [1, —1, 1], [0, 1, —3], and
exercise, and then use the technique illustrated in [0, 0, 1].
Example 6 to estimate scalars r and s such that
the given equation is true. All of the vectors are
In Exercises 31-34, find the vector which, when
assumed to lie in the same plane.
translated, represents geometrically an arrow
reaching from the first point to the second.
17. x =ru+t+sy
18. y=rutsv 31. From (—1, 3) to (4, 2) in R’
19. u=rx+sv 32. From (—3, 2, 5) to (4, -2, —6) in R?
20. y= ru
+ sx 33. From (2, 1, 5, —6) to (3, -2, 1, 7) in R’
Ww
e
Vv
0 u
34. From (1, 2, 3, 4, 5) to(—5, —4, -3, -2, -1) _ . Ifa and b are two vectors in standard
oe
in R° position in R’, then the arrow from the
35. Write the linear system tip of a to the tip of b is a translated
representation of the vector b — a.
3x — 2y + 4z = 10 f. The span of any two nonzero vectors in
x- y-3z= 0 R? is all of R?.
2x+ y-3iz=-3 ——g. The span of any two nonzero, nonparallel
vectors in R? is all of R?.
as a column-vector equation. _—h. The span of any three nonzero,
36. Write the linear system _ nonparallel vectors in R? is all of R?.
—— i. If Vv), ¥),..., % are vectors in R? such
xX, — 3x, + 2x, = -6 that sp(v,,v., .., ¥,) = R’, then k = 2.
3x, ~ 4x, + 5x, = 12 — j. Ifv,%,..., ¥, are vectors in R? such
that sp(v,, v,,....V¥,) = R?, then k = 3.
as a column-vector equation. 40. Prove the indicated property of vector
37. Write the row-vector equation addition in R’, stated in Theorem 1.1.
a. Property Al
pl-3, 4, 6] + g[0, -2, 5] — r14, -3, 2] + b. Property A3
s[6, 0, 7] = [8, -3, 1] c. Property A4
as 4}, Prove the indicated property of scalar
a. a linear system, multiplication in R’, stated in Theorem 1.1.
b. a column-vectcr equation. 42. a. Property S!
b. Property S3
38. Write the column-vector equaticn
c. Property S4
3 1 9 | ; 42. Prove algebraically that the linear system
r| 3)+7ry 13) +7) 0} =|-8
o| --4} Ad
|[-9} .
Lai d
r- 25 = b,
3r+ Ss =),
as a linear system.
has a solution for all numbers 5,, 50, € R, as
39. Mark each of the following True or False.
asserted in the text.
___ a. The notion of a vector in R’ is useful a 83
only if 2 = 1, 2, or 3. on . Option | of the routine VECTGRPH in the
_— b. Every ordered n-tuple in R" can be software package LINTEK gives graphic
viewed both as a point and as a vector. quizzes on addition and subtraction of
-— ¢. lt would be impossible to define addition vectors in R?. Work with this Option {| until
of points in R” because we only add you consistently achieve a score of 80% or
vectors. better on the quizzes.
—~d. If a and b are two vectors in standard 44, Repeat Exercise 43 using Option 2 of the
position in R’, then the arrow from the routine VECTGRPH. The quizzes this time
tip of a to the tip of b is a translated are on linear combinations in R’, and are
representation of the vector a — b. quite similar to Exercises 17-20.
MATLAB
The MATLAB exercises are designed to build some familianty with this widely used
software as you work your way through the text. Complete information can be
obtained from the manual that accompanies MATLAB. Some information is
18 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
available on the screen by typing help followed by a space and then the word or
symbol about which you desire information.
The software LINTEK designed explicitly for this text does not require a
manual, because all information is automatically given on the screen. However,
MATLAB is a professionally designed program that is much more powerful than
LINTEK. Alinough LINTEK adequately illustrates material in the text, the
prospective scientist would be well advised to invest the extra time necessary to
acquire some facility with MATLAB.
Access MATLAB according to the directions for your installation. The MATLAB
prompt, which is its request for instructions, looks like >. Vectors can be entered by
entering the components in square brackets, separated by spaces; do not use commas.
Enter vectors a, b, c, u, v, and w by typing the lines displayed below. After you type
each line, press the Enter key. The vector components will be displayed, without
brackets, using decimal notation. Proofread each vector after you enter it. If you have
made an error, strike the up-arrow key and edit in the usual fashion to correct your
error. (If you do not want a vectc; displayed for proofreading after entry, you can
achieve this by typing; after the closing square bracket.) If you ever need to continue
on the ne-:t line to type data, enter at least two periods .. and immediately press the
Enter key and continue the data.
a=(2 -4 5 7
b=[-1 6 7 3]
c= [13 -21 5 39)
u = [2/3 3/5 1/7]
v= (3/2 -5/6 11/3)
w= [5/7 3/4 —-2/3]
Now enter u (that is, type u and press the Enter key) to see wu displayed again. Then
enter format long and then u to see the components of u displayed with more decimal
places. Enter format short and then wu to return to the display with fewer decimal
places. Enter rat(u, ’s'), which displays rational approximations that are accurate
when the numbers involved are fractions with sufficiently small denominators, to see u
displayed again in fraction (rational) format. Addition and subtraction of vectors can
be performed using + and —, and scalar multiplication using *. Enter a + b to see
this sum displayed. Then enter -3xa to see this scalar product displayed. Using what
you have discovered about MATLAB, work the following exercises. Entering who at
any time displays a list of variables to which you have assigned numerical, vector, or
matrix values. When you have finished, enter quit or exit to leave MATLAB.
M1. Compute 2a - 3b — Se.
M2. Compute 3c — 4(2a — b).
M3. Attempt to compute a + u. What happens, and why?
M4. Compute u + vin
a. short format
b. long format
¢c. rational (fraction) format.
M5. Repeat Exercise M4 for 2u- 3v + w by first entering x = 2«u — 3av + w
and then looking at x in the different formats.
11 VECTORS IN EUCLIDEAN SPACES 19
M6. Repeat Exercise M4 for 3 - 2b, entering the fractions as (1/3) and (3/7),
and using the time-saving technique of Exercise MS.
M7. Repeat Exercise M4 for 0.3u — 0.23w, using the time-saving technique of
Exercise M5.
M8. The transpose vy’ of a vector v is denoted in MATLAB by v’. Compute
a’ — 3c? and (a — 3c)’. How do they compare?
M9. Try to compute u + u’. What happens, and why?
M10. Enter help : to see some of the things that can be done using the colon. Use
the colon in a statement starting v1 = to generate the vector v1 = [-3 —2
-1 0 1 2], and then generate v2 = [1 4 7 10 13 16] similarly. Compute
Ww + 5y2 in rational format. (We use v1 and v2 in MATLAB where we
would use y, and v, in the text.)
M11. MATLAB can provide grarhic illustrations of the component values of a
vector. Enter plot(a) and note how the figure drawn reflects the components
of the vector a = [2 —4 5 7]. Press any key to clear the screen, and repeat
the experiment with bar(a) and finally with stairs(a) to see two other ways to
illustrate the component values of the vector a.
M12. Using the plot command, we can plot graphs of functions in MATLAB. The
command plot(x, y) will plot the x vector against the y vector. Enter x =
~1: .5: 1. Note that, using the colons, we have generated a vector of
x-coordinates starting at —1 and stopping at 1, with increments of 0.5. Now
enter y = x .« x . (A period before an operator, such as .+ or .*, will cause
that operation to be performed on each component of a vector. Enter help .
to see MATLAB explain this.) You will see that the y vector contains the
squares of the x-coordinates in the x vector. Enter plot(x, y) to see a crude
plot of the graph ofy = x for -l1 <x Sl.
a. Use the up-arrow key to return to the colon statement and make the
increment .2 rather than .5 to get a better graph. Use the up-arrow key to
get to the y = x .« x command, and press the Enter key to generate the
new y vector with more entries. Then get to the plot(x, y) command and
press Enter to see the improved graph of y = x’ for -1 Sx = 1.
b. Proceed as in part (a) to graph y = x’ for —3 < x = 3 with increments of
0.2. This time, put a semicolon after the command that defines the vector
x before pressing the Enter key, so that you don’t see the x-coordinates
printed out. Similarly, put a semicolon after the command defining the
vector y.
c. Plot the graph ofy = sin(x) for —4a S x S 4m. The number z can be
entered as pi in MATLAB. Remember to use « for multiplication.
d. Plot the graph of y = 3 cos(2x) — 2 sin(x) for —4a = x = 47. Remember
to use « for muitiplication.
20 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
EXAMPLE1 Represent the vector v = [3, —4] geometrically, and find its magnitude.
SOLUTION The vector [3, -4} has magnitude
IM = VIF CA = V2 =
and is shown in Figure 1.19. a
In Figure 1.20, the magnitude ||v|| of a vector v = [v,, v,, V3] in R? appears as
the length of the hypotenuse of a right tnangle whose altitude is |v,| and whose
base in the x,,x,-plane has length Vv,’ + v,2. Using the Pythagorean theorem,
we obtain
EXAMPLE2 Represent the vector v = [2, 3, 4] geometrically, and find its magnitude.
SOLUTION The vector v = [2, 3, 4] has magnitude ||v|| = \/2? + 3? + 4? = V/29 and is
represented in Figure 1.21. =
——— ite,
(vy, v2)
lvl
we Xy
v = (3, -4j
0 Vv)
x3
A
ov = (2, 3, 4)
U3
PN “ V2, U3)
Vv
0 Xx
fs
t+——- x2 3 2
nN
Y/Y
x) vy + v2" x
The magnitude of a vector is also called the norm or the /ength of the
vector. As suggested by Eqs. (1) and (2), we define the norm ||vj| of a vector v in
R’ as follows.
Let v = [v,, v,, ... , ¥,] be a vector in R". The norm or magnitude of v 1s
0
FIGURE 1.22
The triangle inequality.
Unit Vectors
A vector in R" is a unit vector if it has magnitude 1. Given any nonzero vector v
in R”, a unit vector having the same direction 4s v is given by (1/|v/l)v.
EXAMPLE 4 Find a unit vector having the same direction as v = [2, 1, —3], and find a vector
of magnitude 3 having direction opposite to v.
SOLUTION Because ||v|]| = 2? + 1? + (—3)? = V/14, we see that u = (1/0/14)[2, 1, —3] is
the unit vector with the same direction as v, and —3u = (—3/V 14)[2, !, —3]
is the other required vector. s
The two-component unit vectors are precisely the vectors that extend from
the origin to the unit circle x? + y* = 1 with center (0, 0) and radius | in R’. (See
Figure |.23a.) The three-component unit vectors extend from (0, 0, 0) to the
unit sphere in R}, as illustrated in Figure 1.23(b).
Note that the standard basis vectors i and j in R’, as well asi, j, and k in R?,
are unit vectors. In fact, the standard basis vectors e,,e, ...,e, for Rare unit
vectors, because each has zeros in all components except for one component of
1. For this reason, these standard basis vectors are also called unit coordinate
vectors.
1.2 THE NORM AND THE DOT PRODUCT 23
FIGURE 1.23
(a) Typical unit vector in R?, (b) typical unit vector in R?.
The dot product of two vectors is a scalar that we will encounter as we now
try to define the angle @ between two vectors v = [¥,, ¥.,..., V,] and w =
[w,, W,,..., ,] In Rk", shown symbolicaliy in Figure 1.24. To motivate the
definition of 6, we will use the law of cosines for the triangle symbolized in
Figure 1.24. Using our definition of the norm of a vector in R" to compute the
lengths of the sides of the triangle, the law of cosines yields
llvi? + [IP = ily — wil? + 2IIvl [Iwill (cos 6)
or
vetoes
tue t+ wets s + w,?
= (, — wi) +++ + + (, — Ww, + Ql [fwil (cos 0). (4)
n
After computing the squares on the right-hand side of Eq. (4) and simplifying,
we obtain
FIGURE 1.24
The angle between v and w.
24 CHAPTER 14 VECTORS, MATRICES, AND LINEAR SYSTEMS
The dot product of vectors v = [V,, v,,... , V,] and w = [W,, #2, ..- 5 Wal
in R" is the scalar given by
Vow=VwW,
+ vw, + 2° ° + VW, (6)
The dot product is sometimes called the inner product or the scalar
product. To avoid possible confusion with scalar multiplication, we shall never
use the latter term.
In view of Definition 1.6, we can write Eq. (5) as
EXAMPLE 5 Find the angle @ between the vectors [1, 2, 0, 2] and [—3, 1, 1, 5] in R*.
SOLUTION We have
6 = [i, 2, 0, 2] -[-3, 1, 1, 5] _ 9 _ 41
CSUN VP + P+ Ot? V3 + P+ P+ BY) 2
Thus, 6= 60° a
1.2 THE NORM AND THE DOT PRODUCT 25
where we can consider je I(x)g(x} dx to be the inner product of the functions j(x), g(x) in the
vector space of continuous functions on [a, 5]. Bunyakovsky served as vice-president of the St.
Petersburg Academy of Sciences from 1864 until his death. In 1875, the Academy established a
mathematics prize in his name in recognition of his 50 years of teaching and research.
Schwarz stated the inequality in 1884. In his case, the vectors were functions ¢, X of two
variables in a region 7 in the plane, and the inner product of these functions was given by
Schwarz’s proof is similar to the one given in the text (page 29). Schwarz was the leading
mathematician in Berlin around the turn of the century; the work in which the inequality appears
is devoted to a question about minimal surfaces.
26 CHAPTER1 VECTORS, MATRICES, AND LINEAR SYSTEMS
EXAMPLE 6 Verify the positivity property D4 of Theorem 1.3.
SOLUTION We let v = [¥,, v., ... ,v,], and we find that
vevevrtvetess ty?
A sum of squares is nonnegative and can be zero if and only if each summand
is zero. But a summand vy? is itself a square, and will be zero if and only if v; =
0. This completes the demonstration. =
vi? = v- v. (11)
EXAMPLE 7 Show that the sum of the squares of the lengths of the diagonals of a
parallelogram in R’ is equal tc tiie sum of the squares of the lengths of the
sides. (This is the parallelogram relution).
SOLUTION We take our parallelogram with vertex at the origin and with vectors v and w
emanating from the origin to form two sides, as shown in Figure 1.25. The
lengths of the diagonals are then ||v + w|| and |v — wi|. Using Eq. (11) and
properties of the dot product, we have
lly
+ wi + |lv
— wir
= (v + w)- (v + w) + (v— w): (Vv - Ww)
= (vv) + 2(v-w) + (Ww) + (V+ ¥) = QV w) + (WoW)
= 2(v- v) + 2(w:
w)
= Q|lv|P + 2liwiP,
which is what we wished to prove. &
The definition of the angle @ between two vectors v and w in R" leads naturally
to this definition of perpendicular vectors, or orthogonal vectors as they are
usually called in linear algebra.
12 THE NORM AND THE DOT PRODUCT 27
FIGURE 1.25
The parallelogram has v + w and
v — was vector diagonals.
EXAMPLE 8 Determine whether the vectors v = [4, 1. —2, 1] and w = [3, —4, 2, —4] are
perpendicular.
SOLUTION We have
EXAMPLE 9 Suppose that a ketch is sailing at 8 knots, following a course of 010° (that 1s, 10°
east of north), on a bay that has a 2-knot current setting in the direction 070°
(that is, 70° east of north). Find the course and speed made good. (The
expression made good is standard navigation terminology for the actual course
and speed of a vessel over the bottom.)
28 CHAPTER 1 VECTORS, MATRICCS, AND LINEAR SYSTEMS
SOLUTION The velocity vectors s for the ketch and c for the current are shown in Figure
1.26, in which the vertical axis points due north. We find s and c by using a
calculator and computing
s = [8 cos 80°, 8 sin 80°] = [1.39, 7.88]
and
o
90 arctan( 8.56
5-55 =~— 90° o — 69° ow
= 21°.°
EXAMPLE 10 Suppose the captain of our ketch realizes the importance of keeping track of
the current. He wishes to sail in 5 hours to a harbor that bears 120° and is 35
nautical miles away. That is, he wishes to make good the course 120° and the
speed 7 knots. He knows from a tide and current table that the current is
setting due south at 3 knots. What should be his course and speed through the
water?
N
1 Is|| = 8
|
N
A
10°
I
an Il vil = 9.16
L— lle =2
Le 20° -_
SOLUTION In a vector diagram (see Figure 1.27), we again represent the course and speed
to be made good by a vector v and the velocity of the current by c. The correct
course and speed to follow are represented by the vector s, which is obtained
oy computing
=v-c
= [7 cos 30°, —7 sin 30°] — [0, -3]
~ [6.06, —3.5] - [0, -3] = [6.06, -C.5].
Thus the captain should steer course 90° — arctan(—0.5/6.06) ~ 90° + 4.7° =
94.7° and should proceed at
Iis\| ~ (6.06) + (—0.5): ~ 6.08 knots. ns
The proofs of the Schwarz and tnangle inequalities illustrate the use of
algebraic properties of the dot product in proving properties of the norm.
Recall Eq. {11): for a vector v in R’, we have
lv! = vey.
PROOF Because the norm ofa vector is a real number and the square of a real
number is nonnegative, for any scalars r and s we have
Irv + swi|? = 0. (12)
(w- w)’(v > v) — 2(w > w)(v > w) + (v° w)'(w- Ww)
= (w- w)(v- vy — (Ww: wiv: wy = 0.
Factoring out (w - w), we see that
(w + w)[(w + w)(v + v) — (v+ wy] = 0. (13)
If w- w = 0, then w = 0 by the positivity property in Theorem 1.3, and the
Schwarz inequality is then true because it reduces to 0 = 0. If ||wi|? = w- w # 0,
30 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
then the expression in square brackets 1n relation (13) must also be nonnega-
tive—that is,
(w+ w)(v-v)
— (v- wy = 0,
and so
The Schwarz inequality can be used to prove the triangle inequality that
was illustrated in Figure 1.22.
PROOF Using properties of the dot product, as weli as tne Schwarz inequality,
we have
lv + wil? = (v + w)-(v + w)
= (v-v) + 2(v- w) + (w- Ww)
= (v+v) + 2{{v{| |[wi] + (w > w)
= [IMP + 2[IvIl [wl + Iwi?
= (||v|| + [lw].
The desired relation follows at once, by taking square roots. a
“| SUMMARY
Let v = [v,, %,....¥,] and w = [w,, w,,..., W,] be vectors in R".
EXERCISES
opposite direction
30. [4, 1, 2, 1, 6] and [8, 2, 4, 2, 3]
u:v¥
31. The distance between points {¥,, v.,... , ¥,)
. u:(v+w) and (w,, Ww, ..., W,) in R’ is the norm
. (uty)-w |v — w||, where v = |v, v.,..., ¥,] and w =
. The angle between u and v [w,, Ww, ..., W,]. Why is this a reasonable
definition of distance?
. The angle between u and w
. The value ofx such that [x, —3. 5] is
perpendicular to u In Exercises 32-35, use the definition given in
15. The value of » such that ([— 3, 1, 10] is Exercise 3! to find the indicated distance.
perpendicular to u
16. A nonzero vector perpendicular to both u 32. The distance from (—1, 4, 2) to (0, 8, 1) in
and v R3
17. A nonzero vector perpendicular to both u 33. The distance from (2, —1, 3) to (4, 1, —2) in
and w R3
along the two halves of the rope at the evelet 40. Mark each of the following True or False.
must be an upward vertical vector of ___ a. Every nonzero vector in R” has nonzero
magnitude 100.] magnitude.
—— b. Every vector of nonzero magnitude in R’
38. a. Answer Exercise 37 if each half of the 1S nonzero.
rope makes an angle of 6 with the vertical ___c. The magnitude ofv + w must be at least
at the eyelet. . ;
: oo , . as large as the magnitude of either v or
b. Find the tension in the rope if both in Ree ° .
halves are vertical (@ = 0). d . Every nonzero vector v in tk” has exactly
c. What happens if an attempt is made to —_ one unit vector parallel to it.
stretch the rope out straight (horizontal)
—___ e. There are exactly two unit vectors
while the 100-lb weight hangs on it?
parallel to any given nonzero vector in
39. Suppose that a weight of 100 Ib is suspended R’.
by two different ropes tied at an eyelet on ____ f. There are exactly two unit vectors
top of the weight, as shown in Figure 1.29. perpendicular to any given nonzero
Let the angles the ropes make with the vector in R’.
vertical be 6, and 6,, as shown in the figure. — 8. The angle between two nonzero vectors
Let the tensions in the ropes be 7, fer the in R” is less than 90° if and only if the
right-hand rope and 7, for the lefi-uand dot product of the vectors is positive.
rope. __h. The dot product . of a vector with itself
a. Show that the force vector F, shown in _ yields the magnitude of the vector.
Figure 1.29 is T(sin 6)i + T,(cos 8,)j. ___ i. Fora vector v in R", the magnitude of r
b. Find the corresponding expression for F, times v is r times the magnitude of v.
in terms of 7, and 6). __ j._Ifv and w are vectors in R” of the same
c. If the system is in equilibrium, F, + F, = magnitude, then the magnitude of v — w
100j, so F, + F, must have i-component is 0.
() and j-component 100. Write two 41. Prove the indicated property of the norm
equations reflecting this fact, using the stated in Theorem 1.2.
answers to parts (a) and (b). a. The positivity property
d. Find 7, and 7; if 6, = 45° and 6, = 30°. b. The homogeneity property
12 THE NORM AND THE DOT PRODUCT 33
Itel + will = Ie - ml
MATLAB
MATLAB has a built-in function norm(x) for computing the norm of a vector x. It
has no built-in command for finding a dot product or the angle between two vectors.
Because one purpose of these exercises is to give practice at working with MATLAB,
we will show how the norm of a vector can be computed without using the built-in
function, as well as how to compute dot products and angles between vectors.
It is important to know how to enter data into MATLAB. In Section 1.1, we
showed how to enter a vector. We have created M-files on the LINTEK disk that
can be used to enter data automatically for our exercises, once practice in manual
data entry has been provided. If these M-files have been copied intu your MATLAB,
you can simply enter fbcls2 to create the data vectors a, b, c, u. v, and w for the
exercises below. The name of the file containing them is FBC1S2.M, where the
FBC1S2 stands for ‘‘Fraleigh/Beauregard Chapter | Section 2.” To view this data
file so that you can create data files of your own, if you wish, simply enter type
fbcls2 when in MATLAB. In addition to saving time, the data files help prevent
wrong answers resulting from typos in data entry.
Access MATLAB, and either emer focls2 or manually enter the following vectors.
a=[-2135]] u = [2/3 -4/7 8/5]
b= [4-1235] vy = [-1/2 13/3 17/11]
c= [-103064] w = [22/7 15/2 —8/3]
34 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
If you enter these vectors manually, be sure to proofread your data entry for accuracy.
Enter help . to see what can be done with the period. Enter a .« b and compare the
resulting vector with the vectors a and b. (Be sure to get in the habit of putting a space
before the period so that MATLAB will never interpret it as a decimal point in this
context.) Now enter c “3 and compare with the vector c. The symbol * is used to
denote exponentiation. Then entzr sum(a) and note how this number was obtained
from the vector a. Using the . notation, the sum function sum(x), the square root
function sqrt(x), and the arccosine function acos(x), we can easily write formulas for
computing norms of vectors, dot products of vectors, and the angle between two
vectors.
M1. Enter x = a and then enter normx = sqrt(sum(x .« x)) to find |la||. Compare
your answer with the result obtained by entering norm(a).
M2. Find ||b|| by entering x = b and then pressing the upward arrow until
equation normx = sqrt(sum(x .« x)) is at the cursor, and then pressing the
Enter key.
M3. Using the technique outlined in Exercise M2, find |lull.
M4. Using the appropriate MATLAB commands, compute the dot product v : w in
(a) short format and (b) rational tormat.
M5. Repeat Exercise M4 for (2u — 3v) + (4u — 7v).
NOTE. If you are working with your own personal MATLAB, you can add a function angl(x, y) for
finding the angle between vectors x and y having the same number of components to MATLAB’s
supply of available functions. First, enter help angi to be sure that MATLAB does not already have
a command with the name you will use; otherwise you might delete an existing MATLAB
function. Then, assuming that MATLAB has created a subdirectory C:\MATLAB\MATLAB on
your disk, get out of MATLAB and either use a word processor that will create ASCII files
or skip to the next paragraph. Using a word processor, create an ASCII file designated as
C:\MATLAB\MATLAB\ANGL.M by entering each of the following lines.
function z = angl(x, y)
% ANGL ANGL (x, y) is the radian angle between vectors x and y.
z = acos(sum(x .« y)/(norm(x)snorm(y)))
Then save the file. This creates a function angl(x, y) of two variables x and y in place of the name
angixy we use in M6. You will now be able to compute the angle between vectors x and y in
MATLAB simply by entering angl(x, y). Note that the file name ANGL.M concludes with the .M
suffix. MATLAB comes with a number of these M-files, which are probably in the subdirectory
MATLAB\MATLAB of your disk. Remember to enter help angl from MATLAB first, to be sure
there is not already a file with the name ANGL.M.
If you do not have a word processor that writes ASCII files, you can still create the file from
DOS if your hard disk is the default c-drive. First enter CD C:\MATLAB\MATLAB. Then enter
COPY CON ANGL.M and proceed to enter the three lines displayed above. When you have
pressed Enter after the final line, press the F6 key and then press Enter again.
After creating the file, access MATLAB and test your function angl(x, y) by finding the angle
between the vectors [!, 0] and [— 1, 0]. The angle should be a ~ 3.1416. Then enter help angi and
you should see the explanatory note on the line of the file that starts with % displayed on the
screen. Using these directions as a model, you can easily create functions of your own to add to
MATLAB.
1.3 MATRICES AND THEIR ALGEBRA 35
to compute the angle (in radians) between a and b. You should study this
formula until you understand why it provides the angle between a and b.
M7. Compute the angle between b and c using the technique suggested in Exercise
M2. Namely, enter x = b, enter y = c, and then use the upward arrow until
the cursor is at the formula for anglxy and press Enter.
M8. Move the cursor to the formula for anglxy and edit the formula so that the
angle will be given in degrees rather than in radians. Recall that we multiply
by 180/a to convert radians to degrees. The number 7 is available as pi in
MATLAB. Check your editing by computing the angle between the vectors
{{, O} and (0, 1}. Then find the angle between u and w in degrees.
M9. Find the angle between 3u — 2w and 4y + 2w in degrees.
The Notation Ax = b
x, ~~ 2X} = —]
1 -2 x, = —] (3)
3 5} |x, 19} .
4 X b
Let us denote by A the bracketed array on the left containing the coefficients of
the linear system. This array A is followed by the column vector x of
unknowns, and let the column vector of constants after the equal sign be
denoted by b. We can then symbolize the linear system as
Ax = b. (4)
There are several reasons why notation (4) is a convenient way to write a linear
system. It is much easier to denote a general linear system by Ax = b than to
write out several linear equations with unknowns x,, x,,..., X,, subscripted
letters for the coefficients of the unknowiis, and constants b,, D,, ... , b,, to the
36 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
right of the equal signs. [Just look ahead at Eq. (1) on page 54.] Also, a single
linear equation in just one unknown can be written in the form ax = b(2x = 6,
for example), and the notation Ax = b is suggestively similar. Furthermore,
we will see in Section 2.3 that we can regard such an array A as defining a
function whose value at x we will write as Ax, much as we write sin x. Solv-
ing a linear system Ax = b can thus be regarded as finding the vector x
such thai this function applied to x yields the vector b. For all of these rea-
sons, the notation Ax = b for a linear system is one of the most useful nota-
tions in mathematics.
It is very important to remember that
We now introduce the usual terminology and notation for an array of numbers
such as the coefficient array A in Eq. (3).
A matrix is an ordered rectangular array of numbers, usually e1closed in
parentheses or square brackets. For example,
-1 0 3
1 —2 2 1 4
A=|; | and B= 4 5-6
-3 -1 -]1
second subscript gives the number of the column (counting from the left).
Thus an m X 2 matrix A may be written as
If we want to express the matrix B on page 36 as [b,], we would have b,, = —1,
b,, = 2. by = 5, and so on.
Matrix Multiplication
We are going to consider the expression Ax shown in Eq. (3) to be the product
of the matrix. A and the column vector x. Looking back at Eq. (5), we see that
Such a product of a matrix A with a column vector x should be the linear
combination of the column vectors of A having as coefficients the components
in the vector x. Here is a nonsquare example in which we replace the vector x of
unknowns by a specific vector of numbers.
[ 2 -3 ; “
[-1 4 -7]] 3
tL “J
HISTORICAL NOTE THE TERM maTRIXis first mentioned in mathematical literature in an 1850
paper of James Joseph Sylvester (1814-1897). The standard nontechnical meaning of this term is
‘a place in which something is bred, produced, or developed.” For Sylvester, then, a matrix, which
was an “oblong arrangement of terms,” was an entity out of which one could form various square
pieces to produce determinants. These latter quantities. formed from squaie matrices, were quite
well known by this time.
James Sylvester (his original name was James Joseph) was born into a Jewish family in
London. and was to become one of the supreme algebraists of the nineteenth century. Despite
having studied for several years at Cambridge University, he was not permitted to take his degree
there because he “professed the faith in which the founder of Christianity was educated.”
Therefore, he received his degrees from Trinity College, Dublin. In 1841 he accepted a
professorship at the University of Virginia; he remained there only a short time, however, his
horror of slavery preventing him from fitting into the academic community. In 1871 he returned
to the United States to accept the chair of mathematics at the newly opened Johns Hopkins
University. In between these sojourns, he spent about 10 years as an attomey, during which time
he met Arthur Cayley (see the note on p. 3), and 15 years as Professor of Mathematics at the Royal
Military Academy, Woolwich. Sylvester was an avid poet, prefacing many of his mathematical
papers with examples of his work. His most renowned example was the “Rosalind” poem, a
400-line epic. each line of which rhymed with “Rosalind.”
38 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
The product AB 1s the matrix whose jth column is the product ofA
with the sth column vector of B.
B C
Because B has s columns, C has s columns. The comments after Example !
indicate that the ith entry in the jth column of AB is the dot product of the ith
row of 4 with the jth column of B. We give a formal definition.
We illustrate the choice of row i from A and columnj from 8 to find the
elemeni c, in AB, according to Definition 1.8, by the equation
cr —- 4,, |
AB = [ce] =| 9 — 9m | | |,
by b,, bs
L Gm) — Ann |
where
EXAMPLE 2 Let A be a2 X 3 matrix, and iet B be a 3 X 5 matrix. Find the sizes of 48 and
BA, if they are defined.
SOLUTION Because the second size-number, 3, of A equals the first size-number, 3, of B,
we see that 4B is defined; it is a 2 X 5 matrix. However, BA is not defined,
because the second size-number, 5, of B is not the same as the first
size-number, 2, of A. a
-» 3 nl 4-7! 2 5
4 6-o{| 3 9 1 UL
-2 3 5-3
SOLUTION The product is defined, because the left-hand matrix is 2 x 3 and the
right-hand matrix is 3 x 4; the product will have size 2 x 4. The entry in the
first row and first column position of the product is obtained by taking the dot
product of the first row vector of the left-hand matrix and the first column
vector of the right-hand matrix, as follows:
The entry in the second row and third column of the product is the dot product
of the second row vector of the left-hand matrix and the third column vector of
the right-hand one:
and so on, through the remaining row and column positions of the product.
Eight such computations show that
2 3 4 oo 7 le[3) 8 9-2
4 6-2/55 §3| 138-10
4 32
Examples 2 and 3 show that sometimes AB is defined when BA is not. Even
if both AB and BA are defined, nowever, it need not be true that AB = BA:
EXAMPLE 4 Let
transforms F’ into a form F” in x”, y", then the composition of the substitutions, found by
replacing x’, y’ in (i) by their values in (ii), gives a substitution transforming F into F”:
x = (ae + bg)x” + (af + bh)y” y = (ce + dg)x” + (ef + dh)y”. (iii)
The coefficient matrix of substitution (iii) is the product of the coefficient matrices of substitutions
(i) and (ii). Gauss performed an analogous computation in his study of substitutions in forms iD
three variables, which produced the rule for multiplication of 3 x 3 matrices.
Gauss, however, did not explicitly refer to this idea of composition as a “multiplication.”
That was done by his student Ferdinand Gotthold Eisenstein (1823-1852), who introduced the
notation S x T to denote the substitution composed of S and 7. About this notation Eisenstein
wrote, “An algorithm for calculation can be based on this; it consists of applying the usual rules for
the operations of multiplication, division, and exponentiation to symbolical equations betweed
linear systems; correct symbolical equations are always obtained, the sole consideration being that
the order of the factors may not be altered.”
1.3 MATRICES AND THEIR ALGEBRA 41
_[ 410 13°55
AB =| 4 x8 and BA = ' 29}
Of course, for a square matrix 4, we denote AA by 4’, AAA by 43, and so
on. It can be shown that matrix multiplication is associative; that is,
A(BC) = (AB)C
whenever the produci is defined. This is not difficult to prove from the
definition, although keeping track of subscripts can be a bit challenging. We
Icave the proof as Exercise 33, whose solution is given in the back of this text.
—
OO 2
ooo
oo;
—
or
000 --: I 0 1
where the large zeros above and below the diagonal in the second matrix
indicate that each entry of the matrix in those positions is 0. If A is any m x n
matrix and B is any n X s matrix, we can show that
AI=A and IB=B.
We can understand why this is so if we think about why it is that
Let A = [a,] and B = [b,] be two matrices of the same size m X n. The
sum A + B of these two matrices is the m x n matrix C = [c,], where
C, = a, + dy.
That is, the sum of two matrices of the same size is the matrix of that
size ohtained by adding corresponding entries.
EXAMPLE 5 Find
EXAMPLE 6 Find
1-3 -5§ 4 6
; Aba 307 ‘|
SOLUTION The sum is undefined, because the matrices are not the same size. a
AtTO=O+A=A.
The matrix O is called the m X n zero matrix; the size of such a zero matrix is
made clear by the context.
Let A = [a,], and let rbe a scalar. The product rA of the scalar rand the
matrix A is the matnix B = (b,] having the same size as A, where
bj = 1a,
13. MATRICES AND THEIR ALGEBRA 43
EXAMPLE 7 Find
| 72 |
3 -S|
For matrices A and B of the same size. we define the difference A — Btobe
A-B=A+(-1)B.
EXAMPLE 8 If
_{3 -l 4
A=|9 _ and B=|_/-1 1 0 4]
1
find 2A - 3B.
SOLUTION We find that
_f 9-2 -7
24 3B =| 5 10 3h .
We introduced the transpose operation to change a row vector to a column
vector, or vice versa, in Section I.!. We generalize this operation for
applicaticn to matrices, changing all the row vectors to column vectors, which
results in all the column vectors becoming row vectors.
EXAMPLE 9 Find A’ if
{| 1 4 = §
A= _; 2 |
SOLUTION We have
1-31
A’=!]4 2),
5 7
5 -6 8
3
-—2 10 4
II —1
to make it symmetric.
5 -6 -2 &
-6 3 1 11
—2 1 O
8 lt 4-1 .
For handy reference, we box the properties of matrix algebra and of the
transpose operation. These properties are valid for all vectors, scalars,
and matrices for which the indicated quantities are defined. The exercises
ask for proofs of most of them. The proofs of the properties of matrix
algebra not involving matrix multiplication are essentially the same as the
proofs of the same properties presented for vector algebra in Section 1.1.
We would expect this because those operations are performed just on cor-
responding entries, and every vector can be regarded as either a 1 X nor an
n X | matrix.
1.3 MATRICES AND THEIR ALGEBRA 45
EXAMPLE 11 Prove that A(B + C) = AB + AC for any m X n matrix A and any n X s matrices
B and C.
SOLUTION Let A = [a,], B = [b,] and C = [cy]. Note the use of j, which runs from | to n, as
both the second index for entries in A and the first index for the entries in B
and C. Tie entry in the ith row and Ath column of A(B + C) is
nt n n
“| SUMMARY
An m X n matrix is an ordered rectangular array of numbers containing m
rows and n columns.
An m X | matrix is a column vector with m components, anda l X n
matrix 1s a row vector with n components.
The product Ab of au m X n matrix A and a column vector b with
components b,, 5,,..., 5, is the column vector equal to the linear
combination of the column vectors of A where the scalar coefficient of the
jth column vector of A is 8).
The product AB of an m X n matrix A and ann X § matrix Bis them X5
matrix C whose jth column is A times the jth column of B. The entry c, in
the ith row and jth column of Cis the dot product of the ith row vector of A
and ihe jth column vector of B. In general, AB # BA.
If A = [a,] and B = [6,] are matrices of the same size, then A + Bis the
matrix of that size with entry a, + 5, in the ith row and jth column.
For any matrix A and scalar r, the matrix rA is found by multiplying each
entry in A by r.
The transpose of an m X n matrix A is the n X m matrix A’, which has as its
kth row vector the Ath column vector of A.
Properties of the matrix operations are given in boxed displays on page 45.
") EXERCISES
In Exercises 1-16. let
3A 9. (24)(5C) ,
=
a. Find A’.
OB
A (ODYAB) b. Find A’.
RYN
A+B
i. 4 , 18. Let
B+C 2. (AC) 0 0 —|
4A —- 2B 14. ADB 2 0 O
AB 15. (A™)A a. Find A’,
SY
19, Consider the row and column vectors vector c and a matrix A as a linear
combination of vectors. {Hint: Consider
4
x = {-2, 3, -l] andy = |
((eA)")".]
3
Compute the matrix products xy and yx. In Exercises 25-34, prove that the given relation
40. Fill in the missing entries in the 4 x 4 holds for all vectors, matrices, and scalars for
matrix which the expressions are defined.
1 -1 5
25. A+B=Bt+A
4 8
2-7 -1 26. (A+ BY+ C=A+(Bt+C)
6 3 27. (r+ s\A=rA+ SA
so that the matrix is symmetric. 28. (rs)A = r(sA)
29. A(B + C) = AB + AC
21. Mark each of the following True or False.
The statements involve matrices A, B, and 30. (A?) =
C that are assumed to have appropriate 31. {4 + B)’ = A’ + B
$1ze. 32. (AB)? = BTAT
a. IfA = B, then AC = BC.
b. if AC = BC, then A = B. 33. (AB)C = A(BC)
c. If AB = O, thenA = Oor B= 0. 34. (rA)B = A(rB) = r(AB)
d. ifA+C=B+C, ihenA=8B 35. If Bis anm X n matrix and ifB = A’, find
e. If A? = J, thenA = +1. the size of
f. If B = A? and ifA isn x n and a. A,
symmetric, then b, = 0 for b. AA?,
i=1,2,...,n. c. ATA.
__.g. If AB = C and if two of the matrices are
36. Let v and w be column vectors in R". What
square, then so Is the third.
is the size of vw”? What relationships hold
__h. If AB = C and if C is a column vector,
between yw’ and wv’?
then so is B.
—— i. If A? = J, then A’ = / for all integers 37. The Hilbert matrix H, is the n X n matrix
n= 2. [hy], where h, = 1/(i + j — 1). Prove that the
____ j. If A? = J, then A” = / for all even integers matrix H, is symmetric.
n= 2. 38. Prove that, if A is a square matrix, then the
22. a. Prove that, if A is a matrix and x is a row matrix A + A’ is symmetric.
vector, then xA (if defined) is again a row 39. Prove that, if A is a matrix, then the matrix
vector. AA’ is symmetric.
b. Prove that, if A is a matrix and y isa 40. a. Prove that, if A is a square matrix, then
column vector, then Ay (if defined) is (A*\7 = (AT) and (42)" = (A7). [Hint:
again a column vector. Don’t try to show that the matrices have
23. Let A be an m X n matrix and let b and c be equal entries; instead use Exercise 32.]
column vectors with n components. Express b. State the generalization of part (a), and
the dot product (Ab) : (Ac) as a product of give a proof using mathematical
matrices. induction (see Appendix A).
24. The product Ab of a matrix and a column 41. a. Let A bean m X n matnx, and let e, be
vector is equal to a linear combination of the n x | column vectcr whose jth
columns of A where the scalar coefficient of component is 1 and whose other
the jth column of A is 8;. In a similar components are 0. Show that Ae, is the
fashion, describe the product c4 of a row jth column vector of A.
48 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
b. Let .A and B be matrices of the same size. 46. Find all values of r for which
i. Prove that, if Ax = 0 (the zero vector) 2001 101
for all x, then A = OQ, the zero maitiix. 070 commutes with 010
{Hint: Use part (a).] 002 101 ‘
ii. Prove that, if Ax = Bx for all x, then
A = B. {Hint: Consider A - B.] = The software LINTEK includes a routine,
42. Let A and B be square matrices. |s MATCOMP, that performs the matrix operations
(A + BY =A? + 24B + B? described in this section. Let
ar . 4 6 0 1-9
If so, prove it; if not, give a counterexample 211 $$ 2-5
and state under what conditions the =|-1 2-4 5 7
equation is true. 012-8 4 3
43. Let A and B be square matrices. Is 10 4 6 2-5
(A + B)(A — B) = A? — BY? and
If so, prove it; if not, give a counterexample -8 15 4-11
and state under what conditions the 3 5 6 -2
equation is true. B=| 0-1 12 5}.
44. Ann Xn matrix C is skew symmetric if 113-15 7
C7 = -C. Prove that every square matrix A L6-8 0 -5!
can be written uniquely as A = B + C where
B is symmetric and C is skew symmetric. Use MATCOMP in LINTEK to enter and store
these matrices, and then compute ihe matrices in
o, _ Exercises 47-54, if they are defined. Write down
Matrix A commutes with matrix B if AB = BA. to hand in, if requested, the entry in the 3rd row,
. 4th column of the matrix.
45. Find all values of r for which
2001 CooL 47, AA+A 50. BA? 53. (2A) — AS
MATLAB
To enter a matrix in MATLAB, start with a left bracket [ and then type the entries
across the rows, separating the entries by spaces and separating the rows by
semicolons. Conclude with a right bracket ]. To illustrate, we would enter the matrix
-! 5
4-[2 - as A =[-1
5; 13 —4; 7 0)
7 #0
and MATLAB would then print it for us to proofread. Recall that to avoid having
data printed again on the screen, we type a semicolon at the end of the data before
pressing the Enter key. Thus if we enter
A = [-1 5; 13 —4; 7 0);
1.3 MATRICES AND THEIR ALGEBRA 4S
the matrix A will not be printed for proofreading. In MATLAB, we can enter
In MATLAB, A(i, j) is the entry in the ith row and jth column of A, while A(k) is
the Ath entry in A where entries are numbered consecutively starting at the upper
left and proceeding down the first column, then down the seccnd column, etc.
Access your MATLAB, and enter the matrices A and B given before Exercises 47-54.
(We ask you to enter them manually this time to be sure you know how to enter
matrices.) Proofread the data for A and B. If you find, for example, that you entered 6
rather than 5 for the entry in the 2nd row, 3rd column of A, you can correct your
error by entering A(2,3) = 5.
Ml. Exercises 47~54 are much easier to do with MATJ_ARB than with LINTEK,
because operations in LINTEK must be specified one at a time. Find the
element in the 3rd row, 4th column of the given matrix.
a. B(2A)
b. AB(AB)"
c. (2A) — A
M2. Enter B(8). What answer did you get? Why did MATLAB give that answer?
M3. Enter help : to review the uses of the colon with matrices. Mastery of use of
the colon is a real timesaver in MATLAB. Use the colon to set C equal to the
5 x 3 matrix consisting of the 3rd through the Sth columns of A. Then
compute C’C and write down your answer.
M4. Forma 5 = 9 matrix D whose first five columns are.those of A and whose last
four columns are those of B by
a. entering D = [A B}, which works when A and B have the same number of
rows,
b. first entering D = A and then using the colon to specify that columns 6
through 9 of D are equal to B. Use the fact that AG, j) gives the jth column
of A.
Write down the entry in the 2nd row, Sth column of D7D.
M5. Form a matrix E consisting of B with two rows of zeros put at the bottom
and write down the entry in the 2nd row, 3rd column of E7E by
a. entering Z = zeros (2,4) and then E = [B; Z}, which works when B and Z
have the same number of columns,
b. first entering E = B and then using the colon to specify that rows 6
through 7 of E are equal to Z. Use the fact that A(i, :) gives the ‘th row
of A.
56 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
M6. In mathematics, “mean” stands for “‘average,” so the mean of the numbeis 2,
- 4, and 9 is their average (2 + 4 + 9)/3 = 5. In MATLAB, enter help mean to
see what that function gives, and then enter mean(A). Figure out a way to
have MATLAB find the mean (average) of all 25 numbers in the matrix A.
Find that mean and write down your answer.
M7. If F isa 10 x 10 matrix with random entries from 0 to |, approximately
what would you expect the mean value of those entries to be? Enter help rand,
read what it says, and then generate such a matrix F. Using the idea in
Exercise M6, compute the mean of the entries in F, and write down your
answer. Repeat this several times.
M8. Enter help ones and read what it says. Write down a statement that you could
enter in MATLAB to form from the matrix F of Exercise M7 a 10 x 10
matrix G that has random entries from —4 to 4. Using the ideas in Exercise
M6, find and write down the mean of the entries in the matrix G. Repeat this
several times.
M9. In MATLAB, entering mesh(X) will draw a three-dimensional picture
indicating the relative values of the entries in a matrix X, much as entering
plot(a) draws a two-dimensional picture for the entries in a vector a. Enter
= eye(16); mesh(I) to see a graphic for the 16 x 16 identity matrix. Then
enter mesh(rot90(I)). Enter help rot90 and help triu. Enter X = criu(ones(14));
mesh(X). Then enter mesh(rot90(X)} and finally mesh(rot90(X,—1)).
then MATLAB will create t.vo matrices X and Y containing, respectively, the
x-coordinates and y-coordinatc; of a grid of points in the region -3 = x = 3 and
-2 = y = 2. Because the x-increment is | and the y-increment is 0.5, we see that
both X and Y will be 9 x 7 matrices.
Enter now
will be x° + ¥°. Entering mesh(Z) will then create the mesh graph over the region
-3sxy53,-2s7 2.
Enter now
Z=XaXt+Yx¥: (9)
mesh(Z) (i0)
tp see this graph.
M10. Using the up arrow, modify Eq. (8) to make both the x-increment and the
y-increment 0.2. After pressing the Enter key, use the up arrow to get Eq. (9)
and press the Enter key to form the larger matrix Z for these new grid
points. Then create the mesh graph using Eq. (10).
M11. Modify Eq. (9) and create the mesh graph for z = x — y’.
M12. Change Eq. (8) so thai the regien will be -3 = x = 3and -3 = ys 3 still
with 0.2 for increments. Form the mesh graph for z = 9 — x? — y’.
M13. Mesh graphs for cylinders are especially nice. Draw the mesh graphs for
a z=x
bh z=y.
M14. Change the mesh domain to —4a7 <= x = 4x, -3 = y S$ 3 with x-increment
0.2 and y-increment 6. Recall that 2 can be entered as pi. Draw the mesh
graphs for
a. the cylinder z = sin(x),
b. the cylinder z = x sin(x), remembering to use .x, and
c. the function z = » sin(x), which is not a cylinder but is pretty.
FIGURE 1.31
The plane
x +y + Zz = 1.
We know that a single linear equation in two unknowns has a line in the
plane as its solution set. Similarly, a single linear equation in three unknowns
has a plane in space as its solution set. The solution set of x + y + z= 1 1s the
plane sketched in Figure 1.31. This geometric analysis can be extended to an
equation that has more than three variables, but it is difficult for us to
represent the solution set of such an equation graphically.
Two lines in the plane usually intersect at a single point; here the word
usually means that, if the lines are selected in some random way, the chance
HISTORICAL NOTE Systems of Linear Equations are found in ancient Babylonian and
Chinese texts dating back well over 2000 years. The problems are generally stated in real-life
terms, but it is clear that they are artificial and designed simply to train students in mathematical
procedures, As an example of a Babylonian problem, consider the following, which has been
slightly modified from the original found on a clay tablet from about 300 8.c.: There are two fields
whose total area is | 800 square yards. One produces grain at a rate of = bushel per square yard, the
other at a rate of= bushel per square yard. The total yield of the two fields is 1100 bushels. What is
the size of each field? This problem leads to the system
x+ y= 1800
x + Jy = 1100,
A typical Chinese problem, taken from the Han dynasty text Nine Chapters of the
Mathematical Art (about 200 B.c.), reads as follows: There are three classes of corn, of which three
bundles of the first class, two of the second, and one of the third make 39 measures. Two of the
first, three of the second, and one of the third make 34 measures. And one of the first, two of the
second, and three of the third make 26 measures. How many measures of grain are contained in
one bundle of each class? The system of equations here is
3x
+ 2p+ z= 39
2x+3y+ z= 34
x + 2y + 3z=
26.
1.4 SOLVING SYSTEMS OF LINEAR EQUATIONS 53
y y
A h
| ~4x + 6y = —8
* o| 17% 3
al
4
Lor ay 4
|
FIGURE 1.32 FIGURE 1.33
2x ~— 3y = 4 is parallel to 2x — 3y = 4 and —4x + 6y = -—8
2x — 3y = 6. are the same line.
that they either are parallel (have empty intersection) or coincide (have an
infinite number of points in their intersection) is very small. Thus we see that a
system of two randomly selected equations in two unknowns can be expected
to have a unique solution. However, it is possible fur the system to have no
solutions or an infinite number of solutions. For example, the equations
2x — 3 = 4
2x — 3y
=6
correspond to distinct parallel lines, as shown in Figure 1.32, and the system
consisting of these equations has no solutions. Moreover, the equations
2x — 3y=4
—4x + by = -8
correspond to the same line, as shown in Figure !.33. All points on this line are
solutions of this system of two equations. And because it is possible to have
any number of lines in thc plane—say, fifty lines—pass through a single point,
it is possible for a system of fifty equations in only two unknowns to have a
unique solution.
Similar illustrations can be made in space, where a linear equation has as
its solution set a plane. Three randomly chosen planes can be expected to have
a unique poini in common. Two of them can be expected to intersect in a line
(see Figure 1.34), which in turn can be expected to meet the third plane at a
single point. However, it is possible for three planes to have no point in
common, giving rise to a linear system with no solutions. It is also possible for
all three planes to contain a common line, in which case the corresponding
linear system will have an infinite number of solutions.
54 CHAPTER 3 VECTORS, MATRICES, AND LINEAR SYSTEMS
FIGURE 1.34
Two planes intersecting in a line.
(1)
a, a, **> a, | OD,
Q, Ay *** ay | dB,
(3)
The author then instructs the reader to multiply the middle column by 3 and subsequently to
subtract the right column “as many times as possible”; the same is t> be done to the left cclumn.
The new diagrams are then
103 0 0 3
25 2 4 5 2
3 1 1 ad g yy
26 24 39 39 24 39
The next instruction is to multiply the left column by 5 and then to subtract the middle column as
many times as possible. This gives
0 0 3
0 5 2
36 1 1
99 24 39
The system has thus been reduced to the system 3x + 2y + z = 39, Sy + z = 24, 36z = 99, from
which the complete solution is easily found.
56 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
the ith equation by —s and adding it to the new jth equation (an R3 operation),
we see that the original system has all the solutions of the new one. Hence R3,
too, does not alter the solution set of system (1).
These procedures applied to system (1) correspond to elementary row
operations applied to augmented matrix (3). We list these in a box together
with a suggestive notation for each.
Row-Echelon Form
1. All rows containing only zeros appear below rows with nonzero
entries.
2. The first nonzero entry in any row appears in a column to the right
of the first nonzero entry in any preceding row.
For such a matrix, the first nonzero entry in a row is the pivot for that
row.
Cow >
oor ht
Wr
—
Oo
SOIUTION Matrix 4 7s not in row-echelon form, because the second row (consisting of all
zero entries) is not below the third row (which has a nonzero entry).
Matrix B is not in row-echelon form, because the first nonzero entry in the
second row does not appear in a column to the right of the first nonzero entry
in the first row.
Matrix C is in row-echelon form, because both conditions of Definition
1.12 are satisfied. The pivots are —1 and 3.
Matrix D satisfies both conditions as well, and is in row-echelon form. The
pivots are the entries 1. §&
Solutions of Hx = c
|-3
|-2
x=] 61,
or equivalently, x, = —3,x,=6,x%,=-2. os
Xs= |.
We solve each equation for the variable corresponding to the colored pivot in
the inatrix. Thus we obtain
xX, = 3x, — 5x, + 4
xX; = 1.
Xy r
X=!x,;|}=| —2s—7 | for any scalars rand s. (5)
X4
Xs |
We call x, and x, free variables, and we refer to Eq. (5) as the general solution of
the system. We obtain particular solutions by setting r and s equal to specific
values. For example, we obtain
2 [ 25
| 2
—9|forr=s=1 and |-1]forr
= 2,5 = —3. =
| | 3
| l
Gauss Reduction of Ax = b to Hx = c
reduction with back substitution. In the box below, we give an outline for
reducing a matrix A to row-echelon form.
—1
0-1 2
0-1 2
1.4 SOLVING SYSTEMS OF LINEAR EQUATIONS 61
1-2 1 -!l
_{9 0 1 -2) Add row 2 to rows 3 and 4, to obtain the final
0 O--1 2] matrix.
0 O-1 2
R,—>
R, + UR, R,—>
R, + IR:
1 2-1-1
0 O 1 -2
~“10 0 0 OF
0 0 0 O
This last matrix is row-echelon form, with voth pivots equal to 1. &
2x,+3x,- 4 = 7
4x, + 5X, a 2X3 10.
HISTORICAL NOTE THE Gauss SOLUTION METHOD js so named because Gauss described it in a
paper detailing the computations he made to determine the orbit of the asteroid Pallas. The
parameters of the orbit had to be determined by observations of the asteroid over a 6-year period
from 1803 to 1809. These led to six linear equations in six unknowns with quite complicated
coefficients. Gauss showed how to solve these equations by systematically replacing them with a
new system in which only the first equation had all six unknowns, the second equation included
five unknowns, the third equation only four, and so on, until the sixth equation had but one. This
last equation could, of course, be easily solved; the remaining unknowns were then found by back
substitution.
62 CHAPTER 1 VFCTORS, MATRICES, AND LINEAR SYSTEMS
From the last augmented matrix, we cculd proceed to write the corresponding
equations (as in Example 2) and to solve in succession for x, x,, and x, by back
substitution. However, it makes sense to keep using our shorthand, without
writing out variables, and to do our back substitution in terms of augmented
matrices. Starting with the final augmented matrix in the preceding set, we
obtain
2 3-1| 7) [2 3-1] 7 |
~—
[2 3-1| 7] [2 3 0] 10]
0 1-3} -5)-|0 1 O 4, R,>R, + IR; Ry >R, + 3R;
0 0 1 3} jo oO 1 3 (This shows that x, = 4.)
2 3 0; 10] [2 0 0| -2]
0 1 O} 4/~Jo 1 0; 4 R, > R, — 3R,
001 3 001 | 3 (This shows that x, = —1.)
EXAMPLE 7 Determine whether the vector b = [1, —7, —4] ts in the span of the vectors v =
[2, 1, 1] and w = [I, 3, 2}.
SOLUTION We know that b is in sp(v. w) if and only ifb = x,v + x,w for some scalars x, and
X,. This vector equation is equivalent to the linear system
2 | l
1 3 = =|-7].
1 2) |-4
14 — SOLVING SYSTEMS OF LINEAR EQUATIONS 63
P.educing the appropriate augmented matrix, we obtain
[2 1 1} Jt 31-7) [1 3] -7] fl Of 2
1 3] -7}~/2 1] ly~]0 -5 | 15/~]o1 | -3).
{ 2/]-4) |1 2] -4| Jo -1] 3] Jo of Oo
R, - R, R, - R, - 2R, R, => IR
; 7 3
(to avoid R;>R,;- 1R, R, > R, - 3R,
fractions) R,— R, + IR;
The left side of the final augmented matrix is in reduced row-echelon form.
From the solution x, = 2 and x, = —3, we see that b = 2v — 3w, which is indeed
insp(v,w). §
The linear system Ax = b displayed in Eq. (1) can be written in the form
ay, Q12 a, b,
Q>, ar Qo, b,
X,| > + Xz} - tere t Xn| - |=] -
Ani Qn Ginn |
A matrix in row-echelon form with all pivots equal to 1 and with zeros
above as well as below each pivot is said to be in reduced row-echelon form.
Thus the Gauss—Jordan method consists of using elementary row operations
on an augmented matrix [A | b] to bring the coefficient matrix A into reduced
HISTORICAL NOTE Tue JorDAN HALF OF THE GAUSS-JORDAN METHOD is essentizlly a syste-
matic technique of back substitution. In this form, it was first described by Wilhelm Jordan
(1842-1899), a German professor of geodesy, in the third (1888) edition of his Handbook of
Geodesy. Although Jordan’s arrangement of his calculations is different from the one presented
here, partly because he was always applying the method to the symmetric system of equations
arising out of a least-squares application in geodesy (see Section 6.5), Jordan’s method uses the
same arithmetic and arrives at the same answers for the unknowns.
Wilhelm Jordan was prominent in his field in the late nineteenth century, being involved in
several geodetic surveys in Germany and in the first major survey of the Libyan desert. He was the
founding editor of the German geodesy journal and was widely praised as a teacher of his subject.
His interest in finding a systematic method of solving large systems of linear equations stems from
their frequent appearance in problems of triangulation.
64 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
PROOF If [| c] hasan ith row with all entries 0 to the left of the partition and
a nonzero entry c, to the right of the partition, the corresponding ith equation
in the system Hx = cis Ox, + Ox, + +++ + Ox, =, which has no solutions;
therefore, the system Ax = b has no solutions, by Theorem 1.6. The next
paragraph shows that, if H contains no such row, we can find a solution to
the system. Thus the system is inconsistent if and only if H contains such
a row.
Assume now that [#7 | c] has no row with all entries 0 to the left of the
partition and a nonzero entry to the right. If the ith row of [H| c] is a zero row
vector, the corresponding equation Ox, + Ox, + - > - + 0x, = 0 is satisfied for
all values of the variables x,, and thus it can be deleted from the system Hx =.
Assume that this has been done wherever possible, so that [H | c] has no zero
row vectors. For each j such that the jth column has no pivot, we can set 4;
equal to any value we please (as in Example 4) and then, starting from the last
remaining equation of the system and working back to the first, solve in
succession for the variables corresponding to the columns containing the
pivots. If some column j has no pivot, there are an infinite number of solutions,
because x, can be set equal to any value. On the other hand, if every column
has a pivot (as in Examples 2, 6, and 7), the value of each x, is uniquely
determined. a
With reference to item (3) of Theorem 1.7, the number of free variables in
the solution set ofa system Ax = b depends only on the system, and not on the
1.4 SOLVING SYSTEMS OF LINEAR EQUATIONS 65
way in which the matrix A is reduced to row-echelon form. This follows from
the uniqueness of the reduced row-echelon form. (See Exercise 33 in Section
2.3.)
Elementary Matrices
The elementary row operations we have performed can actually be carried out
by means of matrix multiplication. Although it is not efficient to row-reduce a
matrix by multiplying it by other matrices, representing row reduction as a
product of matrices 1s a useful theoretical tool. For example, we use elemen-
tary matrices in Section 1.5 to show that, for square matrices A and C, if AC =
I, then CA = I. We use them again in Section 4.2 to demonstrate the
multiplicative property of determinants, and again in Section 10.2 to exhibit a
factorization of some square matrices A into a product LU of a lower-
triangular matrix Z and an upper-triangular matrix U.
If we interchange its second and third rows, the 3 x 3 identity matrix
100 0 0
=
I=|0 10] becomes E= 0 1}.
001 oo 1 0
EXAMPLE 8 Let
0 1-3
A=|2 3-1}.
4 5-2
Find a matrix C such that CA is a matrix in row-echelon form that is row
equivalent to A.
SOLUTION We row reduce A to row-echelon form H and write down, for each row
operation, the elementary matrix obtained by performing the same operation
on the 3 x 3 identity matnx.
14 § -2, I-20]
> 3-4] ‘1 0 0
~ 0 l —3 R,—> R, + IR, E, = 0 1 0
10-1 0] lO td
7 3-1
~10 1 -3/=H.
0 0-3,
SUMMARY
EXERCISES
In Exercises 1-6, reduce the matrix to (a) In Exercises 7-12, describe all solutions of a
row-echelon jorm, and (b) reduced row-echelon linear system whose corresponding augmented
form. Answers to (a) are not unique, so your matrix can be row-reduced to the given matrix. If
answer may differ from the one at the back of the requested, also give the indicated particular
text. solution, if il exists.
Ww
wr
3 4 —2 5
13. 2x- y= 8 |.
6x — Sy = 32 26. b= 4}
[14 2 fz
14. 4x, - 3x, = 10
5 [1 2 [-3
moe) ie| Zhe] Sine |d
8x, — xX, = 10
15. y+ z= 6
3x -—yt z= -7 | a13, | 0 5 ~8
x+y-3z=
-13
16. 2x+ y-3z= 0 v,= 0
=|
6x+3y-8z= 9 —4|
2x- yt5z=-4 [ 2 1 -3 I
17, x,-— 2x,= 3 28. b= atta ™ he 7
3x, — xX, = 14
| 7 4 -9 4
(4
x, — 1x,
= -2
2
18 x,-— 3x, + x, =2
0
3x, — 8x, + 2x, = 5 * 10
19. x, +4x,-2x,= 4
29. Mark each of the following True or False.
2x,
+ 71x, - xX;
= -2
___ a. Every linear system with the same
2x, + 9X,- 7x; = 1
number of equations as unknowns has a
20. x, _ 3x; + 2x; —_ X4 = 8 unique solution.
—_._ b. Every linear system with the same
number of equations as unknowns has at
least one solution.
In Exercises 21-24, find ail solutions of the iinear
___ ¢. A linear system with more equations than
system, using the Gauss-Jordan method. unknowns may have an infinite number
of solutions.
21. 3x, — 2x, = -8 __. d. A linear system with fewer equations
than unknowns may have no solution.
4x, + 5x, = -3
___e. Every matrix is row equivalent to a
22. 2x, + 8x, = 16 unique matrix in row-echelon form.
5x, — 4x, = -8 __ f. Every matrix is row equivalent to a
23. Xx, —2x,+ x, =6 unique matrix in reduced row-echelon
form.
2X,- x,+ «4, -3x,=0 ___ g. If [A | b] and [B | c] are row-equivalent
9x, — 3x,- x, - 7x,= 4 partitioned matrices, the linear systems
76 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
Ax = b and Bx = c have the same 40. Determine all values of the 5, that make the
solution set. linear system
. A linear system with a square coefficient
matrix A has a unique solution if and xX, +X, —- x; = 5,
only if A is row equivalent to the identity 2x, + x3= b,
matrix. X, — X; = b;
i. A linear system with coefficient matrix A
has an infinite number of solutions if and consistent.
only if A can be row-reduced to an 41. Determine all values 5,, b,, and 5, such that
echelon matrix that includes some b = (b,, b,, 55] lies in the span of v, =
column containing no pivot. (1, l, 0], VY, = [3, -l, 4], and y=
. Acconsistent linear system with coefficient {-1, 2, —3].
matrix A has an infinite number of 42. Find an elementary matrix £ such that
solutions if and only if A can be
row-reduced io an echelon matnx that yaa 1 3 1 4
includes some column containing no E012 1);=)0 1 2 1].
pivot. 3451) {0 -5 2-1)
43. Find an elementary matrix £ such that
35.
[ =I} In Exercises 46-51, let A bea 4 x 4 matrix. Find
36.
afd a matrix C such that the result of applying the
given sequence of elementary row operations to A
can also be found by computing the product CA.
37.
Ee} 3)=[o
38. Determine all values of the b, that make the
46. Interchange row | and row 2.
47, Interchange row ! and row 3; multiply row 3
linear system by 4.
x, + 2x, = b, 48. Multiply row | by 5; interchange rows 2 and
3; add 2 times row 3 to row 4.
3x, + 6x. = dB,
49. Add 4 times row 2 to row 4; multiply row 4
consistent. by —3; add 5 times row 4 to row |.
39. Determine all values 5, and 6, such that b = 50. Interchange rows | and 4; add 6 times row 2
[b,, 5,] is a linear combination of v, = [1, 3] to row 1; add —3 times row | to row 3; add
and V5 = (5, —1}. —2 times row 4 to row 2.
14 SOLVING SYSTEMS OF LINEAR EQUATIONS 71
51. Add 3 times row 2 to row 4; add —2 times 0. where the meaning of ‘sufficiently small" must
row 4 to row 3; add 5 times row 3 to row |; be specified in terms of the size of the nonzero
add —4 times row | to row 2. entries in the original matrix. The routine
YUREDUCE in LINTEK provides drill on the
steps involved in reducing a matrix without
Exercise 24 in Section 1.3 is useful for the irext requiring burdensome computation. The program
three exercises. computes the smallest nonzero coefficient
magnitude m and asks the user to enter a number
52. Prove Theorem 1.8 for the row-interchange r (for ratio), all computed entries of magnitude
operation. less than rm produced during reduction of the
coefficient matrix will be set equal to zero. In
53. Prove Theorem 1.8 for the row-scaling
Exercises 59-64, use the routine YUREDUCE,
operation.
specifying r = 0.0001, to solve the linear system.
54. Prove Theorem 1.8 for the row-addition
operation.
59, 3x, - x, =-10
55. Prove that row equivalence ~ is an
equivalence relation by verifying the 1x, +2x%,= 7
following for m x n matrices A, B, and C. 2x, — 5x, = -37
a. A~ A. (Reflexive Property) 60. 5x, -— 2x, = 11
b. IfA ~ B, then
B ~ A.
8x, + x= 3
(Symmetric Property)
6x, — 5x, = —4
c. IfA ~ Band B~ C, thenA ~ C.
(Transitive Property) 61. 7x,-2x,+ x, =-14
56. Find a, b, and c such that the parabola y = —4x, + 5x, - 3x, = 17
ax’ + bx + c passes through the points 5x, - X,+2x,;= —-7
(1, —4), (-1, 0), and (2, 3). 62. —3x,+ 5x,+2x,= 12
57. Find a, b, c, and d such that the quartic 5x, — 7X, + 6x, = —16
curve y = ax’ + bx’ + cx’ + d passes
through (1, 2), (—1, 6), (-2, 38), and (2, 6). llx, — 17x, + 2x, = —40
58. Let A be an m X n matrix, and letc bea 63. x, -~24,+ X,- xX, +2x,= |
column vector such that Ax = c has a unique 2x,+ X,- 4x; - x, + 5x,= 16
solution. 8x,- x, +3x,- m4 - X= 1
a. Prove that m= n.
4x, — 2x, + 3x; — 8x, + 2x, = —5
b. If m = n, must the system Ax = b be
Sx, + 3x, — 4x, + 7x4, - 6x, = 7
consistent for every choice of b?
c. Answer part (b) for the case where 64. x,-2x,+ x,- x +2x,= |
m>n 2x,+ X,- 4x, - xX, + Sx,= 10
8x,- x, +3x,- XY- X= 5
h
4x, ~ 2x, + 3x; 7 8x, + 2X; = -3
& A problem we meet when reducing a matrix with
the aid of a computer involves determining when 5x, + 3x, — 4x, + 7x, -— 6x, = 1
a computed entry should be 0. The computer
might give an entry as 0.00000001, because of
roundoff error, when it should be 0. If the The routine MATCOMP in LINTEK can also be
computer uses this entry as a pivot in a future used to find the solutions of a linear system.
Step, the result is chaotic! For this reason, it is MATCOMP will bring the left portion of the
common practice to program the computer to augmented matrix to reduced row-echelon form
replace all sufficiently small computed entries with and display the result on the screen. The user can
72 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
then find the solutions. Use MATCOMP in the 66. Solve the linear system in Exercise 61.
remaining exercises. 67. Solve the linear system in Exercise 62.
65. Find the reduced row-echelon form of the 68. Solve the linear system in Exercise 63.
matrix in Exercise 6, by taking it as a
coefficient matrix for zero systems.
MATLAB
When reducing a matrix X to reduced row-echelon form, we may need to swap row
i with row k. This can be done in MATLAB using the command
X(i,:) = XG.:)/XG,j).
When we have made pivots | and wish to make the entry in row k, columnj equal
to zero using the pivot in row i, column j, we always multiply row / by the negative
of the entry that we wish to make zero, and add the result to row k. In MATLAR,
this has the form
X = ones(4);
i = 1; j = 2;k = 3;
X({i k],:) = X({k i],:)
X(i,:) = XG,:)/XG,j)
X(k,:) = X(k,:) ~ X(k, D*XQi,:),
which vou can then access using the p-arrow key and edit repeatedly to row-reduce a
matrix X. MATLAB will not show a partition in X—you have to supply the partition
mentally If your installation contains the data files for our text, enter fcl1s4 now. Ve
will be asking you to work with some of the augmented matrices used in the exercises
for this section. In our data file, the augmented matrix for Exercise 63 is called E63.
etc Solve the indicated systen: by setting X equal to the appropriate matrix and
reducing tt using the up-arrow key and editing repeatedly the three basic commands
above. In MATLAB, only the commands executed most recently can be accessed by
using the up-arrow key. To avoid losing the command to interchange rows, which is
seldom necessary, execute it at least once in each exercise even if it is not needed.
(Interchanging the same rows twice leaves a matrix unchanged.) Solve the indicated
exercises listed below
The command rref(A) in MATLAB will reduce the matnx A ta reduced row-ech elon
form. Use this command to Solve the following exercises
M6. Exercise6 M8. Exercise 63
M7. Exercise 24 M9. Exercise 64
Ax = b. (1)
where A is the n X n coefficient matrix. x isthe 7 X | column vector with /th
entry x,, and bis ann X | column vector with constant entries. The analogous
equation using scalars is
ax = b (2)
for scalars a and 8. Ifa 4 0, we usually think of solving Eq. (2) for x by dividing
by a, but we can just as well think of solving it by multiplying by 1/a. Breaking
the solution Gown into small steps, we have
l
x= |7 b. Property of 1
Let us see whether we can solve Eq. (1) similarly if A is a nonzero matrix.
Matrix multiplication is associative, and the n x n identity matrix J plays the
same role for multiplication of n X 1 matrices that the number | plays for
74 CHAPTER t VECTORS, MATRICES, AND LINEAR SYSTEMS
[29
A=l1 4 | and C= 5
satisfy CA = I = AC.
Unfortuaately, it is not true that, for each nonzero ” X n matrix A, we can
find ann X n matrix C such that CA = AC = J. For example, if the first column
of A has only zero entries, then the first column of CA also has only zero entries
for any matrix C, so CA # I for any matrix C. However, for many important
n X n matrices A, there does exist an n X n matrix C such that CA = AC =I.
Let us show that when such a matrix exists, it is unique.
D(AC) = (DA)C.
15. INVERSES OF SQUARE MATRICES 75
Therefore, C = D.
Now suppose that AC = CA = J, and let us show that C is the unique
matrix with this property. To this end, supsose also that 4D = D1 = /. Then
we have AC = I = DA, so D = C, as we just showed. a
Although A“! plays the same role arithmetically as a7! = 1/a (as we showed
at the start of this section), we will never write A”! as 1/A. The powers of an
invertible n X n matrix A are now defined for all integers. That is, for 7 > 9.
A” is the product of m factors A, and A~” is the product of m factors A~'. We
consider A° to be the n X n identity matrix J.
HISTORICAL NOTE THE NOTION OF THE INVERSE OF A MATRIX firs] appears in an 1855 nole of
Arthur Cayley (1821-1895) and is made more explicil in an 1858 paper enlilled “A Memoir on
ihe Theory of Matrices.” In that work, Cayley outlines the basic properties of matrices. noting that
most of these derive from work with sets of linear equations. In particular, the inverse comes from
ihe idea of solving a system
Xa=axtbyt+ez
Y=a'xtby+c'z
Z=a'xt by + cz
for x, v, zin terms of X, Y, Z. Cayley gives an explicit construction for the inverse in terms of the
determinants of the original matrix and of the minors. ;
In 1842, Arthur Cayley graduated from Trinity College, Cambridge, but could not find a
suitable teaching post. So, like Sylvester, he studied law and was called to the bar in 1849. During
his 14 years as a lawyer, he wrote about 300 mathematical papers; finally, in 1863 he became a
professor at Cambridge, where he remained until his death. It was during his stint as a lawyer that
he met Sylvester; their discussions over the next 4) years were extremely fruitful for the progress
of algebra. Over his lifetime, Cayley produced about 1000 papers in pure mathematics, theoretical
dynamics, and mathematica] astronomy.
76 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
E,'E, = E,E£,' = 1,
so E, is invertible, with inverse E£,’.
Finally, let E, be an elementary row-addition matrix, obtained from J by
the addition of 7 times row i to row k. If E,' is obtained from J by the addition
of —r times row i to row k, then
EyE, = E,Ey = 1.
We have established the following fact:
Inverses of Products
PROOF By assumption, there exist matrices A~' and B-' such that AA7! = A"'A
= T and BB"' = B''B = I. Making use of the associative law for matrix
multiplication, we find that
(E,... E,E,E,)A,
A Commutativity Property
We are now in position to show that if CA = J, then AC = J. First we prove a
lemma (a result preliminary to the main result).
78 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
PROOF Let b be any column vector in R” and let the augmented matrix
[A | b] be row-reduced to {H | c] where H is in reduced row-echelon form.
If H is the identity matrix /, then the linear system Ax = b has the solution
X=C.
For the converse, suppose that reduction of A to reduced row-echelon
form yields a matrix H that is not the identity matrix. Then the bottom
row of H must have every entry equal to 0. Now there exist elementary
matrices E,, F,,..., E, such that (E,- ++ £,£,)A = H. Recall that every
elementary matrix is invertible, and that a product of elementary matrices
is invertible. Let b = (£,°:- £,E£,)"'e,, where e, is the column vector
with | in its mth component and zeros elsewhere. Reduction of the aug-
mented matrix [A | b] can be accomplished by multiplying both A and b on
the left by E, - - - £,£,, so the reduction will yield [H | e,], which represents
a system of equations with no solution because the bottom row has entries 0
to the left of the partition and | to the right of the partition. This shows that
if H is not the identity matrix, then Ax =-b does not have a solution for
somebER". a
Computation of Inverses
Let A = [a,] be an” X n matrix. To find A”. if it exists. we must find an #7 xX A
matrix X = [x,] such that AX = /—that is. such that
Gy Ay 8 Ayal Ay Me ot ON, 10 -:: 0
Ay, Ary * Any | | Xa) Nn 8 May 01 :-- 0
: = . . (3)
Xn1 0
which is a Square system of equations. There are also n equations involving the
nunknowns x, fori = 1, 2,...,; and so on. In addition to solving system
(4), we must solve the systems
X12 °| Xn 0
X29 i Xp 0
A - =|.j,...,A =T+], (5)
Xn 0 Xan l
where each system has the same coefficient matrix A. Whenever we want to
solve several systems Ax = b, with the same coefficient matrix but different
vectors b,, we solve them all at once. rather than one at a time. The main job in
Solving a linear system is reducing the coefficient matrix to row-echelon or
reduced row-echelon form, and we don’t want to repeat that work over and
over. We simply reduce one augmented matrix, where we line up all the vectors
b; to the right of the partition. Thus, to solve all the linear systems in Eqs. (4)
and (5), we form the augmented matrix
a, 4, ~** a, | |
Q, Qa. *** &, | O
which we abbreviate by [A | J]. The matrix A is to the left of the partition, and
the identity matrix J is to the nght. We then perform a Gauss—Jordan
reduction on this augmented matnx. By Theorem 1.9, we know that if A!
exists, it is unique, so that every column in the reduced row-echelon form of A
has a pivot. Thus, 47! exists if and only if the augmented matrix (6) can be
reducec¢ to
l 0 eee 0 C , Ci o ee Ci
Computation of A™
To find A~’, if it exists, proceed as follows:
Step 1 Form the augmented matrix [A | I].
Step 2 Apply the Gauss—Jordan method to attempt to reduce [A | J]
to [J | C]. If the reduction can be carried out, then A“! = C.
Otherwise, A~! does not exist.
2 , a: .
EXAMPLE 2 For the matrix A = i; A compute the inverse we exhibited at the start of this
section, and use this inverse to solve the linear system
2x + 9y = —5
x+4y= 7.
_flo}|-4 9
o1| 1-2)
Thercfore,
15 INVERSES OF SQUARE MATRICES 81
PROOF Step 2 in the box preceding Example 2 shows that parts (i) and (11) of
Theorem 1,12 are equivalent. For the equivalence of (ii) with (iii), Lemma 1.1
shows that Ax = b has a solution for each b € R’ if and only if (ii) is true. Thus,
(ii) and (iii) are equivalent. The equivalence of (iii) and (v) follows from the
box on page 63.
82 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
Turning to the equivalence of parts (ii) and (iv), we know that the matrix A
is row equivalent tc / if and only if there is a sequence of elementary matrices
E,, £,,...,£,such that £,- -- E,E,A= I; and this is the case if and only if A is
expressible as a product 4 = E,'£,'--- E,' of elementary matrices. 4
. 1 0 : .
Ey = 7] Add -2 times row 1 to row 2.
A= —- EB; 'E,
RP-ip-ip-) = i0 1}/1
ol O;io
]1 4
if ;
Example 3 illustraves the following boxed rule for expressing an invertible
matrix A as a product of elementary matrices.
fy 3-2
A=| 2 § -3
-3 2-4
l 3-2 | 1 0 0 1 3 -2 1 0 0O
2 5 -3 O 1} O;~{0 -1 l —2 1 0
—3 2-4; 0 0 1 0 11 -10 3 0 1
1 6 1] -5 3 Oj] [1 O Of 14 -8 -1
~|0 1-1 2-1 Of/~/0 1 O] -17 10 I}.
0 0 41;,-19 11 #1 0 0 14-19 11 #1
15 INVERSES OF SQUARE MATRICES 83
14 -8 -1]
A =]-17 10 1.
10 0;) 1 130
A=|2 10 01 0 010
00 1}/-3 0 7
l 0 0)/ i > l -
x 10 1 0 3 l 0 0 1 4 ,
0: 1 1 {lo G I 0 ¢ 1 u
EXAMPLE 6 Determine whether the span of the vectors [1, —2, 1], (3, —5, 4], and [4, —3, 9]
is all of R?.
SOLUTION Let
1 3 4
A=|-2 -5 —3}.
1 4 9
We have
1 3 4 134 1 34
—2 -5 -3}~|01 5|/~|0 1 35).
1 4 9; {01 5} |000
We do not have a pivot in the row 3, column 3 position, so we are not able
to reduce A to the identity matrix. By Theorem 1.12, the span of the given
veciors is not ailof R?. «=
SUMMARY
OO
EXERCISES
In Exercises 1-8, (a) find the inverse of the square In Exercises 11 and 12. determine whether the
matrix, if it exists, and (b} express each invertible span of the column vectors of the given matrix is
matrix as a product of elementary matrices. Re.
“By 3 6
BS
67
rT 1 (0 1 1]
0-1-3
1 O-!
4
2
3.
a
! 0 1
«To1] -3
rT 1-2
0 0-1]
| O
5.10 1 1 6. | 2 0 3 5 0 2
Ut
12.
0 1 2-4
w
lo oO -1 3 |
f-1 2 4 -2|
2 1 4 -1
2 13. a. Show that the matrix
7.1432 «5 8. | 2 -3
0-1 1 1 0 2 -3
[5 7
A=
Wiis
A “0 1]. such that C4 = B.
l4 1 2
—
23. Mark each of the following True or False.
If possible, find a matrix C such that The statements involve matrices 4, B, and
a
A= 0.
NW
242
[4| B]~ |X).
lL r 3 That is. if the matrix A is reduced to the
112 identity matrix J, then the matrix B will be
reduced to A7'B.
Is invertible.
86 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
MATLAB
Access MATLAB and, if the data files for our text are accessible, enter focisS.
Otherwise, enter these four matrices by hand. {In MATLAB, In(x) is denoted by
log(x).]
As you work the problems, write down the entry in the 2nd row, 3rd column position
of the answer, with four-significant-figure accuracy, to hand in.
Enter help inv, read the information, and then use the function inv to work
problems M1 through M4.
M1. Compute C°.
M2. Compute A°B°?°C.
M3. Find the matrix X such that XB = C.
M4. Find the matrix X such that B’XC = A.
Enter help / and then help \, read the information, and then use / and \ rather than
the function inv to work problems M5 through M8.
M5. Compute A'B’C"'B.
M6. Compute B-?CA-3B>.
M7. Find the matrix X such that CX = B°?.
M8. Find the matrix X such that AXC’ = B.
88 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
Notice how easy it was to write down the proof of Theorem 1.13 in matrix
notation. What a chore it would have been to write out the proof using
equations with their subscripted variables and coefficients to denote a general
m X n homogeneous system!
Although we stated and proved Theorem 1.13 for just two solutions of
Ax = 0, either induction or the same proof using k solutions shows that:
Subspaces
The solution set of a homogeneous system Ax = 0 in nm unknowns is an
example of a subset W of R" with the property that every linear combination of
vectors in Wis again in W. Note that ¥ contains all linear combinations of its
16 HOMOGENEOUS SYSTEMS, SUBSPACES, AND BASES 89
vectors if and only if it contains every sum of two of its vectors and
every scalar multiple of each of its vectors. We now give a formal definition
of a subset of R’ having such a self-contained algebraic structure. Rather
than phrase the definition in terms of linear combinations, we state it in
terms of the two basic vector operations, vector addition and scalar multi-
plication.
A subset W of R" 1s closed under vector addition if for all u,v & W the
sumu+ visin W.Ifrv & W for all v& Wand all scalars r, then W is
closed under scalar multiplication. A nonempty subset ¥ of R’ that is
closed under both vector addition and scalar multiplication is a
subspace of R’.
Theorem 1.13 shows that the solution set of every homogeneous system
with n unknowns is a subspace of R". We give an example of a subset of R? that
is a subspace and an example of a subset that is not a subspace.
Note that the set {0} consisting of just the zero vector in R" is a subspace of
R", because 0 + 0 = 0 and r0 = 0 for all scalars r. We refer to {0} as the zero
subspace. Of course, R" itself is a subspace of R", because it is closed under
vector addition and scaiar multiplication. The two subspaces {0} and R"
represent extremes in size for subspaces of R". The next theorem shows one
way to form subspaces of various sizes.
90 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
X2
_---T fu
utv ---"
/
i |
/
Vv
FIGURC 1.35
The shaded subset is not closed under addition.
Let W = sp(w,, w., . - . , W,) be the span of k > 0 vectors in R". Then W
is a subspace of R’.
PROOF Let
is again in W—that is. Wis closed under scalar multiplication. Because k > 0,
W is also nonempty, so Wis a subspace of R". a
We say that the vectors w,, w....., W, Span or generate the subspace
Sp(W). Ws... w;) of R".
We will see in Section 2.1 that every subspace in R” can be described as the
span of at most » vectors in R". In particular, the solution set of a homoge-
neous system .4x = 0 can always be described as a span of some of the solution
vectors. We illustrate how to describe the solution set this way in an example.
1.6 HOMOGENEOUS SYSTEMS, SUBSPACES, AND BASES 91
XN, — 2N,+ x, - x, = 0
2x, — 3N, + 4x, - 3x, = 0
3x, _ 3N, + 5N; — 4x, = 0
| 5 0j ]
Thus the solution set is
-5 3
—2| |1
sp | ll] Oll-
0} | 1
We chose these two generating vectors from Eq. (1) by taking r= 1, 5 = 0 for
the first andr = 0,s = 1 forthe second. «
The preceding example indicates how we can always express the entire
solution set of a homogeneous system with k free variables as the span of k
solution vectors.
Given an m X n matrix A, there are three natural subspaces of R” or R"
associated with it. Recall (Theorem 1.14) that a span of vectors in R” is always
a subspace of R”. The span of the row vectors of A is the row space of A, and is
of course a subspace of R". The span of the column vectors of A is the column
space of A and is a subspace of R”. The solution set of Ax = 0, which we have
been discussing, is the nullspace of A and is a subspace of R”. For example, if
_fl1 0 3
A=|j |
92 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
we see that
ithe row space of A is sp({1, 0, 3], (0, 1, —1]) in R’,
the column space of A isso([o} Hi _} in R?, and
-3
the nullspace of A is sp}|_ 1}] in R’.
l
The nullspace of A was readily found because A is in reduced row-echelon
form.
In Section 1.3, we emphasized that for an m X ny matrix A and x € R’, the
vector Ax is a linear combination of the column vectors of A. In Section 1.4 we
saw that the system Ax = b has a solution if and only if b is equal to some linear
combination of the column vectors of A. We rephrase this criterion for
existence of a solution of Ax = b in terms of the column space of A.
We have discussed the significance of the nullspace and the column space
of a matrix. The row space of A is significant because the row vectors of A are
orthogonal to the vectors in the nullspace of A, as the ith equation in the
system Ax = 0 shows. This observation will be useful when we compute
projections in Section 6.1.
Bases
We have seen how the solution set of a homogeneous linear system can be
expressed as the span of certain selected solution vectors. Look again at
Eq. (1), which shows the solution set of the linear system in Example 3 to be
sp(w,, W,) for
-5 [3
Ww, = | and Ww, = 0 .
0 l
The last two components of these vectors are 1, 0 for w, and 0, 1 for w,. These
components mirror the vectors i = [1, 0] and j = (0, 1] in the plane. Now the
vector [r, s] in the plane can be expressed uniquely as a linear combination ofi
and j—namely, as ri + sj. Thus we see that every solution vector in Eq. (1) of
the linear system in Example 3 is a unique linear combination of w, and
w,—namely, rw, + sw,. We can think of (r, s) as being coordinates of the
1.6 HOMOGENEOUS SYSTEMS, SUBSPACES, AND BASES $3
solution relative to w, and w,. Because we regard all ordered pairs of numbe-s
as filling a plane, this indicates how we might regard the solution set of th:s
system as a plane in R‘. We can think of {w,, w,} as a set of reference vectors to:
the plane. We have depicted this plane in Figure 1.36
More generally, if every vector w in the subspace W = sp(w,, W2,.. - - ¥.!
of iR” can be expressed as a linear combination w = cw, +-c.W, + °° ° + 6.5.
in a unique way, then we can consider the ordered k-tuple (c,, ¢,, . - . 5 Cx) in 3°
to be the coordinates of w. The set {w,, w,, . . . , W,} is considered to be a sei of
reference vectors for the subspace W. Such a set is known as a basis for HW. as
the next definition indicates.
Let W bea subspace of R". A subset {w,, w,, ..., Wt of Wis a basis for
W if every vector in W can be expressed uniquely as a linear
combination of w,, W2,..., W;-
0 j
form a basis for the solution space of the homogeneous system there.
It {w,, W., .. . , W, iS a basis for 77’, then we have W = sp{w,, w,, ... , W,) 2s
well as the uniqueness requirement. Remember that we called e,, ¢,,....e.
standard basis vectors for R", The reason for this is that every element of R’” can
be expressed uniquely as a linear combination of these vectors e,. We call
{e,, @,,... , €,} the standard basis for R’.
a mi tam
/
FIGURE 1.36
The plane sp(w,, w,)
94 CHAPTER } VECTORS, MATRICES. AND LINEAR SYSTEMS
The set {w,, w,,..., W,} is a basis for W = sp(w,, w,, ..., W,) 10 R’ if
and only if the zero vector is a unique linear combination of the
w,—that is, if and only if r,w, + nw, + -: +> + 7,7, = Oimplies that
hen=crr: =K=0.
PROOF If {w,, W.,..., W,} is a basis for W, then the expression for every
vector in Was a linear combination of the w,; is unique, so, in particular, the
linear combination that gives the zero vector must be unique. Because
Ow, + Ow, + ++: + Ow, = 0, it follows that rw, + nw, + ++: +7W, = 0
implies that each r; must be 0.
Conversely, suppose that Ow, + Ow. + --+ + Ow, is the only linear
combination giving the zero vector. If we have two linear combinations
W= CW, + 6,W, + ct + ew,
w= dw, + dw. t+ +--+: +d,
EXAMPLE 4 Determine whether the vectors v, = [1, 1. 3], v. = [3, 0, 4], and v, = [1, 4, -1]
form a basis for R?.
SOLUTION We must see whether the matrix 4 having v,, v.. and v, as column vectors is row
equivalent to the identity matrix. We need onlv create zeros below pivots to
determine if this is the case. We obtain
1 3 1 1 3 1) fi 3 4]
d={1 0 4/~]0 -3 3/~lo 1 -1/.
3 4-1} {0 -5 -4| |0 0-9
There is no point in going further. We see that we will be able to get the identity
matrix, so {v,. vy. ¥;} is a basis for R*.s
zero rows. For example, if m = 5 and n = 3, the reduced row-echelon form for
A in this unique solution case must be
1 0
Oo
01
omct
00
00
000
We summarize these observations as a theorem.
EXAMPLE 5 Determine whether the vectors w, = [1, 2, 3, -1], w. = [-2, —3, —5, 1], and
w, = [-1, —3, —4, 2] form a basis for the subspace sp(w,, w,, ;) in R¥.
SOLUTION By Theorem 1.17, we need to determine whether the reduced row-echelon
form of the matrix A with w,, w,, and w, as column vectors consists of the 3 x 3
identity matrix followed by a row of zeros. Again, we can determine this using
just the row-echelon form, without creating zeros above the pivots. We obtain
1-2-1) f1 -2 -1) [1 -2 -1
2-3-3) Jo 1-1] lo 1-1
A=! 3-5 -4/~]o 1 -1{/~]o 0. O|:
-1 1 2} {o-1 1] [0 0 0
We cannot obtain the 3 x 3 identity matrix. Thus the vectors do not form a
basis for the subspace which is the column space of A. «
PROOF If m< nin Theorem 1.17, the reduced row-echelon form of A cannot
contain the n X n identity matrix, so we cannot be in the unique solution case.
Because we are assuming that the system is consistent, there are an infinite
number of solutions. 4
The next corollary follows at once from Corollary 1 and Theorem 1.17.
EXAMPLE 6 Show that a basis for R* cannot contain more than n vectors.
SOLUTION If {v,, v.,..., Vf 18 a basis for R*, then, by Theorem 1.15, the only linear
combination of the v, equal to the zero vector is the one for which the
coefficient of each v, is the scalar 0. In terms of a matrix equation che
homogeneous linear system Ax = 0, where v; is the jth column vector of the
n X k matrix A, must have only the trivial solution. If k > n, then this linear
system has fewer equations than unknowns, and therefore 2 nontrivial
solution by Corollary 2. Consequently, we must havekK <n. ©
X4 5 0 S 8
General solution Pp h
“| SUMMARY
1. A linear system Ax = b is homogeneous if b = 0.
2. Every linear combination of solutioris of a homogeneous system Ax = 0 is
again a solution of the system.
3. A subset 4’ of R" is closed under vector addition if the sum of two vectors
in Wis again in W. The subset YU" is closed under scalar multiplication if
every scalar multiple of every vector in "is in W. If Wis nonempty and
closed under both operations, then Wis a subspace of R".
1.6 HOMOGENEOUS SYSTEMS, SUBSPACES, AND BASES 99
EXERCISES
In Exercises 1-10, determine whether the 12. Let a, b, and c be scalars such that abc # 0.
indicated subset is a subspace of the given Prove that the plane ax + by + cz = Oisa
Euclidean space R’. subspace of R’.
13. a. Give a geometric description of all
1. {{(r, -r]) |r E Rin R? subspaces of R’. ;
2. {[x, x + 1] | x © R}in R? b. Repeat part (a) for R’. oR ons th
3. {{n, m] | n and m are integers} in R? 14. Prove that every subspace of R” contains the
4. {[x, y] | xy € R and x,y = 0} (the first Zero vector.
» WS UT MY > = ‘ 15. Is the zero vectora basis for the subspace {0}
quadrant of R’) .
of R"? Why or why noi?
5. {[x, y, z] | x,y,z € R and z = 3x + 2}in R?
6. {[x, y, z] | x,y,z © R and x = 2y + z}in R’ In Exercises 16-21, find a basis for the solution
7. {[x, y, z] | ~y,z © R and z= 1, y = 2x} in RB? set of the given homogeneous linear system.
8. {[2x,
{[2x, xx ++ y,ys y]VI} 1 xyxy ۩ RhR} in
in k R? 16. x- y=0
9. {[2x,, 3x,, 4x5, 5x.) | x; € R} in R' dx -2=0
10. {[x,, %, ..- Xa) | x, ER, x, = O} in R" 1. 4x aan n=0
1}. Prove that the line y = mx is a subspace of , >?
R?. [Hint: Write the line as 6x, + 2x, + 2x; = 0
W = {[x, mx] | x € R}.] —9x, “ 3x, 7 3x; = 0
100 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
18. x, - %+x- x, =0 32. Find a basis for the nullspace of the matrix
Xy + Xy =0 {357
x, + 2x, — x, + 3x, = 0 2042I'.
19. 2x,+ m+ x4+ 1, =0 3287
xX, — 6xX1+ XX; =0 33. Let v,, v.,....¥, and W,, Wo, .... Wy, be
3x, — 5X, + 2x; + x, =0 vectors in a vector space V. Give a necessary
and sufficient condition, involving linear
Sx, — 4x; + 3x; + 2x, = 0 combinations. for
20. 2x, +X. + xX, + x, =0
Sp(V,, Va... . > Vy) = SPp(W), Wa... W,,):
3x, +X ~ =X; + 2x, =0
X, +X, + 3x; =0
X, — %) — 7x; + 2x, = 0 In Excrcises 34-37, solve the given linear system
and express the solution set in a form that
2. x - Xt 6X, + Xy- x, =0 illustrates Theorem 1.18.
3x, + 2x, - 3x, + 2x, + Sx; = 0
4x, + 2x,.- 4x; + 3x,- x; =0 34. x, — 2x, + x, + 5x, =7
3x, — 2x, + 14x, + x, - 8x, = 35. 2x, - X2+ 3d = -3
2x, — X,+ 8x, + 2x, - 7x, =0 4x, + 2x, -x,= |
36. x,-—2x,+ x+ x,
= 4
In Exercises 22-30, determine whether the set of 2x,+ %-—3x,- m= 6
vectors is a basis for the subspace of R" that the
vectors span. x, 7 7X5 7 6x; + 2N, = 6
ws
22. {[-1, 1], [1. 2]} in ¥, a X T 2x; + Xy =
Oo
|
23. {{-1, 3, 1], (2, 1, 4]} in RF
Wm
4x, - XxX + 7X; + 2X, =
24. {{-1, 3, 4), [1, 5. 1}. [f. £3, 2)} in R? 7X, 7 2x2 - X3 + xX, = -5
25. {[2. 1, -3], [4, 0. 2], (2, -1, 3]} in R? 38. Mark each of the following True or False.
26. {{2, 1, 0, 2], [2. —3. 1. 0}. [3, 2, 0, O}} in R* . — a. A linear system with fewer equations
than unknowns has an infinite number of
27. The set of row vectors of the matrix
solutions.
2-6 | ___ b. A consistent linear system with fewer
1-3 4] equations than unknowns has an infinite
28. The set of column vectors of the matrix in number of solutions.
Exercise 27. c. Jf a square linear system Ax = b has a
solution for every choice of column
29. The set of row vectors of the matrix
vector b, then the solution is unique for
-| ol each b.
=
— 68 . The sum of two solution vectors of any 45. Let v, and v, be vectors in R*. Prove the
homogeneous linear system is also a following set equalities by showing that each
solution vector of the system. of the spans is contained in the other.
h. A scalar multiple of a solution vector of a. sp(v,, Vv.) = sp(v,, 2v, + Va)
any homogeneous linear system is also a b. sp(V,, ¥) = sPtV, + ¥,, Vy — ¥)
solution vector of the system. 46. Referring to Exercise 45, prove that if {¥,, ¥;}
i. Every line in R? is a subspace of R? is a basis for sp(v,, v,), then
generated by a single vector. a. {v,, 2v, + v,} is also a basis.
j. Every line through the origin in R? is a b. {v, + v,, v, — v,} is also a basis.
subspace of R’ generated by a single c. {¥, + V2, V, — V2, 2v, — 3y,} is not a basis.
vector.
47. Let W, and W, be two subspaces of R".
39. We have defined a linear system to be Prove that their intersection W, M W, is also
underdetermined if it has an infinite number a subspace.
of solutions. Explain why this is a reasonable
term to use for such a system. fai In Exercises 48-51, use LINTEK or MATLAB to
40. A linear system is overdetermined if it has determine whether the given vectors form a basis
more equations than unknowns. Explain why for the subspace of RX" that they span.
this is a reasonabie term to use for such a
system. 48. a, = [1, 1, -1, 0}
a, = (5, 1, 1, 2]
41. Referring to Exercises 39 and 40, give an
example of an overdetermined a, = [5, -3, 2, -1]
underdetermined linear system! a, = [9, 3, 0, 3]
42. Use Theorem !.13 to explain why a 49. b, = [3, —4, 0, 0, 1]
homogeneous system of linear equations has b, = [4, C, 2, —6, 2]
either a unique solution or an infinite
b, = (0, 1, 1, —3, 0]
number of solutions.
b, = [1, 4, —1, 3, 0]
. Use Theorem 1.18 ts explain why no system
of linear equations can have exactly two . Vv, = (4, -1,2, 1
solutions. v, = [10, —2, 5, 1]
MATLAB
Access MATLAB and enter fbc1s6 if our text data files are available; otherwise, enter
the vectors in Exercises 48-51 by hand. ‘Use MATLAB matrix commands to form the
necessary matrix and reduce it in problems M1-M4.
Nil. Solve Exercise 48. 13. Solve Exercise 50.
M2. Solve Exercise 49. M4. Solve Excercise 51.
102 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
M5. What do you think the probability would be that if n vectors in R” were
selected at random, they would form a basis for R*. (The probability of an
event is a number from 0 to 1. An impossible event has probability 0, a
certain event has probability 1, an event as likely to occur as not to occur has
probability 0.5, etc.)
M6. As a way of testing your answer to the preceding exercise, you might
experiment by asking MATLAB to generate ‘‘random” n x n matrices for
some value of 1, and reducing them to sce if their column vectors form a
basis for R". Enter rand(8) to view an 8 xX 8 matrix with “random” entries
between 0 and 1. The column vectors cannot be considered to be random
vectors in R®, because all their couaponents lie between 0 and 1. Do you think
the probability that such column vectors form a basis for R* is the same as in
the preceding exercise? As an experimental check, execute the command
rref(rand(8)) ten times to row-reduce ten such 8 x 8 matrices, examining each
reduced matrix to see if the coiumn vectors of the matrix generated by
rand(8) did form a basis for R°.
M7. Note that 4srand(8)—2+ones(8) will produce an 8 x 8 matrix with “random”
entries between —2 and 2. Again, its column vectors cannot be regarded as
random vectors ia R®, but at least the components of the vectors need not all
be positive, as they were in the preceding exercise. Do you think the
probability that such column vectors form a basis for R® is the same as in
Exercise M5? As an experimental check, row-reduce ten such matrices.
Population States
Our populations will often consist of people, but this is not essential. For
example, at any moment we can classify the population of cars as operational
or not operational.
We are interested in how the distribution of a population between (or
among) states may change over a period of tinie. Matrices and their multiplica-
tion can play an important role in suci considerations.
Transition Matrices
r
poor mid rich
80 .15 .05] poor
T=
19.75 .30| mid
| 01 .10 .65} rich
104 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
We have labeled the columns and rows with the names of the states. Notice
that an entry of the matrix gives the proportion of the population in the state
above the entry that moves to the state at the right of the entry during one
20-year period. =
CP +.b2P2 + ty
This is precisely the first entry in the column vector given by the product
i be tsl|Py
Tp = [ty ty b3}]P2)-
bi be 43||D3
In a similar fashion, we find that the second and third components of Tp give
the proportions of population in state 2 and in state 3 after one time period.
Markov Chains
001
T=1100
010
is a transition matrix for a three-state Markov chain, and explain the
significance of the zeros and the ones.
SOLUTION The entries are all nonnegative, and the sum of the entries in each column is 1.
Thus the matrix is a transition matrix for a Markov chain.
At least for finite populations, a transition probability 4, = 0 means that
there is no movement from state j to state i over the time period. That is,
HISTORICAL NOTE Markov Cuains are named for the Russian mathematician Andrei
Andreevich Markov (1856-1922), who first defined them in a paper of 1906 dealing with the Law
of Large Numbers and subsequently proved many of the standard results about them. His interest
in these sequences stemmed from the needs of probability theory; Markov never dealt with their
applications to the sciences. The only real examples he used were from literary texts, where the
two possible states were vowels and consonants. To illustrate bis results, he did a statistical study
of the alternation of vowels and consonants in Pushkin’s Eugene Onegin.
Andrei Markov taught at St. Petersburg University from 1880 to 1905, when he retired to
make room for younger mathematicians. Besides his work in probability, he contributed to such
fields as number theory, continued fractions, and approximation theory. He was an active
participant in the liberal movement in the pre-World War I era in Russia; on many occasions he
made public criticisms of the actions of state authorities. In 1913, when as a member of the
Academy of Sciences he was asked to participate in the pompous ceremonies celebrating the 300th
anniversary of the Romanov dynasty, he instead organized a ceiebrazion of the 200th anniversary
of Jacob Bernoulli’s publication of the Law of Large Numbers.
106 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
transition from state j to state i over the time period is impossible. On the
other hand, if ¢; = 1, the entire population of state j moves to state / over the
time period. That is, transition from state j to state i in the time period is
certain.
For the given matrix, we see that, over one time period, the entire
population of state 1 moves to state 2, the ex:ire population of state 2 moves to
state 3, and the entire popuiation of state3 moves to state 1. a
©
oo~_
we
Oo
—_—
©S&
If T” has no zero entries, then 7”*' = (J)T has no zero entries, because
the entries in any column vector of T are nonnegative with at least one nonzero
entry. In determining whether a transition matrix is regular, it is not necessary
to compute the entries in powers of the matrix. We need only determine
whether or not they are zero.
108 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
is regular.
SOLUTION We compute configurations of high powers of T as rapidly as we can, because
once a power has no zero entries, all higher powers must have nonzero entries.
We find that
x 0x x xx x xX [x x x x
x 00 x x 0x xX x xxX xX
T? = x x x OP T‘ = x xX X XP T= x xX xX XP
00x QO x x x .0 x xX XX
Py, 5
p=|p,| and s=|]s,|,
P; 5;
then Theorem 1.19 tells us that 7"p is approximately
Thus, after many time periods, the population distribution vector is approxi-
mately equal to the steady-state vector s for any choice of initial population
distribution vector p.
There are two ways we can attempt to compute the steady-state vector s of
a regular transition matrix 7. If we have a computer handy, we can simply
raise T to a sufficiently high power so that all column vectors are the same, as
far as the computer can print them. The software LINTEK or MATLAB can be
used to do this. Alternatively, we can use part (2) of Theorem 1.19 and solve
for s in the equation
Ts =s. (1)
In solving Eq. (1), we will be finding our first eigenvector in this text. We will
have a lot more to say about eigenvectors in Chapter 5.
Using the identity matrix J, we can rewrite Eq. (1) as
Ts = Is
Ts
- Is = 0
(T — Ds
= 0.
EXAMPLE 6 Use the routine MATCOMP in LINTEK, and raise the transition matrix to
powers to find the steady-state distribution vector for the Markov chain in
Example |, having states labeled
no meat meat
1
7 1 Ino meat
T=(1
0 meat
2
Because 7? has no zero entries, the Markov chain is regular. We solve
(T -I)s = 0, or E S| = ol
We reduce the augmented matrix:
| | : -2 Of -2 | 0
I ~|1 0}~ lo o| of
Thus we have
rs, 2r|
ls =|) for some scalar r.
ay
17 APPLICATION TO POPULATION DISTRIBUTION (OPTIONAL) 111
WwW]
steady-state vector remains | | in either case. &
—
wa
If we were to solve Example 7 by reducing an augmented matrix with a
computer, we should add a new row corresponding to the condition s, + s, = |
for the desired steady-state vector. Then the unique solution could be seen at
once from the reduction of the augmented matrix. This can be done using
pencil-and-paper computations just as well. If we insert this as the first
condition on s and rework Example 7, our work appears as follows:
1 afi) ft a] af fr oy 2
12 1 |} 0}. }0 23 | 24 _|0 1 13\°
> -110 (9 “5 | 73 0010
[2
Again, we obtain the steady-state vector |3).
3
“| SUMMARY
1. A transition matrix for a Markov chain is a square matrix with nonneg-
ative entries such that the sum of the entries in each column is 1.
2. The entry in the ith row and jth column of a transition matnix is the
proportion of the population in state j that moves to state i during one time
period of the chain.
3. Ifthe column vector p is the initial population distribution vector bet ween
states in a Markov chain with transition matrix 7, the population
distribution vector after one time period of the chain is 7p.
4. If Tisthe transition matrix for one time period of a Markov chain, then 7”
is the transition matrix for m time periods.
5. A Markov chain and its associated transition matrix T are called regular if
there exists an integer m such that 7” has no zero entries.
6. If Tis a regular transition matrix for a Markov chain,
a. The columns of 7” all approach the same probability distribution
vector s as m becomes large;
b. s is the unique probability distribution vector satisfy:ng 7s = s; and
112 CHAPTER 1 —- VECTORS, MATRICES, AND LINEAR SYSTEMS
c. As the number of time pcriods increases, the population distribution
vectors approach s regardless of the initial population distribution
vector p. Thus s is the steady-stale population distribution vector.
EXERCISES
In Exercises 1-8, determine whether the given In Exercises 13-18, determine whether the given
matrix is a transition matrix. If it is, determine transition matrix with the indicated distribution
py
whether it is regular. of zero entries and nonzero X entries is regular.
ria 02
2 2 4 os Ix 0 x
13 3 4 13.|0 x x 14.|0
0 x
10x x ix x 0
.2 1 3 ri. 2
3./.4 .5 1 4.|.3 .4 ‘Ox 0 [x 0 x
_X,
14 4 8 16 4 15. |x
lo x
x x
0
16. |*
x
*
0 x
*
KKK
1.500 5 3.205
rox x x LX 0 x
5 [5.5 0 0 g |-4 20.1
"lo 5 50 "l1 212 7, |% 0% x TO 0 x
KK
“10 0x x 18, |% 9 0
10 0 5 55 2.402 10 0 x x 0x 0
fo .5 22.1 fo 100.9
sx
10 0 0
30.1 8 .5 0200.11
71004 0.1 810.301 0
7510.1 1.1000 In Exercises 19-24, find the steady-state
10 0 2 0 22] 10 3 100 distribution vector for the given transition matrix
of a Sfarkov chain.
| 3.7 4
19. |? 20.173
3 42
In Exercises 9-12, let
T =|.4 .2 .1| be the
3 1.5 RY3 0 i 42
transition matrix for a Markov chain, and let p = - "
3 033 133
2 be the initial population distribution vector. 2. |0 24 22. Jo L0
5
104 1 O1411
NOANESALVE,
. Find the population distribuuion vector after -—- b. Every matrix whose entries are all
tau time pericds. NuNNegalive IS a transition matrix.
1.7 APPLICATION TO POPULATION DISTRIBUTION (OPTIONAL) 113
c. The sum of all the entries in an n X n 31. If the initial population distribution vector
transition matrix is 1. 4
d. The sum of all the entries in ann x n for all the women is|.5|, find the popuiation
transition matrix is n. dl
e. If a transition matrix contains no zero distribution vector for the next generation.
entries, it is regular.
32. Repeat Exercise 31, but find the population
f. If a transition matrix is regular, it
distribution vector for the following (third)
contains no nonzero entries.
generation.
. Every power of a transition matrix is
again a transition matrix. 33. Show that this Markov chain is regular, and
h. If a transition matrix is regular, its square find the steady-state probability distribution
has equal column vectors. vector.
i. If a transition matrix T is regular, there
exists a unique vector s such that 7s = s.
j. If a transition matrix 7 is regular, there
Exercises 34-39 deal with a simple genetic model
exists a unique population distribution
involving just two types of genes, G and g.
vector s such that 7s = s.
Suppose that a physical trait, such as eye color, is
26. Estimate A', if A is the matrix in Exercise controlled by a pair of these genes, one inherited
20. from each parent. A person may be classified as
27. Estimate A'™, if A is the matrix in Exercise being in one of three states:
23.
Dominant (type GG), Hybrid (type Gg),
Recessive (type gg).
37. If initially the entire population is hybrid, A in that time period. Argue that, by starting
find the population distribution vector in the in any state in the original chain, you are
next generation. more likely to reach an absorbing state in m
38. If initially the population is evenly divided time periods than you are by starting from
Go
among the three states, find the population state F in the new chain and going for r time
distribution vector in the third generation periods. Using the fact that large powers of a
(after two time periods). positive number less than | are almost 0,
show that for the two-state chain, the
39. Show that this Markov chain is regular, and
population distribution vector approaches
find the steady-state population distribution
vector. } as the number of time periods increases,
. A state in a Markov chain is calied absorbing regardless of the initial populaticn
if it is impossible to leave that state over the distribution vector.]
next iime period. What characterizes the
44. Let Tbe an n X n transition matrix. Show
transition matrix of a Merkov chain with an
that, if every row and every column have
absorbing state? Can a Markov chain with
fewer than 7/2 zero entries, the matrix is
an absorbing state be regular?
regular.
41. Consider the genetic model for Exercises
34-39. Suppose that, instead of always
crossing with hybrids to produce offspring,
we always cross with recessives. Give the
transition matrix for this Markov chain, and In Exercises 45-49, find the steady-state
show that there is an absorbing state. (See population distribution vector for the given
Exercise 40.) transition matrix. See the comment following
Example 7.
42. A Markov chain is termed absorbing if it
° fe ;
contains at least one absorbing state (see
Exercise 40) and if it is possible to get from
V3 Hi j0
4
any state to an absorbiag state in some 45. 46. 41.
number of time periods.
a. Give an example of a transition matrix
for a three-state absorbing Markov chain.
ory
48.;503 49,
[gas ]
$03
b. Give an example of a transition matrix
for a three-state Markov chain that is not HE ea
absorbing but has an absorbing state.
43. With reference to Exercise 42, consider an
absorbing Markov chain with transition
matrix 7 and a single absorbing state. Argue = In Exercises 50-54, find the steady-state
that, for any initial distribution vector p, the population distribution vector by (a) raising the
vectors 7"p for large n approach the vector matrix to a power and (b) solving a linear system.
containing 1 in the component that Use LINTEK or MATLAB.
corresponds to the absorbing state and zeros
elsewhere. [Succestion: Let m be such that 1.3.4 3.3 1
it is possible to reach the absorbing state 50.|.2 0 2 §1.].1 .3 .5
from any state in m time periods, and let g 1774 6 .4 4
be the smallest entry in the row of 7” [50009
corresponding to the absorbing state. Form a eos 5 soeel
vw chain with just two states, Absorbing 32.4 2.5 40
53.]/0
6 .3 0 0
(A) and Free (F), which has as time period m 35 20 00.72 0
time periods of the original chain, and with ov 10 0 0 8 LL
probability g of moving from state F to state 54. The matrix in Exercise 8
1.8 APPLICATION TO BINARY LINEAR CODES (OPTIONAL) 115
AaBbZz1234567890,;?*&#!}+-/'"
The 256 characters are assigned numbers from 0 to 255. For example, s iS
assigned the number 83 (decimal), which is 1010011 (binary) because, reading
1010011 from left to right, we see that
1(25) + 0(2%) + 1(24) + 0(2°) + 0(27) + 1(2') + 1(2°) = 83.
The ASCII code number for the character 7 is 55 (decimal), which is 110111
(binary). Because 2 = 256, each character in the ASCII code can be
represented by a sequence of eight 0’s or 1’s; the S is represented by 01010011
and 7 by 00110111. This discussion makes it clear that all information can be
encoded using just the binary alphabet B = {0, 1}.
ILLUSTRATION 1 Suppose we wish to send 1011 as the binary message word. To detect any
single-error transmission, we could send each character twice—that is, we
could encede 1011 as 11001111 when we send it. If a single error is made in
transmission of the code word 11001111 and the recipient knows the encoding
scheme, then tne error will be detected. For illustration, if the received code
word is 11001011, the recipient knows there is an error because the fifth and
sixth characters are differeut. Of course, the recipient does not know whether 0
or | is the correct character. But note that not all doub!e-error transmissions
can be detected. For example, if the received code word is 11000011, the
recipient perceives no error, and obtains 1001 upon decoding, which was not
the message sent. «=
ILLUSTRATION 2 Suppose again that we wish to transmit a four-character word on the alphabet
B. Let us denote the word symbolically by x,x,x,x,, where each x; is either 0 or
1. We make use of mudulo 2 arithmetic on B, where
0+0=0, 1+0=0+1=1, and 1+1=0 (modulo
2 sums)
and where subtraction is the same as addition, sothatO ~ 1 =0+1= 1.
Multiplication is as usual: | -0 = 0- 1 =0Oand1- 1 = 1. We append to the
word x,x,x,x, the modulo 2 sum
word is odd. Note that the result is a five-character code word x,x,x,x,%;
definitely containing an even number of 1’s. Thus we have
X, + xX, + x,+x%,+ x;=0 (modulo 2).
Terminology
Equation | in Illustration 2 is known as a parity-check equation. In general,
starting with a message word x,x, - * x, of kK characters, we encode it as a code
word X,%,*°* x, °+** x, of m characters. The first k characters are the
information portion of the encoded word, and the final n — kK characters are the
redundancy portion or parity-check portion.
We introduce more notation and terminology to make our discussion
easier. Let B" be the set of all binary words of n consecutive 0’s or 1’s. A binary
code Cis any subset of B". We can identify a vector of n components with each
word in B"—namely, the vector whose ith component is the ith character in
the word. For example, we can identify the word 1101 with the row vector
[1, 1, 0, 1]. It is convenient to denote the set of all of these row vectors with
components by B” also. This notation is similar to the notation R" for all
n-component vectors of real numbers. On occasion, we may find it convenient
to use column vectors rather than row vectors.
The length of a word u in B" is n, the number of its components. The
Hamming weight wt(u) of u is the number of components that are 1. Given two
binary words u and vy in B", the distance between them, denoted by d(u, v), is the
number of components in which the entries in u and are different, so that one
of the words has a 0 where the other has a 1.
118 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
ILLUSTRATION 3 Consider the binary words u = 11010011 and v = 01110111. Both words have
length 8. Also, wt(u) = 5, whereas wt(v) = 6. The associated vectors differ in
the first, third, and sixth components, so d(u, vy) = 3. =
We can define addition on the set 8" by adding modulo 2 the characters in
the corresponding positions. Remembering that 1 + 1 = 0, we add
0011201019 and 1010110001 as follows.
0011101010
+ 1010110001
1001011011
We refer to this operation as word addition. Word subtraction is the same as
word addition. Exercise 17 shows that 5" is closed under word addition. A
binary group code is any nonempty subset C of B" that is closed under word
addition. It can be shown that C has precisely 2* elements for some integer k
where 0 = k = n. We refer to such a code as an (n, k) binary group code.
which we will show accomplish our goal, also satisfy this condition.
1.8 APPLICATION TO BINARY LINEAR CODES (OPTIONAL) 119
Let us see how to get a distance of at least 3 between each pair of the 16
code words. Of course, the distance between any two of the original 16
four-character message words is at least 1 because they are all different.
Suppose now that two message words w and v differ in just one component, say
xX, A single parity-check equation containing x, then yields a different
character for w than for v. This shows that if each x, in our origina!. message
word appears in at least two parity-check equations, then any message words
at a distance of 1 are encoded into code words of distance at least 3. Note that
the three parity-check equations (Eqs. (2)] satisfy this condition. It remains to
ensure that two message words at a distance of 2 are encoded to increase this
distance by at least 1. Suppose two message words u and v differ in only the ith
and jth components. Now a parity-check equation containing both x; and x,
will create the same parity-check character for u as for ». Thus, for each such
combinaticn i,j of positions in ovr message word, we need some panty-check
equation to contain either x; but not x; or x; but not x, We see that this
condition is satisfied by the three parity-check equations [Eqs. (2)] for all
possible combinations i,j—namely,
1 0
01
(x, Xo, X34, Xa] 00
0 0
To see this, note that the first four columns of G give the 4 x 4 identity matrix
I, so the first four entries in the encoded word will yield precisely the message
word x,x,X;X,. In columns 5, 6, and 7, we put the coefficients of x,, x,, x;, and x,
as they appear in the parity-check equations defining x,, x,, and x,, respec-
tively. Table 1.1 shows the 16 message words and the code words obtained
using this generator matrix.G. Note that the message words 0011 and 0111,
which are at distance 1, have been encoded as 0011100 and 011100!, which
are at distance 3. Also, the message words 0101 and 0011, which are at dis-
tance 2, have been encoded as 0101110 and 0011100, which are at distance 3.
120 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
ILLUSTRATION 4 Suppose the Hamming (7, 4) code shown in Table 1.1 is used. If the received
word is 1011019, then the decoded message word consists of the first four
characters 1011, because 1011010 is a code word. However, suppose the word
0110101 is received. This is not a code word. The closest code word is
0100101, which is at distance 1 from 0110101. Thus we decode 0110101 as
TABLE 1.1
Tne Hamming (7, 4) code
0000 0000000
0001 0001011
0010 0010111
0011 0011100
0100 0100101
0101 0101110
0110 0110010
0111 0111001
1000 1000110
1001 1001101
1610 1010001
1011 1011010
1100 1100011
1101 1101000
1110 1110100
111 112111
HISTORICAL NOTE Ricuarp Hammine fb. 1915) had his interest in the question of coding
stimulated in 1947 when he was using an earlv Sell System relay computer on weekends only
(because he did not have priority use of the machine). During the week, the machine sounded an
alarm when it discovered an error so that an operator could attempt to correct it. On weekends,
however, the machine was unattended and would dump any problem in which it discovered an
error and proceed to the next one. Hamming’s frustration with this behavior of the machine grew
when errors cost him two consecutive weekends of work. He decided that if the machine could
discover errors—it used a fairly simple error-detecting code—there must be a way for it to correct
them and proceed with the solution. He therefore worked on this idea for the next year and
discovered several different methods of creating error-correcting codes. Because of patenl
considerations, Hamming did no! publish his solutions until 1950. A brief descriplion of his (7, 4)
code, how.ver, appeared in a paper of Claude Shannon (0. 1916) in 1948.
Hamming, in fact, developed some of the parity-check ideas discusscd in the.texl as well as
the geometric model in which the distance between code words is the number of cvordinates in
which they differ. He also, in essence. realized that the sel of actual code words embedded in EB’
was 4 four-dimensional subspace of that space.
1.8 APPLICATION TO BINARY LINEAR CODES (OPTIONAL) 121
0100, which differs from the first four characters of the recetved word. On the
other hand, if we receive the noncode word 1100111, ¥-2 decode it as 1100,
because the closest code word to 1100111 is 1100011. =
X, + Xz, +X, + x, = 0.
We form the parity-check matrix H whose ith row contains the seven
coefficients of x,, X2, X3, %4) Xs. Xs) X7 in the ith equation—namely,
1110100
H=/101101 0).
O11!
Let w bea received word, written as a column vector. Exercise 26 shows that w
1s a code word if and only if Hw is the zero column vector, where we are always
using modulo 2 arithmetic. If w resulted from a single-error transmission in
which the character in the jth position was changed, then Hw would have | in
its ith component if and only if x; appeared in ihe ith parity-check equation, so
that the column vector Hw would be the jth column of H. Thus we can decode
a received word w as follows in the Hamming (7, 4) code of Illustration 4, and
be confident of detecting and correcting any single-position error, as follows.
ILLUSTRATION 5 Suppose the Hamming (7, 4) code shown in Table 1.1 on page 120 1s used and
the word w = 0110101 is received. We compute that
0
I
1 1 10 Oj/1 1
—_
—_—i_
1 1 00 1//1 l
oe
0
I
Because this is the third column of H, we change the third character in the
message portion 0:10 and decode as 0100. Note that this is what wc obtained
in Illustration 4 when we decoded this word using Table 1.1. =
1.8 APPLICATION TO BINARY LINEAR CODES (OPTIONAL) 123
Just as we did after Illustration 4, we point out that if two errors are made
in transmission, the preceding outline may lead to incorrect decoding. If the
code word v = 0001011 is transmitted and received as w = 0011001 with two
errors, then
l
Hw= 1/0},
l
which ts column 2 of the matrix H. Thus, decoding by the steps above leads to
the incorrect message word 0111. Note that item 4 in the list above does not
say that if more than one error has been made, then Hw is neither the zero
vector nor the jth column of H.
| EXERCISES
1. Let the binary numbers 1 through 15 stand Exercise 4. (Recall that in the case where
for the letters ABCDEFGHIJKLM more than one code word is at minimum
N O, in that order. Using Table 1.1 for the distance from the received word, a code
Hamming (7, 4) code and letting 0000 stand word of minimum distance is selected
for a space between words, encode the arbitrarily.)
message A GOOD DOG. a. 110111
. With the same understanding as in the b. 001011
preceding exercise, use nearest-neighbcr c. 111011
decoding to decode this received message. d. 101010
e. i00i0i
0111001 1104111 1010100 0101110
0000000 1100110 1i11111 1101000 . Give the parity-check matrix for this code.
1101110 . Use the parity-check matrix to decode the
received words in Exercise 7.
In Exercises 3 through 11, consider the (6, 3)
linear code C with standard generator matrix 10. cee = 1101010111 and v = 0111001110.
in
100110 a. wt()
G=|0 1010 1}. b. wt(v)
001011 couty
; d. d(u, v)
. one the pasity-check equations for this 11. Show that for word addition of binary words
°° °. u and v of the same length, we have u + v =
. List the code words in C. u-v.
- How many errors can always be detected 12. If a binary code word u is transmitted and
using this code? the received word is w, then the sum uv + w
. How many errors can always be corrected given by word addition modulo 2 is called
using this code? the error pattern. Explain why this is a
. Assuining that the given word has been descriptive name for this sum.
received, decode it using nearest-neighbor 13. Show that for two binary words of the same
decoding. using your list of code words in length, we have d(u, v) = wt(" ~ »).
124 CHAPTER 1 VECTORS, MATRICES, AND LINEAR SYSTEMS
14. Prove the following properties of the if i # j, Then use the fact that m must be
distance function for binary words u, v, and large enough so that B” contains C and all
w of the same length. words whose distance from some word in C
2. d(u. v) = Oif and only ifu = v is 1]
b. A(u, v) = d(v, u) (Symmetrv) 22. Show that if the minimum distance between
c. d(u, w) = d(u, v) + d(v, w) tne words in an (m, k) binary group code C is
(Triangle inequality) at least 5, then we must have
d. d(u, v) = d(u + w,v+ w)
(Invariance under translation) arte lint =),
15. Show that 8” is closed under word addition. [Hint: Proceed as suggested by the hint in
16. Recall that we call a nonempty subset C of Exercise 21 to count the words at distance |
B" a binary group code if C is closed under and at distance 2 from some word in C.]}
addition. Show that the Hamming (7, 4) —
23. Using the formulas in Exercises 21 and 22,
code is a group code. [Hint: To show closure find a lower bound for the number of
under word addition, use the fact that the pafity-check equations necessary to encode
words in the Hamming (7, 4) code can be the 2* words in B* so that the minimum
formed from those in B‘ by multiplying by a distance between different code words is at
generator matrix.] least m for the given values of m and k.
17. Show that in a binary group code C, the (Note that k = 8 would allow us to encode
minimum distance between code words is all the ASCII characters, and that m = 5
equal to the minimum weight of the nonzero would allow us to detect and correct all
code words. single-error and double-error transmissions
18. Suppose that you want to be able to using nearest-neighbor decoding.)
recognize that a received word is incorrect a. k=2,m=3 d. k=2,m=5
when m or fewer of its characters have been b. k= 4, m = 3 e k=4,m=5
changed during transmission. What must be c. k=8,m=3 f. k=8,m=5
the minimum distance between code words 24. Find parity-check equations for encoding the
to accomplish this? 32 words in B° into an (n, 5) linear code that
19. Suppose that you want to be able to find a can be used to detect and correct any
unique nearest neighbor for a received word single-error transmission of a code word.
that has been transmitted with m or fewer of (Recall that each character x, must appear in
its characters changed. What must be the two parity-check equations, and that for
minimum distance between code words to each pair x,, x, some equation must contain
accomplish this? one of them but not the other.) Try to make
20. Show that if the minimum nonzero weight of the number of parity-check equations as
code words in a group code C is at least small as possible; see Exercise 21. Give the
2t + :, then the code can detect any 2¢ standard generator matrix for your code.
errors and correct any ¢ errors. (Compare the . The 256 ASCII characters are numbered
result stated in this exercise with your from 0 to 255, and thus can be represented
answers to the two preceding ones.) by the 256 binary words in B®. Find n — 8
21. Show that if the minimum distance between parity-check equations that can be used to
the words in an (n, k) binary group code C is form an (n, 8) linear code that can be used
at least 3, we must have to detect and correct any single-error
transmission of a code word. Try to make 7
aka Ltn. the value found in part (c) of Exercise 23.
uni: Let e, be the word in B" with | in the 26. Let C be an (n, k) linear code with
ih pesition and 0’s elsewhere. Show that e, parity-check matrix H. We know that Hc = 9
is not in C and that, for any two distinct for all c € C. Show conversely thai if
words vand win C, wehavev + e, # wt 2, w € B" and Hw = 0, thenweE C.
mn CHAPTER
Given a finite set S of vectors that generate a subspace W of R’, we would like
to delete from S any superfluous vectors, obtaining as small a subset B of S as
we can that still generates W. We tackle this problem in Section 2.1. In doing
so, we encounter the notion of an independent set of vectors. We discover that
such a minimal subset B of S that generates W is a basis for W, so that every
vector in Wcan be expressed uniquely as a linear combination of vectors in B.
We will see that any two bases for W contain the same number of vectors—the
dimension of W wiil be defined to be this number. Section 2.2 discusses the
relationships among the dimensions of the column space, the row space, and
the nullspace of a matrix.
In Section 2.3, we discuss functions mapping R’ into R" that preserve, ina
sense that we will describe, both vector addition and scalar multiplication.
Such functions are known as linear transformations. We will see that for a
linear transformation, the image of a vector x in R" can be computed by
multiplying the column vector x on the left by a suitable m x n matrix.
Optional Section 2.4 then applies matrix techniques in describing geometri-
cally all linear transformations of the plaie R? into itself.
As another application to geometry, optional Section 2.5 uses vector
techniques to generalize the nctions of line and plane to k-dimensional flats in
R".
125
126 CHAPTER 2 DIMENSION, RANK, AND LINEAR TRANSFORMATIONS
Using Eq. (1), we can express each of w,, w,, and w, as a linear combination of
the other two. For example, we have
We claim that we can delete w, from our list w,, W,, . . . , W, and the remaining
w; will still span W. The space spanned by the remaining w, which is contained
in W, will still contain w, because w, = —6w, + 15w,, and we have seen that W
is the smallest space containing all the w,. Thus the vector w, in the original list
is not needed to span W.
The preceding illustration indicates that we can find a basis for W =
sp(W,, Wa, ..., W,) by repeatedly delcting from the list of w,, w., ..., W, one
vector that appears with a nonzero coefficient in a linear combination giving
the zero vector, such as Eq. (1), until no such nontrivial linear combination for
0 exists. The final list of remaining w, will still span W and be a basis for Wby
Theorem 1.15.
EXAMPLE 1 Find a basis for W = sp((2, 3], [0, 1], (4, —6]) in R*.
SOLUTION The presence of the vector [0, 1] allows us to spot that [4, —6] = 2[2, 3] -
12[0, 1], so we have a relation like Eq. (1)—namely,
Let {w,, W,, ... , W,} be a set of vectors in R”. A dependence relation in
this set is an equation of the form
For convenience, we will often drop the word linearly from the terms
linearly dependent and linearly independent, and just speak of a dependent or
independent set of vectors. We will sometimes drop the words set ofand refer to
dependent or independent vectors w,, W,,..., We
Two nonzero vectors in R" are independent if and only if one is not a scalar
inultiple of the other (see Exercise 29). Figure 2.1(a) shows two independent
vectors w, and w, in R’. A little thought shows why r,w, + r,w, in this figure can
be the zero vector if and only if r, = r, = 0. Figure 2.1(b) shows three
independent vectors w,, W,, and w, in R”. Note how w, ¢ sp(w,, w;). Similarly,
w, € sp(w,, w;) and w, € sp(w,, w,).
Using our new terminology, Theorem 1.15 shows that {w,. w,,... , wis a
basis for a subspace W of R’ if and oniy if the vectors w,, W.,..., W, span W
and are independent. This is taken as a definition of a basis in many texts. We
chose the “unique linear combination” characterization in Definition 1.17
because it is the most important property of bases and was the natural choice
arising from our discussion of the solution set of Ax = 0. We state this
alternative characterization as a theorem.
(a) (b)
FIGURE 2.1
(a) Independent vectors w, and w,; (b) independent vectors w., Ww, and wy.
128 CHAPTER 2 DIMENSION, RANK, AND LINEAR TRANSFORMATIONS
Let Wbe a subspace of R"®. A subset {w,, w,,.. w,}of Wis a basis for
W if and only if the following two conditions are met:
1. The vectors w,, W,,..., W, Span W.
2. The vectors are linearly independent.
Hence we can delete w, and w, and retain {w,, w,, w,} as a basis for W. In order
to be systematic, we have chosen to keep precisely the vectors w, such that the
jth column of H contains a pivot. Thus we don’t really have to obtain reduced
row-echelon form with zeros above pivots to do this; row-echelon form is
enough. We have hit upon the following elegant technique.
2.1 INDEPENDENCE AND DIMENSION 129
1 2 0 3 2°5 12 0 3 2 =#«55
-! 1-3 3 4 7; 10 3-3 6 6 12
0-2 2-4 1 -3!/~|0 -2 2-4 #1 -3
2 0 4-2 0-2; |9 -4 4 -8 -4 -12
1 0 2-1 1 OF 10-2 2-4-1 -53
i 2 0 3 2 =5 1 2 0 3 2°5
0 1-1 2 2 4) {0 1-1 2 2 4
~10 0 0 0 5 5)~|0 0 0 0 1 1).
000 0 4 4 10 0 0 0 0 0
00 ¢ 0 3 3} 0 0 0 0 0 0
Because there are pivots in columns 1, 2, and 5 of the row-echeion form, the
vectors w,, W,, and w, are retained and are independent. We obtain {w,, w, Ws}
as a basis for W. oa
EXAMPLE 32 Determine whether the vectors v, = [1, 2, 3, 1], v, = (2, 2, 1, 3], and y, =
[-1, 2, 7, —3] in R* are independent.
SOLUTION Reducing the matrix with jth column vector v,, we obtain
1 2-1) f1 2-1) fi 0 3
2 2 2) lo-2 4] |o 1-2
3 1 7/~l0-5 10/~]o 0 OF
1 3-3} (0 1-2; lo 0 oO]
We see that the vectors are not independent. In fact, v, = 3v, — 2v,. m=
PROOF Let us suppose that k < m. We will show that the vectors v,,v,, ...,¥,
are dependent, contrary to hypothesis. Because the vectors w,, w,,..., W;
span W, there exist scalars a, such that
V, = 4,W, + ,W, + + °° + AyW,,
V. = A.W, + AyW, + +++ + Aym,
(3)
etc., and then adding the equations. Now the resulting sum is sure to be the
zero vector if the total coefficient of each w, on the right-hand side in the sum
after adding is zero—that is, if we can make
*ai, + XQr9 tees t Xm2 sm = 0,
XAyy + Xp +6 * + XpQen = O-
PROOF Suppose that both a set B with k vectors and a set B’ with m vectors
are bases for W. Then both B and B’ are independent sets of vectors, and the
vectors in either set span W. Regarding B as a set of k vectors spanning Wand
regarding B’ as a set of m independent vectors in W, Theorem 2.2 tells us that
k = m. Switching around and regarding B’ as a set of m vectors spanning W
and regarding B as a set of k independent vectors in W, the theorem tells us
that m =k. Therefore,k =m. a
Let Wbe a subspace of R’. The number of elements in a basis for Wis
the dimension of W, and is denoted by dim(W).
spanning set can always be cut down (if necessary) to form a basis using the
technique boxed before Example 2. Theorem 2.2 also tells us that we cannot
find a set containing more than n independent vectors in R’. The same
observations hold for any subspace 17 of R’ using the same arguments. If
dim(W) = k, then W cannot be spanned by fewer than & vectors, and an
independent set of vectors in W can contain at most k elements. Perhaps you
just assumed that this would be the case; it is gratifying now to have
Justification for it.
EXAMPLE 4 Find the dimension of the subspace W = sp(w,, w., W;, W,) of R? where
Wy = (1, —3, 1], Ww = [-2, 6, —2], W3 = (2, l, —4}, and W, = [-1, 10, —7}.
SOLUTION Clearly, dim(W) is no larger than 3. To determine its value, we form the matrix
1-2 2-1
A=|-3 6
1 10).
1-2 +4 -7
We reduce the matrix A to row-echelon form, obtaining
l 2
—3] and 1
l —4
In Section 1.6, we stated that we would show that every subspace W of R’”
is of the form sp(w,, W>,..., W,). We do this now by showing that every
subspace W # {0} has a basis. Of course, {0} = sp(0). To construct a basis for W
# {0}, choose any nonzero vector w, in W. If W = sp(w,), we are done. If not,
choose a vector w, in W that is not in sp(w,). Now the vectors w,, w, must be
independent, for a dependence relation would allow us to express w, as a
multiple of w,, contrary to our choice of w, not in sp(w,). If sp(w,, w,) = W, we
are done. If not, choose w, € W that is not in sp(w,, w,). Again, no dependence
relation can exist for w,, W,, W; because none exists for w,, w, and because w,
cannot be a linear combination of w, and w,. Continue in this fashion. Now
W cannot contain an independent set with more than n vectors because
no independent subset of R” can have more than n vectors (Theorem 2.2).
The process must stop with W = sp(w,, w,,..., w,) for some k = a,
which demonstrates our goal. In order to be able to say that everv subspace
of R" has a basis, we define the basis of the zero subspace {0} to be the empty
set. Note that although sp(0) = {0}. the zero vector is not a waigue linear com-
21 INDEPENDENCE AND DIMENSION 133
EXAMPLE 5 Enlarge the independent set {[1, 1, —1], [1, 2, -2]} to a basis for R’.
SOLUTION Let v, = [1, 1, —l] andy, = [1, 2, —2]. We knowa spanning set for R’-—namely.
{e,, €,, €;}. We write R? = sp(v,, v,, e,, €,, €;) and apply the technique of Example
2 to find a basis. As long as we put y, and y, first as columns of the matrix to be
reduced, pivots will occur in those columns, so vy, and y, will be retained in the
basis. We obtain
1 1 1 0 0 1 t 1 0 0 1 1 1 0
1 2 0 1 Of~jO
1 -!t | O;~/0 1 -! 1
-1-2 0 0 1 0-1 1 0 1 00 0 1
We sec that the pivots occur in columns |, 2, and 4. Thus a basis containing v.
and vis {f1, 1, -1], (1, 2, -2], [0, 1, 0]. «
134 CHAPTER 2 DIMENSION, RANK, AND LINEAR TRANSFORMATIONS
‘| SUMMARY
- Aset of vectors {w,, w,, . . . » W,} in R" is linearly dependent if there exists a
dependence relation
nw, t+ rw, + +++ +7W,=0, — with at least one 7; 4 0.
The set is linearly independent if no such dependence relation exists, so
that a linear combination of the w, is the zero vector only if all of the scalar
coefficients are zero.
. Asset Bof vectors in a subspace W of R’ is a basis for W if and only if the set
is independent and the vectors span W. Equivalently, each vector in Wcan
be written uniguely as a linear combination of the vectors in 2.
. If W= sp(w,, Wo, .. . , W,), then the set {w,, w., . . . , W,} can be cut down, if
necessary, to a basis for W by reducing the matrix A having w, as the jth
column vectur to row-echelon form H, and retaining w, if and only if the jth
column of H contains a pivot.
. Every subspace W of R" has a basis, and every independent set of vectors
in W can be enlarged (if necessary) to a basis for W.
. Let W be a subspace of R". All bases of W contain the same number of
vectors. The dimension of W, denoted by dim(W), is the number of
vectors in any basis for W.
. Let Wbea subspace of R" and let dim(W) = k. A subset S of W containing
exactly k vectors is a basis for W if either
a. Sis an independent set, or
b. S spans W.
That is, it is not necessary to check both conditions in Theorem 2.1 for a
basis if S has the right number of elements for a basis.
EXERCISES
. Give a geometric criterion for a set of two 6. Argue geometrically that every set of four
distinct nonzero vectors in R? to be distinct vectors in R? is dependent.
dependent.
. Argue geometrically that any set of three
distinct vectors in R? is dependent. In Exercises 7-11, use the technique of Example
2, described in the box on page 129, to find a
. Give a geometric criterion for a set of two basis for the subspace spanned by the given
distinct nonzero vectors in R? to be
vectors.
dependent.
. Give a geometric description of the subspace 7. sp([-3, 1), (6, 4]) in R?
of R} generated by an independent set of two
8. sp({[—3, 1], (9, —3]) in R?
vectors.
. Give a geometric criterion for a set of three 9. sp([2, 1], [-6, —3], [1, 4]) in R?
172)
distineL nonzero vectors in R’ to be 10. sp([-—2, 3, 1], [3, -1, 2], [1, 2, 3], [-1, 5, 4)
dependent. in R?
2.1 INDEPENDENCE AND DIMENSION 135
1. sp([!, 2, 1, 2], (2, 1, 0, -1], [-1. 4, 3, 8], —— b. Ifa set of nonzero vectors in R" is
(0, 3, 2, 5]) in R* dependent, then aay two vectors in tlie
12. Find a basis for the column space of the set are parallel.
matrix . Every subset of three vectors in R? is
dependent.
2 . Every subset of two vectors in R? is
—
Ww
_|5 independent.
A=T,
~)
—_—
th
. Ifa Subset of two vectors in R? spans R’,
no
6- then the subset is independent.
eo
NR
other.
25. {{-2, 3, 1), (3, —1, 2], U), 2, 3], [-1, 5, 4]} in
R3 30. Let v,, v,, ¥; be independent vectors in R’,
Prove that w, = 3v,, W, = Zv, — ¥;,,
and w, = v, + v; are also independent.
In Exercises 26 and 27, enlarge the given
independent set to a basis for the entire space R’. 31. Let v,, V,, ¥; be any vectors in R”. Prove that
WwW, = 2v, + 3y,, W, = Vv, — 2v;,
26. {[1, 2, 1]} in R? and w, = —v, — 3v, are dependent.
27. {[2, I, 1, 1), (1, 0, l, 1}} in RS
32. Find all scalars s, if any exist, such that
22. Let S = {v,, v,,..., Vt be aset of vectors in [1, 0, 1], (2, s, 3], (2, 3, !] are independent.
R°. Mark each of the following True or False.
_— a. A subset of R’ containing two nonzero 33. Find all scalars s, if any exist, such that
distinct parallel vectors is dependent. [1, 0. 1]. [2. s, 3]. [1, -s, 0} are independent.
136 CHAPTER. 2 DIMENSION, RANK, AND LINEAR TRANSFORMATIONS
34. Let v and w be independent column vectors ~ a In Exercises 39-42, use LINTEK to find a basis
in R?, and let A be an invertible 3 x 3 for the space spanncd by the given vectors in R’.
matrix. Prove that the vectors Av and Aw are 39. v, = (5, 4, 3], v, = [6, 1, 4]
independent. -_ (2, 1, 6], v5 = 1,1,]]
35. Give an example showing that the (4, 5, -12],
conclusion of the preceding exercise need
not hold if A is nonzero but singular. Can “0.3 [ OA 7) a, = {-1, 4, 6, 11],
you also find specific independent vectors v [-3 Bt as 1, 1, 3],
and w and 2 singular matrix A such that Ay 1, = (3,7, 3, 9]
and Aw are still independent? “1. [3,
36. Let v and w be column vectors in R’, and let (3,
A be an mn X n matrix. Prove that, if Av and [3,
Aw are independent, y and w are [1,
independent. [7,
37. Generalizing Exercise 34, let v,, ¥,,..., %; (2,
be independent column vectors in R’, and let 42. w, =
C be an invertible n x n matrix. Prove that W, =
the vectors Cv,, Cv.,..., Cv, are W; =
independent. W, =
38. Prove that if W is a subspace of R’ and W; =
dim(W) = n, then W = R’. W, =
MATLAB
Access MATLAB and work the indicated exercise. M1. Exercise 39
If the data files for the text are available, enter M2. Exercise 40
fbe2s1 for the vector data. Otherwise, enter the M3. Exercise 41
vector data by hand. ~ EXETCISe
M4. Exercise 42
We know that a basis for the column space of A consists of the columns of
A giving rise to pivots in a row-echelon form of A. We saw how to find a basis
for the nullspace of A in Section 1.6. We would like to be able to find a basis for
the row space. We could work with the transpose of A, but this would require
HISTORICAL NOTE THE RANK OF A MATRIX was defined in 1879 by Georg Frobenius
(1849-1917) as follows: If all determinants of the {r + 1)st degree vanish, but not all of the 7th
degree, then r is the rank of the matnx. Frobenius used this concept to deal with the questions of
canonical forms for certain matrices of integers and with the solutions of certain systems of linear
congruences.
The nullity was defined by James Sylvester in 1884 for square matrices as follows: The nullity
of an n X nv matrix is i if every minor (determinant) of order n — i + 1 (and therefore of every
higher order) equals 0 and / is the largest such number for which this is true. Sylvester was
interested here, as in much of his mathematical career, in discovering invariants—properties of
pa ticular mathematical objects that do not change under specified types of transformations. He
proceeded to prove what he called one of the cardinal laws in the theory of matrices, that the
nullity of the product of two matrices is not less than the nullity of any factor or greater than the
sum of the nullities of the factors.
138 CHAPTER 2 DIMENSION, RANK, AND LINEAR TRANSFORMATIONS
EXAMPLE 1 Find the rank, a basis for the row space, a basis for the column space, and a
basis for the nullspace of the matrix
1 3 0-1 2
0-2 4-2 0
A=!3 11 -4 -t 6}:
25 3-4 0
SOLUTION We reduce A all the way to reduced row-echelon form, because we also want to
find a basis for the nullspace of A. We obtain
1 3 0-1 2] fl 3 O-1 2) f1 0 6-4 2
0-2 4-2 0} jo-2 4-2 Of jo 1-2 1 «0
A=!3 11 -4-1 6/~|0 2-4 2 01710 0 0 0 Ol7
2 5 3-4 O| lo-1 3-2-4) 10 0 1-1 -4
1 0 0 2 26
0 1 O-1 -8
0 0 1-1 -4|= 47.
0 0 0 0 0
Because the reduced form H contains three pivots, we see that rank(A) = 3.
As a basis for the row space of A, we take the nonzero row vectors of H,
obtaining
{{l, 0, 0, 2, 26], (0, l, 0, -I, —8], (0, 0, l, -i, —4}}.
Notice that the next to the last matrix in the reduction shows that the first
threc row vectors ofA are dependent, so we must not take them as a basis for
the row space.
2.2 THE RANK OF A MATRIX 139
Now the columns of A in which pivots appear in H form a basis for the
column space, and from H we see that the solution of Ax = 0 is
—2r - 3 "| ~26
r+ 8s ! 8
X=} r+ 4s;=r| t)+ 5} 4|. Thus we have the following bases:
r | | 0
S 0! I
i} / 3] f 0 72] |~26
0] |-2} | 4 ly; 8
Calumn space: 3\+{ a1)-|—4 Null space: ;| 1}, 4
atl sf l 3 0l
0 "
Our work has given us still another criterion for the invertibility of a
square matrix.
“| SUMMARY
1. Let Abe an m x n matrix. The dimension of the row space of A is equal to
the dimension of the column space of A, and is called the rank of A,
denoted by rank(A). The rank of A is equal to the number of pivots in a
row-echelon form H of A. The nullity of A, denoted by nullity(A), is the
dimension of the nullspace of A—that is, of the solution set of Ax = 9.
2. Bases for the row space, the column space, and the nullspace of a matrix A
can be found as described in a box in the text.
3. (Rank Equation) For an m X n matrix A, we have
rank(A) + nullity(A) = 7.
141. Mark each of the following True or False. 17. Give an example ofa 3 x 3 matrix 4 such
___ a. The number of independent row vectors that rank(A) = 2 and rank(A’) = 0.
in a matrix is the same as the number of
independent column vectors.
In Exercises 18-20, let A and C be matrices such
b. If H is a rcw-echelon form of a matrix A,
that the product AC is defined.
then the nonzero column vectors in H
form a basis for the column space of A.
18. Prove that rank(AC) < raak(A).
c. If His a row-echelon form of a matrix A,
then the nonzero row vectors in H area 19. Give an example where rank(AC) < rank(A).
basis for the row space of A. 20. Is it true that rank(AC) = rank(C)? Explain.
d. If ann X n matrix A is invertible, then
rank(A) = n. It can be shown that rank(ATA) = rank(A) (see
e. For every matrix A, we have rank(A) > 0. Theorem 6.10). Use this result in Exercises 2]-23.
f. For all positive integers m and n, the
rank of an m X n matrix might be any
21. Let A be an m X 4: matrix. Prove that
number from 0 to the maximum of m
rank(A(A7)) = rank(A).
and n.
__ g. For all positive integers m and n, the 22. Ifais ann x | vectorandbisal * m
rank of an m X n matrix might be any vector, prove that ab is ann X m matrix of
number from 0 to the minimum of m rank at most one.
and n. 23. Let A be an m X mn matrix. Prove that the
__h. For all positive integers m and n, the column space and row space of (A’)4 are the
nullity of an m Xx n matrix might be any same.
number from 0 to n. 24. Suppose that you are using computer
_— i. For all positive integers m and n, the software, such as LINTEK or MATLAB, that
nullity of an m X n matrix might be any will compute and print the reduced row-
number from 0 to m. echelon form of a matrix but does not
__.. j.. For all positive integers m and n, with indicate any row interchanges it may have
m = n, the nullity of an m X n matrix made. How can you determine what rows of
might be any number from 0 to n. the original matrix form a basis for the row
12. Prove that, if A is a square matrix, the space?
nullity of A is the same as the nullity of A’.
13. Let A be an m X rn matrix, and let b be an = In Exercises 25 and 26, use LINTEK or MATLAB
n X I vector. Prove that the system of to request a row reduction of the matrix, without
equations Ax = b has a solution for x if seeing intermediate steps. Load data files as usual
and only if rank(A) = rank(A | b), where if they are available. (a) Give the rank of the
rank (A | b) represents the rank of the matrix, and (bj use the software as suggested in
associated augmented matrix [A | b] of the Exercise 24 to find the lowest numbered rows, 11!
system. consecutive order, of the given matrix that form a
basis for its row space.
In Exercises 14-16, let A and C be matrices such
that the product AC is defined. 2-3 0 1 4
1 4 -6 3 -2
14. Prove that the column space of AC is 25. A=
0 lt -12 5 -8
contained in the column space of A. 4-1 5 3 7
1S. Is it true that the column space of AC is -! 1 3 —-6 8 —-2
contained in the column space of C? -3 & 3 | 4 8
Explain. 26. B=/| 1-3 3-13 12 -12
16. State the analogue of Exercise 14 concerning 0 2 -6 19 -20 14
the row spaces of A and C. 5 13 -21 3. Il 6
142 CHAPTER 2 DIMENSION, RANK, AND LINEAR TRANSFORMATIONS
From properties | and 2, it follows that [(ru + 3) = rT(u) = 57 (sj for all
u, v € R" and all scalars r and s. (See Exercise 32.) In fact, this equation can be
2.3 LINEAR TRANSFORMATIONS OF EUCLIDEAN SPACES 143
/ | | ™N
(b)
FIGURE 2.2
(a) The image of H under f; (b) the inverse image of K under [.
penain of T Range of T
R’
FIGURE 2.3
The linear transformation T(x) = Ax.
144 CHAPTER 2 DIMENSION, RANK, AND LINEAR TRANSFORMATIONS
EXAMPLE 2 Determine whether T: R? > R? defined by T([x,, X,]) = [2,, X, — X,, 2x, + x] is
a linear transformation.
SOLUTION To test for preservation of addition, we let uv = [u,, u,] and v = [y,, v,], and
compute
T(a + vy) = T([u, + 1, % + »J)
= [uw + vu + Vy — Uy — Vy, 2u, + 2v, + wy + YY]
= [u,, Uy — Uy, 24, + uw] + [%, ¥, — Vy, 2¥, + V4]
= T(u) + T(y),
and so vector addition is preserved. To test for preservation of scalar
multiplication, we compute
EXAMPLE 3 Let A be an m X n matrix, and let T,: R’ > R” be defined by 7,(x) = Ax for
each column vector x € R". Show that T, is a linear transformation.
SOLUTION This follows from the distributive and scalars-pull-through properties of
matrix multiplication stated in Section 1.3—namely, for any vectors u and V
and for any scalar r, we have
PROOF Let v be any vector in R”. We know that because 8B is a basis, there
exist unique scalars r,, r,,..., 7, Such that
v=rb, + 7b, + °° > +7,b,.
Theorem 2.7 shows that if two linear transformations have tie same value
at each basis vector b,, then the two transformations have the same value at
each vector in R’, and thus they are the sa:ne transformation.
PROOF Recall that for any matrix A, Ae; is the jth column of A. This shows at
once that if A is the matrix described in Eq. (2), then Ae; = T(e,), and so T and
the linear transformation T, given by 7,(x) = Ax agree on the standard basis
{e,, e,,..., @,¢ of R". By Theorem 2.7, and the comment following this
theorem, we know that then T(x) = T,,(x) for every x € R’— that is, T(x) = Ax
foreveryxE RR’. &
The matrix 4 in Eq. (2} 1s the standard matrix representation of ihe linear
transformation T.
SOLUTION Equation (2) for the standard matrix representation shows that
-
3! [2x. + 3x.
2.3. LINEAR TRANSFORMATIONS OF EUCLIDEAN SPACES 147
SOLUTION We compute
T(e,) = T({1, 0, 0, OJ) = (€, 2, 8], T(e,) = T((O, 1, 0, O]) = (1, -1, —4]
T(e;) = T({O, 0, 1, O}) = (-3, 0, 3], T(e,) = TE[O, 0, 0, 1]) = (0, 3, —1)].
Using Eq. (2), we find that
0 1-3 0
A=|2 -1 0 3}
8-4 3 -i
Perhaps you noticed in Example € that the first row of the matnx A
consists of the coefficients of x,, x,, x;, and x, in the first component x, — 3x, of
T ([X,, Xz» X3 X4]). The second and third rows can be found similarly. If Eq. (3)
iS written in column-vector form, the matrix A jumps out at you immediately.
Try it! This is often a fast way to write down the standard matrix representa-
tion when the transformation is described by a formula as in Example 6. Be
sure to remember, however, the Eq. (2) formulation for the standard matrix
representation.
We give another example indicating how a linear transformation is
determined, as in the proof of Theorem 2.7, if we know its values on a basis for
its domain. Note that the vectors u = [—1, 2] and v = (3, —5] are two
nonparallel vectors in the plane, and form a basis for R’.
EXAMPLE 7 Let u = [-1, 2] and v = [3, —5] be in R’, and let T: R’ — R? be a linear
transformation such that T(u) = [-2, 1, 0] and 7(v) = (5, —7, 1]. Find the
standard matnx representation A of T and compute 7((-—4, 3)).
SOLUTION To find the standard matrix representation of T, we need to find T(e,) and
T(e,) for e,, e, € R’. Following the argument in the proof of Theorem 2.7, we
express e, and e, as linear combinations of the basis vectors u and v for R’,
where we know the action of T. To express e, and e, as linear combinations of u
and v, we solve the two linear systems Ax = e, and Ax = e,, where the
coefficient matrix A has u and vas its column vectors. Because both systems
have the same coefficient matrix, we can solve them both at once as follows:
-1 3 Pog frst) ft of s 3
2-5]/01} {o 1121 jo 12ay
148 CHAPTER 2 DIMENSION, RANK, AND LINEAR TRANSFORMATIONS
34] ets)
0 -} 3
A=
EXAMPLE 8 Find the kernel of the linear transformation T: R? > R’ where T([X,, x), X3])
= [x, ~ 2X, xX, + 4x,].
The dimension of range T is called the rank of 7, and the dimension of ker(T)
18 called the nullity of T.
Also, matrix multiplication and matrix inversion have very signifi-
cant analogues in terms of transformations. Let 7: R’ > R” and 7’: R™ > R*
be two linear transformations. We can consider the composite function
(T’ ° T): R" — R* where (7” © T)(x) = T’(T(x)) for x € R". Figure 2.4 gives a
graphic illustration of this composite map.
Now suppose that A is the m X n matrix associated with T and that 8 is the
k X m mairix associated with 7’. Then we can compute 7’(T(x)) as
T'(T(x)) = T'(Ax) = B(Ax).
But
T’ oT
| T
re
°
,
x
rn| 7 Ax By =B(Ax)
rel _| Ri
FIGURE 2.4
The composite map T’ « 7.
150 CHAPTER 2 DIMENSION, RANK, AND LINEAR TRANSFORMATIONS
ILLUSTRATION 1 (The Double Angle Formulas) It is shown in the next section that rotation of the
plane R’ counterclockwise about the origin through an angle @ is a linear
transformation 7: R’ > R? with standard matrix representation
es 6 —sin |
. . Counterclockwise rotation through @ (3)
sin @ cos 6
- Thus, 7 applied twice—that is, T° T—rotates the plane through 26. Replacing
6 by 26 in matrix (3), we find that the standard matrix representation for T° T
must be
On the other hand, we know that the standard matrix representation for the
composition J» T must be the square of the standard matrix representation
for T, and so matnx (4) must be equal to
cos @ ~sin 6]|/cos 6 —sin 6} _ |cos’@ — sin’@ ~2 sin Ocos 6] (5)
sin@ cos @\|sin@ cos 6 2 sin @cos 86 —sin’@ + cos’6]’
Comparing the entries in matrix (4) with the finai result in Eq. (5), we obtain
the double angle trigonometric identities
x =AT4Ax [|———_
rt Ke aw
R" ae a”
a
FIGURE 2.5
T-'o T is the identity transformation.
EXAMPLE 9 Show that the linear transformaticn T: R? — R? defined by T([x,, x,, x;]) =
t X, 1 — 2X, 2 + X;,3 X) 2 — Xy,3 2x, 2 — 3x,] 3 is invertible, and find a formula for its
inverse.
SOLUTION Using column-vector notation, we see that T(x) = Ax, where
1-2 |
A=!/0 1 —1/I.
0 2-3
Next, we find the inverse of A:
1-2 1/100] [f O-1}41 2 Of fi O Of 1 4-1
0 1-1)/0 1 0)~;0 1-170 1 OF~10 1 OF] O 3 -II.
0 2-3;00 1 0 0-110 -2 i} loo alo 2-1
Therefore,
, SUMMARY |
1, A function T: R’— Ris a linear transformation if T(u + v) = T(u) + T(y)
and T(ru) = rT(u) for all vectors u, vy € R’ and ll scalars r.
. IfAisanm xX n matrix, then the function T,: R’ > R” given by T(x) = Ax
for al! x € R’ is a linear transformation.
A linear transformation T: R" — R” is uniquely determined by 7(b,),
T(b.),..., T(b,) for any basis {b,, b,, .. . , b,} of R”.
Let T: R" — R* be a linear transformation and let A be the m X n matrix
whose jth column vector is T(e)). Then T(x) = Ax for all x € R’; the matrix
A is the standard matrix representation of 7. The kernei of T is the
nullspace of A, and the range of T is the column space of A.
Let T: R* > R” and 7’: R” — R* be linear transformations with standard
matrix representations A and B, respectively. The composition T" ° T of
the two transformations is a linear transformation, and its standard matrix
representation is BA.
6. If y = T(x) = A(x) where A is an invertible n X # matmx, then T is
invertible and the transformation T~' defined by T~'(y) = A™'y is the
inverse of T. Both T-!° T and 7° T~' are the identity transformation of R’.
EXERCISES
1. Is T([4, Xy X5]) = Py + 2%, %1 — 3x) a 7. If T((1, 0, O}) = [3, 1, 2], T((O, 1, C}) =
linear transformation of R? into R?? Why or [2, -1, 4], and T((0, 0, 1]) = [6, 0, 1], find
why not? T((2, —5, 1)).
+ Is T([%, 2 X]) = [0, 0, 0, 0] a linear . If T({1, 0, O}) = [-3, 1], T((0, 1, 0) =
transformation of R? into R‘? Why or why [4, -1]}, and T({0, -1, 1]) = [3, —5], find
not?
T([-1, 4, 2]).
~ Is T([%, X,, X5]) = [1, 1, 1, 1] a linear
transformation of R? into R*? Why or why . If T[-1, 2) = [1, 0, 0] aad 72, ij) =
not? [0, 1, 2], find 7([0, 10]).
- Ts F(D4, ©) = [4 — 22 + 1, 3x, — 2) a 10. If T([-1, 1]) = (2, 1, 4] and T((A, 1) =
linear traasformation of R? into R?? Why or {—6, 3, 2], find T(Lx, y)).
why not?
11. If T([1, 2, —3]) = {1, 0, 4, 2], T((3, 5, 2]) =
In Exercises 5—12, assume that T is a linear {—8, 3, 0, 1], and T([—-2, -3, -4)) =
transformation. Refer to Example 7 for Exercises {0, 2, —1, 0], find 7([5, —1, 4]).
9-12, if necessary. [Computational aid: See Example 4 in
Section 1.5.]
5. If T([1, 0]) = (3, —!] and 7((0, 1]) =
12. If T((2, 3, 0]) = 8, T({1, 2, -1]) = —S, and
[—2, 5]. find T([4, -61). T([4, 5, l]) = 17, find 7({[-3, tl, -4)).
6. If T({-1, OP = (2, 3] and T({C. 1]) = [5, 4]. [Computational aid: See the answer to
find T([-3, —5}). Exercise 7 in Section 1.5.]
2.3 LINEAR TRANSFORMATIONS OF EUCLIDEAN SPACES 153
In Exercises 13-18, the given formula defines a ___e. An invertible linear iransformation
linear transformation. Give its standard matrix mapping R’ into itself has a unique
representation. inverse.
ff. The same matrix may be the standard
13. [x,, ,]) = [x + Xy YY — 3x] matrix representation for several different
14. linear transformations.
[X1, X2]) = [2x, — x, Xx, + %, x, + 3x]
. A linear transformation having an m Xx n
15. (1, %, X3]) = [2% + XQ © XG, X) + Xy, Xi] matrix as standard matrix representation
16. [X1, X2, X3]) = [2x + xy +X, x +X, + 35] maps R’ into R”.
17. [X1, X25 X3]) = [%) — Xp + 3X5, xy +. + Xs, 4] . If T and T' are different linear
transformations mapping R? into R”, then
18. [X1, Xp, X3]) = 1 + + %
we may have 7(e,) = T’(e,) for some
19. If T: R? — R} is defined by T([x,, x,]) = standard basis vector e; of R’.
[2x, + x), xX), X; — X,] and T’: R? > R? is i. If T and T’ are different linear
defined by 7’([x,, x, X3]) = transformations mapping R’ into R*, then
[x, — xX, + X;, xX, + xX], find the standard we may have 7(e,) = T'(e,) for all
matrix representation for the linear standard basis vectors e; of R’.
transformation T” ° T that carries R? intc R’. . IfB = {b,, b,, ..., b,} is a basis for R?
Find a formula for (T' ° T\([x,, x,]). and 7 and T” are linear transformations
20. Referring to Exercise 19, find the standard mapping R’ into R”, then T(x) = T'(x)
matrix representation for the linear for all x € R’ if and only if T(b,)) = T'(b,)
transformation T° 7’ that carries R? into R’. fori=1,2,...,7.
Find a2 formula for (T° T')([x,, X, 3). . Verify that 7~'(7(x)) = x for the linear
transformation T in Example 9 of the text.
31. Let T: R? > R" and T': R" > R be linear
transformations. Prove directly from
In Exercises 21-28, determine whether. the Definition 2.3 that (T’ ° T) R" > R¥ is also a
indicated linear transformation T is invertible. If linear transformation.
it is, find a formula for T-'(x) in row notation. If 32. Let T: R? > R” be a linear transformation.
it is not, explain why it is not. Prove from Definition 2.3 that T(ru + sv)
= rT(u) + sT(vy) for all u, v € R’ and all
21. The transformation in Exercise 13. scalars r and s.
22. The transformation in Exercise 14,
23. The transformation in Exercise 15. Exercise 33 shows that the reduced row-echelon
24. The transformation in Exercise 16. form of a matrix is unique.
&
Je The transformation in Exercise i7.
33. Let A be an m X nm matrix with row-echelon
26. The transformation in Exercise 18.
form H, and let V be the row space of A (and
27. The transformation in Exercise 19. thus of H). Let W, = sp(e, @,..., eg
28. The transformation in Exercise 20. be the subspace of R” generated by the first k
29. Mark each of the following True or False. rows of the n x n identity matrix. Consider
—_— a. Every linear transformation is a function. T,: V— W, defined by
__— b. Every function mapping R” into R” is a
linear transformation.
T,((%, Xq, + -» Xl)
= (x1, X2) . ..,Xp, 0,..., 0).
. Composition of linear transformations
corresponds to multiplication of their a. Show that 7, is a linear transformation of
standard matrix representations. V into W, and that 7,[V] =
. Function composition is associative. {T,(v) | vin V} is a subspace of W,..
154 CHAPTER 2 DIMENSION, RANK, AND LINEAR TRANSFORMATIONS
10 0 0| -1 0| 1 -3 (1)
oop fo1p Lop |2 -6}:
Projection Projection Collapse Collapse
on x-axis ony-axis ontoy=-—-x ontoy = 2x
The first two of these matrices produce projections on the coordinate axes, as
labeled. Projection of the plane on a line /, through the origin maps each vector
v onto a vector p represented geometrically by the arrow starting at the origin
and having its tip at the point on L that is closest to the tip of v. The line
through the tips of v and p must be perpendicular to the line L; phrased
entirely in terms of vectors, the vector v — p must be orthogonal to p. This 1s
illustrated in Figure 2.6. Projection on the x-axis is illustrated in Figure 2.7; we
see that we have
in accord with our labeling of the first matrix in (1). Similarly, the second
matrix in (1) gives projection on the y-axis. We refer to such matrices as
Drojection matrices. The third and fourth matrices map the plane onto the
indicated lines, as we readily see by examination of their column vectors. The
transformations represented by these matrices are not projections onto those
lines, however. Note that when projecting onto a line, every vector along the
line is left fixed—that is, it is carried into itself. Now [3, —3] is a vector along
the line y = —x, but
[i ols} [3] | 3)
-1 Of 3) _ {-3 3
which shows that the third matrix in (1) is not a projection matrix. A similar
computation shows that the final matrix in (1) is not a projection matrix. (See
Exercise 1.) Chapter 6 discusses larger projection matrices.
EXAMPLE 1 Explain geomctrically why T: R’? — R?, which rotates the plane counterclock-
wise through an angle @, is a linear transformation, and find its standard
matrix representation. An algebraic proof is outlined in Exercise 23.
SOLUTION We must show that for all u, v, w © R? and all scalars r, we have J(u + v) =
T(u) + T(v) and T(rw) = rT(w). Figure 2.8(a) indicates that the paralJelogram
that defines u + v is carried into the parallelogram defining T(u) + T(v) by 7,
and Figure 2.8(b) similarly shows the lines illustrating rw and T(rw). Thus T
preserves addition and scalar multiplication. Figure 2.9 indicates that
yas y
A Muy) t T(rw)
y J
rw
T(w)
£6|
Ww
\ ~~
FIGURE 2.8
2.4 LINEAR TRANSFORMATIONS OF THE PLANE (OPTIONAL) 157
}
T(e€) A e2
I T(e;)
cos 8 ro
sin 6
I @ 4 x
sin 6 cos 8 “1
FIGURE 2.9
Countercloeckwise rotatiun of e, and e,
through the angle 6.
Thus
cos@ —sin@
Counterclockwise rotation threugh 0 (2)
sin@ cos 6}
is the standard matrix representation of this transformation. »#
Another type of rigid motion T of the plane consists of “turning the plane
over” around a line L through the origin. Turn the plane by holding the ends of
the line L and rotating 180°, as you might hold a pencil by the ends with the
“No. 2” designation on iop and rotate it 180° so that the “No. 2” is on the
underside. In analogy with the rotation in Figure 2.8, the parallelogram
defining u + v is carried into one defining T(u) + 7(v), and similarly for the
arrows defining the scalar product rw. This type of rigid motion of the plane is
called a reflection in the line L, because if we think of holding a mirror
perpendicular to the plane with its bottom edge falling on L, then 7(v) 1s the
reflection of v in this mirror, as indicated in Figure 2.10. Every vector w along
Lis carried into itself. As indicated in Figure 2.11, the reflection T of the plane
y -
A Lx, y]
I
oy
0 j
J
|
in the x-axis is defined oy T({x, y]) = [x, —y]. Because e, is left fixed and e, is
carried into —e,, we see that the standard matrix representation of this
reflection in the x-axis is
[1 0
10 - 1 . Reflection in the x-axis (3)
It can be shown that the rigid motions of the plane carrying the origin into
itself are precisely the linear transformations T: R* — R’ that preserve lengths
of all vectors in R?—that is, such that ||7(x)|| = ||x|| for all x € R?. We will
discuss such ideas further in Exercises 17-22.
Thinking for a moment, we can see that every rigid motion of the plane
leaving the origin fixed is either a rotation or a reflection followed by a
rotation. Namely, if the plane is not turned over, all we can do is rotate it about
the origin. If the plane has been tumed over, we can achieve its final position
by reflection in the x-axis, turning it over horizontally, followed by a rotation
avout the origir io obtain the desired position. We will use this last fact in the
second solution of the next example. (Actually, every rigid motion leaving the
origin fixed and turning the plane over is a reflection in some line through the
origin, although this is not quite as easy to see.) The first solution of the next
example illustrates tuat bases for R* other than the standard basis can be
useful.
EXAMPLE 2 Find the standard matrix representation A for the reflection of the plane in the
line y = 2x.
SOLUTION 1 Let b, = [1, 2], which lies along the linev = 2x, and let b, = [—2, 1], which is
orthogonal to b, because b, - b, = 0. These vectors are shown in Figure 2.12. If
7: R* > R? is reflection in the line y = 2x, then we have
y= 2x
FiGurRE 212
Reflection in the line y = 2x.
24 LINEAR TRANSFORMATIONS OF THE PLANE (OPTIONAL) 159
Now {b,, b,} is a basis for R’, and Theorem 2.7 teiis us that T is completely
determined by its action on this basis. To find T(e,) and 7(e,) for the column
vectors in the standard matrix representation A of JT, we arst express e, and e,
as linear combinations of b, and b,. To do this, we solve the two linear systems
with e, and e, as column vectors of constants and b, and b, as columns of the
coefficient matrix, as follows:
|
1
aldo
Ji -2] 1 OF Jt -2 1 O} 71 0 5
2 1/0 1 0 5j-2 1 0 | 2
.
tale
5
Thus
we have
l 2 2 1
e, = 5D - 3) and e,= 3), + 3b).
Tie)
= sTb) ~ ZT)= gh. + Bs= gle 2)+ Y-2, 1) =| 3, §|
and
tal nlf
UE
A=
unit
SOLUTION 2 The three parts of Figure 7.13 show that we can attain the reflection of the
plane in the line y = 2x as follows: First reflect in the x-axis, taking us from
part (a) to part (b) of the figure, and then rotate counterclockwise through the
angle 20, where @ is the angle from the x-axis to the line y = 2x, measured
counterclockwise. Using the double angle formulas derived in Section 2.3, we
see from the right triangle in Figure 2.13(a) that
2 It
sin 2@ = 2 sin@ cos@ = AA = 2
and
oo
Replacing @ by 26 in the matrix in Example 1, we see that the standard matrix
representation for rotation through the angle 24 is
Al
Alo
WIR
Wl
60 CHAPTER 2 DIMENSION, RANK, ANO LINEAR TRANSFORMATIONS
FIGURE 2.13
(a) The vector v (b) Reflected (c) Rotated
3
4h —3
al&
ale
_ 5 5
A=! 4 4
5
Uw
Ui
Rotate Reflect Wo
In Example 2, note that because we first reflect in the x-axis and then rotate
through 26, the matrix for the reflection is the one on the right, which acts ona
vector y € R? first wien computing Av.*
*This right-to-left order for composite transformations occurs because we wnte functions on the
left of the elements of the domain on which they act, wnting f(x} rather than (x). From a
pedagogical standpoint, writing functions on the left must be regarded as a peculiarity in the
development of mathematical notations in a society where text is read from left to right. If we
wrote functions on the right side, then we would take the transpose of 4x and write
EXAMPLE 3 Describe geometrically the effect on the plane of the linear transformation 7;
where E is an elementary matrix ootained by multiplying a row of the 2 x 2
identity matnix J by —1.
SOLUTION The matrix obiained by multiplying the second row of J by —1 is
elo-i}
_}l 0
which is the matrix given as matrix (3). We saw there that 7, is the reflection in
the x-axis and 7,([x, y]) = [x, —y]. Similarly, we see that the elementary
matrix obtained by multiplying the first row of J by —1 represents the
transformation that changes the sign of the first component of a vector,
carrying [x, y] into [—x, y]. This is the reflection in the y-axis.
EXAMPLE 4 Describe geometrically the effect on tne plane of the linear transformation 7;
where £ is an elementary matrix obtained by interchanging the rows of the
2 X 2 identity matnx J.
SOLUTION Here we have
In row-vector notation, we have T,([x, y]) = [y, x]. Figure 2.14 indicates that
this transformation, which interchanges the components of a vector in the
plane, is the reflection in the line y= x. o
oilboy
i yax
a
FIGURE 2.14
Reflection in the line y = x.
162 CHAPTER 2 DIMENSION, RANK, AND LINEAR TRANSFORMATIONS
for sume nonzero scalar r, We discuss the first case and leave the second as
Exercise 8. The transformation is given by
n(s)-[oiG)-(5h
or, in row notation, T({x, y]) = [rx, yl. The second component of [x, y] is
unchanged. However, the first component is multiplied by the scalar r,
resulting in a horizontal expansion if r > 1 or in a horizontal contraction if
0<r< 1. In Figure 2.15, we illustrate the effect of such a horizontal expansion
or contraction on the points of the unit circle. If r <0, we have an expansion or
contraction followed by a reflection in the y-axis. For example,
—2 0) _[-1 0 3 0
0 1 0 1; !0 1/7
Reflection Horizontal expansion
Pile [od
for some nonzero scalar r. We discuss the first case, and leave the second as
Exercise 10. The transformation is given by
y y
4 A
-1
(a) (b)
FIGURE 2.15
1
(a)T([x, y}) = 5 y contracts horizontally; (b) T([x, y]) = [3x, y] expands horizontally.
2.4 LINEAR TRANSFORMATIONS OF fHE PLANE (OPTIONAL) 163
addition of rx. For example, [1, 0] is carried onto [1, r], and [1, 1] 1s carried
onto [1, | + r], while [0, 0} and [0, 1] are carried onto themselves. Notice that
every vector along the y-axis remains fixed. Figure 2.16 illustrates the effect of
this transformation. The squares shaded in black are carried onto the
parallelograms shaded in color. This transformation is called a vertical shear.
Exercise 10 deals with the case of a horizontal shear. «=
(a)
FIGURE 2.16
(a) The verticai shear T([x, y]) = { ,X+ yl) r>0
(6) the vertical shear 7{[x, y]} = x, mx+ yr <0
164 CHAPTER 2 DIMENSION, RANK, AND LINEAR TRANSFORMATIONS
EXAMPLE 7? illustrate the result relating to the boxed description above for the invertible
linear transformation T([x, y]) = [x + 2y, 3x + 4y).
SOLUTION We reduce the standard matrix representation A of T, obtaining
E, E, E,
A= 1 2 _ [| ? 1 2 _ 10 .
3 4 (0 --2| 0 | 01
RR, _ 3K, Ry>-5R, RR, —_ 2R,
Opa he
E, E,
2
E,
and so
1 Oy;71 O;f1 2
A= Ev By; 5 1 fF 5 0 ir
= -! ~I -l=
as
7 SUMMARY
cos @ —sin 6
sin @ cas 6
EXERCISES
= 10
the rotation of the plane counterclockwise
about the origin through an angle of
a. 30°,
b. 90°, corresponds to a horizontal shear of the
c. 135°. plane.
. Give the standard matrix representation of
the rotation of the plane clockwise about the In Exercises 11-15, express the standard matrix
origin through an angle of representation of the given invertible
a. 45°, transformation of R’ into itself as a product of
b. 60°, elementary matrices. Use this expression to
c. 150°. describe the transformation as a product of one or
. Use the rotation matrix in item 3 of the more reflections, horizontal or vertical expansions
Summary to derive trigonometric identities or contractions, and shears.
for sin 38 and cos 36 in terms of sin @ and
cos @. (See Illustration 1, Section 2.3.) 11. T([x, yi) = [-y, »] (Rotation
. Use the rotation matrix in item 3 of the counterclockwise through 90°)
Summary to derive trigonometric identities 12. T((x, y]) = (2x, 2y] (Expansion away from
for sin(@ + @) and cos (6 + ¢) in terms of the origin by a factor of 2)
sin 6, sin ¢, cos @, and cos ¢. (See
Illustration 1, Section 2.3.) 13. T([x, yl) = [-x, —y] (Rotation through
180°)
. Find the general matrix representation for
the reflection of the plane in the line y = 14. T([x, yl) = [x + y, 2x — y]
mx, using the method for the case m = 2 in 15. T([x, yl) = [x + y, 3x + Sy]
Solution | of Example 2 in the text. 16. Mark each of the following True or False.
. Repeat Exercise 6, but use the method for — a. Every rotation of the plane is a linear
the case m = 2 in Solution 2 of Example 2 transformation.
in the text. . Every rotation of the plane about the
. Show that the linear transformation origin is a linear transformation.
. Every reflection of the plane in a line Z is
r( x — {1 0|{x a rigid motion of the plane.
yy {0 rily . Every reflection of the plane in a line Z is
a linear transformation of the plane.
affects the plane R? as follows: . Every rigid motion of the plane that
(1) A vertical expansion, if r > 1; carries the origin into itself is a linear
transformation.
(ii) A vertical contraction, if 0 < r< 1;
f. Every invertible linear transformation of
(ii1) A vertical expansion followed by a the plane is a rigid motion.
reflection in the x-axis, if r< —]; . Ifa linear transformation T: R? > R? is a
(iv) A vertical contraction followed by a ngid motion of the plane, then
reflecticn in ihe x-axis, if -] <7 <0. I 7yx)H] = |x|] for all x © R?.
166 CHAPTER 2 DIMENSION, RANK, AND LINEAR TRANSFORMATIONS
—_—h. The geometric effect of all invertible 19. Express both the length of a vector vy € R?
linear transformations of R? into itself and the angle between two nonzero vectors
can be described in terms of the u, v € R’ in terms of the dot product only.
geometric effect of the linear (From this we may conclude that if a linear
transformations of R? having elementary transformation T: R? — R? preserves the dot
matrices as standard matnx product, then it preserves length and angle.)
representations. 20. Suppose that 7,,: R? > R? preserves both
. i. Every linear transformation of the plane length and angle. Prove that the two column
into itself can be achieved through a vectors of the matrix A are orthogonal unit
succession of reflections. expansions, vectors.
contractions, and shears.
. Every invertible linear transformation of 21. Prove that the two column vectors of a
2 < 2 matrix A are orthogonal unit vectors if
the plane into itself can be achieved
and only if (A’)A = I. Demonstrate that the
through a succession of reflections,
matrix representations for the rigid motions
expansions, contractions, and shears.
given in Examples | and 2 satisfy this
condition.
22. Let A be a2 X 2 matrix such that (A7)A = I.
A linear transformation T: R? > R? Prove that the linear transformation T,
preserves length if ||7(x)|] = ||x|| for all preserves the dot product, and hence also
x & R’. It preserves angle if the angle preserves length and angle. [Hint: Note that
between u and v is the same as the angle the dot product of two column vectors
between 7(u) and 7(v) for ali u, v E R’. It u, v & R" is the entry in the 1 x 1! matrix
preserves the dot product if T(u) - 7(v) = (u7)v. Compute the dot product 7,(u) - T,(v)
uv for all u,v & R?. by computing (Au)"(Av).]
23. This exercise outlines an algebraic proof that
rotation of the plane about the origin is a
linear transformation. Let T: R? > R’ be the
We recommend that Exercises 17-22 be worked function that rotates the plane
sequentially, or at least be read sequentially. counterclockwise through an angle 6 as in
Example 1.
17. Use the familiar equation that describes the a. Prove algebraically that each vector
dot product u - v geometrically to prove that v & R? can be written in the polar form
if a linear transformation T: R? > R? v = r[cos a, sin a]. [Hint: Each unit
preserves both length and angle, then it also vector has this form with r = 1.]
preserves the dot product. b. For v = r[cos a, sin aj, express T(y) in
18. Use algebraic properties of the dot product this polar form.
to compute ||u — v|? = (u — v) - fu — v), and c. Using column-vector notation and .
prove from the resulting equatisn that a appropriate trigonometric identities, find
linear transformation T: R? > R? that a matrix A such that 7(v) = Av. The
preserves length also preserves the dot existence of such a matrix A proves that
product. T is a linear transformation.
2.5 LINES, PLANES, AND OTHER FLATS (OPTIONAL) 167
We turn to geometry in this section, and generalize the notions of a /ine in the
plane or in space and of a plane in space. Our work in the preceding sections
will enable us to describe geometrically the solution set of any consistent linear
system.
x2
A
/ p(d) sp(d)
x}
SOLUTION The subset S of R? is a disk with center at the origin and a radius of 2. As shown
in Figure 2.20, its translate by [3, —4] is the disk with center at the point
(3, —4) and a radius of 2. =
x, = td, + a,
xX, = td, + a,
Component equations of Z
x, = td, + a,.
2.5 LINES, PLANES, AND OTHER FLATS (OPTIONAL) 16
In classical geometry, component equations for L are also called parametric
equations for L, and the variable ¢ is a parameter. Of course, this same
parameter ¢ appears in the vector equation also.
EXAMPLE 2 Find a vector equation and component equations for the line in R’ through
(2, 1) having direction vector (3, 4]. Then find the point on the line having -4
as its x,-coordinate. ,
SOLUTION The line can be characterized as the transiate of sp([3, 4]) by the vector (2, 1],
and so a vector equation of the line is
Lx; xX] = (3, 4] + (2, I].
EXAMPLE 3 Find parametric equations of the line in R? that passes through the points
(2, —1, 3) and (1, 3, 5).
SOLUTION We arbitrarily choose a = [2, —!, 3] as the translation vector corresponding to
the point (Z, —1, 3) on the line. A direction vector is given by
d = [1, 3, 5] — [2, -1, 3] = [-1, 4, 2],
x3
FIGURE 2.21
The line passing through (2, —1, 3) and (1, 3, 5).
170 CHAPTER 2 DIMENSION, RANK, AND LINEAR TRANSFORMATIONS
Line Segments
Consider the line in R" that passes through the two points (a,, a, ..., @,) and
(5,, b,,..., b,). Letting a and b be the vectors corresponding to these points,
we sec as in Example 3 that d = b — ais a direction vector for the line. The
vecio. equation
x=(dt+a=(b-a)t+a (1)
for the line in effect presents the line as a f-axis whose origin is at the point
(a), a, ..., @,) and on which a one-unit change in ¢ corresponds to ||d|| units
distance in R’. This is illustrated in Figure -2.22.
As illustrated in Figure 2.23, each point in R” on the line segment that
joins the tip of a to the tip of b lies at the tip of a vector x obtained in Eq. (1) for
some valve of f for wnich C <t< 1. Notethat f = 0 yields the point at the tip of
a and 1 = 1 yields the point at the tip of b. By choosing ¢ between 0 and 1
appropnately, we can find the coordinates of any point on this line segment. In
particular, the coordinates of the midpoint of thc line segment are the
components of the vector
EXAMPLE 4 Find the points that divide into five equal parts the line segment that ioins
(1, 2, 1, 3) to (2, 1, 4, 2) in R*.
SOLUTION We obtaind = [2, i, 4, 2] — [1, 2, 1, 3] =[1, —1, 3, —1] asa direction vector for
the line through the two given points. The corresponding vector equation of
the line is
atid
a3 +a
a+ zd
a=a+0d
TABLE 2.1
Flats in R’
Just as a line is a translate of a one-dimensional subspace in R’, a plane in R" is
a translate of a iwo-dimensional subspace sp(d,, d,), where d, and d, are
nonzero, nonparallel vectors in R". A plane appears as a flat piece of R’, as
illustrated in Figure 2.24. We have no word analogous to “straight” or “‘flat” in
our language to denote that R? is not “curved.” We borrow the term “flat’’
when generalizing to higher dimensions, and describe a translate of a
k-dimensional subspace of R” for k < nas being “flat.” Let us give a formal
definition.
~ 2-flat through
the origin
FIGURE 2.24
Planes or 2-flats in R?
CHAPTER 2 DIMENSION, RANK, AND LINEAR TRANSFORMATIONS
NO
~d
is the vector equation of the k-flat. (We use the letter d because W determines
the direction of the k-flat as being parallel to W.) The corresponding
component equations are again called parametric equations for the k-flat.
EXAMPLE 5 Find parametnv equations of the plane in R* passing through the points
(1, 1, 1, 1), (2, 1, 1, 0), and (3, 2, 1, 0).
SOLUTION We arbitrarily choose a = [1, 1, 1, 1] as the translation vector corresponding to
the point (1, I, 1, 1) on the desired planc. Two vectors that (when translateu)
start at this point and reach to the other two points are
[X), Xa, X53, X4) = S[1, 0,0, -1] + ¢[2, 1,0, -1] + [1, 1, 1, 1).
HISTORICAL NOTE = THE EQuaTION OF A PLANE IN R} appears as early as 1732 in a paper of Jacob
Hermann (1678-1733). He was able to determine the plane’s position by using intercepts, and he
also noted that the sine of the angle between the plane and the one coordinate plane he dealt with
(what we call the x,, x,-plane) was
Vd? + d,
Vd? + d? + d}
In his 1748 Introduction to Infinitesimal Analysis, Leonhard Euler (1707-1783) used, instead. the
cosine of this angle, d,/\/q? + a, + d,.
At the end of the eighteenth century, Gaspard Monge (1746~1818), in his notes for a course
on solid analytic geometry at the Ecole Polytechnique, related the equation of a plane to all three
coordinate planes and gave the cosines of the angles the plane made with each cf these (the
so-called direction cosines). He also presented many of the standard problems of solid analytic
geometry, examples of which appear in the exercises. For instance, he showed how to find the
plane passing through three given points, the line passing through a point perpendicular to a plane,
the distance between two parallel planes, and the angle between a line and a plane.
Known as “‘the greatest geometer of the eighteenth century,” Monge developed new graphical
geometric techniques as a student and later as a professor at a military school. The first problem he
solved had to do with a procedure enabling soldiers to make quickly a fortification capable of
shielding a position from both the view and the firepower of the enemy. Monge served the French
revolutionary government as minister of the navy and later served Napoleon in various scicatufic
offices. Ultimately, he was appointed senator for life by the emperor.
25 LINES, PLANES, AND OTHER FLATS (OPTIONAL) 173
translated d,
translated dy
(hdd)
FIGURE 2.25
A 2-ilat in R*.
X= st+2tt+1
xX = t+1
x= 1
X,=—-S- ttl. a
EXAMPLE 6 Show that the linear equation ¢,x, + ¢,x, + ¢,x,; = 5, where at least one of
C1, C,, Cy 1S nonzero, represents a plane in R’.
SOLUTION Let us assume that c, 4 0. A particular solution of the given equation is
a = [b/c,, 0,0]. The corresponding homogeneous equation has a solution space
generated by d, = [c;, 0, —c,] and d, = [c,, —c,, 0]. Thus the solution set of the
linear equation is a 2-flat in R? with equation x = sd, + td, + a—that is, a
plane in R®, »
shown that a k-flat in R" is the solution set of some system of n — & linear
equations in n unknowns, That is, < k-flat in R" is the intersection of n — k
hyperplanes. Thus there are two ways to view a k-flat in R™
|. As a translate of a k-dimensional subspace of R’, described using
parametric equations
2. As an intersection of n — k hyperplanes, described with a system of
linear equations.
EXAMPLE 8 Describe the line (1-flat) in R’ that passes through (2, —1, 3) and (1, 3, 5) in
terms of
(1) parametric equations, and
(2) a system of linear equations.
SOLUTION (1) In Example 3, we found the parametric equations for the line:
x, = -t + 2, x, = 4-1, xX; = 2t + 3. (3)
(2) In order to describe the line with a system of linear equations, we eliminate
the parameter ¢ from Eqs. (3):
4x, + x; = 7 Add four times the first to the second. (4)
X, — 2x, = —7 Subtract twice the third from the second.
This system describes the line as an intersection of two planes. The line can be
represented as the intersection of any two distinct planes, each containing the
line. This is illustrated by the equivalent systems we have at the various stages
in the Gauss reduction of system (4) to obtain solution (3). «
SUMMARY
1. The translate of a subset S of R" by a vector a € R'is the set of all vectors in
IR" of the form x + a for x € S, and is denoted by S + a.
2. A k-flat in RR“ is a transiate of a k-dimensionai subspace and has the form
a + sp(d,, d,,...,d,), where a is a vector in R" ana d,, d,,... , d, are
independent vectors in R". The vector equation of the k-flat is x = a +
id, + id, +--+ + td, for scalars ¢, in R.
3. A line in R" is a 1-flat. The line passing through the point a with parallel
vector d is given by x = a + fd, where t-runs through all scalars. Parametric
equations of the line are the component equations x, = a, + d@t for
P=1,2,...,a.
4. Let aandb be vectors in R”. Vectors to points on the line segment from the
tip ofa to the tip ofb are vectors of the form x = (b—a)+afor0=¢= 1.
5. A plane in R’ is a 2-flat; a hyperplane in R’ is an (n — 1)-flat.
176 CHAPTER 2 DIMENSION, RANK, AND LINEAR TRANSFORMATIONS
| EXERCISES
In, keeping with classical geometry, many of the 11. For each pair of points, find parametric
exercises that follow are phrased in terms of equations of the line containing them.
points rather than in terras of vectors. a. (-2, 4) and (3, —1) in R?
In Exercises 1-6, sketch the indicated b. (3, —1, 6) and (0, —3, —1) in R?
translate of the subset of R" in an appropriate c. (2, 0, 4) and (-—1, 5, —8) in R?
figure. . For each of the giveu pairs of lines in R’,
determine whether the lines intersect. If they
do intersect, find the point of intersection,
. The translate of the line x, = 2x, + 3 in R? and determine whether the lines are
by the vector [—3, 0] orthogonal.
. The translate of {(t, 2) | ¢ € R} in R? by the ax=H=4tt x%,=2- 31,
vector [-1, —2] x,=-3+5
. The translate of {x € R?| |x| < 1 for i= 1, 2} and
by the vector [1, 2} xX =114+3s, «x, =-9
- 4s,
. The translate of {x € R? | ||x|| = 3} by the x;= -4- 3s
vector {2, 3| bh x= 1!+34, 4 =-3-4,
. The translate of {x € R? | ||x|| <= 1} by the x,=4+ 3t
vector [2, 4, 3] and
. The translate of the plane x, + x, = 2 in R? xX=6-25, xm=-2+5,
by the vector [—1, 2, 3} x;= -15 + Ts
. Give parametric equations for the line in R? 13. Find all points in common to the lines in R?
through (3, —3) with direction vector given by x, = 5 ~ 3t,x, = -1 + tand
d = [-—8, 4]. Sketch the line in an x, = -7 + 65, x, = 3 — 2s.
appropriate figure. 14, Find parametric equations for the line in R°
. Give parametric equations for the line in R? through (—1, 2, 3) that is orthogonal to each
through (—1, 3, 0) with direction vector of the two lines having parametric equations
d = [-2, -1, 4]. Sketch the line in an xX, = -2 + 3t, x, = 4, x; = i — tand
appropriate figure. X=7-6%,=2+ 34%,=4 +1.
. Consider the line in R? that is given by the . Find the midpoint of the line segment
equation dx, + d,x, = c for numbers d,, d,, joining each pair of points.
and c in R, where d, and d, are not both a. (—2, 4) and (3, -1) in R?
zero. Find parametric equations of the line. b. (3, —1, 6) and (0, —3, —1) in R?
. Find parametric equations for the line in R? c. (0, 4, 8) and (-4, S, 9) in R?
through (5, —1) and orthogonal to the line 16. Find the point in R? on the line segment
“ith parametric equations x, = 4 — 21, joining (—t, 3) and (2, 5) tnaz is twice as
Wwe? te closc 1:0 (—!. 3) as to (2, 5).
2.8 LINES, PLANES, AND OTHER FLATS (OPTIONAL) 177
17. Find the point in R? on the line segment 29. Find a linear system with two equations
joining (—2, 1, 3) and (0, —5, 6) that is in four variables whose solution set is the
one-fourth of the way from (—2, 1, 3) to plane in Exercise 28. [See the hint for
(0, —5, 6). Exercise 23.]
18. Find the points that divide the line segment 30. Find a vector equation of the hyperplane
between (2, 1, 3, 4) and (—1, 2, 1, 3) in R* that passes through the points (1, 2, 1, 2, 3),
into three équal parts. (0, 1, 2, 1, 3), (0, 0, 3, 1, 2), (0, 0, 0, 1, 4),
19. Find the midpoint of the line segment and (0, 0, 0, 0, 2) in R°.
between (2, !, 3, 4, 0) and (1, 2, —1, 3, -1)
31. Find a single linear equation in five variables
in R?.
whose solution set is the hyperplane in
20. Find the intersection in R? of the line given Exercise 30. [See the hint for Exercise 23.]
by
32. Find a vector equation of the hyperplane in
X= 5+8 x, = —3t, x,=-2+4 R° through the endpoints of e,, e,, . . . , &
and the piane with equation x, — 3x, + 2x, 33. Find a single linear equation in six variables
= —25, whose solution set is the hyperplane in
21. Find the intersection in R? of the line given Exercise 32. [See the hint for Exercise 23.]
by
X=2, »x=5-12, X;
= 20
In Exercises 34-42, solve the given systern of
and the plane with equation x, + 2x, = 10. linear equations and write the solution set asa
22. Find parametric equations of the plane that k-flat.
passes through the unit coordinate points
(1, 0, 0}, (0, 1, 0), and (0, 0, 1) in R?. 34. xX, ~ 2X, = 3
3 VECTOR SPACES
For the sake of efficiency, mathematicians often study objects just in terms of
their mathematical structure, deen:phasizing such things as particular sym-
bols used, names of things, and applications. Any properties derived exclu-
sively from mathematical structure will hold for all objects having that
structure. Organizing mathematics in this way avoids repeating the same
arguments in different contexts. Viewed from this perspective, linear algebra is
the study of all objects that have a vector-space structure. The Euclidean spaces
R" that we treated in Chapters 1 and 2 serve as our guide.
Section 3.1 defines the general notion of a vector space, motivated by the
familiar algebraic structure of the spaces R". Our examples focus mainiy on
spaces other than R’, such as function spaces. Unlike the first two chapters in
our text, this chapter draws on calculus for many of its illustrations.
Section 3.2 explains how the linear-algebra terminology introduced in
Chapter | for R" carries over to a general vector space V. The definitions given
in Chapters 1 and 2 for linear combinations, spans, subspaces, bases,
dependent vectors, independent vectors, and dimension can be left mostly
unchanged, except for replacing “R””’ by “a vector space V.” Indeed, with this
replacement, many of the theorems and proofs in Chapter i have word-for-
word validity for general vector spaces.
Section 3.3 shows that every finite-dimensional (real) vector space can be
coordinatized to become algebraically indistinguisiiable from one of the spaces
R". This coordinatization allows us to apply the matnx techniques developed
in Chapters 1 and 2 to any finite-dimensional vector space for such things as
determining whether vectors are independent or form a basis.
Linear transformations of one vector space into another are the topic of
Section 3.4. We will see that some of the basic operations of calculus, such as
differentiation, can be viewed as linear transformations.
To conclude the chapter, optional Section 3.5 describes how we try to
access such geometric notions as length and angle even in infinite-dimensional
vector spaces.
179
180 CHAPTER 3 VECTOR SPACES
For example, we know how to add the two functions x’ and sin x, ad we know
how to muitiply them by a real number. We require that, whenever addition or
scalar multiplication is performed with elements in V, the answers obtained lie
again in V. That is, we require that V be closed uader vector addition and closed
under scalar multiplication. This notion of closure under an operation is
familiar to us from Chapter 1.
HISTORICAL NOTE ALTHOUGH THE OBJECTS WE CALL VECTOR SPACES were well known in the late
nineteenth century, the first mathematician to give an abstract definition of a vector space was
Giuseppe Peano (1858-1932) in his Calcolo Geometrico of 1888. Peano’s aim in the book, as the
title indicates, was to develop a geometric calculus. Such a calculus “consists of a system of
operations analogous to those of algebraic calculus but in which the objects with which the
calculations are performed are, instead of numbers, geometrical objects.” Much of the book
consists of calculations dealing with points, lines, planes, and volumes. But in the ninth chapter,
Peano defines what he called a /inear system. This was a set of objects that was provided with
operations of addition and scalar multiplication. These operations were to satisfy axioms Al-A4
and S1-S4 presented in this section. Peano also defined the dimension of a linear system to be the
maximum number of linearly independent objects in the system and noted that the set of
polynomial functions in one variable forms a linear system of infinite dimension.
Curiously, Peano’s work had no immediate effect on the mathematical community. The
definition was even forgotten. It only entered the mathematical mainstream through the book
Space-Time-Matter (1918) by Hermann Weyl (1885-1955). Weyl wrote this book as an
introduction to Einstein’s general theory of relativity. In Chapter 1 he discusses the nature of a
Euclidean space and, as part of that discussion, formulates the same standard axioms as Peano did
earlier. He also gives a philosophic reason for adopting such a definition:
Not only in geometry, but to a still more astonishing degree in physics, has it becomc more and more
evident that as soon as we have succeeded in unraveling fully the natural laws ‘vhich govern reality, we find
them to be expressible by mathematical relations of surpassing simplicity and architectonic perfection
... Analytical geometry [ihe axiom system which he presented] . . . conveys an idea, even if inadequate,
of this perfection of form.
182 CHAPTER 3 VECTOR SPACES
,
EXAMPLE 1 Show that the set M/,,, of all m Xx n matrices is a vector space, using as vector
addition and scalar multiplication the usual addition of matnces and multipli-
cation of a matrix by a scalar.
SOLUTION We have seen that addition of m X n matrices and multiplication of an m X n
matrix by a scalar again yield an m X n matrix. Thus, M,,, is closed under
vector addition and scalar multiplication. We take as zero vector in M,,, the
usual zero matrix, all of whose entries are zero. For any matrix A in M,,,, we
consider —A to be the matrix (—1)A. The properties of matrix arithmetic on
page 45 show that all eight properties Al-A4 and S$1-S4 required of a vector
space are Satisfied. s
The preceding example introduced the notation M,,, for the vector space
of all m x n matrices. We use M, for the vector space of all square n x n
matrices.
EXAMPLE 2 Show that the set P of all polynomials in the variable x with coefficients in
fR is a vector space, using for vector addition and scalar multiplication
the usual addition of polynomials and multiplication of a polynomial by a
scalar.
SOLUTION Let p and g be polynomials
p=ataxtax+-+: ta.
and
G= bt bx t+ bytes + dx".
If m = n, the sum of p and 7 is given by
Dy Xt + DX
pt+q= (ath) t+ (a+ d)xt-°+ + (a, + 5,)2°
EXAMPLE 3 Let F be the set of all real-valued functions with domain R; that is, let F be the
set of all functions mapping R into R. The vector sum f+ g of two functions f
and gin Fis defined in the usual way to be the function whose value at any xin
R is f(x) + g(x); that is,
(f> Ox) = fly + a(x).
3.1 VECTOR SPACES 183
For any scalar r in R and function fin F, the product rf is the function whose
value at x is rf(x), so that
(rf (x) = rf(2).
Show that F with these operations is a vector space.
SOLUTION We observe that, for fand gin F, bcth f+ gand rfare functions mapping R into
R, sof + gand rfare in F. Thus, F is closed under vector addition and under
scalar multiplication. We take as zero vector in F the constant function whose
value at each x in R is 0. For each function fin F, we take as —fthe function
(-l)fin F.
There are four vector-addition properties to verify, and they are all easy.
We illustrate by verifying condition A4. For fin F, the function f+ (—f) =
f+ (-1)fhas as its value at x in R the number f(x) + (—1)f(x), which is 0.
Consequently, f+ (—/) is the zero function, and A4 is verified.
The scalar multiplicative properties are just as easy io verify. For example,
to verify S4, we must compute If at any x in R and compare the resuit with
I(x). We obtain (1f)(x) = f(x) = f(x), and so lf=f a
EXAMPLE 4 Show that the set P, of formal power series in x of the form
yw)=
(> a. (> bys = > (a, + 8,)x" and (>
m)
Ms
ra,x",
=
ea 0
is a vector space.*
SOLUTION The reasoning here is precisely the same as for the space P of polynomials in
Example 2, The zero series is > Ox", and the additive inverse of > a,x" is
n=0 n=0
Mae
(—a,)x". All the other axioms follow from the associative, commutative,
n
So
It
The examples of vector spaces that we have presented so far are all based
on algebraic structures familiar to us—namely, the algebra of matnies,
polynomials, and functions. You may be thinking, “How can anything with an
*We can add only a finite number of vectors in a vector space. Thus we do not regard these formal
power series as infinite sums in P, of monomials. Also, we are not concerned with questions of
convergence or divergence, as studied in calculus. This is the significance of the word formal.
184 CHAPTER 3 VECTOR SPACES
EXAMPLE 5 Lei R? have the usual operation of addition, but define scalar multiplication
r[x, y] by 7{x, y] = [0, 0]. Determine whether R? with these operations is a
vector space.
SOLUTION Because conditions Al-A4 of Definition 3.1 do not involve scalar multiplica-
tion, and because addition is the usual operation, we need cnly check
conditions S1-S4. We see that all of these hold, except for condition S4:
because I[x, y] = [0, 0], the scale is not preserved. Thus, R? is not a vector
space with these particular two operations. 5
EXAMPLE 6 Let R? have the usual! scalar multiplication, but let addition + be defined on R?
by the formula
[x, ¥] + [7 5) = [e t+ 7, 2y + 3).
Determine whether R? with these operations is a vector space. (We use the
symbol + for the warped addition of vectors in R’, to distinguish it from the
usual addition.)
SOLUTION We check the associative law for +:
Because the two colored scalars are not equal, wc expect that + is not
associative. We can find a specific violation of the associative law by choosing
y ~ 0; for example,
({0, 1] + [0, 0}) + [0, 0] = [0, 4],
whereas
axicms, we can then use it in the proofs of other things. The properties that
appear in Theorem 3.1 below are listed in a convenient order for proof; for
example, we will see that it is convenient to know property 3 in order to prove
property 4. Property 4 states that in a vector space V, we have Ov = @ for all
v € V. Students often attempt to prove this by saying.
This is a fine argument if Vis R’, but we have now expanded our concept of
vector space, and we can no longer assume that v € Vis some n-tuple of real
numbers.
Ov = (0 + O)v = Ov + Ov.
By the additive identity axiom A3 and the commutative law A2, we know that
. Ov = 0+ Ov = Ov + 0.
Therefore,
Ov + Ov = Ov + 0,
EXAMPLE 7 Let F be the set of all real-valued functions on a (nonempty) set S; that is, let F
be the set of all functions mapping S into R. For f, g € F, let the sum f+ g of
two functions fand g in F be defined by
é a; “
Q, Ay as
as the function /: {1, 2, 3, 4, 5, 6} R, where /(i) = a,. Addition of functions as
defined in Example 7 again corresponds to addition of matrices, and the same
is true for scalar multiplication.
The vector space P of all polynomials in Example 2 is net quite as easy to
present as a function space, because not all the polynomials have the same
number of terms. However, we can view the vector space P., of formal power
series in x as the space of all functions mapping {0, 1, 2, 3,...} into R.
Namely, if fis such a function and if f(n) = a, for n € {0, 1, 2,3, ... }, then we
can denote this function symbolically by > a,x". We see that function
n=0
addition and multiplication by a scalar will produce precisely the addition and
multiplication of power series defined in Example 4. We will show in the next
section that we can view the vector space P of polynomials as a subspace of P..
We have now freed the domain of our function spaces from having to be
the set R. Let’s see how much we can free the codomain. The definitions
a scalar is again a vector space. Note that R itself is a vector space, so the set of
functions f: S > R in Example 7 is a special case of this construction. For
another example, the set of all functions mapping R? into R’ has a vector space
structure. In third-semester calculus, we sometimes speak of a “vector-valued
function,” meaning a function whose codomain is a vector space, although we
usually don’t talk about vector spaces there.
Thus, starting with a set S, we could consider the vector space V, of all
functions mapping S into R, and then the vector space V, of ali functions
mapping S into V,, and then the vector space V; of all functions mapping S
into V,, and then—oops! We had better stop now. People who do too much of
this stuff are apt to start climbing the walls. (However, mathematicians do
sometimes build cumulative structures in this fashion.)
--————_
7 UNMARY |
1. A vector space is a nonempiy set V of objects called vectors, together with
rules for adding any two vectors v and w in V and for multiplying any
vector vin V by any scalar rin R. Furthermore, V must be closed under this
vector addition and scalar multiplication so that v + wand ry are both in
V. Moreover, the following axioms must be satisfied for all vectors u, v, and
w in Y and all scalars r and s in R:
Al (u+yv)+w=ut(v tw)
A2 v+w=wty
A3 There exists a zero vector 0 in V such that 0 + v = vy forally EV.
A4 Each v EV has an additive inverse —y in V such that vy + (- y) = @.
Sl riv+w)=rvtrw
S2 (r+ sw=rv t+ sy
$3 r(sv) = (75)v
S4 lv=y
7 EXERCISES
LO
jn Exercises 1-8, decide whether or not the given 10. The set of all 2 x 2 matrices of the form
set, together with the indicated operations of
addition and scalar multiplication, is a (real) [x 1]
yector Space. [i xy
where each x may be any scalar.
. The set of all diagonal n x n matrices.
1. The set R?, with the usual addition but with
scalar multiplication defined by r[x, y] = 12. The set of ail 3 x 3 matrices of the form
[ry, rx].
xX 0x
. The set R?, with the usual scalar 0x O|,
wv
f. The zero vector is the only vector that is Exercises 26-29 are based on the ootional
its own additive inverse. subsection on the universality of function spaces.
g- Multiplication of two scalars is of no
concern in the definition of a vector 26. Using the discussion of function spaces at
One of the end of this section, explain how we can
h. One of the axioms for a vector space | view the Euclidean vector space R”" and the
relates addition of scalars, multiplication vector space M,,, of all m x n matrices as
of a vector by scalars, and addition of
ay ma
essentially the same vector space with just a
vectors. different notation for the vectors.
i. Every vector space has at least two
vectors. 27. Repeat Exercise 26 for the vector space M,,
j. Every vector space has at least one of 2 x 6 matrices and the vector space M,,
vector. of3 x 4 matrices.
19. Prove property 2 of Theorem 3.1. 28. If you worked Exercise 16 correctly, you
20. Prove property 3 of Theorem 3.1. found that the po:ynomials in x of degree at
most 1 do form a vector space P,. Explain
24. Prove property 5 of Theorem 3.i.
how P, and R"*! can be viewed as essentially
22. Prove property 6 of Theorem 3.1. the same vector space, with just a different
23. Let V be a vector space. Prove that, if v is in notation for the vectors.
V and if ris a scalar and if ry = 0, then
29. Referring to the three preceding exercises,
either r= Oorv=0.
list the vector spaces R™, R*, R°, P,,, Pys,
24, Let be a vector space and let v and w be
nonzero vectors
. in ¥ Prove that if v is not a x ;5 an
in twoMay May columns
or more Man Mayin such
Meg a ane
way
scalar multiple of w, then v is not a scalar that any two vector spaces listed in the
mialtiple ofv + w. same column can be viewed as the same
. Let V bea vecior space, and let v and w be vector space with just different notation for
vectors in V. Prove that there is a unique vectors, but two vector spaces that appear in
vector x in V such that x + v = w. different columns cannot be so viewed.
finite set of polynomials can span the space P of all polynomials in x, because a
finite set of polynomials cannot contain polynomials of arbitrarily high degree.
Because we can add only a finite number of vectors, our definition of a linear
combination in Chapter | will be unchanged. However, surely we want to
consider the space P of all polynomials to be spanned by the monomials in the
infinite set {1, x, x’, x3, . . . }, because every polynomial is a linear combination
of these monomials. Thus we must modify the definition of the span of vectors
to include ihe case where the number of vectors may be infinite.
Given Vectors Vy, Vo. « » Yin a vector space Vand scalars r,,7,, . .
r, in R, the vector ;
- : ' nv, + nv, + ct TV,
Let x be a siibéet ofa a ector space y. The span of Xi is the set of all
linear combinations ofof jWectors.
3 in X, ;and i is denoted: ‘by sp(X). If X is
a finite set, so that Ye = {¥,, Vo oy Vb then we also write sp(X) as
Sp(v,, ¥,---, Vy). lf W = sp(Xx), the vectors in X span or generate W.
ify — $001 for some finite subset X of V, then Vis finitely generated.
ILLUSTRATION 1 Let P be the vector space of all polynomials, and let M = {1, x, x’, x’, ... }be
this subset of monomials. Then P = sp(M). Our remarks above Definition 3.2
indicate that P is not finitely generated. a
ILLUSTRATION 2 Let M,,, be the vector space of all m x n matrices, and let E be the set
consisting of the matrices E,,, where E,; is the m x n matrix having entry | in
the ith row and jth column and entries 0 elsewhere. There are mn of these
matrices in the set E. Then M,,,, = sp(E) and is finitely generated. =
That is, if W1s nonempty and is closed under addition and scalar multiplica-
tion, it is sure to be a vector space in its own right. We have arrived at an
efficient test for determining whether a subset is a subspace of a vector space.
Condition (2) of Theorem 3.2 with r = 0 shows that the zero vector lies in
every subspace. Recall that a subspace of R” always contains the origin.
The entire vector space V satisfies the conditions of Theorem 3.2. That is,
Vis a subspace of itself. Other subspaces of V are called proper subspaces. One
such subspace is the subset {0}, consisting of only the zero vector. We call {0}
the zero subspace of V.
Note that if Vis a vector space and ¥ is any nonempty subset of V, then
sp(X) is a subspace of V, because the sum of two linear combinations of
vectors in X is again a linear combination of vectors in X, as is any scalar
multiple of such a linear combination. Thus the closure conditions of
Theorem 3.2 are satisfied. A moment of thought shows that sp(X) is the
smallest subspace of V containing all the vectors in X.
ILLUSTRATION 3 The space P of all polynomials in x is a subspace of the vector space P., of
power series in x, described in Example 4 of Section 3.1. Exercise 16 in Section
3.1 shows that the set consisting of all polynomials in x of degree at most x,
together with the zero polynomial, is a vector space P,. This space P, is a
subspace both of Pand of P,. =
ILLUSTRATION 4 The set of invertible n X n matrices is not a subspace of the vector space M, of
all n X n matrices, because the sum of two invertible matrices may not be
invertible; also, the zero matrix is not invertible.
ILLUSTRATION 6 Let F be the vector space of all functions mapping R into R. Because sums and
scalar multiples of continuous functions are continuous, the subset C of F
consisting of all continuous functions mapping R into R is a subspace cf #.
Because sums and scalar multiples of differentiable functions are differentia-
ble, the subset D ofF consisting of all diffeientiable functions mapping &e into
194 CHAPTER 3 VECTOR SPACES
EXAMPLE 1 Let F be the vector space of all functions mapping R into R. Show that the set §
of all solutions in F of the differential equation
fit f=
is a subspace of F.
SOLUTION We note that the zero function in F is a solution, and so the set S is nonempty.
If fand gare in S, thenf” + f= 0 and g” + g=0, andso(
f+ g)’+(ftg)=
f'tetfre=(f"t+Js) +(e +g) = 0+ 0, which shows that S is ciosed
under addition. Similarly, (rf)” + rf= rf" + ff=r(f" + f) = 10 = 0, so Sis
closed under scalar multiplication. Thus S is a subspace of F. »
Independence
We wish to extend the notions of dependence and independence that were
given in Chapter 2. We restricted our consideration to finite sets of vectors in
Chapter 2 because we can’t have more than n vectors in an independent subset
of R". In this chapter, we have to worry about larger sets, because it may take
an infinite set to span a vector space V. Recall that the vector space P of all
polynomials cannot be spanned by a finite set of vectors. We make the
following slight modification to Definition 2.1.
LSTRATION 7 The subset {1, x, 7,2... v",... }of monomials in the vector space P of all
polynomials is an independent set.
3.2 BASIC CONCEPTS OF VECTOR SPACES 195
ILLUSTRATION 8 The subset {sin’x, cos?x, 1} of the vector space F of all functions mapping R
into R is dependent. A dependence relation is
EXAMPLE 2 Show that {sin x, cos x} is au independent set of functions in the space F of all
functions mapping R into R.
SOLUTION We show that there is no dependence relation of the form
r(sin x) + s(cos x} = 0, (1)
where the ¢ on the right of the equation is the function that has the value 0 for
all x. If Eq. (1) holds for all x, then setting x = 0 and x = 7/2, we obtain the
linear system
whose only solution is r = s = 0. Thus Eq. (1) holds only if 7 = s = 0, and so the
functions sin x and cos x are independent. 1s
r(0)+s=0 Setting x = 0
r(0)-s=0, Setting x = =
196 CHAPTER 3 VECTOR SPACES
EXAMPLE 3 Show that the functions e* and e* are independent in the vector space F of all
functions mapping R into R.
SOLUTION We set up the dependence reiation format
rex + se* = 0,
r+ s=0
r+ 2s= 0,
which has only the trivial solution r = s = 0. Thus the functions are
independent. s
Recall that we defined a subset {w,, w,, .. . , W,t to be a basis for the subspace
W = sp(w,, W,,..., W,) of R’if every vector in Wcan be expressed uniquely as
a linear combination of w,, W,, ..., W,. Theorem 1.15 shows that to demon-
strate this uniqueness, we need only show that the only linear combination
that yields the zero vector is the one with all coefficients O—that is, the
uniqueness condition can be replaced by the condition that the set’
{w,, W,, . . . , W,} be independent. This led us-to an alternate characterization
a basis for W (Theorem 2.1) as a subset of W that is independent and that spans
W. It is this alternate description that is traditional for general vector spaces.
The uniqueness condition then becomes a theorem; it remains the most
important aspect of a basis and forms the foundation for the next section of
our text. For a general vector space, we mav need an iifinile sct of vectors to
form a basis: for example, a basis for the space P of all polynomials is the set
32 BASIC CONCEPTS OF VECTOR SPACES 197
ILLUSTRATION 9 The setX = {1, x, x’, ..., x’, .. . } of monomials is a basis for the vector space
P of all polynomials. It is not a basis for the vector space P,, of formal power
a
Dimension
PROOF The proof is the same, word for word, as the proof of Theo-
rem 2.2.
It is not surprising that the proof of the preceding theorem is the same as
that of Theorem 2.2. The next section will show that we can expect Chapter 2
arguments to be valid whenever we deal just with finitely generated vector
spaces.
The same arguments as those in the corollary to Theorem 2.2 give us the
following corollary to Theorem 3.4.
ILLUSTRATION 11 The set E of matrices E; ; in Illustration 2 is a basis for the vector space M,,,
of all m X n matrices, so dim(M,,,) = mn. &
By the same arguments that we used for R" (page 132), we see that for a
finitely generated vector space V, every independent set of vectors in V can be
enlarged, if necessary, to a basis. Also, if dim(V) = &, then every independent
set of k vectors in Vis a basis for V, and every set of k vectors that span Visa
basis for V. (See Theorem 2.3 on page 133.)
EXAMPLE 4 Determine whetherS = {1 — x, 2 — 3x*, x + 2x*}is a basis for the vector space
P, of polynomials of degree at most 2, together with the zero polynomial.
200 CHAPTER 3 VECTOR SPACES
SOLUTION We know that dim(P,) = 3 because {1, x, x’} is a basis for P,. Thus S will be a
basis if and only if S is an independent set. We can rewrite the dependence
relation format
r+ 2s =0
-r + t1=6
—35
+ 2t= 0.
1 2 o] fi 2 of {i220
-1 0 tl~l0 2 1/~{01
Nop
0-3 2} [0-3 2] Igo
.
NJ
We see at once thai the. homogeneous system with this coefficient matrix
has only the trivial solution. Thus no dependence relation exists, and so
S is an independent set with the necessary number of vectors, and is thus
a basis. =
EXAMPLE 5 Find a basis for the vector space P; (polynomials of degree at most 3, and 0)
containing the polynomials x? + 1 and x — 1.
SOLUTION First we observe that the two given polynomials are independent because
neither is a scalar multiple of the other. The vectors
eP+1,P-1,1,x%,%7,%
generate P; because the last four of these form a basis for P;. We can reduce
this list of vectors to a basis by deleting any vector that is a linear combina-
tion of-others in the list, being sure to retain the first two. For this ex-
ample, it is actually easier to notice that surely x is not in sp(x’ + 1, x? - 1)
and x? is not in sp(x? — 1, x? + 1, x). Thus the set {x? + 1, x? — 1, x, x} is inde-
pendent. Because dim (P;) = 4 and this independent set contains four vectors,
it must be a basis for P;. Alternatively, we could have.deleted 1 and x by
noticing that
l
L=5(x° + 1) - ae -1) and x= S02 + 1) + ae — 1). s
~
3.2 BASIC CONCEPTS OF VECTOR SPACES 201
| SUMMARY
1. A subset W of a vector space V is a subspace of V if and only if it is
noneinpty and satishes the two closure properties:
v + w is contained in W for all vectors vy and w in W, and
rv is contained in W for all vectors vy in W and all scalars r.
2. Let X be a subset of a vector space V. The set sp(X) of all linear
combinations of vectors in X is a subspace of V called the span of X, or the
subspace of V generated by the vectors in X. It is the smallest subspace of V
containing all the vectors in X.
3. A vector space V is finitely generated if V = sp(X) for some set X =
{v,, V2... , ¥,¢ containing only a finite number of vectors in V.
4. Aset X of vectors in a vector space V is linearly dependent if there exists a
dependence relation
ny, + ny, + +++ +7y,= 0, at least one r, 4 0,
where each v, © X and each r; € R. The set X is linearly independent if no
such dependence relation exists, and so a linear combination of vectors in
X is the zero vector only if all scalar coefficients are zero.
5. A set B of vectors in a vector space V is a basis for V if B spans V and is
independent.
6. Asubset B of nonzero vectors in a vector space Vis a basis for Vif and only
if every nonzero vector in V can be expressed as a linear combination of
vectors in B in a unique way.
7. If X ts a finite set of vectors spanning a vector space V, then X can be
reduced, if necessary, to a basis for V by deleting in turn any vector that
can be expressed as a linear combination of those remaining.
8. Ifa vector space V has a finite basis, then all bases for V have the same
number of vectors. The number of vectors in a basis for Vis the dimension
of V, denoted by dim(V).
9. The following are equivalent for n vectors in a vector space V where
dim(V) = n.
a. The vectors are linearly independent.
b. The vectors generate V.
EXERCISES. .
_f
In Exercises 1-6, determine whether the indicated the vector space P of all polynomials with
subset is a subspace of the given vector space. coefficients in R
2. The set of all polynomials of degree 4
1, The set of all polynomials of degree greater together with the zero polynomial in the
than 3 together with the zero polynomial in vector space P of all polynomials in x
202 CHAPTER 3 VECTOR SPACES
3. The set of all functions fsuch that /(0) = 1 14, {sin x, cos x}
in the vector space F of all functions 15. {I, x, x}
mapping R into R 16. {sin x, sin 2x, sin 3x}
4. The set of all functions fsuch that /(1) = 0 17. {sin x, sin(~x)}
in the vector space F of all functions
mapping R into R 18. {e7, e*, e*}
_ The set of all functions fin the vector space 19. {1, e* + ev", e* — e*}
Wn
26. Let V be a vector space. Mark each of the 32. Let {v,, v., v,} be a basis for a vector space I.
following True or False. Prove that, if w is not in sj,¥,, v,), then
—_—_—
a. Every independent set of vectors in V {v,, ¥,, w} is also a basis for V.
is a basis for the subspace the vectors
span. 33. Let {v,, v>,....¥,} be a basis for a vector
b. If {v,, v,,..., ¥,} generates V, then each
v € Vis a linear combination of the {¥,, with £, ¥ 0. Prove that
vectors in this set. {v,, V9; ee ay Vi-t WwW, Veeb oe ey v,}
sp((1, 0, 1), (3, 0, ~1)) in R?. Find a set of technique to find a basis for the subspace
generating vectors for W, N W,,. spe + 1, x +x- 1, 3x ~- 6,
. Let V be a vector space with basis {v,, v., v5}. + x7 + 1, x)
Prove that {v,, v, + v,, ¥, + Vv, + v3} is alsoa
basis for V. of the polynomial space P.
30 . Let V be a vector space with basis 39. We once watched a speaker in a lecture
{V,, ¥>,-.-, V,}, and let W = derive the equation f(x) sin x + g(x) cos x =
sp(¥;, V4... , ¥,). Ifw = ry, + ry, is 0, and then say, “Now everyone knows that
in W, show that w = 0. sin x and cos x are independent functions,
- Let {v,, ¥,, v;} be a basis for a vector space 17. so f(x) = O and g(x) = 0.” Was the
Prove that the vectors w, = v, + V2, W) = statement correct or incorrect? Give a proof
Vv, + v;, W; = Vv, — vy, do not generate V. or a counterexample.
204 CHAPTER 3 VECTOR SPACES
40. A homogeneous linear nth-order differential In Exercises 43-45, use your knowledge of
equation has the form calculus and the solution of Exercise 41 to
+ fixoy’ + describe the solution set of the given differential
+ f,_i(ye foes
flxyy equation. You should be able to work these
Koy’ + fdy = 0. problems without having had a course in
Show that the set of all solutions of this differential equations, using the hints.
equation that lie in the space F of all 43. a. y” + y = 0 [Hint: You need to find two
functions mapping R into R is a subspace indepcndent functions such that when
of F. you differentiate twice, you get the
41. Referring to Exercise 40, suppose that the negative of the function you started with]
b. y” + y = x [Hmnt: Find one solution by
differential equation
experimentation.]
LOY + faye + + fogy" +
. y” — 4y = 0 [Hint: What two
fiy’ + Koy = ax) independent functions, when
differentiated twice, give 4 times the
does have a solution y = p(x) in the space F
original function?]
of all functions mapping R into R. By b. y" — 4y = x [Rint: Find one solution by
analogy with Theorem 1.18 on p. 97,
expcrimentation.]
describe the structure of the set of solutions
of this equation that lie in F. 45. a. y® — yy’ = 0 [Hint: Try tc find values of
m such that y = 2” is a solution.]
42. Solve the differential equation y’ = 2x and b. yO — 9y' = x? + 2x [Hint: Find one
describe your solution in terms of your solution by experimentation.]
answer to Exercise 41, or in terms of the 46. Let S be any set and let F be the set of all
answer in the back of the text. functions mapping S into R. Let W be the
It is a theorem of differential equations that if subset of F consisting of all functions fE F
the functions /{x) of the differential equation in such that f(s) = 0 for all but a finite number
Exercise 40 are all constant, then all the of elements sin S.
solutions of the equation lie in the vector space F a. Show that W is a subspace of F.
of all functions mapping R into R and form a b. What condition must be satisfied to have
subspace of F of dimension n. Thus every W =F?
solution can be written as a linear combination 47. Referring to Exercise 46, describe a basis B
of n independent functions in F that form a basis for the subspace W of F. Explain why 8 is
for the solution space. not a basis for F unless F = W.
Ordered Bases
The vector [2, 5] in R? can be expressed in terms of the standard basis vectors
as 2e, + 5e,. The components of [2, 5] are precisely the coefficients of these
basis vectors. The vector [2, 5] is different from the vector [5, 2], just as the
point (2, 5) is different from the point (5, 2). We regard the standard basis
vectors as having a natural order, e, = [1, 0] and e, = [0, 1}. Ina nonzero vector
space V with a basis B = {b,, b, .. . , b,}, there is usually no natural order for
the basis vectors. For example, tle vectors b, = [—1, 5] and b, = [3, 2] forma
basis for R?, but there is no natural order for these vectors. If we want the
vectors to have an order, we must specify their order. By convention, set
notation does not denote order; for example, {b,, b,} = {b,, b,}. To describe
order, we use parentheses, ( ), in place of set braces, { }; we are used to
paying attention to order in the notation (b,, b,). We denote an ordered basis of
n vectors in V by 3B = (b, b-,..., b,). For example, the standard basis
{e,, €,, e;} of R? gives rise to six different ordered bases—namely,
(2, €p, €3) (Cp, €1, €3) (€3, 1, €2) (Cy. 3, €2) (€r, €3, €y) (C3, C2, €)).
These correspond to the six possible orders for the unit coordinate vectors.
The ordered basis (e,, e,, e,) is the standard ordered basis for R’, and in general,
the basis E = (e,, e,,.. ., e,) is the standard ordered basis for R’.
Coordinatization of Vectors
Let V be a finite-dimensional vector space, and let B = (b,, b,,..., b,) bea
basis for V. By Theorem 3.3, every vector v in V can be expressed in the form
yv=rnb,+ b+ -°°° +7,b,
for unique scalars r,, 7,,..., 7, We associate the vector [r,, 7, ..., 7,| in R"
with v. This gives us a way of coordinatizing V.
The vector [r,, 7, .--, 7,] ‘$ the coordinate vector of v relative to the
ordered basis B, and is denoted by v,.
ILLUSTRATION 1 Let P, be the vector space of polynomials of degree at most n. There are two
natural choices for an ordered basis for P,—namely,
B=(l,x,2x7,...,x) and B= (x,x7',...,x',x, 1).
Taking # = 4, we see that for the polynomial p(x) = —x + x? + 2x* we have
p(x), = [0, -1,0, 1.2] and p(x), = [2, 1.6, -1, 0. z
206 CHAPTER 3 VECTOR SPACES
EXAMPLE 1 Find the coordinate vectors of [1, —1] and of [—1, —8] relative to the ordered
basis B = ({1, —1], [1, 2]) of R’.
SOLUTION We see that [1, —1], = [1, 9], because
(1, -1} = If, -1] + Of1, 21.
To find [—1, —8],, we must find r, and r, such that [-1, -8] = r,[1, -1] +
r, (1, 2]. Equating components of this vector equation, we obtain the linear
system
r, + ry = —|
EXAMPLE 2 Find the cocrdinate vector of [1, 2, —2] relative to the orcered basis B =
([1, 1, 1], [1, 2, 0), [1, 0, 1]) in R’.
SOLUTION We must express [1, 2, —2] as a linear combination of the basis vectors in B.
Working with column vectors, we must solve the equation
11 { 1! 1
nil{+rnj2)+7r,j;0;=|] 2
{ 0 1) |-2
FIGURE 3 1
[—1, -8]g = (2, -3].
3.3 COORDINATIZATION OF VECTORS 207
for r,, r,, and r,. We find the unique solution by a Gauss—Jordan reduction:
for all vectors vy and win V and for all scalars ¢ in R. To do this, suppose that
Because
(Vt+w)=([r,t+5,%+8,...,7, + 5]
= Ve
+ Wp,
which is the sum of the coordinate vectors of v and of w. Similarly, for any
scalar t, we have
This completes the demonstration of relaticns (1). These relaticns tell us that,
when we rename the vectors in V by coordinates relative to B, the resulting
vector space of coordinates—namely, R’—has the same vector-space struc-
ture as V. Whenever the vectors in a vector space V can be renamed to make V
appear structurally identical to a vector space W, we say that V and W are
isomorphic vector spaces. Our discussion shows that every real vector-space
having a basis of n vectors is isomorphic to R". For example, the space P, of ail
polynomials of degree at most 7 is isomorphic to R"*!, because P, has an
ordered basis B = (x", x!) ..., x, x, 1) of 2 + 1 vectors. Each polynomial
aX" + a,
jx +++ baxt ay
SOLUTION We take B = (x’, x, 1) as an ordered basis for P,. The coordinate vectors
relative to B of the given polynomials are
1 3 7 | 3. 7 137
-—3 5 21)/~]0 14 42)~]0
1 3}.
2 -4 -16 0 -10 -30} |0 00
Because the third column in the echelon form has no pivot, these three
coordinate vectors in R? are dependent, and so the three polynomials are
dependent. s
ft o-
G 1 3).
00 0
If we imagine a partition between the second and third columns, we see that
Thus
EXAMPLE 4 It can be shown that the set {1, sin x, sin 2x, ..., sin mx} is an independent
subset of the vector space / of all functions mapping R into R. Find a basis for
the subspace of F spanned by the functions
Because the first, second, and fourth columns have pivots, we shuuld keep the
first, second, and fourth of the original column vectors, so that the set
{A(X £00), A000} is a basis for the subspace W. m
has a graph that passes through the origin if a, = 0. If both a, and a, are zero,
then the graph not only goes through the crigin, but is quite flat there; indeed,
if a, ~ 0, then the function behaves approximately like a,x’ very near x = 0,
because for very small values of x, such as 0.00001, the value of x* is much
greater than the values of x’, x‘, and other higher powers of x. If a, = 0 also, but
a, ~ 0, then the function behaves very much like a,x’ for such values of x very
close to 0, etc. If instead of studying a polynomial function near x = 0, we want
to study it near some other x-value, say near x = a, then we would like to
express the polynomial as a linear combination of powers (x — a)'—that is,
Both
B = (x",...,
x’, x, 1) and B’ = ((x— a)’,...,
(x — a)’, x — a, 1) are
ordered bases for the space P, of polynomials of degree at most n. (We leave the
demenstration that B’ is a basis as Exercise 20.) We give an example
illustrating a method for expressing the polynomial x’ + x* — x — | asa linear
combination of (x + 1), (x+1)*,x+ landl. -
[1, 3, 3, i], [0, l, 2, 1], [0, 0, l, 1], and 0, 0, 0, 1]. Reducing the matrix
er
corresponding to the assoc.ated linear syst m, we obtain
ao
1 0 00| i} ft 1 0 ty
31 00 1} {0
—-DOO
00/-2} |01 0
o-oo
DON
oOo
10
|
32 -1/~}0 0/-4/~
10 0 ~ 10
11 11 -1| [0 0 0 0
Thus the required coordinate vector is p(x)s. = [1, —2, 0, 0], and so
xt xe-—x- 1 = (x4 1) - x4 1). 7
Linear algebra is not the only tool that can be used to solve the problem in
Example 5. Exercise 13 suggests a polynomial algebra solution, and Exercise
16 describes a calculus solution.
SUMMARY
"| EXERCISES -
In Exercises 1-10, find the coordinate vector of . +x? — 2x + 4 in P, relative to
the given vector relative to the indicated ordered (1, x’, x, x’)
basis. . xi + 3x? — 4x + 2 1n P, relative to
J. [-1, 1] in R relative to ((0, 1], [1, 0]) (x, x? — 1, x3, 2x’)
the vector space P, of polynomials of degree fix) = 1-2 sinx + 4cc3x — sin 2x -
at most 3. Use thc method illustrated in 3 cos 2x,
Example 5. f(x) = 2 -— 3sinx — cos x + 4sin 2x +
12. Find the coordinate vector of the polynomial 5 cos 2x
4x} — 9x? + x relative to the ordered basis
f(x) = 5 - 8sinx
+ 2 cosxt
B’ = ((x — 1), (x - 1), (x — 1), 1) of the
7 sin 2x+ 7 cos 2x
vector space P; of polynomials of degree at
most 3. Use the method illustrated in fx)= -1 + 14 cos
x - 11 sin 2x -
Example 5. 19 cos 2x
13. Example 5 showed how to usc linear algebra 20. Prove that for every positive integer n and
1o rewrite the polynomial p(xj = x3 + x? — every a € R, the set
x — | in powers ofx + | rather than in {(x — a)", (x — a)! 2.2, (x — a)’, x — a, 1}
powers of x. This exercise indicates a
polynomial aigebra sclution to this problem. is a basis for the vector space P, of
Replace x in p(x) by [(x + 1} — 1], and polynomials of degree at most n.
expand using the binomial theorem, keeping 21. Find the polynomial in P, whose coordinate
the (x + 1) intact. Check your answer with vector rclative to the ordered basis B =
that in Example 5. (x+ x*4,x -— x7, 1 + x) is (3, 1, 2].
14. Repeat Exercise 11 using the polynomial 22. Let V be a nonzero finite-dimensional vector
algebra method indicated in Exercise 13. space. Mark each of the following True or
15. Repeat Exercise !2 using the polynomial False.
algebra method indicated in Exercise | 3. __ a. The vector space V is isomorphic to R’
for some positive integer 7.
16. Example 5 showed how to use /inear algebra ___ b. There is a unique coordinate vector
to rewrite the polynomial p(x) = x3 + x? — associated with each vector v € V.
x — | in powers of x + | rather than in ___ c. There is a unique coordinate vector
powers of x. This exercise indicates a associated with each vector v € V relative
calculus solution to this problem. Form the to a basis for V.
equation ___ d. There is a unique coordinate vector
xt x—-x~ 1 = d(x + 1) + bfx+ 1 + associated with each vector v € V relative
b(x+ 1) + do. to an ordered basis for V.
___e. Distinct vectors in V have distinct
Find 0) by substitutingx = —1 in this coordinate vectors relative to the same
equation. Then equate the derivatives of ordered basis 8 for V.
both sides, and substitute x = —1 to find f. The same vector in V cannot have the
b,. Continue differentiating doth sides same coordinate vector relative to
and substituting x = —1 to find b, and different ordered bases for V.
b,. Check your answer with that in ___ g. There are six possible ordered bases for
Example 5. R?.
17. Repeat Exercise 1] using the calculus ___h. There are six possible ordered bases for
method indicated in Exercise 16. R?, consisting of the standard unit
coordinate vectors in R?.
18. Repeat Exercise 12 using the calculus i. A reordering of elements in an ordered
o-
Linear Transformations T: V— V’
EXAMPLE 1 Let F be the vector space of all functions f R — R, and let D be its subspace of
all differentiable functions. Show that differentiation is a linear transforma-
tion of D into F.
SOLUTION Let T: D— F be defined by T(/) = /’, the derivative of f Using the familiar
rules
EXAMPLE 2 Let F be the vector space of all functions /: R > R, and let c be in R. Show that
the evaluation function T: F > R defined by 7(f) = f(c), which maps each
function fin F into its value at c, is a linear transformation.
HISTORICAL NOTE THE CONCEPT OF A LINEAR SUBSTITUTION dates back to the eighteenth
century. But it was only after physicists became used to dealing with vectors that the idea of a
function of vectors became explicit. One of the founders of vector analysis, Oliver Heaviside
(1850-1925), introduced the idea of a linear vector operator in one of his works on electromagne-
tism in 1885. He defined it using coordinates: ‘B comes from H by a linear vector operator if,
when B has components B,, B,, B, and H has components H,, H,, H,, there are numbers y,, for
i,j = 1, 2,3, where
By = yA, + pbyalt, + pills
By = py Hh, + pall, + paylls
By = py, + pyrll, + pgs.
In his lectures at Yale, which were published in 1901, J. Willard Gibbs called this same
transformation a linear, vector function. But he also defined this more abstractly as a continuous
function fsuch that f(yvy + Ww) ==f v)+f( Ww). A fully abstract definition, exactly like Definition
3.9, was given by Hermann Weyl in Space-Time-Matter (1918).
Oliver Heaviside was a self-taught expert on mathematical physics who played an important
role in the development of electromagnetic theory and especially 1ts practical applications. In
1901 he predicted the existence of a reflecting ionized region surrounding the earth; the existence
of this layer, now called the ionosphere, was soon confirmed.
3.4 LINEAR TRANSFORMATIONS 215
SOLUTION We show that T preserves addition and scalar multiplication. If fand gare
functions in the vector space F, then, evaluating f+ g at c, we obtain
EXAMPLE 3 Let C,, de the vector space of all continuous functions mapping the closed
interval a = x = b of R into R. Show that T: C,, > R defined by T(f) =
b
SOLUTION From properties of the definite integral, we know that for fg © C,, and for any
scalar r, we have
5 b
EXAMPLE 4 Let C be the vector space of all continuous functions mapping R into R. Let -
a€ Rand let T,: C—> Cbe defined by 7,(f) = { f(t) dt. Show that T, is a linear
transformation.
SOLUTION This follows from the same properties of the integral that we used in the
solution of the preceding exercise. From calculus, we know that the range of T,
is aviually a subset of the vector space of differentiable functions mapping R
into R. (Theorem 3.7, which follows shortly, will show that the range is
actually a subspace.) a
216 CHAPTER 3 VECTOR SPACES
EXAMPLE 5 Let D, be the space of functions mapping R iuto R that have derivatives of
all orders, and let a, @,, a,,..., a, € R. Show that T: D, > D, defined by
T(f) = 4, f' (x) + +++ + af" (x) + a f'(x) + agf(4) is a linear transforma-
tion.
SOLUTION This follows from the fact that the ith derivative of a sum of functions 1s the
sum of their ith derivatives—that is, (f+ g)(x) = f(x) + g°(x)—and that the
ith derivative of rf(x) is rf\(x), together with the fact that T(/) is defined to be
a linear combination of these derivatives. (We consider f(x) to be the Oth
derivative.) s
7* AMPLE 6 Let D Le the vector space of ail differentiable functions and F the vector space
of all functions mapping & into R. Determine whether /. D — F defined by
TU) = 2f"(x) + 3f'(x) + x is a linear transformation.
34 LINLAK | KANSFURIMAILIUNS aur
SOLUTION Because for the zero constant function we have 7(0) = 2(0") + 3(0’') + x =
2(0) + 3(0) + x’ = x’, and x’ is not the zero constant function, we see that 7
does not preserve zero, and so it is not a linear transformation. 1s
PROOF Let Tand 7 be two linear transformations such that 7(b, = T(b,) for
each vector b, in B. Let vy € V. Then there exist vectors b,, b,, ..., b, in Band
scalars r,, r,..., 7 Such that
We then have
The next theorem also generalizes results that appear ‘n the text and
exercises of Section 2.3 for linear transformations mapping R” into R”.
rT\v,) = T(rv,),
and r7(¥,) isin W’. Thus, rv, is also in T~'[W’']. This shows-that T7'[W’] is
closed under addition and under scalar multiplication, and so T"'[W’'] is a
subspace of V. az
The proof is essentially the same as that of Theorem 1.18; we ask you to write
it out for this case in Exercise 46. This boxed result shows that if kez(7) = {0},
then 7(x) = b has at most one solution, and so T is one-to-one, meaning that
7 (¥,) = T(v,) implies that v, = v,. Conversely, if Tis one-to-one, then 7(x) = 0'
has only one solution—namely, 0—so ker(T) = {0}. We box this fact.
34 LINEAR TRANSFORMATIONS 219
y= ce* + Ge™ — 7x — 2. 7
Invertible Transformations
This follows at once from the fact that, for T~' with the properties in
Definition 3.10 and for v' in V’, we have T~'(v') = v for some vin V. But then
v = (Te T')(v') = 7(7-\v')) = Tv). A transformation T: V— V' satisfying
property (7) is onto V’; in this case the range of 7 is all of V’. We have thus
proved half of ihe following theorem.*
PROOF We have just shown that, if Tis invertible, it must be one-to-one and
onto V’.
Suppose now that 7 is one-to-one and onto V’. Because T is onto V’, for
each v’ in V’ we can find v in V such that 7(v) = v’. Because T is one-to-one,
this vector v in Vis unique. Let T~': V' — Vbe defined by T~'(v') = v, where V
is the unique vector in V such that 7(v) = v’. Then
for all vy, and v; in V’ and for all scalars r. Let v, and v, be the unique vectors in
V such that 7(v,) = v, and 7(v,) = v;. Remembering that T”' ° T is the identity
map, we have
Similarly,
T-\(rvi) = T7\(rT(v,)) = T-(T(rv,)) = rv, = 'T(¥}). A
The proof of Theorem 3.8 shows that, if 7; V > V’ is invertible, the linear
transformation T-': V' — V described in Definition 3.10 is unique. This
transformation 7~' is the inverse transformation of T.
ILLUSTRATION 2 Let D be the vector space of differentiable functions and F the vector space
of all functions mapping R into R. Then T: D > F, where T(f) = f', the
derivative of f, is not an invertible transformation. Specificaily, we have 7(x) =
T(x + 17) = 1, showing that T is not one-to-one. Notice also tnat the kernel of
T is not {0}, but consists of all constant functions. »
ILLUSTRATION 3 Let P be the vector space cf all polynomials in x and let W be the subspace of
all polynomials in x with constant term 0. so that q(0) = 0 for q(x) © W. Let
T: P—> W be defined by T(p(x)) = xp(x). Then T is a linear transformation.
Because xp,(x) = xp,(x) if and only if p,(x) = p,(x), we see that T is one-to-one.
Every polynomial g(x) in W contains no constant term, and so it can be
factored as q(x} = xp(x); because T(p(x)) = g(x), we sec that T mans Ponto W.
Thus T is an invertible linear transformation. s
lsomorphism
An isomorphism is a linear transformation 7; V > V’' that is one-to-one
and onto V’. Theorem 3.8 shows that isomorphisms are precisely the in-
vertible linear transformations T: V > V’'. If an isomorphism T exists,
then it is invertible and its inverse is also an isomorphism. The vector spaces
V and V’ are said to be isomorphic in this case. We view isomorphic vector
spaces V and V’ as being structurally identical in the following sense. Let
T: V — V’ be an isomorphism. Rename each v in V by the v’ = 7(y) in
V'. Because T is one-to-one, no two different elements of V get the same
name from V’, and because T is onto V’', all names in V’ are used. The
renamed V and V’ then appear identical as sets. But they also have the
same algebraic structure as vector spaces, as Figure 3.2 illustrates. We
discussed a special case of this concept before Example 3 in Section 3.3,
indicating informally that every finite-dimensional vector space is struc-
turally the same as R" for some n. We are now in a position to state this
formally.
222 CHAPTER 3 VECTOR SPACES
PROOF Equation 1 in Section 3.3 shows that T preserves addition and scalar
multiplication. Moreover, T is one-to-one, because the coordinate vector v, of
¥ uniquely determines v, and the range of T is all of R’. Therefore, 7 is an
isomorphism. a
The isomorphism of V with R’, described in Theorem 3.9, is by no means
wnique. There is one such isomorphism for each choice of an ordered basis B
of V.
Let V and V’ be vector spaces of dimensions n and m, respectively. By
Theorem 3.9, we can choose ordered bases B for V and b’ for V’', and
essentially convert V into R" and V’ into R” by renaming vectors by their
coordinate vectors relative to these bases. Then each linear transformation
T: V— V' corresponds to a linear transformation T: R" > R” in a natural way,
and we can answer questions about T by studying T instead. But we can study
T in turn by studying its standard matrix representation A, as we will
. add .
VY}, Vain V > Vv) t v,inV
rename rename
(a)
multiply by scalar
vin V —>— rvin V
rename rename
(b)
FIGURE 3.2
(a) The equal sign shows that the renaming preserves vector addition
(b) The equal sign shows that the renaming preserves scalar multiplication.
3.4 LINEAR TRANSFORMATIONS 223
illustrate shortly. This matrix changes if we change the ordered bases Bor B’. A
good deal of the remainder of our text is devoted to studying how to choose B
and B’ when m = n so that the square matrix 4 has a simple form that
illuminates the structure of the transformation 7. This is the thrust of
Chapters 5 and 7.
We summarize in a theorem.
The matrix A in Eq. (8) and described in Theorem 3.10 is the matrix
representation of 7 relative to B,B’.
224 CHAPTER 3 VECTOR SPACES
EXAMPLE 8 Let V be the .ubspace sp(sin x cos x, sin’x, cos’x) of the vector space D of all
differentiable functions mapping R into R. Differentiation gives a linear
transformation T of V into itself. Find the matrix representation A of T
relative to B,B’ where B = B' = (sin x ccs x, sin*x, cos*x). Use A to compute the
derivative of
f(x) = 3 sin x cos x — 5 sin’x + 7 cos*x.
T(sin x cos x) = —sin?’x + cos’*x so T(sin x cos x), = [0, —1, 1],
T(sin’x) = 2 sin xcosx so T(sin*x), = [2,0,0], and
T(cos’*x) = —2 sinxcosx so T(cos‘x)g, = [—2, 6, 0}.
Using Eq. (8), we have
EXAMPLE 9 Letting P, be the vector space of polynomials of degree at most n, we note that
T: P,— P, defined by T(p(x)) = (x + 1)p(x — 2) isa linear transformation. Find
the matrix representation A of T relative to the ordered bases B = (x’, x, 1) and
B' = (x3, x*, x, 1) for P, and P,, respectively. Use 4 to compute 7(p(x)} for
p(x) = 5x* — 7x + 18.
SOLUTION We compute
Thus T(x?}5° = fl, —3, 0, 4}, T(x)» = [0, l, —I, —2], and T()),: = [0, 0, l, 1}.
Consequently,
1 0 Of 1 0 O|, ., 5
-3 1 0 -3 1 Oj] 3) |-22
A=! 9 -1 so A(p(x)s)=| g -1 1 - =| 5]:
4-2 1 4-2 1{L18 52
Thus 7(p(x)) = 5x? - 22x? + 25x +52. =
Let Vand V' be n-dimensicnal vector spaces with ordered bases B and B,
respectively. Theorem 3.8 tells us that a linear transformation T: V—> V' 1S
invertible if and only if it is one-to-one and onto V’. Under these circum-
stances, we might expect the following to be true:
34 LINEAR TRANSFORMATIONS 225
This is indeed the case in view of Theorem 3.10 and the fact that linear
transfermations of R” have this property. (See the box on page 151.) Exercise
20 gives an illustration of this.
It is important to note that the matrix representation of the linear
transforination 7: V- V' depends on the particular basesB for Vand B’ for
V’. We really should use some noiation such as Ag, or Rg yg to denote this
dependency. Such notations appear cumbersome and complicated, so we
avoid them for now. We will use such a notation in Chapter 7, where we will
consider this dependency in more detail.
| SUMMARY
Let Vand V' be vector spaces, and let T be a function mapping V into V'.
T(rv,) = rT(v,)
for all vectors v, and v, in V and for scalars vin R.
2. If T is a linear transformation, then 7(0) = 0’ and also 7(v, — v,) =
T(v,) — T(v)).
3. A function T: R’ > R” is a linear transformation if and only if T has the
form T(x) = Ax for some m X n matrix A.
4. The matrix A in summary item 3 is the standard matrix representation
of the transformation 7.
5. A linear transformation 7: V — V’' is invertible if and only if it is
one-to-one and onto V’. Such transformations are isomorphisms.
6. Every nonzero finite-dimensional real vector space Vis isomorphic to R’,
where n = dim(V).
8. The range of T is the subspace {7(v) | v in V} of V’, and the kernel ker(T)
is the subspace {v € V | T(v) = 0'} of V.
9. Tis one-to-one if and only if ker(7) = {0}. If T is one-to-one and has as its
range all of V', then T-': V' — V is well-defined and is a linear
transformation. In this case, both T and 7~' are isomorphisms.
10. If 7: V— V’' isa linear transformation and 7(p) = b, then the solution set
of 7(x) = b is {p + h| h € ker(T)}.
11. Let B= (b, b,,... ,b,) and B’ = (bi, bi, . . . by.) be ordered bases for V
and V', respectively. The matrix representation of T relative to B,B' is the
m X n matrix A having 7(b,), as its jth column vector. We have T(v),. =
A(v,) for all v € V.
"| EXERCISES
In Exercises 1-5, let F be the vector space of all 9. Note that one solution of the differential
functions mapping R into R. Determine whether equation y” — 4y = sin x is y = —1 sin x.
the given function T is a linear transformation. If
Use summary itera 10 and Illustration | to
it ts a linear transformation, describe the kernel of
describe all solutions of this equation.
T and determine whether the transformation is
invertible. Let D, be the vector space of functions mapping
R into R that have derivatives of all orders. It
1. T: F— R defined by T(f) = f(-4) can be shown that the kernel of a linear
2. T: F > R defined by 7(f) = f(5) transformation 7: D, > D, of the form 7(/) =
afM+--+++af' + afwherea, # 0 is an
3. T: F > F defined by 7(/) = f+f
n-dimensional subspace of D..
4. T: F > F defined by 7(f) = f+ 3, where3
is the constant function with value 3 for all
xER. In Exercises 10-15, use the preceding
5. T: F > F defined by T(f) = —f information, summary item 10, and your
knowledge of calculus to find all solutions in D, of
6. Let Cy, be the space of continuous functions the given differential equation. See Illustration I
mapping the interval 0 = x = 2 into R. Let in the text.
T: Cy. — R be defined by 7(f) = f? f(x) dx.
See Example 3. If possible, give three
different functions in ker(T). 10. y’ = sin 2x 13. yp" + 4y=x?
7. Let C be the space of all continuous 11. y” = -cos x 14. y’ + y' = 3e*
tunctions mapping R into R, and let 12. y -y=x 15. yO - 2y" =x
T: C—> Che defined by 71f) = Ji f(o) dt. 16. Let V and V’' be vector spaces having
See Example 4. If possible, give three
ordered bases B = (b,, b,, b,) and B’ =
different functions in ker(7).
(bj, b;, b3, bi), respectively. Let 7: VV’
8. Let F be the vector space of all functions be a linear transformation such that
mapping R into R, and let 7: F— Fbea
linear transformation such that T(e*) = x’, T(b,) = 3b; + bj} + 4b) — by
T(e**) = sin x, and 7(1) = cos 5x. Find the T(b,) b) + 2b; — bj + 2b;
following, fit is determined by this data. T(b,) = —2b; — bs + 2b!
a ey c. T(3e")
b. ( + Se* )
7(3 d. 7(e+2e")
et+ 2e% Find the matrix representation A of T relativ
to B,B’.
3.4 LINEAR TRANSFORMATIONS 227
In Exercises 17-19, let Vand V be vector spaces c. Noting that T° T = D”, find the second
with ordered bases B = (b,, b,, b,) and B’ = derivative of —5x? + 8x7 -3x + 4 by
(b;, b;, b;, b,), respectively, and let T: V—> V’ be multiplying a column vector by an
the linear transformation having the given matrix appropriate matrix.
Aas matrix representation relutive to B,B' Find 22. Let T: P; > P; be defined by 7(p(x)) =
T(v) for the given vector v. x D(p(x)) and let the ordered bases B and
B' be as in Exercise Zi.
41 a. Find the matrix representation A relative
_|2 2
O—
to B,B’.
A=!) 6 ,v=b, +b, +b,
b. Working with the matrix A and
2 1
we
. UseA to find 7(v),, if vz = (2, —5, 1}. by taking second derivatives, so T = D?, and
ao”
c. Show that 7 is invertible, and find the let B = B' = (xe", xe’, e*). Find the matrix
matrix representation of 7”! relative to A by .
B',B. a. following the procedure in summary item
d. Find T~'(v’), if v;. = ([—-1, 1, 3]. 11, and
e. Express 7~'(b}), 7~'(b;), and T~'(b}) b. finding and then squaring the matrix A,
as linear combinations of the vectors that represents the transformation D
in B. corresponding to taking first derivatives.
21. Let T: P; — P, be defined by 7(p(x)) = 24. Let T: P; —> P, be defined by 7(p(x)) =
D(p(x)), the derivative of p(x). Let the p'(2x + 1), where p'(x) = D(p(>)), and let B
ordered bases for P, be B = B’ = = (x, x, x, 1) and B’ = (x*, x, 1).
(x?, x’, x, 1). a. Find the matrix A.
a. Find the matrix A. b. Use A to compute 7(4x° — 5x’ + 4x — 7).
b. Use A to find the derivative of 25. Let V = sp(sin’x, cos’x) and let 7; V-> V be
4x? — 5x? + 10x- 13. defined by taking second derivatives.
228 CHAPTER 3 VECTOR SPACES
Taking B = B’ = (sin’x, cos’x), find A in two 33. For W and B in Exercise 31, find
ways. 7(a sin 2x + bcos 2x) for T: W> W
a. Compute 4 as described in summary whose matrix representation 1s
item 11.
b. Find the space W spanned by the first A > -2of
_|[90
derivatives of the vectors in B, choose an
ordered basis fer W, and compute A as a 34. Let Vand V’ be vector spaces. Mark each of
product of the two matrices representing the following True or False.
the differentiation map from V into —_—.a. A linear transformation of vector spaces
followed by the differentiaticn map from preserves the vector-space operations.
W into V. . Every function mapping V into V’' relates
. Let T: P; — P, be the linear transformation the algebraic structure of V to that of V’.
. A linear transformation T: V— V’
defined by T(p(x)) = D'(p(x)} — 4D(p(x) + carries the zero vector of V into the zero
p(x). Find the matrix representation A of 7,
where B = (x, 1 + x, x + x’, x’). vector of V’.
. A linear transformation 7: V— V'
. Let W = sp(e*, e*, e*) be the subspace of carries a pair v,—v in V into a pair v’,—v’
the vector space of al! real-valued functions in V’.
with domain R, and let B = (e*, e*, e*). . For every vector b’ in V’, the function
Find the matrix representation 4 relative to Ty: V— V' defined by 7,.(v) = b’ for all
B, B of the linear transformation T: W-> W vin Visa linear transformation.
defined by 7.) = D(f) + 2D(N +f . The function Ty: V— V' defined by
28. For W and B in Exercise 27, find the matrix T,(v) = 0’, the zero vector of V’, for all v
representation A of the linear transformation in V is a linear transformation.
T: W— W defined by T() = J2, (0) dt. _— g. The vector space P,, of polynomials of
29. For W and B in Exercise 27, find degree = 10 is isomorphic to R"”.
T(ae* + be* + ce*) for the linear —_—h. There is exactly one isomorphism
transformation T whose matrix T: Pig > R"
representation relative to B,B is i. Let Vand V' be vector spaces of
—
39. Let v and w be independent vectors in V, 45. If Vand V’ are the finite-dimensional spaces
and let 7: V > V’ be a one-to-one linear in Exercises 43 and 44 and have ordered
transformation of V into V'. Prove that T(v) bases B and B’, respectively, describe the
and 7(w) are independent vectors in V’, matrix representations of 7, + 7, in Exercise
43 and rT ip Exercise 44 in terms of the
40. Let V and V’ be vector spaces, let B =
{b,, b,,.. ., b,} be a basis for V, and let matrix representations of 7), 7), aud T
cl, ¢;,--+,¢, © V’. Prove that there exists a relative to B,B’.
linear transformation 7: Y¥ > V’ such that 46. Prove that if 7: V— V' is a linear
T(b) = ¢ fori=1,2,...,4, transformation and 7(p) = b. then the
solution set of T(x) = b is
41. State and prove a generalization of Exercise
40 for any vector spaces V and V’, where V {p+ h|h € ker(T)}.
has a basis B. 47. Prove that, for any five linear
transformations 7,, 7}, 73, T,, T; mapping
42. If the matrix representation of T: R’ > R"
relative to B,B is a diagonal matrix, describe R? into R?, there’ exist scalars ¢,, C), C3, C4, Cs
the effect of 7 on the basis vectors in B. (not all of which are zero) such that 7 =
CT, + Ty + 037; + C457, + €5T; has the
Exercises 43 and 44 show that the set L(V, V') of property that 7(x) = 0 for all x in R’.
all linear transformations mapping a vector space 48. Let T: R" > R’ be a linear transformation.
V into a vector space V' is a subspace of the Prove that, if 7(7(x)} = T(x) + T(x) + 3x
vector space of all functions mapping V into V'. for all x in R’, then 7 is a one-to-one
(See summary item 5 in Section 3.1.) mapping of R” into R’.
49. Let V and V’ be vector spaces having the
43. Let 7, and T, be in L(V, V’), and let same finite dimension, and let 7: V— V’' be
(T, + T,): V— V' be defined by a linear transformation. Prove that T is
one-to-one if and only if range(T) = V’.
(T, + T,)(v) = T,(v) + Tv) (Hint: Use Exercise 36 in Section 3.2.]
for each vector v in V. Prove that 7, + T; is 50. Give an example of a vector space V and a
again a linear transformation of V into V’. linear transformation 7: ¥ — ? such that 7
44, Let T be in L(V, V’'), let r be any scalar in R, is one-to-one but range(7) # V. [Hint: By
and let r7: V > V' be defined by Exercise 49, what must be true of the
dimension of V’?]
(rT )(v) = {T()) 51. Repeat Exercise 50, but this time make
for each vector vin V. Prove that rT is again range(7) = V for a transformation T that is
a linear transformation of V into V’ not one-to-one.
In Section 1.2, we introduced the concepts of the length of a vector and the
angle between vectors in R”. Length and angle are defined and computed in R’
using the dot product of vectors. Iii this section, we discuss these notions for
more general vector spaces. We start by recalling the properties of the dot
product in R’, listed in Thecrem 1.3, Section !.2.
230 CHAPTER 3 VECTOR SPACES
There are many vector spaces for which we can define useful dot products
satisfying properties D1-D4. In phrasing a general definition for such vector
spaces, it is customary to speak of an inner product rather than of a dot
product, and to use the notation (v, w) in place of v- w. One such example
would be ([2, 3], [4, —1]) = 2(4) + 3(-1) = 5.
EXAMPLE ij Determine whether R? is an inner-product space if, for v = [v,, v,] and w =
[w,, w,], we define
(v, w) = 2v,w, + S5v,W,.
SOLUTION We check each of the four properties in Definition 3.12.
P1: Because (v, w) = 2v,w, + Sy,w, and because (w, v) = 2w,v, + 5w,y,, the first
property holds.
P2: We compute
(u,v + w) = 2u,(v, + W,) + 5u,(v, + Wy),
(u, v) = 2u,v, + Surv,
(u, W) = 2u,W, + 54,39.
3.5 INNER-PRODUCT SPACES (OPTIONAL) 231
The sum of the right-hand sides of the last two equations equals the
right-hand side of the firs1 equation. This establishes property P2.
P3: We compute
EXAMPLE 2 Determine whether R? is an inner-product space when, for v = [v,, v.] and w =
[w,, w,], we define
EXAMPLE 3 Determine whether the space P,, of all polynomial functions with real
coefficients and domain 0 = x S J is an inner-product space if for p and g in
P,, we define
(7,0) = |, poxdax) de
SOLUTION We check the properties for an inner product.
Pi: Clearly (p, g) = (q, p) because
P3: We have
r[ poatade = | rlpteygcs dx
= [ poxytatx)) de,
so P3 holds.
P4: Because (p, p) = J, p(x)’ dx and because p(x) = 0 for all x, we have (p, p) =
f p(x)? dx = 0. Now p(x)’ is a continuous nonnegative polynomial function and
can be zero only at a finite number of points unless p(x) is the zero polynomial.
It follows that
(p, p) = |, oxyde> 0,
unless p(x) is the zero polynomial. This establishes P4.
Because all four properties of Definition 3.12 hold, the space P,, is an
inner-product space with the given inner product. sm
We mention that Example 3 is a very famous inner product, and the same
definition
(a= | foes) de
gives an inner product on the space C,, of all continuous real-valued functions
with domain of the interval 0 = x = 1. The hypothesis of continuity is essential
for the demonstration of P4, as is shown in advanced calculus. Of course, there
is nothing unique about the interval 0 = x = 1. Any interval a = x s bcan be
used in its place. The choice of interval depends on the application.
Magnitude
The condition (Vv, v) = 0 in the definition of an inner-product space allows us to
define the magnitude of a vector, just as we did in Section 1.2 using the dot
product.
EXAMPLE 4 In the inner-product space P,, of all polynomial functions with real coeff-
cients and domain 0 < x < 1, and with inner product defined by
f'
(p,q) = j, Pax)ax,
(a) find the magnitude of the polynomial p(x) = x + 1, and (b) compute the
distance d(x’, x) from x? to x.
SOLUTION For part (a), we have
1 4 l q
_ [ 2
(e+ 2x4 Dae -(~4y
S++ 2)) =—/3"
The inner product used in Example 4 was not contrived just for this
illustration. Another measure of the distance between the functions x and x’
over 0 = x = | is the maximum vertical distance between their graphs over the
interval 0 = x = 1, which calculus easily shows to be i where x = 5. In contrast,
the inner product in this example uses an integral to measure the distance
between the functions, not only at one point of the interval, but over the
interval as a whole. Notice that the distance 1/30 we obtained is less than the
maximum distance 4 between the graphs, reflecting the fact that the graphs are
less than 4 unit apart over much of the interval. The functions are shown in
Figure 3. 3, The notion (used in Example 4) of distance between functions over
an interval is very important in advanced mathematics, where it is used in
approximating a complicated function over an interval as closely as possible
by a function that is easier to handle.
234 CHAPTER 3 VECTOR SPACES
—
,
IT (1, 1)
rel
!
~
~
——af-——_po~
ele
nee » a,
+
7
nN|—
FIGURE 3.3
Graphs of x and x? overO sx = 1.
It is often best, when working with the norm of a vector vy, to work with
||v||? = <v, v) and to introduce the radical in the final stages of the computation.
Irvil = [71 Iv
for any vector vin V and for any scalar r.
SOLUTION We have
= Ply.
On taking square roots, we have |[rv|| = |r| |v].
EXAMPLE 6 Using the standard inner product in R’, find the magnitude of the vector
v-w
8 = arccos
lvl [wil
The validity of this definition rested on the fact that for v, w € R’. we have
vow
-l|s = 1.
[|v] ‘|
The Schwarz inequality with v + w replaced by (v, w) holds in any inner-product
space.
We now define the angle between two vectors v and w in any inner-product
space to be
§ = arccos ——.
(vy, W)
\Ivl] [Iw
In particular, we define v and w to be orthogonal (or perpendicular) if (v, w) = 0.
Recall that another important inequality in R” that follows readily from
the Schwarz inequality is
SUMMARY
a
"| EXERCISES
In Exercises 1-9, determine whether or not the 10. Let C,, be the vector space of all continuous
indicated product satisfies the conditions for an real-valued functions with domain a < x = 0.
inner proditct in the given vector space. Prove that ( , ), defined in C,, by (f, g) =
JS? fdge(x) dx, is an inner product in C,,.
1. In R’, let ([x,, 2], Ds Ye) = xu — ae 11. Let C,, be the vector space of all continuous
2. In R?, let ([x,, x4), (1, Val) = Xy%q + Vo real-valued functions with domain 0 = x = 1.
3. In RY, let (x, %2, Dr, Jal) = 21’, + ay. rie 408) deined in Coy by (h 8) =
4. In R’, bet (Lx, 2), (i, Ya)) = xn. a. Find ((x + 1), x).
5. In R?, let (x, Xa; Xs], (vi, Yr ys) = Xj. b. Find |||.
6. In R?, let ([x,, X2, X3), [Yi Yo» al) = 1 + Y. c- Find I< ~ >
7. In the vector space M, of all 2 x 2 matrices, 4. Find |[sin 7x]
let 12. Prove that sin x and cos x are orthogonal |
functions in the vector space C), of Exercise
( é “i 5 a, 10, with the inner product defined there.
43 J LPs 4 13. Let (, ) be defined in C,, as in Exercise 11.
= a,b, + a,b, + ayby + ayby. Find a set of two independent functions in
8. Let C_,, be the vector space of all Cy, each of which is orthogonal to the
continuous functions mapping the interval constant function 1.
~l sx = L into R, and let (fg) = 14. Let u and v be vectors in an inner-product
$2, fodgtx) dx. space, and suppose that ||u|| = 3 and
9. Let C be as in Exercise §, and Ict (f, g} = ||v|| = 5. Find (u + 2v,u — 2v).
/(0)g(0).
3.5 INNER-PRODUCT SPACES (OPTIONAL) 237
15. Suppose that the vectors u and v in Exercise 20. Let V be an inner-product space, and !et S
14 are perpendicular. Find (u + 2v, 3u + vy), be a subset of V. Prove that
16. Let V be an inner-product space. Mark each St =
of the following True or False. {v € V| v is orthogona! to each vector in S}
4 DETERMINANTS
Each square matrix has associated with it a number called the determinant of
the matrix. In Section 4.1 we introduce determinants of 2 x 2 and 3 x 3
matrices, motivated by computations of area and volume. Section 4.2 dis-
cusses determinants of n X n matrices and their properties.
Section 4.3 opens with an efficient way to compute determinants and then
presents Cramer’s rule as well as a formula for the inveise of an invertible
square matrix in terms of determinants. Cramer’s rule expresses, in terms of
determinants, the solution of a square linear system having a unique solution.
This method is primarily of theoretical interest because the methods presented
in Chapter 1 are much more efficient for solving a square system with more
than two or three equations. Because references and formulas involving
Cramer’s rule appear in advanced calculus and other fields, we believe that
students should at least read the statement of Cramer’s rule and look at an
illustration.
The chapter concludes with optional Section 4.4, which discusses the
significance of the determinant of the standard matrix representation of 4
linear transformation mapping R" into R’. The ideas in that section form the
foundation for the change-of-variable formulas for definite integrals of funt-
tions of one or more variables.
238
4.1 AREAS, VOLUMES, ANO CROSS PRODUCTS 239
~{4 %
aT ot
and is denoted by !A] or det(A), so that
det(4) = (2! %
1 2
—p x
FIGURE 4.1
The parallelogram determined by a and b.
240 CHAPTER 4 DETERMINANTS
That ts, if
alee
_ {4
then
We can remember this formula for the determinant by taking the product of
the black entries on the main diagonal of the matrix, minus the product of the
colored entries on the other diagonal.
[2 3
{1 4}
SOLUTION We have
r23)4] _= = _ 5.
4) - GA) .
EXAMPLE 2. Find the area of the parallelogram in R? with vertices (1, 1), (2, 3), (2, 1), (3, 3).
SOLUTION The paratlelogram is sketched in Figure 4.2. The sides having (1, 1) as common
vertex can be regarded as the vectors
a=(2,1]-[1, 1] =[1,0]
and
x2
J
3 4 -
2+ b
] ——_>
a
+--+ +—— x)
0 l 2 3
FIGURE 4 2
The parallelogram determined by a = [1, 0] and b = [1, 2}.
4.1 AREAS, VOLUMES, AND CROSS PRODUCTS 244
as shown in the figure. Therefore, the area of the parallelogram is given by the
determinant
3] = 2) - ON)= 2.
0
p= d, by in b, b, j+ b, b, k (3)
C, Cy C, & C, Cy
is a vector perpendicular to both b and c. (See Exercise 5.) This can be seen by
computing p- b = p-c = 0. The vector p in formula (3) is known as the cross
product of b and c, and is denoted p = b x ¢.
There is a very easy way to remember formula (3) for the cross product
b x c. Form the 3 x 3 symbolic matrix
ij k
b, 6, 6,).
Ci Cy Cy
HISTORICAL NOTE Tue NOTION OF A CROSS PRODUCT grew out of Sir William Rowan
Hamilton's attempt to develop a multiplication for “triplets” —that is, vectors in R’. He wanted
this multiplication to satisfy the associative and commutative properties as well as the distributive
law. He wanted division to be always possible, except by 0. And he wanted the lengths to
multiply—that is, if (a, 4, a) (b,, b,, b,) = (c, Cy, C3), then [I(a,, a, a,)|| Ilo. b,, b,)|| = Iker, Cy ¢,)|]-
After struggling with this problem for 13 years, Hamilton finally succeeded in solving it on
October 16, 1843, although not in the way he had hoped. Namely, he discovered an analogous
result—-not for triples, but for quadruples. As he walked that day in Dublin, he wrote, he could not
“resist the impulse . . . to cut with a knife on a stone of Brougham Bridge .. . the fundamental
formula with the symbols i, j, 44 namely, i? = j? = k? = ijk = --1.” This formula symbolized his
discovery of guaternions, elements of the form Q = w + xi + pj + zk with w, x, y, z real numbers,
whose multiplication obeys the laws just given, as well as the other laws Hamilton desired, except
for the commutative ‘aw.
Hamilton noted the convenience of wnting a quaternion Q in two parts: the scalar part wand
the vector part x: + yj + zk. Then the product of two quaternions a = xi+ yj + zk and B = x'i+
y’j + z'k with scalar parts 0 is given as
aB = (—xx' — yy’ — 22') + (yz — zy')i + (zx! — x2’)j + (xy’ — yx')k.
The vector part of this product is our modern cross product of the “vectors” a and B, while the
scalar part is the negative of the modern dot product (Section 1.2).
Although Hamilton and others pushed for the use of quaternions in physics, physicists
realized by the end of the nineteenth century that the only parts of the subject necessary for their
work were the two types of products of vectors. It was Josiah Willard Gibbs (1839-1903),
professor of mathematical physics at Yale, who introduced our modern notation for both the dot
product and the cross product and developed their properties in detail in his classes at Yale and
finally in his Vector Analysis of 1901.
242 CHAPTER 4 DETERMINANTS
Formula (3) can be obtained from this matrix in a simple way. Multiply the
vector i by the determinzat of the 2 X 2 matrix obtained by crossing out the
row and column containing i, as in
| hb
1 & &
Similarly, multiply (—j) by the determinant of the matrix obtained by crossing
out the row and column in which j appears. Finally, multiply k by the
determinant of the matrix obtained by crossing out the row and column
containing k, and add these multiples of i, j, and k to obtain formula (3).
aw,
we
Ce
me
NO
—
Ww
N
=
1 I, re 2 i,
(2,1
x [1,2,
,13)=],
] gi- |) lit |] 9
=i -5j+3k=[l,
-5, 3].
The cross product p = b x c as defined in Eq. (3) not only is perpendicular
to both b and c but points in the direction determined by the familiar
right-hand rule: when the fingers of the right hand curve in the direction fromb
to c, the thumb points in the direction of b x c. (See Figure 4.3.) We do not
attempt to prove this.
FIGURE 4.3
The area of the parallelogram determined by b and ¢ is ||b’x cl}.
4.1 AREAS, VOLUMES, AND CROSS PRODUCTS 243
Again, pencil and paper are needed to check this last equality. Taking square
roots, we obtain
[|b x cl] = Area of the parallelogram in R’ determined by b and c.
EXAMPLE 4 Find the area of the parallelogram in R’ determined by the vectors b =[3, i, 0]
and ¢c = [l, 3, Z].
SOLUTION From the symbolic matrix
i j kl
3| |
11 3 2
we find that
_ {1 Oj, {3 0]. , [3 1]
bxe=|3 of ; i+ |i 3K
= [2, —6, 8].
Therefore, the area of the parallelogram shown in Figure 4.3 is
\lb x el] = 2V1 +9 + 16 = 20/26. 5
EXAMPLE 5 Find the area of the triangle in R? with vertices (—1, 2, 0), (2, 1, 3), and
(, 1, -1).
SOLUTION We think of (—1, 2. 0) as a local origin, and we take translated vectors starting
there and reaching to (2, 1, 3) and to (1, 1, —-1)—namely,
(2, 1, 3)
(--1, 2, 0)
The cross product is useful in finding the volume of the box, or parailelepiped, -
determined by the three nonzero vectors a = [a,, @,, a,], b = [0,, 5, b,], and ¢=
[C,, G, ¢] in R’, as shown in Figure 4.5. The volume of the box can be
computed by multiplying the area of the base by the altitude A.
ee
that is,
x yz
got xy 2,
xyz
Lagrange was born in Turin, but spent most of his mathematical career in Berlin and in Pac
He contributed important results to such varied fields as the calculus of variation, celest!
mechanics, number theory, and the theory of equations. Among his most famous works are thé
Treatise on Analytical Mechanics (1788), in which he presented the various principles
mechanics from a single point of view, and the Theory of Analytic Hunctions (1797). in which BE
attempted to base the differential calculus on the theory of power scries.
4.1 AREAS, VOLUMES, AND CROSS PRODUCTS 245
We have just seen that the area of the base of the box 1s equal to ||b x cll,
and the altitude can be found by computing
- _ [Ib x el|fall
eos | _ |(b xc)-al
n= |lallleos |= "Ty x] bx
The absolute value is used in case cos @ is negative. This would be the case if
the direction of b x ¢ were opposite to that shown in Figure 4.5. Thus,
b xc):
Volume = (Area of base)h = ||b x eae = |(b x c)- al.
That is, referring to formula (3), which defines b x c, we see that
2 a, a;
A= b, b, b,
C, Cy Cy
and is denoted by
a, a, a,
det(A) = |6, b, 5,).
C, Cy C3
It can be computed as
2 1 3
A=|4 1 2).
1 2 -3
SOLUTION Using formula (5), we have
2 | 3
| _ al 2] f4 2 41
4 1 ; = 2), 4 iF 4 +3 |
1 2-
= 2(-7) - (-14) + 3(7) = 21. .
EXAMPLE 7 Find the volume of the box with vertex at the origin determined by the vectors
a= (4.1, 1]. b = [2, i, 0). and ec = [0, 2, 3], and sketch the box in a figure.
246 CHAPTER 4 DETERMINANTS
x]
FIGURE 4.6
The box determined by a, b, and c.
SOLUTION The box is shown in Figure 4.6. Its volume is given by the absolute value of the
determinant
14 Gi. |G Gl. [a G
b, bt (6, bh* 16, 81%
A simple computation shows that interchanging the rows of a 2 X 2 matrix
having a determinant d gives a matrix with determinant —d (see Exercise i).
Comparison of the preceding formula for c x b with the formula for b x c in
Eq. (3) then shows that b x c= -(c x b).
| SUMMARY
ab = ad — be.
cd
A third-order determinant is defined by Eq. (5).
2. The area of the parallelogram with vertex at the origin determined by
nonzero vectors a and b in R? is the absolute value of the determinant of
the matrix having row vectors a and b.
3. The cross product of vectors b and c in R? can be computed by using the
symbolic determinant
i j k
b, by db;
Cy Oy C;
EXERCISES
Eq. (3) 1s perpendicular to both b and c. ___b. If two rows of a3 x 3 matrix are
interchanged, the sign of the determinant
is changed.
In Exercises 5-9, find the indicated determinant.
___ ce. The determinant of a 3 < 3 matrix is
zero it two rows of the matrix are parallel
! 4 4 2-5 3 vectors in R’.
6. 5 13 O 7} 1 3 4 ___ d. In order for the determinant of a 3 x 3
p -1 3 —2 3 ~=7 matrix to be zero, two rows of the matrix
must be parallel vectors in R?.
1-2 7 z~-l J __. e. The determinant of a 3 x 35 matrix 15
8. |0 1 4 9 i-l 0 3 zero if the pcints in R? given by the rows
1 0 3 2 | -4 of the matnx lie in a plane.
16. Show by direct computation that: f. The determinant of a 3 x 3 matrix is
Qa, a a zero if the points in R? given by the rows
a. |@, Q@, a;| = 0; of the matrix lie in a plane through the
origin.
Cr Cf, C;
__ g. The parallelogram in R? determined by
nonzero vectors a and b is a square if and
b. b, b, b, = 0, only ifa- b= 0.
a, @, a, ___h. The box in R} determined by vectors a, ),
11. Show by direct computation that and cis a cube if and only ifa- b=
a‘c=b-c=Qanda-a=b-b=c°:e.
Qa, a& b, by j. If the angle between vectors a and b in R?
Q, a, is 7/4, then |la x bl| = |a - bj.
12. Show by direct computation that ____ j. For any vector a in R®, we have jja x aij =
[all
a a, a; a, a, a,
b, b, b, =~ Cc Cy Cy. In Exercises 20-24, find the area of the
C) Cr Cy b, 6, by parallelogram with vertex at the origin and with
the given vectors as edges
in Exercises 13-18, finda x b.
20. -i+ 4jand 2i + 3j
Ia - ti jo Shob si + Qj 21. -5i + 3jandi+ 7j
id. a= —S5i+jt
4k, b=2i+j - 3k 22. i + 3j — 5k and 21 + 4j — k
4.1 AREAS, VOLUMES, AND CROSS PRODUCTS 249
93. 2-j+ kandi +3j-k volume of the box having the same three vectors
94. i- 4j + kand 2i + 3j - 2k as adjacent edges.)
jn Exercises 25-32, find the arca of the given Al. (-3, 0, 1), (4, 2, 1), (0, 1, 7), (1 1, I)
geometric covfiguration. 42. (0, 1, 1), (8, 2, -7), (3, 1, 6), (—4, -2, 0)
43. (-1, 1, 2), (3, 1, 4% (-1, 6, 0), (2, -1, 5)
25. The triangle with vertices (—1, 2), (3, -1), 44. (-1, 2, 4), (2, -3; 0), (-4, 2, -1), (0, 3, —2)
and (4, 3)
26. The triangle with vertices (3, —4), (1, 1) and In Exercises 45-48, use a determinant to
(5, 7) ascertain whether the given points lie or. a line in
27. The triangle with vertices (2, 1, —3), R?, [Hur: What is the area of a “parallelogram”
(3, 0, 4), and (1, 0, 5) with collinear vertices?|
28. The triangle with vertices (3, 1, —2),
- (1, 4, 5), and (2, 1, -4) 45. (0, 0), (3, 5), (6, 9
29. The triangle in the piane R? bounded by the 46. (0, 0), (4, 2), (6, -3)
lines y = x, y = —3x+ 8, and 3y + 5x =0 47. (1, 5), (3, 7), (-3, 1)
30. The parallelogram with vertices (1. 3), 48. (2, 3), (1, —4), (6, 2)
(—2, 6), (1, 11), and (4, 8)
31. The parallelogram with vertices (1, 0, 1), In Exercises 49-52, use a determinant to
(3, 1, 4), (0, 2, 9), and (-2, 1, 6) ascertain whether the given points lie in a plane
32. The parallelogram in the plane R? bounded in R?, [Hinr: What is the “volume” of a box with
by the linesx — 2y = 3,x — 2y = 10, coplanar vertices?|
2x+ 3y = -1, and 2x+ 3y = —-8
49. (0, 0, 0), (1, 4, 3), (2, 5, 8), (1, 2, -5)
In Exercises 33-36, find a + (b x c) and 50. (0, 0, 0), (2, 1, 1), (3, 2, 1), (-1, 2, 3)
a X (b X c).
51. (i, -1, 3), (4, 2, 3), (3, 1, -2), (5, 5, -5)
33, a=i+ 2j — 3k,b = 4i-jt+ 2k,c = 3i+k §2. (1, 2, 1), (3, 3, 4), (2, 2, 2), (4, 3, 5)
34. a=~-i+j+ 2k, b=i+k,
c = 3i — 2j + 5k Let a, b, and c be any vectors in R>. In Exercises
53-56, simplify the given expression.
35. a=i— 3k,b=-i+ 4j,c=i+jt+k
36. a = 4i — j + 2k, b = 31 + 5j — 2k,
53. a- (a x b)
c=i-3jt+k
54. (b x c) — (c X b)
In Exercises 37-40, find the volume of the box 55. {la x bil? + (a- bP
having the given vectors as adjacent edges. 56. aX (bX c) + bx (¢c xX a) +c xX (aX b)
57. Prove property (2) of Theorem 4.1.
37. -i + 4j + 7k, 3i— 2j—k, 4i + 2k 58. Prove property (3) of Theorem 4.1.
38. 2i + j — 4k, 3i-j + 2k, i+ 3j — 8k 5 9. Prove property (6) of Theorem 4.1.
39. -21 + j, 3i- 4j+ ki — 2k a 60. Option 7 of the routine VECTGRPH in
40. 31 — 5+ 4k, i — 2j + 7k, Si — 3j + 10k LINTEK provides drill on the determinant
of a2 X 2 matrix as the area of the
In Exercises 41-44, find the volume of the parallelogram determined by its row veciors,
‘etrahedron having the given vertices. (Consider with an associated plus or minus sign. Run
how the volume of a tetrahedron having three this option until you can regularly achieve a
vectors from one point as edges is related to the score of 80% or better.
250 CHAPTER 4 DETERMINANTS
MATLAB has a function det(A) which gives the 61. -i + 7j + 3k, 41 + 23j — 13k,
determinant of a matrix A. In Exercises 61-63, 12i — 17) — 31k
use the routine MATCOMP in LINTEX or 2. 4.ii — 2.3k, 5.3) — 2.1%, 6.11 + 5.7j
MATLAB to find the volume of the box having the
given vectors in PB as adjacent edges. (W have 63. 2.131 + 4.71j — 3.62k, 5i — 3.2j + 6.32k,
rot supplied matrix files for these problems ) 8.31 — 0.457 + 1.13k
MATLAB
M1. Enter the data vectorsx =[1 5 7] andy =[-3 2 4] into MATLAB.
Then enter a line crossxy = | ], which wil! compute the cross product
x X y of vectors [x(1)_ x(2) x(3)} and [y(@1) y(2) y(3)I. (Hint: The first
componen: in [ ] will be x(2)*y(3)— y(2)*x(3).] Be sure you use no spaces
excepi one between the vector components. Check that the value given
for crossxy is the correct vector 61 — 25j + 17k for the data vectors
entered.
M2. Usz the norm function in MATLAB to find the area of the parallelogram in R?
having the vectors x and y in the preceding exercise as adjacent edges.
M3. Enter the vectors x = 4.2i — 3.7j + 5.6k and y = —7.3i + 4.5j + 11.4k.
a. Using the up-arrow key to access your line defining crossxy, find x x y.
b. Find the area of the parallelogram in R? having x and y as adjacent
edges.
M4. Find the area of the triangle in R? having vertices (—1.2, 3.4, —6.7),
(2.3, —5.2, 9.4), and (3.1, 8.3, —3.6). [Hint: Enter vectors a, b, and c from the.
origin to these points and set x and y equal to appropriate differences of
them.]
The Definition
We defined a third-order determinant in terms of second-order determinants
in Eq. (5) on page 245. A second-order determinant can be defined in termsof
first-order determinants if we interpret the determinant ofa 1 x 1 matrix to}
its sole entry. We define an nth-order determinant in terms of determinants
order +7 — |. In order to facilitate this, we introduce the minor matrix Ajj
A> nmatriv.d = (a,j: itis the (2 — 1) xX (a — 1) matrix obtained by crossin
4.2 THE DETERMINANT OF A SQUARE MATRIX 251
out the ith row and jth column of A. The 1.inor matnx is the portion shown in
color shading in the matrix
jth column
Using |A;| as notation for the determinant of the minor matrix A, we can
express the determinant of a 3 X 3 matnx A as
Q, Ay Gy
Ay Ay Ay = Ay|Ay| — Ap/Ay| + @43[A)3l-
ay, ay) as,
The numbers a), = |A,,|, a1. = —|A,,, and ai, = |A,;| are appropriately called
the cofactors of a,,, @,,, and a,;. We now proceed to define the determinant of
any square matrix, using mathematical induction. (See Appendix A for a
discussion of mathematical induction.)
HISTORICAL NOTE = THE FIRST APPEARANCE OF THE DETERMINANT OF A SQUARE MATRIX in Western
Europe occurred in a 1683 letter from Gottfned von Leibniz (1646-1716) to the Marquis de
L’H6pital (1661-1704). Leibniz wrote a system of three equations in two unknowns with abstract
“‘numerical” coefficients,
10+ tix + 12y=0
20 + 2Ix + 22y =0
30 + 31x+ 32y =0,
in which he noted that each coefficient number has “two characters, the first marking in which
equation it occurs, the second marking which letter it belongs to.” He then proceeded to eliminate
first y and then x to show that the criterion for the system of equations to have a solution is that
10-21-32 + 11-22-30+ 12-20-31 = 10-22-31 + 11-20-32 +4 12-21 30.
This is equivalent to the moder condition that the determinant of the matzix of coefficients must
be zero.
Determinants also appeared in the contemporaneous work of the Japanese mathematician
Seki Takakazu (1642-1708). Seki’s manuscript of 1683 includes his detailed calculations of
determinants of 2 x 2, 3 x 3, and 4 X 4 matrices—although his version was the negative of the
version used today. Seki applied the determinant to the solving of certain types of equations, but
evidently not to the solving cf systems oi linear equations. Seki spent mosi of his life as an
accountant working for two feudal lords, Tokugawa Tsunashige and Tokugawa Tsunatoyo, in
Kofu, a city in the prefecture of Yamanashi, west of Tokyo.
252 CHAPTER 4 DETERIAINANTS
Q, A, °° * a,
Q, 4, **°* Ay
det(A} =
ay an se. ann
= ©
212
N=
W
A=
01 4:
£&
021
m
SOLUTION Because 3 is in the row 2, column | position of A, we cross out the second row
and first column of A and find the cofactor of 3 to be
101 14) | | |o1 |
ai, = (-if*' 10lori
1 4) = - 2H 102
= -(-7- 0) =7.
EXAMPLE 2 Use Eq. (3) in Definition 4.1 to find the determinant of the matnx
5-2 4-1
01 52
A=| 1 2 0 1f
-3 1-1 1
4.2 THE DETERMINANT OF A SQUARE MATRIX 253
SOLUTION We have
5 -2 4-]
d=] 1 2 0 1
i—3 «61 -l 61
1 5 2 0 5 2
> 5-17 ]2 0 I+ (-2y-1)3} 1 0 1
1-1! 1 —-3 -! 1
0 1 2 0 1 5
+ 4-1) 1 2 1+(-1y-1)) 1 2 OL.
- 1 1 -—3 1-1
Computing the third-order determinants, we have
1 5 2
2 0 y= ft 1 3/1 | + 3ti a
I-11
= 1(1) - 5(1) + 2(-2) = 8;
052
1 0 =o] 1 -5|_3 | +23 4
3-1 1
= O(1) — 5(4) + 2(-1) = -22:
012
—3
1 2 =o} |
1 l a
- 1] 3 a 4]
+ 2] 3 Fj
4
Let A be an n X n matrix, and let rand s be any selections from the list
of numbers 1, 2,..., 2. Then
Equation (4) is the expansion of det(A) by minors on the rth row of A, and
Eq. (5) is the expansion of det(A) by minors on the sth column of A. Theorem 4.2
thus says that det(A) can be found by expanding by minors on any row or on
any column of A.
3 2 0 1 3
—2 4 1 2 #41
A=| 0-1 0 1 -S}.
-! 2 0-1 2
000 0 2
SOLUTION A recursive computation such as the one in Definition 4.1 is still the only waY
we have of computing det(A) at the moment, but we can expedite ¢
4.2 THE DETERMINANT OF A SQUARE MATRIX 255
~1 2 0-1
3 2 1
= (21-17) 0 -1 1 Expanding on column 3
-l1 2-1
=
= 213-5
1+] —1
_ 3l 7 1(—1) 3+!
11
21
Expanding
:
on column 1
¢ Unn
det(U) = uy, . ° ° = Uy Uy
0 0 -:: 4&,, O +: 4,
St = yy? * Ung. .
PROOF Again we find that the proof is trivial for the case n = 2. Assume that
n > 2, and that this row-interchange property holds for matrices of size smaller
than n X n. Let A be ann x n matrix, and let B be the matrix obtained from4
by interchanging the ith and rth rows, leaving the other rows unchanged.
Because n > 2, we can choose a kth row for expansion by minors, where kis
different from both r and /. Consider the cofactors
PROOF Let B be the matrix obtained from A by interchanging the two equal
rows of A. By the row-interchange property, we have det(B) = —det(A). On the
other hand, B = A, so det(A) = —det(A). Therefore, det(A) = 0. a
PROOF Let r be any scalar, and let B be the matrix obtained from A by
replacing the kth row [dy,, dy, . . . » Qn] Of A by [ray,, Ay, . . . , FAgy). Since the
rows of B are equal to those of A except possibly for the kth row, it follows that
the minor matrices A, and B,, are equal for each j. Therefore, aj, = 5;,, and
computing det(B) by expanding by minors on the kth row, we have
HISTORICAL NOTE THe THEORY OF DETERMINANTS grew from the efforts of many mathemat:-
cians of the late eighteenth and early nineteenth centuries. Besides Gabriel Cramer, whose work
we wilt discuss in the note on page 267, Etienne Bezout (1739-1783) in 1764 and Alexandre-
Theophile Vandermonde (1735-1796) in 1771 gave various methods for computing determi-
nants. In a work on integral calculus, Pierre Simon Laplace (1749-1827) had to deal with systems
of linear equations. He repeated the work of Cramer, but he also stated and proved the rule that
interchanging two adjacent columns of the determinant changes the sign and showed that a
determinant with two equal columns will be 0.
The most complete of the early works on determinants is that of Augustin-Louis Cauchy
(1789-1857) in 1812. In this work, Cauchy introduced the name determinant to replace several
older terms, used our current double-subscript notation for a square array of numbers, defined the
array of adjoints (or minors) to a given array, and showed that one can calculate the determinant
by expanding on any row or column. In addition, Cauchy re-proved many of the standard
theorems on determinants that had been more or less known for the past 50 years.
Cauchy was the most prolific mathematician of the nineteenth century, contributing to such
areas as complex analysis, calculus, differential equations, and mechanics. In particular, he wrote
the first calculus text using our modern ¢,5-approach to continuity. Politically he was a
conservative; when the July Revolution of 1830 replaced the Bourbon king Charles X with the
Orleans king Louis-Philippe, Cauchy refused to take the oath of allegiance, thereby forfeiting his
chairs at the Ecole Polytechnique and the Collége de France and going into exile in Turin and
Prague.
258 CHAPTER 4 DETERMINANTS
Yo
=—_AN
NAA
Wo
hf
—
WN
my
I}
ame
=
NR
SOLUTION Wenote that the third row of A is three times the first row. Therefore, we have
21342
62141
det(A) = 3}2 1 3 4 2! Property4
213441
14211
= 3(0)= 6. Property 3 iz
PROOF Let a, = [@,,, 4p, . . . , @;,] be the ith row of A. Suppose that ra; is added
to the kth row a, of A, where r is any scalar and k # i. We obtain a matrix B
whose rows are the same as the rows of A except possibly for the kth row,
which is
Clearly the minor matrices A, and B,, are equal for each j. Therefore, aj, = Dip
and computing det(B) by expanding by minors on the kth row, we have
where c is the matrix obtained from A by replacing the kth row of A with the
ith row of A. Because C has two equal rows, its determinant is zero, so det(B) =
det(A). a
We now know how the three types of elementary row operations affect the
determinant of a matrix A. In particular, if we reduceA to an echelon form H
and avoid the use of row scaling, then det(A) = +det(H), and det(#) is the
product of its diagonal entries. (See Example 4.) We know that an echelon
form of A has only nonzero entries on its main diagonal if and only if 4 is
invertible. Thus, det(A) # 0 if and only if A is invertible. We state this new
condition for invertibility as a theorem.
PROOF First we note that, if A is a diagonal matrix, the result follows easily,
because the product
ay by byt: Bh,
Qy, by bys Bay
has its ith row equal to a, times the ith row of B. Using the scalar-
multiplication property in each of these rows, we obtain
det(AB) = (4,4, * °° Gy.) - det(B) = det(A) - det(B).
(see Exercise 30); so both A and AB have a zero determinant, by Theorem 4.3,
and det(A) - det(B) = 0, too.
If we assume that A is invertibie, it can be row-reduced through row-
interchange and row-addition operations to an upper-triangular matrix with
nonzero entries on the diagonal. We continue such row reduction analogous to
the Gauss—Jordan method but without making pivots 1, and finally we reduce
A toa diagonal matrix D with nonzero diagonal entries. We can write D = EA,
where E is the product of elementary matrices corresponding to the row
interchanges and row additions used to reduce A to D. By the properties
of determinants, we have det(A) = (—1)’ - det(D), where r is the number
of row interchanges. The same sequence of steps will reduce the matrix
AB to the matrix E(AB) = (EA)B = DB, so det(AB) = (—1) - det(DB).
Therefore,
20 O}]1 2 3
A=}13 OVO L 2].
L
42 1/0 0 2
so det(A7') = 3] a
det(A~')
1
~ det(A)’
4.2 THE DETERMINANT OF A SQUARE MATRIX 261
| SUMMARY
1. The cofactor of an element a, in a square matrix A is (—1)'A;|, where A, is
the matrix obtained from A by deleting the ith row and the jth column.
2. The determinant of an n X n matrix may be defined inductively by
expansion by minors on the first row. The determinant can be computed
by expansion by minors on any row or on any column; it is the sum of the
products of the entries in that row or column by the cofactors of the
entries. For large matrices, such a computation is hopelessly long.
3. The elementary row operations have the following effect on the determi-
nant of a square matrix A.
a. If two different rows of A are interchanged, the sign of the determinant
is changed.
b. If a single row of A is multiplied by a scalar, the determinant is
multiplied by the scalar.
c. If a multiple of one row is added to a different row, the determinant is
not changed.
4. We have det(A) = det(A’). As a consequence, the properties just listed for
elementary row operations are also true for elementary column opera-
tions.
5. If two rows or two columns of a matrix are the same, the determinant of
the matnix is zero.
6. The determinant of an upper-triangular matrix or of a lower-tnangular
matrix is the product of the diagonal entries.
7. Ann X n matrix A is invertible if and only if det(A) 4 0.
8. If A and Bare n X n matrices, then det(AB) = det(A) - det(B).
EXERCISES
for 324
141
a -1
4-1
2 #1
2
ere oe BOT
1 2
1 2 0-1 2 4
: 14 621 1 O 1 8 1 5
3./2 3 1 6./0 4 1
14) 00 i
262 CHAPTER. 4 DETERMINANTS
31. (Application to permutation theory) Consider skeptical of our assertion that the solar
an arrangement of n objects, lined up in a system would be dead long before a
column. A rearrangement of the order of the present-day computer could find the
objects is called a permutation of the objects. determinant of a 50 x 50 matrix using just
Every such permutation can be achieved bv Definition 4.1 with expansion by minors.
successively swapping the positions of pairs a. Recall that n! = n(n — 1) - + - (3)(2X(1).
of the objects. For example, the first swap Show by induction that expansion of an
might be to interchange the first object with n X n matrix by minors requires at least
whatever one you want to be first in the new n! multiplications for n > 1.
arrangement, and then continuing this
procedure with the second, the third, etc.
WH b. Run the routine: EBYMTIME
; in LINTEK'
However, there are many possible sequences and find the time required to perform n
multiplications for n = 8, 12, 16, 20, 25,
of swaps that will achieve a given permutation.
Use the theory of determinants to prove that 30, 40, 50, 70, and 100.
it is impossible to achieve the same 39. Use MATLAB or the routine MATCOMP in
permutation using both an even number and LINTEK to check Example 2 and Exercises
an odd number of swaps. [Hint: It doesn’t 5-10. Load the appropriate file of matrices
matter what the objects actually are—think of if it is accessible. The determinant of a
them as being the rows of an n x n matrix.] matrix A is found in MATLAB using the
38. This exercise is for the reader who is command det(A).
Computation of a Determinant
The determinant of an 7 X m matrix A can be computed as follows:
1. Reduce A to an echelon form, using only row addition and row
interchanges.
2. If any of the matrices appearing in the reduction contains a row of
zeros, then det(A) = 0.
64 CHAPTER 4 DETERMINANTS
3. Otherwise,
When doing a computation with pencil and paper rather than with a
computer, we often use row scaling to make pivots 1, in order to ease
calculations. As you study the following example, notice how the pivots
accumulate as factors when the scalar-multiplication property of determinants
's repeatedly used.
2 204
2322
A=\9 132
2021
2204 1102
3322 3322 o
0132/7 2 0132 Scalar-multiplication property
2021, 2021
1 1 0 2
0 0 2-4
=2 013 2 Row-addition property twice
0-2 2-3
1 1 0 2
0 1 3 2 ;
= -2 002-4 Row-interchange property
0-2 2-3
1 | 0 2
0 1 3 2 .
= -2 00 2-4 Row-addition property
0 0 8 #1
1 1 0 2
0 1 3 2
= (—2)(2)lg Q 1 —2| Scalar-multiplication property
0 0 8 #1
43 COMPUTATION OF DETERMINANTS AND CRAMER'S RULE 265
1 1 0 2
0 1 3 2
= (—2)(2) 0 0 1-2 Row-addilion properly
0 0 017
In our written work, we usually don’t write out the shaded portion of the
computation in the preceding example.
Row reduction offers an efficient way to program a computer to compute a
determinant. If we are using pencil and paper, a further modification is more
practical. We can use elementary row or column operations and the properties
of determinants to reduce the computation to the determinant of a matrix
having some row or column with a sole nonzero entry. A computer program
generally modifies the matrix so that the first column has a single nonzero
entry, but we can look at the matrix and choose the row or column where this
can be achieved most easily. Expanding by minors on that row or column
reduces the computation to a determinant of order one less, and we can
continue the process until we are left with the computation of a determinant of
a2 xX 2 matnix. Here is an illustration.
2-1 3 °«5
201 40
A=) 6 1 3 4
-7 3-2 8|
SOLUTION It is easiest to create zeros in the second row and then expand by minors on
that row. We start by adding —2 times the third column to the first column,
and we continue in this fashion:
Cramer’s Rule
x, b,
x=|°]}, and b=
Xn D, =)
_ det(B,)
for k=1,...,n, (1)
*e “Get(A)
where B, is the matrix obtained from 4 by replacing the kth-column
vector of A by the column vector b.
X,=1000
--- x00 --: Of
5 1 1 5-2 I
det(B,)=|3 3 Of = —-15, det(B,)=|3 2 3) = —20.
1 0-1 1 1 @
Hence,
xewel
! —] 3?
m=
-15
yg =i,
3
= 4
—] ~ 3 a
ij
a, a,2 . Ann
Then
_ fdet(A) if i = j
det(4,.) = (5 if i xj.
If we expand det(4,_.,) by minors on the jth row, we have
nt
~ _ {det(A) if i =j, 2)
py Ais ~ ( ifi Aj. (
The term on the left-hand side in Eq. (2) is the entry in the ith row and jth
column in the product A(A’)’, where A’ = [a;] is the matrix whose entries are
the cofactors of the entries of A. Thus Eq. (2) can be written in matrix form 4
A(A')! = (det(A))/,
4.3 COMPUTATION OF DETERMINANTS AND CRAMER'S RULE 269
where Jis the n X nidentity matrix. Similarly, replacing the ith column of A by
the jth column and by expanding on the /th column, we have
A det(A)
if i =j
2, diay = \
r=)
ifi # ji (3)
Relation (3) yields (A’)"A = (det(A))I.
fhe matrix (A’)’ is called the adjoint of A and is denoted by adj(A). We
have established an important relationship between a matrix and its adjoint.
A‘ = =—,adj(A),
TaD
on
te
©
od NO
—m—N
a= (-I9| 20 f= 2 a = ry] 2 ,
= -2,
z 2 ' 01
Ay = (-1) 3 l =—4, Qa, = (-1) l 1 = l,
; 41 | 40
on =o 1 a one Cos 1 a
270 CHAPTER 4 DETERMINANTS
' 0 ; ; ;|4 1
ay, = (—1)"), = -2, ay = (-1) 20 =2
a, = (-1)]5 52 = 8.
Hence,
2-2 -4 2 | -2
A’=[aJ=| 1 1-4), so adj(A)=|}-2 1 2
-2 2 8 -4 -4 8
and
toot
2 1-2 ; 44
A! = adj(A)=—|-2 1 2/=| 72 4 2}.
wet) -4-4 8] [-1-1 2 .
The method described in Section 1.5 for finding the inverse of an
invertible matrix is more efficient thax the method illustrated in the preceding
example, especially if the matrix is large. The corollary is often used to find ihe
inverse of a 2 x 2 matrix. We see that if ad— bc # 0, then
la ey ee d —b
le d| ~ad- a ak
“| SUMMARY
1. A computationally feasible algcrithm for finding the determinant of a
matrix is te reduce the matrix to echelon form, using just row-addition and
row-interchange operations. If a row of zeros is formed during the process,
the determinant is zero. Otherwise, the determinant of the original matrix
is found by computing (—1)' - (Product of pivots) in the echelon form,
where r is the number of row interchanges performed. This is one way to
program a computer to find a determinant.
2. The determinant of a matrix can be found by row or column reduction of
the matrix to a matnx having a sole nonzero entry in some column or row.
One then expands by minors on that column or row, and continues this
process. If a matrix having a zero row or column is encountered, the
determinant is zero. Otherwise, one continues until the computation is
reduced to the determinant of a 2 X 2 matrix. This is a good way to find
determinant when working with pencil and paper.
3. IfA is invertible, the linear system Ax = b has the unique solution x whos¢
kth component is given explicitly by the formula
_ det(B,)
\ det(A)
where the matrix B, is obtained from matrix A by replacing the kth colum®@
of A by b
4.3 COMPUTAT lON OF DETERMINANTS AND CRAMER'S RULE 271
4. The methods of Chapter | are far more efficient than those described in
this section for actual computation of both the inverse of A and the
solution of the system Ax = b.
Let A be ann x n matrix, and let A’ be its matrix of cofactors. The adjoint
un
adj(A) is the matrix (A’)’ and satisfies (adj(A))A = A(adj(A)) = (det(A))/,
where J is the n xX n identity matrix.
6. The inverse of an invertible matrix A is given by the explicit formula
l
Al = det(d) 204).
EXERCISES
3, |2 3-1 2 10./-1 2 4 0 0O
° 3 —4 3 7 3 1 -2 0 0
1-10 4 . 5 1 5 0 0
- 11. The matrices in Exercises 8 and 9 have zero
| 3-5-1 7 entries except for entries in anr X r
4.| 9 3 1-6 submatrix R and a separate ¢ x 5 submatrix
2-5-1 8 S whose main diagonals lie on the main
(8 8 2-9 diagonal of the whole n x n matrix, and
2 1 0 0 0 where r + 5 = n. Prove that, 1f A is such a
3-! 2 0 0 matrix with submatrices & and S, then
510 4 1-1 2 det(A) = det(R) - det(S).
0 0-3 2 4 12. The matrix A in Exercise 10 has a structure
0 O O-1 3 similar to that discussed in Exercise 1 1,
r3 2 00 90 except that the square submatrices R and S
-1 4 1 0 0 lie along the other diagonal. State and prove
6 | 0-3 5 2 =O a result similar to that in Exercise 11 for
000 i: 4 such a matrix.
| 0 0 O-! 2| 13. State and prove a generalization of the result
fo 0 03 4 in Exercise 11, when the matrix A has zero
002 0-3 entries except for entries in k submatrices
72/02 1 0 0 positioned along the diagonal.
5-3 2 0 0
3 4 0 0 0| In Exercises 14-19, use the corollary to Theorem
72-1 0 0 4.6 to find A” if A is invertible.
gs ‘lo|4 5
03 0 0
6 Maat | Is. A=[) 1
10 0-4 2 1 -1 2 |
272 CHAPTER 4 DETERMINANTS
det(A) = +1, prove that A~! also has the a, b, c, da, | 3b,|
same properties. a, dy cy a, | 3d,
46. Prove that the inverse of a nonsingular aefault roundoff control ratio r. Start
upper-triangular matrix is upper triangular. computing determinants of powers of A.
37. Prove that a square matrix is invertible if
Find the smallest positive integer m such
and only if its adjoint is an invertible that det(A”) # 1, according 9 MATCOMP.
matrix. How bad is the error? What does
MATCOMP give for det(A™)? At what
4g. Let A be an n X n matrix. Prove that integer exponent does the break occur
det(adj(A)) = det(A)""'. between the incorrect value 0 and incorrect
49. Let A be an invertible 1 x n matrix with values of large magnitude?
n> 1. Using Exercises 37 and 38, prove that Repeat the above, taking zero for
adj(adj(A)) = (det(A))""7A. roundoff control ratio r. Try to explain why
the results are different and what is
Bl he routine YUREDUCE in LINTEK has a happening in each case.
menu option D that will compute and display . Using MATLAB, find the smallest positive
the product of the diagonal elements of a integer m such that det(A”) # 1, according
square matrix. The routine MATCOMF has a
to MATLAB, for the matrix A in Exercise
menu cption D to compute a determinant.
43.
Use YUREDUCE or MATLAB to compute
the determinant of the matrices in Exercises
40-42. Write down your results. If you used In Exercises 45-47, use MATCOMP in LINTEK
YUREDUCE, use MATCOMP to compute or MATLAB and the corollary of Theorem 4.6 to
the determinants of the same matrices again find the matrix of cofactors of the given matrix.
and compare the answers.
li -9 28 13 -15 33 f} 2 -3
40. }32 -24 21 41.;-15 25 40 45. 2 3 0
10 13 -19 12 -33 27 3 1 «4
2 3]
34
A =
continuous functions of its entries; that is,
changing an entry by a very slight amount
has determinant |, so every power of it will change a cofactor only slightly. Change
should have determinant i. Use MATCOMPF some entry just a bit to make the
with single-precision printing and with the determinant nonzero.]
thatA is a square matrix, then det(A) is defined. But what is the meaning of the
number det(A) for the transformation 7? We now tackle this question, and the
answer is sO important that it merits a section all to itself. The not‘on of the
determinant associated with a linear transformation 7 mapping R” into R" lies
at the heart of variable substitution in integral calculus. This section presents
an informal and intuitive explanation of this notion.
In Section 4.1, we saw that the area of the parallelogram (or 2-box} in R?
determined by vectors a, and a, is the absolute value |det(A)| of the determi-
nant of the 2 x 2 matrix A having a, and a, as column vectors.* We also saw
that the volume of the parallelepiped (or 3-ox) in R? determined by vectors a,,
a), and a, is |det(A)| for the 3 x 3 matrix A whose column vectors are a, a), a3.
We wish to extend these notions by defining an n-box in R™ for m = n and
finding its “volume.”
If the vectors a,, a,,..., a, in Definition 4.2 are dependent, the set
described is a degenerate n-box.
EXAMPLE 1 Describe geometrically the 1-box determined by the “vector” 2 in R and the
I-box determined by a nonzero vector a in R”.
SOLUTION The 1-box determined by the “vector” 2 in R consists of all numbers 4(2) for
0 <¢< 1, which is simply the closed interval 0 < x < 2. Similarly, the 1-box in
R™ determined by a nonzero vector a is the line segment joining the origin to
the tipofa. s
Notice that our boxes need not be rectangular boxes; perhaps we should
have used the term skew box to make this clear.
*Anticipating our work in this section, we arrange the vectors a, as columns rather than as rows
of A. Recall that det(A) = det(A’).
44 LINEAR TRANSFORMATIONS AND DETERMINANTS (OPTIONAL) 275
ay
From Eqs. (1) and (2), we might guess that the square of the volume of an
n-box in R” is det((a; - a). Of course, we must define what we mean by the
volume of such a box, but with the natural definition, this conjecture is true. If
A is the matrix with jth column vector a, then A’ is the matrix with ith row
vector a, and the n X n matrix [a, - a] is A7A, so Eq. (2) can be written as
(Area) = \det(A7A).
Wc have an intuitive idea of the volume of an 7-box in R” for n < m. For
example, the 3-box in R™ determined by independent vectors a,, a,, a; has a
volume equal to the altitude of the box times the volume (that is, area) of the
base, as shown in Figure 4.10. Roughly speaking, the volume of an n-box 1s
equal to the altitude of the box times the volume of the base, which is an
(n — 1)-box. This notion of the altitude of a box can be made precise after we
develop projections in Chapter 6. The formal definition of the volume of an
n-box appears in Appendix B, as does the proof of our main result on volumes
(Theorem 4.7). For the remainder of this section, we are content to proceed
with our intuitive notion of volume.
PROOF By Theorem 4.7, the square of the volume of the n-box is det(A7A).
But because A is ann X n matrix, we have
EXAMPLE 3 Find the area of the parallelogram in R* determined by the vectors [2, |, — 1, 3]
and [0, 2, 4, —1].
SOLUTION If
2 0
1 2
A=|_1 4),
3 -1
then
2 ;
2 1-1 3] 1 2]_f15 -5
AA E 2 4 aye 4 i i}
r = =
3-1
By Theorem 4.7, we have
15 -5
(Area) 2—
2 71 =
290.
EXAMPLE 4 Find the volume of the parallelepiped in ik’ determined by the vectors
[1, 0, -1], [-1, 1, 3], and [2, 4, 1].
SOLUTION We compute the determinant
1-1 2) [1-1 2
0 1 4=lo 1 = |) 3 = -3.
-1 3 of jo 2 3 |
Applying the corollary of Theorem 4.7, the volume of the parallelepiped is
therefore 5 =
Comparing Theorem 4.7 and its corollary, we see that the formula for the
volume of an n-box in a space of larger dimension m involves 2 square root,
278 CHAPTER 4 DETERMINANTS
whereas the formula for the volume of a box in a space of its own dimension
does not involve a square root. The student of calculus discovers that the
calculus formulas used to find the length of a curve ‘which 1s one-dimensional)
in the plane or in space involve a square root. The same is true of the formulas
used to find the area of a surface (two-dimensional) in space. However, the
calculus formulas for finding the area of part of the plane or the volume of
some part of space do not involve square roots. Theorem 4.7 and its corollary
lie at the heart of this difference in the calculus formulas.
LL
AB =|Ab, Ab, «++ Ab,|.
Find the volume of the image box when T acts on the cube determined by the
vectors ce,, ce,, and ce, for c > 0.
44 LINEAR TRANSFORMATIONS AND DETERMINANTS (OPTIONAL) 279
y
A
y= 3x
3¢
Volume change
factor = |3|
> Xx
c
FIGURE 4.11
The volume-change factor of T(x) = 3x.
¢ Volume of the
Volume of the image box = c|det(A)|
x cube = c? x,
FIGURE 4.12
The volume-change factor of T(x) = Ax.
280 CHAPTER 4 DETERMINANTS
and the volume-change factor of Tis |det(A)| = 6. Therefore, the image box has
volume 6c’. This volume can also be computed by taking the determinant of
the matrix having as column vectors 7{ce,), T(ce,), and T(ce,). This matrix is
cA, and has determinant c - det(A) = 6c.
Application to Calculus
We can get an intuitive feel for the connection between the volume-change
factor of T and integral calculus. The definition of an integral involves
summing products of the form
(Function value at some point of a box)(Volume of the box). (3)
f{x) dx
Under a change of variables—say, from x-variables to t-variables—the boxes
in the dx-space are replaced by boxes in dt-space via an invertible linear
transformation—namely, the differential of the variable substitution function.
Thus the second factor in piuduct (3) must be expressed in terms of volumes of
boxes in the dt-space. The determinant of the differential tiansformation must
play a role, because the volume of the box in dx-space is the volume of the
corresponding box in dt-space multiplied by the absolute value of the
determinant.
Let us look at a one-dimensional example. In making the substitution x =
sin ¢ in an integral in single-variable calculus, we associate with each f-value 4
the linear transformation of dt-space into dx-space given by the equation dx =
(cos f,)dt. The determinant of this linear transformation is cos f). A little 1-box
of volume (length) dt and containing the point ¢, is carried by this lineaf
transformation into a little 1-box in the dx-space of volume (length) |cos t,|dt.
Having conveyed a rough idea of this topic and its importance, we leave 1ts
further development to an upper-level course in analysis.
\det(A)| - V.
That is, the volume of a region is multiplied by |det(A)| when the region
is transformed by T. This result has important applications to integral cal
culus.
The volume of a sufficiently nice region G in R" may be approximated by
adding the volumes of small n-cubes contained in G and having edges of length
c parallel to the coordinate axes. Figure 4.13(a) illustrates this situation for a
plane region in R’, where a grid of small squares (?-cubes) is placed on the
region. As the length c of the edges of the squares approaches zero, the sum of
the areas of the colored squares inside the region approaches the area of the
region. These squares inside G are mapped by T into parallelograms of area
c’|det(A)| inside the image of G under T. (See the colored parallelograms in
Figure 4.13(b). As c approaches zero, the sum of the areas of these parallelo-
grams approaches the area of the image of G under 7, which thus must be the
area of G multiplied by |det(A)|. A similar construction can be made with a
grid of n-cubes for a region G in R". Each such cube is mapped by 7 into an
n-box of volume c’|det(A)|. Adding the volumes of these n-boxes and taking
the limiting value of the sum as c approaches zero, we see that the voiume of
the image under T of the region G is given by
Volume of image of G = |det(A)| - (Volume of G). (4)
We summarize this work in a theorem on the following page.
€2
U e;
c
FIGURE 4.13
282 CHAPTER 4 - DETERMINANTS
EXAMPLE6 Let T: R? > R’ be the linear transformation of the plane given by 7([x, y]) =
[2x — y, x + 3}}. Find the area of the image under T of the disk x? + y= 4 in
the domain of T.
SOLUTION The disk x? + y = 4 has radius 2 and area 42. It is mapped by 7 into a region
(actually bounded by an ellipse) of area
Vdet(A7A) - V.
We summarize this generalization in a theorem.
Vdet(A7A) - V.
EXAMPLE7 Let T: R? > R? be given by T([x, y]) = [2x + 3y, x — y, 2y]. Find the area of the
image in R? under T of the disk x7 + 7 s 4 in R’.
SOLUTION The standard matrix representation A of Tis
2 3
A=|1 -1
0 2
44 LINEAR TRANSFORMATIONS AND DETERMINANTS (OPTIONAL) 283
and
fz
ATA =}5 _y
1 of?
al!
3 455yah
-1=]5
0 2
Thus,
| SUMMARY
where
0 <= f,= 1 fori=1,2,...,7n.
2. A |-box in R” is a line segment, and its “volume” is its length.
3. A 2-box in R” is a parallelogram determined by two independent vectors,
and the “volume” of the 2-box is the area of the parallelogram.
A 3-box in iR” is a skewed box (parallelepiped) in the usual sense, and its
fo.
| exercises
1. Find the area of the parallelogram in R? 2. Find the area of the parallelogram in R°
determined by the vectors [0, !, 4] and determined by the vectors [1, 0, 1, 2, —1] and
[—1,3, -2]. (0, 1, -1, 1, 3}.
284 CHAPTER 4 DETERMINANTS
3. Find the volume of the 3-box in R‘ [4x — 2y, 2x + 3y]. Find the area of the image
determined by the vectors [—1, 2, 0, 1], under T of each of the given regions in R?.
[0, 1, 3, 0], and [0, 0, 2, —1}.
4. Find the volume of the 4-box in R° 18. The squaareOSx51,0s ys]
determined by the vectors [1, 1, 1, 0, 1], 19. The rectangle -l=x<t,l=sy=2
(0, 1, 1. 0, 0], [3, 0, 1, 0, 0], and 20. The parallelogram determined by 2e, + 3e,
[1, —1, 0, 0, 1). and 4e, — e,
. £1, 0,0, 1), (2, -1, 3, O}, (0, !, 3, 4], 23. TheboxO =<x<2,-ls ys3,252<5
Oo
eoetric argument showing that at least —_. @ If the image under 7 of an n-box B in R*
|det(AB)| = |det(A)] - |det(B)|. [Hint: Use the has volume 12, the box B has volume
fact that, if A is the standard matrix 12/|det(A)].
representation of 7: R” > R" and B is the . fn = 2, the image under 7 of the unit
standard matrix representation of disk x + y’ = | has area |det(A)|.
T’: R’ — R’, then AB is the standard matrix
. The linear transformation 7 is an
representation T° T’.]
isomorphism.
35. Let T: R? — R" be a linear transformation of
. The image under 7° T of an n-box in R’
rank n with standard matnx representation
of volume V is a box in R’ of volume
A. Mark each of the following True or False.
det(A’) - V.
_ &. The image under T of a box in R’ is
again a box in R’. i. The image under 7° Te T of an n-box in
__ b. The image under 7 of an n-box in R" of R’ of volume V is a box in R’ of volume
volume V is a box in R’ of volume det(A’) - V.
det(A)- V. —_—j. The image under 7 of a nondegenerate
__ c. The image under T of an n-box in R’ of 1-box is again nondegenerate.
volume > 9 is a box in R’ of volume > 0.
__ 4. If the image under 7 of an n-box B in R" 36. Prove Eq. (1); that is, prove that the square
has volume 12, the box B has volume of the length of the line segment determined
|det(A)| - 12. by a, in R" is |la,||? = det([a, - a,]).
CHAPTER
5 EIGENVALUES AND
J — EIGENVECTORS
EXAMPLE 1 (Fibonacci’s rabbits) Suppose that newly born pairs of rabbits produ
offspring curing the first month of their lives, but each pair produces one
pair each subsequent month. Starting with F, = | newly born pair in the
286
5.1 EIGENVALUES AND EIGENVECTORS 287
month, find the number F; of pairs in the Ath month, assuming that 20 rabbit
dies.
SOLUTION In the Ath month, the number of pairs of rabbits is
F, = (Number of pairs alive the preceding month)
+ (Number of newly born pairs for the kth month).
Because our rabbits do not produce offspring during the first month of their
lives, we see that the number of newly born pairs for the kth month is the
number F,_, of pairs alive two months before. Thus we can write the equation
above as
F, _|! 1] | F,,
F,., 1 0} LF yo}
Thus, if we set
_ |F fil
x, = i and A= i of
we find that
X, = AX;,.;. (3)
288 CHAPTER 5 EIGENVALUES AND EIGENVECTORS
Applying Eq. (3) repeatedly, we see that
and in general
Thus we can compute the kth Fibonacci number F, by finding A‘' and
multiplying it on the right by the column vector x,. Raising a matrix to a power
is also a bit of a job, but the routine MATCOMP (in LINTEK) or MATLAB
can easily find F,, for us. (See Exercise 45.)
Both Markov chains and the Fibonacci sequence lead us to computations
ef the form A‘x. Other examples leading to A*x abound in the physical and
social sciences.
Av = Av (5)
for some scalar A. Geometrically, Eq. (5) asserts that Av is a vector parallel to ¥.
From Av = Av, we obtain
Ab,
= Ab; for i= 1,2,....n.
To compute A‘x, we first express x as a linear combination of these basis
eigenvectors:
Computing Eigenvalues
In this section we show how a determinant can be used to find eigenvalues of
ann X nmatrix A; the computational technique is practical only for relatively
small matrices.
We write the equation Av = Av as Av — Av = 0, or as Av — Alv = 0, where /is
the n x nidentity matrix. This last equation can be written as (A — A/)v = 0, so
y must be a solution of the homogeneous linear system
(A — AZ)x = 9. (7)
An eigenvalue of A is thus a scalar A for which system (7) has a nontrivial
solution v. (Reca!] that an eigenvector is nonzero by definition.) We know that
system (7) has a nontrivial solution precisely when the determinant of the
coefficient matnx is zero—that is, if and only if
det(A - Al) = 0. (8)
HISTORICAL NOTE THE CONCEPT OF AN EIGENVALUE, in its origins and its later development,
was independent of matrix theory. In fact, its original context was that of the solution of systems
of linear differential equations with constant coefficients (see Section 5.3). Jean Le Rond
D’Alembert (1717-1783), in his work in the 1740s and 1750s on the motion of a string loaded
with a finite, number of masses, considered the system
3
2
I> ay =0, t= 1,2, 3.
dt get
(Here the number of masses is restricted to 3 for simplicity.) To solve this system, D’Alembert
multiplied the ith equation by a constant v;(to be determined) for each i and added the equations
together to obtain
3 3
= qd?
au aa + > va4y, = 0.
i=l ik= J
If the v, are chosen so that ¥, va,;, + Av, = 0 fork = 1, 2, 3—that is, if the vector [v,, V2. Ya] is
eigenvector corresponding to the eigenvalue —A for the matrix A = [a,), then the substitution # ~
Viv, + vat. + v3), converts the original system to the single differential equation
This equation can now casily be solved and leads to solutions for the y;. It is not difficult to show
that A is determined by a cubic equation that has three roots. .
Eigenvalues also appeared in other situations involving svstems of differential equations:
including physical situations studied by Euler and Lagrange.
5.1 EIGENVALUES AND EIGENVECTORS 291
a,-A ay me Qi,
Ay, A2—A Q2, =6 (9)
a, ano ° Gan A
A=|2
_ {32
A
SOLUTION The characteristic polynomial of A is
— yn f3-A 2 oy ay
det(A AN =| 7 j=) 3A — 4.
EXAMPLE 3 Show that A, = | is an eigenvalue of the transition matrix for any Markov
chain.
SOLUTION Let T be ann X n transition matrix for a Markov chain; that is, all entries in 7
are nonnegative and the sum of the entries in each column of T is 1. We easily
see that the sum of the eniries in any column of 7 — J must be zero. Thus the
*Many authors define the characteristic polynomial of A to be p(A) = det(A/ — A) rather than
P(A) = det(A — AI). Because AJ — A = (—1(A — AJ), we see that for an n x n matrix A, we have
det(Al — A) = (—1)*det(4 — A). Thus the two definitions differ only when n is an odd integer, in
which case the two polynomials differ only in sign. The definition det(A/ — A) has the advantage
that the term of highest degree is always A’, rather than tcing —A" when n is odd. We use the
definition p(A) = det(A — A/) in this first course because for our pencil-and-paper computations, it
is easier to write “— A” after each diagonal entry than to change the sign of every entry and then
write “A +” before the diagonal entries. However, the command poly(A) in MATLAB will produce
the coefficients of the polynomial det(A/ — A).
292 CHAPTER 5 EIGENVALUES AND EIGENVECTORS
sum ol the row vectors of 7 — / 1s the zero vector, so the rows of T — J are
linearly dependent. Consequently, rank(7 — /) < an, and (7 — /)x = 0 hasa
nontrivial solution, so A, = | is an eigenvalue of 7.
2-Aa 1 0
-’A 1 -! 1
PAyY=}-1 A | =@-A) 3 al - 4 1 1-A
l 3. I-A
3.1 EIGENVALUES AND EIGENVECTORS 293
Computation of Eigenvectors
We turn to the computation of the eigenvectors corresponding to an eigen-
value A of a matrix A. Having found the eigenvalue, we substitute it in
homogeneous system (7) and solve to find the nontrivial solutions of the
system. We will obtain an infinite number of nontrivial solutions, each of
which is an eigenvector corresponding to the eigenvalue A.
2 0
—
A=|-1 1}.
&
1 1
Ww
SOLUTION The eigenvalues of A were found to be A, = —1 and A, = A, = 2. We substitute
each of these values in the homogeneous system (7). The eigenvectors are
obtained by reducing the coefficicnt matrix A — AJ in the augmented matrix for
the system. For A, = —1, we obtain
[A — AJ | 0] = [A+ J] 0]
(In the future, we will drop the column of zeros to the right of the partition
when solving for eigenvectors.) The solution of the homogeneous system 1s
given by
r/4
—3r/4 for any scalar r.
r
Therefore,
pAlw ale
rl4
vy, = |—3r/4| = r}—3| for any nonzero scalar r
r
—
294 CHAPTER 5 EIGENVALUES AND EIGENVECTORS
r l
vy, = -s =r 5 for any nonzero scalar r.
L
4r 4|
For A, = 2, we obtain
[A - Af] = 1A - 2]
0 1 O 1 3-1
=|-1 -—2 ~10
1 Ol.
1 3-1] [0 0 0
This time we find that
s l
vy, = i = for any nonzero scalar s
5 1
21 0\[s] [2s s
AV, = —1 0 l 0 = 0 = 0 = 2v, = dV).
13 jis} [2s s|
1 0 0
A=|-8 4 -6].
8 I 9
il-A 0 0 A -6
His =~ a=|-8 4-A -6 9 -A =(1- aly
| 8 1 9-A
= (1 — AA? — 13A + 42) = (1 — AMA — 6)(A — 7).
00 0; 0 0 0
A-AI=A-I=|-8 3 -6/~|0 4 2
8 1 8 {8 1 8
5.1 EIGENVALUES AND EIGENVECTORS 295
I 15
8 1 3) |. ? | 9 6
~lo 2 I~] ! a}~]O1 a),
00 0; |0 0 O| j0 0 0
sO
—15r/16 “i
vVi=|] -r/2 |=r -3 for any nonzero scalarr
r
1
[Sr 15
8ri=r| 8| for any nonzero scalar r.
~—l6r —16
For A, = 6, we have
[--5 0 o] f1 0 oO] ft 0 O
A-AJ=A-6I1=|-8 -2 -6]~]0 -2 -6/~|0 1 3},
8 1 3
sO
0 0
vy, = - = i for any nonzero scalar s
s
—
-6 0 0 1 0 0 1 0 0
A-A,j =A-7I1=|-8 -3 -6|~|0 -3 -6|~|0 1 2],
8 1 2); |O t 2] 10 0 @
so
0 0
vy, = I = [3 for anv nonzero scalar ¢
| ¢ | 1
is an eigenvector. sm
Let A be an n X n matrix.
1. If A is an eigenvalue of A with v as a corresponding eigenvector,
then A* is an eigenvalue of A*, again with v as a corresponding
eigenvector, for any positive integer k.
2. If A is an eigenvalue of an invertible matrix A with v as a
corresponding eigenvector, then A # 0 and 1/A is an eigenvalue of
A~', again with v as a corresponding eigenvector.
3. If A is an eigenvalue of A, inen the set E, consisting of the zero
vector together with ail eigenvectors of A for this eigenvalue A is a
subspace of n-space, the eigenspace of A.
Av = dv
T(v) = Av.
Thus, the linear transformation 7 maps the vector v onto a vector that is
parallel to v. (See Fig. 5.1.) We present a definition of eigenvalues and
eigenvectors for linear transformations that is more general than for matrices,
in that we need not restrict ourselves to a finite-dimensional vector space.
vA 7(v) = 2v
T(v) = -3y
(a) (b)
FIGURE 5.1
(a) T has eigenvalue A = —2; (b) 7 has eigenvalue A = 2.
5.1 EIGENVALUES AND EIGENVECTORS = 297
DEFINITION 5.2 Eigenvaiues and Eigenvectors
ILLUSTRATION 1 Not every linear transformation has eigenvectors. Rotation of the plane
counterclockwise through a positive angle 6 is a linear transformation (see
page 156). If 0 < @< 180°, then no vector is mapped onto one parallel to
it—that is, no vector is an eigenvector. If @ = 180°, then every nonzero vector
is an eigenvector and they all have the same associated eigenvalue A,=—1. ©
ILLUSTRATION 2 The linear transformation T: R? > R? that zeflects vectors in the line x + 2 = 0
maps the vector [2, — 1] onto itself and maps the vector [1, 2] onto [-—1, —2], as
indicated in Figure 5.2. The equations 7((2, —1]) = (2, —1] and 7([1, 2]) =
[—1, —2] show that (2, —1] and [1, 2} are eigenvectors of T with corresponding
eigenvalues i and —i, respectively. *
~
72, — 1)
(C1, 2))
FIGURE 5.2
Reflection in the line x + 2y = 0.
298 CHAPTER 5 EIGENVALUES AND EIGENVECTORS
EXAMPLE 8 Let D. be the vector space of all functions mapping R into R and having
derivatives of all orders. Let 7: D, — D, be the differentiation map, so that
T(f) =f", the derivative of {/ Describe ail eigenvalues and eigenvectors of T.
(We have seen in Section 3.4 that differentiation does give a linear transforma-
tion.)
SOLUTION We must find scalars A and nonzero functions fsuch that 7({) = Af—that is,
such that f’ = Af, We consider two cases: if A = 0, and if A # 0.
If 4 = 0, we are trying to solve the differential equation f’ = 0, or to use
Leibniz notation, dy/dx = 0. We know from calculus that the only soluticns of
this equation are the constant functions. Thus the nonzero constant functions
are eigenvectors corresponding to the eigenvalue 0.
If A 4 0, the differential equation becomes f' = Af or dy/dx = Ay. It 1s
readily checked that y = e* is a solution of this equation, so f(x) = ke* is aa
eigenvector for every nonzero scalar k. To see that these are the only solutions,
we can solve the differential equation by separating variables, which yields tne
equation
dy
—=A) dx.
y_
Integrating both sides of the equation yields
In|y| = Ax +c.
Solving for y, we obtain y = +e‘e* = ke**, so the only solutions of the
differential equation are indeed of the form y = ke*. a
ILLUSTRATION 3 There are several physical situations in which a point of a vibrating body
subject to a restoring force proportional to its displacement from its positis
at rest. One such situation is the vibration of a body suspended by a spring,8
5.1 EIGENVALUES AND EIGENVECTORS 299
illustrated in Figure 5.3. We let f(t) be the displacement of the body at time ¢.
The acceieration of the body at time ¢ is then given by /"(t). Using Newton’s
law F = ma, we see that we can express the relation between the restoring force
and the displacement as mf"(t) = —cf(t), where c is a positive constant.
Dividing by m, we can rewrite this differential equation as
FD = -kf(9) (10)
for some constant k. Now differentiating twice is a linear transformation of the
vector space of all infinitely differentiable function into itself, and we see that
the equation f"() = —kf(é) asserts that the function fis an eigenvector of this
transformation with eigenvalue —k’. We can easily check that the functions sin
kt and cos kt are solutions of Eq. (10). By property 3 of Theorem 5.1, every
linear combination
S(Q = asin kt + bcos kt
is also a solution. Note that for this vibration situation, the eigenvalue —k?
determines the frequency of the vibration. To illustrate, suppose in the case of
the spring in Figure 5.3 we have f(t) = 2 and f'(t) = 0 when ¢ = 0. It is readily
determined that we then must have b = 2 anda = 0, so f(t) = 2 cos kt. Thus the
frequency of vibration, which is the reciprocal of the period, is k/(27). =
SUMMARY
EXERCISES 7
i -1 -i 3-2 0 8. 4 , “ ’ 3 O |
A,=|-1 1 -1|,4,=|-2 3 0 L
-1-1 1 0 0 5 rT 1 0 0 -2 0 0
10.|-8 4 -5 Wl. |-5S -2 -5
and the vectors 8 0 9 5 0 3
0 -! [4 0 0 [1 O 1
V=l1],%=/0),%=] 1, W2.|-7 2 -1 13./-7 2 5
I I 0 7 0 3 . 3 0 TD
! -| [460 0 ro 0 |
v¥,=]1], v5 =| OF. 14.;8 4 8 1§.|-2 -2 1
0 I 10 0 4 | 2 0-1!
In Exercises 2-16, find the characteristic In Exercises 17-22, find the eigenvalues A, 27d
polynomial, athe real eigenvalues,a and the . the corresponding eigenvectors v; of the linear
corresponding eigenvectors of the given matrix. transformation T.
transformation 7: V ~ V with corresponding 46. The first two terms of a sequence are a, = 0
eigenvalues A, and A,, respectively. Prove and a, = 1. Subsequent terms are generated
that, if A, * A), then v, and v, are using the relation
independent vectors.
a, = 2a,.,+ a. for k=2.
42. The analogue of Exercise 41 for a list of r
eigenvectors in V having distinct eigenvalues a. Write the terms of the sequence through
is also true; that is, the vectors are . dy.
independent. See if you can prove it. [HiNT: b. Find a matnx that can be used to
Suppose that the vectors are dependent; generate the sequence, as the matrix A in
consider the first vector in the list that is a Exercise 31 can be used to generate the
linear combination of its predecessors, and Fibonacci sequence.
apply T.] c. Use MATCOMP in LINTEK, or
43. State the result for matrices corresponding MATLAB, to find ap.
to Exercise 42. Explain why successful 47, Repeat Exercise 46 for a sequence where
completion of Exercise 42 gives a proof of a, = 0, a, = 1, a, = 2, and a, =
this statement for matrices. 2a,-, — 3ay-2 + ay-3 for k = 3.
5.1 EIGENVALUES AND EIGENVECTORS 303
a square matrix A, .£he MATLAB command Copy down the equation, and then use
gig(A) produces the eigenvalues (both real and ALLROOTS to find all eigenvaiues of the
plex) of A. The command [V, D] = eig(A) matrix.
ces a matrix V whose column vectors are
haps complex) eigenvectors of A and a -l 4 6 10 -13 8
‘agonal matrix D whose entry d,, is the .| 2 7 9 53. 3 -20 5
eigenvalue for the eigenvector in the ith column of -3 Il 13 -11 7 -6
y, In Exercises 48-51, use either MATLAB or the
poutine MATCOMP in LINTEK to find the real [ -7 t1 -7 10
eigenvalues and corresponding eigenvectors of the 5 8 -13 3
piven matrix. TS 8 -9 2
3-4 20 -6
710 6 “1000 21-8 O 32
—
-14 17 -6 9
#)2-15 -60 49,| -33290
020 55.
15 11-13 16
I-31 0 3! I-18 30 4331
00-2 2 . Use the routine MATCOMP in LINTEK, or
-34-3 3 MATLAB, to illustrate the Cayley-Hamilton
01409 2 4 theorem for the matna
20 2 4
-2 4 6 -1
4 00 O 5-8 3 2
1-6 16-6 6 11 -3 7 1}
-)_16 0 20 16 0-5 9 10
-16 0 16 20
(See Exercise 39.)
The routine ALLROOTS in LINTEK can be used 57. The eigenvalue option of the routine
to find both real and complex roots of a VECTGRPH in LINTEK is designed to
polynomial. Ihe program uses Newton’s method, illustrate graphically, for a linear
which finds a solution by successive transformation of R? into itself having real
approximations of the polynomial function by a eigenvalues, how repeatedly transforming a
linear one. ALLROOTS is designed so that the vector makes it swing in the direction of an
user can watch the approximations approach a eigenvector having eigenvalue of maximum
solution. Of course, a program such as MATLAB, magnitude. By finding graphically a vector
which is designed for research, simply spits out the whose transform is parallel to it, one can
answers. In Exercises 52-55, either estimate from the graph an eigenvector and
&. use the command eig(A) in MATLAB to find the corresponding eigenvalue. Read the
all eigenvalues of the matrix or directions for this option, and then work
b. first use MATCOMP in LINTEK to find the with it until you can reliably achieve a score
characteristic equation of the given matrix. of 85% or better. -
MATLAB
The command v = poly(A) in MATLAB produces a vector v whose components are
the coefficients of the characteristic polynomial p(A) of A, appearing in order of
decreasing powers of A. The command roots(v) then produces the solutions of
Pia) = 0.
304 CHAPTER 5 EIGENVALUES AND EIGENVECTORS
Mi. €nter the command format long and then use thc commands jusi eaplained to
find the characteristic polynomial and both the real and the complex
eigenvalues of
a. the matrix in Exercise 53
b. the matrix in Exercise 54.
{Recall that in MATLAB, the characteristic polynomial of an n x n matrix A is
det(A/ — A), rather than det(A — AJ). Thus the coefficient of A” should be | rather
than (—1)". We enter the long format because otherwise, with the displays in
scientific notation, it might seem from MATLAB that the coefficient of A” is 0.]
Let A be ann X rn matrix. In the introduction to this section, we considered the
case where there is a basis {b,, b,, . . . , b,} for R" consisting of eigenvectors of A, and
where, for one of the eigenvectors, say b,, the corresponding eigenvalue A, is of
greater magnitude than each of A,,..., A,. Also ifx = db, + d,b, +--+ + d,b,
with d, # 9, then for sufficiently !arge values of k, we saw that
Ablx = A,(440,
so that this eigenvalue A, of maximum magnitude might be estimated by taking the
ratio of components of A‘*'x to components of A*x. Recall that a period before an
arithmetic operation symbol, such as .« or ./, will cause that operation to be
performed component by component for vectors or matrices. Thus the MATLAB
command
r = (A*(k+1)«x) ./ (A*k«x)
will cause the desired ratios of components to be printed. If these ratios agree for all
components, we expect we have computed the eigenvalue of maximum magnitude
accurate to all figures shown.
M2. Enterk = 1;x = [1 23)’; A = [3 5 —11; 5 -8 —3; -11 —3 14]; in MATLAB.
(Recall that the final semicolon keeps the data from being displayed.) Then
enter r = (A‘(k+1)«x) ./ (A*k«x) to see the ratios displayed. Now enter k =
5; and use the up-arrow key to have the ratios computed for that &. Continue
using the up-arrow key, setting k successively equal to 10, 15, 20, 25, . . . until
the ratios agree to all places shown. Then change to format long, and continue
until you have found the eigenvalue of maximum magnitude accurate to all
figures shown. Copy it down as your answer to this problem. Check it using
the command eig(A).
M3. Property 2 of Theorem 5.1 indicates that the eigenvalue of minimum
magnitude of A is the reciprocal of the eigenvalue of maximum magnitude of
A~', Continuing the preceding exercise, enter B = A; to save A, and then enter
A = inv(A);. Now enter s = ones(3,1) / r. (Enter heip ones to learn the effect
of the ones(3,1) statement.) Using the up-arrow key to access statements
defining k and computing r and s, compute the eigenvalue of A of minimum
magnitude accurate to all figures shown in the long format. Copy down your
answer. and check it using the eig(B) command.
M4. Raising a matrix A to a high power can generate error, and can cause overflow
(numbers too large to handle) in a computer. Continuing the preceding two
exercises. let us avoid this by only raising A to a low power, say the Sth
power, and replace x by A°x after each iteration. Repeating this m times
should have the same effect as raising A to the power 5m. Of course, now the
entries in the vector x may get large. We compensate for this by norming x
5.2 DIAGONALIZATION 305
be of magnitude | before the next iteration. This does not change the
direction of x, which should swing to parallel the eigenvector b, having
eigenvalue A, of maximum magnitude as the iterations are performed. To
execute this procedure, enter A = B; to recover A as in Exercise M2, and then
enter
x = A%Sex; x = (1/norm(x))}«x; r = (Aex) / x
to counpute 4°x, normalize x, and compute the ratios of components of Ax to
those of x. One repetition of the up-arrow key followed by the Enter key
executes these iterations rapidly. Establish your result in Exercise M2 again,
and then, replacing A by 4~', establish your result in Exercise M3 again.
MS. Explain why the final vector x obtained when finding the eigenvalue of either
maximum or of minimum magnitude in Exercise M4 should be an
eigenvector corresponding to that eigenvalue. Check that this is so using the
[V, D] = eig(A) command explained before Exercise 48.
| 41 $ -3 -4 7 ®-ll 5
-2 6 8 1-2-2 6 7-3 0 2 10
M6. | 6-3 21 M7] 5 4 _3 (M8 | 8 0 2 1 4
8 2-4 5 30g oo -1 2 10S =9
5 10 4 -9 3
DIAGONALIZATION
Recall that a square matrix is called diagonal if all entnes not on the main
diagonal are zero. In the preceding section we indicated the importance of
being able to compute A*x for an m X m matrix A and a column vector x in R’.
In this section, we show that, if A has distinct eigenvalues, then computation
of A‘ can be essentially replaced by computation of D*, where D is a diagonal
matrix with the eigenvalues of A as diagonal entries. Notice that D* is the
diagonal matrix obtained from D by raising each diagonal entry to the power
k, For example,
2 0 of |8 0 O
0-1 oO} =|o-1 Ol.
0 0-2 0 0 -8|
306 CHAPTER 5 EIGENVALUES AND EIGENVECTORS
The theorem that follows shows how to summarize the action of a square
matrix on its eigenvectors in a single matrix equation. This theorem is the first
step toward our goal of diagonalizing a matrix. Theorems stated are valid for
matrices and vectors with complex entries and complex scalars unless we use
the adjective real. We sometimes refer to n-space in this section, meaning
either R" or C’.
|” 0
0 an
Then AC = CD if and only if A,, A, . . . , A, are eigenvalues of A and vy,
is an eigenvector of A corresponding to A, for j = 1, 2,...,n.
PROOF We have
CD=lv,
|
| v, °*° ¥,
Ih. Q
r
AC
= Alv, Vv, -*:+ V,|= Av, Av, *°- Av,}.
COROLLARY 2 Computation of A‘
PROOF Suppose that the conclusion is false, so the eigenvectors V,, V,,..., V,
are linearly dependent. Then one of them is a linear combination of its
predecessors. (See Exercise 37, page 203.) Let v, be the first such vector, so that
SOLUTION We compute
A-- a=" 2
5S 5} jl -l
2/ |0 oP
which yields an eigenvector
-f
which yields an eigenvector
A diagonalization of A is given by
s-coer-[! 8 of
ee |
Wiha
ofan
Wie
Lvl
|
(We omit the computation of C~'.) Thus,
f | 2k 0 5 | _ {noe + § 5(2*) z |
Ak=
LL 2} iy + 2)’
373] 3[-2 £2 5024
the colored sign being used only when k is odd. «
Diagonalization has uses other than computing A‘. The next section
presents an application to differential equations. Section 8.1 gives another
application. Here is an application to geometry in R’.
EXAMPLE 2 Find a formula for the linear transformation T: R’ — R’ that reflects vectors in
the line x + 2y = 0.
SOLUTION We saw in Illustration 2 of Section 5.1 that T has eigenvectors [1, 2]
and [2, —1] with corresponding eigenvalues —1 and 1, respectively. Let
A be the standard matrix representation of 7. Using column-vector nota-
tion, we have
a-coc=[ a) | i 1-4-3]
1 2)f-t O75 5) _ af 3 -4
Thus
r= Ab) = s-a— 3
x}\_ ,f[x|)_ 1] 3x - 4y
1 0 0
A=|-8 4 -6
81 9
of Example 6 in Section 5.1.
SOLUTION Taking r = 5 = t = | in Example 6 of Section 5.1, we see that eigenvalues and
corresponding eigenvectors of A are given by
A=1 4 =6, A =7,
15 0 ay
v= 8], v,=/-3], v,=|]-2).
—16 1 l
If we let
f 15 0 0
C= 8 -3 -2|,
-16 1 #1
then Theorem 5.3 tells us that C is invertible. Theorem 5.2 then shows that
100
C'AC= D=|0 6 O}.
007
—_——_—_—_—”
HISTORICAL NOTE Tue 1DEA OF SIMILARITY, like many matrix notions, appears without 4
definition in works from as early as the 1820s. In fact, in his 1826 work on quadratric forms (
note on page 409), Cauchy showed that if two quadratic forms (polynomials) are related by §
change of variables—that is, if their matrices are similar—then their characteristic equations4
the same. But like the concept of orthogonality, that of similarity was first formally defined af
discussed by Georg Frobenius in 1878. Frobenius began by discussing the general case: he called
two matrices A, D equivalent if there existed invertible matrices P, Q such that D = PAQ. The faster
matrices were called the substitutions through which A was transformed into D.
Frobenius then dealt with the special cases where P = Q7 (the two matrices were then calted
congruent) and where P = Q™' (the similarity case of this section). Frobenius went on to
many results on similarity, including the useful theorem that, if 4 is similar to D, then fA
similar to f(D), where fis any polynomial matrix function.
5.2 DIAGONALIZATION 311
Thus Dis similar to A. We are not eager to check that C"'AC = D; however, it is
easy to check the equivalent statement:
5 0 0
AC = CD= 8 —i8 —14).
-16 6 7 .
1-3 3
A=|0 -5 6}.
0-3 4
SOLUTION We find that the characteristic equation of A is
(1 — A)(-5 — A\(4 — A) + 18) = (1 — AYA? + A — 2)
= (1 — AXA + 2) — 1) = 0.
Thus, the eigenvalues of A are A, = 1, A, = 1, and A, = —2. Notice that 1 isa
root of multiplicity 2 of the characteristic equation; we say that the eigenvalue
1 has algebraic multiplicity 2.
Reducing A — J, we obtain
fo -3 31 fo 1-11
A-I=}]0-6 61~|0 O Oj}.
0-3 3} {@ 0 0
We see that the eigenspace E£, (that 1s, the nullspace of A — J) has dimension 2
and consists of vectors of the form
5
r for any scalars r and s.
r
l 0
v, =|0| and vy, =]1
0 |
corresponding to the eigenvalues A, = A, = |.
Reducing A + 2/, we find that
3 -3 3; |3 0-3
A+27={0-3 6|~/0 1 -2).
0-3 6] {0 0 0
312 CHAPTER 5 EIGENVALUES AND EIGENVECTORS
Therefore, if we take
we should have
1 Q -2
AC
= CD= 0 1 —-4|.
0 1 -2
As we indicated in Example 4, the algebraic multiplicity of an eigenvalue A;
of A is its multiplicity as a root of the characteristic equation of A. Its geometric
multiplicity is the dimension of the eigenspace E,. Of course, the geometric
multiplicity of each eigenvalue must be at least 1, because there always exists a
nonzero eigenvector in the eigenspace. However, it is possible for the algebraic
multiplicity to be greater than the geometric multiplicity.
EXAMPLE 5 Referring back to Examples 4 and 5 in Section 5.1, find the algebraic and
geometric multiplicities of the eigenvalue 2 of the matrix
2 1 0
A=|-1 0 1).
1 3 1
SOLUTION Example 4 on page 292 shows that the characteristic equation of A is
—(A — 2)(A + 1) = 0, so 2 isan eigenvalue of algebraic multiplicity 2. Example
5 on page 293 shows that the reduced form of A — 21 is
1 3-1
0 1 QO}.
00 0
Thus, the eigenspace E,, which is the set of vectors of the form
r
O|forr ER,
r
Section 9.4 describes a technique for finding J. This Jordan canonical form is
as close to a diagonalization of A as we can come. The Jordan canonical form
has applications to the solution of systems of differential equations.
To conclude this discussion, we state a result whose proof requires an
excursion into complex numbers. The proof is given in Chapter 9. Thus far, we
have seen nothing to indicate that symmetric matrices play any significant role
in linear algebra. The following theorem immediately elevates them into a
position of prominence.
“| SUMMARY
a |
Let A be ann Xx n matrix.
EXERCISES
jn Exe srcises 1-8, find the eigenvalues A, and the i. If an n X n matrix A is diagonalizable,
gorresponding eigenvectors v, cf the given matrix there is a unique diagonal matrix D that
A and also find an invertible matrix Cand a is similar to 4.
CAC.
diagonal matrix D such that D = — j. If A and B are similar square matrices.
then det(A) = det(B).
-3 4 32 14. Give two different diagonal matrices that are
2. A=
LA
=
| 4 i ; ‘ i
similar n E1
to the matnx 34 .
6 3-3
4.Az=|-2 -1 2 . Prove that, if a matrix is diagonalizable, so
3, A= 7 |
van 16 8 -7 is its transpose.
16. Let P, Q, and R be n X n matrices.
(-3 10 -6| -3 5 -208 Recall that P is similar to Q if there
-6 6. A= 2 0
5. A -| 0 7
exists an invertible n x n matrix C such that
C"'PC = Q. This exercise shows that
form 4@,,.,, @,, OF a,_,,. Show that if a,,_, and this section, give the best justification you
a,_,; are both positive, both negative, or both can for this statement.
zero for i = 2, 3,..., ”, then A has real
eigenvalues. [HINT: Shew that a diagonal
[Bl 29. The MATLAB command
22 that, if A and B are similar square diagonalizable. (Load the matrices from a matrix
matrices, each eigenvalue of A has the same file if it is accessible.)
geometric multiplicity for A that it has for B.
(See Exercise 18 for the corresponding 18 25 -25 8.3 8.0 —6.0
statement on algebraic multiplicities.) 30.); 1 6 -1 31. |-2.0 0.3 3.0
24, Prove that, if A,, A,,... ,A,, are distinct real 18 34 -25 0.0 00 43
eigenvalues of an n X n real matrix A and if
B; is a basis for the eigenspace E,, then the 24.55 46.60 46.60
union of the bases B; is an independent set 32. |-4.66 -8.07 -9.32
of vectors in R*. [HinT: Make use of Theorem -~9.32 -18.64 -17.39
5.3.]
0.8 -1.6 1.8]
25. Let 7: V > V be a linear transformation of a 33. | -0.6 -3.8 3.4
vector space V into itself. Prove that, if -~20.6 -1.2 9.6
V1, V2... , ¥, are eigenvectors of T
corresponding to distinct nonzero [ 7-20 -5 5
eigenvalues A,, A,,.. . ,A;,, then the set 5 -13 -5 0
{7(v,), T{v,), . . . ,7(v,)} is independent. 4/5 10 75
| § -10 -5 -3
26. Show that the set {e"", 2", ... , “}, where
the A, are distinct, is independent in the 2 § -9 10
vector space W of all functions mapping R 4 9 8 -3
into R and having derivatives of all orders. 3-13 2 012
27. Using Exercise 26, show that the infinite set 7 -6 3 2
{e* | k © R} is an independent set in the -22.7 -26.9 -6.3 —46.5
vector space W described in Exercise 26.
[HinT: How many vectors are involved in any
36, |759-7 —40.9 20.9 -99.5
dependence relation?]
"| 15.9 9.6 -8.4 26.5
43.8 36.5 -7.3 78.2
28. In Section 5.1, we stated that if we allow
complex numbers, then an n x n matrix 66.2 58.0 -11.6 116.0
with entries chosen at random has 37, | 120.6 89.6 -42.6 201.0
eigenvectors that form a basis for n-space "1-210 -150 7.6 35.0
with probability 1. In light of our work in -99.6 -79.0 28.6 -169.4
5.3 TWO APPLICATIONS 317
-253 -232 -96 1088 280] T2513 596 -414 ~2583 1937
213. 204 +93 -879 -225 127-32. 33 132 8!
3. | 90 -90 -47 360 90 40.| -421 94 -83 -434 306
-38 -36 -18 162 40 2610 -615 443 2684 -1994
62 64 42 -251 —57| | 90 -19 29 494 -50
154 -24 -36 —1608 —336 2-4 6 21
-126 16 18 1314 270 6 3-8 12
99.| 54 0 4 -540 -108 4./-4 0 5 12
2440 O -236 —48 4 3 S-IL 3
-42 -12 -18 366 70 16679 #2 3
oe
5.3 | TWO APPLICATIONS
| |
In this section, A will always denote a matrix with real-number entries.
In Section 5.1, we were motivated to introduce eigenvalues by our desire
to compute A*x. Recall that we regard x as an initial information vector of
some process, and A as a matrix that transforms, by left multiplication, an
information vector at.any stage of the process into the information vector at
the next stage. In our first application, we examine this computation of A*x and
the significance of the eigenvalues of A. As illustration, we determine the
behavior of the terms F, of the Fibonacci sequence for large values of n. Our
second appiication shows how diagonalization of matrices can be used to solve
some linear systems of differential equations.
» 0
A,
CAC = D= () .
" a
For any vector x in R’, let d = [d,, d,, . . . , d,] be its coordinate vector relative
to the basis B. Thus,
x=dy,+dy,+-::+dyv n
318 CHAPTER 5 EIGENVALUES AND EIGENVECTORS
Then
EXAMPLE 1 Show that a diagonalizable transition matmx 7 for a Markov chain has no
eigenvalues of magnitude > 1.
SOLUTION Example 3 in Section 5.1 siiows that 1 is an eigenvalue for every transition
matrix of a Markov chain. For every choice of population distribution vector
p, the vector T*p is again a vector with nonnegative entries having sum 1. The
preceding discussion shows that all eigenvalues of 7 must have magnitude = 1;
otherwise, entries in some 7*p would have very large magnitude as k
increases. @
EXAMPLE 2 Find the order of magnitude of the term F, of the Fibonacci sequence Fo, F;,
F,, F;,... —that is, the sequence
0, 1, 1, 2,3, 5, 8, 13,...
for large values of k.
SOLUTION We saw in Section 5.i that, if we let
then
xX, = Ve
1 0
We compute relation (1) for
11 l
A=|i A and x= x= (9)
5.3 TWO APPLICATIONS =—-3319
The characteristic equation of A is
(1 -—AY-A)-
1 =A-A-1=0.
Using the quadratic formula, we find the eigenvalues
1+-V5 1-V5
A, = 2 and A=):
1-V5
A-Al~ 2 I) SO wale |
0 0
n=|Via |
is an eigenvector corresponding to A,. Thus we take
cele 1 vee
To find the coordinate vector d of x, relative to the basis (v,, v,). we observe that
= Cd. We find that
cia, [Vot} A
4V5l1-V5 2)
thus,
Cavs 2)[vai a} @
For large k, the kth power of the eigenvalue A, = (1 + VV5)/2 dominates, so A‘x,
is approximately equal to the shaded portion of Eq. (2). Computing the second
component of A*x,, we find that
n=val(-z) -(a)}
Lyla Vsye (L- V5
(3)
320 CHAPTER 5 EIGENVALUES AND EIGENVECTORS
Thus,
1 lt V5\e
F, = val a} for large k. (4)
Indeed, because jA,| = |(1 — /5)/2| < 1, we see that the contribution from this
second eigenvalue io the right-hand side of Eq. (3) approaches zero as k
increases. Because ]A,‘/5| < 5 for k = 1 and hence for all F, we see that F, can
be characterized as the closest integer to (1/\/5)((1 + 5)/2)* for all k,
Approximation (4) verifies that F, increases exponentially with k, as expected
for a population of rabbits. #
in simple rate of growth problems involving the time derivative of the amount
X present of a single quantity. We may also write this equation as x’ =
where we understand that x is a function of the time variable ¢, In mor
complex growth situations, m quantities may be present in amounts X,, x, --*’
x,. The rate of change of x, may depend not only on the amount of x; present,
but also on the amounts of the other n — 1 quantities present at time /. we
5.3 TWO APPLICATIONS 321
(5)
wilere
x,(t) oy
X(t) x,(d)
x=| .- |, x’=] . |, and A = [a,).
x,(0) x(t)
If the matrix A is a diagonal matrix so that a, = 0 for i # j, then system (8)
reduces to a system of nm equations, each like Eq. (6), namely:
X, = aX
X_ = AyX;
(9)
Xq = AngXy-
The general solution is given by
x,] [ke
X,| | k,e%22!
x=|-]=
Xn k eon!
322 CHAPTER 5 EIGENVALUES AND EIGENVECTORS
A,
D = CAC= ()
") n
x = Cy. (11)
(If we let x = Cy, we can confirm that x’ = Cy’.) The genera! solution of system
(10) is
X= X- XX;
X, = -X, +X- X;
x; = —X, — X, + X;.
-] -]
v,=| I] and v,=| OJ.
0 l
A diagonalizing matnx is then
1-1-1
C=]1 1 O|,
1 0 1
so that
-1 0 0
D=C'AC=| 0 2 O|.
0 0 2
Ni ke"!
Yy| = | ke
V3 ke "|
~~ |
"| SUMMARY
1. Let A be diagonalizable by a matrix C, let x be any column vector, and let
d = C''x. Then
A‘x = dA,*y, + d,A,*v, treet dA. V 3
EXERCISES
y. Let the sequence a, a, a),... be given by In Exercises 6-13, solve the given system of linear
a = 9, a, = I, and a, = (a,_, + aj-,)/2 for differential equations as outlined in the summary.
k2 2.
6. x = 3x, _~ 5X;
a. Find the matnx A that can be used to x, = 2X,
generate this sequence as we used a
matrix to generate the Fibonacci xi= xX, + 4,
sequence in Section 5.1. x = 3x,
6 ORTHOGONALITY
|e: PROJECTIONS
326
6.1 PROJECTIONS 327
Projection p of b on sp(a) in R’
=”
re (1)
<)
EXAMPLE 1 Find the projection p of the vector [1, 2, 3] on sp({2, 4, 3]) in R®.
SOLUTION We let a = [2, 4, 3] and b = [1, 2, 3] in formula (1), obtaining
_b-a,_ 2+8+9 19
Pr aa? 44 16t9%
2944 3h
The Concept of Projection
Let W* be the set of all vectors in R’ that are perpendicular to every vector in
W. Properties of W* will appear shortly in Theorem 6.1. Figure 6.3 gives a
symbolic illustration of this decomposition of b into a sum of a vector in W
and a vector orthogonal to W. Once we have demonstrated the existence of
this decomposition, we will define the projection ofb on W to be the vector by.
The projection b,, of b on W is the vector w in W that is closest to b. That
is, W = by minimizes the distance ||b — w|| from b to W for all w in W. This
seems reasonable, because we have b = by + by» and because b,, is
orthogonal to every vector in W. We can demonstrate algebraically that for any
w € W, we have ||b — wil = ||b — by/|. We work with ||b — wil? so we can use the
dot product. Because the dot product of any vector in Wand any vector in W*
is 0, we obtain, for all w € W,
[|b — wi’ = (b — w) + (b — w)
= ((b — by) + (by— w)) - ((b — by) + (by~ w))
= (b — by) + (b — by) + 2(b — by) + (by — w) + (by — W) + (by — ¥)
in W* in W
= |[b — by? + [lbw — wil = |[b — byl’.
FIGURE 6.3
The decomposition b = b, + by:
6.1 PROJECTIONS 329
ILLUSTRATION 2 ‘Suppose that you are pushing a box across a floor by pushing forward and
downward on the top edge. (See Figure 6.4.) We take as origin the point where
the force is applied to the box and as W the plane through that origin paralic!
to the floor. If F is the force vector applied at the origin, then F,,is the portion
of the force vector that actually moves the body along the floor, and Fy:
(which is directed straight down) is the portion of the force vector that
attempts to push the box into the floor and thereby increases the friction
between the box and the floor. =
FIGURE 6.4
The decomposition of a force vector.
330 CHAPTER 6 ORTHOGONALITY
A =
vy
.
| j
Vy |
Thus, W is the row space of A. Now the nullspace of A consists of all vectors x
in R* that are solutions of the homogeneous system Ax = 0. But Ax = 0 if and
only ifv,-x =Ofori=1,2,...,. Therefore, the nullspace of A is the set of
all vectors x in R" that are orthogonal to each of the rows of A, and hence to the
row space of A. In other words, the orthogonal complement of the row space of
A is the nullspace of A. Thus we have found W+. We summarize this procedure
in a box. Notice that this procedure marks one of the rare occasions in this text
when vectors must be placed as rows of a matrix.
a= [}22 4
Reducing A, we have
|; 22 I [ 2 2 \~(¢ 0 -2 ,
342 3 0-2 -4 0 0 1 2 OF
Therefore, the nullspace of A, which is the orthogonal complement of VW, is the
set of vectors of the form
[2r — s, —2r, r,s] for any scalars r and s.
ILLUSTRATION 3 Note that if Wis the (m — 1}-dimensional solution space in R” of = single linear
equation a,x, + a,x, + +--+ + a,x, = 0, then the vector [a,, a),..-., 4,] of
coefficients is orthogonal to W, because the equation can be written as
PROOF We may assume W # {0}. Let dim(W) = k, and let {v,, v,,..., v,} bea
basis for W. Let A be the & X m matrix having v, as its ith row vector for
i=1,...,k4
For property 1, we have seen that W" is the nullspace of the matrix A, so it
is a subspace of R’.
For property 2, consider the rank equation of A:
rank(A) + nullity(A) = 2.
For property 4, let {v..;, Via2, . - - > ¥,¢ be a basis for W+. We claim that the
set
The sum on the left-hand side is in W, ana the sum on the right-hand side is in
W". Because these sums are equal, they represent a vector that is 11 both W
and W+ and must be orthogonal to itself. The only such vector is the zero
vector, so both sides of Eq. (3) musi equal 0. Because the v, are independent for
| =iskand the v,are independent for x + | =7 =n, we see that ail the scalars
r, and 5, are zero. This shows that set (2) is independent; because it contains n
vectors, it must be a basis for R*. Therefore, for every vector b in R" we can
express bh in the form
by bya
EXAMPLE 3 Fird the projection of b = [2, 1, 5}on the subspace 'V = sp((I, 2, 1], (2, 1, - 1)).
SOLUTION We follow the boxed procedure.
Step 1: Because v, = [1, 2, \] and v, = (2, 1, —1] are independent, they form
a basis for W.
Step 2: A basis for W* can be found by obtaining the nullspace of the
matrix
_{l 2 1
a=() ; 1}
An echelon form of A is
1 2 1
0 -3 -3P
and the nuilspace of A is the set of vectors [7, -7,r], where ris any scalar. I et us
take v, = [1, —1, 1] to form the basis {v,} of W*. (Alternatively, we could havc
computed v, X v, to find a suitable v;.)
Step 3: To find the coordinate vector r of b relative to the ordered
basis (v,, ¥,, ¥;), we proceed as described in Section 3.3, and perform the
reduction
1 2 1/2) f1 2 1] 2] ft 0 1] 4]
2 i -l ~|0 -3 -3 | -3}~]O 1 Oj -1
(1-1 1|5{ [0-3 O| 3} [0 0 -3 | -6
vy. Vv ¥; +b
1 0 0 2
~10 1 0 —1f.
0 0 1 2
Thus, r = (2, -1, 2].
Step 4: The projection of b on Wis
by = 2v, — v, = 2[1, 2, 1] — [2, 1, -1] = [0, 3, 3}.
As acheck, notice that 2v,; = [2, —2, 2] 1s the projection of b on W4, and b =
by + by = [0, 3, 3] + [2, —2, 2] = (2, 1, 5]. 7
334 CHAPTER 6 ORTHOGONALITY
Our next example shows that the procedure described in the box preceding
Example 3 yields formula (1) for the projection of a vector b on sp(a).
EXAMPLE 4 Let a # 0 and b be vectors in R*. Find the projection of b on sp(a), using the
same boxed procedure we have been applying.
SOLUTION We project b on the suuspace W = sp(a) of R”. Let {v,, v5, .. . , v,} be a basis for
W*, and let r = [r,, ,..-, 7,] be the coordinate vector of b relative to the
ordered basis (a, v,,..., V,)- Then
b=ratnytec+
+ryvan
EXAMPLE 5S Find the projection of the vector [3, —1, 2] on the plane x + y + z= O through
the origin in R?>.
SOLUTION Let W be the subspace of R? given by ihe plane x + y + z = 0. Then WM is
one-dimensional, and a generating vector for W* is a = [1, 1, 1), obtained by
taking the coefficients of x, y, and z in this equation. (See Illustration 3.) Let
b = [3, —1, 2]. By formula (1), we have
bus _bra_
= a= _ Taal
3-142 _4
b= fh 1 Ib
Thus by=—_ b — by» —= (3, -1, 2] - 4{1,
4 1, N=— 15 _7 q, Ar2 .
— %, a) (4)
P ~ Ya, a)*
The fact that the projection by; of a vector b on a subspace W is the vector Wi”
W that minimizes ||b — w(|| is useful in many contexts.
For an example involving function spaces, suppose that fis a complicated
function and that W consists of functions that are easily handled, such 4
6.1 PROJECTIONS 335
EXAMPLE 6 Let the inner product of two polynomials p(x) and q(x) in the space P,, of
polynomial functions with domain 0 = x = | be defined by
I, (9)
ea) [car i=(T)E=3
The projection of x on sp(1)" is obtained by subtracting the projection on sp(1)}
from x. We obtain x — 5. As a check, we should then have (!, x — ) =0O0,anda
computation of Jo Oe — })dx shows that this is indeed so. ®
"| SUMMARY
1. The projection of b in R’ on sp(a) for a nonzero vector a in R’ is given by
((b - a)/(a - a))a.
2. The orthogonal complement W* of a subspace W of R’ is the set cf all
vectors in R” that are orthogonal to every vector in W. Further, W is a
subspace of R" of dimension n — dim(W), and (W*)' = W
3. The row space and the nullspace of an m X n matrix A are orthogonal
complements of each other. In particular, W* can be computed as the
nullspace of a matrix A having as its row vectors the vectors in a generating
set for W.
4. Let Wbea subspace of R’. Each vector b in R’ can be expressed uniquely in
the form b = b, + by for by in W and by» in M4,
S. The vectors by and by: are the projections of b on W and on H%,
respectively. They can be computed by means of the boxed procedure on
page 333.
336 CHAPTER 6 ORTHOGONALITY
EXERCISES
In Exercises 1-6, find the indicated projection. 18. The projection of [-1, 0, 1] on the plane
xt+y=0inR
1. The projection of (2, 1] on sp([3, 4]) in i 19. The projection of [0, 0, 1] on the plane
2. The projection of [3, 4] on sp({2, 1]) in R’ 2x-y-z=0inR
3. The projection of [1, 2, 1] on each of the 20. The projection in R* of [—2, 1, 3, —5] on
unit coordinate vectors in R? a. the subspace sp(e,)
4. The projection of [1, 2, 1] on the line with b. the subspace sp(e,, 4)
c. the subsnace sp(e,, €3, €4)
parametric equations x = 3, y= ,z = 21
in R? d. RY
21. The projection of [1, 0, —1, i] on the
5. The projection of [—1, 2, 0, 1] on
subspace sp([!, 0, 0, 0], [0, 1, 1, OJ,
sp((2, —3, 1, 2]) in R*
(0, 0, 1, 1]) in R*
o. The projection of [2, —1, 3, —5] on che line
22. The projection of [0, 1, —1, 0} on the
sp({1, 6, —1, 2)} in R* subspace (hyperplane) x, — x, + x; + x, =0
in & (Hint: See Example 5.]
In Exercises 7~12, find the orthogunal
23. Assume that a, b, and c are vectors in R” and
complement of the given subspace. that W is a subspace of R*. Mark each of the
following True or False.
7. The subspace sp([!, 2, —1]) in R? ——.a. The projection of b on sp(a) is a scalar
8. The line sp((2, —1, 0, —3]) in R* multiple of b.
9. The subspace sp({1, 3, 0], (2, 1, 4]) in R? ___ b. The projection of b on sp(a) is a scalar
multiple of a.
10. The plane 2x + y + 3z = 0in R?
—c. The set of all vectors in R’ orthogonal to
11. The subspace sp{[2, 1, 3, 4], [1, 0, —2, 1]) in every vector in W is a subspace of R’.
R4
___ d. The vector w G W that minimizes
12, The subspace (hyperplane) ax, + bx, + cx, llc — w|| is cy.
+ dx, = 0 in R* (Hint: See Illustration 3.] ___ e. If the projection of b on W is b itself,
13. Find a nonzero vector in R? perpendicular to then b is orthogonal to every vector in W.
[1, 1, 2] and (2, 3, 1] by _— f. Ifthe projection of b on W is b itself,
a. the methods of tiis section, then b is in W.
b. computing a determinant. ___ g. The vector b is orthogonal to every
vector in W if and only if b, = 0.
14. Find a nonzero vector in R‘ perpendicular to _—h. The intersection of W and W* is empty.
{1, 0, ~1, 1), (, 0, —1, 2], and (2, -1, 2, 0] __ i. If b and c have the same projection on
by W, then b = c.
a. the methods of this section,
—— j. Ifb and c have the same projection on
b. computing a determinant. every subspace of R”, then b = c.
24. Let a and b be nonzero vectors in R’, and let
In Exercises 15-22, find the indicated projection. 6 be the angle between a and b. The scalar
||b|] cos @ is called the scalar component of b
15. The projection of [1, 2, 1] on the subspace along a. Interpret this scalar graphically (se¢
sp((3, 1, 2], [1, 0, 1]) in R? Figures 6.1 and 6.2), and give a formula for
16. The projection of [1, 2, |] on the plane it in terms of the dot product.
x+y+z=0i0® 25. Let W be a subspace of R" and let b be a
17. The projection of [1, 0, 0] on the subspace vector in R". Prove that there is one and
sp((2, 1, 1], (1, 0, 2]) in R? only one vector p in W such that b — pis
6.1 PROJECTIONS 337
perpendicular to every vector in W. (Hint: 30. Find the distance from the point (2, 1, 3, 1)
Suppose that p, and p, are two such vectors, in R‘ to the plane sp({1, 0, 1, 0],
aod show that p, — p, is in W*+.] fl, —1, 1, 1]). (Hint: See Exercise 29.]
36. Let A be an m X n matrix.
a. Prove that the set W of row vectors x in
RR” such that xA = 0 is a subspace of R”.
p. Prove that the subspace W in part (a) In Exercises 31-36, use the idea in Exercise 29 to
and the column space of A are orthogonal find the distance from the tip of a to the giver:
complements. one-dimensional subspace (line). [NoTE: To
calculate |lay-||, first calculate ||ay\| and then use
47. Subspaces U and Wof R" are orthogonal if
Exercise 28.}
u-w = 0 for all uin U and all win W. Let
U and W be orthogonal subspaces of R", and
let dim(U) = n — dim(W). Prove that each 31. a=[l, 2, 1],
subspace is the orthogonal complement of W = sp({2, 1, 0]) in &
the other.
3z. a={2,—-1, 3],
98. Let W be a subspace of R’ with orthogonal
W = sp({I,2, 4]) in R?
complement W". Wniting a = a, + a,/1, as
in Theorem 6.1, prove that 33. a = [I, 2, -1, 0),
W = sp((3, 1, 4, —1)) in R*
all = Vflaud? + [anal
(Hint: Use the formula |lal\? = a - a.] 34. a= (2, 1, 1, 2],
29. (Distance from a point to a subspace) Let W W = sp((I,2, 1, 3]) in R*
be a subspace of R". Figure 6.5 suggcsts that 35. a=([l, 2, 3, 4, 5],
the distance from the tip of a in R" to the
W = sp(fl, 1, 1, 1, 1)) in R?
subspace W is equal to the magnitude of the
projection of the vector a on the orthogonal 36. a=[I,0,1,0, 1,0, 1),
complement of W. Find the distance from W = sp({i, 2, 3, 4, 3, 2, 1) in R?
the point (1, 2, 3) in R? to the subspace
(plane) sp([2, 2, 1], [1, 2, 1)).
wt
37. Referring to Example 6, find the projection
of f(x) = 1 on sp(x) in P,.
| | 38. Referring to Example 6, find the projection
!
of f(x) = x on sp(I + x).
|
4 39. Let S and T be nonempty subsets of an
\
inner-product space V with the property that
aw. a : llay:l|
|
every vector in S is orthogonal to every
|
vector in T. Prove that the span of S and the
J span of T are orthogonal subspaces of V.
l
Let {v,,¥), ... , ¥,¢ be an orthogonal set of nonzero vectors in R”. Then
this set is independent and consequently is a basis for the subspace
SP(V1, Ya. + +» Ya):
PROOF To show that the orthogonal set {v,, v,, ... , v,} is independent, let us
suppose that
Taking the dot product of both sides of this equation with v, yields v, - v, = 0,
which contradicts the hypothesis that v, # 0. Thus, no v,is a linear combination
of its predecessors, so {v,, v,,..-, ¥,} iS independent and thus is a basis for
Sp(¥,, ¥.,..., ¥,). (See Exercise 37 on page 203.) a
rN; = boi
V; ° Vv;
Vis
EXAMPLE 2 Find the projection ofb = (3, —2, 2] on the plane zx — y + z = 0 in R?*.
SOLUTION In Example 1, we found an orthogonal basis for the given plane, consisting of
the vectors v, = [—1, 0, 2] and v, = (2, 5, 1]. Thus, the plane may be expressed
as W = sp(v,, v,). Using Eq. (3), we have
“yy b-v
by = ——ly,
b- +——? ",
- Vv; v,° Vv,
- Ie 3
0, 2] + (-Zo)l2 5,1 = y3l-1, 0, 2) ~ 7912, 5,1]
-{1 _1 3 3 37 31 a
340 CHAPTER 6 ORTHOGONALITY
Let Wbea subspace of R". A basis {q,, q,, . . . ,q¢ for Wis orthonormal
if
1. q;° q,)= Ofori # J, Mutually perpendicular
2. q;°
q; = I. Length1
The standard basis ror R" is just one of many orthonormal bases for R’ if
i > |. For example, any two perpendicular vectors v, and v, on the wnit circle
(i!lustrated in Figure 6.6) form an orthonormal basis for R?.
For the projection of a vector on a subspace that has a known orthonormal
basis, Eq. (3) in Theorem 6.3 assumes a simpler form:
vy
FIGURE 6.6
One of many orthonormal bases for R,.
6.2 THE GRAM-SCHMIDT PROCESS 341
SOLUTION Wesee that v,*¥2 = vq ¥3 =v, +¥, = 0. Because [lvl] = Iv = [il] = 2, let a, = 3y
for i = 1, 2, 3, to obtain an orthonormal basis {q,, q,, q,} for W, so that
q, ?2?2??> 2: q = ??? 2: q3 ? ? ? OI
Let W be a subspace of R’, let {a,, a,, .. . , a,} be any basis for W, and
let
W,= sp(ay, a,...,8) for j=1,2,...,k.
Then there is an orthonormal basis {q,, q,, . . . , q,} for W such that
W; = sp(q; G,---35 q).
PROOF Let v, = a,. Forj=2,...,k,let p,be the projection of a,on W,_,, and
let v; = a, — p,. That is, v;is obtained by subtracting from a; its projection on the
subspace generated by its predecessors. Figure 6.7 gives a symbolic illustra-
tion. The decomposition
a, = Pj + (a; ~ P,)
FIGURE 6.7
The vector v; in the Gram-—Schmidt construction.
342 CHAPTER 6 ORTHOGONALITY
s the unique expression for a; as the sum of the vector p,;in W,_, and the vector
~ p,;in(W,_,)*, described in Theorem 6.1. Because a; is in Wi and because p.
is in W,_,, which is itself contained in W,, we see that v; = a; — p; lies in the
subspace W,. Now v, is perpendicular to every vector in W;_, . Consequently, y
is perpendicular to v,, v,,..-, V4. We conclude that each vector in the set
{v,,¥.,...,¥) (5)
1s perpendicular to each of its predecessors. Thus the set (5) of vectors consists
of j mutually perpendicular nonzero vectors in the j-dimensional subspace W,,
and so the set constitutes an orthogonal basis for W,. It follows that, if we set q,
= (1/|lv{)v, for i= 1,2,...,j, then W;= sp(q,,q,,... , q)). Taking
j = k, we see
that
{q,, ,---, 4d
Gram-Schmidt Process
To find an orthonormal basis for 2 subspace W of Re:
1. Find a basis [a, a,,..., aj) for 7
2. Let v, = a. Fors — Peek, compute in succession the yectory
given by subtracting from 2,is projection on the subspace:penerat-
ed by its predecessors
3. They,so obtained form an mnogoneats for W and they may be
normalized to yield an orthpxormal basis,
One may no--nalize the v, forming the vector q, = (1/|lvj|)v, to obtain a vector
of length | at each step of the construction. In that case, formula (6) can be
replaced by the fotlowing simple form:
The arithmetic using formula (6) and that using formula (7) are similar, but
formula (6) postpones the introduction of the radicals from normalizing until
the entire orthogonal basis is obtained. We shall use formula (6) in our work.
However, a computer will generate less error if it normalizes as it goes along.
This is indicated in the next section.
EXAMPLE 4 Find an orthonormal basis for the subspace W = sp([1, 0, 1], (1, 1, 1]) of R?.
SOLUTION We use the Gram-Schmidi process with formula (6), finding first an orthogo-
nal basis for W. We take v, = [1, 0, 1}. From formula (6) with v, = [1, 0, 1] and
a, = {1, 1, 1], we have
An orthogonal basis for W is {[{1, 0, 1], [0, 1, OJ}, and an orthonormal basis is
{{1/V/2, 0, 1/2], [0, 1, O}}. =
Referring to the proof of Theorem 6.4, we see that the sets {v,, v,,... , v}
and {q,, 4, ... , q} are both bases for the subspace W; = spf{a,, a),..., a).
Consequently, the vector a; can be expressed as a linear combination
HISTORICAL NOTE Tue Gram-Scumipt processis named for the Danish mathematician
Jorgen P. Gram (1850-1916) and the German Erhard Schmidt (1876-1959). It was first published
by Gram in 1883 in a paper entitled “Series Development Using the Method of Least Squares.”” It
was published again with a careful proof by Schmidt in 1907 in a work on integral equations. In
fact, Schmidt even referred to Gram’s result. For Schmidt, as for Gram, the vectors were
continuous functions defined on an interval |a, 5] with the inner product of two such functions
¢, # being given as (¢¢{x)y{x) dx. Schmidt was more explicit than Gram, however, writing out the
process in great detail and proving that the set of functions ¢, derived from his original set ¢, was in
fact an orthonormal set.
Schmidt, who was at the University of Berlin from 1917 until his death, is best known for his
definitive work on Hilbert spaces —spaces of square summable sequences of complex numbers. In
fact, he applied the Gram-—Schmidt process to sets of vectors in these spaces to help develop
necessary and sufficient conditions for such sets to be linearly independent.
344 CHAPTER 6 ORTHOGONALITY
| Mori tt Mg
Py Vy
a 4 *** a/=!}q, G °° () ,
| | | Vex
A Q R
COROLLARY 1° Q&-Factorization
EXAMPLE 5 Let
—_—
A=
—>
nea
and solve QR = A for the matrix R. That is, we solve the matrix equation
liv 0
for the entries r,,, r,, and 3). This corresponds to two linear systems of threé
equations each, but by inspection we see that r,, = V2, r,. = V2, and rn = L.
Thus,
11] firvz 0
A=|01{=| 0 fy \’] = oR, .
11) [in0
6.2 THE GRAM-SCHMIDT PROCESS 345
1 !
‘fo ease computations, we replace v, by the parallel vector 3v,, which serves just
as well, obtaining v, = [4, —1, 3, —i]. Finally, we subtract from a, = (1, 0, 1, 1]
its projection on the subspace sp(v,, v,), obtaining
a,°Vv My a.:V
ac,
vy;
= a; —
YVrv v2
° V2
As you can see, the arithmetic involved in the Gram~Schmidt process can
be a bit tedious with pencil and paper, but it is very easy to implement the
process on a computer.
We know that any independent set of vectors in R” can be extended to a
basis for R”. Using Theorem 6.4, we can prove a similar result for orthogonal
sets.
EXAMPLE 7 Expand {f1, 1, 0], [1, —1, 1]} to an oruiogonal basis for R’, and then transform
this to an orthonormai basis for R?.
SOLUTION First we expand the given set to a basis {a,, a), aj} for R?. We take a, = (1, 1, OJ,
a, = [1, —1, 1], and a, = [1, 0, 0], which we can see form a basis for R*. (See
Theorem 2.3.)
Now we use the Gram—Schmidt process with formula (6). Because a, and
a, are perpendicular, we let v, = a, = [1, 1, O] and vy, = a, = [1, —1, 1]. From
formula (6), we have
a,°, a.-yv
Vv, =a; - 3 ly +3 2y,
y° yy Vv,
° Va
-110.0-[564)-[b-$-4
Multiplying this vector by —6, we replace v, by [-1, !. 2]. Thus we have
expanded the given set to an orthogonal basis
{{1, 1, 0], (1, -1, 1], [-1, 1, 2]}
of R’. Normalizing these vectors to unit length, we obtain
{((1/V42, 12, 0], V3, - 1/3, 13), [- 1/6, 1/6, 2-6 J}
as an orthonormal basis. ss
EXAMPLE 8 Find an orthogonal basis for the subspace sp(1, x, x) of the vector space Coi
of continuous functions with domain 0 = x = 1, where (J, g) = [j f(x)g(x) 4
SOLUTION We let v, = | and compute
wa VE ED 1)
(Vx, 2 ve tev
Vx dx 2/3
- Bee -F2
62 THE GRAM-SCHMIDT PROCESS 347
[x ax [ xavx- 2) dx
vy = xX = | - [= 3 Vx - 2)
I, 1 dx | BV x — 2) ax
eye 1/2 -
8-1 yg ya x-to2ovend
_,§
=x 5VX +753
| SUMMARY
| EXERCISES
In Exercises 1-4, verify that the generating set of 3. W=sp((!, -1, -1, 1), U1, 1, 1, 1),
the given subspace W is orthogonal, and find the [-1, 0, 0, 1); b = [2, 1, 3, 1]
Projection of the given vector b on W. 4. W=sp([l, -1, 1, 1, [-1, 1,1, 1),
(1,1, —1, 1);b = [1, 4, 1, 2]
1. W = sp([2, 3, 1], [-1, 1, -1);> = (2, 1,4] 5. Find an orthonormal basis for the plane
2. W = sp((-1, 0, 1], [1 1, sb = 0, 2, 3] 2x t+ iy+z=0.
348 CHAPTER 6 ORTHOGONALITY
6. Find an orthonormal basis for the subspace 22. Find an orthogonal basis for sp({1, 2, 1, 2],
(2, 1, 2, 0]) that contains j1, 2, 1, 2].
W = {[x,, X), X5, Xa] | x, = xX) + 2;,
23. Find an orthogonal basis for sp([2, 1, -1, 1] ’
Xy = —X, + x5}
[1, 1, 3, 0}, [1, 1, 1, 1) that contains
of R‘.
(2, 1, -1, 1] and [I, 1, 3, 0).
(spb
. Find an orthonormal basis for ihe subspace
24. Let B be the ordered orthonormal basis
—
1 -]
O
ft
29. Let A be an 7 X k matrix. Prove that the 35. Find an orthonormal basis for sp(1, e*) for
column vectors of A are orthonormal if and O0=x=1 if the inner product is defined by
only if A7A = 1. SL 8) = Sof(xdg(x) dx.
. Let A be ann X nm matnix. Prove that A has
orthonormal column vectors if and only if A a The routine QRFACTOR in LINTEK allows the
is invertiole with inverse A7'! = A’.
user to enter k independent row vectors in R" for n
and k at most 10. The program can then be used
31. Let A be an n X n matrix. Prove that the to find an orthonormal set of vectors spanning the
column vectors of A are orthonormal if and same subspace. It will also exhibit a
only if the row vectors of A are orthonormal. QR-factorization of the n x k matrix A having
[Hint: Use Exercise 30 and the fact that 4 the entered vectors as column vectors.
commutes with its inverse.] For ann X k matrixA of rank k, the
command {Q,R] = qr(A) in MATLAB produces
ann Xn matrix Q whose columns form an
orthonormal basis for R" andann x k
Exercises 32-35 invoive inner-product spaces. upper-triangular matrix R (that is, with 1, = 0 for
i > j) such that A = QR. The first k columns of Q
comprise the n X k mairix Q described in
32. Let V be an inner-product space of
Corollary I of Theorem 6.4, and R is thek x k
dimension 7, and let B be an ordered
matrix R described in Corollary 1 with n — k
orthonormal basis for V. Prove that, for any
rows of zeros supplied at the bottom to make it
vectors a and b in V, the inner product (a, b)
the same size as A.
is equal to the dot product of the coordinate
vectors of a and b relative to B. (See Exercise In Exercises 36-38, use MATLAB or
24 for an illustration.) LINTEK as just described to check the answers
you gave for the indicated preceding exercise.
33. Find an orthonomnnal basis for sp(sin x, (Note that the order you took for the vectors in the
cos x) for 0 = x = wif the inner product is Gram-Schmidt process in those exercises must be
defined by (f, g) = JUs()g() ax. the same as the order in which you supply them
34. Find an orthonormai basis for sp(1, x, x2) in the Software to be able to check your answers.)
for —1 = x < 1 if the inner product is 36. Exercises 7-12 37. Exercises 17, 20~23
defined by (f, g) = J' fg) dx. 38. Exercises 26-28
Let .4 be the n X n matrix with column vectors a,, a), . . . ,4,. Recall that these
vectors form an orthonormal basis for R" if and only if
a; = ? if i#j, ala,
a,° a;
1 if i=j llall=1
Because
a; a, eo « e
350 CHAPTER 6 ORTHOGONALITY
has a, * a; in the ith row and jth column, we see that the columns of A form an
orthonormal basis of R" if and only if
ATA = I. (1)
In computations with matrices using a computer, it is desirable to use matrices
satisfying Eq. (1) as much as possible, as we discuss later in this section.
6 2 -3
is an orthogonal matrix, and find A™'.
SOLUTION We have
2 3 6//2 3 6 100
ATA = 75/3 —6 2\|3 -6 2/=/0
1 O}.
6 2 -3}|/6 2 -3} |001
6.3 ORTHOGONAL MATRICES 351
PROOF For property 1, we need only recall that the dot product x - y of two
column vectors can be found by using tne matrix multiplication (x")y. Because
A is orthogonal, we know that A7A = J, so
such that Zy? = Ex. He found that the coefficients of the substitution must satisfy the
orthogonality property
One can even trace orthogonal systems of coefficients back to seventeenth- and eighteenth-
century works in analytic geometry, when rotations of the plane or of 3-space are given in order to
transform the equations of curves or surfaces. Expressed as matrices, these rotations would give
orthogonal ones.
The formal definition of an orthogonal matrix, however, and a comprehensive discussion
appeared in an 1878 paper of Georg Ferdinand Frobenius (1849-1917) entitled “On Linear
Substitutions and Bilinear Forms.” In particular, Frobenius dealt with the eigenvalues of such a
matrix.
Frobenius, who was a futl professor in Zurich and laier in Berlin, made his major
mathematical contribution in the area of group theory. He was instrumental in developing the
concept of an abstract group, as well as in investigating the theory of finite matnx groups and
group characters.
352 CHAPTER 6 ORTHOGONALITY
For property 2, the length of a vector can be defined in terms of the dot
product—namely, ||x|]]| = x - x. Because multiplication by A preserves the
dot product, it must preserve the length.
For property 3, the angle between nonzero vectors x and y can be defined
in terms of the dot product—namely, as
_jf{—_X Ye
COS Vx x wr}
Because multiplication by A preserves the dot product, it must preserve
angles. a
EXAMPLE 2 Let v be 2 vector in R? with coordinate vector [2, 3, 5] relative to some ordered
orthonormal basis (a, a), a) of R?. Find |v].
SOLUTION We have v = 2a, + 3a, + 5a;, which can be expressed as
| 2
v= Ax, where A=|a, a, a,{ and x ={|3).
| 5
A(x + e) = Ax + Ae.
The new error vector is Ae. If the matrix A is orthogonal, we know that
\|Ael| = lel], so the magnitude of the error vector remains the same under
multiplication by A. We express this important fact as follows:
If A is not orthogonal, ||Ae|| can be a great deal larger than |lell. Repeated
mv'tiplication by nonorthogonal matrices can cause the error vector to blow
up to such an extent that the final answer is meaningless. To take advantas°
6.3 ORTHOGONAL MATRICES 353
(Av,)"v, = (v,7A7)v,.
Therefore,
fA,(V, ° ¥,)] = v,7ATv,. (2)
The results stated for real symmetric matrices in Section 5.2 tell us that an
n X n real symmetric matnx A has only real numbers as the roots of its
characteristic polynomial, and that the algebraic multiplicity of each eigen.
value is equal to its geometric multiplicity; therefore, we can find a basis for R"
consisting of eigenvectors of A. Using the Gram~Schmidt process, we can
modify the vectors of the basis in each eigenspace to be an orthonormal set.
Theorem 6.7 then tells us that the basis vectors from different eigenspaces are
also perpendicular, so we obtain a basis of mutually perpendicular real
eigenvectors of unit length. We can take as the diagonalizing matrix C, such
that C-'AC = D, an orthogonal matrix whose column vectors consist of the
vectors in this orthonormal basis for R’. We summarize our discussion in a
theorem.
a=)
_ {12
4
SOLUTION The eigenvalues of A are the roots of the characteristic equation
HSA
1-A
2 Jew sina
24 2 gL
a-a=a-[}3]-|
which yields the eigenspace
For A, = 5, we have
a)
which yields the eigenspace
6-year
Thus,
B )-say
is a diagonalization of A, and
[2aL 4
as a
is an orthogonal diagonalization of A. 8
A, = -1, Ea ?
—ty; j-1
A,=A,;=2, E,=sp of ’ ;
VY. V3
Notice that vectors v, and v, in E, are orthogonal to the vector v, in £,, as must
be the case for this symmetric matrix A. The vectors v, and v, in £, are not
orthogonal, but we can use the Gram-—Schmidt process to find an orthogonal
basis for E,. We replace v, by
. -1]- ,{-!] [-1 -1
-HoMy, = 01-5 1}=|—1/2], or by {-I11.
V2 ° V2 1 0 l 2
Thus {[1, 1, 1], [—1, 1, 0], [-1, —1, 2]} is an orthogonal basis for R? of
eigenvectors of A. An orthogonal diagonalizing matrix C is obtained by
normalizing these vectors and taking the vectors in the resulting orthonormal
basis as column vectors in C. We obtain
V3 1/2 - 1/6
C= WWV3 WN - V/V.
W330 216
356 CHAPTER 6 ORTHOGONALITY
Orthogonal Linear Transformations
For example, the linear transformation that reflects the plane R? in a line
containing the origin clearly preserves both the angle @ between vectors u and v
and the magnitudes of the vectors. Because the dot product in R’ satisfies
EXAMPLE 5 Show that the linear transformation 7: R? > R? defined by T([x,, %2. %3]) 7
[x [V2 + xfV72, x, —x,{W2 + 4/2] is orthogonal.
6.3 ORTHOGONAL MATRICES 357
SOLUTION The orthogonality of the transformation follows from the fact that the
standard matrix representation
Ww72 0 WN
Az=| 0 1 4
-WV2 0 Wnv?2
is an orthogonal matrix. =
| SUMMARY
1. A square n X n matrix A is orthogonal if it satishes any one (and hence all)
of these three equivalent conditions:
a. The rows of A form an orthonormal basis for R’.
b. The columns of A form an orthonomnal basis for R’.
c. The matrix A is invertible, and A! = A’.
2. Multiplication of column vectors in R" on the left by an n x 1% orthogonal
matrix preserves length, dot product, and the angle between veccors. Such
multiplication is computationally stable.
3. A linear transformation of R" into itself is orthogonal if and only if it
preserves the dot product, or (equivalently) if and only if its standard
matrix representation is orthogonal, or (equivalently) if and only if it maps
unit vectors into unit vectors.
4. The eigenspaces of a symmetric matrix A are mutually orthogonal, and A
has n mutually perpendicular eigenvectors.
5. A symmetric matrix A is diagonalizable by an orthogonal matrix C. That
is, there exists an orthogonal matrix C such that D = C”'AC is a diagonal
matrix.
358 CHAPTER 6 ORTHOGONALITY
"| EXERCISES
In Exercises 1-4, verify that the given matrix is 12. Let (a,, a,, a,, a,) be an ordered orthonormal
orthogonal, and find its inverse. basis for R‘, and let (2, 1, 4, —3] be the
coordinate vector of a vector b in 1X‘ relative
to this basis. Find |{bjl.
u
ulw
via
2&2
1. wv 1 2.|_
oOo
ule
ote In Exercises 13-18, find a matrix C such that
SC
SO
—
D = C'AC is an orthogonal diagonalization of
I-11 1 the given syinmetric matrix A.
2-3 6 gil boat
|=
3. 3 ¢2 23 “ar a -tot
6 12. 12 32
- Lod ot-t
~~
A 4. | Ol
If A and D are square :natrices, D is diagonal, 2 10 joi
and AD is orthogonal, then (AD)! = {AD)' and 15.|1 2! 16.1321
D“'A-' = DTA’ so that A“' = DD™A? = D?A™. In 10 1 2 l1 10
Exercises 5~8, find the inverse of each matrix A
by first finding a diagonal matrix D so that AD Oo 1 Lt O 1-1 0 0
has column vectors of length 1, and then applying 1-2 2 1 -!1 1 0 0
the formula A“! = D’A’. oS | 18. 00 1 3
oO ft tt O 00 3 #41
3 0 8
s.[ -1
| 3]3 6. |-4 0 6 19. Mark each of the following True or False.
4 010 ___ a. A square matrix is orthogonal if its
column vectors are orthogonal.
7.
-12
4-3
6 6
2
623 8
212 1-3 #1
_— b. Every orthogonal matrix has nullispace
{0}.
_—.c. If A? is orthogonal, then A is orthogonal.
4 2 1 3-1 _—d. If A is an n X n symmetric orthogonal
matrix, then A? = J.
9. Supply a third column vector so that the
___e. IfA is ann X mn symmetric matrix such
matrix
that A? = J, then A is orthogonal.
V3 V2 ___ f. If A and B are orthogonal n x n matrices,
V3 0 is orthogonal. then AB is orthogonal.
UWV3 -1nW2 _—. g. Every orthogonal linear transformation
catries every unit vector into a unit
10. Repeat Exercise 9 for the matnx vector.
2 V5 ___h. Every linear transformation that carries
each unit vector into a unit vector is
3 -2V13 orthogonal.
___ i. Every map of the plane into itself that is
é7 0 an isometry (that is, preserves distance
between points) is given by an orthogonal
11. Let (a,, a), a,) be an ordered orthonormal linear transformation.
basis for R’, and let b be a unit vector with _— j. Every map of the plane into itself that
coordinate vector ; i q relative to this an isometry and that leaves the origin
fixed is given by an orthogonal linear
basis. Find all possible values for c. transformation.
6.3 ORTHOGONAL MATRICES 359
T(ra) = rT (a)
T(a)
explained in Section 6.1, we see that the projection vector p is the unique
vector satisfying the following two properties. (See Figure 6.10 for an
illustration when dim(W) = 2.)
FIGURE 6.10
The projection of b on sp(a,, a,).
362 CHAPTER 6 ORTHOGONALITY
x;
xX}
x =|. | for any scalars x,, xX,,..., X;,
Xx
r
ry
r=|-| for some scalars r,,7r,,...,
ry
In other words, the dot product of the vectors x and A’b — A7Ar must be zero
for all vectors x. This can happen only if the vector A’b — A“Ar is itself the zero
vector. (See Exercise 41 in Section 1.3.) Therefore, we have
by = A(ATA)"'A7b. (3)
eo
6.4 THE PROJECTION MATRIX 363
We leave as Exercise |2 the demonstration that the formula in Section 6.1 for
projecting b on sp(a) can be written in the form of formula (3), using matrices
oi appropriate size.
Our first example reworks Example | in Section 6.1, using formula (3).
EXAMPLE 1 Using formula (3), find the projection of the vector b on the subspace W’ =
sp(a), where
I 2|
b=/2| and a=/4).
3 3|
SOLUTION Let A be the matrix whose single column is a. Then
2
ane 4 falc
3
Putting this into formula (3), we have
P = A(ATA)'AT,
SOLUTION The x,,x,-plane is the subspace W = sp(e), e;), where e, and e, are the column
vectors of the matrix
00
A= 1 O}.
01
364 CHAPTER 6 ORTHOGONALITY
We find that A7A = J, the 2 x 2 identity matrix, and that the projection matrix
1S
9% 10 000
P= aaray'a’=|1 0l[9 9 |= 01 0}.
01 0 0 1
Notice that P projects each vector
|
b, b,
b,| inRonto Plb,\=
b, b,
EXAMPLE 3 Find the matrix that projects vectors in R? on the plane 2x — y — 3z = 0. Also,
find the projection of a general vector b in R? on this plane.
SOLUTION We observe that the given plane contains the zero vector and can therefore be
written as the subspace W = sp(a,, a,), where a, and a, are any two nonzero and
nonparallel vectors in the plane. We choose
0 1
a,=| 3{ and a, =/2\,
0
($4
so that
&
=
WwW
on
m—
Then
TA\
(A°Ay"
=
's 5] mia 5 -6
6 10
p--| 3 af 5 3° a
~ 14 6 101
2 O
I-10
1[-6 ry 3-17
[2 2 ©
=74l 3 211 2 of Tal 2 13-3)
-5 6 6-3 5
Each vector b = [b,, 5,, 53] in R? projects onto the vector
We can use property 2 as a partial check for errors in long computations that
lead to P, as in Example 3. These two properties completely characterize the
projection matrices, as we now show.
In Example Z, we saw that the projection matrix for the x,,x,-plane in R? has
the very simple description
000
P=|0 1 0}.
001
The usually complicated formula P = A(A7A)"'A’ was simplified when we used
the standard unit coordinate vectors in our basis for W. Namely, we had
(0 0 00
A=1l1 01 so ara =| 9 3] 1=[9 9] = 7
(0 | 0 1
Thus, (A7A)"! = J, too, which simplifies what is normally the worst part of the
computation in our formula for P. This simplification can be made in
computing P for any subspace W of R", provided that we know an orthonormal
basis {a,, a,..., a, for W. If A is the n X k matrix whose columns are
a,, a),..-., a, then we know that A7A = J. The formula for the projection
matrix becomes
P = A(ATAY AT = AIA? = AAT.
We box this result for easy reference.
P = AA’, (4)
where .4 is the n x A matrix having column vectors a,, a,,... , &.
64 THE PROJECTION MATRIX 367
EXAMPLE 4 Find the projection matrix for the subspace W = sp(a,, a.) of R? if
WV3 V2
a, =|-1/V3| and a,=| 1/V2].
V3 0
Find the projection of each vector b in R? on W.
SOLUTION Let
V3 WV
A=|-WUV3 WV].
VvV3 0 7"
Then
ATAr,_|1 0
|
sO
5 J 3
6 6 3
1 S$ _t
P=AAT=/6 6 3).
11 4
3 3 3
Each column vector b in R? pro)
projects onte
§
b] {58+ by + 2]
Al—
6
by = Pb =|! |
Al
3 b, ~ 6 b, + 5b, — 26,
Alternatively, we can compute b, directly, using boxed Eq. (4) in Section 6.2.
We obtain
if bb +b) fb, +b,
by = (b+ aja, + (b+ aja, = 3)—5, + b, — by + 516, + 6,
b, - b, +b 0
I + b+
b, + 5b, — 26,
On
SUMMARY
Let {a,, a,,..., a, be a basis for a subspace W of R’, and let A be then x k
matrix having a, as jth column vector, so that W is the column space of A.
EXERCISES
In Exercises 1-8, find the projection matrix for __.h. Ann X nsymmetric matrix A is a
the given subspace, and find the projection of the projection matrix if and only if A? = J.
indicated vector on the subspace. _—— i. Every syminetric idempotent matnx is the
projection matnx for its column space.
. [1], 2, 1] on sp(f2, 1, —1)) in R?
_— j. Every symmetric idempotent matnx
bom
space.
[1, 2, 1] on sp((3, 0, 1), [1, 1, 1]) in R?
16. Show that the projection matrix P =
de
in R‘
17. What is the projection matrix for the
8. [1, 1, 2, 1] on sp("l, 1, 1, 1), afl, -I, l, —1),
subspace R’ of R”?
[-1, 1, 1, —1]} in R*
18. Let U be a subspace of W, which is a
9, Find the projection matrix for the
subspace of R*. Let P be the projection
X,,X,plane in R?.
matrix for W, and let R be the projection
10. Find the projection matrix for the
matrix for U. Find PR and RP. [Hint: Argue
X,,X;-coordinate subspace of R‘.
geometrically.]
11. Find the projection matrix for the
19. Let P b2 the projection matrix for a
X1,X2,X,-coordinate subspace of R‘.
k-dimensional subspace of R’.
12. Show that boxed Eq. (3) of this section
a. Find all eigenvalues of P.
reduces to Eq. (1) of Section 6.1 for
b. Find the algebraic multiplicity and the
projecting b on sp(a).
geometric multiplicity of each eigenvalue
13. Give a geometric argument indicating that
found in pari (a).
every projection matrix is idempotent.
c. Explain how we can deduce that P is
14. Let a be a unit column vector in R*. Show
diagonalizable, without using the fact that
that aa? is the projection matrix for the
P is a symmetric matrix.
subspace sp(a).
20. Show that every symmetric matrix whose
15. Mark each of the following True or False.
only eigenvalues are 0 and | is a projection
—— a. A Subspace W of dimension k in R’ has
matrix.
associated with ita x Xx k projection
21. Find all invertible projection matrices.
matrix.
_— b. Every subspace W of R" has associated In Exercises 22-28, find the projection matrix for
with it an n X n projection matrix. the subspace W having the given orthonormal
___ c. Projection of R” on a subspace W is a basis. The vectors are given in row notation to
linear transformation of R" into itself. save space in printing.
_— d. Two different subspaces of R* may have
22. W = sp(a,, a,) in R’, where
the same projection matrix.
___ e. Two different matrices may be projection a, = [1/V2, 0, -1/V/2] and
matrices for the same subspace of R’. a, = [1/V3, -1/V3, 1/V3]
_— f. Every projection matrix is symmetric.
—— g. Every symmetric matrix is a projection 23. W sp(a,, a,) in R?, where
M
matrix. a =
5, Q] and a) = [9, 0, 1]
6.5 THE METHOD OF LEAST SQUARES 369
PROBLEM1 According to Hooke’s law, the distance that a spring stretches is proportional
to the force applied. Suppose that we attach four different weights a,, @,, a,,
and a, in turn to the bottom of a spring suspended vertically. We measure the
four lengths 5,, 5,, 5, and b, of the stretched spring, and the data in Table 6.1
are obtained. Because of Hooke’s law, we expect the data points (a;, 5,) to be
close to some line with equation
yaf[()an+ nx,
where r, is the unstretched length of the spring and r, is the spring constant.
That is, if our measurements were exact and the spring ideal, we would have
b, = ro + r,a; for specific values ry andr,
TABLE 6.1
In Problem 1, we have only the two unknowns 7, and r,; and in theory,
just two measurements should suffice to find them. In practice, however, we
expect to have some error in physical measurements. It is standard procedure
to make more measurements than are theoretically necessary in the hope
that the errors will roughly cancel each other out, in accordance with the
laws of probability. Substitution of each data point (a, 5,) from Problem |
into the equation y = r, + r,x gives a single linear equation in the two un-
knowns r, and r,. The four data points of Problem | thus give rise to 4
linear system of four equations in only twc unknowns. Such a linear sys-
tem with more equations than unknowns is called overdetermined, and
one expects to find that the system is inconsistent, having no actual solu-
tion. It will be our task to find values for the unknowns y, and r, that
will come as close as possible, in some sense, to satisfying all four of the
equations.
We have used the illustration presented in Problem | to introduce our
goal in this section, and we will solve the problem in a moment. We first
present two more hypothetical problems, which we will also solve later 10
the section.
6.5 THE METHOD OF LEAST SQUARES 371
PROBLEM 2 At a recent boat show, the observations listed in Table 6.2 were made relating
the prices 5; of sailboats and their weights a;. Plotting the data points (a;, 6,), as
shown in Figure 6.12, we might expect a quadratic function of the form
ypHf[M=antrxt Hr
to fit the data fairly well. »
TABLE 6.2
a; = Weight in tons 2 4 5 8
b, = Price in units of $10,000 ] 3 5 12
PROBLEM 3 A population of rabbits on a large island was estimated each year from 1991 to
1994, giving the data in Table 6.3. Knowing that population growth is
exponential in the absence of disease, predators, famine, and so on, we expect
an exponential function
y = f(x) = re"
to provide the best representation of these data. Notice that, by using
logarithms, we can convert this exponential function into a form linear in x:
TABLE 6.3
Price
y
A
10+
5+
1+. Weight
py ht tt le FIGURE 6.12
112345678910 Problem
2 data.
372 CHAPTER 6 ORTHOGONAI ITY
or simply as b ~ Ar. We try to find an optimal solution vector r for the system
(1) of approximations. For each vector r, the error vector Ar — b measures how
far our system (1) is from being a system of equations with solution vector r.
The absclute values of the components of the vector Ar — b represent the
vertical distances d, = |r, + r,a; — 0], shown in Figure 6.13.
We want to minimize, in some sense, our error vector Ar — b. A number of
different methods for minimization are very useful. For example, one might
want to minimize the maximum of the distances d;. We study just one sense of
bt
FIGURE 6.13
The distances d..
6.5 THE METHGD OF LEAST SQUARES 3/3
minimization, the one that probably seems most natural at this point. We will
find r = r to minimize the length ||Ar — bj] of our error vector. Minimizing
\| Ar — b|| can be achieved by minimizing || Ar — bj|?, which means minimizing
the sum
d?+dzp+-+-+d,-
of the squares of the distances in Figure 6.13. Hence the name metiod of least
squares given to this procedure.
If a, and a, denote the columns of A in system (1), the vector Ar = r,a, +
ra, lies in the column space W = sp(a,, a,) of A. From Figure 6.14, we see
geometrically that, of all the vectors Ar in W, the one that minimizes || Ar — b||
is the projection by = Ar of b on W. Recall that we proved this minimization
property of by algebraically on page 328. Formula (3) in Section 6.4 shows that
then Ar = A(A’A)'A’b. Thus our optimal solution vector r is
r = (ATAY'ATD. (2)
The 2 x 2 matrix A‘A is invertible as long as the columns of A are
independent, as shown by Theorem 6.10. For our matrix A shown in system
(1), this just means that not all the values a; are the same. Geometrically, this
corresponds to saying that our data points in the plane do not all lie ona
vertical line.
Note that the equation r = (A7A)"'A’b can be rewritten as (A"A)r = A‘b.
Thus r appears as the unique solution of the consistent linear system
FIGURE 6.14
The length ||Ar — bl|.
374 CHAPTER 6 ORTHOGONALITY
Least-Squares Solution of Ar = b
Let A be a matrix with independent column vectors. The
least-squares solution r of Ar ~ b can be computed in either of the
following ways:
1. Compute r = (A7A)"'A™S.
2. Solve (.47A)r = Ab.
When a computer is being used, the second method is more
efficient.
EXAMPLE 1 Find the least-squares fit to the data points in Problem | by a straight
line—that is, by a linear function y = 7% + r,x.
SOLUTiON We form the system b ~ Ar in system (1) for the data in Table 6.1:
We have
and
6.5
_ = (ATA\'ATh = 1f 81 -17])f1
11 1)] 8&5
r= (ArAyAa’d | 7 all; 45 A 11.0
12.5
6.5
_ 1/47 13 -4 -21)} 85}_ 1109.5) _ [3.1
~ 351-9 -1 3 7I{ILOl” 35) 53.5 1.5/°
12.5
Therefore, the linear function that best fits the data in the least-squares sense 1S
y ~ f(x) = 1.5x + 3.1. This line and the data points are shown in Figure
6:5. 8
65 THE METHOD OF LEAST SQUARES 375
FIGURE 6.75
The least-squares fit of data points.
EXAMPLE 2 Use the method of least squares to fit the data in Problem 3 by an exponential
function y = f(x) = re™.
SOLUTION We use logarithms and convert the exponential equation to an equation that is
linear in x:
Iny=Inr+t
sx.
TABLE 6.4
xX =4, | 2 3 4
y=5, 3 4.5 8 17
z = In(b) 1.10 1.50 2.08 2.83
HISTORICAL NOTE A TECHNIQUE VERY CLOSE TO THAT OF LEAST SQUARES was developed by
Roger Cotes (1682-1716), the gifted mathematician who edited the second edition of Isaac
Newton’s Principia, in a work dealing with errors in astronomical observations, wntten around
1715,
The complete principle, however, was first formulated by Carl Gauss at around the age of 16
while he was adjusting approximations in dealing with the distribution of prime numbers. Gauss
later stated that he used the method frequently over the years—for example, when he did
calculations concerning the orbits of asteroids. Gauss published the method in 1809 and gave a
definitive exposition 14 years later.
On the other hand, it was Adrien-Marie Legendre (1752-1833), founder of the theory of
elliptic functions, who first published the method of least squares, in an 1806 work on determining
the orbits of comets. After Gauss’s 1809 publication, Legendre wrote to him, censuring him for
claiming the method as his own. Even as late as 1827, Legendre was still berating Gauss for
“appropriating the discoveries of others.” In fact. the problem lay in Gauss’s failure to publish
many of his discoveries promptly; he mentioned them only after they were published by others.
376 CHAPTER 5 ORTHOGONALITY
From Table 6.3, we obtain the data in Table 6.4 to use for our logarithmic
equation. We obtain
11
11 i. it 2)_7 4 10
AA i 23 al 3 0 30
TA = =
14
3 1
aay =| 2 2 2 §
. a
3 1
3a(L 111 bo 3I OF 1
(ATA AT=| 2
iL
5
23 Je 10
tou
10 10
|10
Multiplying this last matrix on the right by
1.10
1.56
2.08 |>
2.83]
we obtain, from Eq. (3),
= = Inr} _ {-435
5 S77]
Thus r = e * = 1,54, and we obtain y = j(x) = 1.54e°”*as a fitting exponential
function.
The graph of the function and of the data points in Table 6.5 is shown in
Figure 6.16. On the basis of the function f(x) obtained, we can project the
population of rabbits on the island in the year 2000 to be about
f(10) - 1000 ~ 494,000 rabbits,
unless predators, disease, or lack of food interferes. =
TABLE 6.5
a; b, f(a,)
1 3 2.7
2 4.5 4.9
3 8 8.7
4 17 15.5
<
rr
|
v=] (§4¢0-57 71x
20
qT
FIGURE 6.46
Data points and the exponential fit.
are obtained from measurements on the ith experiment. For example, the data
values a, might be ones that can be controlled and the value b; measured. Of
course, there may be errors in the controlled as well as in the measured values.
Suppose that we have reason to believe that the data obtained for each
experiment should theoretically satisfy the same linear relation
YH tnx, + mx, + ses + 7,Xq) (4)
b 1 a, a. *'*) allt
b, Lay Gy. t+ *) Gyllh
= . ry (5)
r,
378 CHAPTER 6 ORTHOGONALITY
The m data points thus give rise to a linear systein of the form of system (5),
where the matrix A 1s given by
] a, a; eoee a,"
1 a, a, 2 . e ea
a,"
EXAMPLE 3 Use a computer to find the least-squares fit to the data in Froblem 2 by a
parabola—that is, by a quadratic function
YHryrnx t+ rx’.
SOLUTION We write the data in the form b = Ar, where A has the form of matrix (7):
1 12 4
3) |1 4 16|}%
5] 11 5 251 4]-
12 18 64)?
.207
r ~}.010
6.5 THE METHOD OF LEAST SQUARES 379
Thus, the quadratic function that best approximates the data in the least-
squares sense 1S
TABLE 6.6
a, b, f(a,)
2 | .959
4 3 3.17
5 5 4.83
& 12 12.0
y
A
12 T
10+
— +++» x
2 4 6 8 10 12
FIGURE 6.17
the graph and data points for Example 3.
380 CHAPTER 6 ORTHOGONALITY
EXAMPLE 4 Show that the least-squares linear fit to data points (a, b,) fori = 1, 2,...,m
by a constant function /(x) = 7 is achieved when r, is the average y-value
(b, +b, + +++ +b ym
of the data values 0.
SOLUTION Such a constant function y = f(x) = 7 has a horizontal iine as its graph. In
this situation, we have n = 1 as the number of unknowns and system (5) be-
comes
b| (1
b, l
.1~{. | Ll
i|
b A
Thus A is the im X 1 matrix with all entries 1. We easily find that A7A = [m],so
(ATA)! = [l/m]. We also find that A7b = 5, + b, + - ++ + b,. Thus, the
least-squares solution [Eq. (6)] !s
r=(6,+6,+--+
+ 6,)/m,
as asserted. a
1 0 1
2 1 1
1 3 =O
0 2 1
-l 2-1
1 0 1 l
1 2 1 0-1} 2 1 Ix, i 2 It O -1)/0
0 1 3 2 2 1 3 Off%y=)0 1 3 2 21h
1 ! O 1 lf} O 2° IIx, 1 1 0 1 -1}}2
-1 2-1 0
or
7 3 4i\x, 2
3 18 L)px,}=47],
4 1 4)Ix, 3
x, —.614
X,) =] .421),
L
X31 | 1.259
accurate to three decimal places. This is the least-squares approximate
solution x to the given overdetermined system. a
A=| . (10)
has length Wk. However, it may happen that the column vectors in 4 are
mutually perpendicular; we can see then that
k 0 Vk 0
A'A
T =
If a: } sO (A TA) “l=
| 0 (a . |
The matrix A has this property if the x-values a,, a,,..., a, for the data are
symmetrically positioned about zero. We illustrate with an example.
EXAMPLE 6 Find the least-squares linear fit of the data points (—3, 8), (-—1, 5), (1, 3), and
(3, 0).
ee
A=|j
eee
I
tad
We can see why the symmetry of the x-values abont zero causes the column
vectors of this matrix to be orthcgonal. We ind that
1 -3
1 1 1 og} -l 4 0
AA [_ -1 1 shi l S sof
T = =
1 3,
Then
1
r = I = (ATAY'ATh = :
So
|
ee
tI—
°
aim
Oo
10 36 ~ [1 3}
|
al-
Suppose now that the x-values in the data are not symmetrically posi-
tioned about zero, but that there is a number x = c about which they
are symmetrically located. If we make the vanable transformation ¢ = x — 6
then the ¢-values for the data are symmetrically positioned about ¢ = 0. We
can find the 4,y-equation for the least-squares linear fit, and then replace !
by x — c to obtain the x,y-equation. (Exercises 14 and 15 refine this idea
still further.)
EXAMPLE 7 The number of sales of a particular item by a manufacturer in each of the first
four months of 1995 is given in Table 6.7. Find the least-squares linear fit of
these data, and use it to project sales in the fifth month.
6.5 THE METHOD OF LEAST SQUARES 383
TABLE 6.7
a, = Month 1 2 3 4
b, = Thousands sold 2.5 3
SOLUTION Our x-values a, are symmetrically located about ¢ = 2.5. If we take f= x — 2.5,
the data points (¢, v) become
[2.5
TA\-\ATh —|4 OF 2 1a 13
(waar o 1 a3 —.5 .5 1513.8
5 4.5
; (34 - [*6s}
Thus, the ¢,y-equation is y = 3.45 + .682. Replacing ¢ by x — 2.5, we obtain
To find the least-squares solution vector R"'Q’b once Q and R have been
found, we first multiply b by Q7, which is a stable computation because Q is
crthogonal. We then solve the upper-triangular system
Rx = Q'b (12)
384 CHAPTER 6 ORTHOGONALITY
/ | SUMMARY | | —
1. Aset of k data points (a,, b;) whose first coordinates are not all equal can be
fitted with a polynomial function or with an exponential function, using
the method of least squares illustrated in Examples | through 3.
2. Suppose that the x-values a, in a set of k data points (a, b,) are
symmetrically positioned about zero. Let A be the k x 2 matrix with first
column vector having all entries 1 and second column vector a. The
columns of A are orthogonal, and
ATA = 0 0 |
Oa-a
Computation of the least-squares linear fit for the data points is then
simplified. If the x-values of the data points are symmetrically positioned
about x = c, then the substitution ¢ = x — c gives data points with f-values
symmetrically positioned about zero, and the above simplification applies. See
Example 7.
X = (ATA)"'A',
Geometrically, Ax is the projection of b on the column space of A.
5. An alternative to using the formula for x in summary item 4 is to convert
the overdetermined system Ax = b to the consistent system (A7A)x = A™b
by multiplying both sides of Ax = b by A’, and then to find its unique
solution, which is the least-squares solution x of Ax = b.
EXERCISES
1. Let the length b of a spring with an attached b. Use the answer to part (a) to estimate the
weight a, be determined by measurements, as profit if the sales force is reduced to 25.
shown in Table 6.8. c. Does the profit obtained using the answer
a. Find the least-squares linear fit, in to part (a) for a sales force of 0 people
accordance with Hooke’s law. seem in any way plausible?
b. Use the answer to part (a) to estimate the
length of the spring if a weight of 5 In Exercises 5-7. find the least-squares fit to the
ounces is attached. given data by a linear function f(x) = ro + rx.
TABLE 6.8
Graph the linear function and the data points.
5. (0, 1), (2, 6), (3, 11), (4, 12)
a,= Weight in ounces 1 2 4 6 6. (1, 1), (2, 4), (3, 6), (4, 9)
b; = Length in inches 3. 4.1 5.9 = 8.2
7. (0, 0), (1, 1), (2, 3), (3, 8)
8. Find the least-squares fit to the data in
. A company had profits (in units of $19,900) Exercise 6 by a parabola (a quadratic
ta
(ap
degree at most / — | with graph passing
through any k points in R? having distinct
first coordinates.
— d. The least-squares solution of Ax = b is
= Goalie)
and unique.
— ¢. [he least-squares solution of Ax = b can
ft!
be an actual solution only if A is a square
matnx.
15. Repeat Exercise !4, but do not assume that . The least-squares solution vector of
x”, a, = 0. Show that the least-squares linear Ax = bis the projection of b on the
fit of the data is given by y = r, + r(x — 0), column space of A.
where c = (2%, a,)/m and r, and r, have ___ g. The least-squares solution vector of
the values given in Exercise 14. Ax = b is the vector x such that Ax is the
projection of b on the column space of A.
In Exercises 16-20, find the least-squares solution —_— h. The least-squares so!ution vector of
of the given overdetermined system Ax = b by Ax = b is the vector x = x that
converting it to a consistent system and then minimizes the magnitude of Ax — b.
solving, as illustrated in Example 5. i. A linear system has a least-squares
solution only if the number of equations
Teak
is greater than the number of unknowns.
. Every linear system has a least-squares
16.
solution.
Ale
—_S
=
17.
wt x
The routine YOUFIT in LINTEK can be used to executing back substitution on Rx = Q™b as in
illustrate graphically the fitting of data points by Eq. (12).
linear, quadratic, or exponential functions. In Recall that in MATLAB, if A is n x k, then
Exercises 27-31, use YOUFIT to try visually to fit Q isn X nand Risnx k. Cutting Qand R
the given Gata with the indicated type of graph. down to the text sizesn x kandk x k,
When inis is done, enter the zera data suggested respectively, we can use the command lines
to see the computer's fit. Run twice more with the
same data points but without trying to fit the data [Q R] = qr(A); fu, k] = size(A);
visually, and determine whether the data are best rref([R(1:k,1:k) Q(:,1:k)‘sb])
futed by a linear, quadratic, or {logarithmically
fated) exponential function, by comparing the fo compute the solution of Rx = Q"b.
least-squares sums for the three cases. Use LINTEK or MATLAB in this fashion for
Exercises 32-37. You must supply the matrix A
27. Fit (1, 2), (4, 6), (7, 10), (10, 14), (14, 19) by and the vector b.
a linear function.
32. Find the least-squares linear fit for the data
28. Fit (2, 2), (6, 10), (10, 12), (16, 2) by a
points (—3, 10), (-2, 8), (—1, 7), (0, 6),
quadratic function.
(1, 4), (2, 5), (G, 6).
29. Fit (1, 1), (10, 8), (14, 12), (16, 20) by an
33. Find the ieast-squares quadratic at for the
exponential function. Try to achieve a lower
data points in Exercise 32.
squares sum than the computer obtains with
its least-squares fit tbat uses logarithms of 34. Find the least-squares cubic fit for the data
y-values. points in Exercise 32.
30. Repeat Exercise 29 with data (1, 9), (5, 1), 35. Find the least-squares quartic fit for the data
(6, .5), (9, .01). points in Exercise 32.
31. Fit (2, 9), (4, 6), (7, 1), (8, .1) by a linear 36. Find the quadratic polynomial function
function. whose graph passes through the points (1, 4),
(2, 15), (3, 32).
The routine QRFACTOR in LINTEK has an 37. Find the cubic polynomial function whose
option to use a QR-factorization of A to find the graph passes through the points (— 1, 13),
least-squares solution of a linear system Ax = b, (0, —5), (2, 7), (3, 25).
CHAPTER
/ CHANGE OF BASIS
Recall from Section 3.4 that a linear transformation is a mapping of one vector
space into another that preserves addition and scalar multiplication. Two
vector spaces and V’ are isomorphic if there exists a linear transformation of
V onto V’ that is one-to-one. Of particular importance are the coordinatization
isomorphisms of an n-dimensional vector space V with kk". One chooses an
ordered basis B for V and defines T: V-> R’ by taking 7(v) = ¥,, the coordinate.
vector of v relative to B, as described in Section 3.3. Such an isomorphism
describes V and R" as being virtually the same as vector spaces. Thus mucin of
the work in finite-dimensional vector spaces can be reduced to computations
in R". We take full advantage of this feature as we continue our study of linear
transformations in this chapter.
We have seen that bases other than the standard basis E of R" can be useful.
For example, suppose A is an n X n matrix. Changing from the standard basis
of R" to a basis of eigenvectors (when this is possible) facilitates computation
of the powers A‘ of A using the more easily computed powers of a diagonal
matrix. as described in Section 5.2. Our work in the preceding chapter gives
another illustration of a desirable change of basis: recall the convenience that
an orthonormal basis can provide. We expect that changing to a new basis will
change coordinate vectors and matrix representations of linear transforma-
tions.
In Section 7.1 we review the notion of the coordinate vector vy, relative to
an ordered basis B, and consider the effect of changing the ordered basis from
B to B'. With this backdrop we turn to matrix representations of a linear
transformation in Section 7.2, examining the relationship between matrix
representations of a linear transformation T: V > V' relative to different
ordered bases of V and of V’.
388
74 COORDINATIZATION AND CHANGE OF BASIS 389
Let B = (b,,b,, . . . , b,) be an ordered basis for a vector space V. Recall that ifv
is a vector in Vand v = 7,b, + r,b, + «+ + +47,b,, then the coordinate vector of¥
relative to B is
Ve = [F,, Fo - - Mal-
For example, the coordinate vector of cos’x relative to the ordered basis
(1, sin?x) in the vector space sp(1, sin’x) is [1, —1]. If this ordered basis is
changed to some other ordered basis, the coordinate vector may change. In
this section we consider the relationship between two coordinate vectors v, and
vy Of the same vector v.
We begin our discussion with R", which we will view as a space of column
veciors. Let v be a vector in R’ with coord:nate vector
Let M, be the matrix having the vectors in the ordered basis B as column
vectors; this is the basis matrix for B, which we display as
M, = | “ob ],
| (2)
] |
Equation (1), which expresses v as a vector in the column space of the matrix
M,, can be written in the form
Msp = V. (3)
If B’ is another ordered basis for R”, we can similarly obtain
Mvp = V. (4)
Equations (3) and (4) together yield
Equation (5) is easy to remember, and it turns out to be very useful. Both M,
and M, are invertible, because they are square matrices whose column vectors
are independent. Thus, Eq. (5) yields
= M;'Mg\. (6)
Equation (6) shows that, given any two ordered bases B and B’ of R’, there
exists an invertible matrix C—-namely,
C = Mz Ms, (7)
vy = Crp. (8)
If we know this inatrix C, we can conveniently convert coordinates relative to
B into coordinates relative tc B’. The matrix C in Eq. (7) is the unique matrix
satisfying Eq. (8). This can be seen by assuming that D is also a matrix
satisfying Eq. (8). so that v, = Dv; then Cy, = Dv, for all vectors ¥, in R’.
Exercise 4! in Section 1.3 shows that we must have C = D.
The matrix C in Eq. (7) and Eq. (8) is computed in terms of the ordered
bases B and 3’, and we will now change to the subscripted notation Cy» to
suggest this dependency. Thus Eq. (8) becomes
y= C, aYp,
and the subscripts on C, read irom left to right, indicate that we are changing
from coordinates relative to B to coordinates relative to B’.
Because every n-dimensional vector space is isomorphic to R’, all our
results here are valid for coordinates with respect to ordered bases B and B’ in
any finite-dimensional vector space. We phrase the definition that follows in
these terms.
. [Me | Mo)
and reduce it (by using elementary row operations) to the form [J | C]. We can
regard this reduction as solving 7 linear systems, one for each column vector of
M, and all having the same coefficient matnx M,. From this perspective, the
matrix M, times the jth column vector of C (that is, the jth “solution vector’)
must yield the jth column vector of M,. This shows that MpC = Msg.
Consequently, we must have C = M;'M, = C,,.. We now have a convenient
procedure for finding a change-of-coordinates matrix.
SISCr:
B to BY in R"
ae
2.
Bee
D,) be.ardered bases of
‘Bao B'is athacont.
the matrix|e C, »
SLT | Coal
EXAMPLE 1 Let B = ([I, 1, O}, (2, 0, 1], [1, —1, 0]) and let EF = (e,, e,, e,) be the standard
ordered basis of R’. Find the change-of-coordinates matrix C,, from E to B,
and use it to find the coordinate vector y, of
v= ]3
. 4
relative to B.
SOLUTION Following the boxed procedure, we place the ‘‘new’’ basis matrix to the left and
the “old” basis matrix to the mght in an augmented matrix, and then we
proceed with the row reduction:
1 2 j%171 0 0 1 2 1) 1 0 0
1 O -i {0 1 O}~]0 -2 -2}-!1 1 O
0 1 O;0 0 1 0 1 0 0 0 1
M, M,
392 CHAPTER 7 CHANGE OF BASIS
Onl
Onl
Ll}.
|
O
by
1
Nl
2
The coordinate vector of v =|3] relative to B is
4
piw
—
]
J
Cnr
Oni
KO
et
Vp = Ceg¥ =
Cd
Mt
A
—
bof
In the solution to Example 1, notice that C,, is the inverse of the matrix
appearing to the left of the partition in the original augmented matrix; that is,
Cex = M;'. Taking the inverse of both matrices, we see that Cz, = Mg. Of
course, this is true in general, because by our boxed procedure, we find Cy, by
reducing the matrix [M, | 3] = [/ | 4,], and this augmented matrix is already
in reduced row-echelon form.
In order to find a change-of-coordinates matrix C; 5 for bases B and B’ in
an n-dimensional vector space V other than R", we choose a convenient
ordered basis for V and coordinatize the vectors in V reiative to thai basis,
making V look just like R’. We illustrate this procedure with the vector space ?,
of polynomials of degree at most 2, showing how the work can be carned out
with coordinate vectors in R?.
EXAMPLE 2 Let B = (x2, x, 1) and B’ = (x? — x, 2x? — 2x + 1, x? — 2x) be ordered bases of
the vector space P,. Find the change-of-coordinates matnx from B to B’, and
use it to find the coordinate vector of 2x? + 3x — | relative to B’.
SOLUTION Let us use the ordered basis B = (x, x, 1) to coordinatize polynomials in
P,. Identifying each polynomial a,x? + a,x + a, with its coordinate vector
[a,, @,, Q] relative to the basis B, we obtain the following correspondence from
P, to R?:
1 2 114100 1 2 1/100
-1 -z -2;010/~|0 O-1(116
0 1 O;JO0O1} |O 1 olooi
New basis Old basis
Beh Car Vg Ve
As a check, we have
2x7 + 3x -— 1 = G(x? — x) — 1(2x? — 2x + 1) - SQ? — 2x).
y b’ b b} a
There is another way besides 45'M, to express the change-of-coordinates
matnx C,,. Recall that the coordinate vector (bj), of b; relative to B’ is found
by reducing the augmented matrix [M, | bj]. Thus all » coordinate vectors
(by can be found at once by reducing the augmented matrix [M, | Mg].
But this is precisely what our boxed procedure for finding Cy5. calls for. We
conclude that
|
Cog =| (die (Dade ° °° JI (9)
V and R* are isomorphic, we see (as in Example 2) that this matria is also the
change-of-coordinates matrix from B to B’.
There are times when it is feasible to use Eq. (9) to find the change-of-
coordinates matrix—just noticing how the vector b, in B can be expressed as a
ltnear combination of vectors in B' to determine the jth column vector of C, ,.
rather than actually coordinatizing ” and working within R*. The next
example illustrates this.
EXAMPLE 3. Use Eq. (9) to find the change-of-coordinates matrix C,,. from the basis
B=(xX?- 1,2 + 1,x° + 2x + 1) to the basis B’ = (x’, x, 1) in the vector space
P, of polynomials of degree at most 2.
SOLUTION Relative to 3’ = (x’, x, 1), we see at once that the coordinate vectors of x? - [,
x? + 1,and x’ + 2x + 1 are [l, 0,-1], [1, 0. 1], and [1, 2, 1], respectively.
Changing these to column vectors, we obtain
Pid
Cp py = 0 0 2 |. a
-! 11
SUMMARY
The vector vy, = [r,, 7, .--, 1,] 18 the coordinate vector of v relative to B.
Associating with each vector v its coordinate vector v, coordinatizes V, so
that V is isomorphic to R’.
2. Let Band B’ be ordered bases of V. There exists a unique n X n matrix Cgy
such that Cy Vy = Vg for all vectors v in V. This is the change-of-coordinates
matrix from B to B’. It can be computed by coordinatizing V and then
applying the boxed procedure on page 391. Alternatively, it can be
computed using Eq. (9).
EXERCISES
3 1 2] 2
Cra=| 4 1 | and v=| 51, In Exercises 18-21, use Eq. (9) in tne text to find
-1 2 1 =| the change-of-coordinates matrix from B to B' for
the given vector space V with ordered bases B and
find the coordinate vector ¥,. B’
9, Let V be a vector space with ordered bases B
and B’. If 18. V = P,, B is the basis in Exercise 3, B’ =
i 2 0 (x5, Xy, x, 1)
Aba EH
the basis in Exercise 5,
find the coordinate vector vg.
In Exercises 10-14, find the change-of-coordinates 21. Vis the subspace sp(sin?x, cos?x) of the
vector space F of all real-valued functions
matrix (a) from B to B’, and (b) from B’ to B.
with domain R, B = (cos 2x, 1), B’ =
Verify that these matrices are inverses of each
other. (sin2x, cos?x)
22. Find the change-of-coordinates matrix from
10. B= ([1, 1], [t, O]) and B’ = ((0, 1), (1, 1) B’ to B for the vector space V and ordered
in R? bases B and B’ of Exercise 21.
11. B= ({2. 3, 1), (1, 2, 0}, (2, 0, 3]) and B’ = 23. Let B and B’ be ordered bases for R’. Mark
({1, 0, 0}, [0, 1, 0], [0, 0, 1]) in R? each of the following True or False.
12. B = ((I, 0, 1), [J, 1, 0], (0, 1, 1]) and B’ = ___ a. Every change-of-coordinates matrix is
([0, 1, 1], (1, 1, 0}, (1, 0, tJ) in R? square.
—— b. Every n X nm matrix is a
13. B=({i, 1, l, 1, (1, 1, 1, 0), (1, 1, 0, 0},
change-of-coordinates matrix relative to
[1, 0, 0, OJ) and the standard ordered basis
some bases of R’.
B' = E for R‘
___c. If B and B’ are orthonormal bases, then
14. B = (sinh x, cosh x) and B’ = (e, e*) in Cs. is an orthogonal matrix.
sp(sinh x, cosh x) __d. If Cg, is an orthogonal matrix, then 8
15. Find the change-of-coordinates matrix from and B’ are orthonormal bases.
B’ to B for the bases B = (x?, x, 1) and B’ = —_—e. If Cg¢ is an orthogonal matrix and 8 1s
(x? ~ x, 2x? — 2x + 1, x? — 2x) of P, in an orthonormal basis, then B’ is an
Example 2. Verify that this matrix is the orthonormal basis.
inverse of the change-of-coordinates matrix _— f. For all choices ofB and B’, we have
from B to B’ found in that example. det(C, ,-) = 1.
CHAPTER 7 CHANGE OF BASIS
. For all choices of B and B’, we have 24. For ordered bases B and B’ in R’, explain
det(Cy¢) # 0. how the change-of-coordinates matrix from
B to B’ is related to the change-of-
. det(Cys,) = 1 ifand only if B= B’. coordinates matrices from B to E and from
E to B’.
i. Cys = / if and only :fB = B’.
25. Let B, B’, and B’ be ordered bases for R’.
. Every invertible 1 x n matrix is Find the change-of-coordinates matrix from
the change-of-coordinates matrix B to B’ in terms of Cg¢ and Cy... [HINT:
C,” for some ordered bases B For a vector v in R", what matrix times ¥,
and B’ of R". gives v7]
Let Vand V’ be vector spaces wich ordered vases B = (b,, b,,. . . ,b,) and B’ =
(b,', b.’, ..., b,,’), respectively. If 7; V— V’ is a linear transformation, then
Theorem 3.10 shows that the matrix representation of T relative to B, B’,
which we now denote as Rg, is given by
||
where 7(b,), is the coordinate vector of 7(b)) relative to B’. Furthermore, Rap
is the unique matrix satisfying
T(v\)s = Rees forall vin V. (2)
Let us denote by My,5, the m x n matrix whose jth column vector is 7(b,). From
Eq. (1), we see that we can compute Rg» by row-reducing the augmented
matrix [My | My, to obtain [7 | Rg»). The discussion surrounding Theorem!
3.10 shows how computations involving 7; V > V’ can be carried out in R
and R" using R, » and coordinates of vectors relative to the basis B of Vand B
of V’, in view of the isomorphisms of V with R" and of V' with R” that thesé
coordinatizations provide.
It is the purpose of this section to study the effect that choosing differem
bases for coordiratization has on the matrix representations of a lineaf
transformation. For simplicity, we shall derive our results in terms of the
vector spaces R". They can then be carried over to other finite-dimension4
vector spaces using coordinatization isomorphisms.
Let T: R’ > R" and T': R” > R‘ be linear transformations. Section 2.3 showed
that composition of linear transformations corresponds to multiplication
7.2 MATRIX REPRESENTATIONS AND SIMILARITY 397
e—. pe Re
B B’ B"
The foilowing vector and matrix counterpart diagram shows the action of the
t.ansformations on vectors:
Rar
The matrix Rpg» under the first arrow transforms the vector on the left of the
arrow into the vector on the right by left multiplication, as indicated by Eq.
(2). The matrix under the second arrow acts in a similar way. Thus the matrix
product Ry »-R,» transforms v, into 7”(7(v)),-. However, the matnx represen-
tation R,,- of T” T relative to B,B" is the unique matrix that carries vy, into
T"(T(¥)),- for all vy, in R". Thus we must have
Notice that the order of the matrices in this product is opposite to the order in
which they appear in diagram (3). (See the footnote on page 160 to see why this
1S SO.)
Equation (4) holds equally well for matrix representations of linear
transformations of general finite-dimensional vector spaces, as shown in the
diagram
Toft
yt,yp 1, ye
B B' B’
EXAMPLE 1 Let B = (x‘, x}, x’, x, 1), which is an ordered basis for P,, and let T: P, > P, be
the differentiation transformation. Find the matrix representation R,, of 7
relative to B,B, and illustrate how it can be used to differentiate the polynomial
function
SOLUTION Because 7(x*) = Ax*"!, we see that the matrix representation Ry» in Eq. (1) is
00000
0
Oo
ooo &
Oo
0
OW
Ree =
oN
ooco
oO
0
l
Oo
r
Now the coordinate vector of p(x) = 3x4 — 5x’ + 7° — 8x + 2 relative to B is
00000) i 0
4000 0/|-5 2|
= Ro aP(X),
T(p(x))p =|9 300 CH 7) =)-151
0020} -% H
0001oll 2} | -8
Thus, p'(x) = T(p(x)) = 12x? — 15x? + 14x - 8.
The discussion surrounding Eq. (4) shows that the linear transforma-
tion of P, that computes the fifth derivative nas matrix representation
(Rpg). Because the fifth derivative of any polynomial in P, is zero, we see that
(Rps) =O. @
B' B B B’
7.2 MATRIX REPRESENTATIONS AND SIMILARITY 399
R;
Rg = CapRsCeop. (5)
Reading the matrix product in Eq. (5) from right to left. we see that in order to
compute 7{(v), from vy. if we know Rg, we:
1. change from B' to B coordinates,
2. compute the iransformation relative to B coordinates,
3. change back to B’ coordinates.
Equation (5) makes this procedure easy to remember. _
The matrices Cy5 and Cy, are inverses of each other. It we drop
subscripts for a moment and let C = Cy. y, then Eq. (5) becomes
R, = C'R,C. (6)
Recall that two n X n matrices A and R are similar if there exists an invertible
n X nmatrix C such that R = C"'AC. (See Definition 5.4.) We have shown that
matrix representations of the same transformation relative to different bases
are similar. We state this as a theorem in the context of a general finite-
dimensional vector space.
Ry = C'!R,C,
where C = Cy, is the change-of-coordinates matnx from B’ to B.
Consequently, R,. and R, are similar matrices.
SOLUTION Here the standard ordered basis £ plays the role that the basis B played in
Theorem 7.1, and the basis B here plays the role played by B’ in that theorem.
Of course, the standard matrix representation of T is
17 1
A={1 1 Of.
001
fi 10 2 2 2] [1 1 012
2 2
10 1 2 | 1 j~jO-l L{|O-t1 -t
lori] o ut lb i lout
b, b, b, 7(b,) 7(d.) T(by}
fai 100 an
~10 I 1' Ol t/~jO10)/01 11.
lo 0 2/000] joo!]ooo|
I R,
Thus,
[211
R,=|0 1 1}.
000
l
CC)
—
C = M,= l
es
Oo
0
—_
in this case. The matricesA and R, are similar matrices, both representing the
given linear transformation. As acheck, we could compute that AC = CR,. #
Tie) = (x — 1p =x? - 2x 4+ 1,
T(x) =x- 1,
T(1) = 1,
7.2 MATRIX REPRESENTATIONS AND SIMILARITY 401
-!} 1 -1}} 0 O 1 2
=! 1 O Tf} 1 1 -2})=]-1
i O O;;/-!1 O O 0
Alternatively, R, can be computed directly as
Diagonalization
Let 7: V— Vbea linear transformation of an n-dimensional vector snace into
itself. Suppose that there exists an ordered basis B = (b,, b,,..., b,) of V
composed of eigenvectors of 7. Let the eigenvalue corresponding to b, be 4..
Then the matrix representation of 7 relative to B has the simple diagonal form
y
t
b, = [-m, I]
oo 1 a
FIGURE 7.1
Reflection in the line y = mx.
7.2 MATRIX REPRESENTATIONS AND SIMILARITY 405
Tnen
T*(v) = d,A,*b, + d,,*b, tere + d,d,*b,.- (8)
EXAMPLE 5 Consider the vector space P, of polynomials of degree at most Z, and let B’ be
the ordered basis (1, x, x’) for P,. Let T: P, > P; be the linear transformation
such that
TQ) =34+2x4+%, Td =2, The’) = 2x’.
Find T(x + 2).
SOLUTION The matrix representation of T relative to B’ is
320
R, =|2
0 O1.
102
Using the methods of Chapter 5, we easily find the eigenvalues and eigen-
vectors of R,. and of T given in Table 7.1.
TABLE 7.1
Let B be the ordered basis (—3 + 6x + x’, x’, 2 + x + x’) consisting of these
eigenvectors. We can find the coordinate vector d of x + 2 relative to the basis
B by inspection. Because
we see that
d, = 0, d, = —1, and d, = l.
In particular,
T(x
+ 2) = —16x?
+ 256(x2 + x + 2) = 240x?+ 256x
+ 512. "
, SUMMARY
EXERCISES
In Exercises 1-14, find the matrix representations 3. T: R?— R? defined by 7([x, y, z]} =
R, and Ry and an invertible matrix C such that [xt y,x+z,y—-z];B=
Ry = C'RsC for the linear trzusformation T of ([1, 1, 1), (1, 1, 0) (1, 0, 0), B= E
the given vector space with the indicated ordered 4. T: R? > R? defined by T(x, y, z]) =
bases B and B'. [Sx, 2y, 3z]; B and B’ as in Exercise 3
1. T: R? > R’ defined by 7([x, y]) = 5. T: R? — R? defined by 7([x, y, z]) = [z, 0, xh
[x — y,x + dy}; B= ([t, 1], (2, 1), B = ((3, 1, 2], (1, 2, 1), (2, -1, 0)), BY =
Bose ({1, 2, 1], [2, 1, -I), [S, 4,1)
2. T: R?-—> R defined by 7([x, y]) = 6. T: R? — R’ defined as reflection of the plane
[2x + 3y. x + 2y}; B = ((1, -1), [1, 1), through the line 5x = 3y; B =
B = ((2, 3], (1, 2) ((3, 5], [5, —3)), B' ad E
MATRIX REPRESENTATIONS AND SIMILARITY 407
the corresponding eigenspaces of the linear having the same eigenvalues of the same
transformation T. Determine whether the linear algebraic multiplicities are similar.
transformation is diagonaiizabl< 24. Prove statement 2 of Theorem 7.2.
25. Prove statement 3 of Theorem 7.2.
17. T defined on R‘ by 7([x, y]) = 26. Let A and R be similar matrices. Prove in
[2x — 3y, —3x + 2y]
two ways that A? and R? are similar matrices:
18. T defined on R? by 7([x, y]) = using a matmx argument, and using a linear
[x — y, ~x + y] transformation argument.
19. T defined on R? by 71[x,, x3, X]) = 27. Give a determinant proof that similar
[X, + 2X3, XX, + Xy] matrices have the same eigenvalues.
TY
CHAPTER
6 EIGENVALUES: FURTHER
| APPLICATIONS AND
COMPUTATIONS
This chapter deals with further applications of eigenvalues and with the
computation of eigenvalues. In Section 8.1, we discuss quadratic forms and
their diagonalization. The principal axis theorem (Theorem 8.1) asserts that
every quadratic form can be diagonalized. This is probably the most impor-
tant result in the chapter, having applications to the vibration of elastic bodies,
to quantum mechanics, and to electric circuits. Presentations of such applica-
tions are beyond the scope of this text, and we have chosen to present a more
accessible application in Section 8.2: classification of conic sections and
quadric surfaces. Although the facts about conic sections may be familiar to
you, their easy derivation from the principal axis theorem should be seen and
enjoyed.
Section 8.3 applies the principal axis theorem to local extrema of
functions and discusses maximization and minimization of quadratic forms
on unit spheres. The latter topic is again important in vibration problems; it
indicates that eigenvalues of maximum and of minimum magnitudes can be
found for symmetric matrices by using techniques from advanced calculus for
finding extrema of quadratic forms on unit spheres.
In Section 8.4, we sketch three methods for computing eigenvalues: the
power method, Jacobi’s method, and the QR method. We attempt to make as
intuitively clear as we can how and why each method works, but proofs and
discussions of efficiency are omitted.
This chapter contains some applications to geometry and analysis that are
usually phrased in terms of points in R" rather than in terms of vectors. We will
be studying these applications using our work with vectors. To permit a natural
and convenient use of the classical terminology of points while working with
vectors, we relax for this chapter our convention that the boldface x always be
a vector [X,. x;,..., x,], and allow the same notation to represent the point
(X;, X>,....X,) as well.
408
8.1 DIAGONALIZATION OF QUADRATIC FORMS 409
Quadratic Forms
where not all u,,are zero. To illustrate, the general quadratic form in x,, x,, and
X; 1S
; 2
Ui Xp cb UyyXyXy + UygXXy + Uy 9X? + Uy XyXy + Uy3X;°- (2)
HISTORICAL NOTE _ IN 1826, Cauchy DISCUSSED QUADRATIC FORMS IN THREE VARIABLES— forms
of the type Ar + By + Cz? +2Dxy +2Exz + 2Fyz. He showed that the characteristic equation
formed from the determinant
ADE
DB
EF
remains the same under any change of rectangular axes, whai we would call an orthogonal
coordinate change. Furthermore, he demonstrated that one could always find axes such that the
new form has only the square terms. Three years later, Cauchy generalized the result to quadratic
forms in n variables. (The matrices of such forms are » X n symmetric matrices.) He showed that
the roots A,, A;,..., A, of the characteristic equation are all real, and he showed how to find the
linear substitution that converts the original form to the form A,x,2 + Ap? + °° + A,x,2. In
modem terminology, Cauchy had proved Theorem 5.5, that every real symmetric matnix is
diagonalizable. In the two-variable case, Cauchy’s proof amounts to finding the maximum and
minimum of the quadratic form f(x,y) = Ax? + 2Bxy + Cy? subject to the condition that x? + 37 =
1. In geometric terms, the point at which the extreme value occurs is that point on the unit circle
which also lies on the end of one axis of one of the family of ellipses (or hyperbolas) described by
the quadratic form. If one then takes the lire from the origin to that point as one of the axes and
the perpendicular to that line as the other, the equation in relation to those axes will have only the
squares of the variables, as desired. To determine an extreme point subject to a condition, Cauchy
uses what is today called the principle of Lagrange multipliers.
410 CHAPTER 8 EIGENVALUES: FURTHER APPLICATIONS AND COMPUTATIONS
UyX, + UX + UyyXy
= [x,, x), X43] Uy) X, + UX;
U33X3
= X(tXy + yXy + Uy3Xy) + XY(UyyX, + MQ5X5) + X3(U53X5)
FS Uy XY + Uy
A Xy + Ung XXy + Uy_Xy + Uy XX, + UyXY,
which 1s again form (2). We can verify that the term involving u, fori = jin the
expansion of
| 0 0 Unn| | Xn
is precisely u;,x,x,. Thus, matrix product (3) isa 1 xX 1 matrix whose sole entry
is equal to sum (1).
1-2 6]\x
(x,y, 2]}0 O Olly).
0 0 Iilz
Just think of x as the first variable, y as the second, and z as the third. The
summand —2xy, for example, gives the coefficient —2 in the row 1, column 2
position. «
EXAMPLE 2 Expand
-l 3 Hx
[x,yZ)} 2 1 Olly 4
2 -~
Zz
and find the upper-triangular coefficient matrix for the quadratic form.
SOLUTION We find that
-! 3 IUf|x —x
+ 3y+2z
[x,y,z]} 2 1 Ollyl=[xyy, 2] x+y
—-2 2 4f/z —2x
+ 2y + 4z
X(—x+ By+ 2) + y(2x + y) + 2-2x + 2p + 42)
~x? + 3xy + xz + 2xy + y? — 2xz + 2ypz + 42?
= -x? + Sxy + y? — xz + 2yz + 427,
The upper-triangular coefficient matrix 1s
-! 5-1]
U=| 0 1 2. 5
0 0 4
All the nice things that we will prove about quadratic forms come from the
fact that any quadratic form f(x) can be expressed as f(x) = x™Ax, where A isa
symmetric matrix.
EXAMPLE 3 Find the symmetric cuefficient matrix of tne form x? — 2xy + 6xz + 2
discussed in Example 1.
SOLUTION Rewriting the form as
x? — xy — yx + 3xz + 32x + 27,
1 -l 3
A=|{-1 0O Ol. a
3 0 1
EXAMPLE 4 Find a substitution x = Ct that diagonalizes the form 3x,? + 10x,x, + 3x,?, and
find the corresponding diagonalized form.
SOLUTION The symmetric coefficient matnx for the quadratic form is
a-ar=avrr=[S]~ [9
_ _{5 5) {1 1
and
ype goer {7 SJ fl -l
8.1 DIAGONALIZATION OF QUADRATIC FORMS 413
Thus eigenvectors are v, = (—1, 1] and v, = (1, 1). Normalizing them to length
1 and placing them in the columns of the substitution matrix C, we obtain the
diagonalizing substitution
x] = cf] -[-YV2 v2 4]
X) t, W222} [a
l
x, = 5] (-1, + 4),
.)
2 FT (44! (t, + 4),
/
in the form 3x? + 10x,x, + 3x,? will give the diagonal form
In any particular situation, we would have to balance the desire for arithmsiic
simplicity against the desire to have the new orthogonal basis actually be
orthonormal.
As an orthogonal matrix, C has determinant +1. (See Exercise 22 in
Section 6.3.) We state without proof the significance of the sign of det(C). If
det(C) = 1, then the new ordered orthonormal basis B given by the column
vectors of C has the same orientation in R” as the standard ordered basis £;
while if det(C) = —1, then B has the opposite orientation to £. In order for B =
(b,, b,) in R? to have the same orientation as (e,, e,), there must be a rotation of
the plane, given by a matnx transformation x = Ct, which carries both e, to b,
and e, to b,. The same interpretation in terms of rotation is true in R’, with the
414 CHAPTER 8 EIGENVALUES’ FURTHER APPLICATIONS AND COMPUTATIONS
42
b;
7)
(a) (b;
FIGURE 8.1
(a) Rotation of axes, hence (b,, b,) and (e,, e,) have the same orientation;
(b) not a rotation of axes, hence, (b,, b,) and (e,, e,) have opposite orientations.
_{-iv2 -1/V2) ob ot
dele)= | ind 17} 7272 > 1
b, b,
Counterclockwise rotation of the plane through an angle of 135° carries £ into
B, preserving the order of the vectors. However, we see that the basis (b,, b,) =
({0, —1], [—1, Oj) shown in Figure 8.1(b) does not have the same orientation
as E, and this time
det(C) = _t a =|.
b, b,
EXAMPLE 5 Finda variable substitution that diagonalizes the form 2xy + 2xz, and give the
resulting diagonal form.
SOLUTION The symmetric coefficient matrix for the given quadratic form is
011
0 O}.
he
0 0
—
416 CHAPTER 8 EIGENVALUES: FURTHER APPLICATIONS AND COMPUTATIONS
-A 1 1
IA-aAn=| 1 -A Of} =1()-aQ-1)
1 0O- A
~A(—1 + A? = 1) = -A(A? = 2).
-V2 1 1 ] -V2 0
A-AI=A-V2=| 1 -V2. 0 [~|-W2 1 4
l 0 -v2 i 0 -V2
1 -vW2 0 f -V2 0
~|0 -1 1 |~]0 1 —1],
0 V2 -V2} [0 0 0
0 V22 -V2/2
_ 1 1
V2 : i
x= (alte — &)
y=>(-V2t, + t + 4)
Mie yl
L>=
V2t, + t+ 4)
| SUMMARY
EXERCISES
» Xj) — 2xy) + xy? + Oxy" — 2xyx, + 17. Find a necessary and sufficient condition on
Oxp%4 — 8x4; a, b, and c such that the quadratic form
-2 11fx ax? + 2bxy + cy’ can be orthogonally
5. [x, y] | 7 1 3 diagonalized to kf.
18. Repeat Exercise 17, but require also that
6. [x,y | 7 ~!0 [* k=1,
Is 2} ly
8 3 I lx a In Exercises 19-24, use the routine MATCOMP
7. [x,y,z] 2 1 -4ily in LINTEK, or MATLAB, t° finda diagonal form
~§ 2 10\lz into which the given form can be transformed by
an orthogonal substitution. Do not give the
2-1 3 Olfx, substitution.
4 2-t 3)\x
8. [x,, %, %3, x ; 5
Mn ty Al) _§ 3g 7 x; 19. 3x2+ 4xy - 5y 20. x2- 8xy
+ y’
10 2 1 Sihx,
21. 3x? + yp? — 22? — 4xyp + 6yz
22. y — Bz? + 3xp — 4xz + Tyz
In Exercises 9-16, find an orthogonal substitution
23. x,? — 3x,x4 + 5x -— 8x2%;
that diagonalizes the given quadratic form, and
find the diagonatized form. 24. x, — 8x,x, + 6x; — 4%
418 CHAPTER 8 EIGENVALUES FURTHER APPLICATIONS AND COMPUTATIONS
Conic Sections in R’
Figure 8.2 shows three different types of plane curves obtained when a double
right-circular cone is cut by a plane. These conic sections are the ellipse, the
hyperbola, and the parabola. Figure 8.3(a) shows an ellipse in standard
position, wilh center at the origin. The equation of this ellipse is
ry
at Rp I. (1)
The ellipse in Figure 8.3(b) with center (A, &) has the equation
FIGURE 8.2
Sections of a cone: (a) an elliptic section; (b) a hyperbolic section; (c) a parabolic
section.
8.2 APPLICATIONS TO GEOMETRY 419
(x -_— h) 2 ak) =|
5
a’ b-
~~.
———
~~
\
=
x,
|
IS
o
ll
+
La
5,
\
S
|
>Xx
“Ae 0 h
—h
(a) (b)
FIGURE 8.3
(a) Ellipse in standard position; (b) ellipse centered at (h, k).
x + 3y - 4x + Oy = -1.
SOLUTION Completing the square, we obtain
(x=2 OF],
6 2
which is of the form of Eq. (2). This is the equation of an ellipse with center
(h, k) = (2, -1). Setting X = x — 2 and y = y + 1, we obtain the equation
ye6 + 7 l. r
xy
- mo | or xy
at ® =|,
as shown in Figure 8.4. The dashed diagonal lines y = +(b/a)x shown in the
figure are the asymptotes of the hyperbola. By completing squares and
(b)
FIGURE 8.4
2 2 2 2
(a) The hyperbola = — A = 1; (b) the hyperbola — + ni = 1,
82 APPLICATIONS TO GEOMETRY 421
translating axes, we can reduce Eq. (4) to one of the standard forms shown in
Figure 8.4, in variables x and y, unless the constant in the final equation
reduces to zero. In that case we obtain
x2 y? _
7 BR O or y=
ra
=
ll
S
a
Ne
x? = ay
(a) (b)
FIGURE 8.5
(a) The parabola x? = ay, a > 0; (b) the parabola y’ = ax, a < 0
422 CHAPTER 8 EIGENVALUES: FURTHER APPLICATIONS AND COMPUTATIONS
Classification of Second-Degree Curves
We can now apply our work in Section 8.1 on diagonalizing quadratic forms to
classification of the plane curves described by ari equation of the type
We make a substitution
xX} Alf _
* =C i where det(C) = 1, (9)
EXAMPLE 2 Use rotation and translation of axes to sketch the curve 2xy + 2V/2x = 1.
SOLUTION The symmetric coefficient matrix of the quadratic form 2xy is
4=|t
_ {01
9h
We easily find that the eigenvalues are A, = 1 and A, = —1, and that
c= [Na IW
8.2 APPLICATIONS TO GEOMETRY 423
-~ty-
x ~V5 ty)
i
y= Blt + b)
then yields
t?- 6) + 2t - 26 =
—
|
:
(f, + ly ~~ (t, + ly l,
Quadric Surfaces
An equation in three variables of the form
C7 + Cy? + 0527 + GX + Cy + Gz = d, (11)
where at least one of c,, c,, or c, is nonzero, describes a quuuric surface in space,
which again might be degenerate or empty. Figures 8.7 through 8.15 show
some of the quadric surfaces in standard position.
424 CHAPTER 8 EIGENVALUES: FURTHER APPLICATIONS AND COMPUTATIONS
z
b
HISTORICAL NOTE THE EARLIEST CLASSIFICATION OF QUADRIC SURFACES was given by Leonhard
Euler, in his precalculus text Introduction to Infinitesimal Analysis (1748). Euler’s classification
was similar to the conic-section classification of De Witt. Euler considered the second-degree
equation in three variables Ap? + Bq? + Cr? + Dog + Epr + For + Gp + Hqa+ir+K=0Oas
representing a surface in 3-space. As did De Witt, he gave canonical forms for these surfaces and
showed how to rotate and translate the axes to reduce any given equation to a standard form such
as Ap’ + Bg? + Cr? + K = 0. An analysis of the signs of the new coefficients determined whether
the given equation represented an ellipsoid, a hyperboloid of one or two sheets, an elliptic or
hyperbolic paraboloid, or one of the degenerate cases. Euler did not, however, make explicit use of
eigenvalues. He gave a ger.cral formula for rotation of axes in 3-space as functions of certain angles
and then showed how to choose the angles to make the coefficients D, E, and F all zero.
Euler was the most prolific mathematician of all time. His collected works fill over 70
large volumes. Euler was born in Switzerland, but spent his professional life in St. Peters-
burg and Berlin. His texts on precalculus, differential calculus, and integral calculus (1748,
1755, 1768) had immense influence and became the bases for such texts up to the present. He stan-
dardized much of our current notation, introducing the numbers e, 7, and i, as well as
giving our current definitions for the trigonometric functions. Even though he was blind for
the last 17 years of his life, he continued to produce mathematical papers almost up to the day
he died.
82 APPLICATIONS TO GEOMETRY 425
FIGURE 8.11
FIGURE 8.12
a
The elliptic paraboloid. z = xy?
3 + be The hyperbolic paraboloid
Z eo
~ pt at
FIGURE 8.13
FIGURE 8.14
a x? 2
The elliptic cone z?yea= 3 + be The hyperboloid of two sheets
426 CHAPTER 8 EIGENVALUES FURTHER APPLICATIONS AND COMPUTATIONS
FIGURE 8.15
2 2 y?
The hyperboloid of one sheet 5 +1= 5 + be
Again, degenerate and empty cases are possible. For example, the equation
x + 2y + z? = —4 gives an empty ellipsoid. The elliptic cone, hyperboloid of
two sheets, and hyperboloid of one sheet in Figures 8.13 through 8.15 differ
only in whether a constant in their equations is zero, negative, or positive.
Now consider a general second-degree polynomial equation
x
yl=C|t|, where det(C) = 1, (13)
z
that orthogonally diagonalizes the quadratic-form portion of Eq. (12) (the
portion in color), we obtain
TABLE 8.1
If you try to verify Table 8.1, you will wonder about an equation of the
form ax? + by + cz = din the last case given. Exercise 11 indicates that, by
means of a xotaticn of axes, this equation can be reduced to at,’ + rt, = d,
which can be written as at,’ + r(t, — d/r) = 0, and consequently describes a
parabolic cylinder. We conclude with four examples.
x = lt — )
ye 5(-v% + t, + ts]
z= a(v2e +t, + |
transforms 2xy + 2xzinto V2t,? — V/2t,2. It can be checked that the matrix C
corresponding to this substitution has determinant |. Thus, the substitution
corresponds to a rotation of axes and transforms the equation 2xy + 2xz = |
into V2t, — V2t,? = 1, which we recognize as a hyperbolic cylinder. =
EXAMPLE 4. Classify the quadric surface 2xy + 2xz = y + I.
428 CHAPTER 8 —_ EIGENVALUES: FURTHER APPLICATIONS AND COMPUTATIONS
SOLUTION Using the sane substitution as we did in Example 3, we obtain the equation
We find that
2-A -1 0
det(4-AN=| -1 -3-a 2
0 2 1-A
ony [-3-A 2 | Jel 2
= A) | 2 12+ 1 al
= (2 - AA? + 2A-7)+A-1
= —-' + 12A — 15.
8.2 APPLICATIONS TO GEOMETRY 429
[ |
SUMMARY |
—
1. Given an equation ax + bxy + cy? + dx + ey + f= 0, let A be the
symmetric coefficient matrix of the quadratic-form portion of the equation
(the portion in color), let A, and A, be the eigenvalues of A, and let C be an
orthogonal matrix of determinant | with eigenvectors of A for columns.
The substitution corresponding to inatrix C followed by translation of axes
reduces the given equation to a standard form for the equation of 2 conic
section. In particular, the equation describes a (possibly degenerate or
empty)
ellipse if A,A, > 0,
hyperbola if A,A, < 0,
parabola. if A,A, = 0.
2. Proceeding in a way analogous to that described in summary item (1) but
for the equation
ax’ + by? + cz? + dxy + exz + fyzt+ pxtqtzt+s=0,
one obtains a standard form for the equation of a (possibly degenerate or
empty) quadnic surface in space. Table 8.1 lists the information that can be
obtained from the three eigenvalues A,, A,, and A, alone.
| EXERCISES
| | — off}
yl |b
complete squares if necessary, and sketch the
4. x2 -— Ixy
t y?+ 4V2x= 4
9. Show that the plane curve ax* + Axy + cy? + In Exercises 12-20, classify the quadric surface
dx + ey + f= 01s a (possibly degenerate or with the given equation as one of the (possibly
empty) empty or degenerate) types illustrated in Figurcs
. 8.7-8.15.
ellipse ifP — 4ac < 0,
hyperbola if& — 4ac > 0,
12. 2x° + 2p + 6yz+ 102
= 9
parabola if b? — 4ac = 0.
13. 2xv+ 2774 1=0
[Hint: Diagonalize ax’ + bxy + cy’, using 2
the quadratic formula, and check the signs 14. 2xz+ yet dv t+ 1 =0
of the eigenvalues.} 15. x8 +2yz t+ 4x +1 =0
10. Use Exercise 5 to classify the conic section 16. 3x2 + 2y? + 6xz + 32?
with the given equation.
a . 2x0 + Biy + BY — 3x + 2) = 13 17. x — Bxy + 16y* — 3z' = §
b. yt 4ay ~ Sx? — Bx = 12 18. x + 49 — 4xz + 42 = 8
ce. -x?+ Sxy-— Ty - 4p +11 =0
d. xy + 4x - 3p - 8 19. —3x°
+ 2y + Byz + 162?= 10
e. 2x’ — 3xy + ¥ — 8x + Sy = 30 20. e+ y+ 27 - 2xy + 2xz- 2yz = 9
f. x? + Oxy + 9 - 2x+ 14y= 10
. 4 — Ixy — By ft 8x - Sy =!7
r 8x2 + éxy + oy —~ Sx = 3 = In Exercises 21-27, use the routine MATCOMP
i. x2 — Ixy + 4x - Sy =6 in LINTEK, or MATLAB, to classify the quadric
. Ix? - 3xy + 2p - By = 15 surface according to Table 8.].
j
11. Show that the equation ax? + by + cz =d
>a ye ye
can be transformed into an equation of the 21. x+y +z? — 2xy — 2xz - Dz +
forin at,? + rt, = d by a rotation of axes in 3x — 32 = 8
space. [HInT: 22. 3x2 + 2p + 522 + 4xy + 2ps - 3x +
x t 10y = 4
oon —
l 5
cos x= | - ax? + axt - Gt to (1d)
1 11
lata @
+ «8 6
I I 1.
cos(2x — y) =1 - alex - yy + qil2x — yj - 62% —yfo+tr++. (2)
lo — r= lige ayy t p
ail x y) ~ 4( x xy y’)
is a form of degree 4.
Consider a function g(x), X,,..., X,) Of n vanables, which we denote as
usual by g(x). For many of the most common functions g(x), it can be shown
that, ifx is near 0, so that all x, are smal! in magnitude, then
yar tl
FIGURE 8.16
The graph of y = g(x) = x? + 1.
Now if x is really close to 0, ail the forms fi(x), f(x), 4(x), . . . in Eq. (3)
have very small values, so the constant c, if c # 0, is the dominant term of
Eq. (3) near zero. Notice that, for g(x) = 1 + x’ in Figure 8.16, the function has
values close to the constant | for x close to zero.
After a nonzero constant, the form in Eq. (3) that contributes the most to
g(x) for ||x|| close to 0 is the nonzero form f{x) of lowest degree, because the
lower the degree of a term of a polynomial, the mure it contributes when the
variables are all near zero. For example, x is greater than x near zero; if x =
joo then x? is only rp5p5- If fi(x) = 0 and f(x) # 0, then f,(x) is the dominant
form of Eq. (3) after a nonzero constant c, and so on.
We claim that, if f(x) # 0 in Eg. (3), then g(x) does not have a local
maximum or minimum at x = 0. Suppose that
and g(x) has a local minimum of c at 0. On the other hand, if f,(x) < 0 for all
such x, then we expect that
g(x) = c — (Little bit)
8.3 APPLICATIONS TO EXTREMA 433
for these values x, and g(x) has a local maximum of cat 0. This is proved in an
advanced calculus course.
We know that we can orthogonally diagonalize the form f(x) with a
substitution x = Ct to become
At? + At? treet Ant, (4)
where the A; are the eigenvalues of the symmetric coefficient matrix of f(x).
Form (4) is > 0 for all nonzero t, and hence f,(x) > 0 for all nonzero x, if and
only if we have all A; > 0. Similarly, f(x) < 0 for ail nonzero x if and only if all
A; < 0. It is also clear that, if some A; are positive and some are negative, then
form (4) and hence /5(x) assume both positive and negative values arbitrarily
close to zero.
A quadratic form f(x) is positive definite if f(x) > 0 for all nonzero x in
R*, and it is negative definite if f(x) < 0 for all such nonzero x.
Our work in Section 8.1 and the preceding statement give us the following
theorem and corollary.
Let g(x) be a function of n variables given by Eq. (3). Suppose that the
function f(x) in Eq. (3) is zero. If f,(x) is positive definite, then g(x) has
a local minimum of c at x = 0, whereas if f(x) is negative definite, then
g(x) has a local maximum of c at x = 0. If f(x) assumes both positive
and negative values, then g(x) has no local extremum at x = 0.
Exercises 17 through 22 illustrate some of the things that can occur if step
5 is the case. We shall not tackle steps 1 or 2, because they require calculus, but
will simply start with equations of the form of Eqs. (3) or (5) that meet the
requirements in steps | and 2.
EXAMFLE 1 Let
SOLUTION The symmetric coefficient matrix for the quadratic-form portion 2x? — 4xy +
4y? of g(x, y) is
We find that
a _ All = i a) =\N- 6A + 4.
6+ V36 — 16
A= ENS.
A, = A; aot Se A,-
On the unit sphere with Eq. (7), we see that formula (6) can be written as
A(L -— t? — +++ — £2) + Ag? +--+ + At,?
=A — (A, - Ay)ty? Tr (Am An. (8)
Let f(x) be a quadratic form, and let A,, A,,... , A, be the eigenvalues
of the symmetric coefficient matrix of f(x). The maximum value
assumed by f(x) on the unit sphere ||x|| = 1 is the maximum of the 4,
and the minimum value assumed is the minimum of the A,. Each
extremum Is assumed at any eigenvector of length | corresponding to
the eigenvalue that gives the extremum.
EXAMPLE 3 Find the maximum and minimum vaiues assumed by 2xy + 2xz on the unit
sphere x? + y + z? = 1, and find all points where these extrema are assumed.
SOLUTION. From Example 5 in Section 8.1, we see that the eigenvalues of the symmetric
coeficient matrix A and an associated eigenvector are given by
SUMMARY
1. Let g(x) =c+ f(x) + f(x) +--+ -* + f(x) + +++ near x = 0, where f(x) is
a form of degree i or is zero.
a. If f(x) = 0 and f(x) is positive definite, then g(x) has a local minimum
of cat x = 0.
8.3 APPLICATIONS TO EXTREMA 437
b. Iff\(x) = 0 and f(x) is negative definite, then g(x) hasa local maximum
ofcatx = 9.
c. If f(x) # 0 or if the symmetric coefficient matrix of (x) ha’ both
positive and negative eigenvalues, then g(x) has no local extremum at
x = 0.
2. The natural analogue of summary item i olds at x = h; just translate the
axes to the point h, and replace x, by xX, = x; — A..
3. A quadratic form in n variables has as maximum (minimum) value on the
unit sphere ||x|] = 1 in R” the maximum (minimum) of the eigenvalues of
the symmetric coefficient matrix of the form. The maximum (minimum) is
assumed at each corresponding eigenvector of length 1.
EXERCISES
In Exercises 1-15, assume that g(x), or g(X), ts 13. e(x yz) = 54+ (42+ Axyt yY—-— z+
described by the given formula for values of x, or y + -::
— 4xyz)
(x
x, near zero. Draw whatever conclusions are 14. e(x, y,
= 44+ (+ yt Z)
2? - 2xz -
possible concerning local extrema of the Ixy — 2yz) + GXI-D)+---,
function g. x=x+lyp=y-2,z=z+5
15. g(x, y,z)=4- 37+
1. B(x, y) = —7 + (3x° — bxy + 4y’) +
(©
- 4y) (2x° — 2xy + 3xz + Sp? +z) +
(xyz -ZD)t--+,X=xXx-TyV=yt G,
. B(x, y) = 8 — (2 — Bxyp + By) + Z=2
(2x7y — y) 16. Define the notion ofa local minimum ¢ of a
~ B(%, y) = 4 -— 3x + (2 - Qxy t+ y) + function f(x) of n variables at a point h.
Qxy+ yt
» B(x, y) = 5 — (8 — b6xy + 2y) + In Exercises 17-22, let
(40 = xy)
++ -- ax y=act fx yt haytflayt---
» B(x, y) = 3 — (4x? — Bxy + Sp’) +
QQx*y-—p)yters x=x+5, yay in accordance with the notation we have been
» 8% y) = 24+ (8 + day t+ p+ using. Let d, and A, be the eigenvalues of the
Co + 5xy) symmetric coefficient matrix of f,(x, y).
» &(x, y) = 5+ 3x + 10xy + TP) +
(Ixy
- y) i7. Give an example ofa polynomial function
g(x, y) for which f(x, y) = 0, A, = 0,
» &(x, y) = 4 - Oe - bxy + Dy?) + |A.| = 1, and g(x, y) has a local minimum of
(Oy) too 10 at the origin.
- B(x, y) = 3 + (2x? + Bxy + By?) + 18. Repeat Exercise 17, but make g(x, y) have a
(4x? -xy*)y)+-->,x=x-3,p=y-l local maximum of —5 at the origin.
10. &(xX, y) = 4 - (+ 3xy - y+ . Repeat Exercise 17, but make g(x, y) have
(4— x’ x=xt+1yp=y-7
Sxy),p no local extremum at the origin.
Il. &(x, y, Z) = 4-0 + 4xy + Sy + 327) +
20. Give an example of a polynomial function
Gd
— xyz) g(x, y) such that f(r, ») = Ale y) = 9.
12. &(x, y, Z) = 3 + (2 + 6xz — p + 52%) + having a local maximum of | at the
(x°z — y?z) + ces origin.
438 CHAPTER 8 EIGENVALUES: FURTHER APPLICATIONS AND COMPUTATIONS
21. Repeat Exercise 20, but make g(x, y) have a and prove that this minimum value is
local minimum of 40 at the origin. assumed at any corresponding eigenvector of
22. Repeat Exercise 20, but make g(x, y) have length 1.
no local extremum at the origin. 33. Let f(x) be a quadratic form in n vanables.
Descnbe the maximum and minimum
In Exercises 23-31, find the maximum and values assumed by f(x) on the sphere
minimum values of the quadratic form on the unit \|x|| = a? for any a > 0. Describe the points
circle in R?, or unit sphere in R’, and find all of the sphere at which these extrema are .
points or the unit circle or sphere where the assumed. [Hint: For any & in R, how does
extrema are assumed. f(kx) compare with f(x)?]
If s is large, the quotients (A/A,) for i> 1 are close to zero, because |A/A,| < I.
Thus, if c, # O and s is large enough, A‘w, is very nearly parallel to A,'c,b,, which
is an eigenvector of A corresponding to the eigenvalue A,. This suggests that we
can approximate an eigenvector of A corresponding to the dominant eigen-
value A, by multiplying an appropriate initial approximation vector w,
repeatedly by A.
A few comments are in order. In a practical application, we may have a
rough idea of an eigenvector for A, and be able to choose a reasonably good
first approximation w,. In any case, w, should not be in the subspace of R’
440 CHAPTER 8 EIGENVALUES: FURTHER APPLICATIONS AND COMPUTATIONS
AXX*X7X AX
K°X
Ly, (4)
The quotient (Ax : x)/(x + x) is called a Rayleigh quotient. As we compute the w,,
the Rayleigh quotients (Aw, - w,)/(w; + w,) should approach A,.
This power method for finding the dominant eigenvector should, mathe-
matically, break down if we choose the initial approximation w, in Eq. (1) in
such a way that the coefficient c, of b, is zero. However, due to roundoff error,
it often happens that a nonzero component of b, creeps into the w, as they are
computed, and the w, then start swinging toward an cigenvector for A, as
desired. This is one case whcre roundoff error is he!pful!
Equation (3) indicates that the ratio |A,/A,|, which is the maximum of
the magnitudes |Aj/A,| for ¢ > 1, should control the speed of convergence
of the w; to an eigenvector. If |A,/A,{ is close to 1, convergence may be quite
SiOwW,
We summarize the steps of the power method in the following box.
HISTORICAL NOTE Tue RAYLEIGH QUOTIENT is named for John William Strutt, the third
Baron Rayleigh (1842-1919). Rayleigh was a hereditary peer who surprised his family by pursuing
a scientific career instead of contenting himself with the life of a country gentleman. He set up @
laboratory at the family seat in Terling Place, Essex, and spent most of his life there pursuing his
research into many aspects of physics—in particular, sound and optics. He is especially famous
for his resolution of the long-standing question in optics as to why the sky i is blue, as well as for his
codiscovery of the element argon, for which he won the Nobel prize in 1904. When he received
the British Order of Merit in 1902 he said that “the only merit of which he personally was con-
scious was that of having pleased himself by his studies, and any results that may have bee?
due to his researches were owing to the fact that it had been a pleasure to him to become 4
physicist."
Rayleigh used the Rayleigh quotient early in his career in an 1873 ork in which he needed to
evaluate approximately the normal modes of a complex vibrating system. He subsequently used it
and related methods in his classic text The Theory of Sound (1877).
8.4 COMPUTING EIGENVALUES AND EIGENVECTORS 441
A= —?3-2 0 s tarting
Ing W1 with Wi -| Tey | P
am = {3 oli] =[-a}
_{| 3-27 1]_] 5
' 1]
W,> 5AM = 3}
Then
AW,"W, 5 _ 115 _
w,'W, 29 29
Finally,
442 CHAPTER 8 EIGENVALUES: FURTHER APPLICATIONS AND COMPUTATIONS
EXAMPLE 2 Use the routine POWER in LINTEK with the data in Example 1, and give the
vectors w, and the Rayleigh quotients until stabilization to all decimal place;
printed occurs.
SOLUTION POWER pnitts fewer significant figures for the vectors to save space. The datz
obtained using POWER are shown in Table 8.2. It is easy to solve the
characteristic equation A? — 3A — 4 = 0 of A, and tosee that the eigenvalues are
really A, = 4 and A, = —1. POWER found the dominant eigenvalue 4, and it
shows that [1, ~0.5] is an eigenvector for this eigenvalue.
ifA has an eigenvalue of nonzero magnitude smailer than that of any othe:
eigenvalue, the power method can be used with A“! to find this smalles
eigenvalue. The eigenvalues of A“! are the reciprocals of the eigenvalues of A.
and the eigenvectors are the same. To illustrate, if 5 is the dominant eigenvalue
of A“' and v is an associated eigenvector, then ; is the eigenvalue of A of
smallest magnitude, and v is stil] an associated eigenvector.
TABLE 8.2
3 -2]
Power Method for A = | _ 2 9|
(1, -1]
(1, -.4] 3.5
(1, —.5263158] 3.9655172413793i
[1, —.4935065] 3.997830802603037
(1, —.5016286] 3.9998 64369998644
[1, -.4995932] 3.99999 1522909338
[1, —.5001018) 3.999999470180991
[1, —.4999746] 3.999999966886309
[1, —.5000064] 3.999999997930394
[1, -.4999984] 3.99999999987065
[1, —.5000004] 3.99999999999 1916
[1, —.4999999] 3.999999999999495
[1, -.5] 3.999999999999968
[1, -.5] 3.999999999999998
(1, —.5] 4.
{1, -.5] 4
——r
| lr.
AB=\|c, G& «°° ¢
=
,,
where c, is the jth column vector ofA and r, is the ith row vector of B. We claim
that
AB = cn, + er, + > ° + + ¢,F,, (5)
But c,r, contributes precisely a,b, to the ith row and jth column of the sum in
Eq. (5), while c,r, contributes a5, and so on. This establishes Eq. (5).
Let A be an n X n symmetric matrix with eigenvalues A,, A... ,A,, Where
A= CDC! = CDC"
A, —— b,’ ——
FQ
A, b,7
=|b, b, --: b, , (6)
_ A,b,7 ———
A,b,7
A= |b, b, --- b, (7)
| —_ A,b,7 TT
Using Eq. (5) and writing the b, as column vectors, we see that A can be written
in the following form:
444 CHAPTER 8 EIGENVALUES: FURTHER APPLICATIONS AND COMPUTATIONS
| Spectral Decomposition of A |
A-[ 39 _| 3-2
of Example 1.
SOLUTION Computation shows that the matrixA has eigenvalues A, = 4 and A, = —1, with
corresponding eigenvectors
| 1
o !
v,=]_1/ and vy, =} 2].
A,b,b,” T
+ A,b,b, T—
| TA]
2s RIVS _
W/WV5] _ Ws
aval HVS, 25)
4.2 12
_ 5 S}_ 4j5 5
2 1 24
5 5 15 5
16 _8 1 2
_| 5 s/_|s5 3 oo .
-8 4/1/24 2 0
5 5 5 §
HISTORICAL NOTE Tue TERM SPECTRUM Was coined around 1905 by David Hilbert (1862-
1943) for use in dealing with the eigenvalues of quadratic forms in infinitely many variables. Th¢
notion of such forms came out of his study of certain linear operators in spaces of functions:
Hilbert was struck by the analogy one could make between these operators and quadratic forms 10
finitely many variables. Hilbert’s approach was greatly expanded and generalized during the next
decade by Erhard Schmidt, Frigyes Riesz (1880-1956), and Hermann Weyl. Interestingly enoug!
in the 1920s physicists called on spectra of certain linear operators to explain optical spectra. -
David Hilbert was the most influential mathematician of the early twentieth century. He
made major contributions in many fields, including algebraic forms, algebraic number theory:
integral equations, the foundations of geometry, theoretical physics, and the foundations ®
mathematics. His speech at the 2nd International Congress of Mathematicians in Paris in 1909,
outlining the important mathematical problems of the day, proved extremely significant in
providing the direction for twentieth-century mathematics.
84 COMPUTING EIGENVALUES AND EIGENVECTORS 4AS
4-13
_| 3 -2
of Example 3.
4-2) |12
A = djb,b,7 + Ayb,b,7= 4 3 F}-]3 3h.
5 5) |5 5
Thus,
-l_)y -2
5 5 | _
2 -4_,(/= + A) = AA
+ 1),
5 5
EXAMPLE 5 Use the routine POWER in LINTEK to find the eigenvalues and eigenvectors,
using deflation, for the symmetric matrix
2~8 §!
-g 0 10),
5 10 -6|
SOLUTION Using POWER with deflation and findiiig eigenvalues as accurately as the
printing on the screen permits, we obtain the eigenvectors and eigenvalues
v, = [-.6042427, ~.8477272, 1], A, = —17.49848531152027,
v, = [—.760632, 1, 3881208}, Ay = 9.96626448890372,
v, = [1, 3958651, 9398283], A, = 3.532220822616553.
Using MATCOMP as a check, we find the same eigenvalues and eigen-
vectors. §&
Ba cn (11)
Dap gq
From matrix (10), we would form the matrix consisting of the portion shown
in color—namely,
0 10
10 -6]'
Let C = [c,] be a 2 x 2 orthogonal matrix that diagonalizes matrix (11).
(Recall that C can always be chosen to correspond to a rotation of the plane,
although this is not essential in order for Jacobi’s method to work.) Now form
ann X nmatrix R, the same size as the matrix A, which looks like the identity
matrix except that 7, = C11, Myq = Cia Top = Co1) ANG yg = Cy. For matrix (10),
where 10 has maximum magnitude above the diagonal, we would have
1 0 0
R= {0 cy ey).
0 Cy Cy
84 COMPUTING EIGENVALUES AND EIGENVECTORS 447
This matrix R will be an orthogonal matrix, with det(R) = det(C). Now form
the new symmetric matrix B, = R’AR, which has zero entries in the row p,
column q position and in the row q, column p position. Other entries in B, can
also be changed from thuse in 4, but it can be shown that the maxinium
magnitude of off-diagonal entiies has been reduced, assuming that no other
above-diagonal entry in A had the magnitude of a,. Then repeat this process,
Starting with B, instead of A, to obtain another symmetric matrix B,, and
su on. It can be shown that the maximum magnitude of off-diagonai entries
in the matrices B, approaches zero as / increases. Thus the sequence of
matrices
B, B,, B,, oe
will approach a diagonal matrix J whose eigenvalues d,, dy, .. . , d,, are the
same as those of A.
If one is going to use Jacobi’s method much, one should find the 2 x 2
matrix C that diagonalizes a general 2 x 2 symmetric matnx
bch
b c}’
that is, one should find formuias for computing C in terms of the entries a, b,
and c. Exercise 23 develops such formulas.
Rather than give a tedious pencil-and-paper example of Jacobi’s method,
we choose to present data generated by the routine JACOBI in LINTEK
for matrix (10)—which is the same matrix as that in Example 5, where we
used the power method. Observe how, in each step, the colored entnes of
maximum magnitude off the diagonal are reduced to “zero.” Although
they may not remain zero in the next step, they never return to their orig-
inal size.
2-8 5
-8 0 10).
5 10 -6
2-8 5
-8 0 10
5 10 -6
2 8.786909 3.433691
8.786909 —13.44031 0
3.433691 0 7.440307|
448 CHAPTER 8 EIGENVALUES FURTHER APPLICATIONS AND COMPUTATIONS
-17.41676 0 — 1.415677
0 5.97645 —3.128273
— 1.415677 —3.128273 7.440307
ine ~ 8789265 0
—.8789265 3.495621 3.560333E-02
0 3.560333E-02 9.966067
—17.49849 0 |.489243E-03
0 3.532418 3.55721 7E-02
| 1.489243E-03 3.557217E-02 9.956067 |
~17.49849 8.233765E-06 —1.48922E-03
8.233765E-06 3.53222! 0
—1.48922E-03 0 9.966265
—17.49849 8.233765E-06 0
8.233765E-06 3.532221 —4.464509E-10
0 ~4,464508E-10 9.966265
The off-diagonal entries are now quite small, and we obtain the same
eigenvalues from the diagonal that we did in Example 5 using the power
method. s
QR Algorithm
At the present time, an algorithm based on the QR factorization of an
invertible matrix, discussed in Section 6.2, is often used by professionals to
find eigenvalues of a matrix. A full treatment of the QR algorithm is beyond
the scope of this text, but we give a brief description of ihe method. As with the
Jacobi method, pencil-and-paper computations of eigenvalues using the
QR algorithm are too cumbersome to include in this text. The routine
QRFACTOR in LINTEK can be used to illustrate the features of the
QR algorithm that we now describe.
Let A be a nonsingular matrix. The QR algorithm generates a sequence of
matrices A,, A,, A;, Ay... , all having the same eigenvalues as A. To generat¢
this sequence, let A, = A, and factor A, = Q,R,, where Q, is the orthogon@
matrix and R, is the upper-triangular matrix described in Section 6.2. Then let
A, = R,Q,, factor A, into Q,R,, and set A, = R,Q,. Continue in this fash!”
factoring A, into Q,R, and setting A,,, = R,Q,. Under fairly gene’
84 COMPUTING EIGENVALUES AND EIGENVECTORS 449
IX X X X X X X]
XxX XX X X X;
00 XxX xX X xX
00 XXX xX xX
0000xX xX x}
0000 x xX xX
01
1 0
generates this same matrix repeatediy, even though the eigenvalues are 1 and
— | rather than complex numbers. This is an example of a matrix A for which
450 CHAPTER 8 EIGENVALUES: FURTHER APPLICATIONS AND COMPUTATIONS
t 3}
which generates a sequence that quickly converges to
lo H
O -.1]
Subtracting the scalar .9 from the eigenvalues |.9 and —.1 of this last matrix,
we obtain the eigenvalues | and —! of the original matnx.
Shifts can also be used to speed convergence, which is quite fast when the
ratios of magnitudes of eigenvalues are large. The routine QRFACTOR
Cisplays the matrices A; as they are generated, and it allows shifts. If we notice
that we are going to obtain an eigenvalue whose decimal expansion starts with
17.52, then we can speed convergence greatly by adding the shift (—17.52)/.
The resulting eigenvalue will be near zero, and the ratios of the magnitudes of
other eigenvalues to its magnitude will probably be large. Using this technique
with QRFACTOR, it is quite easy to find all eigenvalues, both real and
complex, of most matrices of reasonable size that can be displayed conve-
niently.
Professional programs make many further improvements in the algonthm
we have presented, in order to speed the creation of zeros in the !ower part of
the matnx.
SUMMARY
3. Let A be ann X n symmetric matrix with eigenvalues A, such that |A,| >
|A,| 2 -- + 2 |A,|. Tf b, is a unit eigenvector corresponding to A,, then
A — A,b,b,’ has eigenvalues A,, A;,... , A,, 0, and if |A,| > |A,|, then A, can
be found by applying the power method to A — A,b,b,7. This deflation can
be continued with A — A,b,b,” — A,b,b,7 if |A,| > |A,], and so on, to find
more eigenvalucs.
4. In the Jacobi method for diagonalizing a symmetric matrix A, one
generates a sequence of symmetnic matrices, starting with A. Each matrix
of the sequence is obtained from the preceding one by multiplying it on the
left by R’ and on the right by R, where R is an orthogonal “rotation”
matrix designed to annihilate the two (symmetrically located) entries of
maximum magnitude off the diagonal. The matrices in the sequence
approach a diagonal matnx D having the same eigenvalues as A.
5. Inthe QR method, one begins with an invertible matrix A and generates a
sequence of matrices A, by setting A, = A and A,,, = R,Q,, where the OR
factorization of A, is Q,R,. The matrices A, approach almost upper-
triangular matrices having the reai eigenvalues of A on the diagonal and
pairs of complex conjugate eigenvalues of A as eigenvalues of 2 x 2 blocks
appeanng along the diagonal. Shifts may be used to speed convergence or
to find eigenvalues of a singular matrix.
EXERCISES
-['}
wal! 8.|-1
an 1-1
and compute w,, W,, and w,. Also find the three [Hinr: Use Example 4 in Section 6.3.]
Rayleigh quotients. Then find the exact
eigenvalues, for comparison, using the In Exercises 9-12, find the matrix obtained by
characteristic equation. deflation after the (exact) eigenvalue of maximum
magnitude and a corresponding eigenvector are
if 3-3 2, [3 -3 found. -
-5 { 4 -5
3, [-3 10 4 [-49 9. The matrix in Exercise 5
3 8 2 5 10. The matrix in Exercise 6
11. The matrix in Exercise 7
In Exercises 5-8, find the spectral decomposition
: 12. The matnx in Exercise 8
(8) of the given symmetric matrix.
= In Exercises 13-15, use the routine POWER in
5, 3 | 6. 3 4 LINTEK, or review Exercises M2-M5 in Section
3 2 5 3 5.1 and use MATLAB as in M4 and M5, to find
452 CHAPTER 8 EIGENVALUES FURTHER. APPLICATIONS AND COMPUTATIONS
In Exercises 18-21, use the routine POWER (If b < 0 and a rotation matrix is desired,
in change the sign of a column vector in C.)
LINTEK and deflation to find all eigenvalues and
corresponding eigenvectors for the given Prove this algorithm by proving the
symmetric matrix Always continue the method following.
before deflating until as much stabilization as a. The eigenvalues of A are
possible is achieved. Note the relationship between - a+c+V(a-
cP + 4b?
ratios of magnitudes of eigenvalues and the speed A= 5 ,
of convergence.
[Hint: Use the quadratic formula.}
3 5 -7 b. The first row vector of A — A/ is
18. 5 10 11 (g + Vg? + b’, bd).
-7 Il Oo c. The columns of the orthogonal matrix C
0-1 4 in step 5 can be found using part b.
19. |-] 2-1 d. The parenthetical statement following
4-1 Q step 5 is valid.
24. Use the algorithm in Exercise 23 to find an
}3 1 -2 i
>a
3
1-3
7-2
5
6
2
“ld
ae
7 4411 3
te 5-8 0
|6 3 0 6
22. The eigenvalue option of the routine In Exercises 25-28, use the routine JACOBI in
VECTGRPH in LINTEK is designed to LINTEK to find the eigenvalues of the matrix by
Jacobi's method.
84 COMPUTING EIGENVALUES AND EIGENVECTORS 453
[
see approximately what an eigenvalue will be. 3 6 4-2 {6
21 -33 5S 8 -12
2 -l1
3
9
§
-6
0 -Ii
7
8 | 32. 15
—18
[-22
-21
12
31
13
4
14
4 = 20
8
9 10
3
CHAPTER
Po
3 COMPLEX SCALARS
The Number /
Numbers exist only in our minds. There is no physical entity that is the
number 1. If there were, 1 would be in a place of high honor in some great
museum of science, and past it would file a steady stream of mathematicians,
gazing at | in wonder and awe. As mathematics developed, new numbers were
invented (some prefer to say discovered) to fulfill algebraic needs that arose. If
we start just with the positive integers, which are the most natural numbers,
inventing zero and negative integers enables us not merely to add any two
integers, but also to subtract any two integers. Inventing rational numbers
(fractions) enables us to divide an integer by any nonzero integer. It can be
shown that no rational number squared is equal to 2, so irrational numbers
have to be invented to solve the equation x* = 2, and our familiar decimal
notation comes into use. This decimal notation provides us with other num-
bers, such as 7 and e, that are of practical use. The numbers (positive, nega
tive, and zero) that we customarily write using decimal notation have unfor-
tunately become known as the real numbers; they are no more real than any
other numbers we may invent, because all numbers exist only in our minds.
454
9.1 ALGEBRA OF COMPLEX NUMBERS 455
The real numbers are still inadequate to provide solutions of even certain
polynomial equations. The simple equation x? + | = 0 has no real number as a
solution. In terms of our needs in this text, the matrix
id
0-1
FIGURE 9.1
The complex number z.
456 CHAPTER 9 COMPLEX SCALARS
w=ctdi
LW
z=atbi, ,
y
4
y bt
A a
z=atbi
22 x
—
: pe Xx
Z=a- bi
32 —pt
| _/ 1 \(3=4i) 3-41 1 yy 3 4;
344 \344i)\3-4i) 25 ~ 25 25 25°
FIGURE 9.6
Polar form of z.
The proofs of the properties in Theorem 9.1 are very easy. We prove
property 3 as an example and leave the rest as exercises. (See Exercises 12 and
13.)
HISTORICAL NOTE Complex NUMBERS MADE THEIR INITIAL APPEARANCE on the mathematical
scene in The Great Art, or On the Rules of Algebra (1545) by the 16th century Italian
mathematician and physician, Gerolamo Cardano (1501-1576). It was in this work that Cardano
presented an algebraic method of solution of cubic and quartic equations. But it was the quadratic
problem of dividing 10 into two parts such that their product is 40 to which Cardano found the
solution 5 + \/—15 and 5 — \/—15 by standard techniques. Cardano was not entirely happy with
this answer, as he wrote, “So progresses anthmetic subtlety the end of which, as is said, is as refined
as it is useless.”
Twenty-seven years later the engineer Raphael Bombelli (1526-1572) published an algebra
text in which he dealt systematically with complex numbers. Bombelli wanted to clarify Cardano’s
cubic formula, which under certain circumstances could express a correct real solution of a cubic
equation as the sum of two expressions each involving the square root of a negative number. Thus
he developed our modern rules for operating with expressions of the form a + b\/—1, including
methods for determining cube roots of such numbers. Thus, for example, he showed that
W247i = 2 + \/—1 and therefore that the solution to the cubic equation x? = !5x + 4,
which Cardano’s formula gave as x = VY 2+ V-121 + Y 2 - V/-121, could be wnite~
simply as x = (2 + \/—1) + (2 - V/—1}), or as x = 4, the obvious answer.
9.1 ALGEBRA OF COMPLEX NUMBERS 459
EXAMPLE 4 rane principal argument and the polar form of the complex number z =
—~V3-i.
SOLUTION Because |z| = (-3)? + (-1)? = V4 = 2, we have cos @ = —V3/2 and
sin 6 = —3, as indicated in Figure 9.7. The principal argument of z is the
angle @ between —7z and 7 satisfying these two conditions—that is, @ =
~5a/€. The required polar form is z = 2(cos(—52/6) + isin(—S7/6)). =
the principal arguments may have to be adjusted to lie between —a and 7z.
(See Figure 9.8.)
y
y j
A
2 Z1
v3 KL > X Arg(z2)
1
11 r=2 *_] /
(5m Arg(z,) + Arg(z2) \ Arta
! x
\ 6 /
X Arg(z 322)
| “! 212) lz122) = |zileal
Because 2,/z, is the complex number w such that z,w = z,, we see that, in
division, one divides the moduli and subtracts the arguments. (See Figure 9.9.)
EXAMPLE 5 Illustrate relations (4) and (5) for the complex numbers z, = V3 + iand z, =
L+.
= eos 2 + isin =
z, = 2(cos 6 + isin 6)"
Z, = V2(cos 4 + isin my
Thus,
lzil
1 (zlzy 1
(21/22) NG Arg(z,/z)
Se / Z2
Arg(z) \ MArg(za) x .
I FIGURE 9.9
Representation of z,/z).
91 ALGEBRA OF COMPLEX NUMBERS 461
and
z .
no Fhloosl & ~ 4) + isin — 3)
= V7
~*_(cos(-2)
12 + i sin(
sin(-—1): .
where
fork =0,1,2,...,"- 1. ;
This illustrates the Fundamental Theorem of Algebra for the equation w" = z
of degree n that has n solutions in C.
EXAMPLE 6 Find the fourth roots of 16, and draw their vector representations in R?.
462 CHAPTER 9 COMPLEX SCALARS
FIGURE 9.10
The fourth roots of 16.
SOLUTION The polar form ofz = 16 is z = 16(cos 0 + isin 0), where Arg(z) = 0. Apply-
ing formula (7), with n = 4, we find the following fourth roots of 16 (see
Figure 9.10):
k 16°{cos “a + isin 2
of 2(cos 0 + i sin 0) = 2
] acs 3 + ?sin 4 = 21
2 2(cos 7 + isin m) = —2
3 a{co 2 + isin | = -2i
| SUMMARY
[|
1. A complex number is a number of the form z = a + bi, where a and b are
real numbers and i = V-1.
2. The modulus of the complex number z = a + bi is
jz. = Va
+ B.
3. The complex number z = a + bi has the polar form z = |z|(cos @ + i sin 6),
and 6 = Arg(z) is the principal argument of z if -7 < 6S a.
91 ALGEBRA OF COMPLEX NUMBERS 463
| EXERCISES
|
I. Find the sum z + wand the product zw if 12. Prove properties 1, 2, and 5 of Theorem 9.1
a Z=1+2iw=3 -i, 13. Prove property 4 of Theorem 9.1.
b z=3+i,w=i. 14. Illustrate Eqs. (5) in the text for z, =
. Find the sum z + wand the product zw if 3 + iand z, = -1 + V3i.
a z=2+3iw=5-4, . Illustrate Eqs. (5) in the text for z, = 2 + 27
b z=1+2iw=2-i. andz,=1+ MV3i.
. Find |z| and z, and verify that zz = |z|’, if . If2 = 16, find |z!.
a. Zz=3+ 2i, . Mark each of the following True or False.
b z=4-i.
a. The existence of complex numbers is
. Find |z| and z, and verify that zz = |z|’, if more doubtful than the oxistence of real
a z=2 +1, numbers.
b. z= 3 - 41. b. Pencil-and-paper computations with
Show that z is a real number if and only if complex numbers are more cumbersome
z=Z, than with real numbers.
c. The square of every complex number is a
. Express z”' in the form a + bi, for a and b
positive real number.
real numbers, if
d. Every complex number has two distinct
a. z= -—I +18, square roots in C.
b z= 3+ 41. e. Every nonzero complex number has two
Express z/w in the form a + bi, for a and 6 distinct square roots in C.
real numbers, if f. The Fundamental Theorem of Algebra
a z=1+2iw=1+i,
asserts that the algebraic operations of
addition, subtraction, multiplication, and
b z=3+i,w=3
+ 41.
division are possible with any two
Find the modulus and principal argument complex numbers, as long as we do not
for divide by zero.
a. 3-4, g. The product of two complex numbers
b. -V3 -i. cannot be a real number unless both
Find the moduius and principal argument numbers are themselves real or unless
for both are of the form bi, where b is a real
number.
a. —2 + 2i,
h. If(a+ bi)= 8, then a’ + b= 4.
b. -2 - 2i.
i. If Arg(z) = 32/4 and Arg(w) = —7/2,
10 Express (3 + i)° in the form a + bi for a then Arg(z/w) = 52/4.
and 6 real numbers. (HINT: Write the given j. If z +z = 2z, then z is a real number.
number in polar form.]
18. Find the three cube roots of 8.
11. Express ‘1 + i)* in the form a + bi for a and
b real numbers. (Hint: Write the given 19. Find the four fourth roots of — 16.
number in polar form.] 29. Find the three cube roots of —27.
464 CHAPTER 9 COMPLEX SCALARS
21. Find the four fourth roots of 1. 27. Show that the infinite list of values ¢ =
Z- 2+(1
+ dz, = l
Iz, — 21z, + Iz,=2- 1
Iz, — (1 + dz, = 1 + 27.
1 2: iti
A=\|1 31 !
Olt+i: -l
i-4i 4) 1-3
Ati=
C” has the same standard basis {e,, €,, . . . ,e,} as R’, but of course now the field
of scalars is C. Thus, C” is an n-dimensional complex vector space—that is, if
we use complex scalars. In particular, C = C! is a one-dimensional complex
vector space, even though we can regard it geometrically as a plane. That plane
is a two-dimensional rea/ vector space, but is a one-dimensional complex
vector space. In general, C” can be regarded geometrically in a natural way as a
2n-dimensiona! real vector space, as we ask you to explain in Exercise 1.
All of our work in Chapters 1, 2, and 3 regarding subspaces, generating
sets, independence, and bases carries over tu complex vector spaces, and the
proofs are identical.
EXAMPLE 4 Find the coordinate vector v, in C? of the vector v = [i, 2 — i, 1 + 2i] relative to
the ordered basis
110
+ 2i
vp=| 5+ 47 |.
-1+4i .
Let uo = [u,, u,,..., u,] and v = [v, v,,..., v,] be vectors in C". The
Euclidean inner product of u and v is
Notice that Definition 9.1 gives {{1, i], [1, i) = (J)() + (-)() = 14+ 1 = 2.
Because |a + il? = (a — bia + bi), we see at once that, for v € C", we have
WV = [yP + [WP t ++ + [val (1)
just as in the R" case.
We list properties of the Euclidean inner product in C" as a theorem,
leaving the proofs as exercises. (See Exercises 16 through 19.)
Property | of Theorem 9.2 and Eq. (1) preceding the theorem suggest that
we define the magnitude or norm of a vector v in C” as
The Euclidean inner product given in Definition 9.1 reduces to the usual
dot product of vectors if the vectors are in R”. However, we have to watch one
feature very carefully:
w=u
_ wv,Wy, uw)vy).
1s orthogonal to v, because
We must not use u — wy, whose inner product with v is generally not zero.
1-1 lt+i
3 vy. 2 V2
l . . l . ,
= [1, 0. N-s-Z1+h1+)-s+h01-9
= Fl -i,-2- 2,1 +9
9.2 MATRICES AND VECTOR SPACES WITH COMPLEX SCALARS 469
Poioiti
A=|2 0 i
2i | i-i
SOLUTION We form the transpose while taking the conjugate oi each ciement, obtaining
| 2 -2i
A*=!| -1 0 I
1-i -i Itt .
Following are some properties of the conjugate transpose that can easily be
verified. (See Exercise 32.)
EXAMPLE 8 Using the properties in Theorem 9.3, show that, for any 2 x n matnx A, we
have (A + A*)* = A + A®*.
SOLUTION Using properties 1 and 2 of Theorem 9.3, we have
EXAMPLE 9 Let 7: C"—» C’ be a linear transformation having a unitary matrix U as mat rix
representation with respect to the standard basis. Show that ||7(z)|] = ||z|| for all
zEC’
SOLUTION We know that 7(z) = Uz, because U is the standard matrix representation of 7
Because ||v||’ = (v, v) = v*v, we find by using property 4 of Theorem 9.5 that
[|Uz|P = (U2)* Uz = 2*(U*U)z = z*Iz = 22 = lel’.
Taking square roots, we find that ||Uz|| = ||z\|, so ||7(z)|| = |lz\l. =
EXAMPLE 10 Example 8 shows that, for any square matnx A, the matnx A + A* is
Hermitian. Illustrate that the entries on the diagonal of A + A* are real, using
the matrix A in Example 7.
SOLUTION Example 7 shows that, for the matnx
A=|2 9 id,
23 1 i-i
we have
/1 2 -2i
A* =| -1 0 |
l-i -i ltt
Thus,
2 2+1 1-1
AtA*® =|2- i 0 1 + if,
l+i |l-i 2
- SUMMARY
|
1. C"1s an n-dimensional complex vector space.
2. Ifu=[u,,u,...,u,)andv=[v,, v,,..., v,] are vectors in C’, then (u, v) =
U\V, + Uv, + ++ * + u,v, is the Euclidean inner product of u and v and
satisfies the preperties in Theorem 9.2. [n general, (u, v) # (vy, u).
V,u) . ,
3. For u, v € C’, the vector u — oy 1S perpendicular to v.
EXERCISES
B=i2+i 1 1-i i
i | i 9. Solve the linear system Az =] 1 + i Als
4. Find A2and A‘ if A -| 1 o1t |: i
“tsa ! the matrix in Exercise 7.
9.2 MATRICES AND VECTOR SPACES WITH COMPLEX SCALARS 473
Li
b [il +ail-Z14+g I+7 | l
e {l+742+i7,34+]
|
V2 0
da f,jlt+il-igig V3i 0 V3
ef{ltil-ii1-Q -i = -2 1
21. Determine whether the given pairs of vectors
are parallel, perpendicular, or neither. i! i l-i
a. [1,4], [i 1 d. 4 i 3
b. [1 +ié41-i,(f-31,-1- 9 il-i 1 2
e. (1 +72 - 9, (33,3 + 9 32. Prove the properties of the conjugate
d. {l,4,1- a, {1-41 +4, 2) transpose operation given in Theorem 9.3.
e.(l+i,1-,1,{,1-%4-3-] (Hint: From Section 1.3, we know that
474 CHAPTER 9 COMPLEX SCALARS
analogous properties of the transpose 34. Prove that, for vectors v,, v,,...,¥, in C’,
operation hold for real matrices and real {V,, Vv), .... Va} iS a basis for C* if and only if
scalars and can be derived using just field {v,,¥>,-.., V,} iS a basis for C’.
properties of R, so they are also tiue for . Prove that an n x n matrix U is unitary if
matrices with complex entries. Thus we can and only if the rows ot U form an
focus on the effect of the conjugation. From orthoncrmal basis for C’.
Theorem 9.1, we know that z + w =z + w,
36. Prove that, ifA is a square matrix, then AA*
zw =2zw, and z =z, forz, wE C. Use
is a Hermitian matrix.
these properties of conjugation to complete
the proof of Theorem 9.3.] 37. Prove that the product of two commuting
n X n Hermitian matrices is also a
33. Mark each of the following True or False.
Hermitian matrix. What can you say about
Assume that all matrices and scalars are
the sum of two Hermitian matrices?
complex.
38. Prove that the product of two m X n umitary
a. The definition ofa determinant, matrices is aiso a unitary matrix. What
properties of determinants (the transpose about the sum?
property, the row-interchange property,
39. Let 7: C? > C’ be a linear transformation
and so on), and techniques for computing whose standard matrix representation is a
them are developed using only field unitary matrix U. Show that (7(2), T(v)) =
properties of R in Chapter 4, and thus {u, v) for all u, v € C”. [Hinr: Remember that
they remain equally valid for square (u, v) = u*y.]
complex matrices.
40. Prove that for u, v € C", we have (u*v)* =
. Cramer’s rule is valid for square linear
systems with complex coefficients.
u*v = vu = u’V.
. IfA is any square matrix and det(A) # 0, 41. Describe the unitary diagonal matrices.
then det(iA) # det(A). . Prove that, if U is unitary, then U, U’, and
. If U is unitary, then U~! = UT. U* are unitary matrices also.
. If U is unitary, then (U)"' = U’. 43. A square matrix A is normal if A*A = AA’.
. The Euclidean inner product in C" is not a. Show that every Hermitian matnx is
commutative. normal.
. For u,v € C’, we have (u, v) = (v, u) if b. Show that every unitary matrix is
and only if (u, v) is a real number. normal.
c. Show that, if A* = —A, then A is normal.
. For a square matrix A, we have det(A) =
det(A). 44, Let A be an n X n matrix. Referring to
i. For a square matnx A, we have det(A*) = Exercise 43, prove that, if A is normal, then
dei{A). ||Az|| = ||A*2|| for all z € C’.
- Hf ("1s a unitary matrix, then 45. Prove the converse of the statement in
dof *) = +1. Exercise 44.
MATLAB
MATLAB can work with complex numbers. When typing a complex number a + 0!
as a component of a vector or as an entry of a matrix, be sure to type at+b«i with
the * denoting multiplication and with no spaces before or aiter the + or the *. A
space before the + would cause MATLAB to interpret a +bsi as two numbers, the
real number a followed by another entry containing the complex number Lv.
93 EIGENVALUES AND DIAGONALIZATION 4/5
Our main goal in this section is to extend this result to complex matrices, as
follows:
We will prove this theorem, which has the theorem for real symmetric matrices
as an easy corollary.
110i
A={ 0 2 0}.
-101
1-A 0 i
det(A-AN=| 0 2-A O 2)
f=(2-ay(lL—-aypt
-i OO I-A
= (2 — AL — 2A + — 1) = -A(2 — AY.
The three roots of —A(2 — A)’ = O are A, = 0, A, = A, = 2.
For the eigenvalue A, = 0, we have
Loz 10 i
A-AJ=| 02 0/~/0
1 Ol,
-i01]; {090
9.3 EIGENVALUES AND DIAGONALIZATION 477
—i
which gives the eigenspace FE, = sp | ; . For the double root a, = A; = 2, we
have ]
-! 0O #8 1 0O-i
A-27/=; 0 0 O;~]0 O 9},
-i 0-1 0 0 0
it |0
which gives the two-dimensional eigenspace E, = sp ‘} " .
1} |0
0 i ;
——
=_
EXAMPLE 2 Let
A = 2 0|. Find a matrix C such that C7'AC is a diagonal matnx.
Co
0 1
=
Ena a
l
478 CHAPTER 9 COMPLEX SCALARS
and that it has the double eigenvalue A, = A, = 2 with eigenspace
1} |0
£, = sp] ]O},]1]).
\ULJ 10);
Thus, the algebraic multipiicity of each eigenvalue is equal to its geometric
multiplicity. and the vectors shown form a basis for C”. Therefore, the matrix
-—1 10
C=)|001
1 10
is invertible and diagonalizes A; and we must also have C~'AC = D, where
000
D=|0 2 0}.
00 2 .
jie 1
A=!0
i 21
loo1
is diagonalizable.
SOLUTION The eigenvalues of the upper-triangular matrix A are A, = A, = and A, = 1. We
focus our attention on the eigenvalue / of algebraic multiplicity 2. For A to be
diagonalizable, its eigenspace E, must have geometric multiplicity 2. The
elgenspace is the nullspace of the matrix
0 ¢ I
A-As=|0 0 2: |,
10 O1l-i
and the nullspace has dimension 2 if and only if the matrix has rank 1, which is
the case if and onlyifc=0.
that B = U~'AU. Because the inverse of 2 unitary matrix is unitary and be-
cause a product of unitary matrices is unitary, we can show that unitary
equivalence is an equivalence relation. Thus, 4 is unitarily equivalent
to itself; and if A is unitarily equivalent to B, then B is unitarily equivalent
to A; if furthermore B is unitarily equivalent to C. then -4 is unitarily equis-
alent to C.
Now we achieve the main goal of this section: to prove that Hermitian
matrices are unitarily equivalent to a diagonal matrix. That is, a Hermitian
matrix can be diagonalized using a unitary matrix. This follows from a very
important result known as Schur’s lemma (or Schur’s unitary triangularization
theorem), which we state, deferring the proof until the end of this section.
PROOF By Schur’s lemma, there exists a unitary matrix U such that U~'AU is
an upper-triangular matrix. Because U 1s unitary, we have U*U = J,so U"! =
U*; and because A is Hermitian, we also kwew that A* = A. Thus, we have
-ii
=| 0 0
1 1
diagonalizes A. Notice that the inner product of any two distinct column
vectors of this matrix is zero, so the column vectors are orthogonal. We need
only normalize them to length | in order to obtain a unitary matrix that
diagonalizes A. Thus, such a matrix is
,{-i i 1
U=| 0 0 V2I.
vat Oj .
—1 i jlti
A=] -i 1 0
1-7 0 l
-l-AzA i l+i
|A -— All = —j 1—A 0
l-i 0 1-A
7 y _ yllb mA 0 | | -) 0
=Cl al 9 1 Nt-1 t-alt
-i 1-A
(1 + Al, i 0 |
= (-1 —AXl — AP - (-DA -— A+ (lL + (- DU — O00 - A)
= (1 -ay(’- 1-1-2) =(1 - Aa’? - 4).
Thus, the eigenvalues are A, = 1, A, = 2, and A; = —2. To find U, we need only
compute one eigenvector of length | for each of these three distinct
eigenvalues. The three eigenvectors we obtain must form an orthonommal set.
according to Theorem 9.6.
For A, = 1, we find that
—2 @« Iti
A-I=| -i 0 Of,
1-i 0 0
SO an eigenvector 1s
0
CHAPTER 9 COMPLEX SCALARS
3 i iti] fl -i Oo] fi -i 0
A-2={ -i -t 0 {~]O -2% 1+ij~]0 1 G- 12),
l-i 0 -1 O1+i -1] [0 0 0
| | i 1+il l on
At+2=-!' -— 3 O'!~|0 2 -1+i ~ I
1-i 0 3 0 -l1-i 1 | [0 0
and a corresponding eigenvector is
-3-3i
vy,=/ l-t
2
We normalize the vectors v,, v,, and v, and form the column vectors in U from
the resulting vectors of magnitude i, obtaining
Exercises 25 and 26 ask you to prove the following theorem, which gives ‘
criterion for A to be unitarily diagonalizable in terms of matrix multiplicatio”
93 EIGENVALUES AND DIAGONALIZATION 483
ll
EXAMPLE 6 Determine all values of a such that the matrix
is unitarily diagonalizable.
SOLUTION In order for the matrix to be unitarily diagonalizable, we must have AA* = A*A
so that
| i ae 3
= 3 “ill
if
Equating entnes, we obtain
row |,column 1: 1+ |a?=1+4, — so |al = 2,
row l,column2: 2i- ai = —ai + 2i,
row 2,column 1: —2i + @i= ai — 21,
row 2,column2: 4+1= Ila’ +1, so fal =2.
Clearly these conditions are satisfied as long as |a| = 2, so a can be any number
of the form x + yi, wherex?+y=4.
A,
0
U,*\A\v,) =
484 CHAPTER 9 COMPLEX SCALARS
A,X Xr X
0
U*AG, = ) (1)
0
where we have denoted the (n — 1) X (n — 1) submatrix in the lower right-hand
comer of U,*AU, by A,. By our induction hypothesis, there exists an (m — 1) x
(1 — 1) unitary matrix C such that C*A,C= B, where Bis upper triangular. Let
[1 00---0
U, =]. (2)
lo |
where we have tised a symbolic notation similar tc that in Eq. (1). Because Cis
unitary, it is clear that U, is a unitary matrix. Now let U = U,U,. Because
U*U = U,*(U,*U,)U, = UTU, = U,*U, = I, we see that Uis a unitary matrix.
Now
U*AU = U,*(U, *AU,)U,. (3)
The matrix in parentheses in Eq. (3) 1s the matrix displayed in Eq. (1). From
our definition of U, in Eq. (2), we see that the (n — 1) < (m — 1) block in the
lower right-hand corner of U*AU is C*A,C = B. Thus, we have
“) SUMMARY
1. Ann X nmatrix A is diagonalizable if and only if C’ has a basis consisting
of eigenvectors of A. Equivalently, each eigenvalue has algebraic multi-
plicity equal to its geometric multiplicity.
2. Every Hermitian matrix is diagonalizable by a unitary matrix.
Every Hermitian matrix has real eigenvalues.
93 EIGENVALUES AND DIAGONALIZATION 485
EXERCISES
_—
1 37
In Exercises 1-12, find a unitary matrix U and a is unitarily diagonalizable.
diagonal matrix D such that D = U~AU for the . Find all a, 6 € C such that the matrix
aN
—
given matrix A.
E
4 is unitarily diagenalizable.
Laat ] 2 A=| |-2i 1| i
—-i i 7. Prove that every 2 x 2 real matrix that is
_—
unitarily diagonalizable las one of the
3. A=_
[¢ H
l l+i 4.A4=_{ 9 3-1
Li | A ls +i 0 || following forms: [@ 4 ab f
|b d]’
OQ i 0 1 -i 0
a,b,dER.
5 A=|-i 0 0 6 A=l|i 1 0
10 0 1 00 2 i-!l l
8. Determine whether the inatrix} | -/ -!
_—
l 2-21 0 -| } I
7.A=|2 + 2i -!1 0
is unitarily diagonalizable.
0 0 3
9. Mark each of the following True or False.
_—
HL — i normal.
Oo
21. Argue directly from Theorem 9.5 that b. Prove that ann X n normal
eigenvectors from different eigenspaces of a upper-tniangular matnx B must be
Hermitian matnx are orthogonal. diagonal. [Hint: Let C = B*B = BB*.
22. Suppose that A is ann X n matrix such that Equating the computations of c,, from
A* = —A. Show that B*B and from BB*, show that b,, = 0 for
| <j <n. Then equate the computations
a. A has eigenvalues of the form ri, where
reR,
of c,, from B*B and from BB* to show
that 5, = 0 for 2 <j = n. Continue this
b. A is diagonalizable by a unitary matrix.
process to show that B is lower
{Hitt FOR BOTH PaRTs: Work with iA.]
triangular.]}
23. Prove that an n X m matrix A is unitarily c. Deduce from parts a and b that a normal
diagonalizable if and only if |[Av|| = ||A*v|| matnix is unitarily diagonalizabie.
for allv € C’,
24. Prove that a nomnal matnix is Hermitian if
and only if all its eigenvalues are in R.
25. s. Prove that a diagonai matrix is normal. a In Exercises 27 and 28, use the command
b. Prove that, if A is normal and B is [U, D] = eig(A) in MATLAB to work the
unitarily equivalent to A, then B is indicated exercise. If your MATLAB answer U for
normal. the unitary matrix differs from the U that we
c. Deduce from parts a and b that a found using pencil and paper and put in the
unitarily diagonalizable matrix is normal. answers at the end of our text, explain how you
26. 2. Prove that every normal matrix A is can get from one answer to tie other.
unitanily equivalent to a normal
upper-triangular matnx B. (Use Schur’s 27. Exercise 9
lemma and part b of Exercise 25.} 28. Exercise 1}
Jordan Blocks
5 10
J=|050
005
is not diagonalizable.
94 JORDAN CANONICAL FORM 487
SOLUTION We see that 5 is the only eigenvalue of ¥ and that it has algebraic multiplicity
3. However,
010
J- 57=]|9 0 0},
vo 0 6
which shows that the eigenspace E, has dimension 2 and has basis {e,, e;}.
Thus, the geometric multiplicity of the eigenvalue 5 is only 2, and J is not
diagonalizable. We cannot find a basis for R? (or even C’) consisting entirely of
eigenvectors of J. "
Let us examine the matnx Jin Example | a bit more. Notice that, although
J — 5I has a nullspace of dimension 2, the matrix (J — 5J)’ is the zero matrix
and has all of C? as nullspace. Moreover. multiplication on the leit by J — SJ
carnies e, into e, and carries both e, and e, into 0. We say that J — 57 annihilates
e, and e,. The action of J — 5/ on these standard basis vectors is denoted
schematically by the two strings
e,->e,- 0, (1)
J-— SI e, > 0.
Diagram (1) also shows that (J — 5/)? maps each of these basis vectors into 0.
Because (J — Se, = e,, we have Je, = Se, + e,, whereas Je, = Se, and Je, = 5Se;.
EXAMPLE 2 Let
41000
0A100
J=|00A 1 0. (2)
000A1
O000A
Discuss the action of J — AJ on the standard basis vectors, drawing a schematic
diagram similar to diagram (1). Describe also the action of J on the vectors in
the standard basis.
SOLUTION We find that
610900
00100
J--A=|000
1 Ol.
00001
00000
J-d’Al e7e7e7e,7e,
> 0. (3)
488 § CHAPTER9 = COMPLEX SCALARS
Left-multiplication by J yields
Thus, the matnx Jin Example 2 1s a Jordan block. However, the matrix in
Example | is not a Jordan biock, since the entry 5 at the bottom of the diagonal
does not have a I just above 1. A Jordan block has the properties described in
the next theorem. These properties were illustrated in Example 2, and we leave
a formal proof to you if you desire one. Notice that, foran m x m Jordan block
A 10 00
OA] 00
J=
000 A 1
000 OA
we have just one string:
f-i 1000009
0-i 1000 0 0
00-000 0 0
00 0-i 100 0
J=19 9 0 0 -i 0 0 OF (4)
000002 0 0
00000051
00000005
As the shading indicates, this matrix J is comprised of four Jordan blocks,
placed cormer-to-corner along the diagonal.
EXAMPLE 4 Describe the effect of matrix J in Eq. (4) on each of the standard basis vectors
in C*. Then give the eigenvalues and eigenspaces of J. Finally, find the
dimension of the nullspace of (J — AJ) for each eigenvalue A of J and for each
positive integer k.
490 CHAPTER 9 COMPLEX SCALARS
The eigenvalues ofJ are —i, 2, and 5, which have algebraic multiplicities of 5,
|, and 2, respectively. The eigenspaces ofJ are E_, = sp(e,, e,), E, = sp(e,), and
E; = sp(e,), aS you can easily check.
The effect ofJ — (—i)J on the first five standard basis vectors 1s given by the
two strings
HISTORICAL NOTE Tue JoRDAN CANONICAL FORM appears in the Treatise on Substitutions and
Algebraic Equations, the chief work of the French algebraist Camille Jordan (1838-1921). This
text, which appeared in 1870, incorporated the author’s group-theory work over the preceding
decade and became the bible of the field for the remainder of the nineteenth century. The theorem
containing the canonical form actually deals not with matrices over the real numbers, but with
matrices with entries from the finite field of order p. And as the title of the book indicates, Jordan
was not considering matrices as such, but the linear substitutions that they represented.
Camille Jordan, a brilliant student, entered the Ecole Polytechnique in Paris at the age of !7
and practiced engineering from the time of his graduation until 1885. He thus had ample time for
mathematical research. From 1873 until 1912, he taught at both the Ecole Polytechnique and the
College de France. Besides doing seminal work on group theory, he is known for important
discoveries in modern analysis and topology.
94 JORDAN CANONICAL FORM =—s_« 491
EXAMPLE 5 Suppose a 9 x 9 Jordan canonical form J has the following properties:
1. (J — 3if) has rank 7 fork = 1, rank 5 fork = 2, and rank 4 fork = 3.
2. (J + fy has rank 6 forj = | and rank § forj = 2.
Find the Jordan blocks that appear in J.
SOLUTION Because the rank ofJ ~ 3i/ is 7, the dimension of its nullspace is 9 — 7 = 2, so
3i is an eigenvalue of geometric multiplicity 2. It must give rise to two Jordan
blocks. In addition, J — 3i/ must annihilate two eigenvectors e, and e, in the
standard basis. Because the rank of (J — 3i/)? is 5, its nullspace must have
dimension 4, so ina diagram of the effect of J — 3i/ on the standard basis, we
must have (J — 3i/)e,,, = e, and (J— 3il)e,,, = e,. Because (J — 3i7)‘ has rank 4
for k = 3, its nullity is 5, and we have just one more standard basis
vector—either e,., or e,,,—that is annihilated by (J — 3i/)’. [hus, the two
Jordan blocks in J that have 3i on the diagonal are
3i 1 «<0 35 1
J,=|0 3i 1] and J,= S sil
00 3
Because J + 1/ has rank 6, its nullspace has dimension 9 — 6 = 3, so —1 is an
eigenvalue of geometric multiplicity 3 and gives nse to three Jordan blocks.
Because (J + J) has rank 5 for j 2 2, its nullspace has dimension 4, so (J + J)
annihilates a total of four standard basis vectors. Thus, just one of these
Jordan blocks is 2 x 2, and tne other two are 1 x |. The Jordan blocks arising
from the eigenvalue —1 are then
-!| 1
4 =| 0 a and J, = J; = (-1).
The matrix J might have these blocks in any order down its diagonal.
Symbolically, we might have
Jordan Bases
ooo &
on
OOO
ON
om
iI
|
ooo
oOo
om
94 JORDAN CANONICAL FORM 493
0 50 0 1
0 00 0 0
A-AI=A-21=|0 0-3 0 QO
0 0 0-3 Q
0 0 0 0-3
has rank 4 and consequently has a nullspace of dimension 1. We find that
0 0 0 0-3
0 0 0 0 O
(A-27y’7=|0 0 9 O ODO,
0 0 0 9 QO
0 0 0 0 9
which has rank 3 and therefore has a nullspace of dimension 2. Furthermore,
00
0 0 9
00 0 0 0
(4-217 =|0 0-27 0 0
0 0 0-27 0
0 0 0 0-27
has the same rank and nullity as (A — 2/)*. Thus we have Ab, = 2b, and
Ab, = 2b, + b, for some Jordan basis B = (b,, b,, b;, b,, bs) for A. There is just
one Jordan block associated with A, = 2—namely,
9 30003
09000
(4+17=|0 0000
00000
00000
494 CHAPTER 9 COMPLEX SCALARS
again has rank 2 and nullity 3. Thus, a Jordan canonical form for A is
21 0 0 0
02 0 0 0
J=|0 0-1 0O 0
00 0-1 0
00 0 0-1 .
[0 0 —1]
0 0 OO ©
b, =/}1], b,=|0}, and b,=
0 1
0 L05
Ww
b> b> 0,
b, — 0.
To find the first and longest of these strings, we compute a basis {v,, v,, . . . » Ys
for the nullspace N, of (A — AI)’. The preceding strings show that multiplica-
tion of all of the vectors in N; on the left by (A — AI) yields a space of di-
mension 1, so at least one of the vectors v; has the property that (A — AI)*v, # 0.
9.4 JORDAN CANONICAL FORM 495
A=[f 2c‘}
If c = 10-'™, then the Jordan canonical form of A has | as its entry in the upper
nght-hand corner; but if c = 0, that entry is 0.
PROOF We use a proof due to Filippov. First we note that it sullices to prove
the theorem for matrices A having 0 as an eigenvalue. Observe that, if A is an
eigenvalue of A, then 0 is an eigenvalue of A — A/. Now if we can find C such
that C"'(A — AI)C = Jis a Jordan canonical form, then C"'AC = J + Al is also
a Jordan canonical form. Thus, we restrict ourselves to the case where .4 has an
eigenvalue of 0.
In order to find a Jordan canonica! fcrm for A, it is useful to consider also
the linear transformation T: C" — C’, where 7(z) = Az; a Jordan basis for A is
considered to be a Jordan basis for T. We will prove the existence of a Jordan
basis for any such linear transformation by induction on the dimension of the
domain of the transformation.
If T is a linear transformation of a one-dimensional vector space sp(z),
then 7(z) = Azfor some A € C, and {z} is the required Jordan basis. (The matrix
of 7 with respect to this ordered basis is the 1 X 1 matrix [A], which is already a
Jordan canonical form.)
Now suppose that there exist Jordan bases for linear transformations on
subspaces of C” of dimension less than n, and let T(z) = Az for z € C’ and an
n X nmatnx A. As noted, we can assume that zero is an eigenvalue of A. Then
rank(A) < n; let r = rank(A). Now T maps C’ onto the column space of A that 1s
of dimension r < n. Let T’ be the induced linear transformation of the column
space of A into itself, defined by 7'(v) = 7(v) for vin the column space of A. By
our induction hypothesis, there is a Jordan basis
B’ = (u,, uh, ..., U,)
Because the vector at the beginning of the jth string is in the column space of A,
it must have the form Aw, for some vector w, in C”. Thus we obtain the vectors
w,, W,..., W, illustrated in Figure 9.11 for s = 2.
Finally, the nullspace ofA has dimension n — r, and we can expand the set
ofs independent vectors in S to a basis for this nullspace. This gives rise to
n—r-—Ss more vectors ¥,, Vi,...,V,-r-~ Of course, each v, is an eigenvector
with corresponding eigenvalue 0.
“e claim that
can be reordered to become a Jordan basis B forA (and of course for T). We
reorder it by moving the vectors w,, tucking each one in so that it starts the
appropriate string in B’ that was used to define it. For the situation in Figure
9.11, we obtain
(U,, U,, Us, W,, Uy, Us, W,, Us, - U,V, ~~ = 5 Mapp)
as Jordan basis. From our construciion, we see that B is a Jordan basis for A if
it is a basis for C". Because there are r + 5 + (n — r — s) = n vectors in all, we
need only show that they are independent.
Suppose that
n-r-
i jel k=I
Because each Au, is either of the form Au; or of the form Au, + u,_,, we see
that the first sum is a linear combination of vectors u;. Moreover, these vectors
Column space of A
Nullspace of A
*
c"
FIGURE 9.11
Construction of a Jordan basis for A (s = 2).
498 CHAPTER 9 COMPLEX SCALARS
Au; do not begin any siring in B’. Now the vectors Aw, in the second sum
are vectors u; that appear at tne start of the s strings in 3’ that end in S. Thus
they do not appear in the first sum. Because B’ is an independent set,
all the coefficients c, in Eq. (7) must be zero. Equation (6) can then be
written as
a-r~—-s
2a a; = » —d,y,. (8)
Now the vector on the left-hand side of this equation lies in the column
space of A, whereas the vector on the right-hand side is in the nullspace of A.
Consequently, this vector lies in S and is a linear combination of the s basis
vectors u; in S. Because the v, were cbtained by extending these s vectors to
a basis for the nullspace of A, the vector 0 is the only linear combination of
the v, that lies in S. Thus, the vector on Loth sides of Eq. (8) is 0. Because
the v, are independent, we see that all d, are zero. Because the u, are inde-
pendent, it follows that the a; are all zero. Therefore, B is an independent
set of n vectors and is thus a basis for C". We have seen.that, by our construc-
tion, it must be a Jordan basis. This completes the induction part of our
proof, demonstrating the existence of a Jordan canonical form for every
square matrix A.
Our work prior to this theorem makes clear that the Jordan blocks
constituting a Jordan canonical form for A are completely dctermined by the
ranks of the matrices (A — AJ)‘ for all eigenvalues A of 4 and for all positive
integers k. Thus, a Jordan canonical form J for A is unique except as to the
order in which these blocks appear along the diagonal of J. a
SUMMARY
1. A Jordan block is a square matrix with all diagonal entries equal, all entries
immediately above diagonal entries equal to 1, and all other entries equal
to 0.
2. Properties of a Jordan block are given in Theorem 9.8.
3. A square matrix is a Jordan canonical form if it consists of Jordan blocks
placed cornertocorner along its main diagonal, with entries elsewhere equal
to 0.
4. A Jordan basis (see Definition 9.8) for an n X n matrix A gives rise to a
Jordan canonical form J that is similar to A.
5. A Jordan canonical form similar to an n X n matrix A can be computed if
we know the eigenvalues A, of A and if we know the rank of (A — AJ)‘ for
each A, and for all positive integers k.
6. Every square matrix has a Jordan canonical form; that is, it is similar to a
Jordan canonical form.
9.4 JORDAN CANONICAL FORM 499
| EXERCISES
~.
oooo~.O0O0©
6 «0 ©
Oo «~ =
Qonooceoecoe
OOOO CSO
O00
matrix 1S @ Jordan canonical form.
—————
oo oo
oo oo oceo
‘0 00 3 00 10.
oo
14.|9 00 2.}0
3 1
ooo.
10 0 0 6 03
eoo
[3100 1000
N=
el
0310 4 [0200
=
3- Jo 021 ‘10030
10 002 10004 In Exercises 11-14, find a Jordan canonical form
i100 for A from the given datu.
0-i 0 0
5-J00 31 11. Ais 5 X 5,A — 32 has nullity 2, (A — 3/)°
100 0 3 has nullity 3, (A — 37) has nullity 4,
(A — 31} has nullity 5 for k = 4.
‘2 1 0 O
5 |9 2 9 0 12. Ais 7 x 7,A + [has nullity 3, (A + J)‘ has
“10 0 ¢ O nullity 5 for k = 2; A + has nullity 1,
10 0 0-1 (A + if} has nullity 2 for j = 2.
13. Ais 8 x 8,A — [has nullity 2, (A — /)* has
nullity 4, (A — 1)‘ has nullity 5 for k = 3;
(A + 2I) has oullity 3 for j = 1.
In Exercises 7-10:
a) Find the eigenvalues of the given matrix J. 14. Ais 8 x 8;A + if has rank 4, (A + il) has
rank 2, (A + iJ) has rank 1, (A + i)‘ = O
b) Give the rank and nullity of (J — A} for each
for k = 4.
eigenvalue d of J and for every positive integer k.
¢) Draw schemata of the strings of vectors in the
standard basis arising from the Jordan blocks in J. In Exercises 15-22, find a Jordan canonical form
d) For each standard basis vector e,, express Je, as and a Jordan basis for the given matrix.
a linear combination of vectors in the standard
basis. -10 1a4
15. EE 16. 35 -41
-2 1 0 0 bid 3 0 |
| 9-2 1 6 17. |2 1 3 18.{ 2-2 1
0 0-2 1 Iso 4 -1 0 -1|
| 0 0 0-2
2 5 0 0 0
ri 0 0 0 0 0200 0
0 i 10 0 19./0 0-1 0-1
810 0 i 0 0 0 0 0-1 0
0 0 0-2 0 0 0 0 O-t.
0 0 0 0-2 fi 0 0 0 0
1 0 0 0 1 0 i 0 0 0
02100 2.10 0 2 0 0
%9!/ 002 0 0 000 2 0
0002 1 > 0-1 0 2!
0 0002
500 CHAPTER 9 COMPLEX SCALARS
2 00 0 1 20000
0 2 0 0 0 02100
21./0 0 2 0 1 27. Let
A =|0 0 2 0 O|. Compute
00 0 2 9 0003 i
oO 8 0 0 2? 100090 3]
rro2 06) (A - 2194 - 3/7). Compare with Exercise 26.
Oo | 0 0 9 ig000
22,.;/0 0 1 0 | 071100
0 0 0 t 2 28. LetA=|0 0 ¢ 0 O|. Find
a polynomial in
0 0 0 0 1 00020
23. Mark each of the fellowing True or False. 10 000 2]
A (that is, a sum of terms @,4’ with a term
___. a. Every Jordan block matrix has just one aol) that gives the zero matrix. (See Exercises
eigenvalue. 24-27.) r
___ b. Every matnx having a unique eigenvalue
is a Jordan block. 29. Repeat Exercise 28 for the matrix A =
___¢. Every diagonal matrix is a Jordan -! 1 0 0 =O
canonical form. 6-1 1 0 0
___d. Every square matrix is similar to a 0 0-1 O O}.
Jordan canonical form. 0 ok Od
___ e. Every square matrix is similar to a 000 0 it
unique Jordan canonical form. . The Cayley—-Hamilton theorem states that, if
__. f. Every 1 x | matrix is similar to a unique P(A) = a,A" + +++ + aA + ay is the
Jordan canonical form. characteristic polynomial of a mainx A, then
___ g. There is a Jordan basis for every square p(A) = a,A"+ +++ +a,A + af = O, the
matrix A. zero matrix. Prove it. [Hin1: Consider
__.h. There is a unique Jordan basis for every (A — AJ)"'b, where b is a vector in a Jordan
square matrix A. basis corresponding to A,.] [n view of
—_— i. Every 3 X 3 diagonalizable matrix 1s Exercises 24-29, explain why you expect
similar to exactly six Jordan cancnical PJ) to be O, where J is a Jordan canonical
forms. form for A. Deduce that p{A) = O.
_— j. Every 3 X 3 matrix is similar to exactly 31. Let T: C" > C’ be a linear transformaiion. A
six Jordan canonical forms. subspace W of C" is invariant under T if
T(w) © W for all w € W. Let A be the
0100 standard matrix representation of T.
24. LetA
_|90001
010 . Compute a. Describe the one-dimensional invariant
subspaces of T.
0000 b. Show that every eigenspace E, of T is
A’, A}, and A‘.
invariant under T-
25. Let A be an n X n upper-triangular -natrix c. Show that the vectors in any string in a
with all diagonal entries 0. Compute A” for Jordan basis for A generate an invariant
all positive integers m = n. (See Exercise subspace of T.
24.) Prove that your answer is correct. d. Is it true that, if S is a subspace of a
subspace W that is invariant under 7,
then S is also invariant under 7? Jf not,
give a counterexample.
e. Is it true that every subspace of R’
invariant under 7 is contained in the
nullspace of (A — AJ)’, where A is some
eigenvalue of T? If not, give a
counterexample.
9.4 JORDAN CANONICAL FORM 501
VN=AN
+ Oe
Vo = Ayy. + Crys, T-1 | QO
J=| 0-1 9,
10 0 4
HAPTER
The Gauss and Gauss—Jordan methods presented in Chapter | are fine for
solving very small linear systems with pencil and paper. Some applied
problems—in particular, those requiring numerica! solutions of differential
equations—can lead to very large linear systems, involving thousands of
equations in thousands of unknowns. Of course, such large linear systems
must be solved by the use of computers. That is the primary concern of this
chapter. Although a computer can work tremendously faster than we can with
pencil and paper, each individual arithmetic operation does take some time,
and additional time is used whenever the value of a variable is stored or
retrieved. In addition, indexing in subscripted arrays requires time. When
solving a large linear system with a computer, we must use as efficient a
computational algorithm as we can, so that the number of operations required
is as small as practically possible.
We begin this chapter with a discussion of the time required for a
computer to execute operations and a comparison of the efficiency of the
Gauss method including back substitution with that of the Gauss—Jordan
method.
Section 10.2 presents the LU (lower- and upper-triangular) factorization of
the coefficient matrix of a square linear system. This factorization will appear
as we develop an efficient algorithm for solving, by repeated computer runs,
many systems all having the same coefficient matrix.
Section 10.3 deals with problems of roundoff and discusses ill-conditioned
matrices. We will see that there are actually very small linear systems,
consisting of only two equations in two unknowns, for which good computer
programs may give incorrect solutions.
502
10.1 CONSIDERATIONS OF TIME S03
where we had earlier set N = 10,000. We obtained the data shown in the top
row of Table 10.1 We also had the computer execute loop (3) with + replaced
by — (subtraction), then by * (multiplication), and finally by / (division). We
similarly timed the execution of 10,000 flops, using
TABLE 10.1
Time (in Seconds) for Executing 10,000 Operations
Addition
{using (3)) 37 44 8 9
Subtraction
[using (3) with —] 37 48 8 9
Multiplication
[using (3) with +] 39 53 9 1!
Division
{using (3) with /] 47 223 9 15
Flops
[using (4)] 143 162 15 18
Point 2 Division took the mosi time of the four arithmetic operations. Indeed,
our PC found double-precision division in interpretive BASIC very
time-consuming. We should try to minimize divisions as much as
possible. For example, when computing
Counting Operations
We turn to counting the flops required to solve a square linear system AX = b
that has a unique solution. We assume that no row interchanges are necessary,
which is frequently the case.
Suppose that we solve the system Ax = b, using the Gauss method with
back substitution. Form the augmented matrix
Qa, a, *** a, | O,
Q, Ay *** A, | b,
[A | b] =
For the moment let us neglect the flops performed on b and count just the flops
performed on A. In reducing the n x n matrix A to upper-triangular form U, we
execute n — i flops in adding a multiple of the first row of [A | b] to the second
row. (We do not need to compute the zero entry at the beginning of our new
second row; we know it will be zero.) We similarly use 7 — | flops to obtain the
new row 3, and so on. This gives a total of (n — 1) flops executed using the
pivot in row 1. The pivot in row 2 is then used for the (n — 1) x (n — 1) matrix
obtained by crossing off the first row and column of the modified coefficient
matrix. By the count just made for the n x n matrix, we realize that using this
pivot in the second row will result in execution of (n — 2) flops. Continuing in
this fashion, we see that reducing A to upper-triangular form U will require
(n-1p+(n-2P+(n-3P t+ 41 (5)
flops, together with some divisions. Let’s count the divisions. We expect to use
just one division each time a row is multiplied by a constant and added to
another row (see point 2). There will be n — | divisions involving the pivot in
row |, n — 2 involving the pivot in row 2, and so on, for a total of
and
2 n(n + 1)(2n + 1)
P+2+3P+es-s
tn 5 (8)
506 CHAPTER 10 SOLVING LARGE LINEAR SYSTEMS
"7 _h (11
divisions. For large n, this is of order of magnitude n’/2, which is inconsequen
tial in comparison with the order of magnitude n3/3 for the flops. Exercise
shows that the number of flops performed on the column vector b in reducin
[A | b] is again given by Eq. (11), so it can be neglected for large n in view of th
order of magnitude 7’/3. The result is shown in the following box.
_ (n+ In win 2
1+2+3+4+---4n 2 27 5 (12)
for the flops in this back substitution. Again, this is of lower order of
magnitude than the order of magnitude n°/3 for the flops required to reduce
[A | b}] to [Uj cl].
Combining the results shown in the tast two boxes, we arrive at the
following flop count.
Flop Count for Solving Ax = b, Using the Gauss Method with Back
Substitution
If A is an n X n matrix, the number of flops required to solve the
system Ax = b using the Gauss method with back substitution is of
order of magnitude 77/3 for large n.
EXAMPLE 1 _For the computer that produced the execution times shown in Table 10.1, find
the approximate time required to solve a system of 100 equations in 100
unknowns, using single-precision arithmetic and (a) interpretive BASIC.
(b) compiled BASIC.
SOLUTION From the flop count for the Gauss method, we see that solving such a system
with n = 100 requires about 100°/3 = 1,000,000/3 flops. In interpretive
BASIC, the time required for 10,000 flops in single precision was about 143
seconds. Thus the 1,000,000/3 flops require about (1,000,000/30,000)143 =
14,300/3 seconds, or about | hour and 20 minutes. In compiled BASIC, where
10,000 flops took about 15 seconds, we find that the time is approximately
(1,000,000/30,000)!5 = 1500/3 = 500 seconds, or about 8 minutes. The PC
used for Table 10.1 is regarded today as terribly slow. «
508 CHAPTER 10 SOLVING LARGE LINEAR SYSTEMS
EXAMPLE 2 Find the number of flops required to compute the dot product of two vectors,
each with m components.
SOLUTION The preceding discussion shows that the dot product uses n flops of form (13),
corresponding to the values 1, 2,3,...,aforK. s
fo
SUMMARY
Lo
5. The formulas
1+2+3t:-- n= Meth
and
EXERCISES
all of these exercises, assume that no row 3, B+C 4. 4 5. BA
ctors are interchanged in a matrix. 6. A} 7. A‘ 8. A?
. 5. A$ . A% 11. A”
. Let A be ann X n matrix. Show that, in 7. A 10. A
reducing [A | bj to [Uj c] using the Gauss |
method, the number of flops performed on b 12. Which of the following is more efficient with
is of order of magnitude 7/2 for large n. a computer?
1. Let A be ann X n matrix. Show that, in a. Solving a square system Ax = b by the
solving Ax = b using the Gauss—Jordan Gauss—Jordan method, which makes each
method, the number of flops has order of pivot | before creating the zeros in the
magnitude n°/2 for large n. column containing it.
b. Using a similar technique to reduce the
| Exercises 3-11, let A be ann X n matrix, and system to a diagonal system Dx = c,
t Band C be m X n matrices. For each matrix, where the entries in D are not necessarily
1d the number of flops required for efficient all 1, and then dividing by these entries
»mputation of the matrix. to obtain the solution.
510 CHAPTER 10 SOLVING LARGE LINEAR SYSTEMS
LINTEK contains a routine TIMING that can be 23. Run the routine TIMING in LINTEK to
used to time algebraic operations and flops. The time the Gauss method and the
program can also be used to time the solution of Gauss—Jordan method, starting with small
square systems Ax = b by the Gauss method with values of n and increasing them until a few
back substitution and by the Gauss-Jordan seconds’ difference in times for the two
methed. For a user-specified integer n = 80, the methods is obtained. Does the time for the
program generates the n X n matrix A and Gauss-Jordan method seem to be about 3
culumn vector b, where all entries are in the the time for the Gauss method with back’
interval [—20, 20]. Use TIMING for Exercises substitution? If not, why not?
22-24. 24. Continuing Exercise 23, increase the size of
n until the solutions take 2 or 3 minutes.
22. Experiment with TIMING to see roughly Now does the Gauss~Jordan method seem
how many of the indicated operations your to take about 3 times as iong? If this ratio is
computer can do in 5 seconds. significantly different from tuat obtained n
a. Additions Exercise 23, explain why. (The two ratios
b. Subtractions may or may not appear to be approximately
. Muttiplications the same, depending on the speed of the
. Divisions
ean
MATLAB
The command clock in MATLAB returns a row vector
that gives the date and time in decimal form. To calculate the elapsed time for a
computation, we can
set t0 = clock,
execute the computation,
give the command etime(clock,t0), which returns elapsed time since t0.
The student edition of MATLAB will not accept a value of i greater than 1024 in a
“FOR i = 1 ton” loop, so to calculate the time for more than 1024 operations ina
loop, we use a double
FOR g = ! toh, FORi=
ton
loop with which we can find the time required for up to 1024- operations. Note below
that the syntax of the MATLAB loops is a bit different from that just displayed.
Recall that the colon can be used with the meaning “through.” The first two lines
below define data to be used in the last line. The last line is jammed together so that
it will all fit. on one screen line when you modify the c=a+b portion in Exercise M5
to time flops. As shown below, the last line returns the elapsed time for h - n repeated
additions of two numbers a aid b between 0 and |: ~
A = rand(6,6); a = rand; b = rand; r = rand; j = 2; k = 4; m = 3;
= 100; n = 50;
t0=clock;for g=1:h,for i= l:n,c=a+b;end;end;etime(clock,t0)
512 CHAPTER 10 SOLVING LARGE LINEAR SYSTEMS
M1. Enter in the three lines shown above. Put spaces in the last line only after
“for”. Using the up-arrow key and changing the value of n (and A if
necessary), find roughly how many additions MATLAB can perform in 5
seconds on your computer. (Remember, the time given is for 4 - n additions.)
M2. Modify the c=a+b portion and repeat M1 for subtractions.
M3. Modify the c=a+b portion and repeat M1 for multiplications
M4. Modify the c=atb portion and repeat M1 for divisions.
M5. Modify the c=a+b portion and =2peat M1 for flops A(k, j)=A(k,j)+r*A(m, j).
Delete c=atb first, and don’t insert any spaces!
From Section 10.1, we know that the magnitude of the number of flops
required to reduce A to an upper-tnangular matrix U is n3/3 for large n. We
want to avoid having to repeat all this work done in reducing A after the first
computer run.
We assume that A can be reduced to U without interchanging rows—that
is, that (nonzero) pivots always appear where we want them as we reduce A
to U. This means that only elementary row-addition operations (those that add
a multiple of a row vector to another row vector) are used. Recall that
a row-addition operation can be accomplished by multiplying on the left by an
n X n elementary matrix £, where E is obtained by applying the same
row-addition operation to the identity matrix. There is a sequence E£,, E,, . . - 9
1 3-1
A=| 2 8 4
-!1 3 4
EXAMPLE 2 Use the record in L in Example | to solve the linear system Ax = b given by
x, + 3x, —_ X3 = —4
4|
— 4]
_ Add —2 times row 1 to ~
f= 2 row 2. 10
e,, = 3 Add -3
ac 3 titimes row 2 2 to ~| “410l=e
— 30
10.2 THE LU-FACTORIZATION 515
If we put this result together with the matrix U obtained in Example |, the
reduced augmented matrix for the linear system becomes
1 3 -1 -
[U|jc]=|0 2 6 10}.
0 0 -15 | -30
Back substitution then yields
X=
—305 = 2
10
- 6(2) _ -2
me Ed,
x,=-44+3+2=1. a
EXAMPLE 3 Let
1-2 0 3 I!
—-2 3 #1 -6 —21
A=\-| 4-4 3) and |S
5-8 4 6G 23
Generate the matnx L while reducing the matrix A to U. Then use U and the
record in L to solve Ax = b.
SOLUTION We work in two columns again. This time we fix up a whole column of A in
each step:
Reduction of A Generation of L
fT 1-2 0 3 11000
_|-2 3 1 -6 01 0 8
A=!_1; 4-4 3 [= ° 01
5-8 4 0 0001
‘1-2 0 3] 1 00 0
0-1 1 O -2 1 0 0
~10 2-4 6 -1 0 | 0
10 2 4-15, 5 0 0 |,
‘1-2 0 31 1 0 0 0]
0-1 1 0 —2 i 0 0
~10 0-2 6 -1-2 1 0
10 0 6 15, | §-2 0 L
‘1-2 0 3 1 00 0
0-1 1 0 —2 1 0 0
~lo 0 -2 6/7 YU L£=)-1
-2 1. OF
10 0 (0 3] 1 5-2-3 J
516
c
CHAPTER 10 SOLVING LARGE LINEAR SYSTEMS
We now apply the record below the diagonal in L to the vector b, worki:
under the main diagonal down each column of the record in L in turn.
i] fa 11 11
—21 1 1 1
First column ol! L: b = ~11}~{-1])7}101 7 | 101
23{ | 23} {23} [-32
ti] Fou 11
1 1 1
second column of L: iol~} 1217} 121
— 32! 39 —30
li] fal
. l l
Third column of L: 1217 |12/-
— 30, L 6 |
1-2 0 3) 11
0-1 1 0 ]
0 0-2 64 12
0 0 0 3 6
n= 12 ~ 12 = 0,
—2
1-0
xX, = _]_ = ~l,
x= 11-2-6=3.
position where a zero is being created! The computer already has space
reserved for an entry there. Just remember that the final matrix contains the
desired entries of U on and above the diagonal and the record for L or —L
below the diagonal. We will always use —r rather than r as record entry in this
texi. Thns, for the 4 x 4 matrix in Example 3, we obtain
1-2 0 3
—2 -! 1 0
~| —2 -2 6|> Combined L\U display (2)
5-2 -3 3
where the black entries on or above the main diagonal give the essential data
for U, and the color entries are the essential data for L.
Let us examine the efficiency of solving a system Ax = b if U and L are
already known. Each entry in the record in L requires one flop to execute when
applying this record to reduce a column vector b. The number of entries is
n-1|)n pw aA
l+24---4¢n— 12! Wn mn
which is of order of magnitude n’/2 for large n. We saw in Section 10.1 that
oack substitution requires about n’/2 flops, too, giving a total of n’ flops for
large n to solve Ax = b, once U and L are known. If instead we computed A7!
and found x = A~'b, the product A~'b would also require n? flops. But there are
at least two advantages in using the LU-technique. First, finding U requires
about n°/3 flops for large n, whereas finding A! requires n? flops. (See Exercise
15 in Section 10.1.) If m = 1000, the difference in computer time is
considerable. Second, more computer memory is used in reducing [A | /] to
[7 | A~'] than is used in the efficient way we record L as we find U, illustrated in
the combined L\U display (2).
We give a specific illustration in which keeping the record L is useful.
EXAMPLE 4 Let
1 2-1 9
A=|-2 -5 3] and b=]-17}.
~!| -3 0 ~44
Solve the linear system 42x = b.
SOLUTION We view A>x = bas A(A’x) = b, and substitute y = Ax to obtain Ay = b. We can
solve this equation for y. Then we write A’x = y as A(Ax) = y or Az = y, where z
= Ax. We then solve Az = y for z. Finally, we solve Ax = z for the desired x.
Because we are using the same coefficient matrix A each time, it is efficient to
find the matrices L and U and then to proceed as in Example 3 to find y, z, and
X in turn.
We find that the matrices U and L are given by
1 2-1] 100
U=/0 -1 '; and L=!/-21 0}.
0 0-2 -I ll
518 CHAPTER 10 SOLVING LARGE LINEAR SYSTEMS
9
b= 4 : a : . 6
LI.
\—7
y=| 17}.
18
To solve Az = y, we apply the record in L toy:
3
=|-7}.
—4
Finaliy, to solve Ax = z, we vn the record in L toz:
aH[1
The routine LUFACTOR in LINTEK can be used to find the matrices L
and U; it has an option for iteration to solve a system A”x = b, as we did in
Example 4.
The Factorization A = LU
: i O
=
Oo 2
KE
etc.
fwON
we
|
~1 00
i |
NO
1 0 0| -4l
2 1 O} 2
~l 3 41 4
c= -4,
2¢, +0, = 2, G =2+8=
10.
—c, + 3c, +c, = 4,
C4, =¢c, — 3c, + 4= -4- 30+ 4 = —30.
Notice that this is the same c as was obtained in Example 2. The back
substitution with Ux = c of Example 2 then yields the same solution x. &
uy,
Uy)
D=
Q Unn
10.2 THE LU-FACTORIZATION 521
PROOF Suppose thatA = L,D,U, andA = L,D,U, are two such factonzations.
Observe that both L,~' and L,"' are also lower triangular, D,~' and D,"' are
both diagonal, and U,~! and U,-' are both still upper triangular. Just think how
the matnx reductions of [L, | J] or [D, | 2] or [U, | 7] to find the inverses look.
Furthermore, L,~', £,-!, U,-', and U,-" have all their main diagonal entries equal
to I.
Now from L,D,U, = L,D,U,, we obtain
but now that some of these £,"' interchange rows, their product may not |
lower triangular. However, after we discover which row interchanges a
necessary, we could start over, and make these necessary row interchanges |
the matrix A before we start creating zeros below the diagonal. We can see th.
the upper-triangular matrix U obtained would still be the same. Suppose, fi
example, that to obtain a nonzero element in pivot position in the ith row v
interchange this ith row with a kth sow farther down in the matrix. The ne
ith row will be the same as though it had been put in the ith row position befor
the start of the reduction; in either case, it has been modified during tt
reduction only by the addition of multiples of rows above the ith row positio1
As multiples of it are now added to rows below the ith row position, the sam
rows {except possibly for order) below the ith row position are create:
whether row interchange is performed during reduction or is completed befor
reduction starts. .
Interchanging some rows before the start of the reduction amounts t
multiplying A on the left by a sequence of elementary row-interchang
matrices. Any product of elementary row-interchange matrices is called
permutation matrix. Thus we can form PA for a permutation matrix P, and P
will then admit a factorization PA = LU. We state this 1s a theorem.
1 3 2
A=|-2 -6 1
2 5 7
fi 3 2) fl 3 2
-2-6 1I/~]0 0 5|,
2 5 7! 10-1 3
and we now find it necessary to interchange rows 2 and 3, which will ther
produce the desired U. Thus we take
0
o-- SO
P=
oo;
0 >
l
10.2 THE LU-FACTORIZATION 523
and we have
1 3 2 1 3 2
PA=| 2 5 TI~j0O -I 3/=v.
—2 -6 1 6 0 5
1 0 0
L=| 2 1 OQ,
-2 0 1
1 3 2 1 0 olf) 3 2
PA={| 2 5 7T7l=; 2 1 OVO0-1 3/=L2U.
2-6 | > 0 11/0 0 5 .
Suppose that, when solving a large square linear system Ax = b with a
computer, keeping the record L, we find that row interchanges are advisable.
(See the discussion of partial pivoting in Section 10.3.) We could keep going
and make the row interchanges to find an upper-triangular matrix; we could
then start over, make those row interchanges first, just as in Example 7, to
obtain a matrix PA, and then obtain U and L for the matrix PA instead. Of
course, we would first have to compute Pb when solving Ax = b, because the
system would then become PA x = Pb, before we could proceed with the record
L and back substitution. This is an undesirable procedure, because it requires
reduction of both A and FA, taking a total of about 27/3 flops, assuming that A
is n X n and nis large. Surely it would be better to devise a method of
record-keeping that would also keep track of any row interchanges as they
occurred, and would then apply this improved record to b and use back
substitution to find the solution. We toss this suggestion out for enthusiastic
programmers to consider on their own.
~) SUMMARY
1. If A is an n X n invertible matrix that can be row-reduced to an
upper-triangular matrix U without row interchanges, there exists a lower-
triangular n X n matrix L such that A = LU.
2. The matrix Z in summary item | can be found as follows. Start with the
n X nidentity matrix /. If during the reduction of A to U, r times row 1 is
added to row k, replace the zero in row k and column / of the identity
matrix by —r. The final result obtained from the identity matrix is the
matrix L.
3. Once A has been reduced to Uand L has been found, a computer can find
the solution of Ax = b for a new column vector b, using about 1’ flops for
large n.
524 CHAPTER 10 SOLVING LARGE LINEAR SYSTEMS
EXERCISES
1. Discuss briefly ihe need to worry about tue answer, using :natrix multiplication. Then solve
time required to create the record matnx L the system Ax = b, using P, L, and U.
when solving
Ax = by, D,...,d,
6 -5 [32
for a large n X n matrix A. [2 41 -3 6|
2. Is there any practical value in creating the 9. 4=|16 3 -8|,b=|17
record matnx L when one needs to solve 2-15 0
only a single square linear system Ax = b, as
rt | 3 -13
we did in Example 3?
10.4=|0 1 I/,b=|] 6
3-1 1 -7
In Exercises 3-7, find the solution oj Ax = b from
the given combined L\U display of the matrix A | -3
and the given vector b W.A=/3 7 2/,b=] 1
14-2 1 -2
3. L\U = 5 s}
-2
= 2]4 ‘1 -4 | -2 -9
10 2-1 I}, _] 6
12 A=\, 7 -2 1] = 10
4. L\U= | 72 5. =| 1 0 3 0-4 —16
~21 -7
fT 1 2-3 0 8
1-3 4 , 2|
5. 1\U=|-2 -1 9|,b= 3 _| 2 5-6 Of] ,_]17
0 1 -6 8 3. A=) yy yf =]W8
410-9 1 33
1-4 2 3]
6. L\U=|0 2 -1|,b=] 2 In Exercises 14-17, proceed as in Example 4 to
3 2-2 | 3{ solve the given system.
1 0 0 1 4
_}oO-14
7. INU=]_) 2J tt.>=|_g
3) } 7 14. Solve 4’x = bifA = 3 of b= i.
2 1-1 3 14
15. Solve A*x = bifA =|7! 3}
In Exercises 8-13, let A be the given matrix. Find 2 -3
a permutation matrix P, if necessary, and bal 144]
matrices L and U such that PA = LU. Check the -233
10.2 THE £U-FACTORIZATION 525
12 -| 2 1-3 4 =O
1 4 6-1 §
-! 0O 1
22. A=!|-2 0 3 321 = 4);
Solve
x =bifA=| 3 1 Ol,
6 1 2 8 -3
-2 0 4 4 1 3 2 1
27
=) 29). —20
122 96| fori= 1,
=), , 9} and] 53| 2, and 3,
5| respectively.
Exercises 18-20, find the unique Jactorization M1 L 26
1U for the given matrix A.
1 3-5 2 {ft
The matnx in Exercise 8 4-6 10 8 3
23. A=| 3 6-1 4 =7);
The matrix in Exercise 10
2 1 11-3 =#13
The matnx in Exercise | 1 -§ 3 1 4 2)3
Mark each of the following True or False.
0} |-48 87
_a. Every matrix 4 has an LU-factorization.
20| | 218 -151
_b. Every square matnx A has an
b, =| —9], Q|, and} 102
LU-factonzation.
—31 100 — 46
. ¢. Every square matrix A has a factonzation
P~'LU for some permutation matrix P. 43] | 143 223]
_d. If an LU-factonization of ann x « aatnx for i = i, 2, and 3, respectively.
A 1s known, using this factorization to -_
28. 4*x = b forA and b in Exercise 17 large n, and the solution for each column vector
29. A*x = b, for A and b, in Exercise 24 should require about r° flops. In Exercises 30-3
use the routine TIMING and see if the ratios of
The routine TIMING in LINTEK has an option times obtained seem to conform to the ratios of
to time the forraation of the combined L\U the numbers of flops required, at least as n
display from a matrix A. The user specifies the increases. Compare the time required to solve o:
system, incliding the computation of L\U, with
size n for ann X » linear system Ax = d. The
program then geiierates ann X n matrix A with the time required to solve a system of the same
entries in the interval [--20, 20] and creates the size using the Gauss method with back
L\U display shown in matrix (2) in the text. The substitution. The times should be about the sam
time required for the reduction is indicated. The
user may then specify a number s for solution of 30. n=4,5= 1,3. 12
Ax = b,, by... , b,. The computer generates a 31. n= 10:5= 1.5, 10
column vector and solves s systems, using the 32. n= 16:5= 1.8. 16
record in L and back substitution; the time to find
33. n= 24,5 = 1.12, 24
these s solutions is also indicated. Recall that the
reduction of A should require ubout n’/3 flops for 34. n= 30:5 = 1.15, 30
MATLAB
MATLAB uses LU-factorization, for example, in its computation of A\ B. (Recall
that A\B = A"'B if A is a square matrix.) In finding an LU-factorization of A,
MATLAB uses partial pivoting (row swapping to make pivots as large as possible)
for increased accuracy (see Section 10.3). It obtains a lower-triangular matnx L ar
an upper-triangular matrix U such that LU = PA, where P is a permutation matrix
as illustrated in Example 7. The MATLAB command
{L,U,P}] = lu(A)
produces these matrices L, U, and P. In pencil-and-paper work. we like small pivo
rather than large ones, and we swapped rows in Exercises §- 13 only if we
encountered a zero pivot. Thus this MATLAB command cannot be used to cneck
our answers to Exercises 8-13.
45.1 + .0725.
SOLUTION Because it can handle only three significant figures, our computer represents
the actual sum 45.1725 as 45.1. In other words, the second summanid .0725
might as well be zero, so far as our computer 1s concerned. The datum .0725 1s
completely lost. a
2 665
3 1000"
SOLUTION The actual difference is
Partial Pivoting
In row reduction of a matnx to echelon form, a technique called part
pivoting is often used. In partial pivoting, one interchanges the row in whi
the pivot is to occur with a row farther down, if necessary, so that the pis
becomes as large in absolute value as possible. To illustrate, suppose that r
reduction cf a matrix tc echelon form leads to an intermediate matrix
2 8-1 3 4
0-2 3-5 6
0 4 1 2 OF
0-7 3 1 4
Using partial pivoting, we would then interchange rows 2 and 4 to use t
entry —7 of maximum magnitude among the possibilities —2, 4, and —7 !
pivots in the second column. That is, we would form the matrix
2 8-13 4
0-73 1 4
EXAMPLE 3 Find the actual solution of linear system (1). Then compare the result with th
obtained by a three-figure computer using the Gauss method with ba
substitution, but without partial pivoting.
10.3 PIVOTING, SCALING, AND ILL-CONDITIONED MATRICES 52°
x, = 1,000,100
2“ 1,000,200”
_ _ 100,010,000 _ 1,000,000
x= (100 1,000,200 100 = + 000,200"
Thus, x, = .9998 and x, = .9999. On the other hand, our three-figure computer
obtains
x, = 1,
x, = (100 — 160)100 = 0.
The x,-parts of the solution are very different. Our three-figure computer
completely lost the second-row data entries 200 and 100 in the matrix when it
added 10,000 times the first row to the second row. «
EXAMPLE 4_ Find the solution to system (1) that our three-figure computer would obtain
using partial pivoting.
SOLUTION Using partial pivoting, the three-figure computer obtains
x, x,
71,000,000 100 000,000
200 —100 100]"
Exercise 2 illustrates that row reduction of matrix (3) by our three-figt
computer gives a reasonable solution of system (2). Notice, however, that '
now have to do some bookkeeping and must remember that the entries
column | of matrix (3) are really the coefficients of x,, not of x,. Fora matrix
any size, the search for elements of maximum magnitude and the bookkeep!
required in full pivoting take a lot of computer time. Partial pivoting
10.3 PIVOTING, SCALING, AND ILL-CONDITIONED MATRICES 531
Scaling
We display again system (2):
Taking this into account, onc usually programs row-echelon reduction so that
entries of unexpectedly small size are changed to zero. MATCOMP finds the
smallest nonzero magnitude m among ! and all the coefficient data suppised
for the linear system, and sets E = rm, where r is specified by the user. (Default
r is .0001.) In reduction of the coefficient matrix, a computed entry of
magnitude less than E is replaced by zero. The same procedure is followed in
YUREDUCE. Whatever computed number we program a computer to choose
for E in a program such as MATCOMP, we will be able to devise some linear
system for which E£ 1s either too large or too small.to give the correct result.
A procedure equivalent to the one in MATCOMP that we just outlined is
to scale the original data for the linear system in such a way that the smallest
nonzero entry is of magnitude roughly 1, and then always use the same value,
perhaps 107‘, for E.
532 CHAPTER 10 SOLVING LARGE LINEAR SYSTEMS
ed cd
ax + by=r
cxtdy=s
1lO’x + Oy=1
Ox + 10°y = |
are actually perpendicular, the first one being vertical and the seco
honzontal. This is as far away from parallel as one can get! It annoyed t
author greatly to have the matrix
ie ,
0 10°
called “nearly singular.” The inverse is obviously
ie m
0 10°)
A scaling routine was promptly written to be executed before calling|
inversion routine. The matrix was multiplied by a constant that would bn
the smallest nonzero magnitude to at least 1, and then the inversion subré
tine was used, and the result rescaled to provide the inverse of the origi!
matrix. For example, applied to the matrix just discussed, this procedt
becomes
10-8 9 —
0 on multiplied by 10’ 5 becomes 10 0
| 0 1}
. 1 0
inverted becomes [is ,
\ll-Cconditioned Matrices
The line x + y = 100 in the plane has x-intercept 100 and y-intercept | 00, as
shown in Figure 10.1. The line x + .9y = 100 also has x-intercept 100, but it
has y-intercept larger than 100. The two lines are almost parallel. The common
x-intercept shows that the solution of the linear system
x+ y= 100
4
x+ Sy = 100 ©)
is x = 100, y = 0, as illustrated in Figure 10.2. Now the line .9x + y = 100 has
y-intercept 100 but x-intercept larger than 100, sc the linear system
x+y= 100
5
9x + y = 100 ©)
y y
4 t
x+y= 100
x+y = 100 Intersection
/
yp Xx —p- X
0 1008 0 - 100%
IGURE 10.1 FIGURE 10.2
he line x + y = 100. The lines are almost parallel.
534 CHAPTER 10 SOLVING LARGE LINEAR SYSTEMS
has the very different solution x = 0, y = 100, as shown in Figure 10.3. Syste
(4) and (5) are examples of ill-conditioned or unstable systems: small chang
in the coefficients or in the constants on the right-hand sides can produce ve
great changes in the solutions. We say that a matrix A is ill-conditioned ii
linear system Ax = b having A as coefficient matrix is ill-conditioned. For ty
equations in two unknowns, solving an ill-conditioned system corresponds
finding the intersection of two nearly parallel lines, as shown in Figure 10
Changing a coefficient of x or y slightly in one equation changes the slope
that line only slightly, but it may generate a big change in the location of t
point of intersection of the two lines.
Computers have a lot of trouble finding accurate solutions of i
conditioned systems such as systems (4) and (5), because the small round
errors created by the computer can produce large changes in the solutio
Pivoting and scaling usually don’t help the situation; the systems are basical
unstable. Notice that the coefficients of x and y in systems (4) and (5) are
comparable magnitude.
Among the most famous ill-conditioned matrices are the Hilbert matrice
These are very bad matrices named after a very good mathematician, Dav
Hilbert! (See the historical note on page 444.) The entry in the ith row and j
column of a Hilbert matrix is I/(i + j — 1). Thus, if we let H, be the n x
Hilbert matrix, we have
11
; 153
Ll 5 tid
A,=), 4} A, =|2 3 4), and so on.
2 3! aid
34 5
y Y
A I
L- Intersection
ae
—
Intersection
> >.
It can be shown that H, is invertible for all n, so a square linear system H,x = b
has a unique solution, but the solution may be very hard to find. When the
matrix is reduced to echelon form, entries of surprisingly small magnitude
appear. Scaling of a row in which all entries are close to zero may help a bit.
Bad as the Hiibert matuiices are, poweis of them are even worse. The
software we make available includes a routine called HILBERT, which is
modeled on YUREDUCE. The computer generates a Hilbert matrix of the
size we specify, up to 10 x 10. It will then raise the matrix to the power 2, 4, 8,
or 16 if we so request. We may then proceed roughly as in YUREDUCE.
Routines such as YUREDUCE and HILBERT should help us understand this
section, because we can watch and see just what is happening as we reduce a
matrix. MATCOMP, which simply spits out answers, may produce an absurd
result, but we have no way of knowing exactly where things went wrong.
SUMMARY
1. Addition of numbers of very different magnitudes by a computer can cause
loss of some or all of the significant figures in the number of smaller
magnitude.
2. Subtraction of nearly equal numbers by a computer can cause loss of
significant figures.
3. Due to roundoff error, a computer may obtain a nonzero value for a
number that should be zero. To attempt to handle this problem, a
computer program might assign the value zero to certain computed
numbers whenever the numbers have a magnitude less than some predeter-
mined small posttive number.
4. In partial pivoting, the pivot in each column is created by swapping rows,
if necessary, so that the pivot has at least the maximum magnitude of any
entry below it in that column. Partial pivoting may be helpful in avoiding
the problem stated in summary item 1.
5. In full ptvoting, columns are interchanged as well as rows, if necessary, to
create pivots cf maximum possible magnitude. Full pivoting requires
much more computer time than does partial pivoting, and bookkeeping ts
necessary to keep track of the relationshtp between the columns and the
variables. Partial pivoting is more commonly used.
6. Scaling, which is multiplication of a row by a nonzero constant, can be
used to reduce the sizes of entries that threaten to domtnate entries 1n
lower rows, or to increase the sizes of entries in a row where some entries
are very small.
7. Alinear system Ax = b is ill-conditioned or unstable if small changes in the
numbers can produce large changes in the solution. The matrix 4 is then
also called ill-conditioned.
536 CHAPTER 10 SOLVING LARGE LINEAR SYSTEMS
8. Hilbert matrices, which are square matrices with entry I/(i + j — 1) inth
ith row and jth column, are examples of ill-conditioned matrices.
9. With present technology, it appears hopeless to write a computer prograr
that will successfully handle every linear system involving even very sma
coefficient matrices— say, of size at most 10 x 10.
EXERCISES
. Find the solution by a three-figure computer d. Multiply the matnx obtained in part c b
ant,
3. Using MATCOMP, use tne scaling routine —_— j. The entry in the ith row and jth column
suggested in Exercise 7 to find the inverse of of a Hilbert matrix is 1/{i + J).
the matrix
.00000: -.000003 .000011 .000006
—.000002 .000013 .000007 .000010 Use the routine HILBERT in LINTEK in
000009 -—.000011 0 —.000005 |" Exercises 15-22. Because the Hilbert matrices are
000014 -—.000008 -—.000002 .000003 nonsingular, diagonal entries computed during the
elimination should never be zero. Except in
Then see if the same answer is obtained Exercise 22, enter 0 for r when r is requested
without using the scaling routine. Check the during a run of HILBERT.
inverse in each case by multiplying by the
original matrix.
Bl Bl
15. Solve H,x = b, c, where
4. Mark each of the following True or False.
— a. Addition and subtraction never cause any
problem when executed by a computer.
—b. Given any present-day computer, cne can
find two positive numbers whose sum the Use just one run of HILBERT; that is, solve
computer will represent as the larger of both systems at once. Notice that the
the two numbers. components of b and c differ by just 1. Find
- A computer may have trouble the difference in the components of the two
representing as accurately as desired the
solution vectors.
sum of two numbers of extremely 16. Repeat Exercise 15, changing the coefficient
different magnitudes. matrix to H,'.
- Acomputer may have trouble 17. Repeat Exercise 15, using as coefficient
representing as accurately as desired the matnx H, with
sum of two numbers of essentially the
saine magnitude.
- A computer may have trouble
representing as accurately as desired the
sum of a and 8, ‘vhere a is approximately
2b or —20. 18. Repeat Exercise 15, using as coefficient
Partial pivoting handles all problems matrix H,!.
resulting from roundoff error when a
linear system is being solved by the 19. Find the inverse of H,, and use the / menu
Gauss method. option to test whether the computed inverse
Full pivoting handles roundoff error is correct.
problems better than partial pivoting, but . Continue Exercise 19 by trying to find the
it is generally not used because of the inverses of H, raised to the powers 2, 4, 8,
extra computer time required to and 16. Was it always possible to reduce the
implement it. power of the Hilbert matrix to diagonal
. Given any present-day computer, one can form? If not, what happened? Why did it
find a system of two equations in two happen? Was scaling (1 an entry) of any
unknowns that the computer cannot solve help? For those cases in which reduction to
accurately using the Gauss method with diagonal form was possible, did the
back substitution. computed inverse seem to be reasonably
i. A linear system of two equations in two accurate when tested?
unknowns is unstable if the lines 21. Repeat Exercises 19 and 20, using H, and
represented by the two equations are the various powers of it. Are problems
extremely close to being parallel. encountered for lower powers this
time?
538 CHAPTER 10 SOLVING LARGE LINEAR SYSTEMS
MATLAB
The command hilb(n) in MATLAB creates the » X m Hilbert matrix, and the
command invhilb(n) attempts to create its inverse. For a matrix X in MATLAB, t!
command max(X) produces a row vector whose jth entry is the maximum of the
entries in the jth column of X, whereas for a row vector v, the command max(y)
returns the maximum of the components of v. Thus the command max(max(X))
returns the maximum of the eniries of X. The functions min(X) and min(v) return
analogous minimum values. Thus the command line
n = 5; A = hi!b(n)+invhilb(n); b = [max(max(A)) min(min(A))]
will return the two-component row vector b whose first component is the maximt
of the entries of A and whose second component is the minimum of those entries
Of course, if invhilb(n) is computed accurately, this vector should be [1, 0].
in MATLAB. Using the up-arrow key to change the value of n, copy down |
vector b with entries to three significant figures for the values 5, 10, 15, 20,
25, and 30 of n.
M2. Recall that the command X = rand(n,n) generates a random ” X n matrix
with entries between 0 and 1; the Hilbert matrices also have entries betwee!
and 1. Enter the command
X = rand(30,30); A = Xsinv(X); b = [max(max(A)) min(min(A))|
Sometimes we want to prove that a statement about positive integers 1s true for
all positive integers or perhaps for some finite or infinite sequence of
consecutive integers. Such proofs are accomplished using mathematical
induction. The validity of the method rests on the following axiom of the
positive integers. The set of all positive integers is denoted by Z’.
———_I
induction Axiom
Let S be a subset of Z* satisfying
iL 1eS, |
2. IfkES,then(k+ IES.
Then S = Z*. |
L ___|
Mathematical Induction
Let P(n) be a statement concerning the positive integer n. Suppose
that
1. P(1) is true,
2. If P(k) is true, then P(k + 1) is true.
Then P(n) is true for all n € Z*.
A-l
A-2 APPENDIX A’ MATHEMATICAL INDUCTION
Most of the time, we want to show that P(n) holds for all n € Z*. If we wis
only to show that it holds forr,r + 1,r+2,...,5— 1,5, then we show tha
P(r) is true and that P(k) implies P(k + 1) forr=k=s-— 1. Noticethat rma
be any integer—positive, negative, or zero.
+ 2b eet k= Me)
=e skee2_ Es kr)
Thus, P(k + 1) holds, and formula (A.1) is true for alln € Z*. «
EXAMPLE A.2 Show that a set of n elements has exactly 2” subsets for any nonnegative inte-
ger n.
SOLUTION This time we start the induction with n = 0. Let S be a finite set having 7
elements. We wish to show
P(n): S has 2” subsets. (A.2)
If n = 0, then S is the empiy set and has only one subset— namely, the empty
set itself. Because 2° = 1, we see that P(0) is true.
Suppose that P(X) is true. Let Shave k + 1 elements, and let one element of
Sbec. Then S — {c} has k elements, and hence 2‘ subsets. Now every subset of
S either contains c or does not contain c. Those not containing c are subsets of
S — {c}, so there are 2* of them by the induction hypothesis. Each subset
containing c consists of one of the 2‘ subsets not containing c, with c adjoined.
There are 2* such subsets also. The total number of subsets of S is then
2k + 2k = 2h(2) = 2K,
so P(k + 1) is true. Thus, P(n) is true for all nonnegative integers n. &
APPENDIX A: MATHEMATICAL INDUCTION A-3
AMPLE A.3 Let x € R with x > -1 and x # 0. Show that (1 + x)" > 1 + mx for every
positive integer n = 2.
SOLUTION We let P(m) be the statement
Again, we are trying to show that P(k + 1) is true, knowing that P(k) is true.
But if we have reached the stage of induction where P(k) has been proved, we
know that P(m) is true for 1 = m = k, so the strengthened hypothesis in the
second statement is permissible.
AMPLE A.4 Recall that the set of all polynomials with real coefficients is denoted by P.
Show that every polynomial in P of degree n € Z* either is irreducible itself or
is a product of irreducible polynomials in P. (An irreducible polynomial is one
that cannot be factored into polynomials in P all of lower degree.)
SOLUTION We will use complete induction. Let P(n} de the statement that is to be proved.
Clearly P(1) is true, because a polynomial of degree 1 is already irreducible.
Let k be a positive integer. Our induction hypothesis is then: every
polynomial in P of degree less than k + 1 either is irreducible or can be
factored into irreducible polynomials. Let f(x) be a polynomial of degree
k + 1. If f(x) is irreducible, we have nothing more to do. Otherwise, we may
factor f(x) into polynomials g(x) and A(x) of lower degree than k + 1, obtain-
ing f(x) = g({x)h(x). The induction hypothesis indicates that each of g{x) and
h(x) can be factored into irreducible polynomials, thus providing such a
factorization of f(x). This proves P(k + 1). It follows that P(n) is true for
allne€ Z*. o
A-4 APPENDIX A’ MATHEMATICAL INDUCTION
EXERCISES
before 8 a.m., contrary to the judge’s her class that she will give one more quiz
sentence. Thus I can’t be executed on on one day during the final full week of
January (31 — (& + 1)), so P(k + 1) is classes, but that the students will not know
true. Therefore, I can’t be executed in for sure that the quiz will be that day until
January. (Of course, the serial killer was they come to the classroom. What is the
executed on January 17.) last day of the week on which she can give
the quiz in order to satisfy these
. An instructor teaches a class five days a conditions?
week, Monday through Friday. She tells
TWO DEFERRED PROOFS
and
Q, Ay 4:3
Oy, Ay, ys) = (141142433) + (—1)(G 12343) + (1)(G124234n) + (= 14124214)
Q3, Gy) Gs;
+ (1)(G)34y1@52) + (— 1)(4)342243).
Notice that each determinant appears as a sum of products, each with an
associated sign given by (1) or (—1), which is determined by the formula
(—1)'*/ as we expand the determinant across the first row. Furthermore, each
product contains exactly one factor from each row and exactly one factor from
each column of the matnx. That is, the row indices in each product run
through all row numbers, and the column indices run through all column
numbers. This is an illustration of a general theorem, which we now prove by
induction.
A-6
APPENDIX 8 TWO DEFERRED PROOFS A-7
PROOF We consider the expansion of det(A) on the first row and give a proof
by induction. We have just shown that our result is true for determinants of
orders 2 and 3. Let n > 3, and assume that our result holds for all square
matrices of size smaller than n x n. Let A be ann X n matrix. When we expand
det(A) by minors across the first row, the only expression involving a, is
(—1)'"a,|A,|. We apply our induction hypothesis to the determinant |A,! of
order n — 1: it isa sum of signed products, each of which has one factor from
each row and column of A except for row | and column j. As we multiply this
sum term by term by a, we obtain a sum of products liaving a,, as the factor
from row | and column j, and one factor from each other row and from each
other column. Thus an expression of the stated form is indeed obtained as we
expand det(A) by minors across the first row.
It is clear that essentially the same arguinent shows that expansion across
any row or down any column yields the same type of sum of signed
products. a
PROOF OF THEOREM 4.2 We first prove Eq. (B.1) for any choice of r from 1 to
n. Clearly, Eq. (B.1) holds for n = 1 and n = 2. Procéeding by induction, let
n > 2 and assume that determinants of order less than n can be computed by
using an expansion on any row. Let A be an n X n matrix. We show that
expansion of det (A) by minors on row ris the same as expansion on row i for
i <r. From Theorem B |, we know that each of the expansions gives a sum of
A-8 APPENDIX B: TWO DEFERRED PROOFS
signed products, where each product contains a single factor from each ro:
and from each column of A. We will compare the products containing a
factors both a, and a,, in each of the expansions. We consider two cases, a
illustrated in Figures B.1 and B.2.
If det(A) is expanded on the ith row, the sum of signed products containin
a,a,, is-part of (—1)a,{A,|. In computing |A;| we may, by our inductio
assumption, expand on the rth row. For j < s, terms of |A;| involving a, ar
then (—1)"-)*6-d, where d is the determinant of the matrix obtained from ,
by crossing out rows i and rand columns j and s, as shown in Figure B.i. Th
exponent (r — 1) + (s — 1) occurs because a,, is in rowr — 1 andcolumns —
of A;. Thus, the part of our expansion of det(A) across the ith row that contain
a, ,, is equal to
(-1)*(-1) Oa, ad for J < §. (B.3
For j > s, we consult Figure B.2 and use similar reasoning to see that th
part of our expansion of det(A) across the ith row, which contains a,a,,, 1
equal to
For j > 1, we expand [A on the first column, using our induction assumption,
and obtain |A,,| = %7., (—1)""*'a,,d, where d is the determinant of tne matrix
obtained from A by crossing out rows | and i and columns | and /. Thus the
terms in the expansion of det(A) containing a,,a,, are
a,,|A,,| + 2 (—1)*'a,,|A,\|.
{=
For i > 1, expanding on the first row, using our induction assumption, shows
that |A,,| = 22, (-—1)'*"a,.d. This results in
(- 1)*'"a;,a,,a
as the part of the expansion of det(A) containing the sum a,,@,,, and this agrees
with the expression in formula (B.7). This concludes our proof. 4
a, = b + p, (B.S
Let a,, a), ..., a, be vectors in R”, and let A be the m x n matrix with
jth column vector a, Let B be the m x n matrix obtained from 4 by
replacing the first column of A by the vector
b=a,—-7,€8,-°*** — 7,8,
PROOF OF THEOREM 4.7 Because our volume was defined inductively, we give
an inductive proof. The theorem is valid if m = | or 2, by Egs. (1) and (2),
respectively, in Section 4.4. Let n > 2, and suppose that the theorem is proved
for all k-boxes for k = n — 1. If we write a, = b + p, as in Eg. (B.9), then,
because p lies in sp(a), ... , a,), we have
p=ra,t::+: +7,4,
Let B be the matrix obtained from A by replacing the first column vector a, of
A by the vector b, as in Theorem B.2. Because b is orthogonal to each of the
vectors a,,..., a,, Which determine the base of our box, we obtain
-b 0 wee 0 ]
0 a, . a, ee a, . a,
BB =
det(B7B) = ||bl|’
a, e a, eee a, s a,
a, e a, «se a, e a,
Below are listed the names and brief descriptions of the routines that make up
the computer software LINTEK designed for this text.
A-13
A-14 APPENDIX C LINTEK ROUTINES
GENERAL INFORMATION
MATLAB pnints > on the screen as its prompt when it is ready for your next
command.
MATLAB is case sensitive—that is, if you have set n = 3, then AN 1s
undefined. Thus you can set X equal to a 3 x 2 matrix and x equal to a row
vector.
To view on the screen the value, vector, or matrix currently assigned to a
variable such as A or x, type the variable and press the Enter key.
For information on a command, enter help followed by a space and the
name of the command or function. For example, entering help * will bring the
response that XY * Y is the matrix product of X and Y, and entering help eye will
inform us that eye(m) is the n x n identity matrix.
The most recent commands you have given MATLAB are kept in a stack,
and you can move back and forth through them using the up-arrow and
down-arrow keys. Thus if you make a mistake in typing a command and get an
error message, you can correct the error by striking the up-arrow key to put the
command by the cursor and then edit it to correct the error, avoiding retyping
the entire command. The exercises frequently ask you to execute a command
you recently gave, and you can simply use the up-arrow key until the command
is at the cursor, and then press the Enter key to execute it again. This saves a lot
of typing. If you know you will be using a command again, don’t let it get too
A-15
A-16 APPENDIX D. MATLAB PROCEDURES AND COMMANDS USED IN THE EXERCISES
far back before asking to execute it again, or it may have been removed fron
the command stack. Executing it puts it at the top of the stack.
DATA ENTRY
Numerical, vector, and matrix data can be entered by the variable name for
the data followed by an equal sign and then the data. For example, entering
X = 3 assigns the value 3 to the letter x, and then entering y = sin(x) will then
assign to y the value sin(3). Entering y = sin(x) before a value has been
assigned to x will produce an error message. Values must have been assigned to
a variable before the variable can be used in a function or computation. If you
do not assign a name to the result of a computation, then it is assigned the
temporary name “ans.” Thus this reserved name ans should not be used in
your work because its value will be changed at the next computation having no
assigned name.
The constant a = 4 tan™'(1) can be entered as pi.
Data for a vector or a matnx should be entered between square brackets
with a space between nuinbers and with rows separated by a semicolon. For
example, the matrix
2-1 3 6
A=|0 512 -7
4-2 9 ll
c2n be entered as
A = [2 -136;05
12 —7;4-29 11],
which will produce the response
A=
2-1 3 6
0 $12 -7
4-2 9 11
from MATLAB. If you do not wish your data entry to be echoed in this fashion
for proofreading, put a semicolon after your data entry line. The semicolon
can also be used to separate several data items or commands all entered on one
line. For example, entering
x = 3; sin(x); v = [1 —3 4]; C = [2 —1; 3 5};
will result in the variable assignments
and no data will be echoed. If the final semicolon were omitted, the data for
the matrix C would be echoed. If you run out of space on a line and need to
continue data. on the next line, type at least two periods .. and then
immediately press Enter and continue entry on the next line.
APPENDIX D. MATLAB PROCEDURES AND COMMANDS USED IN THE EXERCISES A-17
Matrices can be glued together to form larger matrices using the same form
of matrix entry, provided the gluing will still produce a rectangular array. For
example, if A is a 3 X 4 matrix, Bis 3 X 5, and C 1s 2 x 4, then entering
D = [A B] will produce a 3 x 9 matrix D consisting of the four columns of A
followed at the right by the five columns of B, whereas entering E = [A; C] will
produce a5 x 4 matrix consisting of the first three rows of A followed below by
the two rows of C.
NUMERIC OPERATIONS
Use + for addition, — for subtraction, * for multiplication, / for division, and “
for exponentiation. Thus entering 63 produces the result ans = 216.
Remember that all variables must have already been assigned numeric, vector,
or matrix values. Following is a summary of the commands used in the
MATLAB exercises of our text, together with a brief description of what each
one does.
APTER 1
tion 1.1
A-21
A-22 ANSWERS TO MOST ODD-NUMBERED EXERCISES
15. l
7 e V7 1, hy 3, Ys 4 4] 9. ._ 7 -3 11. ° 3
_,__1l 43
13. cos t_____
364 = 54.8
§4 8° 15. —233
3 0} [-4] fe 8 n= Ay b=Ay
b. | +q 3 +r 3 +s q = 3 M1. |lal] ~ 6.3246; the two results agree.
mt
39. FT FFTFTFFT
MS. a. 485.1449
M1. [—58, 79, -36, — 190]
b. Not found by MATLAB
M3. Error using + because a and u have
different numbers of components. M7. Angle ~ 9.9499 radians
MS. a. [—2.4524, 4.4500, —11.3810] M9. Angle = 147.4283°
b. [—2.45238095238095,
4.45000000000000,
— 11.38095238095238}
c, |-103 89 -7 Section 1.3
42°20? 21
M7. a. [0.0357, 0.0075, 0.1962]
{76 3 9 3/2 2 1
b. [0.03571428571429, 12 0 -3 9-1 2
0.00750000000000, 0.196190476 19048) _
c, [LE & 183 6 -3
"28? 400° 525 5.{|-3 1 7. Impossible
M9. Error using + because wu is a row vector 2 §
and u? is a column vector.
9. | —130
110
140
“0
;
11. Impossible
( J=1
x arb = k=1\j=1
2 ( > (ape)
Further, the gth column of BC has Section 1.4
components
ot
&
r
OuUw
oOo
OON
oot
oon
Bi Lis Dulin °°
Ne
Md
=
&
’ > blige
O-
x Hg Py 2g k=1
|
oor Oo
—
l
OO
OW ©
S2Oorn
oor.
SGOn-
ooo
r r
=
eS
(SoS
a,( 2 but + an x
or
So
i
1
4
WO
Ow
Awth-
ooo
OO
OK
—
q
-O°C°.
ow
OG
must be equal.
oo
or
ee
4
Nw
a5
NA
>
_
1
*
1 - 2r a) 0
MI. x = 3) M3. Inconsistent
9x= -2-r-—3s 1 2
x r | 3 Il. x _ 5
Ss —2 2 1-— Ils
13. x = 2, y= —4 M5. x ~=|3 - 7s
nfl
So)
15. x= -3,y=2,z=4 .
[13 — 2p + 145
19. Inconsistent _ r
M7. x = 5455
L ¢
—8
21.x =
—2 —23 -— 5s . 0.0345
* | 1 XT.x= 4s -0.5370
2s M9. x ~ | —1.6240
25. Yes 27. No 0.1222
| 0.8188
29. FFT TFTTTFT
31. x, = -l,x, =3
33. x, = 2, x), = —3
35. x, = 1,x,= -1,x%,=1, x,= 2 Section 1.5
37. x, = —3, x, = 5, x, = 2, x, = -3
la At=|! y
39. All 6, and 8, 0 1
41. All },, b,, 6, such that 5, + b, — 6, =0
b. The matrix A itself is an elementary
‘100 010 matrix.
43.)2 10 45.10
01 3. Not invertible
10 0 1 100
1 Oo 1
pose 1 -60 0 -15] §.a. A'=)0 1 1
0100 0 10 0 0 0-1!
714000 Plo oO | ;
10001 0-12 0 -3 1 0 O11 0 O;}1 0 1
-A=|0 1 OF,0 1 1/0 1 0
o
63.x =|-
ANSWERS TO MOST ODD-NUMBERED EXERCISES A-25
—_,
-7 / b. * = subspace
13. a. A= | [37]
-5§ 2 X5 ~26| . Not a subspace . Nota
~]
15.
x = —7r + 5s+ 31, y = 3r — 25 — 21, subspace
z=3r-—2s-1 . A subspace
\o
46 33 30 321 . a. Every subspace of R? is either the
17.|39 29 26 19.|0 3 2 ufigin, a line through the origin, or all
99 68 63 134 of R’.
21. The matrix is invertible for any value of r b. Every subspace of R? is either the
except r = 0. origin, a line through the origin, a plane
through the origin, or all of R?.
23. TTTFTTTFFF
. No, because 0 = r0 for all rE R, so 0 is
25. a. No; b. Yes not a unique linear combination of 0.
27. a. Notice that A(A7'B) = (A4"')B = IB = . {[-1, 3, 0], [-1, 0, 3]}
B, so X = A~'B 1s a solution. To show . {{(-7, 1, 13, 0], [-6, —1, 0, 13]}
uniqueness, suppose that AY = B. Then
. {{(- 60, 137, 33, 0, 1]} 23. Not a basis
A-AX) = A'B, (A'A)X = A'B, IX =
A7'B, and X = A7'B; therefore, this is . A basis 27. A basis
the only solution. . Not a basis 31. {(-1, -3, 11}}
b. Let E,, E,, ..., E, be elementary . Sp(V,, V2,.-- ȴ,) = sp(W,, W2, --- > Mm) if
matrices that reduce [A | B] to (7 | X], and only if each y, is a linear combination
and let C = E,F,., -++ £,E,. Then of the w, and each w, is a linear
CA = I and CB = X. Thus, C = A"! combination of the v,.
[pa
and so X = A™'B.
37/4 + s/8
“trol
29. a, [9 9
35.
3r/2 + s/4
an
4
0
T 0.355 -0.0645 0.161 0
39.|-0.129 0.387 0.0323
1
[*:
—(5r + s)/3
. 0.0275 —0.0296 -0.0269 0.0263 37. (r+ 29/3
I
~~
|—0.0180 0.0947 0.00385 0.0257 39. In this case, a solution is not uniquely
43. See answer to Exercise 9.
determined. The system is viewed as
underdetermined because it is insufficient
45. See answer to Exercise 41. to determine a unique solution.
A-26 ANSWERS TO MOST ODD-NUMBERED EXERCISES
41. x- yl DHER
2x - 2y=2 0 00;D
3x - 3y = 3 41.T=|1 5 0) H.
M1. Not a basis -M3. A basis The recessive state 1s absorbing, because
MS. The probability is actually 1. there is an entry | in the row 3, column 3
position.
M7. Yes, we obtained the expected result.
: 2 4037
45. |4 47. 4 49. | .3975
Section 1.7 5 L9 1988
: yO
l 12 TL!1191 0111001
4
19. |) 21. |3 23. | 35 Xq = My +X, Xs = Xp + Ay, Xe =X + X;
ww
0111
101010 =
35. i 37. lorr1001fi9°°!
trsfm
8 11110
de tm
11101
a. Because the first column of the product On the other hand, if w is a nonzero code
is the fourth column of the parity-check word of minimum weight, then wt(w) is
matrix, we change the fourth position the distance from w to the zero code word.
of the received word !10111 to get the showing the opposite inequality, so
code word 110011 and decode it as we have the equality stated in the
110. exercise.
b. Because the second column of the . The triangle inequality in Exercise 14
product is the zero vector, the received shows that if the distance from received
word 001011 is a code word and we word v to code word u is at most m and
decode it as 001. the distance from v to code word wis at
c. Because the third column of the most m, then d(v, v) = 2m. Thus, if the
product is the third column of the distance between code words 1s at least
parity-check matrix, we change the 2m + 1, a received word v with at most 777
third position of the received word incorrect components has a unique
111011 to get the code word 110011 nearest-neighbor code word. This number
and decode it as 110. 2m + 1 can’t be improved, because if
d(u, v) = 2m, then a possible reccived
d. Because the fourth column of the word w at distance m from both u and v
product is not the zero vector and not 4 can be constructed by having its
column of the parity-check matrix, components agree with those of u and v
there are at least two errors, and we ask where the components of u and v agree,
for retransmission. and by making m of the 2m components
e. Because the fifth column of the produci of w in which x and y differ opposite from
is the third coijumn of the parity-check the components of u and the other
matrix, we change the third position of components of w opposite from those
the received word 100101 to get the of v.
code word 101101 and decode it as . Let e, be the word in B’ with | in the ith
bd
ok
wafil
r;(—v, — 3v,) = 0. Then (27, — r,)v, + c. The set consisting of 3h a
(3r, + 1,)v, + (-2r, — 37,)v,; = 0. We try
setting these scalar coefficients to zero and
solve the linear system
2r,
3r, ry
- rn=0,
0 0 0/0
This system has the nontrivial solution r; = d. The set consisting of the vector 7
2,7, = —3,and 7, = 1. Thus, (2v, + 3v,) — 0
3(v, — 2v,) + 2(—v, — 3v,) = 0 for all choices
of vectors v,, V2, ¥; & V, so W,, W2, W; are 5. a. 3
dependent. (Notice that, if the system had b. {[6, 0, 0, 5}, [0, 3, 0, 1], [0, 0, 3, 1]}
only the trivial solution, then any choice of
O} | 1] 42
independent vectors v,, v,, v, would show . The set consisting of 2} | ql or the
this exercise to be false.)
a
Oj 12] Ll
33. No such scalars s exist. set of column vectors {e,, e,, e;}
ANSWERS TO MOST ODD-NUMBERED EXERCISES A-29
Ww
6 . —X, + 3, — 2x) x, + x, - 2x,
:. 4 . 4 .
The rank ts 3; therefore, the matrix is not
~~
s
invertible.
- Yes. (7' ° T)"'((x,, 2) = Lz =3% +> 203
23]
The rank is 3; therefore, the matrix is
w
.
invertible.
~TFTTTFTTFT
~TFTTFFTFFT
. a. Because
. No. If A is m x nand Cisn X 5s, then AC
we
and
. LetA = and C= |! | Then T,(ru) = [74, me, ..., my, 0,..., 0)
\o
tL -3 0
Xx; T,,. [VY], we have
Example 3 then shows that 7 is a linear d,.xd,=d,s rar <d,=n.
transformation.
If the jth column of H has a pivot but
. No. T(O[1, 2, 3]) = 70, 0, 0}) = [1, 1, 1, 1), the (j + 1)st column has no pivot, then .
whereas 07((1, 2, 3)) = O[1, 1, 1, 1] = d;,, = d;. However, if the jth and
[0, 0, 0, 0}. (j + 1)st columns both have pivots,
5. (24, —34] 7. (2, 7, -15] then d,,, = d; + 1. See parts c and d for
illustrations.
9. [4, 2, 4]
| c. If d, = d,= | and d, = d, = 2, then H
[802, —477, 398, 57] 13 | -3| must havc the form
34 = 1]
eed
e _ 5 e
Lilt fi -1 3
pXxxxl
00 pxX
-{1 1.0 7W}t oft
0000
100 1 0 0
0000
. The matrix associated with 7" T is
CCF
choices of H. Suppose that the pivot in 11. In column-vector notation, we have
the kth row of H is in column j. Consider
now a nonzero row vector in R’ that has
y x 1 O}LY
entries zero in all components
corresponding to columns of H containing / 1 4 |? 1 Sh which represents a
pivots except for the jth component, LO 1jtl OF LY
where the entry is 1; entries in reflection in the line y = x followed by a
reflection in the y-axis.
CCE
components not corresponding to pivot
column locations in H may be arbitrary. 13. In column-vector notation, we have
We claim that there is a unigue such
218 phe
vector in the row space of A—namely, the
Ath row vector of H. Such a vector must
be a linear combination of the nonzero 01
rows of H, which span the row space of A. reflection in the x-axis followed by a
The fact that there are zeros above reflection in the y-axis.
ANSWERS TO MOST ODD-NUMBERED EXERCISES A-31
x)
dic d,c
a+
=~.
d? dt, x;
d?+dp “
> a t
15.
{4 1 4s\
17. —=, — —| 19. 3 3 l, 7 l
2 2 4, 22 2 2
27. x, + 4x, + x, = 10
29. xX, + x; = 3, ~— 3x, — 7x, + 8x, + 3x, = 0
1
35. The 0-flat x = i
| 2
| 2
37. The !-flatx ={-1l}/+¢/1
| 0 l
| -8 0
39. The I-flat x = ao +1 7
0 2
A-32 ANSWERS TO MUST ODD NUMBERED EXERCISCS
. © vector space 1}. A vector space ¥, = (—2)y, + I(2¥, + v2), showing that
oOo
7. One basis B for ti’ consists of those /, for 3. A linear transformation. ker(7) is the zero
ain S defined by {(a) = | and f(s) = 0 for function, invertible
s#ainS.1FfE Wand f(s) # 0 only for . A linear transformation, ker(7) is the zero
Wa
s€ {aa . ..a,}. then f= f(a,) at function, invertible
L(4No, + +> * + f(a,)f,. Now the linear
combination g = Cs, tof, torn + . The zero function is the only functioi: in
~
Cm, 18 a function satisfying g(b) = c, for ker(7).
1.
J=1,2, ..., mrand g(s) = 0 for all other .{ce" + oe — zsin x | c¢,, c, € R}
5
s € S. Thus g € W, so B spans only W.
11. {cx + Cc, + cos x |G, € R}
(The crucial thing is that all linear , l
eombinations are sums of onlv finite 13. {c, sin 2x + ¢, cos 2x + Gx — lc, ¢. € R}
numbers of vectors.) 2 ly
. {ce + x +c, - 5x — 3 | c,, G, ¢; E R}
17. T(v) = 4b) + 4b, + 7b; + 6b,
‘tion 3.3 19. T(v) = —17b, + 4b, + 13b; — 3b,
0000
1. (1. -1] 3.(2,6,-4] 5. (2, 1, 3] _13 000
Zl. a. A= 0200
7.14, i, -2, 1] 9. [Lt 1-0
001 0
1. (1,2, -1, 5]
4 0
3. p(x) = [e+ 1) - IP +(e 4 1) - IF —5 = 12 ,
—[(x+1)-1)-1 b. A 10 19) - 12% 12x? — 10x ++ 10
=(x+ 1p
- 304+ IP+ 3v4+ 1) -1 —13} , 10
+ (xt lp - 2+ 1)+1 c. The second derivative is given by
— (xt+i)t+i -5 0
— | A 2
38 = 0
39° oo
30x - + 16
=(v + 1)'- 2x +1) + Ov + 1) +0
| 4) [ 16
so the coordinate vector is [1, —2. 0, 0}. 23. a. Die’) = et + 2xes Die) = rer +
1S. [4, 3, -5, -4] 4xe" + 2es D(xe") = xe’ + & D(xe’) =
17. Let x? — 402 + 3x + 7 = B(x — 2) + xe + 2e; Die’) = es De) = e
b,(x — 2)? + bx — 2) + by. Setting x = 2. 100
we find that 8 - 16+6+7=h,s0 b= A=|4 1 0
5. Differentiating and setiing x = 2, we 221,
obtain 3x? — 8x + 3 = 3b,x — 2) + 2b(x
b. From the computations in part a, we
~ 2)+5;i2-16+3=5,5,=-1.
Differentiating again and setting x = 2, we 100]
have 6x — 8 = 65,(x — 2) + 26,,12 - 8 = have A, =|2 | 0}, and computation
2b,; b, = 2. Differentiating again, we (0 i iy
obtain 6 = 65,, so b, = 1. The coordinate shows that A,? = A.
vector is {1, 2, —1, 5].
25. We obtain A = Fe | in both part a
19. b. {f(x). (9) 21. 2x7 + 6x + 2
and part b
9 0 0
ction 3.4 27.}0 25 0
0 0 81
1. A linear transformation, ker(7) =
{f€ F | f{-4) = 0}, not invertible
29. (a + c)e* + be + (a + ce 3 [-34-4
3
A-34 ANSWERS TO MOST ODD-NUMBERED EXERCISES
C, Cy C;
15. 77
(b,c; _~ b,c, + (b,c, ~ b,c,)k. Thus.
oly
15. 4
17. 54 19. 5
15.
21.FTTFTTFTFT
23. 0 25. 6 27. 5,- 2 29. 1,6, -2 if! 4 Hf
33. Let A have column vectors a,, a),..., a, 17. A'=-—=| 7 -6 -1l1
Let 'T)§ -3 3
ar 5
19. A'=-—| 4-1 -8
I3}_) -3 2
[-7 -1 4
2 - adj(4) = |-3 2 -8
i 6 -4 -1
i
b,a,|.
| so the system cannot be solved using
Cramer’s rule.
Then AB = b,a, b,a, oe 27. x, =0,x,=1
29. x= 3.x, = -txy=
31. x, = 0,%,=3,% = 0
Applying the column scaling property n = 4
33.
times, we have * * 59
35. FTTFTTFTTF
41. 19095. The answers are the same.
43. With default roundoff, the smallest such
det(4B)
= b,|a, ba,
- °° ba, value of m is 6, which gave 0 rather than |
as the value of the determinant. Although
this wrong result would lead us to believe
that A® is singular, the error is actually
quite small compared with the size of the
entries in A®, We obtained 0 as det(A”) for
6 = m = 17, and we obtained
= 5,b,/a, a, bya, os ba, — 1.470838 x 10'° as det(A'*). MATCOMP
gave us — 1.598896 x 10" as det(A”).
With roundoff control 0, we obtained
det(A‘) = 1 and det(A’) = 0.999998, which
= -++ =(bb, -+ + 6.) det(A)
has an error of only 0.000002. The error
= det(B) det(A}. became significantly Jarger with m = 12,
A-36 ANSWERS TO MOST ODD-NUMBERED EXERCISES
no
entry on the diagonal, which produces a Eigenvalues: A, = 1, A, = 3
—r
ralse calculation of 0 for the determinant. Eigenvectors: for A, = 1: ¥, = | "} r#0,
With roundoff zero, this does not happen.
But when #7 get sufficiently large. roundoff
error in the computation of 4” and in its for A, = — Lash 5 #0
Loo
- 2§
wn
determinant, no matter what roundoff Eigenvalues: There are no real eigenvalues.
control nuinber is taken. Notice, however, ~l. Characteristic polvnomial. —A? + 247 +
that the value given for the determinant 1s A-2
always small compared with the size of the Eigenvalues: 4, = —1, A. = 1, A, = 2
entries 1n.4”.
Eigenvectors: for 4. = —\:v, =] °|rj.r 4 0,
45.
I2
-I!
-8
43
-7
5 0
9 -6 -l | 0
for A, = | a |s40
(/-4827 114 -2211 2409 5
47.| 73218 76 -1474 1606 =
“| 8045 -190 3685 -4015 for A, = 2: v, =|-2|,¢ 40
| 1609 -38 737 -803 LJ
Section 4.4
9. Characteristic polynomial: —\* + 87 +
A-8
Eigenvalues: 4, = —1, A. = 1, A; = 8
1.V213 3.V300- 5. 11 7. 38 0
V6 5 Eigenvectors: for 4. = —-\:v, =| r|,r 490.
9.2 MN. 13. = 0
. Let a, b, c. and d be the vectors from the 0
origin to the four points. Let A be the for A, = 1: v, =|-s], 5 40
n X 3 matrix with column vectors b — a, 5
c — a, and d — a. The points are coplanar —t|
if and only if det(A7A) = 0. " for A; = 8: ¥, = ipte
17. The points are not coplanar. 19. 32 t
0
for Az, Ay = 2:¥,=| 5|,5 #0 . a. Work with the matrix A + 10/, whose
%
eigenvalues are approximately 30, 12,
15. Characteristic polynomial. —a* - 37 + 4 7, and —9.5. (Othe: answers are
Eigenvalues: A, = A, = —2,A,= 1 possible.).
0 b. Work with the matrix A — 10/, whose
Eigenvectors: for A,. Ay = —2:¥, =| rj, r #0, eigenvalues are approximately —9.5,
0 —8, —13, and —30. (Other answers are
35 possible.)
for A; = I: v, =|-s|,5 #0
35 37. When the determinant of an n X n matrix
A 1s expanded according to the definition
17. Eigenvalues: A, = —1, A, = 5
. r
using expansion by minors, a sum of n!
ELigenvectors. for A, = —1l:¥, = | |: r#0, (signed) terms is obtained, each containing
a product of one element from each row
forA, = 5: ¥, = ss #0 and from each column. (This is readily
proved by induction on 7.) One of the ji!
19. Eigenvalues: 4, = 0, A, = 1, A; = 2 terms obtained by expanding |A — Adj to
obtain p(A) is
on
Eigenvectors: for A, =
RK
Oo
Oo
-“
~
I
:
the coeffictent of A”~' must contain at least
n — 1 of the factors in the preceding
forA, = 2:¥, =|0|,140 product; the other factor must be from the
l remaining row and column, and hence it
21. Eigenvalues: A, = —2,4,; = 1,4, = 5 must be the remaining factor from the
diagonal of 4 — Al. Computing the
Eigenvectors: for A, = —2: ¥, =| |, coefficient of A’~' in the preceding product,
we find that, when —A is chosen from all
but the factor a, — Ain expanding the
product, the resulting contribution to the
for A, = |: ¥) =
coefficient of A”' im p(A) is (—1)""!a,. Thus,
the coefficient of A”~' in p(A) is
for A, = S:v,=|—-Se], 14 0 (-1)"""(a,, + ay + Qj); + °° ° + ,,)-
Now p(A) = (=1)"(A ~ A)(A = A:)(A ~ da)
23. FFTTFTFTFT > ++ (A — A,). and computation of this
A-38 ANSWERS TO MOST ODD-NUMBERED EXERCISES
a)
A, = 16, ¥, = oft #05 a = 36, v=
Ise
—
(—1yrt(y + Ay + As ts + AY).
Thus, tr(A) = Ay +A, + AV tees +A,
0|
oe
39. Because {2 ~ > vi fex- 5A
+ 7, we SAO AR AH Ay = ,fand u
compute | wu
not both zero
A - a+ m=|? 7) “yt
1 3 1 3 53. Characteristic polynomial. —\? — 16d? +
48\ — 261
7 (1 0J_/3
|= . 5],f-10
+ 5],[70
+|
io!; [5 8 -5 -15| |07 Eigenvalues. d, = 1.6033 + 3.3194i,
_19 4 A, = 1.6033 — 3.31947, A, = —19.2067
lo 0}’ 55. Characteristic polynomial: s* — 56.3 +
illustrating the Cayley-Hamulton ineorem. 210A? + 22879 + 678658
4 . The desired result is stated as Theorem 5.3
Eigenvalues: A, = — 12.8959 + 13.3087i,
u
Lad
N
the standard matrix representation of the Ay = 40.8959 + 17.4259i,
linear transformation T: R" — R" and the A, = 40.8959 — 17.4259:
eigenvalues and eigenvectors of the
transformation are the same as those of Ml. a. A? + 16A? — 48A + 261;
the matrix, proof of Exercise 42 for A, = —19.2067,
transformations implies the result for A, = 1.6033 + 3.3194i,
matrices, and vice versa. A, = 1.6033 — 3.3194;
45. a. F, = 21; b. A‘ + 1443 -- 131A? + 739A — 21533;
b. Fy) =832040; A, = ~22.9142,
c. Fy = 12586269025; A, = 10.4799,
A, = --0.7828 + 9.43701,
d. Fy, = 5527939700884757;
II
A, = —0.7828 — 9.43707
e. Fisy ~ 9.969216677189304 x 10”
M3. A of minimum magnitude =
47. a. O, 1, 2, 1, -3, -7, —4, 10, 25; —2.68877026277667
2-3 1
vi7. A of maximum magnitude ~ 19.2077
b|1 O 0};
dA of minimum magnitude ~ 8.0) 47
0 1 0
e. ax = — {91694
Section 5.2
Foo fg!
f
0 s
for A, = sm |ail.s¥ 6
S#0;A, =A, =2,¥,= "| ¢and u not
t
both zero
ANSWERS TO MOST ODD-NUMBERED EXERCISES A-39
fife
2mx + on - OM
23. If A and B are similar square matrices,
-l -z -1 0
then B = C"'AC for some invertible matrix
C. Let A be an eigenvalue of A, and let
5. Eigenvalues: 4, = —3, A, = 1,4, =7 {¥,, ¥2,.-., Vy be a basis for E,. Then
Exercise 22 shows that C"'v,, C''v,,.. .,
r
C-'v, are eigenvectors of B corresponding
Eigenvectors: for A, = —3:¥, = | r#0,
to A, and Exercise 37 in Section 2.1 shows
0
that these vectors are independent. Thus
5 the elgenspace of A relative to B has
for A, = rns lfor 0, dimension at least as great as the
s eigenspace of A relative to A. By the
symmetiy of the similarity relation (see
Exercise 16), the eigenspace of A relative te
A has dimension at least as great as the
eigenspace of A relative io B. Thus the
dimensions of these eigenspaces are
equal—that is, the geometric multiplicity of
A 1s the same relative to A as relative to B.
7. Eigenvalues: A, = —1,A, = 1,A,;=2 31. Diagonalizable
33. Complex (but not real) diagonalizable
—-r
35. Complex (but not real) diagonalizable
Eigenvectors. for 4, = —1: y, -| 0|,7 #40,
37. Not diagonalizable
tL "J
39. Real diagonalizable
—S§
41. Complex (but not real) diagonalizable
for A, = cn] a
Section 5.3
0
Cerne
pote
Noi
b. Neutrally stable;
-l1 -1 0 -100
C= 0 0 1;,,D=| 010
oO
1 3 0 002 er
ws (2) ACL
11. Yes, the eigenvalues are distinct.
IW COlA
1 2} 1 1 }-1
eive™
3 —_
13.FTTFTFFTFT
3
15. Assume that A is a square matrix such that
which checks..
D = C"'AC is a diagonal matrix for some
invertible matrix C. Because CC7' = J, we d. For large k, we have |! a = 2 =
have (CC)? = (C"')'C’ = fF =I, so(C’)'
= (C')". Then D? = (CAC) =
Led
2
CTA™C")"', so A’ is similar to the diagonal
wi No WI
» S04 ~ 3:
matrix D’.
A-40 ANSWERS TO MOST ODD-NUMBERED EXERCISES
CHAPTER 6
Section 6.1
Lj
? >
Ww
- I
3 5. p= 42, —-3, 1, 2]
se |
8
and 4'{tl=
sli) aa al
Ld
i 9
1 3 I
19. = (2, -1, 5 21. = (3, -2, -1, 1]
4-|
1
i
0
b. Unstable; 3
L
23. FTTTFTTEFT 29. =
* ‘ 3 _l _1\f
c. 2 4 4
31. is 33. vil6l
— 35. V10 37.
3
. The sequence starts 0, 1, 12 >
5h © seq 22
Section 6.2
27 [3 ae1 l x32
and 4 0 32 2 ° = 1
3 l == =
|
so the generating set is orthogonal.
[Ue
=
1-1-1+1=0,
Ags — l 3 ‘ 3
d. For large k, we have a, | = (3) I fl, -1, -1, 1)-[-1,
0, 0, 1]=
-1+0+0+1=0, and
sO a, * kel 1?
and a, approaches « as k
[1, 1, 1, 1] [-1,
0, 0, 1) =
approaches «. -1+0+0+1=0,
7 Xx; _ —k,e*! + 4k,e" so the generating set is orthogonal; b,, =
" |X ~ k, e! + 3k, e {2, 2, 2, 1].
xX —2k, e' + ke 1 ! - art
5. {rll 0, —2], 79 | 7), 3]
k, e! + ke"
XY —k,e"' ~ ke!
9 9 4 1 5 tot 1 32-3
13 |=, -3,= 5. |-,0, --,= 2 2 2
5 | I EF 0 3 | ; 1 49|-3 6 2
|
17. | atl
|
0. 1, 0), waar 2, 1, 0),
6 6 6 2 3
=|
1 “<4 ; 11. 23
Vl 1, —1, 0), (0, 0. 0, u| 6| | 6
19. {(3, -—2, 0, 1), [-9, —8, 14, 11)}
13. l -1 1 | V5 2 A
1 1 Al 1 1 IS. 5 |7 ‘ ) ‘
2). | el 1, ty], V0 —1, u|
0 1 1v2 t1
23. {{2, 1, —1, 1], (1, 1, 3, 0], (-24, 9, 5, 44}}
1
25. FTTFFTFITTT ~ V2 ~3 0 2
17. i 1
WV -11V3 16
27.Q0=| 0 V3 aw, 0 1 ow. 5
WW. V3 “Ue
W2V2 V2
R=(|0 V3 -1N3 19. FITITTTTET 23. i i
0 0 46,
27. An orthogonal matnx A gives rise to an
Ts
i} 371 2 1|2
if 2 3 -6],[ 2-3 6 5. Pre -!l 5 2 » projection = 5 8
ATA==|-3
T4 = —|—- 6 27)~ 3 6 2|= 2 2 2 5
6 2 3) [-6 2 3
[10 -1 3 10
1/49 9 90 100
_tli-1 19 6 -1 ontinn cx
49 049 O;=/0
1 0}, PRT 3 6. 3 3 |» Projection
0 049) (001
10 -1 3 10
so A is orthogonal and d A“! = A? = 41
2 3 -6 1 | 40
7|73 6 2). 21|27
6 2 3 4
A-42 ANSWERS TO MOST ODD-NUMBERED EXERCISES
sn
Lad
WwW
5. +
oi |
I
‘=<
a. QO, | 35°
Wi
b. 0 has geometric and algebraic
multiplicity n — k,
<
[
1 has geometric and algebraic
”
multiplicity k.
T
q
c. Because the algebraic and geometric
multiplicities of each eigenvalue are Pe dk T
T
positive integer n.
T T
[2 129
25 25
in
_ 33,35° 102357
T
E 0 1
13 -18 0 -12 -++++-++++—
> x
—
2 4
25.
1/-18 36 12 0
49 0 12 %1I3 -18
7. y= —-0.9 + 2.6x
-12 0-18 36
(3 0 0 i 10 14 y
27.
To 2-11 _ 0
0 1 12 14
33. Referring to Figure 6.11, we see that, for
+++
oo
17. X= 19.x=| 2
Ot
I
eee
a
H
&
J
Oe
21. FFTTFFTTFF
\
23. See answer to Exercise 17.
—
25. See answer to Exercise 19.
o-=—=
27. The computer gave the fit
vr
—--o0
leasi-squares sum of 0.0389105i.
17.
29. We achieved a least-squares sum of
5.838961 with the expoential fit
y = 0.8e°*, The computer achieved a
ms oeCS
&
|
-_—
—
I
JAPTER 7
ction 7.1
1. (-1, 1] 3. [-4,
-2, 1, 5]
A-44 ANSWERS TO MOST ODD-NUMBERED EXERCISES
4° (2 —4
4
NI
$365 3 0
5. Rp =|—> —3 ~2], Rs =|—4 —16 21. A, = —2, A, = Ay = 5; 2_, = spl] |
ILA
3)
~9
3 73 72 12
! 3 }
oF
Ih
0
caf BB
_ 3 2
E, = sp -3 ; hot diagonalizable
2
024 2
0001
10010
C=l0100 1 = Pye “VION TE nee
Pid “Ly UV10 3/10
2 1-3 x) finn val
i5./-1 0 1 S PI iW irV3) Lop Wt 5
00 1
AVA, = -1,A4,=5, FL, = o({1}}
x] [v3 -1rV2. 11% |
15. z v3 InV2 -1rVot|!
WV3 0 26 a
E,= so({-1]} diagonalizable
—t? + 2t? + 21,
1
19. A, = 0,A,= == Oj): Watc=kac=b
I 19. —5.4721361,2 + 3.4721361,
0 ] 21. —4.021597¢,? + 1.3230571,? + 4.69854
E,= sp|| 1 , £, = sp f ; diagonalizable
0 l 23. —4t? + tt? + 40? + Lg
ANSWERS TO MOST ODD-NUMBERED EXERCISES A-4S
y
t
9
S|
re
— ——)- Xx
1
-V2
10x? + 6xy + 2y2 = 4
2xy = |
x+2xy ty? =4
A-46 ANSWERS TO MOST ODD-NUMBERED EXERCISES
>
a-r b/2 g has no local extremum at (0, 0).
deti4 —M)= | p72 ¢-al =
YY
g has a local maximum of 3 ut (—5, 0).
NV — (a+ c)d + ae — b/4. g has no Jocal extremum at (0, 0).
en
The eigenvalues are given by A = The behavior of g at (3, 1) is not
() (a totVatoth- 4ac); they determined.
1 . g has a local maximum of 4 at (0, 0, 0).
—_
are real numbers because A is symmetric;
they have tue same algebraic sign if 13. g has no local extremum at (0, 0, 0).
b’ — 4ac < 0, and they have the opposite 15. g has no local extremum at (7, —6, 0).
algebraic sign if b? — 4ac > 0. One of them
17. (xy y=y +10 19. (x, y= ye +
is zero if b? — 4ac = 0. We obtain a
(possibly degenerate) ellipse, hyperbola, or 21. g(x, y) = xi + y+ 40
parabola accordingly. 23. The maximum is .5 at +(1/2)(1, 1). T.
(1 0 O minimum is —.5 at +(1/V2)(1, —1).
11. Let C7’=|0 O/r cfr|, where 25. The maximum is 9 at +(1/W/10~1, 3).
0 -c/r bir The mi imum is —1! at +(1/V 10)(3, 1).
27. The maximum is 6 at +(1/V2\(1, —1).
r= VP + c. Then C is an orthogonal
The minimum is 0 at +(1/2\(1, 1).
matnx such that det(C) = 1 and
29. The maximum is 3 at +(1/V/3)(1, -1, 1
t, x x The minimum is 0 at
L/=t=cC ly = | (by + cz)fr | +(1/V 2a? + 2h — 2abXa — 3B, a, b).
l, z (-cy + bz)r|
fo
31. The maximum is 2 at
Thus, (+1/V 2a + 2a, b, b, —a). The
minimum is -2 at +(1/2X0, ~-1, 1, 0)
L,
(bt, — “| 33. The local maximum of fis A,a’, where A
(ct, + bt,)/r the maximum eigenvalue of the symmet)
coefficient matrix of the form f; it is
represents a rotation of axes that assumed at any eigenvector correspondil
transforms the equation ax’ + by + cz=d to A, and of length a. An analogous
into the form at,’ + rt, = d. statement holds for the local minimum
13. Hyperboloid of two sheets of f-
15. Hyperboloid of one sheet 35. g has a local maximum of 5 at (0, 0, 0).
17. Hyperbolic cylinder 37. g has no local extremum at (0, 0, 0).
19. Hyperboloid of one sheet
21. Circular cone or hyperboloid of one or two Section 8.4
tme[ ths [ an [I
sheets
23. Elliptic cone or hyperboloid of one or two
sheets
25. Parabolic cylinder or two parallel planes or
one plane or empty Rayleigh quotients: —2, 1, 5.2
27. Hyperbolic paraboloid or hyperbolic Maximum eigenvalue 6, eigenvector E
cylinder or two intersecting planes
ANSWERS TO MOST ODD-NUMBERED EXERCISES A-47
l l l
a=Z (ates Var oF= Mac - 2)
/Wr=15/,
Wy =l19], Wa =| 65
tod
7 29 103
= 5 (ates Va~ ot aB).
Rayleigh quotients. 6,6, 28
yleigh quotients: “4 = 4, 4222
0D = 3.5
b. If we use part a, the first row vector of
A - Alis
=—
—__—
Maximum eigenvalue 3, eigenvector
Ue
[a — A, } =
Wi
ie-es (a —c) + 48°), 7
1
bo l=
Ni—N]—
al
Nl
= (p= VETS, bh
to T= bof
re
aA
|
Nie
Ni
|
—
Wu
wu
Go fm
IN
wl
Dl Al—
+ Ob,b,"
Go fm
(-b,g = h)
Wl
NO
bt + (g + hy’
a bm
Al—
haa fm
eo fm
Wl
cc
|e
wo f— Geo
e
Ni=
wd |
column of C.
A = 12, v = [-.7059, 1, -.4118]
A = 6, ¥ = [-.9032, 1, -.4194] d. det(C) = He + ae) = 26h,
eo Ww
rs
A = .1883, v = [1, 2893, 3204] because A, 7, s = 0, we see that the
. A, = 4.732050807568877, algebraic sign of det(C) is the same as
v, = r[l, —.7320508, 1} that of 0.
wd
MN. !,i,-1, -i
—
-I +i -1-2i i -6i
23. 2,2 + Vi, 21, -V2 +V/2i, -2.
M7. 2
—-V2 -V2i, -21, V2 - Vi
M9. Eniering [Q, R] = qr(A), where A is the
matrix having the given vectors a,, a2, a; i
Section 9.2 column vectors, returns a matrix Q havin
as column vectors an orthonormal basis
—3+2i 2i 2i {q,, dh, 4}; where
3. AB = 2
2+3i -lt+i
2i 1
2+i
|,
q, = [-0.5774, -0.5774i, -0.5774i],
—2+2i i 2-1 q, = [-0.4695 — 0.17193,
BA=|2+3i I+3i 0 ~0.4695 — 0.17193, 0.2977 + 0.6414i],
21 -Il+i 0 = [0.4695 + 0.4291, 0.0726— 0.6414:
0.3703 + 0.1719
- If 2+: -i
~3/-1-i 1 To check, using MATLAB, the Student’s
Solutions Manual’s answers
| 9= 3 1 + 3i -4 + 8i
7. -3+i 3-f -2-61 Vv, = a, V) = a, V; = [1 — 31, -3 + i, -2
10} 944i 2-41 2-4i for an orthogonal basis, enter
3a +i ((1-1)/Q(1,1))*Q¢,1) to check v,, enter
-Z=— 9-31 11. sp||1 + 37 (1/Q(1,2))#Q(:,2) to check v,, and enter
10| 6 -2i 2 ((1—3+i)/Q(1,3))*Q(:,3) to check v,.
13. 3 18. a. (u, v) = 0, wv, u) = 0 MII. a. 0/274, 4476, 458, and V/353 for
b. uv) = 5 - 34,0, =5 + 3 rows |, 2, 3, and 4, respectively.
23. i d. 31 + 147
aQNnoo
(J + 2/) has rank | and nullity 3, 21. , fe, + €3, ©c, Cr, Cg, Cy — e;}
CHAPTER 10 | 0 0 1 2 -]
W.L=|3 1 OU=|0 1 5
Section 10.1 4-10 1 0 0 55|
1. There are 1 — ! flops performed on b
while the first column of A is being fixed
up, 2 — 2 flops while the second columa of
A is being fixed up, and so on. The total
number is(n — |) +(n-2}+-+-+ 42+ 1 00 0 | .2 -3
1 = n(n — 1)/2, which has order of 13.2 _| 2
or 1 0 00 ,U _|0-
0 1 9Q
0-9
magnitude n?/2 for large n.
3. mn, if we call each indexed addition a flop 4 27 | 0 0 0
5. mn? 7. 2n} 9. 3n3 11. 6n3 3
13. 30/2 15. 2 17. 2n x =
~|
19. wn, counting each final division as a flop L 2
21. FTFFTTFRTF
15. | 4 a
23. (No text answer data are possible for this -| 17.) 2
probicm, since different computers run at 2)
different speeds. However, for n large
enough to require 6 seconds or more to 1 0 O1 O Olf: 4 -
solve an n X mn system, our computer did 19.£DU=|0 1 O}}0 | Oo 4
require roughly 50% more time when using 3-4 1410 0 14]l0 9g
the Gauss—Jordan method than when
using the Gauss method with back 21.FFTFTFTFFT
substitution.)
25. See answer to Exercise 23. 23. Display:
1 3 ~-5 2
4-18 39 0 -}
Section 10.2 3 1666667 9 -2 4.16€
2. 2777778 1.407407 - 4.185185 5.42
1. It is not significant; no arithmetic ~6 1.166667 6666667 —4.141593 45.47¢
operations are involved, just storing of
l 5
ei
indexed values.
-| -8
For b,x =|} 0}; forb,x=|] 9]; for b,x =
2 i
~2 4
8125
25. |_| 1593 | nf ia
7 —.3125
~ 09131459
— 08786787
—.4790014
29. | "01573263
08439226
| 2712737
ANSWERS TO MOST ODD-NUMBERED EXERCISES A-51
ction 10.3
21. We were able to complete a reduction for
1. xX, = 0,x,= l inverses of the Ist, 2nd, 4th, 8th, and 16th
powcrs of H,. The inverse checks for H,
3. X, = 1, x, = .9999. Yes, it is reasouably
and H,? were pretty good, and we expect
accurate. 10x, + 1000000x, = 1000000,
that
—10x, + 20x, = 10 is a system that can’t
be solved without pivoting by a five-fizure 16 -120 240 -140
computer. -| = —120 1200 -2700 1680
5. a, + 10'%x, = 10", —x, + 2x, = | Ay 240 -2700 6480 —4200
-140 1680 —4200 2800
. We need to show that (m(nA)"')A = I. It is
~~!
_ | -36
have x = | 361
M1. For n = 5, b = [I, 0);
The two components of b differ from the for n = 10, b = [I, 0};
corresponding ones of c by 10 and by 18. for n = 15, b = 10°[1.68, —2.90];
32 for n = 20, b = 10'[6.265, — 6.688];
17. For b, we have x =
240 for n = 25, b = 10'92.09, — 3.09};
and for c, we
— 1500)
1400 for n = 30, b = 10*[-6.56, 6.97].
ANSWERS TO MOST ODD-NUMBERED EXERCISES
1o7)
see that P(1) is true, because
1. Let P(n) be the equation to be proved. ai -ry(l-n=a(l+n=atar.
Clearly P(1) is true, because Assume that P(k) ts true. Then
+ 1)(2 +
|= eee) Assume that P(X) ts atartart+ +++ + ar!
+ ark!
_ a(1 _ rkth)
true. + ark!
l-r
Then
_ a(1 _ re + rk] _ r))
P+2+Pt---
+k + (kt ly
l-r
~ MEA DEED Gay _ a(l — r**?)
l-r’
(A+ 1)2h +k + 6k + 6) which establishes P(k + 1). Therefore, P(r
6 is true for all € Z*.
~J
6 ,
has not been made precise; it is not well
defined. Moreover, we work in
Thus, P(k + 1) 1s true, and P(m) holds for
all ne Z*.
mathematics with two-valued logic: a
statement is either true or false, but not
. Let P(n) be the equation to be proved. We both, The assertion that not having an
see that P(1) is true. Assurce that P(x) is interesting property would be an
true. Then interesting property seems to contradict
this two-valued logic. We would be saying
1+ 345+ --: +(2k-1)
+ (2k +1) that the integer both has and does not ha
=k°+2k+1=(k+ 1), an interesting property.
as required. Therefure, P(n) holds for all
nEZ.
INDEX
absorbing Markov chain, 114 closure, 89, 192 dependent set, 127, 194
absorbing state, 114 code Descartes, René, 419
addition binary group, 118 determinant, 239, 241, 245. 252
of matrices, 42 Hamming (7, 4), 119 coiputation of, 263
preservation of, 142,213 code word, 117 expansion of, 254
of vectors, 6, 7, 180 codomain, 142, 213 properties of, 256, 257
of words, 118 coefficient matnx, 54 De Witt, Jan, 419, 424
adjoint, Hermitian, 469 coefficient, scalar, 10, 191 diagonal, main, 41
adjoint matnx, 269 cofactor, 252 diagonal matrix, 41, 305
algebraic multiplicity, 312, 403 column space, 91, 136 diagonalizable matnx, 307, 477
angle column vector, 13 diagonalizaLle transformation, 404
between vectors, 24, 235 commuting matrices, 48 diagonalization, 307
preservation of, 166 complete induction, A-3 orthogonal, 354
argument complex conjugate, 457 diagonalizing a quadratic form,
of a complex number, 459 complex number, 455 415
principal, 459 argument of, 459 diagonalizing substitution, 410
asymptotes, 420 conjugate of, 457 difference of matrices, 43
augmented matrix, 54 imaginary part of, 455 differential, 280
axiom of choice, 198 magnitude of, 456 differential equation, linear, 216,
back substitution, 58 modulus of, 456 320
band matrix, 510 polar form of, 459 dimension
band width, 510 real part of, 455 of a subspace, 131
basis complex vector space, 465 of a vector space, 199
Jordan, 492 component of a vector, 4 direction vector, 168
ordered, 205 composite function, 149, 213 distance, 31
orthonormal, 340 composite transformation, 149, 214 between vectors, 233
standard, 10 conic section, 418 betwcen words, 117
standard ordered, 205 conjugate domain, 142
of a subspace, 93 of a complex number, 457 dominant eigenvalue, 289
of a vector space, 197 of a matrix, 469 dot product, 24
basis matrix, 389 conjugate transpose, 469 preservation of, 166
Bezout, Etienne, 257 consistent linear system, 58 properties of, 25
binary alphabet, 115 contraction, 162 echelon form, 57
binary group code, 118 coordinate vector, 205, 389 reduced, 63
binary word, 116 unit, 22 eigenspace, 296, 476
biorthogonality, 301! coordinatization, 180 eigenvalue(s), 288, 476
Bombelli, Raphael, 458 Cotes, Roger, 375 algebraic multiplicity of, 312,
box, 274 Cramer, Gabriel. 257, 267 403
volume of, 276, A-9— Cramer’s rule, 266 eigenspace ot, 296
Bunyakovsky, Viktor Yakovlevich, cross product, 241 geometric multiplicity of, 312,
25 properties of, 246 403
Cardano, Gerolamo, 458 D’Alembert, Jean Le Rond, 290 left, 301
Cauchy, Augustin-Louis, 25, 257, decoding of a linear transformation, 297
310, 409 maximum-likelihood, !20 properties of, 296
Cayley, Arthur, 3, 37, 75 nearest-neighbor, |20 eigenvector(s), 109, 288, 476
Cayley-Hamilton theorem, 302 parity-check matrix, 122 left, 301
chain, Markov, 114 deflation, 442 of a linear transformation, 297
change-of-coordinates matrix, 390 degenerate ellipse, 420 properties of, 296
characteristic equation, 291 degenerate hyperbola, 420 Eisenstein, Ferdinand Gotthold, 40
characteristic polynomial, 291 degenerate n-box, 274 elementary column operations, 67
characteristic value, 288 degenerate parabola, 421 elementary matrix, 65
characteristic vector, 288 dependence relation, 127. 194 elementary row operations, 56
A-53
A-54 INOCX
ellipse. 418 Gauss reduction with back information vector, 288, 318
degenerate, 420 substitution, 60 inner product, 24, 230
ellipsoid, 424, 427 Gauss-Jordan method. 62 product, Euclidean, 467
elliptic cone, 425, 427 general solution. 59, 97 inner-product space, 230
elliptic paraboloid, 425, 427 general solutiun vector, 91! invariant subspace, 500
equal vectors, 4 generating veciors, 90, 191 inverse
equation geometric multiplicity, 312, 403 additive, 180
characteristic, 79| Gibbs, J. Willard, 214, 241 of a matnx, 75
panty-check, 117 Gram, Jorgen P., 343 of a product, 77
zank, 139 Gram—Schmidt process, 342 ofa transformation, 151 221
equivalence relation, 315 Grassman, Hermann, 3, 191 inverse image, 142, 213
Euclidean inner product, 467 group code, binary, 118 invertible matrix, 75
Euclidean space, 2 Hamilton, William Rowan, 3, 241 invertible transformation, 151, 21
subspace of, 89 Hamming, Richard, 120 irrational number, 454
Euler, Leonhard, 172, 267, 290, 424 Hamming (7. 4) code, 119 irreducible polynomial, A-3
expansion, 162 Hamming weight, 117 isomorphic vector spaces, 208, 22
by minors, 254 Heaviside, Oliver, 214 isomorphism, 221
Fermat, Pierre, 419 Hermann, Jacpb, 172 Jacobi, Carl Gustav, 351
Fibonacci sequence, 287 Hermitian adjoini, 469 Jacobi’s method, 436
field, 464 Hermitian matnx, 471 Jordan, Camille, 490
finite-dimensional vector space, 199 spectral theorem for, 479 Jordan basis, 492
finitely generated vector space, i91, Heron, of Alexandna, 5 Jordan block, 488
198 Hilbert, David, 444, 534 Jordan canonical form, 313, 324,
flat, 171 dilbert matnx, 47, 534 489, 491
parametnic equations of, 172 homogeneous linear system, 88 sordan, Wilhelm, 63
vector equation of, 172 nontrivial solution of, 88 kernel, 148, 218
flop, 503 trivial solution of, 88 Lagrange, Joseph Louis, 244, 290
force, resultant, 5 Hooke’s law, 370 Laplace, Pierre Simon, 257
force vector, 3 Householder matnx, 359 LDU-factonzation, 521
torm hyperbola, 420 least squares, 373
Jordan canonical, 313, 324, 489, asymptotes of, 420 Legendre, Adnen-Mane, 375
491 denegerate, 421 Leibniz, Gottfned von, 251
negative definite, 433 hyperbolic cylinder, 424, 427 length
polar, 459 hyperbolic paraboloid 475, 427 preservation of, 166
positive definite, 433 hyperboloid, 425, 426, 427 ofa vector, 21
quadratic, 409 hyperplane, 171, 331 of a word, 117
free vanables, 59 idempotent matrix, 86, 365 l'H6pital, Marquis de, 251
Frobenius, Georg Ferdinand, 137. identity matrix, 41 line, 168, 171
310, 35! identity transformation, 150 along a vector, 11
full matrix, 510 ill-conditioned matrix, 534 parametric equations of, 169
full pivong, 530 image, 142, 213 vector equation of, 168
function, | 42 inverse, 142, 213 linear combination, 10, 191
codomain of, 142, 213 imaginary axis. 455 linearly dependent, 127, 194
composite, 149, 215 imaginary number, 455 linearly independent, 127, 194
demain of, 142 inconsistent linear system, 58 linear system(s), 14, 51
local maximum of, 431 independent set, 127. 194 consistent, 58
locai minimum of, 433 independent vectors, 127, 194 general solution of, 59
one-to-one, 219 induction, A-1 homogeneous, 88
onto, 219 complete, A-3 history of, 52
scalar multiplication of, 183 induction axiom, A-1 inconsistent, 58
range of, 142, 213 induction hypothesis, A-2 overdetermined, 101, 370
sum of, 182 inequality particular solution of, 59
weight, 237 Schwartz, 24, 29 solution of, 51
fundamental theorem of algebra, triangle, 21, 30 square, 95
455 infinite series, 43] underdetermined, 96
Gauss, Carl Friedrich, 40, 66, 375 information portion, 117 unique soluiion case, 96
INDEX A-55