Matrix Methods - Applied Linear Algebra 3rd Ed - Bronson, Costa PDF
Matrix Methods - Applied Linear Algebra 3rd Ed - Bronson, Costa PDF
Matrix Methods - Applied Linear Algebra 3rd Ed - Bronson, Costa PDF
MATRIX METHODS:
Applied Linear Algebra
Third Edition
Richard Bronson
Fairleigh Dickinson University
Teaneck, New Jersey
Gabriel B. Costa
United States Military Academy
West Point, New York
To Evy...again.
R.B.
Contents
Preface
xi
xiii
Acknowledgments
xv
Matrices
1.1
1.2
1.3
1.4
1.5
1.6
1.7
Basic Concepts
1
Problems 1.1
3
Operations
6
Problems 1.2
8
Matrix Multiplication
9
Problems 1.3
16
Special Matrices
19
Problems 1.4
23
Submatrices and Partitioning
29
Problems 1.5
32
Vectors
33
Problems 1.6
34
The Geometry of Vectors
37
Problems 1.7
41
43
Linear Systems
43
Problems 2.1
45
Solutions by Substitution
50
Problems 2.2
54
Gaussian Elimination
54
Problems 2.3
62
vii
viii
Contents
2.4
2.5
2.6
2.7
2.8
The Inverse
3.1
3.2
3.3
3.4
3.5
3.6
88
93
Introduction
93
Problems 3.1
98
Calculating Inverses
101
Problems 3.2
106
Simultaneous Equations
109
Problems 3.3
111
Properties of the Inverse
112
Problems 3.4
114
LU Decomposition
115
Problems 3.5
121
Final Comments on Chapter 3
124
An Introduction to Optimization
4.1
4.2
4.3
4.4
4.5
Pivoting Strategies
65
Problems 2.4
70
Linear Independence
71
Problems 2.5
76
Rank
78
Problems 2.6
83
Theory of Solutions
84
Problems 2.7
87
Final Comments on Chapter 2
Graphing Inequalities
127
Problems 4.1
130
Modeling with Inequalities
131
Problems 4.2
133
Solving Problems Using Linear Programming
135
Problems 4.3
140
An Introduction to The Simplex Method
140
Problems 4.4
147
Final Comments on Chapter 4
147
Determinants
5.1
5.2
5.3
5.4
127
149
Introduction
149
Problems 5.1
150
Expansion by Cofactors
152
Problems 5.2
155
Properties of Determinants
157
Problems 5.3
161
Pivotal Condensation
163
Problems 5.4
166
ix
Contents
5.5
5.6
5.7
173
Inversion
167
Problems 5.5
169
Cramers Rule
170
Problems 5.6
173
Final Comments on Chapter 5
177
Denitions
177
Problems 6.1
179
Eigenvalues
180
Problems 6.2
183
Eigenvectors
184
Problems 6.3
188
Properties of Eigenvalues and Eigenvectors
Problems 6.4
193
Linearly Independent Eigenvectors
194
Problems 6.5
200
Power Methods
201
Problems 6.6
211
Matrix Calculus
190
213
7.1
Well-Dened Functions
213
Problems 7.1
216
7.2 CayleyHamilton Theorem
219
Problems 7.2
221
7.3 Polynomials of MatricesDistinct Eigenvalues
Problems 7.3
226
7.4 Polynomials of MatricesGeneral Case
228
Problems 7.4
232
7.5 Functions of a Matrix
233
Problems 7.5
236
238
7.6 The Function e At
Problems 7.6
240
7.7 Complex Eigenvalues
241
Problems 7.7
244
245
7.8 Properties of e A
Problems 7.8
247
7.9 Derivatives of a Matrix
248
Problems 7.9
253
7.10 Final Comments on Chapter 7
254
222
Contents
Fundamental Form
257
Problems 8.1
261
Reduction of an nth Order Equation
263
Problems 8.2
269
Reduction of a System
269
Problems 8.3
274
Solutions of Systems with Constant Coefcients
Problems 8.4
285
Solutions of SystemsGeneral Case
286
Problems 8.5
294
Final Comments on Chapter 8
295
257
275
297
310
315
327
355
411
357
Preface
It is no secret that matrices are used in many elds. They are naturally present
in all branches of mathematics, as well as, in many engineering and science elds.
Additionally, this simple but powerful concept is readily applied to many other
disciplines, such as economics, sociology, political science, nursing and psychology.
The Matrix is a dynamic construct. New applications of matrices are still
evolving, and our third edition of Matrix Methods: Applied Linear Algebra
(previously An Introduction) reects important changes that have transpired since
the publication of the previous edition.
In this third edition, we added material on optimization and probability theory.
Chapter 4 is new and covers an introduction to the simplex method, one of the
major applied advances in the last half of the twentieth century. Chapter 9 is
also new and introduces Markov Chains, a primary use of matrices to probability
applications. To ensure that the book remains appropriate in length for a one
semester course, we deleted some of the subject matter that is more advanced;
specically, chapters on the Jordan Canonical Form and on Special Matrices (e.g.,
Hermitian and Unitary Matrices). We also included an Appendix dealing with
technological support, such as computer algebra systems. The reader will also nd
that the text contains a considerable modeling avor.
This edition remains a textbook for the student, not the instructor. It remains
a book on methodology rather than theory. And, as in all past editions, proofs are
given in the main body of the text only if they are easy to follow and revealing.
For most of this book, a rm understanding of basic algebra and a smattering
of trigonometry are the only prerequisites; any references to calculus are few and
far between. Calculus is required for Chapter 7 and Chapter 8; however, these
chapters may be omitted with no loss of continuity, should the instructor wish
to do so. The instructor will also nd that he/she can mix and match chapters
depending on the particular course requirements and the needs of the students.
xi
xii
Preface
In closing, we would like to acknowledge the many people who helped to make
this book a reality. These include the professors, most notably Nicholas J. Rose,
who introduced us to the subject matter and instilled in us their love of matrices.
They also include the hundreds of students who interacted with us when we passed
along our knowledge to them. Their questions and insights enabled us to better
understand the underlying beauty of the eld and to express it more succinctly.
Special thanks go to the Most Reverend John J. Myers, Archbishop of Newark,
as well as to the Reverend Monsignor James M. Cafone and the Priest Community
at Seton Hall University. Gratitude is also given to the administrative leaders
of Seton Hall University, and to Dr. Joan Guetti and to the members of the
Department of Mathematics and Computer Science. Finally, thanks are given to
Colonel Michael Phillips and to the members of the Department of Mathematical
Sciences of the United States Military Academy.
Richard Bronson
Teaneck, NJ
Gabriel B. Costa
West Point, NY and South Orange, NJ
xiii
Acknowledgments
Many readers throughout the country have suggested changes and additions to
the rst edition, and their contributions are gratefully acknowledged. They include
John Brillhart, of the University of Arizona; Richard Thornhill, of the University
of Texas; Ioannis M. Roussos, of the University of Southern Alabama; Richard
Scheld and James Jamison, of Memphis State University; Hari Shankar, of Ohio
University; D.J. Hoshi, of ITT-West; W.C. Pye and Jeffrey Stuart, of the University
of Southern Mississippi; Kevin Andrews, of Oakland University; Harold Klee,
of the University of Central Florida; Edwin Oxford, Patrick ODell and Herbert
Kasube, of Baylor University; and Christopher McCord, Philip Korman, Charles
Groetsch and John King, of the University of Cincinnati.
Special thanks must also go to William Anderson and Gilbert Steiner, of Fairleigh Dickinson University, who were always available to me for consultation
and advice in writing this edition, and to E. Harriet, whose assistance was instrumental in completing both editions. Finally, I have the opportunity to correct a
twenty-year oversight: Mable Dukeshire, previously Head of the Department of
Mathematics at FDU, now retired, gave me support and encouragement to write
the rst edition. I acknowledge her contribution now, with thanks and friendship.
xv
1
Matrices
1.1
Basic Concepts
Denition 1 A matrix is a rectangular array of elements arranged in horizontal
rows and vertical columns. Thus,
1 3
5
,
(1)
2 0 1
4
3
0
1
2
4
1
1,
2
(2)
and
2
19.5
(3)
a1n
a2n
a3n
,
..
.
a11
a21
A = a31
..
.
a12
a22
a32
..
.
a13
a23
a33
..
.
ap1
ap2
ap3
apn
(4)
Chapter 1
Matrices
which is often abbreviated to [aij ]p n or just [aij ]. In this notation, aij represents
the general element of the matrix and appears in the ith row and the jth column.
The subscript i, which represents the row, can have any value 1 through p, while
the subscript j, which represents the column, runs 1 through n. Thus, if i = 2 and
j = 3, aij becomes a23 and designates the element in the second row and third
column. If i = 1 and j = 5, aij becomes a15 and signies the element in the rst
row, fth column. Note again that the row index is always given before the column
index.
Any element having its row index equal to its column index is a diagonal
element. Thus, the diagonal elements of a matrix are the elements in the 11
position, 22 position, 33 position, and so on, for as many elements of this type
that exist. Matrix (1) has 1 and 0 as its diagonal elements, while matrix (2) has 4,
2, and 2 as its diagonal elements.
If the matrix has as many rows as columns, p = n, it is called a square matrix;
in general it is written as
a11
a21
a31
..
.
an1
a12
a22
a32
..
.
an2
a1n
a2n
a3n .
..
.
a13
a23
a33
..
.
an3
(5)
ann
In this case, the elements a11 , a22 , a33 , . . . , ann lie on and form the main (or
principal) diagonal.
It should be noted that the elements of a matrix need not be numbers; they
can be, and quite often arise physically as, functions, operators or, as we shall see
later, matrices themselves. Hence,
(t + 1)dt t
3t 2 ,
sin
cos
cos
,
sin
and
x2
ex
d
ln x
dx
x+2
1.1
Basic Concepts
Problems 1.1
1. Determine the orders of the following matrices:
3
1 2 4 7
1 2 3
2
5 6 5 7
A =
, B = 0 0 0,
0
3
1 2 0
4 3 2
3 5
2 2 2
3
t
1 2
3
4
4
t2
t
8, D =
C = 5 6 7
t+2
3t
10 11 12 12
2t 3 5t 2
1
1 1
1
313
5
2 3
2
4
, F = 10, G =
E =
2 3
46.3
5
0
2 5
3 5
6
4
0 0
H=
, J = [1 5 30].
0 0
t2
6t
1
2t 5
0
5
,
2
3t 2
505
18
,
1.043
2. Find, if they exist, the elements in the 13 and the 21 positions for each of
the matrices dened in Problem 1.
3. Find, if they exist, a23 , a32 , b31 , b32 , c11 , d22 , e13 , g22 , g23 , and h32 for the
matrices dened in Problem 1.
4. Construct the 2 2 matrix A having aij = (1)i + j .
5. Construct the 3 3 matrix A having aij = i/j.
6. Construct the n n matrix B having bij = n i j. What will this matrix be
when specialized to the 3 3 case?
7. Construct the 2 4 matrix C having
i when i = 1,
cij =
j when i = 2.
8. Construct the 3 4 matrix D having
i + j
dij = 0
ij
when i > j,
when i = j,
when i < j.
9. Express the following times as matrices: (a) A quarter after nine in the morning. (b) Noon. (c) One thirty in the afternoon. (d) A quarter after nine in the
evening.
10. Express the following dates as matrices:
(a) July 4, 1776
(c) April 23, 1809
Chapter 1
Matrices
11. A gasoline station currently has in inventory 950 gallons of regular unleaded
gasoline, 1253 gallons of premium, and 98 gallons of super. Express this
inventory as a matrix.
12. Store 1 of a three store chain has 3 refrigerators, 5 stoves, 3 washing machines,
and 4 dryers in stock. Store 2 has in stock no refrigerators, 2 stoves, 9 washing
machines, and 5 dryers, while store 3 has in stock 4 refrigerators, 2 stoves, and
no washing machines or dryers. Present the inventory of the entire chain as a
matrix.
13. The number of damaged items delivered by the SleepTight Mattress Company
from its various plants during the past year is given by the matrix
80
50
90
12
40
10
16
16.
50
The rows pertain to its three plants in Michigan,Texas, and Utah. The columns
pertain to its regular model, its rm model, and its extra-rm model, respectively. The companys goal for next year to is to reduce by 10% the number
of damaged regular mattresses shipped by each plant, to reduce by 20% the
number of damaged rm mattresses shipped by its Texas plant, to reduce by
30% the number of damaged extra-rm mattresses shipped by its Utah plant,
and to keep all other entries the same as last year. What will next years matrix
be if all goals are realized?
14. A person purchased 100 shares of AT&T at $27 per share, 150 shares of
Exxon at $45 per share, 50 shares of IBM at $116 per share, and 500 shares of
PanAm at $2 per share. The current price of each stock is $29, $41, $116, and
$3, respectively. Represent in a matrix all the relevant information regarding
this persons portfolio.
15. On January 1, a person buys three certicates of deposit from different institutions, all maturing in one year. The rst is for $1000 at 7%, the second is
for $2000 at 7.5%, and the third is for $3000 at 7.25%. All interest rates are
effective on an annual basis.
(a) Represent in a matrix all the relevant information regarding this persons
holdings.
(b) What will the matrix be one year later if each certicate of deposit is
renewed for the current face amount and accrued interest at rates one
half a percent higher than the present?
16. (Markov Chains, see Chapter 9) A nite Markov chain is a set of objects,
a set of consecutive time periods, and a nite set of different states such
that
(i) during any given time period, each object is in only state (although
different objects can be in different states), and
1.1
Basic Concepts
(ii) the probability that an object will move from one state to another state
(or remain in the same state) over a time period depends only on the
beginning and ending states.
A Markov chain can be represented by a matrix P = pij where pij represents
the probability of an object moving from state i to state j in one time period.
Such a matrix is called a transition matrix.
Construct a transition matrix for the following Markov chain: Census gures show a population shift away from a large mid-western metropolitan
city to its suburbs. Each year, 5% of all families living in the city move to
the suburbs while during the same time period only 1% of those living in the
suburbs move into the city. Hint: Take state 1 to represent families living in
the city, state 2 to represent families living in the suburbs, and one time period
to equal a year.
17. Construct a transition matrix for the following Markov chain: Every four
years, voters in a New England town elect a new mayor because a town
ordinance prohibits mayors from succeeding themselves. Past data indicate
that a Democratic mayor is succeeded by another Democrat 30% of the time
and by a Republican 70% of the time. A Republican mayor, however, is
succeeded by another Republican 60% of the time and by a Democrat 40%
of the time. Hint: Take state 1 to represent a Republican mayor in ofce, state
2 to represent a Democratic mayor in ofce, and one time period to be four
years.
18. Construct a transition matrix for the following Markov chain: The apple
harvest in New York orchards is classied as poor, average, or good. Historical data indicates that if the harvest is poor one year then there is a 40%
chance of having a good harvest the next year, a 50% chance of having an average harvest, and a 10% chance of having another poor harvest. If a harvest
is average one year, the chance of a poor, average, or good harvest the next
year is 20%, 60%, and 20%, respectively. If a harvest is good, then the chance
of a poor, average, or good harvest the next year is 25%, 65%, and 10%,
respectively. Hint: Take state 1 to be a poor harvest, state 2 to be an average
harvest, state 3 to be a good harvest, and one time period to equal one year.
19. Construct a transition matrix for the following Markov chain. Brand X and
brand Y control the majority of the soap powder market in a particular region,
and each has promoted its own product extensively. As a result of past advertising campaigns, it is known that over a two year period of time 10% of
brand Y customers change to brand X and 25% of all other customers change
to brand X. Furthermore, 15% of brand X customers change to brand Y and
30% of all other customers change to brand Y. The major brands also lose customers to smaller competitors, with 5% of brand X customers switching to a
minor brand during a two year time period and 2% of brandY customers doing
likewise. All other customers remain loyal to their past brand of soap powder.
Hint: Take state 1 to be a brand X customer, state 2 a brand Y customer, state
3 another brand customer, and one time period to be two years.
Chapter 1
1.2
Matrices
Operations
The simplest relationship between two matrices is equality. Intuitively one feels
that two matrices should be equal if their corresponding elements are equal. This
is the case, providing the matrices are of the same order.
Denition 1 Two matrices A = [aij ]pn and B = [bij ]pn are equal if they have
the same order and if aij = bij (i = 1, 2, 3, . . . , p; j = 1, 2, 3, . . . , n). Thus, the
equality
5x + 2y
7
=
x 3y
1
implies that 5x + 2y = 7 and x 3y = 1.
The intuitive denition for matrix addition is also the correct one.
Denition 2 If A = [aij ] and B = [bij ] are both of order p n, then A + B is
a p n matrix C = [cij ] where cij = aij + bij (i = 1, 2, 3, . . . , p; j = 1, 2, 3, . . . , n).
Thus,
5
7
2
1
6
3
5 + (6)
3 + 2 1 =
7+2
1
4
1
(2) + 4
and
t2
3t
1
5
+
t
0
5
1
2
1+3
1
3 + (1) = 9
(1) + 1
2
2
6
t +1
=
t
4t
0
0
1
and
6
1
4
2
0
1
;
t
2
1
1.2
Operations
=
.
3 2
4 1
7 3
Another simple operation is that of multiplying a scalar times a matrix. Intuition guides one to perform the operation elementwise, and once again intuition
is correct. Thus, for example,
7
1
3
2
4
7
21
14
28
and
1
3
0
t
=
2
3t
0
.
2t
Find 5A 21 B if
4
0
A=
1
3
and
B=
6
18
20
8
Solution
4
0
5A 21 B = 5
=
20
0
6 20
1
21
18
8
3
5
3 10
17 15
=
.
15
9
4
9 11
It is not difcult to show that if 1 and 2 are scalars, and if A and B are matrices
of identical order, then
(S1) 1 A = A1 ,
(S2) 1 (A + B) = 1 A + 1 B,
(S3) (1 + 2 )A = 1 A + 2 A,
(S4) 1 (2 A) = (1 2 )A.
The reader is cautioned that there is no such operation as matrix division. We
will, however, dene a somewhat analogous operation, namely matrix inversion, in
Chapter 3.
Chapter 1
Matrices
Problems 1.2
In Problems 1 through 26, let
2
,
4
1
A=
3
3
1
D=
3
2
6
1
0
, C=
,
8
3 3
2
2
0
0 2
1
E=
5 3, F = 0
5
1
2
5
B=
7
1
2
,
2
6
1
0
.
0
2
1. Find 2A.
2. Find 5A.
3. Find 3D.
4. Find 10E.
5. Find F.
6. Find A + B.
7. Find C + A.
8. Find D + E.
9. Find D + F.
10. Find A + D.
11. Find A B.
12. Find C A.
13. Find D E.
14. Find D F.
18. Find 2E + F.
19. Find X if A + X = B.
20. Find Y if 2B + Y = C.
21. Find X if 3D X = E.
22. Find Y if E 2Y = F.
24. Find S if 3F 2S = D.
2
A=
4
2 1
1/
and
2 1
B=
3/
6
.
3 + 2 + 1
32. (a) Mr. Jones owns 200 shares of IBM and 150 shares of AT&T. Determine a
portfolio matrix that reects Mr. Jones holdings.
(b) Over the next year, Mr. Jones triples his holdings in each company. What
is his new portfolio matrix?
(c) The following year Mr. Jones lists changes in his portfolio as 50 100 .
What is his new portfolio matrix?
1.3
Matrix Multiplication
1.3
Matrix Multiplication
Matrix multiplication is the rst operation we encounter where our intuition fails.
First, two matrices are not multiplied together elementwise. Secondly, it is not
always possible to multiply matrices of the same order while it is possible to multiply certain matrices of different orders. Thirdly, if A and B are two matrices for
which multiplication is dened, it is generally not the case that AB = BA; that is,
matrix multiplication is not a commutative operation. There are other properties
of matrix multiplication, besides the three mentioned that defy our intuition, and
we shall illustrate them shortly. We begin by determining which matrices can be
multiplied.
Rule 1 The product of two matrices AB is dened if the number of columns
of A equals the number of rows of B.
10
Chapter 1
Matrices
6 1
1 2
0
1
and
1
B = 3
4
0
2
1
1 0
2 1,
1 0
(6)
then the product AB is dened since A has three columns and B has three rows.
The product BA, however, is not dened since B has four columns while A
has only two rows.
When the product is written AB, A is said to premultiply B while B is said to
postmultiply A.
Rule 2 If the product AB is dened, then the resultant matrix will have the same
number of rows as A and the same number of columns as B.
Thus, the product AB, where A and B are given in (6), will have two rows and
four columns since A has two rows and B has four columns.
An easy method of remembering these two rules is the following: write the
orders of the matrices on paper in the sequence in which the multiplication is to
be carried out; that is, if AB is to be found where A has order 2 3 and B has
order 3 4, write
(2 3)(3 4)
(7)
If the two adjacent numbers (indicated in (7) by the curved arrow) are both equal
(in the case they are both three), the multiplication is dened. The order of the
product matrix is obtained by canceling the adjacent numbers and using the two
remaining numbers. Thus in (7), we cancel the adjacent 3s and are left with 2 4,
which in this case, is the order of AB.
As a further example, consider the case where A is 4 3 matrix while B is
a 3 5 matrix. The product AB is dened since, in the notation (4 3)(3 5),
the adjacent numbers denoted by the curved arrow are equal. The product will
be a 4 5 matrix. The product BA however is not dened since in the notation
(3 5)(4 3) the adjacent numbers are not equal. In general, one may schematically state the method as
(k n)(n p) = (k p).
Rule 3 If the product AB = C is dened, where C is denoted by [cij ], then the
element cij is obtained by multiplying the elements in ith row of A by the corresponding elements in the jth column of B and adding. Thus, if A has order k n,
and B has order n p, and
a
21 a22 a2n b21 b22 b2p c21 c22 c2p
.
=
..
.. ..
..
.. ..
..
..
,
.
.
.
. .
.
. .
.
.
ak1 ak2 akn
bn1 bn2 bnp
ck1 ck2 ckp
1.3
11
Matrix Multiplication
then c11 is obtained by multiplying the elements in the rst row of A by the
corresponding elements in the rst column of B and adding; hence,
c11 = a11 b11 + a12 b21 + + a1n bn1 .
The element c12 is found by multiplying the elements in the rst row of A by the
corresponding elements in the second column of B and adding; hence.
c12 = a11 b12 + a12 b22 + + a1n bn2 .
The element ckp is obtained by multiplying the elements in the kth row of A by
the corresponding elements in the pth column of B and adding; hence,
ckp = ak1 b1p + ak2 b2p + + akn bnp .
Example 1
Find AB and BA if
1
A=
4
2
5
3
6
7
3
9
6
0
8
10
11
and
7
B = 9
0
8
10.
11
Solution
1 2
4 5
1(7) + 2(9) + 3(0) 1(8) + 2(10) + 3(11)
=
4(7) + 5(9) + 6(0) 4(8) + 5(10) + 6(11)
7 + 18 + 0
8 + 20 33
11 21
=
=
,
28 + 45 + 0 32 + 50 66
17 48
7 8
1 2 3
9
10
BA =
4 5 6
0 11
(7)1 + (8)4
(7)2 + (8)5
(7)3 + (8)6
9(2) + 10(5)
9(3) + 10(6)
= 9(1) + 10(4)
0(1) + (11)4
0(2) + (11)5
0(3) + (11)6
7 32 14 40 21 48
39 54 69
18 + 50
27 + 60 = 49
68
87.
= 9 + 40
0 44
0 55
0 66
44 55 66
AB =
The preceding three rules can be incorporated into the following formal
denition:
12
Chapter 1
Matrices
Find AB if
1
0
1
2
A = 1
3
and
3
B=
4
1
2
5
1
1
.
0
Solution
2 1
3
1 5 1
AB = 1 0
4 2 1
0
3 1
2(3) + 1(4)
2(1) + 1(2)
2(5) + 1(1)
2(1) + 1(0)
= 1(3) + 0(4) 1(1) + 0(2) 1(5) + 0(1) 1(1) + 0(0)
3(3) + 1(4)
3(1) + 1(2)
3(5) + 1(1)
3(1) + 1(0)
0 11 2
10
1.
= 3 1 5
13
1 16 3
Example 3
Find AB and BA if
2
A=
1
1
3
and
4
B=
1
0
.
2
Solution
2 1 4 0
2(4) + 1(1)
2(0) + 1(2)
9 2
AB =
=
=
;
1 3 1 2
1(4) + 3(1) 1(0) + 3(2)
1 6
4 0
2 1
4(2) + 0(1) 4(1) + 0(3)
8 4
BA =
=
=
.
1 2 1 3
1(2) + 2(1) 1(1) + 2(3)
0 7
This, therefore, is an example where both products AB and BA are dened but
unequal.
Example 4
Find AB and BA if
3
A=
0
1
4
and
1
B=
0
1
.
2
1.3
13
Matrix Multiplication
Solution
3 1 1 1
3 5
=
,
0 4 0 2
0 8
1 1 3 1
3 5
BA =
=
.
0 2 0 4
0 8
AB =
This, therefore, is an example where both products AB and BA are dened and
equal.
In general, it can be shown that matrix multiplication has the following
properties:
A(BC) = (AB)C
A(B + C) = AB + AC
(B + C)A = BA + CA
(M1)
(M2)
(M3)
(Associative Law)
(Left Distributive Law)
(Right Distributive Law)
providing that the matrices A, B, C have the correct order so that the above
multiplications and additions are dened. The one basic property that matrix
multiplication does not possess is commutativity; that is, in general, AB does not
equal BA (see Example 3). We hasten to add, however, that while matrices in
general do not commute, it may very well be the case that, given two particular
matrices, they do commute as can be seen from Example 4.
Commutativity is not the only property that matrix multiplication lacks. We
know from our experiences with real numbers that if the product xy = 0, then
either x = 0 or y = 0 or both are zero. Matrices do not possess this property as the
following example shows:
Example 5
Find AB if
A=
4
2
2
1
and
B=
3
6
4
.
8
Solution
4 2
3
AB =
2 1 6
0 0
=
.
0 0
4
4(3) + 2(6)
=
8
2(3) + 1(6)
4(4) + 2(8)
2(4) + 1(8)
14
Chapter 1
Matrices
Find AB and AC if
2
,
1
4
2
A=
B=
1
2
1
,
1
C=
2
0
2
.
1
Solution
4
AB =
2
4
AC =
2
2
1
2
1
1
2
2
0
1
4(1) + 2(2) 4(1) + 2(1)
8 6
=
=
;
1
2(1) + 1(2) 2(1) + 1(1)
4 3
2
4(2) + 2(0) 4(2) + 2(1)
8 6
=
=
.
1
2(2) + 1(0) 2(2) + 1(1)
4 3
The reader has no doubt wondered why this seemingly complicated procedure
for matrix multiplication has been introduced when the more obvious methods
of multiplying matrices termwise could be used. The answer lies in systems of
simultaneous linear equations. Consider the set of simultaneous linear equations
given by
5x 3y + 2z = 14,
x + y 4z = 7,
7x
3z = 1.
(8)
This system can easily be solved by the method of substitution. Matrix algebra,
however, will give us an entirely new method for obtaining the solution.
Consider the matrix equation
Ax = b
(9)
where
5
A = 1
7
3
1
0
2
4,
3
x
x = y,
z
and
14
b = 7.
1
Here A, called the coefcient matrix, is simply the matrix whose elements are the
coefcients of the unknowns x, y, z in (8). (Note that we have been very careful to put all the x coefcients in the rst column, all the y coefcients in the
second column, and all the z coefcients in the third column. The zero in the
(3, 2) entry appears because the y coefcient in the third equation of system (8)
1.3
15
Matrix Multiplication
is zero.) x and b are obtained in the obvious manner. One note of warning: there
is a basic difference between the unknown matrix x in (9) and the unknown variable x. The reader should be especially careful not to confuse their respective
identities.
Now using our denition of matrix multiplication, we have that
5 3
2
x
(5)(x) + (3)(y) + (2)(z)
1 4 y = (1)(x) + (1)(y) + (4)(z)
Ax = 1
z
(7)(x) + (0)(y) + (3)(z)
7
0 3
5x 3y + 2z
14
= x + y 4z = 7.
7x
3z
1
(10)
Using the denition of matrix equality, we see that (10) is precisely system (8).
Thus (9) is an alternate way of representing the original system. It should come
as no surprise, therefore, that by redening the matrices A, x, b, appropriately, we
can represent any system of simultaneous linear equations by the matrix equation
Ax = b.
Example 7
y + z + w = 5,
+ y z
= 4,
+ 2y
+ 2w = 0,
2y + 3z + 4w = 1.
Solution Dene
1
2
A =
3
1
1
1
2
2
1 1
1 0
,
0 2
3 4
x
y
x =
z ,
w
5
4
b =
0.
1
Unfortunately, we are not yet in a position to solve systems that are in matrix
form Ax = b. One method of solution depends upon the operation of inversion,
and we must postpone a discussion of it until the inverse has been dened. For
the present, however, we hope that the reader will be content with the knowledge that matrix multiplication, as we have dened it, does serve some useful
purpose.
16
Chapter 1
Matrices
Problems 1.3
1. The order of A is 2 4, the order of B is 4 2, the order of C is 4 1, the
order of D is 1 2, and the order of E is 4 4. Find the orders of
(a) AB,
(e) CD,
(i) ABC,
(b) BA,
(f) AE,
(j) DAE,
(c) AC,
(g) EB,
(k) EBA,
(d) CA,
(h) EA,
(l) EECD.
1
1
2,
D = 1
2 2
6
1
0 1
, C=
,
8
3 2 1
2
2
1
0
1 2
E = 0 2 1, F = 1 1 0,
1
2 3
1
0
1
X = [1
Y = [1
A=
1
3
2
,
4
B=
2],
5
7
1].
2. Find AB.
3. Find BA.
4. Find AC.
5. Find BC.
6. Find CB.
7. Find XA.
8. Find XB.
9. Find XC.
20. Find AB if
A=
2
3
6
9
and
B=
3
1
6
.
2
2
,
0
2
B=
1
4
,
2
1
3
2
4
x
.
y
1
C=
3
6
.
4
1.3
17
Matrix Multiplication
1
3
1
1
x
1 y.
0
z
0
1
3
a12
a22
x
.
y
b11
b21
b12
b22
b13
b23
2
1.
3
1
4
A=
2
.
3
3
2
A=
5
.
4
2
A = 3
0
1
2
0
1
1.
1
2
A=
1
1
,
3
0
B=
1
1
,
4
5
C=
2
1
.
1
18
Chapter 1
Matrices
+
+
+
+
2z +
y+
2y +
2z +
4w = 5,
w = 0,
2z = 3,
3w = 4.
35. The price schedule for a Chicago to Los Angeles ight is given by P =
[200 350 500], where the matrix elements pertain, respectively, to coach
tickets, business-class tickets, and rst-class tickets. The number of tickets
purchased in each category for a particular ight is given by
130
N = 20.
10
Compute the products (a) PN, and (b) NP, and determine their signicance.
36. The closing prices of a persons portfolio during the past week are given by
the matrix
40 21 40 78 41 41
40
1
5
1
7 ,
P=
3
3
3
4
3
8
2
8
4
10
9 43 10 18 10
9 58
where the columns pertain to the days of the week, Monday through Friday,
and the rows pertain to the prices of Orchard Fruits, Lion Airways, and
Arrow Oil. The persons holdings in each of these companies are given by
the matrix H = [100 500 400]. Compute the products (a) HP, and (b) PH,
and determine their signicance.
37. The time requirements for a company to produce three products is given by
the matrix
1.4
19
Special Matrices
10.50
W = 14.00.
12.25
Compute the product TW and determine its signicance.
38. Continuing with the data given in the previous problem, assume further that
the number of items on order for lamp bases, cabinets, and tables, respectively,
is given by the matrix O = [1000 100 200]. Compute the product OTW,
and determine its signicance.
39. The results of a u epidemic at a college campus are collected in the matrix
0.20
F = 0.10
0.70
0.20
0.30
0.50
0.15
0.30
0.55
0.15
0.40.
0.45
1050
950
1100 1050
.
C=
360
500
860 1000
Compute the product FC, and determine its signicance.
1.4
Special Matrices
There are certain types of matrices that occur so frequently that it becomes advisable to discuss them separately. One such type is the transpose. Given a matrix A,
the transpose of A, denoted by AT and read A-transpose, is obtained by changing
all the rows of A into columns of AT while preserving the order; hence, the rst
row of A becomes the rst column of AT , while the second row of A becomes
the second column of AT , and the last row of A becomes the last column of AT .
Thus if
1 4 7
1 2 3
A = 4 5 6, then AT = 2 5 8
3 6 9
7 8 9
20
Chapter 1
Matrices
and if
A=
1
5
2
6
3
7
4
,
8
1
2
T
then A =
3
4
5
6
.
7
8
(AT )T = A,
(A)T = AT where represents a scalar,
(A + B)T = AT + BT ,
(A + B + C)T = AT + BT + CT ,
(AB)T = BT AT ,
(ABC)T = CT BT AT
Transposes of sums and products of more than three matrices are dened in the
obvious manner. We caution the reader to be alert to the ordering of properties (5)
and (6). In particular, one should be aware that the transpose of a product is not
the product of the transposes but rather the commuted product of the transposes.
Example 1
Solution
3
4
0
1
and
B=
1
3
2
1
1
.
0
3 1
7;
(AB) = 6
3
4
1
3
3 1
3 4
7.
= 6
BT AT = 2 1
0 1
1
0
3
4
3
AB =
1
6 3
,
7 4
A zero row in a matrix is a row containing only zeros, while a nonzero row is
one that contains at least one nonzero element. A matrix is in row-reduced form
if it satises four conditions:
(R1) All zero rows appear below nonzero rows when both types are present in
the matrix.
(R2) The rst nonzero element in any nonzero row is unity.
1.4
21
Special Matrices
(R3) All elements directly below ( that is, in the same column but in succeeding
rows from) the rst nonzero element of a nonzero row are zero.
(R4) The rst nonzero element of any nonzero row appears in a later column
(further to the right) than the rst nonzero element in any preceding row.
Such matrices are invaluable for solving sets of simultaneous linear equations and
developing efcient algorithms for performing important matrix operations. We
shall have much more to say on these matters in later chapters. Here we are simply
interested in recognizing when a given matrix is or is not in row-reduced form.
Example 2
1
0
A=
0
0
1
C = 0
0
1
0
0
0
2
0
1
2 4
6 5
0 0
0 0
3 4
1 2,
0 5
7
7
,
0
0
1
B = 0
0
1
D= 0
0
3
0,
1
2
0
0
2
0
0
3
1
1
3
3.
0
Solution Matrix A is not in row-reduced form because the rst nonzero element
of the second row is not unity. This violates (R2). If a23 had been unity instead of
6, then the matrix would be in row-reduced form. Matrix B is not in row-reduced
form because the second row is a zero row and it appears before the third row
which is a nonzero row. This violates (R1). If the second and third rows had been
interchanged, then the matrix would be in row-reduced form. Matrix C is not in
row-reduced form because the rst nonzero element in row two appears in a later
column, column 3, than the rst nonzero element of row three. This violates (R4).
If the second and third rows had been interchanged, then the matrix would be in
row-reduced form. Matrix D is not in row-reduced form because the rst nonzero
element in row two appears in the third column, and everything below d23 is not
zero. This violates (R3). Had the 33 element been zero instead of unity, then the
matrix would be in row-reduced form.
For the remainder of this section, we concern ourselves with square matrices;
that is, matrices having the same number of rows as columns. A diagonal matrix is
a square matrix all of whose elements are zero except possibly those on the main
diagonal. (Recall that the main diagonal consists of all the diagonal elements
a11 , a22 , a33 , and so on.) Thus,
5
0
0
1
and
3
0
0
0
3
0
0
0
3
22
Chapter 1
Matrices
are both diagonal matrices of order 2 2 and 3 3 respectively. The zero matrix
is the special diagonal matrix having all the elements on the main diagonal equal
to zero.
An identity matrix is a diagonal matrix worthy of special consideration. Designated by I, an identity is dened to be a diagonal matrix having all diagonal
elements equal to one. Thus,
1 0 0 0
0 1 0 0
1 0
and
0 0 1 0
0 1
0 0 0 1
are the 2 2 and 4 4 identities respectively. The identity is perhaps the most
important matrix of all. If the identity is of the appropriate order so that the
following multiplication can be carried out, then for any arbitrary matrix A,
AI = A
and
IA = A.
A symmetric matrix is a matrix that is equal to its transpose while a skew symmetric
matrix is a matrix that is equal to the negative of its transpose. Thus, a matrix A
is symmetric if A = AT while it is skew symmetric if A = AT . Examples of each
are respectively
1 2 3
0
2 3
2 4 5 and 2
0
1.
3 5 6
3 1
0
A matrix A = [aij ] is called lower triangular if aij = 0 for j > i (that is, if all the
elements above the main diagonal are zero) and upper triangular if aij = 0 for i > j
(that is, if all the elements below the main diagonal are zero).
Examples of lower and upper triangular matrices are, respectively,
5 0 0 0
1 2 4
1
1 2 0 0
0 1 3 1
.
0 1 3 0 and 0 0 2
5
2 1 4 1
0 0 0
5
Theorem 1 The product of two lower (upper) triangular matrices is also lower
(upper) triangular.
Proof. Let A and B both be n n lower triangular matrices. Set C = AB. We
need to show that C is lower triangular, or equivalently, that cij = 0 when i < j.
Now,
cij =
n
k=1
aik bkj =
j1
k=1
aik bkj +
n
k=j
aik bkj .
1.4
23
Special Matrices
We are given that aik = 0 when i < k, and bkj = 0 when k < j, because both A and
B are lower triangular. Thus,
j1
aik bkj =
k=1
j1
aik (0) = 0
k=1
n
(0)bkj = 0
aik bkj =
k=j
k=j
Thus, if
A=
1 2
,
1
3
then A2 =
1
1
2
3
1
1
2
1
=
3
4
8
.
7
Problems 1.4
1. Verify that (A + B)T = AT + BT where
1 5 1
6
3 and B = 2
A = 2 1
0 7 8
1
2. Verify that (AB)T = BT AT , where
t t2
3
A = 1 2t and B =
t
1 0
t
2t
1
0
7
3
1.
2
t+1
t2
0
.
t3
24
Chapter 1
Matrices
(b) AT + (A + BT )T ,
(d) ((AB)T + C)T ,
4].
6. Find XT AX when
2
A=
3
3
4
x
and X =
.
y
0 1
0 0
A =
0 0
0 0
1 1
0 1
C =
0 0
0 0
2 2
E = 0 2
0 0
0 0
H = 0 1
0 0
2 0
L = 0 2
0 0
0 1
Q=
,
1 0
0
0
0
0
4
1
0
0
0
0
0
0
4
1
0
1
2
2,
2
0
0,
0
0
0,
0
7
2
,
1
0
7
2
,
1
5
0
F = 0
0
0
J = 1
0
M = 0
R=
1
0
7
2
,
1
5
1 0 4 7
0 0 0
0
,
0 0 0
1
0 0 0
0
0
1 2 3
0,
G = 0 0 1,
0
1 0 0
1
1
0 2
2,
K = 0 1 1,
0
0
0 0
1
1 0 0
3
1 ,
N = 0 0 1,
4
0 0 0
1
1 0
1 12
S=
,
T=
.
1 0
0
1
1
0
B =
0
0
0
0
D =
0
0
0
0
0
1
0
0
1
2
0 0
1
,
0
1
1
0
0
0
0
1
0
4
1
0
1
1.4
25
Special Matrices
1
A= 0
0
0
3
0
5
0
0 and B = 0
1
0
0
0.
2
0
3
0
13. Prove that if A and B are diagonal matrices of the same order, then AB = BA.
14. Does a 2 2 diagonal matrix commute with every other 2 2 matrix?
15. Compute the products AD and BD for the matrices
1
A = 1
1
1
1
1
1
1,
1
0
B = 3
6
1
4
7
2
5,
8
2
D = 0
0
0
3
0
0
0.
5
0
.
0
1
D=
0
1
A = 0
0
0
2
0
0
0.
3
21. Using the results of Problems 19 and 20 as a guide, what can be said about Dn
if D is a diagonal matrix and n is a positive integer?
22. Prove that if D = [dij ] is a diagonal matrix, then D2 = [dij2 ].
26
Chapter 1
Matrices
0
D = 0
0
0
1
0
0
0.
1
1 1 3
A = 5 2 6
2
1
3
is nilpotent of index 3.
25. Show that
0
0
A =
0
0
1
0
0
0
0
1
0
0
0
0
1
0
1.4
27
Special Matrices
0.1
0.4
0.9
.
0.6
28
Chapter 1
Matrices
Figure 1.1
Figure 1.2
41. (a) Construct an adjacency matrix M for the graph shown in Figure 1.2.
(b) Calculate M2 , and use that matrix to determine the number of paths
consisting of two arcs that connect node 1 to node 5.
(c) Calculate M3 , and use that matrix to determine the number of paths
consisting of three arcs that connect node 2 to node 4.
Figure 1.3
42. Figure 1.3 depicts a road network linking various cities. A traveler in city 1
needs to drive to city 7 and would like to do so by passing through the least
1.5
29
1.5
1 2
3
4
5
6
7
8
, B = 10 12 , and C = [2 3 4],
A =
(11)
9 10 11 12
14 16
13 14 15 16
then B and C are both submatrices of A. Here B was obtained by removing from
A the rst and second rows together with the rst and third columns, while C was
obtained by removing from A the second, third, and fourth rows together with the
rst column. By removing no rows and no columns from A, it follows that A is a
submatrix of itself.
A matrix is said to be partitioned if it is divided into submatrices by horizontal
and vertical lines between the rows and columns. By varying the choices of where
to put the horizontal and vertical lines, one can partition a matrix in many different
ways. Thus,
1 2
4
3 4
1 2 3
5 6
5
8
6 7
7 8
9 10 11 12 and 9 10 11 12
13 14 15 16
13 14 15 16
are examples of two different partitions of the matrix A given in (11).
If partitioning is carried out in a particularly judicious manner, it can be a great
help in matrix multiplication. Consider the case where the two matrices A and B
are to be multiplied together. If we partition both A and B into four submatrices,
respectively, so that
G H
C D
and B =
A=
E F
J K
where C through K represent submatrices, then the product AB may be obtained
by simply carrying out the multiplication as if the submatrices were themselves
elements. Thus,
CG + DJ CH + DK
,
(12)
AB =
EG + FJ EH + FK
providing the partitioning was such that the indicated multiplications are dened.
It is not unusual to need products of matrices having thousands of rows and
thousands of columns. Problem 42 of Section 1.4 dealt with a road network connecting seven cities. A similar network for a state with connections between all
30
Chapter 1
Matrices
cities in the state would have a very large adjacency matrix associated with it, and
its square is then the product of two such matrices. If we expand the network
to include the entire United States, the associated matrix is huge, with one row
and one column for each city and town in the country. Thus, it is not difcult to
visualize large matrices that are too big to be stored in the internal memory of any
modern day computer. And yet the product of such matrices must be computed.
The solution procedure is partitioning. Large matrices are stored in external
memory on peripheral devices, such as disks, and then partitioned. Appropriate
submatrices are fetched from the peripheral devices as needed, computed, and the
results again stored on the peripheral devices. An example is the product given in
(12). If A and B are too large for the internal memory of a particular computer,
but C through K are not, then the partitioned product can be computed. First,
C and G are fetched from external memory and multiplied; the product is then
stored in external memory. Next, D and J are fetched and multiplied. Then, the
product CG is fetched and added to the product DJ. The result, which is the rst
partition of AB, is then stored in external memory, and the process continues.
Example 1
Find AB if
3
A = 1
3
1
4
1
2
1
2
and
1
B = 1
0
3
0
1
2
1.
1
3 1
1
4
A=
3 1
2
1 3
1 and B = 1 0
2
0 1
2
1;
1
then,
3
1
1
4
AB =
1
1
1
1
2 9
0
2
+
3 3
0 1
2 9 + 0 2
2 11
= 3 2
2 11
3
2
0 1
+
0
1
3
+ 2 0 1
0
9
2
5 = 3
2
9
3
1
7
2
+
6
1
7 + 2
11
2
11
9
5.
9
2
2
1
+
1
1
2
1
+ 2 1
1
1
4
1.5
31
Example 2
Find AB if
3
2
A =
0
0
0
Solution
AB =
0
0
1
0
and
AB =
5
4
=
0
0
0
5
4
0
0
0
4
2
3
1
0
4
0 0
+
2
0 0
0
0 3
+
0
0 1
0 + 0 0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
5
4
0
3
= 0
1 0
0
0
0
0
0
2 1
B = 1 1
0 1
1
0
0
0
0
0
0
0
0
0
0
0
0 0 0
+
0
0 0 0
0 0
0 0 3
+
0 0
0 0 1
0 0 + 0 0 0
0
0
4
2
3
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.
1
0
0
0 0 1
+
0
0
0
3
0 0 1
+
0
1
0
+ 0 0 0 1
0
0
0
3
.
1
0
A1
A2
A3
0
A=
.
..
An
is called block diagonal. Such matrices are particularly easy to multiply because
in partitioned form they act as diagonal matrices.
32
Chapter 1
Matrices
Problems 1.5
1. Which of the following are submatrices of the given A and why?
1 2 3
A = 4 5 6
7 8 9
(a)
1
7
3
9
(b) 1
(c)
1 1
A = 3
0
0
1
1
8
2
9
(d)
4
7
6
.
9
b
.
d
2
4,
2
5
2
B = 1 1
0
1
0
3
1
2
1.
4
4. Partition the given matrices A and B and, using the results, nd AB.
4 1 0 0
3 2 0
0
2 2 0 0
1 1 0
0
.
A =
0 0 1 0, B = 0 0 2
1
0 0 1 2
0 0 1 1
5. Compute A2 for the matrix A given in Problem 4 by partitioning A into block
diagonal form.
6. Compute B2 for the matrix B given in Problem 4 by partitioning B into block
diagonal form.
7. Use partitioning to compute A2 and A3 for
1 0 0 0
0 2 0 0
0 0 0 1
A =
0 0 0 0
0 0 0 0
0 0 0 0
0
0
0
1
0
0
0
0
0
.
0
1
0
1.6
33
Vectors
0
1
A =
0
0
0
0
1
0
0
0
0
0
0
0
0
2
1
1
0
0
0
0
2
3
2
0
0
0
0
4
4
3
0
0
0
0
0
0
0
1
0
0
0
0
.
0
0
1
1.6
Vectors
Denition 1 A vector is a 1 n or n 1 matrix.
A 1 n matrix is called a row vector while an n 1 matrix is a column vector. The
elements are called the components of the vector while the number of components
in the vector, in this case n, is its dimension. Thus,
1
2
3
is an example of a 3-dimensional column vector, while
2t
(y1 )2 + (y2 )2 + + (yn )2 .
34
Chapter 1
Matrices
Example 1
Solution
Find y if y =
4 .
y = (1)2 + (2)2 + (3)2 + (4)2 = 30.
Find z if
1
z = 2.
3
Solution
z =
1].
15
15
15
1
.
15
Problems 1.6
1. Find p if 5x 2y = b, where
1
x = 3,
0
2
y = p,
1
and
1
b = 13.
2
1.6
35
Vectors
2. Find x if 3x + 2y = b, where
3
1
y=
6
0
and
2
1
b =
4.
1
3. Find y if 2x 5y = b, where
x= 2
and
b= 1
1 .
(b) ybT ,
(c) yT b,
(d) bT y.
(b) xbT ,
(c) xT b,
(d) bT b.
5
1
1
1
1
,
,
(h)
(i) 1 0 1 1 .
(g)
1
3
2
6
3
1
1
7. Find y if
(a) y = 1 1 ,
(b) y = 3 4 ,
(c) y = 1 1 1 ,
(d) y = 21 21
(e) y = 2 1 1 3 , (f) y = 0 1
8. Find x if
1
1
1
(a) x =
, (b) x =
, (c) x = 1,
1
2
1
1
2
,
3
2 .
36
Chapter 1
Matrices
1
1
1
1
2
0
(d) x =
1, (e) x = 3, (f) x = 1.
1
4
0
9. Find y if
(a) y = 2
3,
(b) y = 0
2 .
1
2
1
x
1 2
3
5
3 y = 11
3
1
z
5
1
1
2
3
x 2 + y 5 + z 3 = 11.
1
3
1
5
12. Convert the following system of equations into a vector equation:
2x + 3y = 10,
4x + 5y = 11.
13. Convert the following system of equations into a vector equation:
3x + 4y + 5z + 6w = 1,
y 2z + 8w = 0,
x + y + 2z w = 0.
14. Using the denition of matrix multiplication, show that the jth column of
(AB) = A (jth column of B).
15. Verify the result of Problem 14 by showing that the rst column of the product
AB with
1
1
1 2 3
0
A=
and B = 1
4 5 6
2 3
is
1
A 1,
2
1.7
37
1
A 0.
3
16. A distribution row vector d for an N-state Markov chain (see Problem 16 of
Section 1.1 and Problem 34 of Section 1.4) is an N-dimensional row vector
having as its components, one for each state, the probabilities that an object in
the system is in each of the respective states. Determine a distribution vector
for a three-state Markov chain if 50% of the objects are in state 1, 30% are in
state 2, and 20% are in state 3.
17. Let d(k) denote the distribution vector for a Markov chain after k time periods.
Thus, d(0) represents the initial distribution. It follows that
d(k) = d(0) Pk = P(k1) P,
where P is the transition matrix and Pk is its kth power.
Consider the Markov chain described in Problem 16 of Section 1.1.
(a) Explain the physical signicance of saying d(0) = [0.6
0.4].
0.5
0.1].
1.7
38
Chapter 1
Matrices
Figure 1.4
units parallel to the vertical axis. We can then draw an arrow beginning at the
origin and ending at the point (a, b). This arrow or directed line segment, as
shown in Figure 1.4, represents the vector geometrically. It follows immediately
from Pythagorass theorem and Denition 2 of Section 1.6 that the length of the
directed line segment is the magnitude of the vector. The angle associated with a
vector, denoted by in Figure 1.4, is the angle from the positive horizontal axis to
the directed line segment measured in the counterclockwise direction.
Example 1 Graph the vectors v = [2
magnitude and angle of each.
4] and u = [1
Solution The vectors are drawn in Figure 1.5. Using Pythagorass theorem and
elementary trigonometry, we have, for v,
v =
4
= 2,
2
and
= 63.4 .
1
= 1, and
1
= 135 .
tan =
tan =
1.7
39
Figure 1.5
Figure 1.6
represents the sum u + v. The process is depicted in Figure 1.6 for the two vectors
dened in Example 1.
To construct the difference of two vectors u v geometrically, graph both u
and v normally and construct an arrow from the terminal point of v to the terminal
point of u. This arrow geometrically represents the difference u v. The process
is depicted in Figure 1.7 for the two vectors dened in Example 1. To measure the
magnitude and direction of u v, translate it so that its initial point is at the origin,
40
Chapter 1
Matrices
Figure 1.7
being careful to preserve both its magnitude and direction, and then measure the
translated vector.
Both geometrical sums and differences involve translations of vectors. This
suggests that a vector is not altered by translating it to another position in the
plane providing both its magnitude and direction are preserved.
Many physical phenomena such as velocity and force are completely described
by their magnitudes and directions. For example, a velocity of 60 miles per
hour in the northwest direction is a complete description of that velocity, and
it is independent of where that velocity occurs. This independence is the rationale behind translating vectors geometrically. Geometrically, vectors having
the same magnitude and direction are called equivalent, and they are regarded
as being equal even though they may be located at different positions in the
plane.
A scalar multiplication ku is dened geometrically to be a vector having length
k times the length of u with direction equal to u when k is positive, and opposite
to u when k is negative. Effectively, ku is an elongation of u by a factor of k
when k is greater than unity, or a contraction of u by a factor of k when k
is less than unity, followed by no rotation when k is positive, or a rotation of 180
degrees when k is negative.
Example 2
Solution To construct 2u, we double the length of u and then rotate the resulting vector by 180 . To construct 21 v we halve the length of v and effect no rotation.
These constructions are illustrated in Figure 1.8.
1.7
41
Figure 1.8
Problems 1.7
In Problems 1 through 16, geometrically construct the indicated vector operations
for
u = [3
1],
v = [2
3
x=
,
5
5],
w = [4
4],
and
0
y=
.
2
1. u + v.
2. u + w.
3. v + w.
4. x + y.
5. x y.
6. y x.
7. u v.
8. w u.
9. u w.
13.
1
2 u.
10. 2x.
14. 21 u.
11. 3x.
15.
1
3 v.
12. 2x.
16. 41 w.
2
Simultaneous Linear Equations
2.1
Linear Systems
Systems of simultaneous equations appear frequently in engineering and scientic
problems. Because of their importance and because they lend themselves to
matrix analysis, we devote this entire chapter to their solutions.
We are interested in systems of the form
a11 x1 + a12 x2 + + a1n xn = b1 ,
a21 x1 + a22 x2 + + a2n xn = b2 ,
..
.
(1)
(2)
43
44
Chapter 2
where
a1n
a2n
.. ,
.
a11
a21
A = .
..
a12
a22
..
.
am1
am2
amn
x1
x2
x = . ,
..
xn
b1
b2
b = . .
..
bm
Solution
1
A=
1
Example 2
2
3
1
2
1
,
4
x
y
x=
z ,
w
4
b=
.
9
Solution
1
4
A=
2
1
2
1
,
1
1
x
x=
,
y
9
9
b=
7.
1
(3)
(4)
2.1
45
Linear Systems
x + y = 0,
2x + 2y = 0.
(5)
Equation (3) has no solutions, (4) admits only the solution x = y = 21 , while (5)
has solutions x = y for any value of y.
Denition 2 A system of simultaneous linear equations is consistent if it
possesses at least one solution. If no solution exists, the system is inconsistent.
Equation (3) is an example of an inconsistent system, while (4) and (5)
represent examples of consistent systems.
Denition 3 A system given by (2) is homogeneous if b = 0 (the zero vector). If b = 0 (at least one component of b differs from zero) the system is
nonhomogeneous.
Equation (5) is an example of a homogeneous system.
Problems 2.1
In Problems 1 and 2, determine whether or not the proposed values of x, y, and z
are solutions of the given systems.
1. x + y + 2z = 2,
(a) x = 1, y = 3, z = 2.
x y 2z = 0,
(b) x = 1, y = 1, z = 1.
x + 2y + 2z = 1.
2.
x + 2y + 3z = 6,
(a) x = 1, y = 1, z = 1.
x 3y + 2z = 0,
(b) x = 2, y = 2, z = 0.
3x 4y + 7z = 6.
(c) x = 14, y = 2, z = 4.
46
Chapter 2
2.1
47
Linear Systems
supplier. The company has three mines which it can work. Mine A produces
8000 tons of low-grade ore, 5000 tons of medium-grade ore, and 1000 tons of
high-grade ore during each day of operation. Mine B produces 3000 tons of
low-grade ore, 12,000 tons of medium-grade ore, and 3000 tons of high-grade
ore for each day it is in operation. The gures for mine C are 1000, 10,000,
and 2000, respectively. Show that the problem of determining how many days
each mine must be operated to meet contractual demands without surplus
is equivalent to solving a set of three equations in A, B, and C, where the
unknowns denote the number of days each mine will be in operation.
15. A pet store has determined that each rabbit in its care should receive 80 units
of protein, 200 units of carbohydrates, and 50 units of fat daily. The store
carries four different types of feed that are appropriate for rabbits with the
following compositions:
Feed
Protein
units/oz
Carbohydrates
units/oz
Fat
units/oz
A
B
C
D
5
4
8
12
20
30
15
5
3
3
10
7
The store wants to determine a blend of these four feeds that will meet the
daily requirements of the rabbits. Show that this problem is equivalent to
solving three equations in the four unknowns A, B, C, and D, where each
unknown denotes the number of ounces of that feed in the blend.
16. A small company computes its end-of-the-year bonus b as 5% of the net prot
after city and state taxes have been paid. The city tax c is 2% of taxable income,
while the state tax s is 3% of taxable income with credit allowed for the city
tax as a pretax deduction. This year, taxable income was $400,000. Show that
b, c, and s are related by three simultaneous equations.
17. A gasoline producer has $800,000 in xed annual costs and incurs an additional
variable cost of $30 per barrel B of gasoline. The total cost C is the sum of
the xed and variable costs. The net sales S is computed on a wholesale price
of $40 per barrel. (a) Show that C, B, and S are related by two simultaneous
equations. (b) Show that the problem of determining how many barrels must
be produced to break even, that is, for net sales to equal cost, is equivalent to
solving a system of three equations.
18. (Leontief Closed Models) A closed economic model involves a society in
which all the goods and services produced by members of the society are consumed by those members. No goods and services are imported from without
and none are exported. Such a system involves N members, each of whom produces goods or services and charges for their use. The problem is to determine
the prices each member should charge for his or her labor so that everyone
48
Chapter 2
breaks even after one year. For simplicity, it is assumed that each member
produces one unit per year.
Consider a simple closed system consisting of a farmer, a carpenter, and a
weaver. The farmer produces one unit of food each year, the carpenter produces one unit of nished wood products each year, and the weaver produces
one unit of clothing each year. Let p1 denote the farmers annual income
(that is, the price she charges for her unit of food), let p2 denote the carpenters annual income (that is, the price he charges for his unit of nished
wood products), and let p3 denote the weavers annual income. Assume on
an annual basis that the farmer and the carpenter consume 40% each of the
available food, while the weaver eats the remaining 20%. Assume that the
carpenter uses 25% of the wood products he makes, while the farmer uses
30% and the weaver uses 45%. Assume further that the farmer uses 50% of
the weavers clothing while the carpenter uses 35% and the weaver consumes
the remaining 15%. Show that a break-even equation for the farmer is
0.40p1 + 0.30p2 + 0.50p3 = p1 ,
while the break-even equation for the carpenter is
0.40p1 + 0.25p2 + 0.35p3 = p2 .
What is the break-even equation for the weaver? Rewrite all three equations
as a homogeneous system.
19. Paul, Jim, and Mary decide to help each other build houses. Paul will spend
half his time on his own house and a quarter of his time on each of the houses of
Jim and Mary. Jim will spend one third of his time on each of the three houses
under construction. Mary will spend one sixth of her time on Pauls house, one
third on Jims house, and one half of her time on her own house. For tax purposes each must place a price on his or her labor, but they want to do so in a way
that each will break even. Show that the process of determining break-even
wages is a Leontief closed model comprised of three homogeneous equations.
20. Four third world countries each grow a different fruit for export and each
uses the income from that fruit to pay for imports of the fruits from the other
countries. Country A exports 20% of its fruit to country B, 30% to country
C, 35% to country D, and uses the rest of its fruit for internal consumption.
Country B exports 10% of its fruit to country A, 15% to country C, 35% to
country D, and retains the rest for its own citizens. Country C does not export
to country A; it divides its crop equally between countries B and D and its
own people. Country D does not consume its own fruit. All of its fruit is for
export with 15% going to country A, 40% to country B, and 45% to country C.
Show that the problem of determining prices on the annual harvests of fruit
so that each country breaks even is equivalent to solving four homogeneous
equations in four unknowns.
2.1
49
Linear Systems
21. (Leontief InputOutput Models) Consider an economy consisting of N sectors, with each producing goods or services unique to that sector. Let xi denote
the amount produced by the ith sector, measured in dollars. Thus xi represents
the dollar value of the supply of product i available in the economy. Assume
that every sector in the economy has a demand for a proportion (which may
be zero) of the output of every other sector. Thus, each sector j has a demand,
measured in dollars, for the item produced in sector i. Let aij denote the proportion of item js revenues that must be committed to the purchase of items
from sector i in order for sector j to produce its goods or services. Assume
also that there is an external demand, denoted by di and measured in dollars,
for each item produced in the economy.
The problem is to determine how much of each item should be produced to meet external demand without creating a surplus of any item. Show
that for a two sector economy, the solution to this problem is given by the
supply/demand equations
supply
demand
x1 = a11 x1 + a12 x2 + d1 ,
x2 = a21 x1 + a22 x2 + d2 .
Show that this system is equivalent to the matrix equations
x = Ax + d
and
(I A)x = d.
50
Chapter 2
of 20 cents on energy costs, 10 cents on transportation, and 30 cents on construction. Each dollar of income gotten by the tourism sector requires the
expenditure of 20 cents on tourism (primarily in the form of complimentary
facilities for favored customers), 15 cents on energy, 5 cents on transportation, and 30 cents on construction. Each dollar of income from transportation
requires the expenditure of 40 cents on energy and 10 cents on construction;
while each dollar of income from construction requires the expenditure of 5
cents on construction, 25 cents on energy, and 10 cents on transportation. The
only external demand is for tourism, and this amounts to $5 million dollars a
year. Show that the problem of determining how much energy, tourism, transportation, and construction is required to supply the external demand without
a surplus is equivalent to solving a Leontief inputoutput model. What are A
and d?
25. A constraint is often imposed on each column of the consumption matrix of
a Leontief inputoutput model, that the sum of the elements in each column
be less than unity. Show that this guarantees that each sector in the economy
is protable.
2.2
Solutions by Substitution
Most readers have probably encountered simultaneous equations in high school
algebra. At that time, matrices were not available; hence other methods were
developed to solve these systems, in particular, the method of substitution. We
review this method in this section. In the next section, we develop its matrix
equivalent, which is slightly more efcient and, more importantly, better suited
for computer implementations.
Consider the system given by (1):
a11 x1 + a12 x2 + + a1n xn = b1 ,
a21 x1 + a22 x2 + + a2n xn = b2 ,
..
.
am1 x1 + am2 x2 + + amn xn = bm .
The method of substitution is the following: take the rst equation and solve for
x1 in terms of x2 , x3 , . . . , xn and then substitute this value of x1 into all the other
equations, thus eliminating it from those equations. (If x1 does not appear in the
rst equation, rearrange the equations so that it does. For example, one might
have to interchange the order of the rst and second equations.) This new set of
equations is called the rst derived set. Working with the rst derived set, solve
the second equation for x2 in terms of x3 , x4 , . . . , xn and then substitute this value
of x2 into the third, fourth, etc. equations, thus eliminating it. This new set is the
2.2
51
Solutions by Substitution
second derived set. This process is kept up until the following set of equations is
obtained:
x1 = c12 x2 +c13 x3 + c14 x4 + + c1n xn + d1 ,
x2 =
x3 =
..
.
xm =
c34 x4 + + c3n xn + d3 ,
(6)
where the cij s and the di s are some combination of the original aij s and bi s.
System (6) can be quickly solved by back substitution.
Example 1
Solution By solving the rst equation for r and then substituting it into the
second and third equations, we obtain the rst derived set
r = 3 2s t,
s 3t = 12,
8s 7t = 11.
By solving the second equation for s and then substituting it into the third equation,
we obtain the second derived set
r = 3 2s t,
s = 12 3t,
17t = 85.
By solving for t in the third equation and then substituting it into the remaining
equations (of which there are none), we obtain the third derived set
r = 3 2s t,
s = 12 3t,
t = 5.
Thus, the solution is t = 5, s = 3, r = 4.
52
Chapter 2
Example 2
Example 3
2.2
53
Solutions by Substitution
5c
=
= + + d .
c
0 1
0
c
d
d
0
0
1
Note that while c and d are arbitrary, once they are given a particular value, a and
b are automatically determined. For example, if c is chosen as 1 and d as 4, a
solution is a = 22, b = 8, c = 1, d = 4, while if c is chosen as 0 and d as 3, a
solution is a = 2, b = 3, c = 0, d = 3.
Example 4
y = 1,
3x + 2y = 5,
5x + 15y = 20.
Solution The rst derived set is
x = 4 3y,
7y = 7,
7y = 7,
0 = 0.
54
Chapter 2
Problems 2.2
Use the method of substitution to solve the following systems:
1.
x + 2y 2z = 1,
2.
2x + y + z = 5,
x + y z = 2.
3.
x + y z = 0,
3x + 2y + 4z = 0.
x + 3y = 4,
2x y = 1,
2x 6y = 8,
4x 9y = 5,
4. 4r 3s + 2t = 1,
r + s 3t = 4,
5r 2s t = 5.
6x + 3y = 3.
5. 2l m + n p =
1,
6. 2x + y z = 0,
l + 2m n + 2p = 1,
l 3m + 2n 3p =
7.
x + 2y z = 5,
2x y + 2z = 1,
2x + 2y z = 7,
x + 2y + z = 3.
2.3
x + 2y + z = 0,
3x y + 2z = 0.
2.
8.
x + 2y + z 2w = 1,
2x + 2y z w = 3,
2x 2y + 2z + 3w = 3,
3x + y 2z 3w = 1.
Gaussian Elimination
Although the method of substitution is straightforward, it is not the most efcient
way to solve simultaneous equations, and it does not lend itself well to electronic
computing. Computers have difculty symbolically manipulating the unknowns
2.3
55
Gaussian Elimination
1
A=
4
3
6
2
5
and
7
b=
,
8
then
1
A =
4
b
2
5
3
6
7
,
8
while if
1
A = 4
7
2
5
8
3
6
9
and
1
b = 2,
3
then
1
Ab = 4
7
2
5
8
3
6
9
1
2.
3
5.
1 1 2
x
3
2 5
3 y 11,
1 3
1
z
5
56
Chapter 2
1
Ab = 2
1
1 2
5
3
3
1
3
11.
5
A second striking feature to the method of substitution is that every derived set
is different from the system that preceded it. The method continues creating new
derived sets until it has one that is particularly easy to solve by back-substitution.
Of course, there is no purpose in solving any derived set, regardless how easy it
is, unless we are assured beforehand that it has the same solution as the original
system. Three elementary operations that alter equations but do not change their
solutions are:
(i) Interchange the positions of any two equations.
(ii) Multiply an equation by a nonzero scalar.
(iii) Add to one equation a scalar times another equation.
If we restate these operations in words appropriate to an augmented matrix,
we obtain the elementary row operations:
(E1) Interchange any two rows in a matrix.
(E2) Multiply any row of a matrix by a nonzero scalar.
(E3) Add to one row of a matrix a scalar times another row of that same matrix.
Gaussian elimination is a matrix method for solving simultaneous linear equations. The augmented matrix for the system is created, and then it is transformed
into a row-reduced matrix (see Section 1.4) using elementary row operations. This
is most often accomplished by using operation (E3) with each diagonal element
in a matrix to create zeros in all columns directly below it, beginning with the
rst column and moving successively through the matrix, column by column. The
system of equations associated with a row-reduced matrix can be solved easily by
back-substitution, if we solve each equation for the rst unknown that appears in
it. This is the unknown associated with the rst nonzero element in each nonzero
row of the nal augmented matrix.
2.3
57
Gaussian Elimination
Example 2
1
2
3
5
3
4
1
1
.
2
5
15 20
3
7
2
15
4
by adding to the
7
second row (2) times
5
the rst row
20
Then,
1
2
3
5
3
4
1
1
1
0
3
2
5
15 20
5
1
0
0
5
1
0
0
0
1
0
0
0
1
0
0
0
3
4
7 7
7 7
15 20
3
7
7
0
4
7
7
0
3
1
7
0
4
1
7
0
3
1
0
0
4
1
.
0
0
by adding to the
third row (3) times
by adding to the
fourth row (5) times
by multiplying the
1
second row by
7
by adding to the
second row (7) times
58
Chapter 2
3,
2r + 3s t = 6,
3r 2s 4t = 2.
Solution The augmented matrix for this system is
2
3
2
1
3
1 6.
4 2
2
1
1 3
2 4
3
12
2
by adding to the
second row (2) times
2
1
8
1
3
7
3
12
11
by adding to the
third row (3) times
2
1
8
1
3
7
3
12
11
1
2
3
Then,
1
2
3
2
3
2
1
3
1
1 6 0
4 2
3
1
0
0
1
0
0
by multiplying the
second row by (1)
2.3
59
Gaussian Elimination
1
0
0
1
0
0
2
1
0
1
3
by adding to the
3 12
third row (8) times
17 85
the second row
2
1
0
1
3
3 12.
1
5
by multiplying
the
third row by
1
17
The system of equations associated with this last augmented matrix in rowreduced form is
r + 2s + t = 3,
s + 3t = 12,
t = 5.
Solving the third equation for t, then the second equation for s, and, lastly, the rst
equation for r, we obtain r = 4, s = 3, and t = 5, which is also the solution to the
original set of equations. Compare this solution with Example 1 of the previous
section.
Whenever one element in a matrix is used to cancel another element to zero by
elementary row operation (E3), the rst element is called the pivot. In Example
3, we rst used the element in the 11 position to cancel the element in the
21 position, and then to cancel the element in the 31 position. In both of
these operations, the unity element in the 11 position was the pivot. Later, we
used the unity element in the 22 position to cancel the element 8 in the 32
position; here, the 22 element was the pivot.
While transforming a matrix into row-reduced form, it is advisable to adhere
to three basic principles:
As a consequence of this last principle, one never involves the ith row of a
matrix in an elementary row operation after the ith column has been transformed
into its required form. That is, once the rst column has the proper form, no pivot
element should ever again come from the rst row; once the second column has
the proper form, no pivot element should ever again come from the second row;
and so on.
When an element we want to use as a pivot is itself zero, we interchange rows
using operation (E1).
60
Chapter 2
Example 4
a + b + 2c
= 0.
0
1
1
0
0
1
2
3
2
4
2.
0
3
1
0
Normally, we would use the element in the 11 position to cancel to zero the
two elements directly below it, but we cannot because it is zero. To proceed with
the reduction process, we must interchange the rst row with either of the other
two rows. The choice is arbitrary.
0 0 2 3 4
1 0 3 1 2
by interchanging the
1 0 3 1 2 0 0 2 3 4
rst row with the
1 1 2 0 0
1 1 2 0 0
second row
1
0
0
0
0
1
3
2
1
2
4.
2
1
3
1
by adding to the
third row (1) times
Next, we would like to use the element in the 22 position to cancel to zero the
element in the 32 position, but we cannot because that prospective pivot is zero.
We use elementary row operation (E1) once again. The transformation yields
1
0
0
1
0
0
0
1
0
3
1
1 1
2
3
2
2
4
0
1
0
3
1
1 1
1 1.5
2
2.
2
by interchanging the
second row with the
third row
by multiplying the
third row by (0.5)
The system of equations associated with this last augmented matrix in rowreduced form is
a
+ 3c +
b c
d=
2,
d = 2,
c + 1.5d =
2.
2.3
61
Gaussian Elimination
We use the third equation to solve for c, the second equation to solve for b, and
the rst equation to solve for a, because these are the unknowns associated with
the rst nonzero element of each nonzero row in the nal augmented matrix. We
have no dening equation for d, so this unknown remains arbitrary. The solution
is, a = 4 + 3.5d, b = 0.5d, c = 2 1.5d, and d arbitrary, or in vector form
a
4 + 3.5d
4
b 0.5d 0
=
c 2 1.5d = 2 +
d
d
0
d
1.
23
2
5x
2
3
5
4
4
0
3
4
1
8
3.
12
Then,
2
3
5
4
4
0
3
4
1
8
1
2
3 3 4
12
5
0
1
0
5
2
10
0
1.5
4
1
4
3
12
1.5
4
8.5 9
1 12
by multiplying
the
rst row by
1
2
by adding to the
second row (3) times
62
Chapter 2
1
0
0
1
0
0
1
0
0
2
10
10
1.5
8.5
8.5
2
1.5
1
0.85
10 8.5
2
1
0
1.5
0.85
0
by adding to the
third row (5) times
4
9
8
4
0.9
8
4
0.9.
1
by multiplying
the
second row by 1
10
by adding to the
third row (10) times
The system of equations associated with this last augmented matrix in rowreduced form is
x + 2y + 1.5z =
4,
y + 0.85z = 0.9,
0=
1.
Since no values of x, y, and z can make this last equation true, this system, as well
as the original one, has no solution.
Finally, we note that most matrices can be transformed into a variety of rowreduced forms. If a row-reduced matrix has two nonzero rows, then a different
row-reduced matrix is easily constructed by adding to the rst row any nonzero
constant times the second row. The equations associated with both augmented
matrices, however, will have identical solutions.
Problems 2.3
In Problems 1 through 5, construct augmented matrices for the given systems of
equations:
1.
x + 2y = 3,
3x + y =
3.
1.
a + 2b = 5,
2.
x + 2y z = 1,
2x 3y + 2z =
4. 2r + 4s
4.
= 2,
3a + b = 13,
3r + 2s + t = 8,
4a + 3b = 0.
5r 3s + 7t = 15.
5. 2r + 3s 4t = 12,
3r 2s
= 1,
8r s 4t = 10.
2.3
63
Gaussian Elimination
In Problems 6 through 11, write the set of equations associated with the given
augmented matrix and the specied variables.
1 2 5
6. Ab =
variables: x and y.
0 1 8
1 2
3 10
1 5 3
variables: x, y, and z.
7. Ab = 0
0
0
1
4
1 3 12
40
1 6 200
8. Ab = 0
variables: r, s, and t.
0
0
1
25
1 3 0 8
2
9. Ab = 0 1 4
variables: x, y, and z.
0 0 0
0
1 7
2 0
1 1 0
variables: a, b, and c.
10. Ab = 0
0
0
0 0
1 1
0
1
0
1 2
2
11. Ab =
variables: u, v, and w.
0
0
1 3
0
0
0
1
12. Solve the system of equations dened in Problem 6.
13. Solve the system of equations dened in Problem 7.
14. Solve the system of equations dened in Problem 8.
15. Solve the system of equations dened in Problem 9.
16. Solve the system of equations dened in Problem 10.
17. Solve the system of equations dened in Problem 11.
In Problems 18 through 24, use elementary row operations to transform the given
matrices into row-reduced form:
18.
1
3
1
21. 1
2
2 5
.
7 8
2
1
3
3
2
0
19.
4
3.
0
4
2
24
11
20
.
8
0
22. 1
2
1
3
3
20.
2
2
1
4
1.
2
0
2
1
7
6
.
5
64
Chapter 2
1
1
23.
2
2
3
2
4
3
0 1
1
4
0
1
.
3
2
2
24. 5
3
3
8
3
4 6 0
15 1 3
5 4 4
10
40.
20
2.4
65
Pivoting Strategies
2.4
Pivoting Strategies
Gaussian elimination is often programmed for computer implementation. Since all
computers round or truncate numbers to a nite number of digits (e.g., the fraction
1/3 might be stored as 0.33333, but never as the innite decimal 0.333333 . . .)
roundoff error can be signicant. A number of strategies have been developed to
minimize the effects of such errors.
The most popular strategy is partial pivoting, which requires that a pivot element always be larger in absolute value than any element below it in the same
column. This is accomplished by interchanging rows whenever necessary.
66
Chapter 2
Example 1
1
2
5
2
12
26
4 18
2
9.
5 14
Normally, the unity element in the 11 position would be the pivot. With partial
pivoting, we compare this prospective pivot to all elements directly below it in the
same column, and if any is larger in absolute value, as is the case here with the
element 5 in the 31 position, we interchange rows to bring the largest element
into the pivot position.
1
2
5
2
4 18
5
12 2
9 2
26
5 14
1
5 14
2
9.
4 18
26
12
2
by interchanging the
rst and third rows
Then,
1
2
1
1
0
1
1
0
0
5.2
12
2
5.2
1.6
2
5.2
1.6
3.2
1
2
4
1
4
4
1
4
3
2.8
9
18
2.8
3.4
18
2.8
3.4.
15.2
by multiplying the
rst row by 15
by adding to the
second row (2) times
by adding to the
third row (1) times
The next pivot would normally be the element 1.6 in the 22 position. Before
accepting it, however, we compare it to all elements directly below it in the same
column. The largest element in absolute value is the element 3.2 in the 32
position. Therefore, we interchange rows to bring this larger element into the
pivot position.
Note. We do not consider the element 5.2 in the 12 position, even though it
is the largest element in its column. Comparisons are only made between a
prospective pivot and all elements directly below it. Recall one of the three basic
2.4
67
Pivoting Strategies
principles of row-reduction: never involve the rst row of matrix in a row operation
after the rst column has been transformed into its required form.
1
0
0
1
0
0
1
0
0
1
0
0
5.2
3.2
1.6
5.2
1
1.6
5.2
1
0
5.2
1
0
1
3
4
2.8
15.2
3.4
by interchanging the
second and third rows
1
2.8
0.9375 4.75
4
3.4
1
2.8
0.9375 4.75
2.5
11
1
2.8
0.9375 4.75
1
4.4
by multiplying the
1
second row by 3.2
by adding to the
third row (1.6) times
by multiplying the
1
third row by 2.5
z=
2.8,
y 0.9375z = 4.75,
z = 4.4,
which has as its solution x = 53.35, y = 8.875, and z = 4.4.
1
2
5
2
12
26
4 18
2
9.
5 14
Normally, we would use the element in the 11 position as the pivot. With
scaled pivoting, however, we rst compare ratios between elements in the rst
68
Chapter 2
column to the largest elements in absolute value in each row, ignoring the last
column. The ratios are
2
= 0.1667,
12
1
= 0.25,
4
and
5
= 0.1923.
26
The largest ratio in absolute value corresponds to the unity element in the 11
position, so that element remains the pivot. Transforming the rst column into
reduced form, we obtain
1
0
0
2
8
16
18
27.
76
4
10
15
Normally, the next pivot would be the element in the 22 position. Instead, we
consider the ratios
8
= 0.8
10
16
= 1,
16
and
which are obtained by dividing the pivot element and every element directly below
it by the largest element in absolute value appearing in their respective rows, ignoring elements in the last column. The largest ratio in absolute value corresponds to
the element 16 appearing in the 32 position. We move it into the pivot position
by interchanging the second and third rows. The new matrix is
1
0
0
18
76.
27
2
4
16 15
8 10
0
0
2
1
0
4
0.9375
1
18
4.75.
4.4
4z = 18,
y 0.9375z = 4.75,
z = 4.4.
The solution is, as before, x = 53.35, y = 8.875, and z = 4.4.
2.4
69
Pivoting Strategies
Complete pivoting compares prospective pivots with all elements in the largest
submatrix for which the prospective pivot is in the upper left position, ignoring
the last column. If any element in this submatrix is larger in absolute value than
the prospective pivot, both row and column interchanges are made to move this
larger element into the pivot position. Because column interchanges rearrange the
order of the unknowns, a book keeping method must be implemented to record
all rearrangements. This is done by adding a new row, designated as row 0, to the
matrix. The entries in the new row are initially the positive integers in ascending
order, to denote that column 1 is associated with variable 1, column 2 with variable
2, and so on. This new top row is only affected by column interchanges; none of
the elementary row operations is applied to it.
Example 3 Use complete pivoting with Gaussian elimination to solve the
system given in Example 1.
Solution The augmented matrix for this system is
1
2
3
- - - - - - - - - - - - - - - -
1
2
4 18
2
12
2
9
5 26
5 14
Normally, we would use the element in the 11 position of the coefcient matrix
A as the pivot. With complete pivoting, however, we rst compare this prospective
pivot to all elements in the submatrix shaded below. In this case, the element 26 is
the largest, so we interchange rows and columns to bring it into the pivot position.
1
2
3
---------------1
2
4 18
2 12 2
9
5
26
5 14
1
2
3
---------------5 26
5 14
2 12 2
9
1
2
4 18
2 1
3
---------------26 5
5 14
.
12 2 2
9
by interchanging the
rst and third rows
by interchanging the
rst and second columns
18
2
1
3
-------------------------------1
0.1923
0.1923
0.5385
0 0.3077 4.3077
2.5385
0
0.6154
3.6154
16.9231
70
Chapter 2
Normally, the next pivot would be 0.3077. Instead, we compare this number
in absolute value to all the numbers in the submatrix shaded above. The largest
such element in absolute value is 4.3077, which we move into the pivot position
by interchanging the second and third column. The result is
2
3
1
-------------------------------1
0.1923
0.1923
0.5385
.
0 4.3077 0.3077
2.5385
0
3.6154
0.6154
16.9231
2
3
1
----------------------------1 0.1923 0.1923
0.5385
.
0
1
0.0714 0.5893
0
53.35
0.5385,
z + 0.0714x = 0.5893,
x = 53.35.
Its solution is, x = 53.35, y = 8.8749, and z = 4.3985, which is within round-off
error of the answers gotten previously.
Complete pivoting generally identies a better pivot than scaled pivoting
which, in turn, identies a better pivot than partial pivoting. Nonetheless, partial pivoting is most often the strategy of choice. Pivoting strategies are used to
avoid roundoff error. We do not need the best pivot; we only need to avoid bad
pivots.
Problems 2.4
In Problems 1 through 6, determine the rst pivot under (a) partial pivoting,
(b) scaled pivoting, and (c) complete pivoting for given augmented matrices.
2.5
1
4
1.
3.
71
Linear Independence
3
8
35
.
15
8 15
.
4 11
2
3
4
6
7
8.
10 11 12
1
3
5. 5
9
2.
1
5
2
3
5
.
85
2
8 3 100
5
4
75.
4. 4
3 1
2 250
0 2
3
4
0
7. Solve Problem 3 of Section 2.3 using Gaussian elimination with each of the
three pivoting strategies.
8. Solve Problem 4 of Section 2.3 using Gaussian elimination with each of the
three pivoting strategies.
9. Solve Problem 5 of Section 2.3 using Gaussian elimination with each of the
three pivoting strategies.
10. Computers internally store numbers in formats similar to the scientic
notation 0, E, representing the number 0. multiplied by the power of 10
signied by the digits following E. Therefore, 0.1234E06 is 123,400 while
0.9935E02 is 99.35. The number of digits between the decimal point and E
is nite and xed; it is the number of signicant gures. Arithmetic operations in computers are performed in registers, which have twice the number
of signicant gures as storage locations.
Consider the system
0.00001x + y = 1.00001,
x + y = 2.
Show that when Gaussian elimination is implemented on this system by a computer
limited to four signicant gures, the result is x = 0 and y = 1, which is incorrect.
Show further that the difculty is resolved when partial pivoting is employed.
2.5
Linear Independence
We momentarily digress from our discussion of simultaneous equations to develop
the concepts of linearly independent vectors and rank of a matrix, both of which
will prove indispensable to us in the ensuing sections.
Denition 1 A vector V1 is a linear combination of the vectors V2 , V3 , . . . , Vn
if there exist scalars d2 , d3 , . . . , dn such that
V1 = d2 V2 + d3 V3 + + dn Vn.
72
Chapter 2
Example 1
[0 0 1].
Solution [1
Show that [1
3] is a linear combination of [2
3] = 21 [2
0] + 3[0
1].
0] and
Referring to Example 1, we could say that the row vector [1 2 3] depends linearly on the other two vectors or, more generally, that the set of vectors {[1 2 3],
[2 4 0], [0 0 1]} is linearly dependent. Another way of expressing this dependence would be to say that there exist constants c1 , c2 , c3 not all zero such that
c1 [1 2 3] + c2 [2 4 0] + c3 [0 0 1] = [0 0 0]. Such a set would be
c1 = 1, c2 = 21 , c3 = 3. Note that the set c1 = c2 = c3 = 0 is also a suitable set.
The important fact about dependent sets, however, is that there exists a set of
constants, not all equal to zero, that satises the equality.
Now consider the set given by V1 = [1 0 0] V2 = [0 1 0] V3 = [0 0 1].
It is easy to verify that no vector in this set is a linear combination of the other two.
Thus, each vector is linearly independent of the other two or, more generally, the
set of vectors is linearly independent. Another way of expressing this independence
would be to say the only scalars that satisfy the equation c1 [1 0 0] + c2 [0 1 0]
+ c3 [0 0 1] = [0 0 0] are c1 = c2 = c3 = 0.
Denition 2 A set of vectors {V1 , V2 , . . . , Vn }, of the same dimension, is linearly dependent if there exist scalars c1 , c2 , . . . , cn , not all zero, such that
c 1 V1 + c 2 V2 + c 3 V3 + + c n Vn = 0
(7)
The vectors are linearly independent if the only set of scalars that satises (7) is
the set c1 = c2 = = cn = 0.
Therefore, to test whether or not a given set of vectors is linearly independent,
rst form the vector equation (7) and ask What values for the cs satisfy this
equation? Clearly c1 = c2 = = cn = 0 is a suitable set. If this is the only set of
values that satises (7) then the vectors are linearly independent. If there exists a
set of values that is not all zero, then the vectors are linearly dependent.
Note that it is not necessary for all the cs to be different from zero for a
set of vectors to be linearly dependent. Consider the vectors V1 = [1, 2], V2 =
[1, 4], V3 = [2, 4]. c1 = 2, c2 = 0, c3 = 1 is a set of scalars, not all zero, such that
c1 V1 + c2 V2 + c3 V3 = 0. Thus, this set is linearly dependent.
Example 2
2.5
73
Linear Independence
Is the set
2
3
8
6, 1, 16
2
2
3
linearly independent?
Solution
2
3
8
0
c1 6 + c2 1 + c3 16 = 0.
2
2
3
0
This equation can be rewritten as
2c1
3c2
8c3
0
6c1 + c2 + 16c3 = 0
2c1
2c2
3c3
0
or as
2c1 + 3c2 + 8c3
0
6c1 + c2 + 16c3 = 0.
0
2c1 + 2c2 3c3
(8)
74
Chapter 2
Example 4
Is the set
1
,
2
5
,
7
3
1
!
linearly independent?
Solution
2.5
75
Linear Independence
c2
c3
cn
V2 V3 Vn .
c1
c1
c1
76
Chapter 2
Problems 2.5
In Problems 1 through 19, determine whether or not the given set is linearly independent.
1. {[1
0], [0
3. {[2
4], [3
1
5.
,
2
1]}.
6]}.
!
3
.
4
2. {[1
1], [1
1]}.
4. {[1
3], [2
1], [1
1
1
,
,
1
1
6.
1
7. 0,
1
1,
0
0
1 .
1
9. 0,
1
1,
1
1
11. 2,
1]}.
!
1
.
2
1
8. 0,
1
0,
2
2
0 .
1
1 .
0
10. 0,
3
2,
1
2
1 .
3
2,
1
2
1 .
1
12. 2,
3
2,
1
2
1,
3
4
13. 5,
3
0 ,
2
1
1 .
3], [3
15. {[1
14. {[1
0], [1
0]}.
6 9]}.
16. {[10
20
20], [10
10
10], [10
20
10]}.
17. {[10
20
20], [10
10
10], [10
20
10], [20
1
1
, , 1 .
19.
1 2 4
3
1
5
2]}.
10
20]}.
1
2 .
2.5
77
Linear Independence
as a linear combination of
1
1,
1
0,
1
1
1 .
24. A set of vectors S is a spanning set for another set of vectors R if every vector
in R can be expressed as a linear combination of the vectors in S. Show that
the vectors given in Problem 1 are a spanning set for all two-dimensional row
vectors. Hint: Show that for any arbitrary real numbers a and b, the vector
[a b] can be expressed as a linear combination of the vectors in Problem 1.
25. Show that the vectors given in Problem 2 are a spanning set for all twodimensional row vectors.
26. Show that the vectors given in Problem 3 are not a spanning set for all twodimensional row vectors.
27. Show that the vectors given in Problem 3 are a spanning set for all vectors of
the form [a 2a], where a designates any real number.
28. Show that the vectors given in Problem 4 are a spanning set for all twodimensional row vectors.
29. Determine whether the vectors given in Problem 7 are a spanning set for all
three-dimensional column vectors.
30. Determine whether the vectors given in Problem 8 are a spanning set for all
three-dimensional column vectors.
31. Determine whether the vectors given in Problem 8 are a spanning set for
vectors of the form [a 0 a]T , where a denotes an arbitrary real number.
32. A set of vectors S is a basis for another set of vectors R if S is a spanning set
for R and S is linearly independent. Determine which, if any, of the sets given
in Problems 1 through 4 are a basis for the set of all two dimensional row
vectors.
33. Determine which, if any, of the sets given in Problems 7 through 12 are a basis
for the set of all three dimensional column vectors.
34. Prove that the columns of the 3 3 identity matrix form a basis for the set of
all three dimensional column vectors.
78
Chapter 2
35. Prove that the rows of the 4 4 identity matrix form a basis for the set of all
four dimensional row vectors.
36. Finish the proof of Theorem 1. (Hint: Assume that V1 can be written as a
linear combination of the other vectors.)
37. Prove Theorem 4.
38. Prove Theorem 5.
39. Prove that the set of vectors {x, kx} is linearly dependent for any choice of the
scalar k.
40. Prove that if x and y are linearly independent, then so too are x + y and x y.
41. Prove that if the set {x1 , x2 , . . . , xn } is linearly independent then so too is the
set {k1 x1 , k2 x2 , . . . , kn xn } for any choice of the non-zero scalars k1 , k2 , . . . , kn .
42. Let A be an n n matrix and let {x1 , x2 , . . . , xk } and {y1 , y2 , . . . , yk } be two
sets of n-dimensional column vectors having the property that Axi = yi =
1, 2, . . . , k. Show that the set {x1 , x2 , . . . , xk } is linearly independent if the set
{y1 , y2 , . . . , yk } is.
2.6
Rank
If we interpret each row of a matrix as a row vector, the elementary row operations
are precisely the operations used to form linear combinations; namely, multiplying
vectors (rows) by scalars and adding vectors (rows) to other vectors (rows). This
observation allows us to develop a straightforward matrix procedure for determining when a set of vectors is linearly independent. It rests on the concept of
rank.
Denition 1 The row rank of a matrix is the maximum number of linearly independent vectors that can be formed from the rows of that matrix, considering each
row as a separate vector. Analogically, the column rank of a matrix is the maximum
number of linearly independent columns, considering each column as a separate
vector.
Row rank is particularly easy to determine for matrices in row-reduced form.
Theorem 1 The row rank of a row-reduced matrix is the number of nonzero
rows in that matrix.
Proof. We must prove two facts: First, that the nonzero rows, considered as
vectors, form a linearly independent set, and second, that every larger set is linearly
dependent. Consider the equation
c1 v1 + c2 v2 + + cr vr = 0,
(9)
2.6
79
Rank
where v1 is the rst nonzero row, v2 is the second nonzero row, . . . , and vr is the
last nonzero row of a row-reduced matrix. The rst nonzero element in the rst
nonzero row of a row-reduced matrix must be unity. Assume it appears in column
j. Then, no other rows have a nonzero element in that column. Consequently,
when the left side of Eq. (9) is computed, it will have c1 as its jth component.
Since the right side of Eq. (9) is the zero vector, it follows that c1 = 0. A similar
argument then shows iteratively that c2 , . . . , cr , are all zero. Thus, the nonzero
rows are linearly independent.
If all the rows of the matrix are nonzero, then they must comprise a maximum
number of linearly independent vectors, because the row rank cannot be greater
than the number of rows in the matrix. If there are zero rows in the row-reduced
matrix, then it follows from Theorem 3 of Section 2.5 that including them could
not increase the number of linearly independent rows. Thus, the largest number
of linearly independent rows comes from including just the nonzero rows.
Example 1
1
0
A=
0
0
0
0
2
5
1 4
0
1
0
3
1
.
0
0
Solution A is in row-reduced form. Since it contains three nonzero rows, its row
rank is three.
The following two theorems, which are proved in the Final Comments to this
chapter, are fundamental.
Theorem 2 The row rank and column rank of a matrix are equal.
For any matrix A, we call this common number the rank of A and denote it by
r(A).
Theorem 3 If B is obtained from A by an elementary row (or column) operation,
then r(B) = r(A).
Theorems 1 through 3 suggest a useful procedure for determining the rank of
any matrix: Simply use elementary row operations to transform the given matrix
to row-reduced form, and then count the number of nonzero rows.
Example 2
1
2
A=
3
5
3
4
1
1
.
2
5
15 20
80
Chapter 2
Solution In Example 2 of Section 2.3, we transferred this matrix into the rowreduced form
1 3 4
0 1 1
0 0 0.
0 0 0
This matrix has two nonzero rows so its rank, as well as that of A, is two.
Example 3
1
B = 2
3
2
1
3
3 1 6.
2 4 2
Solution In Example 3 of Section 2.3, we transferred this matrix into the rowreduced form
1 2 1
3
0 1 3 12.
0 0 1
5
This matrix has three nonzero rows so its rank, as well as that of B, is three.
6, 1, 16
2
2
3
is linearly independent.
Solution We consider the matrix
2
3
8
6
1
16
2
2.
3
2.6
81
Rank
0
0
3
1
0
58 .
0
This matrix has two nonzero rows, so its rank is two. Since this is less than the
number of vectors in the given set, that set is linearly dependent.
We can say even more: The original set of vectors contains a subset of two
linearly independent vectors, the same number as the rank. Also, since no row
interchanges were involved in the transformation to row-reduced form, we can
conclude that the third vector is linear combination of the rst two.
Example 5
is linearly independent.
Solution We consider the matrix
0 1
1 3
2 6
4 0
2
1
1
1
3 0
2 1
,
3 1
0 2
which can be reduced (after the rst two rows are interchanged) to the rowreduced form
1 3 1
2
1
0 1
2
3
0
1 7 1
0 0
27
0 0
0
1 175
This matrix has four nonzero rows, hence its rank is four, which is equal to the
number of vectors in the given set. Therefore, the set is linearly independent.
Example 6
82
Chapter 2
2
?
4
and
3
2
A=
6
4
3
0
6
,
0
which has rank one; hence A has just one linearly independent row vector. In
contrast, the matrix
1
B = 3
2
1
6
4
1
0
0
1
1,
0
which has rank two; hence B has two linearly independent row vectors. Since B
is precisely A with one additional row, it follows that the additional row [1, 1]T
is independent of the other two and, therefore, cannot be written as a linear
combination of the other two vectors.
We did not have to transform B in Example 6 into row-reduced form to determine whether the three-vector set was linearly independent. There is a more direct
approach. Since B has only two columns, its column rank must be less than or equal
to two (why?). Thus, the column rank is less than three. It follows from Theorem
3 that the row rank of B is less than three, so the three vectors must be linearly
dependent. Generalizing this reasoning, we deduce one of the more important
results in linear algebra.
Theorem 4
dependent.
2.6
83
Rank
Problems 2.6
In Problems 15, nd the rank of the given matrix.
4 1
1 2
0
1.
.
2. 2 3.
3 1 5
2 2
1 2 4
1
4 2
8 4.
3. 2
4. 1 1 3
1 4
2
1 2 4
1 7 0
5. 0 1 1.
1 1 0
2
2.
2
In Problems 6 through 22, use rank to determine whether the given set of vectors
is linearly independent.
6. {[1
0], [0
8. {[2
4], [3
1
10.
,
2
1]}.
6]}.
!
3
.
4
7. {[1
1], [1
1]}.
9. {[1
3], [2
1], [1
11.
1
,
1
1]}.
!
1
1
,
.
1
2
1
1
12. 0, 1,
1
0
0
1 .
1
1
13. 0, 0,
1
2
2
0 .
1
14. 0,
1
1,
1
1
1 .
3
0
15. 0, 2,
0
1
2
1 .
1
16. 2,
3
2,
1
2
1 .
3
2
1
2
2
1,
,
17.
,
3
1
3
18. {[1
0], [1 1
19. {[1
3], [3
0]}.
9]}.
20. {[10
20
20], [10
10
10], [10
20
10]}.
21. {[10
20
20], [10
10
10], [10
20
10], [20
22. {[2
1], [3 1
1
2 .
4], [1
10
20]}.
2]}.
84
Chapter 2
24. Can the vector [1 1 1]T be expressed as a linear combination of the vectors
given in (a) Problem 12, (b) Problem 13, or (c) Problem 14?
25. Can the vector [2 0 3]T be expressed as a linear combination of the vectors
given in Problem 13?
26. Can [3
2] and [3
2]?
27. Can [3
2] and [4
8]?
28. Find a maximal linearly independent subset of the vectors given in Problem 9.
29. Find a maximal linearly independent subset of the vectors given in Problem 13.
30. Find a maximal linearly independent subset of the set.
[1
0], [2
0], [1
1], [4
2], [4
3].
2.7
Theory of Solutions
Consider once again the system Ax = b of m equations and n unknowns given in
Eq. (2). Designate the n columns of A by the vectors V1 , V2 , . . . , Vn . Then Eq. (2)
can be rewritten in the vector form
x1 V1 + x2 V2 + + xn Vn = b.
Example 1
(10)
Solution
x
1
2
3
7
+y
+z
=
4
5
6
8
2.7
85
Theory of Solutions
Solution
1
A=
1
1
1
1
,
1
1
b=
,
0
1
A =
1
b
1
1
1
1
1
.
0
Example 3
3,
2x + 2y + 2w =
6,
x y w = 3.
Solution
1
A= 2
1
1
2
1
1
2,
1
3
b = 6,
3
1
Ab = 2
1
1
2
1
1
2
1
3
6.
3
Here r(A) = r(Ab ) = 1; hence, the system is consistent. In this case, n = 3 and
k = 1; thus, the solutions are expressible in terms of 3 1 = 2 arbitrary unknowns.
Using Gaussian elimination, we nd that the solution is x = 3 y w where y and
w are both arbitrary.
86
Chapter 2
Example 4
2,
2x + y 3z =
3.
Solution
2
A = 1
2
1
2,
3
3
1
1
1
b = 2,
3
2
A b = 1
2
3
1
1
1
2
3
1
2.
3
Here r(A) = r(Ab ) = 3, hence the system is consistent. Since n = 3 and k = 3, the
solution will be in n k = 0 arbitrary unknowns. Thus, the solution is unique (none
of the unknowns are arbitrary) and can be obtained by Gaussian elimination as
x = y = 2, z = 1.
Example 5
Solution
1
2
A=
3
4
1
1
2
2
2
1
,
1
2
1
2
b=
3,
4
2
Ab =
3
4
1
1
2
2
2 1
1 2
.
1 3
2 4
Here r(A) = r(Ab ) = 2. Thus, the system is consistent and the solutions will be in
terms of 3 2 = 1 arbitrary unknowns. Using Gaussian elimination, we nd that
the solution is x = 1 3z, y = 5z, and z is arbitrary.
In a consistent system, the solution is unique if k = n. If k = n, the solution will
be in terms of arbitrary unknowns. Since these arbitrary unknowns can be chosen
to be any constants whatsoever, it follows that there will be an innite number of
solutions. Thus, a consistent system will possess exactly one solution or an innite
number of solutions; there is no inbetween.
2.7
87
Theory of Solutions
(11)
(12)
Since Eq. (12) is a special case of Eq. (2) with b = 0, all the theory developed
for the system Ax = b remains valid. Because of the simplied structure of a
homogeneous system, one can draw conclusions about it that are not valid for a
nonhomogeneous system. For instance, a homogeneous system is always consistent. To verify this statement, note that x1 = x2 = = xn = 0 is always a solution
to Eq. (12). Such a solution is called the trivial solution. It is, in general, the nontrivial solutions (solutions in which one or more of the unknowns is different from
zero) that are of the greatest interest.
It follows from Theorem 2, that if the rank of A is less than n(n being the
number of unknowns), then the solution will be in terms of arbitrary unknowns.
Since these arbitrary unknowns can be assigned nonzero values, it follows that
nontrivial solutions exist. On the other hand, if the rank of A equals n, then the
solution will be unique, and, hence, must be the trivial solution (why?). Thus, it
follows that:
Theorem 3 The homogeneous system (12) will admit nontrivial solutions if and
only if r(A) = n.
Problems 2.7
In Problems 19, discuss the solutions of the given system in terms of consistency
and number of solutions. Check your answers by solving the systems wherever
possible.
1. x 2y = 0,
x + y = 1,
2x y = 1.
3.
x
x
3x
+ y
y
+ y
2. x + y = 0,
2x 2y = 1,
x y = 0.
+ z
+
z
+ 3z
=
=
=
1,
2,
4.
4. x
2x
3y
y
+
+
2z w = 2,
z + w = 3.
88
Chapter 2
+
+
5. 2x
x
x
y
2y
y
z =
z =
z =
0,
4,
1.
2z
0,
2x
2x
+
+
3y
7y
z =
7z =
0,
0.
9. x
2y
y
y
6. 2x
x
7.
2.8
3z + 3w =
2z + 2w =
3z + 9w =
8.
=
=
3y
4y
0,
0,
2z
0,
2x
2x
3y
7y
z =
9z =
0,
0.
0,
0,
0.
(13)
(14)
We shall assume that the column rank of A is greater than the column rank of B
and show that this assumption leads to a contradiction. It will then follow that the
reverse must be true, which is precisely what we want to prove.
Denote the column rank of A as a and the column rank of B as b. We assume
that a > b. Since the column rank of A is a, there must exist a columns of A that
2.8
89
Final Comments
are linearly independent. If these columns are not the rst a columns, rearrange
the order of the columns so they are. Lemma 1 guarantees such reorderings do
not alter the column rank. Thus, A1 , A2 , . . . , Aa are linearly independent. Since
a is assumed greater than b, we know that the rst a columns of B are not linearly independent. Since they are linearly dependent, there must exist constants
c1 , c2 , . . . , ca not all zero such that
c1 B1 + c2 B2 + + ca Ba = 0.
It then follows that
c1 B1 + c2 B2 + + ca Ba + 0Ba+1 + + 0Bn = 0,
from which we conclude that
x1 = c1 ,
x2 = c2 ,
. . . , xa = ca ,
xa+1 = 0,
. . . , xn = 0.
is a solution of Eq. (14). Since every solution to Eq. (14) is also a solution to Eq.
(12), we have
c1 A1 + c2 A2 + + ca Aa + 0Aa+1 + + 0An = 0,
or more simply
c1 A1 + c2 A2 + + ca Aa = 0,
where all the cs are not all zero. But this implies that the rst a columns of A
are linearly dependent, which is a contradiction of the assumption that they were
linearly independent.
Lemma 3 If Ax = 0 and Bx = 0 have the same set of solutions, then A and B
have the same column rank.
Proof. If follows from Lemma 2 that the column rank of A is less than or equal
to the column rank of B. By reversing the roles of A and B, we can also conclude
from Lemma 2 that the column rank of B is less than or equal to the column rank
of A. As a result, the two column ranks must be equal.
Theorem 1 An elementary row operation does not alter the column rank of a
matrix.
Proof. Denote the original matrix as A, and let B denote a matrix obtained
by applying an elementary row operation to A; and consider the two homogeneous systems Ax = 0 and Bx = 0. Since elementary row operations do not alter
solutions, both of these systems have the same solution set. Theorem 1 follows
immediately from Lemma 3.
90
Chapter 2
Lemma 4 The column rank of a matrix is less than or equal to its row rank.
Proof. Denote rows of A by A1 , A2 , . . . Am , the column rank of matrix A by c
and its row rank by r. There must exist r rows of A which are linearly independent.
If these rows are not the rst r rows, rearrange the order of the rows so they are.
Theorem 1 guarantees such reorderings do not alter the column rank, and they
certainly do not alter the row rank. Thus, A1 , A2 , . . . , Ar are linearly independent.
Dene partitioned matrices R and S by
A1
A2
R = .. and
.
Ar+1
Ar+2
S = .. .
.
Ar
An
R
.
S
tr+1,1
tr+2,1
T = ..
.
tn,1
tr+1,2
tr+2,2
..
.
tn,2
tr+1,n
tr+2,n
..
.. .
.
.
tn,n
2.8
Final Comments
91
3
The Inverse
3.1
Introduction
Denition 1 An inverse of an n n matrix A is a n n matrix B having the
property that
AB = BA = I.
(1)
Determine whether
B=
1
1
3
1
2
1
4
C=
or
2
3
2
21
1
3
2
.
4
AB =
1
1
3
1
2
1
4
5
3
13
3
1
5
2
=
93
94
Chapter 3
The Inverse
while
1
AC =
3
2
4
2
3
2
1
21
1
=
0
2
0
=
3
1
2
21
1
3
2
= CA.
4
D =
2
3
..
then
D1
1
1
1
2
1
3
..
1
n
It is easy to show that if any diagonal element in a diagonal matrix is zero, then
that matrix is singular. (See Problem 57.)
An elementary matrix E is a square matrix that generates an elementary row
operation on a matrix A (which need not be square) under the multiplication
EA. Elementary, matrices are constructed by applying the desired elementary
row operation to an identity matrix of appropriate order. The appropriate order
3.1
95
Introduction
for both I and E is a square matrix having as many columns as there are rows
in A; then, the multiplication EA is dened. Because identity matrices contain
many zeros, the process for constructing elementary matrices can be simplied
still further. After all, nothing is accomplished by interchanging the positions of
zeros, multiplying zeros by nonzero constants, or adding zeros to zeros.
(i) To construct an elementary matrix that interchanges the ith row with the jth
row, begin with an identity matrix of the appropriate order. First, interchange
the unity element in the i i position with the zero in the j i position, and
then interchange the unity element in the j j position with the zero in the
i j position.
(ii) To construct an elementary matrix that multiplies the ith row of a matrix by
the nonzero scalar k, replace the unity element in the i i position of the
identity matrix of appropriate order with the scalar k.
(iii) To construct an elementary matrix that adds to the jth row of a matrix k times
the ith row, replace the zero element in the j i position of the identity matrix
of appropriate order with the scalar k.
Example 2 Find elementary matrices that when multiplied on the right by any
4 3 matrix A will (a) interchange the second and fourth rows of A, (b) multiply
the third row of A by 3, and (c) add to the fourth row of A 5 times its second row.
Solution
1
0
(a)
0
0
0
0
0
1
0
0
1
0
0
1
,
0
0
1
0
(b)
0
0
0
1
0
0
0
0
3
0
0
0
,
0
1
1 0
0 1
(c)
0 0
0 5
0
0
1
0
0
0
.
0
Example 3 Find elementary matrices that when multiplied on the right by any
3 5 matrix A will (a) interchange the rst and second rows of A, (b) multiply the
third row of A by 0.5, and (c) add to the third row of A 1 times its second row.
Solution
(a) 1
0
1
0
0
0
0,
1
(b) 0
0
0
1
0
0
0 ,
0.5
(c) 0
0
0
1
1
0
0.
1
The inverse of an elementary matrix that interchanges two rows is the matrix
itself, it is its own inverse. The inverse of an elementary matrix that multiplies
one row by a nonzero scalar k is gotten by replacing k by 1/k. The inverse of
96
Chapter 3
The Inverse
an elementary matrix which adds to one row a constant k times another row is
obtained by replacing the scalar k by k.
Example 4
Example 2.
Solution
1
0
(a)
0
0
Example 5
Example 3.
0
0
0
1
0
0
1
0
0
1
,
0
1
0
(b)
0
0
1
0
0
1
3
0
0
,
0
1
0
5
0
1
0
(c)
0
.
0
1
Solution
(a) 1
0
1
0
0
0,
1
(b) 0
0
0
1
0
0,
2
(c) 0
0
0
1
1
0.
1
A =
A1
A2
A3
..
An
A11
A1
0
A21
A31
..
.
An1
3.1
97
Introduction
Example 6
2
0
A = 0
0
0
0
5
0
0
0
0
0
0
0
1
4
0
0
0
0
0
0
1
0
0
0
0
0
0
0
1
0
0
0
0
0.
1
0
0
0
0
0
0
0
1
Solution Set
0
,
5
2
A1 =
0
0
,
1
1
A2 =
4
1
A3 = 0
0
and
0
0
1
0
1;
0
A1
A =
0
A2
A3
A11 =
1
2
1
5
A21 =
1
4
0
,
1
and
A31
1
= 0
0
and
1
2
A1
=
0
0
0
1
5
1 0
4 1
0 0
0 0
0 0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
0.
1
0
0
0
1
0
1,
0
98
Chapter 3
The Inverse
Problems 3.1
1. Determine if any of the following matrices are inverses for
1
2
A=
1
2
1
3
,
1
9
(a)
(c)
3
:
9
23
1
3
1
2
(b)
(d)
3
,
9
1
B=
1
1
1
(a)
(c)
1
1
1
,
1
1
,
1
1
:
1
1
(b)
1
(d)
2
1
1
,
1
1
.
2
2
.
3
b
.
d
8
A=
5
Hint: Dene
B=
a
c
Calculate AB, set the product equal to I, and then solve for the elements of B.
4. Use the procedure described in Problem 3 to calculate the inverse of
1
C=
2
2
.
1
1
D=
1
1
.
1
3.1
99
Introduction
a
A=
c
when ad bc = 0 is
A1 =
b
d
1
d
ad bc c
b
.
a
1 21
.
1
1
2
100
Chapter 3
The Inverse
2
0
1
31. 0
0
1
34. 0
0
1
0
37.
0
0
0
0
40.
0
1
0
,
1
28.
0
1
0
0
0 ,
1
3
0,
1
0
1
0
0
0
0
0
1
0
0
,
1
0
0
1
0
0
0
0
1
0
1
0
,
0
0
0
2
0
1
0
2
,
1
29.
0
32. 1
0
1
35. 0
0
1
0
38.
0
0
0
1
0
0
0
0,
1
0
2,
1
0 0
0 7
,
1 0
0 1
0
1
1
0
0
0
1
0
1
0
0
0
1
0
1
0
41.
0
0
1
3
0
0
,
0
1
0
,
1
30.
1
1
0
,
1
1 0 0
33. 0 1 0,
3 0 1
1 0
0
0,
36. 0 1
0 0 4
1 0 0 0
0 1 0 0
39.
3 0 1 0,
0 0 0 1
1
0
42.
0
0
0
1
0
0
0
0
21
0
0
0
,
0
1
In Problems 43 through 55, nd the inverses, if they exist, of the given diagonal or
block diagonal matrices.
1
0
2 0
1 0
3
0
2
43.
,
44.
,
45.
,
46.
,
0 3
0 0
0 3
0 23
10
47. 0
0
1
0
50.
0
0
0
5
0
2
1
0
0
0
0,
5
0
0
1
2
0
0
,
0
1
1
48. 0
0
2
0
51.
0
0
1
1
0
0
0,
1
0
3
0
0
0
0
1
0
0
0
,
3
1
0
49.
0
4
0
52.
0
0
0
2
0
0
5
0
0
0
0
6
0
0
0 ,
3
5
0
0
,
0
1
3.2
101
Calculating Inverses
0
1
53.
0
0
1
0
0
0
0
0
0
1
0
0
,
1
0
0
0
54.
1
0
0
1
0
0
1
0
0
0
0
0
,
0
7
4
0
55.
0
0
0
5
0
0
0
0
1
0
0
0
,
6
1
56. Prove that a square zero matrix does not have an inverse.
57. Prove that if a diagonal matrix has at least one zero on its main diagonal, then
that matrix cannot have an inverse.
58. Prove that if A2 = I, then A1 = A.
3.2
Calculating Inverses
In Section 2.3, we developed a method for transforming any matrix into rowreduced form using elementary row operations. If we now restrict our attention
to square matrices, we may say that the resulting row-reduced matrices are upper
triangular matrices having either a unity or zero element in each entry on the
main diagonal. This provides a simple test for determining which matrices have
inverses.
Theorem 1 A square matrix has an inverse if and only if reduction to rowreduced form by elementary row operations results in a matrix having all unity
elements on the main diagonal.
We shall prove this theorem in the Final Comments to this chapter as
Theorem 2 An n n matrix has an inverse if and only if it has rank n.
Theorem 1 not only provides a test for determining when a matrix is invertible,
but it also suggests a technique for obtaining the inverse when it exists. Once a
matrix has been transformed to a row-reduced matrix with unity elements on the
main diagonal, it is a simple matter to reduce it still further to the identity matrix.
This is done by applying elementary row operation (E3)adding to one row of
a matrix a scalar times another row of the same matrixto each column of the
matrix, beginning with the last column and moving sequentially toward the rst
column, placing zeros in all positions above the diagonal elements.
Example 1
matrix
1
A = 0
0
to the identity matrix.
2
1
0
1
3
1
102
Chapter 3
Solution
The Inverse
1
0
0
2
1
0
1
1
3 0
1
0
1
0
0
1
0
0
2
1
0
1
0
1
by adding to
the second row (3)
2
1
0
0
0
1
by adding to
the rst row (1)
0
1
0
0
0
1
by adding to
the rst row (2)
To summarize, we now know that a square matrix A has an inverse if and only
if it can be transformed into the identity matrix by elementary row operations.
Moreover, it follows from the previous section that each elementary row operation
is represented by an elementary matrix E that generates the row operation under
the multiplication EA. Therefore, A has an inverse if and only if there exist a
sequence of elementary matrices. E1 , E2 , . . . , Ek such that
Ek Ek1 E3 E2 E1 A = I.
But, if we denote the product of these elementary matrices as B, we then have
BA = I, which implies that B = A1 . That is, the inverse of a square matrix A of
full rank is the product of those elementary matrices that reduce A to the identity
matrix! Thus, to calculate the inverse of A, we need only keep a record of the
elementary row operations, or equivalently the elementary matrices, that were
used to reduce A to I. This is accomplished by simultaneously applying the same
elementary row operations to both A and an identity matrix of the same order,
because if
Ek Ek1 E3 E2 E1 A = I,
then
(Ek Ek1 E3 E2 E1 )I = Ek Ek1 E3 E2 E1 = A1 .
We have, therefore, the following procedure for calculating inverses when they
exist. Let A be the n n matrix we wish to invert. Place next to it another n n
matrix B which is initially the identity. Using elementary row operations on A,
transform it into the identity. Each time an operation is performed on A, repeat
the exact same operation on B. After A is transformed into the identity, the matrix
obtained from transforming B will be A1 .
If A cannot be transformed into an indentity matrix, which is equivalent to
saying that its row-reduced from contains at least one zero row, then A does not
have an inverse.
3.2
103
Calculating Inverses
Example 2
Invert
2
.
4
1
3
A=
Solution
#
2 ## 1
4 # 0
1
3
#
0
1
2 ## 1
1
0 2 # 3
1
0
2
1
#
# 1
#
# 3
# 2
0
1
0
21
by adding to
the second row (3)
by multiplying
the second row by ( 21 )
A has been transformed into row-reduced form with a main diagonal of only unity
elements; it has an inverse. Continuing with transformation process, we get
1
0
by adding to
the rst row (2)
#
1
0 ## 2
.
#
1 # 23 21
Thus,
Example 3
3
2
21
5
A = 0
4
8
2
3
1
1.
1
Solution
5
0
4
8
2
3
#
1 ## 1
1 ## 0
1 # 0
0
1
0
0
1
0 0
1
4
1.6
2
3
#
0.2 ## 0.2
1 ## 0
1 # 0
0
1
0
#
1 1.6 0.2 ## 0.2
2
1 ## 0
0
0 3.4 1.8 # 0.8
0
0
1
0
1
0
by multiplying the
rst row by (0.2)
0
by adding to the
0
third row (4)
1
times the rst row
104
Chapter 3
The Inverse
#
1 1.6 0.2 ## 0.2 0
1
0.5 ## 0 0.5
0
0 3.4 1.8 # 0.8 0
0
0
1
by multiplying the
second row by (0.5)
#
1.6 0.2 ## 0.2 0
1
0.5 ## 0 0.5
0 0.1 # 0.8 1.7
0
0
1
by adding to the
third row (3.4)
1
0
0
1
0
0
1.6
1
0
0.2 ## 0.2 0
0
0.5 ## 0 0.5
0.
1 # 8 17 10
by multiplying the
third row by (0.1)
A has been transformed into row-reduced form with a main diagonal of only unity
elements; it has an inverse. Continuing with the transformation process, we get
1
0
0
1
0
0
1
0
0
1.6
1
0
#
0.2 ## 0.2
0 ## 4
1 # 8
1.6
1
0
#
0 ## 1.4
0 ## 4
1# 8
0
1
0
#
0 ## 5
0 ## 4
1# 8
by adding to the
second row (0.5)
0
0
9
5
17 10
3.4
9
17
by adding to the
rst row (0.2)
2
5
10
by adding to the
rst row (1.6)
11
6
9
5.
17 10
Thus,
A1
Example 4
11
6
9
5.
17 10
5
= 4
8
0 1
A = 1 1
1 1
Solution
0 1
1 1
1 1
1
1
3
#
# 1
#
# 0
#
# 0
0
1
0
0
1 1
0 0 1
1
1 1
1
1
3
#
# 0
#
# 1
#
# 0
1
1.
3
1
0
0
0
0
1
by interchanging the
rst and second rows
3.2
105
Calculating Inverses
1
0
0
0
0
0
0
0
0
0
0
1
1
0
1
1
2
1
1
0
1
1
1
1
1
0
1
0
1
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
0
1
0
0
0
1
#
# 1
1
#
# 1
1
#
2
#
# 0 1
2
by adding to the
the third row (1)
0
0
1
1 0
0 0
0
1
0
1
0
1
0
1
0
21
0
1
1
2
1
2
1
0
by adding to the
second row (1)
21
3
2
1
2
21
by multiplying the
third row by ( 21 )
21
1
2
21
by adding to the
rst row (1)
21
1
2
by adding to the
rst row (1)
21
.
1
2
Thus,
A1 = 1
0
Example 5
21
.
1
2
21
1
2
Invert
2
.
4
1
A=
2
Solution
1
2
#
2 ## 1
4 # 0
0
1
1
0
#
2 ## 1
0 # 2
0
.
1
by adding to
the second row (2)
A has been transformed into row-reduced form. Since the main diagonal contains
a zero element, here in the 22 position, the matrix A does not have an inverse. It
is singular.
106
Chapter 3
The Inverse
Problems 3.2
In Problems 120, nd the inverses of the given matrices, if they exist.
1.
1
3
2
4.
3
1
7. 1
0
1
10. 4
7
3
13. 4
3
2
16. 3
5
1
0
19.
0
0
1
,
4
2.
1
,
4
5.
1
0
1
0
1,
1
2
5
8
3
6,
9
2
0
9
1
1,
2
4
4
0
1
1
0
0
2
1
0
8. 1
0
2
11. 5
4
1
14. 2
1
3
4,
1
1
1
2
0
8
5
2
1
,
3
2
5
17. 2
2
1
2
20.
4
3
1
,
2
3.
4
4
3
,
2
1
0,
0
0
1
1
0
0,
1
1
2
2
9. 0
3
0
1
1
1
2,
1
1
3
0
5
1,
2
2
12. 0
0
1
1,
3
0
1
3
1
2,
1
0 0
1 0
6 2
2 4
2
0
1
1
2
1
3
6.
0
0
1
4
,
4
1
15. 3
2
3
18. 1
2
2
2
3
1
4,
1
1
1
3 1,
3 1
0
0
.
0
1
21. Use the results of Problems 11 and 20 to deduce a theorem involving inverses
of lower triangular matrices.
22. Use the results of Problems 12 and 19 to deduce a theorem involving the
inverses of upper triangular matrices.
23. Matrix inversion can be used to encode and decode sensitive messages for
transmission. Initially, each letter in the alphabet is assigned a unique positive
integer, with the simplest correspondence being
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
3.2
107
Calculating Inverses
19
19
18
0.
1
A=
2
2
,
3
1
2
1
2
2
3
2
3
19
35
=
,
8
62
5
5
=
,
0
10
47
9
=
,
19
75
etc.,
62
10
47
75
....
Note an immediate benet from the scrambling: the letter S, which was originally always coded as 19 in each of its three occurrences, is now coded as a
35 the rst time and as 75 the second time. Continue with the scrambling, and
determine the nal code for transmitting the above message.
24. Scramble the message SHE IS A SEER using, matrix
2
A=
4
3
.
5
25. Scramble the message AARON IS A NAME using the matrix and steps
described in Problem 23.
108
Chapter 3
The Inverse
26. Transmitted messages are unscrambled by again packaging the received message into 2-tuples and multiplying each vector by the inverse of A. To decode
the scrambled message
18
31
44
72
2
,
1
3
2
and then
3
2
2
1
3
2
2
1
18
8
=
,
31
5
44
12
=
.
72
16
12
16
43
40
60
18
31
28
51.
27. Use the decoding procedure described in Problem 26, but with the matrix A
given in Problem 24, to decipher the transmitted message
16
120
39
131
27
45
38
76
51
129
28
56.
28. Scramble the message SHE IS A SEER by packaging the coded letters into
3-tuples and then multiplying by the 3 3 invertible matrix
A = 0
1
0
1
1
1.
0
Add as many zeros as necessary to the end of the message to generate complete
3-tuples.
3.3
3.3
109
Simultaneous Equations
Simultaneous Equations
One use of the inverse is in the solution of systems of simultaneous linear
equations. Recall, from Section 1.3 that any such system may be written in the form
Ax = b,
(2)
where A is the coefcient matrix, b is a known vector, and x is the unknown vector
we wish to nd. If A is invertible, then we can premultiply (2) by A1 and obtain
A1 Ax = A1 b.
But A1 A = 1, therefore
Ix = A1 b
or
x = A1 b.
(3)
Solution
2.
Dene
1
A=
3
2
,
1
x
x=
,
y
9
b=
;
2
2
1
9
5
1
= 15
=
.
2
25
5
Using the denition of matrix equality (two matrices are equal if and only if their
corresponding elements are equal), we have that x = 1 and y = 5.
110
Chapter 3
Example 2
The Inverse
2,
2y + z = 1,
4x + 3y z =
3.
Solution
5
A = 0
4
8
2
3
1
1,
1
x
x = y,
z
2
b = 1.
3
11
6
9
5.
17 10
5
4
8
Thus,
5
x
y = x = A1 b = 4
8
z
hence x = 3, y = 2, and z = 3.
11
6
2
3
9
51 = 2,
17 10
3
3
Not only does the invertibility of A provide us with a solution of the system
Ax = b, it also provides us with a means of showing that this solution is unique
(that is, there is no other solution to the system).
Theorem 1 If A is invertible, then the system of simultaneous linear equations
given by Ax = b has one and only one solution.
Proof. Dene w = A1 b. Since we have already shown that w is a solution to
Ax = b, it follows that
Aw = b.
(4)
Assume that there exists another solution y. Since y is a solution, we have that
Ay = b.
(5)
Aw = Ay.
(6)
3.3
111
Simultaneous Equations
Problems 3.3
In Problems 1 through 12, use matrix inversion, if possible, to solve the given
systems of equations:
1.
x + 2y = 3,
3x + y = 1.
2.
a + 2b = 5,
3a + b = 13.
3. 4x + 2y = 6,
2x 3y = 7.
4. 4l p = 1,
5l 2p = 1.
5. 2x + 3y = 8,
6x + 9y = 24.
6.
7.
2x + 3y z = 4,
x 2y + z = 2,
3x y
= 2.
9. 2r + 4s
= 2,
3r + 2s + t = 8,
5r 3s + 7t = 15.
11. 2r + 3s 4t = 12,
3r 2s
= 1,
8r s 4t = 10.
x + 2y z = 1,
2x + 3y + 2z = 5,
y z = 2.
x + 2y 2z = 1,
2x + y + z = 5,
x + y z = 2.
13. Use matrix inversion to determine a production schedule that satises the
requirements of the manufacturer described in Problem 12 of Section 2.1.
112
Chapter 3
The Inverse
14. Use matrix inversion to determine a production schedule that satises the
requirements of the manufacturer described in Problem 13 of Section 2.1.
15. Use matrix inversion to determine a production schedule that satises the
requirements of the manufacturer described in Problem 14 of Section 2.1.
16. Use matrix inversion to determine the bonus for the company described in
Problem 16 of Section 2.1.
17. Use matrix inversion to determine the number of barrels of gasoline that the
producer described in Problem 17 of Section 2.1 must manufacture to break
even.
18. Use matrix inversion to solve the Leontief inputoutput model described in
Problem 22 of Section 2.1.
19. Use matrix inversion to solve the Leontief inputoutput model described in
Problem 23 of Section 2.1.
3.4
BA = I,
AC = I,
and
CA = I.
A1
%1
= A.
See Problem 1.
Property 2 (AB)1 = B1 A1 .
%
$
Proof. (AB)1 denotes the inverse of AB. However, B1 A1 (AB) =
%
$
B1 A1 A B = B1 IB = B1 B = I. Thus, B1 A1 is also an inverse for AB,
and, by uniqueness of the inverse, B1 A1 = (AB)1 .
1
Property 3 (A1 A2 An )1 = An1 An1
A21 A11 .
3.4
113
$ T %1 $ 1 %T
A
= A
$ %1
Proof. AT
denotes the inverse of AT . However, using the property of the
transpose that (AB)T = BT AT , we have that
$
%$
%T $
%T
AT A1 = A1 A = IT = I.
%T
%T
$
$
Thus, A1 is an inverse of AT , and by uniqueness of the inverse, A1 =
$ T %1
A
.
Property 5 (A)1 = (1/) (A)1 if is a nonzero scalar.
Proof.
Find A2 if
1
=3
1
2
1
2
114
Chapter 3
The Inverse
Solution
2
A2 = A1
12
=
6
2
6
4
6
4
12
=
6
12
6
6
180
=
4
96
96
.
52
Problems 3.4
1. Prove Property 1.
2. Verify Property 2 for
1
2
A=
1
3
and
B=
2
1
5
.
2
1
A=
3
2
4
1
1
1
1
B=
3
and
1
.
5
1
1
0
1
B = 0
0
and
2
1
0
1
1.
1
1
A=
0
3
,
2
4
B=
0
0
,
2
and
1
C=
2
1
A = 2
1
0
3
0
2
1.
3
0
.
2
3.5
115
LU Decomposition
2
.
1
1
2
BA1
%T $
A1 BT
%1
= I.
3.5
LU Decomposition
Matrix inversion of elementary matrices (see Section 3.1) can be combined with
the third elementary row operation (see Section 2.3) to generate a good numerical
technique for solving simultaneous equations. It rests on being able to decompose
a nonsingular square matrix A into the product of lower triangular matrix L with
an upper triangular matrix U. Generally, there are many such factorizations. If,
however, we add the additional condition that all diagonal elements of L be unity,
then the decomposition, when it exists, is unique, and we may write
A = LU
(7)
with
0
1
l21
l
L =
31
...
l32
..
.
0
0
1
..
.
ln1
ln2
ln3
0
0
.
..
. ..
1
and
u11
0
0
U =
.
..
0
u12
u22
0
..
.
u13
u23
u33
..
.
u1n
u2n
u3n .
..
..
.
.
unn
To decompose A into from (7), we rst reduce A to upper triangular from using
just the third elementary row operation: namely, add to one row of a matrix a
116
Chapter 3
The Inverse
scalar times another row of that same matrix. This is completely analogous to
transforming a matrix to row-reduced form, except that we no longer use the
rst two elementary row operations. We do not interchange rows, and we do not
multiply a row by a nonzero constant. Consequently, we no longer require the
rst nonzero element of each nonzero row to be unity, and if any of the pivots
are zerowhich in the row-reduction scheme would require a row interchange
operationthen the decomposition scheme we seek cannot be done.
Example 1
2
A = 4
6
1
2
1
3
1
2
2
A = 4
6
1
2
1
3
2
1 0
2
6
2
0
0
2
0
0
1
4
1
3
5
2
by adding to the
second row (2) times
1
4
4
3
5
11
by adding to the
third row (3) times
1
4
0
3
5.
6
by adding to the
third row (1) times
%
En1,n E41 E31 E21 A = U,
(8)
where E21 denotes the elementary matrix that places a zero in the 21 position,
E31 denotes the elementary matrix that places a zero in the 31 position, E41
denotes the elementary matrix that places a zero in the 41 position, and so on.
Since elementary matrices have inverses, we can write (8) as
1 1
1
A = E1
E
E
E
21 31 41
n,n1 U.
(9)
3.5
117
LU Decomposition
Theorem 1 of Section 1.4 that the product of these lower triangular matrices is
itself lower triangular. Setting
1 1
1
L = E1
21 E31 E41 En,n1,
we see that (9) is identical to (7), and we have the decomposition we seek.
Example 2
Solution The elementary matrices associated with the elementary row operations described in Example 1 are
E21
1
= 2
0
0
1
0
0
0,
1
E31
1
= 0
3
0
1
0
0
0,
1
0
1
0
0
0,
1
and
E42
1
= 0
0
0
1
1
0
0,
1
E1
21
1
= 2
0
0
1
0
0
0,
1
E1
31
1
= 0
3
and
E1
42
1
= 0
0
0
1
1
0
0.
1
Then,
2 1
4
2
6 1
3
1
1 = 2
2
0
0
1
0
0
1 0
0 0 1
1 3 0
0 1
00
1 0
0
1
1
0 2
00
1 0
1
4
0
3
5
6
2 1
4
2
6 1
3
1
0
1 = 2
1
2
3 1
0 2
00
1 0
1
4
0
3
5.
6
118
Chapter 3
The Inverse
from. If this is not possible, because of a zero pivot, then stop; otherwise, the LU
decomposition is found by dening the resulting upper triangular matrix as U and
constructing the lower triangular matrix L utilizing Observation 1.
Example 3
2
6
A =
1
0
1
2
2
4
1
0
1 3
3
8
.
4
4
2
6
1
0
1
2
3
2
0
2
4
8
1
0
4 1
1 3 4
0
2
0
0
0
2
0
0
0
2
0
0
0
2
0
0
0
1
2
1 2
1
0
1 3
3
by adding to the
1
by adding$to the%
1
1
2
1 2
23
1
1
2
1 2
0
2
1 3
3
by adding$to the%
1
third row 23 times
4
the second row
4
1
1
0
0
2
2
2
5
3
by adding to the
1
fourth row (1) times
4
the second row
5
1
1
0
0
2
2
2
0
3
1
4
5
by adding to
the
fourth row 25 times
We now have an upper triangular matrix U. To get the lower triangular matrix
L in the decomposition, we note that we used the scalar 3 to place a zero in
the 21 position, so its negative (3) = 3 goes into the 21 position of L. We
used the scalar 21 to place a zero in the 31 position in the second step of the
above triangularization process, so its negative, 21 , becomes the 31 element in
L; we used the scalar 25 to place a zero in the 43 position during the last step of
3.5
119
LU Decomposition
2
6
1
0
1
2
2
4
1
0
1 3
1
3
8 3
= 1
4 2
4
0
0
1
3
2
0
0
1
25
0
2
0 0
0
0
1
1
1
0
0
2
2
2
0
3
1
.
4
5
LU decompositions, when they exist, can be used to solve systems of simultaneous linear equations. If a square matrix A can be factored into A = LU, then
the system of equations Ax = b can be written as L(Ux) = b. To nd x, we rst
solve the system
Ly = b
(10)
(11)
for x. Both systems (10) and (11) are easy to solve, the rst by forward substitution
and the second by backward substitution.
Example 4
2
4
6
1
2
1
3 x
9
1y = 9.
2
z
12
1
2
3
0
1
1
0
9
0 = 9,
1
12
120
Chapter 3
The Inverse
= 9,
2 +
= 9,
3 + = 12.
Solving this system from top to bottom, we get = 9, = 9, and = 30. Consequently, the matrix system Ux = y is
2 1
3 x
9
0
4 5y = 9.
0
0
6 z
30
which is equivalent to the system of equations
2x y + 3z = 9,
4y 5z = 9,
6z = 30.
Solving this system from bottom to top, we obtain the nal solution x = 1, y = 4,
and z = 5.
Example 5
= 5,
= 8,
= 4,
= 3.
Solution The matrix representation for this system has as its coefcient matrix
the matrix A of Example 3. Dene.
y = [, , , ]T .
Then, using the decomposition determined in Example 3, we can write the matrix
system Ly = b as the system of equations
3 +
1
3
2 + 2 +
= 5,
= 8,
= 4,
25 + = 3,
which has as its solution = 5, = 7, = 4, and = 0. Thus, the matrix system
Ux = y is equivalent to the system of equations
2a + b + 2c + 3d = 5,
b 2c d = 7,
2c + 4d = 4,
5d = 0.
3.5
121
LU Decomposition
Solving this set from bottom to top, we calculate the nal solution a = 1,
b = 3, c = 2, and d = 0.
LU decomposition and Gaussian elimination are equally efcient for solving
Ax = b, when the decomposition exists. LU decomposition is superior when Ax =
b must be solved repeatedly for different values of b but the same A, because once
the factorization of A is determined it can be used with all b. (See Problems 17 and
18.) A disadvantage of LU decomposition is that it does not exist for all nonsingular
matrices, in particular whenever a pivot is zero. Fortunately, this occurs rarely, and
when it does the difculty usually is overcome by simply rearranging the order of
the equations. (See Problems 19 and 20.)
Problems 3.5
In Problems 1 through 14, A and b are given. Construct an LU decomposition for
the matrix A and then use it to solve the system Ax = b for x.
1. A =
3. A =
1
3
1
,
4
b=
3
,
2
b=
8
5
6.
7.
8.
9.
10.
1
.
6
625
.
550
2 1
11
,
b=
.
1 2
2
1 1 0
4
4. A = 1 0 1, b = 1.
0 1 1
1
2. A =
1
2 0
1
A = 1 3 1,
b = 2.
2 2 3
3
2
1
3
10
1
0,
A = 4
b = 40.
2 1 2
0
3 2 1
50
A = 4 0 1,
b = 80.
3 9 2
20
1 2 1
80
1,
A = 2 0
b = 159.
1 1
3
75
1 2 1
8
1,
A = 0 2
b = 1.
0 0
1
5
1 0 0
2
A = 3 2 0,
b = 4.
1 1 2
2
5.
122
Chapter 3
1
1
11. A =
1
0
2
1
12. A =
0
0
1
1
13. A =
1
0
The Inverse
1
4
0
1
1
1
,
0
1
1 3
2 1
,
1 1
0 1
2
1
1
1
1 1
2 1
,
1 2
1 1
0
1
1
1
2 0
2 2
14. A =
4 3
1 0
1
0
1
1
2
0
1
3
0
6
,
1
1
4
3
b =
2.
2
1000
200
b =
100.
100
30
30
b =
10.
10
2
4
b =
9.
4
4,
2x + 7y 4z = 6.
(b) Resolve when the right sides of each equation are replaced by 10, 10, and
10, respectively.
17. Solve the system Ax = b for the following vectors b when A is given as in
Problem 4:
2
40
1
5
(b) 2,
(c) 50,
(d) 1.
(a) 7,
0
20
3
4
3.5
123
LU Decomposition
18. Solve the system Ax = b for the following vectors b when A is given as in
Problem 13:
1
0
190
1
1
0
130
1
(a)
(b)
(c)
(d)
1,
0,
160,
1.
1
0
60
1
19. Show that LU decomposition cannot be used to solve the system
2y + z = 1,
x + y + 3z =
8,
2x y z =
1,
but that the decomposition can be used if the rst two equations are
interchanged.
20. Show that LU decomposition cannot be used to solve the system
x + 2y + z = 2,
2x + 4y z = 7,
x + y + 2z = 2,
but that the decomposition can be used if the rst and third equations are
interchanged.
21. (a) Show that the LU decomposition procedure given in this chapter cannot
be applied to
0
A=
0
2
.
9
1
L=
1
0
1
and
2
.
7
2
.
3
0
U=
0
1
L=
3
0
1
and
0
U=
0
(d) Why do you think the LU decomposition procedure fails for this A? What
might explain the fact that A has more than one LU decomposition?
124
Chapter 3
3.6
The Inverse
1
0
Ai xj =
when i = j,
when i = j.
This equation can be notationally simplied if we make use of the Kronecker delta
ij dened by
ij =
1
0
when i = j.
when i = j.
Then,
Ai xj = ij .
Now consider the equation
n
i=0
ci Ai = 0.
3.6
125
Final Comments
We wish to show that each constant ci must be zero. Multiplying both sides of this
last equation on the right by the vector xj , we have
& n
'
ci Ai xj = 0xj ,
i=0
n
(ci Ai ) xj = 0,
i=0
n
$
%
ci Ai xj = 0,
i=0
n
ci ij = 0,
i=0
cj = 0.
Thus for each xj (j = 1, 2, . . . , n) we have cj = 0, which implies that c1 = c2 =
= cn = 0 and that the rows A1 , A2 , . . . , An are linearly independent.
It follows directly from Lemma 2 and the denition of an inverse that if an
n n matrix A has an inverse, then A must have rank n. This in turn implies
directly that if A does not have rank n, then it does not have an inverse. We now
want to show the converse: that is, if A has rank n, then A has an inverse.
We already have part of the result. If an n n matrix A has rank n, then the
procedure described in Section 3.2 is a constructive method for obtaining a matrix
C having the property that CA = I. The procedure transforms A to an identity
matrix by a sequence of elementary row operations E1 , E2 , . . . , Ek1 , Ek . That is,
Ek Ek1 . . . E2 E1 A = I.
Setting
C = Ek Ek1 . . . E2 E1 ,
(12)
CA = I.
(13)
we have
Proof. If AB = I, then from Lemma 1 A has rank n, and from (12) and (13)
there exists a matrix C such that CA = I. It follows from Theorem 1 of Section 3.4
that B = C.
126
Chapter 3
The Inverse
4
An Introduction to
Optimization
4.1
Graphing Inequalities
Many times in real life, solving simple equations can give us solutions to everyday
problems.
Example 1 Suppose we enter a supermarket and are informed that a certain
brand of coffee is sold in 3-lb bags for $6.81. If we wanted to determine the cost
per unit pound, we could model this problem as follows:
Let x be the cost per unit pound of coffee; then the following equation represents
the total cost of the coffee:
x + x + x = 3x = 6.81.
Dividing both sides of (1) by 3 gives the cost of $2.27 per pound of coffee.
(1)
Example 2 Lets suppose that we are going to rent a car. If the daily xed cost
is $100.00, with the added price of $1.25 per mile driven, then
C = 100 + 1.25m
(2)
represents the total daily cost, C, where m is the number of miles traveled on a
particular day.
What if we had a daily budget of $1000.00? We would then use (2) to determine the number of miles we could travel given this budget. Using elementary
algebra, we see that we would be able to drive 720 miles.
These two simple examples illustrate how equations can assist us in our daily
lives. But sometimes things can be a bit more complicated.
127
128
Chapter 4
An Introduction to Optimization
(3)
(4)
50
50
100
100
Figure 4.1
50
50
100
4.1
129
Graphing Inequalities
Remark 1 Notice that the lower left-hand part of the graph is shaded.
An easy way to check is to pick a point, say (50, 50); clearly 50 +
50 2, therefore the half-region containing this point must be the shaded
portion.
Remark 2 The graph of the strict inequality x + y < 2 yields the same picture
with the line dashed (instead of solid) to indicate that points on the line x + y = 2
are not included.
Example 5
Sketch 2x + 3y 450.
1000
800
600
400
200
0
0
200
400
600
800
1000
Figure 4.2
Remark 3 Notice that we have restricted this graph to the rst quadrant.
Many times the variables involved will have non-negative values, such as volume, area, etc. Notice, too, that the region is innite, as is the region in
Example 4.
Example 6
130
Chapter 4
An Introduction to Optimization
12
10
0
0
10
12
Figure 4.3
Remark 4 Note that the upper-right corner point is (2, 4). This point is
the intersection of the straight lines given by the equations 4x + y = 12 and
2x + 5y = 24; in Chapter 2 we covered techniques used in solving simultaneous
equations. Here the added constraints of x 0 and y 0 render a bounded or
nite region.
We will see regions like Figure 4.3 again both in Section 4.2 (with regard to
modeling) and Section 4.3 (using the technique of linear programming).
Problems 4.1
Sketch the following inequalities:
1. y 0
2. x 0
3. y
4. x + 4y 12
5. x + 4y < 12
6. x + 4y 12
7. x + 4y > 12
4.2
131
4.2
(5)
2X + 2Y 1200
(6)
X0
(7)
Y 0.
(8)
Note that (5) represents the constraint due to construction (in hours) while (6)
represents the constraint due to painting (also in hours). The inequalities (7) and
(8) merely state that the number of each type of wagon cannot be negative.
These four inequalities can be graphed as follows in Figure 4.4:
Let us make a few observations. We will call the shaded region that satises
all four inequalities the region of feasibility. Next, the shaded region has four
corner points called vertices. The coordinates of these points are given by (0, 0),
(0, 600), (450, 150) and (500, 0). Lastly, this region has the property that, given
any two points in the interior of the region, the straight line segment connecting
these two points lies entirely within the region. We call regions with this property
convex.
132
Chapter 4
An Introduction to Optimization
700
600
500
400
300
200
100
0
0
100
200
300
400
500
600
700
Figure 4.4
(9)
Note that Equation (9) is called the objective function. The notation P(X, Y ) is read
P of X and Y and is evaluated by simply substituting the respective values into
the expression. For example, P (0,600) = 50(0) + 60(600) = 0 + 36,000 = 36,000
dollars, while P (450,150) = 50(450) + 60(150) = 22,500 + 9000 = 31,500 dollars.
Equation (9), the inequalities (5)(8), and Figure 4.4 model the situation
above, which is an example of an optimization problem. In this particular example, our goal was to maximize a quantity (prot). Our next example deals with
minimization.
Suppose a specic diet calls for the following minimum daily requirements: 186
units of Vitamin A and 120 units of Vitamin B. Pill X contains 6 units of Vitamin
A and 3 units of Vitamin B, while pill Y contains 2 units of Vitamin A and 2 units
of Vitamin B. What is the least number of pills needed to satisfy both vitamin
requirements?
Let us allow X to represent the number of X pills ingested and let Y represent
the number of Y pills taken. Then the following inequalities hold:
6X + 2Y 186
(10)
3X + 2Y 120
(11)
4.2
133
80
60
40
20
0
0
20
40
60
80
100
Figure 4.5
X0
(12)
Y 0.
(13)
Note that (10) models the minimum daily requirement of units of Vitamin A, while
(11) refers to the minimum daily requirement of units of Vitamin B. The quantity
to be minimized, the total number of pills, is given by the objective function:
N(X, Y ) = X + Y.
(14)
We note that while this region of feasibility is convex, it is also unbounded. Our
vertices are (40, 0), (0, 93), and (22, 27).
In the next section we will solve problems such as these by applying a very
simple, yet extremely powerful, theorem of linear programming.
Problems 4.2
Model the following situations by dening all variables and giving all inequalities,
the objective function and the region of feasibility.
1. Farmer John gets $5000 for every truck of wheat sold and $6000 for every truck
of corn sold. He has two elds: eld A has 23 acres and eld B has 17 acres.
134
Chapter 4
An Introduction to Optimization
For every 2 acres of eld A, Farmer John produces a truck of wheat, while
3 acres are required of eld B for the same amount of wheat. Regarding the
corn, 3 acres of eld A are required for a truck, while only 1 acre of eld B is
needed. How many trucks of each commodity should be produced to maximize
Farmer Johns prot?
2. Redo Problem (1) if Farmer John gets $8000 for every truck of wheat and $5000
for every truck of corn.
3. Dr. Lori Pesciotta, a research scientist, is experimenting with two forms of a
special compound, H-Turebab. She needs at least 180 units of one form of the
compound () and at least 240 units of the second form of the compound ().
Two mixtures are used: X and Y . Every unit of X contains two units of and
three units of , while each unit of Y has the opposite concentration. What
combination of X and Y will minimize Dr. Pesciottas costs, if each unit of X
costs $500 and each unit of Y costs $750?
4. Redo Problem (3) if X costs $750 per unit and Y costs $500 per unit.
5. Redo Problem (3) if, in addition, Dr. Pesciotta needs at least 210 units of a
third form () of H-Turebab, and it is known that every unit of both X and Y
contains 10 units of .
6. Cereal X costs $.05 per ounce while Cereal Y costs $.04 per ounce. Every
ounce of X contains 2 milligrams (mg) of Zinc and 1 mg of Calcium, while
every ounce of Y contains 1 mg of Zinc and 4 mg of Calcium. The minimum
daily requirement (MDR) is 10 mg of Zinc and 15 mg of Calcium. Find the
least expensive combination of the cereals which would satisfy the MDR.
7. Redo Problem (6) with the added constraint of at least 12 mg of Sodium if each
ounce of X contains 3 mg of Sodium and every ounce of Y has 2 mg of Sodium.
8. Redo Problem (7) if Cereal X costs $.07 an ounce and Cereal Y costs $.08 an
ounce.
9. Consider the following group of inequalities along with a corresponding objective function. For each one, sketch the region of feasibility (except for 9 g) and
construct a scenario that might model each set of inequalities:
(a) x 0, y 0, 2x + 5y 10, 3x + 4y 12, F(x, y) = 100x + 55y
(b) x 0, y 0, x + y 40, x + 2y 60, G(x, y) = 7x + 6y
(c) x 2, y 3, x + y 40, x + 2y 60, H(x, y) = x + 3y
(d) x 0, y 0, x + y 600, 3x + y 900, x + 2y 1000, J(x, y) = 10x + 4y
(e) 2x + 9y 1800, 3x + y 750, K(x, y) = 4x + 11y
(f ) x + y 100, x + 3y 270, 3x + y 240, L(x, y) = 600x + 375y
(g) x 0, y 0, z 0, x + y + 2z 12, 2x + y + z 14, x + 3y + z 15,
M(x, y, z) = 2x + 3y + 4z (Do not sketch the region of feasibility for this
problem.)
4.3
4.3
135
(15)
P (0,0) = 0
P (0,600) = 36,000
P (450,150) = 31,500
P (500,0) = 25,000.
By the Fundamental Theorem of Linear Programming, we see that the maximum prot of $36,000 occurs if no X wagons are produced and 600 Y wagons are
made.
136
Chapter 4
An Introduction to Optimization
700
600
500
400
300
200
100
0
0
100
200
300
400
500
600
700
Figure 4.6
Example 2
given by
(16)
Then
R(0, 0) = 0
R(0,600) = 30,000
R(450,150) = 43,500
R(500,0) = 40,000.
We see, in this situation, that the maximum prot of $43,500 occurs if 450 X
wagons are produced, along with 150 Y wagons.
Example 3
given by
(17)
4.3
Then
137
L(0,0) = 0
L(0,600) = 45,000
L(450,150) = 45,000
L(500,0) = 37,500.
Note that we have two situations in which the prot is maximized at $45,000; in
fact, there are many points where this occurs. For example,
L(300, 300) = 45,000.
(18)
This occurs at any point along the constraint given by inequality (2). The reason
lies in the fact that coefcients of X and Y in (2) and in Equation (7) have the same
ratio.
(19)
The region of feasibility (same as Figure 4.5) is given below in Figure 4.7:
Evaluating our objective function (19) at the three vertices, we nd that
N(40, 0) = 40
N(0, 93) = 93
(20)
4X + 2Y 40
(21)
3X + 4Y 60
(22)
X0
(23)
Y 0.
(24)
138
Chapter 4
An Introduction to Optimization
100
80
60
40
20
0
0
20
40
60
80
100
Figure 4.7
The vertices of the region of feasibility are (0, 0), (0, 15), (4, 12), and (10, 0), as
seen below in Figure 4.8.
Note that (11) is maximized at Z(4, 12) = 52.
Suppose we now add a third constraint:
X + Y 30.
(25)
Figure 4.9 below reects this added condition. Note, however, that the region of
feasibility is not changed and the four vertices are unaffected by this redundant
constraint. It follows, therefore, that our objective function Z(X, Y ) = 4X + 3Y is
still maximized at Z(4, 12) = 52.
Remark 4 Sometimes a vertex does not have whole number coordinates (see
problem (15) below). If the physical model does not make sense to have a fractional or decimal answerfor example 2.5 bicycles or 1/3 carsthen we should
check the closest points with whole number coordinates, provided these points
lie in the region of feasibility. For example, if (2.3, 7.8) is the vertex which gives
the optimal value for an objective function, then the following points should be
checked: (2, 7), (2, 8), (3, 7) and (3, 8).
4.3
139
30
20
10
0
0
10
20
30
40
10
20
30
40
Figure 4.8
40
30
20
10
Figure 4.9
140
Chapter 4
An Introduction to Optimization
Problems 4.3
Using linear programming techniques, solve the following problems.
1. Section 4.2, Problem (1).
2. Section 4.2, Problem (2).
3. Section 4.2, Problem (3).
4. Section 4.2, Problem (4).
5. Section 4.2, Problem (5).
6. Section 4.2, Problem (6).
7. Section 4.2, Problem (7).
8. Section 4.2, Problem (8).
9. Section 4.2, Problem (9a); maximize F(x, y).
10. Section 4.2, Problem (9b); maximize G(x, y).
11. Section 4.2, Problem (9c); maximize H(x, y).
12. Section 4.2, Problem (9d); maximize J(x, y).
13. Section 4.2, Problem (9e); minimize K(x, y).
14. Section 4.2, Problem (9f); minimize L(x, y).
15. Maximize P(x, y) = 7x + 6y subject to the constraints x 0, y 0, 2x + 3y
1200 and 6x + y 1500.
4.4
4.4
141
with maximization problems. We will address minimization in the next, and nal,
section of this chapter.
Example 1
variables:
(26)
Note that we are using xi instead of the usual x and y, due to the fact that, in later
examples, we will have more than two independent variables.
Let us assume that the following constraints are imposed:
3x1 + 10x2 33,000
(27)
(28)
x1 0
(29)
x2 0.
(30)
(31)
(32)
and
We also incorporate these slack variables into our objective function (26), rewriting
it as:
7x1 22x2 + 0s1 + 0s2 + z = 0.
(33)
(34)
(35)
Remark 1 Admittedly, the Equations (33) through (35) seem somewhat strange.
However, the reader will soon see why we have recast these equations as they now
appear.
142
Chapter 4
An Introduction to Optimization
We are now ready to put these last three equations into a table known as the
initial tableau. This is nothing more than a kind of augmented matrix. To do this,
we merely detach the coefcients of the ve unknowns (x1 , x2 , s1 , s2 , and z) and
form the following table:
x2
s1
s2
10
8
22
1
0
0
0
1
0
0
0
1
x1
3
5
7
33,000
42,000 .
0
(36)
(37)
(38)
(39)
(40)
x1
1
2
4
4
x2
x3
s1
s2 s3
1
0
9
7
6
3
3
9
1
0
0
0
0
1
0
0
0
0
0
1
0
0
1
0
50
40
.
10
0
(41)
4.4
143
Change all inequalities into equations via the use of slack variables.
Rewrite the objective function, z, in terms of slack variables, setting one side
of the equation equal to zero and keeping the coefcient of z equal to +1.
The number of equations should equal the sum of the constraints plus one (the
equation given by the objective function).
Form the initial tableau, listing the constraints above the objective function,
labeling the columns, beginning with the decision variables, followed by the
slack variables, with z represented by the last column before the vertical bar.
The last column should have all the constants.
Locate the most negative number in the last row. If more than one equally negative number is present, arbitrarily choose any one of them. Call this number
k. This column will be called the work column.
Consider each positive element in the work column. Divide each of these
elements into the corresponding row entry element in the last column. The
ratio that is the smallest will be used as the work columns pivot. If there is
more than one smallest ratio, arbitrarily choose any one of them.
Use elementary row operations (see Chapter 2) to change the pivot element
to 1, unless it is already 1.
Use elementary row operations to transform all the other elements in the work
column to 0.
A column is reduced when all the elements are 0, with the exception of the
pivot, which is 1.
Repeat the process until there are no negative elements in the last row.
We are then able to determine the answers from this nal tableau.
x1
3
5
7
x2
10
8
22
s1 s2
1
0
0
0
0
1
0
1
0
33,000
42,000 .
0
(42)
We rst note that 22 is the most negative number in the last row of (42). So the
x2 column is our work column.
We next divide 33,000 by 10 = 3300 and 42,000 by 8 = 5250; since 3300 is the
lesser positive number, we will use 10 as the pivot. Note that we have put a carat
144
Chapter 4
An Introduction to Optimization
x2
x1
3
5
7
10
8
22
s1 s2
1
0
0
0
0
1
0
1
0
33,000
42,000 .
0
(43)
We now divide every element in the row containing the pivot by 10.
x1
x2
s1
s2
0.3
5
7
1
8
22
0.1
0
0
0
1
0
0
0
1
3300
42,000 .
0
(44)
Next, we use elementary row operations; we multiply the rst row by 8 and add
it to the second row and multiply the rst row by 22 and add it to the third row.
This will give us a 0 for every element (other than the pivot) in the work column.
x1
0.3
2.6
0.4
x2
s1
s2
1
0
0
0.1
0.8
2.2
0
1
0
0
0
1
3300
15,600 .
72,600
(45)
And now we repeat the process because we still have a negative entry in the
last row; that is, 0.4 is in the x1 column. Hence, this becomes our new work
column.
Dividing 3300 by 0.3 yields 11,000; dividing 15,600 by 2.6 gives us 6000; since
6000 is the lesser of the two positive ratios, we will use the 2.6 entry as the pivot
(again denoting it with a carat, and removing the carat from our rst pivot).
x1
x2
s1
s2
0.3
2.6
1
0
0
0.1
0.8
2.2
0
1
0
0
0
1
0.4
3300
15,600 .
72,600
(46)
Dividing each element in this row by 2.6 gives us the following tableau:
x1
0.3
1
0.4
x2
s1
1
0
0
0.1
.31
2.2
s2
0
.38
0
0
0
1
3300
6000 .
72,600
(47)
Using our pivot and elementary row operations, we transform every other element
in this work column to 0. That is, we multiply each element in the second row by
4.4
145
0.3 and add the row to the rst row and we multiply every element in the second
row by 0.4 and add the row to the last row. This gives us the following tableau:
x2
x1
0
1
0
s1
1
0
0
0.19
.31
2.08
s2
0.12
.38
0.15
z
0
0
1
1500
6000 .
75,000
(48)
We are now nished with the process, because there are no negative elements in
the last row. We interpret this nal tableau as follows:
Both slack variables equal 0. To verify this, please see Equations (31) and (32)
and substitute our values for x1 and x2 into these equations.
The maximum value of z is 75,000 (found in the lower right-hand corner box).
(49)
3x1 + 4x2 + s2 = 60
(50)
x1 2x2 + z = 0.
(51)
and
x1
4
3
1
x2
2
4
2
s1 s2
1
0
0
0
0
1
0
1
0
40
60 .
0
(52)
The second column will be our work column, since 2 is the most negative entry.
Dividing 40 by 2 gives 20; dividing 60 by 4 yields 15. Since 15 is a lesser positive
146
Chapter 4
An Introduction to Optimization
4
3
1
x2
2
4
2
s1 s2
1
0
0
0
0
1
0
1
0
40
60 .
0
(53)
Dividing every element of the second row will make our pivoting element 1:
x1
x2
s1
s2
4
0.75
1
2
1
2
1
0
0
0
0.25
0
0
0
1
40
15 .
0
(54)
We now use our pivot, along with the proper elementary row operations, to make
every other element in the column zero. This leads to the following tableau:
x1
2.5
0.75
0.5
x2 s1
0
1
0
1
0
0
s2
0.5
0.25
0.5
0
0
1
10
15 .
30
(55)
Since the last row has no negative entries, we are nished and have the nal
tableau:
x1
2.5
0.75
0.5
x2 s1
0
1
0
1
0
0
s2
0.5
0.25
0.5
0
0
1
10
15 .
30
(56)
(57)
Which forces both x1 and s2 to be zero, since neither can be negative. This forces
s1 = 10, as we can infer from the equation represented by the rst row:
0.25x1 + s1 0.5s2 = 10.
(58)
In practice, we are not concerned with the values of the slack variables, so we summarize by simply saying that our answers are x1 = 0 and x2 = 15 with a maximum
value of z = 30.
4.5
Final Comments
147
Problems 4.4
Using the Simplex Method, solve the following problems:
1. Section 4.2, Problem (1).
2. Section 4.2, Problem (2).
3. Maximize z = 3x1 + 5x2 , subject to x1 + x2 6 and 2x1 + x2 8.
4. Maximize z = 8x1 + x2 , subject to the same constraints in (3).
5. Maximize z = x1 + 12x2 , subject to the same constraints in (3).
6. Maximize z = 3x1 + 6x2 , subject to the constraints x1 + 3x2 30, 2x1 + 2x2
40, and 3x1 + x2 30.
7. Consider problem (9) at the end of Section 4.2. Set up the initial tableaus for
problems (9a) through (9d).
4.5
148
Chapter 4
An Introduction to Optimization
5
Determinants
5.1
Introduction
Every square matrix has associated with it a scalar called its determinant. To be
extremely rigorous we would have to dene this scalar in terms of permutations
on positive integers. However, since in practice it is difcult to apply a denition
of this sort, other procedures have been developed which yield the determinant in
a more straightforward manner. In this chapter, therefore, we concern ourselves
solely with those methods that can be applied easily. We note here for reference
that determinants are only dened for square matrices.
Given a square matrix A, we use det(A) or |A| to designate its determinant. If
the matrix can actually be exhibited, we then designate the determinant of A by
replacing the brackets by vertical straight lines. For example, if
1
A = 4
7
then
2
5
8
#
#1
#
det(A) = ##4
#7
3
6
9
2
5
8
#
3##
6##.
9#
(1)
(2)
We cannot overemphasize the fact that (1) and (2) represent entirely different
animals. (1) represents a matrix, a rectangular array, an entity unto itself while (2)
represents a scalar, a number associated with the matrix in (1). There is absolutely
no similarity between the two other than form!
We are now ready to calculate determinants.
Denition 1 The determinant of a 1 1 matrix [a] is the scalar a.
Thus, the determinant of the matrix [5] is 5 and the determinant of the matrix
[3] is 3.
149
150
Chapter 5
Determinants
a
c
b
d
Find det(A) if
1
A=
4
2
.
3
Solution
#
#1
det(A) = ##
4
Example 2
#
2##
= (1)(3) (2)(4) = 3 8 = 5.
3#
Find |A| if
A=
2
4
1
.
3
Solution
#
1##
= (2)(3) (1)(4) = 6 + 4 = 10.
3#
#
#2
|A| = ##
4
We now could proceed to give separate rules which would enable one to compute determinants of 3 3, 4 4, and higher order matrices. This is unnecessary.
In the next section, we will give a method that enables us to reduce all determinants of order n(n > 2) (if A has order n n then det(A) is said to have order n)
to a sum of determinants of order 2.
Problems 5.1
In Problems 1 through 18, nd the determinants of the given matrices.
3 4
3 4
3 4
1.
,
2.
,
3.
,
5 6
5
6
5 6
4.
5
7
6
,
8
5.
5
7
6
,
8
6.
5
7
6
,
8
5.1
1
,
7
0 1
10.
,
2 6
12 20
13.
,
3 5
t 2
16.
,
3 4
7.
151
Introduction
1
2
3
,
4
2
3
11.
,
4 4
36 3
14.
,
12 1
2t 3
17.
,
2 t
8.
2
4
1
,
8
9 0
12.
,
2 0
8 3
15.
,
7
9
3t t 2
18.
.
2
t
9.
3
3
#
2t ##
= 0.
t#
#
t ##
= 0.
t + 2#
1
2
3
1
and
B=
4 2
.
1 2
152
Chapter 5
Determinants
29. The second elementary row operation is to multiply any row of a matrix
by a nonzero constant. Apply this operation to the matrices given in Problems 1 through 15 for any constants of your choice, and calculate the new
determinants. How do they compare with the determinants of the original
matrix?
30. Redo Problem 29 for the third elementary row operation.
31. What is the determinant of a 2 2 matrix if one row or one column contains
only zero entries?
32. What is the relationship between the determinant of a 2 2 matrix and its
transpose?
33. What is the determinant of a 2 2 matrix if one row is a linear combination
of the other row?
5.2
Expansion by Cofactors
Denition 1 Given a matrix A, a minor is the determinant of any square
submatrix of A.
That is, given a square matrix A, a minor is the determinant of any matrix
formed from A by the removal of an equal number of rows and columns. As an
example, if
1 2 3
A = 4 5 6,
7 8 9
then
#
#1
#
#7
1
7
#
2##
8#
2
8
and
and
and
1
8
#
#5
#
#8
2
9
#
6##
9#
5
8
6
9
#
#1
#
2#
5.2
Expansion by Cofactors
153
A more useful concept for our immediate purposes, since it will enable us to
calculate determinants, is that of the cofactor of an element of a matrix.
Denition 2 Given a matrix A = aij , the cofactor of the element aij is a
scalar obtained by multiplying together the term (1)i + j and the minor obtained
from A by removing the ith row and jth column.
In other words, to compute the cofactor of the element aij we rst form a
submatrix of A by crossing out both the row and column in which the element aij
appears. We then nd the determinant of the submatrix and nally multiply it by
the number (1)i + j .
Example 1
1 2 3
A = 4 5 6.
7 8 9
Solution We rst note that 4 appears in the (2, 1) position. The submatrix
obtained by crossing out the second row and rst column is
1 2 3
4 5 6 = 2 3 ,
8 9
7 8 9
which has a determinant equal to (2)(9) (3)(8) = 6. Since 4 appears in the (2, 1)
position, i = 2 and j = 1. Thus, (1)i+j = (1)2+1 = (1)3 = (1). The cofactor
of 4 is (1)(6) = 6.
Example 2
Solution The element 9 appears in the (3, 3) position. Thus, crossing out the
third row and third column, we obtain the submatrix
1 2 3
4 5 6 = 1 2 .
4 5
7 8 9
which has a determinant equal to (1)(5) (2)(4) = 3. Since, in this case, i = j =
3, the cofactor of 9 is (1)3 + 3 (3) = (1)6 (3) = 3.
We now have enough tools at hand to nd the determinant of any matrix.
Expansion by Cofactors. To nd the determinant of a matrix A of arbitrary order,
(a) pick any one row or any one column of the matrix (dealers choice), (b) for
154
Chapter 5
Determinants
each element in the row or column chosen, nd its cofactor, (c) multiply each
element in the row or column chosen by its cofactor and sum the results. This sum
is the determinant of the matrix.
Example 3
Find det(A) if
3
A = 1
3
Solution
5
2
6
0
1.
4
Example 4
det(A).
Solution
|A| = 3(cofactor of 3) + 5(cofactor of 5) + 0(cofactor of 0)
#
#
#
#
#
#
#
#
1 + 1 # 2 1#
1 + 2 #1 1#
= (3)(1)
#6 4# + 5(1)
# 3 4# + 0
= (3)(1)(8 + 6) + (5)(1)(4 3)
= (3)(14) + (5)(7) = 42 + 35 = 77.
The previous examples illustrate two important properties of the method. First,
the value of the determinant is the same regardless of which row or column we
choose to expand by and second, expanding by a row or column that contains
zeros signicantly reduces the number of computations involved.
Example 5
Find det(A) if
1 0
1 4
A =
3 0
2 1
5 2
1 0
.
4 1
1 3
5.2
155
Expansion by Cofactors
Solution We rst check to see which row or column contains the most zeros and
expand by it. Thus, expanding by the second column gives
|A| = 0(cofactor of 0) + 4(cofactor of 4) + 0(cofactor of 0) + 1(cofactor of 1)
#
#
#
#
# 1 5 2#
# 1 5 2#
#
#
#
#
= 0 + 4(1)2+2 ## 3 4 1## + 0 + 1(1)4+2 ##1 1 0##
#2 1 3#
# 3 4 1#
#
#
# #
# 1 5 2# # 1 5 2#
#
#
# #
= 4 ## 3 4 1## + ##1 1 0##.
#2 1 3# # 3 4 1#
Using expansion by cofactors on each of the determinants of order 3 yields
#
# 1 5
#
# 3 4
#
#2 1
#
#
2##
#4
1## = 1(1)1+1 ##
1
3#
= 22
#
#
#
1##
1+2 # 3
+
5(1)
#
#2
3
#
#
#
1##
1+3 # 3
+
2(1)
#
#2
3
#
4##
1#
and
#
# 1
#
#1
#
# 3
5
1
4
#
#
2##
#1
0## = 2(1)1+3 ##
3
1#
= 8
#
#
#
1##
3+3 # 1
+
0
+
1(1)
#
#1
4
#
5##
1#
Hence,
|A| = 4(22) 8 = 88 8 = 96.
For n n matrices with n > 3, expansion by cofactors is an inefcient procedure for calculating determinants. It simply takes too long. A more elegant
method, based on elementary row operations, is given in Section 5.4 for matrices
whose elements are all numbers.
Problems 5.2
In Problems 1 through 22, use expansion by cofactors to evaluate the determinants
of the given matrices.
1 2 2
3 2 2
1 2 2
3,
4,
3 3,
1. 0 2
2. 1 0
3. 7
0 0 3
2 0 3
0
0
0
156
Chapter 5
Determinants
4.
7.
10.
13.
16.
18.
20.
22.
2 0 1
1 1
1,
3 2 3
2
1 9
3 1
1,
3 1
2
2
1 3
3 1 2,
2
3 5
4
0
0
2 1
0,
3
1 2
4
0
0 0
1 5
0 0
,
2
1 2 0
3
1 2 1
1
1
2 2
1
5
2 1
,
2 2
1
3
3
4 1
8
1
1 0 2
1
5 0 1
,
2 2 0
3
3
4 0
8
11
1 0 9 0
2
1 1 0 0
4 1 1 0 0.
3
2 2 1 0
0
0 1 2 0
5.
8.
11.
14.
3
1
2
1
1
1
1
4
1
1
1
5
5
0
2
3
1
1
3
5
3
3
4
3
17.
19.
21.
2
1 3 3
4,
8
3,
6. 2
7
4
5
0
3
1 3 3
4,
8
4,
9. 2
2
3
5
1
3
1
2 3
6,
5
1,
12. 5
3
2 5 1
2
3 2 0
1,
1 2,
15. 1
8
3
4 1
1 2
1
2
1 0
3 1
,
2 2 1
1
2 0 3
2
1
3
2 2
1 5 4
6
,
3 6
1
1
3 4
3 3
1
2 1 1
4
0 3
0
,
1
1 0
5
2 2 1
1
23. Use the results of Problems 1, 13, and 16 to develop a theorem about the
determinants of triangular matrices.
24. Use the results of Problems 3, 20, and 22 to develop a theorem regarding
determinants of matrices containing a zero row or column.
25. Find det(A I) if A is the matrix given in Problem 2.
26. Find det(A I) if A is the matrix given in Problem 3.
27. Find det(A I) if A is the matrix given in Problem 4.
28. Find det(A I) if A is the matrix given in Problem 5.
5.3
5.3
157
Properties of Determinants
Properties of Determinants
In this section, we list some useful properties of determinants. For the sake of
expediency, we only give proofs for determinants of order three, keeping in mind
that these proofs may be extended in a straightforward manner to determinants
of higher order.
Property 1 If one row of a matrix consists entirely of zeros, then the determinant
is zero.
Proof. Expanding by the zero row, we immediately obtain the desired result.
a11
A = a21
a31
a12
a22
a32
a13
a23 .
a33
a11
B = a31
a21
a12
a32
a22
a13
a33 .
a23
158
Chapter 5
Determinants
Proof. If we interchange the two identical rows of the matrix, the matrix remains
unaltered; hence the determinant of the matrix remains constant. From Property
2, however, by interchanging two rows of a matrix, we change the sign of the
determinant. Thus, the determinant must on one hand remain the same while on
the other hand change the sign. The only way both of these conditions can be met
simultaneously is for the determinant to be zero.
Property 4 If the matrix B is obtained from the matrix A by multiplying every
element in one row of A by the scalar , then |B| = |A|.
Proof.
#
#a11
#
# a21
#
# a31
a12
a22
a32
#
#
a13 ##
#a
#
a23 # = a11 ## 22
a32
a33 #
#
#
#a
a23 ##
a12 ## 21
#
a33
a31
( #
#a
= a11 ## 22
a32
#
#a11
#
= ##a21
#a31
a12
a22
a32
#
#
#a21
a23 ##
#
a
12 #
#
a33
a31
#
#
#a
a23 ##
+ a13 ## 12
#
a33
a31
#
a22 ##
a32 #
#
#
#a21
a23 ##
#
+
a
13 #
#
a33
a31
#)
a22 ##
a32 #
#
a13 ##
a23 ##.
a33 #
2
8 16
=
,
4
24 32
#
#1
8##
3
# #
2## ## 1
=
4# #24
1
3
in determinants we have
#
2##
,
32#
or alternatively
#
#1
8##
3
#
#
#1
2##
#
=
4(2)
#
#3
4
#
#
#2
2##
#
=
4
#
#3
4
# #
4## ## 2
=
4# #12
#
4##
.
16#
5.3
159
Properties of Determinants
a31
#
#a11 a12
#
= ##a21 a22
#a31 a32
#
# a11
#
= ()()## a21
#a31
a13
a11 a12 a13
a23 = det a21 a22 a23
a33
a31 a32 a33
#
#
#
# a11
a13 ##
a12
a13 ##
#
a23 ## = ##a21 a22 a23 ##
#a31 a32 a33 #
a33 #
#
#
#
#a11 a12 a13 #
a12
a13 ##
#
#
a22
a23 ## = ()()##a21 a22 a23 ##
#a31 a32 a33 #
a32 a33 #
a12
a22
a32
= 3 det(A).
Note that for a 3 3 matrix, n = 3.
Property 6 If a matrix B is obtained from a matrix A by adding to one row of
A, a scalar times another row of A, then |A| = |B|.
Proof. Let
a11
A = a21
a31
a12
a22
a32
a13
a23
a33
and
a11
B = a21
a31 + a11
a12
a22
a32 + a12
a13
,
a23
a33 + a13
where B has been obtained from A by adding times the rst row of A to the
third row of A. Expanding |B| by its third row, we obtain
#
#a
|B| = (a31 + a11 )## 12
a22
#
#a
+ (a33 + a13 )## 11
a21
#
#a
= a31 ## 12
a22
#
a13 ##
a23 #
#
a12 ##
a22 #
#
#
#a
a13 ##
a32 ## 11
a23 #
a21
#
#a
+ a11 ## 12
a22
#
#
#a
a13 ##
(a32 + a12 )## 11
a23 #
a21
#
#
#a
a13 ##
+ a33 ## 11
a23 #
a21
#
#
#a
a13 ##
a12 ## 11
a23 #
a21
#
a12 ##
a22 #
#
#
#a
a13 ##
+ a13 ## 11
a23 #
a21
#!
a12 ##
.
a22 #
160
Chapter 5
Determinants
The rst three terms of this sum are exactly |A| (expand |A| by its third row), while
the last three terms of the sum are
#
#a11
#
##a21
#a11
#
a13 ##
a23 ##
a13 #
a12
a22
a12
a12
a22
a12
#
a13 ##
a23 ##.
a13 #
From Property 3, however, this second determinant is zero since its rst and third
rows are identical, hence |B| = |A|.
The same type of argument will quickly show that this result is valid regardless
of the two rows chosen.
b
s
y
b
s
y
# #
c## ## a r
t ## = ##r + 2x
z# # x
bs
s + 2y
y
#
c t ##
t + 2z##.
z #
# #
c## ##a r b s c t ##
by adding to the rst
s
t ##,
row (1 ) times the
t ## = ## r
z# # x
y
z #
second row
#
#
#ar
bs
c t ## by adding to the
#
#
second row ( 2) times the
= #r + 2x s + 2y t + 2z##.
# x
y
z #
third row
a11
A = a21
a31
a12
a22
a32
a13
a23 ,
a33
then
a11
AT = a12
a13
a21
a22
a23
a31
a32 .
a33
5.3
161
Properties of Determinants
#
#
#a
a32 ##
a12 ## 21
#
a33
a23
#
#
#a
a31 ##
+ a13 ## 21
#
a33
a22
#
a31 ##
a32 #
= a11 (a22 a33 a32 a23 ) a12 (a21 a33 a31 a23 ) + a13 (a21 a32 a31 a22 ).
This, however, is exactly
# # the expression we would obtain if we expand det(A) by
the rst row. Thus #AT # = |A|.
It follows from Property 7 that any property about determinants dealing with
row operations is equally true for column operations (the analogous elementary
row operation applied to columns), because a row operation on AT is the same
as a column operation on A. Thus, if one column of a matrix consists entirely of
zeros, then its determinant is zero; if two columns of a matrix are interchanged, the
determinant changes the sign; if two columns of a matrix are identical, its determinant is zero; multiplying a determinant by a scalar is equivalent to multiplying one
column of the matrix by that scalar and then calculating the new determinant; and
the third elementary column operation when applied to a matrix does not change
its determinant.
Property 8 The determinant of a triangular matrix, either upper or lower, is the
product of the elements on the main diagonal.
Proof. See Problem 2.
Property 9 If A and B are of the same order, then det(A) det(B) = det(AB).
Because of its difculty, the proof of Property 9 is omitted here.
Example 2
Solution
2
1
3
4
and
B=
6 1
.
7
4
33
AB =
34
10
15
thus
Problems 5.3
1. Prove that the determinant of a diagonal matrix is the product of the elements
on the main diagonal.
162
Chapter 5
Determinants
# #
x## ##a
y## = ##b
z # #c
#
x##
y##.
z#
r
s
t
2
A = 5
2
1
1
1
0
3.
1
#
1##
2#
and
#
#
#a
x##
#
#
2y# = 12##b
#c
z#
#
#3
B = ##
2
#
x##
y##.
z#
r
s
t
#
#
#a
x 3y##
#
#
y 2z# = 5##b
#c
5z #
#
#
#a
c##
#
#
t # = ##b
#c
z#
#
1##
.
1#
b 3y
b + 5y
y
x
y
z
4b
2s
2y
r
s
t
#
r ##
s##.
t#
#
2c##
t ##.
z#
#
c 3z##
c + 5z## = 0.
z #
#
x##
y##.
z#
5.4
163
Pivotal Condensation
3a
3r
3x
#
c##
t ## = 0.
z#
12. Prove that if one column of a square matrix is a linear combination of another
column, then the determinant of that matrix is zero.
$
%
13. Prove that if A is invertible, then det A1 = 1/ det(A).
5.4
Pivotal Condensation
Properties 2, 4, and 6 of the previous section describe the effects on the determinant of a matrix of applying elementary row operations to the matrix itself. They
comprise part of an efcient algorithm for calculating determinants of matrices
whose elements are numbers. The technique is known as pivotal condensation: A
given matrix is transformed into row-reduced form using elementary row operations. A record is kept of the changes to the determinant as a result of Properties 2,
4, and 6. Once the transformation is complete, the row-reduced matrix is in upper
triangular form, and its determinant is found easily by Property 8. In fact, since
a row-reduced matrix has either unity elements or zeros on its main diagonal, its
determinant will be unity if all its diagonal elements are unity, or zero if any one
diagonal element is zero.
Example 1
Solution
#
# 1
#
#2
#
# 3
# #
2 3## ##1
3 2## = ##0
1 1# #3
#
#1
#
= ##0
#0
#
#1
#
= 7##0
#0
#
2 3##
7 8##
1 1#
2
7
7
2
1
7
#
3##
8##
8#
#
3#
8 ##
7#
8#
Property 6: adding to
the second row (2)
Property 6: adding to
the third row (3)
Property 4: applied
to the second row
164
Chapter 5
Determinants
#
#
#1 2 3 #
#
#
= 7##0 1 87 ##
#0 0 0 #
= 7(0) = 0.
Example 2
Property 6: adding
to the third row (7)
1
5
2
#
4##
1##.
3#
Solution
#
# 0 1
#
# 1 5
#
#6
2
#
#
#
# 1 5
1##
4##
#
1## = (1)## 0 1
4##
#6
2 3#
3#
#
#
#1
5 1##
#
1 4##
= (1)##0
#0 28 3#
#
#
#1
5
1##
#
1 4##
= (1)(1)##0
#0 28
3#
#
#
#1 5
1##
#
#
1
4##
= #0
#0
0 109#
#
#
#1 5
1##
#
1 4##
= (109)##0
#0
0
1#
= (109)(1) = 109.
Property 2: interchanging
the rst and second rows
Property 6: adding
to the third row (6)
Property 4: applied
to the second row
Property 6: adding
to the third row (28)
Property 4: applied
to the third row
Property 8
5.4
165
Pivotal Condensation
Evaluate
#
# 10
#
# 6
#
#10
Solution
#
# 10 6
#
# 6 5
#
#10
9
6
5
9
(Property 6)
#
# #
9## ##10 6 9##
7## = ## 6 5 7##
3
3#
12# # 0
#
#
#10 6 3#
#
#
= ## 6 5 2##
# 0
3
0#
#
#
#10 3#
#
= 3##
6 2#
= 3(20 + 18) = 6.
Example 4
#
9##
7##.
12#
*
by expansion by cofactors
Evaluate
#
#3
#
#0
#
#3
#
#9
1 0
1 4
2 3
7 0
#
2##
1##
.
5##
2#
Solution Since the third column already contains two zeros, it would seem
advisable to work on that one.
#
# #
#
1 0 2 #
#3 1 0 2# #3
#
# #
#
by
adding
43 times
1 4 1#
#0
1 4 1## ##0
#
#
the second row to
#3 2 3 5# = #3 11 0 17 #
4
4 #
# #
#
the third row.
#9
7 0 2 # #9
7 0 2#
#
#
#3
1 2##
#
by expansion
#
17 #
= 4#3 11
#
4
4 #
by cofactors
#
#9
7 2#
#
#
1 2## *
## 3
= 4 41 ##12 11 17## by Property 4
# 9
7 2#
166
Chapter 5
Determinants
#
#3 1
#
= (1) ##0 7
#9
7
#
#3 1
#
= (1)##0 7
#0 10
#
#7
= (1)(3)##
10
#
2##
9##
2#
second row
third row
by expansion by
cofactors
#
2##
9##
4#
#
9##
4#
Problems 5.4
In Problems 1 through 18, evaluate the determinants of the given matrices.
1 2 2
1 2 3
3 4
2
3,
5
7,
1. 1 3
2. 4 5 6,
3. 1
2 5
0
7 8 9
1
9 6
3 3
1 4,
1 2
1
4. 1
1
2
7. 3
2
2
10. 1
3
1
5. 2
3
3
2 ,
5
1
1
3
0
1
2
1 3
8. 4 5
1 3
1
1,
3
3
5
2
1
13.
5
4
8 3
1
1
1
5
15.
2 2
3
4
1
1
1
5
17.
2 2
3
4
3
11. 1
2
4 6
0 7
,
7 2
1 1
2
2
1
1
0
0
0
0
3
8
5
2
1
,
3
8
2
1
,
3
8
1
1
14.
2
2
1
1
16.
3
3
2
4
18.
3
5
5
0
2
3
4,
1
3
6,
3
2
4,
7
2
1
,
1
2
3
2 2
5 4
6
,
6
1
1
4
3 3
0 1
3
0 2 2
.
1 0
1
4 1
7
2
0
2
0
1
3
1
3
2
6. 3
3
1
9. 5
2
1
12. 2
4
1
1
1
9
1,
2
2
5
5
3
1,
1
3 3
8
3,
5
0
5.5
167
Inversion
19. What can you say about the determinant of an n n matrix that has rank less
than n?
20. What can you say about the determinant of a singular matrix?
5.5
Inversion
As an immediate consequence of Theorem 1 of Section 3.2 and the method of
pivotal condensation, we have:
Theorem 1 A square matrix has an inverse if and only if its determinant is not
zero.
In this section, we develop a method to calculate inverses of nonsingular matrices using determinants. For matrices with order greater than 3 3, this method is
less efcient than the one described in Section 3.2, and is generally avoided.
Denition 1 The cofactor matrix associated with an n n matrix A is an n n
matrix Ac obtained from A by replacing each element of A by its cofactor.
Example 1
Find Ac if
3 1 2
A = 2 5 4.
1 3 6
Solution
#
#
1+1 #5
(1)
#3
2+1 #1
Ac =
#3
(1)
#1
3+1
#
(1)
#5
#
#
#
#
#
#
#
4##
1+2 #2 4# (1)1+3 #2
(1)
#
#
#
# 1
6
1 6
#
#
#
#
#3 2#
#3
2##
2+2
2+3
#
#
#
(1)
(1)
#1 6#
#1
6#
#
#
#
#
# 3 2#
#
2##
# (1)3+3 # 3
(1)3+2 ##
#2
4#
2 4#
18
16 11
16 8.
Ac = 0
6 16
17
#
5##
3#
#
1##
,
3#
#
1##
5#
If A = aij , we will use the notation Ac = [aijc ] to represent the cofactor matrix.
Thus aijc represents the cofactor of aij .
Denition 2 The adjugate of an n n matrix A is the transpose of the cofactor
matrix of A.
168
Chapter 5
Determinants
Solution
18
A = 16
11
0
16
8
6
16.
17
Aa
A
|A|
(
=
)
Aa
A = I.
|A|
1 a
A
|A|
if |A| = 0.
That is, if |A| = 0, then A1 may be obtained by dividing the adjugate of A by the
determinant of A.
Example 3
Aa
|A|
Example 4
18
= 1/48 16
11
0
16
8
6
3/8
16 = 1/3
17
11/48
Find A1 if
5
A = 0
4
8
2
3
1
1.
1
0
1/3
1/6
1/8
1/3 .
17/48
5.5
169
Inversion
Solution
5
%
$
T
Aa = Ac = 4
8
5 11
6
Aa
9
5,
= 4
=
|A|
8 17 10
5
Ac = 11
6
A1
Example 5
4
9
5
8
17,
10
6
5,
10
Find A1 if
A=
Solution
11
9
17
1
3
2
.
4
$ c %T
3
4 2
a
, A = A
=
,
1
3
1
4 2 2
Aa
1
1
=
= 2
=
3
1 .
3
1
|A|
2 2
4
A =
2
c
A1
Problems 5.5
In Problems 1 through 15, nd the inverses of the given matrices, if they exist.
1 21
4 4
1 1
1.
,
2.
,
3.
,
1
1
4 4
3 4
2
3
2 1
8 3
2 1
4.
,
5.
,
6.
,
3
4
5 2
4 2
1 1 0
0 0 1
2 0 1
2,
7. 1 0 1,
8. 1 0 0,
9. 0 1
0 1 1
0 1 0
3 1
1
1 2 3
2 0 0
1
2
1
10. 4 5 6,
11. 5 1 0,
12. 3 2 4,
7 8 9
4 1 1
2
3 1
2
4
3
5
0 1
3 1
1
2 ,
13. 3 4 4,
14. 2 1
15. 1 3 1.
5
0 1
2
3 1
2 3 1
170
Chapter 5
Determinants
a
A=
c
b
d
5.6
Cramers Rule
Cramers rule is a method, based on determinants, for solving systems of simultaneous linear equations. In this section, we rst state the rule, then illustrate its
usage by an example, and nally prove its validity using the properties derived in
Section 5.3. We also discuss the many limitations of the method.
Cramers rule states that given a system of simultaneous linear equations in
the matrix form Ax = b (see Section 1.3), the ith component of x (or equivalently
the ith unknown) is the quotient of two determinants. The determinant in the
numerator is the determinant of a matrix obtained from A by replacing the ith
column of A by the vector b, while the determinant in the denominator is just |A|
Thus, if we are considering the system
a11 x1 + a12 x2 + a13 x3 = b1 ,
a21 x1 + a22 x2 + a23 x3 = b2 ,
a31 x1 + a32 x2 + a33 x3 = b3 ,
where x1 , x2 , and x3 represent the unknowns, then Cramers rule states that
#
#
#
#
#b1 a12 a13 #
#a11 b1 a13 #
#
#
#
#
#b2 a22 a23 #
#a21 b2 a23 #
#
#
#
#
#b3 a32 a33 #
#a31 b3 a33 #
x1 =
, x2 =
,
|A|
|A|
#
#
#a11 a12 b1 #
#
#
#a21 a22 b2 #
#
#
#
#
#a11 a12 a13 #
#a31 a32 b3 #
#
#
x3 =
, where |A| = ##a21 a22 a23 ##.
|A|
#a
a
a #
31
32
33
Two restrictions on the application of Cramers rule are immediate. First, the
systems under consideration must have exactly the same number of equations as
5.6
171
Cramers Rule
unknowns to insure that all matrices involved are square and hence have determinants. Second, the determinant of the coefcient matrix must not be zero since
it appears in the denominator. If |A| = 0, then Cramers rule cannot be applied.
Example 1
Solution
1
0
A =
2
1
3 1
3 1
,
1 1
1 1
2
1
3
0
x
y
x = ,
z
w
5
6
b = .
4
1
#
1##
1##
#
1#
#
1#
0
= 0,
20
40
= 2,
20
y=
w=
#
#1
#
#0
#
#
#2
#
#1
#
#1
#
#0
#
#
#2
#
#1
2
1
3
0
5 3
6
3
4
1
1
1
20
3
3
1
1
20
#
5##
6##
#
4#
#
1#
#
1##
1##
#
1#
#
1#
20
= 1,
20
20
= 1.
20
We now derive Cramers rule using only those properties of determinants given
in Section 5.3. We consider the general system Ax = b where
a1n
a2n
a3n
,
..
.
a11
a21
A = a31
..
.
a12
a22
a32
..
.
a13
a23
a33
..
.
an1
an2
an3
amn
x1
x2
x = x3 ,
..
.
xn
and
b1
b2
b = b3 .
..
.
bn
172
Chapter 5
Determinants
Then
#
#
#a11 x1 a12 a13 . . . a1n #
#
#
#a21 x1 a22 a23 . . . a2n #
#
# *
#
#
by Property 4 modied to columns
x1 |A| = #a31 x1 a32 a33 . . . a3n #
# ..
..
..
.. #
# .
#
.
.
.
#
#
#an1 x1 an2 an3 . . . ann #
#
#
# a11 x1 + a12 x2 a12 a13 . . . a1n #
#
#
# a21 x1 + a22 x2 a22 a23 . . . a2n #
#
# by adding (x2 ) times
#
#
the second column to
= # a31 x1 + a32 x2 a32 a33 . . . a3n #
#
..
..
..
.. # the rst column
#
#
.
.
.
. #
#
#an1 x1 + an2 x2 an2 an3 . . . ann #
#
#
# a11 x1 + a12 x2 + a13 x3 a12 a13 . . . a1n #
#
#
by adding (x3 )
# a21 x1 + a22 x2 + a23 x3 a22 a23 . . . a2n #
#
#
a12
a22
a32
..
.
a13
a23
a33
..
.
an2
an3
#
a1n ##
a2n ##
a3n ##
.. #
. ##
ann #
or
x1 =
#
#b1
#
#b2
#
# ..
#.
#
#bn
#
a1n ##
a2n ##
.. #
. ##
an2 ann #
|A|
a12
a22
..
.
5.7
173
Final Comments
Problems 5.6
Solve the following systems of equations by Cramers rule.
1.
x + 2y = 3,
3x + y = 1.
2. 2x + y = 3,
x y = 6.
3. 4a + 2b = 0,
5a 3b = 10.
4.
3s 4t = 30,
2s + 3t = 10.
5. 2x 8y = 200,
x + 4y = 150.
6.
x + y 2z = 3,
2x y + 3z = 2.
7. x + y = 15,
x + z = 15,
y + z = 10.
8. 3x + y + z = 4,
x y + 2z = 15,
2x 2y z = 5.
9.
x + 2y 2z = 1,
2x + y + z = 5,
x + y z = 2.
11. 2x + 3y + 2z = 3,
3x + y + 5z = 2,
7y 4z = 5.
13.
5.7
x
3x
2x
x
+ 2y
+ 4y
+ y
3y
+ z+
2z
z+
+ 4z +
10. 2a + 3b c = 4,
a 2b + c = 2,
3a b
= 2.
12. 5r + 8s + t = 2,
2s + t = 1,
4r + 3s t = 3.
w = 7,
4w = 13,
w = 4,
5w = 0.
174
Chapter 5
Determinants
a11
A = a21
a31
a12
a22
a32
a13
a23 .
a33
Consider the case in which we multiply every element of the third row by the
cofactor of the corresponding element in the second row and then sum the results.
Thus,
a31 (cofactor of a21 ) + a32 (cofactor of a22 ) + a33 ( cofactor of a23 )
#
#
#
#
#
#
#
#
#
#
#
#
3 #a12 a13 #
4 #a11 a13 #
5 #a11 a12 #
+ a32 (1) #
+ a33 (1) #
= a31 (1) #
a32 a33 #
a31 a33 #
a31 a32 #
#
#
#a11 a12 a13 #
#
#
*
= ##a31 a32 a33 ## = 0 from Property 3, Section 5.3
#a31 a32 a33 #
Note that this property is equally valid if we replace the word row by the word
column.
Theorem 1 AAa = |A|I.
Proof. We prove this theorem only for matrices of order 3 3. The proof easily
may be extended to cover matrices of any arbitrary order. This extension is left as
an exercise for the student.
c
c
c
a21
a31
a11
a11 a12 a13
c
c
c .
a22
a32
AAa = a21 a22 a23 a12
c
c
c
a31 a32 a33
a13
a23
a33
If we denote this product matrix by bij , then
c
c
c
b11 = a11 a11
+ a12 a12
+ a13 a13
,
c
c
c
b12 = a11 a21
+ a12 a22
+ a13 a23
,
c
c
c
b23 = a21 a31
+ a22 a32
+ a23 a33
,
c
c
c
b22 = a21 a21
+ a22 a22
+ a23 a23
,
etc.
We now note that b11 = |A| since it is precisely the term obtained when one computes det(A) by cofactors, expanding by the rst row. Similarly, b22 = |A| since it
is precisely the term obtained by computing det(A) by cofactors after expanding
by the second row. It follows from the above lemma that b12 = 0 and b23 = 0 since
b12 is the term obtained by multiplying each element in the rst row of A by the
5.7
175
Final Comments
cofactor of the corresponding element in the second row and adding, while b23
is the term obtained by multiplying each element in the second row of A by the
cofactor of the corresponding element in the third row and adding. Continuing
this analysis for each bij , we nd that
|A|
AAa = 0
0
0
|A|
0
0
1
0 = |A| 0
|A|
0
0
1
0
0
0,
1
AAa = |A|I.
Theorem 2
Aa A = |A|I.
Proof. This proof is completely analogous to the previous one and is left as an
exercise for the student.
6
Eigenvalues and Eigenvectors
6.1
Denitions
Consider the matrix A and the vectors x1 , x2 , x3 given by
1 4 1
4
3
1, x1 = 1, x2 = 2,
A = 0 2
0 0
3
0
2
3
x3 = 0.
0
8
2 = 2x1 ,
0
9
6 = 3x2 ,
6
and
3
0 = 1x3 ;
0
hence,
Ax1 = 2x1 ,
Ax2 = 3x2 ,
Ax3 = 1x3 .
That is, multiplying A by any one of the vectors x1 , x2 , or x3 is equivalent to simply
multiplying the vector by a suitable scalar.
Denition 1 A nonzero vector x is an eigenvector (or characteristic vector) of a
square matrix A if there exists a scalar such that Ax = x. Then is an eigenvalue
(or characteristic value) of A.
Thus, in the above example, x1 , x2 , and x3 are eigenvectors of A and 2, 3, 1 are
eigenvalues of A.
177
178
Chapter 6
Note that eigenvectors and eigenvalues are only dened for square matrices.
Furthermore, note that the zero vector can not be an eigenvector even though
A 0 = 0 for every scalar . An eigenvalue, however, can be zero.
Example 1
Show that
5
x = 0
0
is an eigenvector of
0 5
A = 0 1
0 3
7
2.
1
Solution
0 5 7
5
0
5
Ax = 0 1 2 0 = 0 = 00.
0 3 1
0
0
0
Example 2
Is
x=
1
1
an eigenvector of
A=
1
3
2
?
4
Solution
Ax =
1
3
2
4
1
3
=
.
1
7
=
=
.
7
1
6.1
179
Denitions
Problems 6.1
1. Determine which of the following vectors are eigenvectors for
1 2
A=
.
4 7
1
1
2
1
(a)
,
(b)
,
(c)
,
(d)
,
1
1
1
2
2
4
4
2
(e)
,
(f )
,
(g)
,
(h)
.
2
4
4
4
2. What are the eigenvalues that correspond to the eigenvectors found in
Problem 1?
3. Determine which of the following vectors are eigenvectors for
2 4
B=
.
3 6
1
1
2
0
(a)
,
(b)
,
(c)
,
(d)
,
1
1
1
0
6
2
4
1
(e)
,
(f )
,
(g)
,
(h)
.
3
3
6
0
4. What are the eigenvalues that correspond to the eigenvectors found in
Problem 3?
5. Determine which of the following vectors are eigenvectors for
2 0 1
1.
A= 1 2
1 0
2
1
0
1
3
(a) 0,
(b) 1,
(c) 2,
(d) 6,
0
0
1
3
1
1
2
1
(e) 0,
(f ) 0,
(g) 0,
(h) 1.
1
1
2
1
6. What are the eigenvalues that correspond to the eigenvectors found in
Problem 5?
7. Determine which of the following vectors are eigenvectors for
1
1
A=
0
0
3 0
1 0
0 1
0 4
0
0
.
2
3
180
Chapter 6
1
1
(a)
0 ,
0
0
0
(b)
1,
1
1
0
(c)
0,
1
3
1
(d)
0,
0
0
0
(e)
0,
0
1
1
(f )
0.
0
6.2
Eigenvalues
Let x be an eigenvector of the matrix A. Then there must exist an eigenvalue
such that
Ax = x
(1)
or, equivalently,
Ax x = 0
or
(A I)x = 0.
(2)
(3)
Bx = 0,
(4)
(5)
Equation (5) is called the characteristic equation of A. The roots of (5) determine
the eigenvalues of A.
6.2
181
Eigenvalues
Example 1
1
A=
4
2
.
3
Solution
1
A I =
4
2
1
3
0
0
1
=
1
4
2
3
0
1 2
=
.
4 3
det (A I) = (1 )(3 ) 8 = 2 4 5. The characteristic equation
of A is det (A I) = 0, or 2 4 5 = 0. Solving for , we have that = 1, 5;
hence the eigenvalues of A are 1 = 1, 2 = 5.
Example 2
1
A=
1
2
.
1
Solution
1
1
A I =
2
1
1
0
0
1 2
=
,
1
1
1
det (A I) = (1 )(1 ) + 2 = 2 2 + 3.
The characteristic equation is 2 2 + 3 = 0; hence,
solving for by the quadratic
formula, we have that 1 = 1 + 2 i, 2 = 1 2 i which are eigenvalues of A.
Note: Even if the elements of a matrix are real, the eigenvalues may be complex.
Example 3
t
A=
2t
2t
.
t
Solution
t
A I =
2t
2t
1
t
0
0
t
=
1
2t
2t
t
det (A I) = (t )(t ) 4t 2 = 2 5t 2 .
182
Chapter 6
The characteristic
equation is 2 5t 2 = 0, hence, the eigenvalues are 1 =
2 = 5t.
5t,
Note: If the matrix A depends on a parameter (in this case the parameter is t),
then the eigenvalues may also depend on the parameter.
Example 4
1 1
2 1.
0 1
2
A = 3
0
Solution
2
A I = 3
0
1
2
0
1
1
1 0
1
0
0
1
0
0
2
0 = 3
1
0
1
1
2
1 .
0
1
10
7
A =
8
7
7
5
6
5
8
6
10
9
7
5
.
9
10
For these reasons, eigenvalues are rarely found by the method just given, and
numerical techniques are used to obtain approximate values (see Sections 6.6
and 10.4).
6.2
183
Eigenvalues
Problems 6.2
In Problems 1 through 35, nd the eigenvalues of the given matrices.
1.
1
1
3
4.
9
2
,
4
6
,
6
2
5.
1
3
5
,
5 3
1 0
10.
,
0 1
2
2
13.
,
1 2
0
t
16.
,
2t t
1 0 3
19. 1 2 1,
3 0 1
1 1 1
0,
22. 0 0
1 2
3
3
1 1
3 1,
25. 1
1 1
5
3 1
1
3 1,
28. 1
1 1
3
7.
1
31. 1
0
1
3
34.
0
0
5 1
1 1,
0 3
1
5
0
0
2
2
2.
0 0
0 0
,
1 5
1 1
1
,
3
3.
1
,
4
3
5
,
5 3
0 1
11.
,
0 0
4 10
14.
,
9 5
0
2t
17.
,
2t 4t
2 0 1
2,
20. 2 2
1 0
2
3 0 0
23. 2 6 4,
2 3 5
1 2 3
26. 2 4 6,
3 6 9
1 2 1
29. 2 4 2,
1 2 1
0
32. 0
0
0
0
35.
0
4
2
,
1
2
5
,
1 2
0 0
12.
,
0 0
5 10
15.
,
9 4
4 2
18.
,
2 0 1
2,
21. 2 1
1 0
2
5 7 7
24. 4 3 4,
4 1 2
10 2
0
6,
27. 2 4
0 6 10
4 2 1
30. 2 7 2,
1 2 4
9.
1 0
0 1,
1 0
1
0
0
12
3
,
6
2
4
1
6.
4
8.
0
1
0
13
0
33. 0
27
0
0
.
1
6
1 0
0 1,
27 9
184
Chapter 6
0
0
C=
0
a0
1
0
0
a1
0
1
0
a2
1
an1
6.3
Eigenvectors
To each distinct eigenvalue of a matrix A there will correspond at least one
eigenvector which can be found by solving the appropriate set of homogeneous
equations. If an eigenvalue i is substituted into (2), the corresponding eigenvector
xi is the solution of
(A i I)xi = 0.
Example 1
(6)
1
4
2
.
3
(7)
6.3
185
Eigenvectors
2
1
+
3
0
1
4
or
2
4
2
4
0
1
!
0
x1
=
y1
0
0
x1
=
.
y1
0
or, equivalently,
2x1 + 2y1 = 0,
4x1 + 4y1 = 0.
A nontrivial solution to this set of equations is x1 = y1 , y1 arbitrary; hence, the
eigenvector is
x1
y1
1
=
= y1
, y1 arbitrary.
x1 =
y1
y1
1
By choosing different values of y1 , different eigenvectors for 1 = 1 can be
obtained. Note, however, that any two such eigenvectors would be scalar multiples of each other, hence linearly dependent. Thus, there is only one linearly
independent eigenvector corresponding to 1 = 1. For convenience we choose
y1 = 1, which gives us the eigenvector
1
.
x1 =
1
Many times, however, the scalar y1 is chosen in such a manner that the resulting
eigenvector becomes a unit vector. If we wished
to achieve this result for the above
4
4
2
2
0
x2
=
,
y2
0
186
Chapter 6
or, equivalently,
4x2 + 2y2 = 0,
4x2 2y2 = 0.
A nontrivial solution to this set of equations is x2 = 21 y2, where y2 is arbitrary;
hence
1
y2 /2
x2
=
= y2 2 .
x2 =
y2
y2
1
For convenience, we choose y2 = 2, thus
x2 =
1
.
2
1
4
2
3
1
5
1
=
=5
= 2 x2 .
2
10
2
Again note that x2 is not unique! Any scalar multiple of x2 is also an eigenvector
corresponding to 2 . However, in this case, there is just one linearly independent
eigenvector corresponding to 2.
Example 2
2
A = 0
0
0
2
1
0
5.
2
2
0
0
2
1
0
1
5 2 0
2
0
0
1
0
0
0 x1
0 y1 = 0,
z1
0
1
6.3
187
Eigenvectors
or
0
0
0
0
0
1
0
x1
0
5 y1 = 0,
4
z1
0
or, equivalently,
0 = 0,
5z1 = 0,
y1 4z1 = 0.
A nontrivial solution to this set of equations is y1 = z1 = 0, x1 arbitrary; hence
x1
x1
1
x1 = y1 = 0 = x1 0.
0
z1
0
We now nd the eigenvectors corresponding to 2 = i. If we designate x2 by
x2
y2 ,
z2
Eq. (6) becomes
2i
0
0
0
2i
1
0
x2
0
5 y2 = 0
2 i
0
z2
or
(2 i)x2 = 0,
(2 i)y2 + 5z2 = 0,
y2 + (2 i)z2 = 0.
A nontrivial solution to this set of equations is x2 = 0, y2 = (2 i)z2 , z2 arbitrary;
hence,
x2
0
0
x2 = y2 = (2 i)z2 = z2 2 i.
1
z2
z2
The eigenvectors corresponding to 3 = i are found in a similar manner to be
0
x3 = z3 2 i, z3 arbitrary.
1
188
Chapter 6
Problems 6.3
In Problems 1 through 23, nd an eigenvector corresponding to each eigenvalue
of the given matrix.
1 2
2 1
2 3
1.
,
2.
,
3.
,
1 4
2 3
4 6
3 6
1
2
3
5
4.
,
5.
,
6.
,
9 6
4 1
5 3
3
5
2
5
2
2
7.
,
8.
,
9.
,
5 3
1 2
1 2
4 10
0
t
4 2
10.
,
11.
,
12.
,
9 5
2t t
1 0 3
2 0 1
3 0 1
2,
2,
13. 1 2 1,
14. 2 2
15. 2 3
3 0 1
1 0
2
1 0
3
3 0 0
5 7 7
3
1 1
3 1,
16. 2 6 4,
17. 4 3 4,
18. 1
2 3 5
4 1 2
1 1
5
6.3
189
Eigenvectors
1
5 1
19. 1 1 1,
0
0 3
1 1 0 0
3
5 0 0
,
22.
0
0 1 4
0
0 1 1
0
20. 0
0
2
0
23.
0
0
1
0
1
4 2
1 0
3 3
2 0
0
1,
0
3
21. 0
0
2
4
1
1
0,
5
2
0
.
1
4
24. Find unit eigenvectors (i.e., eigenvectors whose magnitudes equal unity) for
the matrix in Problem 1.
25. Find unit eigenvectors for the matrix in Problem 2.
26. Find unit eigenvectors for the matrix in Problem 3.
27. Find unit eigenvectors for the matrix in Problem 13.
28. Find unit eigenvectors for the matrix in Problem 14.
29. Find unit eigenvectors for the matrix in Problem 16.
30. A nonzero vector x is a left eigenvector for a matrix A if there exists a scalar
such that xA = x. Find a set of left eigenvectors for the matrix in Problem 1.
31. Find a set of left eigenvectors for the matrix in Problem 2.
32. Find a set of left eigenvectors for the matrix in Problem 3.
33. Find a set of left eigenvectors for the matrix in Problem 4.
34. Find a set of left eigenvectors for the matrix in Problem 13.
35. Find a set of left eigenvectors for the matrix in Problem 14.
36. Find a set of left eigenvectors for the matrix in Problem 16.
37. Find a set of left eigenvectors for the matrix in Problem 18.
38. Prove that if x is a right eigenvector of a symmetric matrix A, then xT is a left
eigenvector of A.
39. A left eigenvector for a given matrix is known to be [1 1]. Find another left
eigenvector for the same matrix satisfying the property that the sum of the
vector components must equal unity.
40. A left eigenvector for a given matrix is known to be [2 3]. Find another left
eigenvector for the same matrix satisfying the property that the sum of the
vector components must equal unity.
41. A left eigenvector for a given matrix is known to be [1 2 5]. Find another
left eigenvector for the same matrix satisfying the property that the sum of
the vector components must equal unity.
42. A Markov chain (see Problem 16 of Section 1.1 and Problem 16 of Section 1.6)
is regular if some power of the transition matrix contains only positive elements. If the matrix itself contains only positive elements then the power
190
Chapter 6
is one, and the matrix is automatically regular. Transition matrices that are
regular always have an eigenvalue of unity. They also have limiting distribution vectors denoted by x() , where the ith component of x() represents the
probability of an object being in state i after a large number of time periods
have elapsed. The limiting distribution x() is a left eigenvector of the transition matrix corresponding to the eigenvalue of unity, and having the sum of
its components equal to one.
(a) Find the limiting distribution vector for the Markov chain described in
Problem 16 of Section 1.1.
(b) Ultimately, what is the probability that a family will reside in the city?
43. Find the limiting distribution vector for the Markov chain described in Problem 17 of Section 1.1. What is the probability of having a Republican mayor
over the long run?
44. Find the limiting distribution vector for the Markov chain described in Problem 18 of Section 1.1. What is the probability of having a good harvest over
the long run?
45. Find the limiting distribution vector for the Markov chain described in Problem 19 of Section 1.1. Ultimately, what is the probability that a person will use
Brand Y?
6.4
1
4
1
3
A = 0
1
Solution
2
1.
5
tr(A) = 3 + 4 + (5) = 2.
Property 1 The sum of the eigenvalues of a matrix equals the trace of the matrix.
Proof.
11
5
3
.
5
6.4
191
1
A = 2
3
0
1
4
0
0.
1
192
Chapter 6
%
$
Thus, det AT I = 0, which implies that is an eigenvalue of AT .
6.4
193
Problems 6.4
1. One eigenvalue of the matrix
8
A=
3
2
3
8
3
3
2
(b) 5 and 5,
(d) 2 and 4.
194
Chapter 6
12
A=
3
11. Verify Property 2 for
1 3
A = 1 2
2 1
16
.
7
6
1.
7
6.5
6.5
195
2
A = 0
0
1
2
0
0
1.
2
196
Chapter 6
Example 2
2
A = 0
0
1
2
0
0
0.
2
6.5
197
Example 3
2
A = 0
0
0
2
0
0
0.
2
0
1,
0
0
0.
1
In this case we see that three linearly independent eigenvectors are generated
by = 2. (Note that, from Theorem 1, this is the maximal number that could be
generated.)
The preceding examples are illustrations of
Theorem 2 If is an eigenvalue of multiplicity k of an n n matrix A, then
the number of linearly independent eigenvectors of A associated with is given by
= n r(A I). Furthermore, 1 k.
198
Chapter 6
(8)
and 1 = 2 = 3 = 1 .
Since we want to show that x1 , x2 , x3 are linearly independent, we must show
that the only solution to
(9)
c1 x1 + c2 x2 + c3 x3 = 0
is c1 = c2 = c3 = 0. By premultiplying (9) by A, we obtain
c1 Ax1 + c2 Ax2 + c3 Ax3 = A 0 = 0.
It follows from (8), therefore, that
c1 1 x1 + c2 2 x2 + c3 3 x3 = 0.
(10)
c1 x1
0
1 1 1
1 2 3
c2 x2 = 0.
2
2
2
1 2 3
0
c3 x3
(11)
6.5
199
Dene
1
1
B=
1
2
1
3
.
21
22
23
It can be shown that det(B) = (2 1 )(3 2 )(3 1 ). Thus, since all the
eigenvalues are distinct, det (B) = 0 and B is invertible. Therefore,
0
0
c1 x1
c2 x2 = B1 0 = 0
c3 x3
0
0
or
c1 x1 = 0
c x = 0
2 2
c3 x3 = 0
(12)
But since x1 , x2 , x3 are eigenvectors, they are nonzero, therefore, it follows from
(12) that c1 = c2 = c3 = 0. This result together with (9) implies Theorem 3.
Theorems 2 and 3 together completely determine the number of linearly
independent eigenvectors of a matrix.
Example 4
1
A = 4
4
0
3
2
0
2.
3
x1 =
y1
z1
200
Chapter 6
x1
1
0
x1
y1
0
x1 = y1 =
=
x
+
y
1
1 1.
z1
2x1 y1
2
1
By rst choosing x1 = 1, y1 = 0 and then x1 = 0, y1 = 1, we see that = 1
generates the two linearly independent eigenvectors
0,
2
1.
1
1
0,
2
0
1,
1
0
1.
1
Problems 6.5
In Problems 116 nd a set of linearly independent eigenvectors for the given
matrices.
2 1
3 1
3 0
1.
,
2.
,
3.
,
1
4
0 3
0 3
6.6
201
Power Methods
2 1 1
4. 0 1 0,
1 1 2
1 1 1
0,
7. 0 0
1 2
3
0
1 0
0 1,
10. 0
27 27 9
0 1
0 0
0 0
1 0
,
13.
0 0
0 1
1 4 6 4
1 0 0 0
1 2 1 1
15.
1 1 2 1,
1 1 1 2
2
5. 0
1
1
8. 2
3
0
11. 0
1
1
0
14.
0
0
3
0
16.
0
0
1
1
2
2 0 1
6. 2 1 2,
1 0
2
3 1
1
3 1,
9. 1
1 1
3
4 2 1
12. 2 7 2,
1 2 4
1
0,
2
2 3
4 6,
6 9
1 0
0 1,
3 3
0
0
0
1
1
3
0
0
0 0
1 0
,
0 1
3 3
1 2
1 1
.
2 0
0 2
1
x2
x22
..
.
x2n1
...
#
1 #
#
xn #
#
#
2
xn ##
#
#
.. #
. ##
#
xnn1 #
6.6
Power Methods
The analytic methods described in Sections 6.2 and 6.3 are impractical for calculating the eigenvalues and eigenvectors of matrices of large order. Determining the
characteristic equations for such matrices involves enormous effort, while nding
its roots algebraically is usually impossible. Instead, iterative methods which lend
202
Chapter 6
= 1 c1 v1 + c2
k
k1 c1 v1
2
1
)k
(
v2 + + cn
for large k.
n
1
)k
vn
6.6
203
Power Methods
This last pseudo-equality follows from noting that each quotient of eigenvalues is
less than unity in absolute value, as a result of indexing the rst eigenvalue as the
dominant one, and therefore tends to zero as that quotient is raised to successively
higher powers.
Thus, Ak x0 approaches a scalar multiple of v1 . But any nonzero scalar multiple
of an eigenvector is itself an eigenvector, so Ak x0 approaches an eigenvector of A
corresponding to the dominant eigenvalue, providing c1 is not zero. The scalar c1
will be zero only if x0 is a linear combination of {v2 , v3 , . . . , vn }.
The power method begins with an initial vector x0 , usually the vector having
all ones for its components, and then iteratively calculates the vectors
x1 = Ax0 ,
x2 = Ax1 = A2 x0 ,
x3 = Ax2 = A3 x0 ,
..
.
xk = Axk1 = Ak x0 .
As k gets larger, xk approaches an eigenvector of A corresponding to its dominant
eigenvalue.
We can even determine the dominant eigenvalue by scaling appropriately. If k
is large enough so that xk is a good approximation to the eigenvector, say to within
acceptable roundoff error, then it follows from Eq. (1) that
Axk = 1 xk .
If xk is scaled so that its largest component is unity, then the component of xk+1 =
Axk = 1 xk having the largest absolute value must be 1 .
We can now formalize the power method. Begin with an initial guess x0 for
the eigenvector, having the property that its largest component in absolute value
is unity. Iteratively, calculate x1 , x2 , x3 , . . . by multiplying each successive iterate
by A, the matrix of interest. Each time xk (k = 1, 2, 3, . . .) is computed, identify
its dominant component and divide each component by it. Redene this scaled
vector as the new xk . Each xk is an estimate of an eigenvector for A and each
dominant component is an estimate for the associated eigenvalue.
Example 1
Solution We initialize x0 = 1
1
4
T
1 . Then
2
.
3
204
Chapter 6
First Iteration
1
x1 = Ax0 =
4
2
3
1
3
=
,
1
7
7,
x1
1
3
7
T
= 0.428571
T
Second Iteration
1
4
x2 = Ax1 =
2
3
0.428571
2.428571
=
,
1
4.714286
4.714286,
x2
1
2.428571
4.714286
4.714286
T
= 0.515152
T
Third Iteration
1
x3 = Ax2 =
4
2
3
0.515152
2.515152
=
,
1
5.060606
= 5.060606,
x3
1
[2.515152 5.060606]T = [0.497006 1]T .
5.060606
Fourth Iteration
1
x4 = Ax3 =
4
2
3
0.497006
2.497006
=
,
1
4.988024
4.988024,
x4
1
2.497006
4.988024
4.988024
T
= 0.500600
T
Example 2
0
A= 0
18
1
0
1
0
1.
7
6.6
205
Power Methods
Solution We initialize x0 = 1
T
1 . Then
First Iteration
x1 = Ax0 = 1
10
T
10,
x1
1
1
10
10
T
= 0.1
0.1
T
Second Iteration
0
1
0
0.1
0.1
0
1 0.1 = 1 ,
x2 = Ax1 = 0
18 1 7
1
5.3
5.3,
1 +0.1 1 5.3,T
x2
5.3
T
= 0.018868 0.188679 1 .
Third Iteration
0 1 0
0.018868
0.188679
,
1
x3 = Ax2 = 0 0 1 0.188679 =
18 1 7
1
7.150943
7.150943,
T
1
0.188679 1 7.150943
7.150943
T
= 0.026385 0.139842 1 .
x3
Continuing in this manner, we generate Table 6.1, where all entries are rounded
to four decimal places. The algorithm is converging to the eigenvalue 6.405125
and its corresponding eigenvector
0.024376
0.1561240
T
206
Chapter 6
Table 6.1
Iteration
0
1
2
3
4
5
6
7
8
Eigenvector components
1.0000
0.1000
0.0189
0.0264
0.0219
0.0243
0.0242
0.0244
0.0244
1.0000
0.1000
0.1887
0.1398
0.1566
0.1551
0.1561
0.1560
0.1561
Eigenvalue
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
10.0000
5.3000
7.1509
6.3852
6.4492
6.4078
6.4084
6.4056
2
A=
2
Solution We initialize x0 = [1
with
1
L=
1
1
.
3
and
U=
2
0
1
.
2
First Iteration. We solve the system LUx1 = x0 by rst solving the system
Ly = x0 for y, and then solving the system Ux1 = y for x1 . Set y = [y1 y2 ]T and
6.6
207
Power Methods
1
[0.5 0]T = [1 0]T .
0.5
Second Iteration. We solve the system LUx2 = x1 by rst solving the system
Ly = x1 for y, and then solving the system Ux2 = y for x2 . Set y = [y1 y2 ]T and
x2 = [a b]T . The rst system is
y1 + 0y2 = 1,
y1 + y2 = 0,
which has as its solution y1 = 1 and y2 = 1. The system Ux2 = y becomes
2a + b = 1,
2b = 1,
which admits the solution a = 0.75 and b = 0.5. Thus,
x2 = A1 x1 = [0.75
0.75,
1
0.75
x2
0.75
0.5]T ,
0.5
T
= 1
0.666667
Third Iteration. We rst solve Ly = x2 to obtain y = 1
T
Ux3 = y to obtain x3 = 0.916667 0.833333 . Then,
0.916667
1
0.916667
x3
0.916667
0.833333
T
= 1
T
T
1.666667 , and then
0.909091
T
Continuing, we converge to the eigenvalue 1 for A1 and its reciprocal 1/1 = 1 for
T
A. The vector approximations are converging to 1 1 , which is an eigenvector
for both A1 and A.
208
Chapter 6
Example 4
7
A = 2
0
Solution We initialize x0 = 1
A = LU with
1
L = 0.285714
0
First Iteration
Set y = y1 y2
y3
T
T
1 . The LU decomposition for A has
0 0
1 0
14 1
0
6.
7
2
1
6
7
U = 0
0
and
and x1 = a
2
0.428571
0
0
6 .
77
T
c . The rst system is
y1 + 0y2 + 0y3 = 1,
0.285714y1 + y2 + 0y3 = 1,
0y1 + 14y2 + y3 = 1,
which has as its solution y1 = 1, and y2 = 0.714286, and y3 = 9. The system
Ux1 = y becomes
7a + 2b = 1,
0.428571b + 6c = 0.714286,
77c = 9,
which admits the solution a = 0.134199, b = 0.030303, and c = 0.116883. Thus,
x1 = A1 x0 = 0.134199
0.134199
x1
T
0.116833 ,
1
0.134199
0.134199
= 1
0.030303
0.225806
0.030303
0.870968
T
0.116833
T
6.6
209
Power Methods
Second Iteration
Solving the system Ly = x1 for y, we obtain
y= 1
0.059908
T
1.709677 .
T
0.022204 .
0.171065
Therefore,
0.171065,
x2
1
0.093981
0.171065
= 0.549388
0.171065
0.129796
T
T
0.022204 ,
Third Iteration
Solving the system Ly = x2 for y, we obtain
y = 0.549388
0.843032
11.932245
T
0.202424
0.154964
T
Table 6.2
Iteration
0
1
2
3
4
5
6
7
8
9
10
11
12
Eigenvector components
1.0000
1.0000
0.5494
0.6734
0.0404
0.2677
0.1723
0.2116
0.1951
0.2021
0.1991
0.2004
0.1998
1.0000
0.2258
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
0.8710
0.1298
0.7655
0.5782
0.5988
0.6035
0.5977
0.6012
0.5994
0.6003
0.5999
0.6001
Eigenvalue
0.1342
0.1711
0.2024
0.3921
0.3197
0.3372
0.3323
0.3336
0.3333
0.3334
0.3333
0.3333
210
Chapter 6
Therefore,
0.202424,
x3
+
1
0.136319
0.202424
= 0.673434
0.202424
0.765542
T
0.154964
,T
Continuing in this manner, we generate Table 6.2, where all entries are rounded
to four decimal places. The algorithm is converging to the eigenvalue 1/3 for
A1 and its reciprocal 3 for A. The vector approximations are converging to
[0.2 1 0.6]T , which is an eigenvector for both A1 and A.
We can use Property 7 and Observation 4 of Section 6.4 in conjunction with
the inverse power method to develop a procedure for nding all eigenvalues and
a set of corresponding eigenvectors for a matrix, providing that the eigenvalues
are real and distinct, and estimates of their locations are known. The algorithm is
known as the shifted inverse power method.
If c is an estimate for an eigenvalue of A, then A cI will have an eigenvalue
near zero, and its reciprocal will be the dominant eigenvalue of (A cI)1 . We
use the inverse power method with an LU decomposition of A cI to calculate
the dominant eigenvalue and its corresponding eigenvector x for (A cI)1 .
Then 1/ and x are an eigenvalue and eigenvector for A cI while 1/ + c and x
are an eigenvalue and eigenvector for A.
Example 5
Table 6.3
Iteration
0
1
2
3
4
5
6
7
8
9
10
11
Eigenvector components
1.0000
0.6190
0.4687
0.3995
0.3661
0.3496
0.3415
0.3374
0.3354
0.3343
0.3338
0.3336
1.0000
0.7619
0.7018
0.6816
0.6736
0.6700
0.6683
0.6675
0.6671
0.6669
0.6668
0.6667
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
Eigenvalue
0.2917
0.2639
0.2557
0.2526
0.2513
0.2506
0.2503
0.2502
0.2501
0.2500
0.2500
6.6
211
Power Methods
8
2
0
6,
A cI = 2 14
0
6 8
which has an LU decomposition with
1
0
0
1
0 and
L = 0.25
0
0.444444 1
8
U= 0
0
2
13.5
0
0
.
6
5.333333
Applying the inverse power method to A 15I, we generate Table 6.3, which
T
is converging to = 0.25 and x = 13 23 1 . The corresponding eigenvalue
of A is 1/ 0.25 + 15 = 11, with the same eigenvector.
Using the results of Examples 4 and 5, we have two eigenvalues, 1 = 3
and 2 = 11, of the 3 3 matrix dened in Example 4. Since the trace of a
matrix equals the sum of the eigenvalues (Property 1 of Section 6.4), we know
7 + 1 + 7 = 3 + 11 + 3 , so the last eigenvalue is 3 = 7.
Problems 6.6
In Problems 1 through 10, use the power method to locate the dominant eigenvalue
and a corresponding eigenvector for the given matrices. Stop after ve iterations.
2 1
2 3
3 6
1.
,
2.
,
3.
,
2 3
4 6
9 6
0 1
8 2
8 3
4.
,
5.
,
6.
,
4 6
3 3
3 2
3 0 0
7 2 0
3 2
3
6,
7. 2 6 4,
8. 2 1 6,
9. 2 6
2 3 5
0 6 7
3 6 11
2 17
7
1.
10. 17 4
7
1 14
11. Use the power method on
2
A= 2
1
0
2
0
1
2,
2
212
Chapter 6
3
A=
5
5
,
3
7
Matrix Calculus
7.1
Well-Dened Functions
The student should be aware of the vast importance of polynomials and exponentials to calculus and differential equations. One should not be surprised to nd,
therefore, that polynomials and exponentials of matrices play an equally important
role in matrix calculus and matrix differential equations. Since we will be interested in using matrices to solve linear differential equations, we shall devote this
entire chapter to dening matrix functions, specically polynomials and exponentials, developing techniques for calculating these functions, and discussing some
of their important properties.
Let pk (x) denote an arbitrary polynomial in x of degree k,
pk (x) = ak xk + ak1 xk1 + + a1 x + a0 ,
(1)
(2)
0
A = 0
0
1
0
0
0
1
0
if p2 (x) = 2x2 + 3x + 4.
213
214
Chapter 7
Solution
Matrix Calculus
0
p2 (A) = 20
0
0
= 20
0
0
0 1 0
1 0 0
1 + 30 0 1 + 40 1 0
0
0 0 0
0 0 1
1
0 1 0
1 0 0
4
0 + 30 0 1 + 40 1 0 = 0
0
0 0 0
0 0 1
0
1
0
0
0
0
0
3
4
0
2
3.
4
Note that had we dened p2 (A) = 2A2 + 3A + 4 (that is, without the I term),
we could not have performed the addition since addition of a matrix and a scalar
is undened.
Since a matrix commutes with itself, many of the properties of polynomials
(addition, subtraction, multiplication, and factoring but not division) are still valid
for polynomials of a matrix. For instance, if f(x), d(x), q(x), and r(x) represent
polynomials in x and if
f(x) = d(x)q(x) + r(x)
(3)
(4)
Equation (4) follows from (3) only because A commutes with itself; thus, we
multiply together two polynomials in A precisely in the same manner that we
multiply together two polynomials in x.
If we recall from calculus that many functions can be written as a Maclaurin
series, then we can dene functions of matrices quite easily. For instance, the
Maclaurin series for ex is
ex =
k
x
k=0
k!
=1+
x
x2
x3
+
+
+ .
1!
2!
3!
(5)
Ak
k=0
k!
=I+
A A2
A3
+
+
+ .
1!
2!
3!
(6)
7.1
215
Well-Dened Functions
Denition 2 The innite series
n=0 Bn , converges to B if the sequence {Sk } of
k
partial sums, where Sk = n=0 Bn , converges to B.
It can be shown (see Theorem 1, this section) that the innite series given in
(6) converges for any matrix A. Thus eA is dened for every matrix.
Example 2
Find eA if
2
A=
0
0
.
0
Solution
1 2 0 2
1 2 0
1 2 0 3
0
+
+
+
+
e =e
1
1! 0 0
2! 0 0
3! 0 0
3
2
2 /3! 0
2/1! 0
1 0
+
=
+
+ 2 /2! 0 +
0 1
0
0
0
0
0
0
k
2
2 /k! 0
e
0
=
.
= k=0
0 e0
0
1
A
1
=
0
1
0
A = ..
.
0
2
..
.
0
0
.. ,
.
n
e1
0
eA = ..
.
0
0
e 2
..
.
0
0
0
..
.
(7)
e n
1
A=
4
2
,
3
216
Chapter 7
Matrix Calculus
1 2e5 + 4e1
e =
6 4e5 4e1
A
2e5 2e1
4e5 + 2e1
For the purposes of this book, the exponential is the only function that is
needed. However, it may be of some value to know how other functions of matrices,
sines, cosines, etc., are dened. The following theorem, the proof of which is beyond
the scope of this book, provides this information.
Theorem
z represent the complex variable x + iy. If f(z) has the Taylor
1 Let
k , which converges for |z| < R, and if the eigenvalues , , . . . ,
series
a
z
k
n
1 2
k=0
|
|
<
R(i
=
1,
2,
.
.
.
,
n),
then
of
an
n
n
matrix
A
have
the
property
that
i
k
k=0 ak A will converge to an n n matrix which is dened to be f(A). In such a
case, f(A) is said to be well dened.
Example 3
Dene sin A.
(1)k z2k+1
(2k + 1)!
k=0
=z
z5
z7
z3
+
+
3!
5!
7!
This series can be shown to converge for all z (that is, R = ). Hence, since
any eigenvalue of A must have the property || < (that is, is nite) sin A
can be dened for every A as
sin A =
(1)k A2k+1
k=0
(2k + 1)!
=A
A3
A5
A7
+
+
3!
5!
7!
Problems 7.1
1. Let q(x) = x 1. Find pk (A) and q(A)pk (A) if
1
2 3
(a) A = 0 1 4,
k = 2, and p2 (x) = x2 2x + 1,
0
0 1
1 2
(b) A =
,
k = 3, and p3 (x) = 2x3 3x2 + 4.
3 4
(8)
7.1
217
Well-Dened Functions
0
2
0
A = 0
0
1
0 2
1 2
1
1.
(a) A =
,
(b) A = 3
3 4
2 2
3
The above equation is an example of matrix factoring.
4. Although x2 y2 = (x y)(x + y) whenever x and y denote real-valued
variables, show by example that A2 B2 need not equal the product
(A B)(A + B) whenever A and B denote 2 2 real matrices. Why?
5. It is known that x2 5x + 6 factors into the product (x 2)(x 3) whenever
x denotes a real-valued variable. Is it necessarily true that A2 5A + 6I =
(A 2I)(A 3I) whenever A represents a square real matrix? Why?
6. Determine limk Bk when
1
k
Bk =
3
2
2 2
k .
(0.5)k
2k
k+1
k+3
.
Bk = 2
k 2k + 1
3k2 + 2k
2k2
8. Determine limk Dk when
Dk =
(0.2)k
(0.1)k
3k
218
Chapter 7
Matrix Calculus
n
2n+1 converges for all
9. It is known that arctan (z) =
n=0 [(1) /(2n + 1)]z
|z|
< /2. Determine for which of the following matrices A, arctan(A) =
n
2n+1 is well dened:
n=0 [(1) /2n + 1]A
6
.
4
3
(a)
2
0
(d) 0
0
1
0
1
0
1.
0
4
.
5
5
(b)
6
1
(e) 0
0
6
(c)
2
1
5.
3
2
3
1
0
(f ) 0
0
5
.
1
1
0
18
0
1 .
3
4
n+1 /n]zn converges for all |z| < 1.
10. It is known that ln(1 + z) =
n=0 [(1)
Determine
for which of the matrices given in Problem 9 ln(I + A) =
n+1/n ]An is well dened.
[(1)
n=0
n n
11. It is known that f(z) =
3. Determine for
n=0 z /3 converges for
all |z| <
n n
which of the matrices given in Problem 9 f(A) =
n=0 A /3 is well dened.
12. Derive Eq. (7).
13. Find eA when
1
A=
0
0
.
2
1
0
0
.
28
2
A = 0
0
0
2
0
0
0.
0
16. Derive an expression for sin(A) similar to Eq. (7) when A is a square diagonal
matrix.
17. Find sin(A) for the matrix given in Problem 13.
18. Find sin(A) for the matrix given in Problem 14.
7.2
219
CayleyHamilton Theorem
19. Using Theorem 1, give a denition for cos A and use this denition to nd
1
cos
0
0
.
2
7.2
CayleyHamilton Theorem
We now state one of the most powerful theorems of matrix theory, the proof which
is given in the Final Comments at the end of this chapter.
CayleyHamilton Theorem. A matrix satises its own characteristic equation. That
is, if the characteristic equation of an n n matrix A is n + an1 n1 + + a1 +
a0 = 0, then
An + an1 An1 + + a1 A + a0 I = 0.
Note once again that when we change a scalar equation to a matrix equation,
the unity element 1 is replaced by the identity matrix I.
Example 1 Verify the CayleyHamilton theorem for
A=
1
4
2
.
3
1
4
A2 4A 5I =
9
=
16
2
3
2
1
4
3
4
1
4
8
4
17
16
=
945
16 16 0
2
1
5
3
0
8
5
12
0
0
5
3
A = 2
0
0
0
0
1
1.
4
880
0
=
17 12 5
0
0
1
0
= 0.
0
220
Chapter 7
Matrix Calculus
3 0 0
3 0 1
3 0 1
1 2 0
1
(3I A)(A)(4I A) = 0 3 0 2 0
0 0 3
0 0
4
0 0
4
4
0
0
0
= 2
0
0
= 2
0
0
4
0
0
3
0 2
4
0
1
1
4
0
3
0
1
3
1 2
1
0
0
0
0
1
1
1 2
0
4
0
3
0
1
3
1 2
1
0
0
0
0
3
0
2 = 0
0
0
0
0
0
0
4
0
0
0
0
1
1
0
0
0 = 0.
0
1
A (An1 + an1 An2 + + a1 I) = I.
a0
Example 3
1 n1
(A
+ an1 An2 + + a1 I).
a0
1 2 4
A = 0 1 2.
2
0 3
(9)
7.2
221
CayleyHamilton Theorem
( )
1
(A2 + 3A + 9I) = I.
3
Thus,
( )
1
A2 + 3A + 9I
3
9
0 12
3
1
4 1
4 + 0
=
3
8
4 17
6
A1 =
3
1
4
=
3 2
6
5
4
0
2.
1
6
3
0
12
9
6 + 0
9
0
0
9
0
0
0
9
Problems 7.2
Verify the CayleyHamilton theorem and use it to nd A1 , where possible, for:
1
1. A =
3
2
3. A = 4
0
1
0
5. A =
0
0
2
,
4
0
0
0
0
1
0
0
2
,
4
1 2
3 2,
1 2
1
2. A =
2
1
2,
1
0
0
1
0
1
4. A = 0
2
0
0
.
0
1
222
Chapter 7
7.3
Matrix Calculus
(10)
From the CayleyHamilton theorem, we know that a matrix must satisfy its own
characteristic equation, hence
d(A) = 0.
(11)
Let f(A) be any matrix polynomial of arbitrary degree that we wish to compute.
f() represents the corresponding polynomial of . A theorem of algebra states
that there exist polynomials q() and r() such that
f() = d()q() + r(),
(12)
where r() is called the remainder. The degree of r() is less than that of d(),
which is n, and must be less than or equal to the degree of f() (why?).
Example 1
Solution
,
d()
2 1
1
2
f() 2
= + 2 + 1 +
,
d()
d()
or
f() = d() 2 + 2 + 1 (2).
(13)
If we dene q() = 2 + 2 + 1 and r() = 2, (13) has the exact form of (12) for
all except possibly = 1. However, by direct substitution, we nd that (13) is
also valid for = 1; hence (13) is an identity for all ().
7.3
223
(14)
f(A) = r(A).
(15)
(16)
(17)
(18)
(19)
if i is an eigenvalue.
If we now assume that A has distinct eigenvalues, 1 , 2 , . . . , n (note that if the
eigenvalues are distinct, there must be n of them), then (19) may be used to generate n simultaneous linear equations for the n unknowns n1 , n2 , . . . , 1 , 0 :
f(1 ) = r(1 ) = n1 (1 )n1 + n2 (1 )n2 + + 1 (1 ) + 0 ,
f(2 ) = r(2 ) = n1 (2 )n1 + n2 (2 )n2 + + 1 (2 ) + 0 ,
..
.
f(n ) = r(n ) = n1 (n )n1 + n2 (n )n2 + + 1 (n ) + 0 .
(20)
Note that f() and the eigenvalues 1 , 2 , . . . , n are assumed known; hence
f(1 ), f(2 ), . . . , f(n ) are known, and the only unknowns in (20) are n1 ,
n2 , . . . , 1 , 0 .
224
Chapter 7
Example 2
Matrix Calculus
Find A593 if
A=
4
.
3
3
2
(21)
From (18), we have that f(i ) = r(i ) if i is an eigenvalue of A; thus, for this
example, (i )593 = 1 i + 0 . Substituting the eigenvalues of A into this equation,
we obtain the following system for 1 and 0 .
(1)593 = 1 (1) + 0 ,
(1)593 = 1 (1) + 0 ,
or
1 = 1 + 0 ,
1 = 1 + 0 .
(22)
3
2
Example 3
4
3
593
=
3
2
4
.
3
Find A39 if
A=
4
2
1
.
3
(23)
7.3
225
From (18) we have that f(i ) = r(i ) if i is an eigenvalue of A, thus for this
example, (i )39 = 1 i + 0 . Substituting the eigenvalues of A into this equation,
we obtain the following system for 1 and 0 :
539 = 51 + 0 ,
239 = 21 + 0 .
(24)
539 239
,
3
0 =
2(5)39 + 5(2)39
,
3
39
2(5)39 + 5(2)39 1
539 239 4 1
+
=
2 3
0
3
3
539 239
1 2(5)39 + 239
.
=
3 2(5)39 2(2)39 539 + 2(2)39
0
.
1
(25)
The number 539 and 239 can be determined on a calculator. For our purposes,
however, the form of (25) is sufcient and no further simplication is required.
Example 4
1
A = 0
0
4
0
3
2
0.
3
r(A) = 2 A2 + 1 A + 0 I,
r() = 2 2 + 1 + 0 .
Note that since A is a 3 3 matrix, r(A) must be no more than a second degree
polynomial. Now
f(A) = r(A);
thus,
A602 3A3 = 2 A2 + 1 A + 0 I.
(26)
226
Chapter 7
Matrix Calculus
3602 75
,
6
1 =
10
602 75 1
3
0
0
A602 3A3 =
6
0 9
12
1
= 0
6
0
(3)602 + 63
,
6
0 = 0.
8
602 + 63 1
(3)
0
0 +
6
9
0
6(3)602 498
0
6(3)602 + 474
6(3)602 + 486
6(3)602 486
(27)
4
0
3
2
0
3
Finally, the student should note that if the polynomial to be calculated is already
of a degree less than or equal to n 1, then this method affords no simplication
and the polynomial must still be computed directly.
Problems 7.3
1. Specialize system (20) for f(A) = A7 and
2 3
A=
.
1 2
Solve this system and use the results to determine A7 . Check your answer by
direct calculations.
7.3
227
1
.
1
0
A=
0
Solve this system and use the results to determine A735 (What do you notice
about A3 ?).
4. Specialize system (20) for f(A) = A20 and
A=
6
.
2
3
1
1
.
5
1
A = 0
0
1
1
0
2
2 .
2
228
Chapter 7
Matrix Calculus
15. Specialize system (20) for f(A) = A8 3A5 + 5I, when A is the matrix
described in Problem 11.
16. Specialize system (20) for f(A) = A8 3A5 + 5I, when A is the matrix
described in Problem 12.
17. Specialize system (20) for f(A) = A10 + 6A3 + 8A, when A is the matrix
described in Problem 12.
18. Specialize system (20) for f(A) = A10 + 6A3 + 8A, when A is the matrix
described in Problem 13.
19. Find A202 3A147 + 2I for the A of Problem 1.
20. Find A1025 4A5 for the A of Problem 1.
21. Find A8 3A5 I for the matrix given in Problem 7.
22. Find A13 12A9 + 5I for
3
A=
1
5
.
3
23. Find A10 2A5 + 10I for the matrix given in Problem 22.
24. Find A593 2A15 for
2
A = 0
1
4
0
5
3
0.
2
0 1 0
A = 0 0 1 .
4 4 1
Solve this system, and use the results to determine f(A).
26. Specialize system (20) for f(A) = A9 3A4 + I and
0 1 0
A = 0 0 1 .
1
1
1
16
4
4
Solve this system, and use the results to determine f(A).
7.4
7.4
229
(28)
d k1 r(i )
d k1 f(i )
=
,
dk1
dk1
where the notation d n f(i )/dn denotes the nth derivative of f() with respect to
evaluated at = i .
Thus, for example, if i is an eigenvalue of multiplicity 3, Theorem 1 implies
that f() and its rst two derivatives evaluated at = i are equal, respectively, to
r() and its rst two derivatives also evaluated at = i . If i is an eigenvalue of
multiplicity 5, then f() and the rst four derivatives of f() evaluated at = i are
equal respectively to r() and the rst four derivatives of r() evaluated at = i .
Note, furthermore, that if i is an eigenvalue of multiplicity 1, then Theorem 1
implies that f(i ) = r(i ), which is Eq. (18).
Example 1
3
A = 0
1
2
1
3
4
0.
1
r(A) = 2 A2 + 1 A + 0 I
f() = 24 315
r() = 2 2 + 1 + 0
f () = 2423 4514
r () = 22 + 1
f () = 55222 63013
r () = 22 .
Theorem 1 is proved by differentiating Eq. (12) k 1 times and noting that if is an eigenvalue of
i
multiplicity k, then
d(i ) =
d (k1) d(i )
d[d(i )]
= =
= 0.
d
dk1
230
Chapter 7
Matrix Calculus
(29)
44 270
A24 3A15 = 39A2 + 57A 20I = 0 2
21 93
Example 2
84
0.
40
1 4 3
2
1 7
0 0 2 11
1
0
0 0 1 1
0
1
.
A =
2
1
0 0 0 1
0 0 0
0 1 17
0 0 0
0
0
1
r(A) = 5 A5 + 4 A4 + 3 A3 + 2 A2 + 1 A + 0 I
f() = 15 62
r() = 5 5 + 4 4 + 3 3 + 2 2 + 1 1 + 0
f () = 1514 12
r () = 55 4 + 44 3 + 33 2 + 22 + 1
f ()
r () = 205 3 + 124 2 + 63 + 22 .
21013
12
7.4
231
(30)
Since = 1 is an eigenvalue of multiplicity 3, = 1 is an eigenvalue of multiplicity 2 and = 0 is an eigenvalue of multiplicity 1, it follows from Theorem 1
that
f(1) = r(1),
f (1) = r (1),
f (1) = r (1),
(31)
f(1) = r(1),
f (1) = r (1),
f(0) = r(0).
Hence,
(1)15 6(1)2 = 5 (1)5 + 4 (1)4 + 3 (1)3 + 2 (1)2 + 1 (1) + 0
15(1)14 12(1) = 55 (1)4 + 44 (1)3 + 33 (1)2 + 22 (1) + 1
210(1)13 12 = 205 (1)3 + 124 (1)2 + 63 (1) + 22
(1)15 6(1)2 = 5 (1)5 + 4 (1)4 + 3 (1)3 + 2 (1)2 + 1 (1) + 0
15(1)14 12(1) = 55 (1)4 + 44 (1)3 + 33 (1)2 + 22 (1) + 1
(0)15 12(0)2 = 5 (0)5 + 4 (0)4 + 3 (0)3 + 2 (0)2 + 1 (0) + 0
or
5 = 5 + 4 + 3 + 2 + 1 + 0
3 = 55 + 44 + 33 + 22 + 1
198 = 205 + 124 + 63 + 22
(32)
7 = 5 + 4 3 + 2 1 + 0
27 = 55 44 + 33 22 + 1
0 = 0 .
System (32) can now be solved uniquely for 5 , 4 , . . . , 0 ; the results are then
substituted into (30) to obtain f(A).
232
Chapter 7
Matrix Calculus
Problems 7.4
1. Using Theorem 1, establish the equations that are needed to nd A7 if A is a
2 2 matrix having 2 and 2 as multiple eigenvalues.
2. Using Theorem 1, establish the equations that are needed to nd A7 if A is a
3 3 matrix having 2 as an eigenvalue of multiplicity three.
3. Redo Problem 2 if instead the eigenvalues are 2, 2, and 1.
4. Using Theorem 1, establish the equations that are needed to nd A10 if A is
a 2 2 matrix having 3 as an eigenvalue of multiplicity two.
5. Redo Problem 4 if instead the matrix has order 3 3 with 3 as an eigenvalue
of multiplicity three.
6. Redo Problem 4 if instead the matrix has order 4 4 with 3 as an eigenvalue
of multiplicity four.
7. Using Theorem 1, establish the equations that are needed to nd A9 if A is a
4 4 matrix having 2 as an eigenvalue of multiplicity four.
8. Redo Problem 7 if instead the eigenvalues are 2, 2, 2, and 1.
9. Redo Problem 7 if instead the eigenvalues are 2 and 1, both with multiplicity
two.
10. Set up (but do not solve) the necessary equations to nd A10 3A5 if
5
0
0
A =
0
0
0
2 1
5 2
0 5
0 0
0 0
0 0
1
1
0
2
0
0
5 7
1
1
1 3
.
1
2
2
0
0
5
5
A=
2
8
.
5
4
A = 0
5
1
1
1
3
0.
4
7.5
233
Functions of a Matrix
4
A = 0
8
7.5
1
0
1
2
0 .
4
Functions of a Matrix
Once the student understands how to compute polynomials of a matrix, computing exponentials and other functions of a matrix is easy, because the methods
developed in the previous two sections remain valid for more general functions.
Let f() represent a function of and suppose we wish to compute f(A). It
can be shown, for a large class of problems, that there exists a function q() and
an n 1 degree polynomial r() (we assume A is of order n n) such that
f() = q()d() + r(),
(33)
(34)
Since (33) and (34) are exactly Eqs. (12) and (14), where f() is now understood
to be a general function and not restricted to polynomials, the analysis of Sections
7.3 and 7.4 can again be applied. It then follows that
(a) f(A) = r(A), and
(b) Theorem 1 of Section 7.4 remains valid
Thus, the methods used to compute a polynomial of a matrix can be generalized
and used to compute arbitrary functions of a matrix.
Example 1
Find eA if
1
A=
4
2
.
3
r(A) = 1 A + 0 I
f() = e
r() = 1 + 0 .
(35)
234
Chapter 7
Matrix Calculus
e5 e1
6
0 =
and
e5 + 5e1
.
6
Example 2
2e5 2e1
.
4e5 + 2e1
1 2e5 + 4e1
6 4e5 4e1
Find eA if
2
A = 0
0
1
2
0
0
1.
2
r(A) = 2 A2 + 1 A + 0 I
r() = 2 2 + 1 + 0
r () = 22 + 1
r () = 22 .
(36)
7.5
235
Functions of a Matrix
e2
,
2
1 = e2 ,
0 = e2 .
4
e2
0
e =
2 0
A
Example 3
4
4
0
2
1
2
4 e 0
0
4
1
2
0
0
1
2
1 + e 0
2
0
0
1
0
2
e
0
0 = 0
1
0
e2
e2
0
e2 /2
e2 .
e2
Find sin A if
A = 0
4
0
0 .
/2
r(A) = 2 A2 + 1 A + 0 I
f() = sin
r() = 2 2 + 1 + 0
f () = cos
r () = 22 + 1 .
(37)
236
Chapter 7
Matrix Calculus
0
sin A = 1/2 0
8
2
0
16 10
0
0 .
2
In closing, we point out that although exponentials of any square matrix can
always be computed by the above methods, not all functions of all matrices can;
f(A) must rst be well dened whereby well dened (see Theorem 1 of Section 7.1) we mean that f(z) has a Taylor series which converges for |z| < R and
all eigenvalues of A have the property that their absolute values are also less
than R.
Problems 7.5
1. Establish the equations necessary to nd eA if A is a 2 2 matrix having 1 and
2 as its eigenvalues.
2. Establish the equations necessary to nd eA if A is a 2 2 matrix having 2 and
2 as multiple eigenvalues.
3. Establish the equations necessary to nd eA if A is a 3 3 matrix having 2 as
an eigenvalue of multiplicity three.
7.5
237
Functions of a Matrix
1
A=
4
3
.
2
1
.
2
4
A=
1
15. Find eA for
1
A = 1
0
1
3
0
2
4.
2
1
A = 3
0
1 2
1 4.
0 2
A=
2
3
.
2
238
Chapter 7
Matrix Calculus
(1)k1 zk
k=1
which converges for |z| < 1. For the following matrices, A, determine whether
or not log(A + I) is well dened and, if so, nd it.
1
1
6 9
3
5
0 0
2
(b)
(c)
(d)
.
(a)
2 3
1 3
0 0
0 1
2
7.6
The Function e At
A very important function in the matrix calculus is eAt , where A is a square constant
matrix (that is, all of its entries are constants) and t is a variable. This function may
be calculated by dening a new matrix B = At and then computing eB by the
methods of the previous section.
Example 1
Find eAt if
1
A=
4
Solution
2
.
3
Dene
B = At =
t
4t
2t
.
3t
r(B) = 1 B + 0 I
r() = 1 + 0 .
(38)
7.6
The Function e At
239
eAt = eB =
Example 2
( )
1
1
2t
e5t + 5et
+
3t
0
6
2et
.
+ 2et
Find eAt if
3
A = 0
0
Solution
0
1
1
3
0
0
1.
3
Dene
3t
B = At = 0
0
t
3t
0
0
t .
3t
r(B) = 2 B2 + 1 B + 0 I
f() = e
r() = 2 2 + 1 + 0
(39)
f () = e
r () = 22 + 1
(40)
f () = e
r () = 22 .
(41)
(42)
(43)
f (3t) = r (3t),
(44)
f (3t) = r (3t).
(45)
240
Chapter 7
Matrix Calculus
(46)
e3t = 6t2 + 1 ,
(47)
e3t = 22 .
(48)
1 = (1 3t)e3t ,
0 = (1 3t + 29 t 2 )e3t .
eAt
t2
3t
6t 2 + (1 3t)e3t 0
0
9t 2
1 0 0
+ (1 3t + 29 t 2 )e3t 0 1 0
0 0 1
1 t t 2 /2
= e3t 0 1
t .
0 0
1
9t 2
B
1 3t
= e = 2e
0
0
6t 2
9t 2
0
t
3t
0
0
t
3t
Problems 7.6
Find eAt if A is given by:
4 4
2
1
1.
.
2.
.
3 5
1 2
0
1
3
2
4.
.
5.
.
14 9
2 6
0 1 0
1 0 0
7. 0 0 1.
8. 4 1 2.
0 0 0
1 4 1
4
1
3.
10
6.
6
1
.
2
6
.
10
7.7
7.7
241
Complex Eigenvalues
Complex Eigenvalues
When computing eAt , it is often the case that the eigenvalues of B = At are
complex. If this occurs the complex eigenvalues will appear in conjugate pairs,
assuming the elements of A to be real, and these can be combined to produce real
functions.
Let z represent a complex variable. Dene ez by
ez =
k
z
k=0
k!
=1+z+
z2
z3
z4
z5
+
+
+
+
2!
3!
4!
5!
(49)
(i)3
(i)4
(i)5
(i)2
+
+
+
+
2!
3!
4!
5!
2
i 3
4
i 5
+
+
2!
3!
4!
5!
4
2
+
e = 1
2!
4!
i
'
&
'
3
5
+i
+
.
3!
5!
(50)
But the Maclaurin series expansions for sin and cos are
sin =
3
5
+
1!
3!
5!
cos = 1
2
4
6
+
+
+ ;
2!
4!
6!
(51)
(52)
ei + ei
,
2
(53)
242
Chapter 7
Matrix Calculus
sin =
(54)
Equations (53) and (54) are Eulers relations and can be used to reduce
complex exponentials to expressions involving real numbers.
Example 1
Find eAt if
1
A=
2
Solution
5
.
1
B = At =
5t
.
t
t
2t
r(B) = 1 B + 0 I
r() = 1 + 0
(55)
and
&
'
1 e3ti e3ti
1 =
.
3t
2i
If we now use (53) and (54), where in this case = 3t, it follows that
0 = cos 3t
and
At
=e =
13 sin 3t + cos 3t
23 sin 3t
5
3
1
3
sin 3t
sin 3t + cos 3t
7.7
243
Complex Eigenvalues
(56)
and
$
%
e ei ei
e ei e ei
e+i ei
=
=
= e sin .
2i
2i
2i
Example 2
(57)
Find eAt if
1
.
1
2
4
A=
Solution
B = At =
2t
4t
t
;
t
'
15
3
+i
t,
2
2
&
2 =
'
15
3
i
t.
2
2
Thus,
f(B) = eB
r(B) = 1 B + 0 I
f() = e
r() = 1 + 0 .
15/2)]t
15/2)]t
= 1 [ 23 + i( 15/2)]t + 0 ,
= 1 [ 23 i( 15/2)]t + 0 .
(58)
244
Chapter 7
Matrix Calculus
Putting this system into matrix form, and solving for 1 and 0 by inversion, we
obtain
2
1 =
15t
e[(3/2)t+(
15/2)ti]
e[(3/2)t(
2i
15/2)ti]
'
&
3 e[(3/2)t+( 15/2)ti] e[(3/2)t( 15/2)ti]
0 =
2i
15
'
&
e[(3/2)t+( 15/2)ti] + e[(3/2)t( 15/2)ti]
.
+
2
15t
2
15
15
3 3t/2
3t/2
0 = e
sin
cos
t+e
t.
2
2
15
2
1 = e3t/2 sin
15t
eAt
15
15
1
sin
t + cos
t
2
2
15
= e3t/2
8
15
t
sin
2
15
2
15
t
sin
2
15
.
1
15
15
t + cos
t
sin
2
2
15
Problems 7.7
Find eAt if A is given by:
1 1
2
1.
.
2.
5 1
3
4
10
4.
8
.
4
3 1
7.
.
2 5
5.
2
1
0
8. 0
0
2
.
2
5
.
2
1
2
1
1
.
0
1
.
8
0
64
3.
0
5.
2
6.
0
25
7.8
7.8
Properties of e A
245
Properties of e A
Since the scalar function ex and the matrix function eA are dened similarly (see
Eqs. (5) and (6)), it should not be surprising to nd that they possess some similar
properties. What might be surprising, however, is that not all properties of ex are
common to eA . For example, while it is always true that ex ey = ex+y = ey ex , the
same cannot be said for matrices eA and eB unless A and B commute.
Example 1
0
.
1
e1
1
e
e e =
0
A B
1
0
0
e
=
e
0
e2 e
e
and
1
e e =
0
0
e
B A
e1
e
=
1
0
e
0
e1
;
e
hence
eA+B = eA eB , eA+B = eB eA and eB eA = eA eB .
Two properties that both ex and eA do have in common are given by the
following:
Property 1 e0 = I, where 0 represents the zero matrix.
Proof.
e =
( k)
A
k=0
k!
=I+
( k)
A
k=1
Hence,
e0 = I +
k
0
k=1
k!
= I.
k!
246
Chapter 7
Matrix Calculus
Property 2 (eA )1 = eA .
Proof.
A
(e )(e
)=
(
Ak )
(A)k
k=0
k!
k!
k=0
A2
A3
A2
A3
= I+A+
+
+
IA+
+
2!
3!
2!
3!
+
+
,
,
= II + A [1 1] + A2 21 ! 1 + 21 ! + A3 13 ! + 21 ! 21 ! + 13 ! +
= I.
Thus, eA is an inverse of eA . However, by denition, an inverse of eA is (eA )1 ;
hence, from the uniqueness of the inverse (Theorem 2 of Section 3.4), we have
that eA = (eA )1 .
0
A=
0
1
.
0
Solution
0 1
A =
,
0
0
1
1
, and eA =
1
0
eA =
1
0
1
.
1
Thus,
(eA )1 =
1
0
1
1
1
=
1
0
1
= eA .
1
Note that Property 2 implies that eA is always invertible even if A itself is not.
T
Property 3 (eA )T = eA .
Proof. The proof of this property is left as an exercise for the reader (see
Problem 7).
Properties of e A
7.8
247
2
.
3
4
,
3
1
A=
4
Solution
1
A =
2
T
AT
1 2e5 + 4e1
=
6 2e5 2e1
4e5 4e1
.
4e5 + 2e1
and
2e5 2e1
;
4e5 + 2e1
2e5 + 4e1
1
eA =
6 4e5 4e1
T
Hence, (eA )T = eA .
Problems 7.8
1. Verify Property 2 for
A=
3
.
1
1
0
0
A=
64
1
.
0
0
A = 0
0
1
0
0
0
1.
0
2
A = 0
1
1
2
1
0
0.
1
248
Chapter 7
Matrix Calculus
eA = n1 (AT )n1 + + 1 AT + 0 I,
then j = j for j = 0, 1, . . . , n 1.)
8. Find eA eB , eB eA , and eA+B if
1
A=
0
1
0
0
B=
0
and
1
1
7.9
Derivatives of a Matrix
Denition 1 An n n matrix A(t) = [aij (t)] is continuous at t = t0 if each of its
elements aij (t)(i, j = 1, 2, . . . , n) is continuous at t = t0 .
For example, the matrix given in (59) is continuous everywhere because each
of its elements is continuous everywhere while the matrix given in (60) is not
continuous at t = 0 because the (1, 2) element, sin(1/t), is not continuous at t = 0.
et
t2 1
sin2 t
t 3 3t
2t
(59)
sin(1/t)
45
(60)
We shall use the notation A(t) to emphasize that the matrix A may depend on
the variable t.
7.9
249
Derivatives of a Matrix
(61)
Find A(t)
if
A(t) =
sin t
.
2
et
t2
ln t
Solution
d(t 2 )
dA(t) dt
A(t)
=
=
d(ln t)
dt
dt
Example 2
d(sin t)
2t
dt
= 1
2
d(et )
t
dt
Find A(t)
if
3t
A(t) = 45.
t2
Solution
d(3t)
dt
3
d(45)
A(t) =
= 0 .
dt
2t
d(t 2 )
dt
Example 3
Find x (t) if
x1 (t)
x2 (t)
x(t) =
... .
xn (t)
cos t
2tet
250
Chapter 7
Matrix Calculus
Solution
x 1 (t)
x 2 (t)
x (t) =
... .
x n (t)
The following properties of the derivative can be veried:
d(A(t) + B(t))
dA(t) dB(t)
=
+
.
dt
dt
dt
dA(t)
d[A(t)]
=
, where is a constant.
(P2)
dt
dt
(
)
(
)
d(t)
dA(t)
d[(t)A(t)]
=
A(t) + (t)
, when (t) is a scalar function
(P3)
dt
dt
dt
of t.
(
)
(
)
d[A(t)B(t)]
dA(t)
dB(t)
(P4)
=
B(t) + A(t)
.
dt
dt
dt
(P1)
We warn the student to be very careful about the order of the matrices in (P4).
Any commutation of the matrices on the right side will generally yield a wrong
answer. For instance, it generally is not true that
d
[A(t)B(t)] =
dt
)
(
)
dA(t)
dB(t)
B(t) +
A(t).
dt
dt
3t 2
t
and
1
B(t) =
3t
2t
.
2
Solution
d
d
[A(t)B(t)] =
dt
dt
(
2t
1
3t 2
t
d 2t + 9t 3
=
dt 1 + 3t 2
1
3t
2t
2
)
10t 2
2 + 27t 2
=
6t
4t
20t
,
4
7.9
251
Derivatives of a Matrix
and
dB(t)
dA(t)
2 6t 1 2t
2t
B(t) + A(t)
=
+
0 1 3t 2
1
dt
dt
2 + 27t 2 20t
=
6t
4
=
d[A(t)B(t)]
.
dt
3t 2
t
0
3
2
0
Proof.
(At)k
k=0
k!
or
eAt = I + tA +
t 3 A3
t n1 An1
t n An
t n+1 An+1
t 2 A2
+
+ +
+
+
+
2!
3!
(n 1)!
n!
(n + 1)!
Therefore,
deAt
A 2tA2
3t 2 A3
nt n1 An
(n + 1)t n An+1
=0+
+
+
+ +
+
+
dt
1!
2!
3!
n!
(n + 1)!
tA2
t 2 A3
t n1 An
t n An+1
+
+ +
+
+
1!
2!
(n 1)!
n!
tA t 2 A2
t n1 An1
t n An
= I+
+
+ +
+
+ A
1!
2!
(n 1)!
n!
=A+
= eAt A.
If we had factored A on the left, instead of on the right, we would have obtained
the other identity,
deAt
= AeAt .
dt
252
Chapter 7
Matrix Calculus
Corollary 1
A(t) dt =
Example 5
Find
aij (t) dt .
A(t) dt if
A(t) =
3t
t2
2
.
et
Solution
A(t) dt = 1
Example 6
Find
11
0
3t dt
t 2 dt
1
1
2 dt
et dt
3
2
t 2 + c1
2t + c2
1
3
t 3 + c3
et + c 4
=
A(t) dt if
2t
A(t) = et
sin t
1
6t 2
0
2
1.
1
7.9
253
Derivatives of a Matrix
Solution
1
0
1 dt
A(t) dt =
et dt
0
1
sin t dt
1
= e 1
2/
6t 2 dt
0 dt
0
1
1
2
0
2 dt
01
1 dt
0 1
1 dt
2t dt
2
1.
1
[A(t) + B(t)] dt =
A(t) dt +
B(t) dt,
Problems 7.9
1. Find A(t)
if
cos t t 2 1
.
(a) A(t) =
2t
e(t1)
2et3
t(t 1)
2
17
t .
ln t
= 7,
(t) = t ,
t3
A(t) =
1
3t 2
,
2t
and
t
B(t) = 3
t
2t
.
t5
3. Prove that if dA(t)/dt = 0, then A(t) is a constant matrix. (That is, a matrix
independent of t).
1
4. Find A(t) dt for the A(t) given in Problem 1(a).
5. Verify Property (P5) for
= 2,
= 10,
6t
A(t) =
2t
t2
,
1
and
t
B(t) =
1
4t 2
.
2t
254
Chapter 7
Matrix Calculus
6. Using Property (P4), derive a formula for differentiating A2 (t). Use this
formula to nd dA2 (t)/dt, where
t
A(t) = 3
4t
2t 2
,
et
and, show that dA2 (t)/dt = 2A(t) dA(t)/dt. Therefore, the power rule of differentiation does not hold for matrices unless a matrix commutes with its
derivative.
7.10
3 + 22 + 3 + 4 23 + 32 + 4 + 5
33 + 42 + 5
2 + 3
2 3 2
3 4
4
1 2 3
+
+
=
+
4 0
5 2
0
3 0
5
.
3
(62)
d() = n + an1 n1 + + a1 + a0
(63)
and let
(64)
7.10
255
Final Comments
(65)
(66)
Ca C = Ca (A I) = Ca A Ca .
(67)
(68)
(69)
a1 I = C0 + C1 A
a0 I = C0 A.
Multiplying the rst equation in (69) by An , the second equation by An1 , . . . ,
and the last equation by A0 = I and adding, we obtain (note that the terms on the
right-hand side cancel out)
An + an1 An1 + + a1 A + A0 I = 0.
Equation (70) is the CayleyHamilton theorem.
(70)
8
Linear Differential Equations
8.1
Fundamental Form
We are now ready to solve linear differential equations. The method that we shall
use involves introducing new variables x1 (t), x2 (t), . . . , xn (t), and then reducing a
given system of differential equations to the system
dx1 (t)
= a11 (t)x1 (t) + a12 (t)x2 (t) + + a1n (t)xn (t) + f1 (t)
dt
dx2 (t)
= a21 (t)x1 (t) + a22 (t)x2 (t) + + a2n (t)xn (t) + f2 (t)
dt
..
.
(1)
dxn (t)
= an1 (t)x1 (t) + an2 (t)x2 (t) + + ann (t)xn (t) + fn (t).
dt
If we dene
x1 (t)
x (t)
2
x(t) = .. ,
.
xn (t)
a11 (t)
a21 (t)
A(t) =
...
a12 (t)
a22 (t)
..
.
a1n (t)
a2n (t)
,
..
.
and
f1 (t)
f (t)
2
f(t) = .. ,
.
fn (t)
(2)
(3)
257
258
Chapter 8
Example 1
Note that we are using the standard notation y (t) and z (t) to represent
dy(t)
dt
and
dz(t)
.
dt
Solution Dene x1 (t) = y(t) and x2 (t) = z(t). This system is then equivalent to
the matrix equation
2
sin t
x 1 (t)
3 x1 (t)
t
.
(4)
=
+
x 2 (t)
t 2 + 1
et t x2 (t)
If we dene
x1 (t)
,
x2 (t)
x(t) =
A(t) =
t2
et
3
,
t
and
f(t) =
sin t
t 2 + 1
x2 (t0 ) = c2 , . . . , xn (t0 ) = cn ,
(5)
x1 (t0 )
c1
x2 (t0 ) c2
x (t0 ) = . = . = c.
.. ..
xn (t0 )
cn
Thus, the initial conditions can be put into the matrix form
x(t0 ) = c.
(6)
8.1
259
Fundamental Form
(7)
y (2) = 1.
Solution Dene x1 (t) = x(t) and x2 (t) = y(t). This system is then equivalent to
the matrix equations
2
x 1 (t)
= 2
x 2 (t)
t
t
0
x1 (t)
0
+ t
e
x2 (t)
3
x1 (2)
=
.
1
x2 (2)
(8)
Consequently, if we dene
x1 (t)
x(t) =
,
x2 (t)
0
f(t) = t ,
e
then (8) is in fundamental form.
Example 3
t
,
0
2
A(t) = 2
t
3
c=
,
1
and
t0 = 2,
= l(t) m(t)
n(t)
=
$ %
l 15 = 0,
m(t) n(t)
$ %
m 15 = 170,
$ %
n 15 = 1007.
260
Chapter 8
Solution Dene x1 (t) = l(t), x2 (t) = m(t), x3 (t) = n(t). This system is then equivalent to the matrix equations
x 1 (t)
2
3 1
x1 (t)
x 2 (t) = 1 1
0 x2 (t),
0
1 1
x 3 (t)
x3 (t)
x1 (15)
0
x2 (15) = 170.
1007
x3 (15)
(9)
Thus, if we dene
x1 (t)
x(t) = x2 (t),
x3 (t)
0
c = 170
1007
2
A(t) = 1
0
and
3 1
1
0,
1 1
0
f(t) = 0,
0
t0 = 15,
(a) both x(t) and x (t) are continuous in some neighborhood J of the initial time
t = t0 ,
(b) the substitution of x(t) into the differential equation
x (t) = A(t)x(t) + f(t)
makes the equation an identity in t on the interval J; that is, the equation is
valid for each t in J, and
(c) x(t0 ) = c.
It would also seem advantageous, before trying to nd the solutions, to know
whether or not a given system has any solutions at all, and if it does, how
8.1
Fundamental Form
261
many. The following theorem from differential equations answers both of these
questions.
Theorem 1 Consider a system given by (7). If A(t) and f(t) are continuous in
some interval containing t = t0 , then this system possesses a unique continuous
solution on that interval.
Hence, to insure the applicability of this theorem, we assume for the remainder
of the chapter that A(t) and f(t) are both continuous on some common interval
containing t = t0 .
Problems 8.1
In Problems 8.1 through 8.8, put the given systems into fundamental form.
1.
dx(t)
= 2x(t) + 3y(t),
dt
dy(t)
= 4x(t) + 5y(t),
dt
x(0) = 6,
y(0) = 7.
z(0) = 1.
dx(t)
= 3x(t) + 3y(t) + 1,
dt
dy(t)
= 4x(t) 4y(t) 1,
dt
x(0) = 0,
4.
y(0) = 0.
dx(t)
= 3x(t) + t,
dt
dy(t)
= 2x(t) + t + 1,
dt
x(0) = 1,
5.
y(0) = 1.
dx(t)
= 3t 2 x(t) + 7y(t) + 2,
dt
dx(t)
= x(t) + ty(t) + 2t,
dt
x(1) = 2,
y(1) = 3.
262
Chapter 8
6.
7.
du(t)
= et u(t) + tv(t) + w(t),
dt
dv(t)
= t 2 u(t) 3v(t) + (t + 1)w(t),
dt
dw(t)
2
= v(t) + et w(t),
dt
u(4) = 0, u(4) = 1, z(4) = 1.
dx(t)
= 6y(t) + zt,
dt
dy(t)
= x(t) 3z(t),
dt
dz(t)
2y(t),
dt
x(0) = 10, y(0) = 10,
z(0) = 20.
s(1) = 2,
u(1) = 5.
sin t
(a)
,
cos t
1
0
x1
,
x2
t
e
(b)
,
0
x1 (0)
1
=
:
0
x2 (0)
cos t
(c)
.
sin t
2
3
x1
x (0)
1
, 1
=
:
x2
2
x2 (0)
et
(b)
,
2et
e5t
(c)
.
2e5t
1
x1
x (1)
=
:
, 1
x2
0
x2 (1)
1
3
e2(t1) + 2e(t1)
(b)
,
2e2(t1) + 2e(t1)
e2(t1)
.
(c)
0
8.2
8.2
263
an (t)
d n x(t)
d n1 x(t)
dx(t)
+
a
(t)
+ + a1 (t)
+ a0 (t)x(t) = f(t)
n1
dt n
dt
dt n1
x (t0 ) = c1 ,
dx (t0 )
= c2 , . . . ,
dt
d n1 x (t0 )
dt n1
(10)
= cn .
Equation (10) is an nth order differential equation for x(t) where a0 (t),
a1 (t), . . . , an (t) and f(t) are assumed known and continuous on some interval
containing t0 . Furthermore, we assume that an (t) = 0 on this interval.
A method of reduction, particularly useful for differential equations dened
by system (10), is the following:
Step 1. Rewrite (10) so that the nth derivative of x(t) appears by itself;
an1 (t) d n1 x(t)
a1 (t) dx(t) a0 (t)
f(t)
d n x(t)
=
x(t) +
.
n
n1
dt
an (t) dt
an (t) dt
an (t)
an (t)
(11)
Step 2. Dene n new variables (the same number as the order of the differential
equation), x1 (t), x2 (t), . . . , xn (t) by the equations
x1 = x(t),
x2 =
dx1
,
dt
x3 =
dx2
,
dt
(12)
..
.
xn1 =
dxn2
,
dt
xn =
dxn1
.
dt
264
Chapter 8
(13)
(14)
x+
.
n1
dt
an (t) dt
an (t) dt
an (t)
an (t)
Substituting (13) into this equation, we obtain
an1 (t)
a1 (t)
a0 (t)
f(t)
dxn
=
xn
x2
x1 +
.
dt
an (t)
an (t)
an (t)
an (t)
(15)
(16)
8.2
265
Note that in the last equation of (16) we have rearranged the order of (15) so that
the x1 term appears rst, the x2 term appears second, etc. This was done in order
to simplify the next step.
Step 5. Put (16) into matrix form.
Dene
x1 (t)
x2 (t)
..
.
x(t) =
,
xn1 (t)
xn (t)
0
0
0
..
.
A(t) =
a0 (t)
an (t)
1
0
0
..
.
0
1
0
..
.
0
0
1
..
.
a1 (t)
an (t)
a2 (t)
an (t)
a3 (t)
an (t)
0
0
0
..
.
an1 (t)
(17)
an (t)
and
0
0
.
.
.
f(t) =
.
0
f(t)
an (t)
Then (16) can be written as
x (t) = A(t)x(t) + f(t).
Step 6. Rewrite the initial conditions in matrix form.
From (17), (13), and (10), we have that
x (t0 )
dx (t0 )
c1
x1 (t0 )
c2
x2 (t0 )
dt
x(t0 ) = . =
= .. .
..
.
..
.
xn (t0 )
cn
d n1 x (t0 )
dt n1
(18)
266
Chapter 8
Thus, if we dene
c1
c2
c = . ,
..
cn
the initial conditions can be put into matrix form
x (t0 ) = c.
(19)
Equations (18) and (19) together represent the fundamental form for (10).
Since A(t) and f(t) are continuous (why?), Theorem 1 of the previous section
guarantees that a unique solution exists to (18) and (19). Once this solution is
obtained, x(t) will be known; hence, the components of x(t), x1 (t), . . . , xn (t) will
be known and, consequently, so will x(t), the variable originally sought (from (12),
x1 (t) = x(t)).
Example 1
x (0) = 2,
x (0) = 1,
...
x(0) = 0.
1
2
sin t.
8.2
267
....
...
x 4 = x = 2 x 8t x + 21 x t 2 x +
1
2
sin t
= 2x4 8tx3 + 21 x2 t 2 x1 +
1
2
sin t,
or
x 1
x 2
x 3
x 4
=
x2
=
x3
=
x4
= t 2 x1 + 21 x2 8tx3 + 2x4 +
1
2
sin t.
Dene
x1
x2
x(t) =
x3 ,
x4
0
0
A(t) =
0
t 2
0
0
f(t) =
0 ,
1
2 sin t
1
2
c =
1,
0
1
0
0
0
1
0
8t
1
2
0
0
,
1
2
and
t0 = 0.
Thus, the initial value problem may be rewritten in the fundamental form
x (t) = A(t)x(t) + f(t),
x (t0 ) = c.
Example 2
et
x(2) = 1,
dx(2)
= 1,
dt
4
d5x
2t d x
2e
+ tx = 4et ,
dt 4
dt 5
d 2 x (2)
= 1,
dt 2
d 3 x (2)
= 2,
dt 3
d 4 x (2)
= 3.
dt 4
268
Chapter 8
Dene
x1
x2
x3
x4
=x
= x 1 = x
= x 2 = x
...
= x 3 = x
x5 = x 4 =
d4x
;
dt 4
hence,
x 5 =
d5x
.
dt 5
Thus,
x 1
x 2
x 3
x 4
= x2
= x3
= x4
= x5
x 5 =
4
d5x
t d x tet x + 4
=
2e
dt 4
dt 5
= 2et x5 tet x1 + 4,
or
x 1
x 2
x 3
x 4
=
=
=
=
x 5 =
x2
x3
x4
tet x
x5
t
+ 2e x5
Dene
x1
x2
x(t) =
x3 ,
x4
x5
0
0
f(t) =
0,
0
4
0
0
0
A(t) =
0
tet
1
1
c=
1,
2
3
1
0
0
0
0
0
1
0
0
0
+ 4.
0
0
1
0
0
0
0
0
,
1
2et
and
t0 = 2.
Thus, the initial value problem may be rewritten in the fundamental form
x (t) = A(t)x(t) + f(t),
x (t0 ) = c.
8.3
269
Reduction of a System
Problems 8.2
Put the following initial-value problems into fundamental form:
1.
3.
d2x
dx
2 3x = 0;
2
dt
dt
dx(0)
x(0) = 4,
= 5.
dt
d2x
dx
tx = 0;
+ et
2
dt
dt
dx(1)
x(1) = 2,
= 0.
dt
2.
d2x
x = t2;
dt 2
d2x
dx
3et x = 2;
2e2t
dt
dt 2
dx(0)
x(0) = 0,
= 0.
dt
4. et
dx(0)
= 3.
dt
x(0) = 3,
5. x 3x + 2x = et ,
x(1) = x (1) = 2.
...
6. 4 x + t x x = 0,
x(1) = 2,
7. et
d 2 x(0)
= ,
dt 2
dx(0)
= 2,
dt
d 3 x(0)
= e3 .
dt 3
d6x
d4x
+
4
= t 2 t,
dt 4
dt 6
x() = 2,
x () = 1,
d 4 x()
= 1,
dt 4
8.3
x (1) = 205.
d4x
d2x
dx
+
t
=1+ ,
4
2
dt
dt
dt
x(0) = 1,
8.
x (1) = 1,
x () = 0,
...
x() = 2,
d 5 x()
= 0.
dt 5
Reduction of a System
Based on our work in the preceding section, we are now able to reduce systems of
higher order linear differential equations to fundamental form. The method, which
is a straightforward extension of that used to reduce the nth order differential
equation to fundamental form, is best demonstrated by examples.
Example 1
x (1) = 2,
x (1) = 3,
x (1) = 1,
y(1) = 0,
(20)
y (1) = 2.
270
Chapter 8
Step 1. Rewrite the differential equations so that the highest derivative of each
unknown function appears by itself. For the above system, this has already
been done.
Step 2. Dene new variables x1 (t), x2 (t), x3 (t), y1 (t), and y2 (t). (Since the highest
derivative of x(t) is of order 3, and the highest derivative of y(t) is of order 2,
we need 3 new variables for x(t) and 2 new variables for y(t). In general, for
each unknown function we dene a set a k new variables, where k is the order
of the highest derivative of the original function appearing in the system under
consideration). The new variables are dened in a manner analogous to that used
in the previous section:
x1 = x,
x2 = x 1 ,
x3 = x 2 ,
(21)
y1 = y,
y2 = y 1 .
From (21), the new variables are related to the functions x(t) and y(t) by the
following:
x1 = x,
x2 = x ,
x3 = x ,
(22)
y1 = y,
y2 = y .
It follows from (22), by differentiating x3 and y2 , that
...
x 3 = x,
y 2 = y .
(23)
(24)
8.3
271
Reduction of a System
(25)
y 1 = y2 ,
y 2 = x2 + 3y1 2y2 + sin t.
Note, that for convenience we have rearranged terms in some of the equations to
present them in their natural order.
Step 5. Write (25) in matrix form.
Dene
0 1
x1 (t)
0 0
x2 (t)
x(t) =
x3 (t), A(t) = 0 0
0 0
y1 (t)
0 1
y2 (t)
0 0 0
1 0 0
5 7 1
,
0 0 1
0 3 2
and
0
0
t
f(t) =
e .
0
sin t
(26)
(27)
Step 6. Rewrite the initial conditions in matrix form. From Eqs. (26), (22), and
(20) we have
x1 (1)
x(1)
2
x2 (1) x (1) 3
x(1) =
x3 (1) = x (1) = 1.
y1 (1) y(1) 0
y (1)
2
y2 (1)
Thus, if we dene
2
3
c =
1
0
2
(28)
272
Chapter 8
Since A(t) and f(t) are continuous, (27) and (28) possess a unique solution.
Once x(t) is known, we immediately have the components of x(t), namely
x1 (t), x2 (t), x3 (t), y1 (t) and y2 (t). Thus, we have the functions x(t) and y(t) (from
(21), x1 (t) = x(t) and y1 (t) = y(t)).
All similar systems containing higher order derivatives may be put into
fundamental form in exactly the same manner as that used here.
Example 2
x () = 15,
x () = 59,
y () = 3,
Solution
x () = 117,
z () = 36,
y () = 2,
z () = 212.
Dene
x1 = x
x2 = x 1 = x
x3 = x 2 = x ;
hence,
...
x 3 = x.
hence,
...
y 3 = y.
hence,
z 2 = z .
y1 = y
y2 = y 1 = y
y3 = y 2 = y ;
z1 = z
z2 = z 1 = z ;
Thus,
x 1 = x2
x 2 = x3
...
x 3 = x = 2x + t y 3z + t 2 z + t
= 2x2 + ty2 3z1 + t 2 z2 + t;
y 1 = y2
y 2 = y3
y () = 19,
8.3
273
Reduction of a System
$
%
...
y 3 = y = z + sin t y + x t
$
%
= z2 + sin t y1 + x1 t;
z 1 = z2
z 2 = z = x y + t 2 + 1
= x3 y3 + t 2 + 1;
or
x 1 =
x 2 =
x 3 =
y 1 =
y 2 =
y 3 = x1
z 1 =
z 2 =
x2
x3
+ ty2
y2
2x2
3z1 + t 2 z2 + t
y3
$
%
+ sin t y1
y3
x3
z2 t
z2
+ t 2 + 1.
Dene
x1
x2
x3
y1
x =
y2 ,
y3
z1
z2
0
0
,
f(t) =
0
t
0
t2 + 1
0
0
0
A(t) =
0
0
0
1
0
2
0
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
sin t
0
0
0 0 0
0 0 0
t 0 3
1 0 0
0 1 0
0 0 0
0 0 0
0 1 0
15
59
117
,
c=
19
36
212
and
t0 = .
0
0
t2
0
,
0
1
0
274
Chapter 8
Problems 8.3
Put the following initial-value problems into fundamental form:
1.
d2x
dx
= 2 + 3x + 4y,
dt
dt 2
dy
= 5x 6y,
dt
x(0) = 7,
dx(0)
= 8,
dt
y(0) = 9.
2.
d2x
dx dy
=
+ ,
dt
dt
dt 2
d2y
dy dx
=
,
dt
dt
dt 2
x(0) = 2,
3.
dx(0)
= 3,
dt
y(0) = 4,
dy(0)
= 4.
dt
dx
dy
= t 2 4x,
dt
dt
d2y
= ty + t 2 x,
dt 2
x(2) = 1,
4.
dy(2)
= 0.
dt
y(2) = 0,
dy
dx
= 2 4x + t,
dt
dt
d2y
= ty + 3x 1,
dt 2
x(3) = 0,
y(3) = 0,
dy(3)
= 0.
dt
...
5. x = 2x + y t,
....
y = tx ty + y et ;
x(1) = 2,
...
y(1) = 4.
x (1) = 0,
y(1) = 0,
y (1) = 3,
y (1) = 9,
8.4
6.
275
...
x = x y + y ,
y = x x + 2y ;
x(0) = 21, x (0) = 4,
7. x = y 2,
y = z 2,
z = x + y;
x() = 1, y() = 2,
x (0) = 5,
y () = 17,
y(0) = 5,
y (0) = 7.
z() = 0.
8. x = y + z + 2,
y = x + y 1,
z = x z + 1;
x(20) = 4,
x (20) = 4,
y(20) = 5,
y (20) = 5,
z(20) = 9,
z (20) = 9.
8.4
(29)
x(t0 ) = c.
The differential equation in (29) can be written as
x (t) Ax(t) = f(t).
(30)
(31)
(32)
276
Chapter 8
(33)
t0
t
,
d + At
e
x(t) dt =
eAt f(t) dt
dt
t0
t
eAs f(s) ds.
=
(34)
t0
Note that we have replaced the dummy variable t by the dummy variable s in
the right-hand integral of (34), which in no way alters the denite integral (see
Problem 1).
Upon evaluating the left-hand integral, it follows from (34) that
e
At
#t
t
#
x(t)## =
eAs f(s)ds
t0
t0
or that
eAt x(t) = eAt0 x(t0 ) +
eAs f(s)ds.
(35)
t0
eAs f(s)ds,
(36)
t0
$
%1
Premultiplying both sides of (36) by eAt
, we obtain
%1 At
%1
$
$
e 0 c + eAt
x(t) = eAt
(37)
t0
x(t) = e e
c+e
At
t0
(38)
8.4
277
Since At and At0 commute (why?), we have from Problem 10 of Section 7.8,
eAt eAt0 = eA(tt0 ) .
(39)
(40)
t0
Equation (40) is the unique solution to the initial-value problem given by (29).
A simple method for calculating the quantities eA(tt0 ) , and eAs is to rst
compute eAt (see Section 7.6) and then replace the variable t wherever it appears
by the variables t t0 and (s), respectively.
Example 1
A=
At
e
Hence,
A(tt0 )
e(tt0 )
=
0
(t t0 ) e(tt0 )
e(tt0 )
and
e
As
es
=
0
ses
.
es
278
Chapter 8
x(t) = eA(tt0 ) c +
t0
(41)
t0
Again the quantity eA(ts) can be obtained by replacing the variable t in eAt by
the variable (t s).
In general, the solution x(t) may be obtained quicker by using (41) than by using
(40), since there is one less multiplication involved. (Note that in (40) one must
premultiply the integral by eAt while in (41) this step is eliminated.) However,
since the integration in (41) is more difcult than that in (40), the reader who
is not condent of his integrating abilities will probably be more comfortable
using (40).
If one has a homogeneous initial-value problem with constant coefcients, that
is, a system dened by
x (t) = Ax(t),
x (t0 ) = c,
(42)
(43)
(45)
8.4
279
Example 2
u(t)
x(t) =
,
v(t)
1
A=
4
2
,
3
1
f(t) =
,
1
and
1
c=
.
2
(46)
1 2e5t + 4et
6 4e5t 4et
2e5t 2et
.
4e5t + 2et
Hence,
eAs =
1 2e5s + 4es
6 4e5s 4es
2e5s 2es
4e5s + 2es
and
eA(tt0 ) = eAt ,
since
t0 = 0.
Thus,
e
A(tt0 )
1 2e5t + 4et
c=
6 4e5t 4et
2e5t 2et
4e5t + 2et
1
2
5t
t + 2 2e5t 2et
1 1 2e + 4e
=
6 1 4e5t 4et + 2 4e5t + 2et
e5t
=
,
2e5t
(47)
and
eAs f(s) =
1 2e5s + 4es
6 4e5s 4es
2e5s 2es
4e5s + 2es
1
1
1 1 2e5s + 4es 1 2e5s 2es
es
=
.
=
5s
es
6 1 4e
4es 1 4e5s + 2es
280
Chapter 8
Hence,
s t
t
e
ds
t
e 1
e |0
0
As
e
f(s)ds = t
=
=
et + 1
es |t0
t0
s
e ds
t
and
e
At
As
t0
1 2e5t + 4et
f(s)ds =
6 4e5t 4et
2e5t 2et
$
$
et 1
4e5t + 2et
1 et
t
5t
5t
t
t
t
1 2e + 4e [e 1] + 2e 2e [1 e ]
=
6 4e5t 4et [et 1] + 4e5t + 2et [1 et ]
%
$
1 et %
$
.
=
1 + et
(48)
or
u(t) = e5t et + 1,
v(t) = 2e5t + et 1.
Example 3
0
A=
2
1
,
3
f(t) =
e3t
At
e2t + 2et
=
2e2t + 2et
e2t et
.
2e2t et
and
1
c=
.
0
8.4
281
Thus,
A(tt0 )
c=
e2(t1) + 2e(t1)
e2(t1) e(t1)
0
(49)
Now
f(t) =
e3t
f(s) =
e3s
,
and
e
As
e2s + 2es
f(s) =
2e2s + 2es
=
e2s es
2e2s es
e3s
e5s e4s
.
2e5s e4s
Hence,
t0
5s
4s
e
ds
e
eAs f(s)ds = 1 t
5s
4s
e
2e
ds
1
15 e5t + 41 e4t + 15 e5 41 e4
=
,
2
1
2
1
5t
4t
5
4
5 e + 4 e + 5 e 4 e
and
eAt
eAs f(s)ds
t0
$
%
$ 2t
% 1 5t 1 4t 1 5 1 4
2t
t
t
e
+
e
+
e
e + 2e
e e
4
4
5
5
= $
% $ 2t
%
2t
t
t
2
1
2
1
5t
4t
5
4
2e + 2e
2e e
5e + 4e + 5e 4e
1 3t
20 e
+ 15 e(2t5) 41 et4
=
.
3 3t
2 (2t5)
1 t4
20 e + 5 e
4e
(50)
282
Chapter 8
e2(t1) + 2et1
x1 (t)
=
x2 (t)
2e2(t1) + 2et1
e2(t1) + 2et1 +
=
2e2(t1) + 2et1
1 3t
20 e
3 3t
20
e
1 3t
20 e
3 3t
20 e
+ 15 e(2t5) 41 et4
+ 25 e(2t5) 41 et4
+ 15 e(2t5) 41 et4
+
2 (2t5)
5e
1 t4
4e
1 3t
20 e
+ 15 e(2t5) 41 et4 .
The1most tedious step in Example 3 was multiplying the matrix eAt by the
t
vector t0 eAs f(s)ds. We could have eliminated this multiplication had we used
(41) for the solution rather than (40). Of course, in using (41), we would have had
to handle an integral rather more complicated than the one we encountered.
If A and f(t) are relatively simple (for instance, if f(t) is a constant vector), then
the integral obtained in (41) may not be too tedious to evaluate, and its use can
be a real savings in time and effort over the use of (40). We illustrate this point in
the next example.
Example 4
x(t) =
A=
0
1
1
,
0
f(t) =
0
,
2
and
c=
0
.
1
(51)
8.4
283
Thus,
A(tt0 )
cos (t ) sin (t )
c=
sin (t ) cos (t )
0
1
=
sin (t )
,
cos (t )
(52)
and
A(ts)
cos (t s)
f(s) =
sin (t s)
0
2
sin (t s)
cos (t s)
2 sin (t s)
=
.
2 cos (t s)
Hence,
t
t0
2 sin (t s) ds
eA(ts) f(s)ds = t
2 cos (t s)ds
2 2 cos (t )
=
.
2 sin (t )
(53)
Substituting (52) and (53) into (41) and using the trigonometric identities
sin(t ) = sin t and cos(t ) = cos t, we have
x1 (t)
sin (t )
2 2 cos (t )
= x(t) =
+
cos (t )
2 sin (t )
x2 (t)
sin t + 2 cos t + 2
=
.
cos t 2 sin t
Thus, since x(t) = x1 (t), it follows that the solution to the initial-value problem is
given by
x(t) = sin t + 2 cos t + 2.
Example 5
284
Chapter 8
1 2e5t + 4et
6 4e5t 4et
2e5t 2et
4e5t + 2et
k1
k2
1 k1 2e5t + 4et + k2 2e5t 2et
=
6 k1 4e5t 4et + k2 4e5t + 2et
1 e5t (2k1 + 2k2 ) + et (4k1 2k2 )
=
.
6 e5t (4k1 + 4k2 ) + et (4k1 + 2k2 )
(54)
or
)
)
(
4k1 2k2 t
2k1 + 2k2 5t
e +
e
6
6
)
)
(
(
2k1 + 2k2 5t
4k1 + 2k2 t
e +
e .
v(t) = 2
6
6
(
u(t) =
(55)
We can simplify the expressions for u(t) and v(t) if we introduce two new arbitrary
constants k3 and k4 dened by
k3 =
2k1 + 2k2
,
6
k4 =
4k1 2k2
.
6
(56)
(57)
8.4
285
Problems 8.4
1. Show by direct integration that
t 2 dt =
t0
s2 ds =
t0
p2 dp.
t0
f(t) dt =
f(s) ds.
a
1
1
(Hint: Assume f(t) dt = F(t) + c. Hence, f(s) ds = F(s) + c. Use the fundamental theorem of integral calculus to obtain the result.)
2. Derive Eq. (44). (Hint: Follow steps (30)(33). For step (34) use indenite
integration and note that
,
d + At
x(t) dt = eAt x(t) + k,
e
dt
(b) eA(t2)
(d) eA(t2) , if
(c) eA(ts)
t 2 /2
eAt = e3t 0
t .
4. For eAt as given in Problem 3, invert by the method of cofactors to obtain eAt
and hence verify part (a) of that problem.
5. Find (a) eAt ,
At
5t
t
1 2e + 4e
=
6 4e5t 4et
(b) eAs ,
At
2e5t 2et
4e5t + 2et
(c) eA(ts) if
1 sin 3t + 3 cos 3t
=
3
2 sin 3t
5 sin 3t
sin 3t + 3 cos 3t
286
Chapter 8
Solve the systems given in Problems 7 through 14 by matrix methods. Note that
Problems 7 through 10 have the same coefcient matrix.
7. x (t) = 2x(t) + 3y(t),
x(2) = 2,
x(1) = 1,
y(2) = 4.
x(0) = 1,
x (0) = 0.
13. x x 2x = et ;
x (1) = 2,
x (1) = 3
14. x = 2x + 5y + 3,
x (0) = 0.
y = x 2y;
x(0) = 0,
8.5
y(1) = 1.
x (0) = 0,
y(0) = 1.
(58)
Note that A(t) may now depend on t, hence the analysis of Section 8.4 does not
apply. However, since we still require both A(t) and f(t) to be continuous in some
interval about t = t0 , Theorem 1 of Section 8.1 still guarantees that (58) has a
unique solution. Our aim in this section is to obtain a representation for this
solution.
Denition 1 A transition (or fundamental) matrix of the homogeneous equation x (t) = A(t)x(t) is an n n matrix (t, t0 ) having the properties that
(a)
d
(t, t0 ) = A(t)(t, t0 ),
dt
(b) (t0 , t0 ) = I.
(59)
(60)
Here t0 is the initial time given in (58). In the Final Comments to this chapter, we
show that (t, t0 ) exists and is unique.
8.5
287
Example 1
Solution Consider the matrix eA(tt0 ) . From Property 1 of Section 7.8, we have
that eA(t0 t0 ) = e0 = I, while from Theorem 1 of Section 7.9, we have that
d A(tt0 )
d At At0
e
e e
=
dt
dt
= AeAt eAt0 = AeA(tt0 ) .
Thus, eA(tt0 ) satises (59) and (60). Since (t, t0 ) is unique, it follows for the case
where A is a constant matrix that
(t, t0 ) = eA(tt0 ) .
(61)
(62)
(63)
Proof. If A(t) is a constant matrix, (63) reduces to (43) (see (61)), hence
Theorem 1 is valid. In general, however, we have that
dx(t)
d
d
=
[(t, t0 )c] =
[(t, t0 )] c,
dt
dt
dt
= A(t)(t, t0 )c
{from (59),
= A(t)x(t)
{from (63),
and
x(t0 ) = (t0 , t0 )c,
= Ic
= c.
$ %
{from 60 ,
288
Chapter 8
Example 2
y(1) = 1,
&
'
&
'
t 2 t02
t 2 t02
sin
cos
2
2
(t, t0 ) =
'
&
'.
&
2
2
2
2
t t0
sin t t0
cos
2
2
Thus, from (63), we have
'
t2 1
cos
x(t) =
'
&
2
sin t 1
2
&
'
t2 1
sin
2
0
&
' 1
2
t 1
cos
2
&
'
t2 1
sin
=
' .
&
2
cos t 1
2
&
&
'
t2 1
y(t) = cos
.
2
8.5
289
The transition matrix also enables us to give a representation for the solution
of the general time-varying initial-value problem
x (t) = A(t)x(t) + f(t),
x(t0 ) = c.
(58)
(t, s)f(s)ds.
(64)
t0
(65)
290
Chapter 8
Thus, Property 1 is immediate. For the more general case, that in which A(t)
depends on t, the argument runs as follows: Consider the initial-value problem
x (t) = A(t)x(t)
x(t0 ) = c.
(66)
(67)
x(t1 ) = (t1 , t0 )c
(68)
(69)
Hence,
and
where t1 is any arbitrary time greater than . If we designate the vector x(t1 ) by d
and the vector x() by b, then we can give the solution graphically by Figure 8.1.
Consider an associated system governed by
x (t) = A(t)x(t),
x() = b.
(70)
We seek a solution to the above differential equation that has an initial value b at
the initial time t = . If we designate the solution by y(t), it follows from Theorem 1
that
y(t) = (t, )b,
(71)
(72)
hence
x(t )
x(t1) 5 d
Solution curve
x(t ) 5 (t, t0)c
x( ) 5 b
x(t0) 5 c
t0
Figure 8.1
t1
8.5
291
But now we note that both x(t) and y(t) are governed by the same equation of
motion, namely x (t) = A(t)x(t), and both x(t) and y(t) go through the same point
(, b). Thus, x(t) and y(t) must be the same solution. That is, the solution curve for
y(t) looks exactly like that of x(t), shown in Figure 8.1 except that it starts at t = ,
while that of x(t) starts at t = t0 . Hence,
x(t) = y(t),
t ,
and, in particular,
x(t1 ) = y(t1 ).
(73)
(74)
(75)
(76)
Since c may represent any n-dimensional initial state, it follows from Lemma 1
that
(t1 , t0 ) (t1 , )(, t0 ) = 0
or
(t1 , t0 ) = (t1 , )(, t0 ).
(77)
Since t 1 is arbitrary, it can be replaced by t; Eq. (77) therefore implies Eq. (65).
Property 2
(78)
292
Chapter 8
cos
sin
2
2
&
'
&
'.
2 t2
2 t2
t
t
0
0
sin
cos
2
2
&
Solution This matrix is a transition matrix. (See Problem 1.) Hence using (78)
we nd the inverse to be
&
t02
t2
'
cos
'
&
2
2
sin t0 t
2
'
&
'
t 2 t02
t 2 t02
cos
sin
sin
2
2
2
.
&
' =
&
'
&
'
2
2
2
2
2
2
t t
t t0
sin t t0
cos 0
cos
2
2
2
&
t02
t2
'
&
Here we have used the identities sin() = sin and cos() = cos .
8.5
293
x(t) = (t, t0 )c +
(t, s)f(s)ds.
(79)
t0
Using Property 1, we have that (t, s) = (t, t0 )(t0 , s); hence, (79) may be
rewritten as
t
(t0 , s)f(s)ds.
(80)
x(t) = (t, t0 )c + (t, t0 )
t0
Now
x(t0 ) = (t0 , t0 )c + (t0 , t0 )
t0
(t0 , s)f(s)ds
t0
= Ic + I0 = c.
Thus, the initial condition is satised by (80). To show that the differential equation
is also satised, we differentiate (80) and obtain
t
dx(t)
d
=
(t, t0 )c + (t, t0 )
(t0 , s)f(s)ds
dt
dt
t0
=
t
d
d
(t, t0 ) c +
(t, t0 )
(t0 , s)f(s)ds
dt
dt
t0
+ (t, t0 )
d
dt
(t0 , s)f(s)ds
t0
= A(t)(t, t0 )c + A(t)(t, t0 )
(t0 , s)f(s)ds
t0
(t0 , s)f(s)ds
t0
294
Chapter 8
We conclude this section with one nal property of the transition matrix, the
proof of which is beyond the scope of this book.
Property 3
det (t, t0 ) = exp
!
tr [A(t)] dt .
(81)
t0
Since the exponential is never zero, (81) establishes that det (t, t0 ) = 0, hence,
we have an alternate proof that (t, t0 ) is invertible.
Problem 8.5
1. Use (59) and (60) to show that
'
&
'
t 2 t02
t 2 t02
sin
cos
2
2
(t, t0 ) =
&
'
&
'
2
2
2
2
t t0
t t0
sin
cos
2
2
&
g(s)ds
cos
t0
(t, t0 ) =
g(s)ds
sin
t0
sin
g(s)ds
cos
g(s)ds
t0
t0
rst, c =
0. , second, c = 0. , etc.)
..
..
0
0
8.6
295
Final Comments
t1
1
(t1 , t0 )
t0
=
t1
1
.
t0
8.6
e1 = 0, e2 = 0, e3 = 0, . . . , en =
0.
.
.
.
.
..
..
..
..
0
0
0
1
(82)
Thus,
[e1
e2
e3
1
0
en ] =
0.
..
0
0
1
0
..
.
0
0
0
0
..
.
0
0
0
0
= I.
..
.
1
(83)
(j = 1, 2, . . . , n),
(84)
where A(t) and t0 are taken from (58). For each j(j = 1, 2, . . . , n), Theorem 1
of Section 8.1 guarantees the existence of a unique solution of (84); denote this
solution by xj (t). Thus, x1 (t) solves the system
x 1 (t) = A(t)x1 (t)
x1 (t0 ) = e1 ,
(85)
296
Chapter 8
(86)
(87)
x2 (t0 )
e2
xn (t0 )]
en ]
=I
{from (85)(87)
{from (83)
and
d(t, t0 )
d
= [x1 (t) x2 (t)
dt
dt
= [x1 (t) x 2 (t)
xn (t)]
x n (t)]
{from (85)(87)
9
Probability and Markov
Chains
9.1
297
298
Chapter 9
Using mathematical notation, we can let S be the set that represents the
sample space and let E be the set that represents the desired event. Since the
number of objects (size) in any set is called the cardinal number, we can write
N(S) = 52 and N(E) = 1, to represent the cardinal number of each set. So we now
write
P(E) =
1
N(E)
=
N(S)
52
(1)
13
1
N(F )
=
= .
N(S)
52
4
(2)
Another example would be to consider a fair die; lets call it D. Since there are
six faces, there are six equally likely outcomes (1, 2, 3, 4, 5, or 6) for every roll of
the die, N(D) = 6. If the event G is to roll a 3, then
P(G) =
N(G)
1
= .
N(D)
6
(3)
Experiment by rolling a die many times. You will nd that the proportion
of times a 3 occurs is close to one-sixth. In fact, if this is not the case, the die is
most probably not fair.
Remark 1 From this example it is clear that the probability of any of the six
outcomes is one-sixth. Note, too, that the sum of the six probabilities is 1. Also,
the probability of rolling a 7 is zero, simply because there are no 7s on the any
of the six faces of the die.
Because of the above examples, it is most natural to think of probability as
a number. This number will always be between 0 and 1. We say that an event is
certain if the probability is 1, and that it is impossible if the probability is 0. Most
probabilities will be strictly between 0 and 1.
To compute the number we call the probability of an event, we will adopt the
following convention. We will divide the number of ways the desired event can occur
by the total number of possible outcomes. We always assume that each member of
the sample space is just as likely to occur as any other member. We call this a
relative frequency approach.
9.1
299
Example 1 Consider a fair die. Find the probability of rolling a number that is
a perfect square.
As before, the size of the sample space, D, is 6. The number of ways a perfect
square can occur is two: only 1 or 4 are perfect squares out of the rst six
positive integers. Therefore, the desired probability is
P(K) =
N(K)
2
1
= = .
N(D)
6
3
(4)
N(Z)
6
1
=
= .
N(R)
36
6
(5)
R\G
1
2
3
4
5
6
Figure 9.1
1
2
3
4
5
6
7
2
3
4
5
6
7
8
3
4
5
6
7
8
9
4
5
6
7
8
9
10
5
6
7
8
9
10
11
6
7
8
9
10
11
12
300
Chapter 9
In the next section we will give some rules that pertain to probabilities, and
investigate the meaning of probability more fully.
Problems 9.1
1. Find the sample space, its cardinal number and the probability of the desired
event for each of the following scenarios:
a) Pick a letter, at random, out of the English alphabet. Desired Event:
choosing a vowel.
b) Pick a date, at random, for the Calendar Year 2008. Desired Event:
choosing December 7th.
c) Pick a U.S. President, at random, from a list of all the presidents. Desired
Event: choosing Abraham Lincoln.
d) Pick a U.S. President, at random, from a list of all the presidents. Desired
Event: choosing Grover Cleveland.
e) Pick a card, at random, from a well shufed deck of regular playing cards.
Desired Event: choosing the Ace of Spades.
f) Pick a card, at random, from a well shufed deck of Pinochle playing cards.
Desired Event: choosing the Ace of Spades.
g) Roll a pair of fair dice. Desired Event: getting a roll of 2 (Snake Eyes).
h) Roll a pair of fair dice. Desired Event: getting a roll of 12 (Box Cars).
i) Roll a pair of fair dice. Desired Event: getting a roll of 8.
j) Roll a pair of fair dice. Desired Event: getting a roll of 11.
k) Roll a pair of fair dice. Desired Event: getting a roll of an even number.
l) Roll a pair of fair dice. Desired Event: getting a roll of a number that is a
perfect square.
m) Roll a pair of fair dice. Desired Event: getting a roll of a number that is a
perfect cube.
n) Roll a pair of fair dice. Desired Event: getting a roll of a number that is a
multiple of 3.
o) Roll a pair of fair dice. Desired Event: getting a roll of a number that is
divisible by 3.
p) Roll a pair of fair dice. Desired Event: getting a roll of 13.
2. Suppose we were to roll three fair dice: a red one rst, followed by a white die,
followed by a blue die. Describe the sample space and nd its cardinal number.
3. Suppose the probability for event A is known to be 0.4. Find the cardinal
number of the sample space if N(A) = 36.
9.2
301
4. Suppose the probability for event B is known to be 0.65. Find the cardinal
number of B, if the cardinal number of the sample space, S, is N(S) = 3000.
9.2
N(A)
,
N(S)
(6)
where the numerator and denominator are the respective cardinal numbers of A
and S.
We will now give a number of denitions and rules which we will follow regarding the computations of probabilities. This list formalizes our approach. We note
that the reader can nd the mathematical justication in any number of sources
devoted to more advanced treatments of this topic. We assume that A, B, C . . . are
any events and that S is the sample space. We also use to denote an impossible
event.
(7)
Remark 1 Here we use both the union () and intersection () notation from
set theory. For this rule, we subtract off the probability of the common even in
order not to count it twice. For example, if A is the event of drawing a King
from a deck of regular playing cards, and B is the event of drawing a Diamond,
clearly the King of Diamonds is both a King and a Diamond. Since there are 4
4
. And since there are 13
Kings in the deck, the probability of drawing a King is 52
13
. Since there
Diamonds in the deck, the probability of drawing a Diamond is 52
is only one King of Diamonds in the deck, the probability of drawing this card is
1
. We note that
clearly 52
4
13
1
16
+
=
,
52 52 52
52
(8)
which is the probability of drawing a King or a Diamond, because the deck contains
only 16 Kings or Diamonds.
(9)
302
Chapter 9
Remark 2 Two events are disjoint if they are mutually exclusive; that is, they
cannot happen simultaneously. For example, the events of drawing a King from a
deck of regular playing cards and, at the same time, drawing a Queen are disjoint
events. In this case, we merely add the individual probabilities. Note, also, that
since A and B are disjoint, we can write A B = ; hence, P(A B) = P() = 0.
The reader will also see that equation (9) is merely a special case of equation (7).
(10)
Remark 3 This follows from the fact that either an event occurs or it doesnt.
Therefore, P(A AC ) = 1; but since these two events are disjoint, P(A AC ) =
P(A) + P(AC ) = 1. Equation (10) follows directly.
For example, if the probability of rolling a 3 on a fair die is 16 , then the
probability of not rolling a 3 is 1 16 = 56 .
In the next section we will introduce the idea of independent events, along with
associated concepts. For the rest of this section, we give a number of examples
regarding the above rules of probability.
Example 1 Given a pair of fair dice, nd the probability of rolling a 3 or a 4.
Since these events are disjoint, we use Equation (9) and refer to Figure 9.1 and
2
3
5
+ 36
= 36
.
obtain the desired probability: 36
9.2
303
Example 4 Pick a card at random out of a well shufed deck of regular playing
cards. Find the probability of drawing a red card or a picture card.
We know there are 12 picture cards as well as 26 red cards (Hearts or Diamonds). But our events are not disjoint, since six of the picture cards are
red. Therefore, we apply Equation (7) and compute the desired probability as
12
26
6
32
52 + 52 52 = 52 .
Example 5 Suppose events A and B are not disjoint. Find P(A B), if it is given
that P(A) = 0.4, P(B) = 0.3, and P(A B) = 0.6.
Recall Equation (7): P(A B) = P(A) + P(B) P(A B). Therefore, 0.6 =
0.4 + 0.3 P(A B). Therefore, P(A B) = 0.1.
Example 6 Extend formula (7) for three non-disjoint events. That is, consider
P(A B C).
By using parentheses to group two events, we have the following equation
below:
P(A B C) = P((A B) C) = P(A B) + P(C) P((A B)) C).
(11)
From Set Theory, we know that the last term of (11) can be written as
P((A B) C) = P((A C) (B C)). Hence, applying (7) to (11) yields
P(A C) + P(B C) P((A C) (B C)).
(12)
But the last term of (12) is equivalent to P(A B C). After applying (7) to
the P(A B) term in (11), we have
P(A B C) = P(A) + P(B) + P(C) P(A B)
[P(A C) + P(B C) P((A C) (B C))].
(13)
(14)
Remark 5 Equation (14) can be extended for any nite number of events, and
it holds even if some events are pairwise disjoint. For example, if events A and B
are disjoint, we merely substitute P(A B) = 0 into (14).
304
Chapter 9
Problems 9.2
1. Pick a card at random from a well shufed deck of regular playing cards. Find
the probabilities of:
a) Picking an Ace or a King.
b) Picking an Ace or a picture card.
c) Picking an Ace or a black card.
d) Picking the Four of Diamonds or the Six of Clubs.
e) Picking a red card or a Deuce.
f) Picking a Heart or a Spade.
g) Not choosing a Diamond.
h) Not choosing a Queen.
i) Not choosing an Ace or a Spade.
2. Roll a pair of fair dice. Find the probabilities of:
a) Getting an odd number.
b) Rolling a prime number.
c) Rolling a number divisible by four.
d) Not rolling a 7.
e) Not rolling a 6 or an 8.
f) Not rolling a 1.
3. See Problem 2 of Section 9.1. Roll the three dice. Find the probabilities of:
a) Getting an odd number.
b) Rolling a 3.
c) Rolling an 18.
d) Rolling a 4.
e) Rolling a 17.
f) Rolling a 25.
g) Not rolling a 4.
h) Not rolling a 5.
i) Not rolling a 6.
4. Consider events A and B. Given P(A) = .7 and P(B) = .2, nd the probability
of A or B if P(A B) = .15.
9.3
305
5. Suppose events A and B are equally likely. Find their probabilities if P(A B) =
.46 and P(A B) = .34.
6. Extend Equation (14) for any four events A, B, C, and D.
9.3
(15)
Notice that we use the intersection () notation. This simple rule is called the
multiplication rule.
Remark 3 The reader must be careful not to confuse disjoint events with independent events. The former means that nothing is in common or that the events
cannot happen simultaneously. The latter means that the probabilities do not
inuence one another. Often, but not always, independent events are sequential;
like ipping a coin 10 times in a row.
It is clear that probabilities depend on counting, as in determining the size of
the sample space. We assume the following result from an area of mathematics
306
Chapter 9
known as Combinatorics:
The number of ways we can choose k objects from a given collection of n objects
is given by:
( )
n
n!
=
.
(16)
k
k!(n k)!
. Using (16) we see that 25 =
Since 5! = 120, 2! = 2 and 3! = 6, 25 = 120
12 = 10.
Example 1
Evaluate
5
2
5!
2!(52)!
5!
2!3! .
Example 2 Given a committee of ve people, in how many ways can a subcommittee of size two be formed? The number is precisely what we computed
in the previous example: 10. The reader can verify this as follows. Suppose the
people are designated: A, B, C, D, and E. Then, the 10 sub-committees of size two
are given by: AB, AC, AD, AE, BC, BD, BE, CD, CE, and DE.
Example 3 Given a committee of ve people, in how many ways can a subcommittee of size three be formed? We can use formula (16) to compute the
answer, however the answer must be 10. This is because a sub-committee of 3 is
the complement of a sub-committee or two. That is, consider the three people not
on a particular sub-committee of two, as constituting a sub-committee of three.
For example, if A and B are on one sub-committee of size two, put C, D, and E on
a sub-committee of size three. Clearly there are 10 such pairings.
9.3
307
Let H represent getting a head and T represent getting a tail. Since the coin
is fair, P(H) = P(T) = 21 . The only way we can obtain exactly one head in two tosses
is if the order of outcomes is either HT or TH. Note that the events are disjoint
or mutually exclusive; that is, we cannot get these two outcomes at the same time.
Hence, Equation (9) will come into play. And because of the independence of the
coin ips (each toss is a Bernoulli trial), we will use Equation (15) to determine
the probability of obtaining HT and TH.
Therefore, the probability of getting exactly one H is equal to
P(HT TH ) = P(HT ) + P(TH ) = P(H )P(T ) + P(T )P(H )
=
1 1 1 1
1
+ = .
2 2 2 2
2
(17)
Remark 7 Note that (17) could have been obtained by nding the probability of
HT in that orderand then multiplying it by the number of times (combinations)
we could get exactly one H in two tosses.
10
1
) ( ) ( )9
1
1
10
.
=
2
2
1024
(18)
Here, the rst factor gives the number of ways we can get exactly one H
in 10 tosses; this is where mutual exclusivity comes in. The second factor is the
probability of getting one H, and the third factor is the probability of getting nine
T s; the independence factor is employed here.
Example 6 Suppose we ip a fair coin 10 times. Find the probability of getting
exactly ve Hs.
308
Chapter 9
Since there are 10
5 = 252 ways of getting ve Hs in 10 tosses, the desired
probability is given by
(
10
5
) ( )5 ( )5
1
252
1
.
=
2
2
1024
(19)
)
10
(.3)5 (.7)5 0.103.
5
( )
( )
( )
)
10
10
10
10
5
5
6
4
7
3
(.3) (.7) +
(.3) (.7) +
(.3)8 (.7)2
(.3) (.7) +
6
7
8
5
(
+
)
( )
10
10
9
1
(.3) (.7) +
(.3)10 (.7)10 0.150.
9
10
(20)
9.3
309
)
10
(.3)0 (.7)10 0.972.
0
(21)
(22)
Problems 9.3
1. Evaluate the following:
b) 71 ;
a) 26 ;
f)
1000
1000
;
g)
1000
0
8
5
c)
;
h)
;
100
99
d)
;
i)
20
18
1000
999
;
e)
20
2
;
j)
0
0
2. How many different nine-player line-ups can the New York Yankees produce
if there are 25 players on the roster and every player can play every position?
3. Suppose 15 women comprised a club, and a committee of 6 members was
needed. How many different committees would be possible?
4. Toss a fair die eight times. Find the probability of:
a) Rolling exactly one 5.
b) Rolling exactly three 5s.
c) Rolling at least three 5s.
d) Rolling at most three 5s.
e) Rolling at least one 5.
5. Suppose event A has a probability of occurring equal to .65. Without evaluating the expressions, nd the following probabilities given 500 independent
Bernoulli trials.
310
Chapter 9
9.4
.7
P=
.9
.3
.1
(23)
represents the following probabilities. The element in the rst row and rst column
represents the probability of Family (1) originally residing in state A remaining
in state A, while the element in the rst row and second column represents the
probability of starting in state A and then moving to state B. Note that these two
probabilities add to one.
Similarly, let the element in the second row and rst column, represent the
probability of Family (2) starting in state B and moving to state A, while the element in the second row and second column, represents the probability of starting
in state B and remaining in state B. Here, too, these two probabilities add to one.
Note that we can consider the process as time conditioned in the sense that
there is a present and a future (for example, one year from the present).
Such a matrix is called a transition matrix and the elements are called
transitional probabilities.
9.4
311
.76
P =
.72
2
.24
.
.28
(24)
What does P 2 represent? To answer this question, let us ask another question:
From the perspective of Family (1), what is the probability of being in state A after
two years?
There are two ways Family (1) can be in state A after two years:
Scenario 2: The family moved to state B after one year and then moved back
to state A after the second year.
The probability of the rst scenario is .7(.7) = .49, because these events can be
considered independent. The probability of the second is .3(.7) = .27.
Because these events are disjoint, we add the probabilities to get .76.
Note that this is the element in the rst row and rst column of P 2 .
By similar analyses we nd that P 2 is indeed the transitional matrix of our
Moving Situation after two time periods.
Matrix P is the transition matrix for a Markov chain. The sum of the probabilities of each row must add to one, and by the very nature of the process, the
matrix must be square. We assume that at any time each object is in one and only
one state (although different objects can be in the same state). We also assume
that the probabilities remain constant over the given time period.
Remark 1 The notation p(n)ij is used to signify the transitional probability of
moving from state i to state j over n time periods.
Example 2 Suppose Moe, Curly, and Larry live in the same neighborhood. Let
the transition matrix
.7 .1 .2
S = .5 .3 .2
.8 .1 .1
(25)
represent the respective probabilities of Moe, Curly, and Larry staying at home
on Monday and either visiting one of their two neighbors or staying home on
Tuesday. We ask the following questions regarding Thursday:
a) What is the probability of Moe going to visit Larry at his home, p(3)13 ?
b) What is the probability of Curly being at his own home, p(3)22 ?
312
Chapter 9
.694
P 3 = .124
.695
.124
.132
.124
.182
.182 .
.181
(26)
So, our answers are as follows: a) the probability is .182, the entry in the rst
row and third column; b) the probability is .132, the entry in the second row and
second column.
.7
.1
K = .15 .75
.3 .2
.2
.1.
.5
(27)
.565
K2 = .2475
.39
.185
.5975
.28
.25
.155.
.33
(28)
.5
.3
.5
.
.7
(29)
We nd that if we raise this matrix to the 9th and 10th powers, our result is the
same. That is,
10
A =A
.375
=
.375
.625
.
.625
(30)
and the same result occurs for any higher power of A. This absorbing quality
implies that sooner or later the transitional probabilities will stabilize.
9.4
313
Markov processes are used in many areas including decision theory, economics
and political science.
Problems 9.4
1. Why are the following matrices not transitional matrices?
.1 .2 .7
0 1
.6 .5
.1
a)
;
b)
;
c) 1 0 0 ;
d)
1 2
.4 .5
.2
0 0 0
.5
.6
.4
.2
.1 .2
.7
1 0
0 1
d)
;
e)
;
f) .5 .25 .25
0 .1
1 0
.3
.3 .4
3. Consider the c) and d) matrices in the previous problem; show that these
matrices are absorbing matrices.
4. Consider the e) matrix in Problem (2). Raise the matrix to the powers of 2, 3,
4, 5, 6, 7, and 8. What do you notice? Can you construct a scenario for which
this transitional matrix could be a model?
.6 .4
5. Consider the following transitional matrix:
. Find p(2)11 , p(2)21 , p(3)12 ,
.1 .9
and p(3)22 .
6. Consider a game called Red-Blue. The rules state that after one turn Red can
become Blue or remain Red. The same is true with Blue. Suppose you make a
bet that after ve turns, Red will be in the Red category. You are told that the
following probabilities are valid:
314
Chapter 9
9.5
10
Real Inner Products and
Least-Square
10.1
Introduction
To any two vectors x and y of the same dimension having real components (as
distinct from complex components), we associate a scalar called the inner product,
denoted as x, y, by multiplying together the corresponding elements of x and y,
and then summing the results. Students already familiar with the dot product of
two- and three-dimensional vectors will undoubtedly recognize the inner product
as an extension of the dot product to real vectors of all dimensions.
Example 1
Find x, y if
1
x = 2
3
4
y = 5.
6
and
Example 2
30
10] and v = [10 5 8 6].
It follows immediately from the denition that the inner product of real vectors
satises the following properties:
(I1) x, x is positive if x = 0; x, x = 0 if and only if x = 0.
(I2) x, y = y, x.
(I3) x, y = x, y, for any real scalar .
315
316
Chapter 10
Example 3
Solution
x, x.
29.
The concepts of a normalized vector and a unit vector are identical to the
denitions given in Section 1.6. A nonzero vector is normalized if it is divided by
its magnitude. A unit vector is a vector whose magnitude is unity. Thus, if x is any
nonzero vector, then (1/x) x is normalized. Furthermore,
1
1
1
1
x,
x =
x,
x
x x
x
x
(Property I3)
1
1
x, x
x x
(Property I2)
(
=
(
=
1
x
1
x
)2
x, x
)2
x2 = 1,
(Property I3)
10.1
317
Introduction
Problems 10.1
In Problems 1 through 17, nd (a) x, y and (b) x, x for the given vectors.
1
3
1. x =
and y =
.
2
4
2
4
2. x =
and y =
.
0
5
5
3
3. x =
and y =
.
7
5
4. x = [3
14]
and y = [7
3].
5. x = [2 8] and y = [4 7].
2
1
6. x = 0 and y = 2.
1
4
2
4
7. x = 2 and y = 3.
4
3
3
6
8. x = 2 and y = 4.
5
4
1 1 1
9. x = 2 3 6
and y = 13 23 1 .
and y = 1/ 3
10. x = 1/ 2 1/ 3 1/ 6
and y = 41 21 81 .
11. x = 13 31 31
12. x = [10 20 30] and y = [5
1
1
0
1
13. x =
1 and y = 0.
1
1
1
1
2
1
2
2
14. x = 1 and y =
3.
2
4
1
2
3
5
15. x =
7
8
4
6
and y =
9.
8
3].
3/ 2
1.
318
Chapter 10
16. x =
1
1
5
17. x = [1
1
5
1
5
1
5
1]
y= 1
and
and
y = [3
3
8
4
11
5 .
4
7].
1
a = 3,
1
2
b = 1,
1
and
3
c = 0.
1
28. Determine whether it is possible for two nonzero vectors to have an inner
product that is zero.
29. Prove Property I2.
30. Prove Property I3.
31. Prove Property I4.
32. Prove Property I5.
33. Prove that x + y2 = x2 + 2x, y + y2 .
34. Prove the parallelogram law:
x + y2 + x y2 = 2x2 + 2y2 .
35. Prove that, for any scalar ,
0 x y2 = 2 x2 2x, y + y2 .
36. (Problem 35 continued) Take = x, y/x2 and show that
0
x, y2
+ y2 .
x2
10.1
319
Introduction
2 3
.
1 1
41. Compute x, yA for the vectors given in Problem 6 when
1
A = 1
0
1
0
1
0
1.
1
1 1 1
A = 0 1 1.
1 1 1
320
Chapter 10
10.2
Orthonormal Vectors
Denition 1 Two vectors x and y are orthogonal (or perpendicular) if x, y = 0.
Thus, given the vectors
1
1
1
x = 1, y = 1, z = 1,
1
0
0
we see that x is orthogonal to y and y is orthogonal to z since x, y = y, z = 0;
but the vectors x and z are not orthogonal since x, z = 1 + 1 = 0. In particular,
as a direct consequence of Property (I5) of Section 10.1 we have that the zero
vector is orthogonal to every vector.
A set of vectors is called an orthogonal set if each vector in the set is orthogonal
to every other vector in the set. The set given above is not an orthogonal set since
z is not orthogonal to x whereas the set given by {x, y, z},
1
1
1
x = 1, y = 1, z = 1,
1
2
0
is an orthogonal set because each vector is orthogonal to every other vector.
Denition 2 A set of vectors is orthonormal if it is an orthogonal set having the
property that every vector is a unit vector (a vector of magnitude 1).
The set of vectors
1/2
0
1/2
1/ 2 , 1/ 2 , 0
1
0
0
is an example of an orthonormal set.
Denition 2 can be simplied if we make use of the Kronecker delta, ij,
dened by
ij =
1
0
if
if
i = j,
i = j.
(1)
3
xi , xj = ij
i, j = 1, 2, . . . , n.
(2)
The importance of orthonormal sets is that they are almost equivalent to linearly independent sets. However, since orthonormal sets have associated with
them the additional structure of an inner product, they are often more convenient. We devote the remaining portion of this section to showing the equivalence
of these two concepts. The utility of orthonormality will become self-evident in
later sections.
10.2
321
Orthonormal Vectors
(3)
x2 , y1
y1
y1 , y1
y3 = x3
x3 , y1
x3 , y2
y1
y2
y1 , y1
y2 , y2
and, in general,
yj = xj
j1
xj , yk
k=1
yk , yk
yk
(j = 2, 3, . . . , n).
(4)
322
Chapter 10
that the yj terms lack in order to be the required set is that their magnitudes may
not be one. We remedy this situation by dening
qj =
yj
.
yj
(5)
1
1/2
1
1 =
1/2.
2 0
1
Then,
x3 , y1 = 1(1) + 0(1) + 1(0) = 1,
x3 , y2 = 1(1/2) + 0 (1/2) + 1(1) = 1/2,
y2 , y2 = (1/2)2 + (1/2)2 + (1)2 = 3/2,
so
x3 , y1
x3 , y2
1
1/2
y1
y2 = x3 y1
y2
y1 , y1
y2 , y2
2
3/2
1
1/2
2/3
1
1
1
= 0 1 1/2 = 2/3.
2
3
0
1
2/3
1
y3 = x3
10.2
323
Orthonormal Vectors
The vectors y1 , y2 , and y3 form an orthogonal set. To make this set orthonormal, we
note that y1 , y1 = 2, y2 , y2 = 3/2, and y3 , y3 = (2/3)(2/3) + (2/3)(2/3) +
(2/3)(2/3) = 4/3. Therefore,
y1 =
2
y2 =
y3 = y3 , y3 = 2/ 3,
and
y1 , y1 =
y2 , y2 = 3/2,
1
1/ 2
y1
1
1 = 1/ 2 ,
q1 =
=
y1
2 0
0
1/6
1/2
y2
1
1/2 = 1/ 6 ,
=
q2 =
y2
3/2
1
2/ 6
1/3
2/3
1
y3
2/3 = 1/ 3 .
q3 =
=
y3
2/ 3
2/3
1/ 3
x1 =
0,
1
1
2
x2 =
1,
0
0
1
x3 =
2,
1
1
0
x4 =
1.
1
Solution
1
1
y1 = x1 =
0,
1
y1 , y1 = 1(1) + 1(1) + 0(0) + 1(1) = 3,
x2 , y1 = 1(1) + 2(1) + 1(0) + 0(1) = 3,
0
1
x2 , y1
3
y2 = x2
y1 = x2 y1 =
1;
y1 , y1
3
1
324
Chapter 10
2/3
1/3
2
2
= x3 y1 y2 =
4/3;
3
3
1
y3 = x3
(
y3 , y3 =
2
3
)2
(
+
1
3
)2
+
( )2
4
10
,
+ (1)2 =
3
3
2/3
1/2
2
0
5/3
= x4 y1 y2
y3 =
1/3.
3
3
10/3
1/6
y4 = x4
Then
y4 , y4 = (2/3)(2/3) + (1/2)(1/2) + (1/3)(1/3) + (1/6)(1/6)
= 5/6,
and
1/3
1
1/ 3
1
1
=
,
q1 =
3 0 0
1
1/ 3
0
0
1 1
= 1/3 ,
q2 =
1
1/
3
3
1
1/ 3
10.2
325
Orthonormal Vectors
2/30
2/3
1
1/3 = 1/30 ,
q3 =
4/3
4/30
10/3
1
3/ 30
4/30
2/3
1
1/2 = 3/30.
q4 =
1/3
2/30
5/6
1/6
1/ 30
Problems 10.2
1. Determine which of the following vectors are orthogonal:
1
2
2
4
3
x=
, y=
, z=
, u=
, v=
.
2
1
1
2
6
2. Determine which of the following vectors are orthogonal:
1
1
1
1
2
x = 1, y = 1, z = 1, u = 1, v = 1.
2
1
1
0
1
3. Find x so that
x
3
.
is orthogonal to
4
5
4. Find x so that
1
1
x is orthogonal to 2.
3
3
326
Chapter 10
3
,
2
x2 =
3
.
3
1
13. x1 = 2,
1
1
x2 = 0,
1
1
x3 = 0.
2
2
14. x1 = 1,
0
0
x2 = 1,
1
2
x3 = 0.
2
1
15. x1 = 1,
0
2
x2 = 0,
1
2
x3 = 2.
1
0
16. x1 = 3,
4
3
x2 = 5,
0
2
x3 = 5.
5
0
1
17. x1 =
1,
1
1
1
18. x1 =
0,
0
1
1
0
1
x2 =
1, x3 = 0,
1
1
0
1
1
0
x2 =
1, x3 = 1,
0
0
1
1
x4 =
1.
0
1
0
x4 =
0.
1
10.3
327
1
x1 = 1,
0
0
x2 = 1,
1
1
x3 = 0
1
are linearly dependent. Apply the GramSchmidt process to it, and use the
results to deduce what occurs whenever the process is applied to a linearly
dependent set of vectors.
23. Prove that if x and y are orthogonal, then
||x y||2 = ||x||2 + ||y||2 .
24. Prove that if x and y are orthonormal, then
||sx + ty||2 = s2 + t 2
for any two scalars s and t.
25. Let Q be any n n matrix whose columns, when considered as n-dimensional
vectors, form an orthonormal set. What can you say about the product Q Q?
26. Prove that if y, x = 0 for every n-dimensional vector y, then x = 0.
27. Let x and y be any two vectors of the same dimension. Prove that x + y is
orthogonal to x y if and only if ||x|| = ||y||.
28. Let A be an n n real matrix and p be a real n-dimensional column vector.
Show that if p is orthogonal to the columns of A, then Ay, p = 0 for any
n-dimensional real column vector y.
10.3
328
Chapter 10
Figure 10.1
To use Denition 1, we need the cosine of the angle between two vectors, which
requires us to measure the angle. We shall take another approach.
The vectors u and v along with their difference u v form a triangle (see
Figure 10.2) having sides ||u||, ||v||, and ||u v||. It follows from the law of cosines
that
||u v||2 = ||u||2 + ||v||2 2||u|| ||v|| cos ,
whereupon
1
[||u||2 + ||v||2 ||u v||2 ]
2
1
= [u, u + v, v u v, u v]
2
= u, v.
u2v
Figure 10.2
10.3
329
Thus, the dot product of two-dimensional vectors is the inner product of those
vectors. That is,
u v = ||u|| ||v|| cos = u, v.
(6)
u, v
,
||u|| ||v||
(7)
and use Eq. (7) to calculate the angle between two vectors.
Example 1
Solution u, v =
2(3) + 5(4) = 14, u = 4 + 25 = 29, v = 9 + 16 =
5, so cos = 14/(5 29) = 0.1599, and = 58.7 .
Eq. (7) is used to dene the angle between any two vectors of the same, but
arbitrary dimension, even though the geometrical signicance of an angle becomes
meaningless for dimensions greater than three. (See Problems 9 and 10.)
A problem that occurs often in the applied sciences and that has important
ramications for us in matrices involves a given nonzero vector x and a nonzero
reference vector a. The problem is to decompose x into the sum of two vectors,
u + v, where u is parallel to a and v is perpendicular to a. This situation is illustrated
in Figure 10.3. In physics, u is called the parallel component of x and v is called the
x
u
Figure 10.3
330
Chapter 10
a, x
a
a, a
and
v=x
a, x
a.
a, a
2
x=
7
a, x
22 3
2.64
=
,
a=
3.52
a, a
25 4
v =xu =
2
2.64
4.64
=
.
7
3.52
3.48
a, x
a.
a, a
(8)
10.3
331
a, x
a is orthogonal to a.
a, a
(9)
That is, if we subtract from a nonzero vector x its projection onto another nonzero
vector a, we are left with a vector that is orthogonal to a. (See Problem 23.)
In this context, the GramSchmidt process, described in Section 10.2, is almost
obvious. Consider Eq. (4) from that section:
3
j1 2
xj , yk
2
3 yk
yj = xj
y ,y
k=1 k k
(4 repeated)
The quantity inside the summation sign is the projection of xj onto yk . Thus for
each k (k = 1, 2, . . . , j 1), we are sequentially subtracting from xj its projection
onto yk , leaving a vector that is orthogonal to yk .
We now propose to alter slightly the steps of the GramSchmidt orthonormalization process. First, we shall normalize the orthogonal vectors as soon as they
are obtained, rather than waiting until the end. This will make for messier hand
calculations, but for a more efcient computer algorithm. Observe that if the yk
vectors in Eq. (4) are unit vectors, then the denominator is unity, and need not be
calculated.
Once we have fully determined a yk vector, we shall immediately subtract the
various projections onto this vector from all succeeding x vectors. In particular,
once y1 is determined, we shall subtract the projection of x2 onto y1 from x2 , then
we shall subtract the projection of x3 onto y1 from x3 , and continue until we have
subtracted the projection of xn onto y1 from xn . Only then will we return to x2 and
normalize it to obtain y2 . Then, we shall subtract from x3 , x4 , . . . , xn the projections
onto y2 from x3 , x4 , . . . , xn , respectively, before returning to x3 and normalizing
it, thus obtaining y3 . As a result, once we have y1 , we alter x2 , x3 , . . . , xn so each
is orthogonal to y1 ; once we have y2 , we alter again x3 , x4 , . . . , xn so each is also
orthogonal to y2 ; and so on.
These changes are known as the revised GramSchmidt algorithm. Given a set
of linearly independent vectors {x1 , x2 , . . . , xn }, the algorithm may be formalized
as follows: Begin with k = 1 and, sequentially moving through k = n;
(i) calculate rkk =
xk , xk ,
332
Chapter 10
Example 3 Use the revised GramSchmidt algorithm to construct an orthonormal set of vectors from the linearly independent set {x1 , x2 , x3 }, where
1
x1 = 1,
0
0
x2 = 1,
1
1
x3 = 0.
1
Solution
First Iteration (k = 1)
r11 = x1 , x1 = 2,
1
1/ 2
1
1
1 = 1/ 2 ,
q1 =
x1 =
r11
2 0
0
1
r12 = x2 , q1 = ,
2
1
r13 = x3 , q1 = ,
2
0
x2 x2 r12 q1 = 1
1
1
x3 x3 r13 q1 = 0
1
1/ 2
1/2
1
1/2,
1/ 2 =
2
1
0
1/2
1/ 2
1
1/ 2 = 1/2.
2
1
0
1/6
1/2
1
1
1/2 = 1/ 6 ,
x2 =
q2 =
r22
3/2
1
2/ 6
1
r23 = x3 , q2 = ,
6
1/6
2/3
1/2
1
x3 x3 r23 q2 = 1/2 1/ 6 = 2/3.
6
2/3
1
2/ 6
10.3
333
Third Iteration (k = 3)
Using vectors from the second iteration, we compute
2
x3 , x3 = ,
3
1/3
2/3
1
1
x3 = 2/3 = 1/3 .
q3 =
r33
2/ 3
2/3
1/ 3
r33 =
1
x1 ,
r11
(10)
1
1
x2 =
(x2 r12 q1 ).
r22
r22
(11)
1
1
x3 =
(x3 r13 q1 r23 q2 ).
r33
r33
334
Chapter 10
(12)
Eqs. (10) through (12) form a pattern that is easily extended. Begin with
linearly independent vectors x1 , x2 , . . . , xn , and use the revised GramSchmidt
algorithm to form q1 , q2 , . . . , qn . Then, for any k(k = 1, 2, . . . , n).
xk = r1k q1 + r2k q2 + r3k q3 + + rkk qk .
If we set X = [x1 x2 . . . xn ],
Q = [q1 q2 qn ]
(13)
and
r11 r12
0 r22
0
R=0
..
..
.
.
0
0
r13
r23
r33
..
.
0
r1n
r2n
r3n
;
..
.
rnn
...
(14)
1
X = 1
0
0
1
1
1
0.
1
1/2 1/6
1/3
2 1/
2
1/
2
Q = 1/ 2
1/6 1/3 and R = 0
3/2 1/6.
0
0
2/ 3
0
2/ 6
1/ 3
10.3
335
Example 5
1
1
X=
0
1
1
2
1
0
0
1
2
1
1
0
.
1
1
x1 =
0, x2 = 1, x3 = 2,
1
0
1
1
0
x4 =
1.
1
x1 , x1 = 3 = 1.7321,
1
0.5774
1
1 1
= 0.5774,
q1 =
x1 =
r11
3 0 0.0000
1
0.5774
2
3
r12 = x2 , q1 = 1.7321,
3
2
r13 = x3 , q1 = 1.1547,
3
2
r14 = x4 , q1 = 1.1547,
1
0.5774
0.0000
2
0
0.5774
0.6667
1
0.5774 0.3333
x3 x3 r13 q1 =
2 1.1547 0.0000 = 2.0000,
1
0.5774
0.3333
1
0.5774
0.3333
0
0.5774 0.6667
x4 x4 r14 q1 =
1 1.1547 0.0000 = 1.0000.
1
0.5774
0.3333
336
Chapter 10
Second Iteration (k = 2)
Using vectors from the rst iteration, we compute
r22 =
x2 , x2 = 1.7321,
0.0000
0.0000
1
1
1.0000 = 0.5774,
q2 =
x2 =
1.0000
0.5774
r22
1.7321
1.0000
0.5774
3
2
r23 = x3 , q2 = 1.1547,
3
2
r24 = x4 , q2 = 0.0000,
0.6667
0.0000
0.6667
0.3333
0.5774 0.3333
x3 x3 r23 q2 =
2.0000 1.1547 0.5774 = 1.3333,
0.3333
0.5774
1.0000
0.3333
0.0000
0.3333
0.6667
0.5774 0.6667
x4 x4 r24 q2 =
1.0000 0.0000 0.5774 = 1.0000.
0.3333
0.5774
0.3333
Third Iteration (k = 3)
Using vectors from the second iteration, we compute
x3 , x3 = 1.8257,
0.6667
0.3651
1
1
0.3333 = 0.1826,
q3 =
x3 =
1.3333
0.7303
r33
1.8257
1.0000
0.5477
3
2
r34 = x4 , q3 = 0.9129,
0.3333
0.3651
0.6667
0.6667
0.1826 0.5000
x4 x4 r34 q3 =
1.0000 0.9129 0.7303 = 0.3333.
0.3333
0.5477
0.1667
r33 =
Fourth Iteration (k = 4)
Using vectors from the third iteration, we compute
r44 =
x4 , x4 = 0.9129,
10.3
337
0.6667
0.7303
1
1
0.5000 = 0.5477.
q4 =
x4 =
0.3333
0.3651
r44
0.9129
0.1667
0.1826
With these entries calculated (compare with Example 2 of Section 10.2),
we form
0.5774
0.5774
Q=
0.0000
0.5774
0.3651
0.1826
0.7303
0.5477
0.0000
0.5774
0.5774
0.5774
0.7303
0.5477
0.3651
0.1826
and
1
1
0
1
1
0.5774
0.5774
2
=
1 0.0000
0
0.5774
0.0000
0.5774
1.7321
0
0.5774
0.5774
1.7321
.
1.7321
Problems 10.3
In Problems 1 through 10, determine the (a) the angle between the given vectors,
(b) the projection of x1 onto x2 , and (c) its orthogonal component.
1
2
1
3
, x2 =
.
2. x1 =
, x2 =
.
1. x1 =
2
1
1
5
3. x1 =
3
,
2
x2 =
3
.
3
4. x1 =
4
,
1
x2 =
2
.
8
338
Chapter 10
7
2
5. x1 =
, x2 =
.
2
9
1
7. x1 = 1,
0
0
1
9. x1 =
1,
1
2
x2 = 2.
1
1
1
x2 =
1.
0
2
6. x1 = 1,
0
0
8. x1 = 3,
4
1
2
10. x1 =
3,
4
2
x2 = 0.
2
2
x2 = 5.
5
1
2
x2 =
0.
1
1
2
1
14. 2
2
2
17. 1
0
0
1
20.
1
1
2
.
1
3 3
.
2 3
3 1
2
1 1
2 1
2.
15. 1 0.
16.
1 1.
1
3 5
1 1
0 2
1 2 2
0
1 0.
18. 1 0 2.
19. 3
1 2
0 1 1
4
1 1
1
0
1
1
0 1
1
0
.
.
21.
1 0
0 1 1
1 1
0
0
0
12.
1
1
3
.
5
13.
3
5
0
2
5.
5
a, x
a
a, a
is orthogonal to a.
24. Discuss what is likely to occur in a QR-decomposition if the columns are not
linearly independent, and all calculations are rounded.
10.4
The QR-Algorithm
The QR-algorithm is one of the more powerful numerical methods developed
for computing eigenvalues of real matrices. In contrast to the power methods
described in Section 6.6, which converge only to a single dominant real eigenvalue
of a matrix, the QR-algorithm generally locates all eigenvalues, both real and
complex, regardless of multiplicity.
Although a proof of the QR-algorithm is beyond the scope of this book, the
algorithm itself is deceptively simple. As its name suggests, the algorithm is based
on QR-decompositions. Not surprisingly then, the algorithm involves numerous
arithmetic calculations, making it unattractive for hand computations but ideal
for implementation on a computer.
Like many numerical methods, the QR-algorithm is iterative. We begin with
a square real matrix A0 . To determine its eigenvalues, we create a sequence of
new matrices A1 , A2 , . . . , Ak1 , Ak , . . . , having the property that each new matrix
has the same eigenvalues as A0 , and that these eigenvalues become increasingly
obvious as the sequence progresses. To calculate Ak (k = 1, 2, 3, . . .) once Ak1 is
known, rst construct a QR-decomposition of Ak1 :
Ak1 = Qk1 Rk1 ,
and then reverse the order fo the product to dene
Ak = Rk1 Qk1 .
It can be shown that each matrix in the sequence {Ak } (k = 1, 2, 3, . . .) has identical
eigenvalues. For now, we just note that the sequence generally converges to one
of the following two partitioned forms:
(15)
U
V
- - - - - - - - - - - - - - - - - - - - - - - -
0 0 0 0
b c.
0 0 0 0 d e
(16)
--- ---
S
T
--------------------0 0 0 0 a
or
----- ---
10.4
339
The QR-Algorithm
If matrix (15) occurs, then the element a is an eigenvalue, and the remaining
eigenvalues are found by applying the QR-algorithm a new to the submatrix S.
If, on the other hand, matrix (16) occurs, then two eigenvalues are determined by
solving for the roots of the characteristic equation of the 2 2 matrix in the lower
right partition, namely
2 (b + e) + (be cd) = 0.
340
Chapter 10
The remaining eigenvalues are found by applying the QR-algorithm anew to the
submatrix U.
Convergence of the algorithm is accelerated by performing a shift at each
iteration. If the orders of all matrices are n n, we denote the element in the
(n, n)-position of the matrix Ak1 as wk1 , and construct a QR-decomposition for
the shifted matrix Ak1 wk1 I. That is,
Ak1 wk1 I = Qk1 Rk1 .
(17)
(18)
We dene
Example 1
0
A0 = 0
18
1
0
1
0
1.
7
Solution Using the QR-algorithm with shifting, carrying all calculations to eight
signicant gures but rounding to four decimals for presentation, we compute
7 1 0
A0 (7)I = 0 7 1
18 1 0
0.3624
= 0.0000
0.9320
0.1695
0.9833
0.0659
0.9165
19.3132
0.1818 0.0000
0.3564
0.0000
0.5696
7.1187
0.0000
0.0000
0.9833
0.1818
= Q0 R0 ,
A1 = R0 Q0 + (7)I
19.3132 0.5696
7.1187
= 0.0000
0.0000
0.0000
7
+ 0
0
0.0000
= 0.9165
0.1695
0
7
0
0.0000
0.3624
0.9833 0.0000
0.1818
0.9320
0
0
7
2.7130
0.0648
0.0120
17.8035
1.6449,
6.9352
0.1695
0.9833
0.0659
0.9165
0.1818
0.3564
10.4
341
The QR-Algorithm
17.8035
1.6449
0.0000
0.0260
6.9975
0.0120 0.0000
0.9996
0.0000
6.9352
2.7130
6.8704
A1 (6.9352)I = 0.9165
0.1695 0.0120
0.9911 0.1306
0.9913
= 0.1310
0.0242 0.0153
= Q1 R1 ,
0.0478
A2 = R1 Q1 + (6.9352)I = 0.9414
0.0117
2.9101
0.5954
0.0074
3.5884
6.4565
0.0000
17.4294
3.9562
0.4829
17.5612
4.0322.
6.4525
2.7835
1.1455
0.0001
16.8072
6.5200
6.4056
2.5510
1.5207
0.0000
15.9729
8.3583.
6.4051
0.5511
A3 = 0.7826
0.0001
and
0.9259
A4 = 0.5497
0.0000
A4 has form (15) with
0.9259
S=
0.5497
2.5510
1.5207
and
a = 6.4051.
Example 2
0
1
A0 =
0
0
0
0
1
0
0
0
0
1
25
30
.
18
6
342
Chapter 10
Solution Using the QR-algorithm with shifting, carrying all calculations to eight
signicant gures but rounding to four decimals for presentation, we compute
6
1
A0 (6)I =
0
0
0
6
1
0
0.9864
0.1644
=
0.0000
0.0000
6.0828
0.0000
0.0000
0.0000
0
0
6
1
25
30
18
0
0.1621
0.9726
0.1666
0.0000
0.0270
0.1620
0.9722
0.1667
0.0046
0.0274
0.1643
0.9860
0.9864
6.0023
0.0000
0.0000
0.0000
0.9996
6.0001
0.0000
29.5918
28.1246
13.3142
2.2505
0.0266
0.0044
0.9996
0.0000
4.9275
4.6881
2.3856
0.3751
29.1787
27.7311
,
14.1140
3.7810
0.0266
3.7854
0.9996
0.0000
4.9275
4.6881
1.3954
0.3751
29.1787
27.7311
14.1140
0.0000
0.2343
0.9361
0.2622
0.0000
0.0628
0.2509
0.9516
0.1662
0.0106
0.0423
0.1604
0.9861
0.8931
3.8120
0.0000
0.0000
5.9182
2.8684
2.2569
0.0000
35.0379
22.8257
8.3060
1.3998
1.6681
0.9646
0.5918
0.0000
11.4235
7.4792
3.0137
0.2326
33.6068
21.8871
.
8.5524
2.4006
= Q0 R0 ,
0.1622
0.9868
A1 = R0 Q0 + (6)I =
0.0000
0.0000
3.9432
0.9868
A1 (3.7810)I =
0.0000
0.0000
0.9701
0.2428
=
0.0000
0.0000
4.0647
0.0000
0.0000
0.0000
= Q1 R1 ,
0.3790
0.9254
A2 = R1 Q1 + (3.7810)I =
0.0000
0.0000
10.4
343
The QR-Algorithm
A25
4.4404
2.8641
0.0000
0.0000
4.8641
4.2635
=
0.0000
0.0000
18.1956
13.3357
2.7641
0.3822
28.7675
21.3371
,
4.1438
1.2359
4.8641
4.2635
4.4404
2.8641
and
b
d
c
2.7641
=
e
0.3822
4.1438
.
1.2359
Problems 10.4
1. Use one iteration of the QR-algorithm to calculate A1 when
0
A0 = 0
18
1
0
1
0
1.
7
Note that this matrix differs from the one in Example 1 by a single sign.
2. Use one iteration of the QR-algorithm to calculate A1 when
17
4
1
2
A0 = 17
7
7
1.
14
0
1
A0 =
0
0
0
0
1
0
0
0
0
1
13
4
.
14
4
344
Chapter 10
3
6. 2
2
2
9. 2
1
0
4.
5
0
6
3
0
3
0
1
2 .
2
0
3 2 1
1
0 2 3
.
13.
3
1 0 1
2 2 1
1
10.5
7
7. 2
0
1
10. 0
5
10
7
14.
8
7
0
6.
7
1 0
1 1.
9 6
2
1
6
7
5
6
5
8
6
10
9
3
8. 2
3
3
11. 1
2
3
6.
11
2
6
6
0
1
0
5
1.
3
7
5
.
9
10
Least-Squares
Analyzing data for forecasting and predicting future events is common to business,
engineering, and the sciences, both physical and social. If such data are plotted,
as in Figure 10.4, they constitute a scatter diagram, which may provide insight
into the underlying relationship between system variables. For example, the data
in Figure 10.4 appears to follow a straight line relationship reasonably well. The
problem then is to determine the equation of the straight line that best ts the data.
A straight line in the variables x and y having the equation
y = mx + c,
(19)
Figure 10.4
10.5
345
Least-Squares
15
10
2x
1
e(4)
y5
9
8
7
e(3)
6
e(1)
5
e(2)
3
2
e(0)
x
0
Figure 10.5
where m and c are constants, will have one y-value on the line for each value of
x. This y-value may or may not agree with the data at the same value of x. Thus,
for values of x at which data are available, we generally have two values of y, one
value from the data and a second value from the straight line approximation to
the data. This situation is illustrated in Figure 10.5. The error at each x, designated
as e(x), is the difference between the y-value of the data and the y-value obtained
from the straight-line approximation.
Example 1 Calculate the errors made in approximating the data given in
Figure 10.5 by the line y = 2x + 1.5.
Solution The line and the given data points are plotted in Figure 10.5. There
are errors at x = 0, x = 1, x = 2, x = 3, and x = 4. Evaluating the equation
y = 2x + 1.5 at these values of x, we compute Table 10.1.
It now follows that
e(0) = 1 1.5 = 0.5,
e(1) = 5 3.5 = 1.5,
e(2) = 3 5.5 = 2.5,
e(3) = 6 7.5 = 1.5,
346
Chapter 10
Given data
Evaluated from
y = 2x + 1.5
Table 10.1
0
1
2
3
4
1
5
3
6
9
1.5
3.5
5.5
7.5
9.5
and
e(4) = 9 9.5 = 0.5.
Note that these errors could have been read directly from the graph.
We can extend this concept of error to the more general situation involving N data
points. Let (x1 , y1 ), (x2 , y2 ), (x3 , y3 ), . . . , (xN , yN ) be a set of N data points for a
particular situation. Any straight-line approximation to this data generates errors
e(x1 ), e(x2 ), e(x3 ), . . . , e(xN ) which individually can be positive, negative, or zero.
The latter case occurs when the approximation agrees with the data at a particular
point. We dene the overall error as follows.
Denition 1 The least-squares error E is the sum of the squares of the individual
errors. That is,
E = [e(x1 )]2 + [e(x2 )]2 + [e(x3 )]2 + + [e(xN )]2 .
The only way the total error E can be zero is for each of the individual errors to be
zero. Since each term of E is squared, an equal number of positive and negative
individual errors cannot sum to zero.
Example 2
Example 1.
Solution
E = [e(0)]2 + [e(1)]2 + [e(2)]2 + [e(3)]2 + [e(4)]2
= (0.5)2 + (1.5)2 + (2.5)2 + (1.5)2 + (0.5)2
= 0.25 + 2.25 + 6.25 + 2.25 + 0.25
= 11.25.
10.5
347
Least-Squares
Denition 2 The least-squares straight line is the line that minimizes the
least-squares error.
We seek values of m and c in (19) that minimize the least-squares error. For
such a line,
e(xi ) = yi (mxi + c),
so we want the values for m and c that minimize
E=
N
(yi mxi c)2 .
i=1
i=1
and
E
2(yi mxi c)(1) = 0,
=
c
N
i=1
'
xi2
&
m+
i=1
N
'
xi c =
i=1
N
xi yi ,
(20)
i=1
&N '
N
xi m + Nc =
yi .
i=1
i=1
0
1
1
5
2
3
3
6
4
.
9
348
Chapter 10
Table 10.2
Sum
5
i=1
xi
yi
(xi )2
x i yi
0
1
2
3
4
1
5
3
6
9
0
1
4
9
16
0
5
6
18
36
xi = 10
5
5
yi = 24
i=1
(xi )2 = 30
i=1
5
i=1
xi yi = 65
x1
x2
x3
..
.
xN
1
y1
y2
1
m
1
= y3 .
..
.. c
.
.
1
yN
(21)
10.5
349
Least-Squares
The solution is the vector x satisfying the normal equations, which take the matrix
form
AT Ax = AT b.
(22)
System (22) is identical to system (20) when A and b are dened as above.
We now generalize to all linear systems of the form Ax = b. We are primarily
interested in cases where the system is inconsistent (rendering the methods developed in Chapter 2 useless), ands this generally occurs when A has more rows than
columns. We shall place no restrictions on the number of columns in A, but we
will assume that the columns are linearly independent. We seek the vector x that
minimizes the least-squares error dened by Eq. (21).
Theorem 1 If x has the property that Ax b is orthogonal to the columns of A,
then x minimizes Ax b2 .
Proof. For any vector x0 of appropriate dimension,
Ax0 b2 = (Ax0 Ax) + (Ax b)2
= (Ax0 Ax) + (Ax b) , (Ax0 Ax) + (Ax b)
= (Ax0 Ax) , (Ax0 Ax) + (Ax b) , (Ax b)
= +2 (Ax0 Ax) , (Ax b)
= (Ax0 Ax)2 + (Ax b)2
= +2 Ax0 , (Ax b) 2 Ax, (Ax b) .
It follows directly from Problem 28 of Section 10.2 that the last two inner products
are both zero (take p = Ax b). Therefore,
Ax0 b2 = (Ax0 Ax)2 + (Ax b)2
(Ax b)2 ,
and x minimizes Eq. (21).
As a consequence of Theorem 1, we seek a vector x having the property
that Ax b is orthogonal to the columns of A. Denoting the columns of A as
A1 , A2 , . . . , An , respectively, we require
Ai , Ax b = 0
If y = y1
then
y2
yn
T
(i = 1, 2, . . . , n).
Ay = A1 y1 + A2 y2 + + An yn ,
350
Chapter 10
and
4
Ay, (Ax b) =
n
5
Ai yi , (Ax b)
i=1
n
(23)
i=1
n
i=1
= 0.
It also follows from Problem 39 of Section 6.1 that
Ay, (Ax b) = y, A (Ax b) = y, (A Ax A b).
(24)
Eqs. (23) and (24) imply that y, (A Ax A b) = 0 for any y. We may deduce
from Problem 26 of Section 10.2 that A Ax A b = 0, or A Ax = A b, which
has the same form as Eq. (22)! Therefore, a vector x is the least-squares solution
to Ax = b if and only if it is the solution to A Ax = A b. This set of normal
equations is guaranteed to have a unique solution whenever the columns of A are
linearly independent, and it may be solved using any of the methods described in
the previous chapters!
Example 4
+ 2y + z = 1,
y
= 2,
+ y z = 2,
+ 2y + 2z = 1.
1
3
A =
2
1
2
1
1
2
1
0
,
1
2
x
x = y,
z
and
1
2
b =
2.
1
Then,
15
A A = 3
1
3 1
10 5
5 6
and
12
A b = 4,
1
10.5
351
Least-Squares
15
3
1
3 1
10 5
5 6
x
12
y = 4.
z
1
Using Gaussian elimination, we obtain as the unique solution to this set of equations x = 0.7597, y = 0.2607, and z = 0.1772, which is also the least-squares
solution to the original system.
Example 5
3
5
2,
8
1
1
2
A= 5
1
10
80
100
b = 60.
130
150
x=
x
,
y
and
Then,
A A =
131
15
15
103
and
A b =
1950
,
1510
131
15
15
103
x
1950
=
.
y
1510
352
Chapter 10
Problems 10.5
In Problems 1 through 8, nd the least-squares solution to the given systems of
equations:
1.
2x + 3y = 8,
3x y = 5,
x + y = 6.
2.
2x + y = 8,
x + y = 4,
x + y = 0,
3x + y = 13.
3.
x + 3y
2x y
3x + y
2x + 2y
4.
2x + y = 6,
x + y = 8,
2x + y = 11,
x + y = 8,
3x + y = 4.
5.
2x + 3y 4z = 1,
x 2y + 3z = 3,
x + 4y + 2z = 6,
2x + y 3z = 1.
6.
2x + 3y
2x y
3x + 4y
3x + 5y
7.
x + y z = 90,
2x + y + z = 200,
x + 2y + 2z = 320,
3x 2y 4z = 10,
3x + 2y 3z = 220.
8.
x + 2y + 2z = 1,
2x + 3y + 2z = 2,
2x + 4y + 4z = 2,
3x + 5y + 4z = 1,
x + 3y + 2z = 1.
= 65,
= 0,
= 50,
= 55.
+ 2z = 25,
+ 3z = 30,
2z = 20,
+ 4z = 55.
9. Which of the systems, if any, given in Problems 1 through 8 represent a leastsquares, straight line t to data?
10. The monthly sales gures (in thousands of dollars) for a newly opened shoe
store are:
month
sales
16
14
15
21
sales
51
50
45
46
43
39
35
34
10.5
353
Least-Squares
10.5
10.8
10.9
11.7
11.4
11.8
12.2
(a) Find the least-squares straight line that best ts this data.
(b) Use this line to predict next years rainfall.
13. Solve system (20) algebraically and explain why the solution would be
susceptible to round-off error.
14. (Coding) To minimize the round-off error associated with solving the normal
equations for a least-squares straight line t, the (xi , yi )-data are coded before
using them in calculations. Each xi -value is replaced by the difference between
xi and the average of all xi -data. That is, if
N
1
X=
xi , then set xi = xi X,
N
i=1
1950
1960
1970
1980
1990
population
25.3
23.5
20.6
18.7
17.8
(a) Code this data using the procedure described in Problem 14, and then
nd the least-squares straight line that best ts it.
(b) Use this line to predict the population in 2000.
354
Chapter 10
Appendix: A Word on
Technology
We have covered a number of topics which relied very heavily on computations.
For example, in Chapter 6, we computed eigenvalues and in Chapter 9 we raised
transitional matrices to certain powers.
While it is true that much of the number crunching involved the basic operations of addition and multiplication, all would agree that much time could be
consumed with these tasks.
We, as educators, are rm believers that students of mathematics, science, and
engineering should rst understand the underlying fundamental concepts involved
with the topics presented in this text. However, once these ideas are mastered,
a common sense approach would be appropriate regarding laborious numerical
calculations.
As the rst decade of this new millennium is coming to a close, we can take
advantage of many tools. Calculators and Computer Algebra Systems are ideal
instruments which can be employed.
We give a few suggestions below:
355
5. A =
2
3
1 1
4. A =
.
1
1
7. C =
1
1
1
2
1
3
1
.
4
1
3
2 .
3
3
2
1
2
1 0 1
6. B = 0 1 2.
1 2 3
0 1 2 3
8. D = 3 0 1 2.
4 5 0 1
357
358
9. (a) 9
15 ,
(b) 12
10. (a) 7 4 1776 ,
(d) 10 31 1688 .
11. 950
1253
0,
(b) 12
3
12. 0
4
98 .
100 150
50 500
45 116
2.
14. 27
29
41 116
3
1000 2000
3000
15. (a)
.
0.07 0.075 0.0725
0.95 0.05
16.
.
0.01 0.99
0.80
19. 0.10
0.25
0.15
0.88
0.30
(c) 13
17.
0.6
0.7
0.05
0.02.
0.45
1941 ,
5
2
2
3
9
0
1.
2
6
4
.
8
0 1
1 0
5.
0 0.
2 2
3 2
2 2
9.
3 2.
4 8
4
5.
0
5 1
1 4
13.
2 1.
3 5
23
72
13. 45
81
6 8
.
10 12
9 3
3 6
3.
9 6.
6 18
7.
6.
(c) 4
15 .
1809 ,
12
32
10
0
6
2
.
1
11.
4 4
.
4 4
20 20
0 20
4.
50 30.
50 10
1 3
1 0
8.
8 5.
7 7
3 0
0 2
14.
3 2.
0 4
16
16.
35
1070.00 2150.00 3217.50
(b)
.
0.075
0.08 0.0775
0.10 0.50 0.40
0.4
.
18. 0.20 0.60 0.20.
0.3
0.25 0.65 0.10
5 10
.
15 20
2.
(d) 21
Section 1.2
30 ,
12.
15.
17 22
.
27 32
16.
2 2
.
0 7
5 6
.
3 18
359
4 3
1 4
18.
10 6.
8 0
11 1
3 8
21. X =
4 3.
1 17
1.5 1.0
1.0 1.0
24. S =
1.5 1.0.
2.0
0
17.
0.1 0.2
.
0.9 0.2
11 12
20. Y =
.
11 19
23. R =
2.8 1.6
.
3.6 9.2
4
19. X =
4
4
.
4
1.0 0.5
0.5 1.0
22. Y =
2.5 1.5.
1.5 0.5
25.
5
13
8
.
9
3 + 6 2 +
6 6
.
21
4 2 2 + 6/
27.
32. (a) 200 150 ,
33. (b) 11 2 6 3 ,
34. (b) 10,500 6,000
(b) 600 450 ,
(c) 550 550 .
(c) 9 4 10 8 .
4,500 ,
(c) 35,500 14,500 3,300 .
Section 1.3
1. (a) 2 2, (b) 4 4,
(f ) 2 4, (g) 4 2,
( j) 1 4, (k) 4 4,
19 22
2.
.
43 50
13 12
5. A =
17 16
8. 9
11
.
15
10 .
1 3
.
7 3
1 2
0
17. 1
1
3
6. Not dened.
9. 7 4 1 .
2 2 2
12. 7 4 1.
8 4 0
11.
(c) 2 1,
(d) Not dened, (e) 4 2,
(h) Not dened, (i) Not dened,
(l) 4 2.
23 34
5 4 3
3.
.
4.
.
31 46
9 8 7
1
3.
5
2 2
18. 2 0
1 2
1
0.
2
7. 5
6 .
3.
5.
360
22.
xz
23. 3x + y + z.
x + 3y
0 0
26.
.
0 0
x + 2y
.
3x + 4y
2b11 b12 + 3b13
.
2b21 b22 + 3b23
0 0 0
7
5
28. 0 0 0.
29.
.
11 10
0 0 0
5
1 1 1 x
2
1
34.
33. 2 1 3y = 4.
3
1 1 0 z
0
1
25.
a11 x + a12 y
.
a21 x + a22 y
24.
0
16
27.
40
.
8
2
3 x
10
=
.
4 5 y
11
4
x
5
y 0
1
= .
0 z 3
3 w
4
32.
3
1
2
1
2
0
2
2
35. (a) PN = [38,000], which represents the total revenue for that ight.
613
625
39. FC = 887 960,
1870 1915
which represents the number of each sex in each state of sickness.
Section 1.4
7
1. 6
2
4 1
1 0.
2 6
t 3 + 3t
2t 3 + t 2
2.
t 4 + t 2 + t
t5
2t 2 + 3
4t 2 + t
3
2t + t + 1
2t 4
3
t
.
t + 1
0
4
6
9
4. XT X = [29], and XXT = 6
8 12
8
12.
16
361
1 2
3 4
2
4
6
8
, and XXT = [30].
5. XT X =
3 6
9 12
4 8 12 16
2
6. 2x + 6xy + 4y2 . 7. A, B, D, F, M, N, R, and T.
8. E, F, H, K, L, M, N, R, and T.
9. Yes.
5 0 0
12. 0 9 0. 14. No.
0 0 2
29.
28. A = B + C.
3
2
1
7
2
21
0 21
3
1
30.
0
2 0 4 + 2
1 4 2
2 3
34. (a)
P2
0.37
=
0.28
(b) 0.37,
0.63
0.72
(c) 0.63,
7
2
21
5 + 23
1
5 8
2
1
3
2
25. 4.
21
0 2.
2 0
3
.
and
(d) 0.711,
P3
0.289
=
0.316
0.711
,
0.684
(e) 0.684.
35. 1 1 1 1, 1 1 2 1, 1 2 1 1, 1 2 2 1.
36. (a) 0.097, (b) 0.0194.
0
1
40. M =
0
0
0
0
1
0
0
0
0
1
0
0
0
0
1
1
0
0
0
0
1
0
0
1
0
0
0
1
0
0
1
1
0
1
0
1
1
0
0
0
0
0
.
1
0
1
362
0
2
41. (a) M =
0
1
0
2
0
0
2
1
0
0
0
1
0
1
2
1
0
1
0
1
0
,
1
0
0
1
0
42. (a) M =
1
0
0
1
0
1
0
1
0
0
0
1
1
0
0
1
0
0
0
0
0
0
0
1
1
1
0
1
1
1
1
0
1
0
0
0
0
0
1
1
0
0
1
0
0
0
1
0
0
0
1
0
0
0
,
0
1
0
(b) M3 has a path from node 1 to node 7; it is the rst integral power of M
having m17 positive. The minimum number of intermediate cities is two.
Section 1.5
4
5 1
3. 15 10 4
1
1 5
9
22.
9
11 9 0 0
4 6 0 0
AB =
0 0 2 1.
0
18 6
12 6
5.
0 0
0 0
1
0
0
2
7. A =
0
0
0
4 1
0
0
1
3
0
0
.
0
4
0
4
0
0
0
0
0
0
0
0
0
0
6.
0
0
0
0
0
0
0
0
1
0
0
0
8. An = A when n is odd.
7 8 0
4 1 0
0 0 5
0 0 1
0
0
0
.
A3
1
0
0
0
0
.
1
2
1
0
0
=
0
0
0
0
8
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
.
0
0
0
363
Section 1.6
4/3
1
2.
8/3.
1/3
1. p = 1.
3. 1
6 3
2 1
(b)
12 6
0
0
5. (a) 4
1.
0.4
12
4
24
0
3
1
,
6
0
(c) [29],
2 0 2
(c) 1 0 1,
3 0 3
(b) [1],
1,
(d) [29].
0 1
0 0.
0 1
15,
(f)
1
(d) 0
1
2,
(b) 5,
8. (a)
2,
(b)
9. (a)
15,
(b)
(c)
5,
(c)
3,
(d)
1
2
3,
(d) 2,
(e)
2
3
10
12. x
+y
=
.
4
5
11
39.
3,
(e)
3
4
5
6
1
13. x 0 + y 1 + z 2 + w 8 = 0.
1
1
2
1
0
30,
16. 0.5
(f)
39.
0.3
2.
0.2 .
17. (a) There is a 0.6 probability that an individual chosen at random initially will
live in the city; thus, 60% of the population initially lives in the city, while
40% lives in the suburbs.
(b) d(1) = [0.574 0.426].
18. (a) 40% of customers now use brand X, 50% use brand Y, and 10% use other
brands.
(b) d1 = [0.395 0.530 0.075].
19. (a) d(0) = [0 1].
364
Section 1.7
1.
4.
6.
7.
365
366
16.
17. 341.57 .
18. 111.80 .
19. 225 .
20. 59.04 .
21. 270 .
CHAPTER 2
Section 2.1
1. (a) No.
(b) Yes.
2. (a) Yes.
(b) No.
4. k = 1.
(c) Yes.
5. k = 1/12.
2
1 1 2 x
3 5 x
11
8.
=
.
9. 1 1 2y = 0.
2 7 y
3
1 2 2
z
1
1 2 3 x
6
1
2
2 x
0
4
2y = 0.
10. 1 3 2y = 0.
11. 2
3 4 7 z
6
3 6 4 z
0
12. 50r + 60s = 70,000,
30r + 40s = 45,000.
13.
5d + 0.25b = 200,
10d +
b = 500.
367
15.
5A + 4B + 8C + 12D = 80,
20A + 30B + 15C + 5D = 200,
3A + 3B + 10C + 7D = 50.
1
3
0
23. A = 0.20
0.10
0.20
0
24. A =
0.10
0.30
and
1
4 p1
1
4 p1
23 p2 + 13 p3 = 0,
+ 13 p2 21 p3 = 0.
0.15p4 = 0,
1
3 p3
2
3 p3
+ 0.40p4 = 0,
1
3 p3
+ 0.45p4 = 0,
p4 = 0.
d=
0.02
0
0.35
0.50
0.30
0.10
0.15
0.20
0.05
0.30
0.40
0
0
0.10
20,000
.
30,000
and
0.25
0
0.10
0.05
50,000
d = 80,000.
30,000
and
0
5,000,000
.
d =
0
0
Section 2.2
1. x = 1, y = 1, z = 2.
3. x = y = 1.
1
1
(n + 1), m = (3n 5p 3), n and p are arbitrary.
5
5
6. x = 0, y = 0, z = 0.
7. x = 2, y = 1, z = 1.
5. l =
8. x = 1, y = 1, z = 0, w = 1.
Section 2.3
1
1. Ab =
3
2 3
.
1 1
2. Ab =
1 2 1 1
.
2 3 2 4
368
1
3. Ab = 3
4
2
1
3
5
13.
0
2 4
4. Ab = 3 2
5 3
0
1
7
2
8.
15
2 3 4 12
5. Ab = 3 2 0 1.
8 1 4 10
6. x + 2y = 5,
y = 8.
7. x 2y + 3z = 10,
y 5z = 3,
z = 4.
8. r 3s + 12t =
40,
s 6t = 200,
t=
25.
9. x + 3y
= 8,
y + 4z = 2,
0 = 0.
10. a 7b + 2c = 0,
b c = 0,
0 = 0.
11. u v
= 1,
v 2w = 2,
w = 3,
0 = 1.
12. x = 11, y = 8.
1 2
5
.
0 1 23
1
21. 0
0
1
0
23.
0
0
2
1
0
3
4
5
7 .
1 41/29
3
1
0
0
2
0
5
1
.
1 9/35
0
0
19.
1 6 5
.
0 1 18
1
22. 0
0
1
24. 0
0
3 2
1 2
0 1
3/2
1
0
20.
3.5
1
2.5
.
6
1
4 .
32/23
2
50
1
3
32
53/76
25. x = 1, y = 2.
26. x = 5/7 (1/7)z, y = 6/7 + (4/7)z, z is arbitrary.
27. a = 3, b = 4.
1
0
0
6
5/76
5
130 .
190/76
369
29. r =
1
1
(21 + 8t), s =
(38 + 12t), t is arbitrary.
13
13
30. x = 1, y = 1, z = 2.
32. x = y = 1.
34. l =
1
1
(n + 1), m = (3n 5p 3), n and p are arbitrary.
5
5
37. A = 5, B = 8, C = 6.
40,000
55,000
58,333
(0)
(1)
(2)
48. x =
,
x =
,
x =
.
60,000
43,333
48,333
49. x(0)
100,000
= 160,000,
60,000
x(1)
83,200
= 118,000,
102,000
x(2)
103,360
= 127,240.
89,820
1,500,000
2,300,00
0
10,000,000
(1) = 7,000,000,
(2) = 6,400,000.
,
50. x(0) =
x
x
500,000
800,000
0
0
3,000,000
2,750,000
370
(b) 4,
(c) 8.
3. (a) 3,
(b) 3,
(c) 8.
5. (a) 9,
(b) 9,
(c) 11.
7. a = 3, b = 4.
8. r = 13/3, s = t = 5/3.
9. Depending on the roundoff procedure used, the last equation may not be
0 = 0, but rather numbers very close to zero. Then only one answer is
obtained.
Section 2.5
1. Independent.
2. Independent.
3. Dependent.
4. Dependent.
5. Independent.
6. Dependent.
7. Independent.
8. Dependent.
9. Dependent.
10. Dependent.
11. Independent.
12. Dependent.
13. Independent.
14. Independent.
15. Dependent.
16. Independent.
17. Dependent.
18. Dependent.
2
1
1
1
19. Dependent.
20. 1 = (2)1 + (1) 0 + (3)1.
2
0
1
1
21. (a) [2 3] = 2[1 0] + 3[0 1], (b) [2 3] = 25 [1 1] + 21 [1 1], (c) No.
( ) 1
( ) 1
( ) 0
1
1
1
1
0 +
1 +
1,
22. (a) 1 =
2
2
2
1
1
0
1
1
1
1
1
(c) 1 = (0)0 + (1)1 + (0)1.
1
1
1
1
2
1
1
2
23. 0 = (1)0 + (1)0 + (0)0.
3
1
2
1
(b) No,
371
24. [a
25. [a
26. [1
1].
27. [a
b
+
c
a
+
b
c
a
+
b
+
c
0 +
1 +
1.
29. b =
2
2
2
c
1
0
1
30. No, impossible to write any vector with a nonzero second component as a
linear combination of these vectors.
a
1
1
2
31. 0 = (a)0 + (0)0 + (0)0.
32. 1 and 2 are bases.
a
1
2
1
33. 7 and 11 are bases.
Section 2.6
1. 2.
2. 2.
6. Independent.
3. 1.
4. 2.
5. 3.
7. Independent.
8. Dependent.
10. Independent.
11. Dependent.
12. Independent.
13. Dependent.
14. Dependent.
15. Dependent.
16. Independent.
17. Dependent.
18. Independent.
19. Dependent.
20. Independent.
21. Dependent.
22. Dependent.
9. Dependent.
25. Yes.
26. Yes.
27. No.
Section 2.7
1. Consistent with no arbitrary unknowns; x = 2/3, y = 1/3.
2. Inconsistent.
372
CHAPTER 3
Section 3.1
1. (c).
2. None.
5. D has no inverse.
1
5
3
20
10.
1
10
1
20
14. 0 5
0 0
0
18.
5
7.
0
1
11.
0.
1
0 0
1 0.
0 1
15.
.
3.
1
.
0
8.
16.
1 0 0 0
0 1 0 8
19.
.
0 0 1 0
0 0 0 1
3
1
20.
0
2
0
1 0 0 0
0 0 0 1
21.
.
0 0 1 0
0 1 0 0
1 0 0 0
0 0 0 1
22.
.
0 0 1 0
0
1 0
9.
12
13.
1
0
.
0 5
2
3
2
3
13
0
.
1
13
4.
21
3
0
3
2
12.
2
14
8
14
3
14
5
14
17. 0
0
0
1
0
0
0
0
0
1
0
0
0
1
0
0 0
0 0
0 0
.
1 0
0 1
1 0 0 0 0 0
0 0 0 1 0 0
0 0 1 0 0 0
.
23.
0 1 0 0 0 0
0 0 0 0 1 0
0 0 0 0 0 1
3.
1
373
24.
27.
0
.
7
1
2
31. 0
1
34. 0
0
1 0 0 0 0
0 1 0 0 0
25.
0 0 7 0 0.
0 0 0 1 0
0 0 0 0 1
1
0
28.
0 3
1 0.
0 1
0
1
0
0
0
0
1
0
1
35. 0
0
0
2.
1
0
.
1
5
1
0
39.
3
0
1
0
42.
0
0
0
0
0
1
0
0
.
1
0
1
3
0
0
1
0
0
0
54. 1
0
1
0
1
0
0
0
1
0
0
0
.
1
1
4
0
. 52. 0
0
3
1
0
1
0
4
0
. 55.
0
0
1
7
0
0
0
.
0
1
0
0
.
0
1
0
0
1
0
0 0
1 0
0 2
0 0
0
.
0 13
0
1
4
49. 0 21
0 0
1 1 0
48. 0 1 0.
0 0 1
0
45.
1
1
1 0 0
33. 0 1 0.
3 0 1
1 0 0
0 1 0
36.
.
1
0 0 4
0 0
0 7
.
1 0
0 1
0 0
0 0
.
1 0
0 1
30.
44. No inverse.
1
5
1
0
0
0
0
1
0
1 0
0 1
38.
0 0
0 0
1 0
0 1
41.
0 1
0 0
1
1 2 0 0
2
0 1 0 0
0
50.
0 0 1 0. 51.
0
0 0 2 1
0
0
1
53.
0
0
0
0.
1
1
0
0
0
0.
1
0
.
1
1
3
47.
0
0
0
0
.
1
0
1
0
.
0
0
0
32. 1
0
0
0.2
0
29.
1
3
1
10
1
26. 0
0
1 2
.
0 1
0
.
0
0
0
1
1
2
0
1
0
0
43.
.
1
0
37.
0
0
0
0
40.
0
1
2
46.
0
0
.
5
3
1
5
1
6
0
.
0
1
1
5
0
0
0 0
.
1 6
0 1
0 23
374
Section 3.2
4 1
1.
.
3 1
4.
1
4
11 3
1
2 1
.
2.
3 1 2
1
.
2
5.
1 1 1
1
1 1 1.
7.
2 1 1 1
2 3
.
5 8
0
8. 0
1
1
0
0
0
1.
0
1 0
1
5 2
11.
2
1 2
6.
1 1 1
9. 6 5 4.
3 2 2
0
0.
2
9 5 2
13. 5 3 1.
36 21 8
1 7 2
1
7 2 3.
14.
17 2 3 4
14 5 6
1
5 3 7.
15.
17 13 1 8
5
1
6
17.
33 8
0 4 4
1
1 5 4.
18.
4 3 7 8
3 1 8
1
0 2 1.
12.
6 0 0 3
3
3
15
4 6
.
6 12
1
12.
5
4 4 4 4
1 0 4 2 5
.
19.
4 0 0 2 3
0 0 0 2
1 0
2 1
20.
8 3
25 10
0
0
.
1
0
2
2 1
0
0
28. 24 13 27 19 28 9 0 1 1 24 10 24 18 0 18.
375
Section 3.3
1. x = 1, y = 2.
2. a = 3, b = 4.
3. x = 2, y = 1.
4. l = 1, p = 3.
6. x = 8, y = 5, z = 3.
7. x = y = z = 1.
8. l = 1, m = 2, n = 0.
9. r = 4.333, s = t = 1.667.
12. x = y = 1, z = 2.
1 2 1
1 4 4
14. A2 = 0 1 2, B2 = 0 1 2.
0 0 1
0 0 1
15. A3
1 3 3
= 0 1 3,
0 0 1
B3
1 6 9
= 0 1 3.
0 0 1
1
11 2
16.
=
.
2 11
125
$
%T
%1 $ T %1
$
17. First show that BA1 = A1 BT and that A1 BT
= B
A.
Section 3.5
1 0 1
1.
3 1 0
2.
1
0.5
0
1
1
10
, x=
.
1
9
2
0
1
8
, x=
.
1.5
5
376
3.
1
0.625
0
1
3
400
, x=
.
0.125
1275
8
0
1 0 0
1 1 0
3
4. 1 1 0 0 1 1, x = 1.
0 1 1
0 0 2
2
1 2 0
5
1 0 0
5. 1 1 0 0 1 1, x = 2.
2 2 1
0 0 5
1
1
6. 2
1
0
2 1 3
10
0 0 1 6, x = 0.
1
0 0 1
10
0
1
0
10
0 0 83 13 , x = 10.
40
1
0 0 18
4
1
7.
3
21
1 8
1
0
0
1 2
1
0 0 4
8. 2
1 0.75 1
0 0
1
9. 0
0
10. 3
1
1
1
11.
1
0
1
2
12.
0
1
79
3, x = 1.
4.25
1
0
1
0 0
1
0
2 1
19
2 1, x = 3.
0 1
5
1
2
0 1
0
0
0
1
0
2
0
0
1
1
1
0
0
1
2
0
1
0
0
1
1
0
0
1
2
7
5
7
0
1
0
0
0 0
1
0
0
0 0
0
0
0
1
2
0
0, x = 1.
1
2
2
0 1 1
1
5
1 1 0
, x = .
2
0 1 1
0 0 3
1
1 1
7
2
5
2
0 1
0
266.67
166.67
, x=
166.67.
1
266.67
3
21
377
0
0
0
2
0
1
0
0
1.5
1
0 0
0 0.25 1
0
1 0 0 0
1 2 1 1
10
1 1 0 0 0 1 1 0
10
13.
1 1 1 0 0 0 1 1, x = 10.
0 1 2 1
0 0 0 3
10
1
1
14.
2
0.5
15. (a) x = 5, y = 2;
0 2 0
2.5
2 2 6
, x = 1.5.
0 8 8
1.5
0 0 3
2.0
16. (a) x = 1, y = 0, z = 2;
8
2
35
17. (a) 3, (b) 0, (c) 5,
1
0
15
0
80
1
0
50
1
18. (a)
1, (b) 0, (c) 10,
0
20
1
0.5
(d) 1.5.
1.5
1
3
1
3
(d)
1 .
3
1
3
CHAPTER 4
Section 4.1
1.
2.
21
21
22
22
23
23
22
21
23
0
23
22
21
378
3.
22
24
5.
4.
23
21
26
0
6.
26
22
22
24
24
26
24
22
26
0
8.
22
24
1
26
24
22
22
26
24
22
26
24
26
7.
22
0
0
379
9. 12
10.
10
22
24
26
0
0
10
12
11. 10
26
24
22
12.
200
100
2100
2200
2200
2
2
13.
10
14.
100
50
250
2100
2100
250
50
100
2100
100
200
380
Section 4.2
Note: Assume all variables are non-negative for (1) through (8).
1. Let x = the number of trucks of wheat; y = the number of trucks of corn.
2x + 3y 23, 3x + y 17.
The objective function is 5000x + 6000y.
2. The objective function is 8000x + 5000y.
3. Let x = the number of units of X; y = the number of units of Y . 2x + 3y 180,
3x + 2y 240. The objective function is 500x + 750y.
4. The objective function is 750x + 500y.
5. Add the third constraint 10x + 10y 210.
6. Let x = the number of ounces of Zinc and y = the number of ounces of
Calcium. 2x + y 10, x + 4y 15. The objective function is .04x + .0 5y.
7. Add the third constraint 3x + 2y 12.
8. The objective function is .07x + .0 8y.
9. The Richard Nardone Emporium needs at least 1800 cases of regular scotch and
at least 750 cases of premium scotch. Each foreign shipment from distributor
x can deliver two cases of the former and three cases of the latter, while
distributor y can produce nine cases of the former and one case of the latter
for each foreign shipment. Minimize the cost if each x shipment costs $400
and each y shipment costs $1100. Note that the units for K (x, y) is in $100s.
(g) Three components are required to produce a special force (in pounds):
mechanical, chemical, and electrical. The following constraints are
imposed:
Every x force requires one mechanical unit, two chemical units and one
electrical unit;
Every y force needs one mechanical unit, one chemical unit and three
electrical units;
Every z force requires two mechanical units, one chemical unit and one
electrical unit.
The respective limits on these components is 12, 14, and 15 units, respectively.
The Cafone Force Machine uses 2x plus 3y plus 4z pounds of force; maximize
the sum of these forces.
Section 4.3
1. $50,000.
2. $57,000.
3. $45,000. Note that the minimum occurs at every point on the line segment
connecting (72,12) and (90,0).
381
4. $60,000. Note that the minimum occurs at every point on the line segment
connecting (72,12) and (0,120).
5. X = 72, Y = 12 is one solution,
X = 90, Y = 0 is another solution.
6. About 29 cents.
9. 400.
12. 3280.
14. 60,468.8.
15. 3018.8.
Section 4.4
1. $50,000.
2. $57,000.
3. 30.
4. 20.
5. 72.
x1
7. 2
3
100
x2
s1
s2
5
4
55
1
0
0
0
1
0.5
0
0
1
10
12
0
CHAPTER 5
Section 5.1
1. 2.
2. 38.
3. 38.
4. 2.
6. 82.
7. 9.
8. 20.
9. 21.
11. 20.
12. 0.
16. 4t 6.
17. 2t 2 + 6.
13. 0.
5. 82.
10. 2.
14. 0.
15. 93.
19. 0 and 2.
20. 1 and 4.
21. 2 and 3.
18. 5t 2 .
22. 6.
24. 2 9 + 38.
25. 2 13 2.
23. 2 9 2.
26. 2 8 + 9.
29. The new determinants are the chosen constant times the old determinants,
respectively.
30. No change.
31. Zero.
32. Identical.
33. Zero.
382
Section 5.2
1. 6.
2. 22.
3. 0.
4. 9.
6. 15.
7. 5.
8. 10.
9. 0.
5. 33.
10. 0.
11. 0.
12. 119.
13. 8.
14. 22.
15. 7.
16. 40.
17. 52.
18. 25.
19. 0.
20. 0.
21. 11.
22. 0.
25. 3 + 7 + 22.
26. 3 + 42 17.
27. 3 + 6 9.
2. 0.
3. 311.
4. 10.
5. 0.
6. 5.
7. 0.
8. 0.
9. 119.
10. 9.
11. 33.
12. 15.
13. 2187.
14. 52.
15. 25.
16. 0.
17. 0.
18. 152.
19. 0.
20. 0.
Section 5.5
1. Does not exist.
2.
4 1
.
3 1
3.
4 6
.
6 12
383
4.
1
4
3
11
1
.
2
5.
1 1 1
1
1 1 1.
7.
2 1 1 1
2 3
.
5 8
0
8. 0
1
1
0
0
0
1.
0
1 1 1
9. 6 5 4.
3 2 2
1 0
1
5 2
11.
2
1 2
5
1
6
14.
33 8
1
d
16.
ad bc c
b
.
a
0
0.
2
3
3
15
1
12.
5
14 5 6
1
5 3 7.
12.
17 13 1 8
0 4 4
1
1 5 4.
15.
4 3 7 8
Section 5.6
1. x = 1, y = 2.
2. x = 3, y = 3.
4. s = 50, t = 30.
7. x = 10, y = z = 5.
8. x = 1, y = 4, z = 5.
9. x = y = 1, z = 2.
3. a = 10/11, b = 20/11.
10. a = b = c = 1.
12. r = 3, s = 2, t = 3.
13. x = 1, y = 2, z = 5, w = 3.
CHAPTER 6
Section 6.1
1. (a), (d), (e), (f ), and (h).
384
Section 6.2
1. 2, 3.
2. 1, 4.
5. 3, 3.
6. 3, 3.
4. 3, 12.
3. 0, 8.
7. 34.
8. 4i.
9. i.
13. 2.
10. 1, 1.
11. 0, 0.
12. 0, 0.
16. t, 2t.
18. 2, 3.
19. 2, 4, 2.
20. 1, 2, 3.
21. 1, 1, 3.
22. 0, 2, 2.
23. 2, 3, 9.
24. 1, 2, 5.
25. 2, 3, 6.
26. 0, 0, 14.
28. 2, 2, 5.
29. 0, 0, 6.
30. 3, 3, 9.
31. 3, 2i.
32. 0, i.
33. 3, 3, 3.
34. 2, 4, 1, i 5.
35. 1, 1, 2, 2.
Section 6.3
2 1
1.
,
.
1 1
2.
1 2
,
.
1 3
1
1
5.
,
.
1 2
5
5
,
.
3 4i 3 + 4i
8.
4.
7.
10.
1 1
,
.
1 2
5
2
,
.
3 3
0
1
1
13. 1,1, 0.
0
1
1
0
3
0
16. 1,2,4.
1
0
3
11.
3 1
,
.
2 2
3.
5 ,
5 .
3 34 3 + 34
6.
5
5
,
.
2i 2+i
9.
1 1
,
.
1
2
2
1
2 2 + 2
,
.
1
12.
1
0
1
14. 4,1, 0.
1
0
1
0
1
1
17. 1, 0,1.
1
1
1
9
5
5
19. 1,1 + 2i,1 2i.
13
0
0
1
1
1
21. 0,1,0.
0
1
2
1 2
,
.
1
1
1
0
1
15. 4,1, 0.
1
0
1
1
1
1
18. 1,1, 1.
0
1
2
1
1
1
20. 0, i, i.
0
1
1
1
1
0
0
1 3 0 0
22.
0, 0,2,2.
0
0
1
1
385
10
1
2
2
6 0 0 0
23.
11,0,1, 1.
4
0
0
1
1/2
1/5
25.
.
,
1/ 2
2/ 5
1/3
1/ 2
0
27. 1,1/3 , 0 .
0
1/ 2
1/ 3
2/5
1/2
24.
,
.
1/ 2
1/ 5
3/13
1/5
26.
.
,
2/ 13
2/ 5
1/18
0
1/ 2
28. 4/18 ,1, 0 .
0
1/ 2
1/ 18
0
3/13
0
29. 1/2,2/ 13,4/5.
3/5
1/ 2
0
+
39.
1
2
1
2
,
.
(b) 16 .
Section 6.4
1. 9.
2. 9.2426.
3. 5 + 8 + = 4, = 17.
4. (5)(8) = 4, = 0.1.
6. (a) 6, 8;
(c) 6, 1;
(d) 1, 8.
(c) 6, 6, 12;
(d) 1, 5, 7.
8. (a) 2A,
(b) 5A,
(c) A2 ,
(d) A + 3I.
9. (a) 2A,
(b) A2 ,
(c) A3 ,
(d) A 2I.
386
Section 6.5
1.
1
.
1
1
2.
.
0
1
1
5. 0 , 0.
1
1
3
1
1
8. 0, 5, 2.
1
3
3
1
11. 1.
1
1
0
3.
,
.
0
1
1
1
1
4. 0, 1, 0.
1
0
1
0
1
1
6. 1, 0, 2.
0
1
1
1
1
1
9. 0, 2, 1.
1
1
1
1
1
1
12. 0, 2, 1.
1
1
1
1
0
0 1
14. ,
.
0
1
0
1
1
1
1
0
0 0 0 1
15. , , ,
.
0
1
0
1
1
0
1
1
1
0
1
0 1 1
16.
0, 1, 0.
0
0
1
Section 6.6
1.
2.
Iteration
0
1
2
3
4
5
Eigenvector components
1.0000
1.0000
0.6000
1.0000
0.5238
1.0000
0.5059
1.0000
0.5015
1.0000
0.5004
1.0000
Eigenvalue
Iteration
0
1
2
3
Eigenvector components
1.0000
1.0000
0.5000
1.0000
0.5000
1.0000
0.5000
1.0000
Eigenvalue
5.0000
4.2000
4.0476
4.0118
4.0029
10.0000
8.0000
8.0000
5
1
7. 4, 0.
1
1
1
10. 3.
9
1
1
13.
1.
1
387
3.
4.
5.
6.
7.
Iteration
0
1
2
3
4
5
Eigenvector components
1.0000
1.0000
0.6000
1.0000
0.6842
1.0000
0.6623
1.0000
0.6678
1.0000
0.6664
1.0000
Eigenvalue
Iteration
0
1
2
3
4
5
Eigenvector components
1.0000
1.0000
0.5000
1.0000
0.2500
1.0000
0.2000
1.0000
0.1923
1.0000
0.1912
1.0000
Eigenvalue
Iteration
0
1
2
3
4
5
Eigenvector components
1.0000
1.0000
1.0000
0.6000
1.0000
0.5217
1.0000
0.5048
1.0000
0.5011
1.0000
0.5002
Eigenvalue
Iteration
0
1
2
3
4
5
Eigenvector components
1.0000
1.0000
1.0000
0.4545
1.0000
0.4175
1.0000
0.4145
1.0000
0.4142
1.0000
0.4142
Eigenvalue
Iteration
0
1
2
3
4
5
Eigenvector components
1.0000 1.0000 1.0000
0.2500 1.0000 0.8333
0.0763 1.0000 0.7797
0.0247 1.0000 0.7605
0.0081 1.0000 0.7537
0.0027 1.0000 0.7513
Eigenvalue
15.0000
11.4000
12.1579
11.9610
12.0098
2.0000
4.0000
5.0000
5.2000
5.2308
10.0000
9.2000
9.0435
9.0096
9.0021
11.0000
9.3636
9.2524
9.2434
9.2427
12.0000
9.8333
9.2712
9.0914
9.0310
388
8.
9.
10.
Iteration
0
1
2
3
4
5
Eigenvector components
1.0000 1.0000 1.0000
0.6923 0.6923 1.0000
0.5586 0.7241 1.0000
0.4723 0.6912 1.0000
0.4206 0.6850 1.0000
0.3883 0.6774 1.0000
Eigenvalue
Iteration
0
1
2
3
4
5
Eigenvector components
1.0000 1.0000 1.0000
0.4000 0.7000 1.0000
0.3415 0.6707 1.0000
0.3343 0.6672 1.0000
0.3335 0.6667 1.0000
0.3333 0.6667 1.0000
Eigenvalue
Iteration
0
1
2
3
4
5
Eigenvector components
1.0000 1.0000
1.0000
0.4000 1.0000
0.3000
1.0000 0.7447
0.0284
0.5244 1.0000 0.3683
1.0000 0.7168 0.5303
0.6814 1.0000 0.7423
13.0000
11.1538
11.3448
11.1471
11.1101
20.0000
16.4000
16.0488
16.0061
16.0008
Eigenvalue
20.0000
14.1000
19.9504
18.5293
20.3976
1
1
0
11. 1 is a linear combination of 4 and 1, which are eigenvectors
1
1
0
corresponding to = 1 and = 2, not = 3. Thus, the power method converges to = 2.
13
2
3
6 converges after
14. Shift by = 16. Power method on A = 2 10
3
6 5
three iterations to = 14. + = 2.
389
15.
16.
17.
18.
19.
Iteration
0
1
2
3
4
5
Eigenvector components
1.0000
1.0000
0.3333
1.0000
1.0000
0.7778
0.9535
1.0000
1.0000
0.9904
0.9981
1.0000
Eigenvalue
Iteration
0
1
2
3
4
5
Eigenvector components
1.0000
0.5000
0.8571
1.0000
1.0000
0.9615
0.9903
1.0000
1.0000
0.9976
0.9994
1.0000
Eigenvalue
Iteration
0
1
2
3
4
5
Eigenvector components
1.0000
1.0000
0.2000
1.0000
0.1892
1.0000
0.2997
1.0000
0.3258
1.0000
0.3316
1.0000
Eigenvalue
Iteration
0
1
2
3
4
5
Eigenvector components
1.0000
1.0000
0.2000
1.0000
0.3953
1.0000
0.4127
1.0000
0.4141
1.0000
0.4142
1.0000
Eigenvalue
Iteration
0
1
2
3
4
5
Eigenvector components
1.0000 1.0000
1.0000
1.0000 0.4000 0.2000
1.0000 0.2703 0.4595
1.0000 0.2526 0.4949
1.0000 0.2503 0.4994
1.0000 0.2500 0.4999
0.6000
0.6000
0.9556
0.9721
0.9981
0.2917
0.3095
0.3301
0.3317
0.3331
0.2778
0.4111
0.4760
0.4944
0.4987
0.7143
1.2286
1.3123
1.3197
1.3203
Eigenvalue
0.3125
0.4625
0.4949
0.4994
0.4999
390
20.
21.
Iteration
0
1
2
3
4
5
Eigenvector components
1.0000 1.0000 1.0000
0.3846 1.0000 0.9487
0.5004 0.7042 1.0000
0.3296 0.7720 1.0000
0.3857 0.6633 1.0000
0.3244 0.7002 1.0000
Eigenvalue
Iteration
0
1
2
3
4
5
Eigenvector components
1.0000 1.0000
1.0000
0.6667 1.0000 0.6667
0.3636 1.0000 0.3636
0.2963 1.0000 0.2963
0.2712 1.0000 0.2712
0.2602 1.0000 0.2602
0.1043
0.0969
0.0916
0.0940
0.0907
Eigenvalue
1.5000
1.8333
1.2273
1.0926
1.0424
7
2 3
25. Inverse power method applied to A = 2 4 6 converges to = 1/6.
3
6 1
+ 1/ = 10 + 6 = 16.
27 17 7
21 1 converges to
26. Inverse power method applied to A = 17
7
1 11
= 1/3. + 1/ = 25 + 3 = 22.
CHAPTER 7
Section 7.1
0 4
8
0
8 16
4 8, 0 8
16;
1. (a) 0
0
0
0
0
0
0
0
0
pk (1 )
pk (2 )
0 .
2. pk (A) = 0
0
0
pk (3 )
57
78
234
(b)
,
117 174
522
348
.
756
4. In general, AB = BA.
391
5. Yes.
6.
0 2
.
3 0
2
7. 0 .
3/2
0
sin(1 )
2
e
0 0
0
0
sin(2 )
15.
16. sin(A) = .
0 e
0.
.
.. .
..
..
...
.
0
0 1
0
0
sin(n )
sin(1)
0
sin(1)
0
17.
.
18.
.
0
sin(2)
0
sin(28)
19. cos A =
(1)k A2k
1
, cos
0
(2k)!
k=0
0
cos(1)
0
=
.
2
0
cos(2)
cos(2)
0
0
cos(2) 0.
20. 0
0
0
1
Section 7.2
2
1
1. A =
3/2
1
.
1/2
1/3
4. A1 1/3
1/2
1/3
1/6
1/4
2/3
1/6.
1/4
5. A1
1
0
=
0
0
0
1
0
0
0
0
1
0
0
0
.
0
1
Section 7.3
3
.
2
0
1
0 = 0 ,
.
3.
0 1
1 = 1 + 0 ;
0 = 0
3 6
4.
.
1 = 1 + 0 ;
1 2
1.
1 = 1 + 0 ,
1 = 1 + 0 ;
2
1
1
0
0
.
1
3
1
6
.
2
2.
5.
6.
0
0
1
.
1
392
$ %
478 + 378
478 + 2 378
$ %
$ %
$ %
.
7. 78
4 = 478 = 41 + 0 ;
2 478 2 378 2 478 378
$ %
441 + 2 341
441 + 341
$ %
$ %
$ %
8.
.
2 441 2 341 2 441 341
%%6
$
$
1 0 4 + 4 2222 3
9.
1 = 2 + 1 + 0 ,
$
$
%%6
1 = 2 1 + 0 ,
0 1 2 + 2 2222 3
.
2222 = 42 + 21 + 0 ;
0 0
2222
378 = 31 + 0 ,
10.
317 = 92 + 31 + 0 ,
517 = 252 + 51 + 0 ,
1017 = 1002 + 101 + 0 .
12.
1=
3
(2) = 83
25
$ 325 % = 273
4
= 643
25
11.
+ 2 + 1
+ 42 21
+ 92 + 31
+ 162 41
+
+
+
+
225
(2)25
325
425
0 ,
0 ,
0 ,
0 .
13.
1 = 4
1 = 4
256 = 164
256 = 164
6,561 = 814
14.
5,837 = 92 + 31 + 0 ,
381,255 = 252 + 51 + 0 ,
108 3 (10)5 + 5 = 1002 + 101 + 0 .
15.
165 = 83 + 42
357 = 83 + 42
5,837 = 273 + 92
62,469 = 643 + 162
16.
3=
3
357 = 83
5,837 = 273
68,613 = 643
17.
15 =
3
960 = 83
59,235 = 273
1,048,160 = 643
+ 3 + 2
3 + 2
+ 83 + 42
83 + 42
+ 273 + 92
+
+
21
21
31
41
1
1
21
21
31
+
+
+
+
+ 2 + 1
+ 42 21
+ 92 + 31
+ 162 41
+
+
+
+
+
0 ,
0 ,
0 ,
0 ,
0 .
0 ,
0 ,
0 ,
0 .
+
+
+
+
0 ,
0 ,
0 ,
0 .
+ 2 + 1
+ 42 21
+ 92 + 31
+ 162 41
+
+
+
+
0 ,
0 ,
0 ,
0 .
= 83 + 42
= 83 + 42
= 273 + 92
= 643 + 162
+
+
21
21
31
41
+
+
+
+
0 ,
0 ,
0 ,
0 .
393
18.
15 = 4
13 = 4
1,088 = 164
960 = 164
59,235 = 814
19.
9
3
9
.
3
22.
+ 3 + 2
3 + 2
+ 83 + 42
83 + 42
+ 273 + 92
3,007
1,024
6
20.
3
5,120
.
3,067
1
1
21
21
31
+
+
+
+
+
9
.
6
160
.
1130
938
23.
32
339
4440
1376
56,632
.
119,095
2 4 3
0
0.
24. 0
1 5 2
50,801
21.
113,264
25. 2, 569 = 42 + 21 + 0 ,
5, 633 = 42 21 + 0 ,
5 = 2 + 1 + 0 .
0 ,
0 ,
0 ,
0 ,
0 .
766
4101
3064
1110
344.
4445
0.003906
0.812500
0.000977
0.932312
0.229172.
0.755207
Section 7.4
1. 128 = 21 + 0 ,
448 = 1 .
2.
128 = 42 + 21 + 0 ,
448 = 42 + 1 ,
1,344 = 22 .
3. 128 = 42 + 21 + 0 ,
448 = 42 + 1 ,
1 = 2 + 1 + 0 .
4.
59,049 = 31 + 0 ,
196,830 = 1 .
= 273 + 92 + 31 + 0 ,
= 273 + 62 + 1 ,
= 183 + 22 ,
= 63 .
5.
59,049 = 92 + 31 + 0 ,
196,830 = 62 + 1 ,
590,490 = 22 .
6.
59,049
196,830
590,490
1,574,640
7.
512 = 83 + 42 + 21 + 0 ,
2,304 = 123 + 42 + 1 ,
9,216 = 123 + 22 ,
32,256 = 63 .
8.
512 = 83
2,304 = 123
9,216 = 123
1 = 3
9.
512 = 83
2,304 = 123
1 = 3
9 = 33
+
+
+
+
42
42
2
22
+
+
+
+
21 + 0 ,
1 ,
1 + 0 .
1 .
+
+
+
+
42 + 21 + 0 ,
42 + 1 ,
22 ,
2 + 1 + 0 .
394
10.
(5)10 3(5)5
10(5)9 15(5)4
90(5)8 60(5)3
720(5)7 180(5)2
(2)10 3(2)5
10(2)9 15(2)4
729 0
.
0
729
11.
= 5 (5)5
= 55 (5)4
= 205 (5)3
= 605 (5)2
= 5 (2)5
= 55 (2)4
+ 4 (5)4
+ 44 (5)3
+ 124 (5)2
+ 244 (5)
+ 4 (2)4
+ 44 (2)3
+ 3 (5)3
+ 33 (5)2
+ 63 (5)
+ 63 ,
+ 3 (2)3
+ 33 (2)2
3
0.
4
4
12. 0
5
1
1
1
0
13. 0
0
+ 2 (5)2 + 1 (5) + 0 ,
+ 22 (5) + 1 ,
+ 22 ,
+ 2 (2)2 + 1 (2) + 0 ,
+ 22 (2) + 1 .
0
0
0
0
0.
0
Section 7.5
1.
4.
e = 1 + 0 ,
e2 = 21 + 0 .
2. e2 = 21 + 0 ,
e2 = 1 .
e1 = 2 + 1 + 0 ,
= 42 21 + 0 ,
e3 = 92 + 31 + 0 .
e2
6.
sin (1) = 2 + 1 + 0 ,
sin (2) = 42 + 21 + 0 ,
sin (3) = 92 + 31 + 0 .
8.
e2
e2
e2
e2
= 83 + 42 + 21 + 0 ,
= 123 + 42 + 1 ,
= 123 + 22 ,
= 63 .
3. e2 = 42 + 21 + 0 ,
e2 = 42 + 1 ,
e2 = 22 .
e2 = 42 21 + 0 ,
e2 = 42 + 1 ,
e1 =
2 + 1 + 0 .
5.
7.
sin (2) = 42 21 + 0 ,
cos (2) = 42 + 1 ,
sin (1) =
2 + 1 + 0 .
9.
e2 = 83 + 42 + 21 + 0 ,
e2 = 123 + 42 + 1 ,
2
e = 83 + 42 21 + 0 ,
e2 = 123 42 + 1 .
10.
sin (2) = 83 + 42 + 21 + 0 ,
cos (2) = 123 + 42 + 1 ,
sin (2) = 83 + 42 21 + 0 ,
cos (2) = 123 42 + 1 .
11.
e3 = 273 + 92 + 31 + 0 ,
e3 = 273 + 62 + 1 ,
e3 = 183 + 22 ,
1
e = 3 + 2 1 + 0 .
12.
395
1 3e5 + 4e2 3e5 3e2
3 2 1
.
14.
e
.
1
0
7 4e5 4e2 4e5 + 3e2
12e2 + 4e2
4e2 4e2
0 1 3
1
2
2
15. e2 1 2 5.
16.
12e 12e
4e2 + 12e2
16
0 0 1
0
0
13.
17.
1 1
4
5
18. (a)
38e2 + 2e2
46e2 6e2 .
16e2
6
.
1
log(3/2)
0
log(3/2) log(1/2)
.
log(1/2)
(b) and (c) are not dened since they possess eigenvalues having absolute
value greater than 1.
(d)
0
0
0
.
0
Section 7.6
8t
3e + 4et 4e8t 4et
.
1. 1/7
3e8t 3et 4e8t + 3et
(2/ 3) sinh 3t + cosh 3t
2.
(1/ 3) sinh 3t
(1/ 3) sinh 3t
.
(2/ 3) sinh 3t + cosh 3t
Note:
e 3t e 3t
e 3t + e 3t
sinh 3t =
and cosh 3t =
.
2
2
0.2e2t 0.2e7t
1+t
t
1.4e2t 0.4e7t
3. e3t
.
4.
.
t
1t
2.8e2t + 2.8e7t 0.4e2t + 1.4e7t
0.8e2t + 0.2e7t 0.4e2t 0.4e7t
5.
.
0.4e2t 0.4e7t 0.2e2t + 0.8e7t
0.5e4t + 0.5e16t 0.5e4t 0.5e16t
6.
.
0.5e4t 0.5e16t 0.5e4t + 0.5e16t
1 t t 2 /2
7. 0 1
t .
0 0
1
396
12et
1
9et + 14e3t 5e3t
8.
12
24et + 14e3t + 10e3t
0
8e3t + 4e3t
8e3t 8e3t
0
4e3t 4e3t .
4e3t + 8e3t
Section 7.7
(1/2) sin 2t + cos 2t
(1/2) sin 2t
1.
.
(5/2) sin 2t
(1/2) sin 2t + cos 2t
2 sin 2t + cos 2t
2 sin 2t
2.
.
(3/ 2) sin 2t
2 sin 2t + cos 2t
cos(8t) 18 sin(8t)
3.
.
8 sin(8t) cos(8t)
1 2 sin(8t) + 4 cos(8t)
4 sin(8t)
.
4.
5 sin(8t)
2 sin(8t) + 4 cos(8t)
4
2 sin(t) + cos(t)
5 sin(t)
5.
.
sin(t)
2 sin(t) + cos(t)
1
4 sin(3t) + 3 cos(3t)
sin(3t)
6. e4t
.
25 sin(3t)
4 sin(3t) + 3 cos(3t)
3
sin t + cos t
sin t
4t
7. e
.
2 sin t
sin t + cos t
Section 7.8
3. A does not have an inverse.
e e1
1
A
B
8. e =
, e =
0
1
0
e1
,
1
e A eB
e
=
0
2e2 2e
,
e
e 2e 2
e 2e
A+B
=
, e
=
.
0
e
0 e
1 0
3 0
9. A =
, B=
. Also see Problem 10.
0 2
0 4
e B eA
11. First show that for any integer n, (P1 BP)n = P1 Bn P, and then use Eq. (6)
directly.
397
Section 7.9
1. (a)
sin t
2
4.
sin t + c1
t 2 + c3
2t 1
2 cos 2t
(b)
2t
.
e(t1)
1 3
3 t t + c2
e(t1) + c4
6t 2 et
2t + 3
0
1
.
1/t
CHAPTER 8
Section 8.1
x(t)
1. x(t) =
,
y(t)
y(t)
2. x(t) =
,
z(t)
x(t)
3. x(t) =
,
y(t)
x(t)
4. x(t) =
,
y(t)
2 3
0
6
A(t) =
, f(t) =
, c=
, t0 = 0.
4 5
0
7
3 2
0
1
A(t) =
, f(t) =
, c=
, t0 = 0.
4 1
0
1
3
3
1
0
A(t) =
, f(t) =
, c=
, t0 = 0.
4 4
1
0
3 0
t
1
A(t) =
, f(t) =
, c=
, t0 = 0.
2 0
t+1
1
2
x(t)
3t
, A(t) =
y(t)
1
t
e
u(t)
6. x(t) = v(t) , A(t) = t 2
w(t)
0
x(t)
0
7. x(t) = y(t), A(t) = 1
z(t)
0
2
r(t)
t
8. x(t) = s(t) , A(t) = 1
u(t)
2
5. x(t) =
2
2
7
, f(t) =
, c=
, t0 = 1.
2t
3
t
t
1
0
0
3 t + 1, f(t) = 0, c = 1, t0 = 4.
2
0
1
1 et
6
1
0
10
0 3, f(t) = 0, c = 10, t0 = 0.
2
0
0
20
sin t
3 sin t
1
0 , f(t) = t 2 1,
cos t
et t 2 1
4
c = 2, t0 = 1.
5
9. Only (c).
398
Section 8.2
0
x1 (t)
, A(t) =
1. x(t) =
3
x2 (t)
1
0
4
, f(t) =
, c=
, t0 = 0.
2
0
5
x1 (t)
0
2. x(t) =
, A(t) =
x2 (t)
t
0
2
1
, f(t) =
, c=
, t0 = 1.
0
0
et
0
x1 (t)
, A(t) =
3. x(t) =
1
x2 (t)
0
1
3
, f(t) = 2 , c =
, t0 = 0.
0
3
t
0
x1 (t)
, A(t) =
3
x2 (t)
4. x(t) =
1
0
0
,
f(t)
=
,
c
=
, t0 = 0.
2et
2et
0
0
x1 (t)
, A(t) =
5. x(t) =
2
x2 (t)
1
0
2
, f(t) = t , c =
, t0 = 1.
3
e
2
x1 (t)
0
6. x(t) = x2 (t), A(t) = 0
x3 (t)
1/4
1
0
0
0
0
2
1 , f(t) = 0, c =
1,
t/4
0
205
t0 = 1.
0
x1 (t)
0
x2 (t)
, A(t) =
7. x(t) =
x3 (t)
0
0
x4 (t)
1
0
0
et
0
1
0
tet
0
1
0
0
2
0
, f(t) =
1
0 , c = ,
0
et
e3
t0 = 0.
x1 (t)
0
x2 (t)
0
x3 (t)
0
8. x(t) =
, A(t) =
0
x4 (t)
x5 (t)
0
0
x6 (t)
2
1
0
c=
2, t0 = .
1
0
1
0
0
0
0
0
0 0
1 0
0 1
0 0
0 0
0 0
0
0 0
0
0 0
0
0 0
,
, f(t) =
1 0
0
0 1
2
4 0
t t
399
Section 8.3
x1 (t)
1. x(t) = x2 (t),
y1 (t)
x1 (t)
x2 (t)
2. x(t) =
y1 (t),
y2 (t)
x1 (t)
3. x(t) = y1 (t),
y2 (t)
x1 (t)
4. x(t) = y1 (t),
y2 (t)
x1 (t)
x2 (t)
y1 (t)
,
5. x(t) =
y2 (t)
y3 (t)
y4 (t)
0 1
0
0
7
4, f(t) = 0, c = 8, t0 = 0.
A(t) = 3 2
5 0 6
0
9
0
1 0 0
0
2
0
0
3
1 0 1
A(t) =
, f(t) = , c =
, t = 0.
0
0 0 1
0
4 0
0 1 0 1
0
4
4 0 t 2
0
1
A(t) = 0 0 1 , f(t) = 0, c = 0, t0 = 2.
0
0
t2 t 0
4 0 2
t
0
A(t) = 0 0 1, f(t) = 0, c = 0, t0 = 3.
3 t 0
1
0
0 1 0 0 0 0
0
0 2 0 0 0 1
t
0 0 0 1 0 0
0
A(t) =
, f(t) = 0,
0 0 0 0 1 0
0 0 0 0 0 1
0
t 0 t 0 1 0
et
2
0
0
c=
3, t0 = 1.
9
4
x1 (t)
0
x2 (t)
0
6. x(t) =
x3 (t), A(t) = 1
y1 (t)
0
y2 (t)
1
21
4
c=
5, t0 = 0.
5
7
1
0
0
0
0
0
1
0
0
1
0 0
0
0
0 0
1 1
, f(t) = 0,
0
0 1
0 2
0
400
0
x1 (t)
0
y1 (t)
7. x(t) =
y2 (t), A(t) = 0
1
z1 (t)
1
0
0
1
0
1
0
0
0
2
1
0
2
0
, f(t) = , c = , t0 = .
2
17
1
0
0
0
x1 (t)
0 1 0 0
x2 (t)
0 0 1 0
y1 (t)
, A(t) = 0 0 0 1
8. x(t) =
y2 (t)
1 0 1 0
z1 (t)
0 0 0 0
1 0 0 0
z2 (t)
0
1
0
0
0
1
0
0
2
0
0
, f(t) = 0,
1
0
0
1
0
1
4
4
5
c=
5, t0 = 20.
9
9
Section 8.4
t t 2 /2
1 t ,
0
1
1
3. (a) e3t 0
0
1
(c) e3(ts) 0
0
1
(b) e3(t2) 0
0
(t 2)
1
0
(t 2)2 /2
(t 2) ,
1
(t 2) (t s)2 /2
1
(t s) ,
0
1
1 (t 2) (t 2)2 /2
(d) e3(t2) 0
1
(t s) .
0
0
1
5t
1 2e + 4et 2e5t 2et
1 2e5s + 4es
5. (a)
,
(b)
6 4e5t 4et 4e5t + 2et
6 4e5s 4es
1 2e5(t3) + 4e(t3) 2e5(t3) 2e(t3)
.
6 4e5(t3) 4e(t3) 4e5(t3) + 2e(t3)
1 sin 3t + 3 cos 3t
5 sin 3t
6. (a)
,
2 sin 3t
sin 3t + 3 cos 3t
3
(c)
(b)
1 sin 3s + 3 cos 3s
2 sin 3s
3
5 sin 3s
,
sin 3s + 3 cos 3s
2e5s 2es
,
4e5s + 2es
401
1 sin 3(t s) + 3 cos 3(t s)
5 sin 3(t s)
.
(c)
2 sin 3(t s)
sin 3(t s) + 3 cos 3(t s)t
3
7. x(t) = 5e(t2) 3e(t2) , y(t) = 5e(t2) e(t2) .
8. x(t) = 2e(t1) 1, y(t) = 2e(t1) 1.
9. x(t) = k3 et + 3k4 et , y(t) = k3 et + k4 et .
10. x(t) = k3 et + 3k4 et 1, y(t) = k3 et + k4 et 1.
11. x(t) = cos 2t (1/6) sin 2t + (1/3) sin t.
12. x(t) = t 4 /24 + (5/4)t 2 (2/3)t + 3/8.
$ %
13. x(t) = (4/9) e2t + 5/9 e1t (1/3) te1t
14. x(t) = 8 cos t 6 sin t + 8 + 6t,
y(t) = 4 cos t 2 sin t 3.
Section 8.5
4. First show that
t1
(t1 , t0 )
1
(t1 , t0 )
t0
= (t0 , t1 )
t1
1
t0
=
t1
1
.
t0
CHAPTER 9
Section 9.1
1. (a) The English alphabet: a, b, c, . . . x, y, z. 26. 5/26.
(b) The 366 days designated by a 2008 Calendar, ranging from 1 January
through 31 December. 366. 1/366.
(c) A list of all 43 United States Presidents. 43. 1/43.
(d) Same as (c). 43. 2/43 (Grover Cleveland was both the 22nd and 24th
President).
(e) Regular deck of 52 cards. 52. 1/52.
(f) Pinochle deck of 48 cards. 48. 2/48.
(g) See Figure 9.1 of Chapter 9. 36. 1/36.
402
4. 1950.
Section 9.2
1. (a) 8/52.
(b) 16/52.
(c) 28/52.
(d) 2/52.
(e) 28/52.
(f) 26/52.
(g) 39/52.
(h) 48/52.
(i) 36/52.
2. (a) 18/36.
(b) 15/36.
(c) 10/36.
(d) 30/36.
(e) 26/36.
(f) 1.
3. (a) 108/216.
(b) 1/216.
(c) 1/216.
(d) 3/216.
(e) 3/216.
(f) 0.
(g) 213/216.
(h) 210/216.
(i) 206/216.
4. 0.75.
5. 0.4.
(b) 7.
(c) 56.
(d) 190.
(e) 190.
(f) 1.
(g) 1.
(h) 100.
(i) 1000.
(j) 1.
2. 2,042,975.
3. 5005.
403
500
(.65)123 (.35)377 .
123
(
)
500
(b)
(.65)485 (.35)15 .
485
(
)
(
)
500
500
(.65)498 (.35)2
(c)
(.65)497 (.35)3 +
498
497
5. (a)
)
(
)
500 $ %499 $ %1
500 $ %500 $ %0
.35 +
.35 .
.65
.65
499
500
(
)
(
)
(
)
500 $ %499 $ %1
500 $ %500 $ %0
500 $ %498 $ %2
.35
.35
.35 .
.65
.65
(d) 1
.65
499
500
498
(
)
(
)
(
)
500 $ %200 $ %300
500 $ %300 $ %200
500 $ %100 $ %400
.35
.35
.35
+
.65
+
.65
(e)
.65
200
300
100
(
+
500
400
.65
%400 $
.35
%100
(
+
500
500
.65
%500 $
.35
%0
6. Approximately .267.
7. Approximately .267.
Section 9.4
1. (a) There is a negative element in the second row.
(b) The rst row does not add to 1.
(c) The third row does not add to 1.
(d) It is not a square matrix.
2. (a) If it is sunny today, there is a probability of .5 that it will be sunny tomorrow and a .5 probability that it will rain tomorrow. If it rains today, there is
a .7 probability that it will be sunny tomorrow and a .3 chance that it will
rain tomorrow.
(b) If a parking meter works today, there is a probability of .95 that it will
work tomorrow with a .05 probability that it will not work tomorrow. If the
parking meter is inoperative today, there is a probability of .02 that it will
be xed tomorrow and a .98 probability that it will not be xed tomorrow.
(c) Any scenario has a 5050 chance at any stage.
(d) What is good stays good; what is bad stays bad.
(e) What is good today is bad tomorrow; what is bad today is good
tomorrow.
404
(f) See Example 2 in Section 9.4 and use Tinker, Evers, and Chance for Moe,
Curly, and Larry and instead of visiting or staying home use borrowing a
car or not borrowing a car.
3. Clearly if we raise either matrix to any power, we obtain the original matrix.
1 0
4. The even powers produce
and the odd powers give back the original
0 1
matrix. And situation repeats itself after an even number of time periods.
(2)
(2)
(3)
(3)
CHAPTER 10
Section 10.1
1. 11, 5.
2. 8, 4.
5. 64, 68.
9. 5/6, 7/18.
13. 2, 3.
6. 6, 5.
10. 5/ 6, 1.
14. 1, 1.
3/5
18.
.
4/5
3/17
22. 2/17.
2/ 17
3. 50, 74.
4. 63, 205.
7. 26, 24.
8. 30, 38.
12. 0, 1400.
17. undened, 6.
4/34
+
21. 3/ 34.
23. 2/ 21 4/ 21 1/ 21 .
3/ 34
4/197
+
,
6/ 197
. 25. 1/ 55 2/ 55 3 55
4/
55
5/
55 .
24.
9/ 197
8/ 197
+
,
405
40. 145
41. 27.
42. 32.
Section 10.2
1. x and y, x and u, y and v, u and v.
2. x and z, x and u, y and u, z and u, y and v.
4. 4.
5. 0.5.
7. x = 1, y = 2.
8. x = y = z.
10.
1/5
2/5
,
.
2/ 5
1/ 5
12.
2/13
3/13
,
.
2/ 13
3/ 13
2/45
1/3
2/5
14. 1/ 5 , 4/ 45 , 2/3.
2/3
0
5/ 45
3. 20/3.
6. x = 3y.
9. x = y = z; z = 1/ 3.
11.
1/2
1/2
,
.
1/ 2
1/ 2
1/6
1/3
1/ 2
13. 2/ 6 , 1/3 , 0 .
1/ 2
1/ 3
1/ 6
1/6
1/3
1/2
15. 1/ 2 , 1/3 , 1/ 6 .
0
1/ 3
2/ 6
0
3/5
4/5
16. 3/5, 16/25 , 12/25.
4/5
12/25
9/25
0
3/15
3/35
1/7
1/ 3 2/ 15 3/ 35 1/ 7
17.
1/ 3, 1/ 15 , 4/ 35 , 1/ 7 .
1/ 3
2/ 7
1/ 15
1/ 35
1/ 6
1/ 3
1/2
0
1/ 2 1/ 6 1/ 3 0
18.
0 , 2/6 , 1/3, 0.
1
0
0
0
23. x y2 = x y, x y = x2 2x, y + y2 .
24. sx + ty2 = sx ty, sx ty = sx2 2stx, y + ty2 .
25. I.
406
Section 10.3
1. (a) =
36.9 ,
2. (a) =
14.0 ,
3. (a) = 78.7 ,
4. (a) =
90 ,
5. (a) =
118.5 ,
1.6
(b)
,
0.8
0.7059
(b)
,
1.1765
0.5
(b)
,
0.5
0
(b)
,
0
0.7529
(b)
,
3.3882
6. (a) = 50.8 ,
1
(b) 0,
1
7. (a) = 19.5 ,
8/9
(b) 8/9,
4/9
8. (a) = 17.7 ,
1.2963
(b) 3.2407,
3.2407
9. (a) = 48.2 ,
2/3
2/3
(b)
2/3,
0
7/6
7/3
(b)
0 ,
7/6
0.8944 2.2361
0.4472 0.0000
0.7071 0.7071 1.4142
12.
0.7071
0.7071 0.0000
0.8321 0.5547 3.6056
13.
0.5547 0.8321 0.0000
11.
0.4472
0.8944
1.7889
.
1.3416
5.6569
.
1.4142
0.8321
.
4.1603
0.6
(c)
.
1.2
0.2941
(c)
.
0.1765
2.5
(c)
.
2.5
4
(c)
.
1
6.2471
(c)
.
1.3882
1
(c) 1.
1
1/9
(c) 1/9.
4/9
1.2963
(c) 0.2407.
0.7593
2/3
1/3
(c)
1/3 .
1
13/6
1/3
(c)
3 .
17/6
407
14.
15.
16.
17.
18.
19.
20.
21.
0.3333
0.8085
3.0000 2.6667
0.6667
0.1617
.
0.0000 1.3744
0.6667 0.5659
0.3015 0.2752
0.3015 0.8808 3.3166 4.8242 .
0.0000 1.6514
0.9045
0.3853
0.7746 0.4034
0.5164 0.5714 3.8730 0.2582
0.8944 0.2981
0.3333
2.2361 0.4472 1.7889
0.4472
0.5963 0.6667 0.0000 1.3416 0.8944.
0.0000
0.7454
0.6667
0.0000 0.0000 2.0000
0.7071
0.5774 0.4082
1.4142 1.4142 2.8284
0.7071 0.5774
0.4082 0.0000 1.7321 0.5774.
0.0000
0.5774
0.8165
0.0000 0.0000 0.8165
0.00
0.60
0.80
5 3 7
0.60
0.64 0.48 0 5 2.
0.80 0.48
0.36
0 0 1
0.0000
0.7746
0.5071
1.7321 1.1547 1.1547
0.5774 0.5164
0.5071
0.7071 0.4082
0.5774
1.4142 0.7071 0.7071
0.7071
0.4082 0.5774
24. QR = A.
Section 10.4
1. A1 = R0 Q0 + 7I
19.3132 1.2945
0.0000
0.3624
0.0756
7.0231 0.9967 0.0000 0.9967
= 0.0000
0.0000
0.0000
0.0811
0.9320
0.0294
1 0 0
0.0000
2.7499 17.8357
+ 7 0 1 0 = 0.9289 0.0293 0.2095.
0 0 1
0.0756
0.0024
7.0293
0.9289
0.0811
0.3613
408
2. A1 = R0 Q0 14I
24.3721 17.8483
3.8979
0.6565 0.6250 0.4223
8.4522 4.6650 0.6975 0.2898 0.6553
= 0.0000
0.0000
0.0000
3.6117
0.2872
0.7248 0.6262
1 0 0
15.5690
7.2354
1.0373
2.6178.
14 0 1 0 = 7.2354 19.8307
0 0 1
1.0373
2.6178 11.7383
3. Shift by 4.
4.1231 0.9701
0.0000 13.5820
0.0000
4.0073 0.9982 4.1982
,
R0 =
0.0000
0.0000
4.0005 12.9509
0.0000
0.0000
0.0000
3.3435
0.2353 0.0570
3.3809
0.9719 0.0138 1.0529
A1 = R0 Q0 + 4I =
0.0000
0.9983
3.4864
0.0000
0.0000
0.8358
4. 7.2077, 0.1039 1.5769i.
13.1545
4.0640
.
13.5081
0.7626
6. 2, 3, 9.
7. Method fails. A0 7I does not have linearly independent columns, so no QRdecomposition is possible.
8. 2, 2, 16.
12. i, 2 3i.
9. 1, 3, 3.
10. 2, 3 i.
11. 1, i.
Section 10.5
1. x = 2.225, y = 1.464.
2. x = 3.171, y = 2.286.
3. x = 9.879, y = 18.398.
4. x = 1.174, y = 8.105.
(c) 21.9.
409
N
xi yi
i=1
13. m =
N
i=1
If N
14.
N
xi
i=1
xi2
N
(N
N
N
yi
i=1
)2
,c =
xi
i=1
i=1
2
i=1 xi is near
yi
N
i=1 xi
2
N
i=1
N
xi2
N
xi
N
xi yi
i=1 i=1
(N )
xi2
xi
i=1
i=1
N
i=1 xi
0.841
23. E = 0.210.
2.312
0.160
0.069
24. E =
0.042.
0.173
Index
A
Adjacency matrices, 2728, 42f
Adjugates, 167168
Algorithms, 135
Gram-Schmidt, revised, 331337
QR, 339344
Augmented matrix, 5556
B
Bernoulli, Jakob, 305
Bernoulli trials, 305310
binomial distribution and, 305
independent events and, 305
multiplication rule for, 305
Binomial distribution, 305
Normal Approximation to
Binomial Distribution, 314
Block diagonals, 31
Bounded regions. See Finite regions;
Regions
C
Calculations, for inversions,
101108
theorems for, 101
for square matrices, 102
Calculus, for matrices, 213255.
See also Function eAt
Cayley-Hamilton theorem,
219222, 254255
consequence of, 220
verication of, 219220
matrix derivatives in, 248254
denitions of, 248253
properties of, 250251
polynomials in, 222232
for eigenvalues, 222228
D
Decomposition. See LU
decomposition
DeMoivres formula, 241
Derivatives, for matrices, 248254
denitions for, 248253
properties of, 250251
Determinants, 149175
cofactor expansion and, 152156
denitions of, 152155
minors and, 152
pivotal expansion and, 165
Cramers rule and, 170173
restrictions on, 170171
simultaneous linear equations
and, 173
denitions of, 149150
eigenvalues/eigenvectors and, 193
inversion and, 167170
adjugates and, 167168
cofactor matrices and, 167
411
412
Index
Determinants (continued)
denitions of, 167168
theorems for, 167
pivotal condensation and, 163167
coding for, 164
cofactor expansion and, 165
properties of, 157163
multiplications of matrices and,
158
row vectors as, 157158
Diagonal elements, 2
Diagonal matrices, 2122
block, 31
partitioning and, 31
Disjoint events, 302
complements to, 309
independent v., 305
in Markov chains, 311
Dominant eigenvalues, 202
iterations of, 204205
Dot products, 327330
E
Eigenvalues, 177194, 201212.
See also Inverse power
method
for companion matrices, 184
denitions of, 177178
dominant, 202
iterations of, 204205
eigenvectors and, 180190
characteristic equation of,
180182
function eAt and, 241244
matrix calculus for, 222228
multiplicity of, 182
power methods with, 201212
deciencies of, 205
inverse, 205211
properties of, 190194
determinants and, 193
trace of matrix as, 190191
for upper/lower triangular
matrix, 191
Eigenvectors, 177212. See also
Inverse power method
for companion matrices, 184
denitions of, 177178
eigenvalues and, 180190
characteristic equation of,
180182
linearly independent, 186, 194201
theorems for, 195200
nontrivial solutions with, 185186
F
Feasibility, region of, 131132
in linear programming,
137138
minimization in, 132
objective functions of, 132
properties of, 131
convex, 131
Simplex method and, 140
three-dimensional, 140
unbounded, 133
vertices in, 131
Finite regions, 130
First derived sets, 50
Force, vectors and, 40
Function eAt , 238248
DeMoivres formula and, 241
eigenvalues and, 241244
Eulers relations and, 242243
properties of, 245248
Fundamental forms, 257262
denitions of, 259261
homogenous, 260
initial value problems in, 258
nonhomogenous, 260
theorem for, 261
Fundamental Theorem of Linear
Programming, 135136
G
Gaussian elimination method, 5465
augmented matrix in, 5556
with complete pivoting strategies,
6970
denition of, 5455
H
Hadley, G., 140
Homogenous fundamental forms,
260
Horizontal bar, in Simplex method,
142
I
Identity matrices, 22
inversions for, 102
Independent events, 302, 305
Bernoulli trials and, 305
complements to, 309
disjoint v., 305
in Markov chains, 311
as sequential, 305
Inequalities, 127134. See also
Feasibility, region of;
Graphing
Cauchy-Schwarz, 319
nite regions and, 130
graphing for, 127131
innite regions and, 129
intersections within, 129130
modeling with, 131134
region of feasibility and, 131
strict, 129
Innite regions, 130
Initial tableaux, 142
Inner products, 315344. See also
Orthonormal vectors
Cauchy-Schwarz inequality
and, 319
dot, 327330
nonzero vectors, 316
normalized, 316
orthonormal vectors, 320327,
334344
Gram-Schmidt
orthonormalization process
for, 322325
projections for, 327338
QR decompositions for,334344
413
Index
sets of, 320
theorems for, 321325
unit vectors, 316
Intersections, within sets,
129130, 301
Inverse power method, 205211
iterations of, 206t, 210t
shifted, 209210
Inversions, of matrices, 7, 93126
calculations for, 101108
denition of, 9395
determinants and, 167170
adjugates and, 167168
cofactor matrices and, 167
denitions of, 167168
theorems for, 167
for elementary matrices, 9596
for identity matrices, 102
invertible, 93
for lower triangular, 113
LU decomposition in, 115124
construction of, 118119
Gaussian elimination methods
and, 121
for nonsingular square
matrices, 115116
scalar negatives and, 117
in upper triangular matrices,
118
nonsingular, 93
properties of, 112115
extensions of, 113
symmetry, 113
theorems of, 112113
transpose, 113
simultaneous equations and,
109112
singular, 93
for upper triangular, 113
Invertible inversions, 93
L
Laws of probability, 301307, 309
combinations in, 306307
disjoint events under, 302
independent events under, 302
Bernoulli trials and, 305
complements to, 309
disjoint v., 305
as sequential, 305
Least-squares, 344354
error in, 346
linearly dependent columns
and, 349
M
Maclaurin series, 214
Markov chains,45,189190,310313
modeling with, 310313
disjoint events in, 311
independent events in, 311
Mathematical models
Leontief closed, 4748
Leontief Input-Output, 4950
Matrices, 140. See also Calculus, for
matrices; Determinants;
Eigenvalues; Eigenvectors;
Inner products; Inversions, of
matrices; Multiplications, of
matrices; Probability;
Transpose matrices; Vectors
augmented, 5556
basic concepts of, 15
calculus for, 213255
Cayley-Hamilton theorem,
219222, 254255
matrix derivatives in, 248254
polynomials in, 222232
well-dened functions in,
213219, 233248
cofactor, 167
companion, 184
constant coefcient systems,
solution of, 275286
quantity replacement in,
277278
denition of, 12
derivatives of, 248254
denitions of, 248253
properties of, 250251
determinants and, 149175
cofactor expansion and, 152156
Cramers rule and, 170173
denitions of, 149150
inversion and, 167170
pivotal condensation and,
163167
properties of, 157163
414
Index
Matrices (continued)
diagonal, 2122
block, 31
partitioning and, 31
eigenvalues for, 177194, 201212
companion matrices for, 184
denitions of, 177178
dominant, 202
eigenvectors and, 180190
multiplicity of, 182
polynomials in, 222228
power methods with, 201212
properties of, 190194
eigenvectors for, 177212
companion matrices for, 184
denitions of, 177178
eigenvalues and, 180190
linearly independent, 186,
194201
nontrivial solutions with,
185186
power methods with, 201212
properties of, 190194
elementary, 9596
inversions of, 9596
elements in, 12
equality between, 15
inner products and, 315344
Cauchy-Schwarz inequality
and, 319
dot, 327330
nonzero vectors, 316
orthonormal vectors, 320327,
334344
unit vectors, 316
inversion of, 7, 93126
calculations for, 101108
denition of, 9395
determinants and, 167170
for elementary matrices, 9596
for identity matrices, 102
invertible, 93
for lower triangular, 113
LU decomposition in, 115124
nonsingular, 93
properties of, 112115
simultaneous equations and,
109112
singular, 93
for upper triangular, 113
least-squares and, 344354
error in, 346
linearly dependent columns
and, 349
scatter diagram for, 344f, 345f
column, 33
components of, 33
denitions, 3334
dimension of, 33
geometry of, 3741
magnitude of, 33
normalized, 34
row, 33
unit, 34
zero
nonzero v., 20
submatrices, 31
Matrix calculus. See Calculus, for
matrices
Matrix functions, 245
Minors, 152
Models. See Mathematical models
Multiplication rule, 305
Multiplications, of matrices, 7, 919.
See also Simultaneous linear
equations
cancellations in, 14
coefcient matrix in, 1415
denitions of, 12, 15
determinants and, 158
elements of, 1011
partitioning and, 29
properties of, 13
commutativity and, 13
rules of, 911
postmultiplication, 10
premultiplication, 10
simultaneous linear equations
and, 14, 4391
Cramers rule and, 170173
Gaussian elimination method
in, 5465
linear independence and, 7178
linear systems in, 4350
pivoting strategies for, 6571
rank and, 7884
substitution methods in, 5054
theory of solutions in, 8488
N
Negative numbers, location of, 143
Nonhomogenous fundamental
forms, 260
Nonsingular inversions, 93
Nonsingular square matrices,
115116
Nontrivial solutions, 87
with eigenvectors, 185186
Nonzero matrices, 20
415
Index
Nonzero vectors, 316
normalized, 316
Normal Approximation to Binomial
Distribution, 314
Normalized vectors, 34
nth order equations, 263269
reduction of, 263269
variable denition for, 263
O
Optimization, 127148. See also
Simplex method, for
optimization
inequalities and, 127134
nite regions and, 130
graphing for, 127131
innite regions and, 129
intersections within, 129130
modeling with, 131134
problems in, 132
strict, 129
linear programming and, 135140
algorithms in, 135
coefcient ratios in, 137
Fundamental Theorem of
Linear Programming and,
135136
redundant constraints in,
137138
region of feasibility
in, 137138
Simplex method for, 140147
horizontal bar in, 142
initial tableaux in, 142
in Linear Programming, 140
region of feasibility and, 140
slack variables in, 141
steps in, 143
vertical bar in, 142
Orthogonal sets, 320
Orthonormal vectors, 320327
Gram-Schmidt
orthonormalization process
for, 322325, 331337
for projections, 331337
projections for, 327338
dot products in, 327330
Gram-Schmidt process for,
331337
QR decompositions for, 334344
algorithms for, 339344
iterations of, 335337
sets of, 320
theorems for, 321325
P
Partial pivoting strategies, for
simultaneous linear
equations, 6567
denition of, 65
in Gaussian elimination
methods, 65
Partitioning, 2932
block diagonals and, 31
denition of, 29
multiplication of matrices
and, 29
Pivotal condensation, 163167
coding for, 164
cofactor expansion and, 165
Pivoting strategies, for simultaneous
linear equations, 6571
complete, 6971
with Gaussian elimination
methods, 6970
round-off errors and, 69
partial, 6567
denition of, 65
with Gaussian elimination
methods, 65
scaled, 6768
ratios in, 6768
Pivots, 59
in work column location, 143
Polynomials, 222232
for eigenvalues, 222228
for general cases, 228233
Postmultiplication, 10
Power methods, with
eigenvalues/eigenvectors,
201212
deciencies of, 205
inverse, 205211
iterations of, 206t, 210t
shifted, 209210
Premultiplication, 10
Probability, 297310. See also Laws
of probability; Markov
chains
Bernoulli trials and, 305310
combinatorics and, 305310
interpretations of, 297298
laws of, 301307, 309
combinations in, 306307
disjoint events under, 302
independent events under, 302
Markov chains and, 45, 189190,
310313
modeling with, 310313
Q
QR decompositions, 334344
algorithms for, 339344
iterations of, 335337
QR-algorithms, 339344
R
Rank, in simultaneous linear
equations, 7884
for column vectors, 78
denition of, 78
for row vectors, 7882
theorems for, 79, 82
Ratios
coefcient, 137
in scaled pivoting strategies, 6768
Reduction of systems, 269275
Redundant constraints, 137138
Region of feasibility. See Feasibility,
region of
Regions
nite, 130
innite, 129
Round-off errors, 69
Row vectors, 33
as determinant property, 157158
rank for, 7882
Row-reduced form matrices, 2021
Ruth, Babe, 314
S
Scalar functions, 245
Scalar negatives, 117
Scaled pivoting strategies, for
simultaneous linear
equations, 6768
ratios in, 6768
Scatter diagram, 344f, 345f
Sets, 297300
intersections within, 301
orthogonal, 320
orthonormal vectors in, 320
union in, 297301
Shifted inverse power method,
209210
416
Index
T
Taylor series, 216, 236
Transitional matrices, 286289
Transpose matrices, 1920
commuted products of, 20
denition of, 20
inversions of, 113
Trivial solutions, 87
U
Unions, within sets, 301
Unit vectors, 34
as inner product, 316
Upper triangular matrices, 22
eigenvalues/eigenvectors and, 191
inversions of, 113
LU decomposition in, 118
V
Vectors, 3341. See also
Eigenvectors; Orthonormal
vectors
column, 33
linearly dependent, 349
rank for, 78
work, 143
components of, 33
denitions, 3334
dimension of, 33
eigenvectors, 177212
denitions of, 177178
eigenvalues and, 180190
linearly independent, 194201
nontrivial solutions with,
185186
power methods with, 201212
properties of, 190194
force and, 40
geometry of, 3741
angles in, 38f
equivalency in, 40
force effects on, 40
measurement parameters in,
3940, 40f
sum construction in, 3839, 39f
velocity effects on, 40
inner products and, 315344
linear dependence and, 72
linear independence and, 7176
magnitude of, 33
nonzero, 316
normalized, 316
normalized, 34
orthonormal, 320327
Gram-Schmidt process for,
322325, 331337
projections for, 327338
QR decompositions for,
334344
sets of, 320
theorems for, 321325
row, 33
as determinant property,
157158
rank for, 7882
unit, 34, 316
as inner product, 316
Velocity, vectors and, 40
Vertical bar, in Simplex method, 142
Vertices, 131
Visualizations, for graphing
inequalities, 128
W
Work column vectors, 143
Z
Zero matrices, 6
nonzero v., 20
Zero submatrices, 31