Selected Solutions To Linear Algebra Done Wrong

Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

Selected Solutions to Linear Algebra

Done Wrong
Jing Huang

August 2018

Introduction

Linear Algebra Done Wrong by Prof. Sergei Treil is a well-known linear alge-
bra reference book for collage students. However, I found no solution manual
was available when I read this book and did the exercises. In contrast, the
solution manual for the other famous book Linear Algebra Done Right is
readily available for its greater popularity (without any doubt, Dong Wrong
is an excellent book). Reference solutions are important since insights tend
to hide behind mistakes and ambiguity, which may be missed if there is no
good way to check our answers. They also help to save time when there is
no hint or the hint is obscure.
I scanned all and did most problems in the first seven chapters (2014 ver-
sion) and share those valuable here (those are relatively hard or insightful
from my perspective. I read this book for reviewing and deeper mathemati-
cal understanding). The rest problems should be tractable even for a novice
in linear algebra. In Chapter 7, there are a few easy problems and none is
selected herein. The last chapter, Chapter 8, deals with dual spaces and
tensors, which could be advanced for most readers and is also not covered.
As references aiming to facilitate readers’ learning process, the correctness
and optimality of the solutions are not guaranteed. Lastly, do not copy the
contents as it can violate your college’s academic rules.
I’m now a PhD student with limited time to craft the contents. Typos
and grammar mistakes are inevitable. Available at [email protected]
if any feedback. Cheers.

Chapter 1. Basic Notations

1.4. Prove that a zero vector of a vector space V is unique.


Proof Suppose there exist two different zero vectors 01 and 02 . Then, for
any v ∈ V
v + 01 = v
v + 02 = v.

1
Find the difference of the equations above

02 − 01 = v − v = 01 /(or) 02

then
02 = 01 + 01 /02 = 01 .
Thus, the zero vector is unique.

1.6. Prove that the additive inverse, defined in Axiom 4 of a vector space
is unique.
Proof Assume there exist two different additive inverses w1 and w2 of vec-
tor v ∈ V . Then
v + w1 = 0
v + w2 = 0.

Obtain the difference of the two equations,

w1 − w2 = 0

then
w1 = w2 .
Therefore, the additive inverse is unique.

1.7. Prove that 0v = 0 for any vector v ∈ V .


Proof 0v = (0α)v = α(0v) for any scalar α, so (1 − α)0v = 0 for any scalar
(1 − α). Then 0v = 0.

1.8. Prove that for any vector v its additive inverse −v is given by (−1)v.
Proof v + (−1)v = (1 − 1)v = 0v = 0 and we know from Problem 1.6 that
the additive inverse is unique. Hence, −v = (−1)v.

2.5. Let a system of vectors v1 , v2 , ..., vr be linearly independent but not


generating. Show that it is possible to find a vector vr+1 such that the
system v1 , v2 , ..., vr , vr+1 is linearly independent. Pr
Proof Take vr+1 that can not be represented as k=1 αk vk . Such a
vr+1 exists because v1 , v2 , ..., vr are not generating. Now we need to show
v1 , v2 , ..., vr , vr+1 are linearly independent. Suppose that v1 , v2 , ..., vr , vr+1
are linearly dependent, i.e.,

α1 v1 + α2 v2 + ... + αr vr + αr+1 vr+1 = 0,


Pr+1
and k=1 |αk | 6 0. If αr+1 = 0, then
=

α1 v1 + α2 v2 + ... + αr vr = 0,

2
and rk=1 |αk | =
P
6 0. This contradicts that v1 , v2 , ..., vr are linearly indepen-
dent. So αr+1 6= 0 and vr+1 can be represented as
r
1 X
vr+1 = − αk vk .
αr+1
k=1

This contradicts the premise that vr+1 can not be represented by v1 , v2 , ..., vr .
Thus, the system v1 , v2 , ..., vr , vr+1 is linearly independent.

3.2. Let a linear transformation in R2 be the reflection in the line x1 = x2 .


Find its matrix.
Solution 1. Reflection is a linear transformation. It is completely defined
T T
on the standard basis. And e1 = [1 0]T = ⇒ r1 = [0 1]T , e2 = [0 1]T =
⇒ r2 =
[1 0]T . The matrix is the combination of the two transformed standard basis
as its first and second column, i.e.,
 
0 1
T = [r1 r2 ] = .
1 0

Solution 2. (A more general method) Transformations such as reflec-


tion, rotation w.r.t. coordinate axes are easy to be expressed. The general
idea is to convert transformations into forms w.r.t. coordinates.
Let α be the angle between the x-axis and the line. The reflection
can be achieved through the following steps: First, rotate the line around
the origin −α so the line aligns with the x-axis. x1 = x2 passes through
the origin. If it does not pass through the origin, translation is needed to
make it pass through the origin. (Homogeneous coordinates will be needed
since translation is not a linear transformation if represented in standard
coordinates). Secondly, perform reflection about the x-axis, whose matrix
is easy to get. Lastly, rotate the current frame back to its original location
or perform other corresponding inverse transformations. In this problem,

T = Rotz(−α) · Ref · Rotz(α).

That is
   
cos(−α) − sin(−α) 1 0 cos(α) − sin(α)
T =
sin(−α) cos(−α) 0 −1 sin(α) cos(α)
π π
cos( π4 ) − sin( π4 )
   
cos(− 4 ) − sin(− 4 ) 1 0
=
sin(− π4 ) cos(− π4 ) 0 −1 sin( π4 ) cos( π4 )
 
0 1
= .
1 0

3.7. Show that any linear transformation in C (treated as a complex vector


space) is a multiplication by α ∈ C.

3
⇒ C and T (1) = a + ib for
Proof Suppose a linear transformation is T : C =
two real numbers a and b. Then, T (−1) = −T (1) = −a − ib. Note that
i2 = −1. T (−1) = T (i2 ) = iT (i). Thus
−a − ib
T (i) = = i(a + ib).
i
For any ω = x + iy ∈ C, x, y ∈ R,

T (ω) = T (x + iy) = xT (1) + yT (i)


= x(a + ib) + yi(a + ib)
= (x + iy)(a + ib)
= ωT (1)
= ωα

where α = T (1).

5.4. Find the matrix of the orthogonal projection in R2 onto the line
x1 = −2x2 .
Solution Following similar steps presented in Problem 3.2, we have

T = R(α)P R(−α)
 
1 0
= R(α) R(−α).
0 0

α = tan−1 (− 12 ), and the resulting matrix is


 4
− 25

T = 5 .
− 52 1
5

5.7. Find the matrix of the reflection through the line y = −2x/3. Perform
all the multiplications.
Solution As above,

T = R(α)Ref R(−α)
 
1 0
= R(α) R(−α).
0 −1

α = tan−1 (− 23 ),
5 12
 
13 − 13
T = 5 .
− 12
13 − 13

6.1. Prove that if A : V → W is an isomorphism (i.e. an invertible linear


transformation) and v1 , v2 , ..., vn is a basis in V , then Av1 , Av2 , ..., Avn is

4
a basis in W .
Proof For any w ∈ W , A−1 w = v ∈ V ,
w = Av = A[v1 v2 ... vn ][v1 v2 ... vn ]T
= [Av1 Av2 ... Avn ][v1 v2 ... vn ]T .
[Av1 Av2 ... Avn ] is in the form of a basis in W . Next we show that
Av1 Av2 ... Avn is linearly independent. If they are not liearly independent,
suppose Av1 can be expressed as a linear combination of Av2 Av3 ... Avn
without loss of generality. Multiplying them with A−1 in the left side, it
results in that v1 can be expressed by v2 ... vn , which contradicts the fact
that v1 , v2 , ..., vn is a basis in V . The proposition is proved.

7.4. Let X and Y be subspaces of a vector space V. Using the previ-


ous exercise, show that X ∪ Y is a subspace if and only if X ⊂ Y or Y ⊂ X.
Proof The sufficiency is obvious and easy to verify. For the necessity, sup-
pose X * Y, Y * X, and X ∪ Y is a subspace of V. Then there are vectors
x ∈ X, y ∈ Y and x ∈ / Y, y ∈/ X. According to Problem 7.3, x + y ∈ / X,
x+y ∈ / Y. As a result, x + y ∈ / X ∪ Y. i.e., x ∈ X ∪ Y, y ∈ X ∪ Y, but
x+y ∈ / X ∪ Y, which contradicts that X ∪ Y is a subspace. Thus, X ⊂ Y
or Y ⊂ X.

8.5. A transformation T in R3 is a rotation about the line y = x + 3


in the x-y plane through an angle γ. Write a 4 × 4 matrix corresponding to
this transformation.
You can leave the result as a product of matrices.
Solution For a general spatial rotation about a given line through an angle
γ, the 3 × 3 rotation matrix can be given by:
R = Rx−1 Ry−1 Rz (γ)Ry Rx
where the rotation by γ is appointed to be performed around z-axis. Rx and
Ry are rotations used to align the original line direction with z-axis which
can be determined by simple trigonometry.
For this problem, the line y = x + 3 does not go through the origin, so
extra step T0 is needed to translate the line to make it pass the origin and
homogeneous coordinates are applied:
R4×4 = T0−1 Rx−1 Ry−1 Rz (γ)Ry Rx T0
where rotation matrices are also in their 4 × 4 forms. T0 is not unique for
the translation to make two parallel lines align. For example, consider the
following matrix:
 
1 0 0 3
0 1 0 0
T0 = 
0 0 1 0 (Move y = x + 3 to pass through the origin as y = x.)

0 0 0 1

5
 
1 0 0 0
0 0 −1 0
Rx = 
0
 (Rotate y = x about the x-axis π/2.)
1 0 0
0 0 0 1
√ √ 
2 2
0 − 0
 2 2
 √0 1 √0 0

Ry =  2 2
 (Rotate the line about the y-axis −π/4.)
 2 0 2 0
0 0 0 1
 
cos γ − sin γ 0 0
 sin γ cos γ 0 0
Rz (γ) = 
 0
 (Rotate about the z-axis γ.)
0 1 0
0 0 0 1
Combining all the matrices above and their inverses yields
 √ 
cos γ + 1 1 − cos γ 2 sin γ 3 cos γ − 3

 2 2 √2 2 
 1 − cos γ cos γ + 1 2 sin γ 3 − 3 cos γ 
 
R4×4 =  √ 2
 − .

√ 2 2 √2
2 sin γ 2 sin γ 3 2 sin γ 
 
− cos γ −

2 2 2

0 0 0 1

Notice that processing above showcases a general scenario where we need to


perform two rotations about two coordinate axes in sequence to align the
line with the third coordinate axis. In this problem, it is easier to align
the line with the x-axis or y-axis since the line lies on the x − y plane. For
instance, conduct the rotation about the x-axis, R4×4 is given by

R4×4 = T0−1 Rz−1 Rx (γ)Rz T0 .

Specifically, T0 remains the same and


 √ √ 
2 2
0 0
 2√2 √22 
Rz = − 2 0 0  (Rotate about the z-axis −π/4.)
 2
 0 0 1 0
0 0 0 1
 
1 0 0 0
0 cos γ − sin γ 0
Rx (γ) = 
0 sin γ cos γ
 (Rotate about the x-axis γ.)
0
0 0 0 1
After computation, it can be verified that this solution will also result in the
identical R4×4 as shown above.

6
Chapter 2. Systems of Linear Equations
3.8. Show that if the equation Ax = 0 has unique solution (i.e. if echelon
form of A has pivot in every column), then A is left invertible.
Proof Ax = 0 has unique solution, then the solution is trivial solution.
The echelon form of A has pivot in every column. Let the dimension of A
be m × n, then m ≥ n. The row number is greater or equal to the column
number. The reduced echelon form of A can be denoted as
 
In×n
Are = .
0(m−n)×n

Suppose Are is obtained by a sequence of elementary row operation E1 , E2 , ..., Ek ,

Are = Ek ... E2 E1 A

Ei is m × m. The left inverse of A is the first n rows of the product of Ei .


i.e.
Elef t = In×m Ek ... E2 E1 ,
where  
1 0 0 ... 0
0 1 0 ... 0 
In×m = . . ,
 
.. .. ..
 .. ..

. . . 
0 0 ... 1 ... n×m
is used to extract the In×n identity matrix in Are . Elef t A = In×m Are =
In×n , thus A is left invertible.

5.5. Let vectors u, v, w be a basis in V . Show that u + v + w, v + w, w


is also a basis in V .
Solution For any vector x ∈ V , suppose x = x1 u + x2 v + x3 w. It is easy
to figure out that x = x1 (u + v + w) + (x2 − x1 )(v + w) + (x3 − x2 − x1 )w.
Clearly, u + v + w, v + w, w are linearly independent, can express any x,
and thus form a basis in V .

7.4. Prove that if A : X → Y and V is a subspace of X then dim AV ≤


rank A. (AV here means the subspace V transformed by the transformation
A, i.e., any vector in AV can be represented as Av, v ∈ V ). Deduce from
here that rank(AB) ≤ rankA.
Proof dimAV ≤ dimAX ≤ dim RanA = rankA
Suppose that the column vectors of B compose a basis of space V . Then
rank(AB) ≤ rankA.

7.5. Prove that if A : X → Y and V is a subspace of X then dim AV ≤


dim V . Deduce from here that rank(AB) ≤ rank B.

7
Proof Suppose that dim V = k and let v1 , v2 , ..., vk be a basis of V . AV is
defined by the transformation on the basis: Av1 , Av2 , ..., Avk . dim AV =
rank [Av1 , Av2 , ..., Avk ] ≤ k = dim V .
Similarly, assume rank B = k and b1 , b2 , ..., bk are linearly independent
column vectors in B. rank AB = rank [Ab1 , Ab2 , ..., Abk ] ≤ k = rank B.

7.6. Prove that if the product AB of two n × n matrices is invertible,


then both A and B are invertible. Do not use determinant for this problem.
Proof AB is invertible, rank(AB) = n. From Problem 7.5, we have
rank(AB) = n 6 rank(A) 6 n. Thus rank(A) = n. So is B. Both A
and B have full rank and are invertible.

7.7. Prove that if Ax = 0 has unique solution, then the equation AT x = b


has a solution for every right side b. (Hint: count pivots)
Proof Suppose A ∈ Rm × Rn . Note that for Ax = 0, there is always a triv-
ial solution x = 0 ∈ Rn . We now know the trivial solution is unique, which
indicates that the echelon form of A has a pivot at every column. Accord-
ingly, the echelon form of AT will have a pivot at every row (Consider that
the echelon form of AT is acquired by column reduction that corresponds to
the row reduction of A). Hence, AT x = b is consistent for any b.

7.14. Is it possible for a real matrix A that Ran A = Ker AT ? Is it


possible for a complex A?
Solution Both are not possible. Suppose A is m × n and Ran A = Ker AT .
Then Ran A ⊂ Ker AT , i.e., AT Av = 0 for any v ∈ Rn . This holds only
when AT A = 0n×n . Then A = 0m×n . (Use the column vectors of A and
check the diagonal entries of AT A equal to 0. It will lead to the conclusion
that the column vectors are all-zero vectors, e.g., A = [a1 , a2 , ..., an ] with
ai ∈ Rm . AT A1,1 = aT 2
1 a1 = ||a1 || = 0, then a1 = 0m×1 .)
On the other hand, if Ran A = Ker AT , Ker AT ⊂ Ran A, i.e., if
AT b = 0, then Ax = b has a solution. We have A = 0m×n , then for ar-
bitrary b ∈ Rm , AT b = 0 holds. But for b 6= 0, Ax = b does not have a
solution. This is contradictory. So it is not possible for the real or complex
matrix A that Ran A = Ker AT .

8.5. Prove that if A and B are similar matrices then trace A = trace
B. (Hint: recall how trace(XY ) and trace(Y X) are related.)
Proof trace(A) = trace(Q−1 BQ) = trace(Q−1 QB) = trace(B). (Note that
trace(AB) = trace(BA) as long as AB, BA can be performed.)

8
Chapter 3. Determinants
3.4. A square matrix (n × n) is called skew-symmetric (or antisymmetric) if
AT = −A. Prove that if A is skew-symmetric and n is odd, then det A = 0.
Is this true for even n?
Proof det A = det AT = det(−A) = (−1)n det A by using the properties
of determinant and skew-symmetric matrices. If n is odd, (−1)n = −1, we
have det A = − det A, thus det A = 0.
If n is even, we just have det A = det A, so the result above is generally
not true.

3.5. A square matrix is called nilpotent if Ak = 0 for some positive in-


teger k. Show that for a nilpotent matrix A, det A = 0.
Proof det Ak = (det A)k = det 0 = 0, thus det A = 0.

3.6. Prove that if A and B are similar, then det A = det B.


proof A and B are similar, then A = Q−1 BQ for an invertible matrix Q.
det A = det Q−1 BQ
= (det Q−1 )(det B)(det Q)
= (det Q−1 )(det Q)(det B)
= (det Q−1 Q)(det B)
= (det I)(det B)
= det B.

3.7. A real square matrix Q is called orthogonal if QT Q = I. Prove that if


Q is an orthogonal matrix then det Q = ±1.
Proof det QT Q = (det QT )(det Q) = (det Q)2 = det I = 1. det Q = ±1.

3.9. Let points A, B and C in the plane R2 have coordinates (x1 , y1 ), (x2 , y2 )
and (x3 , y3 ) respectively. Show that the area of triangle ABC is the absolute
value of
1 x1 y1
1
1 x2 y2 .
2
1 x3 y3
Hint: use row operation and geometric interpretation of 2 × 2 determinants
(area).
Proof The area of triangle ABC is half of the parallelogram defined by
neighbouring sides AB, AC, which also can be computed by

1 x2 − x1 y2 − y1
S4ABC = abs( )
2 x3 − x1 y3 − y1
1
= |(x2 − x1 )(y3 − y1 ) − (x3 − x1 )(y2 − y1 )|.
2

9
In the same time, if we use row reduction to check the determinant

1 x1 y1 1 x1 y1
1 1
1 x2 y2 = 0 x2 − x1 y2 − y1
2 2
1 x3 y3 0 x3 − x1 y3 − y1

1 x1 y1
1
= 0 x2 − x1 y2 − y1
2 x3 −x1

0 0 y3 − y1 − (y2 − y1 ) x2 −x1
1
= ((x2 − x1 )(y3 − y1 ) − (x3 − x1 )(y2 − y1 )).
2
We assume that x2 − x1 6= 0 and it can be verified if x2 − x1 = 0, the result
still holds. With the absolute value, we can see the conclusion holds.

3.10. Let A be a square matrix. Show that block triangular matrices


       
I ∗ A ∗ I 0 A 0
0 A 0 I ∗ A ∗ I

all have determinant equal to det A. Here ∗ can be anything.


Proof Considering performing row reduction to make A triangular, the
whole matrix will be triangular and the rest part on the diagonal remains
I. Thus the determinant of the block matrix equals to det A.
(Problem 3.11 and 3.12 are just applications of the conclusion of Problem
3.10. The hint just tells the answer.)

4.2. Let P be a permutation matrix, i.e., an n × n matrix consisting of


zeros and ones and such that there is exactly one 1 in every row and every
column.

a) Can you describe the corresponding linear transformation? That will


explain the name.

b) Show that P is invertible. Can you describe P −1 ?

c) Show that for some N > 0

P N := P
| P{z... P} = I.
N times

Use the fact that there are only finitely many permutations.
Solution a) Consider the linear transformation y = P x and rows of P .
There is only one 1 in each row of P . Suppose in the first row of P , P1,j = 1,
then y1 = p1 x = xj , where p1 is the first row of P . Namely, xj is moved
to the 1st place after the linear transformation. Similarly, for the second
row of P , suppose P2,k = 1, then y2 = xk and xk is moved to the second
place, so on and so forth. There is also only one 1 in each column, then

10
the column indices in 1 entries P1,j , P2,k , P3,m , ... comprise a permutation
of n as (j, k, m, ...). After multiplying by the permutation matrix P , the
elements in x change their orders to [xj , xk , xm , ...]T . (Considering from the
perspective of column vectors of P also works.)
b) Suppose P is invertible, by multiplying P −1 , x = P −1 y. But we know
−1
y1 = xj , then we have Pj,1 = 1 so that xj can return to its original po-
−1
sition. Similarly, y2 = xk , then Pk,2 = 1. Following this, we can see that
−1 −1
Pi,j = Pj,i . So P is invertible and P = P T .
c) Note that P x, P 2 x, P 3 x ... P N x are all permutations of (x1 , x2 , ..., xn ). If
P N can never equal to I, P x, P 2 x, P 3 x ... P N x will be different permuta-
tions. And N can be infinitely big, so there will be infinitely many per-
mutations of (x1 , x2 , ..., xn ), which is impossible. Thus there must be some
N > 0, P N = I. In fact, there are n! different permutations of n distinct
elements, N ≤ n!.

Exercises Prat 5 and Part 7 in this chapter are not difficult. Some ideas
and answers are given for reference:

• Problem 5.3, we can use the last column expansion and the left matrix
(A + tI)i,j is a triangular matrix. The final expression is det(A + tI) =
a0 + a1 t + a2 t2 + ... + an−1 tn−1 . The order of −1 in each term is even.

• Problem 5.7, n! multiplications is needed. We can use induction to


prove it.

• Problem 7.4 and Problem 7.5, consider det RA = (det R)(det A) =


det A, where R is the rotation matrix with its determinant equal to 1.
For proof of the parallelogram area, we can also utilize the parameter
angle, i.e., v1 = [x1 , y1 ]T = [v1 cos α, v1 sin α]T , v2 = [x2 , y2 ]T =
[v2 cos β, v2 sin β]T . v1 , v2 are the lengths of v1 , v2 , respectively. α, β
represents the angle between the vector and x-axis positive direction.
Then

x1 x2
det A = = x 1 y2 − x 2 y1
y1 y2
= v1 v2 (cos α sin β − cos β sin α)
= v1 v2 sin(β − α).

β − α is the angle from v1 to v2 .

Chapter 4. Introduction to Spectral Theory (Eigen-


values and Eigenvectors)
1.1. (Part) True or false:

11
b) If a matrix has one eigenvector, it has infinitely many eigenvectors;
True, if Ax = λx, A(αx) = λ(αx), α is an arbitrary scalar but zero.
αx is also an eigenvector of A.

c) There exists a square matrix with no real eigenvalues;


True, such as the 2D rotation matrix Rα , α 6= nπ.

d) There exists a square matrix with no (complex) eigenvectors;


False, when discussing in complex space, there are always eigenvalues
and as a result, A − λI has nonempty null space.

f) Similar matrices always have the same eigenvectors;


False, if A, B are similar and A = SBS −1 . If Ax = λx, then SBS −1 x =
λx. i.e., B(S −1 x) = λ(S −1 x), S −1 x is an eigenvector of B, not x.

g) The sum of two eigenvectors of a matrix A is always an eigenvector;


False

1.6. An operator A is called nilpotent if Ak = 0 for some K. Prove that if


A is nilpotent, then σ(A) = {0} (i.e. that 0 is the only eigenvalue of A).
Proof Note that if λ is a nonzero eigenvalue of A and Ax = λx. Then
A2 x = A(λx) = λ2 x, A3 x = A(λ2 x) = λ3 x ... Ak x = λk x. That is to say
if λ ∈ σ(A), λk ∈ σ(Ak ). Now Ak = 0, σ(Ak ) = {0}. Then 0 is the only
eigenvalue of A.

1.8. Let v1 , v2 , ..., vn be a basis in a vector space V . Assume also that


the first k vectors v1 , v2 , ..., vk of the basis are eigenvectors of an operator
A, corresponding to an eigenvalue λ (i.e. that Avj = λvj , j = 1, 2, ..., k).
Show that in this basis the matrix of the operator A has block triangular
form  
λIk ∗
0 B

Proof AV V = [I]V S ASS [I]SV , where S represents the standard basis. [I]
is the coordinate change matrix and [I]SV = [v1T , v2T , ..., vnT ]. [I]SV AV V =
ASS [I]SV = ASS [v1T , v2T , ..., vnT ] = [λv1T , λv2T , ..., λvkT , ..., ASS vnT ]. Denote
the i-th column of AV V with ai . Consider a1 , then [I]SV a1 = λv1T . Since
v1 , v2 , ..., vn is a basis, then a1 can only be the form a1 = [λ, 0, 0, ..., 0]T .
Similarly, check the first k columns of AV V , they are λ times the first k
standard base vector. So ASS has the block triangular form above.

1.9. Use the two previous exercises to prove that geometric multiplicity
of an eigenvalue cannot exceed its algebraic multiplicity.
Proof We consider the problem in the basis v1 , v2 , ..., vn and A has the
block triangular form shown in Problem 1.8. Note k is the number of linearly
independent eigenvectors corresponding to λk , which is also the dimension

12
of Ker(A − λk I) (consider the queation (A − λk I)x = 0). Namely, k is the
geometric multiplicity of λk .
For the algebraic multiplicity, consider the determinant

(λk − λ)Ik ∗
det(A − λI) =
0 B − λIn−k
= (λk − λ)k det(B − λIn−k ).
So the algebraic multiplicity of λk is at least k. It is further possible that λk
is a root of the polynomial det(B − λIn−k ), then in this case the algebraic
multiplicity will just exceed k. Thus geometric multiplicity of an eigenvalue
cannot exceed its algebraic multiplicity.

1.10. Prove that determinant of a matrix A is the product of its eigen-


values (counting multiplicity).
Proof (Just use the hint) The characteristic polynomial of n × n square
matrix A is det(A − λI) and we consider the roots of it in complex space.
According to the fundamental theorem of algebra, det(A − λI) has n roots
counting multiplicity and can be factorized as det(A − λI) = (λ1 − λ)(λ2 −
λ)...(λ − λn ). (Recall the formal definition of determinant, the highest
order term of λ, λn , is generated by the diagonal product Πni=1 (aii − λ).
Thus the sign of the factorization is correct.) Then let λ = 0, we will get
det A = λ1 λ2 ...λn .

1.11. Prove that the trace a matrix equals the sum of eigenvalues in three
steps. First, compute the coefficient of λn−1 in the right side of the equality
det(A − λI) = (λ1 − λ)(λ2 − λ)...(λn − λ).
Then show that det(A − λI) can be represented as
det(A − λI) = (a1,1 − λ)(a2,2 − λ)...(an,n − λ) + q(λ).
where q(λ) is polynomial of degree at most n − 2. And finally, comparing
the coefficients of λn−1 get the conclusion.
Proof First, recall the binomial theorem, the coefficient of λn−1 in det(A −
λI) = (λ1 − λ)(λ2 − λ)...(λn − λ) is C(λn−1 ) = (−1)n−1 (λ1 + λ2 + ... + λn ).
Because to get the term λn−1 , we need to pick −λ n times from the total
n factors λi − λ. Then the last one pick is λj from the factor whose −λ
is not picked. The resulting term is then (−1)n−1 λj λn−1 . There are n
combinations and the sum of each term is C(λn−1 )λn−1 .
Then, we show det(A − λI) can be represented as
det(A − λI) = (a1,1 − λ)(a2,2 − λ)...(an,n − λ) + q(λ).
That is to say in det(A − λI), the term λn−1 are all from (a1,1 − λ)(a2,2 −
λ)...(an,n − λ). This holds because λ only appears on the diagonal of A − λI.

13
Using the formal definition of determinant, if we pick n − 1 diagonal term
with λ, then the last pick must also be on the diagonal. There is no other way
to generate λn−1 . Thus q(λ) is a polynomial of degree at most n − 2. Then
we know the coefficient of λn−1 also equals to C(λn−1 ) = (−1)n−1 (a1,1 +
a2,2 + ... + an,n ).
PnThe coefficients
Pn derived from two different ways are identical, so we have
a
i=1 i,i = i=1 i , namely, the trace a matrix equals the sum of eigenval-
λ
ues.

2.1. Let A be n × n matrix. True or false:

a) AT has the same eigenvalues as A.


True, det(A − λI) = det(A − λI)T = det(AT − λI)

b) AT has the same eigenvectors as A.


False.

c) If A is diagonalizable, then so is AT .
True, A = SDS −1 , AT = (SDS −1 )T = (S −1 )T DT S T = (S T )−1 DS T .

2.2. Let A be a square matrix with real entries, and let λ be its complex
eigenvalue. Suppose v = (v1 , v2 , ..., vn )T is a corresponding eigenvector,
Av = λv. Prove that the λ̄ is an eigenvalue of A and Av̄ = λ̄v̄. Here v̄ is
the complex conjugate of the vector v, v̄ := (v¯1 , v¯2 , ..., v¯n )T .
Proof A is real matrix. Then Āv̄ = Av̄. In the same time Āv̄ = Av =
λv = λ̄v̄. Thus Av̄ = λ̄v̄.

Chapter 5. Inner Product Spaces


1.4. Prove that for vectors in an inner product space

kx ± yk2 = kxk2 + kyk2 ± 2Re(x, y).

Recall that Re(z) = 12 (z + z).


Proof
kx − yk2 = (x − y, x − y)
= (x, x − y) − (y, x − y)
= (x, x) − (x, y) − (y, x) + (y, y)
= kxk2 + kyk2 − (x, y) − (x, y)
= kxk2 + kyk2 − 2Re(x, y).

Similarly kx + yk2 = kxk2 + kyk2 + 2Re(x, y).

1.5. Hint: a) Check conjugate symmetry. b) Check linearity. c) Check

14
conjugate symmetry.

1.7. Prove the parallelogram identity for an inner product space V ,

kx + yk2 + kx − yk2 = 2(kxk2 + kyk2 ).

Proof
kx + yk2 + kx − yk2 = (x + y, x + y) + (x − y, x − y)
= (x, x) + (x, y) + (y, x) + (y, y)+
(x, x) − (x, y) − (y, x) + (y, y)
= 2(x, x) + 2(y, y)
= 2(kxk2 + kyk2 ).

1.8. Proof sketch: a) Let v = x, then (x, x) = 0, x = 0.


b) (x, vk = 0), ∀ k, then (x, v) = 0, form conclusion in a), x = 0.
c) (x − y, vk ), ∀ k, form b) x − y = 0, then x = y.

2.3. Let v1 , v2 , ..., vn be an orthonormal basis in V .

a) Prove that for any x = nk=1 αk vk , y = nk=1 βk vk


P P

n
X
(x, y) = αk β k .
k=1

b) Deduce from this Parseval’s identity


n
X
(x, y) = (x, vk )(y, vk ).
k=1

c) Assume now that v1 , v2 , ..., vn is only an orthogonal basis, not an


orthonormal one. Can you write down Parseval’s identity in this case?

Proof
a)

Xn n
X n X
X n n
X
(x, y) = ( α k vk , β k vk ) = αi β j (vi , vj ) = αk β k .
k=1 k=1 i=1 j=1 k=1

Because v1 , v2 , ..., vn is an orthonormal basis, (vi , vj ) = 0, i 6= j, (vi , vj ) =


1, i = j.

15
b) Use (x, vk ) = αk , (y, vk ) = βk and conclusion in a).
c) Use equation in a),
n
X n
X n X
X n n
X
(x, y) = ( αk vk , βk vk ) = αi β j (vi , vj ) = αk β k (vk , vk )
k=1 k=1 i=1 j=1 k=1
n n
X X (x, vk )(y, vk )
= αk β k kvk k2 = .
kvk k2
k=1 k=1

As the basis is only orthogonal, not orthonomal, then (x, vk ) = (αk vk , vk ) =


αk kvk k2 .

3.3 Complete an orthogonal system obtained in the previous problem to


an orthogonal basis in R3 , i.e., add to the system some vectors (how many?)
to get an orthogonal basis.
Can you describe how to complete an orthogonal system to an orthogonal
basis in general situation of Rn or Cn ?
Solution For 3D space, we already have 2 orthogonal vectors v1 , v2 as
the basis components. Then we just need another basis vector v3 . The
computation of v3 exploits the orthogonality, i.e., (v1 , v3 ) = 0, (v2 , v3 ) = 0.
Expressed in matrix form, let A = [v1 , v2 ]. Then solve AT v3 = 0. (Since it
is in 3D space, using cross product is also simple.)
Generally, to complete an orthogonal system of v1 , v2 ...vr . Consider
A = [v1 , v2 , ..., vr ], using the orthogonality, we compute the rest basis vec-
tors by solving AT v = 0, i.e., the rest basis vectors compose an basis of Ker
AT or Null AT .

3.9 (Using eigenvalues to compute determinants).


a) Find the matrix of the orthogonal projection onto the one-dimensional
subspace in Rn spanned by the vector (1, 1, ..., 1)T ;
b) Let A be the n × n matrix with all entries equal 1. Compute its
eigenvalues and their multiplicities (use the previous problem);
c) Compute eigenvalues (and multiplicities) of the matrix A − I, i.e., of
the matrix with zeros on the main diagonal and ones everywhere else;
d) Compute det(A − I).
Pn 1 ∗
Solution a) Note that from Remark 3.5, we know PE = k=1 kvk k2 vk vk .
For this one-dimensional subspace, it is
 
1 1 ... 1
11 1 . . . 1

PE =  . . . . .
n  .. .. . . .. 

1 1 . . . 1 n×n

16
b) Note that A = nPE . Suppose Ax = λx, then nPE x = λx. i.e., n times of
the eigenvector’s projection on the 1D subspace equals λ times of itself. It
also shows that the eigenvector’s orthogonal projection is parallel to itself.
Then there are two possibilities:

• One is the eigenvector is parallel with basis of the 1D subspace v, i.e.,


x = αv, α 6= 0. In this case, PE x = x, then λ = n. The geometric
multiplicity is 1 for 1D eigenspace.

• The eigenvector is orthogonal to v, i.e., x ⊥ v and PE x = 0, λ = 0.


We can totally find n − 1 linearly independent eigenvectors so the
geometric multiplicity of eigenvalue λ = 0 is n − 1.

c) det(A − I − λI) = det(A − (λ + 1)I). i.e., an eigenvalue of A − I plus 1 is


an eigenvalue of A. Then the eigenvalues of A − I equal to the eigenvalues
of A minus 1. Thus the eigenvalues of A − I are n − 1 with multiplicity 1
and −1 with multiplicity n − 1.
d) det(A−I) = (n−1)(−1)n−1 , which equals n−1 if n is odd, 1−n if n is even.

3.10. (Legendre’s polynomials) Hint: Using the Gram-Schmidt orthogo-


nalization algorithm is sufficient. But remember
R 1 to use the inner product
2
defined in the problem, e.g., k1k = (1, 1) = −1 1 · 1̄dt = 2.

3.11. Let P = PE be the matrix of an orthogonal projection onto a subspace


E. Show that

a) The matrix P is self-adjoint, meaning that P ∗ = P .

b) P 2 = P .

Remark The above 2 properties completely characterize orthogonal projec-


tion.
Proof a) From the orthogonality, we have (x, x−P x) = (x−P x, x) = 0, ∀x.
(x, x − P x) = (x − P x)∗ x = (x∗ − x∗ P ∗ )x = x∗ x − x∗ P ∗ x = 0. On the
other hand, (x − P x, x) = x∗ (x − P x) = x∗ x − x∗ P x = 0. Subtract two
equalities, x∗ (P − P ∗ )x = 0, ∀x. Then P − P ∗ = 0n×n , P = P ∗ .
b) Consider (P x, x − P x) = (x − P x)∗ P x = (x∗ − x∗ P ∗ )P x = x∗ (P −
P ∗ P )x = 0. Thus P = P ∗ P = P 2 since P = P ∗ .

3.13 Suppose P is the orthogonal projection onto an subspace E, and Q is


the orthogonal projection onto the orthogonal complement E ⊥ .

a) What are P + Q and PQ?

b) Show that P − Q is its inverse.

17
Proof a) P + Q = I since (P + Q)x = P x + Qx = PE x + QE ⊥ x = x.
P Q = 0n×n as x∗ P Qx = x∗ P ∗ Qx = (Qx, P x) = 0, ∀x (using P is self-
adjoint shown in Problem 3.11).
b) (P − Q)2 = (P − Q)(P − Q) = P 2 − P Q − QP + Q2 = P 2 + Q2 =
P 2 + Q2 + P Q + QP = (P + Q)2 = I 2 = I (using P Q = QP = 0). i.e.,
(P − Q)−1 = P − Q.

4.5. Minimal norm solution. Let an equation Ax = b has a solution, and


let A has non-trivial kernel (so the solution is not unique). Prove that

a) There exists a unique solution x0 of Ax = b minimizing the norm kxk,


i.e., that there exists unique x0 such that Ax0 = b and kx0 k 6 kxk
for any x satisfying Ax = b.

b) x0 = P(Ker A)⊥ x for any x satisfying Ax = 0.

Proof a) Suppose x0 , x1 are solutions of Ax = b. Then A(x1 − x0 ) = 0.


i.e., x1 − x0 ∈ Ker A. As a result, P(Ker A)⊥ (x1 − x0 ) = 0 = P(Ker A)⊥ x1 −
P(Ker A)⊥ x0 . So we have P(Ker A)⊥ x1 = P(Ker A)⊥ x0 = const := h.
Note that kxk2 = kP(Ker A)⊥ xk2 + kx − P(Ker A)⊥ xk2 > khk2 for any x
satisfying Ax = b. When x0 − P(Ker A)⊥ x0 = 0, x0 = h, such a x0 has the
smallest norm among all the solutions. The existence and uniqueness of x0
are guaranteed by h.
b) It is shown above.

5.1. Show that for a square matrix A the equality det(A∗ ) = det(A) holds.
T
Proof det(A∗ ) = det(A ) = det(A) = det(A).

5.3. Let A be an m × n matrix. Show that Ker A = Ker (A∗ A).


Proof It is easy to see Ker A ⊂ Ker (A∗ A). Next we show Ker (A∗ A) ⊂
Ker A. Consider kAxk2 = (Ax, Ax) = x∗ A∗ Ax. Thus if A∗ Ax = 0, we have
x∗ A∗ Ax = kAxk2 = 0, i.e., Ax = 0. Thus Ker (A∗ A) ⊂ Ker A. As a result,
we can conclude Ker A = Ker (A∗ A).

6.4. Show that a product of unitary (orthogonal) matrices is unitary (or-


thogonal) as well.
Proof Suppose U1 , U2 are unitary (orthogonal), then

(U1 U2 )∗ U1 U2 = U2∗ U1∗ U1 U2 = U2∗ IU2 = I.

From Lemma 6.2, we know the product is unitary (orthogonal).

18
Chapter 6. Structure of Operators in Inner Product
Spaces
1.1. Use the upper triangular representations of an operator to give an
alternative proof of the fact that the determinant is the product and the
trace is the sum of eigenvalues counting multiplicities.
Proof (The proof use the fact that the entries on the diagonal of T are
the eigenvalues of A, counting multiplicity, which seems not be explicitly
stated in the book and can be found like in the Wiki.) A = U T U ∗ . det A =
(det U )(det T )(det U ∗ ) = det U = Πni=1 λi because U is unitary and T is
upper triangular with eigenvalues of A on its diagonal.
To consider the trace, suppose U = [u1 , u2 , ..., un ]. Then A can be
represented by
  T
λ1 t12 . . . t1n

u
λ . . . t  1T 
 
 2 2n  u2 

A = u1 u2 . . . un  .. ..   . 
. .  .. 
 
0 λn un
T

 
T
u1
 T
   u2 
= λ1 u1 t12 u1 + λ2 u2 . . . t1n u1 + t2n u2 + ... + λn un  .. 

 . 
T
un
T T T
= λ1 u1 u1 + λ2 u2 u2 + ... + λn un un ,

where we exploit the orthogonality of u1 , u2 , ...un . Then note that the trace
T
of matrix ui ui = [ui1 ui ui2 ui ... uin ui ] (outer product) is u2i1 + u2i2 + ... +
T T
u2in = kui k2 = 1. Thus traceA = trace(λ1 u1 u1 ) + trace(λ2 u2 u2 ) + ... +
T
trace(λn un un ) = λ1 + λ2 + ... + λn = ni=1 λi .
P

2.2. True or false: The sum of normal operators is normal? Justify your
conclusion.
Solution True. Suppose two normal operators are N1 = U1 D1 U1∗ , N2 =
U2 D2 U2∗ . U1 , U2 are unitary and D1 , D2 are diagonal.

(N1 + N2 )∗ (N1 + N2 ) = N1∗ N1 + N1∗ N2 + N2∗ N1 + N2∗ N2


(N1 + N2 )(N1 + N2 )∗ = N1 N1∗ + N1 N2∗ + N2 N1∗ + N2 N2∗

N1 , N2 are normal, we need to prove N1∗ N2 + N2∗ N1 = N1 N2∗ + N2 N1∗ .


In fact, N1∗ N2 = U1 D1∗ U1∗ U2 D2 U2∗ . N1 N2∗ = U1 D1 U1∗ U2 D2∗ U2∗ . As can
be shown, D1∗ U1∗ U2 D2 = D1 U1∗ U2 D2∗ because D1 , D2 are diagonal matri-
ces, D1∗ = D1 , D2∗ = D2 . D1∗ D2 = D1 D2∗ (for complex numbers c1 , c2 , c1 c2 =
c1 c2 ). By checking the entries of the product, one can conclude D1∗ U1∗ U2 D2 =

19
D1 U1∗ U2 D2∗ and N1∗ N2 = N1 N2∗ , N2∗ N1 = N2 N1∗ . So the statement is true.

2.9. Give a proof if the statement is true, or give a counterexample if


it is false:
a) If A = A∗ then A + iI is invertible.
True. The eigenvalues of A + iI are λi + i where λi are eigenvalues of
A and are real. Then det(A + iI) = Πi=1 n(λi + i) 6= 0. (If c1 , c2 ∈
C, c1 c2 = 0, then at least one of c1 , c2 is 0.)
b) If U is unitary, U + 34 I is invertible.
True. If (U + 43 I)x = U x + 34 x = 0, note that kU xk = kxk. Then
kU x + 43 xk > kU xk − k 34 xk = 41 kxk. So the homogeneous equation
only has the trivial solution, U + 34 I is invertible.
c) If a matrix is real, A − iI is invertible.
False. A can have an eigenvalue i.
3.1. Show that the number of non-zero singular values of a matrix A coin-
cides with its rank.
Proof Suppose the dimension of A is m × n. It is known that Ker A = Ker
(A∗ A). Then Rank A = n− dim Ker A = n - dim Ker (A∗ A) = Rank (A∗ A).
The SVD of A is A = W ΣV ∗ , then A∗ A = V Σ∗ W ∗ W ΣV ∗ = V Σ2 V ∗ . Then
Rank A = Rank (A∗ A) = Rank (V Σ2 V ∗ ) = Rank Σ2 = number of non-zero
singular values, because V is an orthogonal matrix (full rank).

3.5. Find singular value decomposition of the matrix


 
2 3
A= .
0 2
Use it to find
a) maxkxk61 kAxk and the vector where the maximum is attained;
b) maxkxk=1 kAxk and the vector where the minimum is attained;
c) the image A(B) of the closed unit ball in R2 , B = {x ∈ R2 : kxk 6 1}.
Describe A(B) geometrically.
Solution (The SVD steps are ignored here.) a) Suppose A = W ΣV ∗ ,
then (Ax, Ax) = x∗ A∗ Ax = x∗ V Σ2 V ∗ x = (V ∗ x)∗ Σ2 (V ∗ x). Define y =
[y1 y2 ]T = V ∗ x. Because V is orthogonal, then y also lies in the unit ball.
Thus
(Ax, Ax) = y∗ Σ2 y
  
  16 0 y1
= y1 y2
0 1 y2
= 16y12 + y22 .

20
y12 + y22 6 1. Hence the maximum is 16 attained when y = [1 0]T . Corre-
sponding x can be solved by x = V y.
b) Similarly, the minimum is 1 attained when y = [0 1]T .
c) Ellipse.

3.8. Let A be an m × n matrix. Prove that non-zero eigenvalues of the


matrices A∗ A and AA∗ (counting multiplicities) coincide.
Proof Suppose v is an eigenvector of A∗ A corresponding to a non-zero
eigenvalue λ, i.e., A∗ Av = λv. Then AA∗ Av = A(λv) = λ(Av), i.e., λ
is an eigenvalue of AA∗ with corresponding eigenvector Av. Similarly, we
can show that the non-zero eigenvalues of AA∗ are also eigenvalues of A∗ A.
Thus non-zero eigenvalues of the matrices A∗ A and AA∗ coincide.

4.2. Let A be a normal operator, and let λ1 , λ2 , ..., λn be its eigenvalues


(counting multiplicities). Show that singular values of A are |λ1 |, |λ2 |, ..., |λn |.
Proof First, we show for normal operator A, kAxk = kA∗ xk. Note that
AA∗ = A∗ A, then

((AA∗ − A∗ A)x, x) = (0, x)


= (AA∗ x, x) − (A∗ Ax, x)
= (A∗ x, A∗ x) − (Ax, Ax)
= kA∗ xk2 − kAxk2
= 0.

Thus kAxk = kA∗ xk.


Suppose v is an eigenvector of A corresponding to the eigenvalue λ. Note
that A − λI is also normal (see Problem 2.2). Thus we have k(A − λI)vk =
k(A − λI)∗ vk = k(A∗ − λI)vk = 0, i.e., A∗ v = λv. So A∗ Av = A∗ (λv) =
λλv = |λ|2 v. |λ|2 is an eigenvalue of A∗ A, then |λ| is a singular value of A.

4.4. Let A = W fΣ e Ve ∗ be a reduced singular value decomposition of A. Show


that Ran A = Ran W f , and then by taking adjoint that Ran A∗ = Ran W f.
Proof Suppose A is an m × n matrix. Σ = diag(σ1 , σ2 , ..., σr ). W is m × r
e f
and Ve ∗ is r × n. To show Ran A = Ran W f , we just need to show Ran Σe Ve ∗
r ∗
= R = Ran V (Σ has full rank r), which holds because Rank V = r and
e e e ∗

r 6 n. Thus Ran Σ e Ve ∗ = Rr and Ran A = Ran W f.


By taking adjoint, it can be easily shown that Ran A∗ = Ran W f.

∗ ∗ ∗ ∗ ∗ End ∗ ∗ ∗ ∗ ∗

21

You might also like