Selected Solutions To Linear Algebra Done Wrong
Selected Solutions To Linear Algebra Done Wrong
Selected Solutions To Linear Algebra Done Wrong
Done Wrong
Jing Huang
August 2018
Introduction
Linear Algebra Done Wrong by Prof. Sergei Treil is a well-known linear alge-
bra reference book for collage students. However, I found no solution manual
was available when I read this book and did the exercises. In contrast, the
solution manual for the other famous book Linear Algebra Done Right is
readily available for its greater popularity (without any doubt, Dong Wrong
is an excellent book). Reference solutions are important since insights tend
to hide behind mistakes and ambiguity, which may be missed if there is no
good way to check our answers. They also help to save time when there is
no hint or the hint is obscure.
I scanned all and did most problems in the first seven chapters (2014 ver-
sion) and share those valuable here (those are relatively hard or insightful
from my perspective. I read this book for reviewing and deeper mathemati-
cal understanding). The rest problems should be tractable even for a novice
in linear algebra. In Chapter 7, there are a few easy problems and none is
selected herein. The last chapter, Chapter 8, deals with dual spaces and
tensors, which could be advanced for most readers and is also not covered.
As references aiming to facilitate readers’ learning process, the correctness
and optimality of the solutions are not guaranteed. Lastly, do not copy the
contents as it can violate your college’s academic rules.
I’m now a PhD student with limited time to craft the contents. Typos
and grammar mistakes are inevitable. Available at [email protected]
if any feedback. Cheers.
1
Find the difference of the equations above
02 − 01 = v − v = 01 /(or) 02
then
02 = 01 + 01 /02 = 01 .
Thus, the zero vector is unique.
1.6. Prove that the additive inverse, defined in Axiom 4 of a vector space
is unique.
Proof Assume there exist two different additive inverses w1 and w2 of vec-
tor v ∈ V . Then
v + w1 = 0
v + w2 = 0.
w1 − w2 = 0
then
w1 = w2 .
Therefore, the additive inverse is unique.
1.8. Prove that for any vector v its additive inverse −v is given by (−1)v.
Proof v + (−1)v = (1 − 1)v = 0v = 0 and we know from Problem 1.6 that
the additive inverse is unique. Hence, −v = (−1)v.
α1 v1 + α2 v2 + ... + αr vr = 0,
2
and rk=1 |αk | =
P
6 0. This contradicts that v1 , v2 , ..., vr are linearly indepen-
dent. So αr+1 6= 0 and vr+1 can be represented as
r
1 X
vr+1 = − αk vk .
αr+1
k=1
This contradicts the premise that vr+1 can not be represented by v1 , v2 , ..., vr .
Thus, the system v1 , v2 , ..., vr , vr+1 is linearly independent.
That is
cos(−α) − sin(−α) 1 0 cos(α) − sin(α)
T =
sin(−α) cos(−α) 0 −1 sin(α) cos(α)
π π
cos( π4 ) − sin( π4 )
cos(− 4 ) − sin(− 4 ) 1 0
=
sin(− π4 ) cos(− π4 ) 0 −1 sin( π4 ) cos( π4 )
0 1
= .
1 0
3
⇒ C and T (1) = a + ib for
Proof Suppose a linear transformation is T : C =
two real numbers a and b. Then, T (−1) = −T (1) = −a − ib. Note that
i2 = −1. T (−1) = T (i2 ) = iT (i). Thus
−a − ib
T (i) = = i(a + ib).
i
For any ω = x + iy ∈ C, x, y ∈ R,
where α = T (1).
5.4. Find the matrix of the orthogonal projection in R2 onto the line
x1 = −2x2 .
Solution Following similar steps presented in Problem 3.2, we have
T = R(α)P R(−α)
1 0
= R(α) R(−α).
0 0
5.7. Find the matrix of the reflection through the line y = −2x/3. Perform
all the multiplications.
Solution As above,
T = R(α)Ref R(−α)
1 0
= R(α) R(−α).
0 −1
α = tan−1 (− 23 ),
5 12
13 − 13
T = 5 .
− 12
13 − 13
4
a basis in W .
Proof For any w ∈ W , A−1 w = v ∈ V ,
w = Av = A[v1 v2 ... vn ][v1 v2 ... vn ]T
= [Av1 Av2 ... Avn ][v1 v2 ... vn ]T .
[Av1 Av2 ... Avn ] is in the form of a basis in W . Next we show that
Av1 Av2 ... Avn is linearly independent. If they are not liearly independent,
suppose Av1 can be expressed as a linear combination of Av2 Av3 ... Avn
without loss of generality. Multiplying them with A−1 in the left side, it
results in that v1 can be expressed by v2 ... vn , which contradicts the fact
that v1 , v2 , ..., vn is a basis in V . The proposition is proved.
0 0 0 1
5
1 0 0 0
0 0 −1 0
Rx =
0
(Rotate y = x about the x-axis π/2.)
1 0 0
0 0 0 1
√ √
2 2
0 − 0
2 2
√0 1 √0 0
Ry = 2 2
(Rotate the line about the y-axis −π/4.)
2 0 2 0
0 0 0 1
cos γ − sin γ 0 0
sin γ cos γ 0 0
Rz (γ) =
0
(Rotate about the z-axis γ.)
0 1 0
0 0 0 1
Combining all the matrices above and their inverses yields
√
cos γ + 1 1 − cos γ 2 sin γ 3 cos γ − 3
−
2 2 √2 2
1 − cos γ cos γ + 1 2 sin γ 3 − 3 cos γ
R4×4 = √ 2
− .
√ 2 2 √2
2 sin γ 2 sin γ 3 2 sin γ
− cos γ −
2 2 2
0 0 0 1
6
Chapter 2. Systems of Linear Equations
3.8. Show that if the equation Ax = 0 has unique solution (i.e. if echelon
form of A has pivot in every column), then A is left invertible.
Proof Ax = 0 has unique solution, then the solution is trivial solution.
The echelon form of A has pivot in every column. Let the dimension of A
be m × n, then m ≥ n. The row number is greater or equal to the column
number. The reduced echelon form of A can be denoted as
In×n
Are = .
0(m−n)×n
Are = Ek ... E2 E1 A
7
Proof Suppose that dim V = k and let v1 , v2 , ..., vk be a basis of V . AV is
defined by the transformation on the basis: Av1 , Av2 , ..., Avk . dim AV =
rank [Av1 , Av2 , ..., Avk ] ≤ k = dim V .
Similarly, assume rank B = k and b1 , b2 , ..., bk are linearly independent
column vectors in B. rank AB = rank [Ab1 , Ab2 , ..., Abk ] ≤ k = rank B.
8.5. Prove that if A and B are similar matrices then trace A = trace
B. (Hint: recall how trace(XY ) and trace(Y X) are related.)
Proof trace(A) = trace(Q−1 BQ) = trace(Q−1 QB) = trace(B). (Note that
trace(AB) = trace(BA) as long as AB, BA can be performed.)
8
Chapter 3. Determinants
3.4. A square matrix (n × n) is called skew-symmetric (or antisymmetric) if
AT = −A. Prove that if A is skew-symmetric and n is odd, then det A = 0.
Is this true for even n?
Proof det A = det AT = det(−A) = (−1)n det A by using the properties
of determinant and skew-symmetric matrices. If n is odd, (−1)n = −1, we
have det A = − det A, thus det A = 0.
If n is even, we just have det A = det A, so the result above is generally
not true.
3.9. Let points A, B and C in the plane R2 have coordinates (x1 , y1 ), (x2 , y2 )
and (x3 , y3 ) respectively. Show that the area of triangle ABC is the absolute
value of
1 x1 y1
1
1 x2 y2 .
2
1 x3 y3
Hint: use row operation and geometric interpretation of 2 × 2 determinants
(area).
Proof The area of triangle ABC is half of the parallelogram defined by
neighbouring sides AB, AC, which also can be computed by
1 x2 − x1 y2 − y1
S4ABC = abs( )
2 x3 − x1 y3 − y1
1
= |(x2 − x1 )(y3 − y1 ) − (x3 − x1 )(y2 − y1 )|.
2
9
In the same time, if we use row reduction to check the determinant
1 x1 y1 1 x1 y1
1 1
1 x2 y2 = 0 x2 − x1 y2 − y1
2 2
1 x3 y3 0 x3 − x1 y3 − y1
1 x1 y1
1
= 0 x2 − x1 y2 − y1
2 x3 −x1
0 0 y3 − y1 − (y2 − y1 ) x2 −x1
1
= ((x2 − x1 )(y3 − y1 ) − (x3 − x1 )(y2 − y1 )).
2
We assume that x2 − x1 6= 0 and it can be verified if x2 − x1 = 0, the result
still holds. With the absolute value, we can see the conclusion holds.
P N := P
| P{z... P} = I.
N times
Use the fact that there are only finitely many permutations.
Solution a) Consider the linear transformation y = P x and rows of P .
There is only one 1 in each row of P . Suppose in the first row of P , P1,j = 1,
then y1 = p1 x = xj , where p1 is the first row of P . Namely, xj is moved
to the 1st place after the linear transformation. Similarly, for the second
row of P , suppose P2,k = 1, then y2 = xk and xk is moved to the second
place, so on and so forth. There is also only one 1 in each column, then
10
the column indices in 1 entries P1,j , P2,k , P3,m , ... comprise a permutation
of n as (j, k, m, ...). After multiplying by the permutation matrix P , the
elements in x change their orders to [xj , xk , xm , ...]T . (Considering from the
perspective of column vectors of P also works.)
b) Suppose P is invertible, by multiplying P −1 , x = P −1 y. But we know
−1
y1 = xj , then we have Pj,1 = 1 so that xj can return to its original po-
−1
sition. Similarly, y2 = xk , then Pk,2 = 1. Following this, we can see that
−1 −1
Pi,j = Pj,i . So P is invertible and P = P T .
c) Note that P x, P 2 x, P 3 x ... P N x are all permutations of (x1 , x2 , ..., xn ). If
P N can never equal to I, P x, P 2 x, P 3 x ... P N x will be different permuta-
tions. And N can be infinitely big, so there will be infinitely many per-
mutations of (x1 , x2 , ..., xn ), which is impossible. Thus there must be some
N > 0, P N = I. In fact, there are n! different permutations of n distinct
elements, N ≤ n!.
Exercises Prat 5 and Part 7 in this chapter are not difficult. Some ideas
and answers are given for reference:
• Problem 5.3, we can use the last column expansion and the left matrix
(A + tI)i,j is a triangular matrix. The final expression is det(A + tI) =
a0 + a1 t + a2 t2 + ... + an−1 tn−1 . The order of −1 in each term is even.
11
b) If a matrix has one eigenvector, it has infinitely many eigenvectors;
True, if Ax = λx, A(αx) = λ(αx), α is an arbitrary scalar but zero.
αx is also an eigenvector of A.
Proof AV V = [I]V S ASS [I]SV , where S represents the standard basis. [I]
is the coordinate change matrix and [I]SV = [v1T , v2T , ..., vnT ]. [I]SV AV V =
ASS [I]SV = ASS [v1T , v2T , ..., vnT ] = [λv1T , λv2T , ..., λvkT , ..., ASS vnT ]. Denote
the i-th column of AV V with ai . Consider a1 , then [I]SV a1 = λv1T . Since
v1 , v2 , ..., vn is a basis, then a1 can only be the form a1 = [λ, 0, 0, ..., 0]T .
Similarly, check the first k columns of AV V , they are λ times the first k
standard base vector. So ASS has the block triangular form above.
1.9. Use the two previous exercises to prove that geometric multiplicity
of an eigenvalue cannot exceed its algebraic multiplicity.
Proof We consider the problem in the basis v1 , v2 , ..., vn and A has the
block triangular form shown in Problem 1.8. Note k is the number of linearly
independent eigenvectors corresponding to λk , which is also the dimension
12
of Ker(A − λk I) (consider the queation (A − λk I)x = 0). Namely, k is the
geometric multiplicity of λk .
For the algebraic multiplicity, consider the determinant
(λk − λ)Ik ∗
det(A − λI) =
0 B − λIn−k
= (λk − λ)k det(B − λIn−k ).
So the algebraic multiplicity of λk is at least k. It is further possible that λk
is a root of the polynomial det(B − λIn−k ), then in this case the algebraic
multiplicity will just exceed k. Thus geometric multiplicity of an eigenvalue
cannot exceed its algebraic multiplicity.
1.11. Prove that the trace a matrix equals the sum of eigenvalues in three
steps. First, compute the coefficient of λn−1 in the right side of the equality
det(A − λI) = (λ1 − λ)(λ2 − λ)...(λn − λ).
Then show that det(A − λI) can be represented as
det(A − λI) = (a1,1 − λ)(a2,2 − λ)...(an,n − λ) + q(λ).
where q(λ) is polynomial of degree at most n − 2. And finally, comparing
the coefficients of λn−1 get the conclusion.
Proof First, recall the binomial theorem, the coefficient of λn−1 in det(A −
λI) = (λ1 − λ)(λ2 − λ)...(λn − λ) is C(λn−1 ) = (−1)n−1 (λ1 + λ2 + ... + λn ).
Because to get the term λn−1 , we need to pick −λ n times from the total
n factors λi − λ. Then the last one pick is λj from the factor whose −λ
is not picked. The resulting term is then (−1)n−1 λj λn−1 . There are n
combinations and the sum of each term is C(λn−1 )λn−1 .
Then, we show det(A − λI) can be represented as
det(A − λI) = (a1,1 − λ)(a2,2 − λ)...(an,n − λ) + q(λ).
That is to say in det(A − λI), the term λn−1 are all from (a1,1 − λ)(a2,2 −
λ)...(an,n − λ). This holds because λ only appears on the diagonal of A − λI.
13
Using the formal definition of determinant, if we pick n − 1 diagonal term
with λ, then the last pick must also be on the diagonal. There is no other way
to generate λn−1 . Thus q(λ) is a polynomial of degree at most n − 2. Then
we know the coefficient of λn−1 also equals to C(λn−1 ) = (−1)n−1 (a1,1 +
a2,2 + ... + an,n ).
PnThe coefficients
Pn derived from two different ways are identical, so we have
a
i=1 i,i = i=1 i , namely, the trace a matrix equals the sum of eigenval-
λ
ues.
c) If A is diagonalizable, then so is AT .
True, A = SDS −1 , AT = (SDS −1 )T = (S −1 )T DT S T = (S T )−1 DS T .
2.2. Let A be a square matrix with real entries, and let λ be its complex
eigenvalue. Suppose v = (v1 , v2 , ..., vn )T is a corresponding eigenvector,
Av = λv. Prove that the λ̄ is an eigenvalue of A and Av̄ = λ̄v̄. Here v̄ is
the complex conjugate of the vector v, v̄ := (v¯1 , v¯2 , ..., v¯n )T .
Proof A is real matrix. Then Āv̄ = Av̄. In the same time Āv̄ = Av =
λv = λ̄v̄. Thus Av̄ = λ̄v̄.
14
conjugate symmetry.
Proof
kx + yk2 + kx − yk2 = (x + y, x + y) + (x − y, x − y)
= (x, x) + (x, y) + (y, x) + (y, y)+
(x, x) − (x, y) − (y, x) + (y, y)
= 2(x, x) + 2(y, y)
= 2(kxk2 + kyk2 ).
n
X
(x, y) = αk β k .
k=1
Proof
a)
Xn n
X n X
X n n
X
(x, y) = ( α k vk , β k vk ) = αi β j (vi , vj ) = αk β k .
k=1 k=1 i=1 j=1 k=1
15
b) Use (x, vk ) = αk , (y, vk ) = βk and conclusion in a).
c) Use equation in a),
n
X n
X n X
X n n
X
(x, y) = ( αk vk , βk vk ) = αi β j (vi , vj ) = αk β k (vk , vk )
k=1 k=1 i=1 j=1 k=1
n n
X X (x, vk )(y, vk )
= αk β k kvk k2 = .
kvk k2
k=1 k=1
16
b) Note that A = nPE . Suppose Ax = λx, then nPE x = λx. i.e., n times of
the eigenvector’s projection on the 1D subspace equals λ times of itself. It
also shows that the eigenvector’s orthogonal projection is parallel to itself.
Then there are two possibilities:
b) P 2 = P .
17
Proof a) P + Q = I since (P + Q)x = P x + Qx = PE x + QE ⊥ x = x.
P Q = 0n×n as x∗ P Qx = x∗ P ∗ Qx = (Qx, P x) = 0, ∀x (using P is self-
adjoint shown in Problem 3.11).
b) (P − Q)2 = (P − Q)(P − Q) = P 2 − P Q − QP + Q2 = P 2 + Q2 =
P 2 + Q2 + P Q + QP = (P + Q)2 = I 2 = I (using P Q = QP = 0). i.e.,
(P − Q)−1 = P − Q.
5.1. Show that for a square matrix A the equality det(A∗ ) = det(A) holds.
T
Proof det(A∗ ) = det(A ) = det(A) = det(A).
18
Chapter 6. Structure of Operators in Inner Product
Spaces
1.1. Use the upper triangular representations of an operator to give an
alternative proof of the fact that the determinant is the product and the
trace is the sum of eigenvalues counting multiplicities.
Proof (The proof use the fact that the entries on the diagonal of T are
the eigenvalues of A, counting multiplicity, which seems not be explicitly
stated in the book and can be found like in the Wiki.) A = U T U ∗ . det A =
(det U )(det T )(det U ∗ ) = det U = Πni=1 λi because U is unitary and T is
upper triangular with eigenvalues of A on its diagonal.
To consider the trace, suppose U = [u1 , u2 , ..., un ]. Then A can be
represented by
T
λ1 t12 . . . t1n
u
λ . . . t 1T
2 2n u2
A = u1 u2 . . . un .. .. .
. . ..
0 λn un
T
T
u1
T
u2
= λ1 u1 t12 u1 + λ2 u2 . . . t1n u1 + t2n u2 + ... + λn un ..
.
T
un
T T T
= λ1 u1 u1 + λ2 u2 u2 + ... + λn un un ,
where we exploit the orthogonality of u1 , u2 , ...un . Then note that the trace
T
of matrix ui ui = [ui1 ui ui2 ui ... uin ui ] (outer product) is u2i1 + u2i2 + ... +
T T
u2in = kui k2 = 1. Thus traceA = trace(λ1 u1 u1 ) + trace(λ2 u2 u2 ) + ... +
T
trace(λn un un ) = λ1 + λ2 + ... + λn = ni=1 λi .
P
2.2. True or false: The sum of normal operators is normal? Justify your
conclusion.
Solution True. Suppose two normal operators are N1 = U1 D1 U1∗ , N2 =
U2 D2 U2∗ . U1 , U2 are unitary and D1 , D2 are diagonal.
19
D1 U1∗ U2 D2∗ and N1∗ N2 = N1 N2∗ , N2∗ N1 = N2 N1∗ . So the statement is true.
20
y12 + y22 6 1. Hence the maximum is 16 attained when y = [1 0]T . Corre-
sponding x can be solved by x = V y.
b) Similarly, the minimum is 1 attained when y = [0 1]T .
c) Ellipse.
∗ ∗ ∗ ∗ ∗ End ∗ ∗ ∗ ∗ ∗
21