Linear Algebra
Linear Algebra
Linear Algebra
Claudio Bartocci
Contents
2 Linear operators 3
3 Upper triangularization 5
5 Jordan-Chevalley decomposition S 10
Throughout these notes3 we shall adopt the following assumptions and notation:
= Definition;
= Remark (or Example);
= Historical remark;
S = Supplementary (sometimes more advanced) material;
! = Try to check it!
= Try to prove it!
1 Main topics: operations of sum and product; determinant; trace; rank; eigenvectors and eigenvalues;
characteristic polynomial.
2 Main topics: homomorphisms; bases; rank-nullity theorem (i.e., for any homomorphism A : V W
between finite-dimensional vector spaces V and W , one has dim Ker A + dim Im A = dimV ); inner prod-
ucts and Gram-Schmidt orthogonalization.
3 The author wishes to thank Valeriano Lanza for his helpful remarks.
Section 2 3
2 Linear operators
Proof.
It is not difficult to find examples of operators having no eigenvectors at all. For
such that J = I (!). Notice that under these hypotheses the dimension of V has
instance, this is the case when V is a real vector space and J : V V is an operator
2
M (L)11 M (L)1n x1
.. .. .. Pn Pn
Lx = . = i =1 x i M (L)1i , . . . , i =1 x i M (L)ni .
. .
M (L)n1 M (L)nn x n
The determinant det A and the trace tr A of the operator A : V V are defined,
respectively, as the trace and the determinant of the matrix M (A, E ), for any arbitrary
choice of a basis E for the vector space V . The characteristic polynomial of A is the
polynomial A (X ) = det(X I A) K[X ].
Theorem 2.3. Let V be a complex vector space. Any operator A : V V has at least
one eigenvector.
k1 + + k s = n ( !).
Corollary 2.4. If the characteristic polynomial A (X ) of an operator A : V V has
n distinct roots 1 , . . . , n , then the corresponding eigenvectors v 1 , . . . , v n constitute a
basis V for V . The matrix M (A, V ) representing the operator A with respect to that
basis is diagonal, namely
1 0 0
0 2 0
M (A, V ) = . .. .
.. ..
.. . . .
0 0 n
3 Upper triangularization
Proof. Let {V1 , . . . ,Vn } be an A-invariant complete flag. It is possible to choose a basis
E = {e1 , . . . , en } for V such that {e1 , . . . , ei } is a basis for Vi for all i = 1, . . . , n: actually,
it is enough to pick a generator e1 of V1 , to complete it to a basis {e1 , e2 } for V2 , and
to proceed inductively in the same way (recall that Vi Vi +1 ). Now, by the very
definition of A-invariant subspace, one has:
Ae1 = a 11 e1
Ae2 = a 12 e1 + a 22 e2
..
.
i
X
Aei = a ki ek . (3.1)
k=1
Hence, the matrix M (A, E ) representing A with respect to the basis E is upper trian-
gular, namely
a 11 a 12 a 1n
0 a 22 a 2n
M (A, E ) = . .. . (3.2)
.. ..
.. . . .
0 0 a nn
N
Theorem 3.2. Let V be a complex vector space. Any operator A : V V can be repre-
sented by a triangular matrix.
Proof. In view of Lemma 3.1 we have just to find an A-invariant complete flag in
V . We proceed by induction on the dimension n of V . The case n = 1 is completely
trivial, and we assume the inductive hypothesis: if U is a complex vector space of
dimension n 1, then any operator B : U U admits a B-invariant complete flag.
We pass now to study the case of an operator A : V V , where dimV = n. By Corol-
lary 2.3 (where we use the fact that K = C!) A has at least one eigenvector v 1 ; so,
Av 1 = v 1 , for some C. Let V1 be the subspace generated by v 1 ; we decompose
V into the direct sum V = V1 U , where dimU = n 1, and consider the canonical
Section 4 7
Theorem 3.2 can be exploited to give a relatively elementary proof of the classical
Cayley-Hamilton theorem. Let A (X ) = nk=0 s k (A)t k , with s k (A) K (and s n (A) = 1),
P
Theorem 4.1 (Cayley-Hamilton). Let V be a complex vector space. For any operator
A : V V one has A (A) = 0.
Proof (after [10]). By Theorem 3.2, after choosing an A-invariant complete flag
{V1 , . . . , . . . ,Vn } in V , the operator A is represented with respect to an adapted basis
E = {e1 , . . . , en } (cf. Lemma 3.1) by an upper triangular matrix of the form shown in
Eq. (3.2). Then the characteristic polynomial of A is readily expressed by the formula
X a 11 a 12 a 1n
0 X a 22 a 2n n
A (X ) = det
Y
= (X a kk ) . (4.1)
.. .. .. ..
. . . . k=1
0 0 X a nn
Thus, one has A (A) = nk=1 (A a kk I). In order to prove that A (A) = 0 it suffices to
Q
j
Y
j = 1, . . . , n v V j (A a kk I)v = 0 .
k=1
8 Notes in linear algebra
j
X jX
1
(A a j j I)v j = ak j vk a j j v j = a k( j 1) v k ,
k=1 k=1
j
Y jY
1
(A a kk I)v = (A a kk I)(u1 + u2 ) ,
k=1 k=1
Qj
with (u1 + u2 ) V j 1 . So, by the inductive hypothesis, we get k=1
(A a kk I)v = 0 for
any v V j . Hence, A (A) = 0. N
It should be clear that the Cayley-Hamilton theorem holds true not just in the
complex case, but for any field K C. Let A : V V be an operator, V being an
n-dimensional K-vector space. After selecting a basis for V , the operator A is repre-
sented by a matrix M and A = M . We can view M as an operator M : Cn Cn and
apply the previous theorem (of course, M = M ). In conclusion, we have A (A) = 0.
It seems that William Rowan Hamilton (1805-1865) was the first to prove a spe-
cial case of the theorem that today we call Hamilton-Cayley theorem. But the for-
malism of his Lectures on Quaternions (1853) is so perplexingly obscure that for a
modern reader is difficult even to spot the page where the result is stated. Five years
later, Arthur Cayley (1821-1895), in his Memoir on the theory of matrices (Philo-
sophical Transactions of the Royal Society of London, 148 (1858), pp. 17-37), provided
a fairly complete algebraic theory of matrices. In this paper he claimed that the de-
terminant having for its matrix a given matrix less the same matrix considered as a
single quantity involving the matrix unity, is equal to zero [4, p. 482]. Cayley ver-
ified the result by hand for 2 2 and 3 3 matrices but he thought it unnecessary
to undertake the labour of a formal proof of the theorem in the general case of a
matrix of any degree [ibid., p. 483]. It was only in 1878 that Ferdinand Georg Frobe-
nius (1849-1917), seemingly unaware of Cayleys work, gave a complete proof of the
theorem in his paper ber lineare Substitutionen und bilineare Formen [5].
+ (1)n1 n1 (a 11 , . . . , a nn )X + (1)n n (a 11 , . . . , a nn ) ,
Section 4 9
Take now the matrix t I M where t is an indeterminate over the ring R. Each entry
of adj(t I M ) is of degree n 1 in t , so that we can write
n1
(t I )k Nk
X
adj(t I M ) =
k=0
for suitable matrices N0 , . . . , Nn1 whose entries are of degree 0 in t . The identity
n
(X t k ) = X n 1 X n1 + 2 X n2 + + (1)n n
Y
k=1
j (t 1 , . . . , t n ) =
X
tl 1 tl j .
1l 1 <<l j n
By comparing the summands having same degree in t , we get a set of n+1 equations:
Eq. E0 :
M N0 = s 0 (M )I
.. ..
. .
Eq. Ek : Nk1 M Nk = s k (M )I
.. ..
. .
Eq. En : Nn1 = I .
By multiplying on the left Eq. Ek by M k and them summing up the resulting equa-
tions we get
sum of the RHSs of Eqs. E0 , ..., En
z }| {
n1
M (M ) = M n + s k (M )M k + s 0 (M )I =
X
k=1
n1
= M n Nn1 + (M k Nk1 M k+1 Nk ) M N0 = 0 .
X
k=1
| {z }
sum of the LHSs of Eqs. E0 , ..., En : its a telescopic sum!
5 Jordan-Chevalley decomposition S
In most of this section we assume that K is an arbitrary field (not necessarily a sub-
field of C). Let V be a K-vector space of finite dimension > 0. An operator A : V V
induces the ring homomorphism
The ring K[A] is the subring of EndK V generated by A; it is a commutative ring, be-
cause A j Ak = Ak A j for all non-negative integers j , k. The kernel of A is a non-zero
Section 5 11
The key point is that the homomorphism A induces a K[X ]-module structure on
V . Actually, for any p(X ) = X m + a m1 X m1 + + a 1 X + a 0 K[X ] and for any v V ,
we set
p(X ) A v = p(A)v = Am v + a m1 Am1 v + + a 1 A + a 0 v .
Suppose V is given a structure of K[X ]-module: so, for any p(X ) K[X ] and for
any v V , one can perform the multiplication p(X ) v. Then, the multiplication by
!
X induces a linear operator A : V V , that is defined by setting Av = X v for all v V .
It is straightforward to check ( ) that the K[X ]-module structure on V induced by
A coincides with the original one.
Theorem 5.2. Let M be a finitely generated module over a principal ideal domain
R. Then M is isomorphic to the direct sum of a finite number of cyclic R-modules6
C 1 , . . . ,C q and a free R-module L. More precisely, M admits the following two decom-
positions:
invariant factor decomposition: M ' R/(d 1 ) R/(d k ) R r , where r is a non-
negative integer and d 1 , . . . , d k are elements of R that are not units and not zero and
such that d i divides d i +1 for all i = 1, . . . , k 1;
s s
primary decomposition: M ' R/p11 R/phh R r , where r is a non-negative
integer (the same as above), p1 , . . . , ph are (not necessarily distinct) prime ideals in R,
and s 1 , . . . , s h non-negative integers.
Notice that the elements d 1 , . . . , d k in the invariant factor decomposition are un-
ambiguously determined up to multiplication by units. They are called the invariant
factors for the module M . The R-modules R/(d i ) are of course not necessarily inde-
6 A cyclic R-module N is a module generated by a single element, say x. This is equivalent to say that
N ' R/I, where I is the ideal of elements of R annihilating x, i.e., I = {a R | ax = 0}. In our case, R is a
principal ideal domain, so that I = (d ), where d is uniquely determined up to multiplication by a unit.
12 Notes in linear algebra
sj
composable; on the contrary, the R-modules R/p j of the primary decomposition
are indecomposable. It should be emphasized that, for a module M over a principal
ideal domain R, the existence of the primary factor decomposition is a consequence
of the existence of the invariant factor decomposition, and conversely.
When the ring R is Z, Theorem 5.2 corresponds to the structure theorem for
finitely generated abelian groups (which, by the way, can be proved in a more ele-
mentary way). Actually, any such group G admits a primary decomposition
where the integers q 1 , . . . q s are powers of (not necessarily distinct) prime numbers
p 1 , . . . p s , and an invariant factor decomposition
Theorem 5.2, when applied to our case, yields the following result.
V = V1 Vk ,
Proof. First, we may make use of Theorem 5.2 because K[X ] is a principal ideal do-
main (actually, an Euclidean domain) and V is finitely generated over K[X ] (this fol-
lows immediately from the fact that V is a finite-dimensional vector space). Next,
it is clear that in the invariant factor decomposition of V there is no free direct
summand; in fact, A (X ) A v = 0 for all v V . So, in view of Theorem 5.2, V =
V1 Vk , where Vi is a cyclic K[X ]-module isomorphic to K[X ]/(q i (X )), where
q 1 (X ), . . . , q k (X ) are non-constant, non-zero polynomials such that q i (X ) divides
q i +1 (X ) for all i = 1, . . . , k 1; each polynomial q i (X ) is uniquely determined up to
multiplication by a unit, so we can assume it is monic. To prove the last statement
we notice that the polynomial q i (X ) annihilates all vectors in Vi . As q k (X ) is a non-
zero multiple of q i (X ) for all i = 1, . . . k 1, it follows that q k (X ) A v = 0 for all v V ;
equivalently, q k (A) = 0. Thus, the minimal polynomial A (X ) divides q k (A) (and they
are both monic). But no polynomial of lower degree than q k (A) can annihilate the
all of Vk , so that A (X ) = q k (A). N
case of an operator B : W W such that W , endowed with the K[X ]-module struc-
ture induced by B, is isomorphic to K[X ]/(p(X )), where p(X ) is a polynomial of de-
gree m = dimW . We make the identification W = K[X ]/(p(X )); as usual, we denote
by [ f (X )] the class of f (X ) in the quotient.
w0 , w 1 = X B w 0 , w 2 = X B w 1 , . . . , w m1 = X B w m2
Bw 0 = w 1 ,
Bw 1 = w 2 ,
...
Bw m2 = w m1 ,
Bw m1 = a m1 w m1 a 1 w 1 a 0 w 0 .
Let us come back to the general case of a pair (V, A) with the associated invariant
factor decomposition V = V1 Vk described in Theorem 5.3; let n i = dimVi .
For each Vi there exists a basis V (i ) = {v 1(i ) , . . . , v n(ii) } such that A|Vi is represented by a
14 Notes in linear algebra
L
matrix F i of the type (5.1). Thus, A = A|Vi is represented by the block matrix
F1 0
0 F2 0
F = .
.. .. ..
.. . . .
0 Fk
Theorem 5.5. Let A : V V be an operator. There exists a basis for V such that A is
represented by a block matrix whose blocks are of the type (5.1).
The block matrix obtained in the previous Theorem is called the Frobenius canon-
ical form (aka rational canonical form) of the operator A. As Michael Artin puts it,
it isnt particularly nice, but it is the best form available for an arbitrary field [1,
p. 479].
According to Theorem 5.2 any finitely-generated K[X ]-module admits, besides its
invariant factor decomposition, also its primary decomposition. By simply rephras-
ing the last part of Theorem 5.2, we obtain the following result.
V = W1 Wh ,
s
where Wi is a cyclic K[X ]-module isomorphic to K[X ]/pi i and p1 , . . . , ph are (not nec-
essarily distinct) prime ideals in K[X ]. Thus, each pi is generated by a unique monic
irreducible polynomial p i (X ) K[X ].
Proof.
Corollary 5.7. The polynomials p 1 (X ), . . . , p h (X ) are the irreducible factors of the min-
imal polynomial A (X ).
!
Proof. By comparing the invariant factor decomposition of Theorem 5.3 with the
primary decomposition of Theorem 5.6, it turns out ( ) that
s s
p 1i (X ) p hh (X ) = q 1 (X ) q k1 (X )A (X ) .
In order to take the best advantage of the primary decomposition obtained in The-
orem 5.6 we analyze more closely the case when K is algebraically close. Then, each
irreducible polynomial in K[X ] is a polynomial of degree 1 (actually, this statement
is equivalent to the fact that K is algebraically closed). Therefore, each p i (X ) in The-
orem 5.6 is of the form p i (X ) = X i , for some i K (recall that repetitions are
Section 5 15
w 1 = (X ) B w 0 , w 2 = (X ) B w 1 , . . . , w s1 = (X ) B w s2 .
By using the same argument as in Lemma 5.4, it is again a simple matter to prove
that the set {w 0 , . . . , w s1 }s is a basis for W . Moreover, one has (X )s B w 0 = 0, so
that (X ) B w s1 = 0.
Bw 0 = w 1 + w 0 ,
Bw 1 = w 2 + w 1 ,
...
Bw s2 = w s1 + w s2 ,
Bw s1 = w s1 .
0 0 0
1
0 0
. .. .. .. ..
. . .
. . .. (5.2)
..
.
0 0 0
0 0 1
Proof.
The block matrix in Theorem 5.8 is called the Jordan canonical form of the opera-
tor A : V V ; its blocks are usually named the Jordan blocks of A.
A = Ad + An ,
Ad An = An Ad .
16 Notes in linear algebra
To show this, let J be the canonical Jordan form of A with respect to a basis W . It is
clear that J can be written in the form
J = M +K ,
matrix, whose diagonal entries are the roots of the minimal polynomial. Notice that
one has K = 0 only when all the roots of the minimal polynomial are distinct ( ).
By direct computation it is non hard to check that M and K commute,
K M = MK .
If V is a vector space over a field K not algebraically closed, then it may well hap-
pen that an operator A : V V cannot be put in Jordan canonical form. Nonethe-
less, one can get some useful information about A by regarding it as an operator on
a vector space V defined over an algebraic closure K of K (so, K is an extension of
K which is algebraic and algebraically closed; it is unique only up to isomorphisms
inducing the identity on K). As usual, we shall identify K with its image under the
immersion K , K.
Kn = V 1 V k ,
where Vi ' K[X ]/(q i ). With respect to the basis W described in Lemma 5.4, the
operator A is put in Frobenius canonical form, i.e., it is represented by a block matrix
F of the form described in Theorem 5.5. This means that there is an invertible matrix
N with entries in K such that F = N 1 AN . Then, the matrix F when acting on vectors
n
in K represents the operator A (with respect to the basis W viewed as a basis for
n n
K ); the corresponding decomposition of K is
n
K = V1 Vk ,
where Vi ' K[X ]/(q i ). The claim then follows from the uniqueness of the invariant
factors.
Theorem 5.9. Let V be a finite-dimensional vector space over an arbitrary field K. Let
A : V V be an operator. Then the characteristic polynomial A (X ) of A is equal to
the product of the invariant factors of the pair (V, A),
A (X ) = q 1 (X ) q k (X ) ,
where q k (X ) coincides with the minimal polynomial A (X ) of A. Furthermore, A (X )
and A (X ) have the same irreducible factors in K[X ].
n
v = (v 1 , . . . , v n ), w = (w 1 , . . . , w n ) Cn
X
h(v, w) = vk wk .
k=1
(of course, since h(v, v) is real and 0 for all v V , the square root is the only positive
real number whose square is h(v, v)). In what follows we shall write just kk instead
of kkh , whenever no confusion may arise.
As usual, we say that two vectors v, w V are orthogonal (w.r.t. the Hermitian
product h) if h(v, w) = 0.
Proof.
Theorem 6.2. For any vectors v, w V and for any complex number z C we have
1) |h(v, w)| kvkkwk (Schwarz inequality);
2) kvk = 0 if and only if v = 0;
3) kzvk = |z| kvk;
4) kv + wk kvk + kwk (triangle inequality).
Proof. 1) For w = 0 the statement is trivial. Assume first that kwk = 1. If we let
h(v, w) = C, then v w and w are othogonal vectors. By the Pythagorean
theorem, one has
kvk2 = kv wk2 + ||2 ,
w
so that || = |h(v, w)| kvk. For a general vector w, one takes the unit vector
kwk
and obtains that
w
|h(v, )| kvk .
kwk
Thus, one concludes that |h(v, w)| kvkkwk. The proofs of 2), 3), and 4) are left to
the reader as (easy) exercises ( ). N
It is well known that a real vector space endowed with an Euclidean scalar product
admits orthogonal bases. The analogous property for a complex vector space with a
Hermitian product is stated in the following theorem.
Theorem 6.3. Let (V, h) be a Hermitian vector space. Then V admits an orthogonal
basis.
Section 6 19
e1 = v 1
h(v2 , e1 )
e2 = v 2 e1
h(e1 , e1 )
.. ..
. .
h(vn , en1 ) h(vn , e1 )
en = v n en1 e1 .
h(en1 , en1 ) h(e1 , e1 )
W = {v V | w W h(v, w) = 0} .
Then V = W W .
process we complete it to an orthogonal basis {w 1 , . . . , w m , v m+1 , . . . , v n } for V . It suf-
fices to verify that the set {v m+1 , . . . v n } constitutes a basis for W ( ). N
Hermitian products may seem to be like Euclidean products on real vector spaces.
However, they differ from the latter in at least one important respect, as it is shown
by the next lemma.
Lemma 6.5. Let (V, h) be a Hermitian vector space. Let A : V V be an operator such
that h(Av, v) = 0 for all v V . Then A = 0.
!
Proof. It is immediate to check by direct computation ( ) that, for any operator B,
the Hermitian product satisfies the the so-called polarization identity:
h(Av, w) + h(Aw, v) = 0
As v is arbitrary, this amount to say that h(Av, w) = 0 for all v, w V . In other words,
Av has to be orthogonal to all vectors w V : so, Av = 0 for all v V . We conclude
that A = 0. N
The previous Lemma is not true in the case of an Euclidean product on a real
vector space. For example, we can take, on R2 with the standard Euclidean product
0 1
(, )Eucl , the operator defined by the matrix J = . One has (J x, x)Eucl = 0 for
1 0
2
all vectors x R .
Let A : V V be a linear operator. Its adjoint operator (w.r.t. the Hermitian prod-
uct h) is the operator A uniquely determined by the formula
v, w V h(Av, w) = h(v, A w) .
It is straightforward to check that A : V V is a linear operator and that the adjoint
operation has the following properties ( ):!
1) (A + B) = A + B ;
2) (AB) = B A ;
3) if A is invertible, than (A1 ) = (A )1 ;
4) (A ) = A;
5) (zA) = zA .
If M (A, V ) is the matrix representing the operator A with respect to any orthog-
onal basis E = {e , . . . , e }, then the complex conjugate transpose matrix M (A, E ) T
represents the adjoint operator A (!). Accordingly, for any complex matrix M we
1 n
as well. Therefore, we see that the space Herm(n) of n n Hermitian matrices carries
a natural structure of real vector space; its dimension is n 2 ( ).
From this observation it follows immediately that the space of self-adjoint oper-
ators on a Hermitian vector space (V, h) is a vector space of dimension equal to the
square of the dimension of V .
Lemma 6.6. An operator A : V V is self-adjoint if and only if h(Av, v) is real for all
v V.
Section 6 21
Therefore, h(Av, v) is real. Conversely, suppose that h(Av, v) is real for any v V .
Then one has
This follows from (A ) = A
z }| {
h(Av, v) = h(Av, v) = h(v, Av) = h(A v, v) ,
so that h((A A )v, v) = 0 for all v V . By Lemma 6.5 it follows that A A = 0. N
Corollary 6.7. Let A : V V be a self-adjoint operator. Then all its eigenvalues are
real.
One can give a more direct proof of Corollary 6.7 by noticing that, for any oper-
eigenvalues of B ().
ator B : V V , the eigenvalues of its adjoint B are the complex conjugates of the
We are now going to prove a key result, often known under the name of spectral
theorem: every self-adjoint operator is diagonalizable w.r.t. an orthogonal basis of
eigenvectors. First we need an easy technical lemma.
Proof.
Theorem 6.9 (Spectral theorem). Let A : V V be a self-adjoint operator w.r.t. a
hermitian product h on V . Then V admits an orthogonal basis E = {e1 , . . . , en } such
that each ei is an eigenvector of A. W.r.t. such a basis A is represented by a diagonal
matrix whose entries are real.
Proof. Recall that, by Theorem 2.3, any complex operator has at least one eigen-
vector. We prove the first claim by induction on n = dimV . The case n = 1 is triv-
ial. Suppose the statement is true for any self-adjoint operator on a complex vector
space of dimension n 1. Now, let V be a complex vector space of dimension n; let
A : V V be a self-adjoint operator. We take an eigenvector v of A and consider the
subspace V1 V generated by v. The subspace V1 is A-invariant; its orthogonal com-
plement V1 is A-invariant (by Lemma 6.8) and we have V1 V1 = V (by Theorem
6.4). Therefore, dimV1 = n 1, so that the inductive hypothesis is satisfied for the
operator A|V : V1 V1 . Let {f 1 , . . . , f n1 } be an orthogonal basis for V1 whose el-
1
ements are eigenvectors of A|V ; obviously, they are eigenvectors of A as well. Then
1
22 Notes in linear algebra
the set of vectors {v, f 1 , . . . , f n1 } is an orthogonal basis for V and its elements are
eigenvectors of A. The second claim follows straightforwardly from Corollary 6.7. N
Clearly, the identity operator I is a positive self-adjoint operator w.r.t. any Hermi-
tian product.
Proof. All claims can be easily proved once an orthogonal basis {e1 , . . . , en } of eigen-
vectors of A has been fixed (this is possible in view of Theorem 6.9).
1) Let Aei = i ei . We have
h(Aei , ei ) = i h(ei , ei ) ,
where h(ei , ei ) > 0. If A is positive, then one has i > 0 for all i = 1, . . . n. Conversely,
if the eigenvalues 1 , . . . , n are positive, then, for any v = ni=1 v i ei V , we get:
P
n n n n
v i v j i h(ei , e j ) = i kvk2 > 0 .
X X X X
h(Av, v) = h(A( v i ei ), v j ej ) =
i =1 j =1 i , j =1 i =1
is a positive self-adjoint operator and commutes with A.
4) N
Let M be an n n Hermitian matrix. It follows from Theorem 6.9 that there exists
Section 7 23
1 0 0
0 2 0
N 1 M N = . .. ,
.. ..
.. . . .
0 0 n
Proof. The proofs of the logical equivalences 1) 2) and 3) 4) are left to the reader
!
( ). The implication 1) 3) is obvious. It remains only to show the implication
3) 1). The following identity can be easily checked by direct computation ( ):
Suppose now that U is norm preserving and apply the previous identity to u1 = Uv,
u2 = Uw. We get
4 h(Uv, Uw) = kU(v + w)k2 kU(v w)k2 + i kU(v + i w)k2 i kU(v i w)k2 =
= kv + wk2 kv wk2 + i kv + i wk2 i kv i wk2 =
= 4 h(v, w) .
24 Notes in linear algebra
Let {e1 , . . . , en } be an orthonormal basis for the Hermitian vector space (V, h). Let
U : V V be a unitary operator. If we let f i = Uei for all i = 1, . . . , n, then it turns
out (by 4) of Theorem 7.1) that the set of vectors {f 1 , . . . , f n } is an orthonormal basis
for V . Conversely, if {f 1 , . . . , f n } is any orthonormal basis for V , there exists a unitary
operator U such that f i = Uei for all i = 1, . . . , n. Therefore, the group U (V ) can
be thought of as acting on the space of orthonormal basis for V ; the action is free
and transitive (so that there is a one-to-one correspondence between U (V ) and the
space of orthonormal bases).
Unitary operators on C n
equipped with its standard Hermitian product are as-
sociated with n n complex matrices U satisfying the relation UU = U U = I .
The group U (n) consisting of those matrices is the unitary group of order n. Since
detUU = 1 and detU = detU , we deduce that the determinant of a unitary matrix
is a unit complex number:
U U (n) |detU | = 1 .
!
When n = 1, the group U (1) coincides with the circle group of unit complex num-
bers. The determinant map det : U (n) U (1) is a group homomorphism ( ); its
kernel is the special unitary group SU (n) U (n) of unitary matrices having deter-
minant equal to 1.
In view of the previous remark, there is a one-to-one correspondence between U (n)
and the space of orthonormal bases of Cn (w.r.t. the standard Hermitian product).
Actually, the columns of every unitary matrix gives the components of an orthonor-
mal basis on the canonical basis {e1 = (1, 0 . . . , 0), e2 = (0, 1, . . . , 0), . . . , en = (0, 0, . . . , 1)},
and every orthonormal basis is obtained in this way.
Let M be a Hermitian matrix. Then there exists a unitary matrix U such that
U MU is a diagonal matrix (whose diagonal entries are, of course, the eigenvalues
of M).
Theorem 7.2 (Polar decomposition). Let (V, h) be a Hermitian vector space. Any
invertible operator A : V V can be uniquely factored as
A = RU ,
Section 8 25
R2 = AA = R
bU b R
bU b = R
b2 ,
so that R
b = R. N
Notice that, if A = RU as in the above Theorem, then we have det A = det R det U,
with det R is a positive real number and det U is a unit complex number.
Theorem 7.2 is equivalent to the fact that every invertible complex matrix A can
be factored as A = RU , where R is a positive definite Hermitian matrix and U is a
unitary matrix. We have seen that it is always possible to find a unitary matrix Z
such that Z R Z = D is a diagonal matrix. Letting S = Z U , we get
A = Z DS ,
where Z and S are unitary matrices and D is diagonal. The diagonal entries of D
are the square roots of the eigenvalues of A A (so, they are positive real numbers):
they are called the singular values of A. Accordingly, the decomposition A = Z DS is
called the singular value decomposition of A.
26 Notes in linear algebra
In this Section we shall prove the analogous versions of the spectral theorem and of
the polar decomposition theorem in the context of real vector spaces. Even though
not strictly needed for this purpose, we start by studying a generalization of the no-
tion of Euclidean product.
Lemma 8.1. Let W be a subspace of V . Then W W , {0} if and only if W does not
contain any null vector. In particular, if W does not contain any null vector, one has
W W = V .
Proof.
Lemma 8.2. Let W be a subspace of (V, g) such that any w W is a null vector. Then
g(w 1 , w 2 ) = 0 for all w 1 , w 2 W . In particular, there is at least one vector in V which
is not a null vector.
As for the second statement, it suffices to observe that, if any vector v V is a null
vector, then g is zero. N
When g is definite positive, it is well known that V admits orthogonal bases , i.e.,
bases {e1 , . . . , en } such that g(ei , e j ) = 0 for all i , j (this can be proved in the same way
as we did for Theorem 6.3). The same result holds true also in the general case.
Theorem 8.3. Any pseudo-Euclidean vector space (V, g) has an orthogonal basis.
does not depends on the choice of the basis {f 1 , . . . , f n }. In fact, p is the dimension of
the maximal subspace where g is positive definite ( ); similarly, q is the dimension
of the maximal subspace where g is negative definite The pair (p, q) is called the
signature of the pseudo-Euclidean product g.
In the case q = 1, one gets the (generalized) Lorentz groups O(p, 1). When q = 0, we
recover the usual definition of the orthogonal group:
O(n) = { GL(n; R) | T I n = I n } .
It is not difficult to show that there are group isomorphisms O(p, q) ' O(q, p) for any
p, q; in particular, O(n) ' O(0, n).
From now we assume that (V, g) is an Euclidean vector space, namely, that g is
positive definite.
(notice that, when b = 0, this is the ordinary real scalar multiplication). Let V C be
the resulting complex vector space. To make notation less cumbersome, the pair
(v, w) will be denoted as v + i w. So,
as required. Any basis for V is a basis for V C over the complex numbers: indeed, if
{e1 , . . . , en } is a basis for V , one has v +i w = nj=1 (v j +i w j )e j for any vector (v +i w)
P
V C . The converse is not true: there are bases for V C that do not stem from bases for
V (the reason for that is that GL(n; C) is much bigger than GL(n; R)).
!
It can be readily checked ( ) that h(, x + i y) : V C C is C-linear for all x + i y and
that
h(v + i w, x + i y) = h(x + i y, v + i w) for all x + i y, v + i w .
Section 8 29
Note that any Euclidean orthogonal basis {e1 , . . . , en } for (V, g) is a Hermitian orthog-
onal basis for (V C , h).
e : V C V C given by
Any operator A : V V induces a complex operator A
e (v + i w) = Av + i Aw .
A
e: V C
Lemma 8.4. Under the previous assumptions, if A : V V is symmetric, then A
C
V is self-adjoint.
e (v + i w), x + i y) =
h(A
= g(Av, x) + g(Aw, y) + i (g(Aw, x) g(Av, y)) =
= g(v, Ax) + g(w, Ay) + i (g(w, Ax) g(v, Ay)) =
= h(v + i w, A
e (x + i y)) .
Therefore, A
e is self-adjoint. N
Theorem 8.5 (Euclidean spectral theorem) Let (V, g) be an Euclidean vector space.
A symmetric operator A : V V has n real eigenvalues. Moreover, there exists a
basis for V consisting of eigenvectors of A.
e (x j + i y j ) = Ax j + i Ay j = j x j + i j y j ,
A
tract from the set of generators x 1 , . . . , x n , y 1 , . . . , y n a basis for V , whose elements are
eigenvectors. N
30 Notes in linear algebra
Corollary 8.6. Under the same hypotheses as in Theorem 8.5, there is an orthonormal
basis for V consisting of eigenvectors of A.
Proof. By Theorem 8.5 A has at least one eigenvector v 1 . Let V1 be the subspace
generated by v 1 . Its orthogonal complement V1 is A-invariant (cf. Lemma 6.8) and
dimV1 = dimV 1. Hence, the statement can be easily proved by induction on
dimV . N
Putting all together, we are able to get the Euclidean analogue of the polar decom-
position theorem.
A = SQ ,
Proof. The proof is essentially the same as that of Theorem 7.2. The operator AAT is
postive symmetric; let S be its square root. Now, we define Q = S1 A. We have
QQT = S1 AAT S1 = S1 S2 S1 = I ,
We can rephrase the last three results in the setting of (real) matrices.
Theorem 8.5 is equivalent to the fact that any symmetric matrix can be diagonalized
by an invertible real matrix.
Corollary 8.6 amounts to say that every symmetric matrix can be diagonalized by an
orthogonal matrix. One more remark is here in order. After fixing a basis for V , an
Euclidean product g is uniquely determined by a positive definite symmetric matrix
G. So, Corollary 8.6 entails that any two matrices A and G, with A symmetric and G
positive definite symmetric, can be simultaneously diagonalized.
Needless to say, Theorem 8.8 is equivalent to the fact that any invertible matrix can
be factored as the product of a positive definite symmetric matrix and an orthogonal
matrix.
Section 8 31
References
[4] C AYLEY, A RTHUR, The Collected Mathematical Papers, vol. II, Cambridge Uni-
versity Press, Cambridge 1889.
[6] G REUB , W ERNER H., Linear Algebra, Springer-Verlag, New York 1967.
[7] H UMPHREYS , J AMES E., Introduction to Lie Algebras and Representations The-
ory, third printing, revised, Springer-Verlag, New York-Heidelberg-Berlin 1972.
[8] J ACOBSON , N ATHAN, Lectures in Abstract Algebra, vol. II, Linear Algebra, Van
Nostrand, Princeton (N.J.) 1953.
[9] L ANG , S ERGE, Algebra, second edition, Addison-Wesley, Reading (Mass.) 1984.