0% found this document useful (0 votes)
1 views

dissertation

Uploaded by

Hongjie Xu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

dissertation

Uploaded by

Hongjie Xu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 134

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/36283144

Topics in matrix analysis /

Article
Source: OAI

CITATIONS READS
101 8,278

1 author:

Dennis I. Merino
Southeastern Louisiana University
45 PUBLICATIONS 362 CITATIONS

SEE PROFILE

All content following this page was uploaded by Dennis I. Merino on 03 January 2016.

The user has requested enhancement of the downloaded file.


TOPICS IN MATRIX ANALYSIS

Dennis Iligan Merino

A dissertation submitted to The Johns Hopkins University


in conformity with the requirements for the degree of
Doctor of Philosophy

Baltimore, Maryland

March 21, 2008


Abstract

We study properties of coninvolutory matrices (EE = I), and derive a


canonical form for them under similarity and under unitary consimilarity.
We show that any complex matrix has a coninvolutory dilation and we char-
acterize the minimum size of a coninvolutory dilation of a square matrix.
We characterize those A ∈ Mm,n (the m-by-n complex matrices) that can
be factored as A = RE with R real and E coninvolutory, and we discuss
the uniqueness of this factorization when A is square and nonsingular. Let
A, C ∈ Mm,n and B, D ∈ Mn,m . The pair (A, B) is contragrediently equiva-
lent to the pair (C, D) if there are nonsingular complex matrices X ∈ Mm
and Y ∈ Mn such that XAY −1 = C and Y BX −1 = D. We develop a
complete set of invariants and an explicit canonical form for this equivalence
relation. We consider a generalization of the transpose operator to a general
linear operator φ : Mn → Mn that satisfies the following properties: for every
A, B ∈ Mn , (a) φ preserves the spectrum of A, (b) φ(φ(A)) = A, and (c)
φ(AB) = φ(B)φ(A). We show that (A, φ(A)) is contragrediently equivalent
to (B, φ(B)) if and only if A = X1 BX2 for some X1 , X2 ∈ Mn such that
Xi φ(Xi ) = I. We also consider a generalized polar factorization A = XY ,
where X, Y ∈ Mn satisfy and Xφ(X) = I and Y = φ(Y ). We study the
nilpotent part of the Jordan canonical forms of the products AB and BA
and present a canonical form for the complex orthogonal equivalence. We
characterize the linear operators on Mm,n that preserve complex orthogo-
nal equivalence and the linear operators on Mn (IF) that preserve unitary
t-congruence when IF = IR and IF = C.

i
Dedication

ii
Contents

Preface v

1 A Real-Coninvolutory Analog of the Polar Decomposition 1


1.1 Coninvolutory Matrices . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Real Similarity . . . . . . . . . . . . . . . . . . . . . . 4
1.1.2 Canonical Form . . . . . . . . . . . . . . . . . . . . . . 5
1.1.3 Singular Values . . . . . . . . . . . . . . . . . . . . . . 9
1.1.4 Coninvolutory Dilation . . . . . . . . . . . . . . . . . . 13
1.2 The RE Decomposition . . . . . . . . . . . . . . . . . . . . . 15
1.2.1 The Nonsingular Case — Existence . . . . . . . . . . . . 16
1.2.2 The Nonsingular Case — Uniqueness . . . . . . . . . . . 16
1.2.3 The Singular Case . . . . . . . . . . . . . . . . . . . . 18
1.2.4 More Factorizations Involving Coninvolutories . . . . . 20

2 The Contragredient Equivalence Relation 21


2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2 A Canonical Form for the
Contragredient Equivalence Relation . . . . . . . . . . . . . . 23
2.3 Complex Orthogonal Equivalence and
the QS Decomposition . . . . . . . . . . . . . . . . . . . . . . 40
2.4 The φS Polar Decomposition . . . . . . . . . . . . . . . . . . . 42
2.4.1 The Nonsingular Case . . . . . . . . . . . . . . . . . . 44
2.4.2 The General Case . . . . . . . . . . . . . . . . . . . . . 47
2.4.3 The Symmetric Case . . . . . . . . . . . . . . . . . . . 49
2.4.4 The Skew-Symmetric Case . . . . . . . . . . . . . . . . 50
2.5 A∗ and A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.6 The Nilpotent Parts of AB and BA . . . . . . . . . . . . . . . 52

iii
2.7 A Sufficient Condition For Existence of A Square Root . . . . 55
2.8 A Canonical Form for
Complex Orthogonal Equivalence . . . . . . . . . . . . . . . . 60

3 Linear Operators Preserving


Orthogonal Equivalence on Matrices 68
3.1 Introduction and Statement of Result . . . . . . . . . . . . . . 69
3.2 Preliminary Results . . . . . . . . . . . . . . . . . . . . . . . . 70
3.3 A Rank-Preserving Property of T −1 . . . . . . . . . . . . . . . 74
3.4 Proof of the Main Theorem . . . . . . . . . . . . . . . . . . . 83

4 Linear Operators Preserving


Unitary t-Congruence on Matrices 85
4.1 Introduction and Statement of Results . . . . . . . . . . . . . 86
4.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.3 The Complex Case . . . . . . . . . . . . . . . . . . . . . . . . 92
4.4 The Real Case . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

A A New Look at the Jordan Canonical Form 115

Bibliography 120

Vita 123

iv
Preface

I came to America because it was a cool thing to do. And I was expected
to do so. It was the natural order of things back at the University of the
Philippines. You graduate with honors, you teach for a year or two, and
then you go to your Uncle Sam. I arrived at The Johns Hopkins University
without a clear objective (even though my application letter would tell you
otherwise). I did not know what to do. I guess I was lucky in that every
course I took was a palatable dish tasted for the first time. I was also given
choices I did not know I had.
In my second semester, I took Dr. Roger A. Horn’s class in Matrix Analy-
sis. One thing I noticed was that he was far from ordinary. He was (and
still is) a performer. An artist. And our classroom was his stage. I looked
forward to, and was swept away by, each and every performance. I knew
then what I wanted to do.
The notation in this work is standard, as in [HJ1] (except for the trans-
pose used in Chapter 4), and is explained at the beginning of each chapter.
Each chapter addresses a different topic and has its own set of notations and
definitions. Each chapter can be read independently of the others.
In Chapter 1, we study properties of coninvolutory matrices (EE = I),
and derive a canonical form under similarity as well as a canonical form un-
der unitary consimilarity for them. We show that any complex matrix has
a coninvolutory dilation and we characterize the minimum size of a conin-
volutory dilation of a square matrix. We characterize those A ∈ Mm,n (the
m-by-n complex matrices) that can be factored as A = RE with R real and
E coninvolutory, and we discuss the uniqueness of this factorization when A
is square and nonsingular.
Let A, C ∈ Mm,n and B, D ∈ Mn,m . The pair (A, B) is contragrediently
equivalent to the pair (C, D) if there are nonsingular complex matrices X ∈

v
Mm and Y ∈ Mn such that XAY −1 = C and Y BX −1 = D. In Chapter 2, we
develop a complete set of invariants and an explicit canonical form for this
equivalence relation. We show that (A, AT ) is contragrediently equivalent
to (C, C T ) if and only if there are complex orthogonal matrices P and Q
such that C = P AQ. Using this result, we show that the following are
equivalent for a given A ∈ Mn : (a) A = QS for some complex orthogonal Q
and some complex symmetric S; (b) AT A is similar to AAT ; (c) (A, AT ) is
contragrediently equivalent to (AT , A); (d) A = Q1 AT Q2 for some complex
orthogonal Q1 , Q2 ; (e) A = P AT P for some complex orthogonal P . We
then consider a generalization of the transpose operator to a general linear
operator φ : Mn → Mn that satisfies the following properties: for every
A, B ∈ Mn , (a) φ preserves the spectrum of A, (b) φ(φ(A)) = A, and (c)
φ(AB) = φ(B)φ(A). We show that (A, φ(A)) is contragrediently similar to
(B, φ(B)) if and only if A = X1 BX2 for some nonsingular X1 , X2 ∈ Mn that
satisfy X1−1 = φ(X1 ) and X2−1 = φ(X2 ). We also consider a factorization
of the form A = XY , where X ∈ Mn is nonsingular and X −1 = φ(X),
and Y ∈ Mn satisfies Y = φ(Y ). We also discuss the operators A → A∗
and A → A. We then study the nilpotent part of the Jordan canonical
forms of the products AB and BA. We use the canonical form for the
contragredient equivalence relation to give a new proof to a result due to
Flanders concerning the relative sizes of the nilpotent Jordan blocks of AB
and BA. We present a sufficient condition for the existence of square roots
of AB and BA and close the chapter with a canonical form for the complex
orthogonal equivalence relation (A → Q1 AQ2 , where Q1 ∈ Mm and Q2 ∈ Mn
are complex orthogonal matrices).
In Chapter 3, we characterize the linear operators on Mm,n that preserve
complex orthogonal equivalence, that is, T : Mm,n → Mm,n is linear and has
the property that T (A) is orthogonally equivalent to T (B) whenever A and
B are orthogonally equivalent.
In Chapter 4, we answer the question of Horn, Hong and Li in [HHL] con-
cerning characterization of linear operators in Mn (IF) that preserve unitary
t-congruence (A → U AU t , with unitary U ) in the cases IF = IR and IF = C.
The result in the real case is a rather pleasant surprise.
In the Appendix, we present a new proof for the Jordan canonical theorem
that incorporates ideas developed in Chapter 2.
I wish to thank Dr. Roger A. Horn and Dr. Chi-Kwong Li for their help
and guidance; and Dr. Irving Kaplansky for his inspiring basis-free methods.

vi
Chapter 1

A Real-Coninvolutory Analog
of the Polar Decomposition

1
2

The classical polar decomposition A = P U , where P is positive semi-


definite and U is unitary, can be viewed as a generalization to matrices of the
polar representation of a complex number z = reiθ , where r, θ ∈ IR. Another,
perhaps more natural, generalization would be A = R1 eiR2 , where R1 and R2
are matrices with real entries. We study this factorization and related ones.
We also investigate the matrices that can be written as E = eiR for some
real matrix R.
We denote the set of m-by-n complex matrices by Mm,n , and write Mn ≡
Mn,n . The set of m-by-n matrices with real entries is denoted by Mm,n (IR),
and we write Mn (IR) ≡ Mn,n (IR). A matrix E ∈ Mn is said to be coninvolu-
tory if EE = I, that is, E is nonsingular and E = E −1 . We denote the set
of coninvolutory matrices in Mn by Cn . For A, B ∈ Mn , A is similar to B
if there exists a nonsingular matrix S ∈ Mn such that A = SBS −1 and we
write A ∼ B; A and B are said to be consimilar if there exists a nonsingular
−1
S ∈ Mn such that A = SBS . Given a scalar λ ∈ C, the n-by-n upper
triangular Jordan block corresponding to λ (having λ on the main diagonal,
1 on the super diagonal, and all other entries are 0) is denoted by Jn (λ).
Topological and metric properties of subsets of Mm,n (compact, bounded,
etc.) are always with respect to the topology generated by some (any) norm
on Mm,n .

1.1 Coninvolutory Matrices


The following can be verified easily.

Proposition 1.1.1 Let A ∈ Mn . Any two of the following implies the third:
(1) A is unitary.

(2) A is symmetric.

(3) A is coninvolutory.

Proposition 1.1.2 Let n be a given positive integer. Then


(a) Cn is closed under consimilarity. In particular, Cn is closed under real
similarity (A → RAR−1 for a real nonsingular R) and coninvolutory
congruence (A → EAE for a coninvolutory E).
3

(b) Cn is closed under general similarity if and only if n = 1.


(c) Cn is closed.

(d) Cn is bounded if and only if n = 1.

(e) Cn is closed under products if and only if n = 1. If E1 , E2 ∈ Cn , then


E1 E2 ∈ Cn if and only if E1 commutes with E2 .

(f) |det E| = 1 for every E ∈ Cn .

(g) Cn ⊕ Cm ⊂ Cn+m for any positive integer m. In particular, E ⊕ Im is


coninvolutory for any E ∈ Cn and any positive integer m.

(h) eiR is coninvolutory for any R ∈ Mn (IR).


−1 −1
Proof If E ∈ Cn and S ∈ Mn is nonsingular, then (SES )(SES ) = I.
−1
Hence, SES ∈ Cn . The example
" # " #
1 0 1 i
E= ⊕ In−2 ∈ Cn and S = ⊕ In−2
0 −1 0 1

shows that " #


−1 1 −2i
SES = ⊕ In−2
0 −1
need not be coninvolutory if n ≥ 2.
Suppose that {Ej } is a sequence of coninvolutories converging to E.
Then I = lim (Ej E j ) = (lim Ej )(lim E j ) = EE as j → ∞, so E is
coninvolutory and Cn is closed.
For any t ∈ IR, one checks that
" #
1 t
⊕ In−2
0 −1

is coninvolutory, so Cn is not bounded for n ≥ 2. If n = 1, C1 = {z =


eiθ : θ ∈ IR} is compact and is closed under similarity and products.
Now " #" # " #
1 −1 1 1 1 2
= 6∈ C2 .
0 −1 0 −1 0 1
4

Therefore, a product of two coninvolutory matrices is not necessarily


coninvolutory. If E1 , E2 ∈ Cn and E1 E2 ∈ Cn , then E1 E2 = (E1 E2 )−1 =
−1 −1
E2 E1 = E2 E1 , so E1 commutes with E2 . Conversely, if E1 and E2
are commuting coninvolutories, then E1 E2 (E1 E2 ) = E2 E1 (E1 E2 ) = I.
The assertions in (f), (g), and (h) are easily verified.

1.1.1 Real Similarity


In Theorem (69) of [K2], Kaplansky used the QS decomposition of a nonsin-
gular matrix (A = QS with Q orthogonal and S symmetric) to prove that
there exists an orthogonal matrix Q such that A = QBQT if and only if there
exists a nonsingular S such that A = SBS −1 and AT = SB T S −1 . An anal-
ogous result about real similarity may be obtained by following the steps in
Kaplansky’s proof and using the RE decomposition (see Section 1.2). Here,
we present a proof that does not make use of the RE decomposition; later,
we shall use this result in our proof of the RE decomposition.

Theorem 1.1.3 Let A, B ∈ Mn be given. There exists a nonsingular real


R ∈ Mn such that A = RBR−1 if and only if there exists a nonsingular
S ∈ Mn such that A = SBS −1 and A = SBS −1 .

Proof The forward implication is easily verified. For the converse, the condi-
−1
tions are equivalent to A = SBS −1 and A = SBS . Hence, AS = SB
and AS = SB. Let S = R1 + iR2 with R1 , R2 ∈ Mn (IR). Then
R1 = (S +S)/2 and R2 = (S −S)/2i, so A(αR1 +βR2 ) = (αR1 +βR2 )B
for all α, β ∈ IR. Let p(z) ≡ det(R1 + zR2 ). Then p(z) is a polynomial
of degree at most n and p(z) 6≡ 0 since p(i) 6= 0. Hence, p(β0 ) 6= 0
for some β0 ∈ IR. Thus, R ≡ R1 + β0 R2 ∈ Mn (IR) is nonsingular,
AR = RB, and A = RBR−1 .

Corollary (6.4.18) in [HJ2] uses the QS decomposition to prove the fol-


lowing: Suppose that A = SBS −1 and suppose there exists a polynomial
p(t) such that AT = p(A) and B T = p(B). Then there exists an orthogonal
Q such that A = QBQT . The following is an analog of this result for real
similarity.
5

Corollary 1.1.4 Let A, B ∈ Mn be given, suppose that A = SBS −1 , and


suppose there exists a polynomial p(t) such that A = p(A) and B = p(B).
Then there exists a nonsingular real R ∈ Mn such that A = RBR−1 .

Proof This follows directly from the theorem since q(A) = Sq(B)S −1 for
any polynomial q(t).

It is known that two symmetric matrices are similar if and only if they
are orthogonally similar and two Hermitian (even normal) matrices are sim-
ilar if and only if they are unitarily similar. An analogous result holds for
coninvolutory matrices.

Corollary 1.1.5 Two coninvolutory matrices are similar if and only if they
are real similar (that is, the similarity can be achieved via a nonsingular real
matrix).

Proof If A and B are coninvolutory and similar, say A = SBS −1 , we have


A = A−1 = SB −1 S −1 = SBS −1 . Theorem 1.1.3 ensures that A =
RBR−1 for some nonsingular real R. Conversely, if two matrices are
real similar, then they are similar.

Corollary 1.1.4 remains true if the polynomial p(t) is replaced by any


primary matrix function (see Chapter 6 of [HJ2]). Corollary 1.1.5 then follows
from Corollary 1.1.4 by taking p(t) = t−1 .

1.1.2 Canonical Form


Definition 1.1.6 Let k be a given positive integer and let A, B ∈ Mk . We
define " #
1 A + B −i(A − B)
E2k (A, B) ≡ . (1.1.1)
2 i(A − B) A+B
For 0 6= λ ∈ C, we set E2k (λ) ≡ E2k (Jk (λ), Jk (λ)−1 ).

Direct computations (and a simple induction for (c)) verify the following
facts and functional equations involving the matrices E2k (A, B).

Lemma 1.1.7 Let A, B, C, D ∈ Mk be given.


6

(a) E2k (A, B)E2k (C, D) = E2k (AC, BD).


(b) E2k (A, B)E2k (C, D) = E2k (AD, BC).
(c) f (E2k (A, B)) = E2k (f (A), f (B)) for any entire analytic function f (z).
−1 −1
(d) E2k (A, B)−1 = E2k (B , A ) whenever A and B are nonsingular.
(e) E2k (A, B) is coninvolutory if and only if AB = I.
(f) {E2k (λ) : 0 6= λ ∈ C} is an unbounded commuting family of coninvolu-
tory matrices.
(g) E2k (A, B) = U (A ⊕ B)U −1 where
" #
1 Ik iIk
U≡√
2 iIk Ik
is unitary, symmetric, and coninvolutory.
1
(h) E (J (α), −Jk (α))
i 2k k
is a real matrix for every α ∈ C.
Thus, the eigenvalues of E2k (A, B) are the union of the eigenvalues of A
and B. In particular, for λ 6= 0 the coninvolutory matrix E2k (λ) is unitarily
similar to Jk (λ) ⊕ Jk (λ)−1 and the eigenvalues of E2k (λ) are λ and 1/λ, each
of multiplicity k.
Theorem 1.1.8 Let 0 6= λ ∈ C be given and let k be a given positive integer.
Then E2k (λ) = eiR where R ∈ M2k (IR) is a polynomial in E2k (λ) and R is
similar to 1i E2k (Jk (α), −Jk (α)), with α ≡ ln λ (principal branch).
Proof Let α ≡ ln λ (principal branch) and notice that Jk (λ) ∼ eJk (α) and
Jk (λ)−1 ∼ e−Jk (α) . Lemma 1.1.7(h) ensures that
1
B ≡ E2k (Jk (α), −Jk (α)) ∈ M2k (IR).
i
Thus,
E2k (λ) ∼ Jk (λ) ⊕ Jk (λ)−1
∼ eJk (α) ⊕ e−Jk (α)
∼ E2k (eJk (α) , e−Jk (α) )
= eE2k (Jk (α),−Jk (α))
= eiB
7

by Lemma 1.1.7(g, c). Since E2k (λ) and eiB are similar coninvolu-
tory matrices, Corollary 1.1.5 ensures that they are real similar, so
there is some nonsingular C ∈ M2k (IR) such that E2k (λ) = CeiB C −1 =
−1
eiCBC = eiR with R ≡ CBC −1 ∈ M2k (IR). Since the spectrum
of iR ∼ Jk (α) ⊕ −Jk (α) is confined to the horizontal line {z ∈ C :
Im z = arg λ}, basic facts about primary matrix functions (see Ex-
ample (6.2.16) in [HJ2]) ensure that there is a polynomial p(t) (which
may depend on λ) such that iR = p(E2k (λ)).

Proposition 1.1.9 (a) Let E ∈ Cn be given, and suppose Jk (λ) is a Jordan


block of E with multiplicity l. Then Jk (1/λ) is a Jordan block of E with
multiplicity l.
(b) Let λ be a given nonzero complex number and let k and l be given positive
integers. Then there exists an E ∈ C2kl such that Jk (λ) is a Jordan block of
E with multiplicity l; if k = 1 and |λ| = 1, there is such an E ∈ Cl .
−1
Proof Since E = E , if Jk (λ) is a Jordan block of E with multiplicity l, E
−1
must have l Jordan blocks similar to Jk (λ) , that is, equal to Jk (1/λ).
For (b), we may take E = E2k (λ) ⊕ · · · ⊕ E2k (λ) (l copies); if k = 1 and
|λ| = 1, we may take E = λI ∈ Ml .

If E ∈ Cn and if λ is an eigenvalue of E with |λ| 6= 1, Proposition 1.1.9(a)


says that the Jordan structure of E associated with the eigenvalue λ is paired
with the Jordan structure of E associated with 1/λ: Each Jk (λ) is paired
with Jk (1/λ). However, λ = 1/λ when |λ| = 1 (λ = eiθ with θ ∈ IR), so
there need be no pairing in this case, and the corresponding Jordan blocks
are of the form Jk (eiθ ), which are similar to eiJk (θ) . These observations and
Corollary 1.1.5 give the following canonical forms for a coninvolutory matrix.

Theorem 1.1.10 Let E ∈ Cn . There are integers r, s ≥ 0, positive integers


l1 , . . . , lr , k1 , . . . , ks such that l1 + · · · + lr + 2(k1 + · · · + ks ) = n, real numbers
θ1 , . . . θr , and complex (possibly real) numbers λ1 , . . . , λs with all |λi | > 1 such
that
(a) E is similar to

Jl1 (eiθ1 ) ⊕ · · · ⊕ Jlr (eiθr ) ⊕ [Jk1 (λ1 ) ⊕ Jk1 (1/λ1 )] ⊕ · · · ⊕ [Jks (λs ) ⊕ Jks (1/λs )]
8

(b) E is real similar to the coninvolutory matrix

eiJl1 (θ1 ) ⊕ · · · ⊕ eiJlr (θr ) ⊕ E2k1 (λ1 ) ⊕ · · · ⊕ E2ks (λs )

(c) E is real similar to the coninvolutory matrix

ei[Jl1 (θ1 )⊕···⊕Jlr (θr )⊕R1 ⊕···⊕Rs ]

where each Rj ≡ 1i E2kj (Jkj (αj ), −Jkj (αj )) ∈ M2kj (IR) and αj ≡ ln λj (prin-
cipal branch) for j = 1, . . . , s.

If C ∈ Mn (IR) gives the real similarity asserted in Theorem 1.1.10(c), we


have E = eiR , where R ≡ C[Jl1 (θ1 ) ⊕ · · · ⊕ Jlr (θr ) ⊕ R1 ⊕ · · · ⊕ Rs ]C −1 . Since
the spectrum of R is confined to an open horizontal strip containing the real
axis and having height 2π, Example (6.2.16) in [HJ2] ensures that there is a
polynomial p(t) (which may depend on E) such that R = p(E). We can also
i i
write E = (e 2 R )2 , where e 2 R is coninvolutory and 12 R is also a polynomial
in E. These observations give the following result, which was proved in a
different way in Corollary (6.4.22) of [HJ2].

Theorem 1.1.11 Let E ∈ Mn be given. The following are equivalent:

(a) EE = I, that is, E is coninvolutory.

(b) E = eiS , where S ∈ Mn (IR) is real.

(c) E = F 2 , where F ∈ Mn is coninvolutory.

For a coninvolutory E, the real matrix S in (b) and the coninvolutory matrix
F in (c) may be taken to be polynomials in E.

For a given E ∈ Cn , the real matrix S in Theorem 1.1.11(b) and the


coninvolutory matrix F in Theorem 1.1.11(c) are not uniquely determined.
However, the general theory of primary matrix functions ensures that there
is a unique real S with spectrum in the half-open strip {z ∈ C : − π2 <
Im z ≤ π2 } such that E = eiS and there is a unique coninvolutory F with
spectrum in the wedge {z ∈ C : − π2 < arg z ≤ π2 } such that E = F 2 ( see
Theorem (6.2.9) of [HJ2]). For another approach to this uniqueness result,
see Theorem (2.3) of [DBS].
9

1.1.3 Singular Values


Definition 1.1.12 Let A ∈ Mn be nonsingular and let k·k be a given matrix
norm. The condition number of A with respect to k · k, denoted κ(A, k · k),
is given by
κ(A, k · k) ≡ kAkkA−1 k.
Proposition 1.1.13 (a) If σ > 0 is a singular value of a given E ∈ Cn with
multiplicity k, then so is 1/σ.
(b) If E ∈ Cn , then the singular values of E and E −1 are the same.
(c) If E ∈ Cn and if k · k is any unitarily invariant norm, then
κ(E, k · k) = kEk2 .

(d) If σ > 0 is given and if k is a given positive integer, then there exists
a coninvolutory matrix E such that σ is a singular value of E with
multiplicity k.
Proof The first three assertions follow directly from the singular value de-
−1
composition of E and the equation E = E . If σ > 0 and σ 6= 1,
then " #
σ 0
E2 (σ) = U U∗
0 1/σ
has singular values σ and 1/σ. Hence, we may take E = E2 (σ) ⊕ · · · ⊕
E2 (σ) (k copies). If σ = 1, then we may take E = Ik .
Let E ∈ Cn . Proposition 1.1.13(a) shows that the singular value decom-
position of E has the form E = U ΣV , where U and V are unitary and
1 1
Σ = [σ1 In1 ⊕ In1 ] ⊕ · · · ⊕ [σk Ink ⊕ Ink ] ⊕ In−2j (1.1.2)
σ1 σk
Pk
with σ1 > · · · > σk > 1 and j = i=1 ni . Partition the unitary matrix
 
X11 X12 · · · X1,l−2 X1,l−1 X1l
 
 X21 X22 · · · X2,l−2 X2,l−1 X2l 
 
 .. .. . . . .. .. .. 
 . . . . . 
VU =
 X


 l−2,1 Xl−2,2 · · · Xl−2,l−2 Xl−2,l−1 Xl−2,l 
 
 Xl−1,1 Xl−1,2 · · · Xl−1,l−2 Xl−1,l−1 Xl−1,l 
Xl1 Xl2 · · · Xl,l−2 Xl,l−1 Xll
10

conformal to Σ, with X11 , X22 ∈ Mn1 , . . . , Xl−2,l−2 , Xl−1,l−1 ∈ Mnk , Xll ∈


−1
Mn−2j , and l = 2k + 1. Since U ΣV = E = E = V T Σ−1 U T , we have
(V U )Σ = Σ−1 U T V ∗ = Σ−1 (V U )T and hence
 1 1

σ1 X11 X
σ1 12 · · · σk X1,l−2 X
σk 1,l−1 X1l
 1 1 
 σ1 X21 X · · · σk X2,l−2 X
σk 2,l−1 X2l 
 σ1 22 
 .. .. . . . .. .. .. 
 . . . . . 
 =
 1 1 
 σ1 Xl−2,1

X
σ1 l−2,2 · · · σk Xl−2,l−2 X
σk l−2,l−1 Xl−2,l 

 σ1 Xl−1,1 1
X · · · σk Xl−1,l−2 1
X Xl−1,l 
 σ1 l−1,2 σk l−1,l−1 
σ1 Xl1 1 · · · σk Xl,l−2 1 Xll
X
σ1 l2 X
σk l,l−1
 1 1 1 T 1 1 
XT
σ1 11
XT
σ1 21 · · · σ1 Xl−2,1 XT
σ1 l−1,1 XT
σ1 l1
 T T T T 
 σ1 X12 σ1 X22 · · · σ1 Xl−2,2 σ1 Xl−1,2 σ1 Xl2T 
 . .. .. .. .. 
 .
 .
... 

 1
. . . . . (1.1.3)
 T 1
XT 1 T 1 T 1 T 
 σk X1,l−2 σk 2,l−2 · · · σk Xl−2,l−2 σk Xl−1,l−2 σk Xl,l−2 
 T T T T T 
 σk X1,l−1 σk X2,l−1 · · · σk Xl−2,l−1 σk Xl−1,l−1 σk Xl,l−1 
X1lT X2l T · · · Xl−2,lT T
Xl−1,l Xll T

Let k · kF denote the Frobenius norm (square root of the sum of squares
of moduli of entries). Since all the columns and rows of V U are unit vectors,
we have
k[ X11 X12 · · · X1l ]k2F = n1 = k[ X11
T T
X21 · · · Xl1T ]k2F . (1.1.4)
T T
Equating the first rows of (1.1.3) gives X21 = X12 and X11 = σ12 X11 , . . .,
T T T T
Xl−2,1 = σ1 σk X1,l−2 , Xl−1,1 = (σ1 /σk )X1,l−1 , Xl1 = σ1 X1l . Since all of the
coefficients σ12 , . . . , σ1 σk , σ1 /σk , and σ1 are greater than one, substituting
these identities in (1.1.4) shows that X1i = Xi1 = 0 for all i 6= 2, and hence
kX12 k2F = n1 = kX21
T 2
kF = kX21 k2F .
Again using the fact that the rows of V U are unit vectors, we also have
n1 = k[ X21 X22 · · · X2l ]k2F ,
and hence X2i = 0 for all i ≥ 2. Similarly, Xi2 = 0 for all i ≥ 2. Thus
 
0 X12 0
 T
V U =  X12 0 0 

0 0 V1
11

where V1 is unitary and satisfies V1 Σ1 = Σ−1 T


1 V1 with Σ1 ≡ [σ2 In2 ⊕
1
I ]
σ2 n2

· · · ⊕ [σk Ink ⊕ σ1k Ink ] ⊕ In−2j .
Iteration of this argument shows that
" # " #
0 X1 0 Xk
X ≡VU = ⊕ ··· ⊕ ⊕ Xk+1 , (1.1.5)
X1T 0 XkT 0

where Xk+1 ∈ Mn−2j is unitary and symmetric. Thus, we can write E as


E = U ΣV = V T XΣV , where Σ and X have the special structure exhibited
in (1.1.2) and (1.1.5).
Let
" 1
# " #
0 I
σi ni Xi 0
Ψi = and Yi = for 1 ≤ i ≤ k.
σi Ini 0 0 XiT

Then " #" #


0 Xi σi Ini 0
Yi Ψi = 1 = Ψi YiT .
XiT 0 0 I
σi ni

Let Zi be a unitary and polynomial square root of Yi . Then Zi Ψi = Ψi ZiT for


1 ≤ i ≤ k (see Problem 27 in Section (4.4) of [HJ1]). Since Xk+1 is symmetric
and unitary, it has a symmetric and unitary square root, say Zk+1 . Let Z =
Z1 ⊕· · ·⊕Zk ⊕Zk+1 , and Ψ = Ψ1 ⊕· · ·⊕Ψk ⊕In−2j . Then XΣ = Z 2 Ψ = ZΨZ T .
Hence, E = V T XΣV = V T ZΨZ T V = (V T Z)Ψ(V T Z)T . We state this result
in the following theorem; for alternative approaches to this result, see (98◦ )
in [A] and Theorem (3.7) of [HH2].

Theorem 1.1.14 Let E ∈ Cn be given. Then E = U ΣU T , where U ∈ Mn is


unitary,
" # " 1
#
1
0 I
σ 1 n1 0 I
σk nk
Σ= ⊕ ··· ⊕ ⊕ In−2j , (1.1.6)
σ1 In1 0 σk Ink 0
P
σ1 > · · · > σk > 1 and j = ki=1 ni . The positive numbers σ1 , σ11 , . . . , σk , σ1k ,
and 1 are the distinct singular values of E with multiplicities n1 , n1 , . . . , nk ,
nk , and n − 2j, respectively. Conversely, U ΣU T ∈ Cn whenever U ∈ Mn is
unitary and Σ has the form (1.1.6).

It is natural to ask to what extent Proposition 1.1.13(b) characterizes Cn .


12

Corollary 1.1.15 Let a nonsingular A ∈ Mn be given. The following are


equivalent:

(a) A and A−1 have the same singular values, that is, A and A−1 are
unitarily equivalent.

(b) A = EW for some E ∈ Cn and unitary W ∈ Mn .

(c) A = V E for some E ∈ Cn and unitary V ∈ Mn .

(d) A = V EW for some E ∈ Cn and unitary V, W ∈ Mn .

Proof If A and A−1 have the same singular values, then A = V1 ΣV2 , where
Σ has the form (1.1.6) and V1 , V2 ∈ Mn are unitary. The converse
assertion in Theorem 1.1.14 ensures that A = (V1 ΣV1T )(V 1 V2 ) is a
factorization of the form asserted in (b). If A = EW as in (b), then
A = W (W T EW ) is a factorization of the form asserted in (c). If
A = V E as in (c), then one may take W = I in (d). If A = V EW as
in (d), then A−1 = W ∗ EV ∗ . Thus, the singular values of A and A−1
are those of E and E, which by Proposition 1.1.13(b) are the same.

Corollary (4.6.12) of [HJ1] guarantees that a positive definite A ∈ Mn and


a Hermitian or symmetric B ∈ Mn can be simultaneously diagonalized by
an appropriate nonsingular congruence. The following is an analog of these
results in the coninvolutory case.

Theorem 1.1.16 Let A ∈ Mn be positive definite and let E ∈ Mn be conin-


volutory. Then there exists a nonsingular S ∈ Mn such that SAS ∗ = I and
−1
SES = Σ has the form (1.1.6).

Proof Let P be the unique positive definite square root of A. Notice that
P −1 EP ∈ Cn , hence Theorem 1.1.14 ensures that there exists a unitary
U and a Σ of the form (1.1.6) such that P −1 EP = U ΣU T . Let S ≡
−1
U ∗ P −1 , so that S ∗ = P −1 U . Then SAS ∗ = I and SES = Σ, as
desired.
13

1.1.4 Coninvolutory Dilation


Definition 1.1.17 Let A ∈ Mn be given, and let k be a given positive
integer. Then B ∈ Mn+k is a dilation of A if A is a leading principal submatrix
of B.

Thompson and Kuo [TK] characterized those A ∈ Mn with a dilation


B ∈ Mn+k that has some special property such as doubly stochastic, unitary,
or complex orthogonal. For example, Thompson and Kuo found that every
A ∈ Mn has a complex orthogonal dilation B ∈ Mn+k , and the least integer
k for which this is possible is k = rank(I − AAT ). When does A have a
coninvolutory dilation?

Lemma 1.1.18 Let A ∈ Mn and let k1 be a positive integer. Suppose that


A has a coninvolutory dilation of size n + k1 . Then A has a coninvolutory
dilation of size n +k for each k ≥ k1 . Moreover, k1 ≥ k0 (A) ≡ rank(I −AA).

Proof Suppose that " #


A X1
B= ∈ Cn+k1 .
X2 Y
Then B ⊕ Ik−k1 ∈ Cn+k is a coninvolutory dilation of A for all k ≥ k1 .
Notice that X1 , X2T ∈ Mn,k1 and AA + X1 X 2 = In , so X1 X 2 = In − AA
and k0 (A) ≡ rank(In − AA) = rank(X1 X 2 ) ≤ rank X1 ≤ k1 .

Definition 1.1.19 For A ∈ Mn , we define


" #
A i(I + A)
β(A) ≡ .
i(I − A) A

Lemma 1.1.20 Let A ∈ Mn (IR). Then β(A) is a coninvolutory dilation of


A of size 2n. Moreover, Jn (±1) has a coninvolutory dilation of size 2n − 1.

Proof A direct computation shows that β(A) ∈ C2n . If A = Jn (1), form B


from β(Jn (1)) by deleting its last row and column; if A = Jn (−1), form
B from β(Jn (−1)) by deleting its (n + 1)-st row and column. In both
cases, one checks that B ∈ C2n−1 is a coninvolutory dilation of A.
14

Lemma 1.1.21 Suppose that A1 ∈ Mn1 has a coninvolutory dilation of size


n1 + k1 , and suppose that A2 ∈ Mn2 has a coninvolutory dilation of size
n2 + k2 . Then A1 ⊕ A2 has a coninvolutory dilation of size n1 + n2 + k1 + k2 .

Proof Let " #


A1 X1
B1 = ∈ Cn1 +k1
Y1 Z1
and let " #
A2 X2
B2 = ∈ Cn2 +k2 .
Y2 Z2
An explicit computation now shows that
 
A1 0 X1 0
 0 A2 0 X2 
 
  ∈ Cn1 +n2 +k1 +k2 .
 Y1 0 Z1 0 
0 Y2 0 Z2

Theorem 1.1.22 Let A ∈ Mn and let k be a given nonnegative integer.


Then A has a coninvolutory dilation of size n + k if and only if k ≥ rank(I −
AA).

Proof Let A ∈ Mn be given and let k0 = rank(I − AA). Theorem (4.9)


−1
of [HH1] guarantees that A = SRS for some nonsingular S ∈ Mn
and R ∈ Mn (IR). Now, R is real similar to its real Jordan canonical
form JR (A) ≡ Cn1 (a1 , b1 ) ⊕ · · · ⊕ Cni (ai , bi ) ⊕ Jm1 (λ1 ) ⊕ · · · ⊕ Jmj (λj ) ∈
Mn (IR), where {Cnt (at , bt )} are the real Jordan blocks containing the
non-real eigenvalues of A and each λt ∈ IR (Theorem (3.4.5) in [HJ1]).
−1
Hence, A is consimilar to JR (A), say A = XJR (A)X . If JR (A) has
a coninvolutory dilation B of size n + k, then since Cn+k is invariant
15

under consimilarity,
" # " −1
# " #" #" −1
#
X 0 X 0 X 0 JR (A) ∗ X 0
B =
0 I 0 I 0 I ∗ ∗ 0 I
" −1
#
XJR (A)X ∗
=
∗ ∗
" #
A ∗
=
∗ ∗
is a coninvolutory dilation of A of size n + k. Lemma 1.1.20 guar-
antees that Cnt (at , bt ) has a coninvolutory dilation of size 2nt . More-
over, Jmt (λt ) has a coninvolutory dilation of size 2mt , and if λt = ±1
then Jmt (±1) has a coninvolutory dilation of size 2mt − 1. Thus,
Lemma 1.1.21 ensures that JR (A) has a coninvolutory dilation of size
2n − l, where l is the number of Jordan blocks in JR (A) with λt = ±1.
Now, n−l = rank(I −[JR (A)]2 ) = rank(I −X[JR (A)]2 X −1 ) = rank(I −
AA) = k0 . Therefore, A has a coninvolutory dilation of size n + k0 .
Lemma 1.1.18 guarantees that A has a coninvolutory dilation of size
n + k if and only if k ≥ k0 .
In [TK] several power dilation problems are considered: when does a given
square matrix A have a dilation
" #
A ∗
B=
∗ ∗
in a given class such that
" #
k Ak ∗
B = for k = 1, . . . , N ?
∗ ∗
Theorem 1.1.22 solves the power dilation problem for N = 1 in the class of
coninvolutory matrices; the situation for N > 1 is an open problem.

1.2 The RE Decomposition


−1
For any nonsingular A ∈ Mn , one checks that A A is coninvolutory, and it
is known (Lemma (4.6.9) in [HJ1]) that every coninvolutory matrix arises in
16

this way. Theorem 1.1.11 therefore ensures that there is a coninvolutory E


−1
such that E 2 = A A, and there exists such an E that is a polynomial in
−1
A A. An immediate consequence of this observation is the following RE de-
composition for square nonsingular complex matrices, which is an elaboration
of Theorem (6.4.23) in [HJ2].

1.2.1 The Nonsingular Case — Existence


Theorem 1.2.1 Let A ∈ Mn be nonsingular. There exists a coninvolutory
−1 −1
E0 ∈ Mn such that E0 2 = A A and E0 is a polynomial in A A. If E
−1
is a given coninvolutory matrix such that E 2 = A A, then R(A, E) ≡ AE
has real entries and A = R(A, E)E. Finally, A commutes with A if and
only if there exist R, E ∈ Mn such that A = RE, R has real entries, E is
coninvolutory, and R commutes with E.

Proof The first assertion has been dealt with. For the second, suppose
−1
EE = I and E 2 = A A, and compute
−1
AE = A(E 2 )E = A(A A)E = AE.

Thus, R(A, E) ≡ AE has real entries and A = AEE = R(A, E)E. If


A = RE with R real, E coninvolutory, and RE = ER, then AA =
RERE = R2 EE = R2 = REER = RERE = AA, so A commutes
−1
with A. Conversely, if A commutes with A, then A commutes with A ,
which is a polynomial in A. Let E be a coninvolutory matrix such that
−1 −1
E 2 = A A and E is a polynomial in A A, and let R(A, E) ≡ AE.
Then A commutes with E, R(A, E) has real entries, A = R(A, E)E,
and ER(A, E) = EAE = AEE = A = AEE = R(A, E)E.

Notice that a given A ∈ Mn commutes with A if and only if AA has


real entries. Although AA need not have real entries if n > 1, AA is always
similar to a square of a real matrix (see Corollary (4.10) of [HH1]).

1.2.2 The Nonsingular Case — Uniqueness


The factors in the classical polar decomposition (A = P U ) of a nonsingular
matrix are always uniquely determined. How well-determined are the factors
17

in an RE decomposition? One can always write A = RE = (−R)(−E). More


generally, if R1 is a real matrix such that R12 = I and R1 commutes with E,
then A = RE = (RR1 )(R1 E), RR1 is real, and (R1 E)−1 = ER1−1 = ER1 =
ER1 = R1 E so R1 E is coninvolutory. The following results show that this is
the only non-uniqueness in the factors of a nonsingular RE decomposition.

Lemma 1.2.2 Let G ∈ Mn be a given coninvolutory matrix, and let E0 be


a given coninvolutory matrix such that E02 = G and E0 is a polynomial in
G. If R ∈ Mn has real entries, R2 = I, and R commutes with E0 , then
F ≡ RE0 is coninvolutory and F 2 = G. Conversely, if F is a coninvolutory
matrix such that F 2 = G, then R(F, E0 ) ≡ F E0 is a real matrix such that
F = R(F, E0 )E0 , R(F, E0 )2 = I, and R(F, E0 ) commutes with E0 .

Proof The forward assertion has already been established. For the con-
−1
verse, let F be coninvolutory and satisfy F 2 = G = F F . Then
Theorem 1.2.1 ensures that R(F, E0 ) ≡ F E0 has real entries and F =
R(F, E0 )E0 . Since E0 is a polynomial in G = F 2 , both E0−1 = E0 and
R(F, E0 ) are polynomials in F . It follows that both F and R(F, E0 )
commute with E0 . Thus, R(F, E0 )2 = F E0 F E0 = F 2 (E0 )2 = GG = I,
as desired.

Theorem 1.2.3 Let A ∈ Mn be nonsingular, let E0 ∈ Mn be a given conin-


−1 −1
volutory matrix such that E02 = A A and E0 is a polynomial in A A, and
let R(A, E0 ) ≡ AE0 . Then R(A, E0 ) has real entries and A = R(A, E0 )E0 .
If R ∈ Mn has real entries, R2 = I, and RE0 = E0 R, then RE0 is con-
involutory, R(A, E0 )R is real, and A = (R(A, E0 )R)(RE0 ). Suppose E1 ,
R1 ∈ Mn are given with E1 coninvolutory, R1 real, and A = R1 E1 , and
define R(E1 , E0 ) ≡ E1 E0 . Then R(E1 , E0 ) is real and commutes with E0 ,
R(E1 , E0 )2 = I, R1 = R(A, E0 )R(E1 , E0 ), and E1 = R(E1 , E0 )E0 .

Proof Theorem 1.2.1 ensures that R(A, E0 ) is real and A = R(A, E0 )E0 .
Moreover, Lemma 1.2.2 ensures that RE0 is coninvolutory and A =
(R(A, E0 )R)(RE0 ) is an RE decomposition if R2 = I and R is real and
commutes with E0 . For the last assertions, apply Lemma 1.2.2 to the
−1
coninvolutory matrix E1 , for which E12 = A A = E02 , and conclude
that R(E1 , E0 ) ≡ E1 E0 is real, E1 = R(E1 , E0 )E0 , R(E1 , E0 ) commutes
18

with E0 , and R(E1 , E0 )2 = I. Then

R1 = AE1
= AR(E1 , E0 )E0
= AE0 R(E1 , E0 )
= R(A, E0 )R(E1 , E0 ).

−1
If E0 is a given coninvolutory polynomial square root of A A, we see
that the set of all pairs (R, E) with R real, E coninvolutory, and A = RE
is precisely {(AE 0 R, RE0 ) : R is real, R2 = I, and R commutes with E0 }.
Depending on the Jordan canonical form of Log E0 (see Theorem (6.4.20)
in [HJ2]), there can be many R’s that satisfy these conditions. If we use
Theorem 1.1.11(b) to write E0 = eiS with S real, and if S = T CT −1 is its
real Jordan form (see Theorem (3.4.5) in [HJ1]) with T and C both real and
C = Cn1 ⊕ · · · ⊕ Cnk , then E0 = T eiC T −1 ≡ T (En1 ⊕ · · · ⊕ Enk )T −1 . Any
matrix of the form R = T (±In1 ⊕ · · · ⊕ ±Ink )T −1 is real, commutes with E0 ,
and satisfies R2 = I.

1.2.3 The Singular Case


Every complex matrix, singular or nonsingular, square or not, has a classical
polar decomposition. Does every complex matrix have an RE decomposition?
Notice that if A = RE, then A = RE and A = AE 2 , so A and A necessarily
have the same range in this case. However,
" #
1 i
A1 ≡
i −1

and A1 have different ranges, so A does not have an RE decomposition. The


example " #
1 i
A2 ≡
0 0
has the same range as A2 , and
" #" #
1 0 1 i
A2 =
0 0 0 1
19

T
is an RE decomposition. Notice that AT2 and A2 have different ranges, so
AT2 does not have an RE decomposition. Thus, it is possible for one, but not
both, of the matrices A and AT to have an RE decomposition. We now show
that equality of the ranges of A and A is sufficient as well as necessary for A
to have an RE decomposition.

Theorem 1.2.4 Let A ∈ Mm,n be given. Then there exist a real R ∈ Mm,n
and a coninvolutory E ∈ Mn such that A = RE if and only if A and A have
the same range, that is, if and only if there exists a nonsingular S ∈ Mn such
that A = AS.

Proof The necessity of the condition has already been shown, so we only
need to show that it is sufficient.
We first show that the asserted equivalent conditions are both invariant
under right multiplication by a nonsingular matrix.

(i) If A = RE, then AS = RES = R(ES) and ES is nonsingular.


Hence ES = R1 E1 , with R1 real and E1 coninvolutory, and AS =
R(R1 E1 ) = (RR1 )E1 and RR1 is real.
(ii) If A = AS and A1 = AX, where X is nonsingular, then A1 =
−1 −1 −1
AX = ASX = AX(X SX) = A1 (X SX) and (X SX) is
nonsingular.

Choose a permutation matrix P ∈ Mn so that AP = [ A1 A2 ] and


A1 ∈ Mm,r has full rank. Then A2 = A1 B for some B ∈ Mr,n−r and
" #
Ir B
AP = [ A1 0 ] .
0 In−r

Hence, without loss of generality we may assume that A = [A1 0] with


A1 having full rank. Then
" #
S11 S12
[ A1 0 ] = A = AS = [ A1 0 ] = [A1 S11 A1 S12 ].
S21 S22

Hence, A1 S12 = 0 and A1 = A1 S11 = A1 S11 S11 = A1 S11 S11 , or A1 (Ir −


S11 S11 ) = 0. Since A1 has full rank, we conclude that S12 = 0 and Ir =
S11 S11 . Thus, S11 is coninvolutory and Theorem 1.1.11 ensures that
20

S11 = E 2 for some coninvolutory E. Taking R ≡ A1 E = (A1 S11 )E =


A1 E 2 E = A1 E = A1 E = R, we see that R is real and A1 = RE.
Finally, we have
" #
E 0
A = [ A1 0 ] = [ RE 0 ] = [ R 0 ]
0 In−r

and A has an RE decomposition.

1.2.4 More Factorizations Involving Coninvolutories


The following result may be thought of as an analog of the singular value
decomposition.

Theorem 1.2.5 Every A ∈ Mm,n can be written as A = E1 RE2 where


E1 ∈ Mm and E2 ∈ Mn are coninvolutory, and R ∈ Mm,n has real entries.

Proof The singular value decomposition ensures that there exist nonsingu-
lar matrices X ∈ Mm and Y ∈ Mn so that A = XRY , where R ∈ Mm,n
has real entries. Write X = E1 R1 and Y = R2 E2 , with each Ei conin-
volutory and each Ri real; then A = E1 (R1 RR2 )E2 .

Corollary 1.2.6 Let A ∈ Mm,n . Then there exists a coninvolutory X ∈ Mm


such that XA and XA have the same range.

Proof Write A = E1 RE2 , as in Theorem 1.2.5, and take X = E1 .

Every square complex matrix is known to be consimilar to a real matrix


(Theorem 4.9 of [HH1]), but we can now show that the consimilarity may be
taken to be coninvolutory. As a consequence, the coninvolutory factors E1
and E2 in Theorem 1.2.5 may be taken to be equal when A is square.
−1
Theorem 1.2.7 Every A ∈ Mn can be written as A = ERE = ERE ,
where E ∈ Mn is coninvolutory and R ∈ Mn (IR).
−1
Proof Write A = SR0 S with a nonsingular S and real R0 . Write S =
ER1 , where E is coninvolutory and R1 is real. Then A = ERE, where
R ≡ R1 R0 R1−1 is real.
Chapter 2

The Contragredient
Equivalence Relation

21
22

2.1 Introduction
We denote the set of m-by-n matrices by Mm,n and write Mn ≡ Mn,n . Given
a scalar λ ∈ C, the n-by-n upper triangular Jordan block corresponding to
λ is denoted by Jn (λ). We say that A, B ∈ Mm,n are equivalent if there are
nonsingular matrices X ∈ Mm and Y ∈ Mn such that A = XBY .

Definition 2.1.1 Let A, C ∈ Mm,n and B, D ∈ Mn,m . We say that (A, B)


is contragrediently equivalent to (C, D), and we write (A, B) ∼ (C, D), if
there are nonsingular X ∈ Mm and Y ∈ Mn such that XAY −1 = C and
Y BX −1 = D.

It can be useful to know that contragredient equivalence of two pairs of


matrices can be expressed as a block diagonal similarity of two block matrices:
" #" #" #−1 " #
X 0 0 A X 0 0 C
= .
0 Y B 0 0 Y D 0
It is easy to verify that contragredient equivalence is an equivalence relation
on Mm,n × Mn,m . In the special case n = m and B = D = I, notice that
(A, I) ∼ (C, I) if and only if A is similar to C.
Contragredient equivalence is a natural notion when one studies products
of matrices. If (A, B) ∼ (C, D), then
(1) CD = (XAY −1 )(Y BX −1 ) = X(AB)X −1 ,
(2) DC = (Y BX −1 )(XAY −1 ) = Y (BA)Y −1 ,
(3) D(CD)k = Y B(AB)k X −1 for all k = 0, 1, 2, . . . , and
(4) C(DC)k = XA(BA)k Y −1 for all k = 0, 1, 2, . . . .

Hence, AB is similar to CD, BA is similar to DC, D(CD)k is equivalent


to B(AB)k and C(DC)k is equivalent to A(BA)k for all k = 0, 1, 2, . . .. In
particular, the following rank identities hold for all integers k ≥ 0:
(10 ) rank (AB)k = rank (CD)k
(20 ) rank (BA)k = rank (DC)k
(30 ) rank B(AB)k = rank D(CD)k
(40 ) rank A(BA)k = rank C(DC)k .
Conversely, we shall show that these rank identities, together with either
of the similarity conditions (1) or (2), are sufficient for the contragredient
23

equivalence of (A, B) and (C, D). We obtain a canonical form for this rela-
tion and use it to study a variety of matrix factorizations involving complex
orthogonal factors.

2.2 A Canonical Form for the


Contragredient Equivalence Relation
For a given A ∈ Mm,n and B ∈ Mn,m , it is our goal to find a canonical pair
 ∈ Mm,n and B̂ ∈ Mn,m such that (A, B) ∼ (Â, B̂). Our first step is to
describe a reduction that shows there are only two essential cases to consider:
AB nonsingular and AB nilpotent.

Lemma 2.2.1 Let positive integers k, n be given with k < n. Let Y1 ∈ Mn,k
and P ∈ Mk,n be given with P Y1 nonsingular. Then there exists a Y2 ∈
Mn,n−k such that P Y2 = 0 and [Y1 Y2 ] ∈ Mn is nonsingular.

Proof Since P has full (row) rank, we may let {ξ1 , . . . , ξn−k } be a basis for
the orthogonal complement of the span of the columns of P ∗ in Cn ,
and set Y2 ≡ [ξ1 . . . ξn−k ]. Then P Y2 = 0 and Y2 has full (column)
rank. Let η ∈ Ck and ζ ∈ Cn−k and suppose
" #
η
[Y1 Y2 ] = Y1 η + Y2 ζ = 0.
ζ

Then 0 = P Y1 η + P Y2 ζ = P Y1 η, so η = 0 since P Y1 is nonsingular.


Thus, Y2 ζ = 0 and ζ = 0 since Y2 has full column rank. We conclude
that [Y1 Y2 ] ∈ Mn is nonsingular, as desired.

Lemma 2.2.2 Let positive integers m, n be given with m ≥ n, let A ∈ Mm,n


and B ∈ Mn,m be given, and let k ≡ rank (AB)m . Suppose the Jordan
canonical form of AB is J(AB) ⊕ N , where J(AB) ∈ Mk is nonsingular if
k ≥ 1 or is absent if k = 0, and N ∈ Mm−k is nilpotent if k < m or is absent
if k = n = m.

(a) If k = 0, then AB is nilpotent.


24

(b) If 1 ≤ k < n, there exist nonsingular X ∈ Mm and Y ∈ Mn such that


" # " #
−1 Ik 0 −1 J(AB) 0
XAY = and Y BX =
0 A 0 B

and AB ∈ Mm−k is nilpotent.

(c) If k = n < m, there exist nonsingular X ∈ Mm and Y ∈ Mn such that


" #
Ik
XAY −1 = and Y BX −1 = [J(AB) 0].
0

(d) If k = n = m, there exist nonsingular X, Y ∈ Mn such that

XAY −1 = Im and Y BX −1 = J(AB).

Proof Let k = rank (AB)m and let X ∈ Mm be a nonsingular matrix that


reduces AB to Jordan canonical form, that is,
" #
J(AB) 0
XABX −1 = ,
0 N

where J(AB) ∈ Mk is nonsingular and N ∈ Mm−k is nilpotent if k < m


and is absent if k = n = m. If k = 0, then AB is nilpotent and we
have case (a).
Suppose k ≥ 1. Partition XA as
" #
C1
XA = with C1 ∈ Mk,n and C2 ∈ Mm−k,n (2.2.1)
C2

and partition BX −1 as

BX −1 = [D1 D2 ] with D1 ∈ Mn,k and D2 ∈ Mn,m−k . (2.2.2)

Compute
" # " # " #
J(AB) 0 C1 C1 D1 C1 D2
= XABX −1 = [D1 D2 ] = .
0 N C2 C2 D1 C2 D2
25

Thus, C1 D2 = 0, C2 D1 = 0, and C1 D1 = J(AB) is nonsingular, as is


D1T C1T .
Now suppose 1 ≤ k < n. Two applications of Lemma 2.2.1 (with
P = D1T and P = C1 , respectively) ensure that there are C3 ∈ Mn−k,n
and D3 ∈ Mn,n−k such that
" # " #
C1 T D1T
Y ≡ ∈ Mn and F ≡ ∈ Mn
C3 D3T

are nonsingular and satisfy C1 D3 = 0 and D1T C3T = 0, that is C3 D1 = 0.


Notice that
" # " # " #
C1 C1 D1 C1 D3 J(AB) 0
YF = [D1 D3 ] = =
C3 C3 D1 C3 D3 0 C3 D3

is nonsingular, so C3 D3 is nonsingular. We also have


" # " # " #
C1 C1 D1 C1 D3 J(AB) 0
XAF = [D1 D3 ] = = .
C2 C2 D1 C2 D3 0 C2 D3

Now compute

XAY −1 = XAF F −1 Y −1 = (XAF )(Y F )−1


" #" #
J(AB) 0 J(AB)−1 0
=
0 C2 D3 0 (C3 D3 )−1
" # " #
Ik 0 Ik 0
= ≡ ,
0 C2 D3 (C3 D3 )−1 0 A

and
" # " #
−1 −1 C1 C1 D1 C1 D2
Y BX = Y (BX )= [D1 D2 ] =
C3 C3 D1 C3 D2
" # " #
J(AB) 0 J(AB) 0
= ≡ ,
0 C3 D2 0 B
26

where we set A ≡ C2 D3 (C3 D3 )−1 and B ≡ C3 D2 . Since


" #
J(AB) 0
= XABX −1 = (XAY −1 )(Y BX −1 )
0 N
" #" #
Ik 0 J(AB) 0
=
0 A 0 B
" #
J(AB) 0
= ,
0 AB

we have AB = N , as desired for case (b).


If k = n < m, then C1 , D1 ∈ Mn in (2.2.1) and (2.2.2) and the identity
C1 D1 = J(AB) shows that C1 and D1 are nonsingular. Since C1 D2 = 0
and C2 D1 = 0, it follows that D2 = 0 and C2 = 0. If we set Y ≡ C1
and F ≡ D1 , we have

Y F = C1 D1 = J(AB)

and " # " # " #


C1 C1 D1 J(AB)
XAF = D1 = = .
0 0 0
Thus,
XAY −1 = XAF F −1 Y −1 = (XAF )(Y F )−1
" # " #
J(AB) −1 Ik
= J(AB) =
0 0
and
Y BX −1 = C1 [D1 D2 ] = [C1 D1 C1 D2 ] = [J(AB) 0],
which is the assertion in (c).
Finally, suppose k = n = m. Then XA = C1 and BX −1 = D1 are
nonsingular. For Y ≡ C1 , we have XAY −1 = Im and Y BX −1 =
C1 D1 = J(AB), which is case (d).

For a given A ∈ Mm,n and B ∈ Mn,m , the four cases (a)—(d) in Lemma
2.2.2 are exhaustive and mutually exclusive. The special forms achieved by
27

the contragredient equivalences in (c) and (d) are the canonical forms we
seek, but more work is required to achieve a canonical form in (a). Once
that is achieved, however, the special form of the reduction in (b) shows that
a canonical form in this case can be expressed as a direct sum of the canonical
forms for cases (a) and (d): If
" # " #
−1 Ik 0 −1 J(AB) 0
XAY = and Y BX = ,
0 A 0 B

and if X1 ≡ Ik ⊕ F1 ∈ Mm and Y1 ≡ Ik ⊕ G1 ∈ Mn are nonsingular, then


" #
−1 Ik 0
(X1 X)A(Y1 Y ) =
0 F1 AG−1
1

and " #
−1 J(AB) 0
(Y1 Y )B(X1 X) = ,
0 G1 BF1−1
and (F1 AG−1 −1
1 )(G1 BF1 ) = F1 (AB)F1
−1
is nilpotent. Thus, if F1 and G1 are
chosen to put (A, B) into canonical form, we shall have achieved a canonical
form for (A, B).
In order to achieve a canonical form for (A, B) under contragredient equiv-
alence, we see that there are only two essential cases: AB is nonsingular or
AB is nilpotent. Before attacking the second case, it is convenient to sum-
marize what we have learned about the first.

Lemma 2.2.3 Let m, n be given positive integers with m ≥ n, and let


A, C ∈ Mm,n and B, D ∈ Mn,m be given. Suppose rank AB = rank (AB)m =
rank CD = rank (CD)m = n. Then (A, B) ∼ (C, D) if and only if AB is
similar to CD.

Proof The forward implication is immediate (and does not require the rank
conditions). For the converse, the hypotheses ensure that the respective
Jordan canonical forms of AB and CD are J(AB)⊕0m−n and J(CD)⊕
0m−n and one may arrange the Jordan blocks in the nonsingular factor
J(CD) ∈ Mn so that J(AB) = J(CD). Inspection of the canonical
forms in Lemma 2.2.2 (c) and (d) now shows that (A, B) ∼ (C, D).
28

With the nonsingular case disposed of, we now begin a discussion of the
nilpotent case. Let m ≥ n, A ∈ Mm,n , and B ∈ Mn,m be given and suppose
AB (and hence also BA) is nilpotent. Then among all the nonzero finite
alternating products of the form

Type I: · · · ABA (ending in A)


Type II: · · · BAB (ending in B)

there is a longest one (possibly two, one of each type). Suppose that a longest
one is of Type I. There are two possibilities, depending on the parity of a
longest product.

Case 1. Odd parity, k ≥ 1, A(BA)k−1 6= 0, (BA)k = 0.


Since we assumed that a longest product was of type I, we must
have (AB)k = 0 in this case.
Case 2. Even parity, k ≥ 2, (BA)k−1 6= 0, A(BA)k−1 = 0.
Again, since a longest nonzero product was assumed to be of type I,
we must have B(AB)k−1 = 0.

We now consider these two cases in turn.

Type I, Case 1. If k = 1, we set (AB)k−1 ≡ Im , even if A = 0. Let x ∈ Cn


and y ∈ Cm be such that y ∗ A(BA)k−1 x 6= 0, and define

Z1 ≡ [x BAx . . . (BA)k−1 x] ∈ Mn,k

Y1 ≡ AZ1 = [Ax A(BA)x . . . A(BA)k−1 x] ∈ Mm,k


 
y∗
 
 y ∗ AB 
P ≡
 ..  ∈ Mk,m

 . 
y ∗ (AB)k−1
and  
y∗A
 
 y ∗ (AB)A 
Q ≡ PA = 
 ..  ∈ Mk,n .

 . 
y ∗ (AB)k−1 A
29

Then
 
∗ y ∗ A(BA)k−1 x
 
 
 · 
P Y1 = QZ1 = 

 ∈ Mk

 · 
 
·
y ∗ A(BA)k−1 x 0

is nonsingular since y ∗ A(BA)k−1 x 6= 0. Moreover, we have the following


identities:
(a) AZ1 = Y1
(b) PA = Q
(2.2.3)
(c) BY1 = Z1 JkT (0)
(d) QB = Jk (0)P.
Suppose k < n. Then Lemma 2.2.1 ensures that there exist Y2 ∈ Mm,m−k
and Z2 ∈ Mn,n−k such that

(a) Y ≡ [Y1 Y2 ] ∈ Mm is nonsingular, and


(b) P Y2 = 0;
(2.2.4)
(c) Z ≡ [Z1 Z2 ] ∈ Mn is nonsingular, and
(d) QZ2 = 0.

Set C ≡ Y −1 AZ2 and partition


" #
C1
C= with C1 ∈ Mk,n−k and A2 ∈ Mm−k,n−k .
A2

Then " #
C1
AZ2 = Y C = [Y1 Y2 ] = Y1 C1 + Y2 A2 . (2.2.5)
A2
Now, use (2.2.5) and the identities (2.2.4b), (2.2.3b) and (2.2.4d) to compute

P AZ2 = P (AZ2 ) = P Y1 C1 + (P Y2 )A2 = (P Y1 )C1

and
P AZ2 = (P A)Z2 = QZ2 = 0.
Since P Y1 is nonsingular, we conclude that C1 = 0. Thus, (2.2.5) simplifies
to
AZ2 = Y2 A2 (2.2.6)
30

and we can use (2.2.3a) and (2.2.6) to compute


" #
Ik 0
A[Z1 Z2 ] = [AZ1 AZ2 ] = [Y1 Y2 ] ,
0 A2

so that " #
−1 Ik 0
Y AZ = . (2.2.7)
0 A2
Now, set D ≡ Z −1 BY2 and partition
" #
D1
D= with D1 ∈ Mk,m−k and B2 ∈ Mn−k,m−k .
B2

Again, notice that


" #
D1
BY2 = ZD = [Z1 Z2 ] = Z1 D1 + Z2 B2 . (2.2.8)
B2

Now use (2.2.8) and the identities (2.2.4d), (2.2.3d) and (2.2.4b) to compute

QBY2 = Q(BY2 ) = QZ1 D1 + (QZ2 )B2 = (QZ1 )D1

and
QBY2 = (QB)Y2 = Jk (0)(P Y2 ) = 0.
Since QZ1 is nonsingular, we conclude that D1 = 0. Hence, (2.2.8) simplifies
to
BY2 = Z2 B2 , (2.2.9)
and we can use (2.2.3c) and (2.2.9) to compute
" #
JkT (0) 0
B[Y1 Y2 ] = [BY1 BY2 ] = [Z1 Z2 ] ,
0 B2

and we have " #


−1 JkT (0) 0
Z BY = . (2.2.10)
0 B2
In this case, notice that A2 ∈ Mm−k,n−k and B2 ∈ Mn−k,m−k . Moreover,
(A2 B2 )k = 0 and (B2 A2 )k = 0.
31

If k = n < m, then Z2 is absent in (2.2.4c) and Z = Z1 is nonsingular; Q


is also nonsingular since QZ1 is nonsingular. The identities (2.2.3), (2.2.4a),
and (2.2.4b) still hold. Now,
" # " #
Ik Ik
AZ = Y1 = [Y1 Y2 ] =Y ,
0 0

so that (2.2.7) reduces to


" #
−1 Ik
Y AZ = . (2.2.11)
0

Using (2.2.3d) and (2.2.4b), we have

QBY2 = (QB)Y2 = Jk (0)(P Y2 ) = 0.

Since Q is nonsingular, we have BY2 = 0. Hence,

BY = B[Y1 Y2 ] = [BY1 BY2 ] = [ZJkT (0) 0] = Z[JkT (0) 0]

and (2.2.10) becomes


Z −1 BY = [JkT (0) 0]. (2.2.12)
If k = n = m, then Y2 and Z2 are absent; Y = Y1 , Z = Z1 , Q and P ∈ Mn
are all nonsingular. The identities (2.2.3) still hold and

Y −1 AZ = Ik and Z −1 BY = JkT (0). (2.2.13)

Notice that if k = n ≤ m, then we may take A2 = 0 in (2.2.7) and B2 = 0


in (2.2.10).

Type I, Case 2. We proceed as in Case 1. Let x, y ∈ Cn be such


that y ∗ (BA)k−1 x 6= 0. In this case, we have k ≥ 2, A(BA)k−1 = 0, and
B(AB)k−1 = 0. Let

Z1 ≡ [x BAx . . . (BA)k−2 x (BA)k−1 x] ∈ Mn,k

Y1 ≡ [Ax A(BA)x . . . A(BA)k−2 x] ∈ Mm,k−1


32

 
y∗B
 
 y ∗ BAB 
P ≡
 ..  ∈ Mk−1,m

 . 
y ∗ B(AB)k−2
and  
y∗
 ∗ 
 y BA 
 

Q≡ .. 
 ∈ Mk,n .
 . 
 y ∗ (BA)k−2 
 
y ∗ (BA)k−1
We define

Hk ≡ [Ik−1 0] ∈ Mk−1,k and Kk ≡ [0 Ik−1 ] ∈ Mk−1,k . (2.2.14)

Calculations similar to those in Case 1 show that P Y1 ∈ Mk−1 and QZ1 ∈


Mk are nonsingular. Morever, we have the following identities:

(a) AZ1 = Y1 Hk
(b) P A = Kk Q
(2.2.15)
(c) BY1 = Z1 KkT
(d) QB = HkT P.

Suppose k < n. Then Lemma 2.2.1 guarantees that there exists a Y2 ∈


Mm,m−k+1 and a Z2 ∈ Mn,n−k such that

(a) Y ≡ [Y1 Y2 ] ∈ Mm is nonsingular, and


(b) P Y2 = 0;
(2.2.16)
(c) Z ≡ [Z1 Z2 ] ∈ Mn is nonsingular, and
(d) QZ2 = 0.

As in Case 1, we set C ≡ Y −1 AZ2 . Similar calculations yield


" #
0
C= with A2 ∈ Mm−k+1,n−k ,
A2

and
AZ2 = Y2 A2 . (2.2.17)
33

Using the identities (2.2.15a) and (2.2.17) we have


" #
Hk 0
A[Z1 Z2 ] = [AZ1 AZ2 ] = [Y1 Hk Y2 A2 ] = [Y1 Y2 ]
0 C2
so that " #
Hk 0
Y −1 AZ = . (2.2.18)
0 C2
Again, as in Case 1, set D ≡ Z −1 BY2 and use similar calculations to
obtain " #
0
D= with B2 ∈ Mn−k,m−k+1
B2
so that
BY2 = Z2 D2 . (2.2.19)
Using the identities (2.2.15c) and (2.2.19), we have
" #
KkT 0
B[Y1 Y2 ] = [BY1 BY2 ] = [Z1 KkT Z2 B2 ] = [Z1 Z2 ]
0 B2
and hence " #
−1 KkT 0
Z BY = . (2.2.20)
0 B2
In this case, notice that A2 ∈ Mm−k+1,n−k and B ∈ Mn−k,m−k+1 . More-
over, B2 (A2 B2 )k−1 = 0 and A2 (B2 A2 )k−1 = 0.
If k = n ≤ m, then Z2 is absent and Z = Z1 ∈ Mn is nonsingular. Since
QZ1 is nonsingular, Q is also nonsingular. The identities (2.2.15), (2.2.16a),
and (2.2.16b) still hold. Hence,
" #
Hk
AZ = AZ1 = Y1 Hk = [Y1 Y2 ] ,
0

so that (2.2.18) becomes


" #
−1 Hk
Y AZ = . (2.2.21)
0

Using (2.2.15d) and (2.2.16b), we have

QBY2 = (QB)Y2 = HkT (P Y2 ) = 0.


34

Hence, BY2 = 0 since Q is nonsingular. Using this and (2.2.15c), we have

BY = B[Y1 Y2 ] = [BY1 BY2 ] = [Z1 KkT 0] = Z1 [KkT 0]

so that (2.2.20) becomes

Z −1 BY = [KkT 0]. (2.2.22)

Again, notice that if k = n ≤ m, then we may take A2 = 0 in (2.2.18)


and B2 = 0 in (2.2.20).
If a longest nonzero alternating product is of Type II, then again there are
two possibilities, depending on the parity of a longest product. Our analysis
of both cases holds with A and B interchanged. In particular, we have the
following results:

Type II, Case 1. Odd parity, k ≥ 1, B(AB)k−1 6= 0, (AB)k = 0 and


(BA)k = 0. There exist nonsingular Y ∈ Mm and Z ∈ Mn such that
(a) if k < n,
" #
−1 Ik 0
Z BY = for some B2 ∈ Mn−k,m−k (2.2.23)
0 B2

and " #
−1 JkT (0) 0
Y AZ = for some A2 ∈ Mm−k,n−k . (2.2.24)
0 A2
(b) if k = n ≤ m, then we may take B2 = 0 in (2.2.23) and A2 = 0 in
(2.2.24).

Type II, Case 2. Even parity, k ≥ 2, (AB)k−1 6= 0, B(AB)k−1 = 0 and


A(BA)k−1 = 0. There exist nonsingular Y ∈ Mm and Z ∈ Mn such that
(a) if k < n, or if k = n < m,
" #
−1 Hk 0
Z BY = for some B2 ∈ Mn−k+1,m−k (2.2.25)
0 B2

and " #
−1 KkT 0
Y AZ = for some A2 ∈ Mm−k,n−k+1 . (2.2.26)
0 A2
35

(b) if k = n ≤ m, then we may take B2 = 0 in (2.2.25) and A2 = 0 in


(2.2.26).
In all four possible combinations of types and cases, our analysis can be
applied again to the matrices A2 and B2 . Iteration of this process leads to
the following result.

Theorem 2.2.4 Let positive integers m, n be given with m ≥ n, let A ∈


Mm,n and B ∈ Mn,m be given, and let k ≡ rank (AB)m . Then there exist
nonsingular X ∈ Mm and Y ∈ Mn such that
 
JA 0 ··· 0 0
 
 0 A1 ··· 0 0 
 
XAY −1 =

.. .. ... .. ..
 (2.2.27)
 . . . .
 0 0 · · · Ap 0 
 
0 0 0 0 0

and  
JB 0 ··· 0 0
 
 0 B1 ··· 0 0 
 
Y BX −1

= .. .. ... .. ..
 (2.2.28)
 . . . .
 0 0 · · · Bp 0 
 
0 0 ··· 0 0
where

(a) JA , JB ∈ Mk are nonsingular and JA JB is similar to J(AB), the non-


singular part of the Jordan canonical form of AB,

(b) Ai , Bi ∈ Mmi ,ni for all i = 1, . . . , p,

(c) max {mi , ni } ≥ max {mi+1 , ni+1 } for all i = 1, . . . , p − 1,

(d) |mi − ni | ≤ 1 for all i = 1, . . . , p,


T T T T
(e) each (Ai , Bi ) ∈ {(Imi , Jm i
(0)), (Jm i
(0), Imi ), (Hmi , Km i
), (Km i
, Hmi )},

(f) Hj ≡ [Ij−1 0] ∈ Mj−1,j and Kj ≡ [0 Ij−1 ] ∈ Mj−1,j , and

(g) Hj KjT = Jj−1


T
(0) and KjT Hj = JjT (0).
36

When m = n, A = In , and B ∈ Mn is nilpotent, the canonical contragre-


dient equivalence guaranteed by Theorem 2.2.4 gives XY −1 = XAY −1 = In
(so that X = Y ), and XBY −1 = Y BY −1 reduces to the Jordan canonical
form of B. This result, together with Theorem (2.4.8) of [HJ1], leads to a
proof of the Jordan canonical form of a square matrix. For a more direct
exposition of this idea, see the Appendix.

Definition 2.2.5 Let positive integers m, n be given with m ≥ n, and let


A ∈ Mm,n and B ∈ Mn,m be given. Define
φ(A, B) ≡
{rank A, rank BA, rank ABA, . . . , rank (AB)m−1 A, rank (BA)m ,
rank B, rank AB, rank BAB, . . . , rank (BA)m−1 B, rank (AB)m }.

The reason for introducing φ(A, B) is that the parts of the canonical form
in Theorem 2.2.4 that contribute to the nilpotent part of AB are completely
determined by the sequence φ(A, B).
Consider first what happens if A and B consist of only one canonical
block. For each choice of the possible pairs of canonical blocks, Table 1 gives
the ranks of the indicated products.

(Type, Case)
(I, 1) (I, 2) (II, 1) (II, 2)
A Ik Hk JkT (0) KkT
T T
B Jk (0) Kk Ik Hk
(AB)k 0 0 0 0
(BA)k 0 0 0 0
A(BA)k−1 1 0 0 0
B(AB)k−1 0 0 1 0
(AB)k−1 1 0 1 1
(BA)k−1 1 1 1 0
Table 1
Ranks of Products by Type and Case

Suppose AB is nilpotent and we are given the sequence φ(A, B). How can
we use φ(A, B) to determine the Ai in (2.2.27) and the Bi in (2.2.28)? Now
consider the general case of nilpotent AB. Let k be the smallest integer such
37

that (AB)k = 0 and (BA)k = 0. Then l1 ≡ rank A(BA)k−1 is the number of


Ik ’s in the canonical form for A, and hence is also the number of JkT (0)’s in
the canonical form for B, since only (Type I, Case 1) has a nonzero entry in
the row corresponding to A(BA)k−1 . In particular, l1 of the Ai ’s are Ik and
each of the l1 corresponding Bi ’s are JkT (0). Similarly, l2 ≡ rank B(AB)k−1
is the number of Ai ’s equal to JkT (0), and also the number of corresponding
Bi ’s equal to Ik (Type II, Case 1). Notice that the row corresponding to
(AB)k−1 has three nonzero entries: (Type I, Case 1), (Type II, Case 1), and
(Type II, Case 2). Hence, if l3 ≡ rank (AB)k−1 , then l3 − l1 − l2 of the Ai ’s
are KkT and the same number of Bi ’s are Hk , corresponding to (Type II,
Case 2). Similarly, if l4 ≡ rank (BA)k−1 , then l4 − l1 − l2 of the Ai ’s are
Hk and the same number of corresponding Bi ’s are KkT (Type I, Case 2).
If k > 1, the minimality of k implies that (AB)k−1 6= 0 or (BA)k−1 6= 0.
Hence, one of l1 , l2 , l3 , l4 must be a positive integer. The only case in which
l1 = l2 = l3 = l4 = 0 is when k = 1, A = 0, and B = 0.
Let A1 be a matrix of the form (2.2.27) having the first l1 of the Ai ’s
equal to Ik , the next l2 of the Ai ’s equal to JkT (0), the next l3 − l1 − l2 of
the Ai ’s equal to KkT , and l4 − l1 − l2 of the Ai ’s equal to Hk . Let B1 be
a corresponding matrix of the form (2.2.28), that is, the first l1 of the Bi ’s
are JkT (0), the next l2 of the Bi ’s are Ik , the next l3 − l1 − l2 of the Bi ’s
are Hk , and the next l4 − l1 − l2 of Bi ’s are KkT . Subtracting the entries
of the sequence φ(A1 , B1 ) from corresponding entries of φ(A, B) gives a new
sequence of ranks corresponding to the remaining part of the canonical forms,
whereupon we can repeat the foregoing construction.
The following procedure summarizes a way of determining a canonical
form of (A, B) under the contragredient equivalence when AB is nilpotent:

• Compute the sequence φ(A, B) and examine it to determine the small-


est integer k such that (AB)k = 0 and (BA)k = 0.

• Let A1 have the form (2.2.27), with l1 of the Ai ’s equal to Ik , l2 of the


Ai ’s equal to JkT (0), l3 − l1 − l2 of the Ai ’s equal to Hk , and l4 − l1 − l2 of
the Ai ’s equal to KkT . Let B1 be the corresponding matrix of the form
(2.2.28). Compute φ(A1 , B1 ).

• Replace φ(A, B) by the sequence formed by the differences of corre-


sponding entries of the sequences φ(A, B) and φ(A1 , B1 ).
38

• Repeat the preceding three steps until the all the entries in the sequence
are 0.
The following result is immediate.
Lemma 2.2.6 Let integers m, n be given with m ≥ n, and suppose A, C ∈
Mm,n and B, D ∈ Mn,m . Suppose that AB is nilpotent. Then (A, B) ∼ (C, D)
if and only if φ(A, B) = φ(C, D).
Corollary 2.2.7 Let integers m, n be given and suppose A, C ∈ Mm,n and
B, D ∈ Mn,m . Then (A, B) ∼ (C, D) if and only if the following conditions
are satisfied:
(1) AB is similar to CD, and
(2) for all l = 0, 1, . . . , max {m, n},
(a) rank (BA)l = rank (DC)l ,
(b) rank A(BA)l = rank C(DC)l , and
(c) rank B(AB)l = rank D(CD)l .
Proof The forward implication is easily verified and was noted in the In-
troduction. For the converse, we may assume without loss of gener-
ality that m ≥ n. Theorem 2.2.4 guarantees that there exist non-
singular X1 , X2 ∈ Mm , Y1 , Y2 ∈ Mn , and a nonnegative integer k =
rank (AB)m = rank (CD)m such that
" # " #
A1 0 B1 0
X1 AY1−1 = and Y1 BX1−1 =
0 A 0 B
and " # " #
C1 0 D1 0
X2 CY2−1 = and Y2 DX2−1 = ,
0 C 0 D
where A1 , B1 , C1 , D1 ∈ Mk are nonsingular and AB and CD are nilpo-
tent. Since AB is similar to CD, the nonsingular parts of their respec-
tive Jordan canonical forms are the same. Therefore, A1 B1 is similar
to C1 D1 and hence Lemma 2.2.3 guarantees that (A1 , B1 ) ∼ (C1 , D1 ).
Moreover, (1) also ensures that rank (AB)l = rank (CD)l for all inte-
gers l = 0, 1, . . . , m. These identities and the rank identities (2) ensure
that φ(A, B) = φ(C, D), so (A, B) ∼ (C, D) by Lemma 2.2.6.
39

Corollary 2.2.8 Let positive integers m, n be given with m ≥ n, and let


A ∈ Mm,n and B ∈ Mn,m be given. Then (A, B) ∼ (C, D), where
 
JA 0 ··· 0 0
 
 0 Aσ(1) ··· 0 0 
 

C≡ .. .. ... .. ..
 (2.2.29)
 . . . .
 0 0 · · · Aσ(p) 0 
 
0 0 0 0 0

and  
JB 0 ··· 0 0
 
 0 Bσ(1) ··· 0 0 
 

D≡ .. .. ... .. ..
, (2.2.30)
 . . . .
 0 0 · · · Bσ(p) 0 
 
0 0 ··· 0 0
JA , JB , Ai , Bi , i = 1, . . . , p, are as in Theorem 2.2.4, and σ is any permuta-
tion of {1, . . . , p}.

Proof Check that the conditions of Corollary 2.2.7 are satisfied.

Definition 2.2.9 For A ∈ Mm,n and B ∈ Mn,m , we define


" #
0 A
ψ(A, B) ≡ .
B 0

Let k ≥ 0 be a given integer. Then ψ(A, B)2k = (AB)k ⊕ (BA)k and


ψ(A, B)2k+1 = ψ(A(BA)k , B(AB)k ). Hence,

rank ψ(A, B)2k = rank (AB)k + rank (BA)k (2.2.31)

and
rank ψ(A, B)2k+1 = rank A(BA)k + rank B(AB)k . (2.2.32)
If AB is similar to CD, ψ(A, B) is similar to ψ(C, D), and rank A(BA)l =
rank C(DC)l for all integers l ≥ 0, then the conditions of Corollary 2.2.7 are
satisfied, and hence (A, B) ∼ (C, D). Conversely, all of these conditions are
satisfied if (A, B) ∼ (C, D).
40

Corollary 2.2.10 Let integers m, n be given and suppose A, C ∈ Mm,n and


B, D ∈ Mn,m . Then (A, B) ∼ (C, D) if and only if the following conditions
are satisfied:

(1) AB is similar to CD,

(2) ψ(A, B) is similar to ψ(C, D), and

(3) rank A(BA)l = rank C(DC)l for all l = 0, 1, . . . , max {m, n}.

2.3 Complex Orthogonal Equivalence and


the QS Decomposition
The classical QS decomposition (the classical algebraic polar decomposition) is
the fact that any nonsingular A ∈ Mn can be written as A = QS, where Q ∈
Mn is complex orthogonal (Q−1 = QT ) and S ∈ Mn is complex symmetric
(S = S T ) (see Theorem (3) on p. 6 of [G] and Theorem (6.4.16) of [HJ2]).
However, not every singular A ∈ Mn has a QS decomposition. If A = QS,
then AT A = S 2 = QT AAT Q, so that AAT is similar to AT A. For the example
" # " # " #
1 i T 0 0 T 1 i
A≡ ∈ M2 , AA = , A A= ,
0 0 0 0 i −1

AAT is not similar to AT A, so this A cannot have a QS decomposition. In


[K1], it is shown that similarity of AAT and AT A is sufficient for A to have a
QS decomposition. Here, we use what we have learned about contragredient
equivalence to give a different approach to this result.

Lemma 2.3.1 Let A, C ∈ Mm,n with m ≥ n. Then the following are equiv-
alent:

(1) (A, AT ) ∼ (C, C T ).

(2) (A, AT ) ∼ (C, C T ) via orthogonal matrices, that is, there exist orthog-
onal Q1 ∈ Mm and Q2 ∈ Mn such that C = Q1 AQ2 .
" # " #
T T 0 A 0 C
(3) AA is similar to CC and is similar to .
AT 0 CT 0
41

Proof Suppose (A, AT ) ∼ (C, C T ). Then there exist nonsingular X ∈ Mm


and Y ∈ Mn such that C = XAY −1 and C T = Y AT X −1 . Hence,
XAY −1 = C = (C T )T = (X −1 )T AY T , so X T XA = AY T Y . From
this it follows that p(X T X)A = Ap(Y T Y ) for any polynomial p(t). √
Let p(t) be a polynomial that interpolates the principal branch of t
and its first m − 1 derivatives on the joint spectra of X T X and Y T Y .
Then Corollary (6.2.12) of [HJ2] guarantees that S1 ≡ p(X T X) and
S2 ≡ p(Y T Y ) are symmetric matrices such that S12 = X T X, S22 = Y T Y ,
and S1 A = AS2 . One checks that Q1 ≡ XS1−1 and Q2 ≡ Y S2−1 are
orthogonal, X = Q1 S1 , and Y = Q2 S2 . Now compute

C = XAY −1 = Q1 (S1 A)S2−1 Q−1 −1 T T


2 = Q1 (AS2 )S2 Q2 = Q1 AQ2 .

Since C T = (Q1 AQT2 )T = Q2 AT QT1 as well, we have shown that (1)


implies (2) One checks easily that (2) implies (3). To show that (3)
implies (1), suppose that AAT is similar to CC T and ψ(A, AT ) is sim-
ilar to ψ(C, C T ) (see Definition 2.2.9). Then rank ψ(A, AT )2k+1 =
rank ψ(C, C T )2k+1 for every integer k ≥ 0. Since rank X = rank X T
for any X ∈ Mm,n , we also have

rank ψ(B, B T )2k+1 = rank B(B T B)k + rank B T (BB T )k


= rank B(B T B)k + rank (B(B T B)k )T
= 2 rank B(B T B)k

for any B ∈ Mm,n and any integer k ≥ 0. Thus, we must also have

rank A(AT A)k = rank C(C T C)k for all k = 0, 1, 2, . . . .

Corollary 2.2.10 guarantees that (A, AT ) ∼ (C, C T ), and our proof is


complete.

Theorem 2.3.2 Let A ∈ Mn be given. The following are equivalent:


(1) (A, AT ) ∼ (AT , A).

(2) AT A is similar to AAT .

(3) A = Q1 AT Q2 for some orthogonal Q1 , Q2 ∈ Mn .

(4) A = QS for some orthogonal Q and symmetric S in Mn .


42

(5) A = QAT Q for some orthogonal Q ∈ Mn .

Proof Notice that for any A, B ∈ Mn , ψ(A, B) = ψ(In , In )ψ(B, A)ψ(In , In )


and ψ(In , In )−1 = ψ(In , In ); we conclude that ψ(A, B) is similar to
ψ(B, A). In particular, ψ(A, AT ) is similar to ψ(AT , A). Lemma 2.3.1
now guarantees the equivalence of (1), (2) and (3). Now suppose A =
Q1 AT Q2 for some orthogonal Q1 , Q2 ∈ Mn . Then
B ≡ AQT1 = Q1 AT Q2 QT1 = (AQT1 )T (Q2 QT1 ) = B T Q,
where Q ≡ Q2 QT1 . Now, BQT = B T = (B T Q)T = QT B, so QT = Q−1
commutes with B, and hence Q (which is a polynomial in Q−1 ) com-
mutes with B. Since QB = BQ implies QB T = B T Q, Q also commutes
with B T . Let R be any polynomial square root of Q, so that R com-
mutes with B and with B T . Moreover, B = B T Q = B T R2 . Thus,
BR−1 = B T R = RB T . Hence, (BR−1 )2 = (BR−1 )(RB T ) = BB T ,
so BB T has a square root. Theorem (4) of [CH2] (see also Problem
23 in Section (6.3) of [HJ2]) guarantees that BB T has a symmetric
square root S that is similar to BR−1 , say S = Y (BR−1 )Y −1 . Thus,
S 2 = Y (BB T )Y −1 and
ψ(S, S) = Zψ(BR−1 , BR−1 )Z −1
= Zψ(BR−1 , RB T )Z −1
= Z1 ψ(B, B T )Z1−1 ,

where Z ≡ Y ⊕ Y and Z1 ≡ Z(In ⊕ R). Hence, S 2 is similar to


BB T and ψ(S, S) is similar to ψ(B, B T ). Lemma 2.3.1 guarantees
that AQT1 = B = Q3 SQ4 for some orthogonal Q3 , Q4 ∈ Mn . Thus,
A = Q3 SQ4 Q1 = QS1 , where Q ≡ Q3 Q4 Q1 is orthogonal and S1 ≡
QT1 QT4 SQ4 Q1 is symmetric, so we have shown that (3) implies (4).
Now suppose A = QS for some orthogonal Q and symmetric S ∈ Mn .
Then S = QT A and AT = SQT = QT AQT so that A = QAT Q and (4)
implies (5). Since (5) trivially implies (3), our proof is complete.

2.4 The φS Polar Decomposition


In the preceding section we studied a factorization related to the transpose
operator φ : Mn → Mn given by φ(A) ≡ AT : we characterized those A ∈
43

Mn that could be written as A = XY , where X ∈ Mn is nonsingular and


X −1 = φ(X), and Y ∈ Mn satisfies Y = φ(Y ). In this section we consider
a factorization of the same form for a general linear operator φ : Mn → Mn
that shares some basic properties of the transpose operator.

Definition 2.4.1 Let Sn+ denote the set of nonsingular symmetric (AT = A)
matrices in Mn , let Sn− denote the set of all nonsingular skew-symmetric
(AT = −A) matrices in Mn , and set Sn ≡ Sn+ ∪ Sn− . The spectrum of
A ∈ Mn , the set of eigenvalues of A, will be denoted by σ(A).

Since the rank of a skew-symmetric matrix is even, Sn− is empty if n is


odd.

Lemma 2.4.2 Let φ : Mn → Mn be a given linear operator. There exists


an S ∈ Sn such that φ(X) = SX T S −1 for all X ∈ Mn if and only if the
following three conditions hold for all A, B ∈ Mn :
(1) σ(φ(A)) = σ(A),
(2) φ(φ(A)) = A, and
(3) φ(AB) = φ(B)φ(A).

Proof The forward implication can be verified easily. Conversely, suppose


φ : Mn → Mn is a linear operator satisfying the three given con-
ditions. Under assumption (1), results about linear preservers (see
Theorem (4.5.7) of [HJ2]) guarantee that there exists a nonsingular
S ∈ Mn such that either φ(X) = SXS −1 for all X ∈ Mn or φ(X) =
SX T S −1 for all X ∈ Mn . Since φ(AB) = φ(B)φ(A) for any A, B ∈
Mn , it must be that φ(X) = SX T S −1 . For all A ∈ Mn , we have
A = φ(φ(A)) = φ(SAT S −1 ) = S(SAT S −1 )T S −1 = SS −T AS T S −1 =
(SS −T )A(SS −T )−1 . Hence, SS −T = αI, and S = αS T . Thus, S =
α(αS T )T = α2 S. Since S is nonsingular, α = ±1, that is, either
S = S T (S ∈ Sn+ ) or S = −S T (S ∈ Sn− ). Hence, S ∈ Sn , as asserted.

Definition 2.4.3 Let S ∈ Sn be given. We define the linear operator φS :


Mn → Mn by
φS (A) ≡ SAT S −1 for all A ∈ Mn .
44

For a given S ∈ Sn , we say that A is φS symmetric if φS (A) = A; A is φS


skew-symmetric if φS (A) = −A; A is φS orthogonal if AφS (A) = I (that is,
A is nonsingular and A−1 = φS (A)). Finally, we say that A has a φS polar
decomposition if A = XY for some φS orthogonal X and φS symmetric Y .

If S = I, then φS is ordinary transposition and the φS polar decom-


position is the QS decomposition (algebraic polar decomposition). If S =
[ sij ] ∈ Sn has si,n−i+1 = 1 for i = 1, . . . , n and all other sij = 0, then φS is
anti-transposition (see [Ho]).
The following assertions are easily verified.

Lemma 2.4.4 Let S ∈ Sn be given, and define φS as in Definition 2.4.3.


Then

(1) φS (A−1 ) = φS (A)−1 for all nonsingular A ∈ Mn .

(2) AφS (A) and φS (A)A are φS symmetric for all A ∈ Mn .

(3) p(φS (A)) = φS (p(A)) for all A ∈ Mn and all polynomials p(t).

(4) If A ∈ Mn is φS symmetric, then p(A) is φS symmetric for every


polynomial p(t).

(5) If A, B ∈ Mn are φS orthogonal, then AB is φS orthogonal.

(6) Suppose that S ∈ Sn+ and let S1 ∈ Sn+ be such that S12 = S. Then
S1 AS1−1 is φS symmetric if and only if A is symmetric.

It is natural to ask if Theorem 2.3.2 can be generalized to the φS polar


decomposition. In particular, is it true that a given A ∈ Mn has a φS polar
decomposition if and only if AφS (A) is similar to φS (A)A?

2.4.1 The Nonsingular Case


The following is a generalization of the QS decomposition in the nonsingular
case.

Theorem 2.4.5 (Existence) Let S ∈ Sn and a nonsingular A ∈ Mn be


given. Then
45

(a) There exists a φS symmetric Y0 ∈ Mn such that Y02 = φS (A)A and Y0


is a polynomial in φS (A)A.

(b) If Y is a given φS symmetric matrix such that Y 2 = φS (A)A, then


X(A, Y ) ≡ AY −1 is φS orthogonal and A = X(A, Y )Y .

(c) A commutes with φS (A) if and only if there exist commuting X and Y ∈
Mn such that A = XY , X is φS orthogonal, and Y is φS symmetric.

Proof We may take Y0 to be any polynomial square root of φS (A)A.


For the second assertion, suppose Y 2 = φS (A)A. Then

I = Y −1 Y 2 Y −1 = Y −1 φS (A)AY −1 = φS (AY −1 )(AY −1 ),

so X(A, Y ) ≡ AY −1 is φS orthogonal and A = AY −1 Y = X(A, Y )Y .


If A = XY with X φS orthogonal and Y φS symmetric, and if X
and Y commute, then AφS (A) = (XY )φS (XY ) = XY φS (Y )φS (X) =
XY 2 φS (X) = Y 2 = φS (A)A. Conversely, if A commutes with φS (A),
then A commutes with Y0 , which is a polynomial in φS (A)A. It follows
that Y0 commutes with X(A, Y0 ) ≡ AY0−1 . But X(A, Y0 ) is orthogonal
by our second assertion, and A = X(A, Y0 )Y0 .

Theorem 2.4.6 (Uniqueness) Let S ∈ Sn and a nonsingular A ∈ Mn be


given. Let Y0 ∈ Mn be φS symmetric and suppose Y02 = φS (A)A and Y0 is
a polynomial in φS (A)A. Let X(A, Y0 ) ≡ AY0−1 , so A = X(A, Y0 )Y0 is a φS
polar decomposition of A. Let X, Y ∈ Mn be given. Then X is φS orthogonal,
Y is φS symmetric, and A = XY if and only if there exists a φS orthogonal
X1 ∈ Mn such that X1 commutes with Y0 , X12 = I, X = X(A, Y0 )X1 , and
Y = X1 Y0 .

Proof For the forward implication, let X1 ≡ Y Y0−1 . All the assertions
follow from Theorem 2.4.5 and the observation that Y0 is a polyno-
mial in φS (A)A = Y 2 , which ensures that Y (and hence X1 ) com-
mutes with Y0 . Conversely, under the stated assumptions we have
XY = X(A, Y0 )X12 Y0 = X(A, Y0 )Y0 = A and φS (Y ) = φS (X1 Y0 ) =
Y0 φS (X1 ) = Y0 X1−1 = Y0 X1 = X1 Y0 = Y , so Y is φS symmetric.
46

Moreover,

φS (X)X = φS ((X(A, Y0 )X1 )X(A, Y0 )X1


= φS (X1 )φS (X(A, Y0 ))X(A, Y0 )X1
= φS (X1 )X1 = I,

so X is φS orthogonal.

As an immediate application of the φS polar decomposition, we have the


following generalizations of the well-known fact that two complex symmetric
matrices are similar if and only they are complex orthogonally similar; see
Section 1.1.1 for analogs of the following three results involving real similarity.

Theorem 2.4.7 Let A, B ∈ Mn and S ∈ Sn be given. There exists a φS


orthogonal X ∈ Mn such that A = XBX −1 if and only if there exists a
nonsingular Z ∈ Mn such that A = ZBZ −1 and φS (A) = ZφS (B)Z −1 .

Proof The forward implication is easily verified. For the converse, we have
ZBZ −1 = A = φS (φS (A)) = φS (ZφS (B)Z −1 ) = φS (Z)−1 BφS (Z),
so (φS (Z)Z)B = B(φS (Z)Z). Theorem 2.4.5 guarantees that there
exists a φS symmetric Y ∈ Mn that is a polynomial in φS (Z)Z, as
well as a φS orthogonal X ∈ Mn such that Z = XY . Since φS (Z)Z
commutes with B, Y also commutes with B. Thus, A = ZBZ −1 =
(XY )B(Y −1 X −1 ) = XBX −1 .

The following immediate consequences of Theorem 2.4.7 generalize Corol-


laries (6.4.18—19) of [HJ2], which correspond to S = I.

Corollary 2.4.8 Let A, B ∈ Mn and S ∈ Sn be given, and suppose there


exists a polynomial p(t) such that φS (A) = p(A) and φS (B) = p(B). Then A
is similar to B if and only if there exists a φS orthogonal X ∈ Mn such that
A = XBX −1 .

Corollary 2.4.9 Let S ∈ Sn be given and suppose both A, B ∈ Mn are φS


symmetric, both are φS skew-symmetric, or both are φS orthogonal. Then A
is similar to B if and only if A is φS orthogonally similar to B.
47

2.4.2 The General Case


Let S ∈ Sn and A ∈ Mn be given. If A = XY is a φS polar decomposition
of A, then AφS (A) = (XY )φS (XY ) = XY 2 φS (X) = XY 2 X −1 is similar to
φS (A)A = Y 2 . Is this condition sufficient for the existence of a φS polar
decomposition of A? Using properties enumerated in Lemma 2.4.4, one can
follow the same argument used to prove Lemma 2.3.1 and obtain the following
generalization.

Lemma 2.4.10 Let S ∈ Sn and A, B ∈ Mn be given. The following are


equivalent:

(1) (A, φS (A)) ∼ (B, φS (B)).

(2) (A, φS (A)) ∼ (B, φS (B)) via φS orthogonal matrices, that is, there exist
φS orthogonal X1 , X2 ∈ Mn such that A = X1 BX2 .

(3) AφS (A) is similar to BφS (B) and


" # " #
0 A 0 B
is similar to .
φS (A) 0 φS (B) 0

Since for any C, D ∈ Mn , ψ(C, D) is always similar to ψ(D, C) (see


Definition 2.2.9), the following is an immediate consequence of Lemma 2.4.10.

Theorem 2.4.11 Let S ∈ Sn and A ∈ Mn be given. There exist φS orthogo-


nal X1 , X2 ∈ Mn such that A = X1 φS (A)X2 if and only if AφS (A) is similar
to φS (A)A.

Let A ∈ Mn and S ∈ Sn be given. If there are φS orthogonal matrices


X1 , X2 ∈ Mn such that A = X1 φS (A)X2 , let

B ≡ AφS (X1 ) and X ≡ X2 φS (X1 )

and follow the argument in the proof of Theorem 2.3.2 to show that:

(a) B = φS (B)X and X is φS orthogonal.

(b) X commutes with B, and hence also with φS (B).


48

Suppose R is a given square root of X that commutes with B. For


example, R may be taken to be any polynomial square root of X.
(c) BφS (B) = (BR−1 )2 .
(d) R commutes with φS (B) because RφS (B) = RBX −1 = BRX −1 =
φS (B)XRX −1 = φS (B)R. It follows that BR−1 = RφS (B).
Suppose W ∈ Mn is a given φS symmetric matrix that is similar to BR−1 .
(e) W 2 is similar to BφS (B) and ψ(W, φS (W )) = ψ(W, W ) is similar
to ψ(BR−1 , BR−1 ) = ψ(BR−1 , RφS (B)), which in turn is similar to
ψ(B, φS (B)).
Lemma 2.4.10 now ensures that there are φS orthogonal matrices Z1 , Z2 ∈ Mn
such that B = Z1 W Z2 , and hence A = [Z1 Z2 X1 ][φS (Z2 X1 )W (Z2 X1 )] is a
φS polar decomposition. Conversely, if A = ZY for Z, Y ∈ Mn such that
Y = φS (Y ) and ZφS (Z) = I, and if we take X1 = X2 = Z and R = I,
then A = X1 φS (A)X2 , R2 = X2 φS (X1 ) = I, and AφS (X1 )R−1 = Y is φS
symmetric. We summarize our conclusions in the following lemma.
Lemma 2.4.12 Let S ∈ Sn and A ∈ Mn be given. There exist a φS orthog-
onal X ∈ Mn and a φS symmetric Y ∈ Mn such that A = XY if and only if
there exist φS orthogonal X1 , X2 and a nonsingular R such that
(1) A = X1 φS (A)X2 ,
(2) R2 = X2 φS (X1 ) and R commutes with AφS (X1 ), and
(3) AφS (X1 )R−1 is similar to a φS symmetric matrix.
Notice that 2.4.12(2) is always attainable for some nonsingular R, since
one may take R to be any polynomial square root of φS (X1 )X2 . Moreover,
Theorem 2.4.11 ensures that 2.4.12(1) holds if and only if AφS (A) is similar
to φS (A)A. However, it is not clear that among the choices of X1 , X2 , and R
that satisfy 2.4.12(1, 2) there is one for which 2.4.12(3) holds. If S ∈ Sn+ , we
will show that any n-by-n matrix is similar to a φS symmetric matrix so that
2.4.12(3) is satisfied for any choice of X1 , X2 , and R that satisfy 2.4.12(1,
2). In this case the question asked at the beginning of this section has an
affirmative answer. However, for each even n and each S ∈ Sn− there is some
A ∈ Mn such that AφS (A) is similar to φS (A)A but A does not have a φS
polar decomposition.
49

2.4.3 The Symmetric Case


Lemma 2.4.13 Let S ∈ Sn+ and A ∈ Mn be given. There exists a φS
symmetric Y ∈ Mn that is similar to A.

Proof Let B ∈ Mn be symmetric and similar to A, let S1 be any symmetric


square root of S, and let C ≡ S1 BS1−1 . Then C is similar to A and
φS (C) = SC T S −1 = S(S1 BS1−1 )T S −1 = S12 (S1−1 BS1 )S1−2 = S1 BS1−1 =
C, so that C is φS symmetric.

Theorem 2.4.14 Let S ∈ Sn+ and a φS symmetric Y ∈ Mn be given, and


suppose Y = A2 for some A ∈ Mn . Then there exists a φS symmetric Y1 that
satisfies Y12 = Y and Y1 is similar to A.

Proof Lemma 2.4.13 guarantees that there exists a φS symmetric B that


is similar to A. Then B 2 (which is φS symmetric by Lemma 2.4.4(4))
is similar to A2 = Y . Corollary 2.4.9 guarantees that there exists a
φS orthogonal X ∈ Mn such that Y = XB 2 X −1 = (XBX −1 )2 . Then
XBX −1 = XBφS (X) is φS symmetric and is similar to A.

Lemmata 2.4.12, 2.4.10 and 2.4.13 now imply the following result about
φS polar decomposition in the symmetric case, which is a generalization of
Theorem 2.3.2.

Theorem 2.4.15 Let S ∈ Sn+ and A ∈ Mn be given. The following are


equivalent:

(1) (A, φS (A)) ∼ (φS (A), A).

(2) AφS (A) is similar to φS (A)A.

(3) A = X1 φS (A)X2 for some φS orthogonal X1 , X2 ∈ Mn .

(4) A = XY for some φS orthogonal X ∈ Mn and φS symmetric Y ∈ Mn .

(5) A = XφS (A)X for some φS orthogonal X ∈ Mn .


50

2.4.4 The Skew-Symmetric Case


Let an even integer n ≥ 2, S ∈ Sn− , and a φS symmetric Y ∈ Mn be
given. Then Y = φS (Y ) = SY T S −1 , so Y S = SY T = −S T Y T = −(Y S)T .
The skew-symmetric matrix Y S, and hence also Y itself, must have an even
rank. Thus, if n is even, no A ∈ Mn with odd rank can have a φS polar
decomposition.
Let an even integer n ≥ 2 and S ∈ Sn− be given. Let X ∈ Mn be
nonsingular and such that
" #
T 0 1
XSX = ⊕ S2 ,
−1 0

where S2 ∈ Sn−2 if n > 2 or is absent if n = 2. Consider B ≡ J2 (0) ⊕ 0 ∈ Mn ,
−1
and set A ≡ X BX. One computes
AφS (A) = (X −1 BX)S(X −1 BX)T S −1
= X −1 B(XSX T )B T X −T S −1 (X −1 X)
= X −1 B[(XSX T )B T (XSX T )−1 ]X
= X −1 (J2 (0) ⊕ 0n−2 )(−J2 (0) ⊕ 0n−2 )X
= 0.

Similarly, φS (A)A = 0. Hence, AφS (A) is similar (in fact, equal) to φS (A)A,
but rank A = 1, so A does not have a φS polar decomposition.
It is an open question to characterize the even-rank matrices with even
dimensions that have a φS polar decomposition for a given S ∈ Sn− .

2.5 A∗ and A
It is natural to ask what the analogs of Lemma 2.3.1 are for the mappings
A → A or A → A∗ . Although these mappings satisfy some of the conditions
2.4.2, they are not linear and do not preserve the spectrum. For a given
A, C ∈ Mm,n , when is (A, A∗ ) ∼ (C, C ∗ )? If m = n, when is (A, A) ∼ (C, C)?
The first case is immediate, and we present it without proof.

Theorem 2.5.1 Let A, C ∈ Mm,n be given. Then the following are equiva-
lent:
(1) (A, A∗ ) ∼ (C, C ∗ ).
51

(2) AA∗ is similar to CC ∗ .

(3) A and C have the same singular values, including multiplicities.

(4) (A, A∗ ) ∼ (C, C ∗ ) via unitary matrices, that is, A = U CV ∗ for some
unitary U ∈ Mn and V ∈ Mm .
−1
Recall that A, C ∈ Mn are consimilar if A = SCS for some nonsingu-
lar S ∈ Mn . We now apply the analysis of Lemma 2.3.1 to the conjugate
mapping.

Theorem 2.5.2 Let A, C ∈ Mn be given. Then (A, A) ∼ (C, C) if and only


−1
if there exists a nonsingular S ∈ Mn such that A = SCS , that is, if and
only if A is consimilar to C.

Proof Suppose (A, A) ∼ (C, C). Then there exist nonsingular X, Y ∈ Mn


such that A = XCY −1 and A = Y CX −1 . Hence XCY −1 = A =
−1 −1 −1
A = Y CX . It follows that ZC = CZ , where Z ≡ Y X. Now,
Theorem (6.4.20) of [HJ2] guarantees that there is a primary matrix
−1
function log X such that log(Z ) = −(log Z) and elog Z = Z. If we
1 −1 1 1 −1
set F ≡ e 2 log Z , then F 2 = Z and F = e− 2 log Z = e 2 log Z , so there
−1 −1
is a single polynomial p(t) such that F = p(Z) and F = p(Z ).
−1 −1
Thus, F C = p(Z)C = Cp(Z ) = CF and so A = XCY −1 =
−1 −1
Y (Y X)CY −1 = Y ZCY −1 = Y F 2 CY −1 = SCS , where S ≡ Y F .
Therefore, A and C are consimilar. The converse is easily verified.

Corollary 2.2.7 and Theorem 2.5.2 now proves the following, which is
Theorem (4.1) in [HH1].

Corollary 2.5.3 Let A, C ∈ Mn be given. Then A is consimilar to C if and


only if the following two conditions are satisfied:

(1) AA is similar to CC, and

(2) rank (AA)k A = rank (CC)k C for all k = 0, 1, . . . , n.


52

2.6 The Nilpotent Parts of AB and BA


Let integers m, n be given with m ≥ n, and let A ∈ Mm,n and B ∈ Mn,m be
given. It is known (Theorem (1.3.20) of [HJ1]) that the nonsingular Jordan
blocks in the Jordan canonical forms of AB and BA are the same; this also
follows from Lemma 2.2.2. What can be said about the nilpotent Jordan
blocks of AB and BA? The statement of the following result is due to
Flanders [F] (see [T] for a different approach); we give a new proof based on
the canonical form in Theorem 2.2.4.

Theorem 2.6.1 Let positive integers m, n be given with m ≥ n, let A ∈


Mm,n and B ∈ Mn,m be given, and let k ≡ rank (AB)m . There exist integers
m1 ≥ · · · ≥ mp ≥ 1 and n1 ≥ · · · ≥ np ≥ 1, and nonsingular X ∈ Mm and
Y ∈ Mn such that

X(AB)X −1 = J(AB) ⊕ Jm1 (0) ⊕ · · · ⊕ Jmp (0) ⊕ 0

and
Y (BA)Y −1 = J(AB) ⊕ Jn1 (0) ⊕ · · · ⊕ Jnp (0) ⊕ 0,
where J(AB) ∈ Mk is the nonsingular part of the Jordan canonical form of
AB, and |mi − ni | ≤ 1 for each i = 1, . . . , p. Conversely, let integers m1 ≥
· · · ≥ mp ≥ 1 and n1 ≥ · · · ≥ np ≥ 1 be given, and set m ≡ m1 + · · · + mp
and n ≡ n1 + · · · + np . Suppose that |mi − ni | ≤ 1 for each i = 1, . . . , p. Then
there exist A ∈ Mm,n and B ∈ Mn,m such that

AB is similar to Jm1 (0) ⊕ · · · ⊕ Jmp (0)

and
BA is similar to Jn1 (0) ⊕ · · · ⊕ Jnp (0).

Proof Theorem 2.2.4 ensures that there is an integer p ≥ 0 and nonsingular


X ∈ Mm and Y ∈ Mn such that
 
JA 0 ··· 0 0
 
 0 A1 ··· 0 0 
 
XAY −1

= .. .. ... .. ..

 . . . .
 0 0 · · · Ap 0 
 
0 0 0 0 0
53

and  
JB 0 ··· 0 0
 
 0 B1 ··· 0 0 
 
Y BX −1

= .. .. ... .. ..
,
 . . . .
 0 0 · · · Bp 0 
 
0 0 ··· 0 0
where JA , JB ∈ Mk are nonsingular; each
T T T T
(Ai , Bi ) ∈ {(Imi , Jm i
(0)), (Jm i
(0), Imi ), (Hmi , Km i
), (Km i
, Hmi )},

and m1 ≥ · · · ≥ mp ≥ 1. Hence, X(AB)X −1 = JA JB ⊕ A1 B1 ⊕ · · · ⊕


Ap Bp ⊕ 0, and Y (BA)Y −1 = JB JA ⊕ B1 A1 ⊕ · · · ⊕ Bp Ap ⊕ 0. Since
JA and JB ∈ Mk are nonsingular, JA JB is similar to JB JA and both
T
products are similar to J(AB). For all i such that Ai ∈ {Im1 , Jm 1
(0)},
T T
we have Ai Bi = Bi Ai = Jm1 (0); if Ai = Hm1 , then Ai Bi = Jm1 −1 (0)
T T T
and Bi Ai = Jm i
(0); and if Ai = Km 1
, then Ai Bi = Jm 1
(0) and Bi Ai =
T
Jm1 −1 (0). Notice that all of the sizes of the corresponding nilpotent
Jordan blocks of AB and BA just enumerated differ by at most one.
Repetition of this enumeration for m2 , . . . , mp gives the asserted result.
For the converse, we look at the three possibilities for each i = 1, . . . , p:
T
If mi = ni , set Ai = Imi and Bi = Jni (0); if mi = ni + 1, set Ai = Km i
T
and Bi = Hmi ; and if mi = ni − 1, set Ai = Hmi and Bi = Kmi . Then
   
A1 · · · 0 B1 · · · 0
 . . .   . . . . .. 
A≡
 .. . . ..  
 and B ≡  .. . 

0 · · · Ap 0 · · · Bp

have the asserted properties.

We can also use the results in Section 2.2 to give a different proof for
Theorem (3) in [F] (see also [PM]).

Theorem 2.6.2 Let A ∈ Mm,n , B ∈ Mn,m , and a nilpotent N ∈ Mm be


given. Suppose that N A = 0. Then the Jordan canonical forms of AB and
AB + N have the same nonsingular parts, and these two matrices have the
same set of eigenvalues.
54

Proof Lemma 2.2.2 guarantees that there exist nonsingular X ∈ Mm and


Y ∈ Mn such that
" # " #
−1 Ik 0 −1 J(AB) 0
XAY = and Y BX =
0 A 0 B

where J(AB) ∈ Mk is the nonsingular part of the Jordan canonical


form of AB and AB ∈ Mm−k is nilpotent. Partition XN X −1 as
" #
−1 N11 N12
XN X ≡ , where N11 ∈ Mk and N22 ∈ Mm−k .
N21 N22

Notice that
" #
−1 −1 −1 N11 N12 A
0 = N A = XN AY = (XN X )(XAY )=
N21 N22 A

so that N11 = 0, N21 = 0, N12 A = 0 and N22 A = 0. Thus,


" #
−1 0 N12
XN X ≡ ,
0 N22
m−k
and the nilpotence of N22 follows from that of N . Note that N22 =
m−k r t
(AB) = 0, so any product of the form (AB) N22 vanishes when
max {r, t} ≥ m − k. Since N22 A = 0, we have

(AB + N22 )l = (AB)l + (AB)l−1 N22 + · · · + (AB)N22


l−1 l
+ N22

for every l = 1, 2, . . .; for l = 2(m−k), each of these summands vanishes.


Thus AB + N22 is nilpotent. Since the spectra of J(AB) and AB + N22
are disjoint, it follows (see Problem (10) in Section (2.4) of [HJ1]) that
" #
−1 J(AB) N12
X(AB + N )X = (2.6.33)
0 AB + N22

is similar to J(AB) ⊕ (AB + N22 ), whose nonsingular part is J(AB),


as asserted. The assertion about the spectra of AB and AB + N is
apparent from (2.6.33).
55

2.7 A Sufficient Condition For Existence of


A Square Root
It is known that for any A ∈ Mn , AA is similar to the square of a real matrix
(Corollary (4.10) of [HH1]), and we have seen that AAT has a square root
whenever AAT is similar to AT A (since A = QS and AAT = (QSQT )2 in
this case). Our goal in this section is to show how the canonical form for
contragredient equivalence leads to a sufficient condition for existence of a
square root that encompasses both of these observations.

Lemma 2.7.1 Let an integer n > 1, A ∈ Mn−1,n , and B ∈ Mn,n−1 be given.


Suppose that BA is similar to Jn (0). Then

(1) AB is similar to Jn−1 (0), and


(
k k n−k−1 for k = 0, . . . , n − 1
(2) rank A(BA) = rank B(AB) =
0 for k ≥ n.

Proof The first assertion is an immediate consequence of Flanders’ The-


orem 2.6.1, but it is easy to give a direct proof: Since (BA)n = 0
and (AB)n+1 = A(BA)n B = 0, we see that AB is nilpotent, so
rank BA < n − 1. But n − 2 = rank (BA)2 ≤ rank AB < n − 1,
so rank AB = n − 2, which ensures that AB is similar to Jn−1 (0).
Using (1), we have

n − k − 1 = n − (k + 1) = rank (BA)k+1
= rank B(AB)k A
≤ rank B(AB)k
≤ rank (AB)k
= (n − 1) − k = n − k − 1

and
n−k−1 = rank (BA)k+1
= rank BA(BA)k
≤ rank A(BA)k
= rank (AB)k A
≤ rank (AB)k
= n − k − 1,
56

so rank B(AB)k = rank A(BA)k = n − k − 1 for k = 0, 1, . . . , n − 1.


Since (AB)n−1 = 0 and (BA)n = 0, rank B(AB)k = rank A(BA)k = 0
for all k ≥ n.

Let m ≥ n be given, and let A ∈ Mm,n and B ∈ Mn,m be given. Corol-


lary 2.2.8 guarantees that there exist nonsingular X ∈ Mm and Y ∈ Mn such
that
   
JA 0 0 0 JB 0 0 0
 0 A1 0 0   0 B1 0 0 
   
XAY −1 =   and Y BX −1 =  
 0 0 A2 0   0 0 B2 0 
0 0 0 0 0 0 0 0

where JA , JB ∈ Mk are nonsingular and JA JB is similar to the nonsingular


part of the Jordan canonical form of AB; A1 and B1 contain all canonical
T
blocks of the form Imi and Jm i
(0), that is

A1 ≡ [Im1 ⊕ JαT1 (0)] ⊕ · · · ⊕ [Imt ⊕ JαTt (0)]

and
T T
B1 ≡ [Jm 1
(0) ⊕ Iα1 ] ⊕ · · · ⊕ [Jm t
(0) ⊕ Iαt ]
with m1 ≥ · · · ≥ mt ≥ 0 and α1 ≥ · · · ≥ αt ≥ 0 (if all mi = αi = 0, then A1
and B1 are absent); and A2 and B2 contain all the canonical blocks of the
T
form Hmi and Km i
, that is,
" # " #
Hn1 0 Hns 0
A2 ≡ ⊕ ··· ⊕
0 KβT1 0 KβTs

and " # " #


KnT1 0 KnTs 0
B2 ≡ ⊕ ··· ⊕
0 Hβ1 0 Hβs
with n1 ≥ · · · ≥ ns ≥ 0 and β1 ≥ · · · ≥ βs ≥ 0 (if all nj = βj = 0, then A2
and B2 are absent).
Since Hp ∈ Mp−1,p and KpT ∈ Mp,p−1 satisfy Hp KpT = JpT (0), it follows
from Lemma 2.7.1(2) (or from inspection of the respective products) that
rank Hp (KpT Hp )k = rank KpT (Hp KpT )k for all k ≥ 0. Thus, we always have

rank A2 (B2 A2 )k = rank B2 (A2 B2 )k for all k = 0, 1, 2, . . .


57

without any assumptions on A or B.


If we now assume that

rank A(BA)k = rank B(AB)k for all k = 0, 1, 2, . . . ,

then it follows that

rank A1 (B1 A1 )k = rank B1 (A1 B1 )k for all k = 0, 1, 2, . . . . (2.7.34)

If α1 > m1 , take k = α1 − 1 ≥ m1 , so JαT1 (0)k 6= 0, JαTi (0)k+1 = 0 and


Jmi (0)k = 0 for all i = 1, . . . , t and hence

A1 (B1 A1 )k = [Jm
T
1
(0)k ⊕ JαT1 (0)k+1 ] ⊕ · · · ⊕ [Jm
T
t
(0)k ⊕ JαTt (0)k+1 ] = 0

and
B1 (A1 B1 )k = [JmT
1
(0)k ⊕ JαT1 (0)k ] ⊕ · · · ⊕ [Jm
T
t
(0)k ⊕ JαTt (0)k ]
= [0 ⊕ JαT1 (0)k ] ⊕ · · · ⊕ [0 ⊕ JαTt (0)k ] 6= 0.

Since this contradicts (2.7.34), we must have α1 ≤ m1 . A symmetric argu-


ment, taking k = m1 −1, shows that m1 ≤ α1 . Hence m1 = α1 . Repetition of
this argument shows that mi = αi for all i = 2, . . . , t. Conversely, if αi = mi
for all i, then rank A(BA)k = rank B(AB)k for all k = 0, 1, 2, . . .. We sum-
marize what we have learned in the following result, which also plays a key
role in identifying a canonical form under orthogonal equivalence.

Lemma 2.7.2 Let positive integers m, n be given with m ≥ n, and let A ∈


Mm,n and B ∈ Mn,m be given. There exist nonsingular X ∈ Mm and Y ∈ Mn
such that
   
JA 0 0 0 JB 0 0 0
 0 A1 0 0   0 B1 0 0 
   
XAY −1 =   and Y BX −1 =  ,
 0 0 A2 0   0 0 B2 0 
0 0 0 0 0 0 0 0

where JA , JB ∈ Mk are nonsingular and JA JB is similar to the nonsingular


part of the Jordan canonical form of AB,
T T
A1 = [Im1 ⊕ Jm 1
(0)] ⊕ · · · ⊕ [Imt ⊕ Jm t
(0)], (2.7.35)
T T
B1 = [Jm 1
(0) ⊕ Im1 ] ⊕ · · · ⊕ [Jm t
(0) ⊕ Imt ], (2.7.36)
58

" # " #
Hn1 0 Hns 0
A2 ≡ ⊕ ··· ⊕ ,
0 KβT1 0 KβTs
and " # " #
KnT1 0 KnTs 0
B2 ≡ ⊕ ··· ⊕
0 Hβ1 0 Hβs
if and only if rank A(BA)k = rank B(AB)k for all k = 0, 1, 2, . . ..

Notice that A1 B1 = B1 A1 . Thus, we always have

rank (A1 B1 )k = rank (B1 A1 )k for all k = 1, 2, 3, . . .

without any assumptions on A or B.


If we now assume that

rank (AB)k = rank (BA)k for all k = 1, 2, 3, . . . ,

then it follows that

rank (A2 B2 )k = rank (B2 A2 )k for all k = 1, 2, 3, . . . .

An argument similar to the proof of Lemma 2.7.2 now shows that nj = βj


for j = 1, . . . , s. Conversely, if nj = βj for each j, then rank (AB)k =
rank (BA)k for all k = 1, 2, 3, . . .. Thus, we have the following complement
to Lemma 2.7.3.

Lemma 2.7.3 Let positive integers m, n be given with m ≥ n, and let A ∈


Mm,n and B ∈ Mn,m be given. There exist nonsingular X ∈ Mm and Y ∈ Mn
such that
   
JA 0 0 0 JB 0 0 0
 0 A1 0 0   0 B1 0 0 
XAY −1 = 

 
 and Y BX −1 = 

,
 0 0 A2 0   0 0 B2 0 
0 0 0 0 0 0 0 0

where JA , JB ∈ Mk are nonsingular and JA JB is similar to the nonsingular


part of the Jordan canonical form of AB,

A1 = [Im1 ⊕ JβT1 (0)] ⊕ · · · ⊕ [Imt ⊕ JβTt (0)],


59

T T
B1 = [Jm 1
(0) ⊕ Iβ1 ] ⊕ · · · ⊕ [Jm t
(0) ⊕ Iβt ],
" # " #
Hn1 0 Hns 0
A2 = ⊕ ··· ⊕ , (2.7.37)
0 KnT1 0 KnTs
and " # " #
KnT1 0 KnTs 0
B2 = ⊕ ··· ⊕ (2.7.38)
0 Hn1 0 Hns
if and only if rank (AB)k = rank (BA)k for all k = 1, 2, 3, . . ..
Theorem 2.7.4 Let A ∈ Mm,n and B ∈ Mn,m be given. Suppose that for
every integer k ≥ 0, we have
(1) rank A(BA)k = rank B(AB)k , and
(2) rank (BA)k+1 = rank (AB)k+1 .
Then AB and BA have square roots.
Proof Suppose m ≥ n. Then AB is similar to J(AB) ⊕ A1 B1 ⊕ A2 B2 ⊕ 0,
in which J(AB) (if present) is nonsingular and hence has a square
T T
root. Lemma 2.7.2 guarantees that A1 B1 = [Jm 1
(0) ⊕ Jm 1
(0)] ⊕ · · · ⊕
T T
[Jm t
(0)⊕Jmt (0)], and hence has a square root. Lemma 2.7.3 guarantees
T T T T
that A2 B2 = [Jn1 −1 (0)⊕ Jn1 (0)] ⊕· · · ⊕ [Jns −1 (0) ⊕Jns (0)], so that A2 B2
also has a square root. It follows that AB has a square root. Similarly,
BA has a square root.
The conditions of Theorem 2.7.4, though sufficient, are not necessary for
the existence of a square root of AB. The example A1 ≡ I2k and B1 ≡
Jk (0) ⊕ Jk (0) shows that condition (1) need not be satisfied (although (2) is).
The example  
" # 0 0
1 0 0 0  1 0 
 
A2 ≡ and B2 ≡  
0 0 1 0  0 0 
0 1
shows that condition (2) need not be satisfied (although (1) is). Moreover,
the example " # " #
A1 0 B1 0
A≡ and B ≡
0 A2 0 B2
shows that neither condition need be satisfied.
60

Corollary 2.7.5 Let A ∈ Mn be given.


(1) AA has a square root.
(2) Let S ∈ Sn be given. If φS (A)A is similar to AφS (A), then φS (A)A has
a square root. In particular, if AT A is similar to AAT , then AT A has
a square root.

Proof Since rank X = rank X = rank φS (X) for any X ∈ Mn and any
S ∈ Sn , one checks that the conditions of Theorem 2.7.4 are satisfied
in each case.

The sufficient condition in Corollary 2.7.5 is not necessary. The example


" #
1 0
A≡
i 0

shows that AT A can have a square root without being similar to AAT .

2.8 A Canonical Form for


Complex Orthogonal Equivalence
For a given A ∈ Mm,n , what standard form can be achieved by Q1 AQ2 for
complex orthogonal Q1 ∈ Mm and Q2 ∈ Mn ? In our search for a standard
form, we are guided by the following facts and observations:
• If m = n and A = QS is a QS decomposition of A, then QT A = S is
symmetric and AAT is orthogonally similar to S 2 , so we seek a standard
form that is as much as possible like a symmetric matrix whose square
is similar to AAT .
• Lemma 2.3.1 ensures that Q1 AQ2 = C if and only if (A, AT ) ∼ (C, C T ),
so we may approach our standard form via a sequence of contragredient
equivalences applied to the pair (A, AT ).
• rank A(AT A)k = rank (A(AT A)k )T = rank AT (AAT )k for all k =
0, 1, 2, . . ., so Lemma 2.7.2 ensures that

(A, AT ) ∼ ([JA ⊕ A1 ⊕ A2 ⊕ 0], [JAT ⊕ B1 ⊕ B2 ⊕ 0])


61

in which JA JAT is similar to the nonsingular part of the Jordan canon-


ical form of AAT and all the direct summands in
T T
A1 = [Im1 ⊕ Jm 1
(0)] ⊕ · · · ⊕ [Imt ⊕ Jm t
(0)]

and
T T
B1 = [Jm 1
(0) ⊕ Im1 ] ⊕ · · · ⊕ [Jm t
(0) ⊕ Im1 ],
m1 ≥ · · · ≥ mt ≥ 1, are paired in sizes. It may not be possible to write
" # " #
Hr1 0 Hrq 0
A2 = ⊕ ··· ⊕ (2.8.39)
0 KsT1 0 KsTq

and " # " #


KrT1 0 KrTq 0
B2 = ⊕ ··· ⊕ (2.8.40)
0 Hs1 0 Hsq
in such a way that all ri = si ; the blocks Kk and Hk are defined in
(2.2.14) and satisfy the identities in Theorem 2.2.4(g). But we may
select the blocks (if any) for which the sizes can be paired, and may
write " # " #
Hj1 0 Hjs 0
A2 = ⊕ ···⊕ ⊕ A3
0 KjT1 0 KjTs
and " # " #
KjT1 0 KjTs 0
B2 = ⊕ ··· ⊕ ⊕ B3 ,
0 Hj1 0 Hjs
in which
" # " #
Hn1 0 Hnp 0
A3 = ⊕ ··· ⊕ (2.8.41)
0 KβT1 0 KβTp

and " # " #


KnT1 0 KnTp 0
B3 = ⊕ ··· ⊕ (2.8.42)
0 Hβ1 0 Hβp
and ni 6= βk for all i, k = 1, . . . , p.
The preceding comments motivate the following analyses of standard
forms for
(a) (JA , JAT ),
62

(b) ([Ik ⊕ JkT (0)], [JkT (0) ⊕ Ik ]),


Ã" # " #!
Hk 0 KkT 0
(c) , , and
0 KkT 0 Hk
Ã" # " #!
Hk 0 KkT 0
(d) , with k 6= l.
0 KlT 0 Hl

Proposition 2.8.1 Let A, B ∈ Mk be nonsingular. Then there exists a sym-


metric S ∈ Mk such that (A, B) ∼ (S, S).

Proof Let S1 ∈ Mk be a symmetric matrix that is similar to AB. Let S be


any polynomial square root of S1 . Then S is symmetric and S 2 = S1
is similar to AB. Lemma 2.2.2 guarantees that (A, B) ∼ (S, S).

Definition 2.8.2 Let λ ∈ C and a positive integer k be given. Then


   
λ 1 0 ··· 0 0 · · · 0 −1 0 
 ···  
 1 λ 1 0   0 · · · −1 0 1 
 ...   

Sk (λ) ≡  0 1 λ .. 
 + i
.. . 0 1 0   ∈ Mk .
 .  .
 .... . . ... .. .. 
 . . . 1 


 −1 . . . . 
0 0 ··· 1 λ 0 1 · · · 0 0

It is known that the symmetric matrix Sk (λ) is similar to the Jordan


block Jk (λ) (see pp. 207—209 of [HJ1]).

Proposition 2.8.3 For a given positive integer k,

([Ik ⊕ JkT (0)], [JkT (0) ⊕ Ik ]) ∼ (S2k (0), S2k (0)).

Proof Write Ak ≡ Ik ⊕ JkT (0) and Bk ≡ JkT (0) ⊕ Ik . Notice that Ak Bk =


Bk Ak = JkT (0) ⊕ JkT (0) is similar to J2k (0)2 , which is similar to S2k (0)2 .
One checks that
rank Bk (Ak Bk )l = rank Ak (Bk Ak )l
= rank S2k (0)2l+1
= rank Jk (0)l+1 ⊕ Jk (0)l
= 2k − 2l − 1 for all l = 0, 1, 2, . . . .

Corollary 2.2.7 now guarantees that (Ak , Bk ) ∼ (S2k (0), S2k (0)).
63

Proposition 2.8.4 For a given positive integer k,


Ã" # " #!
Hk 0 KkT 0
, ∼ (S2k−1 (0), S2k−1 (0)).
0 KkT 0 Hk

Proof Let " # " #


Hk 0 KkT 0
Ak ≡ , and Bk ≡ .
0 KkT 0 Hk
T
Notice that Ak Bk = Jk−1 (0) ⊕ JkT (0) and Bk Ak = JkT (0) ⊕ Jk−1
T
(0),
2
so that Ak Bk and Bk Ak are similar to J2k−1 (0) , which is similar to
S2k−1 (0)2 . One checks that

rank Bk (Ak Bk )l = rank Ak (Bk Ak )l


= rank S2k−1 (0)2l+1
= rank Hk (JkT (0))l ⊕ KkT (Jk−1 (0))l
= 2k − 2l − 2 for all l = 0, 1, 2, . . . .

Corollary 2.2.7 now guarantees that (Ak , Bk ) ∼ (S2k−1 (0), S2k−1 (0)).

We have now analysed all of the basic blocks except those in (2.8.41) and
(2.8.42). If A3 and B3 are present, there is no hope of finding a symmetric S
such that (A3 , B3 ) ∼ (S, S) because there must be at least one k for which
rank A3 (B3 A3 )k 6= rank B3 (A3 B3 )k . Notice that it is precisely the presence
of A3 and B3 that is the obstruction to writing A = QS when m = n. The
next two results will permit us to handle this remaining case.

Lemma 2.8.5 Let k be a given integer with 1 ≤ k < n. There exists a


C ∈ Mk−1,n such that C T C = Sk (0) ⊕ 0.

Proof The symmetric n-by-n matrix Sk (0) ⊕ 0 ∈ Mn has a singular value


decomposition Sk (0) ⊕ 0 ≡ U T Σ2 U (see Corollary (4.4.4) of [HJ1]),
where U ∈ Mn is unitary and Σ ≡ diag (σ1 , . . . , σk−1 , 0, . . . , 0) ∈ Mn
with σ1 ≥ · · · ≥ σk−1 > 0. Now let Σ1 ≡ diag (σ1 , . . . , σk−1 ) ∈ Mk−1
and set C ≡ [Σ1 0]U ∈ Mk−1,n . Then C T C = U T Σ2 U = Sk (0) ⊕ 0, as
desired.
64

Proposition 2.8.6 Let k be a given positive integer. There exists a C ∈


Mk−1,k such that C T C = Sk (0) and (C, C T ) ∼ (Hk , KkT ).
Proof Use Lemma 2.8.5 to construct a C ∈ Mk−1,k such that C T C = Sk (0),
which is similar to KkT Hk = JkT (0). Lemma 2.7.1 ensures that CC T
and Hk KkT are similar to Jk−1 (0), and that
rank C(C T C)l = rank C T (CC T )l
= rank Hk (KkT Hk )l
= rank KkT (Hk KkT )l
= k − l − 1 for all l = 0, 1, 2, . . . .

Corollary 2.2.7 guarantees that (Hk , KkT ) ∼ (C, C T ).


Assembling the preceding results, we can now state a canonical form
under orthogonal equivalence.
Theorem 2.8.7 Let positive integers m, n be given with m ≥ n, and let
A ∈ Mm,n be given. There exist orthogonal Q1 ∈ Mm and Q2 ∈ Mn such that
" #
S 0
Q1 AQ2 = (2.8.43)
0 C
where S ≡ S1 ⊕ S2 ⊕ S3 is symmetric,
(a1) S1 is symmetric and S12 is similar to the nonsingular part of the Jordan
canonical form of AAT ,
(a2) S2 ≡ S2m1 (0) ⊕ · · · ⊕ S2mt (0),
(a3) S3 ≡ S2j1 −1 (0) ⊕ · · · ⊕ S2js −1 (0),
and  

C1 0 · · · 0 0 

 0 D1 . . . ... ..
.


 .. . . . 
 . . . . . . . ... 
C≡

 .. .. .

,

 . . . . Cp 0 
 
 0 0 · · · 0 Dp 
 
0 0 ··· 0 0
in which
65

(b1) Ci ∈ Mni −1,ni has rank ni − 1, CiT Ci = Sni (0), and Ci CiT is similar to
Sni −1 (0) for each i ,
(b2) Dj ∈ Mβj ,βj −1 has rank βj − 1, Dj DjT = Sβj (0), and DjT Dj is similar
to Sβj −1 (0) for each j , and

(b3) ni 6= βj for all i and j.

Proof One checks that (A3 , B3 ) ∼ (C, C T ), so that (A, AT ) ∼ (R, RT ), where
" #
S 0
R≡ .
0 C

Lemma 2.3.1 guarantees that there exist orthogonal Q1 ∈ Mm and


Q2 ∈ Mn such that R = Q1 AQ2 .

Lemma 2.8.8 Let positive integers m, n be given with m ≥ n, and let A ∈


Mm,n be given. Let orthogonal Q1 ∈ Mm and Q2 ∈ Mn be such that Q1 AQ2
has the form (2.8.43). Then
(1) AT A is diagonalizable if and only if the following four conditions hold:
(1a) S1 is diagonalizable and may be taken to be diagonal,
(1b) S2 = S2 (0) ⊕ · · · ⊕ S2 (0), or is absent,
(1c) S3 = 0, or is absent, and
(1d) C may have some Dj ∈ M2,1 but C does not have Ci .

(2) rank A = rank AT A if and only if the following three conditions hold:
(2a) S2 is absent,
(2b) S3 = 0, or is absent, and
(2c) C may have some Ci but C does not have Dj .
(3) rank (AT A)k = rank (AAT )k for all k = 1, 2, 3, . . . if and only if C = 0.

Proof To show (1), notice that AT A is diagonalizable if and only if the


nonsingular part of the Jordan canonical form of AT A is diagonalizable
and the singular part of the Jordan canonical form of AT A is equal to
0. These conditions are equivalent to conditions (1a) - (1d).
66

Suppose rank A = rank AT A. Then rank S + rank C = rank S 2 +


rank C T C. Since rank S ≥ rank S 2 and rank C ≥ rank C T C, we must
have
rank S = rank S 2 and rank C = rank C T C.
It follows that S2 is absent, and S3 is either absent or is equal to 0.
Now, rank Ci = rank CiT Ci = ni − 1 for each i and rank Dj = βj − 1 >
βj − 2 = rank DjT Dj for each j. Hence, C may contain some Ci but not
Dj . The converse can be verified easily.
To prove (3), notice that rank X = rank X T for all X ∈ Mm,n . Hence,
Lemmata 2.7.2 and 2.7.3 ensure that rank (AT A)k = rank (AAT )k for
all k = 1, 2, 3, . . . if and only if A3 = 0 in (2.8.41) and B = 0 in (2.8.42).
This condition is equivalent to C = 0.
We give a new proof for the following analog of the singular value decom-
position, which is Theorem (2) of [CH2].
Corollary 2.8.9 Let A ∈ Mm,n be given. Then A can be written as A =
Q1 ΣQ2 where Q1 ∈ Mm and Q2 ∈ Mn are orthogonal and Σ ≡ [σij ] ∈ Mm,n
is such that σij = 0 for i 6= j if and only if
(1) AT A is diagonalizable, and
(2) rank A = rank AT A.
Proof The forward assertion can be verified easily. For the converse, suppose
m ≥ n. Let orthogonal Q1 ∈ Mm and Q2 ∈ Mn be such that Q1 AQ2
has the form (2.8.43). It follows from Lemma 2.8.8 that S2 = 0 (or is
absent), S3 = 0 (or is absent), and S1 may be taken to be diagonal.
Now, Lemma 2.8.8(1d) and (2c) show that C does not have Ci and Dj .
Hence, C = 0, or is absent. Hence Q1 AQ2 ≡ [σij ] satisfies σij = 0 for
i 6= j, as desired.
The following (problem (34) on p. 488 of [HJ2]) is a generalization of the
QS decomposition in the non-square case.
Corollary 2.8.10 Let integers m, n be given with m ≥ n, and let A ∈ Mm,n
be given. There exist a Q ∈ Mm,n with QT Q = In , and a symmetric S ∈ Mn
such that A = QS if and only if rank (AT A)k = rank (AAT )k for all k =
1, 2, 3, . . ..
67

Proof Suppose A = QS with Q ∈ Mm,n , QT Q = In , and a symmetric Y ∈


Mn . Let A1 ≡ [A 0] ∈ Mm , so that rank (AT1 A1 )k = rank (AT A)k and
rank (A1 AT1 )k = rank (AAT )k for all k ≥ 1. Theorem (2.7) of [CH1]
guarantees that there exists P ≡ [Q R] ∈ Mm such that P T P = Im .
Now,
" # " #
S 0 S 0
A1 = [A 0] = [QS 0] = [Q R] =P
0 0 0 0

is a QS decomposition of A1 . Hence, rank (AT A)k = rank (AT1 A1 )k =


rank (A1 AT1 )k = rank (AAT )k for all k = 1, 2, 3, . . .. To prove the
converse, let orthogonal matrices Q1 ∈ Mm and Q2 ∈ Mn be such that
Q1 AQ2 has the form (2.8.43). Lemma 2.8.8(3) ensures that C = 0, so
that " # " #
S 0 In
Q1 AQ2 = = (S ⊕ 0) ≡ P1 Z1 ,
0 0 0
where P1 ≡ [In 0]T ∈ Mm,n and Z1 ≡ S ⊕ 0 ∈ Mn . Hence, A =
QT1 P1 Z1 QT2 = QZ, where Z = Q2 Z1 QT2 ∈ Mn is symmetric (since Z1
is), and Q ≡ QT1 P1 QT2 ∈ Mm,n . Now, QT Q = Q2 P1T P1 QT2 = Q2 In QT2 =
In , as asserted.
Chapter 3

Linear Operators Preserving


Orthogonal Equivalence on
Matrices

68
69

3.1 Introduction and Statement of Result


Let Mm,n be the set of all m-by-n matrices with complex entries; and we
write Mn ≡ Mn,n . A matrix Q ∈ Mn is said to be (complex) orthogonal if
QT Q = I. Two matrices A, B ∈ Mm,n are said to be (complex) orthogonally
equivalent, denoted A ∼ B, if A = Q1 BQ2 for some orthogonal matrices
Q1 ∈ Mm and Q2 ∈ Mn . One checks that ∼ is an equivalence relation on
Mm,n . We are interested in studying linear orthogonal equivalence preservers
on Mm,n , that is, T : Mm,n → Mm,n such that T (A) ∼ T (B) whenever
A ∼ B. We prove the following, which is our main result.

Theorem 3.1.1 Let T be a given linear operator on Mm,n . Then T preserves


orthogonal equivalence if and only if there exist complex orthogonal Q1 ∈ Mm
and Q2 ∈ Mn and a scalar α ∈ C such that either
(1) T (A) = αQ1 AQ2 for all A ∈ Mm,n , or
(2) m = n and T (A) = αQ1 AT Q2 for all A ∈ Mn .

We have divided the proof of the theorem into three parts. In Section 3.2,
we establish that either T = 0 or T is nonsingular. In Section 3.3, we show
that T has the form asserted in the theorem, except that Q1 and Q2 are
nonsingular, but are not necessarily orthogonal. Finally, in Section 3.4, the
theorem is proved.
Similar problems have been studied in [HHL] and [HLT], and we use their
general approach. We analyse the orbits

O(A) = {X ∈ Mm,n : X ∼ A}

and their corresponding tangent spaces TA . It is known that these orbits


are homogeneous differentiable manifolds [Bo]. As in [HHL] and [HLT], it
is necessary to develop some special techniques to supplement the general
approach in solving the problem. Unlike the relations considered in [HLT],
little is known about complex orthogonal equivalence and a simple canoni-
cal form is not available. This makes the problem more difficult and more
interesting. In fact, the results obtained in this paper may give more insight
into, and better understanding of, the orthogonal equivalence relation.
We denote the standard basis of Mm,n by {E11 , E12 , . . . , Em,n }. When
Eij ∈ Mn , we set Fij ≡ Eij − Eji and to avoid confusion, when Eij ∈ Mm , we
70

set Gij ≡ Eij − Eji . Notice that all Fij and Gij are skew-symmetric matrices.
We denote the standard basis of Cn by {e1 , . . . , en }. The n-by-n identity
matrix is denoted by I, or when necessary for clarity, by In . A vector x ∈ Cn
is said to be isotropic if xT x = 0.

3.2 Preliminary Results


The following observation is used repeatedly in our arguments.

Lemma 3.2.1 Let V be a subspace of Mm,n that is invariant under orthog-


onal equivalence, and let T : V → Mm,n be a linear transformation that
preserves orthogonal equivalence. Then

T (span O(A)) ⊂ span O(T (A)) and T (TA ) ⊂ TT (A)

for every A ∈ V. Consequently, if T is nonsingular, then

dim span O(A) ≤ dim span O(T (A)) and dim TA ≤ dim TT (A)

for every A ∈ V.

Lemma 3.2.2 Let X, Y ∈ Mm,n be given, let T be a given linear orthogonal


preserver on Mm,n , and suppose X ∈ TX . If Y 6∈ TY , then T (X) 6∈ O(Y ).

Proof Suppose T (X) = Q1 Y Q2 for some orthogonal Q1 ∈ Mm and Q2 ∈


Mn . Let T1 ≡ QT1 T QT2 . Then Y = QT1 T (X)QT2 = T1 (X) ∈ T1 (TX ) ⊂
TT1 (X) = TY .

Remark The argument used to prove the preceding lemma can be used to
obtain its conclusion under more general hypotheses: Let G be a given group
of nonsingular linear operators on Mm,n , and say A ∼ B if A = L(B) for some
L ∈ G. Let T be a given linear operator on Mm,n such that T (A) ∼ T (B)
whenever A ∼ B. If X ∈ TX and Y 6∈ TY , then T (X) 6= L(Y ) for all L ∈ G.

Lemma 3.2.3 Span O(A) = Mm,n for every nonzero A ∈ Mm,n .


71

Proof Let A ∈ Mm,n be a given nonzero matrix. There exists orthogonal (in
fact permutation matrices) P ∈ Mm and Q ∈ Mn such that the (1,1)-
entry of B = P AQ, say b11 , is nonzero. For a given positive integer k,
define the diagonal orthogonal matrix

Dk ≡ diag(1, −1, −1, . . . , −1) ∈ Mk .

Then 4b11 E11 = (Im + Dm )B(In + Dn ) = B + Dm B + BDn + Dm BDn ∈


span O(A), so E11 ∈ span O(A). Since there exist permutation (and
hence orthogonal) matrices Pi ∈ Mm and Pj ∈ Mn such that Eij =
Pi E11 Pj , every Eij ∈ span O(A) and hence span O(A) = Mm,n .

Lemma 3.2.4 Let T be a given linear operator on Mm,n . Suppose T pre-


serves orthogonal equivalence. Then, either T = 0 or T is nonsingular.

Proof Suppose that ker T contains a nonzero matrix A. By Lemma 3.2.1,


T (span O(A)) ⊂ span O(T (A)) = {0}. Now, Lemma 3.2.3 guarantees
that span O(A) = Mm,n . Hence, T = 0.

We use the following result, which is Lemma (1) in [CH1].

Lemma 3.2.5 Let X1 , X2 ∈ Mn,k with 1 ≤ k ≤ n. There exists a complex


orthogonal Q ∈ Mn such that X1 = QX2 if and only if the following two
conditions are satisfied:

(a) X1T X1 = X2T X2 , and

(b) There exists a nonsingular B ∈ Mn such that X1 = BX2 .

Note that if X1 , X2 ∈ Mn,k have full rank, Lemma 3.2.5 ensures that
there exists an orthogonal Q ∈ Mn such that X1 = QX2 if and only if
X1T X1 = X2T X2 . In particular, for n ≥ 2 and any given nonzero z ∈ Cn ,
there are two possibilities:

(a) If z T z 6= 0, then z = αQe1 for some orthogonal Q ∈ Mn and some


nonzero α ∈ C, and

(b) If z 6= 0 and z T z = 0, then z = Q(e1 +ie2 ) for some orthogonal Q ∈ Mn .


72

Let A ∈ Mm,n be given. Then O(A) = {Q1 AQ2 : Q1 ∈ Mm , Q2 ∈


Mn , QT1 Q1 = Im , and QT2 Q2 = In }. If B ∼ A, then AT A is similar to
B T B and AAT is similar to BB T . Suppose A has rank 1, so A = xy T
for some x ∈ Cm and y ∈ Cn . Then rank AT A = 0 or 1 according to
whether xT x is zero or nonzero, and rank AAT = 0 or 1 according to whether
y T y is zero or nonzero. Depending on the four possibilities for the pair
(rank AT A, rank AAT ), it follows that for some nonzero scalar α ∈ C, O(A)
contains exactly one of the following: (a) αE11 ; (b) α(E11 +iE12 ); (c) α(E11 +
iE21 ); or (d) α(E11 − E22 + iE12 + iE21 ).
The same reasoning leads immediately to the following lemma.

Lemma 3.2.6 Let E11 , E12 , E21 , E22 ∈ Mm,n be standard basis matrices, and
let E ≡ E11 − E22 + iE12 + iE21 . Then
(a) O(E11 ) = {q1 q2T : q1 ∈ Cm , q2 ∈ Cn , and q1T q1 = q2T q2 = 1}.

(b) O(E11 + iE12 ) = {qy T : q ∈ Cm , y ∈ Cn , q T q = 1, and y T y = 0}.

(c) O(E11 + iE21 ) = {xq T : x ∈ Cm , q ∈ Cn , xT x = 0, and q T q = 1}.

(d) O(E) = {xy T : x ∈ Cm , y ∈ Cn , and xT x = y T y = 0} .

If Q(t) ∈ Mn is a differentiable family of orthogonal matrices with Q(0) =


I, then differentiation of the identity Q(t)Q(t)T = I at t = 0 shows that
Q0 (0) + Q0 (0)T = 0, that is, Q0 (0) is skew-symmetric. Conversely, if B ∈ Mn
is a given skew-symmetric matrix, then Q(t) ≡ etB is a differentiable family
of orthogonal matrices such that Q(0) = I and Q0 (0) = B. If A ∈ Mm,n is
given and Q1 (t) ∈ Mm and Q2 (t) ∈ Mn are given differentiable families of
orthogonal matrices with Q1 (0) = Im and Q2 (0) = In , one computes that
d
{Q1 (t)AQ2 (t)}|t=0 = Q01 (0)A + AQ02 (0).
dt
Thus, the tangent space to O(A) at A is given explicitly by

TA = {XA + AY : X ∈ Mm , Y ∈ Mn , X + X T = 0, and Y + Y T = 0}.

Definition 3.2.7 Let A ∈ Mm,n with n ≥ 2. Then

SA ≡ {AFij : i = 1, 2, and i < j ≤ n}.


73

Notice that SA ⊂ TA for every A ∈ Mm,n .


Suppose A ∈ Mm,n and rank A ≥ 2. Then there exists a permutation
matrix Q ∈ Mn such that the first two columns of AQ are linearly inde-
pendent. One checks that SAQ is a linearly independent subset of TAQ with
2n − 3 elements. Thus, dim TA = dim TAQ ≥ 2n − 3 whenever rank A ≥ 2.
A similar argument shows that dim TA ≥ 3n − 6 whenever rank A ≥ 3.
Now suppose A ∈ Mm,n , rank A ≥ 2, and dim TA = 2n − 3. Let Q ∈ Mn
be a permutation matrix such that the first two columns of

B ≡ AQ = [ b1 b2 b3 . . . bn ]

are independent. Then SB ⊂ TB and dim span SB = 2n − 3 = dim TA =


dim TB , so TB = span SB = {BY : Y ∈ Mn and Y + Y T = 0}. If n > 3 and
j > 3, note that

BF3j = [ 0 0 − bj 0 . . . 0 b3 0 . . . 0]

and the only matrices in SB with nonzero entries in the third column are
BF13 = [ − b3 0 b1 0 . . . 0 ] and BF23 = [ 0 − b3 b2 0 . . . 0 ]; thus, b3 = 0
and bj is a linear combination of b1 and b2 . Hence, rank A = rank B = 2 if
n > 3. Note that if n = 2 or m = 2 and if A ∈ Mm,n with rank A ≥ 2, then
rank A = 2.

Lemma 3.2.8 Let A ∈ Mm,n be nonzero. Then

TA = {XA + AY : X ∈ Mm , Y ∈ Mn and X + X T = 0, Y + Y T = 0}

and
(a) dim TA = m + n − 2 if A ∈ {E11 , E11 + iE12 , E11 + iE21 };
(b) dim TA = m + n − 3 if A = E11 − E22 + iE12 + iE21 ;
(c) dim TA ≥ 2n − 3 if rank A ≥ 2. Moreover, if n > 3, rank A ≥ 2,
and dim TA = 2n − 3, then rank A = 2 and there exists a permutation
matrix Q ∈ Mn such that TAQ = span SAQ ;
(d) dim TA ≥ 3n − 6 if rank A ≥ 3.
(e) If rank A = 2 and dim TA = 2n − 3, then there exists a permutation
matrix Q ∈ Mn such that dim span SAQ = 2n − 3.
74

Proof The asserted form of TA , as well as assertions (c) and (d), have been
verified. Assertions (a) and (b) follow from direct computations. We
consider in detail only the case in which A = E11 +iE12 ; the other cases
can be dealt with similarly. Let X = [xij ] ∈ Mm and Y = [yij ] ∈ Mn
be skew-symmetric. Then
 
0 0 0 ··· 0
 
 x21 ix21 0 ··· 0 
XA = 
 .. .. .. . . .. 

 . . . . . 
xm1 ixm1 0 ··· 0

and  
iy21 y12 y13 + iy23 · · · y1n + iy2n
 
 0 0 0 ··· 0 
AY = 
 .. .. .. ... .. 

 . . . . 
0 0 0 ··· 0

Since y21 = −y12 , dim TA = m + n − 2.

3.3 A Rank-Preserving Property of T −1


Proposition 3.3.1 Let T be a nonsingular linear orthogonal equivalence
preserver on Mm,n . Then T −1 (E) has rank 1 whenever E has rank 1.

We have organized the proof of Proposition 3.3.1 into a sequence of ten


lemmata.

Lemma 3.3.2 Let T be a nonsingular linear orthogonal preserver on Mm,n .


If E ∈ Mm,n and rank E = 1, then dim TT −1 (E) ≤ m + n − 2.

Proof Lemma 3.2.1 shows that dim TT −1 (E) ≤ dim TE , while Lemma 3.2.6
and Lemma 3.2.8(a, b) give dim TE ≤ m + n − 2.

Lemma 3.3.3 Proposition 3.3.1 holds if m = 1 or m + 1 < n.

Proof If m = 1, then the nonsingularity of T implies that T −1 (E) 6= 0


6 0, and this is equivalent to the asserted rank property
whenever E =
75

in this case. Let E ∈ Mm,n be given with 3 ≤ m+1 < n. If rank E = 1,


then dim TT −1 (E) ≤ m + n − 2 < 2n − 3. The first inequality is from
Lemma 3.3.2 and the strict inequality is from our assumption that
3 ≤ m + 1 < n. Lemma 3.2.8(c) now shows that T −1 (E) must have
rank 1.

Lemma 3.3.4 Let T be a nonsingular linear orthogonal preserver on Mm,n .


If m + 1 = n > 3 or m = n > 4, and if E ∈ Mm,n and rank E = 1, then
rank T −1 (E) ≤ 2.

Proof If rank E = 1, then Lemma 3.3.2 guarantees that dim TT −1 (E) ≤


m+ n − 2. If m + 1 = n > 3, then m+ n − 2 = 2n − 3 < 2n − 2 ≤ 3n − 6.
If m = n > 4, then m+n−2 = 2n−2 < 3n−6. Thus, under the stated
hypotheses we have dim TT −1 (E) < 3n − 6 and hence rank T −1 (E) ≤ 2
by Lemma 3.2.8(d).

Let rank A = 2 and suppose that A = [ a1 a2 . . . an ] ∈ Mm,n . There


are two possibilities: at least one ai is not isotropic, in which case we may
suppose aT1 a1 6= 0, or all of them are isotropic. Moreover, we may assume
that {a1 , a2 } is linearly independent.

Case 1. aT1 q
a1 6= 0
Let α = aT1 a1 , so α 6= 0. Lemma 3.2.5 ensures that Qa1 = [ α 0 . . . 0 ]T
for some orthogonal Q ∈ Mm , so
" #
α ∗
QA = .
0 A1
h i
If we write A1 ≡ b2 b3 . . . bn , then b2 6= 0 since a1 and a2 are linearly
independent. There are two possibilities: b2 is isotropic or not.
(i) Suppose bT2 b2 6= 0. After applying the preceding argument to A1 , we
see that there exists an orthogonal Q1 ∈ Mm such that
 
α δ ∗
 
B ≡ Q1 A =  0 β ∗  , α, β, δ ∈ C, with αβ 6= 0,
0 0 A2
76

but A2 = 0 since rank A = 2. For n ≥ m ≥ 3, and referring to the discus-


sion after Definition 3.2.7, we see that that {G13 B, G23 B} ∪ SB is a linearly
independent subset of TB . Hence, dim TA = dim TB ≥ 2n − 1 > m + n − 2.
(ii) If bT2 b2 = 0, a similar argument shows that there is an orthogonal
Q1 ∈ Mm such that  
α δ ∗
 0 β z 
 
B ≡ Q1 A =  ,
 0 iβ iz 
0 0 0
where z ∈ M1,n−2 and β 6= 0. Let X = [ x1 . . . xn ] ∈ SB . Then the columns
of X have the form  

 a 
 
xj =  j  , j = 1, . . . , n
 iaj 
0
for some aj ∈ C. Hence, for n ≥ m ≥ 3, {G12 B, G13 B} ∪ SB is a linearly
independent subset of TB . Hence, dim TA = dim TB ≥ 2n − 1 > m + n − 2.

Case 2. aT1 a1 = aT2 a2 = 0


Because aT1 a1 = 0, there exists an orthogonal Q ∈ Mm and α 6= 0 such
that  
α δ ∗
QA =  iα β ∗  .

0 b2 ∗
Notice that the second column of QA is isotropic and independent of the first
column.
(i) If b2 = 0, then there is δ 6= 0 such that
 
α δ ∗
 
B ≡ QA =  iα −iδ ∗  .
0 0 0

For n ≥ m ≥ 3, {G13 B, G23 B} ∪ SB is a linearly independent subset of TB .


Hence, dim TA ≥ 2n − 1 > m + n − 2.
77

(ii) If b2 6= 0 and bT2 b2 6= 0, there exists an orthogonal Q1 ∈ Mm and


λ 6= 0 such that  
α δ ∗
 iα β ∗ 
B ≡ Q1 A =  

.
 0 λ ∗ 
0 0 0
For n ≥ m ≥ 4, {G14 B, G34 B} ∪ SB is a linearly independent subset of TA ,
and dim TA ≥ 2n − 1 > m + n − 2.
(iii) If b2 6= 0 and bT2 b2 = 0, there exists an orthogonal Q1 ∈ Mm and
λ 6= 0 such that  
α δ ∗
 
 iα β ∗ 
 
B ≡ Q1 A =   0 λ ∗ .

 
 0 iλ ∗ 
0 0 0
Just as in Case 1 (ii), {G13 B, G23 B} ∪ SB ⊂ TB is linearly independent.
Hence, for n ≥ m ≥ 4, dim TA = dim TB ≥ 2n − 1 > m + n − 2.
Therefore, if A ∈ Mm,n , n ≥ m ≥ 4, and rank A = 2, then dim TA ≥
2n − 1 > m + n − 2 ≥ dim TE for any E ∈ Mm,n with rank E = 1 by
Lemma 3.2.8(a, b). Combining this result with Lemma 3.3.4 proves the
following lemma.

Lemma 3.3.5 Proposition 3.3.1 holds if n > 4 and m = n or m = n − 1.

Let ψ = {(2, 2), (2, 3), (3, 3), (3, 4), (4, 4)}. If E ∈ Mm,n with n ≥ m and
(m, n) 6∈ ψ, we have shown that rank T −1 (E) = 1 whenever rank E = 1. We
treat the five special cases (m, n) ∈ ψ separately. We use the following two
results.

Lemma 3.3.6 Let A ∈ Mn . Then XA is skew-symmetric for every skew-


symmetric X ∈ Mn if and only if A = αI for some α ∈ C.

Proof If A = αI, then XA = αX is evidently skew-symmetric. Conversely,


if X and XA are skew-symmetric, then XA = −(XA)T = −(AT X T ) =
AT X. Let A = [aij ] and consider the skew-symmetric matrices F1k =
78

E1k − Ek1 for k = 1, . . . n. Then


 
ak1 ak2 ··· akk ··· akn
 


0 0 ··· 0 ··· 0 

 .. .. ... .. ... .. 
 . . . . 
F1k A = 
 −a


 11 −a12 · · · −a1k · · · −a1n 
 .. .. ... .. ... .. 
 
 . . . . 
0 0 ··· 0 ··· 0
and  
−ak1 0 · · · a11 ··· 0
 


−ak2 0 · · · a12 ··· 0 

 .. .. . . . ... .. 
 . . . .. . 
AT F1k =

.

 −akk 0 · · · a1k ··· 0 
 .. .. . . . . . . .. 
 . .. 
 . . . 
−akn 0 · · · a1n ··· 0

Hence, akk = a11 and aki = 0 for all i 6= k. Thus, A = a11 I.

Lemma 3.3.7 Let A ∈ Mm,n . If dim TA = dim span SAQ for some orthog-
onal Q ∈ Mn , then AAT = αIm for some α ∈ C.

Proof Suppose dim TA = dim span SAQ . Since dim TA = dim TAQ , we have
TAQ = span SAQ . Hence, for each skew-symmetric Y ∈ Mm there
exists a skew-symmetric X ∈ Mn such that Y AQ = AQX. Thus
Y AAT = Y AQT (AQT )T = (AQ)X(AQ)T is skew-symmetric for every
skew-symmetric Y , and hence AAT = αIm by Lemma 3.3.6.

Lemma 3.3.8 Proposition 3.3.1 holds if (m, n) = (2, 3).

Proof Suppose E ∈ M2,3 has rank 1. Since rank T −1 (E) ≤ m = 2, there are
only two possibilities: rank T −1 (E) = 1 (which is the assertion of Proposi-
tion 3.3.1), or rank T −1 (E) = 2. We wish to exclude the latter possibility.
We look at
D ≡ {Y ∈ M2,3 : dim TY ≤ 3}.
Since dim TT −1 (Y ) ≤ dim TY , it must be the case that T −1 (Y ) ∈ D whenever
Y ∈ D. If E has rank 1, then Lemma 3.2.8(a, b) shows that E ∈ D. Suppose
79

X ∈ D and rank X = 2. Then Lemma 3.2.8(c) ensures that dim TX ≥ 3,


so that dim TX = 3 for this case. Hence, Lemma 3.2.8 ensures that there
exists an orthogonal Q ∈ M3 such that dim span SXQ = 3. Lemma 3.3.7
now guarantees that XX T = β 2 I2 for some β ∈ C. Moreover, Lemma (4.4)
of [HHL] shows that β 6= 0. It follows from Lemma 3.2.5 that there exists an
orthogonal Q1 ∈ M3 such that
" #
β 0 0
X= Q1 . (3.3.1)
0 β 0

We will show that T −1 (X) ∈ O(αX) for some α 6= 0. Since X ∈ D, it


suffices to show that T (E) 6∈ O(X) for each E ∈ E ≡ {E11 , E11 + iE12 , E11 +
iE21 , E11 − E22 + iE12 + iE21 }.
Let " #
β 0 0
A≡
0 β 0
so that O(A) = O(X). One checks that
(" # )
0 x y
TA = : x, y, z ∈ C .
−x 0 z

Let B ∈ O(A) ∩ TA . Then


" #
x2 + y 2 yz
= BB T = β 2 I2 .
yz x + z2
2

Hence, y = z = 0 and x2 = β 2 . Thus, O(A) ∩ TA consists of exactly two


vectors.
Let E ≡ E11 − E22 + iE12 + iE21 . Then dim TT −1 (E) ≤ dim TE = 2 <
dim TB for any B 6∈ {0} ∪ O(E). It follows that T −1 (E) ∈ O(E) and hence,
T (E) ∈ O(E). Thus, T (E) 6∈ O(A).
Suppose that T (E11 ) ∈ O(A), say T (E11 ) = Q2 AQ3 . Let T1 ≡ QT2 T QT3 .
Then, A = QT2 T (E11 )QT3 = T1 (E11 ). Notice that
(" # )
0 x y
TE11 = : x, y, z ∈ C .
z 0 0

Thus, {E12 , E13 , E21 } ⊂ O(E11 ) ∩ TE11 so that O(E11 ) ∩ TE11 contains at
least three vectors. Moreover, χ ≡ {T1 (E12 ), T1 (E13 ), T1 (E21 )} contains
80

three vectors since T1 is nonsingular. However, χ ⊂ T1 (O(E11 ) ∩ TE11 ) ⊂


O(T1 (E11 )) ∩ TT1 (E11 ) = O(A) ∩ TA , which contains exactly two vectors. This
contradiction shows that T (E11 ) 6∈ O(A).
Now let E ≡ E11 + iE12 . Then E = E(−iF12 ) ∈ TE . Since A 6∈ TA ,we
have T (E) 6∈ O(A) by Lemma 3.2.2. Similarly, if E ≡ E11 +iE21 = (−iG12 )E
then, T (E) 6∈ O(A).
Thus, T −1 (A) cannot have rank 1, and hence it has rank 2 and T −1 (A) ∈
O(αA) for some α 6= 0. It follows that T (O(A)) ⊂ O( α1 A) and hence,
T −1 (E) 6∈ O(αA) for all rank-one E ∈ M2,3 , and all α 6= 0. Combining this
result with the observation that T −1 (Y ) ∈ D for any Y ∈ D, we see that
rank T −1 (E) = 1 whenever rank E = 1.

Lemma 3.3.9 Proposition 3.3.1 holds if (m, n) = (3, 4).

Proof Let E ∈ M3,4 have rank 1 and let A ≡ T −1 (E). Lemma 3.3.4 shows
that rank A is either 1 or 2. Suppose rank A = 2. We apply the analysis of
the proof of Lemma 3.3.5 to A. Notice that Cases 1(i, ii) and 2(i, iii) are
not possible here. Thus, A has the form considered in Case 2(ii):
 
α δ x1 x4
A =  iα β x2 x5 


0 λ x3 x6
with λ 6= 0 and all the columns of A are isotropic vectors. It now follows
from Lemma 3.2.8(a, c) and Lemma 3.2.1 that dim TA = 5. Moreover,
Lemma 3.2.8(e) ensures that dim span SAP = 5 for some permutation matrix
P ∈ M4 . Hence, Lemma 3.3.7 guarantees that there exists a ∈ C such that
AAT = aI3 . Since it is always the case that rank A ≥ rank AAT , we must
have a = 0, that is, the rows of A are also isotropic vectors. Thus, there exists
an orthogonal Q ∈ M4 such that the first row of AQ is [ α iα 0 0 ]. Since
AAT = 0, the third row of AQ has the form [ 0 0 c d ], where c2 + d2 = 0
and d 6= 0. It follows that
 
α iα 0 0
 
AQ =  iα −α ebd ed 
0 0 bd d

with e2 = b2 = −1 and αd 6= 0. By considering all the possible cases, one


checks that G12 AQ 6∈ span SAQ . We show one such case: b = e = i. Suppose
81

that G12 AQ = AQ(a1 F12 + a2 F13 + a3 F14 + a4 F23 + a5 F24 ). Examining the
(1, 1), (2, 1), (3, 1) and (2, 4) entries, we get a1 = −1, a2 = a3 = a5 = 0. This
is a contradiction since for any a4 , G12 AQ 6= −AQF12 + a4 AQF23 . Hence,
dim TA = dim TAQ ≥ 6, again a contradiction. Since we can exclude all the
possibilities associated with rank A = 2, it follows that rank A = 1.

Lemma 3.3.10 Proposition 3.3.1 holds if (m, n) = (4, 4).

Proof If A ∈ M4 has rank 2, then dim TA ≥ 7, as shown on the proof of


Lemma 3.3.5. Let B ≡ E11 − E22 + iE12 + iE21 . Then dim TT −1 (B) ≤
dim TB ≤ dim TC for any C 6∈ {0} ∪ O(B) and hence, T −1 (B) ∈ O(B)
has rank 1. Let D ≡ diag (1, − 1, 1, 1). Then E11 +iE12 = 12 (B +DB)
and T −1 (E11 + iE12 ) = 12 (T −1 (B) + T −1 (DB)), a sum of two rank-
one matrices. Hence, T −1 (E11 + iE12 ) has rank at most 2. But since
dim TC ≤ 6 < 7 ≤ dim TA for all A, C ∈ M4 such that rank C = 1 and
rank A = 2, we must have rank T −1 (C) 6= 2 so that rank(T −1 (E11 +
iE12 )) = 1. Similarly, since E11 + iE21 = 12 (B + BD) and E11 =
1
2
((E11 + iE12 ) + (E11 + iE12 )D), both T −1 (E11 + iE21 ) and T −1 (E11 )
have rank 1. Thus, if E ∈ M4 has rank 1, then T −1 (E) must also have
rank 1.

Let T be a nonsingular linear orthogonal equivalence preserver on Mm,n ,


with m ≤ n and (m, n) 6∈ {(2, 2), (3, 3)}. Our arguments up to this point
show that T −1 preserves rank 1 matrices, that is, T −1 (E) has rank 1 whenever
E ∈ Mm,n has rank 1. Theorem 4.1 of [HLT] guarantees that there exist
nonsingular X ∈ Mm and Y ∈ Mn such that either T −1 (A) = XAY for all
A ∈ Mm,n , or m = n and T −1 (A) = XAT Y for all A ∈ Mn . Hence, either
T (A) = MAN for all A ∈ Mm,n , or m = n and T (A) = M AT N for all
A ∈ Mn , where M ≡ X −1 and N ≡ Y −1 . We will now show that the same
conclusion can be drawn for the two remaining cases (m, n) = (2, 2), (3, 3),
from which Proposition 3.3.1 follows in these two cases.

Lemma 3.3.11 Let T be a nonsingular linear orthogonal preserver on Mn .


If n = 2 or n = 3, then there exists a scalar α 6= 0 such that T −1 (In ) ∈
O(αIn ).

Proof First note that Xn ≡ E11 − E22 + iE12 + iE21 = Xn (iF12 ) ∈ TXn for
all n = 2, 3, . . ., where Eij , Xn ∈ Mn . Note also that, TIn = {X ∈ Mn :
82

X + X T = 0} is the set of all skew symmetric matrices in Mn . Hence,


dim TIn = n(n − 1)/2. Moreover, In 6∈ TIn for each n.
Let n = 2. Then dim TI2 = 1. It follows from Lemma 3.2.8(a, b,
c) that either T −1 (I2 ) is nonsingular or T −1 (I2 ) ∈ O(X2 ). However,
T −1 (I2 ) 6∈ O(X2 ) by Lemma 3.2.2. Thus, T −1 (I2 ) is nonsingular and
Lemma 3.2.8(e) implies that dim TT −1 (I2 ) = 1 = dim span SI2 Q for
some permutation matrix Q ∈ M2 . Lemma 3.3.7 guarantees that
T −1 (I2 ) ∈ O(βI2 ) for some β 6= 0.
Let n = 3. A similar argument shows that either

rank T −1 (I3 ) ≥ 2 or T −1 (I3 ) ∈ O(X3 ),

and that the latter possibility is excluded. Hence, rank T −1 (I3 ) ≥


2. Let A ≡ T −1 (I3 ). Then dim TA = 3 = dim span SAQ for some
permutation matrix Q ∈ M3 . Hence, AAT = αI3 by Lemma 3.3.7,
and Lemma (4.4) of [HHL] again guarantees that α 6= 0. Therefore,
T −1 (I3 ) ∈ O(βI3 ) with β 2 = α 6= 0.

Suppose n = 2 or 3. Then Lemma 3.3.11 ensures that T −1 (In ) ∈ O(αIn ).


Hence, T1 ≡ αT satisfies T1−1 (In ) ∈ O(In ). It follows that T1 (O(In )) ⊂
O(In ). Lemma (1) of [D] guarantees that T1 (O(In )) = O(In ). Thus, Lemma
(6) of [BP] guarantees that there exist nonsingular M, N ∈ Mn such that
either T1 (A) = M AN or T1 (A) = M AT N for all A ∈ Mn . It follows that
Proposition 3.3.1 holds for these two cases as well.
The preceding ten lemmata constitute a proof of all cases of Proposi-
tion 3.3.1 in which m ≤ n. The remaining cases follow from considering
T1 (X) ≡ T (X T ) and applying the known cases to T1 : Mn,m → Mn,m .
The following proposition summarizes the main conclusions of this sec-
tion.

Proposition 3.3.12 Let T be a nonsingular linear orthogonal equivalence


preserver on Mm,n . Then there exist nonsingular M ∈ Mm and N ∈ Mn
such that either

(1) T (A) = MAN for all A ∈ Mm,n , or

(2) m = n and T (A) = M AT N for all A ∈ Mm,n .


83

3.4 Proof of the Main Theorem


Let T be a given linear operator on Mm,n . Suppose T preserves orthogonal
equivalence. Then Lemma 3.2.4 guarantees that either T = 0 or T is non-
singular. If T = 0, then Theorem 3.1.1 holds with α = 0. If T 6= 0, we will
use the following to show that Theorem 3.1.1 still holds.

Proposition 3.4.1 Let A ∈ Mn be nonsingular. Suppose that

xT AT Ax = xT P T AT AP x

for all orthogonal P ∈ Mn and all x ∈ Cn . Then there exist an orthogonal


Q ∈ Mn and a scalar α 6= 0 such that A = αQ.

Proof An easy polarization argument shows that if C ∈ Mn is symmet-


ric and xT Cx = 0 for all x ∈ Cn , then C = 0. Since xT AT Ax =
xT P T AT AP x for all x ∈ Cn , it follows that AT A = P T AT AP for all
orthogonal P ∈ Mn . Hence,
 
α β ··· β
 
T

 β . . . . . . ... 

A A=
 .. . . .
 . . . . . β 
β ··· β α

Let x ≡ [ 1 1 0 . . . 0 ]T and y ≡ [ 2 0 0 . . . 0 ]T ∈ Cn . Then xT x =
y T y and hence there exists an orthogonal Q1 ∈ Mn such that y = Q1 x.
Now, 2α + 2β = xT AT Ax = xT QT1 AT AQ1 x = y T AT Ay = 2α. Hence,
β = 0 and AT A = αI with α 6= 0 since A is nonsingular. Thus,

Q ≡ √1α A is orthogonal and A = αQ.

Lemma 3.4.2 Let T be a given nonsingular linear operator on Mm,n . Then


T preserves orthogonal equivalence if and only if there exist orthogonal ma-
trices Q1 ∈ Mm , Q2 ∈ Mn and a scalar β 6= 0 such that either

(1) T (A) = βQ1 AQ2 for all A ∈ Mm,n , or

(2) m = n and T (A) = βQ1 AT Q2 for all A ∈ Mn .


84

Proof Under the stated assumptions, Proposition 3.3.12 ensures that there
exist nonsingular M ∈ Mm and N ∈ Mn such that either T (A) = MAN
for all A ∈ Mm,n , or m = n and T (A) = M AT N for all A ∈ Mm,n . We
consider only the case T (A) = M AN ; the case T (A) = MAT N can
be dealt with similarly. Let an orthogonal P ∈ Mn be given. Since
T preserves orthogonal equivalence, for each A ∈ Mm,n there exist
orthogonal matrices Q ∈ Mm and Z ∈ Mn (which depend on A and P )
such that T (A) = Q T (AP ) Z. Hence,

MAN (MAN )T = T (A) T (A)T


= Q T (AP ) Z (Q T (AP ) Z)T (3.4.2)
= QMAP N (M AP N )T QT .
" #
xT
Choose A ≡ M −1
, where x ∈ Cn . Then (3.4.2) becomes
0
" # " #
xT N N T x 0 xT P N N T P T x 0
=Q QT .
0 0 0 0

Since QT = Q−1 , taking the trace of both sides shows that xT N N T x =


xT P N N T P T x. Since this identity holds for all x ∈ Cn and all or-
thogonal P ∈ Mn , Proposition 3.4.1 ensures that N = α2 Q2 for some
orthogonal Q2 ∈ Mn and some scalar α2 6= 0. A similar analysis of
T (A)T T (A) shows that M = α1 Q1 for some orthogonal Q1 ∈ Mm and
some scalar α1 6= 0.

This completes the proof of the forward implication of Theorem 3.1.1.


The converse can be easily verified.
Chapter 4

Linear Operators Preserving


Unitary t-Congruence on
Matrices

85
86

4.1 Introduction and Statement of Results


Let Mn (IF) be the set of all n-by-n matrices over the field IF, where either
IF = C or IF = IR. The set of all matrices U ∈ Mn (IF) that satisfy U ∗ U = I
(that is, U is unitary) is denoted by Un (IF). We say that A, B ∈ Mn (IF) are
t-congruent if there exists a nonsingular X ∈ Mn (IF) such that A = XBX t ;
moreover, if X ∈ Un (IF), then A and B are said to be unitarily t-congruent.
Notice that if IF = IR, then Un (IR) is the set of all real orthogonal matrices
and unitary t-congruence becomes orthogonal similarity.
We shall also use the following notations in our discussion.

(a) Sn (IF) ≡ {X ∈ Mn (IF) : X t = X} is the set of all n-by-n symmetric


matrices.
(b) Kn (IF) ≡ {X ∈ Mn (IF) : X t = −X} is the set of all n-by-n skew-
symmetric matrices.
(c) Eij ∈ Mn (IF) is the matrix whose (i, j) entry is 1; all other entries are
zero. Notice that {Eij : 1 ≤ i, j ≤ n} forms a basis for Mn (IF).
(d) Fij ≡ Eij − Eji . One checks that {Fij : 1 ≤ i < j ≤ n} forms a basis
for Kn (IF).
(e) If
 
0 x12 x13 x14
 −x12 0 x23 x24 
 
X=  ∈ K4 (IF),
 −x13 −x23 0 x34 
−x14 −x24 −x34 0
we denote by X + the matrix derived from X by interchanging x13 and
x24 . One checks that

det X = (x12 x34 + x14 x23 − x13 x24 )2 = det X + .

Notice that the singular values of any A = [aij ] ∈ K4 (IF) are uniquely
P
determined by tr A∗ A = ni,j=1 |aij |2 and |det A|2 , and hence these two
invariants determine A ∈ K4 (IF) uniquely up to unitary t-congruence.
Thus, X, Y ∈ K4 (IF) are unitarily t-congruent if and only if X + is
unitarily t-congruent to Y + . In addition, X and X + are unitarily t-
congruent whenever X ∈ K4 (IF).
87

Interchanging x14 and x23 is equivalent to X → −U X + U t , where


 
0 1 0 1
 1 0 0 0 
 
U ≡  ∈ U4 (IR).
 0 0 0 −1 
0 0 −1 0

Similarly, interchanging x12 and x34 is equivalent to X → −V X + V t ,


where  
0 0 0 1
 0 0 1 0 
V ≡

 ∈ U4 (IR).
 0 1 0 0 
1 0 0 0

In [HHL], the authors studied the linear operators T on Mn (IF) that pre-
serve t-congruence, that is, T (A) is t-congruent to T (B) whenever A and B
are t-congruent. At the end of the paper they raised the question of charac-
terizing the linear operators on Mn (IF) that preserve unitary t-congruence.
The purpose of this chapter is to answer their question with the following
results.

Theorem 4.1.1 Let T be a given linear operator on Mn (C). Then T pre-


serves unitary t-congruence if and only if one of the following three conditions
holds:

(1) There exist U ∈ Un (C) and scalars µ, ν ∈ C such that

T (X) = µU (X + X t )U t + νU (X − X t )U t

for all X ∈ Mn (C) .

(2) n = 4 and there exist U ∈ Un (C) and a scalar ν ∈ C such that

T (X) = νU (X − X t )+ U t for all X ∈ M4 (C).

(3) n = 2 and there exists B ∈ M2 (C) such that

T (X) = (tr XF12 )B for all X ∈ M2 (C).


88

Theorem 4.1.2 Let an integer n ≥ 3 and a linear operator T on Mn (IR)


be given. Then T preserves orthogonal similarity if and only if one of the
following three conditions holds:

(1) There exists A0 ∈ Mn (IR) such that

T (X) = (tr X)A0 for all X ∈ Mn (IR).

(2) There exist U ∈ Un (IR) and scalars α, β, µ ∈ IR such that

T (X) = α(tr X)I + βU (X + X t )U t + µU (X − X t )U t

for all X ∈ Mn (IR).

(3) n = 4 and there exist U ∈ U4 (IR) and scalars α, β ∈ IR such that

T (X) = α(tr X)I + βU (X − X T )+ U t for all X ∈ M4 (IR).

Theorem 4.1.3 A linear operator T on M2 (IR) preserves orthogonal simi-


larity if and only if one of the following three conditions holds:

(1) There exist A0 , B0 ∈ M2 (IR) with tr B0 = tr A0 B0 = tr At0 B0 = 0 such


that
T (X) = (tr X)A0 + (tr XF12 )B0 for all X ∈ M2 (IR).

(2) There exist U ∈ U2 (IR) and scalars α, β, µ ∈ IR such that

T (X) = α(tr X)I + βU (X + X t )U t + µU (X − X t )U t

for all X ∈ M2 (IR).

(3) There exist U ∈ U2 (IR), C0 ∈ K2 (IR), and scalars α, β ∈ IR such that

T (X) = α(tr X)I + (tr X)C0 + βU (X + X t )U t

for all X ∈ M2 (IR).


89

As we shall see, even though the result in the complex case (Theo-
rem 4.1.1) is very similar to Theorem (4.1) of [HHL], it requires a refinement
of their proof and several additional ideas to obtain our result. The real case
is even more interesting. Our Theorems 4.1.2 and 4.1.3 are quite different
from the result in [HHL]. The complication is due to the fact that Mn (IR)
can be decomposed as a direct sum of the subspaces
Λn ≡ {λI : λ ∈ IR},
Sn (IR) ≡ {A ∈ Mn (IR) : A = At and tr A = 0}, and
0

Kn (IR) ≡ {A ∈ Mn (IR) : A + At = 0},


and each of these is an irreducible subspace (in the group representation
sense) of the group G of linear operators on Mn (IR) defined by A → U AU t
for a given orthogonal matrix U . As a result, it is conceivable that a linear
operator T on Mn (IR) can act quite independently on these three subspaces.
However, it turns out that the behavior of T on the three subspaces cannot
be too arbitrary, especially if T is nonsingular, as shown by Theorem 4.1.2.
Nevertheless, this makes the proof of the theorem more difficult and makes
the subject more interesting. One of the key steps in our proof (Lemma 4.4.3)
makes use of a result by Friedland and Loewy [FL] concerning the minimum
dimension of a subspace of Sn (IR) that is certain to contain a matrix whose
largest eigenvalue has at least a given multiplicity r (see (4.4.40)). This result
has been used in other linear preserver problems (see [L]).
One of the basic ideas in our proof is to study the orbits
O(A) ≡ {U AU t : U ∈ Un (IF)}, A ∈ Mn (IF)
and make use of their geometric properties to help solve our problem. These
techniques have been used in [HLT], [HHL], [LRT], and in Chapter 3. We
collect some useful results from [HHL] and put them in Section 4.2 for easy
reference. In Section 4.3 we present the proof of Theorem 4.1.1. Section 4.4
is devoted to proving Theorems 4.1.2 and 4.1.3. For problems similar to ours
and their variations, we refer the reader to [HLT] and [LRT]. For a gentle
introduction to linear preserver problems, we refer the reader to [LT2].

4.2 Preliminaries
The following three results can be found in [HHL]. By private correspon-
dence, the author has learned that Professor M. H. Lim has obtained different
90

proofs for these results.


Lemma 4.2.1 A nonzero linear operator T on Kn (C) preserves unitary t-
congruence if and only if there exist U ∈ Un (C) and a scalar µ > 0 such that
either
(a) T (X) = µU XU t for all X ∈ Kn (C); or
(b) n = 4 and T (X) = µU X + U t for all X ∈ K4 (C).

Lemma 4.2.2 A nonzero linear operator T on Sn (C) preserves unitary t-


congruence if and only if there exist U ∈ Un (C) and a scalar µ > 0 such that
T (X) = µU XU t for all X ∈ Sn (C).

Lemma 4.2.3 A nonzero linear operator T on Kn (IR) preserves orthogonal


similarity if and only if there exist U ∈ Un (IR) and a scalar µ 6= 0 such that
either
(a) T (X) = µU XU t for all X ∈ Kn (IR); or
(b) n = 4 and T (X) = µU X + U t for all X ∈ K4 (IR).

The following is Theorem (2.2) in [Hi].

Lemma 4.2.4 A nonzero linear operator T on Sn (IR) preserves orthogonal


similarity if and only if one of the following conditions holds:
(a) There exists a nonzero A0 ∈ Sn (IR) such that

T (X) = (tr X)A0 for all X ∈ Sn (IR).

(b) There exist U ∈ Un (C) and scalars µ, ν ∈ IR with (µ, ν) 6= (0, 0) such
that
T (X) = µU XU t + ν(tr X)I for all X ∈ Sn (IR).

The following is Lemma (2.2) in [HHL].

Lemma 4.2.5 Suppose IF = C or IF = IR. Then span O(A) = Kn (IF) for


every nonzero A ∈ Kn (IF). Consequently, if T is a linear operator on Mn (IF)
that preserves unitary t-congruence, then ker T ∩ Kn (IF) 6= {0} if and only
if Kn (IF) ⊂ ker T .
91

The following is Lemma (3.2) in [HHL].

Lemma 4.2.6 span O(A) = Sn (C) for every nonzero A ∈ Sn (C). Conse-
quently, if T is a linear operator on Mn (C) that preserves unitary t- congru-
ence, then ker T ∩ Sn (C) 6= {0} if and only if Sn (C) ⊂ ker T .

Recall that if A ∈ Sn0 (IR), then A is symmetric and tr A = 0. Since


unitary t-congruence is real orthogonal similarity when IF = IR, any B ∈
O(A) is also in Sn0 (IR), that is, tr B = 0 as well. Hence span O(A) ⊂ Sn0 (IR)
but span O(A) 6= Sn (IR). This is one of the major differences between the
cases IF = IR and IF = C (see Lemma 4.2.6).

Lemma 4.2.7 span O(A) = Sn0 (IR) for every nonzero A ∈ Sn0 (IR). Conse-
quently, if T is a linear operator on Mn (IR) that preserves orthogonal simi-
larity, then ker T ∩ Sn0 (IR) 6= {0} if and only if Sn0 (IR) ⊂ ker T .

Proof Suppose A ∈ Sn0 (IR). We have already shown that span O(A) ⊂
Sn0 (IR). We now prove that Sn0 (IR) ⊂ span O(A). There exist Q ∈
Un (IR) and α1 , α2 , . . . , αn−1 , αn ∈ IR such that

QAQt = diag (α1 , α2 , . . . , αn−1 , αn ).

Since A 6= 0 and tr A = 0, we may take α1 6= 0 and α1 6= αn . Notice


that

(α1 − αn )(E11 − Enn ) = QAQt − diag (αn , α2 , . . . , αn−1 , α1 ),

so E11 − Enn ∈ span O(A). Since Eij + Eji ∈ O(E11 − Enn ), we have
Eij + Eji ∈ span O(A) for all 1 ≤ i < j ≤ n. The conclusion follows.

Notice that span O(A) = Λn for every nonzero A ∈ Λn . This observation,


together with Lemma 4.2.7, proves the following; for a different approach,
see the proof of Theorem (3.2) of [LT1].

Lemma 4.2.8 Let A ∈ Sn (IR) be given. Suppose that A 6∈ Λn ∪Sn0 (IR). Then
span O(A) = Sn (IR). Consequently, if T is a linear operator on Mn (IR) that
preserves orthogonal similarity, then ker T ∩ Sn (IR) 6= {0} if and only if
Sn (IR) ⊂ ker T .
92

Proof Write A = αI + AS , where α ≡ n1 tr A and AS ∈ Sn0 (IR). Notice


that α 6= 0 since A 6∈ Sn0 (IR). Moreover, AS 6= 0 since A 6∈ Λn . There
exists a U ∈ Un (IR) such that B ≡ U AU t − A = U AS U t − AS 6=
0. Notice that B ∈ Sn0 (IR). Hence, Lemma 4.2.7 guarantees that
Sn0 (IR) ⊂ span O(B) ⊂ span O(A). Thus, αI = A − AS ∈ span O(A)
and Λn ⊂ span O(A). The conclusion now follows.

Lemma 4.2.9 Let IF = C or IF = IR, and let A ∈ Mn (IF) be given. Then


(a) A + At ∈ span O(A), and

(b) A − At ∈ span O(A).

Proof We follow and refine slightly the proof of Lemma (4.2) of [HHL]. Let
{Pi : 1 ≤ i ≤ 2n } denote the group of all real diagonal orthogonal
matrices in Mn (IF) and define
2n
X
P(X) ≡ Pi XPit for X ∈ Mn (IF).
i=1

Write X ≡ [xij ] and note that

P(X) = 2n diag(x11 , . . . , xnn ) = P(X t ) ∈ span O(X)

for all X ∈ Mn (IF). Now, since A + At ∈ Sn (IF), there exists a


U ∈ Un (IF) such that D ≡ U (A + At )U t is diagonal. Hence, 2n D =
P(D) = P(U (A + At )U t ) = 2P(U AU t ) ∈ span O(A). Thus, A + At ∈
span O(A). Finally, A − At = 2A − (A + At ) ∈ span O(A).

4.3 The Complex Case


We begin the proof of the forward implication of Theorem 4.1.1 by con-
sidering a singular linear operator T on Mn (C) that preserves unitary t-
congruence; the reverse implication can be proved by direct verification.

Lemma 4.3.1 Let A ∈ Mn be given and suppose that A 6∈ Sn (C) ∪ Kn (C).


Then span O(A) = Mn (C). Consequently, if T is a linear operator on Mn (C)
that preserves unitary t-congruence, then A ∈ ker T if and only if T = 0.
93

Proof Lemma 4.2.9 guarantees that A + At , A − At ∈ span O(A). Since


A + At 6= 0 and A − At 6= 0, Lemmata 4.2.5 and 4.2.6 guarantee that
Sn (C) ∪ Kn (C) ⊂ span O(A). We conclude that span O(A) = Mn (C).

If T is a nonzero singular linear operator on Mn (C) that preserves unitary


t-congruence, then Lemma 4.3.1 implies that ker T ⊂ Sn (C) ∪ Kn (C), that
is, if T (A) = 0, then either A ∈ Sn (C) or A ∈ Kn (C). We look at these cases
separately.
If 0 6= A ∈ Kn (C), we can use (verbatim) the proof of Lemma (4.3) of
[HHL] to get the following analogous result.

Lemma 4.3.2 Let T be a nonzero linear operator on Mn (C) that preserves


unitary t-congruence. If ker T ∩ Kn (C) 6= {0}, then ker T = Kn (C) and T
satisfies condition (1) of Theorem 4.1.1 with ν = 0.

Suppose now that 0 6= A ∈ Sn (C) ∩ ker T . Lemma 4.2.5 guarantees that


T (Sn (C)) = 0. Let n ≥ 3. We follow the proof of Lemma (4.6) of [HHL] to
show that T (Kn (C)) ⊂ Kn (C).
Define L : Kn (C) → Sn (C) by

T (X) + T (X)t
L(X) ≡ for all X ∈ Kn (C).
2
One checks that L is a linear operator that preserves unitary t-congruence.
Moreover, if for some nonzero B ∈ Kn (C), L(B) = 0, then Lemma 4.2.6
guarantees that L(Kn (C)) = 0. Notice that T (Kn (C)) ⊂ Kn (C)) if and
only if L = 0. Suppose then that L(A) 6= 0 for some (necessarily) nonzero
A ∈ Kn (C), that is, suppose L is nonsingular. There exist U ∈ Un (C) and a
scalar a1 > 0 such that
U AU t = a1 F12 + C (4.3.1)
where C ≡ 02 ⊕ C1 ∈ Kn (C) for some C1 ∈ Kn−2 (C). Notice that for any
µ ∈ C with |µ| = 1, ZAZ t = a1 F12 +µ2 C is unitarily t-congruent to A, where
Z ≡ (I2 ⊕ µIn−2 )U ∈ Un (C). Hence,

rank L(A) = rank L(a1 F12 + µC) = rank (a1 L(F12 ) + µL(C))
94

for all such µ. If we put S12 ≡ L(F12 ), we have

rank L(X) ≥ rank L(F12 ) ≡ rank S12 for all 0 6= X ∈ Kn (C).

Moreover, if rank A = 2, then C = 0 in (4.3.1) and hence rank L(A) =


rank S12 ≤ rank L(X) for any 0 6= X ∈ Kn (C).
As in the case of nonsingular t-congruence, there exists an X0 ∈ Kn (C)
such that rank L(X0 ) ≤ 3 (see the sixth paragraph of the proof of Lemma
(4.4) of [HHL]). Thus, for any A ∈ Kn (C) having rank 2, we have

rank L(A) = rank S12 ≤ 3.

The analysis (for the complex case) of the proof of Lemma (4.4) of [HHL]
can be modified to show that this is not possible.

Lemma 4.3.3 Let an integer n ≥ 3 and a linear operator L : Kn (C) →


Sn (C) be given. Then L preserves unitary t-congruence if and only if L = 0.

If n ≥ 3 and ker T ∩ Sn (C) 6= {0}, then Lemma 4.3.3 implies that

T (Kn (C)) ⊂ Kn (C).

Hence, for this case, we may regard T as a nonzero linear operator on Kn (C).
Lemma 4.2.1 guarantees that Theorem 4.1.1 holds for this case.
Suppose now that n = 2 and let 0 6= A ∈ ker T ∩ Sn (C). Again,
Lemma 4.2.6 guarantees that T (Sn (C)) = 0. Let

T (F12 )
A0 ≡ − .
2
Since X − X t = −(tr XF12 )F12 for all X ∈ M2 (C), we have

T (X) = 12 [T (X + X t ) + T (X − X t )]
= 12 T (X − X t )
= − 12 (tr XF12 )T (F12 )
= (tr XF12 )A0 .

Hence T satisfies condition (3) of Theorem 4.1.1. If A0 ≡ αF12 for some


α ∈ C, then we may take ν = α and U = I2 in condition (1) of Theorem 4.1.1.
95

Lemma 4.3.4 Let T be a given linear operator on Mn (C) that preserves


unitary t-congruence. Suppose that ker T ∩ Sn (C) 6= {0}. Then T (Sn (C)) =
{0} and one of the following conditions holds:

(a) There exist U ∈ Un (C) and a scalar ν ∈ C such that

T (X) = νU (X − X t )U t for all X ∈ Mn (C).

(b) n = 4 and there exist U ∈ Un (C) and a scalar ν ∈ C such that

T (X) = νU (X − X t )+ U t for all X ∈ M4 (C).

(c) n = 2 and there exists B ∈ M2 (C) such that

T (X) = (tr XF12 )B for all X ∈ M2 (C).

Lemmata 4.3.1—4.3.4 show that if T is a singular linear operator on Mn (C)


that preserves unitary t-congruence, then Theorem 4.1.1 holds. The following
result characterizes the nonsingular T . The proof is the same as that of
Lemma (4.7) of [HHL], and will be omitted.

Lemma 4.3.5 Let T be a nonsingular linear operator on Mn (C) that pre-


serves unitary t-congruence. Then T (Sn (C)) = Sn (C) and T (Kn (C)) =
Kn (C).

Let us now consider the nonsingular T . Lemmata 4.2.1, 4.2.2, and 4.3.5
imply that there exist nonzero scalars µ, ν ∈ C and V1 , V2 ∈ Un (C) such that

T (X) = µV1 XV1t for all X ∈ Sn (C)

and either

(A1) T (Y ) = νV2 Y V2t for all Y ∈ Kn (C); or

(A2) n = 4 and T (Y ) = νV2 Y + V2t for all Y ∈ K4 (C).

By considering various choices for X and Y , we will show that we may take
V2 = βV1 in (A1), for some nonzero β ∈ C. Moreover, we will show that
(A2) cannot happen.
96

It is without loss of generality to assume T (X) = X for all X ∈ Sn (C)


and T (Y ) = αV Y V t for all Y ∈ Kn (C), where α ≡ ν/µ, since we may
consider instead the linear transformation T1 ≡ µ1 V1−1 T V1−t . Thus, we need
to show that we may take V = βI for some nonzero β ∈ C.
Suppose that n = 2. Direct calculation shows that for a given V ∈ Un (C),

V Y V t = (det V )Y for all Y ∈ K2 (C).

Hence, T (Y ) = αV Y V t = α(det V )Y for all Y ∈ K2 (C), as desired.


We now look at the case n ≥ 3. Let

X ≡ diag (1, 2, . . . , n) ∈ Sn (C),

and if n = 2k we let

Y ≡ B ⊕ 2B ⊕ · · · ⊕ kB ∈ Kn (C),

otherwise, if n = 2k + 1, then we take

Y ≡ B ⊕ 2B ⊕ · · · ⊕ kB ⊕ 0 ∈ Kn (C),

where B ≡ F12 ∈ K2 (C).


Notice that T (V XV t + α1 Y ) = V (X + Y )V t is unitarily t-congruent to
X+Y . Since T preserves unitary t-congruence and since unitary t-congruence
is an equivalence relation,
1
U V XV t U t + V U Y U t V t = T [U (V XV t + Y )U t ] and X + Y
α
are unitarily t-congruent for any U ∈ Un (C). Hence, there exists a W ∈
Un (C) such that

X + Y = W (U V XV t U t + V U Y U t V t )W t (4.3.2)

Since X ∈ Sn (C) and Y ∈ Kn (C), we have

X = W U V XV t U t W t = (W U V )X(W U V )t (4.3.3)

and
Y = W V U Y U t V t W t = (W V U )Y (W V U )t . (4.3.4)
97

Since W, U, V ∈ Un (C), we have W U V, W V U ∈ Un (C) as well. Hence we


may rewrite the identity (4.3.3) as

X(W U V ) = (W U V )X (4.3.5)

and we may rewrite the identity (4.3.4) as

Y (W V U ) = (W V U )Y. (4.3.6)

If we write W U V ≡ [zij ], then the identity (4.3.5) implies that iz ij = jzij .


Hence, if i 6= j, then zij = 0. Thus W U V = diag (z11 , z22 , . . . , znn ).
Similarly, if we partition W V U ≡ [Wij ] conformal to Y , then the identity
(4.3.6) implies that iBW ij = jWij B for each i, j with 1 ≤ i, j ≤ n. Notice
that B is unitary. Hence, if i 6= j, then Wij = 0. Moreover, if n = 2k + 1,
then Wi,k+1 = 0 and Wk+1,i = 0 for each i = 1, 2, . . . , k. Thus,

W V U = W11 ⊕ W22 ⊕ · · · ⊕ Wkk if n = 2k,

or
W V U = W11 ⊕ W22 ⊕ · · · ⊕ Wkk ⊕ Wk+1,k+1 if n = 2k + 1,
where Wii ∈ U2 (C) for each i = 1, 2, . . . , k and Wk+1,k+1 ∈ U1 (C).
Since U was arbitrary, we may choose U so that U ∗ V ∗ U is diagonal.
Now, W U V = W V U (U ∗ V ∗ U )V . Hence V = (U ∗ V ∗ U )∗ (W V U )∗ (W U V ) is
block-diagonal and has the same form as W V U , that is, either

V = V1 ⊕ V2 ⊕ · · · ⊕ Vk if n = 2k, (4.3.7)

or
V = V1 ⊕ V2 ⊕ · · · ⊕ Vk ⊕ Vk+1 if n = 2k + 1, (4.3.8)
where Vi ∈ U2 (C) for each i = 1, 2, . . . k and Vk+1 ∈ U1 (C).
We apply the same analysis to the same choices for X ∈ Sn (C) and
U ∈ Un (C), but this time we take

Y1 ≡ F23 = 0 ⊕ B ⊕ 0n−3 ∈ Kn (C).

As before, there exists a W1 ∈ Un (C) such that

X(W1 U V ) = (W1 U V )X (4.3.9)


98

and
Y1 (W1 V U ) = (W1 V U )Y1 . (4.3.10)
The identity (4.3.9) implies that D1 ≡ W1 U V is diagonal and the identity
(4.3.10) implies that W1 V U has the form
 
A11 0 A13
W1 V U =  0 A22 0 

,
A31 0 A33

where A11 ∈ C, A22 ∈ M2 (C), and A33 ∈ Mn−3 (C). Since W1 V U =


(W1 U V )(V ∗ )(U ∗ V U ), it follows that A13 = 0, A31 = 0, and the V1 and the
V2 in (4.3.7) or (4.3.8) are diagonal. Similarly, we can show that every Vi in
(4.3.7) or (4.3.8) is diagonal. Hence, V is diagonal, say V = diag (α1 , α2 , . . . , αn )
with |αi | = 1 for each i = 1, 2, . . . , n. Thus, T (Fij ) = αi αj Fij for each i, j
with 1 ≤ i < j ≤ n.
We summarize this result in the following.

Lemma 4.3.6 Let T be a nonsingular linear operator on Mn (C) that pre-


serves unitary t-congruence. Suppose that T (X) = X for all X ∈ Sn (C), and
for some scalar ν ∈ C and V ∈ Un (C), T (Y ) = νV Y V t for all Y ∈ Kn (C).
Then there exist scalars α1 , . . . , αn ∈ C with |αi | = 1 for each i = 1, . . . , n
such that V = diag (α1 , . . . , αn ). Consequently, T (Fij ) = αi αj Fij for each
i, j with 1 ≤ i < j ≤ n.

The proof of Lemma (4.9) of [HHL] can be used (again, verbatim) to prove
the following result.

Lemma 4.3.7 Let T be a nonsingular linear opreator on Mn (C) that pre-


serves unitary t-congruence. Suppose T (X) = X for all X ∈ Sn (C), and
for each i, j with 1 ≤ i < j ≤ n, there exists a nonzero µij ∈ C such that
T (Fij ) = µij Fij . Then
1
T (X) = [(X + X t ) + µ12 (X − X t )] for all X ∈ Mn (C).
2
If T (X) = X for all X ∈ Sn (C) and T (Y ) = νV Y V t for all Y ∈ Kn (C),
then Lemmata 4.3.6 and 4.3.7 imply that we may take V = βI for some
β ∈ C, as desired.
99

We now look at the remaining case n = 4, T (X) = µU XU t for all X ∈


S4 (C), and T (Y ) = νV Y + V t for all Y ∈ K4 (C), where µ, ν ∈ C are nonzero
and U, V ∈ U4 (C). As before, it is without loss of generality to assume
T (X) = X for all X ∈ S4 (C) and T (Y ) = αV Y + V t for all Y ∈ K4 (C).
By considering various choices for X and Y , we will show that this cannot
happen. In particular, we will prove the following.
Lemma 4.3.8 Let a scalar α ∈ C and V ∈ U4 (C) be given. Then
1
T (X) ≡ (X + X t ) + αV (X − X t )+ V t for all X ∈ M4 (C)
2
preserves unitary t-congruence if and only if α = 0.
If α = 0, then direct verification shows that T preserves unitary t-
congruence. Suppose α 6= 0 and T preserves unitary t-congruence. Let
X ≡ diag (1, 2, 3, 4), Y ≡ B ⊕ 2B, and U ≡ U1 ⊕ U2 ,
where B ≡ F12 ∈ K2 (C), and U1 , U2 ∈ U2 (C). Notice that (U Y U t )+ =
U Y U t . As before (see (4.3.2)), there exists a W ∈ U4 (C) such that
X + Y = W (U V XV t U t + V U Y U t V t )W t .
Consequently (see (4.3.5)),
X(W U V ) = (W U V )X (4.3.11)
and (see (4.3.6))
Y (W V U ) = (W V U )Y. (4.3.12)
The identity (4.3.11) implies that W U V is diagonal and the identity (4.3.12)
implies that W V U = Z1 ⊕ Z2 for some Z1 , Z2 ∈ U2 (C). Now,
U ∗ V = V (W U V )∗ (W V U )U ∗ . (4.3.13)
Since W U V is diagonal, and W V U and U are block-diagonal, we may write
(W U V )∗ (W V U )U ∗ ≡ W1 ⊕W2 for some W1 , W2 ∈ U2 (C). Partition V ≡ [Vij ]
conformal to U . The identity (4.3.13) can be rewritten as
(a) V11 W1 = U1∗ V11
(b) V12 W2 = U1∗ V12
(4.3.14)
(c) V21 W1 = U2∗ V21
(d) V22 W2 = U2∗ V22 .
100

Let V11 = Q1 ΣQ2 be a (given) singular value decomposition of V11 , where


Σ = diag (b1 , b2 ) with b1 ≥ b2 ≥ 0, and Q1 , Q2 ∈ U2 (C). Then the identity
(4.3.14a) implies that
Σ(Q2 W1 Q∗2 ) = (Q∗1 U1∗ Q1 )Σ. (4.3.15)
If we write D1 ≡ Q2 W1 Q∗2 = [wij ] and D2 ≡ Q∗1 U1∗ Q1 = [uij ], the identity
(4.3.15) gives
(a) b1 w11 = b1 u11
(b) b1 w12 = b2 u12
(4.3.16)
(c) b2 w21 = b1 u21
(d) b1 w22 = b2 u22 .
Suppose b1 > b2 . Since b2 ≥ 0, the identity (4.3.16a) shows that w11 = u11 .
The identity (4.3.16b) and the fact that both D1 and D2 are unitary imply
that w12 = u12 = 0. Moreover, w21 = u21 = 0 as well. Hence D2 = Q∗1 U1∗ Q1 is
diagonal. Since U1 is arbitrary, this is a contradiction. Thus b1 = b2 . Hence
V11 = b1 Q1 Q2 = b1 Q, where we put Q ≡ Q1 Q2 ∈ U2 (C). If b1 = 0, then
V11 = 0 and since V is unitary, V has the form
" #
0 V12
V = . (4.3.17)
V21 0

If b1 > 0, we let U1 ≡ I2 . The identity (4.3.14a) becomes b1 QW1 = b1 Q.


Hence, W1 = I2 . Now, the identity (4.3.14c) becomes V21 = U2∗ V21 . Since U2
is arbitrary, we conclude that V21 = 0. Since V is unitary, V12 = 0 as well.
Hence V has the form " #
V11 0
V = . (4.3.18)
0 V22
Now make the same choice of X, and let
 
0 0 0 2
 0 0 1 0 
 
Y1 ≡ F23 + 2F14 =  
 0 −1 0 0 
−2 0 0 0

and U1 ≡ diag (α1 , α2 , α3 , α4 ) ∈ U4 (C). As before, (U1 Y1 U1t )+ = U1 Y1 U1t and


moreover, there exists a W1 ∈ U4 (C) such that
X(W1 U1 V ) = (W1 U1 V )X (4.3.19)
101

and
Y (W1 V U1 ) = (W1 V U1 )Y. (4.3.20)
Again, W1 U1 V is diagonal and the identity (4.3.20) shows that W1 V U1 has
the form  
z11 0 0 z14
 0 z 0 
 22 z23 
 . (4.3.21)
 0 z32 z33 0 
z41 0 0 z44
Now,
U1∗ V = V (W1 U1 V )∗ (W1 V U1 )U1∗ . (4.3.22)
Since W1 U1 V and U1 are diagonal, D ≡ (W1 U1 V )∗ (W1 V U1 )U1∗ has the form
(4.3.21). Since V has the form (4.3.17) or (4.3.18), the identity (4.3.22) shows
that D is diagonal, that is, z14 = z23 = z32 = z14 = 0. Now choose U1 having
distinct (diagonal) elements. Then the identity (4.3.22) implies that each of
the Vij in (4.3.17) and (4.3.18) has the form either
" #
ν1 0
(4.3.23)
0 ν2
or " #
0 ν4
. (4.3.24)
ν3 0
We will show that neither of these choices is possible.
To reiterate, we are assuming
1
T (X) ≡ (X + X t ) + αV (X − X t )+ V t for all X ∈ M4 (C), α 6= 0
2
where V ∈ U4 (C) has either the form (4.3.17) or (4.3.18), and each Vij has
the form either (4.3.23) or (4.3.24). Notice that if T (A1 ) = W T (A2 )W t , then
1 1
(A1 + At1 ) = W (A2 + At2 )W t (4.3.25)
2 2
and
1 1
V (A1 − At1 )+ V t = W V (A2 − At2 )+ V t W t . (4.3.26)
2 2
102

" #
0 V12
Case 1 V ≡ .
V21 0
Let    
1 0 1 0 1 1 0 0
 0 0 0 0   −1 0 0 0 
   
A1 ≡   and A2 ≡  .
 −1 0 0 0   0 0 0 0 
0 0 0 2 0 0 0 2
With " #
0 1
U ≡1⊕ ⊕ 1 ∈ U4 (C), (4.3.27)
1 0
one checks that A1 = U A2 U t . Hence, there exists a W ∈ U4 (C) such that
T (A1 ) = W T (A2 )W t . Now the identity (4.3.25) shows that W has the form
 
w11 0 0 0
 0 w22 w23 0 
 
W ≡ .
 0 w32 w33 0 
0 0 0 w44
The identity (4.3.26) shows that
" # " #
0 −V21 BV12t 0 0
t =W W t, (4.3.28)
V12 BV21 0 0 V21 CV21t
where " # " #
0 0 0 1
B≡ and C ≡ .
0 1 −1 0
Since V21 CV21t = βC, where β ≡ det V21 6= 0, direct computation shows that
 
" # 0 0 0 0
0 0  0 0 0 βw23 w44 
W Wt = 


 . (4.3.29)
0 V21 CV21t  0 0 0 βw33 w44 
0 −βw23 w44 −βw33 w44 0
Now, use (4.3.28) and (4.3.29) to get
βw33 w44 = 0 (4.3.30)
and " # " #
0 0 0 0
V12 BV21t = V12 V21t = . (4.3.31)
0 1 0 βw23 w44
103

Since β 6= 0 and w44 6= 0, (4.3.30) shows that w33 = 0. Since W is unitary,


w22 = 0 as well. The identity (4.3.31) implies that V12 and V21 are diagonal,
say " # " #
ν1 0 ν3 0
V12 ≡ and V21 ≡ .
0 ν2 0 ν4
Now, consider
   
1 0 2 0 1 2 0 0
 0 2 0 3   −2 3 0 0 
D1 ≡ 

 
 and D2 ≡ 

.
 −2 0 3 0   0 0 2 3 
0 −3 0 4 0 0 −3 4

With the U in (4.3.27), we have D1 = U D2 U t . Hence, there exists a W1 ∈


U4 (C) such that T (D1 ) = W1 T (D2 )W1t . Moreover,
1 1
(D1 + D1t ) = W1 (D2 + D2t )W1t (4.3.32)
2 2
and
1 1
V (D1 − D1t )+ V t = W1 V (D2 − D2t )+ V t W1t . (4.3.33)
2 2
The identity (4.3.32) shows that W1 has the form
 
u11 0 0 0
 0 0 u23 0 
 
W1 ≡  .
 0 u32 0 0 
0 0 0 u44

Consequently, (4.3.33) becomes


   
0 0 −3ν1 ν3 0 0 0 2β 0
 0 0 0 −2ν2 ν4   0 0 0 3δ 
   
 = , (4.3.34)
 3ν1 ν2 0 0 0   −2β 0 0 0 
0 2ν2 ν4 0 0 0 −3δ 0 0

where β ≡ u11 u32 ν1 ν2 and δ ≡ u23 u44 ν3 ν4 . Since W1 and V are unitary,
|β| = |δ| = |νi νj | = 1 for all 1 ≤ i, j ≤ 2, and hence (4.3.34) is impossible.

Case 2 V ≡ V11 ⊕ V22 .


104

For this case, we take


   
1 0 1 0 1 0 0 1
 0 2 0 0   0 2 0 0 
   
A1 ≡   and A2 ≡  .
 −1 0 0 0   0 0 0 0 
0 0 0 0 −1 0 0 0

One checks that A1 and A2 are unitarily t-congruent. Hence, there exists
a W ∈ U4 (C) such that T (A1 ) = W T (A2 )W t . As in Case 1, the identities
(4.3.25) and (4.3.26) hold. The identity (4.3.25) now shows that W has the
form
" # " #
w11 0 w33 w43
W ≡ W1 ⊕ W2 , with W1 = and W2 = .
0 w22 w34 w44

Consequently, (4.3.26) reduces to

V1 BV2t = W1 V1 CV2t W2t , (4.3.35)

where " # " #


0 0 0 1
B≡ and C ≡ .
0 1 0 0
The identity (4.3.35) can now be rewritten as

B(V2∗ W2∗ V2 )t = (V1∗ W1 V1 )C. (4.3.36)

Since V1 has the form either (4.3.23) or (4.3.24) and W1 is diagonal, V1∗ W1 V1
is diagonal. Thus, if we write (V2∗ W2∗ V2 )t = [uij ], (4.3.36) shows that u21 =
u22 = 0. Hence, V2∗ W2∗ V2 is singular, a contradiction to the fact that both V2
and W2 are unitary.

4.4 The Real Case


Lemma 4.4.1 Let T be a given linear operator on Mn (IR). Suppose T pre-
serves orthogonal similarity. Then ker T is a direct sum of one or more of
the following spaces: Λn , Sn0 (IR), and Kn (IR).
105

Proof Let 0 6= A ∈ ker T . Lemma 4.2.9 guarantees that AS ≡ A + At ∈


span O(A) and AK ≡ A − At ∈ span O(A). Thus, AS ∈ ker T and
AK ∈ ker T . If AK 6= 0, Lemma 4.2.5 ensures that Kn (IR) ⊂ ker T ;
if AS ∈ Λn , then Λn ⊂ ker T ; if 0 6= AS ∈ Sn0 (IR), Lemma 4.2.7
guarantees that Sn0 (IR) ⊂ ker T ; and if AS 6∈ Λn ∪ Sn0 (IR), Lemma 4.2.8
ensures that Sn (IR) ⊂ ker T .
Let T be a given linear operator on Mn (IR) that preserves orthogonal
similarity. Notice that for every X ∈ Mn (IR), we can write X = XS + XK ,
where
X + Xt X − Xt
XS ≡ ∈ Sn (IR) and XK ≡ ∈ Kn (IR).
2 2
Hence, T (X) = T1 (X) + T2 (X), where
T1 (X) ≡ T (XS ) and T2 (X) ≡ T (XK ) for all X ∈ Mn (IR).
One checks that both T1 and T2 preserve orthogonal similarity. Moreover,
Kn (IR) ⊂ ker T1 and Sn (IR) ⊂ ker T2 .
Since Sn (IR) = Λn ⊕ Sn0 (IR), Lemma 4.4.1 guarantees that either Kn (IR) ⊂
ker T2 (so that T2 = 0) or Kn (IR) ∩ ker T2 = {0} (so that T2 is nonsingular
on Kn (IR)) . However, in the case of T1 , any subset of {Λn , Sn0 (IR)} could
also be a subset of ker T1 . We begin with this case.
Lemma 4.4.2 Let T be a given linear operator on Mn (IR) that preserves
orthogonal similarity. Then T1 satisfies one of the following three conditions:
(A1) There exists A0 ∈ Mn (IR) such that
T1 (X) = (tr X)A0 for all X ∈ Mn (IR).

(A2) There exist U ∈ Un (IR) and scalars α, β ∈ IR such that


T1 (X) = α(tr X)I + βU (X + X t )U t for all X ∈ Mn (IR).

(A3) n = 2 and there exist U ∈ U2 (IR), B ∈ K2 (IR), and scalars α, β ∈ IR


such that
T1 (X) = α(tr X)I + (tr X)B + βU (X + X t )U t
for all X ∈ M2 (IR).
106

Proof Suppose T preserves orthogonal similarity. Then T1 also preserves


orthogonal similarity and hence Lemma 4.4.1 guarantees that ker T1 is a di-
rect sum of one or more of Λn , Sn0 (IR), Kn (IR). Suppose Sn0 (IR) ⊂ ker T1 .
Since Kn (IR) ⊂ ker T1 and any X ∈ Mn (IR) can be written as X =
( n1 tr X)I + XS 0 + XK , where XS 0 ∈ Sn0 (IR) and XK ∈ Kn (IR), we have

T1 (X) = (tr X)A0 for all X ∈ Mn (IR),

where we put A0 ≡ n1 T1 (I) ∈ Mn (IR). Hence condition (A1) holds.


Now suppose Sn0 (IR) 6⊂ ker T1 . Notice that if a nonzero A ∈ Sn0 (IR)
satisfies T1 (A) = 0, then Lemma 4.2.7 ensures that Sn0 (IR) ⊂ ker T1 . Hence,
in this case, T1 (A) 6= 0 for all nonzero A ∈ Sn0 (IR). Since

n(n + 1) n(n − 1)
dim Sn0 (IR) = −1> = dim Kn (IR),
2 2
there exists a nonzero B ∈ Sn (IR) such that T1 (A) = B for some nonzero
A ∈ Sn0 (IR). It follows that T1 (Sn0 (IR)) ⊂ Sn (IR). Since Λn ⊕Sn0 (IR) = Sn (IR),
we may regard T1 as a mapping from Sn (IR) to Mn (IR). Define

T1 (X) + T1 (X)t
TS (X) ≡ for all X ∈ Sn (IR)
2
and
T1 (X) − T1 (X)t
TK (X) ≡ for all X ∈ Sn (IR).
2
Notice that TS (Sn (IR)) ⊂ Sn (IR). Lemma 4.2.4 guarantees that TS satisfies
one of the following conditions:

(1a) There exists B1 ∈ Sn (IR) such that TS (X) = (tr X)B1 for all X ∈
Sn (IR); or

(1b) There exist U ∈ Un (IR) and α, β ∈ IR such that TS (X) = α(tr X)I +
βU XU t for all X ∈ Sn (IR).

For every nonzero A ∈ Sn0 (IR), T1 (A) is nonzero and symmetric. It follows
that TS (A) = T1 (A) 6= 0 for all A ∈ Sn0 (IR). Hence (1a) cannot happen.
Notice also that TK (Sn (IR)) ⊂ Kn (IR), so that Sn0 ⊂ ker TK . Thus,

TK (X) = (tr X)A1 for all X ∈ Sn (IR),


107

where we put A1 ≡ n1 T1 (I) ∈ Kn (IR). Since T1 = TS + TK , we have

T1 (X) = (tr X)A1 + α(tr X)I + βU XU t for all X ∈ Sn (IR).

If n = 2, then condition (A3) of Lemma 4.4.2 holds. If n ≥ 3, we consider


various choices for X to show that A1 = 0, and hence condition (A2) holds.
Let n ≥ 3 and suppose A1 ≡ [aij ] 6= 0, that is, suppose that akh 6= 0 for
some 1 ≤ k < h ≤ n. Let

U XU t ≡ diag (1, . . . , k, . . . , h, . . . , n)

and
U Y1 U t ≡ diag (1, . . . , h, . . . , k, . . . , n).
Notice that X and Y1 are orthogonally similar. Hence, there exists W ∈
Un (IR) such that T1 (X) = W T1 (Y1 )W t . It follows that

U XU t = W (U Y1 U t )W t (4.4.37)

and
A1 = W A1 W t . (4.4.38)
The identity (4.4.37) and the fact that W ∈ Un (IR) show that W ≡ [wij ] has
entries wkh , whk , wii = ±1 where i = 1, . . . , n, i 6= h and i 6= k, and wij = 0
otherwise. The k th and hth rows of the identity (4.4.38) show that

(a) wkh whk ahk = akh ,


(b) wkh wii ahi = aki for i 6= k, and (4.4.39)
(c) whk wii aki = ahi for i 6= h.

Since ahk = −akh 6= 0, the identity (4.4.39a) shows that wkh whk = −1. The
identities (4.4.39b) and (4.4.39c) now show that ahi = aki = 0 for all integers
h 6= i 6= k.
Since n ≥ 3, there is an integer t such that 1 ≤ t ≤ n, t 6= k, and t 6= h.
Suppose t < k and this time consider

U Y2 U t ≡ diag (1, . . . , k, . . . , t, . . . , n).

Notice that X and Y2 are also orthogonally similar. Repeat the previous
analysis and conclude that akh = 0. Hence, A1 = 0, as desired.
108

One checks that if a given linear operator L on Mn (IR) satisfies any one
of the three conditions (A1) - (A3), then L preserves orthogonal similarity.
We now consider T2 . Recall that either T2 = 0, or T2 is nonsingular on
Kn (IR). Hence, T2 (A) 6= 0 for some (necessarily) nonzero A ∈ Kn (IR) if and
only if Kn (IR) ∩ ker T2 = {0}.

Lemma 4.4.3 Let T be a given linear operator on Mn (IR) that preserves


orthogonal similarity. Then T2 satisfies one of the following three conditions:

(B1) There exist U ∈ Un (IR) and a scalar α ∈ IR such that

T2 (X) = αU (X − X t )U t for all X ∈ Mn (IR).

(B2) n = 4 and there exist U ∈ U4 (IR) and a scalar α ∈ IR such that

T2 (X) = αU (X − X t )+ U t for all X ∈ M4 (IR).

(B3) n = 2 and there exists B0 ∈ M2 (IR) with tr B0 = 0 such that

T2 (X) = (tr XF12 )B0 for all X ∈ K2 (IR).

Proof We consider the restriction of T2 on Kn (IR). Define

T2 (X) + T2 (X)t
TS (X) ≡ for all X ∈ Kn (IR),
2
and
T2 (X) − T2 (X)t
TK (X) ≡ for all X ∈ Kn (IR).
2
Notice that for any X ∈ Kn (IR), we have T2 (X) = TS (X) + TK (X).
Moreover, TS (Kn (IR)) ⊂ Sn (IR) and TK (Kn (IR)) ⊂ Kn (IR). We will show
that if n ≥ 3, then TS = 0 so that Lemma 4.2.3 ensures that either (B1) or
(B2) holds. If n = 2, we will show that T2 satisfies (B3).
Suppose TS 6= 0. Since TS also preserves orthogonal similarity, TS is
nonsingular. Since −A ∈ O(A) for any A ∈ Kn (IR), we have −TS (A) ∈
O(TS (A)). Hence, if λ is an eigenvalue of TS (A) with multiplicity k, then −λ
is also an eigenvalue of TS (A) with multiplicity k. Moreover, if A 6= 0, then
TS (A) has a positive eigenvalue.
109

Now, Im T is a subspace of Sn (IR) with dimension n(n − 1)/2. If Z is a


subspace of Sn (IR) with dimension k, and if r ∈ {2, . . . , n − 1} and

(r − 1)(2n − r + 2)
k ≥ κ(r) ≡ , (4.4.40)
2
then Theorem (1) of [FL] guarantees that Z contains a nonzero matrix B
whose largest eigenvalue has multiplicity at least r. This result, together with
the fact that the eigenvalues of TS (A) are paired, will give us a contradiction.
Let n = 2t + 1, where t = 1, 2, . . .. Then 2 ≤ t + 1 ≤ n − 1 and
n(n − 1) 3t(t + 1)
k= = t(2t + 1) ≥ = κ(t + 1).
2 2
Hence, there exists a nonzero B ∈ Im T whose largest eigenvalue λ has
multiplicity at least t + 1. Since B ∈ Im T and B 6= 0, we must have λ > 0.
But since −λ is also an eigenvalue of B with multiplicity at least t + 1, this
is a contradiction.
Now let n = 2t, where t = 3, 4, . . .. Then 2 ≤ t + 1 ≤ n − 1 and
n(n − 1) t(3t + 1)
k= = t(2t − 1) ≥ = κ(t + 1).
2 2
Hence, there exists a nonzero B ∈ Im T whose largest eigenvalue is positive
and has multiplicity at least t + 1. Again, this is a contradiction.
Suppose now that n = 4. Notice that dim Im TS = 6 > 4 = κ(2).
Hence, Im TS contains a matrix of the form Y0 ≡ Qdiag (a, a, −a, −a)Qt
where Q ∈ U4 (IR) and a > 0. We will show that this cannot happen.
One checks that the tangent space at any A ∈ M4 (IR) is given by

TA ≡ {XA + At X t : X ∈ K4 (IR)}.

Moreover,
dim TY0 = dim TQt Y0 Q = 2
and
dim TA ≥ 4 for any nonzero A ∈ K4 (IR).
Since TS is nonsingular on K4 (IR), it must be that

dim TA ≤ dim TTS (A) for any A ∈ K4 (IR).


110

Hence, Y0 6∈ Im TS . Thus, if n = 4, then TS = 0.


Now consider the case n = 2. Notice that
1
X = − (tr XF12 )F12 for every X ∈ K2 (IR).
2
It follows that

T (X) = (tr XF12 )A0 for all X ∈ K2 (IR),

where we put A0 ≡ − 12 T (F12 ). Since F12 and −F12 are orthogonally similar,
A0 is orthogonally similar to −A0 . Hence, tr A0 = tr (−A0 ) = −tr A0 , so
that tr A0 = 0. Notice that if A0 ∈ K2 (IR), then condition (B1) also holds.

We now return to consider our original T = T1 + T2 . First, we consider


n ≥ 3. Lemma 4.4.2 guarantees that T1 satisfies either (A1) or (A2), and
Lemmata 4.4.3 guarantees that T2 satisfies either (B1) or (B2). We consider
these cases separately.

Case 1. T1 (X) = (tr X)A0 and n ≥ 3.


First we consider T2 (X) = cV (X − X t )V t , that is,

T (X) = (tr X)A0 + cV (X − X t )V t for all X ∈ Mn (IR).

If c = 0, then condition (1) of Theorem 4.1.2 holds. Suppose now that c 6= 0.


Write A0 = AS + AK , where AS ∈ Sn (IR) and AK ∈ Kn (IR). Consider
1 1 1 1
X1 ≡ I + V t Ak V and X2 ≡ I − V t Ak V.
n 2c n 2c
One checks that X1 and X2 are orthogonally similar. Hence, T (X1 ) = AS +
2AK is orthogonally similar to T (X2 ) = AS . It follows that AK = 0 and
A0 = AS ∈ Sn (IR). We will show that A0 = αI for some α ∈ IR.
Since A0 ∈ Sn (IR), there exist a Q ∈ Un (IR) and a1 , a2 , a3 , . . . , an ∈ IR
such that A0 = Qdiag (a1 , a2 , a3 , . . . , an )Qt . Consider
 
0 0 1
1 1 t  
X3 ≡ I + V Q( 0 0 0  ⊕ 0n−3 )Qt V ∈ Mn (IR)
n 2c
−1 0 0
111

and  
0 0 0
1 1 t 
X4 ≡ I + V Q( 0 0 1  t
 ⊕ 0n−3 )Q V ∈ Mn (IR).
n 2c
0 −1 0
One checks that X3 and X4 are orthogonally similar. Hence T (X3 ) is orthog-
onally similar to T (X4 ). It follows that
   
a1 0 1 a1 0 0
B1 ≡   
 0 a2 0  and B2 ≡  0 a2 1 
−1 0 a3 0 −1 a3

are orthogonally similar. Thus,

a2 (a1 a3 + 1) = det B1 = det B2 = a1 (a2 a3 + 1),

and hence a2 = a1 . Similarly, we can show that ai = a1 for each i = 3, . . . , n.


Hence, A0 = a1 I and condition (2) of Theorem 4.1.2 holds with β = 0.
Suppose now that T2 satisfies (B2), that is, n = 4 and

T (X) = (tr X)A0 + cV (X − X t )+ V t for all X ∈ M4 (IR).

If c = 0, then condition (1) of Theorem 4.1.2 holds. Suppose that c 6= 0. We


will show that A0 = aI for some a ∈ IR so that condition (3) of Theorem 4.1.2
holds. We will use the facts that Y + ∈ K4 (IR) and (Y + )+ = Y whenever
Y ∈ K4 (IR).
Write A0 ≡ AS + AK , where AS ∈ S4 (IR) and AK ∈ K4 (IR). Let
1 1 1 1
X1 ≡ I + (V t Ak V )+ and X2 ≡ I − (V t Ak V )+ .
n 2c n 2c
As in the previous case, X1 and X2 are orthogonally similar, so that T (X1 ) =
AS + 2AK is orthogonally similar to T (X2 ) = AS . It follows that AK = 0.
Moreover, the previous analysis can be used to show that A0 = aI, as desired.

Case 2. T1 (X) = a(tr X)I + bU (X + X t )U t


If T2 satisfies (B1), we may follow the analysis of the complex case (see
the discussion after Lemma 4.3.5) to show that we may take V = U . Thus,
condition (2) of Theorem 4.1.2 holds.
112

Suppose n = 4 and T2 satisfies (B2). If b = 0, then


T (X) = a(tr X)I + cV (X − X t )+ V t for all X ∈ M4 (IR),
so that condition (3) of Theorem 4.1.2 holds. If b 6= 0, we may again follow
the analysis of the complex case (see the discussion after Lemma 4.3.8) to
show that c = 0. Thus, T satisfies condition (2) of Theorem 4.1.2 with µ = 0.
This proves the forward implication of Theorem 4.1.2. The converse can
be proven by direct verification.

For the remaining case, n = 2, notice that if K2 (IR) ⊂ ker T , then


Lemma 4.4.2 shows that either condition (1) or condition (2) of Theorem 4.1.3
holds. It will be useful to have the following observation.

Proposition 4.4.4 Let A, B ∈ M2 (IR) be given. There exists U ∈ U2 (IR)


such that A = U BU t if and only if the following three conditions are satisfied:
(a) det A = det B,
(b) tr A = tr B, and
(c) tr At A = tr B t B.

Proof The forward implication is easily verified. For the converse, notice
that the conditions imply that A and B have the same eigenvalues and
singular values. Hence, the matrices Aα ≡ A + αI and Bα ≡ B + αI
also satisfy the three conditions for any α ∈ IR. Notice that A and B
are orthogonally similar if and only if there is some α ∈ IR such that Aα
and Bα are orthogonally similar. Thus, we may assume that tr A 6= 0
and det A 6= 0. Since A and B have the same singular values, there
exist X1 , X2 , V1 , V2 ∈ U2 (IR) such that
X1 AX1t = ΣX2 and V1 BV1t = ΣV2 , (4.4.41)
where Σ ≡ diag (σ1 , σ2 ) and σ1 ≥ σ2 > 0. Since the determinant is
invariant under orthogonal similarity, it follows from condition (a) and
(4.4.41) that det (X2 ) = det (V2 ). Now, any U ∈ U2 (IR) can be written
as " #
cos θ sin θ
U≡ , for some θ ∈ IR.
∓ sin θ ± cos θ
113

Hence, (b) and the fact that tr A 6= 0 show that X2 and V2 have
the same diagonal elements. Direct verification now shows that X2 =
W V2 W t , where W ≡ diag (1, ±1) ∈ U2 (IR). Hence,

ΣX2 = ΣW V2 W t = W ΣV2 W t .

It follows that A and B are orthogonally similar.

Suppose K2 (IR) 6⊂ ker T . Assume further that S20 (IR) ⊂ ker T . Write
1 1
X = (tr X)I + XS 0 − (tr XF12 )F12 . (4.4.42)
2 2
Then XS 0 ∈ S20 (IR) and hence

T (X) = (tr X)A0 + (tr XF12 )B0 for all X ∈ M2 (IR),

where A0 ≡ 12 T (I) and B0 ≡ − 12 T (F12 ). Consider

X1 ≡ F12 and X2 ≡ −F12

to conclude that tr B0 = 0. Now let

Y1 ≡ I − F12 and Y2 ≡ I + F12 .

Then we have det T (X1 ) = det T (X2 ). Direct computation shows that this
implies tr A0 B0 = 0. Moreover, we have

tr T (Y1 )t T (Y1 ) = tr T (Y2 )t T (Y2 ).

For this case, it follows that tr At0 B0 = 0. Hence, T satisfies condition (1) of
Theorem 4.1.3.
Suppose S20 (IR) 6⊂ ker T and write X as in (4.4.42). Since the restriction
of T = T1 + T2 to S20 (IR) is nonsingular, it follows from Lemma 4.4.2 that

T (X) = (αtr X)I + βU (X + X t )U t + (tr X)C0 + (tr XF12 )D0

for all X ∈ M2 (IR), with β 6= 0, C0 ∈ K2 (IR), and D0 ≡ − 12 T (F12 ). Now


use X1 = −X2 = F12 to show that tr D0 = 0. Hence, we may write D0 ≡
DS 0 + µF12 , where DS 0 ∈ S20 (IR).
114

Use Proposition 4.4.4 to show that

X3 ≡ U t DS 0 U + βF12 and X4 ≡ U t DS 0 U − βF12

are orthogonally similar. It follows that T (X3 ) is orthogonally similar to


T (X4 ), and DS 0 = 0, so that D0 = µF12 . Since K2 (IR) 6⊂ ker T , we must
have µ 6= 0.
Consider
X5 ≡ µI + A0 and X6 ≡ µI − A0 .
Since A0 ∈ K2 (IR), X5 and X6 are orthogonally similar. The orthogonal
similarity of T (X5 ) and T (X6 ) shows that A0 = 0. Now,

(det U )(tr XF12 )F12 = U (X − X t )U t for all X ∈ M2 (IR).

Hence, T satisfies condition (2) of Theorem 4.1.3.


The converse can be verified with the help of Proposition 4.4.4.
Appendix A

A New Look at the Jordan


Canonical Form

115
116

The ultimate proof of the Jordan canonical form theorem is a kind of


mathematical holy grail, and we join others who have sought this treasure
in recent years ([Br], [FS], [GW], [Ho], [V]). Our matrix-theoretic approach
builds on the Schur triangularization theorem and shows how to reduce any
unispectral upper triangular matrix to a direct sum of Jordan blocks. Con-
noisseurs of the Jordan canonical form will recognize in our argument key
elements of Pták’s classic basis-free proof [P].
Our terminology and notation is standard, as in [HJ1]: we denote the set
of m-by-n complex matrices by Mm,n and write Mn ≡ Mn,n ; AT denotes the
transpose of A, and A∗ denotes the conjugate transpose; the n-by-n upper
triangular Jordan block with eigenvalue λ is denoted by Jk (λ).
Lemma A.1 Let positive integers k, n be given with k < n. Let X1 , Y T ∈
Mk,n be such that X1 Y ∈ Mk is nonsingular. Then there exists an X2 ∈
Mn−k,n such that X2 Y = 0 and
" #
X1
X≡ ∈ Mn is nonsingular.
X2
Proof Since Y has full (column) rank k, we let {ξ1 , . . . , ξn−k } ⊂ Cn be a
basis for the orthogonal complement of the span of the columns of Y ,
and set X2 ≡ [ξ1 · · · ξn−k ]∗ . Then X2 Y = 0 and X2 has full (row)
rank. If X ≡ [X1T X2T ]T is singular, then its rows are dependent and
there are vectors η ∈ Ck and ζ ∈ Cn−k , not both zero, such that
" #T " #
η T T X1
X = [η ζ ] = η T X1 + ζ T X2 = 0.
ζ X2
Then 0 = η T X1 Y + ζ T X2 Y = η T X1 Y , so η = 0 since X1 Y is nonsin-
gular. Thus, ζ T X2 = 0 and ζ = 0 since X2 has full row rank. We
conclude that X is nonsingular, as desired.
Let A ∈ Mn be nilpotent, and let k ≥ 1 be such that Ak = 0 but
Ak−1 6= 0; if A = 0, then k = 1 and we set Ak−1 ≡ I. Let x, y ∈ Cn be such
that x∗ Ak−1 y 6= 0. Define
 
x∗
 
 x∗ A 
X1 ≡ 
 ..  ∈ Mk,n

 . 
x∗ Ak−1
117

and
Y ≡ [y Ay · · · Ak−1 y] ∈ Mn,k .
Since x∗ Ak−1 y 6= 0, the columns of
 
∗ x∗ Ak−1 y
 
 
 · 
X1 Y = 

 ∈ Mk

 · 
 
·
x∗ Ak−1 y 0
are independent, so X1 Y is nonsingular (if k = n, this means X1 is nonsin-
gular). Moreover,
X1 A = Jk (0)X1 (A.0.1)
and
AY = Y JkT (0). (A.0.2)
If k = n we have X1 A = Jn (0)X1 and X1 is nonsingular, so Jn (0) = X1 AX1−1 .
If k < n, Lemma A.1 ensures that there is an X2 ∈ Mn−k,n such that

X2 Y = 0 (A.0.3)

and " #
X1
X≡ ∈ Mn is nonsingular.
X2
Set B ≡ (X2 A)X −1 ∈ Mn−k,n and partition B = [B1 A2 ] with B1 ∈ Mn−k,k
and A2 ∈ Mn−k . Notice that
" #
X1
X2 A = BX = [B1 A2 ] = B1 X1 + A2 X2 . (A.0.4)
X2

Now use (A.0.4) and the identities (A.0.3) and (A.0.2) to compute

X2 AY = (X2 A)Y = B1 X1 Y + A2 (X2 Y ) = B1 (X1 Y )

and
X2 AY = X2 (AY ) = X2 (Y JkT (0)) = (X2 Y )JkT (0) = 0.
Since X1 Y is nonsingular, we conclude that B1 = 0. Thus, (A.0.4) simplifies
to
X2 A = A2 X2 (A.0.5)
118

and we can use (A.0.1) and (A.0.5) to compute


" # " # " #
X1 X1 A Jk (0)X1
XA = A= =
X2 X2 A A2 X2
" #" # " #
Jk (0) 0 X1 Jk (0) 0
= = X.
0 A2 X2 0 A2
Since X is nonsingular, we have
" #
Jk (0) 0
C ≡ XAX −1 = .
0 A2
Hence, the given nilpotent matrix A and the block diagonal matrix C are
similar; either A2 is absent (k = n) or it has size strictly less than that of A
and Ak2 = 0.
Application of the preceding argument to A2 and its successors (finitely
many steps) now shows that any nilpotent matrix is similar to a direct sum
of singular Jordan blocks.
Lemma A.2 Let A ∈ Mn be nilpotent. Then there exists a nonsingular
S ∈ Mn and positive integers m, k1 ≥ · · · ≥ km ≥ 1 such that k1 +· · ·+km = n
and  
Jk1 (0) 0 ··· 0
 
 0 Jk2 (0) · · · 0 
−1
SAS =   .. .. .. .
... 
 . . . 
0 0 · · · Jkm (0)
It is a standard consequence of Schur’s triangularization theorem that
any complex matrix is similar to a direct sum of upper triangular matrices,
each of which has only one eigenvalue; see Theorem (2.4.8) in [HJ1] for a
proof.
Lemma A.3 Suppose that A ∈ Mn has r eigenvalues λ1 , . . . , λr with respec-
tive multiplicities n1 , . . . , nr . Then A is similar to a matrix of the form
 
T1 0 ··· 0
 
 0 T2 ··· 0 
 .. .. ..  (A.0.6)
 ... 
 . . . 
0 0 · · · Tr
119

where each Ti ∈ Mni is upper triangular with all diagonal entries equal to λi ,
i = 1, . . . , r.

For a block matrix of the form (A.0.6), notice that each Ti − λi Ini is
nilpotent for i = 1, . . . , r. Lemma A.2 guarantees that there exists a non-
singular Si ∈ Mni and positive integers mi , i1 ≥ · · · ≥ imi ≥ 1 such
that Ji1 (0) ⊕ · · · ⊕ Jimi (0) = Si (Ti − λi Ini )Si−1 = Si Ti Si−1 − λi Ini . Hence,
Si Ti Si−1 = Ji1 (λi )⊕· · ·⊕Jimi (λi ). If we now consider the similarity of (A.0.6)
obtained with the direct sum S1 ⊕ · · · ⊕ Sr , we obtain the desired result:

Jordan Canonical Form Theorem Let A ∈ Mn . Then there exist positive


integers m, k1 , . . . , km with k1 + · · · + km = n, scalars λ1 , . . . , λm and a
nonsingular S ∈ Mn such that
 
Jk1 (λ1 ) 0 ··· 0
 
 0 Jk2 (λ2 ) ··· 0 
SAS −1
=
 .. .. ... .. .

 . . . 
0 0 · · · Jkm (λm )

Once one has the Jordan form, its uniqueness (up to permutation of the
diagonal blocks) can be established by showing that the sizes and multiplici-
ties of the blocks are determined by the finitely many integers rank (A−λI)k ,
λ an eigenvalue of A, k = 1, . . . , n; see the proof of Theorem (3.1.11) in [HJ1].
Bibliography

[A] L. Autonne. Sur les Matrices Hypohermitiennes et sur les Matrices


Unitaries. Annales de Université de Lyon. Nouvelle Série I. Fasc. 38
(1915), 1—77.
[Bo] W. M. Boothby. An Introduction to Differential Manifolds and Rie-
mannian Geometry. Academic Press, New York, 1975.
[BP] E. P. Botta and S. Pierce. The Preservers of Any Orthogonal Group.
Pacific J. Math. 70 (1977), 37—49.
[Br] R. A. Brualdi. The Jordan Canonical Form: an Old Proof. Amer. Math.
Monthly, 94 (1987), 257—267.
[CH1] D. Choudhurry and R. A. Horn. An Analog of the Gram—Schmidt
Algorithm for Complex Bilinear Forms and Diagonalization of Com-
plex Symmetric Matrices. Technical Report No. 454. Department of
Mathematical Sciences. The Johns Hopkins University. (1986).
[CH2] D. Choudhury and R. A. Horn. A Complex Orthogonal—Symmetric
Analog of the Polar Decomposition. SIAM J. Alg. Disc. Meth. 8 (1987),
219—225.
[DBS] N. G. De Bruijn and G. Szekeres. On Some Exponential and Polar
Representations of Matrices. Nieuw Archief voor Wiskunde 3 III (1955),
20—32.
[D] J. D. Dixon. Rigid Embedding of Simple Groups in the General Linear
Group. Can. J. Math. XXIX (1977), 384—391.
[F] H. Flanders. Elementary Divisors of AB and BA. Proc. Amer. Math.
Soc. 2 (1951), 871—974.

120
121

[FS] R. Fletcher and D. C. Sorensen. An Algorithmic Derivation of the Jor-


dan Canonical Form. Amer. Math. Monthly 90 (1983), 12—16.

[FL] S. Friedland and R. Loewy. Subspaces of Symmetric Matrices Contain-


ing Matrices with a Multiple First Eigenvalue. Pacific J. Math. 62
(1976), 389—399.

[GW] A. Galperin and Z. Waksman. An Elementary Approach to Jordan


Theory. Amer. Math. Monthly 87 (1981), 728—732.

[G] F. R. Gantmacher. The Theory of Matrices, Vol. 2. Chelsea Publishing


Company, New York, 1974.

[Hi] F. Hiai. Similarity Preserving Linear Maps on Matrices. Linear Algebra


Appl. 97 (1987), 127—139.

[Ho] Y.P. Hong. A Canonical Form Under φ—Equivalence. Linear Alg. Appl.
147 (1991), 501—549.

[HH1] Y.P. Hong and R. A. Horn. A Canonical Form for Matrices Under
Consimilarity. Linear Algebra Appl. 102 (1988), 143—168.

[HH2] Y.P. Hong and R. A. Horn. A Characterization of Unitary Congru-


ence. Linear and Multilinear Algebra 25 (1989), 105—119.

[HHL] Y.P. Hong, R. A. Horn, and C.K. Li. Linear Operators Preserving
t—Congruence on Matrices. Linear Algebra Appl. to appear.

[HJ1] R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University


Press, New York, 1985.

[HJ2] R. A. Horn and C. R. Johnson. Topics in Matrix Analysis. Cambridge


University Press, New York, 1991.

[HLT] R. A. Horn, C.K. Li, and N.K. Tsing. Linear Operators Preserving
Certain Equivalence Relations on Matrices. SIAM J. Matrix Analysis
Appl. 12 (1991), 195—204.

[K1] I. Kaplansky. Algebraic Polar Decomposition. SIAM J. Matrix Anal.


Appl. 11 (1990), 213—217.
122

[K2] I. Kaplansky. Linear Algebra and Geometry — A Second Course.


Chelsea Publishing Company, New York, 1974.

[LRT] C.K. Li, L. Rodman, and N.K. Tsing. Linear Operators Preserving
Certain Equivalence Relations Originating in System Theory. Linear
Algebra Appl. 161 (1992), 165—225.

[LT1] C.K. Li and N.K. Tsing. G—Invariant Norms and G(c)—Radii. Linear
Algebra Appl. 150 (1991), 150—179.

[LT2] C.K. Li and N.K. Tsing. Linear Preserver Problems: A Brief Intro-
duction and Some Special Techniques. Linear Algebra Appl. 162—164
(1992), 217—236.

[L] R. Loewy. Linear Maps Which Preserve a Balanced Nonsingular Inertia


Class. Linear Algebra Appl. 134 (1990), 165—179.

[PM] W. V. Parker and B. E. Mitchell. Elementary Divisors of Certain Ma-


trices. Duke Math. J. 19 (1952), 483—485.

[P] V. Pták. Eine Bemerkung zur Jordanschen Normalform von Matrizen.


Acta. Sci. Math. (Szeged) 17 (1956), 190—194.

[T] R. C. Thompson. On the Matrices AB and BA∗ . Linear Algebra Appl.


1 (1968) 43—58.

[TK] R. Thompson and C. Kuo. Doubly Stochastic, Unitary, Unimodu-


lar, and Complex Orthogonal Power Embedding. Acta Sci. Math. 44
(1982), 345—357.

[V] H. Väliaho. An Elementary Approach to the Jordan Form of a Matrix.


Amer. Math. Monthly 93 (1986), 711—714.
Vita

123
124

DENNIS I. MERINO
817 Scarlett Drive
Baltimore, MD 21204
(301) 296 3620
e-mail: [email protected]

Personal

Marital Status Single


Citizenship Filipino
Visa F-1

Education / Awards

The Johns Hopkins University, Baltimore, MD


Candidate for Ph.D. degree in Mathematical Sciences,
expected May, 1992.

The Johns Hopkins University, Baltimore, MD


Master of Science in Engineering awarded May, 1990;
Recepient of the Abel Wolman Fellowship, 1988.

University of the Philippines, Quezon City, Philippines


Bachelor of Science in Mathematics, summa cum laude, November,
1986.

• Awardee, Dean’s Medal for Academic Excellence, 1987


• College Scholar, 1983-1984
• University Scholar, 1983-1987
• National Science and Technology Authority Scholar, 1983-1987
• Phi Kappa Phi Awardee, 1987
• Most Outstanding Student of Oriental Mindoro, 1985
125

Employment Experience

1989-present Teaching/Research Assistant


Department of Mathematical Sciences
The Johns Hopkins University, Baltimore, MD
Assisted in Discrete Mathematics, Combinatorial Analysis,
Graph Theory and Game Theory; Research in Matrix Analysis;
Statistical Data Analysis for the Maryland DWI Law Institute.

1990 (summer) Mathematician


Applied Physics Laboratory
National Institutes of Health, Bethesda MD
Assisted in research on fractals and lasers.

1986-1988 Instructor
Department of Mathematics
The University of the Philippines, Quezon City, Philippines
Taught elementary courses in Analysis and Algebra.
Chairman of the Student-Faculty Relations Commitee, 1987.
Trainer for the participants in the International
Mathematical Olympiad, 1988

Membership in Professional Societies

Mathematical Association of America

Publications
R. A. Horn and D. I. Merino. A Real-Coninvolutory Analog of the
Polar Decomposition. (preprint)

R. A. Horn and D. I. Merino. A Canonical Form for Contragredient


Equivalence. (preprint)

R. A. Horn and D. I. Merino. A New Look at the Jordan


126

Canonical Form. (preprint)

R. A. Horn, C. K. Li, and D. I. Merino. Linear Operators Preserving


Orthogonal Equivalence on Matrices. (preprint)

C. K. Li, and D. I. Merino. Linear Operators Preserving


Unitary t-Congruence on Matrices. (preprint)

References
Dr. Roger A. Horn
Department of Mathematical Sciences
The Johns Hopkins University
Baltimore, MD 21218
Dr. Chi-Kwong Li
Department of Mathematics
The College of William and Mary
Williamsburg, VA 23185
Dr. Alan J. Goldman
Department of Mathematical Sciences
The Johns Hopkins University
Baltimore, MD 21218

View publication stats

You might also like