100% found this document useful (1 vote)
408 views532 pages

BumpD-Lie Groups

Uploaded by

Dhruv Bhardwaj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
408 views532 pages

BumpD-Lie Groups

Uploaded by

Dhruv Bhardwaj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 532

Graduate Texts in Mathematics

Daniel Bump

Lie Groups
Second Edition
Graduate Texts in Mathematics 225
Graduate Texts in Mathematics

Series Editors:

Sheldon Axler
San Francisco State University, San Francisco, CA, USA

Kenneth Ribet
University of California, Berkeley, CA, USA

Advisory Board:

Colin Adams, Williams College, Williamstown, MA, USA


Alejandro Adem, University of British Columbia, Vancouver, BC, Canada
Ruth Charney, Brandeis University, Waltham, MA, USA
Irene M. Gamba, The University of Texas at Austin, Austin, TX, USA
Roger E. Howe, Yale University, New Haven, CT, USA
David Jerison, Massachusetts Institute of Technology, Cambridge, MA, USA
Jeffrey C. Lagarias, University of Michigan, Ann Arbor, MI, USA
Jill Pipher, Brown University, Providence, RI, USA
Fadil Santosa, University of Minnesota, Minneapolis, MN, USA
Amie Wilkinson, University of Chicago, Chicago, IL, USA

Graduate Texts in Mathematics bridge the gap between passive study and creative
understanding, offering graduate-level introductions to advanced topics in mathemat-
ics. The volumes are carefully written as teaching aids and highlight characteristic
features of the theory. Although these books are frequently used as textbooks in grad-
uate courses, they are also suitable for individual study.

For further volumes:


http://www.springer.com/series/136
Daniel Bump

Lie Groups
Second Edition

123
Daniel Bump
Department of Mathematics
Stanford University
Stanford, CA, USA

ISSN 0072-5285
ISBN 978-1-4614-8023-5 ISBN 978-1-4614-8024-2 (eBook)
DOI 10.1007/978-1-4614-8024-2
Springer New York Heidelberg Dordrecht London
Library of Congress Control Number: 2013944369

Mathematics Subject Classification: 22Exx, 17Bxx

© Springer Science+Business Media New York 2004, 2013


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection
with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and
executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this pub-
lication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s
location, in its current version, and permission for use must always be obtained from Springer. Permis-
sions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable
to prosecution under the respective Copyright Law.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
While the advice and information in this book are believed to be true and accurate at the date of publica-
tion, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors
or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the
material contained herein.

Printed on acid-free paper

Springer is part of Springer Science+Business Media (www.springer.com)


Preface

This book aims to be both a graduate text and a study resource for Lie groups.
It tries to strike a compromise between accessibility and getting enough depth
to communicate important insights. In discussing the literature, often sec-
ondary sources are preferred: cited works are usually recommended ones.
There are four parts. Parts I, II or IV are all “starting points” where one
could begin reading or lecturing. On the other hand, Part III assumes familiar-
ity with Part II. The following chart indicates the approximate dependencies
of the chapters. There are other dependencies, where a result is used from a
chapter that is not a prerequisite according to this chart: but these are rela-
tively minor. The dashed lines from Chaps. 1 and 2 to the opening chapters of
Parts II and IV indicate that the reader will benefit from knowledge of Schur
orthogonality but may skip or postpone Chaps. 1 and 2 before starting Part II
or Part IV. The other dashed line indicates that the Bruhat decomposition
(Chap. 27) is assumed in the last few chapters of Part IV.

Chapters 1-2
Key

3-4 5-22 32-39 Part I

Part II
24-27 23 39 40-41
Part III

28 29 30 31 42 43-44 45-47 48 Part IV

The two lines of development in Parts II–IV were kept independent because
it was possible to do so. This has the obvious advantage that one may start
reading with Part IV for an alternative course. This should not obscure the
fact that these two lines are complementary, and shed light on each other. We
hope the reader will study the whole book.

v
vi Preface

Part I treats two basic topics in the analysis of compact Lie groups: Schur
orthogonality and the Peter–Weyl theorem, which says that the irreducible
unitary representations of a compact group are all finite-dimensional.
Usually the study of Lie groups begins with compact Lie groups. It is
attractive to make this the complete content of a short course because it can
be treated as a self-contained subject with well-defined goals, about the right
size for a 10-week class. Indeed, Part II, which covers this theory, could be used
as a traditional course culminating in the Weyl character formula. It covers
the basic facts about compact Lie groups: the fundamental group, conjugacy
of maximal tori, roots and weights, the Weyl group, the Weyl integration
formula, and the Weyl character formula. These are basic tools, and a short
course in Lie theory might end up with the Weyl character formula, though
usually I try to do a bit more in a 10-week course, even at the expense of
skipping a few proofs in the lectures. The last chapter in Part II introduces
the affine Weyl group and computes the fundamental group. It can be skipped
since Part III does not depend on it.
Sage, the free mathematical software system, is capable of doing typical
Lie theory calculations. The student of Part II may want to learn to use it.
An appendix illustrates its use.
The goal of Part I is the Peter–Weyl theorem, but Part II does not depend
on this. Therefore one could skip Part I and start with Part II. Usually when
I teach this material, I do spend one or two lectures on Part I, proving Schur
orthogonality but not the Peter–Weyl formula. In the interests of speed I tend
to skip a few proofs in the lectures. For example, the conjugacy of maximal
tori needs to be proved, and this depends in turn on the surjectivity of the
exponential map for compact groups, that is, Theorem 16.3. This is proved
completely in the text, and I think it should be proved in class but some
of the differential geometry details behind it can be replaced by intuitive
explanations. So in lecturing, I try to explain the intuitive content of the
proof without going back and proving Proposition 16.1 in class. Beginning
with Theorems 16.2–16.4, the results to the end of the chapter, culminating
in various important facts such as the conjugacy of maximal tori and the
connectedness of centralizers can all be done in class. In the lectures I prove the
Weyl integration formula and (if there is time) the local Frobenius theorem.
But I skip a few things like Theorem 13.3. Then it is possible to get to the
Weyl Character formula in under 10 weeks.
Although compact Lie groups are an essential topic that can be treated in
one quarter, noncompact Lie groups are equally important. A key role in much
of mathematics is played by the Borel subgroup of a Lie group. For example,
if G = GL(n, R) or GL(n, C), the Borel subgroup is the subgroup of upper
triangular matrices, or any conjugate of this subgroup. It is involved in two
important results, the Bruhat and Iwasawa decompositions. A noncompact Lie
group has two important classes of homogeneous spaces, namely symmetric
spaces and flag varieties, which are at the heart of a great deal of important
Preface vii

modern mathematics. Therefore, noncompact Lie groups cannot be ignored,


and we tried hard to include them.
In Part III we first introduce a class of noncompact groups, the complex
reductive groups, that are obtained from compact Lie groups by “complex-
ification.” These are studied in several chapters before eventually taking on
general noncompact Lie groups. This allows us to introduce key topics such
as the Iwasawa and Bruhat decompositions without getting too caught up in
technicalities. Then we look at the Weyl group and affine Weyl group, already
introduced in Part II, as Coxeter groups. There are two important facts about
them to be proved: that they have Coxeter group presentations, and the theo-
rem of Matsumoto and Tits that any two reduced words for the same element
may be related by applications of the braid relations.
For these two facts we give geometric proofs, based on properties of the
complexes on which they act. These complexes are the system of Weyl cham-
bers in the first case, and of alcoves in the second. Applications are given, such
as Demazure characters and the Bruhat order. For complex reductive groups,
we prove the Iwasawa and Bruhat decompositions, digressing to discuss some
of the implications of the Bruhat decomposition for the flag manifold. In
particular the Schubert and Bott–Samelson varieties, the Borel-Weil theorem
and the Bruhat order are introduced. Then we look at symmetric spaces, in
a chapter that alternates examples with theory. Symmetric spaces occur in
pairs, a compact space matched with a noncompact one. We see how some
symmetric spaces, the Hermitian ones, have complex structures and are im-
portant in the theory of functions of several complex variables. Others are
convex cones. We take a look at Freudenthal’s “magic square.” We discuss
the embedding of a noncompact symmetric space in its compact dual, the
boundary components and Bergman–Shilov boundary of a symmetric tube
domain, and Cartan’s classification. By now we are dealing with arbitrary
noncompact Lie groups, where before we limited ourselves to the complex
analytic ones. Another chapter constructs the relative root system, explains
Satake diagrams and gives examples illustrating the various phenomena that
can occur. The Iwasawa decomposition, formerly obtained for complex ana-
lytic groups, is reproved in this more general context. Another chapter surveys
the different ways Lie groups can be embedded in one another. Part III ends
with a somewhat lengthy discussion of the spin representations of the double
covers of orthogonal groups. First, we consider what can be deduced from the
Weyl theory. Second, as an alternative, we construct the spin representations
using Clifford algebras. Instead of following the approach (due to Chevalley)
often taken in embedding the spin group into the multiplicative group of the
Clifford algebra, we take a different approach suggested by the point of view
in Howe [75, 77].
This approach obtains the spin representation as a projective represen-
tation from the fact that the orthogonal group acts by automorphisms on
a ring having a unique representation. The existence of the spin group is a
byproduct of the projective representation. This is the same way that the Weil
viii Preface

representation is usually constructed from the Stone–von Neumann theorem,


with the Clifford algebra replacing the Heisenberg group.
Part IV, we have already mentioned, is largely independent of the earlier
parts. Much of it concerned with correspondences which were emphasized
by Howe, though important examples occur in older work of Frobenius and
Schur, Weyl, Weil and others. Following Howe, a correspondence is a bijection
between a set of representations of a group G with a set of representations of
another group H which arise as follows. There is a representation Ω of G × H
with the following property. Let πi ⊗ πi be the irreducible representations
of G × H that occur in the restriction. It is assumed that each occurs with
multiplicity one, and moreover, that there are no repetitions among the πi ,
and none among the πi . This gives a bijection between the representations
πi of G and the representations πi of H. Often Ω has an explicit description
with special properties that allow us to transfer calculation from one group
to the other. Sometimes Ω arises by restriction of a “small” representation of
a big group W that contains G × H as a subgroup.
The first example is the Frobenius–Schur duality. This is the correspon-
dence between the irreducible representations of the symmetric group and
the general linear groups. The correspondence comes from decomposing ten-
sor spaces over both groups simultaneously. Another correspondence, for the
groups GL(n) and GL(m), is embodied in the Cauchy identity. We will focus
on these two correspondences, giving examples of how they can be used to
transfer calculations from one group to the other.
Frobenius–Schur duality is very often called “Schur–Weyl duality,” and in-
deed Weyl emphasized this theory both in his book on the classical groups and
in his book on quantum mechanics. However Weyl was much younger than
Schur and did not begin working on Lie groups until the 1920s, while the
duality is already mature in Schur’s 1901 dissertation. Regarding Frobenius’
contribution, Frobenius invented character theory before the relationship be-
tween characters and representations was clarified by his student Schur. With
great insight Frobenius showed in 1900 that the characters of the symmetric
group could be computed using symmetric functions. This very profound idea
justifies attaching Frobenius’ name with Schur’s to this phenomenon. Now
Green has pointed out that the 1892 work of Deruyts in invariant theory also
contains results almost equivalent to this duality. This came a few years too
soon to fully take the point of view of group representation theory. Deruyt’s
work is prescient but less historically influential than that of Frobenius and
Schur since it was overlooked for many years, and in particular Schur was ap-
parently not aware of it. For these reasons we feel the term “Frobenius–Schur
duality” is most accurate. See the excellent history of Curtis [39].
Frobenius–Schur duality allows us to simultaneously develop the represen-
tation theories of GL(n, C) and Sk . For GL(n, C), this means a proof of the
Weyl character formula that is independent of the arguments in Part II. For
the symmetric group, this means that (following Frobenius) we may use sym-
metric functions to describe the characters of the irreducible representations
Preface ix

of Sk . This gives us a double view of symmetric function theory that sheds


light on a great many things. The double view is encoded in the structure of a
graded algebra (actually a Hopf algebra) R whose homogeneous part of degree
k consists of the characters of representations of Sk . This is isomorphic to the
ring of Λ of symmetric polynomials, and familiarity with this equivalence is
the key to understanding a great many things.
One very instructive example of using Frobenius–Schur duality is the com-
putation by Diaconis and Shahshahani of the moments of the traces of unitary
matrices. The result has an interesting interpretation in terms of random ma-
trix theory, and it also serves as an example of how the duality can be used:
directly computing the moments in question is feasible but leads to a difficult
combinatorial problem. Instead, one translates the problem from the unitary
group to an equivalent but easier question on the symmetric group.
The GL(n) × GL(m) duality, like the Frobenius–Schur duality, can be
used to translate a calculation from one context to another, where it may be
easier. As an example, we consider a result of Keating and Snaith, also from
random matrix theory, which had significant consequences in understanding
the distribution of the values of the Riemann zeta function. The computation
in question is that of the 2k-th moment of the characteristic polynomial of
U(n). Using the duality, it is possible to transfer the computation from U(n)
to U(2k), where it becomes easy.
Other types of problems that may be handled this way are branching
rules: a branching rule describes how an irreducible representation of a group
G decomposes into irreducibles when restricted to a subgroup H. We will
see instances where one uses a duality to transfer a calculation from one pair
(G, H) to another, (G , H  ). For example, we may take G and H to be GL(p+q)
and its subgroup GL(p) × GL(q), and G and H  to be GL(n) × GL(n) and
its diagonal subgroup GL(n).
Chapter 42 shows how the Jacobi–Trudi identity from the representation
theory of the symmetric group can be translated using Frobenius–Schur dual-
ity to compute minors of Toeplitz matrices. Then we look at involution models
for the symmetric group, showing how it is possible to find a set of induced
representations whose union contains every irreducible representation exactly
once. Translated by Frobenius–Schur duality, this gives some decompositions
of symmetric algebras over the symmetric and exterior square representations,
a topic that is also treated by a different method in Part II.
Towards the end of Part IV, we discuss several other ways that the graded
ring R occurs. First, the representation theory of the symmetric group has a
deformation in the Iwahori Hecke algebra, which is ubiquitous in mathematics,
from the representation theory of p-adic groups to the K-theory of flag vari-
eties and developments in mathematical physics related to the Yang–Baxter
equation. Second, the Hopf algebra R has an analog in which the representa-
tion theory of GL(k) (say over a finite field) replaces the representation theory
of Sk ; the multiplication and comultiplication are parabolic induction and its
adjoint (the Jacquet functor). The ground field may be replaced by a p-adic
x Preface

field or an adele ring, and ultimately this “philosophy of cusp forms” leads to
the theory of automorphic forms. Thirdly, the ring R has as a homomorphic
image the cohomology rings of flag varieties, leading to the Schubert calculus.
These topics are surveyed in the final chapters.

What’s New? I felt that the plan of the first edition was a good one, but that
substantial improvements were needed. Some material has been removed, and
a fair amount of new material has been added. Some old material has been
streamlined or rewritten, sometimes extensively. In places what was implicit
in the first edition but not explained well is now carefully explained with at-
tention to the underlying principles. There are more exercises. A few chapters
are little changed, but the majority have some revisions, so the changes are
too numerous to list completely. Highlights in the newly added material in-
clude the affine Weyl group, new material about Coxeter groups, Demazure
characters, Bruhat order, Schubert and Bott–Samelson varieties, the Borel-
Weil theorem the appendix on Sage, Clifford algebras, the Keating–Snaith
theorem, and more.

Notation. The notations GL(n, F ) and GLn (F ) are interchangeable for the
group of n × n matrices with coefficients in F . By Matn (F ) we denote the
ring of n × n matrices, and Matn×m (F ) denotes the vector space of n × m
matrices. In GL(n), I or In denotes the n × n identity matrix and if g is any
matrix, t g denotes its transpose. Omitted entries in a matrix are zero. Thus,
for example,    
1 0 1
= .
−1 −1 0
The identity element of a group is usually denoted 1 but also as I, if the
group is GL(n) (or a subgroup), and occasionally as e when it seemed the
other notations could be confusing. The notations ⊂ and ⊆ are synonymous,
but we mostly use X ⊂ Y if X and Y are known to be unequal, although we
make no guarantee that we are completely consistent in this. If X is a finite
set, |X| denotes its cardinality.

Acknowledgements The proofs of the Jacobi–Trudi identity were worked


out years ago with Karl Rumelhart when he was still an undergraduate at
Stanford. Chapters 39 and 42 owe a great deal to Persi Diaconis and (for the
Keating–Snaith result) Alex Gamburd. For the second edition, I thank the
many people who informed me of typos; I cannot list them all but I especially
thank Yunjiang (John) Jiang for his careful reading of Chap. 18. And thanks
in advance to all who will report typos in this edition.
This work was supported in part by NSF grants DMS-9970841 and DMS-
1001079.

Stanford, CA, USA Daniel Bump


Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

Part I Compact Groups

1 Haar Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Schur Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Compact Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4 The Peter–Weyl Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Part II Compact Lie Groups

5 Lie Subgroups of GL(n, C) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

6 Vector Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

7 Left-Invariant Vector Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

8 The Exponential Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

9 Tensors and Universal Properties . . . . . . . . . . . . . . . . . . . . . . . . . . 57

10 The Universal Enveloping Algebra . . . . . . . . . . . . . . . . . . . . . . . . . 61

11 Extension of Scalars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

12 Representations of sl(2, C) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

13 The Universal Cover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

xi
xii Contents

14 The Local Frobenius Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

15 Tori . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

16 Geodesics and Maximal Tori . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

17 The Weyl Integration Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

18 The Root System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

19 Examples of Root Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

20 Abstract Weyl Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

21 Highest Weight Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

22 The Weyl Character Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

23 The Fundamental Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

Part III Noncompact Lie Groups

24 Complexification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205

25 Coxeter Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213

26 The Borel Subgroup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227

27 The Bruhat Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

28 Symmetric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257

29 Relative Root Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281

30 Embeddings of Lie Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303

31 Spin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319

Part IV Duality and Other Topics

32 Mackey Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337

33 Characters of GL(n, C) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349

34 Duality Between Sk and GL(n, C) . . . . . . . . . . . . . . . . . . . . . . . . . 355


Contents xiii

35 The Jacobi–Trudi Identity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365

36 Schur Polynomials and GL(n, C) . . . . . . . . . . . . . . . . . . . . . . . . . . 379

37 Schur Polynomials and Sk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387

38 The Cauchy Identity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395

39 Random Matrix Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407

40 Symmetric Group Branching Rules and Tableaux . . . . . . . . . 419

41 Unitary Branching Rules and Tableaux . . . . . . . . . . . . . . . . . . . . 427

42 Minors of Toeplitz Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437

43 The Involution Model for Sk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445

44 Some Symmetric Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455

45 Gelfand Pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461

46 Hecke Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471

47 The Philosophy of Cusp Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485

48 Cohomology of Grassmannians . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517

Appendix: Sage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545
Part I

Compact Groups
1
Haar Measure

If G is a locally compact group, there is, up to a constant multiple, a unique


regular Borel measure μL that is invariant under left translation. Here left
translation invariance means that μ(X) = μ(gX) for all measurable sets X.
Regularity means that

μ(X) = inf {μ(U ) | U ⊇ X, U open} = sup {μ(K) | K ⊆ X, K compact} .

Such a measure is called a left Haar measure. It has the properties that any
compact set has finite measure and any nonempty open set has measure > 0.

We will not prove the existence and uniqueness of the Haar measure. See
for example Halmos [61], Hewitt and Ross [69], Chap. IV, or Loomis [121] for
a proof of this. Left-invariance of the measure amounts to left-invariance of
the corresponding integral,
 
f (γg) dμL (g) = f (g) dμL (g), (1.1)
G G

for any Haar integrable function f on G.


There is also a right-invariant measure, μR , unique up to constant multiple,
called a right Haar measure. Left and right Haar measures may or may not
coincide. For example, if
  
y x 
G=  x, y ∈ R, y > 0 ,
0 1

then it is easy to see that the left- and right-invariant measures are, respec-
tively,
dμL = y −2 dx dy, dμR = y −1 dx dy.
They are not the same. However, there are many cases where they do coincide,
and if the left Haar measure is also right-invariant, we call G unimodular .

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 3


DOI 10.1007/978-1-4614-8024-2 1, © Springer Science+Business Media New York 2013
4 1 Haar Measure

Conjugation is an automorphism of G, and so it takes a left Haar measure


to another left Haar measure, which must be a constant multiple of the first.
Thus, if g ∈ G, there exists a constant δ(g) > 0 such that
 
−1
f (g hg) dμL (h) = δ(g) f (h) dμL (h).
G G

If G is a topological group, a quasicharacter is a continuous homomorphism


χ : G −→ C× . If |χ(g)| = 1 for all g ∈ G, then χ is a (linear) character or
unitary quasicharacter .

Proposition 1.1. The function δ : G −→ R×


+ is a quasicharacter. The mea-
sure δ(h)μL (h) is right-invariant.

The measure δ(h)μL (h) is a right Haar measure, and we may write μR (h) =
δ(h)μL (h). The quasicharacter δ is called the modular quasicharacter.

Proof. Conjugation by first g1 and then g2 is the same as conjugation by g1 g2


in one step. Thus δ(g1 g2 ) = δ(g1 ) δ(g2 ), so δ is a quasicharacter. Using (1.1),

  
−1
δ(g) f (h) dμL (h) = f (g · g hg) dμL (h) = f (hg) dμL (h).
G G G

Replace f by f δ in this identity and then divide both sides by δ(g) to find
that  
f (h) δ(h) dμL (h) = f (hg) δ(h) dμL (h).
G G

Thus, the measure δ(h) dμL (h) is right-invariant.




Proposition 1.2. If G is compact, then G is unimodular and μL (G) < ∞.

Proof. Since δ is a homomorphism, the image of δ is a subgroup of R× + . Since


G is compact, δ(G) is also compact, and the only compact subgroup of R× + is
just {1}. Thus δ is trivial, so a left Haar measure is right-invariant. We have
mentioned as an assumed fact that the Haar volume of any compact subset of
a locally compact group is finite, so if G is finite, its Haar volume is finite. 

If G is compact, then it is natural to normalize the Haar measure so that G


has volume 1.  
To simplify our notation, we will denote G f (g) dμL (g) by G f (g) dg.

Proposition 1.3. If G is unimodular, then the map g −→ g −1 is an isometry.

Proof. It is easy to see that g −→ g −1 turns a left Haar measure into a right
Haar measure. If left and right Haar measures agree, then g −→ g −1 multiplies
the left Haar measure by a positive constant, which must be 1 since the map
has order 2.

1 Haar Measure 5

Exercises

Exercise 1.1. Let da X denote the Lebesgue measure on Matn (R). It is of course a
Haar measure for the additive group Matn (R). Show that | det(X)|−n da X is both a
left and a right Haar measure on GL(n, R).

Exercise 1.2. Let P be the subgroup of GL(r + s, R) consisting of matrices of the


form
 
g1 X
p= , g1 ∈ GL(r, R), g2 ∈ GL(s, R), X ∈ Matr×s (R).
g2

Let dg1 and dg2 denote Haar measures on GL(r, R) and GL(s, R), and let da X
denote an additive Haar measure on Matr×s (R). Show that

dL p = | det(g1 )|−s dg1 dg2 da X, dR p = | det(g2 )|−r dg1 dg2 da X,

are (respectively) left and right Haar measures on P , and conclude that the modular
quasicharacter of P is
δ(p) = | det(g1 )|s | det(g2 )|−r .
2
Schur Orthogonality

In this chapter and the next two, we will consider the representation theory
of compact groups. Let us begin with a few observations about this theory
and its relationship to some related theories.
If V is a finite-dimensional complex vector space, or more generally a
Banach space, and π : G −→ GL(V ) a continuous homomorphism, then
(π, V ) is called a representation. Assuming dim(V ) < ∞, the function
χπ (g) = tr π(g) is called the character of π. Also assuming dim(V ) < ∞,
the representation (π, V ) is called irreducible if V has no proper nonzero
invariant subspaces, and a character is called irreducible if it is a character of
an irreducible representation.
[If V is an infinite-dimensional topological vector space, then (π, V ) is
called irreducible if it has no proper nonzero invariant closed subspaces.]
A quasicharacter χ is a character in this sense since we can take V = C
and π(g)v = χ(g)v to obtain a representation whose character is χ.

The archetypal compact Abelian group is the circle T = z ∈ C×  |z| = 1 .


We normalize the Haar measure on T so that it has volume 1. Its characters
are the functions χn : T −→ C× , χn (z) = z n . The important properties of the
χn are that they form an orthonormal system and (deeper) an orthonormal
basis of L2 (T).
More generally, if G is a compact Abelian group, the characters of G form
an orthonormal basis of L2 (G). If f ∈ L2 (G), we have a Fourier expansion,

f (g) = aχ χ(g), aχ = f (g)χ(g) dg, (2.1)
χ G

and the Plancherel formula is the identity:



|f (g)|2 dg = |aχ |2 . (2.2)
G χ

These facts can be directly generalized in two ways. First, Fourier analy-
sis on locally compact Abelian groups, including Pontriagin duality, Fourier

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 7


DOI 10.1007/978-1-4614-8024-2 2, © Springer Science+Business Media New York 2013
8 2 Schur Orthogonality

inversion, the Plancherel formula, etc. is an important and complete theory


due to Weil [169] and discussed, for example, in Rudin [140] or Loomis [121].
The most important difference from the compact case is that the charac-
ters can vary continuously. The characters themselves form a group, the dual
group Ĝ, whose topology is that of uniform convergence on compact sets. The
Fourier expansion (2.1) is replaced by the Fourier inversion formula
 
f (g) = ˆ
f (χ) χ(g) dχ, ˆ
f (χ) = f (g) χ(g) dg.
Ĝ G

The symmetry between G and Ĝ is now evident. Similarly in the Plancherel


formula (2.2) the sum on the right is replaced by an integral.
The second generalization, to arbitrary compact groups, is the subject
of this chapter and the next two. In summary, group representation theory
gives a orthonormal basis of L2 (G) in the matrix coefficients of irreducible
representations of G and a (more important and very canonical) orthonormal
basis of the subspace of L2 (G) consisting of class functions in terms of the
characters of the irreducible representations. Most importantly, the irreducible
representations are all finite-dimensional. The orthonormality of these sets is
Schur orthogonality; the completeness is the Peter–Weyl theorem.
These two directions of generalization can be unified. Harmonic analysis
on locally compact groups agrees with representation theory. The Fourier
inversion formula and the Plancherel formula now involve the matrix coeffi-
cients of the irreducible unitary representations, which may occur in contin-
uous families and are usually infinite-dimensional. This field of mathematics,
largely created by Harish-Chandra, is fundamental but beyond the scope of
this book. See Knapp [104] for an extended introduction, and Gelfand, Graev
and Piatetski-Shapiro [55] and Varadarajan [165] for the Plancherel formula
for SL(2, R).
Although infinite-dimensional representations are thus essential in har-
monic analysis on a noncompact group such as SL(n, R), noncompact Lie
groups also have irreducible finite-dimensional representations, which are
important in their own right. They are seldom unitary and hence not relevant
to the Plancherel formula. The scope of this book includes finite-dimensional
representations of Lie groups but not infinite-dimensional ones.
In this chapter and the next two, we will be mainly concerned with com-
pact groups. In this chapter, all representations will be complex and finite-
dimensional except when explicitly noted otherwise.
By an inner product on a complex vector space, we mean a positive definite
Hermitian form, denoted , . Thus, v, w is linear in v, conjugate linear in
w, satisfies w, v = v, w , and v, v > 0 if v = 0. We will also use the term
inner product for real vector spaces—an inner product on a real vector space
is a positive definite symmetric bilinear form. Given a group G and a real or
complex representation π : G −→ GL(V ), we say the inner product , on
V is invariant or G-equivariant if it satisfies the identity
2 Schur Orthogonality 9

π(g)v, π(g)w = v, w .

Proposition 2.1. If G is compact and (π, V ) is any finite-dimensional com-


plex representation, then V admits a G-equivariant inner product.

Proof. Start with an arbitrary inner product , . Averaging it gives another


inner product,

v, w = π(g)v, π(g)w dg,
G

for it is easy to see that this inner product is Hermitian and positive definite.
It is G-invariant by construction.


Proposition 2.2. If G is compact, then each finite-dimensional representa-


tion is the direct sum of irreducible representations.

Proof. Let (π, V ) be given. Let V1 be a nonzero invariant subspace of minimal


dimension. It is clearly irreducible. Let V1⊥ be the orthogonal complement of
V1 with respect to a G-invariant inner product. It is easily checked to be
invariant and is of lower dimension than V . By induction V1⊥ = V2 ⊕ · · · ⊕ Vn
is a direct sum of invariant subspaces and so V = V1 ⊕ · · · ⊕ Vn is also.


A function of the form φ(g) = L π(g) v , where (π, V ) is a finite-dimensional
representation of G, v ∈ V and L : V −→ C is a linear functional, is called a
matrix coefficient on G. This terminology is natural, because if we choose a
basis e1 , . . . , en , of V , we can identify V with Cn and represent g by matrices:
⎛ ⎞⎛ ⎞ ⎛ ⎞
π11 (g) · · · π1n (g) v1 v1
⎜ .. ⎟
n
⎜ .. ⎟ ⎜ .. ⎟ ,
π(g)v = ⎝ ... . ⎠⎝ . ⎠ v = ⎝ . ⎠ = vj ej .
πn1 (g) · · · πnn (g) vn vn j=1

Then each of the n2 functions πij is a matrix coefficient. Indeed



πij (g) = Li π(g)ej ,

where Li ( j vj ej ) = vi .

Proposition 2.3. The matrix coefficients of G are continuous functions. The


pointwise sum or product of two matrix coefficients is a matrix coefficient, so
they form a ring.

Proof. If v ∈ V , then g −→ π(g)v is continuous since by definition a represen-



tation π : G −→ GL(V ) is continuous and so a matrix coefficient L π(g) v is
continuous.
If (π1 , V1 ) and (π2 , V2 ) are representations, vi ∈ Vi are vectors and
Li : Vi −→ C are linear functionals, then we have representations π1 ⊕ π2
and π1 ⊗ π2 on V1 ⊕ V2 and V1 ⊗ V2 , respectively.
Given
vectors vi ∈ Vi
and functionals Li ∈ Vi∗ , then L1 π(g)v1 ± L2 π(g)v2 can be expressed as
10 2 Schur Orthogonality

L (π1 ⊕π2 )(g)(v1 , v2 ) where L : V1 ⊕V2 −→ C is L(x1 , x2 ) = L1 (x1 )±L2 (x2 ),
so the matrix coefficients are closed under addition and subtraction.
Similarly, we have a linear functional L1 ⊗ L2 on V1 ⊗ V2 satisfying

(L1 ⊗ L2 )(x1 ⊗ x2 ) = L1 (x1 )L2 (x2 )

and

(L1 ⊗ L2 ) (π1 ⊗ π2 )(g)(v1 ⊗ v2 ) = L1 π1 (g)v1 L2 π2 (g)v2 ,

proving that the product of two matrix coefficients is a matrix coefficient.




If (π, V ) is a representation, let V ∗ be the dual space of V . To emphasize the


symmetry between V and V ∗ , let us write the dual pairing V × V ∗ −→ C in
the symmetrical form L(v) = v, L. We have a representation (π̂, V ∗ ), called
the contragredient of π, defined by
 
v, π̂(g)L = π(g −1 )v, L . (2.3)

Note that the inverse is needed here so that π̂(g1 g2 ) = π̂(g1 )π̂(g2 ).
If (π, V ) is a representation,
then
by Proposition 2.3 any linear combination
of functions of the form L π(g) v with v ∈ V , L ∈ V ∗ is a matrix coefficient,
though it may be a function L π  (g) v  where (π  , V  ) is not (π, V ), but a
larger representation.
Nevertheless, we call any linear combination of functions
of the form L π(g) v a matrix coefficient of the representation (π, V ). Thus,
the matrix coefficients of π form a vector space, which we will denote by Mπ .
Clearly, dim(Mπ )  dim(V )2 .

Proposition 2.4. If f is a matrix coefficient of (π, V ), then fˇ(g) = f (g −1 )


is a matrix coefficient of (π̂, V ∗ ).

Proof. This is clear from (2.3), regarding v as a linear functional on V ∗ .




We have actions of G on the space of functions on G by left and right trans-


lation. Thus if f is a function and g ∈ G, the left and right translates are

λ(g)f (x) = f (g −1 x), ρ(g)f (x) = f (xg).

Theorem 2.1. Let f be a function on G. The following are equivalent.


(i) The functions λ(g)f span a finite-dimensional vector space.
(ii) The functions ρ(g)f span a finite-dimensional vector space.
(iii) The function f is a matrix coefficient of a finite-dimensional representa-
tion.

Proof. It is easy to check that if f is a matrix coefficient of a particular


representation V , then so are λ(g)f and ρ(g)f for any g ∈ G. Since V is finite-
dimensional, its matrix coefficients span a finite-dimensional vector space; in
fact, a space of dimension at most dim(V )2 . Thus, (iii) implies (i) and (ii).
2 Schur Orthogonality 11

Suppose that the functions ρ(g)f span a finite-dimensional vector space V .


Then (ρ, V ) is a finite-dimensional representation of G, and we claim that f is
a matrix coefficient.
Indeed, define a functional L : V −→ C by L(φ) = φ(1).
Clearly, L ρ(g)f = f (g), so f is a matrix coefficient, as required. Thus (ii)
implies (iii).
Finally, if the functions λ(g)f span a finite-dimensional space, composing
these functions with g −→ g −1 gives another finite-dimensional space which is
closed under right translation, and fˇ defined as in Proposition 2.4 is an element
of this space; hence fˇ is a matrix coefficient by the case just considered.
By Proposition 2.4, f is also a matrix coefficient, so (i) implies (iii).


If (π1 , V1 ) and (π2 , V2 ) are representations, an intertwining operator , also


known as a G-equivariant map T : V1 −→ V2 or (since V1 and V2 are some-
times called G-modules) a G-module homomorphism, is a linear transforma-
tion T : V1 −→ V2 such that

T ◦ π1 (g) = π2 (g) ◦ T

for g ∈ G. We will denote by HomC (V1 , V2 ) the space of all linear trans-
formations V1 −→ V2 and by HomG (V1 , V2 ) the subspace of those that are
intertwining maps.
For the remainder of this chapter, unless otherwise stated, G will denote
a compact group.

Theorem 2.2 (Schur’s lemma).


(i) Let (π1 , V1 ) and (π2 , V2 ) be irreducible representations, and let T : V1 −→
V2 be an intertwining operator. Then either T is zero or it is an isomor-
phism.
(ii) Suppose that (π, V ) is an irreducible representation of G and T : V −→ V
is an intertwining operator. Then there exists a scalar λ ∈ C such that
T (v) = λv for all v ∈ V .

Proof. For (i), the kernel of T is an invariant subspace of V1 , which is assumed


irreducible, so if T is not zero, ker(T ) = 0. Thus, T is injective. Also, the image
of T is an invariant subspace of V2 . Since V2 is irreducible, if T is not zero,
then im(T ) = V2 . Therefore T is bijective, so it is an isomorphism.
For (ii), let λ be any eigenvalue of T . Let I : V −→ V denote the identity
map. The linear transformation T − λI is an intertwining operator that is not
an isomorphism, so it is the zero map by (i).


We are assuming that G is compact. The Haar volume of G is therefore finite,


and we normalize the Haar measure so that the volume of G is 1.
We will consider the space L2 (G) of functions on G that are square-
integrable with respect to the Haar measure. This is a Hilbert space with
the inner product
12 2 Schur Orthogonality

f1 , f2 L2 = f1 (g) f2 (g) dg.
G
Schur orthogonality will give us an orthonormal basis for this space.
If (π, V ) is a representation and , is an invariant inner product on V ,
then every linear functional is of the form x −→ x, v for some v ∈ V . Thus
a matrix coefficient may be written in the form g −→ π(g)w, v , and such a
representation will be useful to us in our discussion of Schur orthogonality.

Lemma 2.1. Suppose that (π1 , V1 ) and (π2 , V2 ) are complex representations
of the compact group G. Let , be any inner product on V1 . If vi , wi ∈ Vi ,
then the map T : V1 −→ V2 given by

T (w) = π1 (g)w, v1 π2 (g −1 )v2 dg (2.4)
G

is G-equivariant.

Proof. We have


T π1 (h)w = π1 (gh)w, v1 π2 (g −1 )v2 dg.
G

The variable change g −→ gh−1 shows that this equals π2 (h)T (w), as required.



Theorem 2.3 (Schur orthogonality). Suppose that (π1 , V1 ) and (π2 , V2 )


are irreducible representations of the compact group G. Either every matrix
coefficient of π1 is orthogonal in L2 (G) to every matrix coefficient of π2 , or
the representations are isomorphic.

Proof. We must show that if there exist matrix coefficients fi : G −→ C of πi


that are not orthogonal, then there is an isomorphism T : V1 −→ V2 . We may
assume that the fi have the form fi (g) = πi (g)wi , vi since functions of that
form span the spaces of matrix coefficients of the representations πi . Here we
use the notation , to denote invariant bilinear forms on both V1 and V2 ,
and vi , wi ∈ Vi . Then our assumption is that
 
 
π1 (g)w1 , v1 π2 (g −1 )v2 , w2 dg = π1 (g)w1 , v1 π2 (g)w2 , v2 dg = 0.
G G

Define T : V1 −→ V2 by (2.4). The map is nonzero since the last inequality


can be written T (w1 ), w2 = 0. It is an isomorphism by Schur’s lemma. 

This gives orthogonality for matrix coefficients coming from nonisomorphic


irreducible representations. But what about matrix coefficients from the same
representation? (If the representations are isomorphic, we may as well assume
they are equal.) The following result gives us an answer to this question.
2 Schur Orthogonality 13

Theorem 2.4 (Schur orthogonality). Let (π, V ) be an irreducible


representation of the compact group G, with invariant inner product , .
Then there exists a constant d > 0 such that

π(g)w1 , v1 π(g)w2 , v2 dg = d−1 w1 , w2 v2 , v1 . (2.5)
G

Later, in Proposition 2.9, we will show that d = dim(V ).

Proof. We will show that if v1 and v2 are fixed, there exists a constant c(v1 , v2 )
such that

π(g)w1 , v1 π(g)w2 , v2 dg = c(v1 , v2 ) w1, w2 . (2.6)
G

Indeed, T given by (2.4) is G-equivariant, so by Schur’s lemma it is a scalar.


Thus, there is a constant c = c(v1 , v2 ) depending only on v1 and v2 such that
T (w) = cw. In particular, T (w1 ) = cw1 , and so the right-hand side of (2.6)
equals

 
T (w1 ), w2 = π(g)w1 , v1 π(g −1 )v2 , w2 dg,
G

Now the variable change g −→ g −1 and the properties of the inner product
show that this equals the left-hand side of (2.6), proving the identity. The
same argument shows that there exists another constant c (w1 , w2 ) such that
for all v1 and v2 we have

π(g)w1 , v1 π(g)w2 , v2 dg = c (w1 , w2 ) v2, v1 .
G

Combining this with (2.6), we get (2.5). We will compute d later in Proposi-
tion 2.9, but for now we simply note that it is positive since, taking w1 = w2
and v1 = v2 , both the left-hand side of (2.5) and the two inner products on
the right-hand side are positive.


Before we turn to the evaluation of the constant d, we will prove a different


orthogonality for the characters of irreducible representations (Theorem 2.5).
This will require some preparations.

Proposition 2.5. The character χ of a representation (π, V ) is a matrix co-


efficient of V .

Proof. If v1 , 
. . . , vn is a matrix
of V , and L1 , . . . , Ln is the dual basis of V ,
n
then χ(g) = i=1 Li π(g)vi . 

Proposition 2.6. Suppose that (π, V ) is a representation of G. Let χ be the


character of π.
(i) If g ∈ V then χ(g −1 ) = χ(g).
14 2 Schur Orthogonality

(ii) Let (π̂, V ∗ ) be the contragredient representation of π. Then the character


of π̂ is the complex conjugate χ of the character χ of G.

Proof. Since π(g) is unitary with respect to an invariant inner product , ,


its eigenvalues t1 , . . . , tn all have absolute value 1, and so

tr π(g)−1 = t−1
i = ti = χ(g).
i i

This proves (i). As for (ii), referring to (2.3), π̂(g) is the adjoint of π(g)−1 with
respect to the dual pairing  , , so its trace equals the trace of π(g)−1 .


The trivial representation of any group G is the representation on a one-


dimensional vector space V with π(g)v = v being the trivial action.

Proposition 2.7. If (π, V ) is an irreducible representation and χ its charac-


ter, then  
1 if π is the trivial representation;
χ(g) dg =
G 0 otherwise.
Proof. The character of the trivial representation is just the constant function
1, and since we normalized the Haar measure so that G has volume 1, this
integral is 1 if π is trivial. In general, we may regard G χ(g) dg as the inner
product of χ with the character 1 of the trivial representation, and if π is
nontrivial, these are matrix coefficients of different irreducible representations
and hence orthogonal by Theorem 2.3.


If (π, V ) is a representation, let V G be the subspace of G-invariants, that is,

V G = {v ∈ V | π(g)v = v for all g ∈ G} .

Proposition 2.8. If (π, V ) is a representation of G and χ its character, then



χ(g) dg = dim(V G ).
G

Proof. Decompose V = ⊕i Vi into a direct sum of irreducible invariant sub-


spaces, and let χi be the character of the restriction πi of π to
 Vi . By Propo-
sition 2.7, G χi (g) dg = 1 if and only if πi is trivial. Hence G χ(g) dg is the
number of trivial πi . The direct sum of the Vi with πi trivial is V G , and the
statement follows.


If (π1 , V1 ) and (π2 , V2 ) are irreducible representations, and χ1 and χ2 are their
characters, we have already noted in proving Proposition 2.3 that we may form
representations π1 ⊕ π2 and π1 ⊗ π2 on V1 ⊕ V2 and V1 ⊗ V2 . It is easy to see
that χπ1 ⊕π2 = χπ1 + χπ2 and χπ1 ⊗π2 = χπ1 χπ2 . It is not quite true that
the characters form a ring. Certainly the negative of a matrix coefficient is a
2 Schur Orthogonality 15

matrix coefficient, yet the negative of a character is not a character. The set
of characters is closed under addition and multiplication but not subtraction.
We define a generalized (or virtual ) character to be a function of the form
χ1 − χ2 , where χ1 and χ2 are characters. It is now clear that the generalized
characters form a ring.

Lemma 2.2. Define a representation Ψ : GL(n, C) × GL(m, C) −→ GL(Ω)


where Ω = Matn×m (C) by Ψ (g1 , g2 ) : X −→ g2 Xg1−1 . Then the trace of
Ψ (g1 , g2 ) is tr(g1−1 ) tr(g2 ).

Proof. Both tr Ψ (g1 , g2 ) and tr(g1−1 ) tr(g2 ) are continuous, and since diag-
onalizable matrices are dense in GL(n, C) we may assume that both g1
and g2 are diagonalizable. Also if γ is invertible we have Ψ (γg1 γ −1 , g2 ) =
Ψ (γ, 1)Ψ (g1 , g2 )Ψ (γ, 1)−1 so the trace of both tr Ψ (g1 , g2 ) and tr(g1−1 )tr(g2 )
are unchanged if g1 is replaced by γg1 γ −1 . So we may assume that g1 is di-
agonal, and similarly g2 . Now if α1 , . . . , αn and β1 , . . . , βm are the diagonal
entries of g1 and g2−1 , the effect of Ψ (g1 , g2 ) on X ∈ Ω is to multiply the
columns by the α−1 i and the rows by the βj . So the trace is tr(g1−1 )tr(g2 ). 

Theorem 2.5 (Schur orthogonality). Let (π1 , V1 ) and (π2 , V2 ) be repre-


sentations of G with characters χ1 and χ2 . Then

χ1 (g) χ2 (g) dg = dim HomG (V1 , V2 ). (2.7)
G

If π1 and π2 are irreducible, then


 
1 if π1 ∼
= π2 ;
χ1 (g)χ2 (g) dg =
G 0 otherwise.

Proof. Define a representation Π of G on the space Ω = HomC (V1 , V2 ) of all


linear transformations T : V1 −→ V2 by

Π(g)T = π2 (g) ◦ T ◦ π1 (g)−1 .

By lemma 2.2 and Proposition 2.6, the character of Π(g) is χ2 (g)χ1 (g). The
space of invariants Ω G exactly of the T which are G-module homomorphisms,
so by Proposition 2.8 we get

χ1 (g) χ2 (g) dg = dim HomG (V1 , V2 ).
G

Since this is real, we may conjugate to obtain (2.7).




Proposition 2.9. The constant d in Theorem 2.4 equals dim(V ).


16 2 Schur Orthogonality

Proof. Let v1 , . . . , vn be an orthonormal basis of V , n = dim(V ). We have



χ(g) = πi (g)vi , vi
i

since π(g)vj , vi is the i, j component of the matrix of π(g) with respect to


this basis. Now
 
1= |χ(g)| dg =
2
π(g)vi , vi π(g)vj , vj dg.
G i,j G

There are n2 terms on the right, but by (2.5) only the terms with i = j are
nonzero, and those equal d−1 . Thus, d = n.


We now return to the matrix coefficients Mπ of an irreducible representation


(π, V ). We define a representation Θ of G × G on Mπ by

Θ(g1 , g2 )f (x) = f (g2−1 xg1 ).

We also have a representation Π of G × G on EndC (V ) by

Π(g1 , g2 )T = π(g2 )−1 T π(g1 ).

Proposition 2.10. If f ∈ Mπ then so is Θ(g1 , g2 ) f . The representations Θ


and Π are equivalent.

Proof. Let L ∈ V ∗ and v ∈ V . Define fL,v (g) = L(π(g)v). The map L, v −→
fL,v is bilinear, hence induces a linear map σ : V ∗ ⊗V −→ Mπ . It is surjective
by the definition of Mπ , and it follows from Proposition 2.4 that if Li and vj
run through orthonormal bases, then fLi ,vj are orthonormal, hence linearly
independent. Therefore, σ is a vector space isomorphism. We have

Θ(g1 , g2 )fL,v (g) = L(g2−1 gg1 v) = fπ̂(g2 )L,π(g1 )v (x),

where we recall that (π̂, V ∗ ) is the contragredient representation. This means


that σ is a G × G-module homomorphism and so Mπ ∼ = V ∗ ⊗ V as G × G-
modules. On the other hand we also have a bilinear map V ∗ × V −→ EndC (V )
that associates with (L, v) the rank-one linear map TL,v (u) = L(u)v. This
induces an isomorphism V ∗ ⊗ V −→ EndC (V ) which is G × G equivariant.
We see that Mπ ∼ =V∗⊗V ∼ = EndC (V ).


A function f on G is called a class function if it is constant on conjugacy


classes, that is, if it satisfies the equation f (hgh−1 ) = f (g). The character of
a representation is a class function since the trace of a linear transformation
is unchanged by conjugation.

Proposition 2.11. If f is the matrix coefficient of an irreducible representa-


tion (π, V ), and if f is a class function, then f is a constant multiple of χπ .
2 Schur Orthogonality 17

Proof. By Schur’s lemma, there is a unique G-invariant vector in HomC (V, V );


hence. by Proposition 2.10, the same is true of Mπ in the action of G by
conjugation. This matrix coefficient is of course χπ .

Theorem 2.6. If f is a matrix coefficient and also a class function, then f
is a finite linear combination of characters of irreducible representations.
n
Proof. Write f = i=1 fi , where each fi is a class function of a distinct irre-
ducible representation (πi , Vi ). Since f is conjugation-invariant, and since the
fi live in spaces Mπi , which are conjugation-invariant and mutually orthog-
onal, each fi is itself a class function and hence a constant multiple of χπi by
Proposition 2.11.


Exercises

Exercise 2.1. Suppose that G is a compact Abelian group and π : G −→ GL(n, C)


an irreducible representation. Prove that n = 1.
Exercise 2.2. Suppose that G is compact group and f : G −→ C is the matrix
coefficient of an irreducible representation π. Show that g −
 → f (g −1 ) is a matrix
coefficient of the same representation π.
Exercise 2.3. Suppose that G is compact group. Let C(G) be the space of contin-
uous functions on G. If f1 and f2 ∈ C(G), define the convolution f1 ∗ f2 of f1 and
f2 by
 
(f1 ∗ f2 )(g) = f1 (gh−1 ) f2 (h) dh = f1 (h) f2 (h−1 g) dh.
G G

(i) Use the variable change h −→ h−1 g to prove the identity of the last two terms.
Prove that this operation is associative, and so C(G) is a ring (without unit)
with respect to covolution.
(ii) Let π be an irreducible representation. Show that the space Mπ of matrix
coefficients of π is a 2-sided ideal in C(G), and explain how this fact implies
Theorem 2.3.
Exercise 2.4. Let G be a compact group, and let G × G act on the space Mπ
by left and right translation: (g, h)f (x) = f (g −1 xh). Show that Mπ ∼
= π̂ ⊗ π as
(G × G)-modules.
Exercise 2.5. Let G be a compact group and let g, h ∈ G. Show that g and h are
conjugate if and only if χ(g) = χ(h) for every irreducible character χ. Show also
that every character is real-valued if and only if every element is conjugate to its
inverse.
Exercise 2.6. Let G be a compact group, and let V, W be irreducible
 G-modules.
An invariant bilinear form B : V ×W → C is one that satisfies B g·v, g·w = B(v, w)
for g ∈ G, v ∈ V , w ∈ W . Show that the space of invariant bilinear forms is at most
one-dimensional, and is one-dimensional if and only if V and W are contragredient.
3
Compact Operators

If H is a normed vector space, a linear operator T : H → H is called bounded


if there exists a constant C such that |T x|  C|x| for all x ∈ H. In this case,
the smallest such C is called the operator norm of T , and is denoted |T |.
The boundedness of the operator T is equivalent to its continuity. If H is a
Hilbert space, then a bounded operator T is self-adjoint if

T f, g = f, T g

for all f, g ∈ H. As usual, we call f an eigenvector with eigenvalue λ if f = 0


and T f = λf . Given λ, the set of eigenvectors with eigenvalue λ (together
with 0, which is not an eigenvector) is called the λ-eigenspace. It follows from
elementary and well-known arguments that if T is a self-adjoint bounded
operator, then its eigenvalues are real, and the eigenspaces corresponding to
distinct eigenvalues are orthogonal. Moreover, if V ⊂ H is a subspace such
that T (V ) ⊂ V , it is easy to see that also T (V ⊥ ) ⊂ V ⊥ .
A bounded operator T : H → H is compact if whenever {x1, x2 , x3 , . . .} is
any bounded sequence in H, the sequence {T x1 , T x2 , . . .} has a convergent
subsequence.

Theorem 3.1 (Spectral theorem for compact operators). Let T be a


compact self-adjoint operator on a Hilbert space H. Let N be the nullspace
of T . Then the Hilbert space dimension of N⊥ is at most countable. N⊥ has an
orthonormal basis φi (i = 1, 2, 3, . . .) of eigenvectors of T so that T φi = λi φi .
If N⊥ is not finite-dimensional, the eigenvalues λi → 0 as i → ∞.

Since the eigenvalues λi → 0, if λ is any nonzero eigenvalue, it follows from


this statement that the λ-eigenspace is finite-dimensional.

Proof. This depends upon the equality

| T x, x |
|T | = sup . (3.1)
0=x∈H x, x

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 19


DOI 10.1007/978-1-4614-8024-2 3, © Springer Science+Business Media New York 2013
20 3 Compact Operators

To prove this, let B denote the right-hand side. If 0 = x ∈ H,

| T x, x |  |T x| · |x|  |T | · |x|2 = |T | · x, x ,

so B  |T |. We must prove
 the
 converse. Let λ > 0 be a constant, to be
determined later. Using T 2 x, x = T x, T x , we have

T x, T x
   
= 14  T (λx + λ−1 T x), λx + λ−1 T x − T (λx − λ−1 T x), λx − λ−1 T x 
   
 14  T (λx + λ−1 T x), λx + λ−1 T x + T (λx − λ−1 T x), λx − λ−1 T x 
    
 14 B λx + λ−1 T x, λx + λ−1 T x + B λx − λ−1 T x, λx − λ−1 T x
 
= B2 λ2 x, x + λ−2 T x, T x .

Now taking λ = |T x|/|x|, we obtain

|T x|2 = T x, T x  B|x| |T x|,

so |T x|  B|x|, which implies that |T |  B, whence (3.1).


We now prove that N⊥ has an orthonormal basis consisting of eigenvectors
of T . It is an easy consequence of self-adjointness that N⊥ is T -stable. Let Σ
be the set of all orthonormal subsets of N⊥ whose elements are eigenvectors
of T . Ordering Σ by inclusion, Zorn’s lemma implies that it has a maximal
element S. Let V be the closure of the linear span of S. We must prove that
V = N⊥ . Let H0 = V ⊥ . We wish to show H0 = N. It is obvious that N ⊆ H0.
To prove the opposite inclusion, note that H0 is stable under T , and T induces
a compact self-adjoint operator on H0 . What we must show is that T |H0 = 0.
If T has a nonzero eigenvector in H0 , this will contradict the maximality of Σ.
It is therefore sufficient to show that a compact self-adjoint operator on a
nonzero Hilbert space has an eigenvector.
Replacing H by H0 , we are therefore reduced to the easier problem of
showing that if T = 0, then T has a nonzero eigenvector. By (3.1), there is
a sequence x1 , x2 , x3 , . . . of unit vectors such that | T xi , xi | → |T |. Observe
that if x ∈ H, we have

T x, x = x, T x = T x, x

so the T xi , xi are real; we may therefore replace the sequence by a subse-


quence such that T xi , xi → λ, where λ = ±|T |. Since T = 0, λ = 0. Since T
is compact, there exists a further subsequence {xi } such that T xi converges
to a vector v. We will show that xi → λ−1 v.
Observe first that

| T xi , xi |  |T xi | |xi | = |T xi |  |T | |xi | = |λ|,

and since T xi , xi → λ, it follows that |T xi | → |λ|. Now


3 Compact Operators 21

|λ xi − T xi |2 = λ xi − T xi , λ xi − T xi = λ2 |xi |2 + |T xi |2 − 2λ T xi , xi ,
and since |xi | = 1, |T xi | → |λ|, and T xi , xi → λ, this converges to 0. Since
T xi → v, the sequence λxi therefore also converges to v, and xi → λ−1 v.
Now, by continuity, T xi → λ−1 T v, so v = λ−1 T v. This proves that v is
an eigenvector with eigenvalue λ. This completes the proof that N⊥ has an
orthonormal basis consisting of eigenvectors.
Now let {φi } be this orthonormal basis and let λi be the corresponding
eigenvalues. If  > 0 is given, only finitely many |λi | >  since otherwise we
can find an infinite sequence of φi with |T φi | > . Such a sequence will have
no convergent subsequence, contradicting the compactness of T . Thus, N⊥ is
countable-dimensional, and we may arrange the {φi } in a sequence. If it is
infinite, we see the λi −→ 0.

Proposition 3.1. Let X and Y be compact topological spaces with Y a metric
space with distance function d. Let U be a set of continuous maps X −→ Y
such that for every x ∈ X and every  > 0 there exists a neighborhood N of
x such that d f (x), f (x ) <  for all x ∈ N and for all f ∈ U . Then every
sequence in U has a uniformly convergent subsequence.
We refer to the hypothesis on U as equicontinuity.
Proof. Let S0 = {f1 , f2 , f3 , . . .} be a sequence in U . We will show that it has
a convergent subsequence. We will construct a subsequence that is uniformly
Cauchy and hence has a limit. For every n > 1, we will construct a subsequence

Sn = {fn1 , fn2 , fn3 , . . .} of Sn−1 such that supx∈X d fni (x), fnj (x)  1/n.
Assume that Sn−1 is constructed. For each x ∈ X, equicontinuity guaran-

tees the existence of an open neighborhood Nx of x such that d f (y), f (x) 
3n for all y ∈ Nx and all f ∈ X. Since X is compact, we can cover X by
1

a finite number of these sets, say Nx1 , . . . , Nxm . Since the fn−1,i take values
in the compact space Y , the m-tuples fn−1,i (x1 ), . . . , fn−1,i (xm ) have an
accumulation
point, and we may therefore select the subsequence {fni } such
that d fni (xk ), fnj (xk )  3n 1
for all i, j and 1  k  m. Then for any y,
there exists xk such that y ∈ Nxk and

d fni (y), fnj (y)  d fni (y), fni (xk ) + d fni (xk ), fnj (xk )

+d fnj (y), fnj (xk )  3n
1 1
+ 3n + 3n1
= n1 .
This completes the construction of the sequences {fni }.
The diagonal sequence {f11 , f22 , f33 , . . .} is uniformly Cauchy. Since Y is
a compact metric space, it is complete, and so this sequence is uniformly
convergent.

We topologize C(X) by giving it the L∞ norm | |∞ (sup norm).
Proposition 3.2 (Ascoli and Arzela). Suppose that X is a compact space
and that U ⊂ C(X) is a bounded subset such that for each x ∈ X and  > 0
there is a neighborhood N of x such that |f (x) − f (y)|   for all y ∈ N and
all f ∈ U . Then every sequence in U has a uniformly convergent subsequence.
22 3 Compact Operators

Again, the hypothesis on U is called equicontinuity.

Proof. Since U is bounded, there is a compact interval Y ⊂ R such that all


functions in U take values in Y . The result follows from Proposition 3.1. 

Exercises
Exercise 3.1. Suppose that T is a bounded operator on the Hilbert space H, and
suppose that for each  > 0 there exists a compact operator T such that |T −T | < .
Show that T is compact. (Use a diagonal argument like the proof of Proposition 3.1.)

Exercise 3.2 (Hilbert–Schmidt operators). Let X be a locally compact Haus-


dorff space with a positive Borel measure μ. Assume that L2 (X) has a countable
basis. Let K ∈ L2 (X ×X). Consider the operator on L2 (X) with kernel K defined by

T f (x) = K(x, y) f (y) dμ(y).
X

2
Let φi be an orthonormal basis of L (X). Expand K in a Fourier expansion:


K(x, y) = ψi (x) φi (y), ψi = T φi .
i=1

Show that |ψi |2 = |K(x, y)|2 dμ(x) dμ(y) < ∞. Consider the operator TN with
kernel

N
KN (x, y) = ψi (x) φi (y).
i=1

Show that TN is compact, and deduce that T is compact.


4
The Peter–Weyl Theorem

In this chapter, we assume that G is a compact group. Let C(G) be the


convolution ring of continuous functions on G. It is a ring (without unit unless
G is finite) under the multiplication of convolution:
 
(f1 ∗ f2 )(g) = f1 (gh−1 ) f2 (h) dh = f1 (h) f2 (h−1 g) dh.
G G

−1
(Use the variable change h −→ h g to prove the identity of the last two
terms. See Exercise 2.3.) We will sometimes define f1 ∗ f2 by this formula
even if f1 and f2 are not assumed continuous. For example, we will make use
of the convolution defined this way if f1 ∈ L∞ (G) and f2 ∈ L1 (G), or vice
versa.
Since G has total volume 1, we have inequalities (where | |p denotes the
Lp norm, 1  p  ∞)
|f |1  |f |2  |f |∞ . (4.1)

The second inequality is trivial, and the first is Cauchy–Schwarz:

|f |1 = |f |, 1  |f |2 · |1|2 = |f |2 .

(Here |f | means the function |f |(x) = |f (x)|.)


If φ ∈ C(G) let Tφ be left convolution with φ. Thus,

(Tφ f )(g) = φ(gh−1 ) f (h) dh.
G

Proposition 4.1. If φ ∈ C(G), then Tφ is a bounded operator on L1 (G).


If f ∈ L1 (G), then Tφ f ∈ L∞ (G) and

|Tφ f |∞  |φ|∞ |f |1 . (4.2)

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 23


DOI 10.1007/978-1-4614-8024-2 4, © Springer Science+Business Media New York 2013
24 4 The Peter–Weyl Theorem

Proof. If f ∈ L1 (G), then


  
 
|Tφ f |∞ = sup  φ gh−1 f (h) dh  |φ|∞ |f (h)| dh,
g∈G G G

proving (4.2). Using (4.1), it follows that the operator Tφ is bounded. In


fact, (4.1) shows that it is bounded in each of the three metrics | |1 , | |2 ,
| |∞.

Proposition 4.2. If φ ∈ C(G), then convolution with φ is a bounded operator
Tφ on L2 (G) and |Tφ |  |φ|∞ . The operator Tφ is compact, and if φ(g −1 ) =
φ(g), it is self-adjoint.
Proof. Using (4.1), L∞ (G) ⊂ L2 (G) ⊂ L1 (G), and by (4.2), |Tφ f |2 
|Tφ f |∞  |φ|∞ |f |1  |φ|∞ |f |2 , so the operator norm |Tφ |  |φ|∞ .
By (4.1), the unit ball in L2 (G) is contained in the unit ball in L1 (G), so
it is sufficient to show that B = {Tφ f |f ∈ L1 (G), |f |1  1} is sequentially
compact in L2 (G). Also, by (4.1), it is sufficient to show that it is sequentially
compact in L∞ (G), that is, in C(G), whose topology is induced by the L∞ (G)
norm. It follows from (4.2) that B is bounded. We show that it is equicon-
tinuous. Since φ is continuous and G is compact, φ is uniformly continuous.
This means that given  > 0 there is a neighborhood N of the identity such
that |φ(kg) − φ(g)| <  for all g when k ∈ N . Now, if f ∈ L1 (G) and |f |1  1,
we have, for all g,
   
 
|(φ ∗ f )(kg) − (φ ∗ f )(g)| =  φ(kgh−1 ) − φ(gh−1 ) f (h) dh
 G
 
 φ(kgh−1 ) − φ(gh−1 ) |f (h)| dh  |f |1  .
G

This proves equicontinuity, and sequential compactness of B now follows by


the Ascoli–Arzela lemma (Proposition 3.2).
If φ(g −1 ) = φ(g), then
 
Tφ f1 , f2 = φ(gh−1 ) f1 (h) f2 (g) dg dh
G G

while  
f1 , Tφ f2 = φ(hg −1 ) f1 (h) f2 (g) dg dh.
G G

These are equal, so T is self-adjoint.




If g ∈ G, let ρ(g)f (x) = f (xg) be the right translate of f by g.
Proposition 4.3. If φ ∈ C(G), and λ ∈ C, the λ-eigenspace
V (λ) = {f ∈ L2 (G) | Tφ f = λf }
is invariant under ρ(g) for all g ∈ G.
4 The Peter–Weyl Theorem 25

Proof. Suppose Tφ f = λf . Then




Tφ ρ(g)f (x) = φ(xh−1 ) f (hg) dh.
G

After the change of variables h −→ hg −1 , this equals



φ(xgh−1 ) f (h) dh = ρ(g)(Tφ f )(x) = λρ(g)f (x).
G

Theorem 4.1 (Peter and Weyl). The matrix coefficients of G are dense
in C(G).

Proof. Let f ∈ C(G). We will prove that there exists a matrix coefficient f 
such that |f − f  |∞ <  for any given  > 0.
Since G is compact, f is uniformly continuous. This means that there exists
an open neighborhood U of the identity such that if g ∈ U , then |λ(g)f −
f |∞ < /2, where λ : G → End C(G) is the action by left translation:
−1
(λ(g)f )(h) = f (g h). Let φ be a nonnegative function supported in U such
that G φ(g) dg = 1. We may arrange that φ(g) = φ(g −1 ) so that the operator
Tφ is self-adjoint as well as compact. We claim that |Tφ f − f |∞ < /2. Indeed,
if h ∈ G,

     
(φ ∗ f )(h) − f (h) =  φ(g) f (g −1 h) − φ(g)f (h) dg 
 G
 
 φ(g) f (g −1 h) − f (h) dg
U
 φ(g) |λ(g)f − f |∞ dg
 U

 φ(g) (/2) dg = .
U 2

By Proposition 4.1, Tφ is a compact operator on L2 (G). If λ is an eigenvalue


of Tφ , let V (λ) be the λ-eigenspace. By the spectral theorem, the spaces V (λ)
are finite-dimensional [except perhaps V (0)], mutually orthogonal, and they
span L2 (G) as a Hilbert space. By Proposition 4.3 they are Tφ -invariant. Let
fλ be the projection of f on V (λ). Orthogonality of the fλ implies that

|fλ |22 = |f |22 < ∞. (4.3)
λ

Let

f  = Tφ (f  ), f  = fλ ,
|λ|>q
26 4 The Peter–Weyl Theorem

 
where
 q > 0 remains to be chosen. We note that f and f are both contained
in |λ|>q V (λ), which is a finite-dimensional vector space, and closed under
right translation by Proposition 4.3, and by Theorem 2.1, it follows that they
are matrix coefficients. 
By (4.3), we may choose q so that 0<q<|λ| |fλ |22 is as small as we like.
Using (4.1) may thus arrange that
   
    
    
 f    f  = |fλ |22 < . (4.4)
 λ   λ  2|φ|∞

0<|λ|<q   0<|λ|<q  0<|λ|<q
1 2

We have
⎛ ⎞ ⎛ ⎞

Tφ (f − f  ) = Tφ ⎝f0 + fλ ⎠ = Tφ ⎝ fλ ⎠ .
0<|λ|<q 0<|λ|<q

Using (4.2) and (4.4) we have |Tφ (f − f  )|∞  /2. Now

|f − f  |∞ = |f − Tφ f + Tφ (f − f  )|  |f − Tφ f | + |Tφ f − Tφ f  |
 2 + 2 = .

Corollary 4.1. The matrix coefficients of G are dense in L2 (G).

Proof. Since C(G) is dense in L2 (G), this follows from the Peter–Weyl
theorem and (4.1).


We say that a topological group G has no small subgroups if it has a


neighborhood U of the identity such that the only subgroup of G contained
in U is just {1}. For example, we will see that Lie groups have no small
subgroups. On the other hand, some groups, such as GL(n, Zp ) where Zp is
the ring of p-adic integers, have a neighborhood basis at the identity consisting
of open subgroups. Such a group is called totally disconnected , and for such a
group the no small subgroups property fails very strongly.
A representation is called faithful if its kernel is trivial.

Theorem 4.2. Let G be a compact group that has no small subgroups. Then
G has a faithful finite-dimensional representation.

Proof. Let U be a neighborhood of the identity that contains no subgroup but


{1}. By the Peter–Weyl theorem, we can find a finite-dimensional representa-
tion π and a matrix coefficient f such that f (1) = 0 but f (g) > 1 when g ∈ U .
The function f is constant on the kernel of π, so that kernel is contained in U .
It follows that the kernel is trivial.

4 The Peter–Weyl Theorem 27

We will now prove a fact about infinite-dimensional representations of a


compact group G. The Peter–Weyl Theorem amounts to a “completeness”
of the finite-dimensional representations from the point of view of harmonic
analysis. One aspect of this is the L2 completeness asserted in Corollary 4.1.
Another aspect, which we now prove, is that there are no irreducible uni-
tary infinite-dimensional representations. From the point of view of harmonic
analysis, these two statements are closely related and are in fact equivalent.
Representation theory and Fourier analysis on groups are essentially the same
thing.
If H is a Hilbert space, a representation π : G −→ End(H) is called unitary
if π(g)v, π(g)w = v, w for all v, w ∈ H, g ∈ G. It is also assumed that the
map (g, v) −→ π(g)v from G × H −→ H is continuous.
Theorem 4.3 (Peter and Weyl). Let H be a Hilbert space and G be a
compact group. Let π : G −→ End(H) be a unitary representation. Then H is
a direct sum of finite-dimensional irreducible representations.

Proof. We first show that if H is nonzero then it has an irreducible finite-


dimensional invariant subspace. We choose a nonzero vector v ∈ H. Let N be
a neighborhood of the identity of G such that if g ∈ N then |π(g)v−v|  |v|/2.
We can
 find a nonnegative continuous function φ on G supported in N such
that G φ(g) dg = 1.
We claim that G φ(g) π(g)v dg = 0. This can be proved by taking the
inner product with v. Indeed
   

φ(g) π(g)v dg, v = v, v − φ(g) v − π(g)v dg, v (4.5)
G N

and
 
 
 
 φ(g) v − π(g)v dg, v  |v − π(g)v|dg · |v|  |v|2 /2.
N N

Thus, the two terms in (4.5) differ in absolute value and cannot cancel.
Next, using the Peter–Weyl theorem, we may find a matrix coefficient f
such that |f − φ|∞ < , where  can be chosen arbitrarily. We have
 
 
 (f − φ)(g) π(g)v dg   |v| ,
G

so if  is sufficiently small we have G f (g) π(g)v dg = 0.
Since f is a matrix coefficient, so is the function g −→ f (g −1 ) by Proposi-
tion 2.4. Thus, let (ρ, W ) be a finite-dimensional representation
w∈W
with
and L : W −→ C a linear functional such that f (g −1 ) = L ρ(g)w . Define a
map T : W −→ H by


T (x) = L ρ(g −1 )x π(g)v dg.
G
28 4 The Peter–Weyl Theorem

This is an intertwining map by the same argument used to prove (2.4). It is


nonzero since T (w) = f (g) π(g)v dg = 0. Since W is finite-dimensional, the
image of T is a nonzero finite-dimensional invariant subspace.
We have proven that every nonzero unitary representation of G has a
nonzero finite-dimensional invariant subspace, which we may obviously assume
to be irreducible. From this we deduce the stated result. Let (π, H) be a
unitary representation of G. Let Σ be the set of all sets of orthogonal finite-
dimensional irreducible invariant subspaces of H, ordered by inclusion. Thus,
if S ∈ Σ and U, V ∈ S, then U and V are finite-dimensional irreducible
invariant subspaces, If U = V . then U ⊥ V . By Zorn’s lemma, Σ has
a maximal element S and we are done if S spans H as a Hilbert space.
Otherwise, let H  be the orthogonal complement of the span of S. By what
we have shown, H  contains an invariant irreducible subspace. We may append
this subspace to S, contradicting its maximality.


Exercises
Exercise 4.1. Let G be totally disconnected, and let π : G −→ GL(n, C) be a
finite-dimensional representation. Show that the kernel of π is open. (Hint: Use the
fact that GL(n, C) has no small subgroups.) Conclude (in contrast with Theorem 4.2)
that the compact group GL(n, Zp ) has no faithful finite-dimensional representation.

Exercise 4.2. Suppose that G is a compact Abelian group and H ⊂ G a closed


subgroup. Let χ : H −→ C× be a character. Show that χ can be extended to a
character of G. (Hint: Apply Theorem 4.3 to the space V = {f ∈ L2 (G) | f (hg) =
χ(h) f (g)}. To show that V is nonzero, note that if φ ∈ C(G) then f (g) =
φ(hg) χ(h)−1 dh defines an element of V . Use Urysohn’s lemma to construct φ
such that f
= 0.)
Part II

Compact Lie Groups


5
Lie Subgroups of GL(n, C)

If U is an open subset of Rn , we say that a map φ : U −→ Rm is smooth if


it has continuous partial derivatives of all orders. More generally, if X ⊂ Rn
is not necessarily open, we say that a map φ : X −→ Rn is smooth if for
each x ∈ X there exists an open set U of Rn containing x such that φ can be
extended to a smooth map on U . A diffeomorphism of X ⊆ Rn with Y ⊆ Rm
is a homeomorphism F : X −→ Y such that both F and F −1 are smooth. We
will assume as known the following useful criterion.
Inverse Function Theorem. If U ⊂ Rd is open and u ∈ U , if F : U −→
Rn is a smooth map, with d < n, and if the matrix of partial derivatives
(∂Fi /∂xj ) has rank d at u, then u has a neighborhood N such that F induces
a diffeomorphism of N onto its image.
A subset X of a topological space Y is locally closed (in Y ) if for all x ∈ X
there exists an open neighborhood U of x in Y such that X ∩ U is closed
in U . This is equivalent to saying that X is the intersection of an open set
and a closed set. We say that X is a submanifold of Rn of dimension d if it
is a locally closed subset and every point of X has a neighborhood that is
diffeomorphic to an open set in Rd .
Let us identify Matn (C) with the Euclidean space Cn ∼
2 2
= R2n . The subset
GL(n, C) is open, and if a closed subgroup G of GL(n, C) is a submanifold of
2
R2n in this identification, we say that G is a closed Lie subgroup of GL(n, C).
It may be shown that any closed subgroup of GL(n, C) is a closed Lie sub-
group. See Remarks 7.1 and 7.2 for some subtleties behind the innocent term
“closed Lie subgroup.”
More generally, a Lie group is a topological group G that is a differentiable
manifold such that the multiplication and inverse maps G × G −→ G and
G −→ G are smooth. We will give a proper definition of a differentiable
manifold in the next chapter. In this chapter, we will restrict ourselves to
closed Lie subgroups of GL(n, C).

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 31


DOI 10.1007/978-1-4614-8024-2 5, © Springer Science+Business Media New York 2013
32 5 Lie Subgroups of GL(n, C)

Example 5.1. If F is a field, then the general linear group GL(n, F ) is the
group of invertible n × n matrices with coefficients in F . It is a Lie group.
Assuming that F = R or C, the group GL(n, F ) is an open set in Matn (F )
and hence a manifold of dimension n2 if F = R or 2n2 if F = C. The special
linear group is the subgroup SL(n, F ) of matrices with determinant 1. It is a
closed Lie subgroup of GL(n, F ) of dimension n2 − 1 or 2(n2 − 1).
Example 5.2. If F = R or C, let O(n, F ) = {g ∈ GL(n, F ) | g · t g = I}. This
is the n × n orthogonal group. More geometrically, O(n, F ) is the group of
linear transformations preserving the quadratic form Q(x1 , . . . , xn ) = x21 +
x22 + · · · + x2n . To see this, if (x) = t (x1 , . . . , xn ) is represented as a column
vector, we have Q(x) = Q(x1 , . . . , xn ) = t x · x, and it is clear that Q(gx) =
Q(x) if g · t g = I. The group O(n, R) is compact and is usually denoted
simply O(n). The group O(n) contains elements of determinants ±1. The
subgroup of elements of determinant 1 is the special orthogonal group SO(n).
The dimension of O(n) and its subgroup SO(n) of index 2 is 12 (n2 − n). This
will be seen in Proposition 5.6 when we compute their Lie algebra (which is
the same for both groups).
Example 5.3. More generally, over any field, a vector space V on which there
is given a quadratic form q is called a quadratic space, and the set O(V, q)
of linear transformations of V preserving q is an orthogonal group. Over the
complex numbers, it is not hard to prove that all orthogonal groups are iso-
morphic (Exercise 5.4), but over the real numbers, some orthogonal groups are
not isomorphic to O(n). If k + r = n, let O(k, r) be the subgroup of GL(n, R)
preserving the indefinite quadratic form x21 + · · · + x2k − x2k+1 − · · · − x2n . If
r = 0, this is O(n), but otherwise this group is noncompact. The dimensions
of these Lie groups are, like SO(n), equal to 12 (n2 − n).

Example 5.4. The unitary group U(n) = {g ∈ GL(n, C) | g·t g = I}. If g ∈ U(n)
then | det(g)| = 1, and every complex number of absolute value 1 is a possible
determinant of g ∈ U(n). The special unitary group SU(n) = U(n) ∩ SL(n, C).
The dimensions of U(n) and SU(n) are n2 and n2 − 1, just like GL(n, R) and
SL(n, R).
Example 5.5. If F = R or C, let Sp(2n, F ) = {g ∈ GL(2n, F ) | g · J · t g = J},
where
 
0 −In
J= .
In 0
This is the symplectic group. The compact group Sp(2n, C) ∩ U(2n) will be
denoted as simply Sp(2n).
A Lie algebra over a field F is a vector space g over F endowed with a bilinear
map, the Lie bracket , denoted (X, Y ) −→ [X, Y ] for X, Y ∈ g, that satisfies
[X, Y ] = −[Y, X] and the Jacobi identity
[X, [Y, Z]] + [Y, [Z, X]] + [Z, [X, Y ]] = 0. (5.1)
5 Lie Subgroups of GL(n, C) 33

The identity [X, Y ] = −[Y, X] implies that [X, X] = 0.


We will show that it is possible to associate a Lie algebra with any Lie
group. We will show this for closed Lie subgroups of GL(n, C) in this chapter
and for arbitrary lie groups in Chap. 7.
First we give two purely algebraic examples of Lie algebras.
Example 5.6. Let A be an associative algebra. Define a bilinear operation on
A by [X, Y ] = XY − Y X. With this definition, A becomes a Lie algebra.
If A = Matn (F ), where F is a field, we will denote the Lie algebra associated
with A by the previous example as gl(n, F ). After Proposition 5.5 it will
become clear that this is the Lie algebra of GL(n, F ) when F = R or C.
Similarly, if V is a vector space over F , then the space End(V ) of F -linear
transformations V −→ V is an associative algebra and hence a Lie algebra,
denoted gl(V ).
Example 5.7. Let F be a field and let A be an F -algebra. By a derivation
of A we mean a map D : A −→ A that is F -linear, and satisfies D(f g) =
f D(g) + D(f )g. We have D(1 · 1) = 2D(1), which implies that D(1) = 0, and
therefore D(c) = 0 for any c ∈ F ⊂ A. It is easy to check that if D1 and
D2 are derivations, then so is [D1 , D2 ] = D1 D2 − D2 D1 . However, D1 D2 and
D2 D1 are themselves not derivations. It is easy to check that the derivations
of A form a Lie algebra.
The exponential map exp : Matn (C) −→ GL(n, C) is defined by

exp(X) = I + X + 12 X 2 + 16 X 3 + · · · . (5.2)

This series is convergent for all matrices X.


Remark 5.1. If X and Y commute, then exp(X +Y ) = exp(X) exp(Y ). If they
do not commute, this is not true.
A one-parameter subgroup of a Lie group G is a continuous homomorphism
R −→ G. We denote this by t → gt . Since tX and uX commute, for X ∈
Matn (C), the map t −→ exp(tX) is a one-parameter subgroup. We will also
denote exp(X) = eX .
Proposition 5.1. Let U be an open subset of Rn , and let x ∈ U . Then we
may find a smooth function f with compact support contained in U that does
not vanish at x.
Proof. We may assume x = (x1 , . . . , xn ) is the origin. Define
 2 2 −1
e−(1−|x| /r ) if |x|  r ,
f (x1 , . . . , xn ) =
0 otherwise.

This function is smooth and has support in the ball {|x|  r}. Taking r
sufficiently small, we can make this vanish outside U . 

34 5 Lie Subgroups of GL(n, C)

Proposition 5.2. Let G be a closed Lie subgroup of GL(n, C), and let X ∈
Matn (C). Then the path t −→ exp(tX) is tangent to the submanifold G of
GL(n, C) at t = 0 if and only if it is contained in G for all t.
Proof. If exp(tX) is contained in G for all t, then clearly it is tangent to G at
t = 0. We must prove the converse. Suppose that exp(t0 X) ∈ / G for some t0 >
0. Using Proposition 5.1, Let φ0 be a smooth compactly supported function

on GL(n, C) such that φ0 (g) = 0 for all g ∈ G, φ0  0, and φ0 exp(t0 X) = 0.
Let 

f (t) = φ exp(tX) , φ(h) = φ0 (hg) dg, t∈R,
G
in terms of a left Haar measure on G. Clearly, φ is constant on the cosets hG
of G, vanishes on G, but is nonzero at exp(t0 X). For any t,
d 
f  (t) = φ exp(tX) exp(uX) u=0 = 0
du
since the path u −→ exp(tX) exp(uX) is tangent to the coset exp(tX)G and
φ is constant on such cosets. Moreover, f (0) = 0. Therefore, f (t) = 0 for all
t, which is a contradiction since f (t0 ) = 0.

Proposition 5.3. Let G be a closed Lie subgroup of GL(n, C). The set Lie(G)
of all X ∈ Matn (C) such that exp(tX) ⊂ G is a vector space whose dimension
is equal to the dimension of G as a manifold.
Proof. This is clear from the characterization of Proposition 5.2.

Proposition 5.4. Let G be a closed Lie subgroup of GL(n, C). The map

X −→ exp(X)

gives a diffeomorphism of a neighborhood of the identity in Lie(G) onto a


neighborhood of the identity in G.
Proof. First we note that since exp(X) = I + X + 12 X 2 + · · · , the Jacobian
of exp at the identity is 1, so exp induces a diffeomorphism of an open neigh-
borhood U of the identity in Matn (C) onto a neighborhood of the identity in
GLn (C) ⊂ Matn (C). Now, since by Proposition 5.3 Lie(H) is a vector sub-
space of dimension equal to the dimension of H as a manifold, the Inverse
Function Theorem implies that the image of Lie(H) ∩ U must be mapped
onto an open neighborhood of the identity in H.

Proposition 5.5. If G is a closed Lie subgroup of GL(n, C), and if X, Y ∈
Lie(G), then [X, Y ] ∈ Lie(G).
Proof. It is evident that Lie(G) is mapped to itself under conjugation by
elements of G. Thus, Lie(G) contains
tX −tX
1
t e Ye − Y = XY − Y X + 2t (X 2 Y − 2XY X + Y X 2 ) + · · · .
Because this is true for all t, passing to the limit t −→ 0 shows that [X, Y ] ∈
Lie(G).

5 Lie Subgroups of GL(n, C) 35

We see that Lie(G) is a Lie subalgebra of gl(n, C). Thus, we are able to
associate a Lie algebra with a Lie group.
Example 5.8. The Lie algebra of GL(n, F ) with F = R or C is gl(n, F ).
Example 5.9. Let sl(n, F ) be the subspace of X ∈ gl(n, F ) such that tr(X) =
0. This is a Lie subalgebra, and it is the Lie algebra of SL(n, F ) when F = R
or C. This follows immediately from the fact that det(eX ) = etr(X) for any
matrix X because if x1 , . . . , xn are the eigenvalues of X, then ex1 , . . . , exn are
the eigenvalues of eX .
Example 5.10. Let o(n, F ) be the set of X ∈ gl(n, F ) that are skew-symmetric,
in other words, that satisfy X + t X = 0. It is easy to check that o(n, F ) is
closed under the Lie bracket and hence is a Lie subalgebra.
Proposition 5.6. If F = R or C, the Lie algebra of O(n, F ) is o(n, F ). The
dimension of O(n) is 12 (n2 − n), and the dimension of O(n, C) is n2 − n.
Proof. Let G = O(n, F ), g = Lie(G). Suppose X ∈ o(n, F ). Exponentiate the
identity −tX = tt X to get
exp(tX)−1 = t exp(tX),
whence exp(tX) ∈ O(n, F ) for all t ∈ R. Thus, o(n, F ) ⊆ g. To prove the
converse, suppose that X ∈ g. Then, for all t,
I = exp(tX) · t exp(tX)
= (I + tX + 12 t2 X 2 + · · · )(I + t t X + 1 2
2 t · tX 2 + · · · )
= I + t(X + t X) + 1 2
2 t (X
2
+ 2X · t X + X 2 ) + · · · .
t

Since this is true for all t, each coefficient in this Taylor series must vanish
(except of course the constant one). In particular, X + t X = 0. This proves
that g = o(n, F ).
The dimensions of O(n) and O(n, C) are most easily calculated by comput-
ing the dimension of the Lie algebras. A skew-symmetric matrix is determined
by its upper triangular entries, and there are 12 (n2 − n) of these.

Example 5.11. Let u(n) be the set of X ∈ GL(n, C) such that X + t X = 0.
One checks easily that this is closed under the gl(n, C) Lie bracket [X, Y ] =
XY − Y X. Despite the fact that these matrices have complex entries, this is
a real Lie algebra, for it is only a real vector space, not a complex one. (It is
not closed under multiplication by complex scalars.) It may be checked along
the lines of Proposition 5.6 that u(n) is the Lie algebra of U(n), and similarly
su(n) = {X ∈ u(n) | tr(X) = 0} is the Lie algebra of SU(n).
Example 5.12. Let sp(2n, F ) be the set of matrices X ∈ Mat2n (F ) that satisfy
XJ + J t X = 0, where
 
0 −In
J= .
In 0
This is the Lie algebra of Sp(2n, F ).
36 5 Lie Subgroups of GL(n, C)

Exercises
Exercise 5.1. Show that O(n, m) is the group of g ∈ GL(n + m, R) such that
g J1 t g = J1 , where
 
In
J1 = .
−Im

Exercise 5.2. If F = R or C, let OJ (F ) be the group of all g ∈ GL(N, F ) such that


g J t g = J, where J is the N × N matrix
⎛ ⎞
1
.
J =⎝ . . ⎠. (5.3)
1

Show that OJ (R) is conjugate in GL(N, R) to O(n, n) if N = 2n and to O(n + 1, n)


if N = 2n + 1. [Hint: Find a matrix σ ∈ GL(N, R) such that σ J t σ = J1 , where J
is as in the previous exercise.]

Exercise 5.3. Let J be as in the previous exercise, and let


⎛ √1 ⎞
2i
··· − √i2i
⎜ √1 − √i2i ⎟
⎜ 2i ⎟
⎜ ⎟
⎜ . .. .. .. ⎟
⎜ . . ⎟
σ = ⎜ .. . ⎟,
⎜ . . .. ⎟
⎜ . . ⎟
⎜ ⎟
⎝ √i − √12i ⎠
2i
√i
2i
··· − √12i

with all entries not on one of the two diagonals equal to zero. If N is odd, the middle
element of this matrix is 1.

(i) Check that σ t σ = J, with J as in (5.3). With OJ (F ) as in Example 5.2, deduce


that σ −1 OJ (C)σ = O(N, C). Why does the same argument not prove that
σ −1 OJ (R)σ = O(n, R)?
(ii) Check that σ is unitary. Show that if g ∈ OJ (C) and h = σ −1 g σ, then h is real
if and only if g is unitary.
(iii) Show that the group OJ (C) ∩ U(N ) is conjugate in GL(N, C) to O(N ).

Exercise 5.4. Let V1 and V2 be vector spaces over a field F , and let qi be a quadratic
form on Vi for i = 1, 2. The quadratic spaces are called equivalent if there exists an
isomorphism l : V1 −→ V2 such that q1 = q2 ◦ l.

(i) Show that over a field of characteristic not equal to 2, any quadratic form is
equivalent to ai x2i for some constants ai .
(ii) Show that, if F = C, then any quadratic space of dimension n is equivalent to
Cn with the quadratic form x21 + · · · + x2n .
(iii) Show that, if F = R, then any quadratic space of dimension n is equivalent to
Rn with the quadratic form x21 + · · · + x2r − x2r+1 − · · · − x2n for some r.

Exercise 5.5. Compute the Lie algebra of Sp(2n, R) and the dimension of the
group.
5 Lie Subgroups of GL(n, C) 37

Let H = R ⊕ Ri ⊕ Rj ⊕ Rk be the ring of quaternions, where i2 = j 2 = k2 = −1


and ij = −ji = k, jk = −kj = i, ki = −ik = j. Then H = C ⊕ Cj. If x =
a + bi + cj + dk ∈ H with a, b, c, d real, let x = a − bi − cj − dk. If u ∈ C, then
juj −1 = u. The group GL(n, H) consists of all n × n invertible quaternion matrices.

Exercise 5.6. Show that there is a ring isomorphism Matn (H) −→ Mat2n (C) with
the following description. Any A ∈ Matn (H) may be written uniquely as A1 + A2 j
with A1 , A2 ∈ Matn (C). The isomorphism in question maps
 
A1 A2
A1 + A2 j −→ .
−Ā2 Ā1

Exercise 5.7. Show that if A ∈ Matn (H), then A · t Ā = I if and only if the complex
2n × 2n matrix
 
A1 A2
−Ā2 Ā1

is in both Sp(2n, C) and U(2n). Recall that the intersection of these two groups was
the group denoted Sp(2n).

Exercise 5.8. Show that the groups SO(2) and SU(2) may be identified with the
groups of matrices
  
a b 
 a, b ∈ F, |a| + |b| = 1 ,
2 2
−b̄ ā

where F = R or C, respectively.

Exercise 5.9. The group SU(1, 1) is by definition the group of g ∈ SL(2, C) such
that
 
1
g · J · t g = J, J= .
−1

(i) Show that SU(1, 1) consists of all elements of SL(2, C) of the form
 
ab
, |a|2 − |b|2 = 1.
b̄ ā

(ii) Show that the Lie algebra su(1, 1) of SU(1, 1) consists of all matrices of the form
 
ai b
b̄ −ai

with a real.  
1 −i
(iii) Let C = √1 ∈ SL(2, C). This element is sometimes called the Cayley
1 i
2i

transform. Show that C·SL(2, R)·C −1 = SU(1, 1) and C·sl(2, R)·C −1 = su(1, 1).
6
Vector Fields

A smooth premanifold of dimension n is a Hausdorff topological space M


together with a set U of pairs (U, φ), where the set of U such that (U, φ) ∈ U
for some φ is an open cover of M and such that, for each (U, φ) ∈ U, the
image φ(U ) of φ is an open subset of Rn and φ is a homeomorphism of U onto
φ(U ). We assume that if U, V ∈ U, then φV ◦ φ−1 U is a diffeomorphism from
φU (U ∩ V ) onto φV (U ∩ V ). The set U is called a preatlas.
If M and N are premanifolds, a continuous map f : M −→ N is smooth
if whenever (U, φ) and (V, ψ) are charts of M and N , respectively, the map
ψ ◦ f ◦ φ−1 is a smooth map from φ U ∩ f −1 (V ) −→ ψ(V ). Smooth maps
are the morphisms in the category of smooth premanifolds. The smooth map
f is a diffeomorphism if it is a bijection and has a smooth inverse. Open
subsets of Rn are naturally premanifolds, and the definitions of smooth maps
and diffeomorphisms are consistent with the definitions already given in that
special case.
If M is a premanifold with preatlas U, and if we replace U by the larger set
U  of all pairs (U, φ), where U is an open subset of M and φ is a diffeomorphism
of U onto an open subset of Rn , then the set of smooth maps M −→ N or
N −→ M , where N is another premanifold, is unchanged. If U = U  , then we
call U  an atlas and M a smooth manifold .
Suppose that M is a smooth manifold and m ∈ M . If U is a neighborhood
of x and (φ, U ) is a chart such that φ(x) is the origin in Rn , then we may write
φ(u) = x1 (u), . . . , xn (u) , where x1 , . . . , xm : U −→ R are smooth functions.
Composing φ with a translation in Rn , we may arrange that xi (m) = 0, and
it is often advantageous to do so. We call x1 , . . . , xm a set of local coordinates
at m or coordinate functions on U . The set U itself may be called a coordinate
neighborhood.
Let m ∈ M , and let F = R or C. A germ of an F -valued function is
an equivalence class of pairs (U, fU ), where U is an open neighborhood of
x and f : U −→ F is a function. The equivalence relation is that (U, fU )
and (V, fV ) are equivalent if fU and fV are equal on some open neighbor-
hood W of x contained in U ∩ V . Let Om be the set of germs of smooth

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 39


DOI 10.1007/978-1-4614-8024-2 6, © Springer Science+Business Media New York 2013
40 6 Vector Fields

real-valued functions. It is a ring in an obvious way, and evaluation at m


induces a surjective homomorphism Om −→ R, the evaluation map. We will
denote the evaluation map f → f (m), a slight abuse of notation since f is a
germ, not a function. Let Mm be the kernel of this homomorphism; that is,
the ideal of germs of smooth functions vanishing at m. Then Om is a local
ring and Mm is its maximal ideal.

Lemma 6.1. Suppose that f is a smooth function on a neighborhood U of the


origin in Rn , and f (0, x2 , . . . , xn ) = 0 for (0, x2 , . . . , xn ) ∈ U . Then

x−1
1 f (x1, . . . , xn ) if x1 = 0 ,
g(x1 , x2 , . . . , xn ) =
(∂f /∂x1 )(0, x2 , . . . , xn ) if x1 = 0 ,

defines a smooth function on U .

Proof. We show first that g is continuous. Indeed, with x2 , . . . , xn fixed,

lim x−1
1 f (x1, . . . , xn ) = (∂f /∂x1 )(0, x2 , . . . , xn )
x1 →0

by the definition of the derivative. Convergence is uniform on compact sets in


x2 , . . . , xn since by the remainder form of Taylor’s theorem
 −1 
x f (x1, . . . , xn ) − (∂f /∂x1 )(0, x2 , . . . , xn )  B x1 ,
1 2

where B is an upper bound for |∂ 2 f /∂x1 |. Since ∂f /∂x1 (0, x2, . . . , xn ) is


continuous by the smoothness of f , it follows that g is continuous.
A similar argument based on Taylor’s theorem shows that the higher
partial derivatives ∂ n g/∂xn1 are also continuous.
Finally, the two functions

∂ k2 +···+kn f ∂ k2 +···+kn g
and
∂xk22 · · · ∂xknn ∂xk22 · · · ∂xknn

bear the same relationship to each other as f and g, so we obtain similarly


continuity of the mixed partials ∂ k1 +k2 +···+kn g/∂xk11 ∂xk22 · · · ∂xknn .


Proposition 6.1. Let m ∈ M , where M is a smooth manifold of dimension n.


Let O = Om and M = Mm . Let x1 , . . . , xn be the germs of a set of local
coordinates at m. Then x1 , . . . , xn generate the ideal M. Moreover, M/M2
is a vector space of dimension n generated by the images of x1 , . . . , xn .

Proof. Although this is really a statement about germs of functions, we will


work with representative functions defined in some neighborhood of m.
If f ∈ M, we write f = f1 + f2 , where f1 (x1 , . . . , xn ) = f (0, x2 , . . . , xn )
and f2 = f − f1 . Then f2 ∈ x1 O by Lemma 6.1, while f1 is the germ of a
function in x2 , . . . , xn vanishing at m and lies in x2 O + · · ·+ xn O by induction
on n.
6 Vector Fields 41

As for the last assertion, if f ∈ M, let ai = (∂f /∂xi )(m). Then f − i ai xi
vanishes to order 2 at m. We need to show that it lies in M2 . Thus, what we
must prove is that if f and ∂f /∂xi vanish at m, then f is in M2 . To prove
this, write f = f1 + f2 + f3 , where
∂f
f1 (x1 , x2 , . . . , xn ) = f (x1 , . . . , xn ) − f (0, x2, . . . , xn ) − x1 (0, x2 , . . . , xn ),
∂x1

f2 (x1 , . . . , xn ) = f (0, x2, . . . , xn ),


∂f
f3 (x1 , x2 , . . . , xn ) = x1 (0, x2 , . . . , xn ).
∂x1
Two applications of Lemma 6.1 show that f1 = x−2 1 h where h is smooth, so
f1 ∈ M2 . The function f2 also vanishes, with its first-order partial derivatives
at m, but is a function in one fewer variables, so by induction it is in M2 .
Lastly, ∂f /∂x1 vanishes at m and hence is in M by the part of this proposition
that is already proved, so multiplying by x1 gives an element of M2 .


A local derivation of Om is a map X : Om −→ R that is R-linear and such


that
X(f g) = f (m)X(g) + g(m)X(f ). (6.1)

Taking f = g = 1 gives X(1 · 1) = 2X(1) so X annihilates constant functions.


For example, if x1 , . . . , xn are a set of local coordinates and a1 , . . . , an ∈ R,
then
n
∂f
Xf = ai (m) (6.2)
i=1
∂xi

is a local derivation.

Proposition 6.2. Let m be a point on an n-dimensional smooth manifold M .


Every local derivation of Om is of the form (6.2). The set Tm (M ) of such local
derivations is an n-dimensional real vector space.

Proof. If f and g both vanish at m, then (6.1) implies that a local derivation
X vanishes on M2 , and by Proposition 6.1 it is therefore determined by its
values on x1 , . . . , xn . If these are a1 , . . . , an , then X agrees with the right-hand
side of (6.2).


We now define the tangent space Tm (M ) to be the space of local derivations


of Om . We will call elements of Tm (M ) tangent vectors. Thus, a tangent vector
at m is the same thing as a local derivation of the ring Om .
This definition of tangent vector and tangent space has the advantage that
it is intrinsic. Proposition 6.2 allows us to relate this definition to the intuitive
notion of a tangent vector. Intuitively, a tangent vector should be an equiva-
lence class of paths through m: two paths are equivalent if they are tangent.
42 6 Vector Fields

By a path we mean a smooth map u : (−, ) −→ M such that u(0) = m for


some  > 0. Given a function, or the germ of a function at m, we can use the
path to define a local derivation
d 
Xf = f u(t)  . (6.3)
dt t=0

Using the chain rule, this equals (6.2) with ai = (d/dt) xi (u(t)) t=0 .
Let M and N be smooth manifolds with a smooth map f : M → N . Let
m ∈ M and n ∈ N such that f (m) = n. Then we have a map df : Tm (M ) →
Tn (N ) defined as follows. Note that f induces a map from On (N ) to Om (M ).
Now X ∈ Tm (M ) then X is a local derivation of Om (M ), and composition
with f produces a local derivation of On (N ). This is the tangent vector we
will denote df (X). The map df : Tm (M ) → Tn (N ) is called the differential
of f .
We will use the notation
n

X= ai
i=1
∂x i

to denote the element (6.2) of Tm (M ). By a vector field X on M we mean a rule


that assigns to each point m ∈ M an element Xm ∈ Tm (M ). The assignment
m −→ Xm must be smooth. This means that if x1 , . . . , xn are local coordinates
on an open set U ⊆ M , then there exist smooth functions a1 , . . . , an on U
such that
n

Xm = ai (m) . (6.4)
i=1
∂x i

It follows from the chain rule that this definition is independent of the choice
of local coordinates xi .
Now let A = C ∞ (M, R) be the ring of smooth real-valued functions on M .
Given a vector field X on M , we may obtain a derivation of A as follows.
If f ∈ A, let X(f ) be the smooth function that assigns to m ∈ M the value
Xm (f ), where we are of course applying Xm to the germ of f at m. For
example, if M = U is an open set on Rn with coordinate functions x1 , . . . , xn
on U , given smooth functions ai : U −→ R, we may associate a derivation of
A with the vector field (6.4) by

n
∂f
(Xf )(m) = ai (m) (m). (6.5)
i=1
∂xi

The content of the next theorem is that every derivation of A is associated


with a vector field in this way.
Proposition 6.3. There is a one-to-one correspondence between vector fields
on a smooth manifold M and derivations of C ∞ (M, R). Specifically, if D is
any derivation of C ∞ (M, R), there is a unique vector field X on M such that
Df = Xf for all f .
6 Vector Fields 43

Proof. We show first that if m ∈ M , and if f ∈ A = C ∞ (M, R) has germ


zero at m, then the function Df vanishes at m. This implies that D induces a
well-defined map Xm : Om −→ R that is a local derivation. Our assumption
means that f vanishes in a neighborhood of m, so there is another smooth
function g such that gf = f , yet g(m) = 0. Now D(f )(m) = g(m) D(f ) +
f (m) D(g). Since both f and g vanish at m, we see that D(f )(m) = 0.
Now let xi be local coordinates on an open set U of M . For each m ∈ U
there are real numbers ai (m) such that (6.4) is true. We need to know that the
ai (m) are smooth functions. Indeed, we have ai (m) = D(xi ), so it is smooth.



Now let X and Y be vector fields on M . By Proposition 6.3, we may regard


these as derivations of C ∞ (M, R). As we have noted in Example 5.7, deriva-
tions of an arbitrary ring form a Lie algebra. Thus [X, Y ] = XY − Y X defines
a derivation:
[X, Y ]f = X(Y f ) − Y (Xf ) . (6.6)

By Proposition 6.3 this derivation [X, Y ] corresponds to a vector field. Let us


see this again concretely by computing its effect in local coordinates. If X =
  ∂   ∂bi ∂f 2


ai ∂xi and Y = bi ∂xi , we have X(Y f ) = i,j aj ∂xj ∂xi + ai bj ∂x∂i ∂x
f
j
.
This is not a derivation, but if we subtract Y (Xf ) to cancel the unwanted
mixed partials, we see that
!
∂bi ∂ai ∂
[X, Y ] = aj − bj .
i,j
∂xj ∂xj ∂xi

Exercises
The following exercise requires some knowledge of topology.

Exercise 6.1. Let X be a vector field on the sphere S k . Assume that X is nowhere
zero, i.e., Xm
= 0 for all m ∈ S k . Show that the antipodal map a : S k −→ S k and
the identity map S k −→ S k are homotopic. Deduce that k is odd.
Hint: Normalize the vector field so that Xm is a unit tangent vector for all m.
If m ∈ S k consider the great circle θm : [0, 2π] −→ S k tangent to Xm . Then
θm (0) = θm (2π) = m, but m −→ θm (π) is the antipodal map. Also, think about the
effect of the antipodal map on H k (S k ).
7
Left-Invariant Vector Fields

To recapitulate, a Lie group is a differentiable manifold with a group structure


in which the multiplication and inversion maps G × G −→ G and G −→ G
are smooth. A homomorphism of Lie groups is a group homomorphism that
is also a smooth map.
Remark 7.1. There is a subtlety in the definition of a Lie subgroup. A Lie sub-
group of G is best defined as a Lie group H with an injective homomorphism
i : H −→ G. With this definition, the image of i in G is not closed, however,
as the following example shows. Let G be T × T, where T is the circle R/Z.
Let H be R, and let i : H −→ G be the map i(t) = (αt, βt) modulo 1, where
the ratio α/β is irrational. This is a Lie subgroup, but the image of H is not
closed. To require a closed image in the definition of a Lie subgroup would
invalidate a theorem of Chevalley that subalgebras of the Lie algebra of a Lie
group correspond to Lie subgroups. If we wish to exclude this type of example,
we will explicitly describe a Lie subgroup of G as a closed Lie subgroup.
Remark 7.2. On the other hand, in the expression “closed Lie subgroup,” the
term “Lie” is redundant. It may be shown that a closed subgroup of a Lie
group is a submanifold and hence a Lie group. See Bröcker and Tom Dieck
[25], Theorem 3.11 on p. 28; Knapp [106] Chap. I Sect. 4; or Knapp [105],
Theorem 1.5 on p. 20. We will only prove this for the special case of an
abelian subgroup in Theorem 15.2 below.

Suppose that M and N are smooth manifolds and φ : M −→ N is a smooth


map. As we explained in Chap. 6, if m ∈ M and n = φ(m), we get a map
dφ : Tm (M ) −→ Tn (N ), called the differential of f . If φ is a diffeomorphism
of M onto N , then we can push a vector field X on M forward this way to
obtain a vector field on N . This vector field may be denoted φ∗ X, defined
by (φ∗ X)n = dφ(Xm ) when f (m) = n. If φ is not a diffeomorphism, this
may not work because some points in N may not even be in the image of φ,
while others may be in the image of two different points m1 and m2 with no
guarantee that dφXm1 = dφXm2 .

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 45


DOI 10.1007/978-1-4614-8024-2 7, © Springer Science+Business Media New York 2013
46 7 Left-Invariant Vector Fields

Now let G be a Lie group. If g ∈ G, then Lg : G −→ G defined by Lg (h) =


gh is a diffeomorphism and hence induces maps Lg,∗ : Th (G) −→ Tgh (G).
A vector field X on G is left-invariant if Lg,∗ (Xh ) = Xgh .
Proposition 7.1. The vector space of left-invariant vector fields is closed
under [ , ] and is a Lie algebra of dimension dim(G). If Xe ∈ Te (G), there
is a unique left-invariant vector field X on G with the prescribed tangent vec-
tor at the identity.
Proof. Given a tangent vector Xe at the identity element e of G, we may
define a left-invariant vector field by Xg = Lg,∗ (Xe ), and conversely any left-
invariant vector field must satisfy this identity, so the space of left-invariant
vector fields is isomorphic to the tangent space of G at the identity. Therefore,
its vector space dimension equals the dimension of G.

Let Lie(G) be the vector space of left-invariant vector fields, which we may
identify with the Te (G). It is clearly closed under [ , ].
Suppose now that G = GL(n, C). We have defined two different Lie alge-
bras for G: first, in Chap. 5, we defined the Lie algebra gl(n, C) of G to be
Matn (C) with the commutation relation [X, Y ] = XY − Y X (matrix multi-
plication); and second, we have defined the Lie algebra to be the Lie algebra
of left-invariant vector fields with the bracket (6.6). We want to see that these
two definitions are the same. We will accomplish this in Proposition 7.2 below.
If X ∈ Matn (C), we begin by associating with X a left-invariant vector
field. Since G is an open subset of the real vector space V = Matn (C), we may
identify the tangent space to G at the identity with V . With this identification,
an element X ∈ V is the local derivation at I [see (6.3)] defined by

d 
f −→ f (I + tX)  ,
dt t=0
where f is the germ of a smooth function at I. The two paths t −→ I + tX
and t −→ exp(tX) = I + tX + · · · are tangent when t = 0, so this is the
same as
d 
f −→ f exp(tX)  ,
dt t=0

which is a better definition. Indeed, if H is a Lie subgroup of GL(n, C) and X


is in the Lie algebra of H, then by Proposition 5.2, the second path exp(tX)
stays within H, so this definition still makes sense.
It is clear how to extrapolate this local derivation to a left-invariant global
derivation of C ∞ (G, R). We must define
d 
(dX)f (g) = f g exp(tX)  . (7.1)
dt t=0

By Proposition 2.8, the left-invariant derivation dX of C ∞ (G, R) corresponds


to a left-invariant vector field. To distinguish this derivation from the element
X of Matn (C), we will resist the temptation to denote this derivation also as
X and denote it by dX.
7 Left-Invariant Vector Fields 47

Lemma 7.1. Let f be a smooth map from a neighborhood of the origin in Rn


into a finite-dimensional vector space. We may write

f (x) = c0 + c1 (x) + B(x, x) + r(x), (7.2)

where c1 : Rn −→ V is linear, B : Rn × Rn −→ V is symmetric and bilinear,


and r vanishes to order 3.

Proof. This is just the familiar Taylor expansion. Denoting u = (u1 , . . . , un ),


let c0 = f (0),
∂f
c1 (u) = (0) ui ,
i
∂xi
and
1 ∂2f
B(u, v) = (0) ui vj .
2 i,j ∂xi ∂xj

Both f (x) and c0 + c1 (x) + B(x, x) have the same partial derivatives of order
 2, so the difference r(x) vanishes to order 3. The fact that B is symmetric
follows from the equality of mixed partials:

∂2f ∂2f
(0) = (0).
∂xi ∂xj ∂xj ∂xi



Proposition 7.2. If X, Y ∈ Matn (C), and if f is a smooth function on


G = GL(n, C), then d[X, Y ]f = dX(dY f ) − dY (dXf ).

Here [X, Y ] means XY − Y X computed using matrix operations; that is, the
bracket computed as in Chap. 5. This proposition shows that if X ∈ Matn (C),
and if we associate with X a derivation of C ∞ (G, R), where G = GL(n, C),
using the formula (7.1), then this bracket operation gives the same result as
the bracket operation (6.6) for left-invariant vector fields.

Proof. We fix a function f ∈ C ∞ (G) and an element g ∈ G. By Lemma 7.1,


we may write, for X near 0,

f g(I + X) = c0 + c1 (X) + B(X, X) + r(X),

where c1 is linear in X, B is symmetric and bilinear, and r vanishes to order


3 at X = 0. We will show that

(dX f )(g) = c1 (X) (7.3)

and
(dX ◦ dY f )(g) = c1 (XY ) + 2B(X, Y ). (7.4)
48 7 Left-Invariant Vector Fields

Indeed,
d
(dX f )(g) = f g(I + tX) |t=0
dt
d 
= c0 + c1 (tX) + B(tX, tX) + r(tX)  .
dt t=0

We may ignore the B and r terms because they vanish to order  2, and since
c1 is linear, this is just c1 (X) proving (7.3). Also
d 
(dX ◦ dY f )(g) = (dY f ) g(I + tX) 
dt u=0
∂ ∂ 
= f g(I + tX)(I + uY ) 
∂t ∂u t=u=0
∂ ∂
= [c0 + c1 (tX + uY + tuXY )
∂t ∂u
+B(tX + uY + tuXY, tX + uY + tuXY )
+r(tX + uY + tuXY )] |t=u=0 .

We may omit r from this computation since it vanishes to third order.


Expanding the linear and bilinear maps c1 and B, we obtain (7.4).
Similarly,
(dY ◦ dXf )(g) = c1 (Y X) + 2B(X, Y ).

Subtracting this from (7.4) to kill the unwanted B term, we obtain



(dX ◦ dY − dY ◦ dX) f (g) = c1 (XY − Y X) = (d[X, Y ] f ) (g)

by (7.3).


If φ : G −→ H is a homomorphism of Lie groups, there is an induced map of


Lie algebras, as we will now explain. Let X be a left-invariant vector field on G.
We have induced a map dφ : Te (G) −→ Te (H), and by Proposition 7.1 applied
to H there is a unique left-invariant vector field Y on H such that dφ(Xe ) =
Ye . It is easy to see that for any g ∈ G we have dφ(Xg ) = Yφ(g) . We regard
Y as an element of Lie(H), and X −→ Y is a map Lie(G) −→ Lie(H),
which we denote Lie(φ) or, more simply, dφ. The Lie algebra homomorphism
dφ = Lie(φ) is called the differential of φ. A map f : g −→ h of Lie algebras
is naturally called a homomorphism if f ([X, Y ]) = [f (X), f (Y )].

Proposition 7.3. If φ : G −→ H is a Lie group homomorphism, then Lie(φ) :


Lie(G) −→ Lie(H) is a Lie algebra homomorphism.

Proof. If X, Y ∈ G, then Xe and Ye are local derivations of Oe (G), and it is


clear from the definitions that φ∗ ([Xe , Ye ]) = [φ∗ (Xe ), φ∗ (Ye )]. Consequently,
[Lie(φ)X, Lie(φ)Y ] and Lie(φ)([X, Y ]) are left-invariant vector fields on H that
agree at the identity, so they are the same by Proposition 7.1.

7 Left-Invariant Vector Fields 49

We may ask to what extent the Lie algebra homomorphism Lie(φ) contains
complete information about φ. For example, given Lie groups G and H with
Lie algebras g and h, and a homomorphism f : g −→ h, is there a homomor-
phism G −→ H with Lie(φ) = f ?
In general, the answer is no, as the following example will show.

Example 7.1. Let H = SU(2) and let G = SO(3).  H acts on 


the three-
x y + iz
dimensional space V of Hermitian matrices ξ = of trace
y − iz −x
zero by h : ξ → hξh−1 = hξ t h, and

ξ → − det(ξ) = x2 + y 2 + z 2

is an invariant positive definite quadratic form on V invariant under this


action. Thus, the transformation ξ → hξh−1 of V is orthogonal, and we have
a homomorphism ψ : SU(2) −→ SO(3). Both groups are three-dimensional,
and ψ is a local homeomorphism at the identity. The differential Lie(ψ) :
su(2) −→ so(3) is therefore an isomorphism and has an inverse, which is
a Lie algebra homomorphism so(3) −→ su(2). However, ψ itself does not
have an inverse since it has a nontrivial element in its kernel, −I. Therefore,
Lie(ψ)−1 : so(3) −→ su(2) is an example of a Lie algebra homomorphism that
does not correspond to a Lie group homomorphism SO(3) −→ SU(2).

Nevertheless, we will see later (Proposition 14.2) that if G is simply connected ,


then any Lie algebra homomorphism g −→ h corresponds to a Lie group
homomorphism G −→ H. Thus, the obstruction to lifting the Lie algebra
homomorphism so(3) −→ su(2) to a Lie group homomorphism is topological
and corresponds to the fact that SO(3) is not simply connected.

Exercises
Exercise 7.1. Compute the Lie algebra homomorphism Lie(ψ) : su(2) −→ so(3) of
Example 7.1 explicitly.

Exercise 7.2. Show that no Lie group can be homeomorphic to the sphere S k if k
is even. On the other hand, show that SU(2) ∼
= S 3 . (Hint: Use Exercise 6.1.)

Exercise 7.3. Let J be the matrix (5.3). Let o(N, C) and oJ (C) be the complexified
Lie algebras of the groups O(N ) and OJ (C) in Exercise 5.9. Show that these complex
Lie algebras are isomorphic. Describe o(N, C) explicitly, i.e., write down a typical
matrix.
8
The Exponential Map

The exponential map, introduced for closed Lie subgroups of GL(n, C) in


Chap. 5, can be defined for a general Lie group G as a map Lie(G) −→ G.
We may consider a vector field (6.5) that is allowed to vary smoothly. By
this we mean that we introduce a real parameter λ ∈ (−, ) for some  > 0
and smooth functions ai : M ×(−, ) −→ C and consider a vector field, which
in local coordinates is given by

n
∂f
(Xf )(m) = ai (m, λ) (m). (8.1)
i=1
∂xi

Proposition 8.1. Suppose that M is a smooth manifold, m ∈ M , and X is


a vector field on M . Then, for sufficiently small  > 0, there exists a path
p : (−, ) −→ M such that p(0) = m and p∗ (d/dt)(t) = Xp(t) for t ∈ (−, ).
Such a curve, on whatever interval it is defined, is uniquely determined. If
the vector field X is allowed to depend on a parameter λ as in (8.1), then for
small values of t, p(t) depends smoothly on λ.
Here we are regarding the interval (−, ) as a manifold, and p∗ (d/dt) is the
image of the tangent vector d/dt. We call such a curve an integral curve for
the vector field.
Proof. In terms of local coordinates x1 , . . . , xn on M , the vector field X is

ai (x1 , . . . , xn ) ,
∂xi
where the ai are smooth functions in the coordinate neighborhood. If a path
p(t) is specified, let us write xi (t) for the xi component of p(t), with the
coordinates of m being x1 = · · · = xn = 0. Applying the tangent vector
p∗ (t)(d/dt)(t) to a function f ∈ C ∞ (G) gives
d  ∂f
f x1 (t), . . . , xn (t) = xi (t) x1 (t), . . . , xn (t) .
dt ∂xi

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 51


DOI 10.1007/978-1-4614-8024-2 8, © Springer Science+Business Media New York 2013
52 8 The Exponential Map

On the other hand, applying Xp(t) to the same f gives


∂f
ai x1 (t), . . . , xn (t) x1 (t), . . . , xn (t) ,
i
∂xi

so we need a solution to the first-order system



xi (t) = ai x1 (t), . . . , xn (t) , xi (0) = 0, (i = 1, . . . , n).

The existence of such a solution for sufficiently small |t|, and its uniqueness on
whatever interval it does exist, is guaranteed by a standard result in the theory
of ordinary differential equations, which may be found in most texts. See, for
example, Ince [81], Chap. 3, particularly Sect. 3.3, for a rigorous treatment.
The required Lipschitz condition follows from smoothness of the ai. For the
statement about continuously varying vector fields, one needs to know the
corresponding fact about first-order systems, which is discussed in Sect. 3.31
of [81]. Here Ince imposes an assumption of analyticity on the dependence of
the differential equation on λ, which he allows to be a complex parameter,
because he wants to conclude analyticity of the solutions; if one weakens
this assumption of analyticity to smoothness, one still gets smoothness of the
solution.


In general, the existence of the integral curve of a vector field is only guaran-
teed in a small segment (−, ), as in Proposition 8.1. However, we will now see
that, for left-invariant vector fields on a Lie group, the integral curve extends
to all R. This fact underlies the construction of the exponential map.

Theorem 8.1. Let G be a Lie group and g its Lie algebra. There exists a map
exp : g −→ G that is a local homeomorphism in a neighborhood of the origin
in g such that, for any X ∈ g, t −→ exp(tX) is an integral curve for the
left-invariant vector field X. Moreover, exp (t + u)X = exp(tX) exp(uX).

Proof. Let X ∈ g. We know that for sufficiently small  > 0 there exists
an integral curve p : (−, ) −→ G for the left-invariant vector field X with
p(0) = 1. We show first that if p : (a, b) −→ G is any integral curve for an
open interval (a, b) containing 0, then

p(s) p(t) = p(s + t) when s, t, s + t ∈ (a, b). (8.2)

Indeed, since X is invariant under left-translation, left-translation by p(s)


takes an integral curve for the vector field into another integral curve. Thus,
t −→ p(s) p(t) and t −→ p(s + t) are both integral curves, with the same
initial condition 0 −→ p(s). They are thus the same.
With this in mind, we show next that if p : (−a, a) −→ G is an integral
curve for the left-invariant vector field X, then we may extend it to all of R.
Of course, it is sufficient to show that we may extend it to (− 23 a, 32 a). We
extend it by the rule p(t) = p(a/2) p(t − a/2) when −a/2  t  3a/2 and
8 The Exponential Map 53

p(t) = p(−a/2) p(t + a/2) when −3a/2  t  a/2, and it follows from (8.2)
that this definition is consistent on regions of overlap.
Now define exp : g −→ G as follows. Let X ∈ g, and let p : R −→ G be an
integral curve for the left-invariant vector field X with p(0) = 0. We define
exp(X) = p(1). We note that if u ∈ R, then t → p(tu) is an integral curve for
uX, so exp(uX) = p(u).
The exponential map is a smooth map, at least for X near the origin in g,
by the last statement in Proposition 8.1. Identifying the tangent space at the
origin in the vector space g with g itself, exp induces a map T0 (g) −→ Te (G)
(that is g −→ g), and this map is the identity map by construction. Thus, the
Jacobian of exp is nonzero and, by the Inverse Function Theorem, exp is a
local homeomorphism near 0.


We also denote exp(X) as eX for X ∈ g.

Remark 8.1. If G = GL(n, C), then as we explained in Chap. 7, Proposition 7.2


allows us to identify the Lie algebra of G with Matn (C). We observe that the
definition of exp : Matn (C) −→ GL(n, C) by a series in (5.2) agrees with the
definition in Theorem 8.1. This is because t −→ exp(tX) with either definition
is an integral curve for the same left-invariant vector field, and the uniqueness
of such an integral curve follows from Proposition 8.1.

Proposition 8.2. Let G, H be Lie groups and let g, h be their respective Lie
algebras. Let f : G → H be a homomorphism. Then the following diagram is
commutative:
df
g −−−−→ h
⏐ ⏐
⏐exp ⏐exp
# #
f
G −−−−→ H
Proof. It is clear from the definitions that f takes an integral curve for a
left-invariant vector field X on G to an integral curve for df (X), and the
statement follows.


A representation of a Lie algebra g over a field F is a Lie algebra homomor-


phism π : g −→ End(V ), where V is an F -vector space, or more generally a
vector space over a field E containing F , and End(V ) is given the Lie algebra
structure that it inherits from its structure as an associative algebra. Thus,

π([x, y]) = π(x) π(y) − π(y) π(x).

We may sometimes find it convenient to denote π(x)v as just xv for x ∈ g


and v ∈ V . We may think of (x, v) → xv = π(x)v as a multiplication. If V
is a vector space, given a map g × V −→ V denoted (x, v) → xv such that
x → π(x) is a representation, where π(x) : V −→ V is the endomorphism
v −→ xv, then we call V a g-module. A homomorphism φ : U −→ V of
g-modules is an F -linear map satisfying φ(xv) = xφ(v).
54 8 The Exponential Map

Example 8.1. If π : G −→ GL(V ) is a representation, where V is a real or


complex vector space, then the Lie algebra of GL(V ) is End(V ), so the differ-
ential Lie(π) : Lie(G) −→ End(V ), defined by Proposition 7.3, is a Lie algebra
representation.
By the universal property of U (g) in Theorem 10.1, A Lie algebra represen-
tation π : g −→ End(V ) extends to a ring homomorphism U (g) −→ End(V ),
which we continue to denote as π.
If g is a Lie algebra over a field F , we get a homomorphism ad : g −→
End(g), called the adjoint map, defined by ad(x)y = [x, y]. We give End(g)
the Lie algebra structure it inherits as an associative ring. We have

ad(x)([y, z]) = [ad(x)(y), z] + [y, ad(x)(z)] (8.3)

since, by the Jacobi identity, both sides equal [x, [y, z]] = [[x, y], z] + [y, [x, z]].
This means that ad(x) is a derivation of g.
Also
ad(x) ad(y) − ad(y) ad(x) = ad [x, y] (8.4)
since applying either side to z ∈ g gives [x, [y, z]] − [y, [x, z]] = [[x, y], z] by the
Jacobi identity. So ad : g −→ End(g) is a Lie algebra representation.
We next explain the geometric origin of ad. To begin with, representations
of Lie algebras arise naturally from representations of Lie groups. Suppose
that G is a Lie group and g is its Lie algebra. If V is a vector space over R
or C, any Lie group homomorphism π : G −→ GL(V ) induces a Lie algebra
homomorphism g −→ End(V ) by Proposition 7.3; that is, a real or complex
representation.
In particular, G acts on itself by conjugation, and so it acts on g = Te (G).
This representation is called the adjoint representation and is denoted Ad :
G −→ GL(g). We show next that the differential of Ad is ad. That is:
Theorem 8.2. Let G be a Lie group, g its Lie algebra, and Ad : G −→ GL(g)
the adjoint representation. Then the Lie group representation g −→ End(g)
corresponding to Ad by Proposition 7.3 is ad.
Proof. It will be most convenient for us to think of elements of the Lie algebra
as tangent vectors at the identity or as local derivations of the local ring there.
Let X, Y ∈ g. If f ∈ C ∞ (G), define c(g)f (h) = f (g −1 hg). Then our definitions
of the adjoint representation amount to

Ad(g)Y f = Y c(g −1 )f .

To compute the differential of Ad, note that the path t −→ exp(tX) in G is


tangent to the identity at t = 0 with tangent vector X. Therefore, under the
representation of g in Proposition 7.3, X maps Y to the local derivation at
the identity
d  d d 

f −→ Ad(etX )Y f  = f (etX euY e−tX )  .
dt t=0 dt du t=u=0
8 The Exponential Map 55

By the chain rule, if F (t1 , t2 ) is a function of two real variables,



d  ∂F ∂F
F (t, t)  = (0, 0) + (0, 0). (8.5)
dt t=0 ∂t1 ∂t2

Applying this, with u fixed to F (t1 , t2 ) = f (et1 X euY e−t2 X ), our last expression
equals
 
d d  d d 
f (etX euY )  − f (euY etX )  = XY f (1) − Y Xf (1).
du dt t=u=0 du dt t=u=0

This is, of course, the same as the effect of [X, Y ] = ad(X)Y .




Exercises
Exercise 8.1. Show that the exponential map su(2) → SU(2) is surjective, but the
exponential map sl(2, R) → SL(2, R) is not.
9
Tensors and Universal Properties

We will review the basic properties of the tensor product and use them to
illustrate the basic notion of a universal property, which we will see repeatedly.
If R is a commutative ring and M , N , and P are R-modules, then a bilinear
map f : M × N −→ P is a map satisfying

f (r1 m1 + r2 m2 , n) = r1 f (m1 , n) + r2 f (m2 , n), ri ∈ R, mi ∈ M, n ∈ N,

f (m, r1 n1 + r2 n2 ) = r1 f (m, n1 ) + r2 f (m, n2 ), ri ∈ R, ni ∈ N, m ∈ M.


More generally, if M1 , . . . , Mk are R-modules, the notion of a k-linear map
M1 × · · · × Mk −→ P is defined similarly: the map must be linear in each
variable.
The tensor product M ⊗R N is an R-module together with a bilinear map
⊗ : M × N −→ M ⊗R N satisfying the following property.

Universal Property of the Tensor Product. If P is any R-module and p :


M ×N −→ P is a bilinear map, there exists a unique R-module homomorphism
F : M ⊗ N −→ P such that p = F ◦ ⊗.

Why do we call this a universal property? It says that ⊗ : M × N −→


M ⊗ N is a “universal” bilinear map in the sense that any bilinear map of
M × N factors through it. As we will explain, the module M ⊗R N is uniquely
determined by the universal property. This is important beyond the immediate
example because often objects are described by universal properties. Before
we explain this point (which is obvious if one thinks about it correctly), let
us make a categorical observation.
If C is a category, an initial object in C is an object X0 such that, for each
object Y , the Hom set HomC (X0 , Y ) consists of a single element. A terminal
object is an object X∞ such that, for each object Y , the Hom set HomC (Y, X∞ )
consists of a single element. For example, in the category of sets, the empty
set is an initial object and a set consisting of one element is a terminal object.

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 57


DOI 10.1007/978-1-4614-8024-2 9, © Springer Science+Business Media New York 2013
58 9 Tensors and Universal Properties

Lemma 9.1. In any category, any two initial objects are isomorphic. Any two
terminal objects are isomorphic.
Proof. If X0 and X1 are initial objects, there exist unique morphisms f :
X0 −→ X1 (since X0 is initial) and g : X1 −→ X0 (since X1 is initial). Then
g ◦ f : X0 −→ X0 and 1X0 : X0 −→ X0 must coincide since X0 is initial,
and similarly f ◦ g = 1X1 . Thus f and g are inverse isomorphisms. Similarly,
terminal objects are isomorphic.

Theorem 9.1. The tensor product M ⊗R N , if it exists, is determined up to
isomorphism by the universal property.
Proof. Let C be the following category. An object in C is an ordered pair
(P, p), where P is an R-module and p : M × N −→ P is a bilinear map.
If X = (P, p) and Y = (Q, q) are objects, then a morphism X −→ Y consists
of an R-module homomorphism f : P −→ Q such that q = f ◦p. The universal
property of the tensor product means that ⊗ : M × N −→ M ⊗ N is an initial
object in this category and therefore determined up to isomorphism.

Of course, we usually denote ⊗(m, n) as m ⊗ n in M ⊗R N . We have not
proved that M ⊗R N exists. We refer to any text on algebra for this fact, such
as Lang [116], Chap. XVI.
In general, by a universal property we mean any characterization of a
mathematical object that can be expressed by saying that some associated object
is an initial or terminal object in some category. The basic paradigm is that
a universal property characterizes an object up to isomorphism.
A typical application of the universal property of the tensor product is
to make M ⊗R N into a functor. Specifically, if μ : M −→ M  and ν :
N −→ N  are R-module homomorphisms, then there is a unique R-module
homomorphism μ ⊗ ν : M ⊗R N −→ M  ⊗R N  such that (μ ⊗ ν)(m ⊗ n) =
μ(m) ⊗ ν(n). We get this by applying the universal property to the R-bilinear
map M × N −→ M  ⊗ N  defined by (m, n) −→ μ(m) ⊗ ν(n).
As another example of an object that can be defined by a universal prop-
$ let V be a vector space over a field F . Let
erty, $ us ask for an F -algebra
V together with an F -linear map i : V −→ V satisfying the following
condition.

Universal Property of the Tensor Algebra. If A is any F -algebra and


φ : V −→ A is$an F -linear map then there exists a unique F -algebra homo-
morphism Φ : V −→ A such that r = ρ ◦ i.

It should be clear from the previous discussion that this universal property
characterizes the tensor algebra up to isomorphism. To prove existence, we can
construct a ring with this exact property as follows. Let unadorned ⊗ mean
⊗F in what follows. By ⊗k V we mean the k-fold tensor product V ⊗ · · · ⊗ V
(k times); if k = 0, then it is natural to take ⊗0 V = F while ⊗1 V = V . If V
has finite dimension d, then ⊗k V has dimension dk . Let
9 Tensors and Universal Properties 59

% ∞
& k
V = ⊗ V .
k=0
$
Then V has the natural structure of a graded F -algebra in which the
multiplication ⊗k V × ⊗l V −→ ⊗k+l V sends

(v1 ⊗ · · · ⊗ vk , u1 ⊗ · · · ⊗ ul ) −→ v1 ⊗ · · · ⊗ vk ⊗ u1 ⊗ · · · ⊗ ul .
$
We regard V as a subset of V embedded onto ⊗1 V = V .

Proposition 9.1. The universal property of the tensor algebra is satisfied.

$ If φ : V −→ A is any linear map of V into an F -algebra,


Proof. define a map
Φ: V −→ A by Φ(v1 ⊗ · · · ⊗ vk ) = φ(v1 ) · · · φ(vk ) on ⊗k V . It$
is easy to
see that Φ is a ring homomorphism. It is unique since V generates V as an
F -algebra.


A graded algebra over the field F is an F -algebra A with a direct sum decom-
position

&
A= Ak
k=0

such that Ak Al ⊆ Ak+l . In most examples we will have A0 = F . Elements


of Ak are called homogeneous of degree k. The tensor algebra is a graded
algebra, with ⊗k V being the homogeneous part of degree k.
Next we define the symmetric and exterior powers of a vector space
V over the field F . Let V k denote V × · · · × V (k times). A k-linear
map f : V k −→ U into another vector space is called symmetric if for
any σ ∈ Sk it satisfies f (vσ(1) , . . . , vσ(k) ) = f (v1 , . . . , vk ) and alternating
if f vσ(1) , . . . , vσ(k) ) = ε(σ) f (v1 , . . . , vk ), where ε : Sk −→ {±1} is the
alternating (sign) character. The kth symmetric and exterior powers of V ,
denoted ∨k V and ∧k V , are F -vector spaces, together with k-linear maps
∨ : V k −→ ∨k V and ∧ : V k −→ ∧k V . The map ∨ is symmetric, and the
map ∧ is alternating. We normally denote ∨(v1 , . . . , vk ) = v1 ∨ · · · ∨ vk and
similarly for ∧. The following universal properties are required.
Universal Properties of the Symmetric and Exterior Powers: Let
f : V k −→ U be any symmetric (resp. alternating) k-linear map. Then there
exists a unique F -linear map φ : ∨k V −→ U (resp. ∧k V −→ U ) such that
f = φ ◦ ∨(resp. f = φ ◦ ∧).
As usual, the symmetric and exterior algebras are characterized up to
isomorphism by the universal property. We may construct ∨k V as a quo-
tient of ⊗k V , dividing by the subspace W generated by elements of the form
v1 ⊗ · · · ⊗ vk − vσ(1) ⊗ · · · ⊗ vσ(k) , with a similar construction for ∧k . The
universal property of ∨k V then follows from the universal property of the
tensor product. Indeed, if f : V k −→ U is any symmetric k-linear map, then
60 9 Tensors and Universal Properties

there is induced a linear map ψ : ⊗k V −→ U such that f = ψ ◦ ⊗. Since f is


symmetric, ψ vanishes on W , so ψ induces a map ∨k V = ⊗k V /W −→ U and
the universal property follows.
If V has dimension d, then ∨k V has dimension d+k−1 k , for if x1 , . . . , xd
is a basis of V , then {xi1 ∨ · · · ∨ xik | 1  i1  i2  · · ·  ik  d} is a basis for
∨k V . On the other hand, the exterior power vanishes unless k  d, in which
case it has dimension kd . A basis consists of {xi1 ∧ · · · ∧ xik | 1  i1 < i2 <
· · · < ik  d}. The vector spaces ∨k V may be collected together to make a
commutative graded algebra:
' ∞
&
V = ∨k V.
k=0
( 
This is the symmetric algebra. The exterior algebra V = k ∧k V is con-
structed similarly. The spaces ∨0 V and ∧0 V are one-dimensional and it is
natural to take ∨0 V = ∧0 V = F .

Exercises
Exercise 9.1. Let V be a finite-dimensional vector space over a field F that may
be assumed to be infinite. Let P(V ) be the ring of polynomial functions on V . Note
that an element of the dual space V ∗ is a function on V , so regarding this function
as a polynomial gives an
 injection V ∗ −→ P(V ). Show that this linear map extends

to a ring isomorphism V −→ P(V ).
Exercise 9.2. Prove that if V is a vector space, then V ⊗ V ∼ = (V ∧ V ) ⊕ (V ∨ V ).
Exercise 9.3. Use the universal properties of the symmetric and exterior power to
show that if V and W are vector spaces, then there are maps ∨k f : ∨k V −→ ∨k W
and ∧k f : ∧k V −→ ∧k W such that
∨k f (v1 ∨ · · · ∨ vk ) = f (v1 ) ∨ · · · ∨ f (vk ), ∧k f (v1 ∧ · · · ∧ vk ) = f (v1 ) ∧ · · · ∧ f (vk ).
Exercise 9.4. Suppose that V = F 4 . Let f : V −→ V be the linear transformation
with eigenvalues a, b, c, d. Compute the traces of the linear transformations ∨2 f and
∧2 f on ∨2 V and ∧2 V as polynomials in a, b, c, d.
Exercise 9.5. Let A and B be algebras over the field F . Then A ⊗ B is also an
algebra, with multiplication (a ⊗ b)(a ⊗ b ) = aa ⊗ bb . Show that there are ring
homomorphisms i : A → A ⊗ B and j : B → A ⊗ B such that if f : A → C and
g : B → C are ring homomorphisms into a ring C satisfying f (a) g(b) = g(b) f (a)
for a ∈ A and b ∈ B, then there exists a unique ring homomorphism φ : A ⊗ B → C
such that φ ◦ i = f and φ ◦ j = g.
Exercise 9.6. Show that if U and V are finite-dimensional vector spaces over F
then show that
    
(U ⊕ V ) ∼
= U ⊗ U
and
    
(U ⊕ V ) ∼
= U ⊗ U .
10
The Universal Enveloping Algebra

We have seen that elements of the Lie algebra of a Lie group G are derivations
of C ∞ (G). They are thus first-order differential operators that are left-
invariant. The universal enveloping algebra is a purely algebraically defined
ring that may be identified with the ring of all left-invariant differential
operators, including higher-order ones.
We recall from Example 5.6 that if A is an associative algebra, then A may
be regarded as a Lie algebra by the rule [a, b] = ab − ba for a, b ∈ A. We will
denote this Lie algebra by Lie(A).

Theorem 10.1. Let g be a Lie algebra over a field F . There exists an


F -algebra U (g) with a Lie algebra homomorphism i : g −→
associative

Lie U (g) such that if A is any F -algebra, and φ : g −→ Lie(A) is a Lie
algebra homomorphism, then there exists a unique F -algebra homomorphism
Φ : U (g) −→ A such that φ = Φ ◦ i.

As always, an object [in this case U (g)] defined by a universal property is


characterized up to isomorphism by that property.
$
Proof. Let K be the ideal in g generated by elements $of the form [x, y] −
(x ⊗ y − y ⊗ x) for x, y ∈ g, and let U (g) be the quotient V /K. Let φ : g −→
Lie(A) be a Lie algebra homomorphism. This means that φ is an F -linear
$ y]) = φ(x)φ(y) − φ(y)φ(x). Then φ extends to a ring
map such that φ([x,
homomorphism g −→ A by Proposition 9.1. Our assumption implies that
K is in the kernel of this homomorphism, and so there is induced a ring
homomorphism U (g) −→ A. Clearly, U (g) is generated by the image of g, so
this homomorphism is uniquely determined.


Suppose that g is the Lie algebra of a Lie group G. Consider the ring A of
vector space endomorphisms of C ∞ (G) that commute with left translation
by elements of G. As we have already seen, elements of g are left-invariant
differential operators, by means of the action

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 61


DOI 10.1007/978-1-4614-8024-2 10, © Springer Science+Business Media New York 2013
62 10 The Universal Enveloping Algebra

d tX
Xf (g) = f ge |t=0 . (10.1)
dt
By the universal property of the universal enveloping algebra, this action
extends to a ring homomorphism U (g) −→ A, the image of which consists
of left-invariant differential operators [Exercise 10.2 (i)]. Let us apply this
observation to give a quick analytic proof of a fact that has a longer purely
algebraic proof.

Proposition 10.1. If g is the Lie algebra of a Lie group G, then the natural
map i : g −→ U (g) is injective.

It is a consequence of the Poincaré–Birkhoff–Witt theorem, a standard and


purely algebraic theorem, that i : g −→ U (g) is injective for any Lie algebra.
Instead of proving the Poincaré–Birkhoff–Witt theorem, we give a short proof
of this weaker statement.

Proof. Let A be the ring of endomorphisms of C ∞ (G). Regarding X ∈ g as a


derivation of C ∞ (G) acting by (10.1), we have a Lie algebra homomorphism
g −→ Lie(A), which by Theorem 10.1 induces a map U (g) −→ A. If X ∈ g
had zero image in U (g), it would have zero image in A. It would therefore be
zero.


The center of U (g) is very important. One reason for this is that while
elements of U (g) are realized as differential operators that are invariant under
left-translation, elements of the center are invariant under both left and right
translation. [Exercise 10.2 (ii)]. Moreover, the center acts by scalars on any
irreducible subspace, as we see in the following version of Schur’s lemma.
A representation (π, V ) of a Lie algebra g is irreducible if there is no proper
nonzero subspace U ⊂ V such that π(x)U ⊆ U for all x ∈ g.

Proposition 10.2. Let π : g −→ End(V ) be an irreducible representation of


the Lie algebra g. If c is in the center of U (g), then there exists a scalar λ
such that π(c) = λIV .

Proof. Let λ be any eigenvalue of π(c). Let U be the λ-eigenspace of π(c).


Since π(c) commutes with π(x) for all x ∈ g, we see that π(x)U ⊆ U for all
x ∈ g. By the definition of irreducibility, U = V , so π(c) acts by the scalar λ.



Thus, the center of U (g) is extremely important. One particular element,


the Casimir element, is especially important. To give two examples of its
significance, the Casimir element gives rise to the Laplace–Beltrami operator,
the spectral theory for which is very important in noneuclidean geometry. It is
also fundamental in the theory of Kac–Moody Lie algebras. This theory gener-
alizes the theory of finite-dimensional Lie algebras to an infinite-dimensional
setting in which (remarkably) all the main theorems remain valid. One of
10 The Universal Enveloping Algebra 63

the key features of this theory is how the Casimir element becomes the key
ingredient in many proofs (such as that of the Weyl character formula) where
other tools are no longer available. See [92].
Our next task will be to construct the Casimir elements. This requires
a discussion of invariant bilinear forms. If V is a vector space over F and
π : g −→ End(V ) is a representation, then we call a bilinear form B on V
invariant if

B π(X)v, w + B v, π(X)w = 0 (10.2)
for X ∈ g, v, w ∈ V . The following proposition shows that this notion of
invariance is the Lie algebra analog of the more intuitive corresponding notion
for Lie groups.
Proposition 10.3. Suppose that G is a Lie group, g its Lie algebra, and
π : G −→ GL(V ) a representation admitting an invariant bilinear form B.
Then B is invariant for the differential of π.
Proof. Invariance under π means that

B π(etX )v, π(etX )w = B(v, w).

The derivative of this with respect to t is zero. By (8.5), this derivative is



B π(X)v, w + B v, π(X)w .

We see that (10.2) is satisfied.



If (π, V ) is a representation of g, define a bilinear form BV : g × g −→ C by
BV (X, Y ) = tr(π(X)π(Y )). This is the trace bilinear form on g with respect
to V . In the special case where V = g and π is the adjoint representation, the
trace bilinear form is called the Killing form.
Proposition 10.4. Suppose that (π, V ) is a representation of g. Then the
trace bilinear form on g is invariant for the adjoint representation ad : g −→
End(g).
Proof. Invariance under ad means

B([x, y], z) + B(y, [x, z]) = 0. (10.3)

Since π is a representation, π([x, y]) = π(x)π(y) − π(y)π(x), so B([x, y], z) is


the trace of
π(x) π(y) π(z) − π(y) π(x) π(z)
while B(y, [x, z]) is the trace of

π(y)π(x)π(z) − π(y)π(z)π(x).

Using the property of endomorphisms A and B of a vector space that tr(AB) =


tr(BA), these sum to zero. This same fact implies that B(x, y) = B(y, x). 

64 10 The Universal Enveloping Algebra

Now given an invariant bilinear form on g, we may construct an element of


the center, provided the bilinear form is nondegenerate.
Theorem 10.2. Suppose that the Lie algebra g admits a nondegenerate in-
variant bilinear form B. Let x1 , . . . , xd be a basis of g, and let y1 , . . . , yd
be thedual basis, so that B(xi , yj ) = δij (Kronecker δ). Then the element
Δ = i xi yi of U (g) is in the center of U (g). The element Δ is independent
of the choice of basis x1 , . . . , xd .
The element Δ is called the Casimir element of U (g) (with respect to B).
Proof.
 Let z ∈ g. There  exist constants αij and βij such that [z, xi ] =
α x
j ij j and [z, y i ] = j βij yj . Since B is invariant, we have

0 = B([z, xi ], yj ) + B(xi , [z, yj ]) = αij + βji .


Now
⎛ ⎞

z xi yi = ([z, xi ]yi + xi zyi ) = ⎝ αij xj yi ⎠ + xi zyi ,
i i i,j i

while
⎛ ⎞

xi yi z = (−xi [z, yi ] + xi zyi ) = − ⎝ βij xi yj ⎠ + xi zyi ,
i i i,j i

and since βij = −αji , these are equal. Thus Δ commutes with g, and since g
generates U (g) as a ring, it is in the center.
It remains to be shown that Δ is independent of the choice  of basis
x1 , . . . , xd . Suppose that x1 , . . . , xd is another basis. Write
 x 
i = j cij xj , and
if y1 , . . . , yd is the corresponding dual basis, let y 
i = j d ij y j . The condition
that B(xi , yj ) = δij (Kronecker δ) implies that k cik djk = δij . Therefore,
the matrices
 (cij ) and (dij ) 
are transposeinverses of each other
 and so we have
also k cki dkj = δij . Now k xk yk = i,j,k cki dkj xi yj = k xk yk = Δ. 
Although Proposition 10.4 provides us with a supply of invariant bilinear
forms, there is no guarantee that they are nonzero, which is required by
Theorem 10.2. We will not address this point now.
One might wonder, since there may be many irreducible representations,
whether the invariant bilinear forms produced by Proposition 10.4 are all
distinct. Also, since these invariant bilinear forms are all symmetric, one
might wonder if we are missing some invariant bilinear forms that are not
symmetric. The following proposition shows that for simple Lie algebras, there
is essentially a unique invariant bilinear form, and that it is symmetric.
A Lie algebra g is called simple if it has no proper nonzero ideals. An ideal
is just an invariant subspace of g for the adjoint representation, so another way
of saying the same thing is that ad : g −→ End(g) is irreducible. For example,
it is not hard to see that for any field F , the Lie algebra sl(n, F ) is simple.
10 The Universal Enveloping Algebra 65

Proposition 10.5. Let g be a finite-dimensional simple Lie algebra over a


field F . Then there exists, up to scalar, at most one invariant bilinear form
on g. If a nonzero invariant bilinear form exists it is nondegenerate and
symmetric.

Proof. Let g∗ be the dual space to g. If λ ∈ g∗ and x ∈ g we will use


the notation x, λ for λ(x). Let α : g −→ End(g∗ ) be defined by the rule
x, α(y)λ = − [y, x], λ . It is easy to check using the Jacobi identity that
this α is a representation. We will regard g∗ as a g-module by means of α.
Every bilinear for B : g×g −→ F is of the form B(x, y) = x, θ(y) for some
linear map θ : g −→ g∗ . We claim that the condition for B to be invariant is
equivalent to θ being a homomorphism of g-modules. Indeed, for θ to be a g-
module homomorphism we need α(x)θ(z) = θ(ad(x)z). Applying these linear
functionals to y ∈ g, this condition is equivalent to −B([x, y], z) = B([y, [x, z])
for all y.
Thus, the vector space of invariant bilinear forms is isomorphic to the
space of g-module homomorphisms θ : g −→ g∗ . Since g is simple, any such
homomorphism is either zero or injective; if it is nonzero, it is bijective since
g and g∗ have the same finite dimension. By Schur’s lemma (Exercise 10.5)
the space of such θ is at most one-dimensional, and so, therefore, is the space
of invariant bilinear forms.
We must show that if B is nonzero and invariant it is symmetric and
nondegenerate. Since θ is injective, B(x, y) = 0 for all x implies that y = 0,
and so it is nondegenerate. To see that it is symmetric, it is unique up to a
scalar, so B(x, y) = cB(y, x) for some scalar c. Applying this twice, c2 = 1,
and we need to show c = 1. If the characteristic of F is two, then c2 = 1
implies c = 1, so assume the characteristic is not two. Then we show that
c = −1. Arguing by contradiction, c = −1 implies that

B([x, y], z) = −B(z, [x, y]) = B([x, z], y) = −B([z, x], y).

Applying this identity three times, B([x, y], z) = −B([x, y], z) and because
the characteristic is not two, we have B([x, y], z) = 0 for all x, y, z. Now we
may assume that g is non-Abelian since otherwise it is one-dimensional and
any bilinear form is symmetric. Then [g, g] is an ideal of g (as follows from
the Jacobi identity) and is nonzero since g is non-Abelian. Since g is simple
[g, g] = g and we have proved that B = 0, a contradiction.


Exercises
Exercise 10.1. Let Xij ∈ gl(n, R) (1  i, j  n) be the n × n matrix with a 1 in
the i, j position and 0’s elsewhere. Show that [Xij , Xkl ] = δjk Xil − δil Xkj , where
δjk is the Kronecker δ. From this, show for any positive integer d that
66 10 The Universal Enveloping Algebra


n 
n
··· Xi1 i2 Xi2 i3 · · · Xid i1
i1 =1 ir =1

is in the center of U (gl(n, R)).

Exercise 10.2. Let G be a connected Lie group and g its Lie algebra. Define an
action of g on the space C ∞ (G) of smooth functions on G by (10.1).
(i) Show that this is a representation of G. Explain why Theorem 10.1 implies that
this action of g on C ∞ (G) can be extended to a representation of the associative
algebra U (g) on C ∞ (G).
(ii) If h ∈ G, let ρ(h) and λ(h) be the endomorphisms of G given by left and right
translation. Thus

ρ(h)f (g) = f (gh), λ(h)f (g) = f (h−1 g).

Show that if h ∈ G and D ∈ U (g), then λ(h) ◦ D = D ◦ λ(h). If D is in the center


of U (g) then prove that ρ(h) ◦ D = D ◦ ρ(h). (Hint: Prove this first if h is of
the form eX for some X ∈ G, and recall that G was assumed to be connected,
so it is generated by a neighborhood of the identity.)

Exercise 10.3. Let G = GL(n, R). Let B be the “Borel subgroup” of upper
triangular matrices with positive diagonal entries, and let B0 be the connected
component of the identity, whose matrices have positive diagonal entries. Let
K = O(n).
(i) Show that every element of g ∈ G has a unique decomposition as g = bk with
b ∈ B0 and k ∈ K.
(ii) Let s = (s1 , . . . , sn ) ∈ Cn . By (i), we may define an element φ = φs of C ∞ (G) by
⎛⎛ ⎞ ⎞
y1 ∗ · · · ∗
⎜⎜ 0 y2 · · · ∗ ⎟ ⎟  n
⎜⎜ ⎟ ⎟
φs ⎜⎜ . . . . ⎟ k⎟ = y si , yi > 0, k ∈ K.
⎝⎝ .. .. . . .. ⎠ ⎠ i=1 i
0 0 · · · yn

Show that φ is an eigenfunction of the center of U (g). That is, if D is in the


center of U (g), then Dφ = λφ for some complex number λ. [Hint: Characterize
φ by properties of left and right translation and use Exercise 10.2 (ii).]
(iii) Define σs (g) = K φs (kg) dk. Clearly σs satisfies σs (kgk ) = σ(g) for k, k ∈ K.
Show that σ is an eigenfunction of the center of U (g). This is the spherical
function.

Exercise 10.4. Give a construction similar to that in Exercise 10.3 for eigenfunc-
tions of the center of U (g) when g = gln (C).

Exercise 10.5 (Schur’s lemma). Let g be a Lie algebra, and let V , W be


g-modules.
(i) Show that the space of g-module homomorphisms φ : V → W is at most
one-dimensional.
(ii) Show that the space of invariant bilinear forms V × W → C is at most
one-dimensional.
11
Extension of Scalars

We will be interested in complex representations of both real and complex


Lie algebras. There is an important distinction to be made. If g is a real Lie
algebra, then a complex representation is an R-linear homomorphism g −→
End(V ), where V is a complex vector space. On the other hand, if g is a
complex Lie algebra, we require that the homomorphism be C-linear. The
reader should note that we ask more of a complex representation of a complex
Lie algebra than we do of a complex representation of a real Lie algebra.
The interplay between real and complex Lie groups and Lie algebras will
prove important to us. We begin this theme right here with some generalities
about extension of scalars.
If R is a commutative ring and S is a larger commutative ring containing
R, we may think of S as an R-algebra. In this case, there are functors between
the categories of R-modules and S-modules. Namely, if N is an S-module, we
may regard it as an R-module. On the other hand, if M is an R-module, then
thinking of S as an R-module, we may form the R-module MS = S ⊗R M .
This has an S-module structure such that t(s ⊗ m) = ts ⊗ m for t, s ∈ S,
and m ∈ M . We call this the S-module obtained by extension of scalars.
If φ : M −→ N is an R-module homomorphism, 1 ⊗ φ : MS −→ NS is an
S-module homomorphism, so extension of scalars is a functor.
Of the properties of extension of scalars, we note the following:
Proposition 11.1. Let S ⊇ R be commutative rings.
(i) If M1 and M2 are R-modules, we have the following natural isomorphisms
of S-modules:
S ⊗R R ∼= S, (11.1)

S ⊗R (M1 ⊕ M2 ) = (S ⊗R M1 ) ⊕ (S ⊗R M2 ), (11.2)
(S ⊗R M1 ) ⊗S (S ⊗R M2 ) ∼
= S ⊗R (M1 ⊗R M2 ). (11.3)
(ii) If M is an R-module and N is an S-module, we have a natural
isomorphism
HomR (M, N ) ∼
= HomS (S ⊗R M, N ). (11.4)

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 67


DOI 10.1007/978-1-4614-8024-2 11, © Springer Science+Business Media New York 2013
68 11 Extension of Scalars

Proof. To prove (11.1), note that the multiplication S × R −→ S is an


R-bilinear map hence by the universal property of the tensor product induces
an R-module homomorphism S ⊗R R −→ S. On the other hand, s −→ s ⊗ 1
is an R-module homomorphism S −→ S ⊗R R, and these maps are inverses
of each other. With our definition of the S-module structure on S ⊗R R, they
are S-module isomorphisms.
To prove (11.2), one may characterize the direct sum M1 ⊕ M2 as follows:
given an R-module M with maps ji : Mi −→ M , pi : M −→ Mi (i = 1, 2)
such that pi ◦ ji = 1Mi and j1 ◦ p1 + j2 ◦ p2 = 1M , then there are maps

M −→ M1 ⊕ M2 , m −→ p1 (m), p2 (m) ,

M1 ⊕ M2 −→ M, (m1 , m2 ) −→ i1 m1 + i2 m2 .
These are easily checked to be inverses of each other, and so M = ∼ M1 ⊕M2 . For
example, if M = M1 ⊕M2 , such maps exist—take the inclusion and projection
maps in and out of the direct sum. Now applying the functor M → S ⊗R M to
the maps j1 , j2 , p1 , p2 gives corresponding maps for S ⊗R (M1 ⊗R M2 ) showing
that it is isomorphic to the left-hand side of (11.2).
To prove (11.3), one has an S-bilinear map

(S ⊗R M1 ) × (S ⊗R M2 ) −→ S ⊗R (M1 ⊗R M2 ) (11.5)

such that (s1 ⊗ m1 ), (s2 ⊗ m2 ) → s1 s2 ⊗ (m1 ⊗ m2 ). This map is S-bilinear,
so it induces a homomorphism

(S ⊗R M1 ) ⊗S (S ⊗R M2 ) −→ S ⊗R (M1 ⊗R M2 ). (11.6)

Similarly, there is an R-bilinear map

S × (M1 ⊗R M2 ) −→ (S ⊗R M1 ) ⊗S (S ⊗R M2 )

such that (s, m1 ⊗ m2 ) → (s ⊗ m1 ) ⊗ (1 ⊗ m2 ) = (1 ⊗ m1 ) ⊗ (s ⊗ m2 ). This


induces an S-module homomorphism that is the inverse to (11.6).
To prove (11.4), we describe the correspondence explicitly. If

φ ∈ HomR (M, N ) and Φ ∈ HomS (S ⊗ M, N ),

then φ and Φ correspond if φ(m) = Φ(1 ⊗ m) and Φ(s ⊗ m) = sφ(m). It is


easily checked that φ → Φ and Φ → φ are well-defined inverse isomorphisms.



If V is a d-dimensional real vector space, then the complex vector space


VC = C ⊗R V is a d-dimensional complex vector space. This follows from
Proposition 11.1 because if V ∼ = R ⊕ . . . ⊕ R (d copies), then (11.1) and (11.2)
imply that VC ∼= C ⊕ · · · ⊕ C (d copies). We call VC the complexification of V .
The natural map V −→ VC given by v → 1 ⊗ v is injective, so we may think
of V as a real vector subspace of VC .
11 Extension of Scalars 69

Proposition 11.2.
(i) If V is a real vector space and W is a complex vector space, any R-linear
transformation V −→ W extends uniquely to a C-linear transformation
VC −→ W .
(ii) If V and U are real vector spaces, any R-linear transformation V −→ U
extends uniquely to a C-linear map VC −→ UC .
(iii) If V and U are real vector spaces, any R-bilinear map V × V −→ U
extends uniquely to a C-bilinear map VC × VC −→ UC .

Proof. Part (i) is a special case of (ii) of Proposition 11.1. Part (ii) follows
by taking W = UC in part (i) after composing the given linear map V −→ U
with the inclusion U −→ W . As for (iii), an R-bilinear map V × V −→ U
induces an R-linear map V ⊗R V −→ U and hence by (ii) a C-linear map
(V ⊗R V )C −→ UC . But by (11.3), (V ⊗R V )C is VC ⊗C VC , and a C-linear map
VC ⊗C VC −→ UC is the same thing as a C-bilinear map VC × VC −→ UC . 

Proposition 11.3.
(i) The complexification gC of a real Lie algebra g with the bracket extended
as in Proposition 11.2 (iii) is a Lie algebra.
(ii) If g is a real Lie algebra, h is a complex Lie algebra, and ρ : g −→ h is a
real Lie algebra homomorphism, then ρ extends uniquely to a homomor-
phism ρC : gC −→ h of complex Lie algebras. In particular, any complex
representation of g extends uniquely to a complex representation of gC .
(iii) If g is a real Lie subalgebra of the complex Lie algebra h, and if h = g ⊕ ig
(i.e., if g and ig span h but g ∩ ig = {0}), then h ∼ = gC as complex Lie
algebras.

Proof. For (i), the extended bracket satisfies the Jacobi identity since both
sides of (5.1) are trilinear maps on gC × gC × gC −→ gC , which by assumption
vanish on g × g × g. Since g generates gC over the complex numbers, (5.1) is
therefore true on gC .
For (ii), the extension is given by Proposition 11.2 (i), taking W = h.
To see that the extension is a Lie algebra homomorphism, note that both
ρ([x, y]) and ρ(x)ρ(y) − ρ(y)ρ(x) are bilinear maps gC × gC −→ h that agree
on g × g. Since g generates gC over C, they are equal for all x, y ∈ gC .
For (iii), by Proposition 11.2 (i), it will be least confusing to distinguish
between g and its image in h, so we prove instead the following equivalent
statement: if g is a real Lie algebra, h is a complex Lie algebra, f : g −→ h
is an injective homomorphism, and if h = f (g) ⊕ i f (g), then f extends to an
isomorphism gC −→ h of complex Lie algebras. Now f extends to a Lie algebra
homomorphism fC : gC −→ h by part (ii). To see that this is an isomorphism,
note that it is surjective since f (g) spans h. To prove that it is injective, if
fC (X +iY ) = 0 with X, Y ∈ g, then f (X)+if (Y ) = 0. Now f (X) = f (Y ) = 0
because f (g) ∩ if (g) = 0. Since f is injective, X = Y = 0.

70 11 Extension of Scalars

Of course, given any complex representation of gC , we may also restrict it


to g, so Proposition 11.3 implies that complex representations of g and com-
plex representations of gC are really the same thing. (They are equivalent
categories.)
As an example, let us consider the complexification of u(n).
Proposition 11.4.
(i) Every n × n complex matrix X can be written uniquely as X1 + iX2 ,
where X1 and X2 are n × n complex matrices satisfying X1 = −t X1 and
X2 = t X2 .
(ii) The complexification of the real Lie algebra u(n) is isomorphic to gl(n, C).
(iii) The complexification of the real Lie algebra su(n) is isomorphic to sl(n, C).
Proof. For (i), the unique solution is clearly
X1 = 12 (X −t X), X2 = 1
2i (X + t X).
For (ii), we will use the criterion of Proposition 11.3 (iii). We recall that
u(n) is the real Lie algebra consisting of complex n × n matrices satisfying
X = −t X. We want to get the complex conjugation out of the picture before
we try to complexify it, so we write X = X1 + iX2 , where X1 and X2 are real
n × n matrices. We must have X1 = −t X1 and X2 = t X2 . Thus, as a vector
space, we may identify u(2) with the real vector space of pairs (X1 , X2 ) ∈
Matn (R) ⊕ Matn (R), where X1 is skew-symmetric and X2 symmetric. The
Lie bracket operation, required by the condition that
[X, Y ] = XY − Y X when X = X1 + iX2 and Y = Y1 + iY2 , (11.7)
amounts to the rule
[(X1 , X2 ), (Y1 , Y2 )]
= (X1 Y1 − X2 Y2 − Y1 X1 + Y2 X2 , X1 Y2 + X2 Y1 − Y2 X1 − Y1 X2 ). (11.8)
Now (i) shows that the complexification of this vector space (allowing X1 and
X2 to be complex) can be identified with Matn (C). Of course, (11.7) and (11.8)
are still equivalent if X1 , X2 , Y1 , and Y2 are allowed to be complex, so with
the Lie bracket in (11.8), this Lie algebra is Matn (C) with the usual bracket.
(iii) is similar to (ii), and we leave it to the reader.

Theorem 11.1. Every complex representation of the Lie algebra u(n) or the
Lie algebra gl(n, R) extends uniquely to a complex representation of gl(n, C).
Every complex representation of the Lie algebra su(n) or the Lie algebra
sl(n, R) extends uniquely to a complex representation of sl(n, C).
Proof. This follows from Proposition 11.3 since the complexification of u(n)
or gl(n, R) is gl(n, C), while the complexification of su(n) or sl(n, R) is
sl(n, C). For gl(2, R) or sl(2, R), this is obvious. For u(n) and su(n), this is
Proposition 11.4.

12
Representations of sl(2, C)

Unless otherwise indicated, in this chapter a representation of a Lie group


or Lie algebra is a complex representation. We remind the reader that if g is
a complex Lie algebra [e.g. sl(2, C)], then a complex representation π : g →
End(V ) is assumed to be complex linear, while if g is a real Lie algebra [e.g.
su(2) or sl(2, R)] then there is no such assumption.
Let us exhibit some representations of the group SL(2, C). We start with
the standard representation on C2 , with SL(2, C) acting by matrix multipli-
cation on column vectors. Due to the functoriality of ∨k , there is induced
a representation of SL(2, C) on ∨k C2 . The dimension of this vector space is
k + 1. In short, ∨k gives us a representation SL(2, C) −→ GL(k + 1, C). There
is an induced map of Lie algebras sl(2, C) −→ gl(k + 1, C) by Proposition 7.3,
and it is not hard to see that this is a complex Lie algebra homomorphism. We
have corresponding representations of the real subalgebras sl(2, R) and su(2),
and we will eventually see that these are all the irreducible representations of
these groups.
Let us make these symmetric power representations more explicit for the
algebra g = sl(2, R). A basis of g consists of the three matrices
     
1 0 01 00
H= , R= , L= .
0 −1 00 10

They satisfy the commutation relations

[H, R] = 2R, [H, L] = −2L, [R, L] = H. (12.1)

Let    
1 0
x= , y= ,
0 1
be the standard basis of C2 . We have a corresponding basis of k + 1 elements
in ∨k C2 , which we will label by integers k, k − 2, k − 4, . . . , −k for reasons that
will become clear presently. Thus, we let

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 71


DOI 10.1007/978-1-4614-8024-2 12, © Springer Science+Business Media New York 2013
72 12 Representations of sl(2, C)

vk−2l = x ∨ · · · ∨ x ∨ y ∨ · · · ∨ y (k − l copies of x, l copies of y).


Since ∨k is a functor, if f : C2 −→ C2 is a linear transformation, there is in-
duced a linear transformation ∨k f of ∨k C2 . (See Exercise 9.3.) For simplicity,
if X ∈ g and v ∈ ∨k C2 we will denote write X · v or Xv instead of (∨k X)v.
Proposition 12.1. We have
H · vk−2l = (k − 2l)vk−2l , (0  l  k), (12.2)

lvk−2l+2 if l > 0,
R · vk−2l = (12.3)
0 if l = 0,
and 
(k − l)vk−2l−2 if l < k,
L · vk−2l = (12.4)
0 if l = k,
The first identity is the reason for the labeling of the vectors vk−2l : each vk−2l
is an eigenvector of H, and the subscript is the eigenvalue. We may visualize
the effects of R and L as in Fig. 12.1. Each dot represents a one-dimensional
eigenspace of H, called a weight space.

R R R R

···
v−k v−k+2 v−k+4 vk
L L L L

Fig. 12.1. Effects of R and L on weight vectors

What this diagram means is that the operator R maps vj to a multiple


of vj+2 , while L maps vj to a multiple of vj−2 . The operators R and L shift
between the various weight spaces. The only exceptions are that R kills vk
and L kills v−k .
Proof. For example, let us compute the effect of ∨k R on vi . In C2 ,

x −→ x,
exp(tR) :
y −→ y + tx.
So
d
R · vk−2l =
exp(tR)vk−2l |t=0 .
dt
Therefore, in ∨k V , remembering that the ∨ operation is symmetric (commu-
tative), we see that exp(tR) maps vk−2l to
 
l
vk−2l + tlvk−2l+2 + t2 vk−2l+4 + · · · .
2
Differentiating with respect to t, then letting t = 0 gives (12.3). We leave the
reader to compute the effects of H and L.

12 Representations of sl(2, C) 73

For example, if k = 3, then with respect to the basis v3 , v1 , v−1 , v−3 , we find
that
⎛ ⎞
0 1 0 0
⎜0 0 2 0⎟
∨3 R = ⎜
⎝ 0 0 0 3 ⎠,

0 0 0 0
⎛ ⎞ ⎛ ⎞
0 0 0 0 3 0 0 0
⎜3 0 0 0⎟ ⎜0 1 0 0 ⎟
∨3 L = ⎜⎝ 0 2 0 0 ⎠,
⎟ ∨3 H = ⎜⎝ 0 0 −1 0 ⎠ .

0 0 1 0 0 0 0 −3

It may be checked directly that these matrices satisfy the commutation rela-
tions (12.1).

Proposition 12.2. The representation ∨k C2 of sl(2, R) is irreducible.

Proof.Suppose that U is a nonzero invariant subspace. Choose a nonzero ele-


ment ak−2l vk−2l of U . Let k − 2l be the smallest integer such that ak−2l =0.
Applying R to this vector l times shifts each vr −→ vr+2 times a nonzero
constant, except for vk , which it kills. Consequently, this operation Rl will
kill every vector vr with r  k − 2l, leaving only a nonzero constant times vk .
Thus vk ∈ U . Now applying L repeatedly shows that vk−2 , vk−4 , . . . ∈ U , so
U contains a basis of ∨k C2 . We see that any nonzero invariant subspace of
∨k C2 is the whole space, so the representation is irreducible.


If k = 0, we reiterate that ∨0 C2 = C. It is a trivial sl(2, R)-module, meaning


that π(X) acts as zero on it for all X ∈ sl(2, R).

Now we need an element of the center of U sl(2, R) . An invariant bilinear
form on g is given by B(x, y) = 12 tr(xy), where the trace is the usual trace
of a matrix,
and xy is the product of two matrices, not multiplication in
U sl(2, R) . The invariance of this bilinear form follows from the property of
the trace that tr(xy) = tr(yx) since

B([x, y], z) + B(y, [x, z]) = 12 tr(xyz) − tr(yxz) + tr(yxz) − tr(yzx) = 0 ,

proving (10.3). Dual to the basis H, R, L of sl(2, R) is the basis H, 2L, 2R,
and it follows from Theorem 10.2 that the Casimir element

Δ = H 2 + 2RL + 2LR

is an element of the center of U sl(2, R) .

Proposition 12.3. Suppose that (π, V ) is an irreducible representation of


sl(2, R). Assume that there exists a vector vk in V such that vk = 0 but
Rvk = 0. Then Δv = λv for all v ∈ V , where λ = k 2 + 2k.
74 12 Representations of sl(2, C)

Proof. By Proposition 10.2 there exists λ such that Δv = λv for all v. To


calculate λ, we use the identity [R, L] = H to write

Δ = H 2 + 2H + 4LR. (12.5)

Using Rvk = 0 and Hvk = kvk we have Δvk = (k 2 + 2k)vk so λ = k 2 + 2k. 



Proposition 12.4. The element Δ acts by the scalar λ = k 2 + 2k on ∨k C2 .
Proof. This follows from Proposition 12.3.

The following fact, though trivial to prove, is very important. It may be visu-
alized as in Fig. 12.1.
Lemma 12.1. Suppose v is an H-eigenvector in some module for sl(2, R) with
eigenvalue k. Then Rv (if nonzero) is also an eigenvector with eigenvalue k+2,
and Lv (if nonzero) is an eigenvector with eigenvalue k − 2.
Proof. In the enveloping algebra, we have HR − RH = [H, R] = 2R, so
HRv = RHv + 2Rv = (r + 2)Rv. This proves the statement for R, and L is
handled similarly.

Proposition 12.5. Let V be a finite-dimensional representation of sl(2, R).
Let vk ∈ V be an H-eigenvector with eigenvalue k maximal. Then k is a
positive integer and vk is contained in an irreducible subspace of V isomorphic
to ∨k C2 .
Proof. We have Rvk = 0 by Lemma 12.1 and the maximality of k. Then
Δvk = (k 2 + 2k)vk follows from (12.5). Consider the submodule U generated
by vk . Every element of U is of the form ξvk where ξ is in the universal
enveloping algebra, and since Δ is in the center, it follows that Δξvk = λξvk
with λ = k 2 + 2k. It remains to be shown that U is isomorphic to ∨k C2 .
Define vk−2 , vk−4 , . . . , v−k by
1
vk−2l−2 = Lvk−2l .
k−l
Then (12.2) is satisfied by Lemma 12.1, and (12.4) is also satisfied by con-
struction. To prove (12.3), the case l = 0 is known, so assume l  1. Writing
Δ = H 2 − 2H + 4RL, the relation Δvk−2l+2 = (k 2 + 2k)vk−2l+2 applied to
vk−2l+2 gives

(k 2 + 2k)vk−2l+2 = [(k − 2l + 2)2 − 2(k − 2l + 2)]vk−2l+2 + 4(k − l + 1)Rvk−2l .

This can be simplified, giving (12.3). It is now clear that U is isomorphic to


∨k C2 . 

Proposition 12.6. Let (π, V ) be an irreducible complex representation of the
Lie algebra sl(2, R). Then Δ acts by a scalar λ on V , and λ = k 2 + 2k for
some nonnegative integer k. The representation π is isomorphic to ∨k C2 .
12 Representations of sl(2, C) 75

Proof. Let vk be an eigenvector for H with eigenvalue k maximal. Then


Rvk = 0 since otherwise Rvk is an eigenvector with eigenvalue k + 2. By
Proposition 12.5 vk generates an irreducible subspace isomorphic to ∨k C2 .
Since V is irreducible, the result follows.

Theorem 12.1. Let (π, V ) be any irreducible complex representation of
sl(2, R), su(2) or sl(2, C). Then π is isomorphic to ∨k C2 for some k.
Proof. By Theorem 11.1, it is sufficient to show this for sl(2, R), in which case
the statement follows from Proposition 12.6.

We can’t quite say yet that the finite-dimensional representations of sl(2, R),
su(2), and sl(2, C) are now classified. We know the irreducible representations
of these three Lie algebras. What we haven’t yet proved is the theorem of
Weyl that says that every irreducible representation is completely reducible,
that is, a direct sum of irreducible representations. We will prove this next.
Another proof will be given later in Theorem 14.4. Therefore, the reader may
skip the rest of this chapter with no loss of continuity.
The proof below in Theorem 14.4 is not purely algebraic. So even though
it is not needed, it is instructive to give a purely algebraic proof of complete
reducibility. The following proof depends on only two facts about g and the
Casimir element Δ. First, we have [g, g] = g, and second, that if V is an
irreducible module then Δv = λv for v ∈ V where the scalar λ is zero if and
only if V is trivial.
It may be shown that these properties are true for an arbitrary semisimple
Lie algebra, so the following arguments are applicable in that generality. The
exercises give an indication of how to extend the proof to other Lie algebras.
But in the special case where g is sl(2, R), su(2) or sl(2, C), the first state-
ment, that [g, g] = g follows from (12.1), and the second statement, that the
only irreducible module annihilated by Δ is the trivial module, follows from
Proposition 12.4 and our classification of the irreducible modules.
If g = su(2), we haven’t proven that Δ is an element of U (g). This can be
checked by direct computation, but we don’t really need it—it is an element of
U (gC ) ∼= U (g)C and as such acts as a scalar on any complex representation of g.
Proposition 12.7. Let g = sl(2, R), su(2) or sl(2, C). Let (π, V ) be a finite-
dimensional complex representation of g. If there exists k  1 such that
π(Δk )v = 0 for all v ∈ V , then π(X)v = 0 for all X ∈ g, v ∈ V .
Proof. There is nothing to do if V = {0}. Assume therefore that U is a
maximal proper invariant subspace of U . By induction on dim(V ), g acts
trivially on U . Now V /U is irreducible by the maximality of U , and Δ an-
nihilates V /U , so by the classification of the irreducible representations of
g in Theorem 12.1, g acts trivially on V /U . This means that if Y ∈ g and
v ∈ V , then π(Y )v ∈ U . Since g acts trivially on U , if X is another el-
ement of g, we have π(X)π(Y )v = 0 and similarly π(Y )π(X) = 0. Thus,
π([X, Y ])v = π(X)π(Y )v − π(Y )π(X)v = 0, and since by (12.1) elements of
the form [X, Y ] span g, it follows that g acts trivially on V .

76 12 Representations of sl(2, C)

Proposition 12.8. Let g = sl(2, R), su(2), or sl(2, C). Let (π, V ) be a finite-
dimensional complex representation of g.
(i) If v ∈ V and Δ2 v = 0, then Δv = 0.
(ii) We have V = V0 ⊕ V1 , where V0 is the kernel of Δ and V1 is the image of
Δ. Both are invariant subspaces. If X ∈ g and v ∈ V0 , then π(X)v = 0.
(iii) The subspace V0 = {v ∈ V | π(X) = 0 for all X ∈ g}.
(iv) If 0 −→ V −→ W −→ Q −→ 0 is an exact sequence of g-modules, then
there is an exact sequence 0 −→ V0 −→ W0 −→ Q0 −→ 0.

Proof. Since Δ commutes with the action of g, the kernel W of Δk is an


invariant subspace. Now (i) follows from Proposition 12.7.
It follows from (i) that V0 ∩ V1 = {0}. Now for any linear endomorphism
of a vector space, the dimension of the image equals the codimension of the
kernel, so dim(V0 ) + dim(V1 ) = dim(V ). It follows that V0 + V1 = V and this
sum is direct. Since Δ commutes with the action of g, both V0 and V1 are
invariant subspaces.
It follows from Proposition 12.7 that g acts trivially on V0 . This proves (ii)
and also (iii) since it is obvious that {v ∈ V |π(X)v = 0} ⊆ V0 , and we have
proved the other inclusion.
For (iv), any homomorphism V −→ W of g-modules maps V0 into W0 ,
so V −→ V0 is a functor. Given a short exact sequence 0 −→ V −→ W −→
Q −→ 0, consider

0 V0 W0 Q0

0 V W Q 0
Δ Δ Δ
0 V W Q 0

V/V1 W/W1

Exactness of the two middle rows implies exactness of the top row. We must
show that W0 −→ Q0 is surjective. We will deduce this from the Snake Lemma.
The cokernel of Δ : V −→ V is V /V1 ∼ = V0 , and similarly the cokernel of
Δ : W −→ W is W/W1 ∼ = W0 , so the Snake lemma gives us a long exact
sequence:
0 −→ V0 −→ W0 −→ Q0 −→ V0 −→ W0 .
Since the last map is injective, the map Q0 −→ V0 is zero, and hence W0 −→
Q0 is surjective.


If V is a g-module, we call V0 = {v ∈ V | Xv = 0 for all X ∈ g} the module of


invariants. The proposition shows that it is an exact functor.
If g is a Lie algebra and V , W are g-modules, we can make the space
Hom(V, W ) of all C-linear transformations V −→ W into a g-module by:
12 Representations of sl(2, C) 77

(Xφ)v = Xφ(v) − φ(Xv).


It is straightforward to check that Π is a Lie algebra representation. The
module of invariants is the space
Homg (V, W ) = {φ : V −→ W | φ(Xv) = Xφ(v) for all X ∈ g}
of all g-module homomorphisms.
Proposition 12.9. Let U, V, W, Q be g-modules, where g is one of sl(2, R),
su(2), or sl(2, C), and let
0 −→ V −→ W −→ Q −→ 0
be an exact sequence of g-modules. Composition with these maps gives an exact
sequence:
0 −→ Homg (U, V ) −→ Homg (U, W ) −→ Homg (U, Q) −→ 0.
Proof. Composition with these maps gives a short exact sequence:
0 −→ Hom(U, V ) −→ Hom(U, W ) −→ Hom(U, Q) −→ 0.
Here, of course, Hom(U, V ) is just the space of all linear transformations of
complex vector spaces. Taking the spaces of invariants gives the exact sequence
of Homg spaces, and by Proposition 12.8 it is exact.

Theorem 12.2. Let g = sl(2, R), su(2), or sl(2, C). Any finite-dimensional
complex representation of g is a direct sum of irreducible representations.
Proof. Let W be a g-module. If W is zero or irreducible, there is nothing to
check. Otherwise, let V be a proper nonzero submodule and let Q = W/V .
We have an exact sequence
0 −→ V −→ W −→ Q −→ 0
and by induction on dim(W ) both V and Q decompose as direct sums of
irreducible submodules. By Proposition 12.9, composition with these maps
produces an exact sequence
0 −→ Hom(Q, V )g −→ Hom(Q, W )g −→ Hom(Q, Q)g −→ 0.
The surjectivity of the map Hom(Q, W )g −→ Hom(Q, Q)g means that there
is a map i : Q −→ W which has a composition p ◦ i with the projection
p : W −→ Q that is the identity map on Q.
Now V and i(Q) are submodules of W such that V ∩ i(Q) = {0} and
W = V + i(Q). Indeed, if x ∈ V ∩ i(Q), then p(x) = 0 since p(V ) = {0}, and
writing x = i(q) with q ∈ Q, we have q = (p ◦ i)(q) = p(x) = 0; so x = 0 and
if w ∈ W we can write w = v + q, where v = w − ip(w) and q = ip(w) and,
since p(v) = p(w) − p(w) = 0, v ∈ ker(p) = V and q ∈ i(Q).
We see that W = V ⊕i(Q), and since V and Q are direct sums of irreducible
submodules, so is W.

78 12 Representations of sl(2, C)

Exercises
Exercise 12.1. If (π, V ) is a representation of SL(2, R), SU(2) or SL(2, C), then we
may restrict the character of π to the diagonal subgroup. This gives
 
t
ξπ (t) = tr π −1 ,
t

which is a Laurent polynomial, that is, a polynomial in t and t−1 .

(i) Compute ξπ (t) for the symmetric power representations. Show that the polyno-
mials ξπ (t) are linearly independent and determine the representation π.
(ii) Show that if Π = π ⊗ π  , then ξΠ = ξπ ξπ . Use this observation to compute the
decomposition of π ⊗ π  into irreducibles when π = ∨n C2 and π  = ∨m C2 .

Exercise 12.2. Show that each representation of sl(2, R) comes from a representa-
tion of SL(2, R).

Exercise 12.3. Let g = sl(3, R). Let Δ be the Casimir element with respect to the
invariant bilinear form 12 tr(xy) on g. Show that if (π, V ) is an irreducible represen-
tation with Δ · V = 0, then V is trivial.
[Hint: Here are some suggestions for a direct approach. Let
⎛ ⎞ ⎛ ⎞
1 0
H1 = ⎝ −1 ⎠ , H2 = ⎝ 1 ⎠,
0 −1

and (denoting by Eij the matrix with 1 in the i, j position, 0 elsewhere) let R1 = E12 ,
R2 = E23 , R3 = E13 , L1 = E21 , L2 = E32 , L3 = E31 . These eight elements are a
basis. Since [H1 , H2 ] = 0 there exists a vector vλ that is a simultaneous eigenvector,
so that H1 vλ = (λ1 − λ2 )vλ and H2 vλ = (λ2 − λ3 )vλ for some triple (λ1 , λ2 , λ3 ) of
real numbers. (We may normalize them so λ1 + λ2 + λ3 = 0, and it may then be
shown that λi ∈ 13 Z, though you may not need that fact.) Let Vλ be the space of
such vectors. Show that R1 maps Vλ into Vλ+α1 and R2 maps Vλ into Vλ+α2 where
α1 = (1, −1, 0) and α2 = (0, 1, −1). (What does R3 do to Vλ , and what do the Li
do?) Conclude that there is a nonzero vector vλ in some Vλ that is annihilated by
R1 , R2 and R3 . Show that λ1  λ2  λ3 . For this, it may be useful to observe that
there are two copies of sl2 (R) in sl3 (R) spanned by Hi , Ri , Li with i = 1, 2, so you
may restrict the representation to these and make use of the theory in the text.
Compute the eigenvector of Δ and show that Δvλ = 0 implies λ = (0, 0, 0).]

Exercise 12.4. Show that complex representations of su(3), sl(3, R), and sl(3, C)
are completely reducible.

 of sl(2, R),
Exercise 12.5. Show that if (π, V ) is a faithful representation  then the
trace bilinear form BV : g×g → C defined by BV (X, Y ) = tr π(X) π(Y ) is nonzero.

Exercise 12.6. Let g be a simple Lie algebra. Assume that g contains a subalgebra
isomorphic to sl(2, R). Let π : g → End(V ) be an irreducible representation. Assume
that π is not the trivial representation.

(i) Show that π is faithful.


12 Representations of sl(2, C) 79

(ii) Show that the trace bilinear form BV on g defined by B(X, Y ) = tr(π(X)π(Y ))
is nondegenerate. (Hint: First show that it is nonzero.)
(iii) By (ii) there exists a Casimir element Δ in the center of U (g) as in Theorem 10.2.
Show that the eigenvalue of Δ on V is 1/dim(V ). (Hint: Take traces.)
(iv) Show that representations of g are completely reducible. (Hint: Use Proposi-
tion 10.5.)

Exercise 12.7. Show that complex representations of su(n), sl(n, R), and sl(n, C)
are completely reducible.
13
The Universal Cover

If U is a Hausdorff topological space, a path is a continuous map p : [0, 1] −→


U . The path is closed if the endpoints coincide: p(0) = p(1). A closed path is
also called a loop.
An object in the category of pointed topological spaces consists of a pair
(X, x0 ), where X is a topological space and x0 ∈ X. The chosen point x0 ∈ X
is called the base point . A morphism in this category is a continuous map
taking base point to base point.
If U and V are topological spaces and φ, ψ : U −→ V are continuous maps,
a homotopy h : φ  ψ is a continuous map h : U × [0, 1] −→ V such that
h(u, 0) = φ(u) and h(u, 1) = ψ(1). To simplify the notation, we will denote
h(u, t) as ht (u) in a homotopy. Two maps φ and ψ are called homotopic if
there exists a homotopy φ  ψ. Homotopy is an equivalence relation.
If p : [0, 1] −→ U and p : [0, 1] −→ U are two paths, we say that p and
p are path-homotopic if there is a homotopy h : p  p that does not move


the endpoints. This means that ht (0) = p(0) = p (0) and ht (1) = p(1) = p (1)
for all t. We call h a path-homotopy, and we write p ≈ p if a path-homotopy
exists.
Suppose there exists a continuous function f : [0, 1] −→ [0, 1] such
that f (0) = 0 and f (1) = 1 and that p = p ◦ f . Then we say that p
is a reparametrization of p. The paths
are path-homotopic since we can
consider pt (u) = p (1 − t)u + tf (u) . Because the interval [0, 1] is convex,
(1 − t)u + tf (u) ∈ [0, 1] and pt : p  p .
Let us say that a map of topological spaces is trivial if it is constant,
mapping the entire domain to a single point. A topological space U is
contractible if the identity map U −→ U is homotopic to a trivial map. A space
U is path-connected if for all x, y ∈ U there exists a path p : [0, 1] −→ U such
that p(0) = x and p(1) = y.
Suppose that p : [0, 1] −→ U and q : [0, 1] −→ U are two paths in the
space U such that the right endpoint of p coincides with the left endpoint of
q; that is, p(1) = q(0). Then we can concatenate the paths to form the path
p  q:

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 81


DOI 10.1007/978-1-4614-8024-2 13, © Springer Science+Business Media New York 2013
82 13 The Universal Cover

p(2t) if 0  t  12 ,
(p  q)(t) =
q(2t − 1) if 12  t  1.

We may also reverse a path: −p is the path (−p)(t) = p(1 − t). These
operations are compatible with path-homotopy, and the path p  (−p) is
homotopic to the trivial path p0 (t) = p(0). To see this, define

pt (u) = p(2tu) if 0  u  1/2,
p 2t(1 − u) if 1/2  u  1.

This is a path-homotopy p0  p(−p). Also (pq)r ≈ p(q r) if p(1) = q(0)


and q(1) = r(0), since these paths differ by a reparametrization.
The space U is simply connected if it is path-connected and given any
closed path [that is, any p : [0, 1] −→ U such that p(0) = p(1)], there exists
a path-homotopy f : p  p0 , where p0 is a trivial loop mapping [0, 1] onto a
single point. Visually, the space is simply connected if every closed path can
be shrunk to a point. It may be convenient to fix a base point x0 ∈ U . In this
case, to check whether U is simply-connected or not, it is sufficient to consider
loops p : [0, 1] −→ U such that p(0) = p(1) = x0 . Indeed, we have:
Proposition 13.1. Suppose the space U is path-connected. The following are
equivalent.
(i) Every loop in U is path-homotopic to a trivial loop.
(ii) Every loop p in U with p(0) = p(1) = x0 is path-homotopic to a trivial
loop.
(iii) Every continuous map of the circle S 1 −→ U is homotopic to a trivial
map.
Thus, any one of these conditions is a criterion for simple connectedness.
Proof. Clearly, (i) implies (ii). Assuming (ii), if p is a loop in U , let x be the
endpoint p(0) = p(1) and (using path-connectedness) let q be a path from
x0 to x. Then q  p  (−q) is a loop beginning and ending at x0 , so using
(ii) it is path-homotopic to the trivial path p0 (t) = x0 for all t ∈ [0, 1]. Since
p0 ≈ q  p  (−q), p ≈ (−q)  p0  q, which is path homotopic to the trivial loop
t −→ x. Thus, (ii) implies (i).
As for (iii), a continuous map of the circle S 1 −→ U is equivalent to a
path p : [0, 1] −→ U with p(0) = p(1). To say that this path is homotopic to
a trivial path is not quite the same as saying it is path-homotopic to a trivial
path because in deforming p we need pt (0) = pt (1) (so that it extends to a
continuous map of the circle), but we do not require that pt (0) = p(0) for all t.
Thus, it may not be a path-homotopy. However, we may modify it to obtain
a path-homotopy as follows: let

⎨ p3tu (0) if 0  u  1/3 ,
qt (u) = pt (3u − 1) if 1/3  u  2/3 ,

p(3−3u)t (1) if 2/3  u  1 .
13 The Universal Cover 83

Then qt is a path-homotopy. When t = 0, it is a reparametrization of the


original path, and when t = 1, since p1 is trivial, q1 is path-homotopic to a
trivial path. Thus, (iii) implies (i), and the converse is obvious.

A map π : N −→ M is called a covering map if the fibers π −1 (x) are discrete
for x ∈ M , and every point m ∈ M has a neighborhood U such that π −1 (U )
is homeomorphic to U × π −1 (x) in such a way that the composition

π −1 (U ) ∼
= U × π −1 (x) −→ U,

where the second map is the projection, coincides with the given map π.
We say that the cover is trivial if N is homeomorphic to M × F , where
the space F is discrete, in such a way that π is the composition N ∼
=M×
F −→ M (where the second map is the projection). Thus, each m ∈ M has
a neighborhood U such that the restricted covering map π −1 (U ) −→ U is
trivial, a property we will cite as local triviality of the cover.
Proposition 13.2. Let π : N −→ M be a covering map.

(i) If p : [0, 1] −→ M is a path, and if y ∈ π −1 p(0) , then there exists a
unique path p̃ : [0, 1] −→ N such that π ◦ p̃ = p and p̃(0) = y.
(ii) If p̃, p̃ : [0, 1] −→ N are paths with p̃(0) = p̃ (0), and if the paths π ◦ p̃ and
π ◦ p̃ are path-homotopic, then the paths p̃ and p̃ are path-homotopic.
We refer to (i) as the path lifting property of the covering space. We refer to
(ii) as the homotopy lifting property.
Proof. If the cover is trivial, then we may assume that N = M × F where
F is discrete, and if y = (x, f ), where
x = p(0) and f ∈ F , then the unique
solution to this problem is p̃(t) = p(t), f .
Since p([0, 1]) is compact, and since the cover is locally trivial, there are a
finite number of open sets U1 , U2, . . . , Un and points x0 = 0 < x1 < · · · < xn =
1 such that p([xi−1 , xi ]) ⊂ Ui and such that the restriction of the cover to Ui
is trivial. On each interval [xi−1 , x], there is a unique solution, and patching
these together gives the unique general solution. This proves (i).
For (ii), since p = π ◦ p̃ and p = π ◦ p̃ are path-homotopic, there exists a
continuous map (u, t) → pt (u) from [0, 1]×[0, 1] −→ M such that p0 (u) = p(u)
and p1 (u) = p (u). For each t, using (i) there is a unique path p,t : [0, 1] −→ M̃
such that pt = π ◦ p̃t and p̃t (0) = p(0). One may check that (u, t) → p̃t (u) is
continuous, and p,0 = p̃ and p,1 = p̃ , so p̃ and p̃ are path-homotopic.

Covering spaces of a fixed space M form a category: if π : N −→ M and
π  : N  −→ M are covering maps, a morphism is a covering map f : N −→ N 
such that π = π  ◦ f . If M is a pointed space, we are actually interested in the
subcategory of pointed covering maps: if x0 is the base point of M , the base
point of N must lie in the fiber π −1 (x0 ), and in this category the morphism
f must preserve base points. We call this category the category of pointed
covering maps or pointed covers of M .
84 13 The Universal Cover

Let M be a path-connected space with a fixed base point x0 . We assume


that every point has a contractible neighborhood. The fundamental group
π1 (M ) consists of the set of homotopy classes of loops in M with left and
right endpoints equal to x0 . The multiplication in π1 (M ) is concatenation,
and the inverse operation is path-reversal. Clearly, π1 (M ) = 1 if and only
if M is simply connected. Changing the base point replaces π1 (M ) by an
isomorphic group, but not canonically so. Thus, π1 (M ) is a functor from the
category of pointed spaces to the category of groups—not a functor on the
category of topological spaces. If M happens to be a topological group, we will
always take the base point to be the identity element.

Proposition 13.3. If M is simply connected, is N path-connected, and π :


N −→ M is a covering map, then π is a homeomorphism.

Proof. Since a covering map is always a local homeomorphism, what we need


to show is that π is bijective. It is, of course, surjective. Suppose that n, n ∈ N
have the same image in M . Since N is path-connected, let p̃ : [0, 1] −→ N
be a path with p̃(0) = n and p̃(1) = n . Because M is simply connected
and π̃ ◦ p(0) = π̃ ◦ p(1), the path π̃ ◦ p is path-homotopic to a trivial path.
By Proposition 13.2 (ii), so is p. Therefore n = n .


Theorem 13.1. Let M be a path-connected space with base point x0 in


which every point has a contractible neighborhood. Then there exists a simply
connected space M̃ with a covering map π̃ : M̃ −→ M . If π : N −→ M is any
pointed covering map, there is a unique morphism M̃ −→ N of pointed covers
of M . If N is simply connected, this map is an isomorphism. Thus, M has a
unique simply connected cover.

Note that this is a universal property. Therefore it characterizes M̃ up to


isomorphism. The space M̃ is called the universal covering space of M .

Proof. To construct M̃ , let M̃ as a set be the set of all paths p : [0, 1] −→


M such that p(0) = x0 modulo the equivalence relation of path-homotopy.
We define the covering map π̃ : M̃ −→ M by π̃(p) = p(1). To topologize M̃ ,
let x ∈ M and let U be a contractible neighborhood of x. Let F = π̃ −1 (x). It is
a set of path-homotopy classes of paths from x0 to x. Using the contractibility
of U , it is straightforward to show that, given p ∈ π −1 (U ) with y = π(p) ∈ U ,
there is a unique element F represented by a path p such that p ≈ p q, where
q is a path from x to y lying entirely within U . We topologize π̃ −1 (U ) in the
unique way such that the map p → (p , y) is a homeomorphism π̃ −1 (U ) −→
F × U.
We must show that, given a pointed covering map π : N −→ M , there
exists a unique morphism M̃ −→ N of pointed covers of M . Let y0 be the
base point of N . An element of π̃ −1 (x), for x ∈ M , is an equivalence class
under the relation of path-homotopy of paths p : [0, 1] −→ M with x0 = p(0).
By Proposition 13.2 (i), there is a unique path q : [0, 1] −→ N lifting this with
13 The Universal Cover 85

q(0) = y0 , and Proposition 13.2 (ii) shows that the path-homotopy class of q
depends only on the path-homotopy class of p. Then mapping p → q(1) is the
unique morphism M̃ −→ N of pointed covers of M .
If N is simply connected, any covering map M −→ N is an isomorphism
by Proposition 13.3.


Proposition 13.4. Let M , N and N  be topological spaces such that every


point has a contractible neighborhood. Assume that M is simply-connected.
Let π : N  −→ N be a covering map, and let f : M −→ N be continuous.
Then there exists a continuous map f  : M −→ N  such that π ◦ f  = f .

This result shows that the universal cover is a functor: if M̃ and Ñ are the
universal covers of M and N , then this proposition implies that a continuous
map φ : M → N induces a map φ̃ : M̃ → Ñ .

Proof. Let x0 be a base point for M , and let y0 be an element of N  such that
π(y0 ) = y0 where y0 = f (x0 ). If x ∈ M , we may find a path p : [0, 1] −→ M
such that p(0) = x0 and p(1) = x. By Proposition 13.2 (i) we may then find
a path p̃ : [0, 1] −→ N  such that π ◦ p̃ = f ◦ p and p̃(0) = y0 . We will define
f  (x) = p̃(1), but first we must check that this is well-defined. If q is another
path with q(0) = x0 and q(1) = x, and if q̃ : [0, 1] −→ N  is the corresponding
lift of f ◦p with q̃(0) = y0 , then we must show q̃(1) = p̃(1). The paths p and p
are homotopic because M is simply connected. That is, the concatenation of p
with the inverse path to p is a loop, hence contractible, and this implies that
p and p are homotopic. It follows that the paths q̃ and p̃ are path-homotopic,
and in particular they have the same right endpoint q̃(1) = p̃(1). Hence we
may define f  (x) = p̃(1) and this is the required map.


If M is a pointed space and x0 is its base point, then the fiber π̃ −1 (x0 )
coincides with its fundamental group π1 (M ). We are interested in the case
where M = G is a Lie group. We take the base point to be the origin.

Theorem 13.2. Suppose that G is a path-connected group in which every


point has a contractible neighborhood. Then the universal covering space G̃
admits a group structure in which both the natural inclusion map π1 (G)−→G̃
and the projection π̃ : G̃ −→ G are homomorphisms. The kernel of π̃ is π1 (G).

Proof. If p : [0, 1] −→ G and q : [0, 1] −→ G are paths, so is t → p · q(t) =


p(t)q(t). If p(0) = q(0) = 1G , the identity element in G, then p·q(0) = 1G also.
If p and p are path-homotopic and q, q  are another pair of path-homotopic
paths, then p·q and p ·q  are path-homotopic, for if t → pt is a path-homotopy
p  p and t → qt is a path-homotopy q  q  , then t → pt · qt is a path-
homotopy p · q  p · q  .
It is straightforward to see that the projection π̃ is a group homomorphism.
To see that the inclusion of the fundamental group as the fiber over the identity
in G̃ is a group homomorphism, let p and q be loops with p(0) = p(1) =
q(0) = q(1) = 1G . There is a continuous map f : [0, 1] × [0, 1] −→ G given
86 13 The Universal Cover

by (t, u) −→ p(t)q(u). Taking different routes from (0, 0) to (1, 1) will give
path-homotopic paths. Going directly via t → f (t, t) = p(t)q(t) gives p · q,
while going indirectly via

f (2t, 0) = p(2t) if 0  t  12 ,
t →
f (1, 2t − 1) = q(2t − 1) if 12  t  1,

gives the concatenated path p  q. Thus, p  q and p · q are path-homotopic, so


the multiplication in π1 (G) is compatible with the multiplication in G̃.
The last statement, that the kernel of π̃ is π1 (G), is true by definition. 

Proposition 13.5. Let S r denote the r-sphere. Then π1 (S 1 ) ∼


= Z, while S r
is simply-connected if r  2.

Proof. We may identify the circle S 1 with the unit circle in C. Then x →
e2πix is a covering map R −→ S 1 . The space R is contractible and hence
simply-connected, so it is the universal covering space. If we give S 1 ⊂ C×
the group structure it inherits from C× , then this map R −→ S 1 is a group
homomorphism, so by Theorem 13.2 we may identify the kernel Z with π1 (S 1 ).
To see that S r is simply connected for r  2, let p : [0, 1] −→ S r be a path.
Since it is a mapping from a lower-dimensional manifold, perturbing the path
slightly if necessary, we may assume that p is not surjective. If it omits one
point P ∈ S r , its image is contained in S r − {P }, which is homeomorphic to
Rr and hence contractible. Therefore p, is path-homotopic to a trivial path.



Proposition 13.6. The group SU(2) is simply-connected. The group SO(3)


is not. In fact π1 SO(3) ∼ = Z/2Z.
  
a b  2
Proof. Note that SU(2) =  |a| + |b|2 = 1 is homeomorphic to
−b̄ ā
the 3 sphere in C2 . As such, it is simply connected. We have a homomorphism
SU(2) −→ SO(3), which we constructed in Example 7.1. Since this mapping
induced an isomorphism of Lie algebras, its image is an open subgroup of
SO(3), and since SO(3) is connected, this homomorphism is surjective. The
kernel {±I} of this homomorphism is finite, so this is a covering map. Be-
cause SU(2) is simply connected, it follows from the uniqueness of the simply
connected covering group that it is the universal covering group of SO(3).
The kernel of this homomorphism SU(2) −→ SO(3) is therefore the funda-
mental group, and it has order 2.


Let G and H be topological groups. By a local homomorphism G −→ H we


mean the following data: a neighborhood U of the identity and a continuous
map φ : U −→ H such that φ(uv) = φ(u)φ(v) whenever u, v, and uv ∈ U .
This implies that φ(1G ) = 1H , so if u, u−1 ∈ U we have φ(u−1 ) = φ(u).
We may as well replace U by U ∩ U −1 so this is true for all u ∈ U .
13 The Universal Cover 87

Theorem 13.3. Let G and H be topological groups, and assume that G is


simply connected. Let U be a neighborhood of the identity in G. Then any
local homomorphism U −→ H can be extended to a homomorphism G −→ H.
Proof. Let g ∈ G. Let p : [0, 1] −→ G be a path with p(0) = 1G , p(1) = g.
(Such a path exists because G is path-connected.) We first show that there
exists a unique path q : [0, 1] −→ H such that q(0) = 1H , and

q(v) q(u)−1 = φ p(v) p(u)−1 (13.1)

when u, v ∈ [0, 1] and |u − v| is sufficiently small. We note that when u and


v are sufficiently close, p(v)p(u)−1 ∈ U , so this makes sense. To construct a
path q with this property, find 0 = x0 < x1 < · · · < xn = 1 such that when
u and v lie in an interval [xi−1 , xi+1 ], we have p(v)p(u)−1 ∈ U (1  i < n).
Define q(x0 ) = 1H , and if v ∈ [xi , xi+1 ] define

q(v) = φ p(v) p(xi )−1 q(xi ). (13.2)

This definition is recursive because here q(xi ) is defined by (13.2) with i


replaced by i − 1 if i > 0. With this definition, (13.2) is actually true for
v ∈ [xi−1 , xi+1 ] if i  1. Indeed, if v ∈ [xi−1 , xi ] (the subinterval for which
this is not a definition), we have

q(v) = φ p(v) p(xi−1 )−1 q(xi−1 ),

so what we need to show is that


−1
q(xi ) q(xi−1 )−1 = φ p(v) p(xi )−1 φ p(v)p(xi−1 )−1 .

It follows from the fact that φ is a local homomorphism that the right-hand
side is

φ p(xi ) p(xi−1 )−1 .
Replacing i by i − 1 in (13.2) and taking v = xi , this equals q(xi )q(xi−1 )−1 .
Now (13.1) follows for this path by noting that if  = 12 min |xi+1 − xi |, then
when |u − v| < , u, v ∈ [0, 1], there exists an i such that u, v ∈ [xi−1 , xi+1 ],
and (13.1) follows from (13.2) and the fact that φ is a local homomorphism.
This proves that the path q exists. To show that it is unique, assume
that (13.1) is valid for |u−v| < , and choose the xi so that |xi −xi+1 | < ; then
for v ∈ [xi , xi+1 ], (13.2) is true, and the values of q are determined by this
property.
Next we indicate how one can show that if p and p are path-homotopic,
and if q and q  are the corresponding paths in H, then q(1) = q  (1). It is
sufficient to prove this in the special case of a path-homotopy t → pt , where
p0 = p and p1 = p , such that there exists a sequence 0 = x1  · · ·  xn = 1
with pt (u)pt (v)−1 ∈ U when u, v ∈ [xi−1 , xi+1 ] and t and t ∈ [0, 1]. For
although a general path-homotopy may not satisfy this assumption, it can be
broken into steps, each of which does. In this case, we define
88 13 The Universal Cover

qt (v) = φ pt (v) p(xi )−1 q(xi )

when v ∈ [xi , xi+1 ] and verify that this qt satisfies



qt (v)qt (u)−1 = φ pt (v) pt (u)−1

when |u − v| is small. In particular, this is satisfied when t = 1 and p1 =


p , so q1 = q  by definition. Now q  (1) = φ p (1) p(1)−1 q(1) = q(1) since
p(1) = p (1), as required.
We now define φ(g) = q(1). Since G is simply connected, any two paths
from the identity to g are path-homotopic, so this is well-defined. It is
straightforward to see that it agrees with φ on U . We must show that it
is a homomorphism. Given g and g  in G, let p be a path from the identity
to g, and let p be a path from the identity to g  , and let q and q  be the
corresponding paths in H defined by (13.1). We construct a path p from the
identity to gg  by

p (2t) if 0  t  1/2 ,
p (t) =
p(2t − 1)g  if 1/2  t  1.

Let

q  (2t) if 0  t  1/2 ,
q  (t) =
q(2t − 1)q  (1) if 1/2  t  1.

Then it is easy to check that q  is related to p by (13.1), and taking t = 1,


we see that φ(gg  ) = q  (1) = q(1)q  (1) = φ(g)φ(g  ). 

We turn next to the computation of the fundamental groups of some noncom-


pact Lie groups.
As usual, we call a square complex matrix g Hermitian if g = t g. The
eigenvalues of a Hermitian matrix are real, and it is called positive definite
if these eigenvalues are positive. If g is Hermitian, so are g 2 and eg = I +
g + 12 g 2 + · · · . According to the spectral theorem, the Hermitian matrix g
can be written kak −1 , where a is real and diagonal and k is unitary. We have
g 2 = ka2 k −1 and kea k −1 , so g 2 and eg are positive definite.

Proposition 13.7.
(i) If g1 and g2 are positive definite Hermitian matrices, and if g12 = g22 , then
g1 = g2 .
(ii) If g1 and g2 are Hermitian matrices and eg1 = eg2 , then g1 = g2 .

Proof. To prove (i), assume that the gi are positive definite and that g12 = g22 .
We may write gi = ki ai ki−1 , where ai is diagonal with positive entries, and we
may arrange it so the entries in ai are in descending order. Since a21 and a22 are
similar diagonal matrices with their entries in descending order, they are equal,
and since the squaring map on the positive reals is injective, a1 = a2 . Denote
13 The Universal Cover 89

a = a1 = a2 . It is not necessarily true that k1 = k2 , but denoting k = k1−1 k2 ,


k commutes with a2 . Let λ1 > λ2 > · · · be the distinct eigenvalues of a with
multiplicities d1 , d2 , . . . . Since k commutes with
⎛ 2 ⎞
λ1 Id1
⎜ λ22 Id2 ⎟
a2 = ⎝ ⎠,
..
.

it has the form


⎛ ⎞
K1
⎜ K2 ⎟
k=⎝ ⎠,
..
.

where Ki is a di × di block. This implies that k commutes with a, and so


g2 = kak −1 = g1 .
The proof assuming eg1 = eg2 is similar. It is no longer necessary to assume
that g1 and g2 are positive definite because (unlike the squaring map) the
exponential map is injective on all of R.


Theorem 13.4. Let P be the space of positive definite Hermitian matrices.


If g ∈ GL(n, C), then g may be written uniquely as pk, where k ∈ U(n)
and p ∈ P . Moreover, the multiplication map P × U(n) −→ GL(n, C) is a
diffeomorphism.

This is one of several related decompositions referred to as the Cartan decom-


position. See Chap. 28 for related material.

Proof. The matrix g · t g is positive definite and Hermitian, so by the spectral


theorem it can be diagonalized by a unitary matrix. This means we can write
g · t g = κaκ−1 , where κ is unitary and a is a diagonal matrix with positive real
entries. We may take the square root of a, writing a = d2 , where d is another
diagonal matrix with positive real entries. Let p = κdκ−1 . Since t κ = κ−1 , we
have g · t g = κdκ−1 · t (κdk −1 ) = p · t p, which implies that k = p−1 g is unitary.
The existence of the decomposition is now proved. To see that it is unique,
suppose that pk = p k  , where p and p are positive definite Hermitian
matrices, and k and k  are unitary. To show that p = p and k = k  , we may
move the k  to the other side, so it is sufficient to show that if pk = p , then
p = p . Taking the conjugate transpose, k −1 pt kp = p , so (p )2 = pkk −1 p = p2 .
The uniqueness now follows from Proposition 13.7.
We now know that the multiplication map P × U(n) −→ GL(n, C) is a
bijection. To see that it is a diffeomorphism, we can use the inverse function
theorem. One must check that the Jacobian of the map is nonzero near any
given point (p0 , k0 ) ∈ P × U(n). Let X0 be a fixed Hermitian matrix such that
exp(X0 ) = p0 . Parametrize P by elements of the vector space p of Hermitian
matrices, which we map to P by the map p  X −→ exp(X0 + X), and
90 13 The Universal Cover

parametrize U(n) by elements of u(n) by means of the map u(n)  Y −→


exp(Y )p0 . Noting that p and u(n) are complementary subspaces of gl(n, C),
it is clear using this parametrization of a neighborhood of (p0 , k0 ) that the
Jacobian is nonzero there, and so the multiplication map is a diffeomorphism.



Theorem 13.5. We have



π1 GL(n, C) ∼= π1 U(n) , π1 SL(n, C) ∼= π1 SU(n) ,

and

π1 SL(n, R) ∼= π1 SO(n) .

We have omitted GL(n, R) from this list because it is not connected. There is
a general principle here: the fundamental group of a connected Lie group is
often the same as the fundamental group of a maximal compact subgroup.

Proof. First, let G = GL(n, C), K = U(n), and P be the space of positive
definite Hermitian matrices. By the Cartan decomposition, multiplication K ×
P −→ G is a bijection, and in fact, a homeomorphism, so it will follow that
π1 (K) ∼
= π1 (G) if we can show that P is contractible. However, the exponential
map from the space p of Hermitian matrices to P is bijective (in fact, a
homeomorphism) by Proposition 13.7, and the space p is a real vector space
and hence contractible.
For G = SL(n, C), one argues similarly, with K = SU(n) and P the space of
positive definite Hermitian matrices of determinant one. The exponential map
from the space p of Hermitian matrices of trace zero is again a homeomorphism
of a real vector space onto P .
Finally, for G = SL(n, R), one takes K = SO(n), P to be the space of
positive definite real matrices of determinant one, and p to be the space of
real symmetric matrices of trace zero.


The remainder of this chapter will be less self-contained, but can be skipped
with no loss of continuity. We will calculate the fundamental groups of SO(n)
and SU(n), making use of some facts from algebraic topology that we do not
prove. (These fundamental groups can alternatively be computed using the
method of Chap. 23. See Exercise 23.4.)
If G is a Hausdorff topological group and H is a closed subgroup, then
the coset space G/H is a Hausdorff space with the quotient topology. Such a
quotient is called a homogeneous space.

Proposition 13.8. Let G be a Lie group and H a closed subgroup. If the


homogeneous space G/H is homeomorphic to a sphere S r where r  3, then
π1 (G) ∼
= π1 (H).

Proof. The map G −→ G/H is a fibration (Spanier [149], Example 4 on p. 91


and Corollary 14 on p. 96). It follows that there is an exact sequence
13 The Universal Cover 91

π2 (G/H) −→ π1 (H) −→ π1 (G) −→ π1 (G/H)

(Spanier [149], Theorem 10 on p. 377). Since G/H is a sphere of dimension


 3, its first and second homotopy groups are trivial and the result follows.



Theorem 13.6. The groups SU(n) are simply connected for all n. On the
other hand,

Z if n = 2 ,
π1 SO(n) ∼ =
Z/2Z if n > 2.

Proof.
Since its fundamental group is Z. By Proposition 13.6
SO(2) is a circle,
π1 SO(3) ∼ = Z/2Z and π1 SU(2) is trivial. The group SO(n) acts transitively
on the unit sphere S n−1 in Rn , and the isotropy subgroup is SO(n − 1), so
SO(n)/SO(n
− 1) is homeomorphic
to S n−1 . By Proposition 13.8, we see that

π1 SO(n) = π1 SO(n − 1) if n  4. Similarly, SU(n) acts on the unit sphere
S 2n−1 in Cn , and so SU(n)/SU(n − 1) ∼= S 2n−1 , whence SU(n) ∼ = SU(n − 1)
for n  2.


If n , the universal covering group of SO(n) is called the spin group and is
denoted Spin(n). We will take a closer look at it in Chap. 31.

Exercises
 R) be the universal covering group of SL(2, R). Let π :
Exercise 13.1. Let SL(2,
 R) −→ GL(V ) be any finite-dimensional irreducible representation. Show that
SL(2,
π factors through SL(2, R) and is hence not a faithful representation. (Hint: Use
Exercise 12.2.)
14
The Local Frobenius Theorem

Let M be an n-dimensional smooth manifold. The tangent bundle T M of


M is the disjoint union of all tangent spaces of points of M . It can be
given the structure of a manifold of dimension 2 dim(M ) as follows. If U
is a coordinate neighborhood and x1 , . . . , xn are local coordinates on U , then
T (U ) = {Tx M | x ∈ U } can be taken to be a coordinate neighborhood of T M .
Every element of Tx M with x ∈ U can be written uniquely as

n

ai ,
i=1
∂xi

and mapping this tangent vector to (x1 , . . . , xn , a1 , . . . , an ) ∈ R2n gives a chart


on T (U ), making T M into a manifold.
By a d-dimensional family D in the tangent bundle of M we mean a rule
that associates with each x ∈ M a d-dimensional subspace Dx ⊂ Tx (M ).
We ask that the family be smooth. By this we mean that in a neighborhood
U of any given point x there are smooth vector fields X1 , . . . , Xd such that
for u ∈ U the vectors Xi,u ∈ Tu (M ) span Du .
We say that a vector field X is subordinate to the family D if Xx ∈ Dx
for all x ∈ U . The family is called involutory if whenever X and Y are vector
fields subordinate to D then so is [X, Y ]. This definition is motivated by the
following considerations.
An integral manifold of the family D is a d-dimensional submanifold N
such that, for each point x ∈ N , the tangent space Tx (N ), identified with its
image in Tx (M ), is Dx . We may ask whether it is possible, at least locally in
a neighborhood of every point, to pass an integral manifold. This is surely a
natural question.
Let us observe that if it is true, then the family D is involutory. To see
this (at least plausibly), let U be an open set in M that is small enough that
through each point in U there is an integral submanifold that is closed in U .
Let J be the subspace of C ∞ (U ) consisting of functions that are constant on
these integral submanifolds. Then the restriction of a vector field X to U is

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 93


DOI 10.1007/978-1-4614-8024-2 14, © Springer Science+Business Media New York 2013
94 14 The Local Frobenius Theorem

subordinate to D if and only if it annihilates J. It is clear from (6.6) that if


X and Y have this property, then so does [X, Y ].
The Frobenius theorem is a converse to this observation. A global version
may be found in Chevalley [35]. We will content ourselves with the local
theorem.

Lemma 14.1. If X1 , . . . , Xd are vector fields on M such that [Xi , Xj ] lies in


the C ∞ (M ) span of X1 , . . . , Xd , and if for each x ∈ M we define Dx to be
the span of X1x , . . . , Xdx , then D is an involutory family.

Proof. Any vector field subordinate to D has the form (locally near x)

i fi Xi , where fi are smooth functions. To check that the commutator of
two such vector fields is also of the same form amounts to using the formula

[f X, gY ] = f g[X, Y ] + f X(g)Y − gY (f )X,

which follows easily on applying both sides to a function h and using the fact
that X and Y are derivations of C ∞ (M ).


Theorem 14.1 (Frobenius). Let D be a smooth involutory d-dimensional


family in the tangent bundle of M . Then for each point x ∈ M there exists a
neighborhood U of x and an integral manifold N of D through x in U . If N 
is another integral manifold through x, then N and N  coincide near x. That
is, there exists a neighborhood V of x such that V ∩ N = V ∩ N  .

Proof. Since this is a strictly local statement, it is sufficient to prove this when
M is an open set in Rn and x is the origin.
We show first that if X is a vector field that does not vanish at x, then
we may find a system y1 , . . . , yn of coordinates in which X = ∂/∂yn . Let
x1 , . . . , xn be the standard Cartesian functions. Since X does not vanish at
the origin, the function X(xi ) does not vanish at the origin for some i, so after
permuting the variables if necessary, we may assume that X(xn ) = 0. Write

∂ ∂
X = a1 + · · · + an
∂x1 ∂xn
in terms of smooth functions ai = ai (x1 , . . . , xn ). Then an (0, . . . , 0) = 0.
The new coordinate system y1 , . . . , yn will have the property that

(y1 , . . . , yn−1 , 0) = (x1 , . . . , xn−1 , 0).

To describe (y1 , . . . , yn ) when yn = 0, let us fix small numbers u1 , . . . , un−1 .


Then we will describe the path which is, in the y coordinates,

t −→ (u1 , . . . , un−1 , t).

This path is to be an integral curve for the vector field through the point
(u1 , . . . , un−1 , 0). By Proposition 8.1 a unique such path exists (for t small).
14 The Local Frobenius Theorem 95

Thus, we have a path that is (in the x coordinates) t −→ (x1 (t), . . . , xn (t)) ,
satisfying the first-order system

xi (t) = ai x1 (t), . . . , xn (t) , (14.1)
(xi (0), . . . , xn (0)) = (u1 , . . . , un−1 , 0).

For u1 , . . . , un−1 sufficiently small, we have an (u1 , . . . , un−1 , 0) = 0 and so


this integral curve is transverse to the hyperplane xn = 0. We choose our
coordinate system y1 , . . . , yn so that

yi x1 (t), . . . , xn (t) = ui , (i = 1, 2, 3, . . . , n − 1),

yn x1 (t), . . . , xn (t) = t.

Now ∂xi /∂yn = ai because the partial derivative is the derivative along one
of the paths (14.1). Thus

∂ ∂xi ∂ ∂
= = ai = X.
∂yn i
∂y n ∂xi i
∂x i

This proves that there exists a coordinate system in which X = ∂/∂yn .


If d = 1, the result is proved by this. We will assume that d > 1 and that
the existence of integral manifolds is known for lower-dimensional involutory
families. Let X1 , . . . , Xd be smooth vector fields such that Xi,u span Du for
u near the origin. We have just shown that  we may assume that X = Xd =
∂/∂yn . Since D is involutory, [Xd , Xi ] = j gij Xj for smooth functions gij .
We will show that we can arrange things so that gid = 0 when i < d; that is,


d−1
[Xd , Xi ] = gij Xj , (i < d). (14.2)
j=1

Indeed, writing

n

Xi = hik , (i = 1, . . . , d − 1), (14.3)
∂yk
k=1

we will still have a spanning set if we subtract hin Xd from Xi . We may


therefore assume that hin = 0 for i < d. Thus


n−1

Xi = hik , (i = 1, . . . , d − 1). (14.4)
∂yk
k=1

In other words, we may assume that Xi does not involve ∂/∂yn for i < d.
Now
∂hik ∂
n−1
[Xd , Xi ] = . (14.5)
∂yn ∂yj
k=1
96 14 The Local Frobenius Theorem

On the other hand, we have


d−1 n−1
d−1 ∂ ∂
[Xd , Xi ] = gij Xj + gid Xd = gij hjk + gid .
j=1 j=1 k=1
∂yk ∂yn

Comparing the coefficients of ∂/∂yn in this expression with that in (14.5)


shows that gid = 0, proving (14.2).
Next we show that if (c1 , . . . , cd−1 ) are real constants, then there exist
smooth functions f1 , . . . , fd−1 such that for small y1 , . . . , yn−1 we have

fi (y1 , y2 , . . . , yn−1 , 0) = ci , (i = 1, . . . , d − 1), (14.6)

and - .

d−1
Xd , fi Xi = 0.
i=1

Indeed,
- .

d−1
d=1
∂fi
d−1
Xd , f i Xi = Xi + fi gij Xj .
i=1 i=1
∂yn i,j=1

For this to be zero, we need the fi to be solutions to the first-order system

∂fj
d−1
+ gij fi = 0, j = 1, . . . , d − 1.
∂yn i=1

This first-order system has a solution locally with the prescribed initial con-
dition.
Since the ci can be arbitrary, we may choose

1 if i = 1,
ci =
0 otherwise.

Then the vector field fi Xi agrees with X1 on the hyperplane yn = 0.
Replacing X1 by fi Xi , we may therefore assume that [Xd , X1 ] = 0.
Repeating this process, we may similarly assume that [Xd , Xi ] = 0 for all
i < d. Now with the hij as in (14.3), this means that ∂hij /∂yn = 0, so the hij
are independent of yn .
Since the hij are independent of yn , we may interpret (14.4) as defining
d − 1 vector fields on Rn−1 . They span a (d − 1)-dimensional involutory family
of tangent vectors in Rn−1 and by induction there exists an integral manifold
for this vector field. If this manifold is N0 ⊂ Rn−1 , then it is clear that

N = {(y1 , . . . , yn ) | (y1 , . . . , yn−1 ) ∈ N0 }

is an integral manifold for D.


14 The Local Frobenius Theorem 97

We have established the existence of an integral submanifold. The local


uniqueness of the integral submanifold can also be proved now. In fact, if we
repeat the process by which we selected the coordinate system y1 , . . . , yn so
that the vector field ∂/∂yn was subordinate to the involutory family D, we
eventually arrive at a system in which D is spanned by ∂/∂yn−d+1, . . . , ∂/∂yn.
Then the integral manifold is given by the equations y1 = · · · = yn−d = 0. 
If G is a Lie group, a local subgroup of G consists of an open neighborhood U
of the identity and a closed subset K of U such that 1G ∈ K, and if x, y ∈ K
such that xy ∈ U , then xy ∈ K, and if x ∈ K such that x−1 ∈ U , then
x−1 ∈ K. For example, if H is a closed subgroup of G and U is any open set,
then U ∩ H is a local subgroup.
Proposition 14.1. Let G be a Lie group with Lie algebra g, and let k be a
Lie subalgebra of g. Then there exists a local subgroup K of G with a tangent
space at the identity that is k. The exponential map sends a neighborhood of
the identity in k onto a neighborhood of the identity in K.
Proof. The Lie algebra g of G has two incarnations: as the tangent space to
the identity of G and as the set of left-invariant vector fields. For definiteness,
we identify g = Te (G) and recall how the left-invariant vector field arises.
If g ∈ G, let λg : G −→ G be left translation by g, so that λg (x) = gx.
Let λg∗ : Te (G) −→ Tx (G) be the induced map of tangent spaces. Then the
left-invariant vector field associated with Xe ∈ g has Xg = λg∗ (Xe ).
Let d = dim(k) and let D be the d-dimensional family of tangent vectors
such that Dg = λg∗ (k). Since k is closed under the bracket, it follows from
Lemma 14.1 that D is involutory, so there exists an integral submanifold K
in a neighborhood U of the identity. We will show that if U is sufficiently
small, then K is a local group.
Indeed, let x and y be elements of K such that xy ∈ U . Since the vector
fields associated with elements of k are left-invariant, the involutory family D
is invariant under left translation. The image of K under right translation by
x is also an integral submanifold of D through x, so this submanifold is K
itself. These submanifolds therefore coincide near x and, since y is in K, its
left translate xy by x is also in K.
Since the one-parameter subgroups exp(tX) with X ∈ k are tangent to
the left-invariant vector field at every point, they are contained in the integral
submanifold K near the identity, and the image of a neighborhood of the
identity under exp is a manifold of the same dimension as K, so the last
statement is clear.

We recall that the notion of a local homomorphism was defined in Chap. 13
before Theorem 13.3.
Proposition 14.2. Let G and H be Lie groups with Lie algebras g and h,
respectively, and let π : g −→ h be a Lie algebra homomorphism. Then there
exists a neighborhood U of G and a local homomorphism π : U −→ H whose
differential is π.
98 14 The Local Frobenius Theorem

Proof. The tangent space to G × H at the identity is g ⊕ h. Let



k = { X, π(X) | X ∈ g} ⊂ g ⊕ h.

It is a Lie subalgebra, corresponding by Proposition 14.1 to a local subgroup


K of G × H. The tangent space to the identity of K is thus its Lie algebra k,
which intersects h in g ⊕ h transversally in a single point. Thus g is the direct
sum of k and h. Concretely, this reflects the fact that k is the graph of a map
π : g −→ h. Using the inverse function theorem, the same is true locally of
K: since its tangent space at the identity is a direct sum complement of the
tangent space of H in the tangent space of G × H, it is, locally, the graph
of a mapping. Thus, there exists a map π : U −→ H of a sufficiently small
neighborhood of the identity in G such that if (g, h) ∈ G × H, g ∈ U , and
h ∈ π(U ), then (g, h) ∈ K if and only if h = π(g). Because K is a local
subgroup, this implies that π is a local homomorphism.


Theorem 14.2. Let G and H be Lie groups with Lie algebras g and h, respec-
tively, and let π : g −→ h be a Lie algebra homomorphism. Assume that G is
simply connected. Then there exists a Lie group homomorphism π : G −→ H
with differential π.

Proof. This follows from Proposition 14.2 and Theorem 13.3.




We can now give another proof of Theorem 12.2. The basic idea here is to
use a compact subgroup to prove the complete reducibility of some class of
representations of a noncompact group. This idea was called the “Unitarian
Trick” by Hermann Weyl. We will extend the validity of Theorem 12.2, though
the algebraic method would work as well for this.

Theorem 14.3. Let G and K be Lie groups with Lie algebras g and k. Assume
K is compact and simply connected. Suppose that g and k have isomorphic
complexifications. Then every finite-dimensional irreducible complex represen-
tation of g is completely reducible. If G is connected, then every irreducible
complex representation of G is completely reducible.

Proof. Let (π, V ) be a finite-dimensional representation of G, and let W be


a proper nonzero invariant subspace. We will show that there is another
invariant subspace W  such that V = W ⊕ W  . By induction on dim(V ),
it will follow that both W and W  are direct sums of irreducible representa-
tions.
The differential of π is a complex representation of g. As in Proposition
11.3, we may extend it to a representation of gC ∼
= kC and then restrict it to k.
Since K is simply connected, the resulting Lie algebra homomorphism k −→
gl(V ) is the differential of a Lie group homomorphism πK : K −→ GL(V ).
Now, because K is compact, this representation of K is completely
reducible (Proposition 2.2). Thus, there exists a K-invariant subspace W 
such that V = W ⊕ W  . Of course, W  is also invariant with respect to k
14 The Local Frobenius Theorem 99

and hence kC =∼ gC , and hence g. It is therefore invariant under exp(g). If G


is connected, it is generated by a neighborhood of the identity, and so W  is
G-invariant.


Theorem 14.4. Let (π, V ) be a finite-dimensional irreducible complex repre-


sentation of g = sl(n, R), su(n), or sl(n, C). If g is sl(n, C) then assume that
π : g −→ End(V ) is complex linear. Then π is completely reducible.

Proof. We will prove this for sl(n, R) and su(n). By Theorem 13.6, K is
simply-connected and the hypotheses of Theorem 14.3 are satisfied. For
sl(n, R), we can take G = SL(n, R), K = SU(n). For su(n), we can take
G = K = SU(n).
The case of sl(n, C) requires a minor modification to Theorem 14.3 and is
left to the reader.


Theorem 14.5. Let (π, V ) be a finite-dimensional irreducible complex repre-


sentation of SL(n, R). Then π is completely reducible.

Proof. We take G = SL(n, R), K = SU(n).




Exercises
Exercise 14.1. Let G be a connected complex analytic Lie group, and let K ⊂ G be
a compact Lie subgroup. Let g and k ⊂ g be the Lie algebras of G and K, respectively.
Assume that g is the complexification of k and that K is simply-connected. Prove
that every finite-dimensional irreducible complex representation of g is completely
reducible. If G is connected, then every irreducible complex analytic representation
of G is completely reducible.
15
Tori

A complex manifold M is constructed analogously to a smooth manifold. We


specify an atlas U = {(U, φ)}, where each chart U ⊂ M is an open set and
φ : U −→ Cm is a homeomorphism of U onto its image that is assumed to be
open in Cm . It is assumed that the transition functions ψ ◦ φ−1 : φ(U ∩ V ) −→
ψ(U ∩ V ) are holomorphic for any two charts (U, φ) and (V, ψ). A complex Lie
group (or complex analytic group) is a Hausdorff topological group that is a
complex manifold in which the multiplication and inversion maps G×G −→ G
and G −→ G are holomorphic. The Lie algebra of a complex Lie group is a
complex Lie algebra. For example, GL(n, C) is a complex Lie group.
If g is a Lie algebra and X, Y ∈ g, we say that X and Y commute if
[X, Y ] = 0. We call the Lie algebra g Abelian if [X, Y ] = 0 for all X, Y ∈ g.

Proposition 15.1. The Lie algebra of an Abelian Lie group is Abelian.

Proof. The action of G on itself by conjugation is trivial, so the induced


action Ad of G on its
Lie algebra
is trivial. By Theorem 8.2, it follows that
ad : Lie(G) −→ End Lie(G) is the zero map, so [X, Y ] = ad(X)Y = 0.


Proposition 15.2. If G is a Lie group, and X and Y are commuting elements


of Lie(G), then eX+Y = eX eY . In particular, eX eY = eY eX .

Proof. First note that, since the differential of Ad is ad (Theorem 8.2),


Ad(etX )Y = Y for all t. Recalling that Ad(etX ) is the endomorphism of
Lie(G) induced by conjugation, this means that conjugation by etX takes the
one-parameter subgroup u −→ euY to itself, so etX euY e−tX = euY . Thus, etX
and euY commute for all real t and u.
We recall from Chap. 8 that the path p(t) = etY is characterized by the
fact that p(0) = 1G , while p∗ (d/dt) = Yp(t) . The latter condition means that
if f ∈ C ∞ (G) we have

d
f p(t) = (Y f ) p(t) .
dt

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 101


DOI 10.1007/978-1-4614-8024-2 15, © Springer Science+Business Media New York 2013
102 15 Tori

Let q(t, u) = etX euY . The vector field Y is invariant under left translation, in
particular left translation by etX , so

f q(t, u) = (Y f )(etX euY ).
∂u
Similarly (making use of etX euY = euY etX ),

f q(t, u) = (Xf )(etX euY ).
∂t
Now, by the chain rule,
d ∂  ∂ 
f q(v, v) = f q(t, u)  + f q(t, u) 
dv ∂t t=u=v ∂u t=u=v
= (Y f + Xf ) q(v, v) .

This means that the path v −→ r(v) = q(v, v) satisfies r∗ (d/dv) = (X +Y )r(v)
whence ev(X+Y ) = evX evY . Taking v = 1, the result is proved.

A compact torus is a compact connected Lie group that is Abelian. In the
context of Lie group theory a compact torus is usually just called a torus,
though in the context of algebraic groups the term “torus” is used slightly
differently. 
For example, T = {z ∈ C×  |z| = 1} is a torus. This group is isomorphic to
R/Z. Even though R and Z are additive groups, we may, during the following
discussion, sometimes write the group law in R/Z multiplicatively.
Proposition 15.3. Let T be a torus, and let t be its Lie algebra. Then exp :
= (R/Z)r ∼
t −→ T is a homomorphism, and its kernel is a lattice. We have T ∼ =
T , where r is the dimension of T .
r

Proof. Let t be the Lie algebra of T . Since T is Abelian, so is t, and by Propo-


sition 15.2, exp is a homomorphism from the additive group t to T . The kernel
Λ ⊂ t is discrete since exp is a local homeomorphism, and Λ is cocompact since
T is compact. Thus, Λ is a lattice and T ∼ = t/Λ ∼= (R/Z)r ∼= Tr .

A character of Rr of the form

(x1 , . . . , xr ) → e2πi kj xj
, (15.1)

where (k1 , . . . , kr ) ∈ Zr , induces a character on (R/Z)r .


Proposition 15.4. Every irreducible complex representation of (R/Z)r coin-
cides with (15.1) for suitable ki ∈ Z.

Proof. By classical Fourier analysis, these characters span L2 (R/Z)r . Thus,
the character χ of any complex representation π is not orthogonal to (15.1) for
some (k1 , . . . , kr ) ∈ Zr . By Schur orthogonality, χ agrees with this character.


15 Tori 103

We also want to know the irreducible real representations of (Z/R)r . Let


k1 , . . . , kr ∈ Z be given. Assume that they are not all zero. The complex
character (15.1) is not a real representation. However, regarding it as a ho-
momorphism (Z/R) 
r
−→ T, we may compose
 it with the real representation
cos(2πθ) sin(2πθ)
T  t = e2πiθ → of T. We obtain a real representation
− sin(2πθ) cos(2πθ)
   
cos(2π ki xi ) sin(2π  ki xi )
(x1 , . . . , xr ) → . (15.2)
− sin(2π ki xi ) cos(2π ki xi )

Proposition 15.5. Let T = (Z/R)r and let (π, V ) be an irreducible real rep-
resentation. Then either π is trivial or π is two-dimensional and is one of
the irreducible representations (15.2) with ki ∈ Z not all zero. In the two-
dimensional case the complexified module VC = C ⊗ V decomposes into two
one-dimensional representations corresponding to a character and its inverse.

Proof. It is straightforward to see that the real representation (15.2) is irre-


ducible. The completeness of this set of irreducible real representations follows
from the corresponding classification of the irreducible complex characters
(Proposition 15.4). It is also easy to see that the complexified representation
is equivalent to
 2πi  k x 
e i i
(x1 , . . . , xr ) →  .
e−2πi ki xi



If T is a compact torus, we will associate with T a complex analytic group TC ,


which we call the complexification of T . Let tC = C ⊗ t be the complexification
of the Lie algebra, and let TC = tC /Λ, where Λ ⊂ t is the kernel of exp : t −→
T . It is easy to see that this construction is functorial: given a homomorphism
φ : T −→ U of compact tori, the differential φ∗ : Lie(T ) −→ Lie(U ) commutes
with the exponential map, so φ∗ kills the kernel Λ of exp : t −→ T . Therefore,
there is an induced map TC −→ UC .
If we identify T = (R/Z)r , the complexification TC ∼ = (C/Z)r . Since
x −→ e 2πix
induces an isomorphism of the additive group C/Z with the
multiplicative group C× , we see that TC ∼ = (C× )r . We call any complex Lie
× r
group isomorphic to (C ) for some r a complex torus.
By a linear character χ of a compact torus T , we mean a continuous
homomorphism T −→ C× . These are just the characters of irreducible repre-
sentations, known explicitly by (15.1). They take values in T, as we may see
from (15.1), or by noting that the image is a compact subgroup of C× .
By a rational character χ of a complex torus T , we mean an analytic
homomorphism T −→ C× .

Proposition 15.6. Let T be a compact torus. Then any linear character χ of


T extends uniquely to a rational character of TC .
104 15 Tori

Proof. Without loss of generality, we may assume that T = (R/Z)r and that
TC = (C× )r , where the embedding T −→ TC is the map (x1 , . . . , xr ) −→
(e2πix1 , . . . , e2πixr ). Every linear character of T is given by (15.1) /
for suitable
ki ∈ Z, and this extends to the rational character (t1 , . . . , tr ) −→ tki i of TC .
Since a rational character is holomorphic, it is determined by its values on the
image Tr of T .


We will denote the group of characters of a compact torus T as X ∗ (T ). We will


denote its group law additively: if χ1 and χ2 are characters, then (χ1 +χ2 )(t) =
χ1 (t)χ2 (t). We may identify X ∗ (T ) with the group of rational characters of TC .
A (topological) generator of a compact torus T is an element t such that
the smallest closed subgroup of T containing t is T itself.

Theorem 15.1 (Kronecker). Let (t1 , . . . , tr ) ∈ Rr , and let t be the image of


this point in T = (R/Z)r . Then t is a generator of T if and only if 1, t1 , . . . , tr
are linearly independent over Q.

Proof. Let H be the closure of the group t generated by t in T = (R/Z)r .


Then T /H is a compact Abelian group, and if it is not reduced to the identity
it has a character χ. We may regard this as a character of T that is trivial
on H, and as such it has the form (15.1) for suitable ki ∈ Z. Since t itself
is in H, this means that kj tj ∈ Z, so 1, t1 , . . . , tr are linearly dependent.
The existence of nontrivial characters of T /H is thus equivalent to the linear
dependence of 1, t1 , . . . , tr and the result follows.


Corollary 15.1. Each compact torus T has a generator. Indeed, generators


are dense in T .

Proof. We may assume that T = (R/Z)r . By Kronecker’s Theorem 15.1, what


we must show is that r-tuples (t1 , . . . , tr ) such that 1, t1 , . . . , tr are linearly
independent over Q are dense in Rr . If 1, t1 , . . . , ti−1 are linearly independent,
then linear independence of 1, t1 , . . . , ti excludes only countably many ti , and
the result follows from the uncountability of R.


Proposition 15.7. Let T = (R/Z)r .


(i) Each automorphism of T is of the form t −→ M t (mod Zr ), where M ∈
GL(r, Z). Thus, Aut(T ) ∼ = GL(r, Z).
(ii) If H is a connected topological space and f : H −→ Aut(T ) is a map such
that (h, t) −→ f (h)t is a continuous map H ×T −→ T , then f is constant.

We can express (ii) by saying that Aut(T ) is discrete since if it is given the
discrete topology, then (h, t) −→ f (h)t is continuous if and only if f is locally
constant.

Proof. If φ : T −→ T is an automorphism, then φ induces an invertible


linear transformation M of the Lie algebra t of T that commutes with the
exponential map. Because T is Abelian, the exponential map exp : t → T is
15 Tori 105

a group homomorphism, and φ must preserve its kernel Λ. We may identify


t = Rr in such a way that Λ is identified with Zr , in which case the matrix of
M must lie in GL(r, Z). Part (i) is now clear.
For part (ii), since T is compact and f is continuous, as h −→ h1 , f (h)t −→
f (h1 )t uniformly for t ∈ T . It is easy to see from (i) that this is impossible
unless f is locally constant.


In the remainder of this chapter, we will consider tori embedded in Lie groups.
First, we prove a general statement that implies the existence of tori.

Theorem 15.2. Let G be a Lie group and H a closed Abelian subgroup. Then
H is a Lie subgroup of G. If G is compact, then the connected component of
the identity in H is a torus.

The assumption that H is Abelian is unnecessary. See Remark 7.2 for refer-
ences to a result without this assumption.

Proof. Let g = Lie(G). The exponential map g −→ G is a local homeomor-


phism near the origin. Let U be a neighborhood of 0 ∈ g such that exp has a
smooth inverse log : exp(U ) −→ U . Let

h = {X ∈ g | exp(tX) ∈ H for all t ∈ R}.

Lemma 15.1. If X ∈ h and Y ∈ U , and if eY ∈ H then [X, Y ] = 0.

for any t > 0 both e and e ∈ H commute,


tX Y
To prove the lemma, note that
Y tX Y −tX
so e = e e e = exp Ad(tX)Y . If t is small enough, both Y and
Ad(tX)Y are in U , so applying log we have Ad(tX)Y = Y . By Theorem 8.2,
it follows that ad(X)Y = 0, proving the lemma.
Let us now show that h is an Abelian Lie algebra. It is clearly closed under
scalar multiplication. If X and Y are in h, then etY ∈ H and tY ∈ U for small
enough t, so by the lemma [X, tY ] = 0. Thus, [X, Y ] = 0. By Proposition 15.2
we have et(X+Y ) = etX etY for all t, so X + Y ∈ h.
Now we will show that there exists a neighborhood V of the identity in G
such that V ⊆ exp(U ) and V ∩ H = {exp(X) | X ∈ h ∩ log(V )}. This will show
that V ∩ H is a smooth locally closed submanifold of G. Since every point of
H has a neighborhood diffeomorphic to this neighborhood of the identity, it
will follow that H is a submanifold of G and hence a Lie subgroup.
It is clear that, for each open neighborhood of V contained in exp(U ), we
have V ∩ H ⊇ {exp(X) | X ∈ h ∩ log(V )}. If this inclusion is proper for every
V , then there exists a sequence {hn } ⊂ H ∩ exp(U ) such that hn −→ 1 but
log(hn ) ∈/ h. We write log(hn ) = Xn . Thus, Xn → 0.
Let us write g = h ⊕ p, where p is a vector subspace. We will show that
we may choose Xn ∈ p. Write Xn = Yn + Zn , where Yn ∈ h and Zn ∈ p. By
the lemma, [Xn , Yn ] = 0, so eZn = eXn e−Yn ∈ H. We may replace Xn by Zn
106 15 Tori

and hn by eZn , and we still have hn −→ 1, but log(hn ) ∈ / h, and after this
substitution we have Xn ∈ p.
Let us put an inner product on g. We choose it so that the unit ball
is contained in U . The vectors Xn /|Xn | lie on the unit ball in p, which is
compact, so they have an accumulation point. Passing to a subsequence, we
may assume that Xn /|Xn | −→ X∞ , where X∞ lies in the unit ball in p. We
will show that X∞ ∈ h, which is a contradiction since h ∩ p = {0}.
To show that X∞ ∈ h, we must show that etX∞ ∈ H. It is sufficient
to show this for t < 1. With t fixed, let rn be the smallest integer greater
than t/|Xn |. Since Xn → 0 we have rn |Xn | → t. Thus, rn Xn −→ tX∞ and
ern Xn = (eXn )rn ∈ H since eXn ∈ H. Since H is closed, etX∞ ∈ H and the
proof that H is a Lie group is complete.
If G is compact, then so is H. The connected component of the identity
in H is a connected compact Abelian Lie group and hence a torus.

If G is a group and H a subgroup, we will denote by NG (H) and CG (H) the
normalizer and centralizers of H. If no confusion is possible, we will denote
them as simply N (H) and C(H).
Let G be a compact, connected Lie group. It contains tori, for example
{1}, and an ascending chain T1  T2  T3  · · · has length bounded by
the dimension of G. Therefore, G contains maximal tori. Let T be a maximal
torus.
The normalizer N (T ) = {g ∈ G | gT g −1 = T }. It is a closed subgroup since
if t ∈ T is a generator, N (T ) is the inverse image of T under the continuous
map g −→ gtg −1 .
Proposition 15.8. Let G be a compact Lie group and T a maximal torus.
Then N(T) is a closed subgroup of G. The connected component N (T )◦ of the
identity in N (T ) is T itself. The quotient N (T )/T is a finite group.
Proof. We have a homomorphism N (T ) −→ Aut(T ) in which the action is
by conjugation. By Proposition 15.7, Aut(T ) ∼ = GL(r, Z) is discrete, so any
connected group of automorphisms must act trivially. Thus, if n ∈ N (T )◦ , n
commutes with T . If N (T )◦ = T , then it contains a one-parameter subgroup
R  t −→ n(t), and the closure of the group generated by T and n(t) is a
closed commutative subgroup strictly larger than T . By Theorem 15.2, it is a
torus, contradicting the maximality of T . It follows that T = N (T )◦ .
The quotient group N (T )◦ /T is both discrete and compact and hence
finite.

The quotient N (T )/T is called the Weyl group of G with respect to T .
Example 15.1. Suppose that G = U(n). A maximal torus is
⎧⎛ ⎞ ⎫

⎨ t1  ⎪

⎜ . ⎟ 
T = ⎝ .. ⎠  |t1 | = · · · = |tn | = 1 .

⎩ ⎪

tn
15 Tori 107

Its normalizer N (T ) consists of all monomial matrices (matrices with a single


nonzero entry in each row and column) so the quotient N (T )/T ∼ = Sn .

Proposition 15.9. Let T be a maximal torus in the compact connected Lie


group G, and let t, g be the Lie algebras of T and G, respectively.
(i) Any vector in g fixed by Ad(T ) is in t.
(ii) We have g = t⊕p, where p is invariant under Ad(T ). Under the restriction
of Ad to T , p decomposes into a direct sum of two-dimensional irreducible
representations of T of the form (15.2).

Proof. For (i), if X ∈ g is fixed by Ad(T ), then by Proposition 15.2, exp(tX)


is a one-parameter subgroup that is not contained in T but that commutes
with T , and unless X ∈ t, the closure of the group it generates with T will be
a torus strictly larger than T , which is a contradiction.
Since G is compact, there exists a positive definite symmetric bilinear form
on the real vector space that is g-invariant under the real representation Ad :
G −→ GL(g). The orthogonal complement p of t is invariant under Ad(T ). It
contains no Ad(T )-fixed vectors by (i). Since every nontrivial irreducible real
representation of T is of the form (15.2), (ii) follows.


Corollary 15.2. If G is a compact connected Lie group and T a maximal


torus, then dim(G) − dim(T ) is even.

Proof. This follows since dim(G/T ) = dim(p), and p decomposes as a direct


sum of two-dimensional irreducible representations.


We review the notion of an orientation. Let M be a manifold of dimension n.


The orientation bundle of M is a certain twofold cover that we now describe.
One way of constructing M̃ begins with the n-fold exterior power of the tan-
gent bundle: the fiber over x ∈ M is ∧n Tx (M ). This is a one-dimensional
real vector space. Omitting the origin and dividing by the equivalence rela-
tion v ∼ w if v = λw for 0 < λ ∈ R, when v, w are elements4of ∧n Tx (M ),
produces a set F (x) with two points. The disjoint union M̃ = x∈M F (x) is
topologized as follows. Let π : M̃ −→ M be the map sending F (x) to x. If
X1 , . . . , Xn are vector fields that are linearly independent on an open set U ,
then X1 ∧ · · · ∧ Xn determines, for each x ∈ U , an element s(x) of π −1 (x).
We topologize M̃ by requiring that s : U −→ M̃ be a local homeomorphism.
Now an orientation of the manifold M is a global section of the orientation
bundle, that is, a continuous map s : M −→ M̃ such that p ◦ s(x) = x for
all x ∈ M . If an orientation exists, then M̃ is a trivial cover, and M̃ ∼ =
M × (Z/2Z). In this case, the bundle M is called orientable. Any complex
manifold is orientable. On the other hand, a Möbius strip is not orientable.
If M and N are manifolds of dimension n and f : M −→ N is a diffeo-
morphism, there is induced for each x ∈ M an isomorphism ∧n Tx (M ) −→
∧n Tf (x) (N ) and so there is induced a canonical map f˜ : M̃ −→ Ñ covering f .
108 15 Tori

Proposition 15.10. Let G be a connected Lie group and H a connected closed


Lie subgroup. Then the quotient space G/H is a connected orientable manifold.

The manifold G/T is called a flag manifold.

Proof. To make G/H a manifold, choose a subspace p of g = Lie(G) comple-


mentary to h = Lie(H). Then X −→ exp(X)gH is a local homeomorphism of
a neighborhood of the identity in p with a neighborhood of the coset gH in
G/H.
To see that M = G/H is orientable, let π : M̃ −→ M be the orientation
bundle, and let ω be an element of π −1 (H). If g ∈ G then g acts by left
translation on M and hence induces an automorphism g̃ of M̃ . We can define
a global section s of M̃ by s(gH) = g̃(ω) if we can check that this is well-
defined. Thus, if gH = g  H, we must show that g̃(ω) = g̃  (ω) in the fiber of
M̃ above gH. We will show that the map g̃ : M −→ M can be deformed into
g̃  through a sequence of maps g,t , each of them mapping H −→ gH, so that
g̃0 = g̃ and g̃1 = g̃  . This is sufficient because the fiber of M̃ above gH is a
discrete set consisting of two elements, and t −→ g,t (ω) is then a continuous
map from [0, 1] into this discrete set.
The existence of g,t will follow from the connectedness of H. Note that if
γ ∈ G we have
γgH = gH ⇐⇒ γ ∈ gHg −1 . (15.3)
In particular, g  g −1 ∈ gHg −1 . Since H is connected, so is gHg −1 , and there is
a path t −→ γt from the identity to g  g −1 within gHg −1 . Then xH −→ γt gxH
is a diffeomorphism of M that agrees with left translation by g when t = 0
and left translation by g  when t = 1, and by (15.3), each canonical lifting g,t
takes H −→ gH, as required.


We have seen in Corollary 15.2 that the flag manifold X is even-dimensional,


and by Proposition 15.10 it is orientable. These facts will be explained by
Theorem 26.4, where we will see that X is actually a complex analytic mani-
fold.

Exercises

Exercise 15.1. Compute the dimensions of the flag manifolds for su(n), sp(2n) and
so(n).
16
Geodesics and Maximal Tori

An important theorem of Cartan asserts that any two maximal tori in a


compact Lie group are conjugate. There are different ways of proving this.
We will deduce it from the surjectivity of the exponential map, which we will
prove by showing that a geodesic between the origin and an arbitrary point
of the group has the form t → etX for some X in the Lie algebra.
We begin by establishing the properties of geodesics that we will need.
These properties are rather well-known, though they do require proof. Some
readers may want to start reading with Theorem 16.1.
A Riemannian manifold consists of a smooth manifold M and for each
x ∈ M an inner product on the tangent space Tx . Since Tx is a real vector
space and not a complex one, an inner product in this context is a positive
definite symmetric real-valued bilinear form. We also describe this family of
inner products on the tangent spaces as a Riemannian structure on the man-
ifold M. We will denote the inner product of X, Y ∈ Tx by X , Y and the
length X, X = |X|. As part of the definition, the inner product must
vary smoothly with x. To make this condition precise, we choose a system of
coordinates x1 , . . . , xn on some open set U of M , where n = dim(M ). Then,
at each point x ∈ U , a basis of Tx (M ) consists of ∂/∂x1 , . . . , ∂/∂xn . Let
 
∂ ∂
gij = , . (16.1)
∂xi ∂xj

Thus, the matrix (gij ) representing the inner product is positive definite sym-
metric. Smoothness of the inner product means that the gij are smooth func-
tions of x ∈ U .
We also define (g ij ) to be the inverse matrix to (gij ). Thus, the functions
ij
g satisfy 
1 if i = k ,
jk k k
gij g = δi , where δi = (16.2)
0 otherwise ,
j

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 109


DOI 10.1007/978-1-4614-8024-2 16, © Springer Science+Business Media New York 2013
110 16 Geodesics and Maximal Tori

and of course
gij = gji , g ij = g ji .
Suppose that p : [0, 1] −→ M is a path in the Riemannian manifold M . We say
p is admissible if it is smooth, and moreover the movement along the path
never “stops,” that is, the tangent vector p∗ (d/dt), where t is the coordinate
function on [0, 1], is never zero. The length or arclength of p is
 1   
 
|p| = p∗ d  dt. (16.3)
 dt 
0

In terms of local coordinates, if we write xi (t) = xi p(t) the integrand is
   
  ∂xi ∂xj
 p∗ d  = gij .
 dt  ∂t ∂t
i,j

We call the path well-paced if


 a   
 
p∗ d  dt = |p|a
 dt 
0

for all 0  a  1. Intuitively, this means that the point p(t) moves along the
path at a constant “velocity.”
It is an easy application of the chain rule that the arclength of p is
unchanged under reparametrization. Moreover, each path has a unique rep-
arametrization that is well-paced.
A Riemannian manifold becomes a complete metric space by defining the
distance between two points a and b as the infimum of the lengths of the paths
connecting them. It is not immediately obvious that there will be a shortest
path, and indeed there may not be for some Riemannian manifolds, but it is
easy to check that this definition satisfies the triangle inequality and induces
the standard topology.
We will encounter various quantities indexed by 1  i, j, k, · · ·  n, where
n is the dimension of the manifold M under consideration. We will make use
of Einstein’s summation convention (in this chapter only). According to this
convention, if any index is repeated in a term, it is summed. For example,
suppose that p : [0, 1] −→ M is a path lying entirely in a single chart U ⊂
M with coordinate functions x1 , . . . , xn . Then we may regard x1 , . . . , xn as
functions of t ∈ [0, 1], namely xi (t) = xi p(t) . If f : U −→ C is a smooth
function, then according to the chain rule
df n
dxi ∂f
x1 (t), . . . , xn (t) = x1 (t), . . . , xn (t) .
dt i=1
dt ∂xi
According to the summation convention, we can write this as simply
df dxi ∂f
= ,
dt dt ∂xi
and the summation over i is understood because it is a repeated index.
16 Geodesics and Maximal Tori 111

If for each smooth curve q : [0, 1] −→ M with the same endpoints as p we


have |p|  |q|, then we say that p is a path of shortest length. We will presently
define geodesics by means of a differential equation, but for the moment we
may provisionally describe a geodesic as a well-paced path along a manifold
M that on short intervals is a path of shortest length.
An example will explain the qualification “on short intervals” in this def-
inition. On a sphere, a geodesic is a great circle. The path in Fig. 16.1 is a
geodesic. It is obviously not the path of shortest length between a and b.

b
a
c

Fig. 16.1. A geodesic on a sphere

Although the indicated geodesic is not a path of shortest length, if we


break it into smaller segments, we may still hope that these shorter paths
may be paths of shortest length. Indeed they will be paths of shortest length
if they are not too long, and this is the content of Proposition 16.4 below. For
example, the segment from a to c is a path of shortest length.
Let p : [0, 1] −→ M be an admissible path. We can consider deformations
of p, namely we can consider a smooth family of paths u −→ pu , where, for
each u ∈ (−, ), pu is a path from a to b and p0 = p. Note that, as with
the definition of path-homotopy, we require that the endpoints be fixed as the
path is deformed. We consider the function f (u) = |pu |. We say the path is
of stationary length if f  (0) = 0 for each such deformation.
If p is a path of shortest length, then 0 will be a minimum of f so f  (0) = 0.
As for the example in Fig. 16.1, the path from a to b may be deformed by
raising it up above the equator and simultaneously shrinking it, but even
under such a deformation we will have f  (0) = 0. So although this path is not
a path of shortest length, it is still a path of stationary length.
Let x1 , . . . , xn be coordinate functions on some open set U on M . Relative
to this coordinate system, let gij and g ij be as in (16.1) and (16.2). We define
the Christoffel symbols
 
1 ∂gik ∂gjk ∂gij
[ij, k] = + − , {ij, k} = g kl [ij, l].
2 ∂xj ∂xi ∂xk
In the last expression, l is summed by the summation convention.
112 16 Geodesics and Maximal Tori

Proposition 16.1. Suppose that p : [0, 1] −→ M is a well-paced admissible


path. If the path lies within an open set U on which x1 , . . . , xn is a system of
coordinates, then writing xi (t) = xi p(t) , the path is of stationary length if
and only if it satisfies the differential equation

d2 xk dxi dxj
2
= −{ij, k} . (16.4)
dt dt dt
Proof. Let us consider the effect of deforming the path. We consider a family
pu of paths parametrized by u ∈ (−, ), where  > 0 is a small real number.
It is assumed that the family of paths varies smoothly, so (t, u) −→ pu (t) is a
smooth map (−, ) × [0, 1] −→ M .
We regard the coordinate functions xi of the point x = pu (t) to be func-
tions of u and t.
It is assumed that p0 (t) = p(t) and that the endpoints are fixed, so that
pu (0) = p(u) and pu (1) = p(1) for all u ∈ (−, ). Therefore,

∂xi
=0 when t = 0 or 1. (16.5)
∂u
In local coordinates, the arclength (16.3) becomes
 15
∂xi ∂xj
|pu | = gij dt. (16.6)
0 ∂t ∂t

Because the path p(t) = p0 (t) is well-paced, the integrand is constant (inde-
pendent of t) when u = 0, so
5
∂ ∂xi ∂xj
gij = 0 when u = 0. (16.7)
∂t ∂t ∂t
We do not need to assume that the deformed path p(t, u) is well-paced for
any u = 0.
Let f (u) = |pu |. We have
 15
 ∂ ∂xi ∂xj
f (u) = gij dt .
∂u 0 ∂t ∂t
This equals
 1  − 12 !
∂xi ∂xj 1 ∂gij ∂xi ∂xj 1 ∂ 2 xi ∂xj 1 ∂xi ∂ 2 xj
gij + gij + gij dt
0 ∂t ∂t 2 ∂u ∂t ∂t 2 ∂u∂t ∂t 2 ∂t ∂u∂t
 1 − 12 !
∂xi ∂xj 1 ∂gij ∂xl ∂xi ∂xj ∂ 2 xi ∂xj
= gij + gij dt,
0 ∂t ∂t 2 ∂xl ∂u ∂t ∂t ∂u∂t ∂t

where we have used the chain rule, and combined two terms that are equal.
(The variables i and j are summed by the summation convention, so we may
16 Geodesics and Maximal Tori 113

interchange them, and using gij = gji , the last two terms on the left-hand side
are equal.) We integrate the second term by parts with respect to t, making
use of (16.5) and (16.7) to obtain
 1 − 12  !
 ∂xi ∂xj 1 ∂gij ∂xl ∂xi ∂xj ∂xi ∂ ∂xj
f (0) = gij − gij dt
0 ∂t ∂t 2 ∂xl ∂u ∂t ∂t ∂u ∂t ∂t
 1 − 12  !
∂xi ∂xj 1 ∂gij ∂xi ∂xj ∂ ∂xj ∂xl
= gij − glj dt.
0 ∂t ∂t 2 ∂xl ∂t ∂t ∂t ∂t ∂u

Now all the partial derivatives are evaluated when u = 0. The last step is just
a relabeling of a summed index.
We observe that the displacements ∂xl /∂u are arbitrary except that they
must vanish when t = 0 and t = 1. (We did not assume the deformed path to
be well-paced except when u = 0.) Thus, the path is of stationary length if
and only if
 
1 ∂gij ∂xi ∂xj ∂ ∂xj
0= − glj ,
2 ∂xl ∂t ∂t ∂t ∂t
so the condition is
∂ 2 xj 1 ∂gij ∂xi ∂xj ∂glj ∂xj
glj 2
= − .
∂t 2 ∂xl ∂t ∂t ∂t ∂t
Now !
∂glj ∂xj ∂glj ∂xi ∂xj 1 ∂glj ∂gli ∂xi ∂xj
= = + .
∂t ∂t ∂xi dt ∂t 2 ∂xi ∂xj dt ∂t
The two terms on the right-hand side are of course equal since both i and j
are summed indices. We obtain in terms of the Christoffel symbols

∂ 2 xj ∂xi ∂xj
glj 2
= −[ij, l] .
∂t ∂t ∂t
Multiplying by g kl , summing the repeated index l, and using (16.2), we obtain
(16.4).


We define a geodesic to be a solution to the differential equation (16.4). This


definition does not depend upon the choice of coordinate systems because
the differential equation (16.4) arose from a variational problem that was
formulated without reference to coordinates. Naturally, one may alternatively
confirm by direct computation that the differential equation (16.4) is stable
under coordinate changes.

Proposition 16.2. Let x be a point on the Riemannian manifold M , and


let X ∈ Tx (M ). Then, for sufficiently small , there is a unique geodesic
p : (−, ) −→ M such that p(0) = x and p∗ (d/dt) = X.
114 16 Geodesics and Maximal Tori

Proof. Let x1 , . . . , xn be coordinate functions. Let y1 , . . . , yn be a set of new


variables, and rewrite (16.4) as a first-order system

dxi
= yi ,
dt
dyk
= −{ij, k} yiyj .
dt
The conditions p(0) = x and p∗ (d/dt) = X amount to initial conditions for
this first-order system, and the existence and uniqueness of the solution follow
from the general theory of first-order systems.


We now come to a property of geodesics that may be less intuitive. Let U


be a smooth submanifold of M , homeomorphic to a disk, of codimension 1.
If x ∈ U , we consider the geodesic t −→ px (t) such that px (0) = x and such
that px,∗ (d/dt) is the unit normal vector to M at x in a fixed direction. For
small  > 0, let U  = {px ()|x ∈ U }. In other words, U  is a translation of the
disk U along the family of geodesics normal to U .
It is obvious that U is normal to each of the geodesic curves px . What
is less obvious, and will be proved in the next proposition, is that U  is also
normal to the geodesics px .
In order to prove this, we will work with a particular set of coordinates. Let
x2 , . . . , xn be local coordinates on U . At each point x = (x2 , . . . , xn ) ∈ U , we
choose the unit normal vector in a fixed direction and construct the geodesic
path through the point with that tangent vector. We prescribe a coordinate
system on M near U by asking that (0, x2 , . . . , xn ) agree with the point x ∈ U
and that the path t −→ (t, x2, . . . , xn ) agree with px . We describe such a
coordinate system as geodesic coordinates.

Proposition 16.3. In geodesic coordinates, g1i = 0 for 2  i  n. Also


g11 = 1.

In view of (16.1), this amounts to saying that the geodesic curves (having
tangent vector ∂/∂x1 ) are orthogonal to the level hypersurfaces x1 = constant
(having tangent spaces spanned by ∂/∂x2 , . . . , ∂/∂xn ), such as U and U  in
Fig. 16.2.

Proof. Having chosen coordinates so that the path t −→ (t, x2, . . . , xn ) is a
geodesic, we see that if all dxi /dt = 0 in (16.4), for i = 1, then d2 xk /dt2 = 0
for all k. This means that {11, k} = 0. Since the matrix (gkl ) is invertible, it
follows that [11, k] = 0, so
∂g1k 1 ∂g11
= . (16.8)
∂x1 2 ∂xk
First, take k = 1 in (16.8). We see that ∂g11 /∂x1 = 0, so if x2 , . . . , xn are held
constant, g11 is constant. When x1 = 0, the initial condition of the geodesic
curve px through (0, x2 , . . . , xn ) is that it is tangent to the unit normal to
16 Geodesics and Maximal Tori 115

the surface, that is, its tangent vector ∂/∂x1 has length one, and by (16.1)
it follows that g11 = 1 when x1 = 0, so g11 = 1 throughout the geodesic
coordinate neighborhood.
Now let 2  k  n in (16.8). Since g11 is constant, ∂g1k /∂x1 = 0, and
so g1k is also constant when x2 , . . . , xn are held constant. When x1 = 0, our
assumption that the geodesic curve px is normal to the surface means that
∂/∂x1 and ∂/∂xk are orthogonal, so by (16.1), g1k vanishes when x1 = 0 and
so it vanishes for all x1 .


Fig. 16.2. Hypersurface remains perpendicular to geodesics on parallel translation

With these preparations, we may now prove that short geodesics are paths
of shortest length.
Proposition 16.4.
(i) Let p : [0, 1] −→ M be a geodesic. Then there exists an  > 0 such that
the restriction of p to [0, ] is the unique path of shortest length from p(0)
to p().
(ii) Let x ∈ M . There exists a neighborhood N of x such that for all y ∈ N
there exists a unique path of shortest distance from x to y, and that path
is a geodesic.
Proof. We choose a hypersurface U orthogonal to p at t = 0 and construct
geodesic coordinates as explained before Proposition 16.3. We choose  and
B so small that the set N of points with coordinates {x1 ∈ [0, ], 0 
|x2 |, . . . , |xn |  B} is contained within the interior of this geodesic coordi-
nate neighborhood. We can assume that the coordinates of p(0) are (0, . . . , 0),
so by construction p(t) = (t, 0, . . . , 0). Then |p| = , where now |p| denotes
the length of the restriction of the path to the interval from 0 to .
We will show that if q : [0, ] −→ M is any path with q(0) = p(0) and
q() = p(), then |q|  |p|.
First, we consider paths q : [0, ] −→ M that lie entirely within the
set N and such that the x1 -coordinate of q(t) is monotonically increasing.
Reparametrizing q, we may arrange that q(t) and p(t) have the same x1 -
coordinate, which equals t. Let us write q(t) = t, x2 (t), . . . , xn (t) . We also
denote x1 (t) = t. Since g1k = gk1 = 0 when k  2 and g11 = 1, we have
116 16 Geodesics and Maximal Tori
  
dxi dxj
|q| = gij dt
0 i,j
dt dt
  dxi dxj
= 1+ gij dt.
0 dt dt
2i,jn

Now since the matrix (gij )1i,jn is positive definite, its principal minor
(gij )2i,jn is also positive definite, so
dxi dxj
gij 0
dt dt
2i,jn

and  √
|q|  1 dt =  = |p|.
0
This argument is easily extended to include all paths such that the values
of x1 for those t such that q(t) ∈ N cover the entire interval [0, ]. Paths for
which this is not true must be long enough to reach the edges of the box
xi > B, and after reducing  if necessary, they must be longer than . This
completes our discussion of (i).
For (ii), given each unit tangent vector X ∈ Tx (M ), there is a unique
geodesic pX : [0, X ] −→ M through x tangent to X, and X > 0 may
be chosen so that this geodesic is a path of shortest length. We assert that
X may be chosen so that the same value X is valid for nearby unit tangent
vectors Y . We leave this point to the reader except to remark that it is perhaps
easiest to see this by applying a diffeomorphism of M that moves X to Y and
regarding X as fixed while the metric gij varies; if Y is sufficiently near X, the
variation of gij will be small and the  in part (i) can be chosen to work for
small variations of the gij . So for each unit tangent vector X ∈ Tx (M ) there
exists an X > 0 and a neighborhood NX of X in the unit ball of Tx (M ) such
that pY : [0, X ] −→ M is a path of shortest length for all Y ∈ NX . Since the
unit tangent ball in Tx (M ) is compact, a finite number of NX suffice to cover
it, and if  is the minimum of the corresponding X , then we can take N to
be the set of all points connected to x by a geodesic of length < .

If M is a connected Riemannian manifold, we make M into a metric space
by defining d(x, y) to be the infimum of |p|, where p is a smooth path from x
to y.
Theorem 16.1. Let M be a compact connected Riemannian manifold, and
let x and y be points of M . Then there is a geodesic p : [0, 1] −→ M with
p(0) = x and p(1) = y.
A more precise statement may be found in Kobayashi and Nomizu [110], The-
orem 4.2 on p. 172. It is proved there that if M is connected and geodesically
complete, meaning that any well-paced geodesic can be extended to (−∞, ∞),
then the conclusion of the theorem is true. (It is not hard to see that a compact
manifold is geodesically complete.)
16 Geodesics and Maximal Tori 117

Proof. Let {pi } be a sequence of well-paced paths from x to y such that


|p i | −→ d(x, y). Because they are well-paced, if 0  a < b  1 we have
d pi (a), pi (b) = (b − a)|pi |, and it follows that {pi } are equicontinuous. Thus
by Proposition 3.1 there is a subsequence that converges uniformly to a path p.
It is not immediately evident that p is smooth, but it is clearly continuous. So
we can partition [0, 1] into short intervals. On each sufficiently short interval
0  a < b  1, p(b) is near enough to p(a) that the unique path of shortest
distance between them is a geodesic by Proposition 16.4. It follows that p is
a geodesic.


Theorem 16.2. Let G be a compact Lie group. There exists on G a Rieman-


nian metric that is invariant under both left and right translation. In this
metric, a geodesic is a translate (either left or right) of a map t −→ exp(tX)
for some X ∈ Lie(G).

Proof. Let g = Lie(G). Since G is a compact group acting by Ad on the real


vector space g, there exists an Ad(G)-invariant inner product on g . Regarding
G as the tangent space to G at the identity, if g ∈ G, left translation induces
an isomorphism g = Te (G) −→ Tg (G) and we may transfer this inner product
to Tg (G). This gives us an inner product on Tg (G) and hence a Riemannian
structure on G, which is invariant under left translation. Right translation
by g induces a different isomorphism g = Te (G) −→ Tg (G), but these two
isomorphisms differ by Ad(g) : g −→ g, and since the original inner product is
invariant under Ad(g), we see that the Riemannian structure we have obtained
is invariant under both left and right translation.
It remains to be shown that a geodesic is a translate of the exponential
map. This is essentially a local statement. Indeed, it is sufficient to show that
any short segment of a geodesic is of the form t −→ g · exp(tX) since any path
that is of such a form on every short interval is globally of the same form.
Moreover, since the Riemannian metric is translation-invariant, it is sufficient
to show that a geodesic near the origin is of the form t −→ exp(tX).
First, we consider the case where G is a torus. In this case, G ∼ = Rn /Λ,
where Λ is a lattice. We identify the tangent space to R at any point with Rn
n

itself. By a linear change of variables, we may assume that the inner product
on Rn = Te (G) corresponding to the Riemannian structure is the standard
Euclidean inner product. Since the Riemannian structure is invariant under
translation it follows that G ∼= Rn /Λ is a Riemannian manifold as well as a
group. Geodesics are straight lines and so are translates of the exponential
map.
We turn now to the general case. If X ∈ g, let EX : (−, ) −→ G denote
the geodesic through the origin tangent to X ∈ g. It is defined for sufficiently
small  (depending on X). If λ ∈ R, then t −→ EX (λt) is the geodesic through
the origin tangent to λX, so EX (λt) = EλX (t). Thus, there is a neighborhood
U of the origin in g and a map E : U −→ G such that EX (t) = E(tX) for
X, tX ∈ U . We must show that E coincides with the exponential map.
118 16 Geodesics and Maximal Tori

If g ∈ G, then translating E(tX) on the left by g and on the right by g −1


gives another geodesic, which is tangent to Ad(g)X. Thus, if tX ∈ U ,

g E(tX) g −1 = E t Ad(g)X . (16.9)

We now fix X ∈ g. Let T be a maximal torus containing the one-parameter


subgroup {etX | t ∈ R}. It follows from (16.9) that E(tX) commutes with
g ∈ H when tX ∈ U . Thus the path t −→ E(tX) runs through the central-
izer C(T ) and a fortiori through N (T ). By Proposition 15.8, it follows that
E(tX) ∈ T .
Now the translation-invariant Riemannian structure on G induces a trans-
lation-invariant Riemannian structure on T , and since the geodesic path t −→
E(tX) of G is contained in T , it is a geodesic path in T also. The result
therefore follows from the special case of the torus, which we have already
handled.


Theorem 16.3. Let G be a compact Lie group and g its Lie algebra. Then
the exponential map g −→ G is surjective.

Proof. Put a Riemannian structure on G as in Theorem 16.2. By Theo-


rem 16.1, given g ∈ G, there exists a geodesic path from the identity to
g. By Theorem 16.2, this path is of the form t −→ etX for some X ∈ g, so
g = eX .


Theorem 16.4. Let G be a compact connected Lie group, and let T be a


maximal torus. Let g ∈ G. Then there exists k ∈ G such that g ∈ kT k −1 .

Proof. Let g and t be the Lie algebras of G and T , respectively. Let t0 be a


generator of T . Using Theorem 16.3, find X ∈ g and H0 ∈ t such that eX = g
and eH0 = t0 .
Since G is a compact group acting by Ad on the real vector space g, there
exists on g an Ad(G)-invariant inner product for which we will denote the
corresponding symmetric bilinear form as , . Choose k ∈ G so that the
real value X, Ad(k)H0 is maximal, and let H = Ad(k)H0 . Thus, exp(H) =
kt0 k −1 generates kT k −1.  
If Y ∈ g is arbitrary, then X, Ad(etY )H has a maximum when t = 0, so
using Theorem 8.2 we have
d   
0= X, Ad(etY )H  = X, ad(Y )H = − X, [H, Y ] .
dt t=0

By Proposition 10.3, this means that

[H, X], Y = 0

for all Y . Since an inner product is by definition positive definite, the bilinear
form , is nondegenerate, which implies that [H, X] = 0. Now, by Proposi-
tion 15.2, eH commutes with etX for all t ∈ R. Since eH generates the maximal
16 Geodesics and Maximal Tori 119

torus kT k −1, it follows that the one-parameter subgroup {etX } is contained


in the centralizer of kT k −1 , and since kT k −1 is a maximal torus, it follows
that {etX } ⊂ kT k −1. In particular, g = eX ∈ kT k −1 .


Theorem 16.5 (E. Cartan). Let G be a compact connected Lie group, and
let T be a maximal torus. Then every maximal torus is conjugate to T , and
every element of G is contained in a conjugate of T .

Proof. The second statement is contained in Theorem 16.4. As for the first
statement, let T  be another maximal torus, and let t be a generator. Then
t is contained in kT k −1 for some k, so T  ⊆ kT k −1. Since both are maximal
tori, they are equal.


Proposition 16.5. Let G be a compact connected Lie group, S ⊂ G a torus


(not necessarily maximal), and g ∈ CG (S) an element of its centralizer. Let
H be the closure of the group generated by S and g. Then H has a topological
generator. That is, there exists h ∈ H such that the subgroup generated by h
is dense in H.

Proof. Since H is closed and Abelian, its connected component H ◦ of the


identity is a torus by Proposition 15.2. Let h0 be a topological generator.
The group H/H ◦ is compact and discrete and hence finite. Since S ⊆ H ◦ ,
and since S and g generate a dense subgroup of H, the finite group H/H ◦ is
cyclic and generated by gH ◦ . Let r be the order of H/H ◦ . Then g r ∈ H ◦ . Since
the rth power map H ◦ −→ H ◦ is surjective, we can find u ∈ H ◦ such that
(gu)r = h0 . Then the group generated by h = ug contains both a generator
h0 of H ◦ and a generator gH ◦ = (gu)H ◦ of H/H ◦ . Clearly, it is a topological
generator of H.


Proposition 16.6. If G is a Lie group and u ∈ G, then the centralizer CG (u)


is a closed Lie subgroup, and its Lie algebra is {X ∈ Lie(G) | Ad(u)X = X}.

Proof. To show that H = CG (u) is a closed submanifold of G, it is sufficient to


show that its intersection with a small neighborhood of the identity is a closed
submanifold since translation by an element h of H will give a diffeomorphism
of that neighborhood onto a neighborhood of h. In a neighborhood N of the
origin in Lie(G), the exponential map is a diffeomorphism onto exp(N ), and
we see that the preimage of CG (u) in N is a vector subspace by recalling
that conjugation by u corresponds to the linear transformation Ad(u) of N .
Particularly, u etX u−1 = et ad(u)X , so etX ∈ CG (u) for all t if and only if
Ad(u)X = X.


Theorem 16.6. Let G be a compact connected Lie group and S ⊂ G a torus


(not necessarily maximal). Then the centralizer CG (S) is a closed connected
Lie subgroup of G.
120 16 Geodesics and Maximal Tori

Proof. We first prove that CG (S) is connected. Let g ∈ CG (S). By Proposition


16.5, there exists an element h of CG (S) that generates the closure H of the
group generated by S and g. Let T be a maximal torus in G containing h.
Then T centralizes S, so the closure of T S is a connected compact Abelian
group and hence a torus, and by the maximality of T it follows that S ⊆ T .
Now clearly T ⊆ CG (S), and since T is connected, T ⊆ CG (S)◦ . Now g ∈ H ⊆
T ⊂ CG (S)◦ . We have shown that CG (S)◦ = CG (S), so CG (S) is connected.
To show that CG (S) is a closed Lie subgroup, let u ∈ S be a generator.
Then CG (S) = CG (u), and the statement follows by Proposition 16.6.


Exercises
Exercise 16.1. Give an example of a connected Riemannian manifold with two
points P and Q such that no geodesic connects P and Q.

Exercise 16.2. Let G be a compact connected Lie group and let g ∈ G. Show that
the centralizer CG (g) of g is connected.

Exercise 16.3. Show that the conclusion of Exercise 16.2 fails for the connected
noncompact Lie group SL(2, R) by exhibiting an element with a centralizer that is
not connected.

If M and N are Riemannian manifolds of the same dimension, and if f : M −→


N is a diffeomorphism, then f is called a conformal map if there exists a positive
function φ on M such that if x ∈ M and y = f (x), and if we use the notation  , 
to denote the inner products in both Tx (M ) and Ty (N ), then

f∗ X, f∗ Y  = φ(x) X, Y  , X, Y ∈ Tx (M ),

where f∗ : Tx (M ) −→ Ty (N ) is the induced map. Intuitively, a conformal map is


one that preserves angles. If the function φ = 1, then f is called isometric.

Exercise 16.4. Show that if M and N are open subsets in C and f : M −→ N


is a holomorphic map such that the inverse map f −1 : N −→ M exists and is
holomorphic (so f  is never zero), then f is a conformal map.

The next exercises


 describe the geodesics for some familiar homogeneous spaces.
Let D = {z ∈ C  |z| < 1} be the complex disk in C, and let R = C ∪ {∞} be the
Riemann sphere. The group SL(2, C) acts on R by linear fractional transformations:
 
ab az + b
: z −→ .
cd cz + d
In this action, it is understood that ∞ is mapped to a/c and z is mapped to ∞ if
cz + d = 0. The map z −→ −1/z is a chart near zero, and R is a complex analytic
manifold. Let
  
ab 
 a, b ∈ C, |a| = 1 ,
2
A=
0 ā
16 Geodesics and Maximal Tori 121

  
a b 
 a, b ∈ C, |a| + |b| = 1 ,
2 2
SU(2) =
−b̄ ā
  
ab 
SU(1, 1) =  a, b ∈ C, |a|2
− |b|2
= 1 ,
b̄ ā
and
  
a0  2
K=  |a| = 1 ∼= U(1).
0 ā
It will be shown in Chap. 28 that the group SU(1, 1) is conjugate in SL(2, C) to
SL(2, R). Let G be one of the groups SU(2), A, or SU(1, 1). The stabilizer of 0 ∈ R
is the group K, so we may identify the orbit of 0 ∈ R with the homogeneous space
G/H by the bijection g(0) ←→ gH. The orbit of 0 is given in the following table.

G K orbit of 0 ∈ R
SU(1, 1) U(1) D
A U(1) C
SU(2) U(1) H

Exercise 16.5. Show that if G is one of the groups SU(1, 1), A, or SU(2), then the
quotient G/K, which we may identify with D, C, or H, has a unique G-invariant
Riemannian structure.

Exercise 16.6. Show that the inclusions D −→ C −→ R are conformal maps but
are not isometric.

A subset C of R is called a circle if either C ⊂ C and C is a circle in the Euclidean


sense. In other words, C is the set of all solutions z to the equation |z − α| = r for
α ∈ C, or else C = L ∪ {∞}, where L is a straight line. Let ∂D = {z  |z| = 1} be
the unit circle.

Exercise 16.7.
(i) Show that the group SL(n, C) preserves the set of circles. Show, however, that
a linear fractional transformation g ∈ SL(n, C) may take a circle with center α
to a circle with center different from g(α).
(ii) Show that if M = D, C or R, then each geodesic is a circle, but not each circle
is a geodesic.
(iii) Show that the geodesics in C are the straight lines and that the geodesics in D
are the curves C ∩ D, where C is a circle in C perpendicular to ∂D.
(iv) Show that ∂D is a geodesic in R.
17
The Weyl Integration Formula

Let G be a compact, connected Lie group, and let T be a maximal torus.


Theorem 16.5 implies that every conjugacy class meets T . Thus, we should be
able to compute the Haar integral of a class function (e.g., the inner product of
two characters) as an integral over the torus. The formula that allows this, the
Weyl integration formula, is therefore fundamental in representation theory
and in other areas, such as random matrix theory.
If G is a locally compact group and H a closed subgroup, then the quotient
space G/H consisting of all cosets gH with g ∈ G, given the quotient topology,
is a locally compact Hausdorff space. (See Hewitt and Ross [69, Theorems 5.21
and 5.22 on p. 38].) Such a coset space is called a homogeneous space.
If X is a locally compact Haudorff space let Cc (X) be the space of continu-
ous, compactly supported functions on X. If X is a locally compact Hausdorff
space, a linear functional I on Cc (X) is called positive if I(f )  0 if f is
nonnegative. According to the Riesz representation theorem, each such I is of
the form 
I(f ) = f dμ
X
for some regular Borel measure dμ. See Halmos [61, Sect. 56], or Hewitt and
Ross [69, Corollary 11.37 on p. 129]. (Regularity of the measure is discussed
after Definition 11.34 on p. 127.)
Proposition 17.1. Let G be a locally compact group, and let H be a compact
subgroup. Let dμG and dμH be left Haar measures on G and H, respectively.
Then there exists a regular Borel measure dμG/H on G/H which is invari-
ant under the action of G by left translation. The measure dμG/H may be
normalized so that, for f ∈ Cc (G), we have
 
f (gh) dμH (h) dμG/H (gH).
G/H H

Here the function g −→ H f (gh) dμH is constant on the cosets gH, and we
are therefore identifying it with a function on G/H.

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 123


DOI 10.1007/978-1-4614-8024-2 17, © Springer Science+Business Media New York 2013
124 17 The Weyl Integration Formula

Proof. We may choose the normalization of dμH so that H has total volume 1.
We define a map λ : Cc (G) −→ Cc (G/H) by

(λf )(g) = f (gh) dμH (h).
H

Note that λf is a function on G which is right invariant under translation


by elements of H, so it may be regarded as a function on G/H. Since H
is compact, λf is compactly supported. If φ ∈ Cc (G/H), regarding φ as a
function on G, we have λφ = φ because
 
(λφ)(g) = φ(gh) dμH (h) = φ(g) dμH (h) = φ(g).
H H

This shows that λ is surjective. We may therefore define a linear functional I


on Cc (G/H) by

I(λf ) = f (g) dμG (g), f ∈ Cc (G)
G

provided we check that this is well defined. We must show that if λf = 0 then

f (g) dμG (g) = 0. (17.1)
G

We note that the function (g, h) −→ f (gh) is compactly supported and con-
tinuous on G × H, so if λf = 0 we may use Fubini’s theorem to write
  
0= (λf )(g) dμG (g) = f (gh) dμG (g) dμH (h).
G H G

In the inner integral on the right-hand side we make the variable change
g −→ gh−1 . Recalling that dμG (g) is left Haar measure, this produces a
factor of δG (h), where δG is the modular quasicharacter on G. Thus,
 
0= δG (h) f (g) dμG (g) dμH (h).
H G

Now the group H is compact, so its image under δG is a compact subgroup


of R×
+ , which must be just {1}. Thus, δG (h) = 1 for all h ∈ H and we
obtain (17.1), justifying the definition of the functional I. The existence of
the measure on G/H now follows from the Riesz representation theorem. 
We have seen in Proposition 15.9 that in the adjoint action on g = Lie(G),
restricted to T , the Lie algebra t is an invariant subspace, complemented by
a space p, which decomposes as the direct sum of nontrivial two-dimensional
irreducible real representations as described in Proposition 15.5.
Let W = N (T )/T be the Weyl group of G. The Weyl group acts on T by
conjugation. Indeed, the elements of the Weyl group are cosets w = nT for
n ∈ N (T ). If t ∈ T , the element ntn−1 depends only on w so by abuse of
notation we denote it wtw−1 .
17 The Weyl Integration Formula 125

Theorem 17.1.
(i) Two elements of T are conjugate in G if and only if they are conjugate
in N (T ).
(ii) The inclusion T −→ G induces a bijection between the orbits of W on T
and the conjugacy classes of G.

Proof. Suppose that t, u ∈ T are conjugate in G, say gtg −1 = u. Let H =


CG (u)◦ be the connected component of the identity in the centralizer of u.
It is a closed Lie subgroup of G by Proposition 16.6. Both T and gT g −1 are
contained in H since they are connected commutative groups containing u.
As they are maximal tori in G, they are maximal tori in H, and so they are
conjugate in the compact connected group H. If h ∈ H such that hT h−1 =
gT g −1, then w = h−1 g ∈ N (T ). Since wtw−1 = h−1 uh = u, we see that t and
u are conjugate in N (T ).
Since G is the union of the conjugates of T , (ii) is a restatement of (i). 

Proposition 17.2. The centralizer C(T ) = T .

Proof. Since C(T ) ⊂ N (T ), T is of finite index in C(T ) by Proposition 15.8.


Thus, if x ∈ C(T ), we have xn ∈ T for some n. Let t0 be a generator of T .
Since the nth power map T −→ T is surjective, there exists t ∈ T such that
(xt)n = t0 . Now xt is contained in a maximal torus T  , which contains t0 and
hence T ⊂ T  . Since T is maximal, T  = T and x ∈ T .


Proposition 17.3. There exists a dense open set Ω of T such that the |W |
elements wtw−1 (w ∈ W ) are all distinct for t ∈ Ω.

See Proposition 23.4 for a more precise result.

Proof. If w ∈ W , let Ωw = {t ∈ T | wtw−1 = t}. It is an open subset of T since


its complement is evidently closed. If w = 1 and t is a generator of T , then
t ∈ Ωw because otherwise if n ∈ N (T ) represents w, then n ∈ C(t) = C(T ),
so n ∈ T by Proposition
6 17.2. This is a contradiction since w = 1. The finite
intersection Ω = w=1 Ωw is dense by Kronecker’s Theorem 15.1. It thus fits
our requirements.


Theorem 17.2 (Weyl). Let G be a compact connected Lie group, and let p
be as in Proposition 15.9. If f is a class function, and if dg and dt are Haar
measures on G and T (normalized so that G and T have volume 1), then
 
1 
f (g) dg = f (t) det [Ad(t−1 ) − Ip ]  p dt.
G |W | T

Proof. Let X = G/T . We give X the measure dX invariant under left trans-
lation by G such that X has volume 1. Consider the map

φ : X × T −→ G, φ(xT, t) = xtx−1 .
126 17 The Weyl Integration Formula

Both X × T and G are orientable manifolds of the same dimension. Of course,


G and T both are given the Haar measures such that G and T have volume 1.
We choose volume elements on the Lie algebras g and t of G and T ,
respectively, so that the Jacobians of the exponential maps g −→ G and
t −→ T at the identity are 1.
We compute the Jacobian Jφ of φ. Parametrize a neighborhood of xT
in X by a chart based on a neighborhood of the origin in p. This chart is
the map
p  U → xeU T .
We also make use of the exponential map to parametrize a neighborhood of
t ∈ T . This is the chart t  V → teV . We therefore have the chart near the
point (xT, t) in X × T mapping

p × t  (U, V ) −→ (xeU T, teV ) ∈ X × T

and, in these coordinates, φ is the map

(U, V ) → xeU teV e−U x−1 .

To compute the Jacobian of this map, we translate on the left by t−1 x−1 and
on the right by x. There is no harm in this because these maps are Haar
isometries. We are reduced to computing the Jacobian of the map
−1
(U, V ) → t−1 eU teV e−U = eAd(t )U V −U
e e .

Identifying the tangent space of the real vector space p × t with itself (that is,
with g = p ⊕ t), the differential of this map is

U + V → Ad(t−1 ) − Ip U + V.

The Jacobian is the determinant of the differential, so



(Jφ)(xT, t) = det [Ad(t−1 ) − Ip ] | p . (17.2)

By Proposition 17.3, the map φ : X × T −→ G is a |W |-fold cover over a


dense open set and so, for any function f on G, we have
 
1
f (g) dg = f φ(xT, t) J φ(xT, t) dx × dt.
G |W | X×T

The integrand f φ(xT, t) J φ(xT, t) = f (t) det [Ad(t−1 ) − Ip ] | p is inde-
pendent of x since f is a class function, and the result follows.


An example may help make this result more concrete.


17 The Weyl Integration Formula 127

Proposition 17.4. Let G = U(n), and let T be the diagonal torus. Writing
⎛ ⎞
t1
⎜ .. ⎟
t=⎝ . ⎠ ∈ T,
tn

and letting T dt be the Haar measure on T normalized so that its volume is
1, we have
⎛ ⎞
  t1
1 ⎜ .. ⎟7
f (g) dg = f⎝ . ⎠ |ti − tj |2 dt. (17.3)
G n! T i<j
tn
Proof. This will follow from Theorem 17.2 once we check that
7
det [Ad(t−1 ) − Ip ] | p = |ti − tj |2 .
i<j

To compute this determinant, we may as well consider the linear transfor-


mation induced by Ad(t−1 ) − Ip on the complexified vector space C ⊗ p.
As in Proposition 11.4, we may identify C ⊗ u(n) with gl(n, C) = Matn (C).
We recall that C⊗p is spanned by the T -eigenspaces in C⊗u(n) corresponding
to nontrivial characters of T . These are spanned by the elementary matrices
Eij with a 1 in the i, jth position and zeros elsewhere, where 1  i, j  n and
i = j. The eigenvalue of t on Eij is ti t−1
j . Hence
7 −1 7
det [Ad(t−1 ) − Ip ] | p = (ti tj − 1) = (ti t−1 −1
j − 1)(tj ti − 1).
i=j i<j

Since |ti | = |tj | = 1, we have (ti t−1


j − 1)(tj t−1
i − 1) = (ti − tj )(t−1 −1
i − tj ) =
|ti − tj |2 , proving (17.3).


Exercises
Exercise 17.1. Let G = SO(2n + 1). Choose the realization of Exercise 5.3.
Show that
⎛ ⎞
t1
⎜ .. ⎟
⎜ ⎟
⎜ . ⎟
  ⎜ ⎟
⎜ tn ⎟
1 ⎜ ⎟
f (g) dg = n f⎜ 1 ⎟
2 n! Tn ⎜ −1 ⎟
SO(2n+1)
⎜ tn ⎟
⎜ ⎟
⎜ .. ⎟
⎝ . ⎠
t−1
1
 2

× |ti − tj |2 |ti − t−1
j | |ti − 1|2 dt1 · · · dtn .
i<j i
128 17 The Weyl Integration Formula

Exercise 17.2. Let G = SO(2n). Choose the realization of Exercise 5.3. Show that
⎛ ⎞
t1
⎜ .. ⎟
⎜ ⎟
  ⎜ . ⎟
⎜ ⎟
1 ⎜ tn ⎟
f (g) dg = n−1 f⎜ −1 ⎟
2 n! Tn ⎜ tn ⎟
SO(2n)
⎜ ⎟
⎜ .. ⎟
⎝ . ⎠
−1
t1
 2
× |ti − tj |2 |ti − t−1
j | dt1 · · · dtn .
i<j

Exercise 17.3. Describe the Haar measure on Sp(2n) as an integral over the diag-
onal maximal torus.

Exercise 17.4. Let f be a class function on SU(2). Suppose that


  
z
f −1 = a(n) z n .
z
n

Give at least two proofs that



f (g) dg = a(0) − a(2).
SU(2)

For the first proof, check that this is true for every irreducible character. For the
second proof, show that a(n) = a(−n). Then use the Weyl integration formula and
make use of the fact that a(2) = a(−2).

Exercise 17.5. Prove that


  
1 2k
|tr(g)|2k dg = .
SU(2) k+1 k

The moments of trace are thus the Catalan numbers.


18
The Root System

A Euclidean space is a real vector space V endowed with an inner product, that
is, a positive definite symmetric bilinear form. We denote this inner product
by , . If 0 = α ∈ V, consider the transformation sα : V −→ V given by
2 x, α
sα (x) = x − α. (18.1)
α, α
This is the reflection attached to α. Geometrically, it is the reflection in the
plane perpendicular to α. We have sα (α) = −α, while any element of that
plane (with x, α = 0) is unchanged by sα .
Definition 18.1. Let V be a finite-dimensional real Euclidean space, Φ ⊂ V
a finite subset of nonzero vectors. Then Φ is called a root system if for all
α ∈ Φ, sα (Φ) = Φ, and if α, β ∈ Φ then 2 α, β / α, α ∈ Z. The root system
is called reduced if α, λα ∈ Φ, λ ∈ R implies that λ = ±1.
There is another, more modern notion which was introduced in Demazure [10]
(Exposé XXI). This notion is known as a root datum. We will give the
definition, then discuss the relationship between the two notions. We will
find both structures in a compact Lie group.
A root datum consists of a quadruple (Λ, Φ, Λ∨ , Φ∨ ) of data which are to
be as follows. First, Λ is a lattice, that is, a free Z-module, and let Λ∨ =
Hom(Λ, Z) is the dual lattice. Inside each lattice there is given a finite set
of nonzero vectors, denoted Φ ⊂ Λ and Φ̂ ⊂ Λ∨ , together with a bijection
α → α∨ from Φ to Φ̂. It is required that α∨ (α) = 2 and that α∨ (Φ) ⊂ Z.
Using these we may define, for each α ∈ Φ, linear maps sα : Λ → Λ and
sα∨ : Λ∨ → Λ∨ of order 2. These are defined by the formulas

sα (v) = v − α∨ (v) α, sα∨ (v ∗ ) = v ∗ − v ∗ (α) α∨ .

It is easy to see that sα∨ is the adjoint of sα , that is,

sα∨ (v ∗ )(v) = v ∗ (s−1 ∗


α v) = v (sα v).

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 129


DOI 10.1007/978-1-4614-8024-2 18, © Springer Science+Business Media New York 2013
130 18 The Root System

Let us now explain the relationship between the root system and the root
datum. We will always obtain the root system with another piece of data: a
lattice Λ that spans V such that Φ ⊂ Λ. It will have the property of being
invariant under the sα . Let V ∗ be the real dual space of V. The dual lattice
Λ∨ is the set of linear functionals v ∗ : V → R such that v ∗ (L) ⊂ Z. It can be
identified with Hom(Λ, Z). If α ∈ Φ the linear functional

2 x, α
α∨ (x) = (18.2)
α, α

is in L∨ by the definition of a root system. If α is a root, then α∨ is the called


the associated coroot. Now if Φ∨ is the set of coroots, then (Λ, Φ, Λ∨ , Φ∨ ) is a
root datum.
The root datum notion has several advantages. First, the root datum gives
complete information sufficient to uniquely determine the group G. This is
perhaps less important if G is semisimple, for then one may specify the group
by describing its root system and its fundamental group. However, if G is
reductive but not semisimple, the root system is not enough data. Another,
more subtle value of the root datum is that if (Λ, Φ, Λ∨ , Φ∨ ) is a root datum
then so is (Λ∨ , Φ∨ , Λ, Φ). This root datum describes another group Ĝ, usually
taken in its complexified form, as a complex analytic group. This is the Lang-
lands L-group, which plays an important role in the representation theory of
both Lie groups and p-adic groups. See Springer [152] and Borel [19]. In the
root system, we are making use of the Euclidean inner product structure to
identify the ambient vector space V with its dual. This has the psychological
advantage of allowing us to envision sα as reflection in the hyperplane perpen-
dicular to the root α. On the other hand from a purely mathematical point of
view, the identification of V with its dual is a somewhat artificial procedure.
The goal of this chapter is to associate a reduced root system with an
arbitrary compact connected Lie group G. The lattice Λ will be X ∗ (T ), where
T is a maximal torus, and the vector space V will be R ⊗ Λ. Elements of Λ
will be called weights, and Λ will be called the weight lattice.
Let G be a compact connected Lie group and T a maximal torus. The
dimension r of T is called the rank of G. We note that this terminology is not
completely standard,
for if Z(G) is not finite, the term rank might refer to
dim(T ) − dim Z(G) . We will refer to the latter statistic as the semisimple
rank of G.
Let g = Lie(G) and t = Lie(T ). Recall that T is the Lie group of complex
numbers of absolute value 1. If we identify the Lie algebra of C× with C then
the Lie algebra of T is iR. Thus, if λ : T −→ T is a character, let dλ : t −→ iR
be the differential of λ, defined as usual by

d 
dλ(H) = λ(etH )  , H ∈ t. (18.3)
dt t=0

Then dλ takes purely imaginary values.


18 The Root System 131

Remark 18.1. Since T =∼ (R/Z)r , its character group X ∗ (T ) ∼


= Zr . We want
to embed X (T ) into a real vector space V ∼

= R . There are two natural
r

ways of doing this. First, we may note that X ∗ (T ) ∼= Zr , so we can take



V = R ⊗Z X (T ). Alternatively, as we have just explained, if λ is a character
of T , then dλ ∈ Hom(t, iR). Extending dλ to a complex linear map tC , we
see that dλ also maps it → R. Part of the construction will be to produce
elements Hα ∈ it such that for λ ∈ X ∗ (T ) we have

dλ(Hα ) = α∨ (λ) (18.4)

(See Proposition 18.13.) In view of this close relationship both the α∨ and the
Hα may be referred to as coroots.

The Weyl group W = N (T )/T acts on T by conjugation and hence on


V, and it will be convenient to give V an inner product (that is, a positive
definite symmetric bilinear form) that is W -invariant. We may, of course, do
this for any finite group acting on a real vector space.
If π : G → GL(V ) is a complex representation, then we may restrict π
to T , where it will decompose into one-dimensional characters. The elements
of Λ = X ∗ (T ) that occur in π restricted to T are called the weights of the
representation. A root of G with respect to T is a nonzero weight of the adjoint
representation. We recall from Chap. 17 that g = t⊕p where p is the direct sum
of nontrivial two-dimensional real subspaces that are irreducible T -modules.
Then gC decomposes as tC ⊕pC . The space pC will further decompose into one-
dimensional t-invariant complex subspaces. More precisely, if U is one of the
irreducible two-dimensional t-invariant subspaces of the real vector space p,
then by Proposition 15.5, UC is the direct sum of two one-dimensional invariant
complex vector spaces, each corresponding to a root α and its negative −α.
So we may say that a root is a character of T that occurs in the adjoint
representation on pC . If α is a root, let Xα ⊂ pC be the α-eigenspace. We will
denote by Φ ⊂ V the set of roots of G with respect to T . We will show in
Theorem 18.2 that Φ is a root system.
Because the proofs are somewhat long, it may be useful to have a couple of
examples in mind. First, SU(2) will play a role in the sequel, so we review it.
The Lie algebra g consists of 2 × 2 skew-Hermitian matrices of trace 0. Every
element of the Lie algebra sl(2, C) of SL(2, C) may be written uniquely as
X +iY with X and Y in g, so sl(2, C) = g⊕ig. In other words the complexified
Lie algebra gC = sl(2, C), the Lie algebra with representations that were
studied in Chap. 12.
Let T be the group of diagonal matrices in SU(2). A character of T has
the form
 
t
λk = tk . (18.5)
t−1
132 18 The Root System

Define
     
1 01 00
Hα = , Xα = , X−α = . (18.6)
−1 00 10

We will see that Hα ∈ it is the coroot, and that Xα and X−α span the
one-dimensional weight spaces Xα and X−α . Thus, the root system Φ =
{α, −α}.
Let us say that λk is the highest weight in an irreducible representation
V if k is maximal such that λk occurs in the restriction of the representation
to V . This ad hoc definition is a special case of a partial order on the weights
for general compact Lie groups in Chap. 21.

Proposition 18.1. If k ∈ Z then dλk (Hα ) = k. The roots of SU(2) are


α = λ2 and −α = λ−2 . If k is a nonnegative integer then SU(2) has a
unique irreducible representation with highest weight λk . The weights of this
representation are λl with −k  l  k and l ≡ k modulo 2.

Proof. Although Hα is not in t, iHα is and we find that


 it 
d e  d
dλk (iHα ) = λk −it
 = eikt |t=0 = ik
dt e t=0 dt

so dλk (Hα ) = k. We have


 
t
Ad X α = t2 X α ,
t−1

so Xα spans a T -eigenspace affording the character λ2 , which is thus a root α.


If k is a nonnegative integer, then we proved in Chap. 12 that sl(2, C) has an
irreducible representation ∨k C2 and the weights are seen from (12.2) to be
the integers between −k and k with the same parity as k.


To give a higher-rank example, let us consider the group G = Sp(4). This is


a maximal compact subgroup of Sp(4, C), which we will take to be the group
of g ∈ GL(4, C) that satisfy g J t g = J, where
⎛ ⎞
−1
⎜ −1 ⎟
J =⎜ ⎝ 1
⎟.

1

This is not the same as the group introduced in Example 5.5, but it is
conjugate to that group in GL(4, C). The subgroup Sp(4) is the intersection
of Sp(4, C) with U(4). A maximal torus T can be taken to be the group of
diagonal elements, and the we will show that the roots are the eight characters
18 The Root System 133


⎪ α1 (t) = t1 t−1 2 ,

⎪ 2
⎛ ⎞ ⎪
⎪ α2 (t) = t 2 ,


t1 ⎪
⎪ (α1 + α2 )(t) = t1 t2 ,
⎜ ⎟ ⎨ (2α + α )(t) = t2 ,
t2
T t=⎜ ⎟ −→ 1 2 1
⎝ t−1 ⎠ ⎪ −α (t) = t −1
1 t2 ,
2 ⎪

1
−1 ⎪
⎪ −α (t) = t −2
t1 ⎪ 2 2 ,

⎪ −1 −1

⎪ −(α + α )(t) = t 1 t2 ,
⎩ 1 2
−(2α1 + α2 )(t) = t−21 .

They form a configuration in V that can be seen in Fig. 19.4 of the next
chapter. The reader can check that this forms a root system.
The complexified Lie algebra gC consists of matrices of the form
⎛ ⎞
t1 x12 x13 x14
⎜ x21 t2 x23 x13 ⎟
⎜ ⎟
⎝ x31 x32 −t2 −x12 ⎠ . (18.7)
x41 x31 −x21 −t1
The spaces Xα1 and X−α1 are spanned by the vectors
⎛ ⎞ ⎛ ⎞
0 1 0 0 0 0 0 0
⎜0 0 0 0 ⎟ ⎜1 0 0 0⎟
Xα1 = ⎜
⎝0 0
⎟, X−α1 = ⎜ ⎟.
0 −1 ⎠ ⎝0 0 0 0⎠
0 0 0 0 0 0 −1 0

Similarly the spaces Xα2 and X−α2 are spanned by


⎛ ⎞ ⎛ ⎞
0 0 0 0 0 0 0 0
⎜0 0 1 0⎟ ⎜0 0 0 0 ⎟
Xα2 = ⎜⎝0
⎟, X−α2 = ⎜ ⎟.
0 0 0⎠ ⎝0 1 0 0 ⎠
0 0 0 0 0 0 0 0

As you can see, Ad(t)Xα = α(t)Xα when α = α1 or α2 . This proves that α1


and α2 are roots, and the four others are handled similarly. Note that these
Xα are elements not of g but of its complexification gC , since to be in g, the
element (18.7) must be skew-Hermitian, which means that the ti are purely
imaginary, and xij = −xji .
As we have mentioned, the proof that the set of roots of a compact Lie
group form a root system involves constructing certain elements Hα of it.
In this example
⎛ ⎞ ⎛ ⎞
1 0
⎜ −1 ⎟ ⎜ 1 ⎟
Hα1 = ⎜⎝
⎟,
⎠ Hα2 = ⎜⎝
⎟.
1 −1 ⎠
−1 0

Note that Hα ∈/ t, but −iHα ∈ t, since the elements of t are diagonal and
purely imaginary. The Hα satisfy
134 18 The Root System

[Hα , Xα ] = 2Xα , [Hα , X−α ] = −2X−α ,

and they are elements of the intersection of it with the complex Lie algebra
generated by Xα and X−α . We note that Xα and X−α are only determined
up to constant multiples by the description we have given, but Hα is fully
characterized. The Hα will be constructed in Proposition 18.8 below. They
form a root system that is dual to the one we want to construct—if α is a
long root, then Hα is short, and conversely, in root systems where not all the
roots have the same length. (See Exercise 18.1.)
A key step will be to construct an element wα of the Weyl group W =
N (T )/T corresponding to the reflection sα in (18.1). In order to produce this,
we will construct a homomorphism
  iα : SU(2) −→ G. Then wα ∈ N (T ) will
0 −1
then be the image of under iα .
1 0
Let us offer a word about how one can get a grip on iα . The centralizer
C(Tα ) of the kernel Tα of the homomorphism α : T −→ C× is a close relative of
this group iα (SU(2)). In fact, C(Tα ) = iα (SU(2))·T . Later, in Proposition 18.6
we will use this circumstance to show that Xα is one-dimensional, after which
the structure of C(Tα ) will become clear: since this group has only two roots
α and −α, it is itself a close relative of SU (2). Its Lie algebra contains a copy
of su(2) (Proposition 18.8) and using this fact we will be able to construct the
homomorphism iα in Theorem 18.1.
Let us take a look at the groups C(Tα ) and the homomorphisms iα in the
example of Sp(4). The subgroup Tα1 of T is characterized by t1 = t2 , so its
centralizer consists of elements of the form
⎛ ⎞
a b  
⎜c d ⎟ a b
⎜ ⎟, ∈ U(2),
⎝ ∗ ∗⎠ c d
∗ ∗

where the elements marked ∗ are determined by the requirement that the
matrix be in Sp(4). The homomorphism iα1 is given by
⎛ ⎞
  a b  
a b ⎜c d ⎟ ab
iα1 =⎝⎜ ⎟ , ∈ SU(2).
c d a −b ⎠ cd
−c d

Similarly, Tα2 is characterized by t2 = {±1}, and


⎛ ⎞
  1
a b ⎜ a b ⎟
iα2 =⎜⎝ c d
⎟.

c d
1

We turn now to the general case and to the proofs.


18 The Root System 135

Proposition 18.2. A maximal Abelian subalgebra h of g is the Lie algebra of


a conjugate of T . Its dimension is the rank r of G.

Proof. By Proposition 15.2, exp(h) is a commutative group that is connected


since it is the continuous image of a connected space. By Theorem 15.2 its
closure H is a Lie subgroup of G, closed, connected and Abelian and therefore
a torus. It is therefore contained in a maximal torus H  . By maximality of h ⊆
Lie(H  ) we must have h = Lie(H  ) and H  = H. By Cartan’s Theorem 16.5,
H is a conjugate of T .


Lemma 18.1. Suppose that G is a compact Lie group with Lie algebra g,
π : G −→ GL(V ) a representation, and dπ : g −→ End(V ) the differential. If
v ∈ V and X ∈ g such that dπ(X)n v = 0 for any n > 1, then dπ(X)v = 0.

Proof. We may put a G-invariant positive definite inner product , on V .


The inner product is then g-invariant, which means that dπ(X)v, w =
− v, dπ(X)w . Thus, dπ(X) is skew-Hermitian, which by the spectral theorem
implies that V has a basis with respect to which its matrix is diagonal. It is
clear that, for a diagonal matrix M , M n v = 0 implies that M v = 0.


Let (π, V ) be any finite-dimensional complex representation of G. If λ ∈


X ∗ (T ), let V (λ) = {v ∈ V | π(t)v = λ(t)v}. Then V is the direct sum of
the V (λ). If (π, V ) = (Ad, gC ) and λ = α is a root, then V (λ) = Xα .

Proposition 18.3. Let (π, V ) be any irreducible representation of G, and let


α be a root.
(i) If dπ : g −→ gl(V ) is the differential of π, then

dπ(H)v = dλ(H)v, H ∈ t, v ∈ V (λ). (18.8)

(ii) We have

[H, Xα ] = ad(H)Xα = dα(H)Xα , H ∈ t, Xα ∈ Xα . (18.9)

(iii) If (π, V ) is a finite-dimensional complex representation of G and v ∈ V (λ)


for some λ ∈ X ∗ (T ), then dπ(Xα )v ∈ V (λ + α).

Proof. For (i), if H ∈ t and t ∈ R, then for v ∈ V (λ) we have

π(etH )v = λ(etH )v = etdλ(H) v.

Taking the derivative and setting t = 0, using (18.3) we obtain (18.8). When
V = gC and π = Ad, we have Xα = V (λ). Remembering that the differential
of Ad is ad (Theorem 8.2), we see that (18.9) is a special case of (18.8), and
(ii) follows.
For (iii), we have, by (18.9),

dπ(H) dπ(Xα ) − dπ(Xα ) dπ(H) = dπ[H, Xα ] = dα(H)dπ(Xα ).


136 18 The Root System

Applying this to v and using (18.8) gives, with w = dπ(Xα )v,



dπ(H)w = dλ(H) + dα(H) w,

so w ∈ V (λ + α).

We may write gC = g + ig. Let c : gC −→ gC be the conjugation with respect
to g, that is, the real linear transformation X + iY −→ X − iY (X, Y ∈ g).
Although c is not complex linear, it is an automorphism of gC as a real Lie
algebra. We have c(aZ) = a · c(Z), so c is complex antilinear.
Proposition 18.4.
(i) We have c(Xα ) = X−α .
(ii) If Xα ∈ Xα , Xβ ∈ Xβ , α, β ∈ Φ, then

tC if β = −α ,
[Xα , Xβ ] ∈
Xα+β if α + β ∈ Φ .

while [Xα , Xβ ] = 0 if β = −α and α + β ∈


/ Φ.
(iii) If 0 = Xα ∈ Xα , then [Xα , c(Xα )] is a nonzero element of it, and
dα([Xα , c(Xα )]) = 0.
In case (ii), if α + β ∈ Φ, we will eventually show that [Xα , Xβ ] is a nonzero
element of Xα+β . See Corollary 18.1.
Proof. For (i), apply c to (18.9) using the complex antilinearity of c, and the
fact that dα(H) is purely imaginary to obtain, for H ∈ t

[H, c(Xα )] = [c(H), c(Xα )] = c[H, Xα ] = c(dα(H)Xα ) = −dα(H)c(Xα ).

This shows that c(Xα ) ∈ X−α .


Part (ii) is the special case of Proposition 18.3 (iii) when π = Ad and
V = gC since tC = V (0) while Xα = V (α) when α ∈ Φ.
Next we prove (iii). By (i) and (ii), [Xα , c(Xα )] ∈ tC . Applying c to this
element,

c [Xα , c(Xα )] = [c(Xα ), Xα ] = −[Xα , c(Xα )],

so [Xα , c(Xα )] ∈ it. We show that [Xα , c(Xα )] = 0. Let tα ⊂ t be the kernel
of dα. It is of course a subspace of codimension 1. Let H1 , . . . , Hr−1 be a basis.
If [Xα , c(Xα )] = 0, then denoting

Yα = 12 Xα + c(Xα ) , 1
Zα = 2i Xα − c(Xα ) , (18.10)

Yα and Zα are c-invariant and hence in g, and

H1 , . . . , Hr−1 , Yα , Zα

are r + 1 commuting elements of t that are linearly independent over R. This


contradicts Proposition 18.2, so [Xα , c(Xα )] = 0.
18 The Root System 137

It remains to be shown that dα [Xα , c(Xα )] = 0. If on the contrary
this vanishes, then [H0 , Xα ] = [H0 , c(Xα )] = 0 by (18.9), where H0 =
−i[Xα , c(Xα )] ∈ t. With Yα and Zα as in (18.10), this implies that [H0 , Yα ] =
[H0 , Zα ] = 0. Now

[Yα , Zα ] = 12 H0 , [Yα , H0 ] = 0.

Thus, ad(Yα )2 Zα = 0, yet ad(Yα )Zα = 0, contradicting Lemma 18.1.




Proposition 18.5. If dim(T ) = 1, then either G = T or dim(G) = 3. If α is


any root, then Xα is one-dimensional, and α, −α are the only roots.

Proof. Since tC is one-dimensional, let H be a basis vector. Assuming G = T ,


Φ is nonempty. The spaces Xα are just the eigenspaces of H on pC . Since T
is one-dimensional, so is V. Thus, if α ∈ Φ, every β ∈ Φ is of the form λα for
a nonzero constant λ. We choose α so that all |λ|  1. Let 0 = Xα ∈ Xα , and
let X−α = −c(Xα ). We consider the complex vector space
&
V = CX−α ⊕ tC ⊕ Xλα .
λα ∈ Φ
λ>0

By Proposition 18.4, each component space is mapped into another by ad(Xα )


and ad(X−α ). Indeed, ad(X−α ) kills X−α , shifts tC into CX−α , and shifts Xλα
into tC if λ = 1 or X(λ−1)α if λ = 1. Note that λ > 0 implies λ > 1, so indeed V
is stable under ad(X−α ). The case of ad(Xα ) is similar. Moreover, [Xα , X−α ]
is a nonzero multiple of H by Proposition 18.4. Since the commutator of two
linear transformations on a finite-dimensional vector space has trace zero, the
trace of H on V is therefore zero.
On the other hand, denoting C = dα(H), the trace of ad(H) on Xλα
equals λC dim(Xλα ), while the trace of ad(H) on CX−α is −C, and the trace
of ad(H) on tC is zero. We see that the trace is −C + λ1 λC dim(Xλα ).
Since this is zero, there can be only one Xλα with λ > 0, namely Xα , and
dim(Xα ) = 1. Now g = CH ⊕ CXα ⊕ CX−α is three-dimensional.


We return now to the general case. If α ∈ Φ, let Tα ⊂ T be the kernel of α.


This closed subgroup of T may or may not be connected. Its Lie algebra is
the kernel tα of dα.

Proposition 18.6.
(i) If α ∈ Φ, then dim(Xα ) = 1.
(ii) If α, β ∈ Φ and α = λβ, λ ∈ R, then λ = ±1.

Proof. The group H = CG (Tα ) is a closed connected Lie subgroup by


Theorem 16.6. It has Tα as a normal subgroup. The Lie algebra of H is the
centralizer h in g of tα , so
138 18 The Root System
&
hC = tC ⊕ Xλα .
λα ∈ Φ
λ∈R

Thus H/Tα is a rank 1 group with maximal torus T /Tα . Its complexified
 Lie
algebra is therefore three-dimensional by Proposition 18.5. However, Xλα
is embedded injectively in this complexified Lie algebra, so λ = ±1 are the
only λ, and X±α are one-dimensional.

Proposition 18.7.
(i) Let g be the Lie algebra of a compact Lie group. If X, Y ∈ g such that
[X, Y ] = cY with c a nonzero real constant, then Y = 0.
(ii) There does not exist any embedding of sl(2, R) into the Lie algebra of a
compact Lie group.
Proof. Let G be a compact Lie group with Lie algebra g. Then given any
finite-dimensional representation π : G −→ GL(V ) on a real vector space
V , there exists a positive-definite symmetric bilinear form B on V such that
B(π(g)v, π(g)w) = B(v, w) for g ∈ G, v, w ∈ V . By Proposition 10.3, we have
B(dπ(X)v, w) = −B(v, dπ(X)w) for X ∈ g. Now let us apply this with V = g
and π = Ad, so by Theorem 8.2 we have B([X, Y ], Z) = −B(Y, [X, Z]). If X
and Y such that [X, Y ] = cY with c a nonzero constant, then
cB(Y, Y ) = B([X, Y ], Y ) = −B(Y, [X, Y ]) = −cB(Y, Y ).
Since c = 0 and B is positive definite, it follows that Y = 0. This proves (i).
As for (ii), if g contains a subalgebra
 isomorphic
  to  sl(2, R) then we may
1 01
take X and Y to be the images of and and obtain a contra-
−1 00
diction to (i).

We remind the reader that the Lie algebra su(2) of SU(2) consists of the
trace zero skew-Hermitian matrices in Mat(2, C). The Lie algebra sl(2, C) of
SL(2, C) consists of all trace zero matrices. Any trace zero matrix may X may
be uniquely written as X1 + iX2 where X1 and X2 are in su(2), so sl(2, C) is
the complexification of su(2).
Proposition 18.8. Let α ∈ Φ and let 0 = Xα ∈ Xα . Let X−α = −c(Xα ) ∈
X−α . Then Xα and X−α generate a complex Lie subalgebra gα,C of gC isomor-
phic to sl(2, C). Its intersection gα = g ∩ gα,C is isomorphic to su(2). We may
choose Xα and the isomorphism iα : sl(2, C) −→ gα,C so that
     
1 0 1 0 0
iα = Hα , iα = Xα , iα = X−α , (18.11)
−1 0 0 1 0

where Hα = [Xα , X−α ]. In this case, Hα ∈ it and

[Xα , X−α ] = Hα , [Hα , Xα ] = 2Xα , [Hα , X−α ] = −2X−α . (18.12)


18 The Root System 139

Proof. Let Hα = [Xα , X−α ]. By Proposition 18.4(iii), Hα is a nonzero element


of it not in itα . By Proposition 18.4(iii) and (18.9), we have [Hα , Xα ] = 2λXα ,
where λ is a nonzero real constant. Applying c and using c(Hα ) = −Hα , we
have [Hα , X−α ] = −2λX−α .
We will show later that λ > 0. For now, assume this. Replacing Xα , X−α
and Hα by λ−1/2 Xα , λ−1/2 X−α , and λ−1 Hα , we may arrange that (18.12)
is satisfied. Since the three matrices in sl(2, C) in (18.11) satisfy the same
relations, we have an isomorphism iα : sl(2, C) −→ gα,C such that (18.11) is
true. Now the real subalgebra gα fixed by the conjugation c is spanned √ as a
real vector space by iH, i(Xα + X−α ) and Xα − X−α . (Here i = −1, not to
be confused with iα .) These are the image under iα of
     
i 0i 1
, , ,
−i i0 −1

which span su(2).


It remains to be shown that λ > 0. If not, we will obtain a contradiction.
Replacing Xα , X−α and Hα by |λ|−1/2 Xα , |λ|−1/2 X−α , and λ−1 Hα gives

[Xα , X−α ] = Hα , [Hα , Xα ] = −2Xα , [Hα , X−α ] = 2X−α .

We may now obtain an isomorphism iα of sl(2, C) with gC by


     
1 0 0 0 i
iα = Hα , iα = Xα , iα = X−α ,
−1 i 0 0 0

The real subalgebra gα fixed by the conjugation c is generated by iH, i(Xα +


X−α ) and Xα − X−α , and these correspond to
     
i −1 −i
, , ,
−i −1 i

and these generate the Lie algebra su(1, 1), which is isomorphic to sl(2, R) by
Exercise 5.9. This is a contradiction because sl(2, R) cannot be embedded in
the Lie algebra of a compact Lie group by Proposition 18.7.


Since Xα is one-dimensional, the group gα does not depend on the choice


of Xα .

Proposition 18.9. If H ∈ tα = ker(dα), then [H, gα ] = 0.

Proof. H centralizes Xα and X−α by (18.9); that is, [H, Xα ] = [H, X−α ] = 0,
and it follows that [H, X] = 0 for all X ∈ gα .


We gave the ambient vector space V of the set Φ of roots an inner product
(Euclidean structure) invariant under W . The Weyl group acts on T by conju-
gation and hence it acts on X ∗ (T ). It acts on p by the adjoint representation
140 18 The Root System

(induced from conjugation) so it permutes the roots. The Weyl group elements
are realized as orthogonal motions with respect to this metric.
We may now give a method of constructing Weyl group elements. Let
α ∈ Φ. Let Tα = {t ∈ T | α(t) = 1}.

Theorem 18.1. Let α ∈ Φ. There exists a homomorphism iα : SU(2) −→


C(Tα )◦ ⊂ G such that the image of the differential diα : su(2) −→ g is the
Lie algebra homomorphism of Proposition 18.8. If
 
−1
wα = iα , (18.13)
1

then wα ∈ N (T ) and wα induces sα in its action on X ∗ (T ).

Proof. Since SU(2) is simply connected, it follows from Theorem 14.2 that the
Lie algebra homomorphism su(2) −→ g of Proposition 18.8 is the differential
of a homomorphism iα : SU(2) −→ G. By Proposition 18.9, gα centralizes tα ,
and since SU(2) is connected, it follows that iα SU(2) ⊆ C(Tα )◦ .
By Proposition 18.4, −iH α ∈/ tα , so t is generated by its codimension-
one subspace tα and iα su(2) ∩ t.
Since Lie(Tα ) = tα , it follows that T is
generated by Tα and T ∩ iα SU(2) . By construction, wα normalizes
  
y 
T ∩ iα SU(2) = iα  y ∈ C, |y| = 1 ,
y −1

and since iα (SU(2)) ⊆ C(Tα )◦ , wα also normalizes Tα .


Since we chose a W -invariant inner product, any element of the Weyl group
acts by a Euclidean motion. Since wα centralizes Tα , it acts trivially on tα
and thus fixes a codimension-one subspace in V. It also maps α −→ −α, and
these two properties characterize sα .


Proposition 18.10. Let (π, V ) be a finite-dimensional representation of G,


and let λ ∈ X ∗ (T ) such that V (λ) = 0. Then 2 λ, α / α, α ∈ Z for all
α ∈ Φ.

Proof. Let
&
W = V (λ + kα).
k∈Z

By Proposition 18.4, this subspace is stable under dπ(Xα ) and dπ(X−α ).


It is therefore invariant under the Lie algebra gα,C that
they generate and its
subalgebra gα . Thus, it is invariant under iα SU(2) , in particular by wα in
Theorem 18.1. Thus, wα V (λ) = V (λ + kα) for some k ∈ Z and by (18.1) we
have k = −2 λ, α / α, α . That proves that this is an integer.


Theorem 18.2. If Φ is the set of roots associated with a compact Lie group
and its maximal torus T , then Φ is a reduced root system.
18 The Root System 141

Proof. Clearly, Φ is a set of nonzero vectors in a Euclidean space V. The fact


that Φ is invariant under sα , α ∈ Φ follows from the construction of
wα ∈ N (T ), the conjugation of which induces sα in Theorem 18.1. The fact
that the integers 2 β, α / α, α ∈ Z for α, β ∈ Φ follows from applying
Proposition 18.10 to (Ad, gC ). Thus Φ is a root system. It is reduced by
Proposition 18.6.


Proposition 18.11. Let λ ∈ X ∗ (T ). Then there exists a finite-dimensional


complex representation (π, V ) of G such that V (λ) = 0.

Proof. Consider the subspace L(λ) of L2 (G) of functions f satisfying

f (tg) = λ(t)f (g)

for t ∈ T . Let G act on L(λ) by right translation: ρ : G −→ End(V ) is the map


ρ(g)f (x) = f (xg). Clearly, L(λ) is an invariant subspace under this action,
and by Theorem 4.3 it decomposes into a direct sum of finite-dimensional
irreducible invariant subspaces. Let V be one of these subspaces, and let π
be the representation of G on V . Every linear functional on V has the form
x −→ x, f0 , where f0 is a vector and , is the L2 inner product. Thus,
there exists an f0 ∈ V such that f (1) = f, f0 for all f ∈ V . Clearly, f0 = 0.
We have
 
f, π(t)f0 = π(t−1 )f, f0 = π(t−1 )f (1) = f (t−1 ) = λ(t)−1 f (1) = f, λ(t)f0 .

Therefore π(t)f0 = λ(t)f0 and so V (λ) = 0.




Proposition 18.12. If Hα is as in Proposition 18.8 and wα ∈ N (T ) is as in


Theorem 18.1, then ad(wα )Hα = −Hα .

Proof. Since wα lies in iα SU(2) , and since by Proposition 18.8 the element
−iHα lies in the image of the Lie algebra of SU(2) under the differential of
iα , we may work in SU(2) to confirm this. The result follows from (18.11)
and (18.13).


We now check the identity (18.4).

Proposition 18.13. Let λ ∈ V and α ∈ Φ. Then dλ(Hα ) = α∨ (λ).

(See Remark 18.1 about the notation dλ.)

Proof. First let us show that λ and α are orthogonal if and only if dλ(Hα ) = 0
with Hα as in Proposition 18.4. It is sufficient to show that the orthogonal
complement of α is contained in the kernel of this functional since both are
subspaces of codimension 1. Assuming therefore that α and λ are orthogonal,
sα (λ) = λ, and since the action of W on X ∗ (T ) and V = R⊗X ∗ (T ) is induced
by the action of W on T by conjugation, whose differential is the action of W
on t via Ad, we have
142 18 The Root System

dλ(Hα ) = dλ Ad(wα )Hα = −dλ(Hα )

by Proposition 18.12.
The result is now proved in the case where λ and α are orthogonal.
Therefore dλ(Hα ) = cα∨ (λ) for some constant c. To show that c = 1, we
take λ = α and (remembering that iHα ∈ t) check that dα(iHα ) = 2i. Indeed
we have
    it  
i d e  d
dα iα = α iα  = e2it |t=0 = 2i.
−i dt e −it  dt
t=0

We recall that if α is a root of G, then Tα ⊂ T is the kernel of α. An element


of is called regular if it is contained in a unique maximal torus. Otherwise, it
is called singular.

Proposition 18.14.
6
(i) 4α∈Φ Tα is the center Z(G).
(ii) α∈Φ Tα is the set of singular elements of T .
6
Of course, Tα = T−α , so we could equally well write Z(G) = α∈Φ+ Tα .

Proof. For (i), any element of G is conjugate to an element of T . If it is in


Z(G), conjugation does not move it, so Z(G) ⊂ T . Now G is generated by
T together with the subgroups iα SU(2) as α runs through the roots of G
because the Lie algebras of these groups generate the Lie algebra of g, and G
is connected. Hence x ∈ T is in Z(G) if and only if it commutes
with each of
these subgroups. From the construction of the groups iα SU(2) , this is true
if and only if x is in the kernel of the representation induced by Ad on the
two-dimensional T -invariant subspace Xα ⊕ X−α . This kernel is Tα , for every
root α. Thus, the center of G is the intersection of the Tα .
For (ii), suppose that T and T  are distinct maximal tori containing t.
Then both are contained in the connected centralizer C(t)◦ , and so by Theo-
rem 16.5 applied to this connected Lie group, they are conjugate in C(t)◦ . The
complexified Lie algebra of C(t)◦ must contain Xα for some α since otherwise
C(t)◦ would be a compact connected Lie group with no roots and hence a
torus, contradicting the assumption that T = T  . Thus, t ∈ Tα . Conversely,
if t ∈ Tα , it is contained in every maximal torus in C(Tα )◦ , which is non-
Abelian, so there are more than one of these.


Theorem 18.3. The Weyl group W = N (T )/T is generated by the wα with


α ∈ Φ.

Proof. Arguing by contradiction, choose w ∈ N (T )/T that is not in the sub-


group generated by the wα . If α ∈ Φ let tα be the Lie algebra of the group
Tα which is the kernel of α. They are hyperplanes in t, the kernels of the
18 The Root System 143

linear functionals dα. Let us partition t into open chambers which are the
complements of the tα . Let C be one of these. Choose the counterexample w
to minimize the number of hyperplanes tα separating the chambers C and wC.
Since wα reflects in the hyperplane tα , we must have w(C) = C. We will argue
that w = 1, which will be a contradiction. Let n ∈ N (T ) represent w. What
we need to show is that n ∈ T .
Since w has finite order and maps C to itself, and since C is convex, we may
find an element H of C such that w(H) = H; simply averaging any element
over its orbit under powers of w will produce such an H. Since H does not lie in
any of the tα , the one-parameter subgroup S = {exp(tH) | t ∈ R} ⊂ T contains
regular elements. Since Ad(n) fixes H, n is in the centralizer CG (S). We claim
that CG (S) = T . First note that if g ∈ CG (S) then gT g −1 contains regular
elements of T , so gT g −1 = T . Thus CG (S) ⊂ NG (T ). But CG (S) is connected
by Theorem 16.6, so n ∈ CG (S) ⊆ NG (T )◦ = T by Proposition 15.8. Therefore
n ∈ T , as required.


Proposition 18.15. Suppose that α ∈ Φ. Let β = ±α be another root. Let


&
W = Xβ+kα . (18.14)
k∈Z
β + kα ∈ Φ

Then W is an irreducible module for iα (sl(2, C)) in the adjoint representation.

Proof. Denote gα = iα (sl(2, C)). First we note that W is an sl(2, C)-module,


since by Proposition 18.4 it is closed under the Lie bracket with Xα and X−α
(which generate gα ). Therefore, it is a module for iα (sl(2, C)).
We must show that it is irreducible. Let TSU(2) be the maximal torus
  
t
TSU(2) = |t ∈ T
t−1

of SU(2). The inclusion iα : TSU(2) −→ T induces a homomorphism X ∗ (T ) −→


X ∗ (TSU(2) ). The image of α is the positive root α of SU(2), and the image of β
is a weight β  . In the notation (18.5) we have α = λ2 and β  = λm for some m.
All of the weights β  + kα of iα (SU(2)), or of its complexified Lie algebra
iα (sl(2, C)) in W are of the form λm+2k where the indices m + 2k have the
same parity as m. Thus, decomposing into irreducibles, W is a direct sum of
modules ∨ni C2 (i = 1, 2, . . .) where the ni have the same parity as m. If there
is more than one of these, then without loss of generality we may assume that
n1  n2 , in which case λn2 occurs as a weight of both ∨n1 C2 and ∨n2 C2 ,
so λn2 occurs with multiplicity two in W . Writing n2 = m + 2k, this means
that Xβ+kα is more than one-dimensional, contradicting Proposition 18.6.
Therefore W is irreducible.


Corollary 18.1. Suppose that α, β and α+ β ∈ Φ. Let Xα and Xβ be nonzero


elements of Xα and Xβ . Then [Xα , Xβ ] is a nonzero element of Xα+β .
144 18 The Root System

Proof. We may identify the decomposition (18.14) with the irreducible module
described in (12.1). Now Xα is iα (R) in the notation of that Proposition. Since
α + β is a root, Xβ is vk−2l with l > 0, and the nonvanishing of [Xα , Xβ ] =
ad(Xα )Xβ follows from (12.3).


Exercises

Exercise 18.1.
(i) Let Φ be a root system in a Euclidean space V and for α ∈ Φ let

α∨ = .
α, α

Show that the α∨ also form a root system. Note that long roots in Φ correspond
to short vectors in Φ∨ . (Hint: Prove this first for rank two root systems, then note
that if α, β ∈ Φ are linearly independent roots the intersection of Φ with their span
is a rank two root system.)
(ii) Explain why this implies that the Hα form a root system in it.

Exercise 18.2. Analyze the root system of SO(5) similarly to the case of Sp(4) in
the text. It may be helpful to use Exercise 7.3.
19
Examples of Root Systems

It may be easiest to read the next chapter with examples in mind. In this
chapter we will describe various root systems and in particular we will illus-
trate the rank 2 root systems. Since the purpose of this chapter is to give
examples, we will state various facts here without proof. The proofs will come
in later chapters.
A root system Φ in a Euclidean space V is called reducible if we can
decompose V = V1 ⊕ V2 into orthogonal subspaces with Φ = Φ1 ∪ Φ2 , with
both Φi = Vi ∩ Φ nonempty. Then the Φi are themselves smaller root systems.
In classifying root systems, one may clearly restrict to irreducible root sys-
tems, and these were classified by Killing and Cartan. The irreducible root
systems are classified by a Cartan type which can be one of the classical Cartan
types Ar (r  1), Br (r  2), Cr (r  2) Dr (r  4), or one of the five excep-
tional types G2 , F4 , E6 , E7 and E8 . The subscript is the (semisimple) rank of
the corresponding Lie groups. We have an accidental isomorphism B2 ∼ = C2 .
The Cartan types D2 and D3 are usually excluded, but it may be helpful to
consider D2 as a synonym for the reducible Cartan type A1 × A1 (that is, A1
is the Cartan type of both Φ1 and Φ2 in the orthogonal decomposition); and
D3 as a synonym for A3 .
In the last chapter we saw how to associate a root system Φ with a compact
Lie group G. The Euclidean V containing Φ is R⊗X ∗ (T ) where T is a maximal
torus. The group G is called semisimple if the root system Φ spans V = R ⊗ Λ
where Λ = X ∗ (T ) is the group of rational characters of a maximal torus T .
We will denote by g the Lie algebra of G and other notations will be as in
Chap. 18.
Within each Cartan type there may be several Lie groups to consider,
but in each case there is a unique semisimple simply connected group. There
is also a unique simple semisimple group, which is isomorphic to the simply
connected group modulo its finite center. This is called the adjoint group since
it is isomorphic to its image in GL(g) under the adjoint representation. Here is
a table giving the simply connected and adjoint groups for each of the classical
Cartan types.

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 145


DOI 10.1007/978-1-4614-8024-2 19, © Springer Science+Business Media New York 2013
146 19 Examples of Root Systems

Cartan type Simply connected G Adjoint group Other common instance


U(r + 1)
Ar SU(r + 1) U(r + 1)/center
(not semisimple)
Br Spin(2r + 1) SO(2r + 1)
Cr Sp(2r) Sp(2r)/center
Dr Spin(2r) SO(2r)/{±I} SO(2r)

Let us consider first the Cartan type Ar . We will describe three distinct
groups, U(r + 1), SU(r + 1) and PU(r + 1), which is U (r + 1) modulo its
one-dimensional center. These have the same root system, but the ambient
vector space V is different in each case.
The group U(r + 1) is not semisimple. Its rank is r + 1 but its semisimple
rank is r. The maximal torus T consists of diagonal matrices t with eigen-
values t1 , . . . , tr+1 in T, the group of complex numbers of absolute value 1.
We may identify Λ = X ∗ (T ) ∼ / Z , in which λ = (λ1 , . . . , λr+1 ) with λi ∈ Z
= r+1

represents the character t → tλi i . So V = R ⊗ X ∗ (T ) may be identified with


Rr+1 with the usual Euclidean inner product. Let ei = (0, . . . , 0, 1, 0, . . . , 0)
be the standard basis of Rr+1 . The root system consists of the r(r − 1) vectors

ei − ej , i = j, (19.1)

having exactly two nonzero entries, one being 1 and the other −1. To see
that this is the root system, we recall that the complexified Lie algebra is
C⊗g ∼ = glr+1 (C) = Matr+1 (C), since Matr+1 (C) = g ⊕ ig. (Every complex
matrix can be written uniquely as X + iY with X and Y skew-Hermitian.)
If α = ei − ej , the one-dimensional vector space Xα of Matr+1 (C) spanned
by the matrix Eij with a 1 in the i, j-position and 0’s everywhere else is an
eigenspace for T affording the character α, and these eigenspaces, together
with the Lie algebra of T , span V . So the ei − ej are precisely the roots of
U(r+1). The group U(r+1) has semisimple rank r, since that is the dimension
of the space spanned by these vectors.
Next consider the group SU(r + 1). This is the semisimple and simply
connected group with the same root system. The ambient space V is one
dimension
/ smaller than for U(r + 1), because the ti are subject to the equa-
tion ti = det(t) = 1. Therefore, the character represented by λ ∈ Zr+1 is
trivial if λ is in the diagonal lattice Δ = Z(1, . . . , 1). Thus, for this group, the
weight lattice, which we will denote ΛSU(r+1) , is Zr+1 /Δ, and the space V is
r-dimensional. It is spanned by the roots, so this group is semisimple.
The group PU(r + 1) is U(r + 1) modulo its one-dimensional central torus.
It is the adjoint group for the Cartan type Ar . It is isomorphic to SU(r + 1)
modulo its finite center of order r + 1. A character of U(r + 1) parametrized
by λ ∈ Zr+1 is well-defined
 if and only if it is trivial on the center of U(r + 1),
which requires λi = 0. So the lattice ΛPU(r+1) is isomorphic to the sublat-
tice of Zr+1 determined by this condition. The composition

ΛPU(r+1) −→ Zr+1 −→ Zr+1 /Δ = ΛSU(r+1)


19 Examples of Root Systems 147

where the first map is the inclusion and the second the projection is injective,
so we may regard ΛPU(r+1) as a sublattice of ΛSU(r+1) . Its index is r + 1.
Turning now to the general case, the set Φ of roots will be partitioned into
two parts, called Φ+ and Φ− . Exactly half the roots will be in Φ+ and the other
half in Φ− . This is accomplished by choosing a hyperplane through the origin
in V that does not pass through any root, and taking Φ+ to be the roots on one
side of the hyperplane, Φ− the roots on the other side. Although the choice

of the hyperplane is arbitrary, if another such decomposition Φ = Φ+ 1 ∪ Φ1
is found by choosing a different hyperplane, a Weyl group element w can be
− −
found such that w(Φ+ ) = Φ+ 1 and w(Φ ) = Φ1 , so the procedure is not as
+
arbitrary as one might think. The roots in Φ will be called positive. In the
figures of this chapter, the positive roots are labeled •, and the negative roots
are labeled ◦.
If G is a semisimple compact connected Lie group, then its universal cover
G̃ is a cover of finite degree, and as in the last example (where G = PU(r + 1)
and G̃ = SU(r + 1)) the weight lattice of G is a sublattice of G̃. Moreover,
if π : G −→ GL(V ) is an irreducible representation, then we may compose
it with the canonical map G̃ −→ G and get a representation of G̃. So if we
understand the representation theory of G̃ we understand the representation
theory of G. For this reason, we will consider mainly the case where G is
simply connected and semisimple in the remaining examples of this chapter.
Assuming G is semisimple, so the αi span V, we will define certain special
elements of V as follows. If Σ = {α1 , . . . , αr } are the simple positive roots
then let {α∨ ∨
1 . . . , αr } be the corresponding coroots. In the semisimple case,
the coroots span V ∗ , and the fundamental dominant weights i are the dual
basis of V. Thus,
α∨
j (i ) = δij (Kronecker δ).
We will show later that if G is simply connected, then the i are in the weight
lattice Λ = X ∗ (T ), though if G is not simply connected, they may not all be
in Λ.
Another important particular vector is ρ, sometimes called the Weyl
vector. It may be characterized as half the sum of the positive roots, and
in the semisimple case it may also be characterized as the sum of the funda-
mental weights. (See Proposition 20.17.)
For example, the root system of type A2 , pictured in Fig. 19.1, consists of
α1 = (1, −1, 0), α2 = (0, 1, −1), (1, 0, −1),
(−1, 1, 0), (0, −1, 1), (−1, 0, 1).
With G = SU(3), we really mean the images of these vectors in Z3 /Δ, as
explained above. Taking T to be the diagonal torus of SU(3), α1 and α2 ∈
X ∗ (T ) are the roots
⎛ ⎞
t1
α1 (t) = t1 t−1
2 , a2 (t) = t2 t−1
3 , t = ⎝ t2 ⎠ ∈ T.
t3
148 19 Examples of Root Systems

The corresponding eigenspaces are spanned by


⎛ ⎞ ⎛ ⎞
010 00 0
E12 = ⎝ 0 0 0 ⎠ ∈ Xα1 , E23 = ⎝ 0 0 1 ⎠ ∈ Xα2 .
000 00 0

The fundamental dominant weights 1 and 2 are, respectively, 1 (t) = t1


and 2 (t) = t−1 3 . Let v0 = (1, 1, 1), so in our previous notation Δ = Zv0 . The
vector space V is R3 /Rv0 , but we may identify this with the codimension one
vector subspace of R3 consisting of (x1 , x2 , x3 ) with xi = 0. The funda-
mental weights are represented by the cosets in Z3 /Zv0 of the vectors (1, 0, 0)
and (1, 1, 0), or in the subspace of codimension one in R3 consisting of vectors
(x0 , x1 , x2 ) satisfying i xi = 0 by ( 23 , − 31 , − 31 ) and ( 13 , 13 , − 23 ), respectively.

α2 C+
ρ

1
α1

Fig. 19.1. The root system of type A2

Figure 19.1 shows the root system of type A2 associated with the Lie group
SU(3). The shaded region in Fig. 19.1 is the positive Weyl chamber C+ , which
consists of {x ∈ V | x, α  0 for all α ∈ Φ+ }. It is a fundamental domain for
the Weyl group.
A role will also be played by a partial order on V. We define x  y
if x − y  0, where x  0 if x is a linear combination, with nonnegative
coefficients, of the elements of Σ. The shaded region in Fig. 19.2 is the set of
x such that x  0 for the root system of type A2 .
Next we turn to the remaining classical root systems. The root system of
type Bn is associated with the odd orthogonal group SO(2n + 1) or with its
double cover spin(2n + 1). The root system of type Cn is associated with the
symplectic group Sp(2n). Finally, the root system of type Dn is associated
with the even orthogonal group SO(2n) or its double cover spin(2n). We will
now describe these root systems. Let ei = (0, . . . , 0, 1, 0, . . . , 0) be the standard
basis of Rn .
19 Examples of Root Systems 149

{x 0}

ρ
α2
2

α1

Fig. 19.2. The partial order

The root system of type Bn can be embedded in Rn . The roots are not all
of the same length. There are 2n short roots

±ei (1  i  n)

and 2(n2 − n) long roots

±ei ± ej (i = j).

The simple positive roots are

α1 = e1 − e2 , α2 = e2 − e3 , ... αn−1 = en−1 − en , αn = en .

To see that this is the root system of SO(2n + 1), it is most convenient to use
the representation of SO(2n + 1) in Exercise 5.3. Thus, we replace the usual
realization of SO(2n + 1) as a group of real matrices by the subgroup of all
g ∈ U(2n + 1) that satisfy g J t g = J, where
⎛ ⎞
1
.
J = ⎝ .. ⎠ .
1

A maximal torus consists of all diagonal elements, which have the form (when
n = 4, for example)
⎛ ⎞
t1
⎜ t2 ⎟
⎜ ⎟
⎜ t ⎟
⎜ 3 ⎟
⎜ t ⎟
⎜ 4 ⎟
t=⎜⎜ 1 ⎟.

⎜ t −1 ⎟
⎜ 4 ⎟
⎜ t −1 ⎟
⎜ 3 ⎟
⎝ t−1 ⎠
2
−1
t1
150 19 Examples of Root Systems

The Lie algebra g consists of all skew-Hermitian matrices X satisfying


X J + J t X = 0. Now we claim that the complexification of g just consists of
all complex matrices satisfying X J + J t X = 0. Indeed, by Proposition 11.4,
any complex matrix X can be written uniquely as X1 + iX2 with X1 and X2
skew-Hermitian, and it is easy to see that X J +J t X = 0 if and only if X1 and
X2 satisfy the same identity. Thus, g ⊕ ig = {X ∈ gl(n, C) | X J + J t X = 0}.
It now follows from Proposition 11.3(iii) that this is the complexification of g.
This Lie algebra is shown in Fig. 19.3 when n = 4.

t1 x12 x13 x14 x15 x16 x17 x18 0

x21 t2 x23 x24 x25 x26 x27 0 −x18

x31 x32 t3 x34 x35 x36 0 −x27 −x17

x41 x42 x43 t4 x45 0 −x36 −x26 −x16

x51 x52 x53 x54 0 −x45 −x35 −x25 −x15

x61 x62 x63 0 −x54 −t4 −x34 −x24 −x14

x71 x72 0 −x63 −x53 −x43 −t3 −x23 −x13

x81 0 −x72 −x62 −x52 −x42 −x32 −t2 −x12

0 −x81 −x71 −x61 −x51 −x41 −x31 −x21 −t1

Fig. 19.3. The Lie algebra so(9). The Dynkin diagram, which will be explained in
Chap. 25, has been superimposed on top of the Lie algebra

We order the roots so the root spaces Xα with α ∈ Φ+ are upper triangular.
In particular, the simple roots are α1 (t) = t1 t−1
2 , acting on Xα1 , the space of
matrices in which all entries are zero except x12 ; α2 (t) = t2 t−1
3 , with root space
corresponding to x23 ; α3 (t) = t3 t−1
4 corresponding to x34 ; and α4 (t) = t4 ,
corresponding to x45 . We have circled these positions. Note, however that (for
example) x12 appears in a second place which has not been circled. The lines
connecting the circles, one of them double, map out the Dynkin diagram,
which will explained in greater detail Chap. 25. Briefly, the Dynkin diagram
is a graph whose vertices correspond to the simple roots; simple roots are
connected in the Dynkin diagram if they are not perpendicular. We have
drawn the nodes corresponding to each simple root on top of the variable xij
for the corresponding eigenspace.
We have drawn a double bond with an arrow pointing from a long root
to a short root, which is the convention when two nonadjacent roots have
different lengths.
19 Examples of Root Systems 151

If we take ei ∈ X ∗ (T ) to be the character ei (t) = ti , then it is clear that


the root system consists of the 2n2 roots ±ei and ±ei ± ej (i = j), as claimed.
The root system of type Cn is similar, but the long and short roots are
reversed. Now there are 2n long roots

±2ei (1  i  n)

and 2(n2 − n) short roots

±ei ± ej (i = j).

The simple positive roots are

α1 = e1 − e2 , α2 = e2 − e3 , ... αn−1 = en−1 − en , αn = 2en .

We leave it to the reader to show that Cn is the root system of Sp(2n) in


Exercise 19.2. (Fig. 30.15 may help with this.)
The root system of type Dn consists of just the long roots in the root
system of type Bn . There are 2(n2 − n) roots, all of the same length:

±ei ± ej (i = j).

The simple positive roots are

α1 = e1 − e2 , α2 = e2 − e3 , ... αn−1 = en−1 − en , αn = en−1 + en .

To see that Dn is the root system of SO(2n), one may again use the realiza-
tion of Exercise 5.3. We leave this verification to the reader in Exercise 19.2.
(Fig. 30.1 may help with this.)

α2
2

α1

Fig. 19.4. The root system of type C2 , which coincides with type B2

It happens that spin(5) ∼


= Sp(4), so the root systems of types B2 and
C2 coincide. These are shown in Fig. 19.4. The shaded region is the positive
152 19 Examples of Root Systems

Weyl chamber. (We have labeled the roots so that the order coincides with
the root system C2 in the notations of Bourbaki [23], in the appendix at the
back of the book. For type B2 , the roots α1 and α2 would be switched.)
There is a nonreduced root system whose type is called BCn . The root
system of type BCn can be realized as all elements of the form

±ei ± ej (i < j), ±ei , ±2ei ,

where ei are standard basis vectors of Rn . Nonreduced root systems do not


occur as root systems of compact Lie groups, but they occur as relative root
systems (Chap. 29). The root system of type BC2 may be found in Fig. 19.5.

Fig. 19.5. The nonreduced root system BC2

In addition to the infinite families of Lie groups in the Cartan classification


are five exceptional groups, of types G2 , F4 , E6 , E7 and E8 . The root system
of type G2 is shown in Fig. 19.6.
In addition to the three root systems we have just considered there is
another rank two reduced root system. This is called A1 × A1 , and it is
illustrated in Fig. 19.7. Unlike the others listed here, this one is reducible.
If V = V1 ⊕ V2 (orthogonal direct sum), and if Φ1 and Φ2 are root systems
in V1 and V2 , then Φ = Φ1 ∪ Φ2 is a root system in V such that every root
in Φ1 is orthogonal to every root in Φ2 . The root system Φ is reducible if it
decomposes in this way.
We leave two other rank 2 root systems, which are neither reduced nor
irreducible, to the imagination of the reader. Their types are A1 × BC1 and
BC1 × BC1 .
19 Examples of Root Systems 153

α2
1

α1

Fig. 19.6. The root system of type G2

α2

2 ρ

1 α1

Fig. 19.7. The reducible root system A1 × A1

Exercises
Exercise 19.1. Show that any irreducible rank 2 root system is isomorphic to one
of those described in this chapter, of type A2 , B2 , G2 or BC2 .

Exercise 19.2. Verify, as we did for type SO(2n + 1), that the root system of the
Lie group SO(2n) is of type Dn and that the root system of Sp(2n) is of type Cn .
154 19 Examples of Root Systems

Exercise 19.3. Show that the root systems of types B2 and C2 are isomorphic.
Exercise 19.4. Show that the root system of SO(6) is isomorphic to that of SU(4).
What can you say about the root system of SO(4)?
Exercise 19.5. Suppose that G is a compact Lie group with root system Φ, and
that H is a Lie subgroup of G having the same maximal torus. Show that every root
of H is a root of G, and that if Φ ⊆ Φ is the root system of H, then
If α, β ∈ Φ and α + β ∈ Φ then α + β ∈ Φ . (19.2)
Exercise 19.6. Conversely, let G be a compact Lie group with root system Φ. Let
Φ ⊆ Φ be a root system such that (19.2) is satisfied. Show that in the notation of
Chap. 18, tC and Xα (α ∈ Φ ) form a complex Lie algebra hC , and that h = hC is a
Lie subalgebra of g.
Exercise 19.7. Let Φ be the root system of type G2 .
(i) Show that the long roots form a root system Φ satisfying (19.2).
(ii) Assume the following fact: there exists a simply connected compact Lie group G
whose root system is Φ. This Lie group G2 may be constructed as the group of
automorphisms of the octonions (Jacobson [87]). Prove that there exists a non-
trivial homomorphism SU(3) → G (known to be injective). (Hint: Use Exercise
19.6 and Theorem 14.2.)
(ii) Exhibit another root system in Φ of rank two satisfying (19.2). Note that you
cannot use the short roots for this.
It may be shown that the root systems Φ in (i) and (ii) of the last exercise
correspond to Lie groups [su(3) for part (i)] that may be embedded in the exceptional
group G2 .
Exercise 19.8. Let ei (i = 1, 2, 3, 4) be the standard basis elements of R4 . Show
that the 48 vectors
1
±ei (1  i  4), ±ei ± ej (1  i < j  4), (±e1 ± e2 ± e3 ± e4 ) ,
2
form a root system. This is the root system of Cartan type F4 . Compute the order of
the Weyl group. Show that this root system contains smaller root systems of types
B3 and C3 .
Exercise 19.9. Let Φ8 consist of the following vectors in R8 . First, the 112 vectors
± e i ± ej 1  i < j  8. (19.3)
Second, the 128 vectors
1
(±e1 ± e2 ± e3 ± e4 ± e5 ± e6 ± e7 ± e8 ) (19.4)
2
where the number of − signs is even. We will refer to the vectors (19.3) as integral
roots and the vectors (19.4) as half-integral roots. Prove that Φ8 is a root system.
This is the exceptional root system of type E8 . Note that the integral roots form a
root system of type D8 .
Hint: To show that if α and β are roots then sα (β) ∈ Φ8 , observe that the D8 Weyl
group permutes
 the roots, and using this action we may assume that α = e1 + e2
or α = 12 ei . The first case is easy so assume α = 12 ei . We may then use the
action of the symmetric group on β and there are only a few cases to check.
19 Examples of Root Systems 155

Exercise 19.10. Let Φ7 consist of the vectors in Φ8 that are orthogonal to e7 + e8 .


Show that Φ7 is a root system containing 126 roots. This is the exceptional root
system of type E7 .

Exercise 19.11. Let Φ6 consist of the vectors in Φ7 that are orthogonal to e6 − e7 .


Show that Φ6 is a root system containing 72 roots. This is the exceptional root
system of type E6 .
20
Abstract Weyl Groups

In this chapter, we will associate a Weyl group with an abstract root system,
and develop some of its properties.
Let V be a Euclidean space and Φ ⊂ V a reduced root system. (At the end
of the chapter we will remove the assumption that Φ is reduced, but many of
the results of this chapter are false without it.)
Since Φ is a finite set of nonzero vectors, we may choose ρ0 ∈ V such that
α, ρ0 = 0 for all α ∈ Φ. Let Φ+ be the set of roots α such that α, ρ0 > 0.
This consists of exactly half the roots since evidently a root α ∈ Φ+ if and
only if −α ∈ / Φ+ . Elements of Φ+ are called positive roots. Elements of set
Φ− = Φ − Φ+ are called negative roots.
If α, β ∈ Φ+ and α + β ∈ Φ, then evidently α + β ∈ Φ+ . Let Σ be the set
of elements in Φ+ that cannot be expressed as a sum of other elements of Φ+ .
If α ∈ Σ, then we call α a simple positive root, or sometimes just a simple
root and we call sα defined by (18.1) a simple reflection.

Proposition 20.1.
(i) The elements of Σ are linearly independent.
(ii) If α ∈ Σ and β ∈ Φ+ , then either β = α or sα (β) ∈ Φ+ .
(iii) If α and β are distinct elements of Σ, then α, β  0.
(iv) Each element α ∈ Φ can be expressed uniquely as a linear combination

α= nβ · β
β∈Σ

in which each nβ ∈ Z and either all nβ  0 (if β ∈ Φ+ ) or all nβ  0


(if β ∈ Φ− ).

Proof. Let Σ  be a subset of Φ+ that is minimal with respect to the property


that every element of Φ+ is a linear combination with nonnegative coefficients
of elements of Σ  . (Subsets with this property clearly exist—e.g., Σ  itself.)
We will eventually show that Σ  = Σ.

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 157


DOI 10.1007/978-1-4614-8024-2 20, © Springer Science+Business Media New York 2013
158 20 Abstract Weyl Groups

First, we show that if α ∈ Σ  and β ∈ Φ+ , then either β = α or sα (β) ∈ Φ+ .


Otherwise −sα (β) ∈ Φ+ , and

α∨ (β) α = β + − sα (β)

is a sum of two positive roots β and −sβ (α). Therefore, we have



β= nγ · γ, −sα (β) = nγ · γ,
γ∈Σ  γ∈Σ 

where nγ , nγ  0 and so



α∨ (β) α = nγ · γ, nγ  0,
γ∈Σ 

where nγ = nγ + nγ . There exists some γ ∈ Σ  such that nγ = 0, because β is
not α, and (the root system being reduced), it follows that β is not a multiple
of α. Therefore,
(α∨ (β) − nα ) α = nγ · γ,
γ ∈ Σ
γ = α

and the right-hand side is not zero. Taking the inner product with ρ0 shows
that the coefficient on the left-hand side is strictly positive; dividing by
this positive constant, we see that α may be expressed as a linear combi-
nation of the elements γ ∈ Σ  distinct from α, and so α may be omitted
from Σ  , contradicting its assumed minimality. This contradiction shows that
sα (β) ∈ Φ+ .
Next we show that if α and β are distinct elements of Σ  , then α, β  0.
We have already shown that sα (β) ∈ Φ+ . If α, β > 0, then by (18.2) we have
α∨ (β) > 0. Write
β = sα (β) + α∨ (β) α. (20.1)
We have already shown that sα (β) ∈ Φ+ . Writing sα (β) as a linear combina-
tion with nonnegative coefficients of the elements of Σ  , and noting that the
coefficient of α on the right-hand side of (20.1) is strictly positive, we may
write
β= nγ · γ,
γ∈Σ 

where nα > 0. We rewrite this



(1 − nβ ) · β = nγ · γ.
γ ∈ Σ
γ = β

At least one coefficient nα > 0 on the right, so taking the inner product with
ρ0 we see that 1 − nβ > 0. Thus, β is a linear combination with nonnegative
20 Abstract Weyl Groups 159

coefficients of other elements of Σ  and hence may be omitted, contradicting


the minimality of Σ  .
Now let us show that the elements of Σ  are R-linearly independent. In a
relation of algebraic dependence, we move all the negative coefficients to the
other side of the identity and obtain a relation of the form

cα · α = dβ · β, (20.2)
α∈Σ1 β∈Σ2

where Σ1 and Σ2 are disjoint subsets of Σ  and the coefficients cα , dβ are all
positive. Call this vector v. We have

v, v = cα dβ α, β  0
α ∈ Σ1
β ∈ Σ2

since we have already shown that the inner products α, β  0. Therefore,


v = 0. Now taking the inner product of the left-hand side in (20.2) with
ρ0 gives
0= cα α, ρ0 .
α∈Σ1

Since α, ρ0 > 0 and cα > 0, this is a contradiction. This proves the linear
independence of the elements of Σ  .
Next let us show that every element of Φ+ may be expressed as a linear
combination of elements of Σ  with integer coefficients. We define a function
h from Φ+ to the positive real numbers as follows. If α ∈ Φ+ we may write

α= nβ · β, nβ  0.
β∈Σ 

The coefficients nβ are uniquely determined since the elements of Σ  are lin-
early independent. We define

h(α) = nβ . (20.3)

Evidently h(α) > 0. We want to show that the coefficients nβ are integers.
Assume a counterexample with h(α) minimal. Evidently, α ∈ / Σ  since if

α ∈ Σ , then nα = 1 while all other nβ = 0, so such an α has all nβ ∈ Z.
Since
0 < α, α = nβ α, β , (20.4)
β∈Σ 

it is impossible that α, β  0 for all β ∈ Σ  . Thus, there exists γ ∈ Σ  such


that α, γ > 0. Then by what we have already proved, α = sγ (α) ∈ Φ+ , and
by (18.1) we see that
α = nβ · β,
β∈Σ 
160 20 Abstract Weyl Groups

where 
nβ if β = γ ,
nβ =
nγ − γ ∨ (α) if β = γ .
Since γ, α > 0, we have
h(α ) < h(α) ,
so by induction we have nβ ∈ Z. Since Φ is a root system, γ ∨ (α) ∈ Z, so
nβ ∈ Z for all β ∈ Σ  . This is a contradiction.
Finally, let us show that Σ = Σ  .
If α ∈ Σ, then by definition of Σ, α cannot be expressed as a linear
combination with integer coefficients of other elements of Φ+ . Hence α cannot
be omitted from Σ  . Thus, Σ ⊂ Σ  .
On the other hand, if α ∈ Σ  , then we claim that α ∈ Σ. Otherwise, we
may write α = β + γ with β, γ ∈ Φ+ , and β and γ may both be written as
linear combinations of elements of Σ  with positive integer coefficients, and
thus h(β), h(γ)  1, so h(α) = h(β) + h(γ) > 1. But evidently h(α) = 1 since
α ∈ Σ  . This contradiction shows that Σ  ⊂ Σ.


Let W be the group generated by the simple reflections sα with α ∈ Σ. If


w ∈ W , let the length l(w) be defined to be the smallest k such that w admits
a factorization w = s1 · · · sk into simple reflections, or l(w) = 0 if w = 1. Let
l (w) be the number of α ∈ Φ+ such that w(α) ∈ Φ− . We will eventually show
that the functions l and l are the same.

Proposition 20.2. Let s = sα (α ∈ Σ) be a simple reflection, and let w ∈ W .


We have  
 l (w) + 1 if w−1 (α) ∈ Φ+ ,
l (sw) = (20.5)
l (w) − 1 if w−1 (α) ∈ Φ− ,
and 
l (w) + 1 if w(α) ∈ Φ+ ,
l (ws) = (20.6)
l (w) − 1 if w(α) ∈ Φ− ,

Proof. Since s(Φ− ) is obtained from Φ− by deleting −α and adding α,


we see that (sw)−1 Φ− = w−1 (sΦ− ) is obtained from w−1 Φ− by deleting
−w−1 (α) and adding w−1 (α). Since l (w) is the cardinality of Φ+ ∩ w−1 Φ− ,
we obtain (20.5). To prove (20.6), we note that l (ws) is the cardinality

of Φ+ ∩ (ws)−1 Φ− , which equals the cardinality of s Φ+ ∩ (ws)−1 Φ− =
sΦ+ ∩ w−1 Φ− , and since sΦ+ is obtained from Φ+ by deleting the element α
and adjoining −α, (20.6) is evident.


If w is any orthogonal linear endomorphism of V, then evidently wsα w−1 is


the reflection in the hyperplane perpendicular to w(α), so

wsα w−1 = sw(α) . (20.7)


20 Abstract Weyl Groups 161

Proposition 20.3. Suppose that α1 , . . . , αk and α are elements of Σ, and let


si = sαi . Suppose that
s1 s2 · · · sk (α) ∈ Φ− .
Then there exists a 1  j  k such that

s1 s2 · · · sk = s1 s2 · · · ŝj · · · sk sα , (20.8)

where the “hat” on the right signifies the omission of the single element sj .

Proof. Let 1  j  k be minimal such that sj+1 · · · sk (α) ∈ Φ+ . Then


sj sj+1 · · · sk (α) ∈ Φ− . Since αj is the unique element of Φ+ mapped into
Φ− by sj , we have
sj+1 · · · sk (α) = αj ,
and by (20.7) we have

(sj+1 · · · sk )sα (sj+1 · · · sk )−1 = sj

or
sj+1 · · · sk sα = sj sj+1 · · · sk .
This implies (20.8).


Proposition 20.4. Suppose that α1 , . . . , αk are elements of Σ, and let si =


sαi . Suppose that l (s1 s2 · · · sk ) < k. Then there exist 1  i < j  k such that

s1 s2 · · · sk = s1 s2 · · · ŝi · · · ŝj · · · sk , (20.9)

where the “hats” on the right signify omission of the elements si and sj .

Proof. Evidently there is a first j such that l (s1 s2 · · · sj ) < j, and [since
l (s1 ) = 1] we have j > 1. Then l (s1 s2 · · · sj−1 ) = j − 1, and by Propo-
sition 20.2 we have s1 s2 · · · sj−1 (αj ) ∈ Φ− . The existence of i satisfying
s1 · · · sj−1 = s1 · · · ŝi · · · sj−1 sj now follows from Proposition 20.3, which im-
plies (20.9).


Proposition 20.5. If w ∈ W , then l(w) = l (w).

Proof. The inequality


l (w)  l(w)
follows from Proposition 20.2 because we may write w = sw1 , where s is a
simple reflection and l(w1 ) = l(w) − 1, and by induction on l(w1 ) we may
assume that l (w1 )  l(w1 ), so l (w)  l (w1 ) + 1  l(w1 ) + 1 = l(w).
Let us show that
l (w)  l(w).
Indeed, let w = s1 · · · sk be a counterexample with l(w) = k, where each
si = sαi with αi ∈ Σ. Thus, l (s1 · · · sk ) < k. Then, by Proposition 20.4 there
exist i and j such that
162 20 Abstract Weyl Groups

w = s1 s2 · · · ŝi · · · ŝj · · · sk .

This expression for w as a product of k − 2 simple reflections contradicts our


assumption that l(w) = k.


Proposition 20.6. If w(Φ+ ) = Φ+ , then w = 1.

Proof. If w(Φ+ ) = Φ+ , then l (w) = 0, so l(w) = 0, that is, w = 1.




Proposition 20.7. If α ∈ Φ, there exists an element w ∈ W such that


w(α) ∈ Σ.

Proof. First, assume that α ∈ Φ+ . We will argue by induction on h(α), which


is defined by (20.3). In view of Proposition 20.1(iv), we know that h(α) is a
positive integer, and if α ∈
/ Σ (which we may as well assume), then h(α) > 1.
As in the proof of Proposition 20.1, (20.4) implies that α, β > 0 for some
β ∈ Σ, and then with α = sβ (α) we have h(α ) < h(α). On the other hand,
α ∈ Φ+ since α = β by Proposition 20.1(ii). By our inductive hypothesis,
w (α ) ∈ Σ for some w ∈ W . Then w(α) = w (α ) with w = w sβ ∈ W . This
shows that if α ∈ Φ+ , then there exists w ∈ W such that w(α) ∈ Σ.
If, on the other hand, α ∈ Φ− , then −α ∈ Φ+ so we may find w1 ∈ W such
that w1 (−α) ∈ Σ. Letting w1 (−α) = β we have w(α) = β with w = sβ w1 .
In both cases, w(α) ∈ Σ for some w ∈ W .


Proposition 20.8. The group W contains sα for each α ∈ Φ.

Proof. Indeed, w(α) ∈ Σ for some w ∈ W , so sw(α) ∈ W and sα is conjugate


in W to sw(α) by (20.7). Therefore, sα ∈ W .


Proposition 20.9. The group W is finite.

Proof. By Proposition 20.6, w ∈ W is determined by w(Φ+ ) ⊂ Φ. Since Φ is


finite, W is finite. 

Proposition 20.10. Suppose that w ∈ W such that l(w) = k. Write w =


s1 · · · sk , where si = sαi , α1 , . . . , αk ∈ Σ. Then

{α ∈ Φ+ |w(α) ∈ Φ− } = {αk , sk (αk−1 ), sk sk−1 (αk−2 ), . . . , sk sk−1 · · · s2 (α1 )}.

Proof. By Proposition 20.5, the cardinality of {α ∈ Φ+ |w(α) ∈ Φ− } is k, so the


result will be established if we show that the described elements are distinct
and in the set. Let w = s1 w1 , where w1 = s2 · · · sk , so that l(w1 ) = l(w) − 1.
By induction, we have

{α ∈ Φ+ |w1 (α) ∈ Φ− } = {αk , sk (αk−1 ), sk sk−1 (αk−2 ), . . . , sk sk−1 · · · s3 (α2 )},

and the elements on the right are distinct. We claim that

{α ∈ Φ+ |w1 (α) ∈ Φ− } ⊂ {α ∈ Φ+ |s1 w1 (α) ∈ Φ− }. (20.10)


20 Abstract Weyl Groups 163

Otherwise, let α ∈ Φ+ such that w1 (α) ∈ Φ− , while s1 w1 (α) ∈ Φ+ . Let


β = −w1 (α). Then β ∈ Φ+ , while s1 (β) ∈ Φ− . By Proposition 20.1(ii), this
implies that β = α1 . Therefore, α = −w1−1 (α1 ). By Proposition 20.2, since
l(s1 w1 ) = k = l(w1 ) + 1, we have −α = w1−1 (α1 ) ∈ Φ+ . This contradiction
proves (20.10).
We will be done if we show that the last remaining element sk · · · s2 (α1 )
is in {α ∈ Φ+ |s1 w1 (α) ∈ Φ− } but not {α ∈ Φ+ |w1 (α) ∈ Φ− } since that will
guarantee that it is distinct from the other elements listed. This is clear since
/ Φ− , while s1 w1 (α) = −α1 ∈ Φ− .
if α = sk · · · s2 (α1 ) we have w1 (α) = α1 ∈



A connected component of the complement of the union of the hyperplanes



{x ∈ V  x, α = 0 for all α ∈ Φ}

is called an open Weyl chamber . The closure of an  open Weyl chamber is


called a Weyl chamber . For example, C+ = {x ∈ V  x, α  0 for all α ∈ Σ}
is called the positive Weyl chamber . Since every element of Φ+ is a lin-
ear combination of elements of C with positive coefficients, C+ = {x ∈
V  x, α  0 for all α ∈ Φ+ }. The interior
 

C+ = {x ∈ V  x, α > 0 for all α ∈ Σ} = {x ∈ V  x, α > 0 for all α ∈ Φ+ }

is an open Weyl chamber.


If y ∈ V, let W (y) be the stabilizer {w ∈ W |w(y) = y}.

Proposition 20.11. Suppose that w ∈ W such that l(w) = k. Write w =


s1 · · · sk , where si = sαi , α1 , . . . , αk ∈ Σ. Assume that x ∈ C+ such that
wx ∈ C+ also.
(i) We have x, αi = 0 for 1  i  k.
(ii) Each si ∈ W (x).
(iii) We have w(x) = x.

Proof. If α ∈ Φ+ and wα ∈ Φ− , then we have x, α = 0. Indeed, x, α  0


since α ∈ Φ+ and x ∈ C+ , and x, α = wx, wα  0 since wx ∈ C+ and
wα ∈ Φ− .
The elements of {α ∈ Φ+ |wα ∈ Φ− } are listed in Proposition 20.10.
Since αk is in this set, we have sk (x) = x − (2 x, αk / αk , αk )αk = x.
Thus, sk ∈ W (x). Now since sk (αk−1 ) ∈ {α ∈ Φ+ |wα ∈ Φ− }, we have
0 = x, sk (αk−1 ) = sk (x), αk−1 = x, αk−1 , which implies sk−1 (x) =
x − 2 x, αk−1 / αk−1 , αk−1 = x. Proceeding in this way, we prove (i) and
(ii) simultaneously. Of course, (ii) implies (iii).


Theorem 20.1. The positive Weyl chamber C+ is a fundamental domain for


the action of W on V. More precisely, let x ∈ V.
(i) There exists w ∈ W such that w(x) ∈ C+ .
164 20 Abstract Weyl Groups

(ii) If w, w ∈ W and w(x) ∈ C+ , w (x) ∈ C+



, then w = w .
(iii) If w, w ∈ W and w(x) ∈ C+ , w (x) ∈ C+ , then w(x) = w (x).
 

Proof. Let w ∈ W be chosen so that the cardinality of



S = {α ∈ Φ+  w(x), α < 0}

is as small as possible. We claim that S is empty. If not, then there exists an


element of β ∈ Σ ∩ S. We have w(x), −β > 0, and since sβ preserves Φ+
except for β, which it maps to −β, the set

S  = {α ∈ Φ+  w(x), sβ (α) < 0}

is smaller than S by one. Since S  = {α ∈ Φ+ | sβ w(x), α < 0} this contra-


dicts the minimality of |S|. Clearly, w(x) ∈ C+ . This proves (i).
We prove (ii). We may assume that w = 1, so x ∈ C+ ◦
. Since x, α > 0 for
all α ∈ Φ , we have Φ = {α ∈
+ +
 Φ| x, α >
 0} = {α ∈ Φ| x, α  0}. Since
w (x) ∈ C+ , if α ∈ Φ+ , we have w−1 (α), x = α, w(x)  0 so w−1 (α) ∈ Φ+ .
By Proposition 20.6, this implies that w−1 = 1, whence (ii).
Part (iii) follows from Proposition 20.11(iii).


Proposition 20.12. The function w −→ (−1)l(w) ∈ {±1} is a character of


W . If α ∈ Φ, then (−1)l(sα ) = −1.

Proof. If l(w) = k and l(w ) = k  , write w = s1 · · · sk and w = s1 · · · sk as


products of simple reflections. It follows from Proposition 20.4 that we may
obtain a decomposition of ww into a product of simple reflections of minimal
length from ww = s1 · · · sk s1 · · · sk by discarding elements in pairs until the
result is reduced. Therefore, l(ww ) ≡ l(w)+l(w ) modulo 2, so w −→ (−1)l(w)
is a character. (One may argue alternatively by showing that (−1)l(w) is the
determinant of w in its action on V.)
If α ∈ Φ, then by Proposition 20.7 there exists w ∈ W such that w(α) ∈
Σ. By (20.7), we have wsα w−1 = sw(α) , and l sw(α) = 1. It follows that
(−1)sα = −1.


Proposition 20.13. Let w̃ be a linear transformation of V that maps Φ to


itself. Then there exists w ∈ W such that w̃(C+ ) = wC+ . The transformation
w−1 w̃ of V permutes the elements of Φ+ and of Σ.

It is possible that w−1 w̃ is not the identity. (See Exercise 25.2.)

Proof. It is sufficient to show that w−1 w̃(C+


◦ ◦
) = C+ ◦
. Let x ∈ C+ . Since the open
Weyl chambers are defined to be the connected components of the complement
of the set of hyperplanes perpendicular to the roots, and since w̃ permutes the

roots, w̃(C+ ) is an open Weyl chamber. By Theorem 20.1 there is an element
w ∈ W such that w−1 w̃(x) ∈ C+ , and w−1 w̃(x) must be in the interior C+ ◦
since
x lies in an open Weyl chamber, and these are permuted by W as well as by
20 Abstract Weyl Groups 165

w̃. Now w−1 w̃(C+◦


) and C+◦
are open Weyl chambers intersecting nontrivially
in x, so they are equal.
The positive roots are characterized by the condition that α ∈ Φ+ if and

only if α, x > 0 for x ∈ C+ . It follows that w−1 w̃ permutes the elements of
Φ . Since the Σ are determined by Φ+ , these too are permuted by w−1 w̃. 
+

Proposition 20.14. If C is any Weyl chamber then there is a unique element


w of W such that C = wC+ . In particular, let w0 be the unique element such
that −C+ = w0 C. Then w0 Φ+ = Φ− and w0 is the longest element of W .

The element w0 is often called the long element of the Weyl group.

Proof. It is clear that W permutes the Weyl chambers transitively. The


uniqueness of w of W such that C = wC+ follows from Theorem 20.1.
Regarding w0 , since

C+ = {x|{α, x} for α ∈ Φ+ },

the element w0 such that wC+ = −C+ sends positive roots to negative roots.
Thus, its length equals the number of positive roots, and is maximal.


An important particular element of V is the Weyl vector


1
ρ= α.
2 +
α∈Φ

Proposition 20.15. If α is a simple root, then

sα (ρ) = ρ − α, α ∈ Σ. (20.11)

Proof. This follows since sα changes the sign of α and permutes the remaining
positive roots.


Let there be given a lattice Λ contained in V that contains a basis of V. Then


V may be identified with R ⊗Z Λ. We will assume that α∨ (Λ) ⊆ Z, and that
every root α is in Λ. For example if Φ is the root system of a compact Lie group
G with maximal torus T as in Chap. 18, then by Proposition 18.10 we may
take Λ = X ∗ (T ). Elements of Λ are to be called weights, and our assumptions
are satisfies by Proposition 18.10. A weight λ is called dominant if λ ∈ C + .
By Theorem 20.1, every weight is equivalent by the action of W to a unique
dominant weight.

Proposition 20.16. If λ ∈ Λ, then λ − w(λ) ∈ Λroot .

Proof. This is true if w is a simple reflection by (18.1). The general case


follows, since if w = s1 · · · sr , where the si are simple reflections, we may
write λ − w(λ) = (λ − sr (λ)) + (sr (λ) − sr−1 (sr (λ)) + · · · .

166 20 Abstract Weyl Groups

Now let us assume that Φ spans V. This will be true if G is semisimple. Let Λ̃
to be the set of vectors v such that α∨ (v) ∈ Z for α∨ ∈ Φ∨ . In the semisimple
case the α∨ span V ∗ , Λ̃ is a lattice. We have Λ̃ ⊇ Λ ⊇ Λroot , and all three
lattices span V, so [Λ̃ : Λroot ] < ∞. The α∨i are linearly independent, and in
the semisimple case they are a basis of V, so let i be the dual basis of V. In
other words, these vectors are defined by α∨ i (j ) = δij (Kronecker delta). The
i are called the fundamental dominant weights. Strictly speaking, because
in our usage only elements of Λ will be called weights, the i might not
be weights by our conventions. However, we will call them the fundamental
weights because this terminology is standard. Clearly the i span Λ̃ as a
Z-module.

Proposition 20.17. In the semisimple case ρ = 1 + · · · + h . In particular,



ρ is a dominant weight. It lies in C+ .

Proof. Let α = αi ∈ Σ. By (20.11), we have α∨ (ρ) α = ρ − sα (ρ) = α. Thus,


α∨
i (ρ) = 1 for each αi ∈ Σ. It follows that ρ is the sum of the fundamental
dominant weights. Since ρ, αi > 0, ρ lies in the interior of C+ .


Up until now we have assumed that Φ is a reduced root system, and much
of the foregoing is false without this assumption. In Chap. 18, and indeed
most of the book, the root systems are reduced, so this is enough for now. In
Chap. 29, however, we will encounter relative root systems, which may not be
reduced, so let us say a few words about them. If Φ ⊂ V is not reduced, then
we may still choose v0 and partition Φ into positive and negative roots. We
call a positive root simple if it cannot be expressed as a linear combination
(with nonnegative coefficients) of other positive roots.

Proposition 20.18. Let (Φ, V) be a root system that is not necessarily re-
duced. If α and λα ∈ Φ with λ > 0, then λ = 1, 2 or 12 . Partition Φ into
positive and negative roots, and let Σ be the set of simple roots. The elements
of Σ are linearly independent. Any positive root may be expressed as a linear
combination of elements of Σ with nonnegative integer coefficients.

Proof. If α and β are proportional roots, say β = λα, then 2 β, α / α, α ∈ Z


implies that 2λ is an integer and, by symmetry, so is 2λ−1 . The first assertion
is therefore clear. Let Ψ be the set of all roots that are not the double of
another root. Then it is clear that Ψ is another root system with the same
Weyl group as Φ. Let Ψ + = Φ+ ∩ Ψ . With our definitions, the set Σ of simple
positive roots of Ψ + is precisely the set of simple positive roots of Φ. They
are linearly independent by Proposition 20.1. If α ∈ Φ+ , we need to know
that α can be expressed as a linear combination, with integer coefficients, of
the elements of Σ. If α ∈ Ψ , this follows from Proposition 20.1, applied to Ψ .
Otherwise, α/2 ∈ Ψ , so α/2 is a linear combination of the elements of Σ with
integer coefficients, and therefore so is α.

20 Abstract Weyl Groups 167

Exercises
Exercise 20.1. Suppose that S is any subset of Φ such that if α ∈ Φ, then either
α ∈ S or −α ∈ S. Assume further more that if α, β ∈ S and if α + β ∈ Φ then
α + β ∈ S. Show that there exists w ∈ W such that w(S) ⊇ Φ+ . If either for every
α ∈ Φ either α ∈ S or −α ∈ W but never both, then w is unique.

Exercise 20.2. Generalize (20.11) by proving, for w ∈ W :



w(ρ) = ρ − α. (20.12)
α ∈ Φ+
w −1 (α) ∈ Φ−
21
Highest Weight Vectors

If G is a compact connected Lie group, we will show in Chap. 22 that its ir-
reducible representations are parametrized uniquely by their highest weight
vectors. In this chapter, we will explain what this means and give some illus-
trative examples. This chapter is to some extent a continuation of the example
Chap. 19. As in that chapter, we will make many assertions that will only be
proved in later chapters, mostly Chap. 22.
We return to the figures in Chap. 19 (which the reader should review). Let
T be a maximal torus in G, with Λ = X ∗ (T ) embedded as a lattice in the
Euclidean space V = R ⊗ X ∗ (T ). Let Λroot ⊆ Λ be the lattice spanned by the
roots.
If G is semisimple, then Λroot spans V and has finite codimension in Λ. In
this case, the coroots also span V ∗ , so we may ask for the dual basis of V. These
are elements called i such that α∨ i (j ) = δij . These are the fundamental
dominant weights. They are not necessarily in Λ, however: they are in Λ if
G is simply connected as well as semisimple. We only will call elements of
V weights if they are in Λ, so if G is not connected, the term “fundamental
dominant weight” is a misnomer. But if G is semisimple and simply connected,
the i are uniquely defined and span the weight lattice Λ. The fundamental
dominant weights do not play a major role in the general theory but they give
a convenient parametrization  of Λ when G is semisimple, since then every
element of Λ is of the form ni i with ni nonnegative integers. (This is true
even if G is not simply connected.) Since our examples will be semisimple, we
will make use of the fundamental dominant weights.
Our first example is G = SU(3). The lattices Λ and its sublattice Λroot
(of index 3) are marked in Fig. 21.1. The positive Weyl chamber C + is the
shaded cone. It is a fundamental group for the Weyl group W , acting by
simple reflections, which are the reflections in the two walls of C + . The weight
lattice Λ is marked with light dots and the root sublattice with darker ones. In
this case G is semisimple and simply connected, so the fundamental dominant
weights 1 and 2 are defined and span the weight lattice. The root lattice
is of codimension 3 in Λ.

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 169


DOI 10.1007/978-1-4614-8024-2 21, © Springer Science+Business Media New York 2013
170 21 Highest Weight Vectors

α2
C+

1
α1

Fig. 21.1. The weight and root lattices for SU(2)

Let (π, V ) be an irreducible complex representation of G. Then the re-


striction of π to T is a representation of T that will not be irreducible if
π is not one-dimensional (since the irreducible representations of T are one-
dimensional). It can be decomposed into a direct sum of one-dimensional
irreducible subspaces of T corresponding to the characters of T . Some char-
acters may occur with multiplicity greater than one. If μ ∈ X ∗ (T ), let m(μ)
be the multiplicity of μ in the decomposition of π over T . Thus, m(μ) is the
dimension of V (μ) = {v ∈ V |π(t)v = λ(t)v for t ∈ T }. If m(λ) = 0, we say
that λ is a weight of the representation π.
For example, let G = SU(3), and let T be the diagonal torus. Let 1 , 2 :
T −→ C be the fundamental dominant weights, labeled as in Chap. 19. They
are the characters 1 (t) = t1 and 2 (t) = t1 t2 = t−1
3 where t1 , t2 , t3 are the
entries in the diagonal matrix t.

1
1 1

1 1
1

Fig. 21.2. Left: The standard representation; Right: its dual

The standard representation of SU(3) is just the usual embedding


SU(3) −→ GL(3, C). The three one-dimensional subspaces spanned by the
standard basis vectors of C3 afford the characters 1 , −1 + 2 , and −2 .
These are the weights of the standard representation. Each occurs with
multiplicity one. On the other hand, the contragredient of the standard
21 Highest Weight Vectors 171

representation is its composition with the transpose-inverse automorphism


of GL(3, C). The standard basis vectors in this dual representation afford the
characters −1 , 1 − 2 , and 2 .
In Fig. 21.2 (left), we have labeled the three weights in the standard repre-
sentation with their multiplicities. (For this example each multiplicity is one.)
In Fig. 21.2 (right), we have labeled the three weights of the dual of the stan-
dard representation. Such a diagram, illustrating the weights of an irreducible
representation, is called a weight diagram.
In each irreducible representation π, there is always a weight λ of π in the
positive Weyl chamber such that if μ is another weight of π then λ  μ in the
partial order. This weight is called the highest weight of the representation.
We always have m(λ) = 1, so V (λ) is one-dimensional, and we call an element
of V (λ) a highest weight vector . We have circled the highest weight vectors of
the standard representation and its dual in Fig. 21.2.

1 1 1 1

1 2 2 2 1

1 2 3 3 2 1

1 2 3 4 3 2 1

1 2 3 4 4 3 2 1

1 2 3 4 4 4 3 2 1

1 2 3 4 4 4 4 3 2 1

1 2 3 3 3 3 3 2 1

1 2 2 2 2 2 2 1

1 1 1 1 1 1 1

Fig. 21.3. The irreducible representation π31 +62 of SU(2). The shaded region is
the positive Weyl chamber, and λ = 31 + 6ω2 is circled

The highest weight can be any element of Λ∩C + . In fact, there is a bijection
between Λ ∩ C + and the isomorphism classes of irreducible representations of
G. Since there is a unique irreducible representation with the given highest
weight λ, we will denote it by πλ . For example, if λ = 31 + 62 , the weight
diagram of πλ is shown in Fig. 21.3. Note that λ = 31 + 62 is marked with
a circle.
From this we can see several features of the general situation. The set of
weights of πλ can be characterized as follows. First, if μ is a weight of πλ
then λ  μ in the partial order. This puts μ in the translate by λ of the
172 21 Highest Weight Vectors

cone {μ  0}. This is the shaded region in Fig. 21.4. Moreover since the set
of weights is invariant under the Weyl group W , we can actually say that
λ  w(μ) for all w ∈ W . In Fig. 21.3, this puts μ in the hexagonal region
that is the convex hull of the W -orbit W λ = {w(λ) | w ∈ W }. This region is
marked with dashed lines.

1 1 1 1

1 2 2 2 1

1 2 3 3 2 1

1 2 3 4 3 2 1

1 2 3 4 4 3 2 1

1 2 3 4 4 4 3 2 1

1 2 3 4 4 4 4 3 2 1

1 2 3 3 3 3 3 2 1

1 2 2 2 2 2 2 1

1 1 1 1 1 1 1

Fig. 21.4. With λ the highest weight vector (circled ) the shaded region is {μ|μ  λ}

It will be noted that not every element of Λ inside the hexagon is a weight
of πλ . Indeed, if μ is a weight of Λ then λ − μ ∈ Λroot . In the particular
example of Fig. 21.3, λ is itself in Λroot , so the weights of πλ are elements of
the root lattice. What is true in general is that the weights of πλ are the μ
inside the convex hull of W λ such that λ − μ ∈ Λroot .
Next let G = Sp(4). The root system is of type C2 . This group is also
simply connected, so again the fundamental dominant weights 1 and 2
are in the weight lattice. The weight lattice and root lattice are illustrated in
Fig. 21.5.
As in Fig. 21.1, the weight lattice Λ is marked with light dots and the
root sublattice with darker ones. We have also marked the positive Weyl
chamber, which is a fundamental group for the Weyl group W , acting by
simple reflections.
The group Sp(4) admits a homomorphism Sp(4) −→ SO(5), so it has both
a four-dimensional and a five-dimensional irreducible representation. These
are π1 and π2 , respectively. Their root diagrams may be found in Fig. 21.6.
21 Highest Weight Vectors 173

α2 2

α1

Fig. 21.5. The root and weight lattices of the C2 root system

The weight diagram of the irreducible representation π21 +32 of Sp(4) is


shown in Fig. 21.7.
If we considered SO(5), the group would not be simply connected, so we
do not expect the fundamental weights to both lie in the root lattice. Let us
see what happens. As explained in Chap. 19, the weight lattice is Z2 and the
simple roots are e1 − e2 and e2 , that is, (1, −1) and (0, 1). From this, the
fundamental dominant weights are 1 = (1, 0) and 2 = ( 12 , 12 ). The first is
in the weight lattice but the second, being fractional, is not. So even though
we call 2 a “fundamental dominant weight” it is not a weight of SO(5). It
is, however, a weight of the universal covering group Spin(5). Indeed, 2 is
the highest weight of a four-dimensional irreducible representation of Spin(5),
the spin representation. Let t be an element of the maximal torus of Spin(5)
that projects onto ⎛ ⎞
t1
⎜ t2 ⎟
⎜ ⎟
⎜ 1 ⎟ ∈ SO(5).
⎜ ⎟
⎝ −1
t2 ⎠
−1
t1
±1/2 ±1/2
Then the four eigenvalues of t in the spin representation are t1 t2 ,
where the signs of the square roots depend on which of the two elements in
the
√ preimage of the above orthogonal matrix is chosen. The highest weight is
t1 t2 , the character corresponding to 2 = ( 12 , 12 ).
174 21 Highest Weight Vectors

1
1 1
1 1 1
1 1
1

Fig. 21.6. The fundamental representations of Sp(4)

1 1 1

1 2 3 2 1

1 2 4 4 4 2 1

1 2 4 5 6 5 4 2 1

1 3 4 6 6 6 4 3 1

1 2 4 5 6 5 4 2 1

1 2 4 4 4 2 1

1 2 3 2 1

1 1 1

Fig. 21.7. The irreducible representation π21 +32 of Sp(4)

Exercises
Exercise 21.1. Consider the adjoint representation of SU(3) acting on the eight-
dimensional Lie algebra g of SU(3). (It may be shown to be irreducible.) Show that
the highest weight vector is 1 + 2 , and construct a weight diagram.
21 Highest Weight Vectors 175

Exercise 21.2. Construct a weight diagram for the adjoint representation of Sp(4)
or, equivalently, SO(5).

Exercise 21.3. Consider the symmetric square of the standard representation of


SU(3). This is an irreducible representation. Show that it has dimension six, and
that its highest weight vector is 21 . Construct its weight diagram.

Exercise 21.4. Consider the tensor product of the contragredient of the standard
representation of SU(3), having highest weight vector 2 , with the adjoint represen-
tation, having highest weight vector 1 + 2 . We will see later in Exercise 22.4 that
this tensor product has three irreducible constituents. They are the contragredient
of the standard representation, the symmetric square of the standard representa-
tion, and another piece, which we will call π1 +22 . The first two pieces are known,
and the third can be obtained by subtracting the two others. Accepting for now
the validity of this decomposition, construct the weight diagram for the irreducible
representation π1 +22 .

Exercise 21.5. The Lie group G2 has an irreducible seven-dimensional represen-


tation. This information, together with the root system, described in Chap. 19, is
enough to determine the weight diagram. Give the weight diagram for this repre-
sentation, and for the 14-dimensional adjoint representation.
22
The Weyl Character Formula

The character formula of Weyl [174] is the gem of the representation theory
of compact Lie groups.
Let G be a compact connected Lie group and T a maximal torus. Let
Λ = X ∗ (T ), and let Λroot be the lattice spanned by the roots. Then Λ ⊇ Λroot .
The index [Λ : Λroot ] may be finite (e.g. if G = SU(n)) or infinite (e.g. if
G = U(n)). If the index is finite, then we say G is semisimple, and this
corresponds to the semisimple case in Chap. 20. Elements of Λ will be called
weights.
We have written the characters of T additively. Sometimes we want to
write them multiplicatively, however, so we introduce symbols eλ for λ ∈ V
subject to the rule eλ eμ = eλ+μ . More formally, let E(R) denote the free
R-module
 on the set of symbols {eλ |λ ∈ Λ}. It consists of all formal sums
λ∈Λ nλ e with nλ ∈ R such that nλ = 0 for all but finitely many λ. It is a
λ

ring with the multiplication


8 9⎛ ⎞ ⎛ ⎞

n λ · eλ ⎝ mμ · e μ ⎠ = ⎝ n λ mμ ⎠ · eν . (22.1)
λ∈Λ μ∈Λ ν∈Λ λ+μ=ν

This makes sense because only finitely many nλ and only finitely many mμ
are nonzero. Of course, E(R) is just the group algebra over R of Λ. The Weyl
group acts on E(R), and we will denote by E(R)W the subring of W -invariant
elements. Usually, we are interested in the case R = Z, and we will denote
E = E(Z), E W = E(Z)W . We will find it sometimes convenient to work in the
larger ringE2 , which is the free Abelian group on 12 Λ.
If ξ = λ nλ · eλ , we will sometimes
 denote m(ξ, λ) = nλ , the multiplicity
of λ in ξ. We will denote by ξ = λ nλ · e−λ the conjugate of ξ.
By Theorem 17.1, class functions on G are the same thing as W -invariant
functions on T . In particular, if χ is the character of a representation of G,
then its restriction to T is a sum of characters of T and is invariant under
the action of W . Thus, if λ ∈ Λ, let nλ (χ) denote the multiplicity of λ in this
restriction. We associate with χ the element

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 177


DOI 10.1007/978-1-4614-8024-2 22, © Springer Science+Business Media New York 2013
178 22 The Weyl Character Formula

nλ (χ) eλ ∈ E W .
λ

We will identify χ with this expression. We thus regard characters χ as ele-


ments of E W . The operation of conjugation that we have defined corresponds
to the conjugation of characters. The conjugate of a character is a character
by Proposition 2.6.
If μ1 , μ2 , . . . , μn is a basis of the free Z-module Λ, then E is the Laurent
polynomial ring
E = Z[μ1 , . . . , μr , μ−1 −1
1 , . . . μr ].

It is the localization S −1 Z[μ1 , . . . , μr ], where S is the multiplicative subset of


Z[μ1 , . . . , μr ] generated by {μ−1 −1
1 , . . . , μr }. As such, it is a unique factorization
domain. (See Lang [116], Exercise 5 on p. 115.)
Let Σ = {α1 , . . . , αr } be the simple roots, and Φ the set of positive roots,
partitioned into Φ+ and Φ− as usual. We will denote by Δ ∈ E the element
7 7
e−ρ (eα − 1) = eρ (1 − e−α ). (22.2)
α∈Φ+ α∈Φ+

The equivalence of the two expressions follows easily from the fact that 2ρ is
the sum of the positive roots.
Proposition 22.1. We have w(Δ) = (−1)l(w) Δ for all w ∈ W .
Proof. It is sufficient to check that sβ (Δ) = −Δ for every simple root β. We
recall that sβ changes the sign of β and permutes the remaining simple roots.
Of the factors in the first expression for Δ in (22.2), only two are changed:
e−ρ and (eβ − 1). These become [see (20.11)] e−ρ+β and (e−β − 1). The net
effect is that Δ changes sign.

An alternative way of explaining the same proof begins with the equation
7
Δ= (eα/2 − e−α/2 ). (22.3)
α∈Φ+

Here α/2 may not be an element of Λ, so each individual factor on the right
is not really an element of E but of the larger ring E2 . Proposition 22.1 follows
by noting that by Proposition 20.1(ii) each simple reflection alters the sign of
exactly one term in (22.3), and the result follows.
Proposition 22.2. If ξ ∈ E satisfies w(ξ) = (−1)l(w) ξ for all w ∈ W , then ξ
is divisible by Δ in E.
Proof. In the ring E, by Proposition 22.1, Δ is a product of distinct irreducible
elements 1 − eα , where α runs through Φ+ , times a unit e−ρ . It is therefore
 by each 1 − e . By Proposition 20.12,
α
sufficient to show that ξ is divisible
we have sα (ξ) = −ξ. Write ξ = λ∈Λ nλ · l. Since sα (ξ) = −ξ, we have
nsα (λ) = −nλ . Noting that sα (λ) = λ − kα where k = α∨ (λ) ∈ Z, we see that
22 The Weyl Character Formula 179

ξ= nλ (eλ − eλ−kα ).
λ∈Λ
λ mod sα 

The notation means that we choose only one representative for each sα orbit
of Λ. (If sα (λ) = λ, then nλ = 0.) Since

eλ − eλ−kα = (1 − eα )(−eλ−α − eλ−2α − · · · − eλ−kα ),

this is divisible by Δ.


If λ ∈ Λ ∩ C+ , let

χ(λ) = Δ−1 (−1)l(w) ew(λ+ρ) . (22.4)
w∈W

By
 Proposition 22.2, χ(λ) ∈ E. Moreover, applying w ∈ W multiplies both
w∈W (−1)w w(λ+ρ)
e and Δ by (−1)w , so χ(λ) is actually in E W .
We will eventually prove that if λ ∈ Λ ∩ C+ this is an irreducible character
of G. Then (22.4) is called the Weyl character formula.
If ξ = nλ eλ ∈ E, we define the support of ξ to be the finite set supp(ξ) =
{λ ∈ L | nλ = 0}. We define a partial order on V by λ  μ if λ = μ+ α∈Σ cα α,
where cα  0.

Proposition 22.3. If λ ∈ C+ , then λ  w(λ) for w ∈ W . If λ ∈ C+ and
w = 1, then w(λ)  λ.

Proof. It is easy to see that, for x ∈ V, x  0 if and only if x, v  0 for


◦ ◦
all v ∈ C+ . So if λ ∈ C+ and λ  w(λ), then there exists v ∈ C+ such that
λ − w(λ), v < 0. We choose w to maximize w(λ), v . Since w(λ) = λ and
λ ∈ C+ , it follows from Theorem 20.1 that w(λ) ∈ / C+ . Therefore, there exists
α ∈ Σ such that w(λ), α < 0, or equivalently, α∨ (w(λ)) < 0. Now
 
w(λ), α
sα w(λ), v = w (λ) − 2 α, v
α, α
w(λ), α
= w (λ) , v − 2 α, v > w (λ) , v .
α, α

The maximality of w(λ), v is contradicted.




Proposition
 22.4. Let λ ∈ C+ . Then λ ∈ supp χ(λ). Indeed, writing χ(λ) =
n
μ μ · e μ
, we have nλ = 1. Moreover, if μ ∈ supp χ(λ), then λ  μ, and
λ − μ ∈ Λroot . In particular, λ is the largest weight in the support of χ(λ).

 ring E as follows. Let E be the “completion” consisting


Proof. We enlarge the ˆ
of all formal sums λ∈Λ nλ · λ, where we now allow nλ = 0 for an infinite
number of λ. However, we ask that there be a v ∈ V such that nλ = 0 implies
180 22 The Weyl Character Formula

that λ  v. This means that, in the product (22.1), only finitely many terms
will be nonzero, so Ê is a ring. We can write
7
Δ = eρ (1 − e−α ),
α∈Φ+

so in Ê we have
7
Δ−1 = e−ρ (1 + e−α + e−2α + · · · ).
α∈Φ+

Therefore,
7
χ(λ) = eλ (1 + e−α + e−2α + · · · ) (−1)l(w) ew(λ+ρ)−(λ+ρ) . (22.5)
α∈Φ+ w∈W

Each factor in the product is ≺ 0 except 1, and by Proposition 22.3 each term
in the sum is ≺ 0 except that corresponding to w = 1. Hence, each term in
the expansion is  λ, and exactly one term contributes λ itself.
It remains to be seen that if eμ appears in the expansion of the right-hand
side of (22.5), then λ−μ is an element of Λroot . We note that w(λ+ρ)−(λ+ρ) ∈
Λroot by Proposition 20.16, and of course all the contributions coming from
the product over α ∈ Φ+ are roots, and the result follows.

Now let us write the Weyl integration formula in terms of Δ.
Theorem 22.1. If f is a class function on G, we have
 
1
f (g) dg = f (t) |Δ(t)|2 dt. (22.6)
G |W | T
Here there is an abuse of notation since Δ is itself only an element of E, not
even W -invariant, so it is not identifiable as a function on the group. However,
it will follow from the proof that ΔΔ̄ is always a function on the group, and
we will naturally denote ΔΔ̄ as |Δ|2 .
Proof. We will show that, in the notation of Theorem 17.2,

det [Ad(t−1 ) − Ip ] | p = ΔΔ̄. (22.7)
Indeed, since the complexification of p is the direct sum of the spaces Xα on
each of which t ∈ T acts by α(t) in the adjoint representation,
7 7
det [Ad(t−1 ) − Ip ] | p = α(t)−1 − 1 = | α(t) − 1 |2 .
α∈Φ α∈Φ+

In E, this becomes the element


- .- .
7 7
−ρ −ρ
e (e − 1) e
α
(e − 1) = ΔΔ.
α

α∈Φ+ α∈Φ+

Now (22.6) is just the Weyl integration formula, Theorem 17.2.



22 The Weyl Character Formula 181

We now introduce an inner product on E W . If ξ, η ∈ E W , let


1
ξ, η = m (ξΔ)(ηΔ), 0 . (22.8)
|W |

That is, it is the multiplicity of the zero weight in (ξΔ)(ηΔ) divided by |W |.

Theorem 22.2. If ξ and η are characters of G, identified with elements of


E, then the inner product (22.8) agrees with the L2 inner product of the
characters.

Proof. The L2 inner product of ξ and η is just the integral of ξ·η over the group
and, using (22.6), this is just W −1 times the multiplicity of 0 in (ξΔ)(ηΔ). 

Proposition 22.5. If λ and μ are weights in C+ , we have



1 if λ = μ,
χ(λ), χ(μ) =
0 otherwise.

Proof. Using (22.8), this inner product is the multiplicity of 0 in


- .- .
1
w w(ρ+λ) w  w  (ρ+μ)
(−1) e (−1) e
|W | 
w∈W w ∈W
⎡ ⎤
1 ⎣  
= (−1)w+w ew(ρ+λ)−w (ρ+μ) ⎦ .
|W | w,w ∈W

We must therefore ask, with both λ and μ ∈ C+ , under what circumstances


w(ρ + λ) − w (ρ + μ) = 0 can vanish. Then ρ + λ = w−1 w (ρ + μ). Since both

ρ + λ and ρ + μ are in C+ , it follows from Theorem 20.1 that w must equal

w and so λ must equal μ. The number of solutions is thus |W | if λ = μ and
zero otherwise.


Proposition 22.6. The set of χ(λ) with λ ∈ Λ ∩ C+ is a basis of the free


Z-module E W .

Proof. The linear independence of the χ(λ) follows from their orthogonality.
We must show that they span. Clearly, E W is spanned by elements of the form

B(λ) = eμ , l ∈ Λ ∩ C+ ,
μ∈W ·λ

where W · λ is the orbit of λ under the action of W . It is sufficient to show


that B(λ) is in the Z-linear span of the χ(λ). It follows from Proposition 22.4
that when we expand B(λ) − χ(λ) in terms of the B(μ), only μ ∈ Λ with
μ ≺ λ can occur and, by induction, these are in the span of the χ(μ).

182 22 The Weyl Character Formula

Theorem 22.3. (Weyl) Assume that G is semisimple. If λ ∈ Λ ∩ C+ , then


χ(λ) is the character of an irreducible representation of G, and each irreducible
representation is obtained this way.
We will denote by π(λ) the irreducible representation of G with character χλ .
Proof. Let χ be an irreducible representation of G. Regarding χ as an element
of E W , we may expand χ in terms of the χ(λ) by Proposition 22.6. We write

χ= nλ · χ(λ), nλ ∈ Z.
λ ∈ Λ ∩ C+

We have

1 = χ, χ = n2λ .
λ

Therefore, exactly one nλ is nonzero, and that has value ±1. Thus, either χ(λ)
or its negative is an irreducible character of G. To see that −χ(λ) is not a
character, consider its restriction to T . By Proposition 22.4, the multiplicity
of the character λ in −χ(λ) is −1, which is impossible if −χ(λ) is a character.
Hence, χ(λ) = χ is an irreducible character of G.
We have shown that every irreducible character of G is a χ(λ). It remains
to be shown that every χ(λ) is a character. Since the class functions on G are
identical to the W -invariant functions on T , the closure in L2 (G) of E(C)W is
identified with the space of all class functions on G. By Proposition 22.6, the
χ(λ) form an L2 -basis of E(C)W . Since by the Peter–Weyl theorem the set of
irreducible characters of G are an L2 basis of the space of class functions, the
characters of G cannot be a proper subset of the set of χ(λ).

Now let us step back and see what we have established. We know that in group
representation theory there is a duality between the irreducible characters of
a group and its conjugacy classes. We can study both the conjugacy classes
and the irreducible representations of a compact Lie group by restricting them
to T . We find that the conjugacy classes of G are in one-to-one correspon-
dence with the W -orbits of T . Dually, the irreducible representations of G are
parametrized by the orbits of W on Λ = X ∗ (T ).
We study these orbits by embedding X ∗ (T ) in a Euclidean space V. The
positive Weyl chamber C+ is a fundamental domain for the action of W on V,
and so the dominant weights—those in C+ —are thus used to parametrize the
irreducible representations. Of the weights that appear in the parametrized
representation χ(λ), the parametrizing weight λ ∈ C+ ∩ X ∗ (T ) is maximal
with respect to the partial order. We therefore call it the highest weight vector
of the representation.
Proposition 22.7. We have

Δ= (−1)l(w) ew(ρ) . (22.9)
w∈W
22 The Weyl Character Formula 183

Proof. The irreducible representation χ(0) with highest weight vector 0 is


obviously the trivial representation. Therefore, χ(0) = e0 = 1. The formula
now follows from (22.4).


Weyl gave a formula for the dimension of the irreducible representation with
character χλ . Of course, this is the value χλ at the identity element of G,
but we cannot simply plug the identity into the Weyl character formula since
the numerator and denominator both vanish there. Naturally, the solution is
to use L’Hôpital’s rule, which can be formulated purely algebraically in this
context.

Theorem 22.4 (Weyl). The dimension of π(λ) is


/
α∈Φ+ λ + ρ, α
/ . (22.10)
α∈Φ+ ρ, α

Proof. Let Ω : E2 −→ Z be the map


8 9

Ω nλ · e λ
= nλ .
λ∈Λ λ∈Λ

The dimension we wish to compute is Ω(χλ ).


If α ∈ Φ, let ∂α : E2 −→ E2 be the map
8 9

∂α nλ · e λ
= nλ λ, α · eλ .
λ∈Λ λ∈Λ

It is straightforward to check that ∂α /


is a derivation and that the operators
∂α commute with each other. Let ∂ = α∈Φ+ ∂α .
We show that if w ∈ W and f ∈ E2 , we have

w∂(f ) = (−1)l(w) ∂w(f ). (22.11)

We note first that


∂w(α) ◦ w = w ◦ ∂α (22.12)
since applying the operator on the left-hand side to eλ gives w(λ), w(α) ew(λ) ,
while the second gives λ, α ewλ , and these are equal. Now, to prove (22.11),
we may assume that w = sβ is a simple reflection. By (22.12), we have
8 9
7
w◦ ∂w(α) = ∂ ◦ w.
α∈Φ+

But by Proposition 20.1(ii), the set of w(α) consists of Φ+ with just one
element, namely β, replaced by its negative. So (22.11) is proved.
184 22 The Weyl Character Formula

We consider now what happens when we apply Ω ◦ ∂ to both sides of the


identity 7 > ?
(−1)λ ew(λ+ρ) = χλ · eα/2 − e−α/2 . (22.13)
w∈W α∈Φ+

On the left-hand side, by (22.11), applying ∂ gives


8 9
7
w ∂eλ+ρ = w λ + ρ, α eλ+ρ .
w∈W w∈W α∈Φ+
/
Now applying Ω gives |W | α∈Φ+ λ + / ρ, α .
On the other hand, we apply ∂ = ∂β one derivation at a time to the
right-hand side of (22.13), expanding by the Leibnitz product rule to obtain
a sum of terms, each of which is a product of χλ and the terms eα/2 − e−α/2 ,
with various subsets of the ∂β applied to each factor. When we apply Ω, any
term in which a eα/2 − e−α/2 is not hit by at least one ∂β will be killed. Since
the number of operators ∂β and the number of factors eα/2 − e−α/2 are equal,
only the terms in which each eα/2 − e−α/2 is hit by exactly one ∂β survive. Of
course, χλ is not hit by a ∂β in any such term. In other words,
8 9
7 > ?
−α/2
Ω ◦ ∂ χλ · e α/2
−e = θ · Ω(χλ ),
α∈Φ+

where 8 9
7 > ?
−α/2
θ =Ω◦∂ e α/2
−e
α∈Φ+

is independent of λ. We have proved that


7
|W | λ + ρ, α = θ · Ω(χλ ).
α∈Φ+

To evaluate θ, we take λ = 0, so that χλ is the character


/ of the trivial rep-
resentation, and Ω(χλ ) = 1. We see that θ = |W | α∈Φ+ ρ, α . Dividing by
this, we obtain (22.10).

Proposition 22.8. Let λ be a dominant weight, and let (π, V ) be an irre-
ducible representation with highest weight λ. Let w0 be the longest Weyl group
element (Proposition 20.14). Then the highest weight of the contragredient
representation is −w0 λ.
Proof. We recall that the character of the contragredient is the complex con-
jugate of the character of π (Proposition 2.6). The weights that occur in the
contragredient are therefore the negatives of the weights that occur in π. It fol-
lows that −λ is the lowest weight of π̂. The highest weight is therefore in the
same Weyl group element as −λ, and the unique dominant weight in that
orbit is −w0 λ.

22 The Weyl Character Formula 185

In 1967 Klimyk described a method of decomposing tensor products that


is very efficient for computation. It is based on a simple idea and in retrospect
it was found that the same idea appeared in a much earlier paper of Brauer
(1937). The same idea appears in Steinberg [154]. We will prove a special case,
leaving the general case for the exercises.

Proposition 22.9 (Brauer, Klimyk). Suppose that λ and μ are in X ∗ (T )∩


C+ . Decompose χμ into a sum of weights ν ∈ X ∗ (T ) with multiplicities m(ν):

χμ = m(ν) eν .
ν

Suppose that for each ν with m(ν) = 0 the weight λ + ν is dominant. Then

χλ χμ = m(ν)χλ+ν . (22.14)
ν

Since χλ χμ is the character of the tensor product representation, this gives the
decomposition of this tensor product into irreducibles. The method of proof
can be extended to the case where λ + ν is not dominant for all ν, though the
answer is a bit more complicated to state (Exercise 22.5).

Proof. By the Weyl character formula, we may write



χλ χμ = Δ−1 m(ν) eν (−1)l(w) ew(λ+ρ) .
ν w

Interchange the order of summation, so that the sum over ν is the inner sum,
and make the variable change ν −→ w(ν). Since m(ν) = m(wν), we get

Δ−1 m(ν) (−1)l(w) ew(λ+ν+ρ) .
w ν

Now we may interchange the order of summation again and apply the Weyl
character formula to obtain (22.14).


It is sometimes convenient to shift ρ by a W -invariant element of R⊗X ∗(T ) so


that it is in X ∗ (T ). Such a shift is harmless in the Weyl character formula and
the Weyl dimension formula provided we shift by a vector that is orthogonal to
all the roots, since ρ only appears in inner product with the roots. This is only
possible if G is not semisimple, for if G is semisimple, there are no nonzero
vectors orthogonal to the roots. Let us illustrate this trick with G = U(n).
We identify X ∗ (T ) with Zn by mapping the character
⎛ ⎞
t1
⎜ .. ⎟ 7
⎝ . ⎠ −→ tki i (22.15)
tn
186 22 The Weyl Character Formula

to (k1 , . . . , kn ) ∈ Zn . Then ρ is 12 (n − 1, n − 3, . . . , 1 − n). If n is even, it is


an element of R ⊗ X ∗ (T ) but not of X ∗ (T ). However, if we add to it the
W -invariant element 12 (n − 1, . . . , n − 1), we get

δ = (n − 1, n − 2, . . . , 1, 0) ∈ X ∗ (T ). (22.16)

We can now write the Weyl character formula in the form



χ(λ) = Δ−1
0 (−1)l(w) ew(λ+δ) , (22.17)
w∈W

where

Δ0 = (−1)l(w) ew(δ) .
w∈W

We have simply multiplied the numerator and the denominator by the same
W -invariant element so that both the numerator and the denominator are in
X ∗ (T ).
In (22.7), we write the factor |Δ|2 = |Δ0 |2 since (Δ0 /Δ)2 = e2(δ−ρ) . As a
function on the group, this is just det(g)n−1 , which has absolute value 1.
Therefore, we may write the Weyl integration formula in the form
 
1
f (g) dg = f (t) |Δ0 (t)|2 dt. (22.18)
G |W | T

Exercises
In the first batch of exercises, G = SU(3) and, as usual, 1 and 2 are the funda-
mental dominant weights.

Exercise 22.1. By Proposition 22.4, all the weights in χλ lie in the set

S(λ) = {μ ∈ Λ | λ  w(μ) for all w ∈ W , λ − μ ∈ Λroot }.

Confirm by examining the weights that this is true for all the examples in Chap. 21—
in fact, for all these examples, S(μ) is exactly the set of weights.

Exercise 22.2. Use the Weyl dimension formula to compute the dimension of χ21 .
Deduce from this that the symmetric square of the standard representation is irre-
ducible.

Exercise 22.3. Use the Weyl dimension formula to compute the dimension of
χ1 +22 . Deduce from this that the symmetric square of the standard represen-
tation is irreducible.

Exercise 22.4. Use the Brauer–Klimyk method (Proposition 22.9) to compute the
tensor product of the contragredient of the standard representation (with character
χ2 ) and the adjoint representation (with character χ1 +2 ).
22 The Weyl Character Formula 187

Exercise 22.5. Prove the following extension of Proposition 22.9. Suppose that λ
is dominant and that ν is any weight. By Proposition 20.1, there exists a Weyl group
element such that w(ν + λ + ρ) ∈ C+ . The point w(ν + λ + ρ) is uniquely determined,
even though w may not be. If w(ν+λ+ρ) is on the boundary of C+ , define ξ(ν, λ) = 0.
If w(ν + λ + ρ) is not on the boundary of C+ , explain why w(ν + λ + ρ) − ρ ∈ C+ and
w is uniquely determined. In this case, define
 ξ(ν, λ) = (−1)l(w) χw(ν+λ+ρ)−ρ . Prove
ν
that if μ is a dominant weight, and χμ = m(ν)e , then

χμ χλ = m(ν)ξ(ν, λ).
ν

Exercise 22.6. Use the last exercise to compute the decomposition of χ21 into
irreducibles, and obtain another proof that the symmetric square of the standard
representation is irreducible.

Exercise 22.7. Let μ be an element of the root lattice. A vector partition of μ is a


decomposition of μ into a linear combination, with nonnegative integer coefficients,
of positive weights. In other words, it is an assignment of nonnegative integers nα
to α ∈ Φ+ such that

μ= nα α.
α∈Φ+

The Kostant partition function P(μ) is defined to be the number of vector partitions
of μ. Note that this is zero unless μ  0. Let Ê be the completion of E defined in
the proof of Proposition 22.4. Show that in Ê
 
(1 − e−α )−1 = P(μ)e−μ ,
α∈Φ μ ∈ Λroot
μ0

and from (22.5) deduce the Kostant multiplicity formula, for λ a dominant weight:
the multiplicity of μ in χ(λ) is
   
(−1)l(w) P w(λ + ρ) − ρ − μ .
w μ

Exercise 22.8. Let G = SU(3) and let 1 , 2 : T −→ C be the fundamental


dominant weights, labeled as in Chap. 19. Use the Kostant multiplicity formula to
compute the weights of χ(1 + 22 ). Note that you need only need to consider
weights in supp χ(λ) as computed in Proposition 22.4. Do you observe a shortcut
for this type of calculation?

Exercise 22.9. Show that if −w0 λ = λ for all weights in G, then every element of
G is conjugate to its inverse.

Exercise 22.10. The nine-dimensional adjoint representation of U(3) has as an


invariant subspace the eight-dimensional Lie algebra of SU(3).
(i) Identifying the weight lattice of U(3) or GL(3, C) with Z3 as in Chap. 19, what
is the highest weight vector in this eight-dimensional module?
(ii) Decompose the tensor square of this representations into irreducibles by com-
puting the square of the character and finding irreducible representations whose
character adds up to the character in question.
(iii) Compute the symmetric and exterior squares of the character.
188 22 The Weyl Character Formula

Exercise 22.11. Let V be the ten-dimensional adjoint representation of Sp(4).


What is the decomposition of the symmetric and exterior squares of this repre-
sentation into irreducibles?

Exercise 22.12. Generalize the last exercise as follows. Let V be the adjoint rep-
resentation of Sp(2n). Its degree is n + 2n2 . What is the decomposition of the
symmetric and exterior squares of this representation into irreducibles? You might
want to use a computer program such as Sage to collect some data but prove your
answer.

Exercise 22.13. Let the weight lattices of Sp(2n) and SO(2n + 1) be identified
with Zn as in Chap. 19. Denote by ρC and ρB the Weyl vectors of these two groups.
Show that
 
1 3 1
ρC = (n, n − 1, . . . , 1), ρB = n − , n − , . . . , .
2 2 2

Recall that Spin(2n + 1) is the double cover of SO(2n + 1). The root lattice of
Spin(2n + 1) is naturally embedded in that of SO(2n + 1). Show that the root lattice
of Spin(2n+1) consists of tuples (μ1 , . . . , μn ) such that 2μi ∈ Z, and the μi are either
all integers or all half integers, that is, the 2μi are either all even or odd. (Hint:
Use Proposition 18.10 and look ahead to Proposition 31.2 if you need help.) Now
let λ = (λ1 , . . . , λn ) ∈ Zn such that λ1  · · ·  λn , and let μ = (μ1 , . . . , μn ) where
μi = λi + 12 . Show that λ and μ are dominant weights for Sp(2n) and SO(2n+1). Let
Vλ and Wμ be irreducible representations of Sp(2n) and SO(2n + 1), respectively.
Show that
dim(Wμ ) = 2n · dim(Vλ ).
[Hint: It may be easiest to show that dim(Wμ )/ dim(Vλ ) is constant, then take l = 0
to determine the constant.]

The next exercise treats the Frobenius–Schur indicator of an irreducible repre-


sentation. This will be covered (in a slightly different form) in Theorem 43.1. If (π, V )
is a representation of the compact group G, and B : V × V → C is an invariant
bilinear form, then B is unique up to scalar multiple (by a version of Schur’s lemma),
so B(x, y) = c B(y, x) for some constant c. Since B(x, y) = c2 B(x, y), c = ±1. Thus
the form B is either symmetric or skew-symmetric. If it is symmetric, then π(G)
is contained in the orthogonal group of the form, in which case we say π is or-
thogonal . If B is skew-symmetric, then π(G) is contained in the symplectic group
of the form, in which case dim(V ) is even, and we say that π is symplectic. Every
self-contragredient representation is either orthogonal or symplectic but not both.
See Chap. 43 for further details.

Exercise 22.14. Let χ be the character of an irreducible representation π : G →


GL(V ) of the compact group G.
(i) Show that χ(g 2 ) = ∨2 χ(g) − ∧2 χ(g) and χ(g)2 = ∨2 χ(g) + ∧2 χ(g), where ∨2 χ
and ∧2 χ are the characters of the symmetric and exterior square representations.
(ii) Show that χ(g 2 ) is a generalized character, and that when χ is expanded in
terms of irreducible characters, the coefficient of the trivial character is the
Frobenius–Schur indicator
22 The Weyl Character Formula 189

⎨ 1 if π is orthogonal,
ε(π) = −1 if π is symplectic,

0 if π is not self-contragredient.

The generalized χ(g 2 ) is (in the language of lambda rings) the second Adams
operation applied to χ. More generally, for k  0, the Adams operation ψ k χ(g) =
χ(g k ). We will return to the Adams operations in Chap. 33, and in particular we
will see that ψ k χ is a generalized character for all k; for k = 2 this follows from
Exercise 22.14. Suppose that for μ in the weight lattice Λ of G, m(μ) is the weight
multiplicity for χ, so that in the notation of Chap. 22, we have

χ= m(μ) eμ .
μ∈Λ

Then clearly 
ψk χ = m(μ) ekμ .
μ∈Λ

A method of computing the Frobenius–Schur indicator is simply to decompose


ψ 2 χ into irreducibles and note the coefficient of the trivial representation. A better
method is to use a result of Steinberg [155] (modestly called Lemma 79) that there
exists an element η of order  2 in the center of G such that if π is self-dual, then the
central character of π applied to η is the Frobenius–Schur indicator. In Sage, irre-
ducible representations (as WeylCharacterRing elements) have a method to compute
the Frobenius–Schur indicator.
Exercise 22.15.
(i) Let G = SU(2). If k is a nonnegative integer, let χk be the character of the
irreducible representation on ∨k C2 . Show that if χ = χk , then the generalized
character g → χ(g 2 ) equals
k
(−1)l χk−l ,
l=0

and deduce that the Frobenius–Schur indicator of χk is (−1)k .


(ii) Show that the image of SL(2, C) under the kth symmetric power homomorphism
to GL(k + 1, C) is contained in SO(k + 1) if k is even, and Sp(k + 1) if k is odd.
Exercise 22.16. Let k be a positive integer, and let χ be an irreducible character
of the compact connected Lie group G. If λ is a dominant weight, let χλ denote the
irreducible character with highest weight λ. Let ρ be the Weyl vector, half the sum
of the positive roots. Prove that
χk−1 · ψ k χλ = χkλ+(k−1)ρ .
(Hint: Use the Weyl character formula.)
Exercise 22.17. Let α1 and α2 be the simple roots for SU(3). The aim of this
exercise is to compute ψ 2 χnρ . Note that ρ = α1 + α2 .
(i) Show that
 
i+j χ(2m+1)ρ − χ(2m−1)ρ if m > 0,
χρ (−1) χ2mρ−iα1 −jα2 =
χρ if m = 0.
i,jm
i = 0 or j = 0
190 22 The Weyl Character Formula

(Hint: One way to prove this is to use the Brauer-Klimyk method.)

(ii) Show that



k 
ψ 2 χkρ = (−1)i+j χ2mρ−iα1 −jα2 .
m=0 i,jm
i = 0 or j = 0

Exercise 22.18. Let χ be a character of the group G. Let ∨k χ, ∧k χ and ψ k χ be the


symmetric power, exterior power and Adams operations applied to χ. Prove that

1 r
k
∨k χ = (ψ χ)(∨k χ),
k r=1

1
k
∧k χ = (−1)r (ψ r χ)(∧k χ).
k r=1
Hint: The symmetric polynomial identity (37.3) below may be of use.

If G is a Lie group, using the Brauer–Klimyk method to compute the right-hand


side in these identities, it is not necessary to decompose ψ k χ into irreducibles. So this
gives an efficient recursive way to compute the symmetric and exterior powers of a
character.

Exercise 22.19. Let 1 and 2 be the fundamental dominant weights for SU(3).
Show that the irreducible representation with highest weight k1 + l2 is self-
contragredient if and only if k = l, and in this case, it is orthogonal.

Exercise 22.20. Let G be a semisimple Lie group. Show that the adjoint repre-
sentation is orthogonal. (Hint: You may assume that G is simple. Use the Killing
form.)
23
The Fundamental Group

In this chapter, we will look more closely at the fundamental group of a


compact Lie group G. We will show that it is a finitely generated Abelian
group and that each loop in G can be deformed into any given maximal torus.
Then we will show how to calculate the fundamental group. Along the way we
will encounter another important Coxeter group, the affine Weyl group. The
key arguments in this chapter are topological and are adapted from Adams [2].

Proposition 23.1. Let G be a connected topological group and Γ a discrete


normal subgroup. Then Γ ⊂ Z(G).

Proof. Let γ ∈ Γ. Then g −→ gγg −1 is a continuous map G −→ Γ . Since G


is connected and Γ discrete, it is constant, so gγg −1 = γ for all g. Therefore,
γ ∈ Z(G).


Proposition 23.2. If G is a connected Lie group, then the fundamental group


π1 (G) is Abelian.

Proof. Let p : G̃ −→ G be the universal cover. We identify the kernel ker(p)


with π1 (G). This is a discrete normal subgroup of G̃ and hence is central in
G̃ by Proposition 23.1. In particular, it is Abelian.


We remind the reader that an element of G is regular if it is contained in a


unique maximal torus. Clearly, a generator of a maximal torus is regular. An
element of G is singular if it is not regular. Let Greg and Gsing be the subsets
of regular and singular elements of G, respectively.

Proposition 23.3. The set Gsing is a finite union of submanifolds of G, each


of codimension at least 3.

Proof. By Proposition 18.14, the singular elements of G are the conjugates of


the kernels Tα of the roots. We first show that the union of the set of conjugates
of Tα is the image of a manifold of codimension 3 under a smooth map. Let
α ∈ Φ. The set of conjugates of Tα is the image of G/CG (Tα ) × Tα under the

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 191


DOI 10.1007/978-1-4614-8024-2 23, © Springer Science+Business Media New York 2013
192 23 The Fundamental Group

smooth map (gCG (Tα ), u) → gug −1 . Let r = dim(T ), so r − 1 = dim(Tα ). The


dimension of CG (Tα ) is at least r+2 since its complexified Lie algebra contains
tC , Xα , and X−α . Thus, the dimension of the manifold G/CG (Tα ) × Tα is at
most dim(G) − (r + 2) + (r − 1) = dim(G) − 3.
However, we have asserted more precisely that Gsing is a union of subman-
ifolds of codimension  3. This more precise statement6 requires a bit more
work. If S ⊂ Φ is any nonempty subset, let US = {Tα |α ∈ S}. Let VS be the
open subset of US consisting of elements not contained in US  for any larger
S  . It is easily checked along the lines of (17.2) that the Jacobian of the map

(g CG (US ), u) → gug −1 , G/CG (US ) × VS −→ G,

is nonvanishing, so its image is a submanifold of G by the inverse function


theorem. The union of these submanifolds is Gsing , and each has dimension
 dim(G) − 3.


Lemma 23.1. Let X and Y be Hausdorff topological spaces and f : X −→ Y


a local homeomorphism. Suppose that U ∈ X is a dense open set and that the
restriction of f to U is injective. Then f is injective.

Proof. If x1 = x2 are elements of X such that f (x1 ) = f (x2 ), find open


neighborhoods V1 and V2 of x1 and x2 , respectively, that are disjoint, and
such that f induces a homeomorphism Vi −→ f (Vi ). Note that U ∩ Vi is
a dense open subset of Vi , so f (U ∩ Vi ) is a dense open subset of f (Vi ).
Since f (V1 ) ∩ f (V2 ) = ∅, it follows that f (U ∩ V1 ∩ V2 ) is nonempty. If z ∈
f (U ∩ V1 ∩ V2 ), then there exist elements yi ∈ U ∩ Vi such that f (yi ) = z.
Since Vi are disjoint y1 = y2 ; yet f (y1 ) = f (y2 ), a contradiction since f |U is
injective.


We define a map φ : G/T × Treg −→ Greg by φ(gT, t) = gtg −1 . It is the


restriction to the regular elements of the map studied in Chap. 17.

Proposition 23.4.
(i) The map φ is a covering map of degree |W |.
(ii) If t ∈ Treg , then the |W | elements wtw−1 , w ∈ W are all distinct.

Proof. For t ∈ Treg , the Jacobian of this map, computed in (17.2), is nonzero.
Thus the map φ is a local homeomorphism.
We define an action of W = N (T )/T on G/T × Treg by

w : (gT, t) −→ (gn−1 T, ngn−1), w = nT ∈ W.

W acts freely on G/T , so the quotient map G/T × Treg −→ W \(G/T × Treg )
is a covering map of degree |W |. The map φ factors through W \(G/T ×
Treg ). Consider the induced map ψ : W \(G/T × Treg ) −→ Greg . We have a
commutative diagram:
23 The Fundamental Group 193

G/T × Treg W \(G/T × Treg)

ψ
φ

Greg

Both φ and the horizontal arrow are local homeomorphisms, so ψ is a local


homeomorphism. By Proposition 17.3, the elements wtw−1 are all distinct for
t in a dense open subset of Treg . Thus, ψ is injective on a dense open subset
of W \(G/T × Treg ), and since it is a local homeomorphism, it is therefore
injective by Lemma 23.1. This proves both (i) and (ii).


Proposition 23.5. Let p : X −→ Y be a covering map. The map π1 (X) −→


π1 (Y ) induced by inclusion X → Y is injective.

Proof. Suppose that p0 and p1 are loops in X with the same endpoints whose
images in Y are path-homotopic. It is an immediate consequence of Proposi-
tion 13.2 that p0 and p1 are themselves path-homotopic.


Proposition 23.6. The inclusion Greg −→ G induces an isomorphism of fun-


damental groups: π1 (Greg ) ∼
= π1 (G).

Proof. Of course, we usually take the base point of G to be the identity,


but that is not in Greg . Since G is connected, the isomorphism class of its
fundamental group does not change if we move the base point P into Greg .
If p : [0, 1] −→ G is a loop beginning and ending at P , the path may
intersect Gsing . We may replace the path by a smooth path. Since Gsing is a
finite union of submanifolds of codimension at least 3, we may move the path
slightly and avoid Gsing . (For this we only need codimension 2.) Therefore,
the induced map π1 (Greg ) −→ π1 (G) is surjective.
Now suppose that p0 and p1 are two paths in Greg that are path-homotopic
in G. We may assume that both the paths and the homotopy are smooth.
Since Gsing is a finite union of submanifolds of codimension at least 3, we may
perturb the homotopy to avoid it, so p0 and p1 are homotopic in Greg . Thus,
the map π1 (Greg ) −→ π1 (G) is injective.


Proposition 23.7. We have π1 (G/T ) = 1.

In Exercise 27.4 we will see that the Bruhat decomposition gives an alternative
proof of this fact.

Proof. Let t0 ∈ Treg and consider the map f0 : G/T −→ G, f0 (gT ) = gt0 g −1 .
We will show that the map π1 (G/T ) −→ π1 (G) induced by f0 is injective. We
may factor f0 as
υ φ
G/T −→ G/T × Treg −→ Greg −→ G,
194 23 The Fundamental Group

where the first map υ sends gT −→ (gT, t0 ). We will show that each
induced map
υ φ
π1 (G/T ) −→ π1 (G/T × Treg ) −→ π1 (Greg ) −→ π1 (G) (23.1)

is injective. It should be noted that Treg might not be connected, so G/T ×Treg
might not be connected, and π1 (G/T × Treg ) depends on the choice of a
connected component for its base point. We choose the base point to be (T, t0 ).
υ
We can factor the identity map G/T as G/T −→ G/T × Treg −→ G/T ,
where the second map is the projection. Applying the functor π1 , we see that
π1 (υ) has a left inverse and is therefore injective. Also π1 (φ) is injective by
Propositions 23.4 and 23.5, and the third map is injective by Proposition 23.6.
This proves that the map induced by f0 injects π1 (G/T ) −→ π1 (G).
However, the map f0 : G/T → G is homotopic in G to the trivial map
mapping G/t to a single point, as we can see by moving t0 to 1 ∈ G. Thus f0
induces the trivial map π1 (G/T ) −→ π1 (G) and so π1 (G/T ) = 1.


Theorem 23.1. The induced map π1 (T ) −→ π1 (G) is surjective. The group


π1 (G) is finitely generated and Abelian.

Proof. One way to see this is to use have the exact sequence

π1 (T ) −→ π1 (G) −→ π1 (G/T )

of the fibration G −→ G/T (Spanier [149, Theorem 10 on p. 377]). It follows


using Proposition 23.7 that π1 (T ) −→ π1 (G) is surjective. Alternatively, we
avoid recourse to the exact sequence to recall more directly why π1 (G/T )
implies that π1 (T ) → π1 (G) is surjective. Given any loop in G, its image in
G/T can be deformed to the identity, and lifting this homotopy to G deforms
the original path to a path lying entirely in T .
As a quotient of the finitely generated Abelian group π1 (T ), the group
π1 (G) is finitely generated and Abelian.


Lemma 23.2. Let H ∈ t.


(i) Let λ ∈ Λ. Then λ(eH ) = 1 if and only if 2πi 1
dλ(H) ∈ Z.
(ii) We have e = 1 if and only if 2πi dλ(H) ∈ Z for all λ ∈ L.
H 1


Proof. Since t → λ etH is a character of R we have λ etH = e2πiθt for some
1 d
θ = θ(λ, H). Then θ = 2πi dt λ(e
tH
) |t=0 = 2πi
1
dλ(H). On the other hand
 1
d 1  2πiθ 
λ(eH ) = λ(etH ) dt = e −1 ,
0 dt 2πiθ

so λ(eH ) = 1 if and only if 1


2πi dλ(H) ∈ Z. And λ(eH ) = 1 for all λ ∈ X ∗ (T )
if and only if eH = 1.

23 The Fundamental Group 195

Since the map π1 (T ) −→ π1 (G) is surjective, we may study the fundamen-


tal group of G by determining the kernel of this homomorphism. The group
π1 (T ) is easy to understand: the Lie algebra t is simply-connected, and the
exponential map exp : t −→ T is a surjective group homomorphism that is
a covering map. Thus we may identify t with the universal cover of T , and
with this identification π1 (T ) is just the kernel of the exponential map t → T .
Moreover, the next result shows how we may further identify π1 (T ) with the
coweight lattice, which is the lattice Λ∨ of linear functionals on V that map
Λ into Z.
Proposition 23.8. Define τ : t → V ∗ by letting τ (H) ∈ V ∗ be the linear
1
functional that sends λ to 2πi dλ(H). Then τ is a linear isomorphism, and τ
maps the kernel of exp : t → T to Λ∨ . If α∨ is a coroot, then

α∨ = τ (2πiHα ). (23.2)

Proof. It is clear that τ is a linear isomorphism. It follows from Lemma 23.2(ii)


that it maps the kernel of exp onto the coweight lattice. The identity (23.2)
follows from Proposition 18.13.

For each α ∈ Φ and each k ∈ Z define the hyperplane Hα,k ⊂ V to be

{H ∈ t|τ (H)(α) = k}.

By Lemma 23.2, the preimage of Tα in t under the exponential map is the


union of the Hα,k .
The geometry of these hyperplanes leads to the affine Weyl group (Bour-
baki [23, Chap. IV Sect. 2]). This structure goes back to Stiefel [156] and has
subsequently proved very important. Adams [2] astutely based his discussion
of the fundamental group on the affine Weyl group. The fundamental group
is also discussed in Bourbaki [24, Chap. IX]. The affine Weyl group was used
by Iwahori and Matsumoto [86] to introduce a Bruhat decomposition into a
reductive p-adic group. The geometry introduced by Stiefel also reappears as
the “apartment” in the Bruhat-Tits building. Iwahori and Matsumoto also
introduced the affine Hecke algebra as a convolution algebra of functions on
a p-adic group, but the affine Hecke algebra has an importance in other areas
of mathematics for example because of its role in Kazhdan-Lusztig theory.
In addition to the simple positive roots α1 , . . . , αr , let −α0 be the highest
weight in the adjoint representation. It is a positive root, so α0 is a negative
root, the so-called affine root which will appear later in Chap. 30. We call a
connected component of the complement of the hyperplanes Hα,k an alcove.
To identify a particular one, there is a unique alcove that is contained in the
positive Weyl chamber which contains the origin in its closure. This alcove is
the region bounded by the hyperplanes Ha1 ,0 , . . . , Har ,0 and H−α0 ,1 = Hα0 ,−1 .
It will be called the fundamental alcove F .
We have seen (by Lemma 23.2) that the lattice Λ∨ where the weights
take integer values may be identified with the kernel of the exponential map
196 23 The Fundamental Group

t −→ T . Thus Λ∨ may be identified with the fundamental group of T , which


(by Proposition 23.1) maps surjectively onto π1 (G). Therefore we need to
compute the kernel of this homomorphism Λ∨ −→ π1 (G). We will show that
it is the coroot lattice Λ∨
coroot , which is the sublattice generated by the coroots
α∨ (α ∈ Φ).

α2∨

0 α1∨

α0∨

Fig. 23.1. The Cartan subalgebra t, partitioned into alcoves, when G = SU(3) or
PU(3). We are identifying t = V ∗ via the isomorphism τ , so the coroots are in t

Before turning to the proofs, let us consider an example. Let G = PU(3),


illustrated in Fig. 23.1. The hyperplanes Hα,k are labeled, subdividing t into
alcoves, and the fundamental alcove F is shaded. The coweight lattice Λ∨ ∼ =
π1 (T ) consists of the vertices of the alcoves, which are the smaller dots. The
heavier dots mark the coroot lattice.
If we consider instead G = SU(3), the diagram would be the same with
only one difference: now not all the vertices of alcoves are in Λ∨ . In this
example, Λ∨ = Λcoroot consists of only the heavier dots. Not every vertex of
the fundamental alcove is in the coweight lattice.
If α ∈ Φ and k ∈ Z let sα,k be the reflection in the hyperplane Hα,k . Thus

sα,k (x) = x − (α (x) − k) α∨ .

Let Waff be the group of transformations of t generated by the reflections in


the hyperplanes Hα,k . This is the affine Weyl group. In particular we will label
the reflections in the walls of the fundamental alcove s0 , s1 , . . . , sr , where if
1  i  r then si is the reflection in Hαi ,0 , and s0 is the reflection in Hα0 ,−1 .
23 The Fundamental Group 197

Proposition 23.9.
(i) The affine Weyl group acts transitively on the alcoves.
(ii) The group Waff is generated by s0 , s1 , . . . , sr .
(iii) The group Waff contains the group of translations by elements of Λ∨
coroot as
a normal subgroup and is the semidirect product of W and the translation
group of Λ∨
coroot .
 
Proof. Let Waff be the subgroup s0 , s1 , . . . , sr . Consider the orbit Waff F of
alcoves. If F1 is in this orbit, and F2 is another alcove adjacent to F1 , then
F = wF1 for some w ∈ WF and so wF2 is adjacent to F , i.e. wF2 = si F for

some si . It follows that F2 is in Waff F also. Since every alcove adjacent to
 
an alcove in Waff F is also in it, it is clear that Waff F consists of all alcoves,
proving (i).

We may now show that Waff = Waff . It is enough to show that the reflection

r = sα,k in the hyperplane Hα,k is in Waff . Let A be an alcove that is adjacent

to Hα,k , and find w ∈ Waff such that w(A) = F . Then w maps Hα,k to one of
the walls of F , so wrw−1 is the reflection in that wall. This means wrw−1 = si
for some i and so r = w−1 si w ∈ Waff 
. This completes the proof that Waff is
generated by the si .
Now we recall that the group G of all affine linear maps of the real vector
space t is the semidirect product of the group of linear transformations (which
fix the origin) by the group of translations, a normal subgroup. If v ∈ t we
will denote by T (v) the translation x → x + v. So T (t) is normal in G and
G = GL(t) · T (t). This means that G/T (t) ∼ = GL(t). The homomorphism
G −→ GL(t) maps the reflection in Hα,k to the reflection in the parallel
hyperplane Hα,0 through the origin. This induces a homomorphism from Waff
to W , and we wish to determine the kernel K, which is the group of translations
in Waff .
First observe T (α∨ ) is in Waff , since it is the reflection in Hα,0 followed by
the reflection in Hα,1 . Let us check that Λ∨ coroot is normalized by W . Indeed we
check easily that sα,0 T (β ∨ )sα0 is translation by sα,0 (β ∨ ) = β ∨ − α(β ∨ )α∨ ∈
Λ∨ ∨ 
coroot . Therefore W · T (Λcoroot ) is a subgroup of Waff . Finally, we note that
∨ ∨
sα,k = T (kα )sα,0 , so W · T (Λcoroot) contains generators for Waff , and the
proof is complete.

The group Wextended generated by Waff and translations by Λ may equal
Waff or it may be larger. This often larger group is called the extended affine
Weyl group. We will not need it, but it is important and we mention it for
completeness.
We have constructed a surjective homomorphism Λ∨ −→ π1 (G). To re-
capitulate, the exponential map t −→ T is a covering by a simply-connected
group, so its kernel Λ∨ may be identified with the fundamental group π1 (T ),
and we have seen that the map π1 (T ) −→ π1 (G) induced by inclusion is
surjective.
We recall that if X is a topological space then a path p : [0, 1] −→ X is
called a loop if p(0) = p(1).
198 23 The Fundamental Group

Lemma 23.3. Let ψ : Y −→ X be a covering map, and let p : [0, 1] −→ Y


be a path. If ψ ◦ p : [0, 1] −→ X is a loop that is contractible in X, then p
is a loop.

Proof. Let q = φ ◦ p, and let x = q(0). Let y = p(0), so φ(y) = x. What we


know is that q(1) = x and what we need to prove is that p(1) = y.
Since q is contractible in X, we may find a family qu of paths indexed
by u ∈ [0, 1] such that q0 = q while q1 is the constant path q1 (t) = x.
We may choose the deformation so that neither end point moves, that is,
qu (0) = qu (1) = x for all u. For each u, by the path lifting property of covering
maps [Proposition 13.2(i)] we may find a unique path pu : [0, 1] −→ Y such
that ψ ◦ pu = qu and pu (0) = y. In particular, p0 = p. It is clear that pu (t)
varies continuously as a function of both t and u. Since q1 is constant, so is p1
and therefore p1 (1) = p1 (0) = y. Now u → pu (1) is a path covering a constant
path, hence it too is constant, so p0 (1) = p1 (1) = y.


Proposition 23.10. The kernel of the surjective homomorphism Λ∨ −→


π1 (G) that we have described is Λ∨
coroot . Thus the fundamental group π1 (G) is
isomorphic to Λ∨ /Λ∨coroot .

Proof. First let us show that a coroot α∨ is in the kernel of this homomor-
phism. In view of (23.2) this means, concretely that if we take a path from
0 to 2πiHα in t then the exponential of this map (which has both left and
right endpoint at the identity) is contractible in G. We may modify the path
so that it is the straight-line path, passing through πiHα . Then we write it
as the concatenation of two paths, p  q in the notation of Chap. 13, where
p(0) = 0 and p(1) = πiHα while q(0) = πiHα and q(1) = 2πiHα .
The exponential of this path epq = ep  eq is a loop. In fact, we have
   
p(0) q(1) 1 p(1) q(0) −1
e =e = iα , e =e = iα .
1 −1

We will deform the path q, leaving p unchanged. Let


 
cos(πu/2) − sin(πu/2)
g(u) = iα ,
sin(πu/2) cos(πu/2)

and consider qu = Ad(g(u))q. The endpoints of qu do change as u goes from


0 to 1, but the endpoints of equ do not. Indeed equ (0) = g(u)iα (−I)g(u)−1 =
iα (−I) and similarly equ (1) = iα (−1). Thus the path epq is homotopic to
ep  eq1 . Now eq1 = wα eq wα−1 , which is the negative of the path ep , being the
exponential of the straight line path from −πiHα to −2πiHα . This proves
that epq is path-homotopic to the identity.
Thus far we have shown that Λcoroot is in the kernel of the homomorphism
Λ∨ −→ π1 (G). To complete the proof, we will now show that if K ∈ Λ∨ maps
to the identity in π1 (G), then K ∈ Λ∨ coroot . We note that there are |W | alcoves
that are adjacent to the origin; these are the alcoves wF with w ∈ W . We will
23 The Fundamental Group 199

show that we may assume that K lies in the closure of one of these alcoves.
Indeed F + K is an alcove, so there is some w ∈ Waff such that F + K = w F .
Moreover, w may be represented as T (K  )w with K  ∈ Λ∨ coroot and w ∈ W .
Thus F + K − K  = wF and since we have already shown that K  maps to
the identity in π1 (G), we may replace K by K − K  and assume that K is in
the closure of wF .
Our goal is to prove that K = 0, which will of course establish that estab-
lish K ∈ Λ∨ coroot , as required. Since K and the origin are in the same alcove,
we may find a path p : [0, 1] −→ t from 0 to K such that p(u) is in the interior
of the alcove for p(u) = 0, 1, while p(0) = 0 and p(1) = K are vertices of
the alcove.
Let treg be the preimage of Treg in t under the exponential map. It is the
complement of the hyperplanes Hα,k , or equivalently, the union of the alcove
interiors. We will make use of the map ψ : G/T × treg −→ Greg defined by
φ(gT, H) = geH g −1 . It is the composition of the covering map φ in Proposi-
tion 23.4 with the exponential map treg −→ Treg , which is also a covering, so
ψ is a covering map.
Let N be a connected neighborhood of the identity in G that is closed
under conjugation such that the preimage under exp of N in t consists of
disjoint open sets, each containing an a single element of the kernel Λ∨ of the
exponential map on t. Let Nt = t ∩ exp−1 (N ) be this reimage. Each connected
component of Nt contains a unique element of Λ∨ .
We will modify p so that it is outside treg only near t = 1. Let H ∈ t be
a small vector in the interior of wF ∩ N and let p : [0, 1] −→ t be the path
shifted by H. The vector H can be chosen in the connected component of Nt
that contains 0 but no other element of Λ∨ . So p (0) = H and p (1) = K + H.
When t is near 1 the path p may cross some of the Hα,k but only inside N .

The exponentiated path ep (t) will be a loop close to ep(t) hence con-
p (t)
tractible. And e will be near the identity 1 = eK in G at the end of

the path where p (t) may not be in the interior of wF . Because, by Proposi-
tion 23.3, Greg is a union of codimension  3 submanifolds of G, we may find

a loop q  : [0, 1] −→ Greg that coincides with ep until near the end where

p (t) is near v + H and ep (t) reenters N . At the end of the path, q  dodges
out of T to avoid the singular subset of G, but stays near the identity. More
 
precisely, if q  is close enough to ep the paths will agree until ep (t) and q 
are both inside N .

The loop q  will have the same endpoints as ep . It is still contractible by
Proposition 23.6. Therefore we may use the path lifting property of covering
maps [Proposition 13.2(i)] to lift the path q  to a path p : [0, 1] −→ G/T ×treg
by means of ψ, with p (0) = (1·T, H), the lifted path is a loop by Lemma 23.3.
Thus p (1) = p (0) = (1 · T, H). Now consider the projection p of p onto

treg . At the end of the path, the paths ep and ψ ◦ p are both within N , so
p and p can only vary within Nt . In particular their endpoints, which are
H + K and H respectively, must be the same, so K = 0 as required.

200 23 The Fundamental Group

Proposition 23.11. Let G be a compact connected Lie group. The following


are equivalent.
(i) The root system Φ spans V = R ⊗ X ∗ (T ).
(ii) The fundamental group π1 (G) is finite.
(iii) The center Z(G) is finite.
If these conditions are satisfied, G is called semisimple.
Proof. The root lattice spans V if and only if the coroot lattice spans V ∗ ,
which we are identifying with t. Since Λ∨ is a lattice in t of rank equal to
dim(V) = dim(T ), the coroot lattice spans V ∗ if and only if Λ∨ /Λcoroot is
finite, and this, we have seen, is isomorphic to the fundamental group. Thus
(i) and (ii) are equivalent. The center Z(G) is the intersection of the Tα by
Proposition 18.14. Thus the Lie algebra z of Z(G) is zero if and only if the tα ,
which are the kernels of the roots interpreted as linear functionals on t = V ∗ .
So Z(G) is finite if and only if z = 0, if and only if the roots span V. The
equivalence of all three statements is now proved.

Let us now assume that G is semisimple. Let Λ̃ be the set of λ ∈ V such that
α∨ (λ) ∈ Z for all coroots α, and let Λ̃∨ be the set of H ∈ V ∗ = t such that
α(H) ∈ Z for all α ∈ Φ.
The following result gives a complete determination of both the center and
the fundamental group.
Theorem 23.2. Assume that G is semisimple. Then
Λ̃ ⊇ Λ ⊇ Λroot , Λ̃∨ ⊇ Λ∨ ⊇ Λ∨
coroot .

Regarding these as lattices in the dual real vector spaces V and V ∗ , which we
have identified with R ⊗ X ∗ (T ) and with t, respectively, Λ̃ is the dual lattice
of Λ∨ ∨ ∨
coroot , Λ is the dual lattice of Λ and Λroot is the dual lattice of Λ̃ . Both
π1 (G) and Z(G) are finite Abelian groups and
π1 (G) ∼
= Λ̃/Λ ∼
= Λ∨ /Λ∨
coroot , = Λ̃∨ /Λ∨ ∼
Z(G) ∼ = Λ/Λroot. (23.3)
Proof. By Proposition 18.10 we have Λ̃ ⊇ Λ and Λ ⊇ Λroot is clear since roots
are characters of X ∗ (T ). That Λ and Λ∨ are dual lattices is Lemma 23.2. That
Λroot and Λ̃∨ are dual lattices and that Λ̃ and Λ∨ coroot are dual lattices are both
by definition. The inclusions Λ̃ ⊇ Λ ⊇ Λroot then imply Λ̃∨ ⊇ Λ∨ ⊇ Λ∨ root .
Moreover, Λ̃/Λ ∼ = Λ∨ /Λ∨ root follows from the fact that two Abelian groups in
duality are isomorphic, and the vector space pairing V × V ∗ −→ R induces a
∨ ∼
perfect pairing Λ̃/Λ × Λ∨ /Λ∨ ∨
root −→ R/Z. Similarly Λ̃ /Λ = Λ/Λroot . The
fact that π1 (G) ∼ ∨ ∨
= Λ /Λcoroot follows from Proposition 23.10. It remains to
be shown that Z(G) ∼ = Λ̃∨ /Λ∨ . We know from Proposition 18.14 that Z(G) is
the intersection of the Tα . Thus H ∈ t exponentiates into Z(G) if it is in the
kernel of all the root groups, that is, if α(H) ∈ Z for all α ∈ Φ. This means
that the exponential induces a surjection Λ̃∨ −→ Z(G). Since Λ∨ is the kernel
of the exponential on t, the statement follows.

23 The Fundamental Group 201

Proposition 23.12. If G is semisimple and simply-connected, then Λ̃ = Λ.

Proof. This follows from (23.3) with π1 (G) = 1.




Exercises
Exercise 23.1. If g is a Lie algebra let [g, g] be the vector space spanned by [X, Y ]
with X, Y ∈ g. Show that [g, g] is an ideal of g.

Exercise 23.2. Suppose that g is a real or complex Lie algebra. Assume that there
exists an invariant inner product B : g × g −→ C. Thus B is positive definite
symmetric or Hermitian and satisfies the ad-invariance property (10.3). Let z be the
center of g. Show that the orthogonal complement of g is [g, g].

Exercise 23.3. Let G be a semisimple group of adjoint type, and let G be its
universal cover. Show that the fundamental group of G is isomorphic to the center
of G . (Both are finite Abelian groups.)

Exercise 23.4.
(i) Consider a simply-connected semisimple group G. Explain why Λ = Λ in the
notation of Theorem 23.2.
(ii) Using the description of the root lattices for each of the four classical Cartan
types in Chap. 19, consider a simply-connected semisimple group G and compute
the weight lattice Λ using Theorem 23.2. Confirm the following table:

Cartan type Fundamental group


Ar Zr+1
Br Z2
Cr Z2
Dr , r odd Z4
Dr , r even Z2 × Z2

Exercise 23.5. If g is a Lie algebra, the center of g is the set of all Z ∈ g such that
[Z, X] = 0 for all X ∈ G. Show that if G is a connected Lie group with Lie algebra
g then the center of g is the Lie algebra of Z(G).

Exercise 23.6. (i) Let g be the Lie algebra of a compact Lie group G. If a is an
Abelian ideal, show that a is contained in the center of g.
(ii) Show by example that this may fail without the assumption that g is the Lie
algebra of a compact Lie group. Thus give a Lie algebra and an Abelian ideal
that is not central.

Exercise 23.7. Let G be a compact Lie group and g its Lie algebra. Let T , t,
and other notations be as in this chapter. Let t be the linear span of the coroots
α∨ = 2πiHα . Let z be the center of g. Show that t = t z.
Part III

Noncompact Lie Groups


24
Complexification

Thus far, we have investigated the representations of compact connected Lie


groups. In this chapter, we will see how the representation theory of compact
connected Lie groups has implications for at least some noncompact Lie
groups.
Let K be a connected Lie group. A complexification of K consists of a
complex analytic group G with a Lie group homomorphism i : K −→ G such
that whenever f : K −→ H is a Lie group homomorphism into a complex
analytic group, there exists a unique analytic homomorphism F : G −→ H
such that f = F ◦ i. This is a universal property, so it characterizes G up to
isomorphism.
A consequence of this definition is that the finite-dimensional representa-
tions of K are in bijection with the finite-dimensional analytic representations
of G. Indeed, we may take H to be GL(n, C). A finite-dimensional rep-
resentation of K is a Lie group homomorphism K −→ GL(n, C), and so
any finite-dimensional representation of K extends uniquely to an analytic
representation of G.

Proposition 24.1. The group SL(n, C) is the complexification of the Lie


group SL(n, R).

Proof. Given any complex analytic group H and any Lie group homomorphism
f : SL(n, R) −→ H, the differential is a Lie algebra homomorphism sl(n, R)−→
Lie(H). Since Lie(H) is a complex Lie algebra, this homomorphism extends
uniquely to a complex Lie algebra homomorphism sl(n, C) −→ Lie(H) by
Proposition 11.3. By Theorems 13.5 and 13.6, SL(n, C) is simply connected,
so by Theorem 14.2 this map is the differential of a Lie group homomorphism
F : SL(n, C) −→ H. We need to show that F is analytic. Consider the com-
mutative diagram

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 205


DOI 10.1007/978-1-4614-8024-2 24, © Springer Science+Business Media New York 2013
206 24 Complexification

The top, left, and right arrows are all holomorphic maps, and exp : sl(n, C) −→
SL(n, C) is a local homeomorphism in a neighborhood of the identity. Hence
F is holomorphic
near 1. If g ∈ SL(n, C) and if l(g) : SL(n, C) −→ SL(n, C)
and l F (g) : H −→ H denote left translation with respect to g and F (g),
then l(g) and l F (g) are analytic, and F = l F (g) ◦ F ◦ l(g)−1 . Since F is
analytic at 1, it follows that it is analytic at g.


We recall from Chap. 14, particularly the proof of Proposition 14.1, that if G is
a Lie group and h a Lie subalgebra of Lie(G), then there is an involutory family
of tangent vectors spanned by the left-invariant vector fields corresponding to
the elements of h. Since these vector fields are left-invariant, this involutory
family is invariant under left translation.

Proposition 24.2. Let G be a Lie group and let h be a Lie subalgebra of


Lie(G). Let H be a closed connected subset of G that is an integral submanifold
of the involutory family associated with h, and suppose that 1 ∈ H. Then H
is a subgroup of G.

One must not conclude from this that every Lie subalgebra of Lie(G) is the
Lie algebra of a closed Lie subgroup. For example, if G = (R/Z)2 , then the
one-dimensional subalgebra spanned by a vector (x1 , x2 ) ∈ Lie(G) = R2 is
the Lie algebra of a closed subgroup only if x1 /x2 is rational or x2 = 0.

Proof. Let x ∈ H and let U = {y ∈ H | x−1 y ∈ H}.


We show that U is open in H. If y ∈ U = H ∩ xH, both H and xH
are integral submanifolds for the involutory family associated with h, since
the vector fields corresponding to elements of h are left-invariant. Hence by
the uniqueness assertion of the local Frobenius theorem (Theorem 14.1) H
and xH have the same intersection with a neighborhood of y in G, and it
follows that U contains a neighborhood of y in H.
We next show that the complement of U is open in H. Suppose that y
is an element H − U . Thus, y ∈ H but x−1 y ∈ / H. By the local Frobenius
theorem there exists an integral manifold V through x−1 y. Since H is closed,
the intersection of V with a sufficiently small neighborhood of x−1 y in G
is disjoint from H. Replacing V by its intersection with this neighborhood,
we may assume that the intersection xV ∩ H = ∅. Since H and xV are
both integral manifolds through y, they have the same intersection with a
neighborhood of y in G, and so xz ∈ V for z near y in H. Thus, z ∈ / U.
It follows that H − U is open.
24 Complexification 207

We see that U is both open and closed in H and nonempty since 1 ∈ U .


Since H is connected, it follows that U = H. This proves that if x, y ∈ H,
then x−1 y ∈ H. This implies that H is a subgroup of G.

Theorem 24.1. Let K be a compact connected Lie group. Then K has a
complexification K −→ G, where G is a complex analytic group. The induced
map π1 (K) −→ π1 (G) is an isomorphism. The Lie algebra of G is the
complexification of the Lie algebra of K. Any faithful complex representation
of K can be extended to a faithful analytic complex representation of G. Any
analytic representation of G is completely reducible.
Proof. By Theorem 4.2, K has a faithful complex representation, which is
unitarizable, so we may assume that K is a closed subgroup of U (n) for
some n. The embedding K −→ U(n) is the differential of a Lie algebra ho-
momorphism k −→ gl(n, C), where k is the Lie algebra of K. This extends, by
Proposition 11.3, to a homomorphism of complex Lie algebras kC −→ gl(n, C),
and we identify kC with its image.
Let P = {eiX |X ∈ k} ⊂ GL(n, C), and let G = P K. Let P  ⊂ GL(n, C)
be the set of positive definite Hermitian matrices. By Theorem 13.4, the
multiplication map P  × U(n) −→ GL(n, C) is a homeomorphism. Moreover,
the exponentiation map from the vector space of Hermitian matrices to P  is
a homeomorphism. Since ik is a closed subspace of the real vector space of
Hermitian matrices, P is a closed topological subspace of P  , and G = P K is
a closed subset of GL(n, C) = P  U (n).
We associate with each element of kC a left-invariant vector field on
GL(n, C) and consider the resulting involutory family on GL(n, C). We will
show that G is an integral submanifold of this involutory family. We must
check that the left-invariant vector field associated with an element Z of kC
is everywhere tangent to G. It is easiest to check this separately in the cases
Z = Y and Z = iY with Y ∈ k. Near the point eiX k ∈ G, with X ∈ k and
k ∈ K, the path t −→ ei(X+tAd(k)Y ) k is tangent to G when t = 0 and is also
tangent to the path
t → eiX eitAd(k)Y k = eiX keitY .
(The two paths are not identical if [X, Y ] = 0, but this is not a problem.)
The latter path is the left translate by eiX k of a path through the identity
tangent to the left-invariant vector field corresponding to iY ∈ k. Since this
vector field is left invariant, this shows that it is tangent to G at eiX k. This
settles the case Z = iY . The case where Z = Y is similar and easier.
It follows from Proposition 24.2 that G is a closed subgroup of GL(n, C).
Since P is homeomorphic to a vector space, it is contractible, and since G is
homeomorphic to P × K, it follows that the inclusion K −→ G induces an
isomorphism of fundamental groups.
The Lie algebra of G is, by construction, ik + k = kC .
To show that G is the complexification of K, let H be a complex analytic
group and f : K −→ H be a Lie group homomorphism. We have an induced
208 24 Complexification

homomorphism k −→ Lie(H) of Lie algebras, which induces a homomorphism


kC = Lie(G) −→ Lie(H) of complex Lie algebras, by Proposition 11.3. If G̃
is the universal covering group of G, then by Proposition 14.2 we obtain a
Lie group homomorphism G̃ −→ H. To show that it factors through G ∼ =
G̃/π1 (G), we must show that the composite π1 (G) −→ G̃ −→ H is trivial.
But this coincides with the composition π1 (G) ∼ = π1 (K) −→ K̃ −→ K −→ H,
where K̃ is the universal covering group of K, and the composition π1 (K) −→
K̃ −→ K is already trivial. Hence the map G̃ −→ H factors through G,
proving that G has the universal property of the complexification.
We constructed G as an analytic subgroup of GL(n, C) starting with an
arbitrary faithful complex representation of K. Looking at this another way,
we have actually proved that any faithful complex representation of K can be
extended to a faithful analytic complex representation of G. The reason is that
if we started with another faithful complex representation and constructed the
complexification using that one, we would have gotten a group isomorphic to
G because the complexification is characterized up to isomorphism by its
universal property.
It remains to be shown that analytic representations of G are completely
reducible. If (π, V ) is an analytic representation of G, then, since K is compact,
by Proposition 2.1 there is a K-invariant inner product on V , and if U is an
invariant subspace, then V = U ⊕ W , where W is the orthogonal complement
of U . Then we claim that W is G-invariant. Indeed, it is invariant under k
and hence under kC = k ⊕ ik, which is the Lie algebra of G and, since G is
connected, under G itself.


In addition to the analytic notion of complexification that we have already


described, there is another notion, which we will call algebraic complexifica-
tion. We will not need it, and the reader may skip the rest of this chapter
with no loss of continuity. Still, it is instructive to consider complexification
from the point of view of algebraic groups, so we digress to discuss it now.
If G is an affine algebraic group defined over the real numbers, then K = G(R)
is a Lie group and G = G(C) is a complex analytic group, and G is the alge-
braic complexification of K. We will assume that G(R) is Zariski-dense in G
to exclude examples such as

G = {(x, y) | x2 + y 2 = ±1},

which is an algebraic group with group law (x, y)(z, w) = (xz − yw, xw + yz),
but which has one Zariski-connected component with no real points.
We see that the algebraic complexification is a functor not from the
category of Lie groups but rather from the category of algebraic groups G
defined over R. So the algebraic complexification of a Lie group K depends
on more than just the isomorphism class of K as a Lie group—it also depends
on its realization as the group of real points of an algebraic group. We illustrate
this point with an example.
24 Complexification 209

Let Ga and Gm be the “additive group” and the “multiplicative group.”


These are algebraic groups such that for any field Ga (F ) ∼ = F (additive
group) and Gm (F ) ∼ = F × . The groups G1 = Ga × (Z/2Z) and G2 = Gm
have isomorphic groups of real points since G1 (R) ∼ = R × (Z/2Z) and
G2 (R) ∼
= R× , and these are isomorphic as Lie groups. Their complexifications
are G1 (C) ∼
= C × (Z/2Z) and G2 (C) ∼= C× . These groups are not isomorphic.
If G is an algebraic group defined over F = R or C, and if K = G(F ),
then we call a complex representation π : K −→ GL(n, C) algebraic if there is
a homomorphism of algebraic groups G −→ GL(n) defined over C such that
the induced map of rational points is π. (This amounts to assuming that the
matrix coefficients of π are polynomial functions.) With this definition, the
algebraic complexification has an interpretation in terms of representations
like that of the analytic complexification.

Proposition 24.3. If G = G(C) is the algebraic complexification of K =


G(R), then any algebraic complex representation of K extends uniquely to an
algebraic representation of G.

Proof. This is clear since a polynomial function extends uniquely from G(R)
to G(C).


If K is a field and L is a Galois extension, we say that algebraic groups G1 and


G2 defined over K are L/K-Galois forms of each other—or (more succinctly)
L/K-forms—if there is an isomorphism G1 ∼ = G2 defined over L. If K = R
and L = C this means that K1 = G1 (R) and K2 = G2 (R) have isomorphic
algebraic complexifications. A C/R-Galois form is called a real form.
The example in Proposition 24.4 will help to clarify this concept.

Proposition 24.4. U(n) is a real form of GL(n, R).

Compare this with Proposition 11.4, which is the Lie algebra analog of this
statement.

Proof. Let G1 be the algebraic group GL(n), and let

G2 = {(A, B) ∈ Matn × Matn | A · t A + B · t B = I, A · t B = B · t A}.

The group law for G2 is given by

(A, B)(C, D) = (AC − BD, AD + BC).

We leave it to the reader to check that this is a group. This definition is


constructed so that G2 (R) = U(n) under the map (A, B) −→ A + Bi, when A
and B are real matrices.
We show that G2 (C) ∼ = GL(n, C). Specifically, we show that if g ∈ GL(n, C)
then there are unique matrices (A, B) ∈ Matn (C) such that A · t A + B · t B = I
and A · t B = B · t A with A + Bi = g. We consider uniqueness first. We have
210 24 Complexification

(A + Bi)(t A − t Bi) = (A t A + B t B) + (B t A − At B)i = I,

so we must have g −1 = t A − t Bi and thus t g −1 = A − Bi. We may now solve


for A and B and obtain

A = 12 (g + t g −1 ), B= 1
2i (g − t g −1 ). (24.1)

This proves uniqueness. Moreover, if we define A and B by (24.1), then it is


easy to see that (A, B) ∈ G2 (C) and A + Bi = g. 

It can be seen similarly that SU(n) and SL(n, R) are C/R Galois forms of each
other. One has only to impose in the definition of the second group G2 an
additional polynomial relation corresponding to the condition det(A+Bi) = 1.
(This condition, written out in terms of matrix entries, will not involve i, so
the resulting algebraic group is defined over R.)

Remark 24.1. Classification of Galois forms of a group is a problem in Ga-


lois cohomology. Indeed, the set of Galois forms of G is parametrized by
H 1 (Gal(L/K), Aut(G)). See Springer [150], Satake [144] and III.1 of Serre
[148]. Tits [162] contains the definitive classification over real, p-adic, finite,
and number fields.

Galois forms are important because if G1 and G2 are Galois forms of each
other, then we expect the representation theories of G1 and G2 to be related.
We have already seen this principle applied (for example) in Theorem 14.3.
Our next proposition gives a typical application.

Proposition 24.5. Let π : GL(n, R) −→ GL(m, C) be an algebraic represen-


tation. Then π is completely reducible.

This would not be true if we removed the assumption of algebraicity. For


example, the representation π : GL(n, R) −→ GL(2, R) defined by
 
1 log | det(g)|
π(g) =
1

is not completely reducible—and it is not algebraic.

Proof. Any irreducible algebraic representations of GL(n, R) can be extended


to an algebraic representation of GL(n, C) and then restricted to U(n), where
it is completely reducible because U(n) is compact.


The irreducible algebraic complex representations of GL(n, R) are the same


as the irreducible algebraic complex representations of GL(n, C), which in
turn are the same as the irreducible complex representations of U(n). (The
latter are automatically algebraic, and indeed we will later construct them as
algebraic representations.)
24 Complexification 211

These finite-dimensional representations of GL(n, R) may be parametrized


by their highest weight vectors and classified as in the previous chapter. Their
characters are given by the Weyl character formula.
Although the irreducible algebraic complex representations of GL(n, R) are
thus the same as the irreducible representations of the compact group U(n),
their significance is very different. These finite-dimensional representations
of GL(n, R) are not unitary (except for the one-dimensional ones). They
therefore do not appear in the Fourier inversion formula (Plancherel theorem).
Unlike U(n), the noncompact group GL(n, R) has unitary representations that
are infinite-dimensional, and it is these infinite-dimensional representations
that appear in the Plancherel theorem.

Exercises
Exercise 24.1. If F is a field, let
⎛ ⎞
  1
SOJ (n, F ) = g ∈ SL(n, F ) | g J t g = J , J =⎝ . .. ⎠.
1

Show that SOJ (C) is the complexification of SO(n). (Use Exercise 5.3.)
25
Coxeter Groups

As we will see in this chapter, Weyl groups and affine Weyl groups are
examples of Coxeter groups, an important family of groups generated by
“reflections.”
Let G be a group, and let I be a set of generators of G, each of which has
order 2. In practice, we will usually denote the elements of I by {s1 , s2 , . . . , sr }
or {s0 , . . . , sr } with some definite indexing by integers. If si , sj ∈ I, let
n(i, j) = n(si , sj ) be the order of si sj . [Strictly speaking we should write
n(si , sj ) but prefer less uncluttered notation.] We assume n(i, j) to be finite
for all si , sj . The pair (G, I) is called a Coxeter group if the relations

s2i = 1, (si sj )n(i,j) = 1 (25.1)

are a presentation of G. This means that G is isomorphic to the quotient of


the free group on a set of generators {σi }, one for each si ∈ I, by the smallest
normal subgroup containing all elements

σi2 , (σi σj )n(i,j) ,

and in this isomorphism each generator σi → si . Equivalently, G has the


following universal property: if Γ is any other group having elements vi (one
for each generator si ) satisfying the same relations (25.1), that is, if

vi2 = 1, (vi vj )n(i,j) = 1,

then there exists a unique homomorphism G → Γ such that each si → vi .


A word representing an element w of a Coxeter group (W, I) is a sequence
(si1 , . . . , sik ) such that w = si1 · · · sik . The word is reduced if k is as small as
possible. Thus, if the Coxeter group is a Weyl group and I the set of simple
reflections, then k is the length of w. Less formally, we may abuse language
by saying that w = si1 · · · sik is a reduced word or reduced decomposition of w.
We return to the context of Chap. 20. Let V be a vector space, Φ a reduced
root system in V, and W the Weyl group. We partition Φ into positive and

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 213


DOI 10.1007/978-1-4614-8024-2 25, © Springer Science+Business Media New York 2013
214 25 Coxeter Groups

negative roots and denote by Σ the simple positive roots. Let I = {s1 , . . . , sr }
be of simple reflections. By definition, W is generated by the set I. Let n(i, j)
denote the order of si sj . We will show that (W, I) is a Coxeter group. It is
evident that the relations (25.1) are satisfied, but we need to see that they
give a presentation of W .

Theorem 25.1. Let W be the Weyl group of the root system Φ, and let I =
{s1 , . . . , sr } be the simple reflections. Then (W, I) is a Coxeter group.

We will give a geometric proof of this fact, making use of the system of Weyl
chambers. As it turns out, every Coxeter group has a geometric action on a
simplicial complex, a Coxeter complex, which for Weyl groups is closely related
to the action on Weyl chambers. This point of view leads to the theory of build-
ings. See Tits [163] and Abramenko and Brown [1] as well as Bourbaki [23].

Proof. Let (W  , I  ) be the Coxeter group with generators s1 , . . . , sj and
the relations (25.1) where n(i, j) is the order of si sj . Since the relations
(25.1) are true in W we have a surjective homomorphism W  −→ W send-
ing si → si . We must show that it is injective. Let si1 · · · sin = sj1 · · · sjm
be two decompositions of the same element w into products of simple reflec-
tions. We will show that we may go from one word w = (si1 , . . . , sin ) to the
other w = (sj1 , . . . , sjm ) , only making changes corresponding to relations in
the Coxeter group presentation. That is, we may insert or remove a pair of
adjacent equal si , or we may replace a segment (si , sj , si , . . .) by (sj , si , sj , . . .)
where the total number of si and sj is each the order of 2n(si , sj ). This will
show that si1 · · · sin = sj1 · · · sjm so the homomorphism W  −→ W is indeed
injective.
Let C be the positive Weyl chamber. We have si1 · · · sin C = sj1 · · · sjm C.
Let
w1 = si1 , w2 = si1 si2 , · · · wn = si1 · · · sin .
The sequence of chambers

C, w1 C, w2 C, . . . , wn C = wC (25.2)

are adjacent. We will say that (C, w1 C, w2 , . . . , wC) is the gallery associated
with the word w = (si1 , . . . , sik ) representing w. We find a path p from a point
in the interior of C to wC passing exactly through this sequence of chambers.
Similarly we have a gallery associated with the word (sj1 , . . . , sjm ). We may
similarly consider a path q from C to wC having the same endpoints as p
passing through the chambers of this gallery. We will consider what happens
when we deform p to q.
If α ∈ Φ let Hα be the set of v ∈ V such that α∨ (v) = 0. This is the hyper-
plane perpendicular to the root α, and these hyperplanes are the walls of Weyl
chambers. Let K2 be the closed subset of V where two or more hyperplanes
Hα intersect. It is a subset of codimension 2, that is, it is a (locally) finite
25 Coxeter Groups 215

union of codimension 2 linear subspaces of V. Let K3 be the closed subset


of V consisting of points P such that three hyperplanes Hα1 , Hα2 , Hα3 pass
through V, with the roots α1 , α2 and α3 linearly independent. The subset K
is of codimension 3 in V. We have K2 ⊃ K3 . The paths p and q do not pass
through K2 .
Since K3 has codimension 3 it is possible to deform p to q avoiding K3 .
Let pu with u ∈ [0, 1] be such a deformation, with p0 = p and p1 = p . For
each u the sequence of chambers through which pu form the gallery associated
to some word representing w. We consider what happens to this word when
the gallery changes as u is varied.

wk−1C
wk−1C = wk+1C

wkC

Fig. 25.1. Eliminating a crossing and recrossing of the same wall

There are two ways the word can change. If ik = ik+1 then wk−1 = wk+1
and we have a crossing as in Fig. 25.1. The path may move to eliminate
(or create) the crossing. This corresponds to eliminating (or inserting) a rep-
eated sik = sik+1 from the word, and since sik has order 2 in the Coxeter
group, the corresponding elements of the Coxeter group will be the same.
Since the deformation avoids K3 , the only other way that the word can
change is if the deformation causes the path to cross K2 , that is, some point
where two or more hyperplanes Hα1 , Hα2 , . . . , Hαn intersect, with the roots
α1 , . . . , αn in a two-dimensional subspace. In this case the transition looks
like Fig. 25.2.
This can happen if ik = ik+2 = · · · = i and ik+1 = ik+3 = · · · = j, and
the effect of the crossing is to replace a subword of the form (si , sj , si , . . .) by
an equivalent (sj , si , sj , . . .), where the total number of si and sj is 2n(si , sj ).
We have si sj si · · · = sj si sj · · · in W  , so this type of transition also does not
change the element of W  . We see that si1 · · · sin = sj1 · · · sjm , proving that
W ∼ = W  . This concludes the proof that W is a Coxeter group.


The Coxeter group (W, I) has a close relative called the associated braid group.
We note that in the Coxeter group (W, I) with generators si satisfying s2i = 1,
the relation (si sj )n(i,j) = 1 (true when i = j) can be written

si sj si · · · = sj si sj · · · , (25.3)
216 25 Coxeter Groups

wkC

wk+3C
wk+1C wk−1C wk−1C

wkC
wk+2C

wk+1C

Fig. 25.2. Crossing K2

where the number of factors on both sides is n(i, j). Written this way, we call
equation (25.3) the braid relation.
Now let us consider a group B with generators ui in bijection with the si
that are assumed to satisfy the braid relations but are not assumed to be of
order two. Thus,
ui uj ui uj · · · = uj ui uj ui · · · , (25.4)
where there are n(i, j) terms on both sides. Note that since the relation s2i = 1
is not true for the ui , it is not true that n(i, j) is the order of ui uj , and in
fact ui uj has infinite order. The group B is called the braid group.
The term braid group is used due to the fact that the braid group of type
An is Artin’s original braid group, which is a fundamental object in knot
theory. Although Artin’s braid group will not play any role in this book,
abstract braid groups will play a role in our discussion of Hecke algebras in
Chap. 46, and the relationship between Weyl groups and braid groups und-
erlies many unexpected developments beginning with the use by Jones [91]
of Hecke algebras in defining new knot invariants and continuing with the
work of Reshetikhin and Turaev [135] based on the Yang–Baxter equation,
with connections to quantum groups and applications to knot and ribbon
invariants.
Consider a set of paths represented by a set of n+1 nonintersecting strings
connected to two (infinite) parallel posts in R3 to be a braid . Braids are
equivalent if they are homotopic. The “multiplication” in the braid group
is concatenation: to multiply two braids, the endpoints of the first braid on
the right post are tied to the endpoints of the second braid on the left post.
In Fig. 25.3, we give generators u1 and u2 for the braid group of type A2 and
calculate their product. In Fig. 25.4, we consider u1 u2 u1 and u2 u1 u2 ; clearly
these two braids are homotopic, so the braid relation u1 u2 u1 = u2 u1 u2 is
satisfied.
We did not have to make the map n part of the defining data in the Coxeter
group since n(i, j) is just the order of si sj . This is no longer true in the braid
group. Coxeter groups are often finite, but the braid group (B, I) is infinite if
|I| > 1.
25 Coxeter Groups 217

u1 × u2 = u 1u 2

Fig. 25.3. Generators u1 and u2 of the braid group of type A2 and u1 u2

Fig. 25.4. The braid relation. Left: u1 u2 u1 . Right: u2 u1 u2

Theorem 25.1 has an important complement due to Matsumoto [127] and


(independently) by Tits. According to this result, if two reduced words rep-
resent the same element, then the corresponding elements represented by the
same reduced words are equal in the braid group. (This is true for arbitrary
Coxeter groups, but we will only prove it for Weyl groups and affine Weyl
groups.) Both Theorem 25.1 and Matsumoto’s theorem may be given proofs
based on Proposition 20.7, and these may be found in Bourbaki [23]. We will
give another geometric proof of Matsumoto’s theorem based on ideas similar
to those in the above proof of Theorem 25.1.

Theorem 25.2 (Matsumoto, Tits). Let w ∈ W have length l(w) = r. Let


si1 · · · sir = sj1 · · · sjr be two reduced decompositions of w into products of
simple reflections. Then the corresponding words are equal in the braid group,
that is, ui1 · · · uir = uj1 · · · ujr .

What we will actually prove is that if w of length k has two reduced


decompositions w = si1 · · · sik = sj1 · · · sjk , then the word (si1 , . . . , sik ) may
be transformed into (sj1 , . . . , sjk ) by a series of substitutions, in which a sub-
word (si , sj , si , . . .) is changed to (sj , si , sj , . . .), both subwords having n(i, j)
elements. For example, in the A3 Weyl group, two words representing the long
218 25 Coxeter Groups

Weyl group element w0 are s1 s2 s1 s3 s2 s1 and s3 s2 s3 s1 s2 s3 . We may transform


the first into the second by the following series of substitutions:

(121321) ↔ (212321) ↔ (213231) ↔ (231213) ↔ (232123) ↔ (323123).

Proof. We associate with a word a gallery and a path as in the proof of


Theorem 25.1. Of the hyperplanes Hα perpendicular to the roots, let H1 · · · Hr
be the ones separating C and wC. Since any path from C to wC must cross these
hyperplanes, the word associated with the path will be reduced if and only
if it does not cross any one of these hyperplanes more than once. The paths
p and q corresponding to the given reduced words thus have this property,
and as in the proof of Theorem 25.1 it is easy to see that we may choose the
deformation pu from p to q that avoids K3 , such that pu does not cross any
of these hyperplanes more than once for any u.
Thus the sequence of words corresponding to the stages of pu are all
reduced words, and it is easy to see that this implies that the only transi-
tions allowed are ones implied by the braid relations. Therefore ui1 · · · uir =
uj1 · · · ujr .


As a typical example of how the theorem of Matsumoto and Tits is used, let
us define the divided difference operators Di on E. They were introduced by
Lascoux and Schutzenberger, and independently by Bernstein, Gelfand, and
Gelfand, in the cohomology of flag varieties. The divided difference operators
are sometimes denoted ∂i , but we will reserve that notation for the Demazure
operators we will introduce below. Di acts on the group algebra of the weight
lattice Λ; this algebra was denoted E in Chap. 22. It has a basis eλ indexed
by weights λ ∈ Λ. We define
−1
Di f = (eαi − 1) (f − si f ) .

It is easy to check that f − si f is divisible in E by eαi − 1, so this operator


maps E to itself.
More formally, let M be the localization of E that is the subring of its
field of fractions obtained by adjoining denominators of the form H(α) =
−1
(eα − 1) with α ∈ Φ. It is convenient to think of the Di as living in the
ring D of expressions of the form fw · w where fw ∈ M, and the sum is
over w ∈ W . We have wf w−1 = w(f ), that is, conjugation by a Weyl group
element is the same as applying it to the element f of M. We have an obvious
action of D on M, and in this notation we write
−1
Di = (eαi − 1) (1 − si ) .

Because eαi − 1 divides f − si f for f ∈ E, the operators Di act on E.

Proposition 25.1. Let n(i, j) be the order of si sj in W , where i = j. Then


the Di satisfy the braid relation
25 Coxeter Groups 219

Di Dj Di · · · = Dj Di Dj · · · (25.5)
where the number of factors on both sides is n(i, j). Moreover, this equals
- .
7 l(w)
H(α) (−1) w
α w

where the product is over roots α in the rank two root system spanned by αi
and αj , and the sum is over the rank two Weyl group generated by si , sj .
Proof. This calculation can be done separately for the four possible cases
n(i, j) = 2, 3, 4 or 6. The case n(i, j) = 2 is trivial so let us assume n(i, j) = 3.
We will show that
l(w)
Di Dj Di = H(αi )H(αj )H(αi + αj ) (−1) w, (25.6)
w∈ si ,sj 

which implies that Di Dj Di = Dj Di Dj . The left-hand side equals


H(αi ) (1 − si ) H(αj ) (1 − sj ) H(αi ) (1 − si ) .

1 H(αj )ε2 H(αi )ε3 where ε1 = 1 or −si ,


This is the sum of eight terms H(αi )ε
etc. Expanding we get it in the form fw ·w where each of the six coefficients
fw are easily evaluated. When w = 1 or s1 there are two contributions, and
for these we use the identity
H(α + β) H(−α) + H(α) H(β) = H(β) H(α + β).
The other four terms have only one contribution and are trivial to check. Each
fw turns out to be equal to (−1)l(w) H(αi )H(αj )H(αi + αj ), proving (25.6).
If n(i, j) = 4 or 6, the proof is similar (but more difficult).

Now we may give the first application of the theorem of Matsumoto
and Tits. We may define, for w ∈ W an operator Dw to be Di1 Di2 · · · Dik
where w = si1 · · · sik is a reduced expression for w. This is well-defined
by the Matsumoto-Tits theorem. Indeed, given another reduced expression
w = sj1 · · · sjk , then the content of the Matsumoto-Tits theorem is that the
two deduced words are equal in the braid group, which means that we can go
from Di1 Di2 · · · Dik to Dj1 Dj2 · · · Djk using only the braid relations, that is,
by repeated applications of (25.5).
Similarly we may consider Demazure operators ∂w indexed by w ∈ W .
These were introduced by Demazure to describe the cohomology of line bun-
dles over Schubert varieties, but they may also be used to give an efficient
method of computing the characters of irreducible representations of compact
Lie groups. Let
−1
∂i f = 1 − e−αi f − e−αi (si f ) .
It is easy to see that f − e−αi (si f ) is divisible by 1 − e−αi so that this is in
E; in fact, this follows from the more precise formula in the following lemma.
220 25 Coxeter Groups

Proposition 25.2. We have

∂i2 = ∂i , si ∂i = ∂i ,

Let f ∈ E. Then ∂i f is in E and is invariant under si , and if si f = f then


∂i f = f . If f = eλ with λ ∈ Λ, then we have
 λ
λ e + eλ−αi + eλ−2αi + · · · + esi λ if α∨
i (λ) > 0;
∂i e =
−eλ+αi − · · · − esi λ−αi if α∨
i (λ)  0.

−1
Proof. We have si ∂i = (1 − eαi ) (s − eαi ) since si eλ s−1
i = esi (λ) and in par-
ticular si e−αi s−1
i = e αi
. Multiplying both the numerator and the denomina-
tor by −e−αi then shows that si ∂i = ∂i . This identity shows that for any f ∈ E
the element ∂i f is si invariant. Moreover, if f is si -invariant, then ∂i f = f
−1
because ∂i f = (1 − e−αi ) (1 − e−αi ) f = f . Since ∂i f is si invariant, we have
∂i2 f = ∂i f . The action of D on E is easily seen to be faithful so this proves
∂i2 = ∂i (or check this by direct computation. The last identity follows from the
−1
formula for a finite geometric series, (1 − x) 1 − xN +1 = 1 + x + · · · + xN
together with si λ = λ − α∨ i (λ) αi .


It is easy to check that the Demazure and divided difference operators are
related by Di = ∂i − 1.

Proposition 25.3. The Demazure operators also satisfy the braid relations

Di Dj Di · · · = Dj Di Dj · · · (25.7)

where the number of factors on both sides is n(i, j).

Proof. Again there are different cases depending on whether n(i, j) = 2, 3, 4


or 6, but in each case this can be reduced to the corresponding relation
(25.5) by use of ∂i2 = ∂i . For example, if n(i, j) = 3, then expanding
0 = (∂i − 1) (∂j − 1) (∂i − 1) − (∂j − 1) (∂i − 1) (∂j − 1) and using ∂i2 = ∂i
gives ∂i ∂j ∂i = ∂j ∂i ∂j . The other cases are similar.


Now, by the theorem of Matsumoto and Tits, we may define ∂w = ∂i1 · · · ∂ik
where w = si1 · · · sik is any reduced expression, and this is well defined.
We return to the setting of Chap. 22. Thus, let λ be a dominant weight in
Λ = X ∗ (T ), where T is a maximal torus in the compact Lie group G. Let χλ
be the character of the corresponding highest weight module, which may be
regarded as an element of E.

Theorem 25.3 (Demazure). Let w0 be the long element in the Weyl group,
and let λ be a dominant weight. Then

χλ = ∂w0 eλ .
25 Coxeter Groups 221

This is an efficient method of computing χλ . Demazure also gave


interpretations of ∂w for other Weyl group elements as characters of T -modules
of sections of line bundles over Schubert varieties.

Proof. Let ∂w0 = fw · w. We will prove that

fw = Δ−1 (−1)l(w) ew(ρ) . (25.8)

where Δ is the Weyl denominator as in Chap. 22. This is sufficient, for then
∂w0 eλ = χλ when λ is dominant by the Weyl character formula.
Let N = l(w0 ). For each i, l(si w0 ) = N − 1, so we may find a reduced word
si w0 = si2 · · · siN . Then w0 = si si2 · · · siN in which i1 = i, so ∂w0 = ∂i ∂si w0 .
Since ∂i2 = ∂i and si ∂i = ∂i this means that ∂i ∂w0 = ∂w0 and si ∂w0 = ∂w0 .
A consequence is that ∂w0 eλ is W -invariant
 for every weight λ (dominant or
not). Therefore, if we write ∂w0 = fw · w with fw ∈ M we have w(fw ) =
fww . Since w(Δ−1 ) = (−1)
l(w)
, we now have only to check (25.8) for one
particular w. Fortunately when w = w0 it is possible to do this without too
much work. Choosing a reduced word, we have
−1 −1
∂w0 = 1 − e−αi1 1 − e−αi1 si1 · · · 1 − e−αiN 1 − e−αiN siN .

Expanding out thefactors 1 − sik e−αk there is only one way to get w0 in
the decomposition fw · w, namely we must take −sik e−αik in every factor.
Therefore,
−1 −α1 −1 −αN
fw0 · w0 = 1 − e−αi1 −e si1 · · · 1 − e−αiN −e siN
= (−1)l(w0 ) H(α1 ) si1 · · · H(αiN ) siN .

Moving the si to the right, this equals

(−1)l(w0 ) H(αi1 ) H(si1 αi2 ) H(si1 si2 αi3 ) · · ·

Applying Proposition 20.10 to the reduced word w0 = sik · · · si1 , this


proves that
7 7 −1
fw0 = (−1)l(w0 ) H(α) = (−1)l(w0 ) (eα − 1) .
α∈Φ+

Since ew0 (ρ) = e−ρ , this is equivalent to (25.8) in the case w = w0 .




Theorem 25.4. The affine Weyl group is also a Coxeter group (generated by
s0 , . . . , sr ). Moreover, the analog of the Matsumoto-Tits theorem is true for
the affine Weyl group: if w of length k has two reduced decompositions w =
si1 · · · sik = sj1 · · · sjk , then the word (si1 , . . . , sik ) may be transformed into
(sj1 , . . . , sjk ) by a series of substitutions, in which a subword (si , sj , si , . . .) is
changed to (sj , si , sj , . . .), both subwords having n(i, j) elements.
222 25 Coxeter Groups

Proof. This may be proved by the same method as Theorem 25.1 and 25.2
(Exercise 25.3).

As a last application of the theorem of Matsumoto and Tits, we discuss the
Bruhat order on the Weyl group, which we will meet again in Chap. 27. This
is a partial order, with the long Weyl group element maximal and the identity
element minimal. If v and u are elements of the Weyl group W , then we write
u  v if, given a reduced decomposition v = si1 · · · sik then there exists a
subsequence (j1 , . . . , jl ) of (i1 , . . . , ik ) such that u = sj1 · · · sjl . By Proposi-
tion 20.4 we may assume that u = sj1 · · · sjl is a reduced decomposition.
Proposition 25.4. This definition does not depend on the reduced decompo-
sition v = si1 · · · sik .
Proof. By Theorem 25.2 it is sufficient to check that if (i1 , . . . , ik ) is changed
by a braid relation, then we can still find a subsequence (j1 , . . . , jl ) repre-
senting u. We therefore find a subsequence of the form (t, u, t, . . .) where the
number of elements is the order of st su , and we replace this by (u, t, u, . . .).
We divide the subsequence (j1 , . . . , jl ) into three parts: the portion extracted
from that part of (i1 , . . . , ik ) before the changed subsequence, the portion ex-
tracted from the changed subsequence, and the portion extracted from after
the changed subsequence. The first and last part do not need to be altered.
A subsequence can be extracted from the portion in the middle to repre-
sent any element of the dihedral group generated by st and su whether it is
(t, u, t, . . .) or (u, t, u, . . .), so changing this portion has no effect.

We now describe (without proof) the classification of the possible reduced
root systems and their associated finite Coxeter groups. See Bourbaki [23] for
proofs. If Φ1 and Φ2 are root systems in vector spaces V1 , V2 , then Φ1 ∪ Φ2 is
a root system in V1 ⊕ V2 . Such a root system is called reducible. Naturally, it
is enough to classify the irreducible root systems.
The Dynkin diagram represents the Coxeter group in compact form. It is
a graph whose vertices are in bijection with Σ. Let us label Σ = {α1 , . . . , αr },
and let si = sαi . Let θ(αi , αj ) be the angle between the roots αi and αj . Then


⎪ 2 if θ(αi , αj ) = π2 ,






⎨ 3 if θ(αi , αj ) = 2π
3 ,
n(si , sj ) =



⎪ 4 if θ(αi , αj ) = 3π
4 ,





6 if θ(αi , αj ) = 5π
6 .

These four cases arise in the rank 2 root systems A1 × A1 , A2 , B2 and G2 , as


the reader may confirm by consulting the figures in Chap. 19.
In the Dynkin diagram, we connect the vertices corresponding to αi and
αj only if the roots are not orthogonal. If they make an angle of 2π/3, we
25 Coxeter Groups 223

connect them with a single bond; if they make an angle of 6π/4, we connect
them with a double bond; and if they make an angle of 5π/6, we connect them
with a triple bond. The latter case only arises with the exceptional group G2 .
If αi and αj make an angle of 3π/4 or 5π/6, then these two roots have
different lengths; see Figs. 19.4 and 19.6. In the Dynkin diagram, there will
be a double or triple bond in these examples, and we draw an arrow from
the long root to the short root. The triple bond (corresponding to an angle
of 5π/6) is rare—it is only found in the Dynkin diagram of a single group,
the exceptional group G2 . If there are no double or triple bonds, the Dynkin
diagram is called simply laced.

α1 α2 α3 α4 α5

Fig. 25.5. The Dynkin diagram for the type A5 root system

The root system of type An is associated with the Lie group SU(n + 1).
The corresponding abstract root system is described in Chap. 19. All roots
have the same length, so the Dynkin diagram is simply laced. In Fig. 25.5
we illustrate the Dynkin diagram when n = 5. The case of general n is the
same—exactly n nodes strung together in a line (•—•— · · · —•).

α1 α2 α3 α4 α5

Fig. 25.6. The Dynkin diagram for the type B5 root system

The root system of type Bn is associated with the odd orthogonal group
SO(2n + 1). The corresponding abstract root system is described in Chap. 19.
There are both long and short roots, so the Dynkin diagram is not simply
laced. See Fig. 25.6 for the Dynkin diagram of type B5 . The general case is
the same (•—•— · · · —•= >
=•), with the arrow pointing towards the αn node
corresponding to the unique short simple root.

α1 α2 α3 α4 α5

Fig. 25.7. The Dynkin diagram for the type C5 root system

The root system of type Cn is associated with the symplectic group Sp(2n).
The corresponding abstract root system is described in Chap. 19. There are
both long and short roots, so the Dynkin diagram is not simply laced. See
Fig. 25.7 for the Dynkin diagram of type C5 . The general case is the same
(•—•— · · · —•=<=•), with the arrow pointing from the αn node corresponding
to the unique long simple root, towards αn−1 .
224 25 Coxeter Groups

α5

α1 α2 α3 α4

α6

Fig. 25.8. The Dynkin diagram for the type D6 root system

The root system of type Dn is associated with the even orthogonal group
O(2n). All roots have the same length, so the Dynkin diagram is simply-laced.
See Fig. 25.8 for the Dynkin diagram of type D6 . The general case is similar,
but the cases n = 2 or n = 3 are degenerate, and coincide with the root
systems A1 × A1 and A3 . For this reason, the family Dn is usually considered
to begin with n = 4. See Fig. 30.2 and the discussion in Chap. 30 for further
information about these degenerate cases.
These are the “classical” root systems, which come in infinite families.
There are also five exceptional root systems, denoted E6 , E7 , E8 , F4 and G2 .
Their Dynkin diagrams are illustrated in Figs. 25.9–25.12.

α2

α1 α3 α4 α5 α6

Fig. 25.9. The Dynkin diagram for the type E6 root system

α2

α1 α3 α4 α5 α6 α7

Fig. 25.10. The Dynkin diagram for the type E7 root system

α2

α1 α3 α4 α5 α6 α7 α8

Fig. 25.11. The Dynkin diagram for the type E8 root system
25 Coxeter Groups 225

α1 α2 α3 α4 α1 α2

Fig. 25.12. The Dynkin diagrams of types F4 (left) and G2 (right)

Exercises
Exercise 25.1. For the root systems of types An , Bn , Cn , Dn and G2 described in
Chap. 19, identify the simple roots and the angles between them. Confirm that their
Dynkin diagrams are as described in this chapter.

Exercise 25.2. Let Φ be a root system in a Euclidean space V. Let W be the Weyl
group, and let W  be the group of all linear transformations of V that preserve Φ.
Show that W is a normal subgroup of W  and that W  /W is isomorphic to the
group of all symmetries of the Dynkin diagram of the associated Coxeter group.
(Use Proposition 20.13.)

Exercise 25.3. Prove Theorem 25.4 by imitating the proof of Theorem 25.1

Exercise 25.4. How many reduced expressions are there in the A3 Weyl group
representing the long Weyl group element?

Exercise 25.5. Let α1 , . . . , αr be the simple roots of a reduced irreducible root


system Φ, and let α0 be the affine root, so that −α0 is the highest root. By Propo-
sition 20.1, the inner product αi , αj   0 when i, j are distinct with 1  i, j  r.
Show that this statement remains true if 0  i, j  r.

The next exercise gives another interpretation of Proposition 20.4.

Exercise 25.6. Let W be a Weyl group. Let w = si1 si2 · · · siN be a decompo-
sition of w into a product of simple reflections. Construct a path through the
sequence (25.2) of chambers as in the proof of Theorem 25.1. Observe that the
word (i1 , i2 , . . . , iN ) representing w is reduced if and only if this path does not cross
any of the hyperplanes H orthogonal to the roots twice. Suppose that the word is
not reduced, and that it meets some hyperplane H in two points, P and Q. Then for
some k, with notation as in (25.2), P lies between wk−1 C and wk−1 sik C. Similarly
Q lies between wl−1 C and wl−1 sil C. Show that ik = il , and that

w = si1 · · · ŝik · · · ŝil · · · siN

where the “hat” means that the two entries are omitted. (Hint: Reflect the segment
of the path between P and Q in the hyperplane H.)

Exercise 25.7. Prove that the Bruhat order has the following properties.

(i) If sv <v and su < u, then uv if and only if su  sv.


(ii) If sv <v and su > u, then uv if and only if u  sv.
(iii) If sv >v and su > u, then uv if and only if su  sv.
(iv) If sv >v and su < u, then uv if and only if u  sv.

[Hint: Any one of these four properties implies the others. For example, to deduce
(ii) from (i), replace u by su].
226 25 Coxeter Groups

Observe that su < u if and only if l(su) < l(u), a condition that is easy to check.
Therefore, (i) and (ii) give a convenient method of checking (recursively) whether
u  v.

Exercise 25.8. Let w0 be the long element in a Weyl group W . Show that if u, v ∈
W then u  v if and only if uw0  vw0 .
26
The Borel Subgroup

The Borel subgroup B of a (noncompact) Lie group G is a maximal closed


and connected solvable subgroup. We will give several applications of the
Borel subgroup in this chapter and the next. In this chapter, we will begin
with the Iwasawa decomposition, an important decomposition involving the
Borel subgroup. We will also show how invariant vectors with respect to the
Borel subgroup give a convenient method of decomposing a representation
into irreducibles. We will restrict ourselves here to complex analytic groups
such as GL(n, C) obtained by complexifying a compact Lie group. A more
general Iwasawa decomposition will be found later in Chap. 29.
Let us begin with an example. Let G = GL(n, C). It is the complexification
of K = U (n), which is a maximal compact subgroup. Let T be the maximal
torus of K consisting of diagonal matrices with eigenvalues that have absolute
value 1. The complexification TC of T can be factored as T A, where A is the
group of diagonal matrices with eigenvalues that are positive real numbers.
Let B be the group of upper triangular matrices in G, and let B0 be the
subgroup of elements of B whose diagonal entries are positive real numbers.
Finally, let N be the subgroup of unipotent elements of B. Recalling that a
matrix is called unipotent if its only eigenvalue is 1, the elements of N are
upper triangular matrices with diagonal entries that are all equal to 1. We may
factor B = T N and B0 = AN . The subgroup N is normal in B and B0 , so
these decompositions are semidirect products.

Proposition 26.1. With G = GL(n, C), K = U (n), and B0 as above, every


element of g ∈ G can be factored uniquely as bk where b ∈ B0 and k ∈ K, or as
aνk, where a ∈ A, ν ∈ N , and k ∈ K. The multiplication maps N ×A×K −→
G and A × N × K −→ G are diffeomorphisms.

Proof. First let us consider N × A× K −→ G. Let g ∈ G. Let v1 , . . . , vn be the


rows of g. Then by the Gram–Schmidt orthogonalization algorithm, we find
constants θij (i < j) such that vn , vn−1 + θn−1,n vn , vn−2 + θn−2,n−1 vn−1 +
θn−2,n vn , . . . are orthogonal. Call these vectors un , . . . , u1 , and let

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 227


DOI 10.1007/978-1-4614-8024-2 26, © Springer Science+Business Media New York 2013
228 26 The Borel Subgroup
⎛ ⎞
1 θ12 · · · θ1n
⎜ 1 θ2n ⎟
⎜ ⎟
ν −1 =⎜ . . .. ⎟ ,
⎝ . . ⎠
1

so u1 , . . . , un are the rows of ν −1 g. Let a be the diagonal matrix with diagonal


entries |u1 |, . . . , |un |. Then k = a−1 ν −1 g has orthonormal rows, and so g =
νak = b0 k is unitary with b0 = νa. This proves that the multiplication map
N ×A×K −→ G is surjective. It follows from the facts that B0 ∩K = {1} and
that A ∩ N = {1} that it is injective. It is easy to see that the matrices a, ν,
and k depend continuously on g, so the multiplication map A × N × K −→ G
has a continuous inverse and hence is a diffeomorphism.
As for the map A × N × K −→ G, this is the composition of the first map
with a bijection A × N × K → N × A × K, in which (a, n, k) → (n , a, k)
if an = n a. The latter map is also a diffeomorphism, and the conclusion is
proved.

The decomposition G ∼ = A × N × K is called the Iwasawa decomposition
of GL(n, C).
To give another example, if G = GL(n, R), one takes K = O(n) to be a
maximal compact subgroup, A is the same group of diagonal real matrices
with positive eigenvalues as in the complex case, and N is the group of upper
triangular unipotent real matrices. Again there is an Iwasawa decomposition,
and one may prove it by the Gram–Schmidt orthogonalization process.
In this section, we will prove an Iwasawa decomposition if G is a complex
Lie group that is the complexification of a compact connected Lie group K.
This result contains the first example of G = GL(n, C), though not the second
example of G = GL(n, R). A more general Iwasawa decomposition containing
both examples will be obtained in Theorem 29.2.
We say that a Lie algebra n is nilpotent if there exists a finite chain of
ideals
n = n1 ⊃ n2 ⊃ · · · ⊃ nN = {0}
such that [n, nk ] ⊆ nk+1 .
Example 26.1. Let F be a field, and let n be the Lie algebra over F consisting
of upper triangular nilpotent matrices in GL(n, F ). Let

nk = {g ∈ n | gij = 0 if j < i + k}.

For example, if n = 3,
⎧⎛ ⎞⎫ ⎧⎛ ⎞⎫
⎨ 0 ∗ ∗ ⎬ ⎨ 0 0 ∗ ⎬
n = n1 = ⎝ 0 0 ∗ ⎠ , n2 = ⎝ 0 0 0 ⎠ , n3 = {0}.
⎩ ⎭ ⎩ ⎭
0 0 0 0 0 0

This Lie algebra is nilpotent.


26 The Borel Subgroup 229

We also say that a Lie algebra b is solvable if there exists a finite chain of
Lie subalgebras
b = b1 ⊃ b2 ⊃ · · · ⊃ bN = {0} (26.1)
such that [bi , bi ] ⊆ bi+1 . It is not necessarily true that bi is an ideal in b.
However, the assumption that [bi , bi ] ⊆ bi+1 obviously implies that [bi , bi+1 ] ⊆
bi+1 , so bi+1 is an ideal in bi .
Clearly, a nilpotent Lie algebra is solvable. The converse is not true, as
the next example shows.
Example 26.2. Let F be a field, and let b be the Lie algebra over F consisting
of all upper triangular matrices in GL(n, F ). Let

bk = {g ∈ b | gij = 0 if j < i + k − 1}.

Thus, if n = 3,
⎧⎛ ⎞⎫ ⎧⎛ ⎞⎫
⎨ ∗ ∗ ∗ ⎬ ⎨ 0 ∗ ∗ ⎬
b = b1 = ⎝ 0 ∗ ∗ ⎠ , b2 = ⎝ 0 0 ∗ ⎠ ,
⎩ ⎭ ⎩ ⎭
0 0 ∗ 0 0 0
⎧⎛ ⎞⎫
⎨ 0 0 ∗ ⎬
b3 = ⎝ 0 0 0⎠ , b4 = {0}.
⎩ ⎭
0 0 0

This Lie algebra is solvable. It is not nilpotent.


Proposition 26.2. Let b be a Lie algebra, b an ideal of b, and b = b/b .
Then b is solvable if and only if b and b are both solvable.
Proof. Given a chain of Lie subalgebras (26.1) satisfying [bi , bi ] ⊂ bi+1 ,
one may intersect them with b or consider their images in b and obtain
corresponding chains in b and b showing that these are solvable.
Conversely, suppose that b and b are both solvable. Then there are chains

b = b1 ⊃ b2 ⊃ · · · ⊃ bM = {0}, b = b1 ⊃ b2 ⊃ · · · ⊃ bN = {0}.

Let bi be the preimage of bi in b. Splicing the two chains in b as

b = b1 ⊃ b2 ⊃ · · · ⊃ bN = b = b1 ⊃ b2 ⊃ · · · ⊃ bM = {0}

shows that b is solvable.



Proposition 26.3. (Dynkin) Let g ⊂ gl(V ) be a Lie algebra of linear
transformations over a field F of characteristic zero, and let h be an ideal
of g. Let λ : h −→ F be a linear form. Then the space

W = {v ∈ V | Y v = λ(Y )v for all Y ∈ h}

is invariant under all of g.


230 26 The Borel Subgroup

Proof. If W = 0, there is nothing to prove, so assume 0 = v0 ∈ W . Fix an


element X ∈ g. Let W0 be the linear span of v0 , Xv0 , X 2 v0 , . . ., and let d be
the dimension of W0 .
If Z ∈ h, then we will prove that

Z(W0 ) ⊆ W0 and the trace of Z on W0 is dim(W0 ) · λ(Z). (26.2)

To prove this, note that

v0 , Xv0 , X 2 v0 , . . . , X d−1 v0 (26.3)

is a basis of W0 . With respect to this basis, for suitable cij ∈ F , we have



ZX i v0 = λ(Z)X i v0 + cij X j v0 . (26.4)
j<i

This is proved by induction since

ZX i v0 = XZX i−1v0 − [X, Z]X i−1 v0 .

By the induction hypothesis, XZX i−1 v0 is Xλ(Z)X i−1 v0 plus a linear


combination of X j v0 with j < i, and [X, Z]X i−1 v0 is λ([X, Z])X i−1 v0 plus a
linear combination of X j v0 with j < i − 1. The formula (26.4) follows. The
invariance of W0 under Z is now clear, and (26.2) also follows from (26.4)
because with respect to the basis (26.3) the matrix of Z is upper triangular
and the diagonal entries all equal λ(Z).
Now let us show that Xv0 ∈ W . Let Y ∈ h. What we must show is that
Y Xv0 = λ(Y )Xv0 . The space W0 is invariant under both X (obviously) and
Y (by (26.2) taking Z = Y ). Thus, the trace of [X, Y ] = XY − Y X on W0 is
zero. Since Y ∈ h and h is an ideal, [X, Y ] ∈ h and we may take Z = [X, Y ]
in (26.2). Since the characteristic of F is 0, we see that λ([X, Y ]) = 0. Now

Y Xv0 = XY v0 − [X, Y ]v0 = λ(Y )Xv0 − λ([X, Y ])v0 = λ(Y )Xv0 ,

as required.


Theorem 26.1. (Lie) Let b ⊆ gl(V ) be a solvable Lie algebra of lin-


ear transformations over an algebraically closed field of characteristic zero.
Assume that V = 0.
(i) There exists a vector v ∈ V that is a simultaneous eigenvector for all of b.
(ii) There exists a basis of V with respect to which all elements of b are
represented by upper triangular matrices.

Proof. To prove (i), we may clearly assume that b = 0. Let us first observe
that b has an ideal h of codimension 1. Indeed, since b is solvable, [b, b] is
a proper ideal, and the quotient Lie algebra b/[b, b] is Abelian; hence any
26 The Borel Subgroup 231

subspace at all of b/[b, b] is an ideal. We choose a subspace of codimension 1,


and let h be its preimage in b.
Now h is solvable and of strictly smaller dimension than b, so by induction
there exists a simultaneous eigenvector v0 for all of h. Let λ : h −→ F be such
that Xv0 = λ(X)v0 . The space W = {v ∈ V | Xv = λ(X)v for all X ∈ h} is
nonzero, and by Proposition 26.3 it is b-invariant. Let Z ∈ b − h. Since F is
assumed to be algebraically closed, Z has an eigenvector on W , which will be
an eigenvector v1 for all of b since it is already an eigenvector for h.
For (ii), the Lie algebra of linear transformations of V /F v1 induced by
those of b is solvable, so by induction this quotient space has a basis v2 , . . . , vd
with respect to which every X ∈ b is  upper triangular. This means that
for suitable aij ∈ F , we have Xvj = 2ij aij vi . Letting v2 , . . . , vd be
representatives of the cosets vi in V , it follows that X is upper triangular
with respect to the basis v1 , . . . , vd .

Let K be a compact connected Lie group and k its Lie algebra. Let g = kC
be the analytic complexification of k, so that g is the Lie algebra of the complex
Lie group G that is the complexification of K. Let T be a maximal torus
of K. We can embed its analytic complexification TC into G by the universal
property of the complexification.
Let Φ be the root system of K and let Φ+ be the positive roots with
respect to some ordering. If α ∈ Φ, let Xα ⊂ g be the α-eigenspace. By
Proposition 18.6, Xα is one-dimensional, and we will denote by Xα a nonzero
element. Define
&
n= Xα . (26.5)
α∈Φ+

Then n is a complex Lie subalgebra of g. Indeed, if α and β are positive roots,


it is impossible that α = −β, so by Proposition 18.4 (ii), [Xα , Xβ ] ⊂ Xα+β if
α + β is a positive root, and otherwise it is zero. In either case, it is in n.
Proposition 26.4. The Lie algebra n defined by (26.5) is nilpotent.

Proof. Let Φ+k be the set of positive roots α such that α is expressible as the
sum of at least k simple positive roots. Thus, Φ+1 = Φ, Φ1 ⊃ Φ2 ⊃ Φ3 ⊃ · · · ,
+ + +
+
and eventually Φk is empty. Define
&
nk = Xα .
α∈Φ+
k

It follows from Proposition 18.4 (ii) that [n, nk ] ⊆ nk+1 , and eventually nk is
zero, so n is nilpotent. 

Now let t be the Lie algebra of T , and let b = tC ⊕n. Since [tC , Xα ] ⊆ Xα , it
is clear that b, like n, is closed under the Lie bracket and forms a complex Lie
algebra. Moreover, since tC is Abelian and normalizes n, we have [b, b] ⊂ n,
and since n is nilpotent and hence solvable, it follows that b is solvable.
232 26 The Borel Subgroup

We aim to show that both n and b are the Lie algebras of closed complex
Lie subgroups of G.

Proposition 26.5. Let G be the complexification of a compact connected Lie


group K, and let n be as in (26.5). If π : G −→ GL(V ) is any representation
and X ∈ n, then π(X) is nilpotent as a linear transformation; that is,
π(X)N = 0 for all sufficiently large N .

We note that it is possible for a nilpotent Lie algebra of linear transformations


to contain linear transformations that are not nilpotent. For example, an
Abelian Lie algebra is nilpotent as a Lie algebra but might well contain linear
transformations that are not nilpotent.

Proof. By Theorem 26.1, we may choose a basis of V such that all π(X) are
upper triangular for X ∈ b, where we are identifying π(X) with its matrix
with respect to the chosen basis. What we must show is that if X ∈ n, then
the diagonal entries of this matrix are zero. It is sufficient to show this if
X ∈ Xα , where α is a positive root.
By the definition of a root, the character α of T is nonzero, and so its
differential dα is nonzero. This means that there exists H ∈ t such that
dα(H) = 0, and by (18.9) the commutator [π(H), π(Xα )] is a nonzero multiple
of π(Xα ). Because it is a nonzero multiple of the commutator of two upper
triangular matrices, it follows that π(Xα ) is an upper triangular matrix with
zeros on the diagonal. Thus, it is nilpotent.


Theorem 26.2. (i) Let G be the complexification of a compact connected


Lie group K, let T be a maximal torus of K, let t be the Lie algebra of T ,
and let TC be its complexification. Let n be as in (26.5), and let b = tC ⊕n.
Let N = exp(n) and B = TC N . Then N and B are closed Lie subalgebras
of G and n and b are the Lie algebras of N and B.
(ii) We may embed G in GL(n, C) for some n in such a way that K consists
of unitary matrices, TC consists of diagonal matrices, and B consists of
upper triangular matrices.
(iii) If u is a complex Lie subalgebra of n, and U = exp(u), then U is a complex
analytic subgroup of N and u is its Lie algebra. If u is a real Lie subalgebra
of n, and U = exp(u), then U is a Lie subgroup of N and u is its Lie
algebra.
(iv) Suppose that v and w are (complex) Lie subalgebras of n such that n =
v ⊕ w. Let V = exp (v) and W = exp (w) so that by (iii) V and W are
complex analytic subgroups of N . Then V ∩ W = {1} and N = V W .

The group B is called the standard Borel subgroup of G. A conjugate of B


is called a Borel subgroup. A subgroup containing a Borel subgroup is called
a parabolic subgroup. We will call a subgroup containing the standard Borel
subgroup a standard parabolic.
26 The Borel Subgroup 233

Proof. We will prove parts (i) and (ii) simultaneously.


Let π : K −→ GL(V ) be a faithful representation. We choose on V an inner
product with respect to which π(k) is unitary for k ∈ K. By Theorem 24.1,
we may extend π to a faithful complex analytic representation of G. We have
already noted that b is a solvable Lie algebra, so by Theorem 26.1 we may find
a basis v1 , . . . , vn of V with respect to which the linear transformations
 π(X)
with X ∈ b are upper triangular. This means that π(X)vi ∈ ji F vj . We
claim that we may assume that the vi are orthonormal. This is accomplished
by Gram–Schmidt orthonormalization. We first divide vi by |vi | so vi has
length 1. Next we replace v2 by v2 − v2 , v1 v1 and so forth so that the vi are
orthonormal. The matrices π(X) with X ∈ b remain upper triangular after
these changes.
We identify G with its image in GL(n, C) and its Lie algebra with the
corresponding Lie subalgebra of Matn (C) = gl(n, C). Thus, we write X instead
of π(X) and regard it as a matrix.
Let
= {exp(X) | X ∈ n}. (26.6)
We will show that N is a closed analytic subgroup of G with a Lie algebra
that is N .
By Remark 8.1, if X ∈ n and Y = exp(X), then

Y = I + X + 12 X 2 + . . . + 1 n
n! X .

This is now a series with only finitely many terms since X is nilpotent by
Proposition 26.5. Moreover, Y − I is a finite sum of upper triangular nilpotent
matrices and hence is itself nilpotent, and reverting the exponential series, we
have X = log(Y ), where we define

log(Y ) = (Y − 1) − 12 (Y − 1)2 + 13 (Y − 1)3 − · · · + (−1)n−1 n1 (Y − 1)n

if Y is an upper triangular unipotent matrix. As with the exponential series,


only finitely many terms are needed since (Y − I)n = 0. This series defines a
continuous map log : N −→ n, which is the inverse of the exponential map.
Therefore, n is homeomorphic to N .
Next we show that N is a closed subset of GL(n, C) and in fact an
affine subvariety. Let n be the Lie subalgebra of gl(n, C) consisting of upper
triangular nilpotent matrices, and let λ1 , . . . , λr be a set of linear function-
als on n such that the intersection of the kernels of the λi is n. N may be
characterized as follows. An element g ∈ GL(n, C) is in N if and only if it is
upper triangular and unipotent, and each λi (log(g)) = 0. These conditions
comprise a set of polynomial equations characterizing N , showing that it
is closed.
Next we show that N is a group. Indeed, its intersection with a neigh-
borhood of the identity is a local group by Proposition 14.1. Thus, if g, h are
near the identity in N , we have gh ∈ N , so φi (g, h) = 0 where φi (g, h) =
234 26 The Borel Subgroup

λi (exp(gh)). Thus, the polynomial φi vanishes near the identity in N × N ,


and since N is a connected affine subvariety of GL(n, C), this polynomial van-
ishes identically on all of N . Thus, N is closed under multiplication, and it is
a group.
Since [tC , n] ⊂ n, the group TC normalizes N , so B = TC N is a subgroup
of G. It is not hard to show that it is a closed Lie subgroup and its Lie algebra
is b.
The same argument that proved that N is a Lie group proves (iii). In
the case where u is a complex Lie algebra, We simply take a larger set of
linear functionals λi on n with a kernel that is the Lie subalgebra u and
argue identically to show first that U = exp(u) is closed, and that it is a Lie
subgroup. If u is a real Lie algebra, we proceed in the same way but take the
λi to be real linear.
We turn to the proof of (iv). We saw in the proof of (ii) that the map
exp : n −→ N is surjective, and given by a polynomial expression, with a
polynomial inverse log : N −→ n. Moreover, exp takes v to V and w to W ,
while log takes V to v and W to w. It follows that V ∩ W = {1} since if
g ∈ V ∩ W then log (g) ∈ v ∩ w = 0, so g = 1.
To show that N = V W we note that the multiplication map V × W −→ N
has as its differential the inclusion v ⊕ w −→ n. But this map is the identity
map. Therefore, by the inverse function theorem multiplication V × W −→ N
is onto a neighborhood of the identity and therefore has an analytic inverse.
This means that there are analytic maps φ : n −→ V and ψ : n −→ W , defined
by power series convergent near the identity, such that φ (X) ψ (X) = eX
for X ∈ n. We will argue that φ and ψ are polynomials. Let Xi be a
basis of v and let Yj be a basis of w. Let λ : n −→ n be the map
λ (X) = log (φ (X))  + log (ψ(X)). Then λ is the inverse
 map of the map
μ that sends X = ci Xi + dj Yj to exp ( ci Xi ) exp ( dj Yj ). Regarding
n as a vector subspace of gl (n, C) = Matn (C), we see that μ (X) is a finite
linear combination of finite products of the ci Xi and dj Yj , where the products
are taken in the sense of matrix multiplication. Inverting μ, we see that λ (X)
also is a linear combination of such finite products of the ci Xi and dj Yj .
It is a finite such linear combination since only finitely many such products
are nonzero: this is because the matrices Xi and Yj are upper triangular and
nilpotent. Projecting onto v and w and exponentiating, we see that φ and ψ
are polynomials.
Since both sides are polynomials, the identity φ (X) ψ (X) = eX , already
proved for X near 0, is true for all X and it follows that the multiplication
map V × W −→ N is surjective. This proves that N = V W .

The Borel subgroup is a bit too big for the Iwasawa decomposition since
it has a nontrivial intersection with K. Let a = it. It is the Lie algebra of a
closed connected Lie subgroup A of T . If we embed K and G into GL(n, C)
as in Theorem 26.2, the elements of T are diagonal, and A consists of the
subgroup of elements of T whose diagonal entries are positive real numbers.
Let B0 = AN .
26 The Borel Subgroup 235

Theorem 26.3. (Iwasawa decomposition) With notations as in Theo-


rem 26.2 and B0 and A as above, each element of g ∈ G can be factored
uniquely as bk where b ∈ B0 and k ∈ K, or as aνk where a ∈ A, ν ∈ N and
k ∈ K. The multiplication map A × N × K −→ G is a diffeomorphism.

Proof. Let G = GL(n, C), K  = U (n), A be the subgroup of GL(n, C)


consisting of diagonal matrices with positive real eigenvalues, and N  be the
subgroup of upper triangular unipotent matrices in G . By Theorem 26.2 (ii),
we may embed G into G for suitable n such that K ends up in K  , N ends
up in N  , and A ends up in K0 .
We have a commutative diagram

A×N×K A × N × K

exp exp

G G

where the vertical arrows are multiplications and the horizontal arrows are
inclusions. By Proposition 26.1, the composition

A × N × K −→ A × N  × K  −→ G (26.7)

is a diffeomorphism onto its image, and so the multiplication A×N ×K −→ G


is a diffeomorphism onto its image. We must show that it is surjective.
Since A, N , and K are each closed in A , N  , and K  , respectively, the
image of (26.7) is closed in G and hence in G. We will show that this image
is also open in G. We note that a + n + k = g since tC ⊂ a + k, and each
CXα ⊂ n + k. It follows that the dimension of A × N × K is greater than or
equal to that of G. (These dimensions are actually equal, though we do not
need this fact, since it is not hard to see that the sum a + n + k is direct.)
Since multiplication is a diffeomorphism onto its image, this image is open and
closed in G. But G is connected, so this image is all of G, and the theorem is
now clear.


As an application, we may now show why flag manifolds have a complex


structure.

Theorem 26.4. Let K be a compact connected Lie group and T a maximal


torus. Then X = K/T can be given the structure of a complex manifold in such
a way that the translation maps g : xT −→ gxT are holomorphic. This action
of K can be extended to an action of the complexification G by holomorphic
maps.
236 26 The Borel Subgroup

Proof. By the Iwasawa decomposition, we may write G = BK. Since B ∩K =


T , we have G/B ∼ = K/T , and this diffeomorphism is K-equivariant. Now G
is a complex Lie group and B is a closed analytic subgroup, so the quotient
G/B has the structure of a complex analytic manifold, and the action of G,
a fortiori of K, consists of holomorphic maps.


We turn now to a different use of the Borel subgroup. If (π, V ) is a


finite-dimensional representation of K, then by Theorem 24.1, π can be
extended to a complex analytic representation of G, and of course a complex
analytic representation of G can be restricted back to K. So the categories
of finite-dimensional representations of K, and the finite-dimensional analytic
representations of G are equivalent.
Let λ be a weight. It is thus a character of TC . Now B is a semidirect
product TC N with N normal, so TC ∼ = B/N . This means that λ may be
extended to a character of B with N in its kernel.
We will show that each irreducible representation of G has an N -fixed
vector v that is unique up to scalar multiple. It is the highest weight vector of
the representation. Thus, if λ is the dominant weight, we have π(t)v = λ(t)v
for t ∈ TC . Since v is N -fixed, we may also write π(b)v = λ(b)v for b ∈ B.
We will give some applications of this useful fact.
Let n be the Lie algebra of N , defined by (26.5). We may similarly define n−
to be the span of Xα with α ∈ Φ− . It is also the Lie algebra of a Lie subgroup
of G, which we will denote N− . Let w0 be a representative of the long Weyl
group element. Then Ad(w0 ) interchanges the positive and negative roots, so
Ad(w0 )n = n− and w0 N w0−1 = N− .

Lemma 26.1. The Lie algebra n is generated by the Xα as α runs through


the simple positive roots.

Proof. Let n be the algebra generated by the Xα with α simple. Let us define
the height of a positive root to be the number of simple roots into which it
may be decomposed, counted with multiplicities. If α ∈ Φ+ is not simple, we
may write α = β + γ where β and γ are in Φ+ , and by induction on the height
of α, we may assume that Xβ and Xγ are in n . By Corollary 18.1, [Xβ , Xγ ]
is a nonzero multiple of Xα and so Xα is in n . Thus, n = n.


Now g is a complex Lie algebra and n− , tC and n are Lie subalgebras, so we


have homomorphisms in , it and in− mapping U (n− ), U(tC ) and U (n) into U (g).
The Poincaré–Birkhoff–Witt theorem implies that these homomorphisms are
injective, but we do not need that fact. If ξ is in U (n− ), for example, we will
use the same notation ξ for in (ξ), which is an abuse of notation since we are
omitting to prove that in is injective. The multiplication map U (n− )×U (tC )×
U (n) −→ U (g) that sends (ξ, η, ζ) to ξ ηζ induces a linear map μ : U (n− ) ⊗
U (tC ) ⊗ U (n) −→ U (g). With the above abuse of notation, we denote the
image of μ as U (n− )U (tC )U (n).
26 The Borel Subgroup 237

Proposition 26.6. (Triangular decomposition) The linear map μ :


U (n− ) ⊗ U (tC ) ⊗ U (n) −→ U (g) is surjective.

Proof. Let R = U (n− )U (tC )U (n) be the image of μ. Since R contains


generators of U (g), it is enough to show that it is closed under multiplication.
It is obvious that U (n− ) R ⊆ R. Moreover, since [tC , n− ] ⊆ n− we also have
U (tC ) R ⊆ R. It remains for us to show that U (n)R ⊆ R. By Lemma 26.1,
U (n) is generated with Xα with α simple, so it is enough to show that Xα R ⊆
R when α is simple. First, if β ∈ Φ+ then Xα X−β − X−β Xα = [Xα , X−β ] ∈ n
unless β = α, by Proposition 18.4, while if β = α we have [Xα , X−α ] ∈ tC .
On the other hand if H ∈ tC then [Xα , H] is a constant multiple of Xα . As a
result of these relations, Xα R ⊆ R.


Let (π, V ) be an irreducible representation of K. We may extend (π, V )


to an irreducible analytic representation of G. Let λ be the highest weight.
Proposition 22.4 tells us that the weight space V (λ) corresponding to the
highest weight λ is one-dimensional. Let vλ be a nonzero element. If α is a
positive root, then π(Xα )vλ = 0 because it is in V (λ + α), which is zero.
(Otherwise, λ is not a highest weight.) On the other hand the triangular
decomposition gives some complementary information.

Theorem 26.5. Let (π, V ) be an irreducible representation of K. Extend


(π, V ) to an irreducible analytic representation of G. Let λ be the highest
weight. Then V (λ) = V N is the space of N -invariants.

Proof. Clearly v ∈ V is N -invariant if and only if π(Xα ) = 0 for α ∈ Φ+ ,


and as we have noted this is true if v ∈ V (λ). We must show that N invari-
ance implies that v ∈ V (λ). Since TC normalizes N , V N is TC -invariant and
can be decomposed into weight spaces. So we may assume that v ∈ V (μ)
for some μ, and the problem is to prove that μ = λ. We may write
U (g) = U (n− )U (tC ) ⊕ U (n− )U (tC )J where J is the ideal of U (n) generated by
n, and Jv = 0. Therefore, U (g)v = U (n− )U (tC )v = U (n− )v. But by Proposi-
tion 18.3 (iii) each weight in U (n− )v is  μ, so U (g)v does not contain V (λ).
It is therefore a proper nonzero submodule. But V is irreducible, so this is a
contradiction.


Now suppose that we have a representation that we want to decompose into


irreducibles. Theorem 26.5 gives a strategy for obtaining this decomposition.
We remind the reader that if λ is a weight, that is, a character of TC , then we
may regard λ as a character of B in which N acts trivially, because B/N ∼ = TC .

Proposition 26.7. Let W be a G-module that decomposes into a direct sum of


finite-dimensional irreducible representations. Let λ be a dominant weight, and
let πλ be the irreducible G-module with highest weight λ. Then the multiplicity
of πλ as a G-module in W equals the multiplicity of λ as a B-module in W .
238 26 The Borel Subgroup

Proof. Since N acts trivially in λ as a B-module, every B-submodule of W


isomorphic to λ is contained in W N . By Theorem 26.5, each copy of πλ
contains a unique vector in W N . The statement is therefore clear.

@ 2 n
To give an example, let us decompose the symmetric algebra (∨ C ) for
over the symmetric square of the standard module for GL(n, C).

Proposition 26.8. Let Ω be the space of n × n symmetric complex matrices.


Let P(Ω) be the ring of polynomials on Ω with the GL(n, C) action
@ (gf )(X) =
f (t g · X · g). Then P(Ω) is isomorphic to the GL(n, C)-module (∨2 Cn ).

Proof. For any module W the polynomial ring on W is isomorphic as a


GL(n, C)-module to the symmetric algebra on W ∗ . So it is sufficient to show
that ∨2 Cn and Ω are dual modules. Indeed, if V = Cn and M is an n × n
symmetric matrix then M induces a symmetric bilinear map V × V −→ C
by (v1 , v2 ) −→t v1 M v2 . (We are thinking of v1 and v2 as column vectors.)
The linear map ∨2 V −→ C identifies Ω with the dual space of ∨2 V .


We identify the weight lattice Λ of U (n) or GL(n, C) with Zn as follows:


if λ ∈ Zn then we identify λ with the character
⎛ ⎞
t1
⎜ .. ⎟ 7
⎝ . ⎠ −→ tλi i .
tn

The weight λ is dominant if λ1  λ2  · · ·  λn . Assuming this, we say


that λ is even if the λi are all even, and effective if λn  0. It is not hard to
see that λ is effective.
@
Theorem 26.6. The GL(n, C)-module (∨2 Cn ) decomposes into a direct
sum of irreducible representations, each with multiplicity one. Let λ be a
dominant weight. The irreducible representation with highest weight λ occurs
in this decomposition if and only if λ is even. Each occurs with multiplicity
one.

Proof. By Proposition 26.8 we may work with the representation P(Ω). As we


have explained, our task is to compute the N -invariants of the representation.
If X = (Xij ) ∈ Ω, let Xk (1  k  n) be the upper left k × k minor, that is

Xk = det(Xij )1i,jk .

We consider Xk to be a polynomial function on Ω. It is simple to check that it


is N -invariant, and we will show that the ring of N -invariants is generated by
X1 , . . . , Xk . Let Ω  be the subspace of Ω characterized by the nonvanishing
of the Xk . It is a dense open set, so any element of P(Ω) is determined by its
restriction to Ω  . Now any double coset in N \Ω  /N is equivalent to a diagonal
element; indeed the element X is in the same double coset as
26 The Borel Subgroup 239
⎛ ⎞
X1
⎜ X2 /X1 ⎟
⎜ ⎟
⎜ X3 /X2 ⎟.
⎝ ⎠
..
.

Clearly any polynomial of X restricts to the diagonal as a polynomial


⎛ ⎞
x1
⎜ x2 ⎟
⎜ ⎟
⎜ x ⎟= a(μ)xμ , xμ = xμ1 1 · · · xμnn ,
⎝ 3 ⎠
.. μ∈Zn
.

and since the value of xμ on X is X1μ1 −μ2 X2μ2 −μ3 · · · if this is a polynomial of
X we must have a(μ) = 0 unless μ1  μ2  · · · . Thus the xμ with μ dominant
form a basis of the N -invariants. We have seen that there is one irreducible
representation for each basis vector. Under the action in Proposition 26.8, the
μ
@ 2x nis 2μ. Hence the highest weights of the irreducible rep-
weight of the vector
resentations in (∨ C ) are the even dominant weights.


The following proposition abstracts this situation. We will say that a


representation of G that decomposes into a direct sum of finite-dimensional
irreducible representations is multiplicity-free if no irreducible representation
occurs in it with
@ multiplicity greater than one. For example, we have just
proved that (∨2 Cn ) is a multiplicity-free GL(n, C)-module. By Proposi-
tion 26.7 a method of proving that a module W is multiplicity-free is to
show that W N is multiplicity-free as a B-module. The next result exposes the
underlying mechanism in the proof of Theorem 26.6.
In the following result we will assume that G is an affine algebraic group
over C. We note that all of the usual examples, SL(n, C), GL(n, C), O(n, C),
Sp(2n, C), . . . are affine algebraic groups. We continue to assume that G is the
complexification of a compact Lie group, and this assumption is true for these
examples as well.

Theorem 26.7. Assume that G is an affine algebraic group over the complex
numbers. Assume that it is also the complexification of a compact Lie group K.
Let X be a complex affine algebraic variety on which the group G acts
algebraically. Assume that the Borel subgroup B has a dense open orbit in
X. Let W be the space of algebraic functions on X. Then W is a multiplicity-
free G-module.

The open orbit, if it exists, is always unique and dense, so the word “dense”
could be eliminated from the statement. The theory of algebraic group actions
is an important topic, and the standard monograph is Mumford, Fulton, and
Kirwan [132]. Varieties (whether affine or not) with an open B-orbit are called
spherical .
240 26 The Borel Subgroup

Proof. We need to prove that W decomposes into a direct sum of finite


-dimensional modules.
We begin by showing that if f ∈ W then the G-translates of f span a
finite-dimensional vector space W (f ). Since the group action G × X −→ X
is algebraic, if f is a polynomial function on X, then (g, x) −→ f (gx) is a
polynomial function on G × X and so there exist polynomials φi on G and ψi
on X such that f (gx) = i φi (g)ψi (x). Thus, the space W (f ) of left translates
of f is spanned by the functions ψi and is finite-dimensional.
Now we embed X in an affine space, so that we may speak of the degree of a
polynomial function. Let WN be the direct sum of the W (f ) for f a polynomial
of degree  N . Then WN is finite-dimensional and G-invariant. Because K is
compact, WN decomposes into a direct sum of irreducible K-modules, which
are also G-invariant subspaces since G is the complexification of K. Since
WN ⊂ WN +1 , may also choose these decompositions so that every irreducible
that occurs in the decomposition of WN is also in the decomposition of WN +1 .
Taking the sum of all the irreducibles that occur in these decompositions shows
that W is completely reducible.
Now let x0 ∈ X such that Bx0 is open and dense. If f ∈ W and g ∈ G, then
the group action is by (gf )(x) = f (g −1 x). So if f is in W N and λ is a dominant
weight such that bf = λ(b)f for all b ∈ B, we have f (bx0 ) = λ(b)−1 f (x0 ).
Because Bx0 is dense, this means that f is determined up to a scalar multiple
by this condition. This shows that W N is multiplicity-free as a B-module and
therefore W is multiplicity-free as a G-module.


Exercises
Exercise 26.1. Let Ω be the vector space of n × n skew-symmetric matrices. G =
GL(n, C) acts on Ω by the space of polynomial functions on Ω by (gf )(X) = f (t g ·
X · g).
(i) Show that the symmetric algebra on the exterior square of the standard module
of GL(n, C), that is, (∧2 Cn ), is isomorphic as a G-module to the ring of
polynomialfunctions on Ω.
(ii) Show that (∧2 Cn ) decomposes as a direct sum of irreducible representations,
each with multiplicity one, and that if λ = (λ1 , λ2 , . . . , λn ) is a dominant weight,
then λ occurs in this decomposition if and only if λ1 = λ2 , λ3 = λ4 , . . ., and if
n is odd, then λn = 0.
Exercise 26.2. Let G = GL(n, C), and let H = O(n, C). As in Proposition 26.8, let
Ω be space of symmetric n × n complex matrices, and let Ω ◦ be the open subset of
invertible n × n matrices. Let P(Ω) be the ring of polynomials on Ω, and let P(Ω ◦ )
be the space of polynomial functions on Ω ◦ ; it is generated by P(Ω) together with
g −→ det(g)−1 . The group G acts on both P(Ω) and P(Ω ◦ ) as in Proposition 26.8.
The stabilizer of I ∈ P(Ω ◦ ) is the group H, and the action on P(Ω ◦ ) is transitive,
so P(Ω ◦ ) is in bijection with G/H. Let (π, V ) be an irreducible representation of G.
(i) Show that (π, V ) has a nonzero H-fixed vector if and only if its contragredient
(π̂, V ∗ ) does. [Hint: Show that π̂ is equivalent to the representation π  : G −→ V
defined by π  (g) = t g −1 by comparing their characters.]
26 The Borel Subgroup 241

(ii) Assume that π has a nonzero H-fixed vector. By (i) there is an H-invariant
linear functional φ : V −→ C. Define Φ : V −→ P(Ω ◦ ) by letting Φ(v) be the
function Φv defined by

Φv (X) = φ(t gv), X = t gg.

Show that this is well defined and that Φgv (X) = Φv (t gXg). Deduce that v −→
Φv is an embedding of V into P(Ω ◦ ).
(iii) Show that π has a nonzero H-fixed vector if and only if π can be embedded in
P(Ω ◦ ). [Hint: One direction is (ii). For the other, prove instead that π has an
H-invariant linear functional.]
Remark: The argument in (ii) and (iii) is formally very similar to the proof
of Frobenius reciprocity (Proposition 32.2) with P(Ω ◦ ) playing the role of the
induced representation.)
(iv) Show that an irreducible representation of P(Ω ◦ ) can be extended to P(Ω) if
and only if its highest weight λ is effective.
(v) Let πλ be an irreducible representation of G with highest weight λ. Assume that
λ is effective, so that λ is a partition. Show that πλ has an O(n)-fixed vector if
and only if λ is even.
(vi) Assume again that λ is effective, but only assume that πλ has a fixed vector for
SO(n). What λ are possible?

Exercise 26.3. The last exercise shows that if (π, V ) is an irreducible representation
of GL(n, C), then the multiplicity of the trivial representation of O(n, C) in its
restriction to this subgroup is at most one. Show by example that there are other
representations that can occur with higher multiplicity, for example when n = 5.

The next exercise is essentially a proof of the Cauchy identity, which is the
subject of Chap. 38.

Exercise 26.4. Let G = GL(n, C) × GL(n, C) acting on the ring P of polynomial


functions on Matn (C) by

((g1 , g2 )f )(X) = f (t g1 Xg2 ),

f ∈ P and X ∈ Matn (C).

(i) Prove that P is isomorphic as a GL(n, C) × GL(n, C) to the symmetric algebra


on V ⊗ V . (Hint: Adapt the proof of Proposition 26.8.)
(ii) Prove that P is isomorphic as a GL(n, C) × GL(n, C) to the direct sum of all
modules πλ ⊗πλ as λ runs through the effective dominant weights. [Hint: Adapt
the proof of Theorem 26.6. Note that if B is the standard Borel subgroup of
GL(n, C) then B ×B is a Borel subgroup of GL(n, C)×GL(n, C), so the problem
is to find the N × N invariants in P. These are, as in Theorem 26.6, again
polynomials in certain minors of X.]

Exercise 26.5. Let G be a complex analytic Lie group and let H1 , H2 be closed
analytic subgroups. Then G acts on the homogeneous space G/H1 , as does its sub-
group H2 . The quotient is the space of double cosets, H2 \G/H1 , which might also
be obtained by letting H1 act on the right on H2 \G.
242 26 The Borel Subgroup

(i) Show that if γ ∈ H1 then the stabilizer in H2 of the coset γH1 is Hγ = H2 ∩


γH1 γ −1 . Deduce that the dimension of the orbit is dim(H2 ) − dim(Hγ )
(ii) Show that H2 has an open orbit on G/H1 if and only if

dim(Hγ ) + dim(G) = dim(H1 ) + dim(H2 ).

(iii) Show that H2 has an open orbit on G/H1 if and only if H1 has an open orbit
on H2 \G.
27
The Bruhat Decomposition

The Bruhat decomposition was discovered quite late in the history of Lie
groups, which is surprising in view of its fundamental importance. It was
preceded by Ehresmann’s discovery of a closely related cell decomposition for
flag manifolds. The Bruhat decomposition was axiomatized by Tits in the
notion of a Group with (B, N ) pair or Tits’ system. This is a generalization
of the notion of a Coxeter group, and indeed every (B, N ) gives rise to a
Coxeter group. We have remarked after Theorem 25.1 that Coxeter groups
always act on simplicial complexes whose geometry is closely connected with
their properties. As it turns out a group with (B, N ) pair also acts on a
simplicial complex, the Tits’ building. We will not have space to discuss this
important concept but see Tits [163] and Abramenko and Brown [1].
In this chapter, in order to be consistent with the notation in the litera-
ture on Tits’ systems, particularly Bourbaki [23], we will modify our notation
slightly. In other chapters such as the previous one, N denotes the subgroup
(26.6) of the Borel subgroup. That group will appear in this Chapter also,
but we will denote it as U , reserving the letter N for the normalizer of T .
Similarly, in this chapter U will be the subgroup formerly denoted N .
Let G = GL(n, F ), where F is a field, and let B be the Borel subgroup of
upper triangular matrices in G. Taking T ⊂ B to be the subgroup of diagonal
matrices in G, the normalizer N (T ) consists of all monomial matrices. The
Weyl group W = N (T )/T ∼ = Sn . If w ∈ W is represented by ω ∈ N (T )
then since T ⊂ B the double coset BωB is independent of the choice of
representative ω, so by abuse of notation we write BwB for BωB. It is a
remarkable and extremely important fact that w −→ BwB is a bijection
between the elements of W and the double cosets B\G/B. We will prove the
following Bruhat decomposition:
A
G= BwB (disjoint). (27.1)
 
ab
The example of GL(2, F ) is worth writing out explicitly. If g = ,
cd
then g ∈ B if c = 0. Therefore to prove the Bruhat decomposition, then for a

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 243


DOI 10.1007/978-1-4614-8024-2 27, © Springer Science+Business Media New York 2013
244 27 The Bruhat Decomposition

representative
 ωof the long Weyl group element it will be convenient to take
0 −Δc−1
ω= where Δ = ad − bc. Then this follows from the identity
c 0
     
ab 1 a/c 1 d/c
= ω
cd 1 1

We will prove this and also obtain a similar statement in complex Lie
groups. Specifically, if G is a complex Lie group obtained by complexification
of a compact connected Lie group, we will prove a “Bruhat decomposition”
analogous to (27.1) in G. A more general Bruhat decomposition will be found
in Theorem 29.5.
We will prove the Bruhat decomposition for a group with a Tits’ system,
which consists of a pair of subgroups B and N satisfying certain axioms. The
use of the notation N differs from that of Chap. 26, though the results of that
chapter are very relevant here.
Let G be a group, and let B and N be subgroups such that T = B ∩ N is
normal in N . Let W be the quotient group N/T . As with GL(n, F ), we write
wB instead of wB when ω ∈ N represents the Weyl group element w, and
similarly we will denote Bw = Bω and BwB = BωB.
Let G be a group with subgroups B and N satisfying the following condi-
tions.

Axiom TS1. The group T = B ∩ N is normal in N .

Axiom TS2. There is specified a set I of generators of the group W = N/T


such that if s ∈ I then s2 = 1.

Axiom TS3. Let w ∈ W and s ∈ I. Then

wBs ⊂ BwsB ∪ BwB. (27.2)

Axiom TS4. Let s ∈ I. Then sBs−1 = B.

Axiom TS5. The group G is generated by N and B.

Then we say that (B, N, I) is a Tits’ system.


We will be particularly concerned with the double cosets C(w) = BwB
with w ∈ W . Then Axiom TS3 can be rewritten

C(w) C(s) ⊂ C(w) ∪ C(ws), (27.3)

which is obviously equivalent to (27.2). Taking inverses, this is equivalent to

C(s) C(w) ⊂ C(w) ∪ C(sw). (27.4)


27 The Bruhat Decomposition 245

As a first example, let G = GL(n, F ), where F is any field. Let B be


the Borel subgroup of upper triangular matrices in G, let T be the standard
“maximal torus” of all diagonal elements, and let N be the normalizer in G
of T . Then B is the semidirect product of T with the normal subgroup U of
upper triangular unipotent matrices. The group N consists of the monomial
matrices, that is, matrices having exactly one nonzero entry in each row and
column. Let I = {s1 , . . . , sn−1 } be the set of simple reflections, namely si is
the image in W = N/T of
⎛ ⎞
Ii−1
⎜ 0 1 ⎟
⎜ ⎟.
⎝ 1 0 ⎠
In−1−i

We will prove in Theorem 27.1 below that this (B, N, I) is a Tits’ system.
The proof will require introducing a root system into GL(n, F ). Of course, we
have already done this if F = C, but let us revisit the definitions in this new
context.
Let X ∗ (T ) be the group of rational characters of T . In case F is a finite
field, we don’t want any torsion in X ∗ (T ); that is, we want χ ∈ X ∗ (T ) to have
infinite order so that R ⊗ X ∗ (T ) will be nonzero. So we define an element of
X ∗ (T ) to be a character of T (F ), the group of diagonal matrices in GL(n, F ),
where F is the algebraic closure of F , of the form
⎛ ⎞
t1
⎜ .. ⎟
⎠ −→ t11 . . . tnn ,
k k
⎝ . (27.5)
tn

where ki ∈ Z. Then X ∗ (T ) ∼ = Zn , so V = R ⊗ X ∗ (T ) ∼
= Rn .

As usual, we write the group law in X (T ) additively.
In this context, by a root of T in G we mean an element α ∈ X ∗ (T ) such
that there exists a group isomorphism xα of F onto a subgroup Xα of G
consisting of unipotent matrices such that

t xα (λ) t−1 = xα α(t) λ , t ∈ T, λ ∈ F. (27.6)

(Strictly speaking, we should require that this identity be true as an equality


of morphisms from the additive group into G.) There are n2 − n roots, which
may be described explicitly as follows. If 1  i, j  n and i = j, let

αij (t) = ti t−1


j (27.7)

when t is as in (27.5). Then αij ∈ X ∗ (T ), and if Eij is the matrix with 1 in


the i, j position and 0’s elsewhere, and if

xα (λ) = I + λEij ,
246 27 The Bruhat Decomposition

then (27.6) is clearly valid. The set Φ consisting of αij is a root system; we
leave the reader to check this but in fact it is identical to the root system
of GL(n, C) or its maximal compact subgroup U(n) already introduced in
Chap. 18 when n = C. Let Φ+ consist of the “positive roots” αij with i < j,
and let Σ consist of the “simple roots” αi,i+1 . We will sometimes denote the
simple reflections si = sα , where α = αi,i+1 .
Suppose that α is a simple root. Let Tα ⊂ T be the kernel of α. Let Mα be
the centralizer of Tα , and let Pα be the subgroup generated by B and Mα . By
abuse of language, Pα is called a minimal parabolic subgroup. Observe that it is
a parabolic subgroup since it contains the Borel subgroup. Strictly speaking
it is not minimal amoung the parabolic subgroups, since the Borel itself is
smaller. However it is minimal among non-Borel parabolic subgroups, and it
is commonly called a minimal parabolic. in Chap. 30.) We have a semidirect
product decomposition Pα = Mα Uα , where Uα is the group generated by the
xβ (λ) with β ∈ Φ+ − {α}. For example, if n = 4 and α = α23 , then
⎧⎛ ⎞⎫ ⎧⎛ ⎞⎫

⎪ t1 ⎪
⎪ ⎪
⎪ ∗ ⎪

⎨⎜ ⎟⎬ ⎨⎜ ⎟⎬
t2 ∗ ∗
Tα = ⎜⎝
⎟ ,
⎠⎪ Mα = ⎜
⎝ ∗ ∗
⎟ ,
⎠⎪

⎪ t2 ⎪ ⎪
⎪ ⎪
⎩ ⎭ ⎩ ⎭
t4 ∗
⎧⎛ ⎞⎫ ⎧⎛ ⎞⎫

⎪ ∗ ∗ ∗ ∗ ⎪⎪ ⎪
⎪ 1 ∗ ∗ ∗ ⎪⎪
⎨⎜ ⎬ ⎨⎜ ⎬
∗ ∗ ∗⎟ 1 ∗⎟
Pα = ⎜ ⎟ ,
⎝ ∗ ∗ ∗ ⎠⎪ Uα = ⎜⎝
⎟ ,

⎪ ⎪ ⎪
⎪ 1 ∗ ⎠⎪

⎩ ⎭ ⎩ ⎭
∗ 1
where ∗ indicates an arbitrary value.

Lemma 27.1. Let G = GL(n, F ) for any field F , and let other notations be
as above. If s is a simple reflection, then B ∪ C(s) is a subgroup of G.

Proof. First, let us check this when n = 2. In this case, there is only one
simple root sα where α = α12 . We check easily that
   
ab 
C(sα ) = Bsα B = ∈ GL(2, F )  c = 0 ,
cd

so C(sα ) ∪ B = G.
In the general case, both C(sα ) and B are subsets of Pα . We claim that
their union is all of Pα . Both double cosets are right-invariant by Uα since
Uα ⊂ B, so it is sufficient to show that C(sα ) ∪ B ⊃ Mα . Passing to the
quotient in Pα /Uα ∼= Mα ∼ = GL(2) × (F × )n−2 , this reduces to the case n = 2
just considered.


We have an action of W on Φ as in Chap. 20. This action is such that if


ω ∈ N represents the Weyl group element w ∈ W , we have
27 The Bruhat Decomposition 247

ωxα (λ)ω −1 ∈ xw(α) (F ). (27.8)

Other notations, such as the length function l : W −→ Z, will be as in that


chapter.

Lemma 27.2. Let G = GL(n, F ) for any field F , and let other notations
be as above. If α is a simple root and w ∈ W such that w(α) ∈ Φ+ , then
C(w) C(s) = C(ws).

Proof. We will show that


wBs ⊆ BwsB.
If this is known, then multiplying both left and right by B gives C(w) C(s) =
BwBsB ⊆ BwsB = C(ws). The other inclusion is obvious, so this is sufficient.
Let ω and σ be representatives of w and s as cosets in N/T = W , and let
b ∈ B. We may write b = txα (λ)u, where t ∈ T , λ ∈ F , and u ∈ Uα . Then

ωbσ = ωtω −1 · ωxα (λ)ω −1 · ωσ · σ −1 uσ.

We have ωtω −1 ∈ T ⊂ B since ω ∈ N = N (T ). We have ωxα (λ)ω −1 ∈


xw(α) (F ) ⊂ B using (27.8) and the fact that w(α) ∈ Φ+ . We have σ −1 uσ ∈
Uα ⊂ B since Mα normalizes Uα and σ ∈ Mα . We see that ωbσ ∈ BwsB as
required.


Proposition 27.1. Let G = GL(n, F ) for any field F , and let other notations
be as above. If w, w ∈ W are such that l(ww ) = l(w) + l(w ), then

C(ww ) = C(w) · C(w ).

Proof. It is sufficient to show that if l(w) = r, and if w = s1 . . . sr is a


decomposition into simple reflections, then

C(w) = C(s1 ) . . . C(sr ). (27.9)

Indeed, assuming we know this fact, let w = s1 . . . sr be a decomposition into
simple reflections with r = l(r ). Then s1 . . . sr s1 . . . sr is a decomposition of
ww into simple reflections with l(ww ) = r + r , so

C(ww ) = C(s1 ) . . . C(sr ) C(s1 ) . . . C(sr ) = C(w) C(w ).

To prove (27.9), let sr = sα , and let w1 = s1 . . . sr−1 . Then l(w1 sα ) =


l(w1 ) + 1, so by Propositions 20.2 and 20.5 we have w (α) ∈ Φ+ . Thus,
Lemma 27.2 is applicable and C(w) = C(w1 ) C(sr ). By induction on r, we
have C(w1 ) = C(s1 ) . . . C(sr−1 ) and so we are done.


Theorem 27.1. With G = GL(n, F ) and B, N, I as above, (B, N, I) is a


Tits’ system in G.
248 27 The Bruhat Decomposition

Proof. Only Axiom TS3 requires proof; the others can be safely left to the
reader. Let α ∈ Σ such that s = sα .
First, suppose that w(α) ∈ Φ+ . In this case, it follows from Lemma 27.2
that wBs ⊂ BwsB.
Next suppose that w(α) ∈ Φ+ . Then wsα (α) = w(−α) = −w(α) ∈ Φ+ , so
we may apply the case just considered, with wsα replacing w, to see that

wsBs ⊂ Bws2 B = BwB. (27.10)

By Lemma 27.1, B ∪ BsB is a group containing a representative of the coset


of s ∈ N/T , so B ∪ BsB = sB ∪ sBsB and thus

Bs ⊂ sB ∪ sBsB.

Using (27.10),

wBs ⊂ wsB ∪ wsBsB ⊂ BwsB ∪ BwB.

This proves Axiom TS3.




As a second example of a Tits’ system, let K be a compact connected Lie


group, and let G be its complexification. Let T be a maximal torus of K, let
TC be the complexification of T , and let B be the Borel subgroup of G as
constructed in Chap. 26. Let N be the normalizer in G of TC , and let I be the
set of simple reflections in W = N/T . We will prove that (B, N, I) is a Tits’
system in G, closely paralleling the proof just given for GL(n, F ). In fact, if
F = C and K = U(n), so G = GL(n, C), the two examples, including the
method of proof, exactly coincide.
The key to the proof is the construction of the minimal parabolic subgroup
Pα corresponding to a simple root α ∈ Σ. Chap. 30.) Let Tα be the kernel of
α in T . The centralizer CK (Tα ) played a key role in Chap. 18, particularly in
the proof of Theorem 18.1, where a homomorphism iα : SU(2) −→ CK (Tα )
was constructed. This homomorphism extends to a homomorphism, which we
will also denote as iα , of the complexification SL(2, C) into the centralizer

CG (Tα ) of Tα in G. Let Pα be the subgroup generated
by iα SL(2, C) and
B. Let Mα be the group generated by iα SL(2, C) and TC . Finally, let
&
uα = Xβ .
β ∈ Φ+
β = α

If β1 , β2 ∈ {β ∈ Φ+ | β = α}, then β1 + β2 = 0, and if β1 + β2 is a root, it is


also in {β ∈ Φ+ | β = α}. It follows from this observation and Proposition 18.4
that uα is closed under the Lie bracket; that is, it is a complex Lie algebra of
the Lie algebra denoted n in Chap. 26. Theorem 26.2 (iii) shows that it is the
Lie algebra of a complex Lie subgroup Uα of G.
27 The Bruhat Decomposition 249

Proposition 27.2. Let G be the complexification of the compact connected


Lie group K, let α be a simple positive root of G with respect to a fixed maximal
torus T of K, and let other notations be as above. Then Mα normalizes Uα .

Proof. It is clear that B normalizes Uα , so we need to show that iα SL(2, C)
normalizes Uα . If γ ∈ {β ∈ Φ+ | β = α} and δ = α or −α, then γ + δ = 0,
and if γ + δ ∈ Φ, then γ + δ ∈ {β ∈ Φ+ | β = α}. Thus [X±α , Xγ ] ⊆ uα , and
since by Theorem 18.1 and Proposition 18.8 the Lie algebra of iα SL(2, C)
is generated by Xα and X−α , it follows that the Lie algebra of iα SL(2, C)
normalizes
the Lie algebra of Uα . Since both groups are connected, it follows
that iα SL(2, C) normalizes Uα .


Since Mα normalizes Uα , we may define Pα to be the semidirect product
Mα Uα . An analog of Lemma 27.1 is true in this context.
Lemma 27.3. Let G be the complexification of the compact connected Lie
group K, and let other notations be as above. If s is a simple reflection, then
B ∪ C(s) is a subgroup of G.
Proof. Indeed, if s = sα , then B ∪ C(s) = Pα . From Theorem 18.1, the group
Mα contains a representative of s ∈ N/T , so it is clear that B ∪ C(s) ⊂ Pα . As
for the other inclusion, both B and C(s) are invariant under right multiplica-
tion by Uα , so it is sufficient to show that Mα ∈ B ∪ C(s). Moreover, both B
and C(s) are invariant under right multiplication by TC , so it is sufficient to
show that iα SL(2, C) ⊂ B ∪ C(s). This is identical to Lemma 27.1 except
that we work with SL(2, C) instead of GL(2, F ). We have
  
ab B if c = 0 ,
iα ∈
cd C(s) if c = 0.
This completes the proof.

Theorem 27.2. Let G be the complexification of the compact connected Lie
group K. With B, N, I as above, (B, N, I) is a Tits’ system in G.
Proof. The proof of this is identical to Theorem 27.1. The analog of
Lemma 27.2 is true, and the proof is the same except that we use Lemma 27.3
instead of Lemma 27.1. All other details are the same.

Now that we have two examples of Tits’ systems, let us prove the Bruhat
decomposition.
Theorem 27.3. Let (B, N, I) be a Tits’ system within a group G, and let W
be the corresponding Weyl group. Then
A
G= BwB, (27.11)
w∈W

and this union is disjoint.


250 27 The Bruhat Decomposition
4
Proof. Let us show that w∈W C(w) is a group. It is clearly closed under
inverses. We must show that it is closed under multiplication.
Let us consider C(w1 ) · C(w2 ), where w1 , w2 ∈ W . We show by induction
on l(w2 ) that this is contained in a union of double cosets. If l(w2 ) = 0, then
w2 = 1 and the assertion is obvious. If l(w2 ) > 0, write w2 = sw2 , where s ∈ I
and l(w2 ) < l(w2 ). Then, by Axiom TS3, we have

C(w1 ) · C(w2 ) = Bw1 Bsw2 B ⊂ Bw1 Bw2 B ∪ Bw1 sBw2 B,

and by induction this is contained in a union of double cosets.


We have shown that the right-hand side of (27.11) is a group, and since it
clearly contains B and N , it must be all of G by Axiom TS5.
It remains to be shown that the union (27.11) is disjoint. Of course, two
double cosets are either disjoint or equal, so assume that C(w) = C(w ), where
w, w ∈ W . We will show that w = w .
Without loss of generality, we may assume that l(w)  l(w ), and we
proceed by induction on l(w). If l(w) = 0, then w = 1, and so B = C(w ).
Thus, in N/T , a representative for w will lie in B. Since B∩N = T , this means
that w = 1, and we are done in this case. Assume therefore that l(w) > 0
and that whenever C(w1 ) = C(w1 ) with l(w1 ) < l(w) we have w1 = w1 .
Write w = w s, where s ∈ I and l(w ) < l(w). Thus w s ∈ C(w ), and
since s has order 2, we have

w ∈ C(w )s ⊂ C(w ) ∪ C(w s)

by Axiom TS3. Since two double cosets are either disjoint or equal, this means
that either
C(w ) = C(w ) or C(w ) = C(w s).
Our induction hypothesis implies that either w = w or w = w s. The first
case is impossible since l(w ) < l(w)  l(w ). Therefore w = w s. Hence
w = w s = w , as required.


We return to the second example of a Tits’ system. Let K be a compact


connected Lie group, G its complexification. Let B be the standard Borel
subgroup, containing a maximal torus TC , with T = TC ∩ K the maximal
torus of K. The group (26.6) which is usually denoted N will be denoted U
(in this chapter only).
The flag manifold X = K/T may be identified with  G/TC as in Theo-
rem 26.4. We will use the Bruhat decomposition G = BwB to look more
closely at X.
By Theorem 26.4, X is a complex manifold.4It is compact since it is a
continuous image of K. We may decompose X = Yw where w runs through
the Weyl group and Yw = BwB/B. Let us begin by looking more closely at
Yw . Let U+w
= U ∩ wU w−1 and U− w
= U ∩ wU− w−1 . The Lie algebra uw
+ is the
intersection of the Lie algebras of U and wU w−1 , so
27 The Bruhat Decomposition 251
&
uw
+ = Xα ,
α∈Φ+ ∩wΦ+

and similarly &


uw
+ = Xα .
α∈Φ+ ∩wΦ−

Proposition 27.3. The map u → uwB is a bijection of U−


w
onto Yw .
Proof. Clearly BwB/B = U wB/B. Moreover if u, u ∈ U then uwB = u wB
if and only if u−1 u ∈ U+
w
. We need to show that every coset in U/U+w
has a
+
unique representative from U− . This follows from Theorem 26.2 (iv).

The orbits of B under the left action of B on X are the Yw . So the closure of
Yw is a union of other Yu with u ∈ W . Which ones? We recall the Bruhat order
that was introduced in Chap. 25. If w = si1 . . . sik is a reduced decomposition,
then u  w if and only if u obtained by eliminating some of the factors.
In other words, there is a subsequence (j1 , . . . , jl ) of (i1 , . . . , ik ) with u =
sj1 . . . sjl . It was shown in Proposition 25.4 that this definition does not depend
on the decomposition w = si1 . . . sik . Moreover, we may always arrange that
u = sj1 . . . sjl is a reduced decomposition.
Our goal is to prove that Yu is contained in the closure of Yw if and only
if u  v in the Bruhat order. To prove this, we introduce the Bott-Samelson
varieties. If 1  i  r, where r is the semisimple rank of K, that is, the
number of simple reflections, let Pi be minimal parabolic subgroup generated
by si and B.
Proposition 27.4. The minimal parabolic Pi = C(1) ∪ C(si ). The quotient
Pi /B is diffeomorphic to the projective line P1 (C).
Proof. By Lemma 27.1, C(1) ∪ C(si ) is a group, so Pi = C(1) ∪ C(si ). Since
SL(2, C) is simply-connected, the injection iαk : sl(2, C) −→ Lie(G) as in
Proposition 18.8
induces a homomorphism SL(2, C) −→ G whose image is in
Pik . Since iαi SL(2, C) contains
s ,
i we have Pi = i αi SL(2, C) B. Therefore
Pi /B in bijection with iαi SL(2, C) modulo its intersection with B. The
quotient of SL(2, C) by its Borel subgroup is the projective line P1 (C).

If w = (i1 , . . . , ik ), define a right action of B k on Pi1 × . . . × Pik by

(p1 , . . . , pk ) · (b1 , . . . , bk ) = (p1 b1 , b−1 −1


1 p2 b2 , . . . , bk−1 pk bk ), (27.12)

where pj ∈ Pij and bj ∈ B. We are mainly interested in the case where w


is a reduced word. The quotient Zw = (Pi1 × . . . × Pik )/B k is called a Bott-
Samelson variety. We also have a map Zw −→ Zw where w = (i1 , . . . , ik−1 )
in which the orbit of (p1 , . . . , pk ) goes to the orbit of (p1 , . . . , pk−1 ). This map
is a fibration in which the typical fiber is Pik /B ∼ = P1 (C). Thus the Bott-
Samelson variety is obtained by successive fiberings of P1 (C). In particular it
is a compact manifold.
252 27 The Bruhat Decomposition

We have a map τ : Zw −→ X induced by the map (p1 , . . . , pk ) −→


p1 . . . pk B. It is clearly well-defined. Let Xw be the closure of Yw in X. It
is called a Schubert variety. We will show that the image of Zw in X is pre-
cisely Xw . Although we will not discuss this point, both Zw and Xw are
algebraic varieties. The variety Zw is less canonical, since it depends on the
choice of a reduced word w. It is, however, easier to work with. For example
Zw is smooth, whereas Xw can be singular.
Bott-Samelson varieties play a key role in many aspects of the theory.
The map τ is a birational equivalence, so they resolve the singularities of the
Schubert varieties. They are used in Demazure’s calculation of the action of T
on the spaces of sections of line bundles on X restricted to Xw as Demazure
characters.

Theorem 27.4. The image of τ is Xw . The Schubert variety Xw is the union


of the Yu for u  w in the Bruhat order.

Proof. Since C(si ) is dense in Pi , the set C(si1 ) × . . . × C(sik ) is dense in


Pi1 × . . . × Pik . Its image in X is C(si1 ) . . . C(sik ) = C(w) by (27.9), and so
the image of C(si1 ) × . . . × C(sik ) is Yw , and it is dense in τ (Zw ). On the
other hand, the image of τ (Zw ) is closed since Zw is compact. Thus τ (Zw ) is
the closure of Yw , which by definition is Xw .
Now since Pi = C(1) ∪ C(si ), it is clear that τ (Zw ) is the union of the
C(sj1 ) . . . C(sjl )/B as (j1 , . . . , jl ) runs through the subwords of (i1 , . . . , ik ). If
u = sj1 . . . sjl is a reduced decomposition, then by (27.9) this is C(u)/B, and
so we obtain every Yu for u  w. If the decomposition is not reduced, it is
still a union of C(v)/B for v  u, as follows easily from (27.3).


The Borel-Weil theorem realizes an irreducible representation of a compact


Lie group or its complexification as an action on the space of sections of a
holomorphic line bundle on the flag variety. This will be our next topic. The
Bruhat decomposition will play a role in the discussion insofar as we will need
to know that the big Bruhat cell Bw0 B is dense in G.
If (π, V ) is a complex representation of the compact connected Lie group
K, then it follows from the definition of the complexification G that π has a
unique extension to an analytic representation π : G −→ GL(V ). Similarly,
the contragredient representation π̂ : K −→ GL(V ∗ ) may be extended to an
analytic representation of G. Let λ be the highest weight of π. By Proposi-
tion 22.8, the highest weight of π̂ is λ̂ = −w0 λ, where w0 is the long element
of the Weyl group W .
Now let X = G/B be the flag variety. We will construct a line bundle Lλ
over X. This is a complex analytic manifold together with an analytic map
p : Lλ −→ X. The fibers of p are one-dimensional complex vector spaces.
Moreover every point x ∈ X has a neighborhood U such that the p−1 (U ) is
a trivial bundle over U . This means that there is a complex analytic home-
omorphism ψ : p−1 (U ) −→ U × C such that the composition of ψ with the
projection U × C −→ C is p.
27 The Bruhat Decomposition 253

To construct Lλ , define a right action of B on G×C, by (g, ε)b = (gb, λ̂(b)ε)


for b ∈ B and (g, ε) ∈ G × C. Then Lλ is the quotient (G × C)/B. We will
denote the orbit of (g, ε) by [g, ε], so if b ∈ B then [g, ε] = [gb, λ̂(b)ε]. The
map p : Lλ −→ X sends [g, ε] to gB. We leave it to the reader to check that
this is a line bundle.
A section of Lλ is a holomorphic map s : X −→ Lλ such that p ◦ s is the
identity on X. It is well-known (and part of the Riemann-Roch theorem) that
the space Γ (Lλ ) of sections is finite-dimensional. We have compatible actions
of G on X and on Lλ by left translation: if γ ∈ G then γ : X −→ X sends gB to
γgB and γ : Lλ −→ Lλ sends [g, ε] to [γg, ε]. Specifying a section is equivalent
to giving a holomorphic map φ : G −→ C such that φ(gb) = λ̂(b)φ(g); the
last condition is needed so that s(gB) = [g, φ(g)] is well-defined. So Γ (Lλ ) is
isomorphic to the vector space H(λ) of such holomorphic maps φ.
The compatibility is that these actions commute with p. One says that
Lλ is an equivariant line bundle. Now we have an action of G on sections as
follows. If s ∈ Γ (Lλ ) and γ ∈ G then γs is the section γs(gx) = γ(s(γ −1 x))
for x ∈ X. This is equivalent to the action γφ(g) = φ(γ −1 g) on H(λ).

Theorem 27.5. (Borel-Weil) The space Γ (Lλ ) is zero unless λ is domi-


nant. If λ is dominant, then Γ (Lλ ) is irreducible as a G-module, with highest
weight λ.

Proof. We will follow the now-familiar strategy of identifying the N -fixed


vectors in the module. We will take for granted the well-known fact that
the space of sections Γ (Lλ ) is finite-dimensional. See Gunning and Rossi [60],
Corollary 10 on page 241. Assume that Γ (Lλ ) is nonzero. Let φ be the highest
weight vector for some irreducible submodule. Then by Theorem 26.5 we have
φ(ng) = φ(g) for φ ∈ H(λ), so φ(nw0 b) = λ̂(b) φ(w0 ). Thus φ is determined
up to a constant on N w0 B = Bw0 B. This is the big Bruhat cell, and it is
open and dense in X. So φ is zero unless φ(w0 ) = 0 and we may normalize
it so φ(nw0 b) = λ̂(b). This shows that there can be (up to scalar multiple)
at most one N -fixed vector, and therefore Γ (Lλ ), which we are assuming to
be nonzero, is irreducible. Also, since φ is the highest weight vector, we can
use it to compute the highest weight. Let t ∈ TC . Then tφ(w0 ) = φ(t−1 w0 ) =
φ(w0 · w0−1 t−1 w0 ) = λ̂(w0−1 t−1 w0 ) = λ(t). So by Theorem 26.5 the highest
weight in the unique irreducible submodule of H(λ) is λ. In particular λ is
dominant.
We have yet to show that if λ is dominant then H(λ) is nonzero. The issue
is whether the section whose existence on the big cell Bw0 B follows from the
above considerations can be extended to the entire group. We will accomplish
this by exhibiting a G-equivariant map V −→ H(λ). y Proposition 22.8,
the highest weight in π̂ : G −→ GL(V ∗ ) is λ̂, so if θ ∈ V ∗ is the highest
weight vector then π̂(b)θ = λ̂(b)θ for b ∈ B. Let us denote the dual pairing
V × V ∗ −→ C by (v, v ∗ ) → v, v ∗ . Define a map v → φv from v to the space
of holomorphic functions on G by
254 27 The Bruhat Decomposition
 
φv (g) = π(g −1 )v, θ .

We have φv (gb) = λ̂(b)φv (g) since the left-hand side is


     
φv (g) = π(b−1 )π(g −1 )v, θ = π(g −1 )v, π(b)θ = λ̂(b) π(g −1 )v, θ .
So φv ∈ H(λ). It is clear that φγv (g) = φv (γ −1 g) = (γφv )(g), so the map
v → φv is equivariant.


Exercises
Exercise 27.1. Explain why Yw has complex dimension l(w), or real dimension
2l(w). Also explain why Yw is open in Xw . Since Xw is a union of Yw and subsets
of lower dimension, we may say that l(w) is the dimension of the Schubert variety
Xw .
If W is a finite Weyl group then W has a longest element w0 . By the last exercise,
the Bruhat cell Bw0 B is the largest in the sense of dimension. It is therefore called
the big Bruhat cell.
Exercise 27.2. Let G = GL(n, C). Show that g = (gij ) ∈ G is in the big Bruhat
cell if and only if all the bottom left minors
 
   gn−2,1 gn−2,2 gn−2,3 
 gn−1,1 gn−1,2   
gn,1 ,    gn−1,1 gn−1,2 gn−1,3  ,
 gn,1 gn,2  ,   ...
 gn,1 gn,2 gn,3 

are nonzero. If n = 3, give a similar interpretation of all the Bruhat cells.


Exercise 27.3. Show for an arbitrary reduced word that the fiber τ −1 (x) of the
Bott-Samelson map τ is a single point for x in general position. (“General position”
means that this is true on a dense open subset of Xw .)
Exercise 27.4. Let M be a manifold. Suppose that M has an open contractible
subset Ω whose complement is a union of submaniolds of codimension  2. Show
that M is simply-connected. Use this to give another proof of 23.7, that the flag
manifold is simply-connected.
Let G = GLn (C). We give a concrete interpretation of Bott-Samelson variety
as follows. The group G acts on the set X of flags U = (U0 , . . . , Un ) where U0 ⊂
U1 ⊂ . . . ⊂ Un and each Ui is an i-dimensional vector subspace of Cn . We fix a flag
V = (V0 , . . . , Vn ), which we will call the standard flag. The Borel subgroup B may
be taken to be the stabilizer of V. The parabolic subgroup Pi is the set of g ∈ G
such that gVj = Vj for all j
= i.
Thus let X be the set of flags. We have a bijection between X = G/B and X in
which the coset gB is in bijection with the flag gV. In the same way we will describe
Zw as the parameter space for a set of configurations of subspaces of Cn that are more
complicated than simple flags but very similar in spirit. Let w = (sh1 , sh2 , . . . , shk )
be a reduced word representing w = sh1 . . . shk , and let Zw be the set of sequences
U = (U0 , . . . , Uk ) of flags Ui = (U0i , . . . , Uni ) such U0 = V is the standard flag and
Uji = Uj−1
i
except when j = hi .
27 The Bruhat Decomposition 255

Exercise 27.5. Let p1 , . . . , pk in G. Let U0 = V, and define a sequence U =


(U0 , . . . , Uk ) of flags by
p1 . . . pi Ui−1 = Ui .
Show that U ∈ Zw if and only if (p1 , . . . , pk ) ∈ Ph1 ×. . .×Phk . Moreover if (p1 , . . . , pk )
is another element of Ph1 × . . . × Phk then we have p1 . . . pi Ui−1 = Ui if and only if
(p1 , . . . , pk ) and (p1 , . . . , pk ) differ by an element of B k under the right action (27.12).
Conclude that the map Ph1 × . . . × Phk −→ Zw induces a bijection Zw −→ Zw .

Exercise 27.6. Show that there is a commutative diagram


Zw −−−−−
→ Zw
⏐ ⏐
⏐ ⏐φ
"Φ "

Xw −
−−−−
→ Xw
where the horizontal maps are the bijections described above, φ is the canonical
map Zw → Xw , and Φ is the map that sends the configuration (U0 , . . . , Uk ) to its
last flag Uk .

For example, let n = 3. Then U0 = (V0 , V1 , V2 , V3 ) will be the standard flag. We


have Uji = Uj−1
i
except when (i, j) = (1, 1), (2, 2) or (3, 1), which means that we may
find subspaces W1 , U2 and U1 of dimensions 1, 2, 1 such that U1 = (V0 , W1 , V2 , V3 ),
U2 = (V0 , W1 , U2 , V3 ) and U3 = (V0 , U1 , U2 , U3 ). Thus we arrive at the following
configuration:

V3

V2 U2

V1 W1 U1

V0

Vertical lines represent inclusions, subscripts dimensions. The Bott-Samelson


space is a moduli space for such configurations, where (V0 , V1 , V2 , V3 ) are fixed as
the standard flag.
We may compute the fibers of the map φ by solving the equivalent problem of
computing the fibers of Φ. In this case, the question is, given U0 (which is fixed)
and U3 (representing a point in Xw ), how many such configurations are there? The
only unknown is W1 , but from the above inclusions, W1 may be characterized as
the intersection of V2 and U2 . This will be a one-dimensional space (hence the only
possibility for W1 ) except in the case where U2 = V2 . Thus if x ∈ Xw is in general
position the fiber φ−1 (x) consists of a single point. But if U2 = V2 then W1 can be
any one-dimensional subspace of U2 , so the fiber φ−1 (x) is P1 (C).
256 27 The Bruhat Decomposition

Exercise 27.7. (i) Show (for GL(3)) that if w = (1, 2) or (2, 1) φ is an isomorphism.
(ii) Give a similar analysis when w = (2, 1, 2).

Exercise 27.8. For GL(4), the Schubert variety Xw is singular if w = (2, 1, 3, 2) or


w = (1, 3, 2, 1, 3). Analyze the fibers of φ using this method. Here are the relevant
configurations:

V4 V4

V3 U3 V3 W3 U3

V2 W2 U2 V2 U2

V1 U1 V1 W1 U1

V0 V0

Exercise 27.9. Let X = G/B . Explain why X is diffeomorphic to the


flag manifold X. Explore how the statement of the Borel-Weil theorem would
change if instead of line bundles on X, we considered line bundles on X .
28
Symmetric Spaces

We have devoted some attention to an important class of homogeneous spaces


of Lie groups, namely flag manifolds. Another important class is that of sym-
metric spaces. In differential geometry, a symmetric space is a Riemannian
manifold in which around every point there is an isometry reversing the direc-
tion of every geodesic. Symmetric spaces generalize the non-Euclidean geome-
tries of the sphere (compact with positive curvature) and the Poincaré upper
half-plane (noncompact with negative curvature). Like these two examples,
they tend to come in pairs, one compact and one noncompact. They were
classified by E. Cartan.
Our approach to symmetric spaces will be to alternate the examination of
examples with an explanation of general principles. In a few places (Remark
28.2, Theorem 28.2, Theorem 28.3, Proposition 28.3, and in the next chapter
Theorem 29.5) we will make use of results from Helgason [66]. This should
cause no problems for the reader. These are facts that need to be included
to complete the picture, though we do not have space to prove them from
scratch. They can be skipped without serious loss of continuity. In addition to
Helgason [66], a second indispensable work on (mainly Hermitian) symmetric
spaces is Satake [145].
It turns out that symmetric spaces (apart from Euclidean spaces) are con-
structed mainly as homogeneous spaces of Lie groups. In this chapter, an
involution of a Lie group G is an automorphism of order 2.

Proposition 28.1. Suppose that G is a connected Lie group with an involu-


tion θ. Assume that the group

K = {g ∈ G | θ(g) = g} (28.1)

is a compact Lie subgroup. In this setting, X = G/K is a symmetric space.

The involution θ is called a Cartan involution of G, and the involution it


induces on the Lie algebra is called a Cartan involution of Lie(G).

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 257


DOI 10.1007/978-1-4614-8024-2 28, © Springer Science+Business Media New York 2013
258 28 Symmetric Spaces

Proof. Clearly, G acts transitively on G/K, and K is the stabilizer of the


base point x0 , that is, the coset K ∈ G/K. We put a positive definite inner
product on the tangent space Tx0 (X) that is invariant under the compact
group K and also under θ. If x ∈ X, then we may find g ∈ G such that
g(x0 ) = x, and g induces an isomorphism Tx0 (X) −→ Tx (X) by which we
may transfer this positive definite inner product to Tx (X). Because the inner
product on Tx0 (X) is invariant under K, this inner product does not depend
on the choice of g. Thus, X becomes a Riemannian manifold. The involution θ
induces an automorphism of X that preserves geodesics through x0 , reversing
their direction, so X is a symmetric space.


We now come to a striking algebraic fact that leads to the appearance


of symmetric spaces in pairs. The involution θ induces an involution of g =
Lie(G). The +1 eigenspace of θ is, of course, k = Lie(K). Let p be the −1
eigenspace. Evidently,

[k, k] ⊂ k, [k, p] ⊂ p, [p, p] ⊂ k.

From this, it is clear that


gc = k + ip (28.2)
is a Lie subalgebra of gC = C ⊗ g. We observe that g and gc have the same
complexification; that is, gC = g ⊕ ig = gc ⊕ igc .
The appearance of these two Lie algebras with a common complexification
means that symmetric spaces come in pairs. To proceed further, we will make
some assumptions, which we now explain.

Hypothesis 28.1. Let G be a noncompact connected semisimple Lie group


with Lie algebra g. Let θ be an involution of G such that the fixed subgroup
K of θ is compact, as in Proposition 28.1. Let k and p be the +1 and −1
eigenspaces of θ on g, and let gc be the Lie algebra defined by (28.2). We will
assume that gc is the Lie algebra of a second Lie group Gc that is compact and
connected. Let GC be the complexification of Gc (Theorem 24.1). We assume
that the Lie algebra homomorphism g −→ gC is the differential of a Lie group
embedding G −→ GC and that θ extends to an automorphism of GC , also
denoted θ, which stabilizes Gc .

This means G and Gc can be embedded compatibly in the complex analytic


group GC . The involution θ extends to gC and induces an involution on gc
such that
X + iY −→ X − iY, X ∈ k, Y ∈ p.
The last statement in Hypothesis 28.1 means that this θ is the differential of an
automorphism of Gc . As a consequence the homogeneous space Xc = Gc /K
is also a symmetric space, again by Proposition 28.1. The symmetric spaces
X and Xc , one noncompact and the other compact, are said to be in duality
with each other.
28 Symmetric Spaces 259

Remark 28.1. We will see in Theorem 28.3 that every noncompact semisimple
Lie group admits a Cartan involution θ such that this hypothesis is satisfied.
Our proof of Theorem 28.3 will not be self-contained, but we do not really
need to rely on it as motivation because we will give numerous examples in
this chapter and the next where Hypothesis 28.1 is satisfied.

Remark 28.2. We do not specify G, K, and Gc up to isomorphism by this


description since different K could correspond to the same pair G and θ. But
K is always connected and contains the center of G (Helgason [66], Chap.
VI, Theorem 1.1 on p. 252). If we replace G by a semisimple covering group,
the center increases, so we must also enlarge K, and the quotient space G/K
is unchanged. Hence, there is a unique symmetric space of noncompact type
determined by the real semisimple Lie algebra g. By contrast, the symmetric
space of compact type is not uniquely determined by gc . There could be a
finite number of different choices for Gc and K resulting in different compact
symmetric spaces that have the same universal covering space. We will not
distinguish a particular one as the dual of X but say that any one of these
compact spaces is in duality with X. See Helgason [66], Chap. VII, for a
discussion of this point and other subtleties in the compact case.

Example 28.1. Suppose that G = SL(n, R) and K = SO(n). Then g = sl(n, R)


and the involution θ : G −→ G is θ(g) = t g −1 . The induced involution
on g is X −→ −t X. This p consists of symmetric matrices, and gc con-
sists of the skew-Hermitian matrices in sl(n, C); that is, gc = su(n). The
Lie groups G = SL(n, R) and Gc = SU(n) are subgroups of their common
complexification GC = SL(n, C). The symmetric spaces X = SL(n, R)/SO(n)
and Xc = SU(n)/SO(n) are in duality.

Let us obtain concrete realizations of the symmetric spaces G/K and Gc /K


in Example 28.1. The group GL(n, R) acts on the cone Pn (R) of positive
definite real symmetric matrices by the action

g : x −→ g x t g. (28.3)

On the other hand, the group U(n) acts on the space En (R) of unitary sym-
metric matrices by the same formula (28.3). [The notation En (R) does not
imply that the elements of this space are real matrices.]

Proposition 28.2. Suppose that x ∈ Pn (R) or En (R).


(i) There exists g ∈ SO(n) such that g x t g is diagonal.
(ii) The actions of GL(n, R) and U(n) are transitive.
(iii) Let p be the vector space of real symmetric matrices. We have

Pn (R) = {eX | X ∈ p}, En (R) = {eiX | X ∈ p}.

See Theorem 45.6 in Chap. 45 for an application.


260 28 Symmetric Spaces

Proof. If x ∈ Pn (R), then (i) is, of course, just the spectral theorem. However,
if x ∈ En (R), this statement may be less familiar. It is instructive to give a
unified
 proof of the two cases. Give Cn its usual inner product, so u, v =
i ui vi .
Let λ be an eigenvalue of x. We will show that the eigenspace Vλ = {v ∈
Cn | xv = λv} is stable under complex conjugation. Suppose that v ∈ Vλ .
If x ∈ Pn (R), then both x and λ are real, and simply conjugating the identity
−1
xv = λv gives xv = λv. On the other hand, if x ∈ En (R), then x = t x = x−1
and |λ| = 1 so λ = λ−1 . Thus, conjugating xv = λv gives x−1 v = λ−1 v, which
implies that xv = λv.
Now we can show that Cn has an orthonormal basis consisting of eigen-
vectors v1 , . . . , vn such that vi ∈ Rn . The adjoint of x with respect to the
standard inner product is x or x−1 depending on whether x ∈ Pn (R) or En (R).
In either case, x is the matrix of a normal operator—one that commutes with
its adjoint—and Cn is the orthogonal direct sum of the eigenspaces of x.
Each eigenspace has an orthonormal basis consisting of real vectors. Indeed,
if v1 , . . . , vk is a basis of Vλ , then since we have proved that vi ∈ Vλ , the space
is spanned by 12 (vi + vi ) and 2i 1
(vi − vi ); selecting a basis from this spanning
set and applying the usual Gram–Schmidt orthogonalization process gives an
orthonormal basis of real vectors.
In either case, we see that Cn has an orthonormal basis consisting of
eigenvectors v1 , . . . , vn such that vi ∈ Rn . Let xvi = λi vi . Then, if k ∈ O(n) is
the matrix with columns xi and d is the diagonal matrix with diagonal entries
λi , we have xk = kd so k −1 xk = δ. As k −1 = t k we may take the matrix
g = k −1 . If the determinant of k is −1, we can switch the sign of the first
entry without harm, so we may assume k ∈ SO(n) and (i) is proved.
For (i), we have shown that each orbit in Pn (R) or En (R) contains a
diagonal matrix. The eigenvalues are positive real if x ∈ Pn (R) or of absolute
value 1 if x ∈ En (R). In either case, applying the action (28.3) with g ∈
GL(n, R) or U(n) diagonal will reduce to the identity, proving (ii). For (iii),
we use (ii) to write an arbitrary element x of Pn (R) or En (R) as kdk −1 , where
k is orthogonal and d diagonal. The eigenvalues of d are either positive real
if x ∈ Pn (R) or of absolute value 1 if x ∈ En (R). Thus, d = eY , where Y is
real or purely imaginary, and x = eX or eiX , where X = kY k −1 or −ikY k −1
is real.


In the action (28.3) of GL(n, R) or U(n) on Pn (R) or En (R), the stabi-


lizer of I is O(n), so we may identify the coset spaces GL(n, R)/O(n) and
U(n)/O(n) with Pn (R) and En (R), respectively. The actions of SL(n, R) and
SU(n) on Pn (R) and En (R) are not transitive. Let Pn◦ (R) and En◦ (R) be
the subspaces of matrices of determinant 1. Then the actions of SL(n, R)
and SU(n) on Pn◦ (R) and En◦ (R) are transitive, so we may identify Pn◦ (R) =
SL(n, R)/SO(n) and En◦ (R) = SU(n)/SO(n). Thus, we obtain concrete models
of the dual symmetric spaces Pn◦ (R) and En◦ (R).
28 Symmetric Spaces 261

We say that a symmetric space X is reducible if its universal cover


decomposes into a product of two lower-dimensional symmetric spaces. If X
is irreducible (i.e., not reducible) and not a Euclidean space, then it is clas-
sified into one of four types, called I, II, III, and IV. We next explain this
classification.

Example 28.2. If K0 is a compact Lie group, then K0 is itself a compact sym-


metric space, the geodesic reversing involution being k −→ k −1 . A symmetric
space of this type is called Type II .

Example 28.3. Suppose that G is itself obtained by complexification of a com-


pact connected Lie group K0 and that the involution θ of G is the automor-
phism of G as a real Lie group induced by complex conjugation. This means
that on the Lie algebra g = k0 ⊕ ik0 of G, where k0 = Lie(K), the involution
θ sends X + iY −→ X − iY , Y ∈ k0 . The fixed subgroup of θ is K0 , and the
symmetric space is G/K0 . A symmetric space of this type is called Type IV .
It is noncompact.

We will show that the Type II and Type IV symmetric spaces are in
duality. For this, we need a couple of lemmas. If R is a ring and e, f ∈ R we
call e and f orthogonal central idempotents if ex = xe and f x = xf for all
x ∈ R, e2 = e, f 2 = f , and ef = f e.

Lemma 28.1. (Peirce decomposition) Let R be a ring, and let e and f


be orthogonal central idempotents. Assume that 1 = e + f . Then Re and Rf
are (two-sided) ideals of R, and each is a ring with identity elements e and f ,
respectively. The ring R decomposes as Re ⊕ Rf .

Proof. It is straightforward to see that Re is closed under multiplication and


is a ring with identity element e and similarly for Rf . Since 1 = e + f , we
have R = Re + Rf , and Re ∩ Rf = 0 because if x ∈ Re ∩ Rf we can write
x = re = r f , so x = r f 2 = ref = 0.


Lemma 28.2. Regard C ⊗ C = C ⊗R C as a C-algebra with scalar multipli-


cation a(x ⊗ y) = ax ⊗ y, a ∈ C. Then C ⊗ C and C ⊕ C are isomorphic as
C-algebras.

Proof. Let
e = 12 (1 ⊗ 1 + i ⊗ i), f = 12 (1 ⊗ 1 − i ⊗ i). (28.4)
It is easily checked that e and f are orthogonal central idempotents whose
sum is the identity element 1 ⊗ 1, and so we obtain a Peirce decomposition by
Lemma 28.1. The ideals generated by e and f are both isomorphic to C. 

Theorem 28.1. Let K0 be a compact connected Lie group. Then the compact
and noncompact symmetric spaces of Examples 28.2 and 28.3 are in duality.
262 28 Symmetric Spaces

Proof. Let g and k0 be the Lie algebras of G and K0 , respectively. We have


g = C ⊗ k0 . The involution θ : g −→ g takes a ⊗ X −→ a ⊗ X. By Lemma 28.2,
we have gC = C ⊗ C ⊗ k0 ∼ = C ⊗ k0 ⊕ C ⊗ k0 . Now θ induces the automorphism

θ : a ⊗ b ⊗ X −→ a ⊗ b ⊗ X, a, b ∈ C, X ∈ k0 .

The +1 and −1 eigenspaces are spanned by vectors of the form 1 ⊗ 1 ⊗ X


and 1 ⊗ i ⊗ X (X ∈ k0 ), so the Lie algebra gc as in (28.2) will be spanned by
vectors of the form 1 ⊗ 1 ⊗ X and i ⊗ i ◦ X, and the Lie algebra k is 1 ⊗ 1 ⊗ k0 .
Thus, with e and f as in (28.4), gc is the R-linear span of e ⊗ k0 and f ⊗ k0 .
We can identify
gc = e ⊗ k0 ⊕ f ⊗ k0 ∼= k0 ⊕ k0 .
The involution θ interchanges these two components, and since 1 ⊗ 1 = e + f ,
k = 1 ⊗ k0 ∼= k0 embedded diagonally in k0 ⊗ k0 .
From this description, we see that gc is the Lie algebra of K × K, which we
take to be the group Gc . The involution θ : K×K −→ K×K is θ(x, y) = (y, x),
and K is embedded diagonally. This differs from the description of the compact
symmetric space of Type II in Example 28.2, but it is equivalent. We may see
this as follows. We can map K −→ Gc /K by x −→ (x, 1)K. The involution
sends this to (1, x)K = (x−1 , 1)K since (x, x) ∈ K embedded diagonally.
Thus, if we represent the cosets of Gc /K this way, the symmetric space is
parameterized by K, and the involution corresponds to x −→ x−1 .

If G/K and Gc /K are noncompact and compact symmetric spaces in
duality, and if G/K and Gc /K are not of types IV and II, they are said
to be of types III and I, respectively.
Theorem 28.2. Let G be a noncompact, connected semisimple Lie group with
an involution θ satisfying Hypothesis 28.1. Then K is a maximal compact sub-
group of G. Indeed, if K  is any compact subgroup of G, then K  is conjugate
to a subgroup of K.
Proof. This follows from Helgason [66], Theorem 2.1 of Chap. VI on page
246. (Note the hypothesis that K be compact in our Proposition 28.1.) The
proof in [66] depends on showing that G/K is a space of constant negative
curvature. A compact group of isometries of such a space has a fixed point
([66], Theorem 13.1 of Chap. I on page 75). Now if K  fixes xK ∈ G/K, then
x−1 K  x ⊆ K.

A semisimple real Lie algebra g is compact if and only if the Killing form
is negative definite. If this is the case, then ad(g) is contained in the Lie
algebra of the compact orthogonal group with respect to this negative definite
quadratic form, and it follows that g is the Lie algebra of a compact Lie group.
A semisimple Lie algebra is simple if it has no proper nontrivial ideals.
Theorem 28.3. If g is a noncompact Lie algebra, then there exists a noncom-
pact Lie group G with Lie algebra g and a Cartan involution θ of G with fixed
28 Symmetric Spaces 263

points that are a maximal subgroup K of G so that G/K is a symmetric space


of noncompact type. In particular, Hypothesis 28.1 is satisfied. If g is simple,
then G/K is irreducible, and this construction gives a one-to-one correspon-
dence between the simple real Lie algebras and the irreducible noncompact
symmetric spaces of noncompact type.
Although we will not need this fact, it is very striking that the classifica-
tion of irreducible symmetric spaces of noncompact type is the same as the
classification of noncompact real forms of the semisimple Lie algebras.
Proof. It follows from Helgason [66], Chap. III, Theorem 6.4 on p. 181, that
g has a compact form; that is, a compact Lie algebra gc with an isomorphic
complexification. It follows from Theorems 7.1 and 7.2 in Chap. III of [66]
that we may arrange things so that gc = k + ip and g = k + p, where k and p
are the +1 and −1 eigenspaces of a Cartan involution θ, and that this Cartan
involution is essentially unique. Let Gc be the adjoint group of gc ; that is, the
group generated by exponentials of endomorphisms ad(X) with X ∈ gc . It is
a compact Lie group with Lie algebra gc —see Helgason [66], Chap. II, Section
5. Thus, Gc is a group of linear transformations of gc , but we extend them to
complex linear transformations of gC , and so Gc and the other groups G, GC ,
and K that we will construct will all be subgroups of GL(gC ). Let GC be the
complexification of Gc . The conjugation of gC with respect to g induces an
automorphism of GC as a real Lie group with a fixed-point set that can be
taken to be G. The Cartan involution θ induces an involution of G with a
fixed-point set K that is a subgroup with Lie algebra k.

In Table 28.1, we give the classification of Cartan [31] of the Type I and
Type III symmetric spaces. (The symmetric spaces of Type II and Type IV,
as we have already seen, correspond to complex semisimple Lie algebras.)
In Table 28.1, the group SO∗ (2n) consists of all elements of SO(2n, C) that
stabilize the skew-Hermitian form

x1 xn+1 + x2 xn+2 + . . . + xn x2n − xn+1 x1 − xn+2 x2 − · · · − x2n xn .



The subgroups S O(p) × O(q) and S U(p) × U(q) are the subgroups of
O(p) × O(q) and U(p) × U(q) consisting of elements of determinant 1. Cartan
considered the special cases q = 1 significant
enough to warrant independent
classifications. The group S O(p) × O(1) ∼ = O(p), and we have written K this
way for types BII and DII.
For the exceptional groups, we have only described the Lie algebra of the
maximal compact subgroup. We have given the real form from the classifica-
tion of Tits [162]. In this classification, 2 E6,2
16
= i E6,r
d
, for example, where i,
d, and r are numbers whose significance we will briefly discuss. They will all
reappear in the next chapter.
The number i = 1 if the group is an inner form and 2 if it is an outer
form. As we mentioned
in Remark 24.1,
real forms of Gc are parameterized by
elements of H 1 Gal(C/R), Aut(GC ) . If the defining co-cycle is in the image of
264 28 Symmetric Spaces

Table 28.1. Real forms and Type I and Type III symmetric spaces

Cartan’s Dimension Absolute/rel.


G Gc K ◦ or k
class rank root systems
1
(n−1)(n+2) An−1
AI SL(n, R) SU(n) SO(n) 2
n−1 An−1
(n−1)(2n+1) A2n−1
AII SL(n, H) SU(2n) Sp(2n)
n−1 An−1
 Ap+q−1
SU(p, q) 2pq
AIII SU(p + q) S(U(p) × U(q)) Cp (p = q)
p, q > 1 min(p, q)
BCp (p > q)
2p Ap
AIV SU(p, 1) SU(p + 1) S(U(p) × U(q))
1 BC1
SO(p, q) B(p+q−1)/2
pq
BI p, q > 1 SO(p + q) S(O(p) × O(q)) Bq (p > q)
min(p, q)
p + q odd Dp (p = q)
SO(p, 1) 2p Bp/2
BII SO(p + 1) O(p)
p + 1 odd 1 B1
SO(p, q) D
 (p+q)/2
pq
DI p, q > 1 SO(p + q) S(O(p) × O(q)) Bq (p > q)
min(p, q)
p + q even Dp (p = q)
SO(p, 1) D(p+1)/2
DII SO(p + 1) O(p) 2p
p + 1 even A1
 D n
n−1
DIII SO∗ (2n) SO(2n) U(n) Cm n = 2m
m = [n/2]
BCm n = 2m+1
n(n + 1) Cn
CI Sp(2n, R) Sp(2n) U(n)
n Cn
 C p+q
4pq
CII Sp(2p, 2q) Sp(2p + 2q) Sp(2p) × Sp(2q) BCq (p > q)
min(p, q)
Cp (p = q)
EI 1 0
E6,6 E6 sp(8) 42 E6 E6
EII 2 2
E6,4 E6 su(6) × su(2) 40 E6 F4
EIII 2 16
E6,2 E6 so(10) × u(1) 32 E6 G2
EIV 1 28
E6,2 E6 f4 26 E6 A2
EV 0
E7,7 E7 so(10) × u(1) 70 E7 E7
EV I 9
E7,4 E7 so(12) × su(2) 64 E7 F4
EV II 28
E7,3 E7 e6 × u(1) 54 E 7 C3
EV III 0
E8,8 E8 so(16) 128 E8 E8
EIX 28
E8,4 E8 e7 × su(2) 112 E8 F4
FI 0
F4,4 F4 sp(6) × su(2) 28 F4 F4
F II 21
F4,1 F4 so(9) 16 F4 A1
G 0
G2,2 G2 su(2) × su(2) 8 G2 G2
28 Symmetric Spaces 265

H 1 Gal(C/R), Inn(Gc ) −→ H 1 Gal(C/R), Aut(Gc ) ,

where Inn(Gc ) is the group of inner automorphisms, then the group is an inner
form. Looking ahead to the next chapter, where we introduce the Satake dia-
grams, G is an inner form if and only if the symmetry of the Satake diagram,
corresponding to the permutation α −→ −θ(α) of the relative root system, is
trivial. Thus, from Fig. 29.3, we see that SO(6, 6) is an inner form, but the
quasisplit group SO(7, 5) is an outer form. For the exceptional groups, only
E6 admits an outer automorphism (corresponding to the nontrivial automor-
phism of its Dynkin diagram). Thus, for the other exceptional groups, this
parameter is omitted from the notation.
The number r is the (real) rank—the dimension of the group A = exp(a),
where a is a maximal Abelian subspace of p. The number d is the dimension
of the anisotropic kernel , which is the maximal compact subgroup of the
centralizer of A. Both A and M will play an extensive role in the next chapter.
We have listed the rank for the groups of classical type but not the excep-
tional ones since for those the rank is contained in the Tits’ classification.
For classification matters we recommend Tits [162] supplemented by Borel
[20]. The definitive classification in this paper, from the point of view of alge-
braic groups, includes not only real groups but also groups over p-adic fields,
number fields, and finite fields. Knapp [106], Helgason [66], Onishchik and
Vinberg [166], and Satake [144] are also very useful.

Example 28.4. Consider SL(2, R)/SO(2) and SU(2)/SO(2). Unlike the general
case of SL(n, R)/SO(n) and SU(n)/SO(n), these two symmetric spaces have
complex structures. Specifically, SL(2, R) acts transitively on the Poincaré
upper half-plane H = {z = x + iy | x, y ∈ R, y > 0} by linear fractional
transformations:  
ab az + b
SL(2, R)  : z −→ .
cd cz + d
The stabilizer of the point i ∈ H is SO(2), so we may identify H with
SL(2, R)/SO(2). Equally, let R be the Riemann sphere C ∪ {∞}, which is
the same as the complex projective line P1 (C). The group SU(2) acts transi-
tively on R, also by linear fractional transformations:
 
a b az + b
SU(2)  : z −→ , |a|2 + |b|2 = 1.
−b̄ ā −b̄z + ā

Again, the stabilizer of i is SO(2), so we may identify SU(2)/SO(2) with R.

Both H and R are naturally complex manifolds, and the action of SL(2, R)
or SU(2) consists of holomorphic mappings. They are examples of Hermitian
symmetric spaces, which we now define. A Hermitian manifold is the complex
analog of a Riemannian manifold. A Hermitian manifold is a complex manifold
on which there is given a (smoothly varying) positive definite Hermitian inner
product on each tangent space (which has a complex structure because the
266 28 Symmetric Spaces

space is a complex manifold). The real part of this positive definite Hermitian
inner product is a positive definite symmetric bilinear form, so a Hermitian
manifold becomes a Riemannian manifold. A real-valued symmetric bilinear
form B on a complex vector space V is the real part of a positive definite
Hermitian form H if and only if it satisfies

B(iX, iY ) = B(X, Y ),

for if this is true it is easy to check that



H(X, Y ) = 12 B(X, Y ) − iB(iX, Y )

is the unique positive definite Hermitian form with real part H. From this
remark, a complex manifold is Hermitian by our definition if and only if it is
Hermitian by the definition in Helgason [66].
A symmetric space X is called Hermitian if it is a Hermitian manifold that
is homogeneous with respect to a group of holomorphic Hermitian isometries
that is connected and contains the geodesic-reversing reflection around each
point. Thus, if X = G/K, the group G consists of holomorphic mappings,
and if g(x) = y for x, y ∈ X, g ∈ X, then g induces an isometry between the
tangent spaces at x and y.
The irreducible Hermitian symmetric spaces can easily be recognized by
the following criterion.
Proposition 28.3. Let X = G/K and Xc = Gc /K be a pair of irreducible
symmetric spaces in duality. If one is a Hermitian symmetric space, then they
both are. This will be true if and only if the center of K is a one-dimensional
central torus Z. In this case, the rank of Gc equals the rank of K.
In a nutshell, if K has a one-dimensional central torus, then there exists a
homomorphism of T into the center of K. The image of T induces a group
of isometries of X fixing the base point x0 ∈ X corresponding to the coset
of K. The content of the proposition is that X may be given the structure of
a complex manifold in such a way that the maps on the tangent space at x0
induced by this family of isometries correspond to multiplication by T, which
is regarded as a subgroup of C× .
Proof. See Helgason [66], Theorem 6.1 and Proposition 6.2, or Wolf [176],
Corollary 8.7.10, for the first statement. The latter reference has two other
very interesting conditions for the space to be symmetric. The fact that Gc
and K are of equal rank is contained in Helgason [66] in the first paragraph
of “Proof of Theorem 7.1 (ii), algebraic part” on p. 383.

A particularly important family of Hermitian symmetric spaces are the
Siegel upper half-spaces Hn , also known as Siegel spaces which generalize the
Poincaré upper half-plane H = H1 . We will discuss this family of examples in
considerable detail since many features of the general case are already present
in this example and are perhaps best understood with an example in mind.
28 Symmetric Spaces 267

In this chapter, if F is a field (always R or C), the symplectic group is


 

−In
Sp(2n, F ) = g ∈ GL(2n, F ) | g J t g = J , J= .
In
 
AB
Write g = , where A, B, C, and D are n × n blocks. Multiplying out
CD
t
the condition g J g = J gives the conditions

A · t B = B · t A, C · t D = D · t C,
(28.5)
A · t D − B · t C = I, D · t A − C · t B = I.

The condition g J t g = J may be expressed as (gJ)(t gJ) = −I, so gJ and t gJ


are negative inverses of each other. From this, we see that t g is also symplectic,
and so (28.5) applied to t g gives the further relations
t
A · C = t C · A, t
B · D = t D · B,
(28.6)
t
A · D − t C · B = I, t
D · A − t B · C = I.

If A+iB ∈ U(n), where the matrices A and B are real, then A· t A + B · t B = I


and A ·t B = B ·t A. Thus, if we take D = A and C = −B, then (28.5) is
satisfied. Thus,
 
A B
A + iB −→ (28.7)
−B A
maps U(n) into Sp(2n, R) and is easily checked to be a homomorphism.
If W is a Hermitian matrix, we write W > 0 if W is positive definite.
If Ω ⊂ Rr is any connected open set, we can form the tube domain over
Ω. This is the set of all elements of Cr that has imaginary parts in Ω. Let
Hn be the tube domain over Pn (R). Thus, Hn is the space of all symmetric
complex matrices Z = X + iY where X and Y are real symmetric matrices
such that Y > 0.
 
AB
Proposition 28.4. If Z ∈ Hn and g = ∈ Sp(2n, R), then CZ + D
CD
is invertible. Define

g(Z) = (AZ + B)(CZ + D)−1 . (28.8)

Then g(Z) ∈ Hn , and (28.8) defines an action of Sp(2n, R) on Hn . The action


is transitive, and the stabilizer of iIn ∈ Hn is the image of U(n) under the
embedding (28.7). If W is the imaginary part of g(Z) then

W = (Z t C + t D)−1 Y (CZ + D)−1 . (28.9)

Proof. Using (28.6), one easily checks that


t
2i (Z C + D)(AZ + B) − (Z A + B)(CZ + D) = − Z) = Y, (28.10)
1 t t t 1
2i (Z
268 28 Symmetric Spaces

where Y is the imaginary part of Z. From this it follows that CZ + D is


invertible since if it had a nonzero nullvector v, then we would have t vY v = 0,
which is impossible since Y > 0.
Therefore, we may make the definition (28.8). To check that g(Z) is sym-
metric, the identity g(Z) = t g(Z) is equivalent to

(AZ + B)(Z t C + t D) = (Z t A + t B)(CZ + D) ,

which is easily confirmed using (28.5) and (28.6).


Next we show that the imaginary
part W of g(Z) is positive definite.
Indeed W equals 2i 1
g(Z) − g(Z) . Using the fact that g(Z) is symmetric and
(28.10), this is
−1

1
2i (AZ + B)(CZ + D) − (Z t C + t D)−1 (Z t A + t B) .

Simplifying this gives (28.9). From this it is clear that W is Hermitian and that
W > 0. It is real, of course, though that is less clear from this expression. Since
it is real and positive definite Hermitian, it is a positive definite symmetric
matrix.
It is easy to check that g g  (Z) = (gg  )(Z), so this is a group action.
To show that this action is transitive, we note that if Z = X + iY ∈ Hn , then
 
I −X
∈ Sp(2n, R) ,
I

and this matrix takes Z to iY . Now if h ∈ GL(n, R), then


 
g
t −1 ∈ Sp(2n, R),
g

and this matrix takes iY to iY  , where Y  = g Y t g. Since Y > 0, we may


choose g so that Y  = I. This shows that any element in Hn may be moved
to iIn , and the action is transitive.
To check that U(n) is the stabilizer of iIn is quite easy, and we leave it to
the reader.

Example 28.5. By Proposition 28.4, we can identify Hn with Sp(2n, R)/U(n).
The fact that it is a Hermitian symmetric space is in accord with Proposi-
tion 28.3, since U(n) has a central torus. In the notation of Proposition 28.1,
if G = Sp(2n, R) and K = U(n) are embedded via (28.7), then the compact
group Gc is Sp(2n), where as usual Sp(2n) denotes Sp(2n, C) ∩ U(2n).
We will investigate the relationship between Examples 28.1 and 28.5 using
a fundamental map, the Cayley transform. For clarity, we introduce this first
in the more familiar context of the Poincaré upper half-plane (Example 28.4),
which is a special case of Example 28.5.
We observe that the action of Gc = SU(2) on the compact dual Xc =
SU(2)/SO(2) can be extended to an action of GC = SL(2, C). Indeed, if we
28 Symmetric Spaces 269

identify Xc with the Riemann sphere R, then the action of SU(2) was by
linear fractional transformations and so is the extension to SL(2, C).
As a consequence, we have an action of G = SL(2, R) on Xc since G ⊂ GC
and GC acts on Xc . This is just the action by linear fractional transformations
on R = C ∪ {∞}. There are three orbits: H, the projective real line P1 (R) =
R ∪ {∞}, and the lower half-plane H.
The Cayley transform is the element c ∈ SU(2) given by
   
1 −i i i
c = √12i , so c−1 = √12i . (28.11)
1 i −1 1

Interpreted as a transformation of R, the Cayley transform takes H to the


unit disk

D = {w ∈ C  |w| < 1}.
Indeed, if z ∈ H, then
z−i
c(z) = ,
z+i
and since z is closer to i than to −i, this lies in D. The effect of the Cayley
transform is shown in Fig. 28.1.

Fig. 28.1. The Cayley transform

The significance of the Cayley transform is that it relates a bounded


symmetric domain D to an unbounded one, H. We will use both H and D
together when thinking about the boundary of the noncompact symmetric
space embedded in its compact dual.
Since SL(2, R) acts on H, the group c SL(2, R) c−1 acts on c(H) = D. This
group is
  
a b  2
SU(1, 1) =  |a| − |b| = 1 .
2
b̄ ā
The Cayley transform is generally applicable to Hermitian symmetric
spaces. It was shown by Cartan and Harish-Chandra that Hermitian symmet-
ric spaces could be realized on bounded domains in Cn . Piatetski-Shapiro [133]
270 28 Symmetric Spaces

gave unbounded realizations. Korányi and Wolf [111, 112] gave a completely
general theory relating bounded symmetric domains to unbounded ones by
means of the Cayley transform.
Now let us consider the Cayley transform for Hn . Let G = Sp(2n, R),
K = U(n), Gc = Sp(2n), and GC = Sp(2n, C). Let
   
1 In −iIn −1 1 iIn iIn
c= √ , c = √ .
2i In iIn 2i −In In
They are elements of Sp(2n). We will see that the scenario uncovered for
SL(2, R) extends to the symplectic group.
Our first goal is to show that Hn can be embedded in its compact dual,
a fact already noted when n = 1. The first step is to interpret Gc /K as an
analog of the Riemann sphere R, a space on which the actions of both groups
G and Gc may be realized as linear fractional transformations. Specifically,
we will construct a space Rn that contains a dense open subspace R◦n that
can be naturally identified with the vector space of all complex symmetric
matrices. What we want is for GC to act on Rn , and if g ∈ GC , with both
Z, g(Z) ∈ R◦n , then g(Z) is expressed in terms of Z by (28.8).
Toward the goal of constructing Rn , let
   
h I X 
P = t −1  h ∈ GL(n, C), X ∈ Mat n (C), X = t
X .
h I
(28.12)

This group is called the Siegel parabolic subgroup of Sp(2n, C). (The term
parabolic subgroup will be formally defined in Chap. 30.) We will define Rn
to be the quotient GC /P . Let us consider how an element of this space can
(usually) be regarded as a complex symmetric matrix, and the action of GC
is by linear fractional transformations as in (28.8).
Proposition 28.5. We have P Gc = Sp(2n, C) and P ∩ Gc = cKc−1 .
Proof. Indeed, P contains a Borel subgroup, the group B of matrices (28.12)
with g upper triangular, so P Gc = Sp(2n, C) follows from the Iwasawa
decomposition (Theorem 26.3). The group K is U(n) embedded via (28.7),
and it is easy to check that
  
g 
cKc−1 = t −1  g ∈ U(n) . (28.13)
g

It is clear that cKc−1 ⊆ P ∩ Sp(2n). To prove the converse inclusion, it is


straightforward to check that any unitary matrix in P is actually of the form
(28.13), and so P ∩ Gc ⊆ cKc−1 .


We  Rn = GC /P . We define Rn to be the set of cosets gP , where
 define
AB
g= ∈ Sp(2n, C) and det(C) is nonsingular.
CD
28 Symmetric Spaces 271

Lemma 28.3. Suppose that


   
AB  A B 
g= , g = ,
CD C  D

are elements of GC . Then gP = g  P if and only if there exists a matrix h ∈


GL(n, C) such that Ah = A and Ch = C  . If C is invertible, then AC −1 is a
complex symmetric matrix. If this is true, a necessary and sufficient condition
for gP = g  P is that C  is also invertible and that AC −1 = A (C  )−1 .

Proof. Most of this is safely left to the reader. We only point out the reason
that AC −1 is symmetric. By (28.6), the matrix t CA is symmetric, so t C −1 ·
t
CA · C −1 = AC −1 is also.


Let Rn be the vector space of n × n complex symmetric matrices. By the


Lemma 28.3, the map σ : Rn −→ R◦n defined by
 
Z −I
σ(Z) = P
I

is a bijection, and we can write


 
AB
σ(Z) = P
CD

if and only if AC −1 = Z.
 
AB
Proposition 28.6. If σ(Z) and g σ(Z) are both in R◦n , where g =
CD
is an element of Sp(2n, C), then CZ + D is invertible and

g σ(Z) = σ (AZ + B)(CZ + D)−1 .

Proof. We have
    
A B Z −I AZ + B −A
g σ(Z) = P = P .
C D I CZ + D −C

Since we are assuming this is in R◦n , the matrix CZ + D is invertible by


Lemma 28.3, and this equals σ (AZ + B)(CZ + D)−1 . 

In view of Proposition 28.6 we will identify Rn with its image in R◦n . Thus,
elements of R◦n become for us complex symmetric matrices, and the action of
Sp(2n, C) is by linear fractional transformations.
We can also identify Rn with the compact symmetric space Gc /K by
means of the composition of bijections

Gc /K −→ Gc /cKc−1 −→ GC /P = Rn .
272 28 Symmetric Spaces

The first map is induced by conjugation by c ∈ Gc . The second map is induced


by the inclusion Gc −→ GC and is bijective by Proposition 28.5, so we may
regard the embedding of Hn into Rn as an embedding of a noncompact sym-
metric space into its compact dual.
So far, the picture is extremely similar to the case where n = 1. We now
come to an important difference. In the case of SL(2, R), the topological
boundary of H (or D) in R was just a circle consisting of a single orbit of
SL(2, R) or even its maximal compact subgroup SO(2).
When n  2, however, the boundary consists of a finite number of orbits,
each of which is the union of smaller pieces called boundary components.
Each boundary component is a copy of a Siegel space of lower dimension.
The boundary components are infinite in number, but each is a copy of one
of a finite number of standard ones. Since the structure of the boundary is
suddenly interesting when n  2, we will take a closer look at it. For more
information about boundary components, which are important in the theory
of automorphic forms and the algebraic geometry of arithmetic quotients such
as Sp(2n, Z)\Hn , see Ash, Mumford, Rapoport, and Tai [11], Baily [13], Baily
and Borel [14], and Satake [142, 144].
The first step is to map Hn into a bounded region. Writing Z = X + iY ,
where X and Y are real symmetric matrices, Z ∈ Hn if and only if Y > 0. So Z
is on the boundary if Y is positive semidefinite yet has 0 as an eigenvalue. The
multiplicity of 0 as an eigenvalue separates the boundary into several pieces
that are separate orbits of G. (These are not the boundary components, which
we will meet presently.)
If we embed Hn into Rn , a portion of the border is at “infinity”; that is, it
is in Rn − R◦n . We propose to examine the border by applying c, which maps
Hn into a region with a closure that is wholly contained in Rn .
Proposition 28.7. The image of Hn under c is
Dn = {W ∈ R◦n | I − W W > 0}.
The group c Sp(2n, R) c−1 , acting on Dn by linear fractional transformations,
consists of all symplectic matrices of the form
 
AB
. (28.14)
BA

(Note that, since W is symmetric, I − W W is Hermitian.)


Proof. The condition on W to be in c(H) is that the imaginary part of
c−1 (W ) = −i(W − I)(W + I)−1
be positive definite. This imaginary part is

Y = − 12 (W − I)(W + I)−1 + (W − I)(W + I)−1 =

− 21 (W − I)(W + I)−1 + (W + I)−1 (W − I) ,
28 Symmetric Spaces 273

where we have used the fact that both W and (W −I)(W +I)−1 are symmetric
to rewrite the second term. This will be positive definite if and only if (W +
I)Y (W + I) is positive definite. This equals

− 21 (W + I)(W − I) + (W − I)(W + I) = I − W W.

Since Sp(2n, R) maps Hn into itself, c Sp(2n, R) c−1 maps Dn = c(Hn )


into itself. We have only to justifythe claim
 that this group consists of the
AB
matrices of form (28.14). For g = ∈ Sp(2n, C) to have the property
CD
that c−1 g c is real, we need c−1 g c = c−1 g c, so
     
−1 A B AB −1 −1
√ 0 In
cc = cc , c c = −i .
CD CD In 0

This implies that C = B and D = A.



Proposition 28.8. (i) The closure of Dn is contained within R◦n . The
boundary of Dn consists of all complex symmetric matrices W such that
I − W W is positive semidefinite but such that det(I − W W ) = 0.
(ii) If W and W  are points of the closure of Dn in Rn that are congruent
modulo c G c−1 , then the ranks of I − W W and I − W  W  are equal.
(iii) Let W be in the closure of Dn , and let r be the rank of I − W W . Then
there exists g ∈ c G c−1 such that g(W ) has the form
 
Wr
, Wr ∈ Dn−r . (28.15)
In−r

Proof. The diagonal entries in W W are the squares of the lengths of the rows
of the symmetric matrix W . If I − W W is positive definite, these must be less
than 1. So Dn is a bounded domain within the set R◦n of symmetric complex
matrices. The rest of (i) is also clear.
For (ii), if g ∈ c G c−1 , then by Proposition 28.7 the matrix g has the form
(28.14). Using the fact that both W and W  are symmetric, we have

I − W̄  W = I − (W̄ t B + t A)−1 (W̄ t Ā + t B)(AW + B)(B̄W + Ā)−1 .

Both W and W  are in R◦n , so by Proposition 28.6 the matrix BW + Ā is


invertible. Therefore, the rank of I − W  W  is the same as the rank of

(W̄ t B + t A)(I − W̄  W  )(B̄W + Ā) =


(W̄ B + A)(B̄W + Ā) − (W̄ t Ā + t B)(AW + B).
t t

Using the relations (28.6), this equals I − W̄ W .


To prove (iii), note first that if u ∈ U(n) ⊂ c G c−1 , then
 
u
c G c−1  : W −→ u W t u.

274 28 Symmetric Spaces

Taking u to be a scalar, we may assume that −1 is not an eigenvalue of W .


Then W + I is nonsingular so Z = c−1 W = −i(W − I)(W + I)−1 ∈ R◦n . Since
Z is in the closure of H, it follows that Z = X + iY , where X and Y are real
symmetric and Y is positive semidefinite. There exists an orthogonal matrix
k such that D = kY k −1 is diagonal with nonnegative eigenvalues. Now
  
k I −X
γ(Z) = iD, γ= ∈ G.
k I

Thus, denoting W  = cγc−1 W ,

W  = c(iD) = (D − I)(D + I)−1 .

Like D, the matrix W  is diagonal, and its diagonal entries equal to 1 corre-
spond to the diagonal entries of D equal to 0. These correspond to diagonal
entries of I − W  W  equal to 0, so the diagonal matrices D and I − W  W 
have the same rank. But by (ii), the ranks of I − W W and I − W  W  are
equal, so the rank of D is r. Clearly, W  has the special form (28.15).

Now let us fix r < n and consider
  
Wr 
Br =  r
W ∈ D n−r .
In−r

By Proposition 28.7, the subgroup of c G c−1 of the form


⎛ ⎞
Ar 0 Br 0
⎜ 0 In−r 0 0 ⎟
⎜ ⎟
⎝ Br 0 Ar 0 ⎠
0 0 0 In−r

is isomorphic to Sp(2r, R), and Br is homogeneous with respect to this sub-


group. Thus, Br is a copy of the lower-dimensional Siegel space Dr embedded
into the boundary of Dn .
Any subset of the boundary that is congruent to a Br by an element of
cGc−1 is called a boundary component. There are infinitely many boundary
components, but each of them resembles one of these standard ones. The
closure of a boundary component is a union of lower-dimensional boundary
components. L Now let us consider the union of the zero-dimensional bound-
ary components, that is, the set of all elements equivalent to B0 = {In }.
By Proposition 28.8, it is clear that this set is characterized by the vanishing
of I − W W . In other words, this is the set En (R).
If D is a bounded convex domain in Cr , homogeneous under a group G
of holomorphic mappings, the Bergman–Shilov boundary of D is the unique
minimal closed subset B of the topological boundary ∂D such that a function
holomorphic on D and continuous on its closure will take its maximum (in
absolute value). See Korányi and Wolf [112] for further information, including
the fact that a bounded symmetric domain must have such a boundary.
28 Symmetric Spaces 275

Theorem 28.4. The domain Dn has En (R) as its Bergman–Shilov boundary.

Proof. Let f be a holomorphic function on Dn that is continuous on its closure.


We will show that f takes its maximum on the set En (R). This is sufficient
because G acts transitively on En (R), so the set En (R) cannot be replaced by
any strictly smaller subspace with the same maximizing property.
Suppose x ∈ Dn maximizes |f |. Let B be the boundary component con-
taining x, so B is congruent to some Br . If r > 0, then noting that the
restriction of f to B is a holomorphic function there, the maximum modulus
principle implies that f is constant on B and hence |f | takes the same maxi-
mum value on the boundary of B, which intersects En (R).


We now see that both the dual symmetric spaces Pn (R) and En (R) appear
in connection with Hn . The construction of Hn involved building a tube
domain over the cone Pn (R), while the dual En (R) appeared as the Bergman–
Shilov boundary. (Since Pn◦ (R) and En (R)◦ are in duality, it is natural to
extend the notion of duality to the reducible symmetric spaces Pn (R) and
En (R) and to say that these are in duality.)
This scenario repeats itself: there are four infinite families and one isolated
example of Hermitian symmetric spaces that appear as tube domains over
cones. In each case, the space can be mapped to a bounded symmetric domain
by a Cayley transform, and the compact dual of the cone appears as the
Bergman–Shilov boundary of the cone. These statements follow from the work
of Koecher, Vinberg, and Piatetski-Shapiro [133], culminating in Korányi and
Wolf [111, 112].
Let us describe this setup in a bit more detail. Let V be a real vector space
with an inner product , . A cone C ⊂ V is a convex open set consisting of a
union of rays through the origin but not containing any line. The dual cone to
C is {x ∈ V | x, y > 0 for all y ∈ C}. If C is its own dual, it is naturally called
self-dual . It is called homogeneous if it admits a transitive automorphism
group.
A homogeneous self-dual cone is a symmetric space. It is not irreducible
since it is invariant under similitudes (i.e, transformations x −→ λx where
λ ∈ R× ). The orbit of a typical point under the commutator subgroup of
the group of automorphisms of the cone sits inside the cone, inscribed like a
hyperboloid, though this description is a little misleading since it may be the
constant locus of an invariant of degree > 2. For example, Pn◦ (R) is the locus
of det(x) = 1, and det is a homogeneous polynomial of degree n.
Homogeneous self-dual cones were investigated and applied to symmetric
domains by Koecher, Vinberg, and others. A Jordan algebra over a field F is
a nonassociative algebra over F with multiplication that is commutative and
satisfies the weakened associative law (ab)a2 = a(ba2 ). The basic principle is
that if C ⊂ V is a self-dual convex cone, then V can be given the structure of
a Jordan algebra in such a way that C becomes the set of squares in V .
276 28 Symmetric Spaces

In addition to Satake [145] Chap. I, Sect. 8, see Ash, Mumford, Rapoport,


and Tai [11], Chap. II, for good explanations, including a discussion of the
boundary components of a self-dual cone.

Example 28.6. Let D = R, C, or H. Let d = 1, 2 or 4 be the real dimension


of D. Let Jn (D) be the set of Hermitian matrices in Matn (D), which is a
Jordan algebra. Let Pn (D) be the set of positive definite elements. It is a self-
dual cone of dimension n + (d/2)n(n − 1). It is a reducible symmetric space,
but the elements of g ∈ Pn (D) such that multiplication by g as an R-linear
transformation of Matn (D) has determinant 1 is an irreducible symmetric
space Pn◦ (D) of dimension n + (d/2)n(n − 1) − 1. The dual En◦ (D) is a compact
Hermitian symmetric space.

Example 28.7. The set defined by the inequality x0 > x21 + · · · + x2n in Rn+1
is a self-dual cone, which we will denote P(n, 1). The group of automorphisms
is the group of similitudes for the quadratic form x20 −x21 −· · ·−x2n . The derived
group is SO(n, 1), and its homogeneous space P ◦ (n, 1) can be identified with
the orbit of (1, 0, . . . , 0), which is the locus of the hyperboloid x20 − x21 − · · · −
x2n = 1. The following special cases are worth noting: P(2, 1) ∼ = P2 (R) can be
identified with the Poincaré upper half-plane, P ◦ (3, 1) can be identified with
P2 (C), and P ◦ (5, 1) can be identified with P2 (H).

Example 28.8. The octonions or Cayley numbers are a nonassociative algebra


O over R of degree 8. The automorphism group of O is the exceptional group
G2 . The construction of Example 28.6 applied to D = O does not produce
a Jordan algebra if n > 3. If n  3, then Jn (O) is a Jordan algebra con-
taining a self-dual cone Pn (O). But P2 (O) is the same as P(9, 1). Only the
27-dimensional exceptional Jordan algebra J3 (O), discovered in 1947 by A. A.
Albert, produces a new cone P3 (O). It contains an irreducible symmetric space
of codimension 1, P3◦ (O), which is the locus of a cubic invariant. Let E3◦ (O)
denote the compact dual. The Cartan classification of these 26-dimensional
symmetric spaces is EIV .

The tube domain H(C) over a self-dual cone C, consisting of all X + iY ∈


C ⊗ V , is a Hermitian symmetric space. These examples are extremely similar
to the case of the Siegel space. For example, we can embed H(C) in its compact
dual R(C), which contains R◦ (C) = C⊗V as a dense open set. A Cayley trans-
form c : R(C) −→ R(C) takes H(C) into a bounded symmetric domain D(C),
the closure of which is contained in R◦ (C). The Bergman–Shilov boundary
can be identified with the compact dual of the (reducible) symmetric space
C, and its preimage under c consists of X + iY ∈ C ⊗ V with Y = 0, that is,
with the vector space V .
The nonassociative algebras O and J3 (O) are crucial in the construction
of the exceptional groups and Lie algebras. See Adams [3], Jacobson [88],
Onishchik and Vinberg [166] and Schafer [146] for constructions. Freudenthal
[50] observed a phenomenon involving some symmetric spaces known as the
28 Symmetric Spaces 277

magic square. Freudenthal envisioned a series of geometries over the algebras


R, C, H, and O, and found a remarkable symmetry, which we will present
momentarily. Tits [161] gave a uniform construction of the exceptional Lie
algebras that sheds light on the magic square. See also Allison [6]. The paper
of Baez [12] gives a useful survey of matters related to the exceptional groups,
including the magic square and much more. A recent paper on the magic
square, in the geometric spirit of Freudenthal’s original approach, is Landsberg
and Manivel [115]. And Tits’ theory of buildings ([163], [1]) had its roots in
his investigations of the geometry of the exceptional groups ([134]).
We will now take a look at the magic square Let us denote R(C) as Rn (D) if
C = Pn (D). We associate with this C three groups Gn (D), Gn (D), and Gn (D)
such that Gn (D) ⊃ Gn (D) ⊃ Gn (D) and such that G (D)/Gn (D) = Rn (D),
while Gn (D)/Gn (D) = En (D). Thus Gn (R) = SO(n) and Gn (R) = GL(n, R),
while Gn (R) = Sp(2n, R).
These groups are tabulated in Fig. 28.2 together with the noncompact
duals that produce tube domains. Note that the symmetric spaces U(n) ×
U(n)/U(n) = U(n) and GL(2n, C)/U(n) = P3 (C) of the center column are
of Types II and IV, respectively. The “magic” consists of the fact that the
square is symmetric.

D R C H R C H
Gn(D) SO(n) U(n) Sp(2n) − − −
Gn (D) U(n) U(n) × U(n) U(2n) GL(n, R) GL(n, C) GL(n, H)
Gn(D) Sp(2n) U(2n) SO(4n) Sp(2n, R) GU(n, n) SO(4n)∗

Fig. 28.2. The 3 × 3 square. Left: compact forms. Right: noncompact forms

We have the following numerology:

dim Gn (D) + 2 dim G(D) = 3 dim Gn (D). (28.16)

Indeed, dim Gn (D) − dim Gn (D) is the dimension of the tube domain, and
this is twice the dimension dim G (D) − dim Gn (D) of the cone.
Although in presenting the 3 × 3 square—valid for all n—in Fig. 28.2 it
seems best to take the full unitary groups in the second rows and columns,
this does not work so well for the 4×4 magic square. Let us therefore note that
we can also use modified groups that we call Hn (D) ⊂ Hn (D) ⊂ Hn , which
are the derived groups of the Gn (D). We must modify (28.16) accordingly:

dim H  (D) + 2 dim H(D) = 3 dim Hn (D) + 3. (28.17)

See Fig. 28.3 for the resulting “reduced” 3 × 3 magic square.


If n = 3, the reduced 3 × 3 square can be extended, resulting in Freuden-
thal’s magic square, which we consider next. It will be noted that in Cartan’s
list (Table 28.1) some of the symmetric spaces have an SU(2) factor in K. Since
278 28 Symmetric Spaces

D R C H R C H
1
Hn(D) SO(n) SU(n) Sp(2n) 2n(n − 1) n2 − 1 n(2n + 1)
Hn (D) SU(n) SU(n) × SU(n) SU(2n) n2 − 1 2n2 − 2 4n2 − 1
Hn(D) Sp(2n) SU(2n) SO(4n) n(2n + 1) 4n2 − 1 2n(4n − 1)

Fig. 28.3. Left: the reduced 3 × 3 square. Right: dimensions

SU(2) is the multiplicative group of quaternions of norm 1, these spaces have


a quaternionic structure analogous to the complex structure shown by Her-
mitian symmetric spaces, where K contains a U(1) factor (Proposition 28.3).
See Wolf [175]. Of the exceptional types, EII, EIV , EIX, F I, and G are
quaternionic. Observe that in each case the dimension is a multiple of 4. Us-
ing some of these quaternionic symmetric spaces it is possible to extend the
magic square in the special case where n = 3 by a fourth group Hn (D) such
that Hn (D) × SU(2) is the maximal compact subgroup of the relevant non-
compact form. It is also possible to add a fourth column when n = 3 due to
existence of the exceptional Jordan algebra and P3 (O).
The magic square then looks like Fig. 28.4. In addition to (28.17), there is
a similar relation,

dim H  (D) + 2 dim H  (D) = 3 dim Hn (D) + 5, (28.18)

which suggests that the quaternionic symmetric spaces—they are F I, EII,


EV I, and EIX in Cartan’s classification—should be thought of as “quater-
nionic tube domains” over the corresponding Hermitian symmetric spaces.

Exercises
In the exercises, we look at the complex unit ball, which is a Hermitian symmetric
space that is not a tube domain. For these spaces, Piatetski-Shapiro [133] gave
unbounded realizations that are called Siegel domains of Type II . (Siegel domains
of Type I are tube domains over self-dual cones.)

D R C H O R C H O
H3(D) SO(3) SU(3) Sp(6) F4 3 8 21 52
H3 (D) SU(3) SU(3) × SU(3) SU(6) E6 8 16 35 78
H3(D) Sp(6) SU(6) SO(12) E7 21 35 66 133
H3(D) F4 E6 E 7 E8 52 78 133 248

Fig. 28.4. Left: the magic square. Right: dimensions


28 Symmetric Spaces 279

Exercise 28.1. The group G = SU(n, 1) consists of solutions to


   
In In
t
g g= , g ∈ GL(n + 1, C).
−1 −1

Let
⎧ ⎛ ⎞ ⎫

⎨ w1  ⎪

⎜ ⎟
Bn = w = ⎝ ... ⎠  |w1 |2 + · · · + |wn |2 < 1

⎩ ⎪

wn
be the complex unit ball. Write
 
Ab
g= , A ∈ Matn (C), b ∈ Matn×1 (C), c ∈ Mat1,n (C), d ∈ C.
c d

If w ∈ Bn , show that cw + d is invertible. (This is a 1 × 1 matrix, so it can be


regarded as a complex number.) Define

g(w) = (Aw + b)(cw + d)−1 . (28.19)

Show that g(w) ∈ Bn and that this defines an action of SU(n, 1) on Bn .

Exercise 28.2. Let Hn ∈ Cn be the bounded domain


⎧ ⎛ ⎞ ⎫

⎨ z1  ⎪

⎜ ⎟
Hn = z = ⎝ ... ⎠  2 Im(z1 ) > |z2 |2 + · · · + |zn |2 .

⎩ ⎪

zn

Show that there are holomorphic maps c : Hn −→ Bn and c−1 : Bn −→ Hn that are
inverses of each other and are given by
⎛ −1 ⎞ ⎛ ⎞
√1 − i)(z1 + i)−1
(z √ + w1 )(1 − w1−1
i(1 )−1
⎜ 2iz2 (z1 + i) ⎟ ⎜ 2iw2 (1 − w1 ) ⎟
⎜ ⎟ ⎜ ⎟
c(z) = ⎜ .. ⎟, c−1 (w) = ⎜ .. ⎟.
⎝ . ⎠ ⎝ . ⎠
√ −1
√ −1
2izn (z1 + i) 2iwn (1 − w1 )

Note: If we extend the action (28.19) to allow g ∈ GL(n + 1, C), these “Cayley
transforms” are represented by the matrices
⎛ √ √ ⎞ ⎛ √ √ ⎞
1/ 2i −i/ 2i i/ 2i i/ 2i
c = ⎝ √ In−1 √
⎠, c−1 = ⎝ √ In−1 √
⎠.
1/ 2i i/ 2i −1/ 2i 1/ 2i

Exercise 28.3. Show that c−1 SU(n, 1)c = SUξ , where SUξ is the group of all g ∈
GL(n, C) satisfying g ξ t ḡ −1 = ξ, where
⎛ ⎞
−i
ξ = ⎝ In−1 ⎠.
i
280 28 Symmetric Spaces

Show that SUξ contains the noncompact “Heisenberg” unipotent subgroup


⎧⎛ ⎞ ⎫
⎨ 1 ib̄ 2i |b| + ia 
2

H = ⎝ In−1 b ⎠  b ∈ Matn,1 (C), a ∈ R .
⎩ ⎭
1

Let us write
⎛ ⎞
z1 ⎛ ⎞
⎜ z2 ⎟   z2
⎜ ⎟ z1 ⎜ ⎟
z=⎜ . ⎟= , ζ = ⎝ ... ⎠ .
⎝ .. ⎠ ζ
zn
zn
According to (28.19), a typical element of H should act by
i 2
z1 −→ z1 + ibζ + |b| + ia,
2
ζ −→ ζ + b.

Check directly that H is invariant under such a transformation. Also show that SUξ
contains the group
⎧⎛ ⎞ ⎫
⎨ u  ⎬
M= ⎝ h ⎠  u, v ∈ C× , h ∈ U(n − 1) .
⎩ ⎭
ū−1

Describe the action of this group. Show that the subgroup of SUξ generated by H
and M is transitive on Hn , and deduce that the action of SU(n, 1) on Bn is also
transitive.
 
Exercise 28.4. Observe that the subgroup K = S U(n) × U(1) of SU(n, 1) acts
transitively on the topological boundary of Bn , and explain why this shows that the
Bergman–Shilov boundary is the whole topological boundary. Contrast this with the
case of Dn .

Exercise 28.5. Emulate the construction of Rn and R◦n to show that the compact
dual of Bn has a dense open subset that can be identified with Cn in such a way that
GC = GL(n + 1, C) acts by (28.19). Show that Bn can be embedded in its compact
dual, just as Dn is in the case of the symplectic group.
29
Relative Root Systems

In this chapter, we will consider root systems and Weyl groups associated
with a Lie group G. We will assume that G satisfies the assumptions in Hy-
pothesis 28.1 of the last chapter. Thus, G is semisimple and comes with a
compact dual Gc . In Chap. 18, we associated with Gc a root system and Weyl
group. That root system and Weyl group we will call the absolute root system
Φ and Weyl group W . We will introduce another root system Φrel , called the
relative or restricted root system, and a Weyl group Wrel called the relative
Weyl group. The relation between the two root systems will be discussed. The
structures that we will find give Iwasawa and Bruhat decompositions in this
context. This chapter may be skipped with no loss of continuity.
As we saw in Theorem 28.3, every semisimple Lie group admits a Car-
tan decomposition, and Hypothesis 28.1 will be satisfied. The assumption of
semisimplicity can be relaxed—it is sufficient for G to be reductive, though in
this book we only define the term “reductive” when G is a complex analytic
group. A more significant generalization of the results of this chapter is that
relative and absolute root systems and Weyl groups can be defined whenever
G is a reductive algebraic group defined over a field F . If F is algebraically
closed, these coincide. If F = R, they coincide with the structures defined in
this chapter. But reductive groups over p-adic fields, number fields, or finite
fields have many applications, and this reason alone is enough to prefer an ap-
proach based on algebraic groups. For this, see Borel [20] as well as Borel and
Tits [21], Tits [162] (and other papers in the same volume), and Satake [144].
Consider, for example, the group G = SL(r, H), the construction of which
we recall. The group GL(r, H) is the group of units of the central simple algebra
Matr (H) over R. We have C ⊗ H ∼ = Mat2 (C) as C-algebras. Consequently,
C ⊗ Matr (H) ∼ = Mat2r (C). The reduced norm ν : Matr (H) −→ R is a map
determined by the commutativity of the diagram

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 281


DOI 10.1007/978-1-4614-8024-2 29, © Springer Science+Business Media New York 2013
282 29 Relative Root Systems

Matr(H) Mat2r(C)

ν det

R C

(See Exercise 29.1.) The restriction of the reduced norm to GL(r, H) is a


homomorphism ν : GL(r, H) −→ R× with a kernel that is the group SL(r, H).
It is a real form of SL(2r, R), or of the compact group Gc = SU(2r), and we
may associate with it the Weyl group and root system W and Φ of SU(2r) of
type A2r−1 . This is the absolute root system. On the other hand, there is also
a relative or restricted root system and Weyl group, which we now describe.

Let K be the group of g ∈ SL(r, H) such that g t ḡ = I, where the bar


denotes the conjugation map of H. By Exercise 5.7, K is a compact group
isomorphic to Sp(2r). The Cartan involution θ of Hypothesis 28.1 is the map
g −→ t ḡ −1 .
We will denote by R× + the multiplicative group of the positive real numbers.
Let A ∼= + (R × r
) be the subgroup
⎛ ⎞
t1
⎜ ⎟ 7
⎝ ..
. ⎠, ti ∈ R ×
+, ti = 1.
tr
The centralizer of A consists of the group
⎧⎛ ⎞ ⎫

⎨ t1  ⎪

⎜ .. ⎟ ×
CG (A) = ⎝ . ⎠ it ∈ H .

⎩ ⎪

tr
The group M = CG (A) ∩ K consists of all elements with |ti | = 1. The group
of norm 1 elements in H× is isomorphic to SU(2) by Exercise 5.7 with n = 1.
Thus M is isomorphic to SU(2)r .
On the other hand, the normalizer of NG (A) consists of all monomial
quaternion matrices. The quotient Wrel = NG (A)/CG (A) is of type Ar−1 . The
“restricted roots” are “rational characters” of the group A, of the form αij =
ti t−1
j , with i = j. We can identify g = Lie(G) with Matn (H), in which case
the subspace of g that transforms by αij consists of all elements of g having
zeros everywhere except in the i, j position. In contrast with the absolute
root system, where the eigenspace Xα of a root is always one-dimensional (see
Proposition 18.6), these eigenspaces are all four-dimensional.
We see from this example that the group SL(n, H) looks like SL(n, R), but
the root eigenspaces are “fattened up.” The role of the torus T in Chap. 18
will be played by the group CG (A), which may be thought of as a “fattened
up” and non-Abelian replacement for the torus.
We turn to the general case and to the proofs.
29 Relative Root Systems 283

Proposition 29.1. Assume that the assumptions of Hypothesis 28.1 are sat-
isfied. Then the map
(Z, k) −→ exp(Z)k (29.1)
is a diffeomorphism p × K −→ G.

Proof. Choosing a faithful representation (π, V ) of the compact group Gc , we


may embed Gc into GL(V ). We may find a positive definite invariant inner
product on V and, on choosing an orthonormal basis, we may embed Gc into
U(n), where n = dim(V ). The Lie algebra gC is then embedded into gl(n, C)
in such a way that k ⊆ u(n) and p is contained in the space P of n × n
Hermitian matrices. We now recall from Theorem 13.4 and Proposition 13.7
that the formula (29.1) defines a diffeomorphism P × U(n) −→ GL(n, C). It
follows that it gives a diffeomorphism of p × K onto its image. It also follows
that (29.1) has nonzero differential everywhere, and as p × K and G have the
same dimension, we get an open mapping p × K −→ G. On the other hand,
p × K is closed in P × U(n), so the image of (29.1) is closed as well as open
in G. Since G is connected, it follows that (29.1) is surjective.


If a is an Abelian Lie subalgebra of g such that a ⊂ p, we say a is an Abelian


subspace of p. This expression is used instead of “Abelian subalgebra” since
p itself is not a Lie subalgebra of g. We will see later in Theorem 29.3 that a
maximal Abelian subspace a of p is unique up to conjugation.

Proposition 29.2. Assume that the assumptions of Hypothesis 28.1 are sat-
isfied. Let a be a maximal Abelian subspace of p. Then A = exp(a) is a closed
Lie subgroup of G, and a is its Lie algebra. There exists a θ-stable maximal
torus T of Gc such that A is contained in the complexification TC regarded as
a subgroup of GC . If r = dim(a), then A ∼
= (R× r
+ ) . Moreover, Ac = exp(ia) is
a compact torus contained in T . We have T = Ac TM , where TM = (T ∩ K)◦ .

Proof. By Proposition 15.2, A is an Abelian group. By Proposition 29.1, the


restriction of exp to p is a diffeomorphism onto its image, which is closed in G,
and since a is closed in p it follows that exp(a) is closed and isomorphic as a
Lie group to the vector space a ∼ = Rr . Exponentiating, the group A ∼ = (R× r
+) .
Let Ac = exp(ia) ⊂ Gc . By Proposition 15.2, it is an Abelian subgroup.
We will show that it is closed. If it is not, consider its topological closure
Ac . This is a closed connected Abelian subgroup of the compact group Gc
and hence a torus by Theorem 15.2. Since θ induces −1 on p, it induces the
automorphism g −→ g −1 on Ac and hence on Ac . Therefore, the Lie algebra of
Ac is contained in the −1 eigenspace ip of θ in Lie(Gc ). Since ia is a maximal
Abelian subspace of ip, it follows that ia is the Lie algebra of Ac , and therefore
Ac = exp(ia) = Ac .
Now let T be a maximal θ-stable torus of Gc containing Ac . We will show
that T is a maximal torus of Gc . Let T  ⊇ T be a maximal torus. Let t and
t be the respective Lie algebras of T  and T . Suppose that H ∈ t . If Y ∈ t,
then [Y, θ H] = [θ Y , H] = −[Y, H] = 0 since t is θ-stable and Y, H ∈ t , which
284 29 Relative Root Systems

is Abelian. Thus, both H and θ H are in the centralizer of t. Now we can write
H = H1 + H2 , where H1 = 12 (H + θ H) and H2 = 12 (H − θ H). Note that the
torus Si , which is the closure of {exp(tHi ) | t ∈ R}, is θ stable – indeed θ is
trivial on S1 and induces the automorphism x −→ x−1 on S2 . Also Si ⊆ T 
centralizes T . Consequently, T Si is a θ-stable torus and, by maximality of T ,
Si ⊆ T . It follows that Hi ∈ t, and so H ∈ t. We have proven that t = t and
so T = T  is a maximal torus.
It remains to be shown that T = Ac TM . It is sufficient to show that the Lie
algebra of T decomposes as ia ⊕ tM , where tM is the Lie algebra of TM . Since
θ stabilizes T , it induces an endomorphism of order 2 of t = Lie(T ). The +1
eigenspace is tM = t ∩ k since the +1 eigenspace of θ on gc is k. On the other
hand, the −1 eigenspace of θ on t contains ia and is contained in ip, which is
the −1 eigenspace of θ on gc . Since a is a maximal Abelian subspace of p, it
follows that the −1 eigenspace of θ on t is exactly ia, so t = ia ⊕ tM .


Lemma 29.1. Let Z ∈ GL(n, C) be a Hermitian matrix. If g ∈ GL(n, C)


commutes with exp(Z), then g commutes with Z.

Proof. Let λ1 , . . . , λh be the distinct eigenvalues of Z. Let us choose a basis


with respect to which Z has the matrix
⎛ ⎞
λ1 Ir1
⎜ .. ⎟
⎝ . ⎠.
λh Irh

Then exp(Z) has the same form with λi replaced by exp(λi ). Since the λi are
distinct real numbers, the exp(λi ) are also distinct, and it follows that g has
the form ⎛ ⎞
g1
⎜ .. ⎟
⎝ . ⎠,
gh
where gi is an ri × ri block. Thus g commutes with Z.


Proposition 29.3. In the context of Proposition 29.2, let M = CG (A) ∩ K.


Then CG (A) = M A and M ∩ A = {1}, so CG (A) is the direct product of M
and A. The group TM is a maximal torus of M .

The compact group M is called the anisotropic kernel .

Proof. Since M ⊆ K and A ⊆ exp(p), and since by Proposition 29.1 K ∩


exp(p) = {1}, we have M ∩ A = {1}. We will show that CG (A) = M A. Let
g ∈ M . By Proposition 29.1, we may write g = exp(Z)k, where Z ∈ p and
k ∈ K. If a ∈ A, then a commutes with exp(Z)k. We will show that any a ∈ A
commutes with exp(Z) and with k individually. From this we will deduce that
exp(Z) ∈ A and k ∈ M .
29 Relative Root Systems 285

By Theorem 4.2, Gc has a faithful complex representation Gc −→ GL(V ).


We extend this to a representation of GC and gC . Giving V a Gc -invariant
inner product and choosing an orthonormal basis, Gc is realized as a group of
unitary matrices. Therefore gc is realized as a Lie algebra of skew-Hermitian
matrices, and since ip ⊆ gc , the vector space p consists of Hermitian matrices.
We note that θ(Z) = −Z, θ(a) = a−1 , and θ(k) = k. Thus if we
apply the automorphism θ to the identity a exp(Z)k = exp(Z)ka, we get
a−1 exp(−Z)k = exp(−Z)ka−1 . Since this is true for all a ∈ A, both exp(−Z)k
−1
and exp(Z)k are in CG (A). It follows that exp(2Z) = exp(Z)k exp(−Z)k
is in CG (A). Since exp(2Z) commutes with A, by Lemma 29.1, Z commutes
with the elements of A (in our matrix realization) and hence ad(Z)a = 0.
Because a is a maximal Abelian subspace of p, it follows that Z ∈ a. Also,
k centralizes A since exp(Z)k and exp(Z) both do, and so exp(Z) ∈ A and
k ∈ M.
It is clear that TM = (T ∩ K)◦ is contained in CG (A) and K, so TM is a
 
torus in M . Let TM be a maximal torus of M containing TM . Then Ac TM is
a connected Abelian subgroup of CG (A) containing T = A T
c M , and since T
 
is a maximal torus of Gc we have Ac TM = T . Therefore, TM ⊂ T . It is also

contained in K and connected. This proves that TM = TM is a maximal torus
of M .

We say that a quasicharacter of A ∼ = (R× r
+ ) is a rational character if it can
be extended to a complex analytic character of AC = exp(aC ). We will denote
by X ∗ (A) the group of rational characters of A. We recall from Chap. 15 that
X ∗ (Ac ) is the group of all characters of the compact torus Ac .
Proposition 29.4. Every rational character of A has the form

(t1 , . . . , tr ) −→ tk11 · · · tkr r , ki ∈ Z. (29.2)

The groups X ∗ (A) and X ∗ (Ac ) are isomorphic: extending a rational character
of A to a complex analytic character of AC and then restricting it to Ac gives
every character of Ac exactly once.
Proof. Obviously (29.2) is a rational character. Extending any rational char-
acter of A to an analytic character of AC and then restricting it to Ac
gives a homomorphism X ∗ (A) −→ X ∗ (Ac ), and since the characters of
X ∗ (Ac ) are classified by Proposition 15.4, we see that every rational char-
acter has the form (29.2) and that the homomorphism X ∗ (A) −→ X ∗ (Ac ) is
an isomorphism.

Since the compact tori T and Ac satisfy T ⊃ Ac , we may restrict characters
of T to Ac . Some characters may restrict trivially. In any case, if α ∈ X ∗ (T )
restricts to β ∈ X ∗ (A) = X ∗ (Ac ), we write α|β. Assuming that α and hence
β are not the trivial character, as in Chap. 18 we will denote by Xβ the
β-eigenspace of T on gC . We will also denote by Xrelα the α-eigenspace of Ac
on gC . Since X ∗ (Ac ) = X ∗ (A), we may write
286 29 Relative Root Systems

Xrel
α = {X ∈ gC | Ad(a)X = α(a)X for all a ∈ A}.

We will see by examples that Xrel


α may be more than one-dimensional. How-
ever, Xβ is one-dimensional by Proposition 18.6, and we may obviously write
&
Xrel
α = Xβ .
β ∈ X ∗ (T )
β|α

Let Φ be the set of β ∈ X ∗ (T ) such that Xβ = 0, and let Φrel be the set of
α ∈ X ∗ (A) such that Xrelα = 0.
If β ∈ X ∗ (T ), let dβ : t −→ C be the differential of β. Thus

d 
dβ(H) = β(etH )  , H ∈ t.
dt t=0

As in Chap. 18, the linear form dβ is pure imaginary on the Lie algebra tM ⊕ia
of the compact torus T . This means that dβ is real on a and purely imaginary
on tM .
If α ∈ Φrel , the α-eigenspace Xrel
α can be characterized by either the con-
dition (for X ∈ Xrelα )

Ad(a)X = α(a)X, a ∈ A,

or
[H, X] = dα(H) X, H ∈ a. (29.3)
Let c : gC −→ gC denote the conjugation with respect to g. Thus, if
Z ∈ gC is written as X + iY , where X, Y ∈ g, then c(Z) = X − iY so
g = {Z ∈ gC | c(Z) = Z}. Let m be the Lie algebra of M . Thus, the Lie
algebra of CG (A) = M A is m ⊕ a. It is the 0-eigenspace of A on g, so
&
gC = C(m ⊕ a) ⊕ Xα (29.4)
α∈Φrel

is the decomposition into eigenspaces.


Proposition 29.5. (i) In the context of Proposition 29.2, if α ∈ Φrel , then
Xrel
α ∩ g spans Xα .
rel

(ii) If 0 = X ∈ Xα ∩ g, then θ(X) ∈ Xrel


rel
−α ∩ g and [X, θ(X)] = 0.
(iii) The space Xrel
α ∩ g is invariant under Ad(M A).
(iv) If α, α ∈ Φrel , and if Xα ∈ Xrelα , X α ∈ X  , then
rel
α

C(m ⊕ a) if α = −α,
[Xα , Xα ] ∈
Xα+α if α + α ∈ Φrel ,

while [Xα , Xα ] = 0 if α = −α and α + α ∈


/ Φ.
This is in contrast with the situation in Chap. 18, where the spaces Xα did
not intersect the Lie algebra of the compact Lie group.
29 Relative Root Systems 287

Proof. We show that we may find a basis X1 , . . . , Xh of the complex vector


space Xrel
α such that Xi ∈ g. Suppose that X1 , . . . , Xh are a maximal linearly
independent subset of Xrel α such that Xi ∈ g. If they do not span Xα , let
rel

0 = Z ∈ Xα be found that is not in their span. Then c(Z) ∈ Xα since


rel rel

applying c to (29.3) gives the same condition, with Z replaced by c(Z). Now

2i Z − c(Z) ,
1 1
2 Z + c(Z) ,

are in g, and at least one of them is not in the span of X1 , . . . , Xi since Z is not.
We may add this to the linearly independent set X1 , . . . , Xh , contradicting the
assumed maximality. This proves (i).
α to X−α . Indeed, if X ∈ Xα , then for
For (ii), let us show that θ maps Xrel rel rel
−1
a ∈ A we have Ad(a)X = α(a)Xα . Since θ(a) = a , replacing a by its inverse
and applying θ, it follows that Ad(a)θ(X) = α(a−1 ) θ(X). Since the group law
in X ∗ (A) is written additively, (−α)(a) = α(a−1 ). Therefore θ(X) ∈ X−α .
Since θ and c commute, if X ∈ g, then θ(X) ∈ g.
The last point we need to check for (ii) is that if 0 = X ∈ Xrel α ∩ g, then
[X, θ(X)] = 0. Since Ad : Gc −→ GL(gc ) is a real representation of a compact
group, there exists a positive definite symmetric bilinear form B on gc that is
Gc -invariant. We extend B to a symmetric C-bilinear form B : gC × gC −→ C
by linearity. We note that Z = X + θ(X) ∈ k since θ(Z) = Z and Z ∈ g.
In particular Z ∈ gc . It cannot vanish since X and θ(X) lie in Xα and X−α ,
which have a trivial intersection. Therefore, B(Z, Z) = 0. Choose H ∈ a such
that dα(H) = 0. We have

B X + θ(X), [H, X − θ(X)] = B Z, dα(H)Z = 0.
On the other hand, by (10.3) this equals

−B [X + θ(X), X − θ(X)], H = 2B [X, θ(X)], H .
Therefore, [X, θ(X)] = 0.
For (ii), we will prove that Xrel
α is invariant under CG (A), which con-
tains M . Since g is obviously an Ad-invariant real subspace of gC it will follow
that Xrel
α ∩ g is Ad(M )-invariant. Since CG (A) is connected by Theorem 16.6,
it is sufficient to show that Xrel
α is invariant under ad(Z) when Z is in the Lie
algebra centralizer of a. Thus, if H ∈ a we have [H, Z] = 0. Now if X ∈ Xrel α
we have
[H, [Z, X]] = [[H, Z], X] + [Z, [H, X]] = [Z, dα(H)X] = dα(H)[Z, X].
Therefore, Ad(Z)X = [Z, X] ∈ Xrel α .
Part (iv) is entirely similar to Proposition 18.4 (ii), and we leave it to the
reader.

The roots in Φ can now be divided into two classes. First, there are those
that restrict nontrivially to A and hence correspond to roots in Φrel . On the
other hand, some roots do restrict trivially, and we will show that these cor-
respond to roots of the compact Lie group M . Let m = Lie(M ).
288 29 Relative Root Systems

Proposition 29.6. Suppose that β ∈ Φ. If the restriction of β to A is trivial,


then Xβ is contained in the complexification of m and β is a root of the compact
group M with respect to TM .
Proof. We show that Xβ is θ-stable. Let X ∈ Xβ . Then
[H, X] = dβ(H)X, H ∈ t. (29.5)
We must show that θ(X) has the same property. Applying θ to (29.5) gives
[θ(H), θ(X)] = dβ(H) θ(X), H ∈ t.
If H ∈ tM , then θ(H) = H and we have (29.5) with θ(X) replacing X. On the
other hand, if H ∈ ia we have θ(H) = −H, but by assumption dβ(H) = 0, so
we have (29.5) with θ(X) replacing X in this case, too. Since t = tM ⊕ ia, we
have proved that Xβ is θ-stable.
If a ∈ A and X ∈ Xβ , then Ad(a)X is trivial, so a commutes with the
one-parameter subgroup t −→ exp(tX), and therefore exp(tX) is contained
in the centralizer of A in GC . This means that exp(tX) is contained in the
complexification of the Lie algebra of CG (A), which by Proposition 29.3 is
C(m ⊕ a). Since θ is +1 on m and −1 on a, and since we have proved that Xβ
is θ-stable, we have X ∈ Cm.

Now let V = R ⊗ X ∗ (T ), VM = R ⊗ X ∗ (TM ), and Vrel = R ⊗ X ∗ (A) =
R ⊗ X ∗ (Ac ). Since T = TM Ac by Proposition 29.2, we have V = VM ⊕ Vrel .
In particular, we have a short exact sequence
0 −→ VM −→ V −→ Vrel −→ 0. (29.6)
Let ΦM be the root system of M with respect to TM . The content of Propo-
sition 29.6 is that the roots of Gc with respect to T that restrict trivially to
A are roots of M with respect to TM .
We choose on V an inner product that is invariant under the absolute Weyl
group NGc (T )/T . This induces an inner product on Vrel and, if α is a root,
there is a reflection sα : Vrel −→ Vrel given by (18.1).
Proposition 29.7. In the context of Proposition 29.2, let α ∈ Φrel . Let Aα ⊂
A be the kernel of α, let Gα ⊂ G be its centralizer, and let gα ⊂ g be the
Lie algebra of Gα . There exist Xα ∈ Xα ∩ g such that if X−α = −θ(Xα ) and
Hα = [Xα , X−α ], then dα(Hα ) = 2. We have
[Hα , Xα ] = 2Xα , [Hα , X−α ] = −2X−α . (29.7)
There exists a Lie group homomorphism iα : SL(2, R) −→ Gα such that the
differential diα : sl(2, R) −→ gα maps
     
1 01 00
−→ Hα , −→ Xα , −→ X−α . (29.8)
−1 00 10
The Lie group homomorphism iα extends to a complex analytic homomor-
phism SL(2, C) −→ GC .
29 Relative Root Systems 289

Proof. Choose 0 = Xα ∈ Xα . By Proposition 29.5, we may choose Xα ∈ g, and


denoting X−α = −θ(Xa ) we have X−α ∈ X−α ∩ g and Hα = [Xα , X−α ] = 0.
We claim that Hα ∈ a. Observe that Hα ∈ g since Xα and X−α are in g, and
applying θ to Hα gives [X−α , Xα ] = −Hα . Therefore, Hα ∈ p. Now if H ∈ a
we have

[H, Hα ] = [[H, Xα ], X−α ] + [Xα , [H, X−α ]] =


[dα(H)Xα , X−α ] + [Xα , −dα(H)X−α ] = 0.

Since a is a maximal Abelian subspace of p, this means that Hα ∈ a.


Now iHα ∈ ip, Z = Xα − X−α ∈ k, and Y = i(Xα + X−α ) ∈ ip are all
elements of gc = k ⊕ ip. We have

[iHα , Z] = dα(Hα )Y, [iHα , Y ] = −dα(Hα )Z ,

and
[Y, Z] = 2iHα .
Now dα(Hα ) = 0. Indeed, if dα(Hα ) = 0, then ad(Z)2 Y = 0 while
ad(Z)Y = 0, contradicting Lemma 18.1, since Z ∈ k. After replacing Xα by a
positive multiple, we may assume that dα(H) = 2.
Now at least we have a Lie algebra homomorphism sl(2, R) −→ g with the
effect (29.8), and we have to show that it is the differential of a Lie group
homomorphism SL(2, R) −→ G. We begin by constructing the corresponding
map SU(2) −→ Gc . Note that iHα , Y , and Z are all elements of gc , and so
we have a homomorphism su(2) −→ k that maps
     
i i 1
−→ iHα , −→ Y, −→ Z.
−i i −1

By Theorem 14.2, there exists a homomorphism SU(2) −→ Gc . Since SL(2, C)


is the analytic complexification of SU(2), and GC is the analytic complexifi-
cation of Gc , this extends to a complex analytic homomorphism SL(2, C) −→
GC . The restriction to SL(2, R) is the sought-after embedding.
Lastly, we note that Xα and X−α centralize Aα since [H, X±α ] = 0 for H
in the kernel aα of dα : a −→ R, which is the Lie algebra of Aα . Thus, the
Lie algebra they generate is contained in gα , and its exponential is contained
in Gα .


Theorem 29.1. In the context of Proposition 29.7, the set Φrel of restricted
roots is a root system. If α ∈ Φrel , there exists wα ∈ K that normalizes A and
that induces on X ∗ (A) the reflection sα .

Proof. Let  
1
wα = iα .
−1
We note wα ∈ K. Indeed, it is the exponential of
290 29 Relative Root Systems
  
π 1 π
diα = Xα − X−α ∈ k
2 −1 2

since
    
1 cos(t) sin(t)
exp t = .
−1 − sin(t) cos(t)

Now wα centralizes Aα by Proposition 29.7. Also


      
π 1 1 1
ad : −→ −
2 −1 −1 −1

in SL(2, R), and applying iα gives Ad(wα )Hα = −Hα . Since a is spanned by
the codimension 1 subspace aα and the vector Hα , it follows that (in its action
on Vrel ) wα has order 2 and eigenvalue −1 with multiplicity 1. It therefore
induces the reflection sα in its action on Vrel .
Now the proof that Φrel is a root system follows the structure of the proof
of Theorem 18.2. The existence of the simple reflection wα in the Weyl group
implies that sα preserves the set Φ.
For the proof that if α and β are in Φ then 2 α, β / α, α ∈ Z, we adapt
the proof of Proposition 18.10. If λ ∈ X ∗ (Ac ), we will denote (in this proof
only) by Xλ the λ-eigenspace of Ac in the adjoint representation. We normally
use this notation only if λ = 0 is a root. If λ = 0, then Xλ is the complexified
Lie algebra of CG (A); that is, C(m ⊕ a). Let
&
W = Xβ+kα ⊆ XC .
k∈Z

We claim that W is invariant under iα SL(2, C) . To prove this, it is sufficient
to show that it is invariant under diα sl(2, C) , which is generated by Xα and
X−α , since these are the images under iα of a pair of generators of sl(2, C)
by (29.8). These are the images of diα and iα , respectively. From (29.7), we see
that ad(Xα )Xγ ∈ Xγ+2α and ad(X−α )Xγ ∈ Xγ−2α , proving that iα (SL(2, C))
is invariant. In particular, W is invariant under wα ∈ SL(2, C). Since ad(wα )
induces sα on Vrel , it follows that the set {β + kα|k ∈ Z} is invariant under
sα and, by (18.1), this implies that 2 α, β / α, α ∈ Z.


The group Wrel = NG (A)/CG (A) is the relative Weyl group. In


Theorem 29.1 we constructed simple reflections showing that Wrel contains
the abstract Weyl group associated with the root system Φrel . An analog of
Theorem 18.3 is true – Wrel is generated by the reflections and hence coincides
with the abstract Weyl group. We note that by Theorem 29.1 the generators
of Wrel can be taken in K, so we may write Wrel = NK (A)/CK (A).
Although we have proved that Φrel is a root system, we have not proved
that it is reduced. In fact, it may not be—we will give examples where the type
of Φrel is BCq and is not reduced! In Chap. 20, except for Proposition 20.18,
29 Relative Root Systems 291

it was assumed that the root system was reduced. Proposition 20.18 contains
all we need about nonreduced root systems.
The relationship between the three root systems Φ, ΦM , and Φrel can be
expressed in a “short exact sequence of root systems,”

0 −→ ΦM −→ Φ −→ Φrel −→ 0, (29.9)

embedded in the short exact sequence (29.6) of Euclidean spaces. Of course,


this is intended symbolically rather than literally. What we mean by this
“short exact sequence” is that, in accord with Proposition 29.6, each root of
M can be extended to a unique root of Gc ; that the roots in Φ that are not
thus extended from M are precisely those that restrict to a nonzero root in
Φrel ; and that every root in Φrel is a restricted root.

Proposition 29.8. If α ∈ Φ+ rel is a simple positive root, then there exists a


β ∈ Φ+ such that β is a simple positive root and β|α. Moreover, if β ∈ Φ+ is a
simple positive root with a restriction to A that is nonzero, then its restriction
is a simple root of Φ+
rel .

Proof. Find a root γ ∈ Φ whose restriction to A is α. Since we have chosen the


root systems  compatibly, γ is a positive root. We write it as a sum of positive
roots: γ = βi . Each of these restricts either trivially or to a relative root in
Φ+rel , and we can write α as the sum of the nonzero restrictions of βi , which are
positive roots. Because α is simple, exactly one restricted βi can be nonzero,
and taking β to be this βi , we have β|α.
The last statement is clear.


We see that the restriction map induces a surjective mapping from the set
of simple roots in Φ that have nonzero restrictions to the simple roots in Φrel .
The last question that needs to be answered is when two simple roots of Φ
can have the same nonzero restriction to Φrel .

Proposition 29.9. Let β ∈ Φ+ . Then −θ(β) ∈ Φ+ . The roots β and −θ(β)


have the same restriction to A. If β is a simple positive root, then so is −θ(β),
and if α is a simple root of Φrel and β, β  are simple roots of Φrel both restrict-
ing to α, then either β  = β or β  = −θ(β).

Proof. The fact that β and −θ(β) have the same restriction follows from
Proposition 29.5 (ii). It follows immediately that −θ(β) is a positive root
in Φ. The map β −→ −θ(β) permutes the positive roots, is additive, and
therefore preserves the simple positive roots.
Suppose that α is a simple root of Φrel and β, β  are simple roots of Φrel
both restricting to α. Since β −β  has trivial
restriction
to Ac , it is θ-invariant.
Rewrite β − β  = θ(β − β  ) as β + − θ(β) = β  + θ(−β  ) . This expresses the
sum of two simple positive roots as the sum of another two simple positive
roots. Since the simple positive roots are linearly independent by Proposi-
tion 20.18, it follows that either β  = β or β  = −θ(β).

292 29 Relative Root Systems

The symmetry β −→ −θ(β) of the Weyl group is reflected by a symmetry


of the Dynkin diagram. It may be shown that if Gc is simply connected, this
symmetry corresponds to an outer automorphism of GC . Only the Dynkin
diagrams of types An , Dn , and E6 admit nontrivial symmetries, so unless the
absolute root system is one of these types, β = −θ(β).
The relationship between the three root systems in the “short exact se-
quence” (29.9) may be elucidated by the “Satake diagram,” which we will
now discuss. Tables of Satake diagrams may be found in Table VI on p. 532
of Helgason [66], p. 124 of Satake [144], or in Table 4 on p. 229 of Onishchik
and Vinberg [166]. The diagrams in Tits [162] look a little different from the
Satake diagram but contain the same information.
In addition to the Satake diagrams we will work out, a few different ex-
amples are explained in Goodman and Wallach [56].
Knapp [106] contains a different classification based on tori (Cartan sub-
groups) that (in contrast with our “maximally split” torus T ), are maximally
anisotropic, that is, are split as little as possible. Knapp also discusses the re-
lationships between different tori by Cayley transforms. In this classification
the Satake diagrams are replaced by “Vogan diagrams.”
In the Satake diagram, one starts with the Dynkin diagram of Φ. We
recall that the nodes of the Dynkin diagram correspond to simple roots of Gc .
Those corresponding to roots that restrict trivially to A are colored dark.
By Proposition 29.6, these correspond to the simple roots of the anisotropic
kernel M , and indeed one may read the Dynkin diagram of M from the Satake
diagram simply by taking the colored roots.
In addition to coloring some of the roots, the Satake diagram records the
effect of the symmetry β −→ −θ(β) of the Dynkin diagram. In the “exact
sequence” (29.9), corresponding nodes are mapped to the same node in the
Dynkin diagram of Φrel . We will discuss this point later, but for examples of
diagrams with nontrivial symmetries see Figs. 29.3b and 29.5.
As a first example of a Satake diagram, consider SL(3, H). The Satake
diagram is •−◦−•−◦−•. The symmetry β −→ −θ(β) is trivial. From this Satake
diagram, we can read off the Dynkin diagram of M ∼ = SU(2) × SU(2) × SU(2)
by erasing the uncolored dots to obtain the disconnected diagram • • •
of type A1 × A1 × A1 .
On the other hand, in this example, the relative root system is of type A2 .
We can visualize the “short exact sequence of root systems” as in Fig. 29.1,
where we have indicated the destination of each simple root in the inclusion
ΦM −→ Φ and the destinations of those simple roots in Φ that restrict non-
trivially in the relative root system.
As a second example, let F = R, and let us consider the group G =
SO(n, 1). In this example, we will see that G has real rank 1 and that the
relative root system of G is of type A1 . Groups of real rank 1 are in many
ways the simplest groups. Their symmetric spaces are direct generalizations
of the Poincaré upper half-plane, and the symmetric space of SO(n, 1) is of-
29 Relative Root Systems 293

0 ΦM Φ Φrel 0
A1 × A1 × A1 A5 A2

Fig. 29.1. The “short exact sequence of root systems” for SL(3, H)

ten referred to as hyperbolic n-space. (It is n-dimensional.) We have seen in


Example 28.7 that this symmetric space can be realized as a hyperboloid.
We will see, consistent with our description of SL(n, H) as a “fattened up”
version of SL(n, R), that SO(n, 1) can be seen as a “fattened up” version of
SO(2, 1).
We originally defined G = SO(n, 1) to be the set of g ∈ GL(n + 1, R) such
that g J t g = J, where J = J1 and
 
In
J1 = .
−1

However, we could just as easily take J = J2 and


⎛ ⎞
1
J2 = ⎝ In−1 ⎠
1

since this symmetric matrix also has eigenvalues 1 with multiplicity n and −1
with multiplicity −1. Thus, if
⎛ √ √ ⎞
1/ 2 −1/ 2
u = ⎝ √ In−1 √
⎠,
1/ 2 1/ 2

then u ∈ O(n + 1) and u J1 t u = J2 . It follows that if g J1 t g = J1 , then h =


ugu−1 satisfies h J2 t h = J2 . The two orthogonal groups are thus equivalent,
and we will take J = J2 in the definition of O(n, 1). Then we see that the Lie
algebra of G is ⎧⎛ ⎞ ⎫
⎨ a x 0  ⎬
⎝ y T −t x ⎠  T = −t T .
⎩ ⎭
0 −t y −a
294 29 Relative Root Systems

Here a is a 1×1 block, x is 1×(n−1), y is (n−1)×1, and T is (n−1)×(n−1).


The middle block is just the Lie algebra of SO(n − 1), which is the anisotropic
kernel. The relative Weyl group has order 2, and is generated by J2 . The
Satake diagram is shown in Fig. 29.2 for the two cases n = 9 and n = 10.

SO(11,1) (Type DII) SO(10,1) (Type BII)

Fig. 29.2. Satake diagrams for the rank 1 groups SO(n, 1) (a) SO(11, 1) (Type
DII) (b) SO(10, 1) (Type BII)

A number of rank 1 groups, such as SO(n, 1) can be found in Cartan’s list.


Notably, among the exceptional groups, we find Type F II. Most of these can
be thought of as “fattened up” versions of SL(2, R) or SO(2, 1), as in the two
cases above. Some rank 1 groups have relative root system of type BC1 .
At the other extreme, let us consider the groups SO(n, n) and SO(n +
1, n − 1). The group SO(n, n) is split . This means that the anisotropic kernel
is trivial and that the absolute and relative root systems Φ and Φrel coincide.
We can take G = {g ∈ GL(2n, R) | g J t g = J}, where
⎛ ⎞
1
.
J = ⎝ .. ⎠ .
1
We leave the details of this case to the reader. The Satake diagram is shown
in Fig. 29.3 when n = 6.

SO(6,6) (Type DI, split) SO(7,5) (Type DI, quasisplit)

Fig. 29.3. Split and quasisplit even orthogonal groups (a) SO(6, 6) (Type DI, split)
(b) SO (7, 5) (Type DI, quasisplit)

A more interesting case is SO(n + 1, n − 1). This group is quasisplit . This


means that the anisotropic kernel M is Abelian. Since M contains no roots,
there are no colored roots in the Dynkin diagram of a quasisplit group. A split
group is quasisplit, but not conversely, as this example shows. This group
is not split since the relative root systems Φ and Φrel differ. We can take
G = {g ∈ GL(2n, R) | gJ t g = J} where now
29 Relative Root Systems 295
⎛ ⎞
1
.
⎜ .. ⎟
⎜ ⎟
⎜ 1 ⎟
⎜ ⎟
⎜ 1 ⎟
J =⎜

⎟.

⎜ 1 ⎟
⎜ 1 ⎟
⎜ . ⎟
⎝ .. ⎠
1

We can take A to be the group of matrices of the form


⎛ ⎞
t1
⎜ .. ⎟
⎜ . ⎟
⎜ ⎟
⎜ tn−1 ⎟
⎜ ⎟
⎜ 1 ⎟
⎜ ⎟.
⎜ 1 ⎟
⎜ ⎟
⎜ −1
tn−1 ⎟
⎜ ⎟
⎜ .. ⎟
⎝ . ⎠
t−1
1

For n = 5, the Lie algebra of SO(6, 4) is shown in Fig. 29.4. For n = 6, the
Satake diagram of SO(7, 5) is shown in Fig. 29.3.

t1 x12 x13 x14 x15 x16 x17 x18 x19 0

x21 t2 x23 x24 x25 x26 x27 x28 0 −x19

x31 x32 t3 x34 x35 x36 x37 0 −x28 −x18

x41 x42 x43 t4 x45 x46 0 −x37 −x27 −x17

x51 x52 x53 x54 0 t5 −x45 −x35 −x25 −x15

x61 x62 x63 x64 −t5 0 −x46 −x36 −x26 −x16

x71 x72 x73 0 −x54 −x64 −t4 −x34 −x24 −x14

x81 x82 0 −x73 −x53 −x63 −x43 −t3 −x23 −x13

x91 0 −x82 −x72 −x52 −x62 −x42 −x32 −t2 −x12

0 −x91 −x81 −x71 −x51 −x61 −x41 −x31 −x21 −t1

Fig. 29.4. The Lie algebra of quasisplit SO(6, 4)

The circling of the x45 and x46 positions in Fig. 29.4 is slightly misleading
because, as we will now explain, these do not correspond exactly to roots.
296 29 Relative Root Systems

Indeed, each of the circled coordinates x12 , x23 , and x34 corresponds to a
one-dimensional subspace of g spanning a space Xαi , where i = 1, 2, 3 are
the first three simple roots in Φ. In contrast, the root spaces Xα4 and Xα5
are divided between the x45 and x46 positions. To see this, the torus T in
Gc ⊂ GC consists of matrices
⎛ it1 ⎞
e
⎜ eit2 ⎟
⎜ eit3 ⎟
⎜ ⎟
⎜ it4 ⎟
⎜ e ⎟
⎜ cos(t5 ) sin(t5 ) ⎟
t=⎜ ⎜


⎜ − sin(t5 ) cos(t5 ) ⎟
⎜ e−it4 ⎟
⎜ −it3 ⎟
⎜ e ⎟
⎝ e −it2 ⎠
e−it1

with ti ∈ R. The simple roots are

α1 (t) = ei(t1 −t2 ) , α2 (t) = ei(t2 −t3 ) , α3 (t) = ei(t3 −t4 ) ,

and
α4 (t) = ei(t4 −t5 ) , α5 (t) = ei(t4 +t5 ) .
The eigenspaces Xα4 and Xα5 are spanned by Xα4 and Xα5 , where
⎛ ⎞
0 0 0 0 0 0 0 0
⎜ 0 0 0 0 0 0 0 0 ⎟
⎜ 0 0 0 1 i 0 0 0 ⎟
⎜ ⎟
⎜ 0 0 0 0 0 −1 0 0 ⎟
Xα4 = ⎜⎜ 0 0 0 0 0 −i 0
⎟,
⎜ 0 ⎟

⎜ 0 0 0 0 0 0 0 0 ⎟
⎝ ⎠
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
and its conjugate is Xα5 .
The involution θ is transpose-inverse. In its effect on the torus T , θ(t−1 )
does not change t1 , t2 , t3 , or t4 but sends t5 −→ −t5 . Therefore, −θ inter-
changes the simple roots α4 and α5 , as indicated in the Satake diagram in
Figs. 29.3 and 29.4.
As a last example, we look next at the Lie group SU(p, q), where p > q.
We will see that this has type BCq . Recall from Chap. 19 that the root system
of type BCq can be realized as all elements of the form

±ei ± ej (i < j), ±ei , ±2ei ,

where ei are standard basis vectors of Rn . See Fig. 19.5 for the case q = 2.
We defined U(p, q) to be

g ∈ GL(p + q, C) | g J t ḡ = J ,
29 Relative Root Systems 297

where J = J1 , but (as with the group O(n, 1) discussed above) we could just
as well take J = J2 , where now
⎛ ⎞
  Iq
Ip
J1 = , J2 = ⎝ Ip−q ⎠ .
−Iq
Iq

This has the advantage of making the group A diagonal. We can take A to be
the group of matrices of the form
⎛ ⎞
t1
⎜ .. ⎟
⎜ . ⎟
⎜ ⎟
⎜ tq ⎟
⎜ ⎟
⎜ Ip−q ⎟.
⎜ ⎟
⎜ −1
t1 ⎟
⎜ ⎟
⎜ .. ⎟
⎝ . ⎠
t−1
q

Now the Lie algebra of SU(p, q) consists of


⎧⎛ ⎞ ⎫
⎨ a x b  ⎬

⎝ y u −t x̄ ⎠  b, c, u skew-Hermitian .
⎩ ⎭
c −t ȳ −t ā

Considering the action of the adjoint representation, the roots ti t−1


j appear in
2 −1 −1 −2
a, the roots ti tj and ti appear in b, the roots ti tj and ti appear in c, the
roots ti appear in x, and the roots t−1 ∗
i appear in y. Identifying R⊗X (A) = R
n

in such a way that the rational character ti corresponds to the standard basis
vector ei , we see that Φrel is a root system of type BCq . The Satake diagram
is illustrated in Fig. 29.5.
We turn now to the Iwasawa decomposition for G admitting a Cartan
decomposition as in Hypothesis 28.1. The construction is rather similar to
what we have already done in Chap. 26.

Proposition 29.10. Let G, Gc , K, g, and θ satisfy Hypothesis 28.1. Let M


and A be as in Propositions 29.2 and 29.3. Let Φ and Φrel be the absolute and
relative root systems, and let Φ+ and Φ+
rel be the positive roots with respect to
compatible orders. Let &
n= (Xα ∩ g).
α∈Φ+
rel

Then n is a nilpotent Lie algebra. It is the Lie algebra of a closed subgroup


N of G. The group N is normalized by M and by A. We may embed the
complexification GC of G into GL(n, C) for some n in such a way that G ⊆
GL(n, R), Gc ⊆ U(n), K ⊆ O(n), N is upper triangular, and θ(g) = t g −1 .
298 29 Relative Root Systems

q nodes

SU(p, q), p > q


Type AIII p − q − 1 nodes
p = 8,q = 3

q nodes

Fig. 29.5. The Satake diagram of SU(p, q)

Proof. As part of the definition of semisimplicity, it is assumed that the


semisimple group G has a faithful complex representation. Since we may em-
bed GL(n, C) in GL(2n, R), it has a faithful real representation. We may as-
sume that G ⊆ GL(V ), where V is a real vector space. We may then assume
that the complexification GC ⊆ GL(VC ), where VC = C⊗V is the complexified
vector space.
The proof that n is nilpotent is identical to Proposition 26.4 but uses
Proposition 29.5 (iv) instead of Proposition 18.4 (ii). By Lie’s Theorem 26.1,
we can find an R-basis v1 , . . . , vn of V such that each X ∈ n is upper triangular
with respect to this basis. It is nilpotent as a matrix by Proposition 26.5.
Choose a Gc -invariant inner product on VC (i.e., a positive definite Hermi-
tian form , ). It induces an inner product on V ; that is, its restriction to V
is a positive definite R-bilinear form. Now applying Gram–Schmidt orthogo-
nalization to the basis v1 , . . . , vn , we may assume that they are orthonormal.
This does not alter the fact that n consists of upper triangular matrices. It
follows by imitating the argument of Theorem 26.2 that N = exp(n) is a
Lie group with Lie algebra n. The group M normalizes N because its Lie
algebra normalizes the Lie algebra of N by Proposition 18.4 (iii), so the Lie
algebra of N is invariant under Ad(M A).
We have G ⊆ GL(n, R) since G stabilizes V . It is also clear that Gc ⊆ U(n)
since vi are an orthonormal basis and the inner product , was chosen to
be Gc -invariant. Since K ⊆ G ∩ Gc , we have K ⊆ O(n).
It remains to be shown that θ(g) = t g −1 for g ∈ G. Since G is assumed to
be connected in Hypothesis 28.1, it is sufficient to show that θ(X) = −t X for
X ∈ g, and we may treat the cases X ∈ k and X ∈ p separately. If X ∈ k, then
X is skew-symmetric since K ⊆ O(n). Thus, θ(X) = X = −t X. On the other
hand, if X ∈ p, then iX ∈ gc , and iX is skew-Hermitian because Gc ⊆ U(n).
Thus, X is symmetric, and θ(X) = −X = −t X.

29 Relative Root Systems 299

Since M normalizes N , we have a Lie subgroup B = M AN of G. We may


call it the (standard) R-Borel subgroup of G. (If G is split or quasisplit, one
may omit the “R-” from this designation.) Let B0 = AN .
Theorem 29.2. (Iwasawa decomposition) With notations as above, each
element of g ∈ G can be factored uniquely as bk, where b ∈ B0 and k ∈
K, or as aνk where a ∈ A, ν ∈ N , and k ∈ K. The multiplication map
A × N × K −→ G is a diffeomorphism.
Proof. This is nearly identical to Theorem 26.3, and we mostly leave the proof
to the reader. We consider only the key point that g = a + n + k. It is sufficient
to show that gC = C a + C n + C k. We have tC ⊆ C a + C m ⊆ C a + C k, so it
is sufficient to show that C n + C k contains Xβ for each β ∈ Φ. If β restricts
trivially to A, then Xβ ⊆ Cm by Proposition 29.6, so we may assume that
β restricts nontrivially. Let α be the restriction of β. If β ∈ Φ+ , then Xβ ⊆
Xα ⊂ C n. On the other hand, if β ∈ Φ− and X ∈ Xβ , then X + θ(X) ∈ C k
and θ(X) ∈ X−β ⊆ X−α ⊂ C n. In either case, Xβ ⊂ C k + C n.

Our next goal is to show that the maximal Abelian subspace a is unique
up to conjugacy. First, we need an analog of Proposition 18.14 (ii). Let us say
that H ∈ p is regular if it is contained in a unique maximal Abelian subspace
of p and singular if it is not regular.
Proposition 29.11. (i) If H is regular and Z ∈ p satisfies [H, Z] = 0, then
Z ∈ a.
(ii) An element H ∈ a is singular if and only if dα(H) = 0 for some α ∈ Φrel .
Proof. The element H is singular if and only if there is some Z ∈ p − a
such that [Z, H] = 0, for if this is the case, then H is contained in at least
two distinct maximal Abelian subspaces, namely a and any maximal Abelian
subspace containing the Abelian subspace RZ + RH. Conversely, if no such
Z exists, then any maximal Abelian subgroup containing H must obviously
coincide with a.
Now (i) is clear.
We also use this criterion to prove (ii). Consider the decomposition of
Z ∈ p in the eigenspace decomposition (29.4):

Z = Z0 + Zα , Z0 ∈ C(m ⊕ a), Zα ∈ Xrel
α .
α∈Φrel

We have

0 = [H, Z] = [H, Z0 ] + [H, Zα ] = dα(H)Zα .
α∈Φrel α∈Φrel

Thus, for all α ∈ Φrel , we have either dα(H) = 0 or Zα = 0. So if dα(H) = 0


for all H then all Zα = 0 and Z = Z0 ∈ C(m ⊕ a). Since Z ∈ p, this implies
that Z ∈ a, and so H is regular. On the other hand, if dα = 0 for some α,
then we can take Z = Zα − θ(Zα ) for nonzero Zα ∈ Xrel α ∩ g and [Z, H] = 0,
Z ∈ p − a.

300 29 Relative Root Systems

Theorem 29.3. Let a1 and a2 be two maximal Abelian subspaces of p. Then


there exists a k ∈ k such that Ad(k)a1 = a2 .

Thus, the relative root system does not dependent in any essential way on the
choice of a. The argument is similar to the proof of Theorem 16.4.

Proof. By Proposition 29.11 (ii), a1 and a2 contain regular elements H1


and H2 . We will show that [Ad(k)H1 , H2 ] = 0 for some k ∈ k. Choose
an Ad-invariant inner product ,  on g, and choose k ∈ K to maximize
Ad(k)H1 , H2 . If Z ∈ k, then since Ad(etZ )H1 , H2 is maximal when t = 0,
we have
d  
0= Ad(etZ )Ad(k)H1 , H2 = − [Ad(k)H1 , Z], H2 .
dt
By Proposition 10.3, this equals Z, [Ad(k)H1 , H2 ] . Since both Ad(k)H1 and
H2 are in p, their bracket is in k, and the vanishing of this inner product for
all Z ∈ k implies that [Ad(k)H1 , H2 ] = 0.
Now take Z = Ad(k)H1 in Proposition 29.11 (i). We see that Ad(k)H1 ∈
a2 , and since both Ad(k)H1 and H2 are regular, it follows that Ad(k)a1 = a2 .



Theorem 29.4. With notations as above, G = KAK.

Proof. Let g ∈ G. Let p = gθ(g)−1 = g t g. We will show that p ∈ exp(p).


By Proposition 29.1, we can write p = exp(Z) k0 , where Z ∈ p and k0 ∈ K,
and we want to show that k0 = 1. By Proposition 29.10, we may embed GC
into GL(n, C) in such a way that G ⊆ GL(n, R), Gc ⊆ U(n), K ⊆ O(n),
and θ(g) = t g −1 . In the matrix realization, p is a positive definite symmetric
matrix. By the uniqueness assertion in Theorem 13.4, it follows that k0 = 1
and p = exp(Z).
Now, by Theorem 29.3, we can find k ∈ K such that Ad(k)Z = H ∈ a. It
follows that kpk −1 = a2 , where a = exp(Ad(k)H/2) ∈ A. Now

(a−1 kg)θ(a−1 kg)−1 = a−1 kgθ(g)−1 k −1 a = a−1 kpk −1 a−1 = 1.

Therefore, a−1 kg ∈ K, and it follows that g ∈ KaK.




Finally, there is the Bruhat decomposition. Let B be the R-Borel subgroup


of G. If w ∈ W , let ω ∈ NG (A) represent W . Clearly, the double coset BωB
does not depend on the choice of representative ω, and we denote it BwB.

Theorem 29.5. (Bruhat decomposition) We have


A
G= BwB.
w∈Wrel

Proof. Omitted. See Helgason [66], p. 403.



29 Relative Root Systems 301

Exercises
∼ Mat2n (C) as C-algebras and that the
Exercise 29.1. Show that C ⊗ Matn (H) =
composition
Matn (H) −→ C ⊗ Matn (H) ∼
det
= Mat2n (C) −→ C
takes values in R.

Exercise 29.2. Compute the Satake diagrams for SO(p, q) with p  q for all p
and q.

Exercise 29.3. Prove an analog of Theorem 18.3 showing that Wrel is generated
by the reflections constructed in Theorem 29.1.
30
Embeddings of Lie Groups

In this chapter, we will contemplate how Lie groups embed in one another.
Our aim is not to be systematic or even completely precise but to give the
reader some tools for thinking about the relationships between different Lie
groups.
If G is a Lie group and H a subgroup, then there exists a chain of Lie
subgroups of G,
G = G0 ⊃ G1 ⊃ · · · ⊃ Gn = H
such that each Gi is maximal in Gi−1 . Dynkin [45–47] classified the maxi-
mal subgroups of semisimple complex analytic groups. Thus, the lattice of
semisimple complex analytic subgroups of such a group is known.
Let K1 and K2 be compact connected Lie groups, and let G1 and G2 be
their complexifications. Given an embedding K1 −→ K2 , there is a unique
analytic embedding G1 −→ G2 . The converse is also true: given an analytic
embedding G1 −→ G2 , then K1 embeds as a compact subgroup of G2 . How-
ever, any compact subgroup of G2 is conjugate to a subgroup of K2 (Theorem
28.2), so K1 is conjugate to a subgroup of K2 . Thus, embeddings of compact
connected Lie groups and analytic embeddings of their complexifications are
essentially the same thing. To be definite, let us specify that in this chapter
we are talking about analytic embeddings of complex analytic groups, with
the understanding that the ideas will be applicable in other contexts. By a
“torus,” we therefore mean a group analytically isomorphic to (C)n for some
n. We will allow ourselves to be a bit sloppy in this chapter, and we will
sometimes write O(n) when we should really write O(n, C).
So let us start with embeddings of complex analytic Lie groups. A useful
class of complex analytic groups that is slightly larger than the semisimple
ones is the class of reductive complex analytic groups. A complex analytic
group G (connected, let us assume) is called reductive if its linear analytic
representations are completely reducible. For example, GL(n, C) is reductive,
though it is not semisimple.

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 303


DOI 10.1007/978-1-4614-8024-2 30, © Springer Science+Business Media New York 2013
304 30 Embeddings of Lie Groups

Examples of groups that are not reductive are parabolic subgroups. Let G
be the complexification of the compact connected Lie group K, and let B be
the Borel subgroup described in Theorem 26.2. A subgroup of G containing B
is called a standard parabolic subgroup. (Any conjugate of a standard parabolic
subgroup is called parabolic.)
As an example of a group that is not reductive, let P ⊂ GL(n, C) be the
maximal parabolic subgroup consisting of matrices
 
g1 ∗
, g1 ∈ GL(r, C), g2 ∈ GL(s, C), r + s = n.
g2
In the standard representation corresponding to the inclusion P −→ GL(n, C),
the set of matrices which have last s entries that are zero is a P -invariant sub-
space of Cn that has no invariant complement. Therefore, this representation
is not completely reducible, and so P is not reductive.
If G is the complexification of a connected compact group, then analytic
representations of G are completely reducible by Theorem 24.1. It turns out
that the converse is true—a connected complex analytic reductive group is the
complexification of a compact Lie group. We will not prove this, but it is useful
to bear in mind that whatever we prove for complexifications of connected
compact groups is applicable to the class of reductive complex analytic Lie
groups.
Even if we restrict ourselves to finding reductive subgroups of reductive
Lie groups, the problem is very difficult. After all, any faithful representation
gives an embedding of a Lie group in another. There is an important class of
embeddings for which it is possible to give a systematic discussion. Following
Dynkin, we call an embedding of Lie groups or Lie algebras regular if it takes
a maximal torus into a maximal torus and roots into roots. Our first aim is
to show how regular embeddings can be recognized using extended Dynkin
diagrams.
We will use orthogonal groups to illustrate some points. It is convenient
to take the orthogonal group in the form
⎛ ⎞

1
.
OJ (n, F ) = g ∈ GL(n, F ) | g J t g = J , J = ⎝ .. ⎠ .
1
We will take the realization OJ (n, C) ∩ U(n) = ∼ O(n) of the usual orthogonal
group in Exercise 5.3 with the maximal torus T consisting of diagonal ele-
ments of OJ (n, C) ∩ U(n). Then, as in Exercise 24.1, OJ (n, C) is the analytic
complexification of the usual orthogonal group O(n). We can take the ordering
of the roots so that the root eigenspaces Xα with α ∈ Φ+ are upper triangular.
We recall that the root system of type Dn is the root system for SO(2n).
Normally, one only considers Dn when n  4. The reason for this is that the
Lie groups SO(4) and SO(6) have root systems of types A1 × A1 and A3 ,
respectively. To see this, consider the Lie algebra of type SO(8). This consists
of the set of all matrices of the form in Fig. 30.1.
30 Embeddings of Lie Groups 305

t1 x12 x13 x14 x15 x16 x17 0

x21 t2 x23 x24 x25 x26 0 −x17

x31 x32 t3 x34 x35 0 −x26 −x16

x41 x42 x43 t4 0 −x35 −x25 −x15

x51 x52 x53 0 −t4 −x34 −x24 −x14

x61 x62 0 −x53 −x43 −t3 −x23 −x13

x71 0 −x62 −x52 −x42 −x32 −t2 −x12

0 −x71 −x61 −x51 −x41 −x31 −x21 −t1

Fig. 30.1. The Lie algebra of SO(8)

The Lie algebra t of T consists of the subalgebra of diagonal matrices,


where all xij = 0. The 24 roots α are such that each Xα is characterized by
the nonvanishing of exactly one xij . We have circled the Xα corresponding
to the four simple roots and drawn lines to indicate the graph of the Dynkin
diagram. (Note that each xij occurs in two places. We have only circled the
xij in the upper half of the diagram.)
The middle 6 × 6 block, shaded in Fig. 30.1, is the Lie algebra of SO(6),
and the very middle 4 × 4 block, shaded dark, is the Lie algebra of SO(4).
Looking at the simple roots, we can see the inclusions of Dynkin diagrams in
Fig. 30.2. The shadings of the nodes correspond to the shadings in Fig. 30.1.
The coincidences of root systems D2 = A1 × A1 and D3 = A3 are
worth explaining from another point of view. We may realize the group
SO(4) concretely as follows. Let V = Mat2 (C). The determinant is a non-
degenerate quadratic form on the four-dimensional vector space V . Since all
nondegenerate quadratic forms are equivalent, the group of linear
transformations of V preserving the determinant may thus be identified with
SO(4). We consider the group

G = {(g1 , g2 ) ∈ GL(2, C) × GL(2, C) | det(g1 ) = det(g2 )}.

This group acts on V by

(g1 , g2 ) : X −→ g1 Xg2−1 .

This action preserves the determinant, so we have a homomorphism G −→


O(4). There is a kernel Z Δ consisting of the scalar matrices in GL(2, C)
306 30 Embeddings of Lie Groups

D2 = A1 × A1 D3 = A3 D4

middle O(4) −−−→ O(6) −−−→ O(8)

Fig. 30.2. The inclusions SO(4) → SO(6) → SO(8)

embedded diagonally. We therefore have an injective homomorphism


G/Z Δ −→ O(4). Both groups have dimension 6, so this homomorphism is
a surjection onto the connected component SO(4) of the identity.
Using the fact that C is algebraically closed, the subgroup SL(2, C) ×
SL(2, C) of G maps surjectively onto SO(4). The kernel of the map

SL(2, C) × SL(2, C) −→ SO(4)

has order 2, and we may identify the simply-connected group SL(2, C) ×


SL(2, C) as the double cover Spin(4, C). Since SO(4) is a quotient of SL(2, C)×
SL(2, C), we see why its root system is of type A1 × A1 .

Remark 30.1. Although we could have worked with SL(2, C) × SL(2, C) at the
outset, over a field F that was not algebraically closed, it is often better to
use the realization G/Z Δ ∼ = SO(4). The reason is that if F is not algebraically
closed, the image of the homomorphism SL(2, F )×SL(2, F ) −→ SO(4, F ) may
not be all of SO(4). Identifying SL(2)×SL(2) with the algebraic group Spin(4),
this is a special instance of the fact that the covering map Spin(n) −→ SO(n),
though surjective over an algebraically closed field, is not generally surjective
on rational points over a field that is not algebraically closed. A surjective map
may instead be obtained by working with the group of similitudes GSpin(n),
which when n = 4 is the group G. This is analogous to the fact that the
homomorphism SL(2, F ) −→ PGL(2, F ) is not surjective if F is algebraically
closed, which is why the adjoint group PGL(2, F ) of SL(2) is constructed as
GL(2, F ) modulo the center, not SL(2) modulo the center.

We turn next to SO(6). Let W be a four-dimensional complex vector space.


There is a homomorphism GL(W ) −→ GL(∧2 W ) ∼ = GL(6, C), namely the
exterior square map, and there is a homomorphism

∧2
GL(∧2 W ) −→ GL(∧4 W ) ∼
= C× .

The latter map is symmetric since in the exterior algebra

(v1 ∧ . . . ∧ vr ) ∧ (w1 ∧ . . . ∧ ws ) = (−1)rs (w1 ∧ . . . ∧ ws ) ∧ (v1 ∧ . . . ∧ vr ).


30 Embeddings of Lie Groups 307

(Each vi has to move past each wj producing rs sign changes.) Hence we may
regard ∧2 as a quadratic form on GL(∧2 W ). The subspace preserving the
determinant is therefore isomorphic to SO(6). The composite
2 2
∧ ∧
GL(W ) −→ GL(∧2 W ) −→ GL(∧4 W ) ∼
= C×

is the determinant, so the image of SL(W ) = SL(4, C) in GL(∧2 W ) is there-


fore contained in SO(6). Both SL(4, C) and SO(6) are 15-dimensional and
connected, so we have constructed a homomorphism onto SO(6). The kernel
consists of {±1}, so we see that SO(6) ∼ = SL(4, C)/{±I}. Since SO(6) is a
quotient of SL(4, C), we see why its root system is of type A3 .
The maps discussed so far, involving SO(2n) with n = 2, 3, and 4, are reg-
ular. Sometimes (as in these examples) regular embeddings can be recognized
by inclusions of ordinary Dynkin diagrams, but a fuller picture will emerge if
we introduce the extended Dynkin diagram.
Let K be a compact connected Lie group with maximal torus T . Let G be
its complexification. Let Φ, Φ+ , Σ, and other notations be as in Chap. 18.

Proposition 30.1. Suppose in this setting that S is any set of roots such that
if α, β ∈ S and if α + β ⊂ Φ, then α + β ∈ S. Then
&
h = tC ⊕ Xα
α∈S

is a Lie subalgebra of Lie(G).

Proof. It is immediate from Proposition 18.4 (ii) and Proposition 18.3 (ii)
that this vector space is closed under the bracket.


We will not worry too much about verifying that h is the Lie algebra of a
closed Lie subgroup of G except to remark that we have some tools for this,
such as Theorem 14.3.
We have already introduced the Dynkin diagram in Chap. 25. We recall
that the Dynkin diagram is obtained as a graph whose vertices are in bijection
with Σ. Let us label Σ = {α1 , . . . , αr }, and let si = sαi . Let θ(αi , αj ) be the
angle between the roots αi and αj . Then


⎪ 2 if θ(αi , αj ) = π2 ,






⎨ 3 if θ(αi , αj ) = 2π 3 ,
n(si , sj ) =



⎪ 4 if θ(αi , αj ) = 3π
4 ,





6 if θ(αi , αj ) = 5π
6 .

The extended Dynkin diagram adjoins to the graph of the Dynkin diagram
one more node, which corresponds to the negative root α0 such that −α0
308 30 Embeddings of Lie Groups

is the highest weight vector in the adjoint representation. The negative root
α0 is sometimes called the affine root, because of its role in the affine root
system (Chap. 23). As in the usual Dynkin diagram, we connect the vertices
corresponding to αi and αj only if the roots are not orthogonal. If they make
an angle of 2π/3, we connect them with a single bond; if they make an angle
of 6π/4, we connect them with a double bond; and if they make an angle of
5π/6, we connect them with a triple bond.
The basic paradigm is that if we remove a node from the extended Dynkin
diagram, what remains will be the Dynkin diagram of a subgroup of G. To get
some feeling for why this is true, let us consider an example in the exceptional
group G2 . We may take S in Proposition 30.1 to be the set of six long roots.
These form a root system of type A2 , and h is the Lie algebra of a Lie subgroup
isomorphic to SL(3, C). Since SL(3, C) is the complexification of the simply-
connected compact Lie group SU(2), it follows from Theorem 14.3 that there
is a homomorphism SL(3, C) −→ G.

α2

α1

α0

Fig. 30.3. The exceptional root α0 of G2 (• = positive roots)

The ordinary Dynkin diagram of G2 does not reflect the existence of this
embedding. However, from Fig. 30.3, we see that the roots α2 and α0 can be
taken as the simple roots of SL(3, C). The embedding SL(3, C) can be under-
stood as an inclusion of the A2 (ordinary) Dynkin diagram in the extended
G2 Dynkin diagram (Fig. 30.4).

A2 (ordinary Dynkin diagram)

G2 (extended Dynkin diagram)


α1 α2 α0

Fig. 30.4. The inclusion of SL(3) in G2


30 Embeddings of Lie Groups 309

Let us consider some more extended Dynkin diagrams. If n > 2, and if G


is the odd orthogonal group SO(2n + 1), its root system is of type Bn , and
its extended Dynkin diagram is as in Fig. 30.5. We confirm this in Fig. 30.6
for SO(9) – that is, when n = 4 – by explicitly marking the simple roots
α1 , . . . , αn and the largest root α0 .

α0

α1 α2 α3 αn−2 αn−1 αn

Fig. 30.5. The extended Dynkin diagram of type Bn

t1 x12 x13 x14 x15 x16 x17 x18 0

x21 t2 x23 x24 x25 x26 x27 0 −x18

x31 x32 t3 x34 x35 x36 0 −x27 −x17

x41 x42 x43 t4 x45 0 −x36 −x26 −x16

x51 x52 x53 x54 0 −x45 −x35 −x25 −x15

x61 x62 x63 0 −x54 −t4 −x34 −x24 −x14

x71 x72 0 −x63 −x53 −x43 −t3 −x23 −x13

x81 0 −x72 −x62 −x52 −x42 −x32 −t2 −x12

0 −x81 −x71 −x61 −x51 −x41 −x31 −x21 −t1

Fig. 30.6. The Lie algebra of SO(9)

Next, if n  5 and G = SO(2n), the root system of G is Dn , and the


extended Dynkin diagram is as in Fig. 30.7. For example if n = 5, the config-
uration of roots is as in Fig. 30.8.
We leave it to the reader to check the extended Dynkin diagrams of the
symplectic group Sp(2n), which is of type Cn (Fig. 30.9).
The extended Dynkin diagram of type An (n  2) is shown in Fig. 30.10.
It has the feature that removing a node leaves the diagram connected. Because
of this, the paradigm of finding subgroups of a Lie group by examining the
extended Dynkin diagram does not produce any interesting examples for
SL(n + 1) or GL(n + 1).
We already encountered the extended Dynkin diagram of G2 is in Fig. 30.4.
The extended Dynkin diagrams of all the exceptional groups are listed in
Fig. 30.11.
310 30 Embeddings of Lie Groups

α0 αn−1

α1 α2 α3 αn−3 αn−2

αn

Fig. 30.7. The extended Dynkin diagram of type Dn

Our first paradigm of recognizing the embedding of a group H in G by emb-


edding the ordinary Dynkin diagram of H in the extended Dynkin diagram
of G predicts the embedding of SO(2n) in SO(2n + 1) but not the embedding
of SO(2n + 1) in SO(2n + 2). For this we need another paradigm, which we
call root folding.
We note that the Dynkin diagram Dn+1 has a symmetry interchanging
the vertices αn and αn+1 . This corresponds to an outer automorphism of
SO(2n + 2), namely conjugation by
⎛ ⎞
In−1
⎜ 0 1 ⎟
⎜ ⎟,
⎝ 1 0 ⎠
In−1

which is in O(2n + 2) but not SO(2n + 2). The fixed subgroup of this outer
automorphism stabilizes the vector v0 = t (0, . . . , 0, 1, −1, 0, . . . , 0). This vector
is not isotropic (that is, it does not have length zero) so the stabilizer is the
group SO(2n + 1) fixing the 2n + 1-dimensional orthogonal complement of
v0 . In this embedding SO(2n + 1) −→ SO(2n + 1), the short simple root of
SO(2n + 1) is embedded into the direct sum of Xαn and Xαn+1 . We invite the
reader to confirm this for the embedding of SO(9) −→ SO(10) with the above
matrices. We envision the Dn+1 Dynkin diagram being folded into the Bn
diagram, as in Fig. 30.12.
The Dynkin diagram of type D4 admits a rare symmetry of order 3
(Fig. 30.13). This is associated with a phenomenon known as triality, which
we now discuss.
Referring to Fig. 30.1, the groups Xαi (i = 1, 2, 3, 4) correspond to x12 , x23 ,
x34 and x35 , respectively. The Lie algebra will thus have an automorphism
τ that sends x12 −→ x34 −→ x35 −→ x12 and fixes x23 . Let us consider the
effect on tC , which is the subalgebra of elements t with all xij = 0. Noting
that dα1 (t) = t1 − t2 , dα2 (t) = t2 − t3 , dα3 (t) = t3 − t4 , and dα4 (t) = t3 + t4 ,
we must have ⎧
⎪ t1 − t2 −→ t3 − t4


t2 − t3 −→ t2 − t3
τ: ,

⎪ t3 − t4 −→ t3 + t4

t3 + t4 −→ t1 − t2
from which we deduce that
30 Embeddings of Lie Groups 311

t1 x12 x13 x14 x15 x16 x17 x18 x19 0

x21 t2 x23 x24 x25 x26 x27 x28 0 −x19

x31 x32 t3 x34 x35 x36 x37 0 −x28 −x18

x41 x42 x43 t4 x45 x46 0 −x37 −x27 −x17

x51 x52 x53 x54 t5 0 −x46 −x36 −x26 −x16

x61 x62 x63 x64 0 −t5 −x45 −x35 −x25 −x15

x71 x72 x73 0 −x64 −x54 −t4 −x34 −x24 −x14

x81 x82 0 −x73 −x63 −x53 −x43 −t3 −x23 −x13

x91 0 −x82 −x72 −x62 −x52 −x42 −x32 −t2 −x12

0 −x91 −x81 −x71 −x61 −x51 −x41 −x31 −x21 −t1

Fig. 30.8. The Lie algebra of SO(10)

α0 α1 α2 α3 αn−2 αn−1 αn

Fig. 30.9. The extended Dynkin diagram of type Cn

α0

α1 α2 α3 αn−2 αn−1
αn

Fig. 30.10. The extended Dynkin diagram of type An

α1 α2
α0 Left: G2, F4, E6.
Right: E7, E8. α2

α1 α2 α3 α4 α0
α0 α1 α3 α4 α5 α6 α7
α0

α2 α2
α0
α1 α3 α4 α5 α6 α1 α3 α4 α5 α6 α7 α8

Fig. 30.11. Extended Dynkin diagram of the exceptional groups

τ (t1 ) = 12 (t1 + t2 + t3 − t4 ) ,
τ (t2 ) = 12 (t1 + t2 − t3 + t4 ) ,
τ (t3 ) = 12 (t1 − t2 + t3 + t4 ) ,
τ (t4 ) = 12 (t1 − t2 − t3 − t4 ) .
312 30 Embeddings of Lie Groups

αn
α1 α2 α3 αn−2 αn−1

αn+1

α1 α2 α3 αn−2 αn−1 αn

Fig. 30.12. Embedding SO(2n + 1)−→SO(2n + 2) as “folding”

α4

α2
α1

α3

Fig. 30.13. Triality

At first this is puzzling since, translated to a statement about the group,


we have
⎛ ⎞ ⎛  ⎞
t1 t1
⎜ t2 ⎟ ⎜ t2 ⎟
⎜ ⎟ ⎜ ⎟
⎜ t3 ⎟ ⎜ 
t3 ⎟
⎜ ⎟ ⎜ ⎟
⎜ t4 ⎟ ⎜ 
t4 ⎟
τ⎜
⎜ −1

⎟ = ⎜
⎜  −1
⎟,

⎜ t4 ⎟ ⎜ t4 ⎟
⎜ −1
t3 ⎟ ⎜  −1
t3 ⎟
⎜ ⎟ ⎜ ⎟
⎝ −1
t2 ⎠ ⎝  −1
t2 ⎠
−1  −1
t1 t1

where B B
t1 = t1 t2 t3 t−1
4 , t2 = t1 t2 t−1
3 t4 ,
B B
t3 = t1 t−1
2 t3 t4 , t4 = t1 t−1 −1 −1
2 t3 t4 .

Due to the ambiguity of the square roots, this is not a univalent map.
The explanation is that since SO(8) is not simply-connected, a Lie alge-
bra automorphism cannot necessarily be lifted to the group. However, there
is automatically induced an automorphism τ of the simply-connected double
cover Spin(8). The center of Spin(8) is (Z/2Z) × (Z/2Z), which has an auto-
morphism of order 3 that does not preserve the kernel (of order 2) of τ . If we
divide Spin(8) by its entire center (Z/2Z) × (Z/2Z), we obtain the adjoint
group PGO(8), and the triality automorphism of Spin(8) induces an auto-
morphism of order 3 of PGO(8). To summarize, triality is an automorphism
of order 3 of either Spin(8) or PGO(8) but not of SO(8).
30 Embeddings of Lie Groups 313

The fixed subgroup of τ in either Spin(8) or PGO(8) is the exceptional


group G2 , and the inclusion of G2 in Spin(8) can be understood as a folding of
roots. The unipotent subgroup corresponding to a short simple root of G2 is
included diagonally in the three root groups exp(Xαi ), (i = 1, 3, 4) of Spin(8)
as in Fig. 30.14 (left).
Triality has the following interpretation. The quadratic space V of dimen-
sion 8 on which SO(8) acts can be given the structure of a nonassociative
algebra known as the octonions or Cayley numbers.
If f1 : V −→ V is any nonsingular orthogonal linear transformation, there
exist linear transformations f2 and f3 such that

f1 (xy) = f2 (x)f3 (y).

The linear transformations f2 and f3 are only determined up to sign. The maps
f1 −→ f2 and f1 −→ f3 , though thus not well-defined as an automorphisms
of SO(8), do lift to well-defined automorphisms of Spin(8), and the resulting
automorphism f1 −→ f2 is the triality automorphism. Triality permutes the
three orthogonal maps f1 , f2 , and f3 cyclicly. Note that if f1 = f2 = f3 ,
then f1 is an automorphism of the octonion ring, so the fixed group G2 is
the automorphism group of the octonions. See Chevalley [36], p.188. As an
alternative to Chevalley’s approach, one may first prove a local form of triality
as in Jacobson [88] and then deduce the global form. See also Schafer [146].
Over an algebraically closed field, the octonion algebra is unique. Over the
real numbers there are two forms, which correspond to the compact group
O(8) and the split form O(4, 4).
So far, the examples we have given of folding correspond to automorphisms
of the group G. For an example that does not, consider the embedding of G2
into Spin(7) (Fig. 30.14, right).

α1 α1

α2 α2
α3
α4 α3

α2 α1 α2 α1

Fig. 30.14. The group G2 embedded in Spin(8) and Spin(7)

A frequent way in which large subgroups of a Lie group arise is as fixed


points of automorphisms, usually involutions. Many of these subgroups can
be understood by the the paradigms explained above. A list of such subgroups
314 30 Embeddings of Lie Groups

can be found in Table 28.1, for in this list, the compact subgroup K is the
fixed point of an involution in the compact group Gc , and this relationship is
also true for the complexifications. For example, the first entry, corresponding
to Cartan’s classification AI, is the symmetric space with Gc = SU(n) and the
subgroup K = SO(n). Assuming that we use the version of the orthogonal
group in Exercise 5.3, the involution θ is g → J t g −1 J, where J is given
by (5.3). This involution extends to the complexification SL(n, C), and the
fixed point set is the subgroup SO(n, C). If n is odd, then every simple root
eigenspace of SO(n, C) embeds in the direct sum of one or two simple root
eigenspaces of SL(2, C), and the embedding may be understood as an example
of the root folding paradigm. But if n = 2r is even, then one of the roots of
SO(2r), namely the simple root er−1 + er , involves non-simple roots of SL(n).
Suppose that V1 and V2 are quadratic spaces (that is, vector spaces
equipped with nondegenerate symmetric bilinear forms). Then V1 ⊕ V2 is
naturally a quadratic space, so we have an embedding O(V1 ) × O(V2 ) −→
O(V1 ⊕ V2 ). The same is true if V1 and V2 are symplectic (that is, equipped
with nondegenerate skew-symmetric bilinear forms). It follows that we have
embeddings

O(r) × O(s) −→ O(r + s), Sp(2r) × Sp(2s) −→ Sp 2(r + s) .

These embeddings can be understood as embeddings of extended Dynkin dia-


grams except in the orthogonal case where r and s are both odd (Exercise 30.2.
Also, if V1 and V2 are vector spaces with bilinear forms βi : Vi × Vi −→ C,
then there is a bilinear form B on V1 ⊗ V2 such that

B(v1 ⊗ v2 , v1 ⊗ v2 ) = β1 (v1 , v1 ) β2 (v2 , v2 ).

If both β1 and β2 are either symmetric or skew-symmetric, then B is sym-


metric. If one of β1 and β2 is symmetric and the other skew-symmetric, then
B is skew-symmetric. Therefore, we have embeddings

O(r) × O(s) −→ O(rs), Sp(2r) × O(s) −→ Sp(2rs),

Sp(2r) × Sp(2s) → Sp(4rs). (30.1)


The second embedding is the single most important “dual reductive pair,”
which is fundamental in automorphic forms and representation theory. A dual
reductive pair in a Lie or algebraic group H consists of reductive subgroups
G1 and G2 embedded in such a way that G1 is the centralizer of G2 in H and
conversely. If H is the symplectic group, or more properly its “metaplectic”
double cover, then H has an important infinite-dimensional representation
ω introduced by Weil [172]. Weil showed in [173] that in many cases the
restriction of the Weil representation to a dual reductive pair can be used to
understand classical correspondences of automorphic forms due to Siegel. The
importance of this phenomenon cannot be overstated. From Weil’s point of
view this phenomenon is a global one, but Howe [73] gave better foundations,
30 Embeddings of Lie Groups 315

including a local theory. This is a topic that transcends Lie theory since in
much of the literature one will consider O(s) or Sp(2r) as algebraic groups
defined over a p-adic field or a number field (and its adele ring). Expositions of
pure Lie group applications may be found in Howe and Tan [78] and Goodman
and Wallach [56].
The classification of dual reductive pairs in Sp(2n), described in Weil [173]
and Howe [73], has its origins in the theory of algebras with involutions, due
to Albert [5]. The connection between algebras with involutions and the the-
ory of algebraic groups was emphasized earlier by Weil [171]. A modern and
immensely valuable treatise on algebras with involutions and their relations
with the theory of algebraic groups may be found in Knus, Merkurjev, Rost,
and Tignol [107].
A classification of dual reductive pairs in exceptional groups is in Ruben-
thaler [138]. These examples have proved interesting in the theory of auto-
morphic forms since an analog of the Weil representation is available.
So far, our point of view has been to start with a group G and understand
its large subgroups H, and we have a set of examples sufficient for understand-
ing most, but not all such pairs. Let us consider the alternative question: given
H, how can we embed it in a larger group G?
Suppose, therefore that π : H → GL(V ) is a representation. We assume
that it is faithful and irreducible. Then we get an embedding of H into GL(V ).
However sometimes there is a smaller subgroup G ⊂ GL(V ) such that the
image of π is contained in G. A frequent case is that G is an orthogonal or
symplectic group. These cases may be classified by considering the theory
of the Frobenius-Schur indicator, which is discussed in the exercises to this
chapter and again in Chap. 43. The Frobenius-Schur indicator (π) is the
multiplicity of the trivial character in the generalized character g → χ(g 2 ),
where χ is the character of π. It equals 0 unless π = π̂ is self-contragredient,
in which case either it equals 1 and π is orthogonal, or −1 and π is symplectic.
This means that if (π) = 1, then we may take G = O(n) where n = dim(V ),
while if (π) = −1, then dim(V ) is even, and we may take G = Sp(n).
The examples (30.1) can be understood this way. Here’s a couple more.
Let H = SL(2), and let π be the symmetric k-th power representation. The
vector space V is k + 1-dimensional. Exercise 22.15 computes the Frobenius-
Schur indicator, and we see that H embeds in SO(k + 1) if k is even, and
Sp(k + 1) if k is odd. For another example, if H is any simple Lie group,
then the adjoint representation is orthogonal since the Killing form on the Lie
algebra is a nondegenerate symmetric bilinear form. Thus for example we get
an embedding of SL(3) into SO(8).
As a final topic, we discuss parabolic subgroups. Just as regular subgroups
of G can be read off from the extended Dynkin diagram, the parabolic sub-
groups can be read off from the regular Dynkin diagram. Let Σ  ⊂ Σ be any
proper subset of the set of simple roots. Then Σ  is the set of vertices of a
(possibly disconnected) Dynkin diagram D contained in that of G. There will
316 30 Embeddings of Lie Groups

be a unique parabolic subgroup P such that, for a simple root α ∈ Σ, the


space X−α is contained in the Lie algebra of P if and only if α ∈ S.
The roots X−α and Xα with α ∈ S together with tC generate a Lie algebra
m, which is the Lie algebra of a reductive Lie group M , and
&
u= Xα
α ∈ Φ+
Xα  m

is the Lie algebra of a unipotent subgroup U of P . (By unipotent we mean


here that its image in any analytic representation of G consists of unipotent
matrices.) The group P = M U . This factorization is called the Levi decompo-
sition. The subgroup U of P is normal, so this decomposition is a semidirect
product. The group M is called the Levi factor , and the group U is called the
unipotent radical of P .
We illustrate all this with an example from the symplectic group. We take
G = Sp(2n) to be {g | t gJg = J}, where
⎛ ⎞
−1
⎜ . ⎟
⎜ .. ⎟
⎜ −1 ⎟
J =⎜ ⎜ ⎟.
1 ⎟
⎜ ⎟
⎝ ... ⎠
1
This realization of the symplectic group has the advantage that the Xα
corresponding to positive roots α ∈ Φ+ all correspond to upper triangular
matrices. We see from Fig. 30.9 that removing a node from the Dynkin dia-
gram of type Cn gives a smaller diagram, disconnected unless we take an
end vertex, of type Ar−1 × Cn−r . This is the Dynkin
diagram
of a maximal
parabolic subgroup with Levi factor M = GL(r)×Sp 2(n−r) . The subgroup
looks like this:
⎧⎛ ⎞ ⎫ ⎧⎛ ⎞⎫
⎨ g  ⎬ ⎨ Ir ∗ ∗ ⎬
M= ⎝ h ⎠  g ∈ GL(r), h ∈ Sp(2m) , U = ⎝ I2m ∗ ⎠ .
⎩  ⎭ ⎩ ⎭
g Ir
Here m = n− r. In the matrix M , the matrix g  depends on g; it is determined
by the requirement that the given matrix be symplectic. Figure 30.15 shows
the parabolic subgroup with Levi factor GL(3) × Sp(4) in GL(10). Its Lie
algebra is shaded here: the Lie algebra of M shaded dark and the Lie algebra
of U is shaded light.
The Levi factor M = GL(3) × Sp(4) is a proper subgroup of the larger
group Sp(6)×Sp(4), which can be read off from the extended Dynkin diagram.
The Lie algebra of Sp(6) × Sp(4) is shaded dark in Fig. 30.16.
30 Embeddings of Lie Groups 317

t1 x12 x13 x14 x15 x16 x17 x18 x19 x10

x21 t2 x23 x24 x25 x26 x27 x28 x29 x19

x31 x32 t3 x34 x35 x36 x37 x38 x28 x18

x41 x42 x43 t4 x45 x46 x47 x37 x27 x17

x51 x52 x53 x54 t5 x56 x46 x36 x26 x16

x61 x62 x63 x64 x65 −t5 −x45 −x35 −x25 −x15

x71 x72 x73 x74 x64 −x54 −t4 −x34 −x24 −x14

x81 x82 x83 x73 x63 −x53 −x43 −t3 −x23 −x13

x91 x92 x82 x72 x62 −x52 −x42 −x32 −t2 −x12

x01 x91 x81 x71 x61 −x51 −x41 −x31 −x21 −t1

Fig. 30.15. A parabolic subgroup of Sp(10)

t1 x12 x13 x14 x15 x16 x17 x18 x19 x10

x21 t2 x23 x24 x25 x26 x27 x28 x29 x19

x31 x32 t3 x34 x35 x36 x37 x38 x28 x18

x41 x42 x43 t4 x45 x46 x47 x37 x27 x17

x51 x52 x53 x54 t5 x56 x46 x36 x26 x16

x61 x62 x63 x64 x65 −t5 −x45 −x35 −x25 −x15

x71 x72 x73 x74 x64 −x54 −t4 −x34 −x24 −x14

x81 x82 x83 x73 x63 −x53 −x43 −t3 −x23 −x13

x91 x92 x82 x72 x62 −x52 −x42 −x32 −t2 −x12

x01 x91 x81 x71 x61 −x51 −x41 −x31 −x21 −t1

Fig. 30.16. The Sp(6) × Sp(4) subgroup of Sp(10)

Exercises
Exercise 30.1. Discuss as many as possible of the embeddings K−→Gc in Table
28.1 of Chap. 28 using the extended Dynkin diagram of Gc .

Exercise 30.2. In doing the last exercise, one case you may have trouble with is the
embedding of S(O(p) × O(q)) into SO(p + q) when p and q are both odd. To get some
318 30 Embeddings of Lie Groups
 
insight, consider the embedding of SO(5)×SO(5) into SO(10). (Note: S O(p)×O(q)
is the group of elements of determinant 1 in O(p) × O(q) and contains SO(p) × SO(q)
as a subgroup of index 2. For this exercise,
 it does not matter whether you work
with SO(5) × SO(5) or S O(5) × O(5) .) Take the form of SO(10) in Fig. 30.8.
This stabilizes the quadratic form x1 x10 + x2 x9 + x3 x8 + x4 x7 + x5 x6 . Consider the
subspaces ⎧⎛ ⎞⎫ ⎧⎛ ⎞⎫
⎪ a ⎪
⎪ ⎪ ⎪ 0 ⎪
⎪ ⎪

⎪ ⎜ ⎟ ⎪
⎪ ⎪
⎪ ⎜ 0 ⎟⎪ ⎪

⎪ ⎜ b ⎟⎪⎪ ⎪
⎪ ⎜ ⎟⎪ ⎪

⎪ ⎜ 0 ⎟⎪ ⎪ ⎪
⎪ ⎜ t ⎟⎪ ⎪

⎪ ⎜ ⎟ ⎪
⎪ ⎪
⎪ ⎪
⎪⎜
⎪ ⎟⎪⎪ ⎪

⎪⎜ ⎟⎪ ⎟ ⎪


⎪ 0 ⎪ ⎪ u ⎟⎪
⎨⎜ ⎟⎪
⎜ c ⎟⎬ ⎨⎜

⎜ v ⎟⎬

V1 = ⎜ ⎟ ,
⎜ −c ⎟⎪ V 2 = ⎜ ⎟ .
⎜ v ⎟⎪

⎪ ⎜ ⎟⎪ ⎪
⎪ ⎜ ⎟⎪

⎪ ⎜ 0 ⎟⎪ ⎪ ⎪
⎪ ⎜ ⎟⎪ ⎪

⎪⎜ ⎟⎪⎪
⎪ ⎪⎜ w ⎟⎪
⎪ ⎪

⎪ ⎜ ⎟ ⎪ ⎪
⎪ ⎜ ⎟ ⎪


⎪ ⎜ 0 ⎟⎪ ⎪ ⎪
⎪ ⎜ x ⎟⎪ ⎪

⎪ ⎝ d ⎠⎪ ⎪ ⎪
⎪ ⎝ 0 ⎠⎪ ⎪

⎪ ⎪
⎪ ⎪
⎪ ⎪

⎩ ⎭ ⎩ ⎭
e 0
Observe that these five-dimensional spaces are mutually orthogonal and that the
restriction of the quadratic form is nondegenerate, so the stabilizers of these two
spaces are mutually centralizing copies of SO(5). Compute the Lie algebras of these
two subgroups, and describe how the roots of SO(10) restrict to SO(5) × SO(5).

Exercise 30.3. Let G be a semisimple Lie group. Assume that the Dynkin diagram
of G has no automorphisms. Show that every representation is self-contragredient.

Exercise 30.4. Let 1 and 2 be the fundamental dominant weights for Spin(5),
so that 2 is the highest weight of the spin representation. Show that the irreducible
representation with highest weight k1 +l2 is orthogonal if l is even, and symplectic
of l is odd.

Exercise 30.5. The group Spin(8) has three distinct irreducible eight-dimensional
representations, namely the standard representation of SO(8) and the two spin rep-
resentations. Show that these are permuted cyclicly by the triality automorphism.

Exercise 30.6. Prove that if G is semisimple and its Dynkin diagram has no aut-
omorphisms, then every element in G is conjugate to its inverse. Is the converse
true?
31
Spin

This chapter does not depend on the last few chapters, and may be read at
any point after Chap. 23, or even earlier. The results of Chap. 23 are not used
here, but are illustrated by the results of this chapter.
We will take a closer look at the groups SO(N ) and their double cov-
ers, Spin(N ). We assume that N  3 and that N = 2n + 1 or 2n. In this
Chapter, we will take a closer look at the groups SO(N ) and their double
covers, Spin(N ). These groups have remarkable “spin” representations of dim-
ension 2n , where N = 2n or 2n + 1. We will first show that this follows from
the Weyl theorem of Chap. 22. We will then take a different point of view
and give a different construction, using Clifford algebras and a uniqueness
principle.
The group Spin(N ) was constructed at the end
of Chap. 13 as the universal
cover of SO(N ). Since we proved that π1 SO(N ) ∼ = Z/2Z, it is a double cover.
In this chapter, we will construct and study the interesting and important spin
representations of the group Spin(N ). We will also show how to compute the
center of Spin(N ).
Let G = SO(N ) and let G̃ = Spin(N ). We will take G in the realization
of Exercise 5.3; that is, as the group of unitary matrices satisfying g J t g = J,
where J is (5.3). Let p : G̃ −→ G be the covering map.
Let T be the diagonal
torus in G, and let T̃ = p−1 (T ). Thus ker(p) ∼= π1 SO(N ) ∼ = Z/2Z.

Proposition 31.1. The group T̃ is connected and is a maximal torus of G̃.

Proof. Let Π ⊂ G̃ be the kernel of p. The connected component T̃ ◦ of the


identity in T̃ is a torus of the same dimension as T , so it is a maximal torus in
G̃. Its image in G is isomorphic to T̃ ◦ /(T̃ ◦ ∩ Π) ∼
= T̃ ◦ Π/Π. This is a torus of
G contained in T , and of the same dimension as T , so it is all of T . Thus, the
composition T̃ ◦ −→ T̃ −→ T is surjective. We see that T̃ /Π ∼ =T ∼
p
= T̃ ◦ Π/Π
canonically and therefore T̃ = T̃ ◦ Π.
We may identify Π with the fundamental group π1 (G) by Theorem 13.2. It
is a discrete normal subgroup of G̃ and hence central in G̃ by Proposition 23.1.

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 319


DOI 10.1007/978-1-4614-8024-2 31, © Springer Science+Business Media New York 2013
320 31 Spin

Thus it is contained in every maximal torus by Proposition 18.14, particularly


in T̃ ◦ . Thus T̃ ◦ = T̃ ◦ Π = T̃ and so T̃ is connected and a maximal torus. 

Composition with p is a homomorphism X ∗ (T ) −→ X ∗ (T̃ ), which induces
an isomorphism R ⊗ X ∗ (T ) −→ R ⊗ X ∗ (T̃ ). We will identify these two vector
spaces, which we denote by V. From the short exact sequence
1 −→ π1 (G) −→ T̃ −→ T −→ 1,
we have a short exact sequence

0 −→ X ∗ (T ) −→ X ∗ (T̃ ) −→ X ∗ π1 (G) −→ 0. (31.1)
(Surjectivity of the last map uses Exercise 4.2.) We recall that Λroot ⊆
X ∗ (T ) ⊆ Λ, where Λ and Λroot are the root and weight lattices.
A typical element of T has the form
⎧⎛ ⎞


t1

⎪ ⎜ .. ⎟

⎪⎜

⎪ ⎜ . ⎟


⎪ ⎜ ⎟
⎪⎜
⎪ t n ⎟

⎪ ⎜ ⎟ if N = 2n + 1 is odd,

⎪ ⎜ 1 ⎟

⎪ ⎜ −1 ⎟

⎪ ⎜ t n ⎟

⎪ ⎜ ⎟

⎪ ⎝ . .. ⎠

−1
t= ⎛ t1 ⎞ (31.2)



⎪ t 1

⎪ ⎜ .. ⎟

⎪ ⎜ ⎟

⎪ ⎜ . ⎟

⎪ ⎜ ⎟
⎪ ⎜
⎪ tn ⎟ if N = 2n is even.

⎪ ⎜ −1 ⎟
⎪ ⎜
⎪ tn ⎟

⎪ ⎜ ⎟

⎪ ⎝ . .. ⎠



t−1
1

In either case, V is spanned by e1 , . . . , en , where ei (t) = ti . The root system,


as we have already seen in Chap. 19, consists of all ±ei ± ej (i = j), with the
additional roots ±ei included only if N = 2n + 1 is odd. Order the roots so
that the positive roots are ei ± ej (i < j) and (if N is odd) ei . This is the
ordering that makes the root eigenspaces Xα upper triangular. See Fig. 30.1
and Fig. 19.3 for the groups SO(8) and SO(9).
It is easy to check that the simple roots are
α1 = e1 − e2 ,
α2 = e2 − e3 ,
..
.
αn−1 = en−1 − en

en−1 + en if N = 2n,
αn = (31.3)
en if N = 2n + 1.
The Weyl group may now be described.
31 Spin 321

Theorem 31.1. The Weyl group W of O(N ) has order 2n · n! if N = 2n + 1


and order 2n−1 · n! if N = 2n. It has as a subgroup the symmetric group Sn ,
which simply permutes the ti in the action on T , or dually the ei in its action
on V. It also has a subgroup H consisting of transformations of the form

ti −→ t±1
i or ei −→ ±ei .

If N = 2n + 1, then H consists of all such transformations, and its order


is 2n . If N = 2n, then H only contains transformations that change an even
number of signs. In either case, H is a normal subgroup of W and W = H ·Sn
is a semidirect product.
Proof. Regarding Sn and H as groups of linear transformations of V, the
group H is normalized by Sn , and H ∩ Sn = {1}, so the semidirect product
H · Sn exists and has order 2n n! or 2n−1 n! depending on whether |H| = 2n or
2n−1 . We must show that this is exactly the group generated by the simple
reflections.
The simple reflections with respect to α1 , . . . , αn−1 are identical with the
simple reflections in the Weyl group Sn of U(n), which is clear since we may
embed U(n) −→ O(2n) or O(2n + 1) by
⎛ ⎞
  g
g ⎝ ⎠,
g −→ or 1
g∗
g∗
where
⎛ ⎞ ⎛ ⎞
1 1
g∗ = ⎝ . .. ⎠ t g −1 ⎝ . .. ⎠.
1 1

Under this embedding, the Weyl group Sn of U(n) gets embedded in the Weyl
group of O(N ). In its action on the torus, the ti are simply permuted, and in
the action on X ∗ (T ), the ei are permuted. The simple i-th simple reflection
in Sn has the sends αi to its negative (1  i  n − 1) while permuting the
other positive roots, so it coincides with the i-th simple reflection in SO(N ).
Now let us consider the simple reflection with respect to αn . If N = 2n+ 1,
then since αn = en this just has the effect en −→ −en , and all other ei −→ ei .
A representative in N (T ) can be taken to be
⎛ ⎞
In−1
⎜ 0 0 1 ⎟
⎜ ⎟
wn = ⎜⎜ 0 −1 0 ⎟.

⎝ 1 0 0 ⎠
In−1

It is clear that all elements of the group H described in the statement of the
theorem that change the sign of exactly one ei can be generated by conjugating
322 31 Spin

wn by elements of Sn and that these generate H. Thus W contains HSn .


On the other hand, all simple reflections are contained in HSn , so W = HSn
in this case.
If N = 2n, then since αn = en−1 + en , the simple reflection in αn has the
effect en−1 −→ −en , en −→ −en−1 . A representative in N (T ) can be taken
to be ⎛ ⎞
In−2
⎜ 0 0 1 0 ⎟
⎜ ⎟
⎜ 0 0 0 1 ⎟
wn = ⎜⎜
⎟.

⎜ 1 0 0 0 ⎟
⎝ 0 1 0 0 ⎠
In−2
If we multiply this by the simple reflection in αn−1 , which just interchanges
en−1 and en , we get the element of the group H that changes the signs of
en−1 and en and leaves everything else fixed. It is clear that all elements
of the group H described in the statement of the theorem that change the
sign of exactly two ei can be generated by conjugating this element of W by
elements of Sn and that these generate H. Again W contains HSn , and again
all simple reflections are contained in HSn , so W = HSn in this case.


Proposition 31.2. The weight lattice Λ = X ∗ (T̃ ) consists of all elements of


V of the form
8 n 9
1
ci e i , (31.4)
2 i=1

where ci ∈ Z are either all even or all odd.

Proof. From our determination of the simple reflections, which generate W ,


the W -invariant inner product on V = R ⊗ Λ may be chosen so that the ei are
orthonormal. By Proposition 18.10 every weight λ is in the lattice Λ, of λ ∈ V
such that 2 λ, α / α, α ∈ Z for α in the root lattice. Since we know the root
system, it is easy to see that (31.4) consists of the weights 31.4.
We could now invoke Proposition 23.12. But since Proposition 23.12 is
somewhat deep, let us give a simple alternative argument that avoids it.
We know that Zn ⊂ Λ since Zn = X ∗ (T ) is the weight lattice of SO(N ),
contained in Λ = X ∗ (T̃ ) by means of the homomorphism T̃ → T . By (31.1)
Zn is a subgroup of index two in Λ. Since Zn is of index two in Λ, and since
Λ ⊆ Λ, by Proposition 18.10, we see that Λ = Λ. ,


From (31.3), we can compute the fundamental dominant weights i .


If N = 2n + 1 is odd, these are
31 Spin 323

1 = e1 ,
2 = e1 + e2 ,
..
.
n−1 = e1 + e2 + . . . + en−1 ,
n = 12 (e1 + e2 + . . . + en−1 + en ).

On the other hand, if N = 2n is even, the last two are a little changed. In this
case, the fundamental weights are

1 = e1 ,
2 = e1 + e2 ,
..
.
n−1 = 12 (e1 + e2 + . . . + en−1 − en ),
n = 12 (e1 + e2 + . . . + en−1 + en ).

Of course, to check the correctness of these weights, what one must check is
that 2 i , αj / αj , αj = 1 if i = j, and 0 if i = j, and this is easily done.
We say that a weight is integral if it is in X ∗ (T ) and half-integral if it
is not. Thus a weight is integral if it is of the form (31.4) with the ci even,
and half-integral if they are odd. Dominant integral weights, of course, are
highest weight vectors of representations of SO(N ). By Proposition 31.2, the
dominant half-integral weights are highest weight vectors of representations
of Spin(N ). They are not highest weight vectors of representations of SO(N ).
If N = 2n+ 1, we see that just the last fundamental weight is half-integral,
but if N = 2n, the last two fundamental weights are half-integral. The repre-
sentations with highest weight vectors n (when N = 2n + 1) or n−1 and
n (when N = 2n) are called the spin representations.

Theorem 31.2. (i) If N = 2n + 1, the dimension of the spin representa-


tion π(n ) is 2n . The weights that occur with nonzero multiplicity in this
representation all occur with multiplicity one; they are
1
2 (±e1 ± e2 ± . . . ± en ).

(ii) If N = 2n, the dimensions of the spin representations π(n−1 ) and π(n )
are each 2n−1 . The weights that occur with nonzero multiplicity in this
representation all occur with multiplicity one; they are
1
2 (±e1 ± e2 ± . . . ± en ),

where the number of minus signs is odd for π(n−1 ) and even for π(n ).

Proof. There is enough information in Proposition 22.4 to determine the


weights in the spin representations.
324 31 Spin

Specifically, let λ = n and N = 2n + 1 or 2n, or λ = n−1 if N = 2n. Let


S(λ) be as in Exercise 22.1. Then it is not hard to check that S(λ) is exactly
the set of weights stated in the theorem. By Proposition 22.4, S(λ) ⊇ supp χλ .
On the other hand, it is easy to check that S(λ) consists of a single Weyl group
orbit, namely the orbit of the highest weight vector λ, so S(λ) ⊆ supp χλ ,
and, for this orbit, Proposition 22.4 also tells us that each weight appears in
χλ with multiplicity exactly one.


The center of SO(N ) consists of {±IN } if N is even but is trivial if N


is odd. The center of Spin(N ) is more subtle, but we now have the tools to
compute it.

Theorem 31.3. If N = 2n + 1, then Z(G) ∼ = Z/2Z. If N = 2n, then Z(G) ∼


=

Z/4Z if n is odd, while Z(G) = (Z/2Z) × (Z/2Z) if n is even.

Proof. X ∗ (T̃ ) is described explicitly by Proposition 31.2, and we have also


described the simple roots, which generate Λroot . We leave the verification that
X ∗ (T̃ )/Λroot is as described to the reader. The result follows from Theorem
23.2.


Now let us consider the spin representations from a different point of view.
If V is a complex vector space of dimension N with a nondegenerate quadratic
form q, and if W ⊂ V is a maximal subspace on which q restricts to zero, we
will call W Lagrangian. We will see that the dimension of such a Lagrangian
(
subspace W is n where N = 2n or 2n + 1, so the exterior algebra W has
dimension 2n , and we will construct the spin representation on this vector
space.
To construct the spin representation, we will make use of the properties
of Clifford algebras. We digress to develop what we need. For more about
Clifford algebras, see Artin [9], Chevalley [36], Goodman and Wallach [56],
and Lawson and Michelsohn [118].
By a Z/2Z-graded algebra we mean an F -algebra A that decomposes into
a direct sum A0 ⊕ A1 , where A0 is a subalgbra, with Ai · Aj ⊆ Ai+j where i + j
is modulo 2. We require that F be contained in A0 and that F is central in A,
but A0 may be strictly larger than F . An element a of A is called homogeneous
if it is in Ai with i ∈ {0, 1} and then we call deg(a) = i the degree of a.
If A and B are Z/2Z-graded algebras, then we may define a Z/2Z-graded
algebra A ⊗ B. As a vector space, this is the usual tensor product of A and
B, with the following Z/2Z grading:

(A ⊗ B)0 = A0 ⊗ B0 ⊕ A1 ⊗ B1 , (A ⊗ B)1 = A1 ⊗ B0 ⊕ A0 ⊗ B1 .

The multiplication involves a sign as follows. It is sufficient to define the


product of two homogeneous elements, and then we define

(a ⊗ b)(a ⊗ b ) = (−1)deg(b)·deg(a ) aa ⊗ bb . (31.5)
31 Spin 325

Every Z/2Z-graded algebra A has an automorphism of order 2 that is 1 on


A0 and −1 on A1 . We will denote this operation by a −→ ā. We will encounter
 M = Mat2 (F ) with the following Z/2Z-grading: M0 consists of
the algebra
ab
matrices with b = c = 0, and M1 consists of matrices with a = d = 0.
cd
Now if A is a Z/2Z-graded algebra, then we may identify elements of A ⊗ M
with 2 × 2 matrices with coefficients in A by mapping
       
10 01 00 00
a⊗ +b⊗ +c⊗ +d⊗
00 00 10 01
 
ab
to the matrix . However we do not use ordinary matrix multiplication
cd
in this ring. Indeed, by the sign rule (31.5) the multiplication in this “matrix
ring” is twisted by conjugation:
       
ab a b aa + bc ac + bd
= .
cd c d ca + dc cb + dd

Let us denote this Z/2Z-graded algebra M ⊗ A as M (A).


We recall that a ring is simple if it has no proper nontrivial ideals.

Proposition 31.3. If A is a simple Z/2Z-graded algebra then so is M (A).


 
ab
Proof. Let I be a nonzero ideal. If m = is a nonzero element of I,
cd
 
01
then one of a, b, c, d is nonzero. Left and/or right multiplying by we
 1 0
10
may assume that a = 0. Then left and right multiplying by we may
00
   
a0 10
assume m = . Since A is simple, I contains . Similarly it contains
  0 0   00
00 10
. Adding these two elements is in the ideal I, which is thus not
01 01
proper.


We will also encounter the Z/2Z-graded algebra D(F ) which is a two-


dimensional algebra over F generated by an element ζ of Z/2Z-degree 1 that
satisfies ζ 2 = −1. Then A ⊗ D(F ) will be denoted D(A).

Proposition 31.4. In the graded ring D(A) we have D(A)0 ∼


= A as a ring.

Proof. We may identify D(A) with A⊕A as a vector space in which a⊗1+b⊗ζ
is identified with the ordered pair (a, b). In view of (31.5) the multiplication
is
(a, b)(c, d) = (ac − bd, ad + bc).
326 31 Spin

Now D(A)0 consists of pairs (a, b) with a of degree zero and b of degree 1,
and for this subring, the multiplication is

(a, b)(c, d) = (ac + bd, ad + bc).

Now every element of A can be written uniquely as a + b with a ∈ A0 and


b ∈ A1 . Then a + b −→ (a, b) is clearly an isomorphism A −→ D(A)0 .


Let F be a field, which for simplicity we assume has characteristic not


equal to 2. By a quadratic space we mean a vector space V (over F ) together
with a symmetric bilinear form B = BV : V × V −→ F . We say that the
quadratic space V is nondegenerate if the symmetric bilinear form B is non-
degenerate. Let q(x) = qV (x) = B(x, x). This is a quadratic form, and giving
B is equivalent to giving q since B(x, y) = 12 q(x + y) − q(x) − q(y) .
The Clifford algebra C(V ) will be an F -algebra characterized by a universal
property: it comes with a map ι : V −→ C(V ) and if x, y ∈ V then ι(x)2 =
q(x). The universal property is that if A is any F -algebra with a linear map
j : V −→ A satisfying j(x)2 = q(x) in A, then there exists a unique algebra
homomorphism J : C(V ) −→ A such that j = J ◦ ι.
Instead of verifying j(x)2 = q(x) it may be more convenient to verify the
equivalent condition j(x)j(y) + j(y)j(x) = 2B(x, y), for x, y ∈ V since the
latter condition is linear, so it is sufficient to verify it on a subset of V that
spans it. The bilinear condition is equivalent to j(x)2 = q(x) since

xy + yx = (x + y)2 − x2 − y 2 , 2B(x, y) = q(x + y) − q(x) − q(y).

In order to construct the Clifford algebra, we may take the tensor algebra
T (V ) modulo the ideal I = IV generated by elements of the form x ⊗ y + y ⊗
x − 2B(x, y) with x, y ∈ V .

Proposition 31.5. (i) The Clifford algebra is a Z/2Z-graded algebra.


(ii) Suppose that V is the orthogonal direct sum of two subspaces U and W .
Then C(U ⊕ W ) ∼ = C(U ) ⊗ C(W ).
(iii) The dimension of C(V ) is 2dim(V ) .
(iv) The map i : V −→ C(V ) is injective.
(v) If v1 , v2 , . . . , vd is a basis of V , then the set of products

vi1 vi2 . . . vik (1  i1 < i2 < . . . < ik  d)

is a basis of C(V ). Here we are using (iii) to identify v ∈ V with i(v) ∈


C(V ).

Proof. The tenor algebra T = T (V ) is a graded algebra in which the homo-


geneous part of degree k is ⊗k V . Let
&
Ti = ⊗k V, (i = 0, 1).
k ≡ i mod 2
31 Spin 327

Let R be the vector space in T spanned by the relations x⊗y +y ⊗x−2B(x, y)


so that T RT = I. Clearly R ⊂ T0 and so the ideal I is homogeneous in the
sense that I = (I ∩ T0 ) ⊕ (I ∩ T1 ). This implies that the quotient C(V ) =
T (V )/I inherits the Z/2Z-grading from T (V ).
For (ii), we have linear maps iU : U −→ C(U ) and iW : W −→ C(W ).
Define j : U ⊕ W −→ C(U ) ⊗ C(W ) by j(u) = iU (u) ⊗ 1 on U and j(w) = 1 ⊗
iW (w) on W . Using the fact that U and W are orthogonal, we have uw = −wu
in C(U ⊕ W ), from which it follows that j(x)2 = q(x) for x ∈ U ⊕ W . (Indeed
j(x)2 and q(x) equal q(u) + q(w) if x = u + w with u ∈ U and w ∈ W .)
Therefore there exists a ring homomorphism J : C(U ⊕ W ) −→ C(U ) ⊗ C(W )
such that J ◦ iU⊕W = j. The map is surjective since its image contains the
generators iU U ⊗1 and 1⊗iW W . To see that it is injective, we compose it with
the canonical map T (U ⊕ W ) −→ C(U ⊕ W ). The kernel IU⊕W is generated
by the relations x ⊗ y + y ⊗ x − 2B(x, y) with x and y in either U or W , and
in each of the four cases, these are mapped to zero by j. Hence the induced
map C(U ⊕ W ) −→ C(U ) ⊗ C(W ) is injective, and indeed is an isomorphism.
Next let us show that dim C(V ) = 2dim(V ) . We will argue by induction
on dim(V ). If dim(V ) = 1, then let v be a basis vector and a = q(v). The ring
C(V ) is easily seen to be spanned as an F -vector space by 1 and v with one
relation v 2 = a, and clearly v is not zero. Thus dim C(V ) = 2 = 2dim(V ) .
So by induction we may assume that dim(V ) > 1. We may always find a
nonzero vector u and a vector subspace W of codimension 1 that is orthog-
onal to u. Indeed, if the bilinear form B is degenerate, we may take u to be
a nonzero element of the kernel, and W to be any vector space of codimen-
sion 1 not containing u. On the other hand, if V is nondegenerate, we may
find a vector u with q(u) = 0; in this case we take W to be the orthogonal
complement of U = F u. Now C(V ) ∼ = C(U ) ⊗ C(W ). By induction, C(U )
has dimension 2dim(U) and C(W ) has dimension 2dim(W ) and the statement
follows.
If v1 , . . . , vd are a basis of V , it is easy to see using the generating relations
that the vector space spanned by
iV (vi1 ) . . . iV (vid ), i1 < . . . < id
is closed under multiplication, so these span C(V ). Since they are 2dim(V ) in
number, they are linearly independent. This proves (v) and (vi).

A vector v is called isotropic if q(v) = 0. Similarly, a subspace W of the
quadratic space V is isotropic if B(x, y) = 0 for x, y ∈ W . If F = R there
may be no nonzero isotropic subspaces (if the quadratic form is positive def-
inite), but if F is C and V is nondegenerate, we will see that the dimension
of a maximal isotropic subspace W of V will be n if dim(V ) = 2n or 2n + 1.
If dim(W ) = n and dim(V ) = 2n or 2n + 1 we call the isotropic subspace
W Lagrangian. It follows from Witt’s theorem (see Lang [116]) that maximal
isotropic subspaces are conjugated transitively by O(N ), and if V is nonde-
generate these are the Lagrangian subspaces, provided Lagrangian subspaces
exist. This is always true if F is algebraically closed.
328 31 Spin

Let V be a two-dimensional quadratic space. Then V is called a hyperbolic


plane if it is nondegenerate, and if V has a basis x and y of linearly independent
isotropic vectors. We may multiply x by a nonzero constant and also assume
that B(x, y) = 12 .

Proposition 31.6. If V is a hyperbolic plane then C(V ) = ∼ M (F ) as Z/2Z-


graded algebras.
   
01 00
Proof. Let X = and Y = . With x, y such that q(x) = q(y) = 0
00 10
and B(x, y) = 12 , we have x2 = y 2 = 0 and xy + yx = 1 in C(V ). Since X and
Y satisfy the same relations, the universal property of the Clifford algebra
implies that there is a homomorphism C(V ) −→ Mat2 (F ). Since Mat2 (F ) is
generated by X and Y , the map is surjective, and since both algebras have
dimension four, it is an isomorphism. The Z/2Z-gradings are compatible. 

Lemma 31.1. Assume that F is algebraically closed and V is nondegenerate.


If dim(V )  2 then V may be decomposed as V0 ⊕ V  where V0 is a hyperbolic
plane and V  is its orthogonal complement.

Proof. Let v be any vector with q(v) = 0, and let w be any nonzero vector
in the orthogonal complement of v. Then q(w) = 0 also since V otherwise
it is in the kernel of the associated symmetric bilinear form B, but B is
nondegenerate. Let q(v) = a2 and q(w) = b2 . Then x = bv − aw and y =
bv + aw are linearly independent isotropic vectors since a, b = 0. Clearly the
space V0 spanned by x and y is a hyperbolic plane, and we may take V  to be
its orthogonal complement.


Proposition 31.7. If F is algebraically closed and V is a nondegenerate


quadratic space of dimension 2n or 2n + 1, then V contains Lagrangian sub-
spaces W and W  such that W ∩ W  = 0, and B induces a nondegenerate
pairing W × W  −→ F . If dim(V ) = 2n + 1 then the one-dimensional orthog-
onal complement of W + W  is spanned by a vector z such that q(z) = 1.

Proof. Using the Lemma 31.1 repeatedly, we may decompose V = V1 ⊕ V2 ⊕


. . . ⊕ Vn ⊕ V  where Vi are hyperbolic planes and V  is either zero or one-
dimensional. Each Vi is spanned by two isotropic vectors xi and yi such that
B(xi , yi ) = 12 . Let W be the space spanned by the xi and W  be the space
spanned by the yi . We have xi yj + yj xi = δij (Kronecker delta). If dim(V ) =
2n + 1 then the orthogonal complement V0 of W + W  is one-dimensional,
and since V is nondegenerate, if q is a basis vector then q(z) = 0 for nonzero
z ∈ V0 . Since F is algebraically closed, we may scale z so that q(z) = 1.


Let us assume that dim(V ) = 2n or 2n + 1. Let us also assume that V


has a decomposition V = W ⊕ W  ⊕ V0 where W and W  are Lagrangian
subspaces (i.e. isotropic subspaces of dimension n) that are dually paired by
B. Of course W and W  are not orthogonal, but the space V0 is assumed to
31 Spin 329

be the orthogonal complement of W ⊕ W  . It is zero if dim(V ) = 2n but it is


one-dimensional if dim(V ) = 2n + 1, and in this case we assume that it is
spanned by a vector z with q(z) = 1. We call such a decomposition V =
W ⊕ W  ⊕ V0 a Lagrangian decomposition. By Proposition 31.7 there is always
a Lagrangian decomposition if F is algebraically closed.
Given a Lagrangian decomposition
( of V we will describe a representation
of C(V ) in the exterior algebra W on W . This module is known as the
Fermionic Fock space and we will denote it as Ω.

Proposition 31.8. Given a Lagrangian decomposition


( V = W ⊕ W  ⊕ V0
of a nondegenerate quadratic space, let Ω = W . There exists an algebra
homomorphism ω : C(V ) −→ End(W ) in which, for ξ ∈ Ω we have

ω(x)ξ = x ∧ ξ, x ∈ W, (31.6)


k
ω(y)ξ = 2B(y, wi )(−1)i+1 w1 ∧ . . . ∧ w
Ci ∧ . . . ∧ wk , y ∈ W . (31.7)
i=1

if ξ = w1 ∧ . . . ∧ wk , where the “hat” over w


Ci means that this factor is omitted.
Also if dim(V ) = 2n + 1, let z be the chosen element of V0 with q(z) = 1. If
ξ ∈ Ω is homogeneous of degree i, then ω(z)ξ = (−1)i ξ.

Proof. We can define ω by the (31.6) and (31.7) and (if N is odd) the req-
uirement that ω(z)ξ = (−1)i ξ. Regarding (31.7) this is well-defined by the
universal property of the exterior power because it is easy to check that the
right-hand side is multiplied by −1 if wi and wi+1 are interchanged, so it is
alternating. We have to check that ω is an algebra homomorphism.
We will show that if x, y ∈ V then

ω(x)ω(y) + ω(y)ω(x) = 2B(x, y). (31.8)

If x, y ∈ W or x, y ∈ W  , both sides are zero. If x ∈ W and y = w , and


ξ = w1 ∧ . . . ∧ wk then ω(y)ω(x)ξ consists of k + 1 terms. All but one of these
are the k terms in ω(x)ω(y)ξ but with opposite sign, and the one term that
is not cancelled equals 2B(y, x)ξ, as required. This proves (31.8) in the case
dim(V ) = 2n. If dim(V ) = 2n + 1 we have also to check this if y = z and
either x = z or x ∈ W or x ∈ W  . If x ∈ W or W  , both sides of (31.8) vanish
by definition of ω(z) since ω(x) has graded degree ±1. If x = y = z, then both
sides of (31.8) are multiplication by 2, so (31.8) is proved.
By the universal property of the Clifford algebra, (31.8) implies that there
is a homomorphism C(V ) −→ End(Ω) as required.


We will denote by Ci (V ) the homogeneous part of degree i in the Z/2Z-


grading. In other words, with i = 0 or 1, Ci (V ) = Ai if A = C(V ).

Theorem 31.4. Let V be a quadratic space with a nondegenerate symmetric


bilinear form and a Lagrangian decomposition V = W ⊕ W  ⊕ V0 , and let
330 31 Spin
(
Ω = W as in Proposition 31.8. Assume that the ground field contains an
element i such that i2 = −1. Let R = C(V ) if dim(V ) = 2n and R = C0 (V )
if dim(V ) = 2n + 1. Then R is a simple ring with Ω its irreducible module,
and in fact the representation π : R −→ End(Ω) in Proposition 31.8 is an
isomorphism.
Proof. Since by Lemma 31.1 the even-dimensional subspace W ⊕ W  is an
orthogonal direct sum of hyperbolic planes, it follows from Proposition 31.6
that A = C(W ⊕ W  ) is a simple algebra. If dim(V ) = 2n then R = A.
On the other hand, suppose that dim(V ) = 2n + 1. Then taking ζ = iz,
where q(z) = 1 as in Proposition 31.8, we see that C(V0 ) ∼
= D(F ). Therefore
C(V ) ∼= D(A) where A = C(W ⊕ W  ) and by Proposition 31.4 we have
R = D(A)0 ∼ = A.
In either case, it is a simple algebra by Proposition 31.3. The homomor-
phism π : R −→ End(Ω) must be an isomorphism since it cannot have a kernel
(by the simplicity of R) and both rings have the same dimension 22n .

We will construct a representation of the Spin(N ) on Ω (with Ω as in
Theorem 31.4) by first constructing a projective representation of O(N ). We
therefore digress to review projective representations and their relations to
true representations.
If V is a complex vector space, PGL(V ) is GL(V )/Z where Z is the center
of GL(V ), that is, the group of scalar linear transformations of V . Let P :
GL(V ) −→ PGL(V ) be the projection map. A projective representation of a
group G a homomorphism π : G −→ PGL(V ). Equivalently, we may describe
the projective representation by giving a map π  : G −→ GL(V ) such that
P ◦ π  = π.
We review the connection between projective representations and central
extensions. By a central extension of G by an Abelian group A we mean a
group Ĝ with an subgroup isomorphic to A contained in its center, such that
(identifying this subgroup with A) we have Ĝ/A ∼= G. In other words we have
a short exact sequence
p
1 −→ A −→ Ĝ −→ G −→ 1
with the image of A contained in the center of Ĝ. We are interested in the
case where A ⊆ C× , the group of complex numbers of absolute value 1.
Suppose (π̂, V ) is a representation of Ĝ. Assume that π̂(a) is a scalar
linear transformation for all a ∈ A. For example, by Schur’s Lemma, this is
true if π̂ is irreducible. Then the map P ◦ π̂ from π̂ to PGL(V ) is constant on
the cosets of A. It thus gives a projective representation of G, the projective
representation associated with π̂.
Proposition 31.9. Suppose that π : G −→ PGL(V ) is a projective repre-
sentation of G. Then there exists a central extension Ĝ of G by C× and a
representation of Ĝ such that π is the projective representation associated
with π̂.
31 Spin 331

Proof. Choose a map π  : G −→ GL(V ) such that P ◦ π  = π. Then π  (g1 g2 )


differs from π  (g1 )π  (g2 ) by a scalar linear transformation, since these have
the same image under P . Thus there is a map φ : G × G −→ C× such that
π  (g1 )π  (g2 ) = φ(g1 , g2 )π  (g1 g2 ) (31.9)

Applying π to (g1 g2 )g3 = g1 (g2 g3 ) gives the “cocycle relation”
φ(g1 , g2 )φ(g1 g2 , g3 ) = φ(g1 , g2 g3 )φ(g2 , g3 ).
Let Ĝ be (as a set) the Cartesian product G × C× , and we make it a group
by defining
(g1 , ε1 )(g2 , ε2 ) = (g1 g2 , φ(g1 , g2 )ε1 ε2 ). (31.10)
The cocycle relation implies that this group law is associative.
Now define π̂ : Ĝ −→ GL(V ) by
π̂(g, ε) = επ  (g). (31.11)
Then it is easy to see that (31.9) and (31.10) imply that π̂ is a representation,
and it is clear that the associated projective representation is π.

Remark 31.1. If the map φ in (31.9) takes values in a subgroup A of C× ,
then we may obtain a true representation of a central extension of G by A
by exactly the same construction: as a set, the extension is G × A, and the
multiplication is defined by the same formula (31.10). For example if G is a
simply-connected Lie group the next Proposition shows that we may we may
even take A = 1 and obtain a true representation of G.
Proposition 31.10. Suppose that G is a simply-connected Lie group, and
let π : G −→ PGL(V ) be a projective representation. Then there exists a
representation π̂ : G −→ GL(V ) such that π is the projective representation
associated with π̂. Moreover we may assume that π̂(G) ⊆ SL(V ).
Proof. Let d = dim(V ). The natural map SL(V ) −→ PGL(V ) has kernel
of order d, consisting of scalar linear transformations εIV where ε is an d-
th root of unity. Hence this is a covering map. Since G is simply-connected,
by Proposition 13.4 we may find a continuous map π  : G −→ SL(V ) such
that P ◦ π  = π. Now as in the proof of Proposition 31.9 we may define
φ : G × G −→ SL(V ) such that (31.9) is true, and φ is continuous since π  is.
Taking determinants on both sides, φ(g) is an d-th root of unity. Proceeding
as in the proof of Proposition 31.9 we then obtain a true representation π̂ of
a central extension
p
1 −→ μd −→ Ĝ −→ G −→ 1
where μd is the group of d-th roots of unity in C. Since μd is discrete, the res-
triction of p to the connected component Ĝ◦ of the identity in Ĝ is a covering
map, but since G is simply-connected, this restriction is an isomorphism. Let
s : G −→ Ĝ◦ be the inverse map. Then π̂ = π̂  ◦ s is a true representation of G
whose associated projective representation is π. By (31.11) this construction
gives it values in SL(V ).

332 31 Spin

Now the method by which we will construct the spin representations of


orthogonal groups, or more precisely their double covers, may be revealed. The
Clifford algebra has a crucial property: it has only one or two classes of simple
modules, and if Ω is such a module, then O(V ) gets a projective representation
on Ω by the following Proposition. It follows that we have a representation of
a central extension of O(V ), and these are the spin representations.
Proposition 31.11. Let G be a group, and let R be a C-algebra that has a
unique isomorphism class of simple modules. Let Ω be such a module, and
let ω : R −→ EndC (Ω) be the C-algebra homomorphism defined by ω(r)v =
r · v. Let ρ : G −→ Aut(R) be a group homomorphism. Then there exists a
projective representation π : G −→ GL(Ω) such that for r ∈ R, g ∈ G and
v ∈ Ω we have
π(g) ω(r) = ω ρ(g)r π(g). (31.12)
Proof. Given g ∈ G, define another R-module structure on Ω by means of
the homomorphism g ω : R −→ EndC (Ω) given by g ω ρ(g)r = ω(r), r ∈ R.
Denote by g Ω the vector space Ω with this R-module structure. Since R has
a unique isomorphism class of simple modules, we may find a π(g) : Ω −→ g Ω
that is an R-module homomorphism. By Schur’s Lemma, it is determined up
to isomorphism. The fact that it is an R-module homomorphism amounts to
the identity (31.12). To show that π is a projective representation, we need
to show that π(g1 )π(g2 ) and π(g1 g2 ) are the same, up to a constant multiple.
Indeed, both satisfy (31.12) with g = g1 g2 , and so these two endomorphisms
are proportional.

To apply this, we may take R = C(V ) if dim(V ) = 2n or C(V )0 if
dim(V ) = 2n + 1 as in Theorem 31.4. Since O(V ) acts by automorphisms ( on
V , hence on R, we obtain a projective representation of O(V ) on Ω = W
for W a Lagrangian subspace. In order to apply Proposition 31.10, we restrict
to the connected subgroup SO(V ), whose universal cover we denote Spin(V ).
Except in the case where dim(V ) = 2, we have already shown that this is a
central extension such that the cover map Spin(V ) −→ SO(V ) has degree 2.
We see that there exists a true representation π : Spin(V ) −→ GL(Ω), and
that the image of this may be taken inside of SL(Ω).
To compare this with our previous computation of the spin representations,
let w1 , . . . , wn be a basis of W , and let w1 , . . . , wn be the dual basis of W  , char-
acterized by 2B(wi , wj ) = δij . If dim(V ) is even, then w1 , . . . , wn , wn , . . . , w1
form a basis B of V ; if dim(V ) is odd, we supplement these by a basis vector
v0 of V0 , and let B be w1 , . . . , wn , v0 , wn , . . . , w1 . Let T be the maximal torus
of SO(V ) that is diagonal with respect to the basis B. If we identify these
with the standard basis of CN then T consists of the elements (31.2). Let T̃
be the preimage of T in Spin(V ).
Proposition 31.12. Let t̃ ∈ T̃ . Assume with the above identifications that
the image of t in T is (31.2). Then the eigenvalues of σ(t) are the 2n values
/ ±1/2
ti for an appropriate choice of the square roots.
31 Spin 333

Proof. Let t ∈ T be the element corresponding to t̃ ∈ T̃ . By (31.12) we have,


for r ∈ R
σ(t̃)ω(r) = ω(ρ(t)r)σ(t̃). (31.13)
Here ρ : SO(V ) −→ Aut(R) is obtained by extending the action of SO(V )
on V to automorphisms of the Clifford algebra, and ω : R −→ End(Ω) is the
representation of Proposition 31.8. By (31.7) the vector 1 ∈ Ω is characterized
by being the unique (up to constant multiple) nonzero vector annihilated by
ω(W  ), and since ρ(t)W  = W  it follows from this characterization that
1 ∈ W is an eigenvector of σ(t̃). Let σ(t̃)1 = λ1, where λ ∈ C× is to be
determined.
Let r = wi1 . . . wik with i1 < . . . < ik , where the multiplication is in the
Clifford algebra. We have ρ(t)r = ti1 . . . tik r so by (31.13) we see that ω(r)1
is also an eigenvector of σ(t̃), with eigenvalue ti1 . . . tik λ. By (31.6) we have
ω(r)1 = wi1 ∧ . . . ∧ wik , so a basis of Ω consisting of eigenvectors of T̃ consist
of the elements wi1 ∧ . . . ∧ wik and the eigenvalues are λti1 . . . tik . Now since
σ(T̃ ) ⊂ SL(Ω), the product of these eigenvalues is 1, that is
8 92n−1
7
n
λ2 ti = 1.
i=1
2
/n
Now λ i=1 ti must depend continuously on t̃, and since it is a 2n−1 -st root
/n
of unity, it is constant. Clearly λ = 1 when ,
t = 1, so λ2
i=1 ti = 1. Therefore
/n −1/2
we may write λ = i=1 ti for some choice of the square root, and the
statement follows.

Comparing Proposition 31.12 with Theorem 31.2, we see from this com-
putation of the character that the representation we have constructed is the
same spin representation π(n ) described in that theorem when dim(V ) is
odd; when dim(V ) is even, it is the direct sum of the two spin representations
π(n−1 ) and π(n ).
This approach, based on Proposition 31.12 is a variant the construction
of the Weil representation, or oscillator representation, a projective represen-
tation of the symplectic group of a local field which was introduced in the
great paper [172] in order to explain Siegel’s work on the theory of quadratic
forms. The analogy between the Weil representation and the spin representa-
tion was emphasized and applied in the very interesting papers Howe [75, 77].
Let F be a field, and consider the action of Sp(2n, F ) on a vector space
V of dimension 2n with a nondegenerate bilinear form B : V × V −→ F
which satisfies B(x, y) = −B(y, x). There is again a symplectic Clifford alge-
bra more commonly called the Weyl algebra whose definition is similar to the
orthogonal case, except for a sign: in this case the relations to be satisfied are
xy − yx = B(x, y) for x, y ∈ V . Thus if W is a maximal isotropic subspace
(of dimension n) then elements of W commute, rather than anticommute.
Therefore the module Ω would not be the exterior algebra on W but the sym-
metric algebra, and it should be infinite-dimensional. As in the orthogonal
334 31 Spin

case it is indeed true that if F is a locally compact field (e.g. R, C, a finite or


p-adic field) then the Clifford algebra has a unique irreducible representation
though one takes not (as this reasoning might suggest) the symmetric algebra
but rather the Schwartz space on W . This uniqueness produces a projective
representation of Sp(2n, F ) by the same principle based on Proposition 31.11.
The symplectic Clifford algebra (Weyl algebra) is a quotient of the univer-
sal enveloping algebra of the Heisenberg Lie algebra h, which has generators
Xi , Yi , 1  i  n and the central element Z, with relations [Xi , Yi ] = Z. More
precisely, if we divide U (h) by the ideal generated by Z − λ, where λ is a
nonzero complex number, we obtain the symplectic Clifford algebra. So the
fact that the Weyl algebra has a unique module is equivalent to the Stone-von
Neumann Theorem which asserts that the Heisenberg group, or its Lie alge-
bra has a unique irreducible module with a given nontrivial central character.
See Lion and Vergne [119] Sect. 1.6 for an account of the Weil representation
featuring the Stone von Neumann theorem.

Exercises
Exercise 31.1. Check the details in the proof of Theorem 31.2. That is, verify that
S(λ) is exactly the set of characters stated in the theorem and that it consists of
just the W orbit of λ.

Exercise 31.2. Prove that the restriction of the spin representation of Spin(2n + 1)
to Spin(2n) is the sum of the two spin representations of Spin(2n).

Exercise 31.3. Prove that the restriction of either spin representation of Spin(2n)
to Spin(2n − 1) is the spin representation of Spin(2n).

Exercise 31.4. Show that one of the spin representations of Spin(6) gives an iso-
morphism Spin(6) ∼ = SU(4). What is the significance of the fact that there are two
spin representations?

For another spin exercise, see Exercise 30.5.

Exercise 31.5. Verify the description of X ∗ (T̃ )/Λroot in Theorem 31.3.

Exercise 31.6. Let G be a compact connected Lie group whose root system is of
type G2 . (See Fig. 19.6.) Prove that G is simply-connected.
Part IV

Duality and Other Topics


32
Mackey Theory

Given a subgroup H of a finite group G, and a representation π of H,


there is an induced representation π G of G. Mackey theory is concerned with
intertwining operators between a pair of induced representations. It is based
on a very simple idea: if the two representations are induced from subgroups
H1 and H2 , then every such intertwining operator is convolution with suit-
able function Δ, which has left and right translation properties by H1 and
H2 . This leads to a method of calculating the space of intertwining operators,
based on the double cosets H2 \G/H1 .
If H is a subgroup of the finite group G, and if (π, V ) is a representation of
H, then we define the induced representation (π G , V G ) as follows. The vector
space V G consists of all maps f : G −→ V that satisfy f (hg) = π(h) f (g)
when h ∈ H. The representation π G : G −→ GL(V G ) is by right translation
G
π (g)f (x) = f (xg).

It is easy to see that if f ∈ V G , then so is π G (g) f , and that π G is a representa-


tion. We will sometimes denote the representation (π G , V G ) as IndG H (π). If V
happens to be one-dimensional, we may identify V = C. Also in Theorem 32.1
the vector space Hom(V1 , V2 ) plays a role; if V1 and V2 are one-dimensional
we may identify Hom(V1 , V2 ) with C.
We begin with an instructive example of how Mackey theory is used in
practice. Let G = GL(2, F ) where F = Fq is a finite field, and let B be the
Borel subgroup of upper triangular matrices. Let χ1 and χ2 be characters of
F × . Let χ be the character
 
y1 ∗
χ = χ1 (y1 )χ2 (y2 ) (32.1)
y2

of B. Similarly, let μ1 and μ2 be two other characters of F × , and let μ be the


corresponding character of B.

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 337


DOI 10.1007/978-1-4614-8024-2 32, © Springer Science+Business Media New York 2013
338 32 Mackey Theory

Proposition 32.1. The representation IndG B (χ) is of degree q + 1. It is


irreducible unless χ1 = χ2 . Moreover, it is isomorphic to IndGB (μ) if and only
if either χ1 = μ1 and χ2 = μ2 or χ1 = μ2 and χ2 = μ1 .

This completely classifies the principal series representations of GL(2, F ).


The irreducibles of this type are about half the irreducible representation of
GL(2, F ). The proof will be complete, assuming one fact that will be proved
later in the chapter. The reason for this deviation from linear ordering of the
material is heuristic—we assume that the reader will be more cheerful while
reading the proof of Theorem 32.1 given an example of how the theorem is
used.

Proof. The index of B in G is easily seen to be q + 1, so this is the dimension


of the induced representation. We recall that the vector space for the repre-
sentation IndGB (χ) consists of the space of functions f : G −→ C such that
f (bg) = χ(b)f (g). Let us call this space Vχ . The key calculation is to compute
HomG (Vχ , Vμ ). We will show that

dim HomG (Vχ , Vμ ) =


    (32.2)
1 if χ1 = μ1 , χ2 = μ2 1 if χ1 = μ2 , χ2 = μ1
+ .
0 otherwise 0 otherwise

Before showing how Mackey theory can be used to prove (32.2), let us observe
that this implies the proposition. First, if χ1 = χ2 , then it shows that
HomG (Vχ , Vχ ) is one-dimensional, so Vχ is irreducible. Moreover, (32.2) shows
exactly when there is a nonzero intertwining operator Vχ −→ Vμ , and the
second statement is easily deduced.
To prove (32.2), we make use of Mackey’s theorem, which we will prove
later in the chapter. We recall that if f1 and f2 are functions on G, their
convolution is the function

(f1 ∗ f2 )(g) = f1 (gh) f2 (h−1 ) = f1 (h) f2 (h−1 g).
h∈G h∈G

Mackey’s theorem (Theorem 32.1 below) asserts that any intertwining operator
T : Vχ −→ Vμ is of the form T f = Δ ∗ f where Δ : G −→ C is a function
satisfying
Δ(b2 g b1 ) = μ(b2 )Δ(g)χ(b1 ).
Such a function is determined by its values on a set of representatives for
the double cosets B\G/B. By the Bruhat decomposition, there are just two
double cosets:
 
1
G = B 1B ∪ Bw0 B, w0 = ,
1
where 1 is, of course, the identity matrix. [A quick proof is given below (27.1).]
32 Mackey Theory 339

So what we will prove is that Δ(1) = 0 unless χ1 = μ1 and χ2 = μ2 , and


that Δ(w0 ) = 1 unless χ1 = μ2 and χ2 = μ1 . Indeed,
    −1     −1
t1 1 t1 t1 t1
Δ(1) = Δ = μ Δ(1)χ ,
t2 1 t−1
2 t2 t2

that is,
Δ(1) = μ1 (t1 )μ2 (t2 )χ1 (t1 )−1 χ2 (t2 )−1 Δ(1).
Unless χ1 = μ1 and χ2 = μ2 , we may choose t1 and t2 so that

μ1 (t1 )μ2 (t2 )χ1 (t1 )−1 χ2 (t2 )−1 = 1,

proving Δ(1) = 0. The proof that Δ(w0 ) = 0 unless χ1 = μ2 and χ2 = μ1 is


similar. 

Now let us treat Mackey theory more systematically. We will work with
finite groups and with representations over an arbitrary ground field F . In this
generality, representations may not be completely reducible. Before consider-
ing Mackey theory in general, we will give two functorial interpretations of
Frobenius reciprocity that correspond to the two special cases where H1 = G
and H2 = G.
Let G be a finite group, F a field, and F [G] the group algebra. If π : G −→
GL(V ) is an representation in an F -vector space V , then V becomes an F [G]
module by
⎛ ⎞

⎝ cg · g ⎠ v = cg π(g)v, cg · g ∈ F [G],
g∈G g∈G g∈G

and, conversely, if V is an F [G]-module, then π : G −→ GL(V ) defined by


π(g)v = gv is a representation. Thus, the categories of complex representations
of G and F [G]-modules are equivalent. In either case, we may refer to V as a
G-module. An intertwining operator for two representations is the same as an
F [G]-module homomorphism for the corresponding F [G]-modules, and we call
such a map a G-module homomorphism.
Also, if (σ, U ) is a representation of G, then we can restrict σ to H to
obtain a representation of H. We call UH the corresponding H-module. Thus,
as sets, U and UH are equal.

Proposition 32.2. (Frobenius reciprocity, first version) Let H be a sub-


group of G and let (π, V ) be a representation of H. Let (σ, U ) be a represen-
tation of G. Then

HomG (U, V G ) ∼
= HomH (UH , V ). (32.3)

In this isomorphism, J ∈ HomG (U, V G ) and j ∈ Hom


H (UH , V ) correspond if
and only if j(u) = J(u)(1) and J(u)(g) = j σ(g)u .
340 32 Mackey Theory

Proof. Given J ∈ HomG (U, V G ), define j(u) = J(u)(1). We show that j is in


HomH (UH , V ). Indeed, if h ∈ H, we have

j σ(h)u = J σ(h)u (1) = π G (h) J(u) (1)

because J : U −→ V G is G-equivariant. This equals J(u)(1.h) = J(u)(h.1) =


π(h) J(u)(1) = π(h)j(u) because h ∈ H and J(u) ∈ V G . Therefore, j ∈
HomH (UH , V ).
if j ∈ HomH (UH , V ) and u ∈ U , we define J(u) : G −→ V by
Conversely,
J(u)(g) = j σ(g)u . We leave it to the reader to check that J(u) ∈ V G and
that J : U → V G is G-equivariant. We also leave it to the reader to check that
J → j and j → J are inverse maps and so HomG (U, V G ) ∼ = HomH (UH , V ).



If the ground field F = C, then we may reinterpret this statement in terms


of characters. If η and χ are the characters of U and V , respectively, and if
χG is the character of the representation of G on V G , then by Theorem 2.5
we may express Proposition 32.2 by the well-known character identity
 
χ, η H = χG , η G . (32.4)

Dual to (32.3) there is also a natural isomorphism

HomG (V G , U ) ∼
= HomH (V, U ). (32.5)

This is slightly more difficult than Proposition 32.2, and it also involves ideas
that we will need in our discussion of Mackey theory. We will approach this
by means of a universal property.

Proposition 32.3. Let H be a subgroup of G and let (π, V ) be a representa-


tion of H. If v ∈ V , define (v) : G −→ V by

π(g)v if g ∈ H ,
(v)(g) =
0 otherwise.

Then (v) ∈ V G , and  : V −→ V G is H-equivariant. Let (σ, U ) be a repre-


sentation of G. If j : V −→ U is any H-module homomorphism, then there
exists a unique G-module homomorphism J : V G −→ U such that j = J ◦ .
We have

J(f ) = σ(γ)j f (γ −1 ) . (32.6)
γ∈G/H

Proof. It is easy to check that (v) ∈ V G and that if h ∈ H, then



 π(h)v = π G (h)(v). (32.7)

Thus  is H-equivariant.
32 Mackey Theory 341

We prove that if f ∈ V G , then



f= π G (γ)  f (γ −1 ) . (32.8)
G/H

Using (32.7), each term on the right-hand side of (32.8) is independent of the
choice of representatives γ of the cosets in G/H. Let us apply the right-hand
side to g ∈ G. We get

 f (γ −1 ) (gγ).
G/H

Only one coset representative γ of G/H contributes since, by the definition


of , the contribution is zero unless gγ ∈ H. Since we have already noted
that each term on the right-hand side of (32.8) is independent of the choice
of γ modulo right multiplication
by an element of H, we may as well choose
γ = g −1 . We obtain  f (g) (1) = f (g). This proves (32.8).
Suppose now that J : V G −→ U is G-equivariant and that j = J ◦ . Then,
using (32.8),

J(f ) = J π G (γ)(f (γ −1 )) = σ(γ)(J ◦ ) f (γ −1 )
G/H G/H

so J must satisfy (32.6). We leave it to the reader to check that J defined by


(32.6) is independent of the choice of representatives γ for G/H. We check
that it is G-equivariant. If g ∈ G, we have

J π G (g)f = σ(γ)j f (γ −1 g) .
γ∈G/H

The variable change γ −→ gγ permutes the cosets in G/H and shows that

J π G (g)f = σ(gγ)j f (γ −1 ) = σ(g)J(f ),
γ∈G/H

as required.


Corollary 32.1. (Frobenius reciprocity, second version) If H is a sub-


group of the finite group G, and if (σ, U ) and (π, V ) are representations of G
and H, respectively, then HomG (V G , U ) ∼
= HomH (V, U ), and in this isomor-
phism j ∈ HomH (V, U ) corresponds to J ∈ HomG (V G , U ) if and only if they
are related by (32.6).

Proof. This is a direct restatement of Proposition 32.3.




We turn next to Mackey theory. In the following statement, Hom(V1 , V2 )


means HomF (V1 , V2 ), the space of all linear maps.
342 32 Mackey Theory

Theorem 32.1. (Mackey’s theorem, geometric version) Suppose that


G is a finite group, H1 and H2 subgroups, and (π1 , V1 ) and (π2 , V2 ) represen-
tations of H1 and H2 , respectively. Then HomG (V1G , V2G ) is isomorphic as a
vector space to the space of all functions Δ : G −→ Hom(V1 , V2 ) that satisfy
Δ(h2 gh1 ) = π2 (h2 ) ◦ Δ(g) ◦ π1 (h1 ), h i ∈ Hi . (32.9)
In this isomorphism an intertwining operator Λ : V1G −→ V2G corresponds to
Δ if Λ(f ) = Δ ∗ f (f ∈ V1G ), where the “convolution” Δ ∗ f is defined by

(Δ ∗ f )(g) = Δ(γ) f (γ −1 g). (32.10)
γ∈G/H1

Proof. Let Δ satisfying (32.9) be given. It is easy to check, using (32.9) and
the fact that f ∈ V1G , that (32.10) is independent of the choice of coset
representatives γ for G/H1 . Moreover, if h2 ∈ H2 , then the variable change
γ −→ h2 γ permutes the cosets of G/H1 , and again using (32.9), this variable
change shows that Δ∗f ∈ V2G . Thus f −→ Δ∗f is a well-defined map V1G −→
V2G , and using the fact that G acts on both these spaces by right translation,
it is straightforward to see that Λ(f ) = Δ ∗ f defines an intertwining operator
V1G −→ V2G .
To show that this map Δ → Λ is an isomorphism of the space of Δ satisfy-
ing (32.9) to HomG (V1G , V2G ), we make use of Corollary 32.1. We must relate
the space of Δ satisfying (32.9) to HomH1 (V1 , V2G ). Given λ ∈ HomH1 (V1 , V2G )
corresponding to Λ ∈ HomG (V1G , V2G ) as in that corollary, define Δ : G −→
Hom(V1 , V2 ) by Δ(g)v1 = λ(v1 )(g). The condition that λ(v1 ) ∈ V2G for all
v1 ∈ V1 is equivalent to
Δ(h2 g) = π2 (h2 ) ◦ Δ(g), h 2 ∈ H2 ,
and the condition that λ : V1 −→ V2G is H1 -equivariant is equivalent to
Δ(gh1 ) = Δ(g) ◦ π1 (h1 ), h 1 ∈ H1 .
Of course, these two properties together are equivalent to (32.9). We see that
Corollary 32.1 implies a linear isomorphism between the space of functions Δ
satisfying (32.9) and the elements of HomG (V1G , V2G ). We have only to show
that this correspondence is given by (32.10). In (32.6), we take H = H1 ,
(σ, U ) = (π2G , V2G ), and j = λ. Then J = Λ and (32.6) gives us, for f ∈ V1G ,

Λ(f ) = π2G (γ)λ f (γ −1 ) .
γ∈G/H1

Applying this to g ∈ G,

Λ(f )(g) = λ f (γ −1 ) (gγ) = Δ(gγ)f (γ −1 ).
γ∈G/H1 γ∈G/H1

Making the variable change γ −→ g −1 γ, this equals (32.10).



32 Mackey Theory 343

Remark 32.1. Although we are working here with finite groups, Mackey’s
theorem is (since Bruhat [26]) a standard tool in representation theory of
Lie groups also. The function Δ becomes a distribution.
Remark 32.2. Suppose that H1 , H2 , and (πi , Vi ) are as in Theorem 32.1. The
function Δ : G −→ Hom(V1 , V2 ) associated with an intertwining operator
Λ : V1G −→ V2G is clearly determined by its values on a set of representatives
for the double cosets in H2 \G/H1 . The simplest case is when Δ is supported
on a single double coset H2 γH1 . In this case, we say that the intertwining
operator Λ is supported on H2 γH1 .

Proposition 32.4. In the setting of Theorem 32.1, let γ ∈ G. Let Hγ = H2 ∩


γH1 γ −1 . Define two representations (π1γ , V1 ) and (π2γ , V2 ) of Hγ as follows.
The representation π2γ is just the restriction of π2 to Hγ . On the other hand,
we define π1γ (h) = π1 (γ −1 hγ) for h ∈ Hγ . The space of intertwining operators
Λ : V1G −→ V2G supported on H2 γH1 is isomorphic to HomHγ (π1γ , π2γ ), the
space of all δ : V1 −→ V2 such that
δ ◦ π1γ (h) = π2γ (h) ◦ δ, h ∈ Hγ . (32.11)
Proof. If Δ : G −→ Hom(V1 , V2 ) is associated with Λ as in Theorem 32.1,
then Δ is by assumption supported on H2 γH1 , and (32.9) implies that Δ is
determined by δ = Δ(γ). This is subject to a consistency condition derived
from (32.9). If h ∈ Hγ , then γh = hγ, where h = γ −1 hγ. We have h ∈ H2 and
h ∈ H1 , so by (32.9) the map δ : V1 −→ V2 must satisfy (32.11). Conversely,
if (32.11) is assumed, it is not hard to see that

π2 (h2 )δπ1 (h1 ) if g = h2 γh1 ∈ H2 γH1 , hi ∈ Hi ,
Δ(g) =
0 if g ∈
/ H2 γH1 ,
is a well-defined function G −→ Hom(V1 , V2 ) satisfying (32.9), and the corre-
sponding intertwining operator Λ is supported on H2 γH1 .

Theorem 32.2. (Mackey’s theorem, algebraic version) In the setting of
Theorem 32.1, let γ1 , . . . , γh be a complete set of representatives of the double
cosets in H2 \G/H1 . With γ = γi , let πiγ be as in Proposition 32.4. We have


h
dim HomG (V1G , V2G ) = dim HomHγi (π1γi , π2γi ). (32.12)
i=1

Proof. If Δ is as in Theorem 32.1, write Δ = i Δi , where

Δ(g) if g ∈ H2 γi H1 ,
Δi (g) =
0 otherwise.
Then Δi satisfy (32.9). Let Λi be the intertwining operator. Then Λi is sup-
ported on a single double coset, and the dimension of the space of such inter-
twining operators is computed in Proposition 32.4.

344 32 Mackey Theory

Corollary 32.2. Assume that the ground field F is of characteristic zero. Let
H1 and H2 be subgroups of G and let (π, V ) be an irreducible representation
of H1 . Let γ1 , . . . , γh be a complete set of representatives of the double cosets
in H2 \G/H1 . If γ ∈ G, let Hγ = H2 ∩ γH1 γ −1 , and let π γ : Hγ −→ GL(V )
be the representation π γ (g) = π(γ −1 gγ). Then the restriction of π G to H2 is
isomorphic to
&h
IndH 2 γi
Hγ (π ). (32.13)
i
i=1

In a word, first inducing and then restricting gives the same result as restrict-
ing, then inducing. This way of explaining the result is a pithy oversimplifi-
cation that has to be correctly understood. More precisely, there are different
ways we can restrict, namely given γ we may restrict to Hγ , then induce; we
have to sum over all these different ways. And the different ways depend only
on the double coset H2 γH1 .

Proof. Since we are assuming that the characteristic of F is zero, representa-


tions are completely reducible and it is enough to show that the multiplicity
of an irreducible representation (π2 , V2 ) in π G is the same as the multiplicity
of π2 in the direct sum (32.13). The multiplicity of π2 in π G is


h
dim HomH2 (V G , V2 ) = dim HomG (V G , V2G ) = dim HomHγi (π γi , π2γi )
i=1

by Frobenius reciprocity and Theorem 32.2. One more application of Frobe-


nius reciprocity shows that this equals


h

dim HomH2 IndH2 γi
Hγ (π ), π2 .
i
i=1



Next we will reinterpret induced representations as obtained by “extension


of scalars” as explained in Chap. 11. We must extend the setup there to
noncommutative rings. In particular, we recall the basics of tensor products
over noncommutative rings. Let R be a ring, not necessarily commutative,
and let W be a right R-module and V a left R-module. If C is an Abelian
group (written additively), a map f : W × V −→ C is called balanced if (for
w, w1 , w2 ∈ W and v, v1 , v2 ∈ V )

f (w1 + w2 , v) = f (w1 , v) + f (w2 , v),

f (w, v1 + v2 ) = f (w, v1 ) + f (w, v2 ),


and if r ∈ R,
f (wr, v) = f (w, rv).
32 Mackey Theory 345

The tensor product W ⊗R V is an Abelian group with a balanced map T :


W × V −→ W ⊗R V such that if f : W × V −→ C is any balanced map into an
Abelian group C, then there exists a unique homomorphism F : W ⊗R V −→ C
of Abelian groups such that f = F ◦T. The balanced map T is usually denoted
T (w, v) = w ⊗ v.

Remark 32.3. The tensor product always exists and is characterized up to


isomorphism by this universal property. If R is noncommutative, then W ⊗R V
does not generally have an R-module structure. However, in special cases it
is a module. If A is another ring, we call W an (A, R)-bimodule if it is a left
A-module and a right R-module, and if these module structures are compatible
in the sense that if w ∈ W , a ∈ A, and r ∈ R, then a(wr) = (aw)r. If W is
an (A, R)-bimodule, then W ⊗R V has the structure of a left A-module with
multiplication satisfying

a(w ⊗ v) = aw ⊗ v, a ∈ A.

If R is a subring of A, then A is itself an (A, R)-bimodule. Therefore, if V is


a left R-module, we can consider A ⊗R V and this is a left A-module.

Proposition 32.5. If R is a subring of A and V is a left R-module, let V 


be the left A-module A ⊗R V . We have a homomorphism i : V −→ V  of
R-modules defined by i(v) = 1 ⊗ v. If U is any left A-module and j : V −→ U
is an R-module homomorphism, then there exists a unique A-module homo-
morphism J : V  −→ U such that j = J ◦ i.

Proof. Suppose that J : V  −→ U is A-linear and satisfies j = J ◦ i. Then



J(a ⊗ v) = J a(1 ⊗ v) = aJ(1 ⊗ v) = aJ i(v) = aj(v).

Since V  is spanned by elements of the form a ⊗ v, this proves that J, if it


exists, is unique.
To show that J exists, note that we have a balanced map A × V −→ U
given by (a, v) −→ aj(v). Hence, there exists a unique homomorphism J :
V  = A ⊗R V −→ U of Abelian groups such that J(a ⊗ v) = aj(v). It is
straightforward to see that this J is A-linear and that J ◦ i = j.


Proposition 32.6. If R is a subring of A, U is a left A-module, and V is a


left R-module, we have a natural isomorphism

HomR (V, U ) ∼
= HomA (A ⊗R V, U ). (32.14)

Proof. This is a direct generalization of Proposition 11.1 (ii). It is also ess-


entially equivalent to Proposition 32.5. Indeed, composition with i : V −→
V  = A ⊗R V is a map HomA (V  , U ) −→ HomR (V, U ), and the content of
Proposition 32.5 is that this map is bijective.

346 32 Mackey Theory

Proposition 32.7. Suppose that H is a subgroup of G and V is an H-module.


Then V is a module for the group ring F [H], which is a subring of F [G]. We
have an isomorphism
VG ∼= F [G] ⊗F [H] V
as G-modules.

Proof. Comparing Proposition 32.3 and Proposition 32.5, the G-modules V G


and F [G] ⊗F [H] V satisfy the same universal property, so they are isomorphic.



Finally, if F = C, let us recall the formula for the character of the induced
representation. If χ is a class function of the subgroup H of G, let χ̇ : G −→ C
be the function 
χ(g) if g ∈ H ,
χ̇(g) =
0 otherwise,
and let χG : G −→ C be the function

χG (g) = χ̇(xgx−1 ). (32.15)
x∈H\G

We note that since χ is assumed to be a class function, each term depends


only on the coset of x in H\G. We may, of course, also write
1
χG (g) = χ̇(xgx−1 ). (32.16)
|H|
x∈G

Clearly, χG is a class function on G.

Proposition 32.8. Let (π, V ) be a complex representation of the subgroup


H of the finite group G with character χ. Then the character of the induced
representation π G is χG .

Proof. Let η be the character of a representation (σ, U ) of G. We will prove


that the class function χG satisfies Frobenius reciprocity in its classical form
G
(32.4).
 G  This suffices because χ is determined by the inner product values
χ , η . We have
 G  1 1
χ ,η G = χ̇(xgx−1 ) η(g) =
|G| |H|
g∈G x∈G

1 1
χ(h) η(g).
|G| |H|
g∈G h∈H x∈G
xgx−1 = h

Given h ∈ H, we can enumerate the pairs (g, x) ∈ G×G that satisfy xgx−1 = h
by noting that they are the pairs (x−1 hx, x) with x ∈ G. So the sum equals
32 Mackey Theory 347

1 1 1
χ(h) η(x−1 hx) = χ(h) η(h) = χ, η H
|G| |H| |H|
h∈H x∈G h∈H
−1
since η(x hx) = η(h).


Exercises
Exercise 32.1. Some points in the proof of Proposition 32.2 were left to the reader.
Write out a complete proof.
Exercise 32.2. Let H1 , H2 , and H3 be subgroups of G, with (πi , Vi ) a repre-
sentation of Hi . Let there be given intertwining operators Λ1 : V1G → V2G and
Λ2 : V2G → V3G . Let Δ1 : G → Hom(V1 , V2 ) and Δ2 : G → Hom(V2 , V3 ) being
the corresponding functions as in Theorem 32.1. Express the Δ : G → Hom(V1 , V3 )
corresponding to the composition Λ2 ◦ Λ1 in terms of Δ1 and Δ2 .
Exercise 32.3. Let H be a subgroup of G, and ψ : H → C× a linear character.
Prove that the ring of G-module endomorphisms of the induced representation ψ G
is isomorphic to the convolution ring of functions Δ : G → C× such that
Δ(h2 gh1 ) = ψ(h2 ) Δ(g) ψ(h1 ), h1 , h2 ∈ H.
G
What can you say about ψ if this ring is commutative?
Exercise 32.4. Let G = GL(2, F ), where F is a finite field. Let B be the Borel
subgroup of upper triangular matrices, and let N be its subgroup of unipotent
matrices. Let ψF : F → C be any nontrivial character. Define a character of N as
follows:  
1x
ψ = ψF (x).
1
Let χ be a linear character of B as in (32.1). Show that up to scalar multiple there
is a unique intertwining operator IndGB (χ) → IndN (ψ).
G

Exercise 32.5. Let H be the non-Abelian group of order q 3 consisting of all


matrices ⎛ ⎞
1xz
⎝ 1 y⎠.
1
The center Z of matrices with x = y = 0. The subgroup A of matrices with x = 0
is Abelian but not central. Let χ and ψ be two linear characters of A.
(i) Assume that χ and ψ have nontrivial restrictions to Z. Let χH and ψ H be the
induced representations. Use Mackey theory to prove that

1 if χ, ψ have the same restriction to Z;
dim HomH (χH , ψ H ) =
0 otherwise.
(ii) Prove that χH is irreducible, and that χH , ψ H are isomorphic if and only if χ
and ψ have the same restriction to Z.
(iii) Prove that given a nontrivial central character θ of Z, H has a unique irreducible
representation with central character θ. This is the Stone–von Neumann theorem
for finite fields.
33
Characters of GL(n, C)

In the next few chapters, we will construct the irreducible representations of


the symmetric group in parallel with the irreducible algebraic representations
of GL(n, C). In this chapter, we will construct some generalized characters of
GL(n, C). The connection with the representation theory of Sk will become
clear later.
A complex representation (π, V ) of GL(n, C) is algebraic if the matrix
coefficients of π(g) are polynomial functions in the matrix coefficients gij of
−1
g = (gij ) ∈ GL(n, C) and
of det(g)
. Thus, if we choose a basis of V , then
π(g) becomes a matrix π(g)kl with 1  k, l  dim(V ), and for each k, l we
require that there be a polynomial Pkl with n2 +1 entries such that

π(g)kl = Pkl g11 , . . . , gnn , det(g)−1 .

The assumption that a representation is algebraic is similar to the assumption


that it is analytic—it rules out representations such as complex conjugation
GL(n, C) → GL(n, C). It is not hard to show (using the Weyl character for-
mula) that every analytic representation of GL(n, C) is algebraic, and of course
the converse is also true.
A character χ is algebraic if it is the character of an algebraic representa-
tion. A generalized character , also called a virtual character , is the difference
between two characters. If G = GL(n, C), or more generally any algebraic
group, we will say a generalized character is algebraic if it is χ1 − χ2 , where
χ1 and χ2 are algebraic.
If R is a commutative ring, we will denote by Rsym [x1 , . . . , xn ] the ring of
symmetric polynomials in x1 , . . . , xn having coefficients in R. Let ek and hk ∈
Zsym [x1 , . . . , xn ] be the kth elementary and complete symmetric polynomials
in n variables. Specifically,

ek (x1, . . . , xn ) = xi1 xi2 . . . xik ,
1i1 <i2 <···<ik n

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 349


DOI 10.1007/978-1-4614-8024-2 33, © Springer Science+Business Media New York 2013
350 33 Characters of GL(n, C)

hk (x1, . . . , xn ) = xi1 xi2 · · · xik .
1i1 i2 ···ik n

If k > n, then ek = 0, although this is not true for hk . Our convention is that
e0 = h0 = 1.
Let E(t) be the generating function for the elementary symmetric
polynomials:
n
E(t) = e k tk .
k=0
Then
E(t) = (1 + x1 t)(1 + x2 t) · · · (1 + xn t) (33.1)
since expanding the right-hand side and collecting the coefficients of tk will
give each monomial in the definition of ek exactly once. Similarly, if


H(t) = h k tk ,
k=0

then
7
n
H(t) = (1 + xi t + x2i t2 + · · · ) = (1 − x1 t)−1 · · · (1 − xn t)−1 . (33.2)
i=0

We see that
H(t)E(−t) = 1.
Equating the coefficients in this identity gives us recursive relations
hk − e1 hk−1 + e2 hk−2 − · · · + (−1)k ek = 0, k > 0. (33.3)
These can be used to express the h’s in terms of the e’s or vice versa.
Proposition 33.1. The ring Zsym [x1 , . . . , xn ] is generated as a Z-algebra by
e1 , . . . , en , and they are algebraically independent. Thus, Zsym [x1 , . . . , xn ] =
Z[e1 , . . . , en ] is a polynomial ring. It is also generated by h1 , . . . , hn , which are
algebraically independent, and Zsym [x1 , . . . , xn ] = Z[h1 , . . . , hn ].
Proof. The fact that the ei generate Zsym [x1 , . . . , xn ] is Theorem 6.1 on p. 191
of Lang [116], and their algebraic independence is proved on p. 192 of that
reference. The fact that h1 , . . . , hn also generate follows since (33.3) can be
solved recursively to express the ei in terms of the hi . The hi must be alge-
braically independent since if they were dependent the transcendence degree
of the field of fractions of Zsym [x1 , . . . , xn ] would be less than n, so the ei
would also be algebraically dependent, which is a contradiction.

If V is a vector space, let ∧k V and ∨k V denote the kth exterior and
symmetric powers. If T : V −→ W is a linear transformation, then there are
induced linear transformations ∧k T : ∧k V −→ ∧k W and ∨k T : ∨k V −→
∨k W .
33 Characters of GL(n, C) 351

Proposition 33.2. If V is an n-dimensional vector space and T : V −→ V


an endomorphism, and if t1 , . . . , tn are its eigenvalues with multiplicities (that
is, each eigenvalue is listed with its multiplicity as a root of the characteristic
polynomial), then
tr ∧k T = ek (t1 , . . . , tn ) (33.4)
and
tr ∨k T = hk (t1 , . . . , tn ) . (33.5)

Proof. First, assume that T is diagonalizable and that v1 , . . . , vn are its eigen-
vectors, so T vi = ti vi . Then a basis of ∧k V consists of the vectors

vi1 ∧ · · · ∧ vik , 1  i1 < i2 < · · · < ik  n,

and this is an eigenvector of ∧k T with eigenvalue ti1 · · · tik . Summing these


eigenvalues gives ek (t1 , . . . , tn ). Thus, (33.4) is true if T is diagonalizable.
Similarly, a basis of ∨k V consists of the vectors
vi1 ∨ · · · ∨ vik , 1  i1  i2  · · ·  ik  n,
so (33.5) is also true if T is diagonalizable.
In the general case, both sides of (33.4) or (33.5) are continuous functions
of the matrix entries of T . The left-hand side of (33.4) is continuous
because
if we refer T to a fixed basis, then tr ∧k T is the sum of the nk principal
minors of its matrix with respect to this basis, and the right-hand side is
continuous because it is a coefficient in the characteristic polynomial of T .
Since the diagonalizable matrices are dense in GL(n, C), it follows that (33.4)
is true for all T . As for (33.5), the h’s are polynomial functions in the e’s, as
we see by solving (33.3) recursively, so the right-hand side of (33.5) is also
continuous, and (33.5) is also proved.


Theorem 33.1. Let f (x1 , . . . , xn ) be a symmetric polynomial with integer


coefficients. Define a function ψf on GL(n, C) as follows. If t1 , . . . , tn are
the eigenvalues of g, let
ψf (g) = f (t1 , . . . , tn ). (33.6)
Then ψf is an algebraic generalized character of GL(n, C).

As in Proposition 33.2, there may be repeated eigenvalues. If this is the


case, we count each eigenvalue with the multiplicity with which it occurs as
a root of the characteristic polynomial.

Proof. Let us call a symmetric polynomial f constructible if ψf is a gener-


alized character of GL(n, C). The generalized characters of GL(n, C) form a
ring since the direct sum and tensor product operations on GL(n, C)-modules
correspond to addition and multiplication of characters. Since
ψf1 ±f2 = ψf1 ± ψf2 , ψf1 f2 = ψf1 ψf2 ,
352 33 Characters of GL(n, C)

it follows that the constructible polynomials also form a ring. The ek are
constructible by Proposition 33.2 and generate Zsym [x1 , . . . , xn ] by Proposi-
tion 33.1. Thus, the ring of constructible polynomials is all of Zsym [x1 , . . . , xn ].



In addition to the elementary and complete symmetric polynomials, we


have the power sum symmetric polynomials

pk (x1 , . . . , xn ) = xk1 + · · · + xkn . (33.7)

Theorem 33.2. Let G be a group, let χ be a character of G, and let k be a


nonnegative integer. Then g →
 χ(g k ) is a virtual character of G.

Proof. Let χ be the character corresponding to the representation π : G →


GL(n, C). If ψ is any generalized character of GL(n, C), then ψ ◦ π is a gen-
eralized character of G. We take ψ = ψpk , which is a generalized character by
Theorem 33.1. If t1 , . . . , tn are the eigenvalues of π(g), then tk1 , . . . , tkn are the
eigenvalues of π(g k ). Hence

(ψpk ◦ π)(g) = χ(g k ), (33.8)

proving that χ(g k ) is a generalized character.




Proposition 33.3. (Newton) The polynomials pk generate Qsym [x1 , . . . , xn ]


as a Q-algebra.

Proof. We will make use of the identity



(−1)k−1
log(1 + t) = tk .
k
k=1

Replacing t by txi in this identity, summing over the xi , and using (33.1), we
see that

(−1)k−1
log E(t) = p k tk .
k
k=1

Exponentiating this identity,


∞ ∞
8 9
(−1)k−1
k
ek t = exp p k tk .
k
k=0 k=1

Expanding and collecting the coefficients of tk expresses ek as a polynomial


in the p’s, with rational coefficients.

33 Characters of GL(n, C) 353

Let us return to the context of Theorem 33.2. Let G be a group and χ the
character of a representation π : G −→ GL(n, C). As we saw in that theorem,
the functions g −→ χk (g) = χ(g k ) are generalized characters; indeed they are
the functions ψpk ◦ π. They are conveniently computable and therefore useful.
The operations χ −→ χk on the ring of generalized characters of G are called
the Adams operations. See also the exercises in Chap. 22 for more about the
Adams operations.
Let us consider an example. Consider the polynomial

s(x1 , . . . , xn ) = x2i xj + 2 xi xj xk . (33.9)
i=j i<j<k

We find that

p31 = x3i + 3 x2i xj + 6 xi xj xk ,
i i=j i<j<k
so
s = 13 (p31 − p3 ). (33.10)
Hence, if π : G −→ GL(n, C) is a representation affording the character χ,
then we have

(ψs ◦ π)(g) = 13 χ(g)3 − χ(g 3 ) . (33.11)
Such a composition of a representation with a ψf is called a plethysm. The
expression on the right-hand side is useful for calculating the values of this
function, which we have proved is a virtual character of GL(n, C), provided we
know the values of the character χ. We will show in the next chapter that (for
this particular s) this plethysm is actually a proper character. Indeed, we will
actually prove that ψs is a character of GL(n, C), not just a virtual character.
This will require ideas different from those than used in this chapter.

EXERCISES
Exercise 33.1. Express each of the sets of polynomials {ek | k  5} and {pk | k  5}
in terms of the other.
Exercise 33.2. Here is the character table of S4 .
1 (123) (12)(34) (12) (1234)
χ1 1 1 1 1 1
χ2 1 1 1 −1 −1
χ3 3 0 −1 1 −1
χ4 3 0 −1 −1 1
χ5 2 −1 2 0 0
Let s be as in (33.9). Using (33.11), compute ψs ◦π when (π, V ) is an irreducible rep-
resentation with character χi for each i, and decompose the resulting class function
into irreducible characters, confirming that it is a generalized character.
34
Duality Between Sk and GL(n, C)

$
Let V be a complex vector space, and let k V = V ⊗ · · · ⊗ V be the k-fold
tensor of V . (Unadorned ⊗ means ⊗C .) We consider this to be a right module
over the group ring C[Sk ], where σ ∈ Sk acts by permuting the factors:
(v1 ⊗ · · · ⊗ vk )σ = vσ(1) ⊗ · · · ⊗ vσ(k) . (34.1)
It may be checked that with this definition
((v1 ⊗ · · · ⊗ vk )σ) τ = (v1 ⊗ · · · ⊗ vk )(στ ).
$k
If A is C-algebra and V is an A-module, then V has an A-module struc-
ture; namely, a ∈ A acts diagonally:
a(v1 ⊗ · · · ⊗ vk ) = av1 ⊗ · · · ⊗ avk .
This action commutes with the action (34.1) of the symmetric group, so it
$k
makes V an (A, C[Sk ])-bimodule. Suppose that ρ : Sk −→ GL(Nρ ) is a
representation. Then Nρ is an Sk -module, so by Remark 32.3
% 
k
Vρ = V ⊗C[Sk ] Nρ (34.2)

is a left A-module.
We can take A = End(V ). Embedding GL(V ) −→ A, we obtain a rep-
resentation of GL(V ) parametrized by a module Nρ of Sk . Thus, Vρ is a
GL(V )-module. This is the basic construction of Frobenius–Schur duality.
We now give a reinterpretation of the symmetric and exterior powers,
which were used in the proof of Theorem 33.1. Let Csym be a left C[Sk ]-
module for the trivial representation, and let Calt be a C[Sk ]-module for the
alternating character. Thus, Calt is C with the Sk -module structure
σ x = ε(σ) x,
for σ ∈ Sk , x ∈ Calt , where ε : Sk → {±1} is the alternating character.

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 355


DOI 10.1007/978-1-4614-8024-2 34, © Springer Science+Business Media New York 2013
356 34 Duality Between Sk and GL(n, C)

Proposition 34.1. Let V be a vector space over C.We have functorial


isomorphisms
%  % 
k k
∧ V =
k ∼ V ⊗C[Sk ] Calt , ∨ V =
k ∼ V ⊗C[Sk ] Csym .

Here “functorial” means that if T : V −→ W is a linear transformation, then


we have a commutative diagram

>$ ?
= k
∧k V −→ V ⊗C[Sk ] Calt
↓ >$ ?↓

= k
∧ W −→
k
W ⊗C[Sk ] Calt
>$ ?
and in particular if V = W , this implies that ∧k V ∼
k
= V ⊗C[Sk ] Calt as
GL(V )-modules.
Proof. The proofs of these isomorphisms are similar. We will prove the first.
It is sufficient to show that the right-hand side satisfies the universal property
of the exterior kth power. We recall that this is the following property of ∧k V .
Given a vector space W , a k-linear map f : V × · · ·× V −→ W is alternating if

f vσ(1) , . . . , vσ(k) = ε(σ) f (v1 , . . . , vk ).

The universal property is that any such alternating map factors uniquely
through ∧k V . That is, the map (v1 , . . . , vk ) → v1 ∧· · ·∧vk is itself alternating,
and given any alternating map f : V × · · · × V −→ W there exists a unique
linear map>F : ∧k V? −→ W such that f (v1 , . . . , vk ) = F (v1 ∧ · · · ∧ vk ). We will
$k
show that V ⊗C[Sk ] Calt has the same universal property.
We are identifying the underlying
>$ space ? of Calt with C, so 1 ∈ Calt . There
k
exists a map i : V × · · · × V → V ⊗C[Sk ] Calt given by

i(v1 , . . . , vk ) = (v1 ⊗ · · · ⊗ vk ) ⊗C[Sk ] 1.

Let f : V × · · · × V → W be an alternating k-linear map into a vector space


W . We must show that there exists a unique linear map
% 
k
F : V ⊗C[Sk ] Calt → W

such
>$ that ? f = F ◦ i. Uniqueness is clear since the image of i spans the space
k
V ⊗C[Sk ] Calt . To prove existence, we observe first that by the universal
$k
property of the tensor product there exists a linear map f  : V → W such
that f (v1 , . . . , vk ) = f  (v1 ⊗ · · · ⊗ vk ). Now consider the map
% 
k
V × Calt → W
34 Duality Between Sk and GL(n, C) 357

defined by (ξ, t) −→ t f  (ξ). It follows from the fact that f is alternating that
this map is C[Sk ]-balanced and consequently induces a map
% 
k
F : V ⊗C[Sk ] Calt → W.

>$ ?
k
This is the map we are seeking. We see that V ⊗C[Sk ] Calt satisfies the
same universal property as the exterior power, so it is naturally isomorphic
to ∧k V .


For the rest of this chapter, fix n and let V = Cn . If ρ : Sk −→ GL(Nρ ) is


any representation, then (34.2) defines a module Vρ for GL(n, C).

Theorem 34.1. Let ρ : Sk −→ GL(Nρ ) be a representation. Let Vρ be as in


(34.2). There exists a homogeneous symmetric polynomial sρ of degree k in n
variables such that if ψρ (g) is the trace of g ∈ GL(n, C) on Vρ , and if t1 , . . . , tn
are the eigenvalues of g, then

ψρ (g) = sρ (t1 , . . . , tn ). (34.3)

Proof. First let us prove this for g restricted to the subgroup of diagonal
matrices. Let ξ1 , . . . , ξn be the standard basis of V . In other words, identifying
V with Cn , let ξi = (0, . . . , 1, . . . , 0), where the 1 is in the ith position. The
vectors (ξi1 ⊗ · · · ⊗ ξik ) ⊗ ν, where ν runs through a basis of Nρ , and 1 
i1  · · ·  ik  n span Vρ . They will generally not be linearly independent,
but there will be a linearly independent subset that forms a basis of Vρ . For g
diagonal, if g(ξi ) = ti ξi , then (ξi1 ⊗ · · · ⊗ ξik ) ⊗ ν will be an eigenvector for g
in Vρ with eigenvalue ti1 · · · tik . Thus, we see that there exists a homogeneous
polynomial sρ of degree k such that (34.3) is true for diagonal matrices g.
To see that sρ is symmetric, we have pointed out that the action of Sk on
⊗k V commutes with the action of GL(n, C). In particular, it commutes with
the action of the permutation matrices in GL(n, C), which form a subgroup
isomorphic to Sn . These permute the eigenvectors (ξi1 ⊗ · · · ⊗ ξik ) ⊗ ν of g
and hence their eigenvalues. Thus, the polynomial sρ must be symmetric.
Since the eigenvalues of a matrix are equal to the eigenvalues of any con-
jugate, we see that (34.3) must be true for any matrix that is conjugate to a
diagonal matrix. Since these are dense in GL(n, C), (34.3) follows for all g by
continuity.


Proposition 34.2. Let ρi : Sk −→ GL(Nρi ) (i = 1, . . . , h) be the irreducible


representations of Sk and let d1 , . . . , dh be their respective degrees. Then

pk1 = di sρi . (34.4)
i
358 34 Duality Between Sk and GL(n, C)

Proof. If R is a ring and M a right R-module, then


M ⊗R R ∼
= M. (34.5)
(To prove this standard isomorphism, observe that m⊗r → mr and m → m⊗1
are inverse maps between the two Abelian groups.) If M is an (S, R)-bimodule,
then this is an isomorphism of S-modules. Consequently,
%k % 
k

V = V ⊗C[Sk ] C[Sk ].


 multiplicity of ρi in the regular representation is di , that is, C[Sk ] =
The
di Nρi , and hence
%k & %k  &
V ∼= di V ⊗C[Sk ] Nρi = di Vρi . (34.6)
i i

Taking characters, we obtain (34.4).



Recall that we ended the last chapter by asserting that ψs is a proper
character of GL(n, C), where s is the polynomial in (33.10). We now have the
tools to prove this.
Let k = 3, and let ρi be the irreducible representations of degree 2 of S3 .
We will take ρ1 to be the trivial representation, ρ2 = ε to be the alternating
representation, and ρ3 to be the irreducible two-dimensional representation.
If g ∈ GL(n, C) has eigenvalues t1 , . . . , tn , then the value at g of the character
$
of the representation of GL(n, C) on the module 3 V is
> ?3
p31 (t1 , . . . , tn ) = ti = t3i + 3 t2i tj + 6 ti tj tk .
i=j i<j<k

The right-hand side of (34.4) consists of three terms. First, corresponding to


ρ1 and the symmetric cube ∨3 V ∼ = Vρ1 representation of GL(n, C) is

h3 = t3i + t2i tj + ti tj tk .
i=j i<j<k

Second, corresponding to ρ2 and the exterior cube ∧3 V ∼


= Vρ2 representation
of GL(n, C) is

e3 = ti tj tk .
i<j<k

Finally, corresponding to ρ3 , the associated module Vρ3 of GL(n, C) affords


the character ψρ3 , and the associated symmetric polynomial sρ3 occurs with
coefficient d3 = 2. This satisfies the equation

p31 = h3 + e3 + 2sρ3 ,

from which we easily calculate that sρ3 is the polynomial in (33.10).


34 Duality Between Sk and GL(n, C) 359

The conjugacy classes of Sk are parametrized by the partitions of k.


Apartition of k is a decomposition of k into a sum of positive integers. Thus,
the partitions of 5 are

5, 4 + 1, 3 + 2, 3 + 1 + 1, 2 + 2 + 1, 2 + 1 + 1 + 1, 1 + 1 + 1 + 1 + 1.

Note that the partitions 3 + 2 and 2 + 3 are considered equal. We may arrange
the terms in a partition into descending order. Hence, a partition λ of k may
be more formally defined to be a sequence of nonnegative integers (λ1 , . . . , λl )
such that λ1  λ2  · · ·  λl  0 and i λi = k. It is sometimes convenient to
allow some of the parts λi to be zero, in which case we identify two sequences
if they differ only by trailing zeros. Thus, (3, 2, 0, 0) is considered to be the
same partition as (3, 2). The length or number of parts l(λ) of the partition λ
is the largest i such that λi = 0, so the length of the partition (3, 2) is two.
We will denote by p(k) the number of partitions of k, so that p(5) = 7.
If λ is a partition of k, there is another partition, called the conjugate
partition and denoted λt , which may be constructed as follows. We construct
from λ a diagram in which the ith row is a series of λi boxes. Thus, the
diagram corresponding to the partition λ = (3, 2) is

Having constructed the diagram, we transpose it, and the corresponding


partition is the conjugate partition, denoted λt . Hence, the transpose of the
preceding diagram is

and so the partition of 5 conjugate to λ = (3, 2) is λt = (2, 1, 1). These types


of diagrams are called Young diagrams or Ferrers’ diagrams.
More formally, the diagram D(λ) of a partition λ is the set of (i, j) ∈ Z2
such that 0  i and 0  j  λi . We associate with each pair (i, j) the box in
the ith row and the jth column, where the convention is that the row index
i increases as one moves downward and the column index j increases as one
moves to the right, so that the boxes lie in the fourth quadrant.
360 34 Duality Between Sk and GL(n, C)

Suppose that μ = λt . Then (i, j) ∈ D(λ) if and only if (j, i) ∈ D(μ).


Therefore,
j  λi ⇐⇒ i  μj . (34.7)
If G is a finite group, let X(G) be the additive group of generalized charac-
ters of G. It is isomorphic to the free Abelian group generated by the isomor-
phism classes of irreducible representations. Because X(G) has a well-known
ring structure, it is usually called the character ring of G, but we will not
use the multiplication in X(G) at all. To us it is simply an additive Abelian
group, the group of generalized characters.
Let Rk = X(Sk ). Its rank, as a free Z-module is equal to the number p(k)
of partitions of k. Our convention is R0 = Z.
Although we do not need the ring structure on Rk itself,
 we will introduce
a multiplication Rk × Rl → Rk+l , which makes R = k Rk into a graded
ring. The multiplication in R is as follows. If θ, ρ are representations of Sk and
Sl , respectively, then θ ⊗ ρ is a representation of Sk × Sl , which is a subgroup
of Sk+l . We will always use the unadorned symbol ⊗ to denote ⊗C .
We let θ ◦ ρ be the representation obtained by inducing θ ⊗ ρ from Sk × Sl
to Sk+l . This multiplication, at first defined only for genuine representations,
extends to virtual representations by additivity, and so we get a multiplication
Rk × Rl → Rk+l . It follows from the principle of transitivity of induction that
this multiplication is associative, and since the subgroups Sk × Sl and Sl × Sk
are conjugate in Sk+l , it is also commutative.
Now let us introduce another graded ring. Let n be a fixed integer, and let
x1 , . . . , xn be indeterminates. We consider the ring

Λ(n) = Zsym [x1 , . . . , xn ]

of symmetric polynomials with integer coefficients in x1 , . . . , xn , graded by


degree. By Proposition 33.1, Λ(n) is a polynomial ring in the symmetric poly-
nomials e1 , . . . , en ,
Λ(n) ∼
= Z[e1 , . . . , en ] (34.8)
or equally, in terms of the symmetric polynomials hi ,

Λ(n) ∼
= Z[h1 , . . . , hn ].
 (n) (n)
Λ(n) is a graded ring. We have Λ(n) = Λk , where Λk consists of all
(n)
homogeneous polynomials of degree k in Λ .
(n)
Proposition 34.3. The homogeneous part Λk is a free Abelian group of rank
equal to the number of partitions of k into no more than n parts.

Proof. Let λ(n) be sucha partition. Thus, λ(n) = (λ1 , . . . , λn ), where λ1 


λ2  · · ·  λn  0 and i λi = k. Let
α
mλ (x1 , . . . , xn ) = x1 1 · · · xα
n ,
n
34 Duality Between Sk and GL(n, C) 361

where (α1 , . . . , αn ) runs over all distinct permutations of (λ1 , . . . , λn ). Clearly,


(n) (n)
the mλ form a Z-basis of Λk , and therefore Λk is a free Abelian group of
rank equal to the number of partitions of k into no more than n parts.

In Theorem 34.1, we associated with each irreducible representation ρ of
(n)
Sk an element sρ of Λk . Thus, there exists a homomorphism of Abelian
(n) (n) (n)
groups chk : Rk → Λk such that chk (ρ) = sρ . Let ch(n) : R −→ Λ(n) be
(n)
the homomorphism of graded rings that is chk on the homogeneous part Rk
of degree k.
Proposition 34.4. The map ch(n) is a surjective homomorphism of graded
(n)
rings. The map chk in degree k is an isomorphism if n  k.
Proof. The main thing to check is that the group law ◦ that was introduced
in the ring R corresponds to multiplication of polynomials. Indeed, let θ and
ρ be representations of Sk and Sl , respectively. Then θ ⊗ ρ is an Sk × Sl -
module, and by Proposition 32.7, θ ◦ ρ is the representation of Sk+l attached
to C[Sk+l ] ⊗C[Sk ×Sl ] (Nθ ⊗ Nρ ). Therefore,

Vθ◦ρ = (⊗k+l V ) ⊗C[Sk+l ] C[Sk+l ] ⊗ (Nθ ⊗ Nρ ),

which by (34.5) is isomorphic to

(⊗k+l V ) ⊗C[Sk ×Sl ] (N θ ⊗ Nρ )



= (⊗k V ) ⊗ (⊗l V ) ⊗C[Sk ]⊗C[Sl ] (Nθ ⊗ Nρ )

= (⊗k V ⊗C[Sk ] Nθ ) ⊗ (⊗l V ⊗C[Sl ] Nρ ) = Vθ ⊗ Vρ .

Consequently the trace of g ∈ GL(n, C) on Vθ◦ρ is the product of the traces


on Vθ and Vρ . It follows that for representations θ and ρ of Sk and Sl , we have
sθ◦ρ = sθ sρ . Hence, ch(n) is multiplicative and therefore is a homomorphism
of graded rings. It is surjective because a set of generators—the elementary
symmetric polynomials ei —are in the image. If n  k, then the ranks of Rk
(n)
and Λk both equal p(k), so surjectivity implies that it is an isomorphism. 
We will denote by ek , hk ∈ Rk the classes of the alternating representation
and the trivial representation, respectively. It follows from Proposition 34.1
that ch(n) (ek ) = ek and ch(n) (hk ) = hk .
Proposition 34.5. R is a polynomial ring in an infinite number of genera-
tors, R = Z[e1 , e2 , e3 , . . .] = Z[h1 , h2 , h3 , . . .].
Proof. To show that the ei generate R, it is sufficient to show that the ring
they generate contains an arbitrary element u of Rk for any fixed k. Take
n  k. Since e1 , . . . , en generate the ring Λ(n) , there exists a polynomial f
with integer coefficients such that f (e1 , . . . , en ) = ch(u). Then ch(n) applied
to f (e1 , . . . , en ) gives ch(u), and it follows from the injectivity assertion in
Proposition 34.4 that f (e1 , . . . , en ) = u.
362 34 Duality Between Sk and GL(n, C)

To see that the ei are algebraically independent, if f is a polynomial with


integer coefficients such that f (e1 , . . . , en ) = 0, then since applying ch(n) we
have f (e1 , . . . , en ) = 0, by Proposition 33.1 it follows that f = 0.
Identical arguments work for the h’s using Proposition 33.1.


The rings Λ(n) may be combined as follows. We have a homomorphism

rn : Λ(n+1) −→ Λ(n) , xn+1 −→ 0 . (34.9)

It is easy to see that in this homomorphism ei → ei if i  n while en+1 → 0,


and so in the inverse limit
Λ = lim Λ(n) (34.10)
←−

there exists a unique element whose image under the projection Λ → Λ(n) is
ek for all n  k; we naturally denote this element ek , and (34.8) implies that

Λ∼
= Z[e1 , e2 , e3 , . . .]

is a polynomial ring in an infinite number of variables, and similarly

Λ∼
= Z[h1 , h2 , h3 , . . .].

In the natural grading on Λ, ei and hi are homogeneous of degree i. Since the


(n)
rank of Λk equals the number of partitions of k into no more than n parts,
the rank of Λ equals the number of partitions of k.

Proposition 34.6. We have rn ◦ ch(n+1) = ch(n) as maps R −→ Λ(n) .

Proof. It is enough to check this on e1 , e2 , . . . since they generate R by Propo-


sition 33.1. Both maps send ek −→ ek if k  n, and ek −→ 0 if k > n.


Now turning to the inverse limit (34.10), the homomorphisms ch(n) : R →


Λ are compatible with the homomorphisms Λ(n+1) → Λ(n) , and so there is
(n)

induced a ring homomorphism ch : R → Λ.

Theorem 34.2. The map ch : R −→ Λ is a ring isomorphism.

Proof. This is clear from Proposition 34.4.




Theorem 34.3. The rings R and Λ admit automorphisms of order 2 that


interchange ei ←→ hi and ei ←→ hi .

Proof. Of course, it does not matter which ring we work in. Since Λ ∼ =
Z[e1 , e2 , e3 , . . .], and since the ei are algebraically independent, if u1 , u2 , . . . are
arbitrarily elements of Λ, there exists a unique ring homomorphism Λ −→ Λ
such that ei −→ ui . What we must show is that if we take the ui = hi , then
this same homomorphism maps hi −→ ui . This follows from the fact that the
34 Duality Between Sk and GL(n, C) 363

recursive identity (33.3), from which we may solve for the e’s in terms of the
h’s or conversely, is unchanged if we interchange ei ←→ hi .


We will usually denote the involution of Theorem 34.3 as ι.

EXERCISES
Exercise 34.1. Let s = h1 h2 − h3 . Show that ι s = s.
35
The Jacobi–Trudi Identity

For another account that derives the Jacobi-Trudi identity as a determinantal


identity for characters of Sn using Mackey theory see Kerber [100]. The point
of view in Zelevinsky [178] is slightly different but also similar in spirit. We
take up his Hopf algebra approach in the exercises. For us, the details were
worked out some years ago in the Stanford senior thesis of Karl Rumelhart.
An important question is to characterize the symmetric polynomials that
correspond to irreducible characters of Sk . These are called Schur polynomials.

If A = (aij ) and B = (bij ) are square N × N matrices, and if I, J ⊂


{1, 2, 3, . . . , n} are two subsets of cardinality r, where 1  r  n, the minors

det(aij | i ∈ I, j ∈ J), det(bij | i ∈


/ I, j ∈
/ J) ,

are called complementary.

Proposition 35.1. Let A be a matrix of determinant 1, and let B = t A−1 .


Each minor of A equals ± the complementary minor of B.

This is a standard fact from linear algebra. For example, if


⎛ ⎞ ⎛ ⎞
a11 a12 a13 a14 b11 b12 b13 b14
⎜ a21 a22 a23 a24 ⎟ ⎜ b21 b22 b23 b24 ⎟
A=⎜ ⎟
⎝ a31 a32 a33 a34 ⎠ , B=⎜ ⎟
⎝ b31 b32 b33 b34 ⎠ ,
a41 a42 a43 a44 b41 b42 b43 b44
then  
 b11 b12 b14     
   a12 a13   
a23 = −  b31 b32 b34  ,   = −  b21 b24  .
 a32 a33   b41 b44 
 b41 b42 b44 
It is not hard to give a rule for the sign in general, but we will not need it.

Proof. Let us show how to prove this fact using exterior algebra. Suppose that
A is an N × N matrix. Let V = CN . Then ∧N V is one-dimensional, and we

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 365


DOI 10.1007/978-1-4614-8024-2 35, © Springer Science+Business Media New York 2013
366 35 The Jacobi–Trudi Identity

(∧kA, ∧N−kA)
(∧kV ) × (∧N−kV ) (∧kV ) × (∧N−kV )

∧ ∧

∧NA
∧NV ∧NV

η η

det A
C C

fix an isomorphism η : ∧N V −→ C. If 1  k  N , and if A : V −→ V is any


linear transformation, we have a commutative diagram: The vertical arrows
marked ∧ are multiplications in the exterior algebra. The vertical map η ◦ ∧ :
(∧k V ) × (∧N −k V ) −→ C is a nondegenerate bilinear Indeed, let v1 , . . . , vN be
a basis of V chosen so that

η(v1 ∧ · · · ∧ vN ) = 1.

Then a pair of dual bases of ∧k V and ∧N −k V with respect to this pairing are

vi1 ∧ · · · ∧ vik , ±vj1 ∧ · · · ∧ vjN −k ,

where i1 < · · · < ik , j1 < · · · < jN −k , and the two subsets

{i1 , . . . , ik }, {j1 , . . . , jN −k } ,

of {1, . . . , N } are complementary. [The sign of the second basis vector will
be (−1)d , where d = (i1 − 1) + (i2 − 2) + · · · + (ik − k).] If det(A) = 1,
then the bottom arrow is the identity map, and therefore we see that the map
∧N −k A : ∧N −k V → ∧N −k V is the inverse of the adjoint of ∧k A : ∧k V → ∧k V
with respect to this dual pairing. Hence, if we use the above dual bases to
compute matrices for these two maps, the matrix of ∧N −k A is the transpose
of the inverse of the matrix of ∧k A. Thus, if B is the inverse of the adjoint
of A with respect to the inner product on V for which v1 , . . . , vN are an
orthonormal basis, then the matrix of ∧N −k B is the same as the matrix of
∧k A. Now, with respect to the chosen dual bases, the coefficients in the matrix
of ∧k A are the k × k minors of A, while the matrix coefficients of ∧N −k B are
(up to sign) the complementary (N − k) × (N − k) minors of B. Hence, these
are equal.


Proposition 35.2. Suppose that λ = (λ1 , . . . , λr ) and μ = (μ1 , . . . , μs ) are


conjugate partitions of k. Then the r + s numbers

s + i − λi , (i = 1, . . . , r),
s − j + μj + 1, (j = 1, . . . , s),

are 1, 2, 3, . . . , r + s rearranged.
35 The Jacobi–Trudi Identity 367

Another proof of this combinatorial lemma may be found in Macdonald [124],


I.1.7.

Proof. First note that the r + s integers all lie between 0 and r + s. Indeed,
if 1  i  r, then
0  s + i − λi  s + r
because s is greater than or equal to the length l(μ) = λ1  λi , so s + i − λi 
s − λi  0, and s + i − λi  s + i  s + r; and if 1  j  s, then

0  s − j + μj + 1  s + r

since s − j + μj + 1  s − j  0, and μj  μ1 = l(λ)  r, so s − j + μj + 1 


s + μj  s + r.
Thus, it is sufficient to show that there are no duplications between these
s + r numbers. The sequence s + i − λi is strictly increasing, so there can be
no duplications in it, and similarly there can be no duplications among the
s − j + μj + 1. We need to show that s + i − λi = s − j + μj + 1 for all 1  i  r,
1  j  s, that is,
λi + μj + 1 = i + j. (35.1)
There are two cases. If j  λi , then by (34.7) we have also i  μj , so λi +
μj + 1 > λi + μj  i + j. On the other hand, if j > λi , then by (34.7), i > λj ,
so
i + j  λi + μj + 2 > λi + μj + 1.
In both cases, we have (35.1).


We will henceforth denote the multiplication in R, which was denoted in


Chap. 34 with the symbol ◦, by the usual notations for multiplication. Thus,
what was formerly denoted θ ◦ ρ will be denoted θρ, etc. Observe that the ring
R is commutative.
We recall that ek and hk ∈ Rk denote the sign character and the trivial
character of Sk , respectively.

Proposition 35.3. We have

hk − e1 hk−1 + e2 hk−2 − · · · + (−1)k ek = 0 (35.2)

if k  1.
(n)
Proof. Choose n  k so that the characteristic map ch(n) : Rk → Λk is
injective. It is then sufficient to prove that ch(n) annihilates the left-hand
side. Since ch(n) (ei ) = ei and ch(n) (hi ) = hi , this follows from (33.3).


Proposition 35.4. Let λ = (λ1 , . . . , λr ) and μ = (μ1 , . . . , μs ) be conjugate


partitions of k. Then

det(hλi −i+j )1i,jr = ± det(eμi −i+j ). (35.3)


368 35 The Jacobi–Trudi Identity

Our convention is that if r < 0, then hr = er = 0. (Also, remember that


h0 = e0 = 1.) As an example, if λ = (3, 3, 1), then μ = λt = (3, 2, 2), and we
have    
 h3 h4 h5   e 3 e 4 e 5 
   
 h2 h3 h4  =  e 1 e 2 e 3  .
   
 0 h0 h1   e 0 e 1 e 2 

Later, in Proposition 35.1 we will see that the sign in (35.3) is always +. This
could be proved now by carefully keeping track of the sign, but this is more
trouble than it is worth because we will determine the sign in a different way.

Proof. We may interpret (33.3) as saying that the Toeplitz matrix


⎛ ⎞
h0 h1 · · · hr+s−1
⎜ h0 · · · hr+s−2 ⎟
⎜ ⎟
⎜ .. .. ⎟ (35.4)
⎝ . . ⎠
h0

is the transpose inverse of


⎛ ⎞
e0
⎜ e1 e0 ⎟
⎜ ⎟
⎜ .. .. ⎟ (35.5)
⎝ . . ⎠
er+s−1 er+s−2 e0

conjugated by ⎛ ⎞
1
⎜ −1 ⎟
⎜ ⎟
⎜ .. ⎟.
⎝ . ⎠
(−1)r+s−1
We only need to compute the minors up to sign, and conjugation by the
latter matrix only changes the signs of these minors. Hence, it follows from
Proposition 35.1 that each minor of (35.4) is, up to sign, the same as the
complementary minor of (35.5). Let us choose the minor of (35.4) with columns
s+1, . . . , s+r and rows s+i−λi (i = 1, . . . , r). This minor is the left-hand side
of (35.3). By Proposition 35.2, the complementary minor of (35.5) is formed
with columns 1, . . . , s and rows s − j + μj + 1 (j = 1, . . . , s). After conjugating
this matrix by ⎛ ⎞
1
.
⎝ .. ⎠ ,
1
we obtain the right-hand side of (35.3).


Suppose that λ = (λ1 , . . . , λr ) is a partition of k. Then we will denote


35 The Jacobi–Trudi Identity 369

eλ = eλ1 · · · eλr , hλ = hλ1 · · · hλr .

Referring to the definition of multiplication in the ring Rk , we see that eλ


and hλ are the characters of Sk induced from the sign and trivial characters,
respectively, of the subgroup Sλ1 × · · · × Sλr . We will denote this group by
Sλ .
There is a partial ordering on partitions. We write λ  μ if

λ1 + · · · + λi  μ1 + · · · + μi , (i = 1, 2, 3, . . .).

Since Rk is the character ring of Sk , it has a natural inner product, which


we will denote , . Our objective is to compute the inner product eλ , hμ .
Proposition 35.5. Let λ = (λ1 , . . . , λr ) and μ = (μ1 , . . . , μs ) be partitions of
k. Then
hλ , eμ = eλ , hμ . (35.6)
This inner product is equal to the number of r×s matrices with each coefficient
equal to either 0 or 1 such that the sum of the ith row is equal to λi and the
sum of the jth column is equal to μj . This inner product is nonzero if and
only if μt  λ. If μt = λ, then the inner product is 1.
Proof. Computing the right- and left-hand sides of (35.6) both lead to the
same calculation, as we shall see. For definiteness, we will compute the left-
hand side of (35.6). Note that
> ?
hλ , eμ = dim HomSk IndSSkλ (1), IndSSkμ (ε) ,

where ε is the alternating character of Sλ , and IndSSkμ (ε) denotes the cor-
responding induced representation of Sk . This is because eμi ∈ Rμi is the
alternating character of Sλi , and the multiplication in R is defined so that
the product eμ = eμ1 · · · eμs is obtained by induction from Sμ .
By Mackey’s theorem, we must count the number of double cosets in
Sμ \Sk /Sλ that support intertwining operators. (See Remark 32.2.) Simply
counting these double cosets is sufficient because the representations that we
are inducing are both one-dimensional, so each space on the right-hand side
of (32.12) is either one-dimensional (if the coset supports an intertwining op-
erator) or zero-dimensional (if it doesn’t).
First, we will show that the double cosets in Sμ \Sk /Sλ may be parametrized
by s × r matrices with nonnegative integer coefficients such that the sum of
the ith row is equal to μi and the sum of the jth column is equal to λj . Then
we will show that the double cosets that support intertwining operators are
precisely those that have no entry > 1. This will prove the first assertion.
We will identify Sk with the group of k × k permutation matrices. (A
permutation matrix is one that has only zeros and ones as entries, with ex-
actly one nonzero entry in each row and column.) Then Sλ is the subgroup
consisting of elements of the form
370 35 The Jacobi–Trudi Identity
⎛ ⎞
D1 0 ··· 0
⎜ 0 D2 · · · 0 ⎟
⎜ ⎟
⎜ .. .. . . .. ⎟,
⎝ . . . . ⎠
0 0 ··· Dr

where Di is a λi × λi permutation matrix. Let g ∈ Sk represent a double coset


in Sμ \Sk /Sλ . Let us write g in block form,
⎛ ⎞
G11 G12 · · · G1r
⎜ G21 G22 · · · G2r ⎟
⎜ ⎟
⎜ .. .. . . .. ⎟ , (35.7)
⎝ . . . . ⎠
Gs1 Gs2 · · · Gsr

where Gij is a μi × λj block. Let γij be the rank of Gij , which is the number
of nonzero entries. Then the matrix r × s matrix (γij ) is independent of the
choice of representative of the double coset. It has the property that the sum
of the ith row is equal to μi and the sum of the jth column is equal to λj .
Moreover, it is easy to see that any such matrix arises from a double coset in
this manner and determines the double coset uniquely. This establishes the
correspondence between the matrices (γij ) and the double cosets.
Next we show that a double coset supports an intertwining operator if and
only if each γij  1. A double coset Sμ gSλ supports an intertwining operator
if and only if there exists a nonzero function Δ : Sk → C with support in
Sμ gSλ such that
Δ(τ hσ) = ε(τ )Δ(h) (35.8)
for τ ∈ Sμ , σ ∈ Sλ .
First, suppose the matrix (γij ) is given such that for some particular i, j,
we have γ = γij > 1. Then we may take as our representative of the double
coset a matrix g such that
 
Iγ 0
Gij = .
0 0

Now there exists a transposition σ ∈ Sλ and a transposition τ ∈ Sμ such that


g = τ gσ. Indeed, we may take τ to be the transposition (12) ∈ Sλj ⊂ Sλ and
σ to be the transposition (12) ∈ Sμi ⊂ Sμ . Now, by (35.8),

Δ(g) = Δ(τ gσ) = −Δ(g),

so Δ(g) = 0 and therefore Δ is identically zero. We see that if any γij > 1, then
the corresponding double coset does not support an intertwining operator.
On the other hand, if each γij  1, then we will show that for g a repre-
sentative of the corresponding double coset, g −1 Sμ g ∩ Sλ = {1}, or

Sμ g ∩ gSλ = {g}. (35.9)


35 The Jacobi–Trudi Identity 371

Indeed, suppose that τ ∈ Sμ and σ ∈ Sλ such that τ g = gσ. Writing


⎛ ⎞ ⎛ ⎞
τμ1 σλ1
⎜ τμ2 ⎟ ⎜ σλ2 ⎟
τ =⎝ ⎠, σ=⎝ ⎠,
.. ..
. .

with τμi ∈ Sμi and σλi ∈ Sλi and letting g be as in (35.7), we have τμi Gij =
Gij σλj . If τμi = I, then

τμi Gi1 · · · Gir = Gi1 · · · Gir
since the rows of the second matrix are distinct. Thus τμi Gij = Gij for some i.
Since Gij has at most one nonzero entry, it is impossible that after reordering
the rows (which is the effect of left multiplication by τμi ) this nonzero entry
could be restored to its original position by reordering the columns (which
is the effect of right multiplication by σλ−1 j
). Thus, τμi Gij = Gij implies that
τμi Gij = Gij σλj . This contradiction proves (35.9).
Now (35.9) shows that each element of the double coset has a unique
representation as τ gσ with τ ∈ Sμ and σ ∈ Sλ . Hence, we may define

ε(τ ) if h = τ gσ with τ ∈ Sμ and σ ∈ Sλ ,
Δ(h) =
0 otherwise,
and this is well-defined. Hence, such a double coset does support an intertwin-
ing operator.
Now we have asserted further that (35.6) is nonzero if and only if μt  λ
and that if μt = λ, then the inner product is 1. Let us ask, therefore, for given
λ and μ, whether we can construct a matrix (γij ) with each γij = 0 or 1 such
that the sum of the ith row is μi and the sum of the jth column is λj . Let
ν = μt . Then
νi = card {j | μj  i}.
That is, νi is the number of rows that will accommodate up to i 1’s. Now
ν1 + ν2 + · · · + νt is equal to the number of rows that will take a 1, plus the
number of rows that will take two 1’s, and so forth. Let us ask how many 1’s
we may put in the first t columns. Each nonzero entry must lie in a different
row , so to put as many 1’s as possible in the first t columns, we should put νt
of them in those rows that will accommodate t nonzero entries, νt−1 of them in
those rows that will accommodate t−1 entries, and so forth. Thus, ν1 +· · ·+νt
is the maximum number of 1’s we can put in the first t columns. We need to
place λ1 + · · · + λt ones in these rows, so in order for the construction to be
possible, what we need is
λ1 + · · · + λt  ν1 + · · · + νt
for each t, that is, for ν  λ. It is easy to see that if ν = λ, then the location of
the ones in the matrix (γij ) is forced so that in this case there exists a unique
intertwining operator.

372 35 The Jacobi–Trudi Identity

Corollary 35.1. If λ and μ are partitions of k, then we have μt  λt if and


only if λ  μ.

Proof. This is equivalent to the statement that μt  λ if and only if λt  μ.


In this form, this is contained in the preceding proposition from the identity
(35.6) together with the characterization of the nonvanishing of that inner
product. Of course, one may also give a direct combinatorial argument.


Theorem 35.1. (Jacobi–Trudi identity) Let λ = (λ1 , . . . , λr ) and μ =


(μ1 , . . . , μs ) be conjugate partitions of k. We have the identity

det(hλi −i+j )1i,jr = det(eμi −i+j )1i,js (35.10)

in Rk . We denote this element (35.10) as sλ . It is an irreducible character of


Sk and may be characterized as the unique irreducible character that occurs
with positive multiplicity in both IndSSkμ (ε) and IndSSkλ (1); it occurs with multi-
plicity one in each of them. The p(k) characters sλ are all distinct, and are
all the irreducible characters of Sk .
(n)
Proof. Let n  k, so that ch(n) : Rk → Λk is injective. Applying ch to
(35.10) and using (35.3), we see that the left- and right-hand sides are either
equal or negatives of each other. We will show that the inner product of the
left-hand side with the right-hand side of (35.10) equals 1. Since the inner
product is positive definite, this will
 show that the left- and right-hand sides
are actually equal. Moreover, if  di χi is the decomposition of (35.10) into
irreducibles, this inner product is i d2i , so knowing that the inner product
is 1 will imply that sλ is either an irreducible character, or the negative of an
irreducible character.
We claim that expanding the determinant on the left-hand side of (35.10)
gives a sum of terms of the form ±hλ where each λ  λ and the term hλ
occurs exactly once. Indeed, the terms in the expansion of the determinant
are of the form
hλ1 −1+j1 hλ2 −1+j2 · · · hλr −r+jr ,
where (j1 , . . . , jr ) is a permutation of (1, 2, . . . , r). If we arrange the indices
λi − i + ji into descending order as λ1 , λ2 , . . ., then λ1 is greater than or equal
to λ1 − 1 + j1 . Moreover, j1  1 so

λ1  λ1 − 1 + j1  λ1 ,

and similarly j1 + j2  3 so

λ1 + λ2  (λ1 − 1 + j1 ) + (λ2 − 2 + j2 )  λ1 + λ2 ,

and so forth.
Similarly, expanding the right-hand side gives a sum of terms of the form
±eμ , where μ  μ, and the term eμ also occurs exactly once.
35 The Jacobi–Trudi Identity 373

Now let us consider hλ , eμ . By Proposition 35.5, if this is nonzero we


have (μ )t  λ . Since λ  λ and μ  μ, which implies μt  (μ )t by
Corollary 35.1, we have
λ = μt  (μ )t  λ  λ.
Thus, we must have λ = λ. It is easy to see that this implies that (j1 , . . . , jr ) =
(1, 2, . . . , r), so the monomial eλ occurs exactly once in the expansion of
det(hλi −i+j ). A similar analysis applies to det(eμi −i+j ).
We see that the inner product of the left- and right-hand sides of (35.10)
equals 1, which implies everything except that sλ and not −sλ is an irreducible
character of Sk . To see this, we form the inner product sλ , hμ . The same
considerations show that this inner product is 1. Since hμ is a proper character
[it is the character of IndSSkμ (1)] this implies that it is sλ , and not −sλ , is an
irreducible character.
We have just noted that sλ occurs with positive multiplicity in hλ , which
is the character of the representation IndSSkλ (1). Similar considerations show
that sλ , eμ = 1 and eλ is the character of the representation IndSSkμ (). By
Proposition 35.5, eμ , hλ = 1, so there cannot be any other representation
that occurs with positive multiplicity in both.
This characterization of sλ shows that it cannot equal sμ for any μ = λ,
so the irreducible characters sλ are all distinct. Their number is p(k), which is
also the number of conjugacy classes in Sk (i.e., the total number of irreducible
representations). We have therefore constructed all of them.

Theorem 35.2. If λ and μ are conjugate partitions, and if ι is the involution
of Theorem 34.3, then ι sλ = sμ and ι sλ = sμ .
Proof. Since ι hλ = eλ and ι eλ = hλ , this follows from the Jacobi-Trudi
identity.


EXERCISES
Exercise 35.1. Let λ and μ be partitions of k. Show that
hλ , hμ  = eλ , eμ 
and that this inner product is equal to the number of r × s matrices with each
coefficient a nonnegative integer such that the sum of the ith row is equal to λi , and
the sum of the jth column is equal to μj .

Exercise 35.2. Give a combinatorial proof of Corollary 35.1.


Exercise 35.3. If λ, μ are a partitions of k, let Tsh (λμ) be the coefficient of hμ when
sλ is expressed in terms of the hμ , that is,
sλ = Tsh (λ, μ)hμ .
374 35 The Jacobi–Trudi Identity

Similarly we will define Txy when x, y are s, e or h to denote the transition matrices
between the bases sλ , eλ and hλ of Λk .
(i) Show that Tsh (λ, μ) = 0 unless μ  λ.
(ii) Show that Ths (λ, μ) = 0 unless μ  λ.
(iii) Show that Tse (λ, μ) = 0 unless μ  λt .
(iv) Show that Tes (λ, μ) = 0 unless μt  λ.
(v) Show that The (λ, μ) = 0 unless μt  λ.
(v) Show that Teh (λ, μ) = 0 unless μt  λ.
Zelevinsky [178] shows how the ring R may be given the structure of a
graded Hopf algebra. This extra algebraic structure (actually introduced earlier by
Geissinger) encodes all the information about the representations of Sk that comes
from Mackey theory. Moreover, a similar structure exists in a ring R(q) analogous
to R, constructed from the representations of GL(k, Fq ), which we will consider in
Chap. 47. Thus, Zelevinsky is able to give a unified discussion of important aspects
of the two theories. In the next exercises, we will establish the basic fact that R is
a Hopf algebra.
We begin by reviewing the notion of a Hopf algebra. We recommend Majid [125]
for further insight. (Apart from its use as an introduction to quantum groups, this is
good for gaining facility with Hopf algebra methods such as the Sweedler notation.)
Let A be a commutative ring. An A-algebra is normally defined to be a ring R with
a homomorphism u into the center of R. The homomorphism u (called the unit)
then makes R into an A-module. The multiplication map R × R → R is A-bilinear
hence induces a linear map m : R ⊗ R → R. The associative law for multiplication
may be interpreted as the commutativity of the diagram:

m⊗1
R⊗R⊗R R⊗R

1⊗m m

m
R⊗R R

R⊗R R⊗A R
u⊗1 m
1⊗u m

A⊗R R R⊗R

We also have commutative diagrams


Here we are identifying R with A ⊗ R by the canonical isomorphism x → 1 ⊗ x.
As an alternative viewpoint, given an A-module R with linear maps u : A → R
and m : R ⊗ R → R subject to these commutative diagrams, R is an A-algebra. The
change of viewpoint in replacing the bilinear multiplication map R × R → R with
the linear map R ⊗ R → R is a simple but useful one, since it allows us to transport
the notion to other contexts. For example, now we can dualize it.
35 The Jacobi–Trudi Identity 375

The dual notion to an algebra is that of a coalgebra. The definition and axioms
are obtained by reversing all the arrows. That is, we require an A-module R together
with linear maps Δ : R → R ⊗ R and  : R → A such that we have commutative
diagrams

Δ
R R⊗R

Δ 1⊗Δ

Δ⊗1
R⊗R R⊗R⊗R
R A⊗R R⊗R

Δ 1⊗
Δ ⊗1

R⊗R R R⊗A

Exercise 35.4. Let R be a algebra that is also a coalgebra. Show that the three
statements are equivalent.
(i) The comultiplication Δ : R −→ R ⊗ R and counit R → A are homomorphisms
of algebras.
(ii) The multiplication m : R ⊗ R −→ R and unit A → R are homomorphisms of
coalgebras.
(iii) The following diagram is commutative:
Here τ is the “transposition” map R ⊗ R → R ⊗ R that sends x ⊗ y to y ⊗ x.
We will refer to this property as the Hopf axiom.

Δ⊗Δ 1⊗τ⊗1
R⊗R R⊗R⊗R⊗R R⊗R⊗R⊗R

m m⊗m

Δ
R R⊗R

If these three equivalent conditions are satisfied, then R is called a bialgebra.


Note that this definition is self-dual. For example, if A is a field and R is a finite-
dimensional bialgebra, then the dual space R∗ is also a bialgebra, with comultipli-
cation being the adjoint of multiplication, etc.

Exercise 35.5. Let G be a finite group, A = C and let R be the group algebra.
Define a map Δ : R → R ⊗ R by extending the diagonal map G → G × G to a linear
map R → R ⊗ R, and let  : R → A be the augmentation map that sends every
element of G to 1. Show that R is a bialgebra.

As a variant, all these notions have graded versions. Let A be a commutative


ring. A graded A-module R is an A-module R with a sequence {R0 , R1 , R2 , . . .} of
376 35 The Jacobi–Trudi Identity

submodules such that R = Ri , and a homomorphism R −→ S of graded A-


modules is a homomorphism that takes Ri into Si . The tensor product R ⊗ S =
R ⊗A S of two graded A-modules is a graded A-module with
'
(R ⊗ S)m = Rk ⊗ Sl .
k+l=m

A graded A-algebra is an A-algebra R in which R0 = A and the multiplication


satisfies Rk · Rl ⊂ Rk+l . (The condition that R0 = A may be replaced by A ⊆ R0 .)
The map m : R ⊗ R −→ R such that m(x ⊗ y) = xy is a homomorphism of graded
A-modules. The ring A is itself a graded module with A0 = A and Ai = 0 for
i > 0. Now a graded algebra, coalgebra or bialgebra is defined by requiring the
multiplication, unit, comultiplication and counit to be homomorphisms of graded
modules.

Exercise 35.6. Suppose that k+l = m. Let ⊗ denote ⊗Z . The group Rk ⊗Rl can be
identified with the free Abelian group identified with the irreducible representations
of Sk × Sl . (Explain.) So restriction of a representation from Sm to Sk × Sl gives a
group homomorphism Rm −→ Rk ⊗ Rl . Combining these maps gives a map
'
Δ : Rm −→ Rk ⊗ Rl = (R ⊗ R)m .
k+l=m

Show that this homomorphism of graded Z-algebras makes R into a graded coalge-
bra.

Exercise 35.7. (Zelevinsky [178])


(i) Let k+l = p+q = m. Representing elements of the symmetric group as matrices,
show that a complete set of double coset representatives for (Sp ×Sq )\Sm /(Sk ×
Sl ) consists of the matrices
⎛ ⎞
Ia 0 0 0
⎜ 0 0 0 Ic ⎟
⎜ ⎟
⎝ 0 0 Id 0 ⎠ ,
0 Ib 0 0

where a + b = k, c + d = l, a + c = p, and b + d = q.
(ii) Use (i) and Mackey theory to prove that R is a graded bialgebra over Z.

Hint: Both parts are similar to parts of the proof of Proposition 35.5.

A bialgebra R is called a Hopf algebra if it satisfies the following additional


condition. There must be a map S : R → R such that the following diagram is
commutative:
Δ Δ
R⊗R R R⊗R

1⊗S A S⊗1

R⊗R m R m R⊗R
35 The Jacobi–Trudi Identity 377

Exercise 35.8. Show that a group algebra is a Hopf algebra. The antipode is the
map S(g) = g −1 .

Exercise 35.9. Show that R is a Hopf algebra. We have S(hk ) = (−1)k ek and
S(ek ) = (−1)k hk .

Exercise 35.10. Let H be a Hopf algebra. The Hopf square map σ : H → H is


m ◦ Δ. Prove that if H is commutative as a ring, then σ is a ring homomorphism.

The next exercise is from the 2013 senior thesis of Seth Shelley-Abrahamson.
Similar statements relate the higher Hopf power maps to other wreath products.
Interest in the Hopf square map and higher-power maps has been stimulated by
recent investigations of Diaconis, Pang, and Ram.

Exercise 35.11. Let Hk be the hyperoctahedral group of k × k matrices g such that


g has one nonzero entry in every row and column, and every nonzero entry is ±1.
The order of Hk is k!2k and Hk is isomorphic to the Weyl group of Cartan type Bk
(or Ck ). Given a character χ of Sk , we may induce χ to Hk , then restrict it back to
Sk . Thus, we get a self-map of Rk .
(i) Use Mackey theory to show that this map is the Hopf square map.
(ii) Let θk be the function on Sk that has a value on a permutation σ is 2n where
n is the number of cycles in σ. Show that θk is a character of Sk and that the
map of (i) multiplies every character of Sk by σk .
36
Schur Polynomials and GL(n, C)

Now let sμ (x1 , . . . , xn ) be the symmetric polynomial ch(n) (sμ ); we will use the
same notation sμ for the element ch(sμ ) of the inverse limit ring Λ defined by
(34.10). These are the Schur polynomials.
Theorem 36.1. Assume that n  l(λ). We have
 λ1 +n−1 λ1 +n−1 
 x1 λ1 +n−1 
 λ +n−2 xλ2 +n−2 · · · xλn +n−2 
x 2 x2 2 · · · xn2 
 1 
 . . 
 .. .. 
 
 xλn x2λn
· · · xn λn 
sλ (x1 , . . . , xn ) = 1
 n−1 n−1  , (36.1)
 x1 x2 · · · xn−1 
 n−2 n−2 n 
x n−2 
 1 x2 · · · xn 
 . .. 
 .. . 

 x1 x2 · · · xn 
 
 1 1 ··· 1 
provided that n is greater than or equal to the length of the partition k, so that
we may denote λ = (λ1 , . . . , λn ) (possibly with trailing zeros). In this case
sλ = 0.
It is worth recalling that the Vandermonde determinant in the denominator
can be factored:
 n−1 n−1 
 x1 x2 · · · xn−1 
 n−2 n−2 n 
x n−2 
 1 x2 · · · xn  7
 . ..  =
 .. .  (xi − xj ).

 x1 x2 · · · xn  i<j
 
 1 1 ··· 1 
It is also worth noting, since it is not immediately obvious from the expression
(36.1), that the Schur polynomial sλ in n + 1 variables restricts to the Schur

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 379


DOI 10.1007/978-1-4614-8024-2 36, © Springer Science+Business Media New York 2013
380 36 Schur Polynomials and GL(n, C)

polynomial also denoted sλ under the map (34.9). This is of course clear from
Proposition 34.6 and the fact that ch(sλ ) = sλ .
(i)
Proof. Let ek be the kth elementary symmetric matrix in n − 1 variables
x1 , . . . , xi−i , xi+1 , . . . , xn ,
omitting xi . We have, using (33.1) and (33.2) and omitting one variable in
(33.1),
∞ 7n
(i)
(−1)k ek tk = (1 − xj t),
k=0 j=i
∞ 7 n
hk t k
= (1 − xi t)−1 ,
k=0 i=1
and therefore
-∞ .- ∞ .

k (i) k
(−1) ek t hk t = (1 − txi )−1 = 1 + txi + t2 x2i + · · · .
k

k=0 k=0

Comparing the coefficients of tr in this identity, we have



(i)
(−1)k ek hr−k = xri .
k=0
(i) (i)
(Our convention is that ek = hk = 0 if k < 0, and also note that ek = 0 if
k  n.) Therefore, we have
⎛ ⎞ ⎛ (1) (2) (n)

hλ1 hλ1 +1 · · · hλ1 +n−1 ±en−1 ±en−1 · · · ±en−1
⎜ hλ2 −1 ⎜ (1) (n) ⎟
⎜ hλ2 · · · hλ2 +n−2 ⎟
⎟⎜
(2)
∓en−2 ∓en−2 · · · ∓en−2 ⎟
⎜ ⎟ ⎜ .. ⎟

.. .. ..
⎠⎜ . ⎟=
. . . ⎝ .. . ⎠
hλn −n+1 hλn −n+2 · · · hλn (1)
e0
(2)
e0
(n)
· · · e0
⎛ λ1 +n−1 λ1 +n−1 ⎞
x1 x2 · · · xλn1 +n−1
⎜ xλ2 +n−2 xλ2 +n−2 · · · xλ2 +n−2 ⎟
⎜ 1 2 2 ⎟
⎜ .. .. ⎟.
⎝ . . ⎠
xλ1 n xλ2 n ··· xλnn
Denote the determinant of the second factor on the left-hand side by D. Taking
determinants,
 λ1 +n−1 λ1 +n−1 
 x1 λ1 +n−1 
 λ +n−2 xλ2 +n−2 · · · xλn +n−2 
x 2 x2 2 · · · xn2 
 1 
sλ D =  . . . (36.2)
 .. .. 
 
 xλn xλn · · · xλn 
1 2 n

Hence, we have only to prove that D is equal to the denominator in (36.1),


and this follows from (36.2) by taking λ = (0, . . . , 0) since s(0,...,0) = 1.

36 Schur Polynomials and GL(n, C) 381

Suppose that V and W are vector spaces over a field of characteristic zero
and B : V × · · · × V −→ W is a symmetric k-linear map. Let Q : V −→ W be
the function Q(v) = B(v, . . . , v). The function B can be reconstructed from
Q, and this process is called polarization. For example, if k = 2 we have
1
B(v, w) = (Q(v + w) − Q(v) − Q(w)) ,
2
as we may see by expanding the right-hand side and using B(v, w) = B(w, v).

Proposition 36.1. Let U and W be vector spaces over a field of characteristic


zero and let B : U × · · · × U −→ W be a symmetric k-linear map. Let Q :
U −→ W be the function Q(u)  = B(u, . . . , u). If u1 , . . . , uk ∈ U , and if
S ⊂ I = {1, 2, . . . , k}, let uS = i∈S ui . We have
⎡ ⎤
1 ⎣
B(u1 , . . . , uk ) = (−1)k−|S| Q(uS )⎦ .
k!
S⊆I

Proof. Expanding Q(uS ) = B(uS , . . . , uS ) and using the k-linearity of B, we


have
Q(uS ) = B(ui1 , ui2 , . . . , uik ).
i1 ,...,ik ∈S

Therefore,

(−1)k−|S| Q(uS ) = B(ui1 , . . . , uik ) (−1)k−|S| .
S⊆I 1  i1  k S⊇{i1 ,...,ik }
.
.
.
1  ik  k

Suppose that there are repetitions among the list i1 , . . . , ik . Then there will be
some j ∈ I such that j ∈ / {i1 , . . . , ik }, and pairing
those subsets containing j
with those not containing j, we see that the sum S⊇{i1 ,...,ik } (−1)k−|S| = 0.
Hence, we need only consider those terms where {i1 , . . . , ik } is a permutation
of {1, . . . , k}. Remembering that B is symmetric, these terms all contribute
equally and the result follows.


Theorem 36.2. Let λ be a partition of k, and let n  l(λ). Then there exists
GL(n)
an irreducible representation πλ = πλ of GL(n, C) with character χλ such
that if g ∈ GL(n, C) has eigenvalues t1 , . . . , tn , then

χλ (g) = sλ (t1 , . . . , tn ). (36.3)

The restriction of πλ to U(n) is an irreducible representation of U(n). If μ = λ


is another partition of k with n  l(μ), then χλ and χμ are distinct.
382 36 Schur Polynomials and GL(n, C)

Proof. We know that the representation exists by applying Theorem 34.1 to


the irreducible representation (ρ, Nρ ) of Sk with character
>$ ? sλ . The problem
k
is to prove the irreducibility of the module Vρ = V ⊗C[Sk ] Nρ , which
as the character χλ by Theorem 34.1. (As in Theorem 34.1, we are taking
V = Cn .)
$k
Let B be the ring of endomorphisms of V that commute with the
action of Sk . We will show that B is spanned by the linear transformations

v1 ⊗ · · · ⊗ vk −→ gv1 ⊗ · · · ⊗ gvk , g ∈ GL(n, C). (36.4)


$k >$ ?
End(V ) ∼
k
We have an isomorphism = End V . In this isomorphism,
$k
f1 ⊗ · · ·⊗ fk ∈ End(V ) corresponds to the
>$endomorphism
? v1 ⊗ · · ·⊗ vk −→
k
f1 (v1 ) ⊗ · · · ⊗ fk (vk ). Conjugation in End V by an element of σ ∈ Sk
$k
in the action (34.1) on V corresponds to the transformation

f1 ⊗ · · · ⊗ fk −→ fσ(1) ⊗ · · · ⊗ fσ(k)
$k $k
of End(V ). If ξ ∈ End(V ) commutes with this action, then ξ is a
linear combination of elements of the form B(f1 , . . . , fk ), where B : End(V ) ×
$
· · · × End(V ) −→ k End(V ) is the symmetric k-linear map

B(f1 , . . . , fk ) = fσ(1) ⊗ · · · ⊗ fσ(k) .
σ∈Sk

It $
follows from Proposition 36.1 that the vector space of such elements
k
of End(V ) is spanned by those of the form Q(f ) = B(f, . . . , f ) with
f ∈ End(V ). Since GL(n, C) is dense in End(V ), the elements Q(f ) with f
invertible span the same vector space. This proves that$the transformations of
k
the form (36.4) span the space of transformations of V commuting with
the action of Sk . $k
We temporarily restrict the action of GL(n, C) × Sk on V to the com-
pact subgroup U(n) × Sk . Representations of a compact group are completely
reducible, and the irreducible representations of U(n) × Sk are of the form
π ⊗ ρ, where π is an irreducible representation of U(n) and ρ is an irreducible
representation of Sk . Thus, we write
%k
V ∼
= πi ⊗ ρi , (36.5)
i

where the πi and ρi are irreducible representations of U(n) and Sk , res-


pectively. We take the πi to be left U(n)-modules and the ρi to be right
$k
Sk -modules. This is because the commuting actions we have defined on V
have U(n) acting on the left and Sk acting on the right.
$k
The subspace of V corresponding to πi ⊗ ρi is actually GL(n, C)-
invariant. This is because it is a complex subspace invariant under the
36 Schur Polynomials and GL(n, C) 383

Lie algebra action of u(n) and hence is invariant under the action of the
complexified Lie algebra u(n) + iu(n) = gl(n, C) and therefore under its exp-
onential, GL(n, C). So we may regard the decomposition (36.5) as a decom-
position with respect to GL(n, C) × Sk .
We claim that there are no repetitions among the isomorphism classes of
the representations ρi of Sk that occur. This is because if ρi ∼= ρj , then if we
denote by f an intertwining map ρi −→ ρj and by τ an arbitrary nonzero
linear transformation from the space of πi to the space of πj , then τ ⊗ f is
a map from the space of πi ⊗ ρi to the space of πj ⊗ ρj that commutes with
the action of Sk . Extending it by zero on direct summands in (36.5) beside
$k
πi ⊗ ρi gives an endomorphism of V that commutes with the action of Sk .
It therefore is in the span of the endomorphisms (36.4). But this is impossible
because those endomorphisms leave πi ⊗ ρi invariant and this one does not.
This contradiction shows that the ρi all have distinct isomorphism classes.
It follows from this that at most one ρi>can be?isomorphic to the contragre-
$k
dient representation of ρλ . Thus, in Vρ = V ⊗C[Sk ] Nρ at most one term
can survive, and that term will be isomorphic to πi as a GL(n, C) module for
this unique i. We know that Vρ is nonzero since by Theorem 36.1 the polyno-
mial sλ = 0 under our hypothesis that l(λ)  n. Thus, such a πi does exist,
and it is irreducible as a U(n)-module a fortiori as a GL(n, C)-module.
It remains to be shown that if μ = λ, then χμ = χλ . Indeed, the Schur
polynomials sμ and sλ are distinct since the partition λ can be read off from
the numerator in (36.1).

We have constructed an irreducible representation of GL(n, C) for every
partition λ = (λ1 , . . . , λn ) of length  n.
Proposition 36.2. Suppose that n  l(λ). Let
λ = (λ1 − λn , λ2 − λn , . . . , λn−1 − λn , 0).
In the ring Λ(n) of symmetric polynomials in n variables, we have
sλ (x1 , . . . , xn ) = en (x1 , . . . , xn )λn sλ (x1 , . . . , xn ). (36.6)
In terms of the characters of GL(n, C), we have
χλ (g) = det(g)λn χλ (g). (36.7)
Note that en (x1 , . . . , xn ) = x1 · · · xn . Caution: This identity is special to Λ(n) .
The corresponding statement is not true in Λ.
Proof. It follows from (36.1) that sλ (x1 , . . . , xn ) is divisible by (x1 · · · xn )λn .
Indeed, each entry of the first column of the matrix in the numerator is divis-
ible by xλ1 n , so we may pull xλ1 n out of the first column, xλ2 n out of the second
column, and so forth, obtaining (36.6).
If the eigenvalues of g are t1 , . . . , tn , then en (t1 , . . . , tn ) = t1 · · · tn = det(g)
and (36.7) follows from (36.6) and (36.3).

384 36 Schur Polynomials and GL(n, C)

Although we have constructed many irreducible characters of GL(n, C),


it is not true that every character is a χλ for some partition λ. What we
are missing are those of the form det(g)−m χλ (g), where m > 0 and χλ is
not divisible by det(g)m . We may slightly expand the parametrization of the
irreducible characters of GL(n, C) as follows. Let λ be a sequence of n integers,
λ1  λ2  · · ·  λn . (We no longer assume that the λ are nonnegative;
if λn < 0, such a λ is not a partition.) Then we can define a character of
GL(n, C) by (36.7) since even if λ is not a partition, λ is still a partition.
GL(n)
We will denote this representation by πλ , and its character by χλ .
GL(n)
We now have a representation πλ for each λ ∈ Zn such that λ1  λ2 
· · ·  λn . We will show that we have all the irreducible finite-dimensional
analytic representations. We will call such a λ a dominant weight . Thus, the
dominant weight λ is a partition if and only if λn  0. We call λ the highest
GL(n)
weight of the representation πλ . This terminology is consistent with that
introduced in Chap. 21.

Proposition 36.3. Let π be a finite-dimensional irreducible representation of


GL(n)
U(n). Then π is isomorphic to the restriction of πλ for some λ.

Proof. Let G = U(n). By Schur orthogonality, it is enough to show that


GL(n)
the characters of the πλ = πλ are dense in the space of class functions in
L2 (G). We refer to a symmetric polynomial in α1 , . . . , αn and their inverses as
a symmetric Laurent polynomial . We regard a symmetric Laurent functions
as class functions on U(n) by applying it to the eigenvalues of g ∈ U(n).
Every symmetric polynomial is a linear combination of the characters of the
πλ with λ a partition, so expanding the set of λ to dominant weights gives us
all symmetric Laurent polynomials. Remembering that the eigenvalues αi of g
satisfy |αi | = 1, we may approximate an arbitrary L2 function by a symmetric
Laurent polynomial by symmetrically truncating its Fourier expansion.


Lemma 36.1. If f is an analytic function on GL(n, C), then f is determined


by its restriction to U(n).

Proof. We show that if f |U(n) = 0 then f = 0. Let g be the Lie algebra


of U(n) of consisting of skew-Hermitian matrices. Then the exponential map
exp : g −→ U(n) is surjective, so f ◦ exp is zero on g. Since f is analytic, so is
f ◦exp and it follows that f ◦exp is zero on g⊕ig which is all Matn (C). So f = 0
in a neighborhood of the identity in GL(n, C), so it vanishes identically.


Proposition 36.4. Let π1 and π2 be analytic representations of GL(n, C).


If π1 and π2 have isomorphic restrictions to U(n), they are isomorphic.

Proof. We may assume that π1 and π2 act on the same complex vector space
V , and that π1 (g) = π2 (g) when g ∈ U(n). Applying Lemma 36.1 to the matrix
coefficients of π1 and π2 it follows that π1 (g) = π2 (g) for all g ∈ GL(n, C). 

36 Schur Polynomials and GL(n, C) 385

Theorem 36.3. Every finite-dimensional representation of the group U(n)


extends uniquely to an analytic representation of GL(n, C). The irreducible
complex representations of U(n), or equivalently the irreducible analytic com-
GL(n)
plex representations of GL(n, C), are precisely the πλ parametrized by the
dominant weights λ.

Proof. The fact that irreducible representations of U(n) extend to analytic


GL(n)
representations follows from the fact that such a representation is a πλ ,
proved in Proposition 36.3. Since U(n) is compact, each representation is
a direct sum of irreducibles, and it follows that each representation of U(n)
extends to an analytic representation. The uniqueness of the extension follows
from Proposition 36.4. The last statement now follows from Proposition 36.3.



Proposition 36.5. Suppose that λ is a partition and l(λ) > n. Then we have
sλ (x1 , . . . , xn ) = 0 in the ring Λ(n) .

Proof. If N = l(λ), then λ = (λ1 , . . . , λN ), where λN > 0 and N > n. Apply


the homomorphism rN −1 defined by (34.9), noting that rN −1 (eN ) = 0, since
eN is divisible by xN , and rN −1 consists of setting xN = 0. It follows from
(36.6) that rN −1 annihilates sλ . We may apply rN −2 , etc., until we reach Λ(n)
and so sλ = 0 in Λ(n) .


Theorem 36.4. If λ is a partition of k let ρλ denote the irreducible rep-


resentation of Sk affording the character sλ constructed in Theorem 35.1.
If, moreover, l(λ)  n, let πλ denote the irreducible representation of GL(n, C)
constructed in Theorem 36.2. Let V =$Cn denote the standard  module of
k
GL(n, C). The GL(n, C) × Sk module V is isomorphic to λ πλ ⊗ ρλ ,
where the sum is over partitions of k of length  n.

Proof. Most of this was proved in the proof of Theorem 36.2. Particularly, we
saw there that each irreducible representation of Sk occurring in (36.5) occurs
at most once and is paired with an irreducible representation of GL(n, C).
If l(λ)  n, we saw in the proof of Theorem 36.2 that ρλ does occur and
is paired with πλ . The one fact that was not proved there is that ρλ with
l(λ) > n do not occur, and this follows from Proposition 36.5.

37
Schur Polynomials and Sk

Frobenius [51] discovered that the characters of the symmetric group can be
computed using symmetric functions. We will explain this from our point of
view. We highly recommend Curtis [39] as an account, both historical and
mathematical, of the work of Frobenius and Schur on representation theory.
We remind the reader that the elements of Rk , as generalized characters,
are class functions on Sk . The conjugacy classes of Sk are parametrized by
the partitions as follows. Let λ = (λ1 , . . . , λr ) be a partition of k. Let Cλ
be the conjugacy class consisting of products of disjoint cycles of lengths
λ1 , λ2 , . . . . Thus, if k = 7 and λ = (3, 3, 1), then Cλ consists of the conjugates
of (123) (456) (7) = (123) (456). We say that the partition λ is the cycle type
of the permutations in the conjugacy class Cλ . Let zλ = |Sk |/|Cλ |.
The support of σ ∈ Sk is the set of x ∈ {1, 2, 3, . . . , k} such that σ(x) = x.

Proposition 37.1. Let mr be the number of i such that λi = r. Then


7
k
zλ = r mr mr ! . (37.1)
r=1

Proof. zλ is the order of the centralizer of a representative element g ∈ Cλ .


This centralizer is easily described.
First, we consider the case where g contains only cycles of length r in its
decomposition into disjoint cycles. In this case (denoting mr = m), k = rm
and we may write g = c1 · · · cm , where cm is a cycle of length r. The centralizer
CSk (g) contains a normal subgroup N of order rm generated by c1 , . . . , cm .
The quotient CSk (g)/N can be identified with Sm since it acts by conjugation
on the m cyclic subgroups c1 , . . . , cm . Thus, |CSk (g)| = rm m! .
In the general case where g has cycles of different lengths, its centralizer
is a direct product of groups such as the one just described.

We showed in the previous chapter that the irreducible characters of Sk
are also parametrized by the partitions of k—namely to a partition μ there

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 387


DOI 10.1007/978-1-4614-8024-2 37, © Springer Science+Business Media New York 2013
388 37 Schur Polynomials and Sk

corresponds an irreducible representation sμ . Our aim is to compute sμ (g)


when g ∈ Cλ using symmetric functions.

Proposition 37.2. The character values of the irreducible representations of


Sk are rational integers.

Proof. Using the Jacobi–Trudi identity (Theorem 35.1), sλ is a sum of terms


of the form ±hμ for various partitions μ. Each hμ is the character induced
from the trivial character of Sμ , so it has integer values.


Let pλ (k  1) be the conjugacy class indicator, which we define to be the


function 
zλ if g ∈ Cλ ,
pλ (g) =
0 otherwise.
As a special case, pk will denote the indicator of the conjugacy class of the
k-cycle, corresponding to the partition λ = (k). The term ‘conjugacy class
indicator’ is justified by the following result.

Proposition 37.3. If g ∈ Cλ , then sμ , pλ = sμ (g).

Proof. We have
1
sμ , pλ = zλ sμ (x).
|Sk |
x ∈ Cλ

The summand is constant on Cλ and equals zλ sμ (g) for any fixed representa-
tive g. The cardinality of Cλ is |Sk |/zλ and the result follows.


It is clear that the pλ are orthogonal. More precisely, we have



zλ if λ = μ,
pλ , pμ = (37.2)
0 otherwise.

This is clear since pλ is supported on the conjugacy class Cλ , which has


cardinality |Sk |/zλ .
We defined pλ as a class function. We now show it is a generalized
character.

Proposition 37.4. If λ is a partition of k, then pλ ∈ Rk .

Proof. The inner products pλ , sμ are rational integers


 by Propositions 37.2
and 37.3. By Schur orthogonality, we have pλ = μ pλ , sμ sμ , so pλ ∈ Rk .



Proposition 37.5. If h = l(λ), so λ = (λ1 , . . . , λh ) and λh > 0, then

pλ = pλ1 pλ2 · · · pλh .


37 Schur Polynomials and Sk 389

Proof. From the definitions, pλ1 · · · pλh is induced from the class function f
on the subgroup Sλ of Sk which has a value on (σ1 , . . . , σh ) that is

λ1 · · · λh if each σi is a λi -cycle ,
0 otherwise.
The formula (32.15) may be used to compute this induced class function. It
is clear that pλ1 · · · pλh is supported on the conjugacy class of cycle type λ,
and so it is a constant multiple of pλ . We write pλ1 · · · pλh = cpλ and use a
trick to show that c = 1. By Proposition 37.3, since hk = s(k) is the trivial
character of Sk , we have hk , pλ Sk = 1. On the other hand, by Frobenius
reciprocity, hk , pλ1 · · · pλh Sk = hk , f Sλ . As a class function, hk is just the
constant function on Sk equal to 1, so this inner product is
7
hλi , pλi Sλ = 1.
i
i

Therefore, c = 1.

Proposition 37.6. We have

k
khk = pr hk−r . (37.3)
r=1

Proof. Let λ be a partition of k. Let ms be the number of λi equal to s. We


will prove
pr hk−r , pλ = rmr . (37.4)
By Frobenius reciprocity, this inner product is f, pλ Sr ×Sk−r , where f is the
function on Sr × Sk−r which as a value on (σ, τ ), with σ ∈ Sr and τ ∈ Sk−r
that is 
r if σ is an r-cycle,
0 otherwise.
The value of f pλ restricted to Sr × Sk−r will be zero on (σ, τ ) unless σ is an
r-cycle [since f (σ, τ ) must be nonzero] and τ has cycle type λ , where λ is
the partition obtained from λ by removing one part of length r [since pλ (σ, τ )
must be nonzero]. The number of such pairs (σ, τ ) is |Sr | · |Sk−r | divided by
the product of the orders of the centralizers in Sr and Sk−r , respectively, of
an r-cycle and of a permutation of cycle type λ . That is,
|Sr | · |Sk−r |
/ .
r · rmr −1 (mr − 1)! s=r sms ms !
The value of f pλ on these conjugacy classes is rzλ . Therefore,
- .
1 |Sr | · |Sk−r |
f, pλ Sr ×Sk−r = / rzλ ,
|Sr | · |Sk−r | r · rmr −1 (mr − 1)! s=r sms ms !

which equals rmr . This proves (37.4).


390 37 Schur Polynomials and Sk

We note thatsince λ is a partition of k, and since λ has mr cycles of length


k
r, we have k = r=1 rmr . Therefore,
D k E

pr hk−r , pλ = rmr = k = khk , pλ .
r=1 r

Because this is true for every λ, we obtain (37.3).



Let pλ = pλ1 pλ2 · · · ∈ Λk , where pk is defined by (33.7).
Proposition 37.7. We have

k
khk = pr hk−r . (37.5)
r=1

Proof. We recall from (33.2) that



7
n
h k tk = (1 − xi t)−1 ,
k=0 i=1

which we differentiate logarithmically to obtain


∞ k−1 n
k=0 khk t d
 ∞ k
= log(1 − xi t)−1 .
k=0 h k t i=1
dt

Since

d
log(1 − xi t)−1 = xri tr−1 ,
dt r=1
we obtain -∞ .



khk tk−1 = h k tk pr tr−1 .
k=1 k=0 r=1

Equating the coefficients of tk−1 , the result follows.



Theorem 37.1. We have ch(pλ ) = pλ .
Proof. We have pλ = pλ1 pλ2 · · · . Hence, it is sufficient to show that ch(pk ) =
pk . This follows from the fact that they satisfy the same recursion formula—
compare (37.5) with (37.3)—and that ch(hk ) = hk .

Now we may determine the irreducible characters of Sk .
Theorem 37.2. Express each symmetric polynomial pλ as a linear combina-
tion of the sμ :

pλ = cλμ sμ .
μ

Then the coefficient cλμ is the value of the irreducible character sμ on elements
of the conjugacy class Cλ .
37 Schur Polynomials and Sk 391

Proof. Since n  k, ch : Rk → Λk is injective, and it follows that



pλ = cλμ sμ .
μ

Taking the inner product of this relation with sμ , we see that

cλμ = pλ , sμ .

The result follows from Proposition 37.3.




Here is a variant of Theorem 37.2. Let


7
Δ= (xi − xj ) = det(xn−i
j )
i<j

be the Vandermonde determinant. which is the denominator in (36.1).

Theorem 37.3. (Frobenius) Let μ be a partition of k of length  n, and


let λ be another partition of k. Let cλμ be the value of the character sμ on
elements of the conjugacy class Cλ . Then cλμ is the coefficient of

xμ1 1 +n−1 xμ2 2 +n−2 · · · xμnn (37.6)

in the polynomal pλ Δ.

Proof. By Theorem 37.2, we have pλ = μ cλμ sμ , and by (36.1) this means
that

pλ Δ = cλμ det(xjμi +n−i ),
μ

the determinant being the determinant in the numerator in (36.1). The mono-
mial (37.6) appears only in the μ term and the statement follows.


As an example of Theorem 37.2, let us verify the irreducible characters of


S3 . We have
 3  
s(3) = h3 = xi + i=j x2i xj + i<j<k xi xj xk ,
 2

s(21) = i=j xi xj +2 xx x ,
 i<j<k i j k
s(111) = e(3) = i<j<k x i xj xk .

and 
p(3) =  x3i , 
p(21) = x3i + i=j x2i xj ,
  
p(111) = x3i + 3 i=j x2i xj + 6 i<j<k xi xj xk ,
so
p(111) = s(3) + s(111) + 2s(21) ,
p(3) = s(3) + s(111) − s(21) ,
p(21 ) = s(3) − s(111) .
392 37 Schur Polynomials and Sk

These coefficients are precisely the coefficients in the character table of S3 :

1 (123) (12)
s(3) 1 1 1
s(111) 1 1 −1
s(21) 2 −1 0

Before we leave the representation theory of the symmetric group, let us


recall the involution ι of Proposition 34.3 and Theorem 35.2, which inter-
changes sλ with sμ , where μ = λt is the conjugate partition. It has a concrete
interpretation in this context.

Lemma 37.1. Let H be a subgroup of the finite group G. Let χ be a character


of H, and let ρ be a one-dimensional character of G, which we may restrict
to H. The induced character (ρχ)G equals ρχG .

Thus, it does not matter whether we multiply by ρ before or after inducing


to G.

Proof. This may be proved either directly from the definition of the induced
representation or by using (32.15).


Theorem 37.4. If f is a class function on Sk , its involute ι f is the result of


multiplying f by the alternating character ε of Sk .

We refrain from denoting ι f as εf because the graded ring R has a different


multiplication.

Proof. Let us denote by τ : Rk −→ Rk the linear map that takes a class func-
tion f on Sk and multiplies it by ε, and assemble the τ in different degrees to a
linear map of R to itself. We want to prove that τ and ι are the same. By the
definition of the ek and hk , they are interchanged by τ , and by Theorem 35.2
they are interchanged by ι. Since the ek generate R as a ring, the result will
follow if we check that τ is a ring homomorphism.
Applying Lemma 37.1 with G = Sk+l , H = Sk × Sl , and ρ = ε shows
that multiplying the characters χ and η of Sk and Sl each by ε to obtain the
characters τ χ and τ η and then inducing the character τ χ ⊗ τ η of Sk × Sl to
Sk+l gives the same result as inducing χ ⊗ η and multiplying it by ε. This
shows that τ is a ring homomorphism.


EXERCISES
Exercise 37.1. Compute the character table of S4 using symmetric polynomials by
the method of this chapter.
37 Schur Polynomials and Sk 393

Exercise 37.2. Prove the identity


k
kek = (−1)r pr ek−r .
r=1

Let us say that a partition λ is a ribbon partition if its Young diagram only
has entries in the first row and column. The ribbon partitions of k are of the form
(k − r, 1r ) with 0  r  k, where the notation means the partition with one part of
length k − r and r parts of length 1.

Exercise 37.3. Show that



k
pk = (−1)r s(k−r,1r ) .
r=0

[Hint: This may be proved by multiplying the denominator in (36.1) by pr and


manipulating the result.]

See Exercise 40.1 for a generalization.

Exercise 37.4. Let sλ be an irreducible character of Sk , where λ is a partition of k.


Let σ be a k-sycle. Show that sλ (σ) is 0, 1 or −1. For which partitions is it nonzero?
38
The Cauchy Identity

Suppose that α1 , . . . , αn and β1 , . . . , βm are two sets of variables. The Cauchy


identity asserts that
7
n 7
m
(1 − αi βj )−1 = sλ (α1 , . . . , αn ) sλ (β1 , . . . , βm ), (38.1)
i=1 j=1 λ

where the sum is over all partitions λ (of all k). The series is absolutely
convergent if all |αi |, |βi | < 1. It can also be regarded as an equality of formal
power series.
The general context for our discussion of the Cauchy identity will be the
Frobenius–Schur duality. For other approaches, see Exercises 26.4 and 38.4.
We recall from Chap. 34 that the characteristic map ch : R −→ Λ(N )
allows us to interpret a character (or class function) on the symmetric group
Sk as a symmetric polynomial in N variables that is homogeneous of degree k.
Here is a simple fact we will need. Notations are as in Chap. 37.

Proposition 38.1. Let k be a nonnegative integer. Then we have the following


identity in the ring Λ(N ) of symmetric polynomials.

zλ−1 pλ = hk .
λ a partition of k

Proof. In view of Theorem 37.1 it is sufficient to show in R that



zλ−1 pλ = hk .
λ a partition of k

We consider both sides as functions on Sk . By definition, pλ is the function


class Cλ with value zλ on that class; sum-
supported on the single conjugacy
ming over all conjugacy classes, λ zλ−1 pλ is the constant function equal to
1 on Sk , that is, hk .


D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 395


DOI 10.1007/978-1-4614-8024-2 38, © Springer Science+Business Media New York 2013
396 38 The Cauchy Identity

Next we will consider symmetric polynomials in two sets of variables,


α1 , . . . , αn and β1 , . . . , βm . Consider a polynomial f in α1 , . . . , αn and
β1 , . . . , βm that is (for fixed β) symmetric in α1 , . . . , αn and homogeneous
of degree k, and also (for fixed α) symmetric in β1 , . . . , βm and homogeneous
of degree l. Then we may transfer this by the Frobenius–Schur duality to
Rk ⊗ Rl . In other words, we may find an element ξ in Rk ⊗ Rl such that
ch(n) ⊗ ch(m) (ξ) is the given symmetric polynomial in two sets of variables.

Proposition 38.2. Let k be a nonnegative integer. Then



sλ (α) sλ (β) = zλ−1 pλ (α) pλ (β). (38.2)
λ a partition of k λ a partition of k

Proof. Both sides polynomials in the αi and βi that are symmetric and homo-
geneous of degree k in either set of variables. Use the Frobenius–Schur duality
to transfer the function on the right-hand side to a function on Sk ×Sk . In view
of Theorem 37.1 this is the function

Δ(σ, τ ) = zλ−1 pλ (σ) pλ (τ )
λ a partition of k

that maps (σ, τ ) ∈ Sk × Sk to the function that has the value zλ if σ and τ
are in the conjugacy class Cλ , and is zero if σ and τ are not conjugate. This
function may be characterized as follows: if f is a class function, then
1
Δ(σ, τ ) f (τ ) = f (σ).
k!
τ ∈Sk

Indeed, if σ is in the conjugacy class Cλ , there are |Cλ | values of τ , namely


the conjugates of σ, for which there is a contribution of zλ f (τ ) = zλ f (σ), and
since |Cλ | zλ = k!, the statement follows. Thus, Δ is the reproducing kernel
for class functions. It is characterized by this property, together with the fact
(with τ fixed) Δ(σ, τ ) is constant on conjugacy classes of σ, and similarly for
τ with σ fixed. Now

sλ (σ) sλ (τ )
λ a partition of k

is also a class function in σ and τ separately, and it has the same reproducing
property, as a consequence of Schur orthogonality. Hence these are equal.
We see that

sλ (σ) sλ (τ ) = zλ−1 pλ (σ) pλ (τ ),
λ a partition of k λ a partition of k

and applying ch ⊗ ch we obtain (38.2).




Theorem 38.1. (Cauchy) Suppose α1 , . . . , αn and β1 , . . . , βm are complex


numbers of absolute value < 1. Then
38 The Cauchy Identity 397

7
n 7
m
(1 − αi βj )−1 = sλ (α1 , . . . , αn ) sλ (β1 , . . . , βm ). (38.3)
i=1 j=1 λ

The sum is over all partitions λ.

Proof. Using (33.2) in the nm variables αi βj , the left-hand side equals



hk (αi βj ),
k=0

so it is sufficient to show

sλ (α1 , . . . , αn ) sλ (β1 , . . . , βm ) = hk (αi βj ). (38.4)
λ a partition of k

By (38.2) this equals



zλ−1 pλ (α) pλ (β).
λ a partition of k

We now make the observation that pk (αi βj ), which is the kth power sum sym-
metric polynomial in nm variables αi βj , equals pk (α)pk (β), and so pλ (αi βj ) =
pλ (α)pλ (β). The statement now follows from Proposition 38.1.


The Cauchy identity may be interpreted as describing the decomposition of


the symmetric algebra over the tensor product representation of GLn × GLm ,
as we will now explain. Let G be a group, and let π : G −→ GL(Ω) be a
representation on some vector space Ω. Let g ∈ G, and let α1 , . . . , αN be the
eigenvalues of π(g). Then hk (α) = hk (α1 , . . . , αN ) and ek (α) = ek (α1 , . . . , αk )
are the traces of π(g) on the kth symmetric and exterior powers ∨k Ω and ∧k Ω,
respectively. Therefore,

7
N ∞
7
N
hk (α) = (1 − αi )−1 and ek (α) = (1 + αi )
k=0 i=1 k=0 i=1

may be regarded as the characters of g on the symmetric and exterior algebras.


The symmetric algebra is infinite-dimensional, so strictly speaking the
trace of an endomorphism only has a provisional meaning. Indeed the first
series is only convergent if |αi | < 1, but there are several ways of handling
this. One may try to choose g so that its eigenvalues are < 1, or one may
simply regard the series as formal. Or, assuming no αi = 1, one may regard
the series as obtained from

7
N
hk (α)tk = (1 − tαi )−1
k=0 i=1

by analytic continuation in t.
398 38 The Cauchy Identity

Proposition 38.3. Let G = GLn (C) × GLm (C) acting on the tensor product
Ω = Cn ⊗ Cm of the standard modules of GLn (C) and GLm (C). Then the
symmetric algebra
' &
Ω∼
= πλGLn ⊗ πλGLm (38.5)
λ

as G-modules, where the summation is over all partitions λ of length 


min(m, n).

Proof. If g has eigenvalues αi and h has eigenvalues βj , then (g, h) has eigen-
 hk (αi βj ) on ∨ Ω. By the Cauchy identity
k
values αi βj on Ω, hence has trace
in the form (38.4), this equals sλ (α) sλ (β) where the sum is over partitions
of k. [If the length of the partition λ is > n, we interpret sλ (α) as zero.] Com-
bining the contributions over all k, the statement follows.


There is a dual Cauchy identity.

Theorem 38.2. Suppose α1 , . . . , αn and β1 , . . . , βm are complex numbers of


absolute value < 1. Then
7
n 7
m
(1 + αi βj ) = sλ (α1 , . . . , αn ) sλt (β1 , . . . , βm ). (38.6)
i=1 j=1 λ

Note that now each partition λ is paired with its conjugate partition λt . This
may be regarded as a decomposition of the exterior algebra on Matn (C)∗ .

Proof. Let α1 , . . . , αn be fixed complex numbers, and let Λ(m) be the ring
of symmetric polynomials in β1 , . . . , βm with integer coefficients. We recall
from Theorems 34.3 and 35.2 that Λ has an involution ι that interchanges sλ
and sλ . We have to be careful how we use ι because it does not induce an
involution of Λ(m) . Indeed, it is possible that in Λ(m) one of sλ and sλ is zero
and the other is not, so no involution exists that simply interchanges them.
We write the Cauchy identity in the form
-∞ .
7n
αki hk (β1 , . . . , βm ) = sλ (α1 , . . . , αn ) sλ (β1 , . . . , bm ).
i=1 k=0 λ

This is true for all m, and therefore we may write


-∞ .
7
n
k
αi hk = sλ (α1 , . . . , αn ) sλ ,
i=1 k=0 λ

where the hk on the left and the second occurrence of sλ on the right are
regarded as elements of the ring Λ, which is the inverse limit (34.10) of the
rings Λ(m) , while αi and sλ (α1 , . . . , αn ) are regarded as complex numbers.
To this identity we may apply ι and obtain
38 The Cauchy Identity 399

- .
7
n
αki ek = sλ (α1 , . . . , αn ) sλ ,
i=1 k=0 λ

and now we specialize from Λ to Λ(m) and obtain (38.6).



In this chapter and the next, we will give some applications of the Cauchy
identity. First some preliminaries. If λ, μ, ν are partitions, there is defined
a nonnegative integer cλμν called the Littlewood–Richardson coefficient.  It is
(by definition) zero unless |λ| = |μ| + |ν|, where we recall that |λ| = λi is
the sum of the parts, that is, λ is a partition of |λ|. There is a combinatorial
description of cλμν , but we will not describe it (except in special cases). For
this Littlewood–Richardson rule see Macdonald [124] or Stanley [153].
The next theorem, asserting the equivalence of three definitions of cλμν ,
shows that the Littlewood–Richardson coefficients have three distinct repre-
sentation theoretic interpretations. They have other interpretations too. For
example, they describe the structure constants in the cohomology ring of
Grassmannians with respect to the basis of cohomology classes corresponding
to Schubert cycles. If λ = (λ1 , λ2 , . . . , λn ) is a partition of length  n, we will
GL(n)
denote by πλ the irreducible representation of GL(n) parametrized by λ.
Let G be a group and H a subgroup. A rule describing how irreducible rep-
resentations of G decompose into irreducibles when restricted to H is called
a branching rule. The tensor product rule describing how the tensor product
π ⊗ π  of irreducibles π, π  of H decomposes into irreducibles of H may be
thought of a branching rule. Indeed, π ⊗ π  extends to an irreducible repre-
sentation of H × H, so the tensor product rule is really a branching rule for
H embedded in G = H × H diagonally.
All three definitions of the Littlewood–Richardson coefficients may be
characterized as branching rules. To specify a branching rule, we need to
specify an embedding of a group H in a larger group G. The embeddings
H → G in the three branching rules as follows. The first is the embedding
of Sk × Sl −→ Sk+l that we worked with in Chap. 34. The second is the
diagonal embedding GL(n, C) −→ GL(n, C) × GL(n, C). The third is the Levi
embedding of GL(p, C) × GL(q, C) −→ GL(p + q, C) as follows:
 
g
(g, h) −→ , g ∈ GLp (C), h ∈ GLq (C). (38.7)
h
As usual, we are only interested in analytic representations of GL(n, C), which
are the same as representations of U(n), so we could equally well work with the
embeddings U(n) −→ U(n) × U(n) and the Levi embedding U(p) × U(q) −→
U(p + q).
Remarkably, these three branching rules involve the same coefficients cλμν .
This is the content of the next result.
Theorem 38.3. Let λ, μ, ν be partitions such that |λ| = |μ| + |ν|. Then the
following three definitions of cλμν are equivalent. We will denote k = |μ| and
l = |ν|.
400 38 The Cauchy Identity

(i) Let ρλ be the irreducible representation of Sk+l with character sλ . Then


cλμν is the multiplicity of ρμ ⊗ ρν in the restriction of ρλ to Sk × Sl .
GL(n)
(ii) Let n  |λ|. Then cλμν is the multiplicity of πλ in the decomposition
GL(n) GL(n)
of the representation πμ ⊗ πν of GL(n) into irreducibles.
GL(p) GL(q)
(ii) Let p  k and q  l. Then cμν is the multiplicity of πμ
λ
⊗ πν in the
GL(p+q)
restriction of πμ to GL(p, C) × GL(q, C) into irreducibles.

Proof. We note that (i) can be expressed as the identity



sμ sν = cλμν sλ ,
μ

since taking the inner product of the left-hand side with sλ and using Frobe-
nius reciprocity gives the coefficient of ρμ ⊗ ρν in the restriction of ρλ from
Sk+l to Sk × Sl . On the other hand (ii) can be expressed as the identity

sμ (x1 , . . . , xn ) sν (x1 , . . . , xn ) = cλμν sλ (x1 , . . . , xn )
λ

in the ring Λ(n) of symmetric polynomials. Indeed, substituting for xi the


eigenvalues of g ∈ GL(n, C), the Schur polynomial sλ becomes the character
GL(n) GL(n) GL(n)
of πλ , and the left-hand side becomes the character of πμ ⊗ πν .
Thus the equivalence of (i) and (ii) follows from Proposition 34.4.
As for the equivalence of (ii) and (iii), we give an argument based on the
GL(p+q)
Cauchy identity. Comparing the characters of πλ and
&
cλμν πμGL(p) ⊗ πνGL(q)
μ,ν

on the matrix (38.7), we see that (iii) is equivalent to the identity



sλ (α1 , . . . , αp , β1 , . . . , βq ) = cλμν sμ (α1 , . . . , αp )sν (β1 , . . . , βq ),
μ,ν

where α1 , . . . , αp are the eigenvalues of g and β1 , . . . , βq are the eigenvalues of


h. In a more succinct notation, we write the left-hand side sλ (α, β), so what
we need to prove is

sλ (α, β) = cλμν sμ (α)sν (β). (38.8)
μ,ν

Let γ1 , . . . , γn be arbitrary complex numbers. By the Cauchy identity for


7 7 −1
sλ (α, β)sλ (γ) = (1 − αi γk )−1 (1 − βj γk ) .
λ i,k j,k

Also by the Cauchy identity this equals


38 The Cauchy Identity 401
8 98 9

sμ (α)sμ (γ) sν (β)sν (γ) = sμ (α)sν (b) cλμν sλ (γ).
μ ν μ,ν λ

Since the functions sλ (γ) are linearly independent as λ varies, we may compare
the coefficients of sλ (γ) and obtain (38.8).

It is worth pondering the mechanism behind the proof that (ii) is equivalent
to (iii). We will reconsider it after some preliminaries.
We begin with the notion of a correspondence in the sense of Howe, who
wrote many papers on this subject: see Howe [75, 77]. A correspondence is a
bijection between a set of irreducible representations of a group G and another
group H. The relevant examples arise in the following manner.
Let G be a group with a representation Θ, and let G and H be subgroups
that centralize each other. Thus, we have a homomorphism G × H −→ G.
(Often this homomorphism is injective so G × H is a subgroup of G, but
we do not require this.) We assume given a representation Θ of G with the
following property: when Θ is restricted to G × H, it becomes a direct sum
πi ⊗ πi , where πi are irreducible representations of G, and πi are irreducible
representations of H. We assume that each πi ⊗ πi occurs with multiplicity at
most one, and moreover, there are no repetitions between the representations
πi and no repetitions among the πi . (This definition is adequate if G and H
are compact but might need to be generalized slightly if they are not.) If this
condition is satisfied, we say the representation Θ induces a correspondence
for G and H. The correspondence is the bijection πi ←→ πi . Here are some
examples.
• Let G = Sk , H = GL(n, C), and G = G × H. The representation Θ
is the action on ⊗k Cn in Theorem 36.4. That theorem implies that Θ
induces a correspondence. Indeed, by Theorem 36.4 the correspondence
GL(n)
is the bijection ρλ ←→ πλ , as λ runs through partitions of k that
have length  n. Thus, the Frobenius–Schur duality is an example of a
correspondence.
• Consider G = GL(n, C), H = GL(m, C) acting on Ω = Cn ⊗ Cm as above.
Since (g ⊗ Im )(In ⊗ h) = (g ⊗ h) = (In ⊗ h)(g ⊗ Im ), the actions of G and H
commute. Let Θ be the action on the symmetric algebra of Ω. It is actually
a representation of G = GL(Ω). As we have already explained, when
restricted to G × H, the Cauchy identity implies the decomposition (38.5),
GL(n) GL(m)
so Θ induces a correspondence. This is the bijection πλ ←→ πλ
as λ runs through all partitions of length  min(m, n). This equivalence
is sometimes referred to as GL(n) × GL(m)-duality.
• Howe conjectured [73], and it was eventually proved, that if G and H are
reductive subgroups of Sp(2N, F ), where F is a local field (including R
or C), then the Weil (oscillator) representation induces a correspondence.
In some cases one of the groups of the correspondence must be replaced
by a covering group. In one most important case, G = Sp(2n) and
H = O(m), where nm = N , so the correspondence relates representa-
tions of symplectic groups (or their double covers) to representations of
402 38 The Cauchy Identity

an orthogonal group. This phenomenon is known as Howe duality. It is


closely related to the theta liftings in the theory of automorphic forms.
Here the Weil representation is a projective representation of Sp(2N, F )
with a construction that is similar to the construction of a projective rep-
resentation of the orthogonal groups in Chap. 31. In place of the Clifford
algebra one uses a Heisenberg group or the symplectic Clifford algebra,
often called the Weyl algebra.
Now let us consider the following abstract situation. Let G and G be
groups. Let H and H  be subgroups of G and H, respectively. We will assume
that G and G are subgroups of a larger group G such that H  is the centralizer
of G, and H is the centralizer of G . Now let us assume that G has a repre-
sentation Θ that induces correspondences between G and H  , and between G
and H.
We summarize this situation by a “see-saw” diagram:

G G

H H
(38.9)

The vertical lines are inclusions, and the diagonal lines are correspondences.
Now we can show that the pairs G, H and G , H  have the same branching
rule (except inverted with respect to inclusion).

Proposition 38.4. Let there be given a see-saw (38.9). Let πiG and πiH be cor-

responding representations of G and H  , and let πjG and πjH be corresponding
representations of G and H. Then the multiplicity of πjH in πiG equals the
 
multiplicity of πiH in πjG .

Proof. We may express the correspondences as follows:


&  & 
Θ|G×H  = πiG ⊗ πiH , Θ|H×G = πjH ⊗ πjG , (38.10)
i∈I j∈J

for suitable indexing sets I and J. We first observe that if σ is an irreducible


of H that occurs in any πiG |H , then σ = πjH for some j. Indeed, it follows
from the second decomposition that the πjH are precisely the irreducibles of
H that occur in the restriction of Θ to H, from which this statement is clear.
Therefore, we may find integers c(i, j) such that
&
πiG |H = c(i, j)πjH (38.11)
j∈J
38 The Cauchy Identity 403

and similarly
 & 
πjG |H  = d(i, j)πiH .
i∈I

What we must prove is that c(i, j) = d(i, j). Now combining the first equation
in (38.10) with (38.11) we get
& 
ΘH×H  = c(i, j)πjH ⊗ πiH
i,j

and similarly
& 
ΘH×H  = d(i, j)πjH ⊗ πiH .
i,j

Comparing, the statement follows.




Now let us reconsider the equivalence of (ii) and (iii) in Theorem 38.3.
This may be understood as a refllection of the following see-saw:

GL(n) × GL(n) GL(p + q)

GL(n) GL(p) × GL(q)

The left vertical line is the diagonal embedding GL(n) −→ GL(n) × GL(n),
and the right vertical line is the Levi embedding GL(p)×GL(q) −→ GL(p+q).
The ambient
@ group is GL(Ω) where Ω = Cn ⊗ Cp+q acting on the symmetric
algebra Ω. More specifically, with H = GL(n) and G = GL(p + q), H × G
acts on Cn ⊗ Cp+q in the obvious way; with G = GL(n) × GL(n) and H  =
GL(p) × GL(q) we use the isomorphism

Cn ⊗ Cp+q ∼
= (Cn ⊗ Cp ) ⊕ (Cn ⊗ Cq )

with the first GL(n) and GL(p) acting on the first component, and the second
GL(n) and GL(q) on the second component. Proposition 38.4 asserts that the
two branching rules are the same, which is the equivalence of (ii) and (iii) in
Theorem 38.3.
The paper of Howe, Tan, and Willenbring [79] gives many more examples
of see-saws applied to branching rules. Kudla [113] showed that many con-
structions in the theory of automorphic forms could be explained by see-saws.
Branching rules are important for many problems and are the subject
of considerable literature. Branching rules for the orthogonal and symplectic
404 38 The Cauchy Identity

groups are discussed in Goodman and Wallach [56], Chap. 8. King [101] is a
useful survey of branching rules for classical groups. Many branching rules are
programmed into Sage.

Exercises
Exercise 38.1. Let n and m be integers. Define a bijection between partitions λ =
(λ1 , . . . , λn ) with λ1  m and partitions μ = (μ1 , . . . , μm ) with μ1  n as follows.
The shapes of the partitions λ and μ must sit as complementary pieces in an n × m
box, with the λi being the lengths of the rows of one piece, and the μj being the
lengths of the columns of the other. For example, suppose n = 3 and μ = 5 we could
have λ = (4, 2, 1) and μ = (3, 2, 2, 1). As usual, a partition may be padded with
zeros, so we identify this μ with (3, 2, 2, 1, 0), and the diagram is as follows:

λ
μ

(i) Show that


 −1 
(y1 , . . . , ym )n sλ y1−1 , . . . , ym = sμ (y1 , . . . , ym ).
(ii) Prove that


n 
m 
(xi + yj ) = sλ (x1 , . . . , xn ) sμ (y1 , . . . , ym ),
i=1 j=1

where the sum is over λ and μ related as explained above.

Exercise 38.2. Give another proof of Proposition 38.1 as follows. Show that
 
(1 − αi t)−1 = zλ pλ (α1 , . . . , αn )t|λ| (38.12)
i λ

by writing the left-hand side as


( )  
  αki k  pk (α1 , . . . , αk ) k
exp t = exp t ,
i
k k
k k

expanding and making use of (37.1).

The next two exercises lead to another proof of the Cauchy identity.

Exercise 38.3. Let G be any compact group. Let τ be an antiautomorphism of


G, that is, a continuous map that satisfies τ (gh) = τ (h)τ (g). Assume that τ (g) is
conjugate to g for all g ∈ G. For example, we could take G = U(n) and τ to be the
transpose map.
38 The Cauchy Identity 405

(i) Let (π, V ) be an


 irreducible
 representation of G. Let π  : G −→ GL(V ) be the
 −1
map π (g) = π τ (g) . Show that (π  , V ) is isomorphic to the contragredient

representation (π̂, V ).
(ii) Let G × G act on the ring of matrix coefficients Mπ of π by
 
(g, h)f (x) = f τ (g)xh .

Show that this representation is isomorphic to π ⊗ π as (G × G)-modules (Hint:


Use Exercise 2.4.)

Let us call a function on GL(n, C) regular if, as a function of g = (gij ) it is a


polynomial in the gij and det(g)−1 . A regular function on Matn (C) is a polynomial.
Thus, it is a regular function on GL(n, C), but one that is a polynomial in just
the
 coordinate
 functions
 gij and
 not involving the inverse determinant. The rings
O Matn (C) and O GL((n, C) of regular functions on Matn (C) and GL(n, C) are
just the affine rings of algebraic geometry. The fact that the regular functions on
Matn are a subring of the regular functions on GL(n) reflects the fact that Matn
contains GL(n) as an open subset.

Exercise 38.4. (i) Show that every matrix coefficient of U(n) extends uniquely to
a regular function on GL(n, C),
 so the ring of matrix coefficients on U(n) may
be identified with O GL(n, C . Deduce
 that the ring of matrix coefficients of
U(n) may be identified with the O GL(n, C) . Let GL(n, C) × GL(n, C) act on
functions on either GL(n, C) or Matn (C) by

(g1 , g2 )f (h) = f (t g1 hg2 ).

(ii) Show that


  '
O GL(n, C) ∼= π⊗π
λ a dominant weight

as GL(n, C) × GL(n, C)-modules.


(iii) Show that the component π ⊗ π in this decomposition extends to a space of
regular functions on Matn if and only if λ is a partition, and deduce that
  '
O Matn (C) ∼ = π ⊗ π.
λ a partition

Explain why this proves (38.5) when m = n.


(iv) Explain why the Cauchy identity when m = n implies the general case, and
deduce the Cauchy identity from (iii).

Exercise 38.5. (i) Let α = (α1 , α2 , . . .), β = (β1 , β2 , . . .), γ = (γ1 , γ2 , . . .), δ =
(δ1 , δ2 , . . .) be three sets of variables. Using (38.8), evaluate ν sν (α, β) sν (γ, δ)
in two different ways and obtain the identity
 ν ν  λ μ θ τ
cλμ cθτ = cφψ cξη cφξ cψη . (38.13)
ν φ,ψ,ξ,η
406 38 The Cauchy Identity

(ii) Show that (38.13) implies the Hopf axiom, that is, the commutativity of the
diagram in Exercise 35.4.

Let α = (α1 , α2 , . . .), β = (β1 , β2 , . . .) be two sets of variables. Define the sup-
ersymmetric Schur polynomial (Littlewood [120] (pages 66–70), Berele and Remmel
[15], Macdonald [123], Bump and Gamburd [29]) by the formula
 λ
sλ (α/β) = cμν sμ (α)sν t (β)
μ,ν

where ν t is the conjugate partition.

Exercise 38.6. Prove the supersymmetric Cauchy identity


    
sν (α/β)sν (γ/δ) = (1 − αi γj )−1 (1 + αi δj ) (1 + βi δj ) (1 − βi δj )−1 .
ν i,j i,j i,j

(Hint: Use the involution.)


39
Random Matrix Theory

In this chapter, we will work not with GL(n, C) but with its compact subgroup
U(n). As in the previous chapters, we will consider elements of Rk as gener-
(n)
alized characters on Sk . If f ∈ Rk , then f = ch(n) (f ) ∈ Λk is a symmetric
polynomial in n variables, homogeneous of weight k. Then ψf : U(n) −→ C,
defined by (33.6), is the function on U(n) obtained by applying f to the
eigenvalues of g ∈ U(n). We will denote ψf = Ch(n) (f ). Thus, Ch(n) maps the
additive group of generalized characters on Sk to the additive group of gener-
alized characters on U(n). It extends by linearity to a map from the Hilbert
space of class functions on Sk to the Hilbert space of class functions on U (n).

Proposition 39.1. Let f be a class function on Sk . Write f = λ cλ sλ , where
the sum is over the partitions of k. Then

|f |2 = |cλ |2 , |Ch(n) (f )|2 = |cλ |2 .
λ l(λ)n

Proof. The sλ are orthonormal by Schur orthogonality, so |f |2 = |cλ |2 .
(n)
By Theorem 36.2, Ch (sλ ) are distinct irreducible characters when λ runs
through the partitions of k with length  n, while, by Proposition 36.5,
Ch(n) (sλ ) = 0 if l(λ) > n. Therefore, we may write

Ch(n) (f ) = cλ Ch(n) (sλ ),
l(λ)n

and the Ch(n) (sλ ) in this decomposition are orthonormal by Schur orthogo-

nality on U(n). Thus, |Ch(n) (f )|2 = l(λ)n |cλ |2 .


Theorem 39.1. The map Ch(n) is a contraction if n < k and an isometry if


n  k. In other words, if f is a class function on Sk ,
|Ch(n) (f )|  |f |
with equality when n  k.

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 407


DOI 10.1007/978-1-4614-8024-2 39, © Springer Science+Business Media New York 2013
408 39 Random Matrix Theory

Proof. This follows immediately from Proposition 39.1 since if n  k every


partition of k has length  n.


Theorem 39.1 is a powerful tool for transferring computations from one


group to another, in this case from the unitary group to the symmetric group.
The underlying principle is that of a correspondence introduced in the last
chapter. This is not unlike Proposition 38.4, where we showed how correspon-
dences may be used to transfer a branching rule from one pair of groups to
another.
We will illustrate Theorem 39.1 with a striking result of Diaconis and
Shahshahani [42], who showed by this method that the traces of large random
unitary matrices are normally distributed. We will give a second example of
using a correspondence to transfer a calculation from one group to another
below in the theorem of Keating and Snaith, were we will employ GLn × GLm
duality in a similar way.
A measure is called a probability measure if its total volume is 1. Suppose
that X and Y are topological spaces and that X is endowed with a Borel
probability measure dμX . Let f : X −→ Y be a continuous function. We can
push the measure dμX forward to probability measure dμY on Y , defined by
 

φ(y) dμY (y) = φ f (x) dμX (x)
Y X

for measurable functions on Y . Concretely, this measure gives the distribution


of the values f (x) when x ∈ X is a random variable.
For example, the trace of a Haar random unitary matrix g ∈ U(n) is
distributed with a measure dμn on C satisfying
 

φ tr(g) dg = φ(z) dμn (z). (39.1)
U(n) C

We say that a sequence νn of Borel


 probability measures
 on a space X
converges weakly to a measure ν if X φ(x) dνn (x) −→ X φ(x) dν(x) for all
bounded continuous functions φ on X. We will see that the measures μn
converge weakly as n −→ ∞ to a fixed Gaussian measure

1 −(x2 +y2 )
dμ(z) = e dx ∧ dy, z = x + iy. (39.2)
π

Let us consider how surprising this is! As n varies, the number of eigen-
values increases and one might expect the standard deviation of the traces to
increase with n. This is what would happen were the eigenvalues of a random
symmetric matrix uncorrelated. That it converges to a fixed Gaussian mea-
sure means that the eigenvalues of a random unitary matrix are quite evenly
distributed around the circle.
Intuitively, the eigenvalues “repel” and tend not to lie too close together.
This is reflected in the property of the trace—that its distribution does not
39 Random Matrix Theory 409

spread out as n is increased. This can be regarded as a reflection of (17.3).


Because of the factor |ti − tj |2 , matrices with close eigenvalues have small
Haar measure in U(n). Dyson [48] gave the following analogy. Consider the
eigenvalues of a Haar random matrix distributed on the unit circle to be like
the distribution of charged particles in a Coulomb gas. At a certain tempera-
ture (T = 12 ), this model gives the right distribution. The exercises introduce
Dyson’s “pair correlation” function that quantifies the tendency of the eigen-
values to repel at close ranges. Figure 39.1 shows the probability density
sin2 (nθ/2)
R2 (1, θ) = n2 − (39.3)
sin2 (θ/2)
that there are eigenvalues at both eit and ei(t+θ) as a function of θ (for n = 10).
(Consult the exercises for the definition of Rm and a proof that R2 is given
by (39.3).) We can see from this figure that the probability is small when θ is
small, but is essentially independent of θ if θ is moderate.

100

R2(1, θ)

θ π

Fig. 39.1. The pair correlation R2 (1, θ) when n = 10

Weak convergence requires that for any continuous bounded function φ


 
lim φ(z) dμn (z) = φ(z) dμ(z),
n−→∞ C C

or in other words
 

lim φ tr(g) dg = φ(x + iy) dμ(z). (39.4)
n−→∞ U(n) C

Remarkably, if φ(z) is a polynomial in z and z, this identity is exactly true


for sufficiently large n, depending only on the degree of the polynomial! Of
course, a polynomial is not a bounded continuous function, but we will deduce
weak convergence from this fact about polynomial functions.
Proposition 39.2. Let k, l  0. Then

tr(g)k tr(g)l dg = 0 if k = l,
U(n)
410 39 Random Matrix Theory

while 
|tr(g)|2k dg  k! ,
U(n)

with equality when n  k.


Proof. If k = l, then the variable change g −→ eiθ g multiplies the left-hand
side by ei(k−l)θ = 1 for θ in general position, so the integral vanishes.
Assume that k = l. We show that

|tr(g)|2k dg = k! (39.5)
U(n)

provided k  n. Note that if V = Cn is the standard module for U(n), then


$k
tr(g)k is the trace of g acting on V as in (36.4). As in (34.6), we may
decompose
%k &
V = dλ Vλ ,
λ

where dλ is the degree of the irreducible representation of Sk with character


sλ , and Vλ is an irreducible module of U(n) by Theorem 36.2. The L2 -norm
of f (g) = tr(g)k can be computed by Proposition 39.1, and we have

|tr(g)|2k dg = |f |2 = d2λ .
U(n) λ

Of course, the sum of the squares of the degrees of the irreducible representa-
tions of Sk is |Sk | = k!, and (39.5) is proved. If k >
n, then the same method
can be used to evaluate the trace, and we obtain λ d2λ , where now the sum
is restricted to partitions of length  n. This is < k! .

Theorem 39.2. Suppose that φ(z) is a polynomial in z and z of degree  2n.
Then  

φ tr(g) dg = φ(z) dμ(z), (39.6)
U(n) C

where dμ is the measure (39.2).


Proof. It is sufficient to prove this if φ(z) = z k z l . If deg(φ)  2n, then k + l 
2n so either k = l or both k, l  n, and in either case Proposition 39.2 implies
that the left-hand side equals 0 if k = l and k! if k = l. What we must therefore
show is  
k! if k = l ,
z k z l dμ(z) =
C 0 if k = l.
The measure dμ(z) is rotationally symmetric, and if k = l, then replacing z
by eiθ z multiplies the left-hand side by eiθ(k−l) , so the integral is zero in that
case. Assume therefore that φ(x + iy) = |z|2k . Then using polar coordinates
(so z = x + iy = reiθ ) the integral equals
39 Random Matrix Theory 411
  ∞  ∞
2
+y 2 )
|z|2k dμ(z) = 1
π (x2 + y 2 )k e−(x dx dy =
C −∞ −∞
 2π  ∞  ∞
1
π r2k e−2r r dr dθ = 2 r2k+1 e−2r dr = Γ (k + 1) = k!
0 0 0
and the theorem is proved.


This establishes (39.4) when φ is a polynomial—indeed the sequence


becomes stationary for large n. However, it does not establish weak conver-
gence. To this end, we will study the Fourier transforms of the measures μn
and μ.
The Fourier transform of a probability measure ν on RN is called its char-
acteristic function. Concretely,

ν̂(y1 , . . . , yN ) = ei(x1 y1 +···+xN yN ) dν(x), x = (x1 , . . . , xN ).
RN

Theorem 39.3. Let ν1 , ν2 , ν3 , . . . and ν be probability measures on RN . Sup-


pose that the characteristic functions νCi (y1 , . . . , yN ) −→ ν̂(y1 , . . . , yN ) point-
wise for all (y1 , . . . , yN ) ∈ RN . Then the measures νi converge weakly to ν.

Proof omitted. A proof may be found in Billingsley [18], Theorem 26.3 (when
N = 1) and Sect. 28 (for general N ). The precise statement we need is on
p. 383 before Theorem 29.4.


In the case at hand, we wish to compare probability measures on C = R2 ,


and it will be most convenient to define the Fourier transform as a function
of w = u + iv ∈ C. Let

μ̂(w) = ei(zw+zw) dμ(z)
C

and similarly for the μ̂n .

Proposition 39.3. The functions μ̂n converge uniformly on compact subsets


of C to μ̂.

Proof. The function μ̂ is easily computed. As the Fourier transform of a Gaus-


2
sian distribution, μ̂ is also Gaussian and in fact μ̂(w) = e−|w| . We write this
as a power series:

1
μ̂(w) = F (|w|), F (r) = (−1)k r2k .
k!
k=0

The radius of convergence of this power series is ∞.


412 39 Random Matrix Theory

We have
 - ∞ k+l k k l l

.
i z w z w
μ̂n (w) = dμn (z)
C k! l!
k=0 l=0

∞ k+l  !
i
= z z dμn (z) wk w l .
k l
k! l! C
k=0 l=0

The interchange of the summation and the integration is justified since the
measure dμn is compactly supported, and the series is uniformly convergent
when z is restricted to a compact set. By Proposition 39.2 and the definition
(39.1) of μn , the integral inside brackets vanishes unless k = l, so

(−1)k 2k
μ̂n (w) = Fn (|w|), Fn (r) = ak,n r ,
k!
k=0

1
ak,n = |z| dμn (z).
2k
k! C

By Proposition 39.2 the coefficients ak,n satisfy 0  ak,n  1 with equality


when k > n. We have
 
 ∞
(−1)k 2k  r2k


|F (r) − Fn (r)| =  (1 − ak,n ) r  ,
 k!  k!
k=n k=n

which converges to 0 uniformly as n −→ ∞ when r is restricted to a com-


pact set.

Corollary 39.1. The measures μn converge weakly to μ.
Proof. This follows immediately from the criterion of Theorem 39.3.

Since we have not proved Theorem 39.3, let us point out that we can
immediately prove (39.4) for a fairly big set of test functions φ. For example,
if φ is the Fourier transform of an integrable function ψ with compact support,
we can write
  

φ tr(g) dg = φ(z) dμn (z) = ψ(w) μ̂n (w) du ∧ dv, w = u + iv,
U(n) C C

by the Plancherel formula and, since we have proved that μ̂n −→ μ uniformly
on compact sets (39.4) is clear for such φ.
Diaconis and Shahshahani [42] proved a much stronger statement to the
effect that the quantities
tr(g), tr(g 2 ), . . . , tr(g r ),
where g is a Haar random element of U(n), are distributed like the moments
of r independent Gaussian random variables. Strikingly, what the proof req-
uires is the full representation theory of the symmetric group in the form of
Theorem 37.1!
39 Random Matrix Theory 413

Proposition 39.4. We have


 7
r
|tr(g)|2k1 |tr(g 2 )|2k2 · · · |tr(g r )|2kr dg  j kj kj ! (39.7)
U(n) j=1

with equality provided k1 + 2k2 + · · · + rkr  n.

Proof. Let k = k1 + 2k2 + · · · + rkr , and let λ be the partition of k containing


k1 entries equal to 1, k2 entries equal to 2, and so forth. By Theorem 37.1, we
have Ch(n) (pλ ) = ψpλ . This is the function

g → tr(g)k1 tr(g 2 )k2 · · · tr(g r )kr

since pλ = pλ1 · · · pλr , and applying pλi to the eigenvalues of g gives tr(g λi ).
The left-hand side of (39.7) is thus the L2 norm of Ch(n) , and if k  n,
then by Theorem 39.1 we may compute this L2 norm in Sk . It equals
1
|pλ (σ)|2 = zλ
|Sk |
σ∈Sk

by (37.2). This is the right-hand side of (39.7). If k > n, the proof is identical
except that Theorem 39.1 only gives an inequality in (39.7).


Theorem 39.4. (Diaconis and Shahshahani) The joint probability distri-


bution of the tr(g), tr(g 2 ), . . . , tr(g r ) near (z1 , . . . , zr ) ∈ Cr is a measure
weakly converging to
7 r
1 −π|zj |2 /j
πe dxj ∧ dyj . (39.8)
j=1
j

Thus, the distributions of tr(g), tr(g 2 ), . . . , tr(g r ) are as a sequence of


independent random variables in Gaussian distributions.

Proof. Indeed, this follows along the lines of Corollary 39.1 using the fact that
the moments of the measure (39.8)
 7r 7r
1 −π|zj |2 /j
|z1 |2k1 |z2 |2k2 · · · |zr |2kr πe dxj ∧ dyj = j kj kj ! ,
C j=1
j j=1

agree with (39.7).




By an ensemble we mean a topological space with elements that are mat-


rices, given a probability measure. Random matrix theory is concerned with
the statistical distribution of the eigenvalues of the matrices in the ensemble,
particularly local statistical facts such as the spacing of these eigenvalues.
The original focus of random matrix theory was not on unitary matrices
but on random Hermitian matrices. The reason for this had to do with the
414 39 Random Matrix Theory

origin of the theory in nuclear physics. In quantum mechanics, an observable


quantity such as energy or angular momentum is associated with a Hermitian
operator acting on a Hilbert space with elements that correspond to possible
states of a physical system. An eigenvector corresponds to a state in which the
observable has a definite value, which equals the eigenvalue of the operator
on that eigenvector. The Hermitian operator corresponding to the energy
level of the physical system (a typical observable) is called the Hamiltonian.
A Hamiltonian operator is typically positive definite.
It was observed by Wigner and his collaborators that although the spectra
of atomic nuclei (emitting or absorbing neutrons) were hopeless to calculate
from first principles, the spacing of the eigenvalues still obeyed statistical laws
that could be studied. To this end, random Hermitian operators were studied,
first by Wigner, Gaudin, Mehta, and Dyson. The book of Mehta [128] is the
standard treatise on the subject from the point of view taken by this physics-
inspired literature. The papers of Dyson [49] also greatly repay study. The
more recent books of Anderson, Guionnet, and Zeitouni [7], Deift [40], Katz
and Sarnak [95] and the handbook [4] are all strongly recommended.
Although the Hilbert space on which the Hermitian operator corresponding
to an observable acts is infinite-dimensional, one may truncate the operator,
replacing the Hilbert space with a finite-dimensional invariant subspace. The
operator is then realized as a Hermitian matrix.
To study the local properties of the eigenvalues, one seeks to give the real
vector space of Hermitian matrices a probability measure which is invariant
under the action of the unitary group by conjugation, since one is interested in
the eigenvalues, and these are preserved under conjugation. The usual way is
to assume that the matrix entries are independent random variables with nor-
mal (i.e., Gaussian) distributions. This probability space is called the Gaussian
unitary ensemble (GUE). Two other ensembles model physical systems with
time reversal symmetry. For these, the type of symmetry depends on whether
reversing the direction of time multiplies the operator by ±1. The ensemble
that models systems with a Hamilton that is unchanged under time-reversal
consists of real symmetric matrices and is called the Gaussian orthogonal
ensemble (GOE). The ensemble modeling systems with a Hamiltonian that
is antisymmetric under time-reversal can be represented by quaternionic Her-
mitian matrices and is called the Gaussian symplectic ensemble (GSE). See
Dyson [48] and Mehta [128] for further information about this point.
The space of positive definite Hermitian matrices is an open subset of
the space of all Hermitian matrices, and this space is isomorphic to the
Type IV symmetric space GL(n, C)/U(n), under the map which associates
with the coset gU(n) in the symmetric space the Hermitian matrix g t ḡ. Simi-
larly the positive-definite parts of the GOE and GSE are GL(n, R)/O(n) and
GL(n, H)/Sp(2n) with associated probability measures.
Dyson [48] shifted focus from the Gaussian ensembles to the circular ens-
embles that are the compact duals of the symmetric spaces GL(n, C)/U(n),
GL(n, R)/O(n) and GL(n, H)/Sp(2n). For example, by Theorem 28.1, the
39 Random Matrix Theory 415

dual of GL(n, C)/U(n) is just U(n). Haar measure makes this symmetric space
into the circular unitary ensemble (CUE). The ensemble is called circular
because the eigenvalues of a unitary matrix lie on the unit circle instead of
the real line. It is the CUE that we have studied in this chapter. Note that
in the GUE, we cannot use Haar measure to make GL(n, C)/U(n) into a
measure space, since we want a probability measure on each ensemble, but
the noncompact group GL(n, C) has infinite volume. This is an important
advantage of the CUE over the GUE. And as Dyson observed, as far as the
local statistics of random matrices are concerned—for examples, with matters
of spacing of eigenvalues—the circular ensembles are faithful mirrors of the
Gaussian ones. The circular orthogonal and symplectic ensembles (COE and
CSE) are similarly the measure spaces U(n)/O(n) and U(2n)/Sp(2n) with
their unique invariant probability measures.
In recent years, random matrix theory has found a new application in the
study of the zeros of the Riemann zeta function and similar arithmetic data.
The observation that the distribution of the zeros of the Riemann zeta function
should have a local distribution similar to that of the eigenvalues of a random
Hermitian matrix in the GUE originated in a conversation between Dyson
and Montgomery, and was confirmed numerically by Odlyzko and others; see
Rubinstein [139]. See Katz and Sarnak [94] and Conrey [38] for surveys of this
field, Keating and Snaith [99] for a typical paper from the extensive literature.
The paper of Keating and Snaith is important because it marked a paradigm
shift away from the study of the spacing of the zeros of ζ(s) to the distribution
of the values of ζ( 12 +it), which are, in the new paradigm, related to the values
of the characteristic polynomial of a random matrix.

Theorem 39.5. (Keating and Snaith) Let k be a nonnegative integer.


Then
 7 j!(j + 2k)!
n−1
| det(g − I)| dg =
2k
(39.9)
U(n) j=0
(j + k)!2

This was proved by Keating and Snaith using the Selberg integral. How-
ever an alternative proof was found by Alex Gamburd (see Bump and Gam-
burd [29]) which we will give here. This proof is similar to that of Theorem 39.4
in that we will transfer the computation from U(n) to another group. Whereas
in Theorem 39.4 we used the Frobenius–Schur duality to transfer the com-
putation to the symmetric group Sk , here we will use the Cauchy identity
to transfer the computation from U(n) to U(2k). The two procedures are
extremely analogous and closely related to each other.

Proof. Let t1 , . . . , tk and u1 , . . . , uk be complex numbers. We will show that


 7
k
(1 + ti det(g))(ui + det(g)−1 ) dg = s(nk ) (t1 , . . . , tk , u1 , . . . , uk ).
U(n) i=1
(39.10)
416 39 Random Matrix Theory

Here (nk ) = (n, . . . , n, 0, . . . , 0) is the


 partition with k nonzero parts, each
equal to n. Taking ti = ui = 1 gives U(n) |1 + det(g)|2k dg, because | det(g)| =
1 so det(g)−1 = det(g). This equals the left-hand side of (39.9) because if g is
a unitary matrix so is −g. Now s(nk ) (1, . . . , 1) with 2k entries equal to 1 is the
dimension of an irreducible representation of U(2k), which may be evaluated
using the Weyl dimension formula (Theorem 22.10). We leave it to the reader
to check that this dimension equals the right-hand side of (39.9).
Thus, consider the left-hand side of (39.10). If the eigenvalues of g are
α1 , . . . , αn , by the dual Cauchy identity the integrand equals

7
k 7
n
(1 + ti αj )(1 + ui αj ) det(g)−k
i=1 j=1
k
= sλ (α1 , . . . , αn ) sλt (t1 , . . . , tn , u1 , . . . , un ) det(g) .
λ

Now each sλ (α1 , . . . , αn ) is the character of an irreducible representation


of U(n) if it is nonzero, that is, if the length of λ is  n. In particular
det(g)n = s(kn ) (α1 , . . . , αn ). So by Schur orthogonality, integrating over g
picks off the contribution of a single term, with λ = (k n ) and λt = (nk ). This
proves (39.10).


Exercises
Let m  n. The m-level correlation function of Dyson [48] for unitary statistics
is a function Rm on Tm defined by the requirement that if f is a test function on
Tm (piecewise continuous, let us say) then
  ∗
Rm (t1 , . . . , tm ) f (t1 , . . . , tm ) dt1 · · · dtm = f (ti1 , . . . , tim ) dg, (39.11)
Tm U(n)

where the sum is over all distinct m-tuples (i1 , . . . , im ) of distinct integers between
1 and n, and t1 , . . . , tn are the eigenvalues of g. Intuitively, this function gives the
probability density that t1 , . . . , tn are the eigenvalues of g ∈ U(n).
The purpose of the exercises is to prove (and generalize) Dyson’s formula
 
Rm (t1 , . . . , tm ) = det sn (θj − θk ) j,k , ti = eiθj , (39.12)

where *
sin(nθ/2)
sin(θ/2)
if θ
= 0,
sn (θ) =
n if θ = 0.
As a special case, when m = 2, the graph of the “pair correlation” R2 (1, θ) may be
found in Fig. 39.1. This shows graphically the repulsion of the zeros – as we can see,
the probability of two zeros being close together is small, but for moderate distances
there is no correlation.
39 Random Matrix Theory 417

Exercise 39.1. If m = n, prove that


⎛ ⎞
1 t1 · · · tn−1
1
⎜ 1 t2 · · · tn−1 ⎟
⎜ 2 ⎟
Rn (t1 , . . . , tn ) = det(A · t Ā), A=⎜ .. .. ⎟.
⎝ . . ⎠
1 tn · · · tn−1
n

[Since n = m, the matrix A is square and we have det(A · t Ā) = | det(A)|2 . Reduce
to the case where the test function f is symmetric. Then use the Weyl integration
formula.]

Exercise 39.2. Show that



1
Rm (x1 , . . . , xm ) = Rn (x1 , . . . , xn ) dx1 · · · dxn .
(n − m)! Tn−m

Exercise 39.3. Prove that when m  n we have


⎛ ⎞
1 t1 · · · tn−1
1
⎜ 1 t2 · · · t2 ⎟
n−1
⎜ ⎟
Rm (t1 , . . . , tm ) = det(A · t Ā), A=⎜. .. ⎟ .
⎝ .. . ⎠
1 tm · · · tn−1
m

Observe that if m < n, then A is not square, so we may no longer factor the
determinant. Deduce Dyson’s formula (39.12).

Exercise 39.4. (Bump, Diaconis and Keller [30]) Generalize Dyson’s formula
as follows. Let λ be a partition of length  n. The measure |χλ (g)|2 dg is a probability
measure, and we may define an m-level correlation function for it exactly as in
(39.11). Denote this as Rm,λ . Prove that
⎛ −λ1 1−λ2 ⎞
t1 t1 · · · t1−λn +n−1
⎜ t2 1 t2 2 · · · t2 n
−λ 1−λ −λ +n−1 ⎟
⎜ ⎟
Rm,λ (t1 , . . . , tn ) = det(A · t Ā), A=⎜ . . ⎟.
⎝ .. .. ⎠
−λ1 1−λ2 −λn +n−1
tm tm · · · tm

Exercise 39.5. Let us consider the distribution of the traces of g ∈ SU(2). In this
cases the traces are real valued so we must modify (39.1) to read
 
 
φ tr(g) dg = φ(x) dμ(x).
SU(2) R

Since |tr(g)|  2, and since the map g → −g takes SU(2) to itself, the measure dμ
will be even and supported between −2 and 2. Show that
 ∞ +  
1 1 2k
4 − x2 x2k dx =
2π −∞ k+1 k
and deduce that
1 +
dμ(x) = 4 − x2 dx.

40
Symmetric Group Branching Rules
and Tableaux

If G ⊃ H are groups, a branching rule is an explicit description of how


representations of G decompose into irreducibles when restricted to H.
By Frobenius reciprocity, this is equivalent to asking how representations of H
decompose into irreducibles on induction to G. In this chapter, we will obtain
the branching rule for the symmetric groups.
Suppose that λ is a partition of k and that μ is a partition of l with k  l.
We write λ ⊆ μ or λ ⊇ μ if the Young diagram of λ is contained in the Young
diagram of μ. Concretely this means that λi  μi for all i. If λ = μ, we write
λ ⊂ μ or μ ⊃ λ.
We will denote by ρλ the irreducible representation of Sk parametrized by
λ. We follow the notation of the last chapter in regarding elements of Rk as
generalized characters of Sk . Thus, sλ is the character of the representation ρλ .
Proposition 40.1. Let λ be a partition of k, and let μ be a partition of k − 1.
Then 
1 if λ ⊃ μ,
sλ , sμ e1 =
0 otherwise.
Proof. Applying ch, it is sufficient to show that

e 1 sμ = sλ .
λ⊃μ

We work in Λ(n) for any sufficiently large n; of course n = k is sufficient. Let


Δ denote the denominator in (36.1), and let
 
 xμ1 n xμ2 n · · · xμnn 
 μn−1 +1 μn−1 +1 
 x1 x2
+1
· · · xnn−1 
μ

M = .. .. .. . (40.1)
 . . . 
 
 xμ1 +n−1 xμ1 +n−1 · · · xμ1 +n−1 
1 2 n

By (36.1), we have sμ = M/Δ and e1 = xi , so

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 419


DOI 10.1007/978-1-4614-8024-2 40, © Springer Science+Business Media New York 2013
420 40 Symmetric Group Branching Rules and Tableaux
 
 xμ1 n · · · xiμn +1 · · · xμnn 
 
 .. .. .. 
 . . . 
n n
 μn−j 
Δe1 sμ = xi M =  +j μn−j +j+1
· · · xi
μn−j +j 
· · · xn
 x1 . (40.2)
i=1  . . . 
i=1
 .. .. .. 
 
 xμ1 +n−1 · · · xμ1 +n · · · xμ1 +n−1 
1 i n

We claim that this equals


 
 xμ1 n ··· xμi n ··· xμnn 
 
 .. .. .. 

 μ +j+1 . . . 
μn−j +j+1 
n
 x n−j μn−j +j+1
· · · xi · · · xn
 1 . (40.3)
j=1  .. .. .. 
 . . . 
 
 xμ1 +n−1 · · · xμ1 +n−1 · · · xμ1 +n−1 
1 i n

In (40.2), we have increased the exponent in exactly one column of M by one


and then summed over columns; in (40.3), we have increased the exponent
in exactly one row of M by one and then summed over rows. In either case,
expanding the determinants and summing over i or j gives the result of first
expanding M and then in each resulting monomial increasing the exponent of
exactly one xi by one. These are the same set of terms, so (40.2) and (40.3)
are equal.
In (40.3), not all terms may be nonzero. Two consecutive rows will be the
same if μn−j + j + 1 = μn−j+1 + j + 1, that is, if μn−j = μn−j+1 . In this case,
the determinant is zero. Discarding these terms, (40.3) is the sum of all sλ as
λ runs through those partitions of k that contain μ.


Theorem 40.1. Let λ be a partition of k and let μ be a partition of k − 1.


The following are equivalent.
(i) The representation ρλ occurs in the representation of Sk induced from the
representation Sμ of Sk−1 ⊂ Sk ; in this case it occurs with multiplicity
one.
(ii) The representation ρμ occurs in the representation of Sk restricted from
the representation Sλ of Sk ⊃ Sk−1 ; in this case it occurs with multiplicity
one.
(iii) The partition μ ⊂ λ.

Proof. Statements (i) and (ii) are equivalent by Frobenius reciprocity. Noting
that S1 is the trivial group, we have Sk−1 = Sk−1 × S1 . By definition, sμ e1
is the character of Sk induced from the character sμ ⊗ e1 of Sk−1 × S1 . With
this in mind, this theorem is just a paraphrase of Proposition 40.1.


A representation is multiplicity-free if in its decomposition into irre-


ducibles, no irreducible occurs with multiplicity greater than 1.
40 Symmetric Group Branching Rules and Tableaux 421

Corollary 40.1. If ρ is an irreducible representation of Sk−1 , then the


representation of Sk induced from ρ is multiplicity-free; and if τ is an irr-
educible representation of Sk then the representation of Sk−1 restricted from
τ is multiplicity-free.

Proof. This is an immediate consequence of the theorem.




Let λ be a partition of k. By a standard (Young) tableau of shape λ, we


mean a labeling of the diagram of λ by the integers 1 through k in such a
way that entries increase in each row and column. As we explained earlier,
we represent the diagram of a partition by a series of boxes. This is more
convenient than a set of dots since we can then represent a tableau by putting
numbers in the boxes to indicate the labeling.
For example, the standard tableaux of shape (3, 2) are:

1 2 3 1 2 4 1 2 5 1 3 5 1 3 4
4 5 3 5 3 4 2 4 2 5

The following theorem makes use of the following chain of groups:

Sk ⊃ Sk−1 ⊃ · · · ⊃ S1 .

These have the remarkable property that the restriction of each irreducible
representation of Si to Si−1 is multiplicity-free and the branching rule is
explicitly known. Although this is a rare phenomenon, there are a couple
of other important cases:

U(n) ⊃ U(n − 1) ⊃ · · · ⊃ U(1),

and
O(n) ⊃ O(n − 1) ⊃ · · · ⊃ O(2).

Theorem 40.2. If λ is a partition of k, the degree of the irreducible represen-


tation ρλ of Sk associated with λ is equal to the number of standard tableaux
of shape λ.

Proof. Removing the top box (labeled k) from a tableau of shape λ results
in another tableau, of shape μ (say), where μ ⊂ λ. Thus, the set of tableaux
of shape λ is in bijection with the set of tableaux of shape μ, where μ runs
through the partitions of k − 1 contained in λ.
The restriction of ρλ to Sk−1 is the direct sum of the irreducible repre-
sentations ρμ , where μ runs through the partitions of k − 1 contained in λ,
and by induction the degree of each such ρμ equals the number of tableaux of
shape μ. The result follows.

422 40 Symmetric Group Branching Rules and Tableaux

Tableaux are an important topic in combinatorics. Fulton [53] and Stanley


[153] have extensive discussions of tableaux, and there is a very good discussion
of standard tableaux in Knuth [109].
A famous formula, due to Frame, Robinson, and Thrall, for the number
of tableaux of shape λ—that is, the degree of ρλ —is the hook length formula.
There are many proofs in the literature. For a variety of proofs see Fulton [53],
Knuth [109], Macdonald [124], Manivel [126], Sagan [141] (with anecdote), and
Stanley [153]. The hook length formula is equivalent to an older formula of
Frobenius and (independently) Young, which is treated in Exercise 40.4.
For each box B in the diagram of λ, the hook at B consists of B, all
boxes to the right and below. The hook length is the length of the hook. For
example, Fig. 40.1 shows a hook for the partition λ = (5, 5, 4, 3, 3) of 20. This
hook has length 5.
Theorem 40.3. (Hook length formula) Let λ be a partition of k. The
number of standard tableaux of shape λ equals k! divided by the product of the
lengths of the hooks.
For the example, we have indicated the lengths of the hooks in Fig. 40.1.
By the hook length formula, we see that the number of tableaux of shape λ is
20!
= 34, 641, 750,
9·8·7·4·2·8·7·6·3·1·6·5·4·1·4·3·2·3·2·1
and this is the degree of the irreducible representation ρλ of S20 .
Proof. See Exercise 40.5.


9 8 7 4 2

8 7 6 3 1

B 6 5 4 1

4 3 2

3 2 1

Fig. 40.1. The hook length formula for λ = (5, 5, 4, 3, 3)

Proposition 40.1 is a special case of Pieri’s formula, which we explain and


prove. First, we give a bit of background on the Littlewood–Richardson rule,
of which Pieri’s formula is itself a special case.
The multiplicative structure of the ring R ∼ = Λ is of intense interest. If λ
and μ are partitions of r and k, respectively, then we can decompose

sλ sμ = cνλμ sν ,
λ
40 Symmetric Group Branching Rules and Tableaux 423

where the sum is over partitions ν of r + k. The coefficients cνλμ are called
the Littlewood–Richardson coefficients. They are integers since the sν are a
Z-basis of the free Abelian group Rr+k .
Applying ch(n) , we may also write

sλ sμ = cνλμ sν
λ

as a decomposition of Schur polynomials, or χλ χμ = cνλμ χν in terms of the
irreducible characters of U (n) parametrized by λ, μ, and ν. Using the fact
that the sλ are orthonormal, we have also

cνλμ = sλ sμ , sν .

Proposition 40.2. The coefficients cνλμ are nonnegative integers.


Proof. This is clear from any one of the characterizations in Theorem 38.3.


Given that the Littlewood–Richardson coefficients are nonnegative inte-
gers, a natural question is to ask for a combinatorial interpretation. Can
cνλμ be realized as the cardinality of some set? The answer is yes, and this
interpretation is known as the Littlewood–Richardson rule. We refer to Fulton
[53], Stanley [153], or Macdonald [124] for a full discussion of the Littlewood–
Richardson rule.
Even just to state the Littlewood–Richardson rule in full generality is
slightly complex, and we will content ourselves with a particularly important
special case. This is where λ = (r) or λ = (1, . . . , 1), so sλ = hr or er . This
simple and useful case of the Littlewood–Richardson rule is called Pieri’s
formula. We will now state and prove it.
If μ ⊂ λ are partitions, we call the pair (μ, λ) a skew partition and denote
it λ\μ. Its diagram is the set-theoretic difference between the diagrams of λ
and μ. We call the skew partition λ\μ a vertical strip if its diagram does not
contain more than one box in any given row. It is called a horizontal strip if
its diagram does not contain more than one box in any given column.
For example, if μ = (3, 3), then the partitions λ of 8 such that λ\μ is a
vertical strip are (4, 4), (4, 3, 1), and (3, 3, 1, 1). The diagrams of these skew
partitions are the shaded regions in Fig. 40.2.
Theorem 40.4. (Pieri’s formula) Let μ be a partition of k, and let r  0.
Then sμ er is the sum of the sλ as λ runs through the partitions of k + r
containing μ such that λ\μ is a vertical strip. Also, sμ hr is the sum of the sλ
as λ runs through the partitions of k + r such that λ\μ is a horizontal strip.

Proof. Since by Theorems 34.3 and 35.2 applying the involution ι interchanges
er and hr and also interchanges sμ and sλ , the second statement follows from
the first, which we prove.
424 40 Symmetric Group Branching Rules and Tableaux

Fig. 40.2. Vertical strips

The proof that sμ er is the sum of the sλ as λ runs through the partitions
of k + r containing μ such that λ\μ is a vertical strip is actually identical
to the proof of Proposition 40.1. Choose n  k + r and, applying ch, it is
sufficient to prove the corresponding result for Schur polynomials.
With notations as in that proof, we see that Δer sμ equals the sum of
k
r terms, each of which is obtained by multiplying M , defined by (40.1),
by a monomial xi1 · · · xir , where i1 < · · · < ir . Multiplying M by xi1 · · · xir
amounts to increasing the exponent of xir in the ir th column by one. Thus,
we get Δer sμ if we take M , increase the exponents in r columns each by one,

and then add the resulting kr determinants.
We claim that this gives the same result as taking M , increasing
the expo-
nents in r rows each by one, and then adding the resulting kr determinants.
Indeed, either way, we get the result of taking each monomial occurring in the
expansion of the determinant M , increasing the exponents of exactly r of the
xi each by one, and adding all resulting terms.
Thus, er sμ equals the sum of all terms (36.1) where (λ1 , . . . , λn ) is obtained
from (μ1 , . . . , μn ) by increasing exactly r of the μi by one. Some of these
terms may not be partitions, in which case the determinant in the numerator
of (36.1) will be zero since it will have repeated rows. The terms that remain
will be the partitions of k + r of length such that λ\μ is a vertical strip. These
partitions all have length  n because we chose n large enough. Thus, er sμ is
the sum of sλ for these λ, as required.


Exercises
The next problem generalizes Exercise 37.3. If λ and μ are partitions such that
the Young diagram of λ contains that of μ, then the pair (λ, μ) is called a skew shape
and is denoted λ\μ. Its Young diagram is the set-theoretic difference between the
Young diagrams of λ and μ. The skew shape is called a ribbon shape if the diagram
is connected and contains no 2 × 2 squares. For example, if λ = (5, 4, 4, 3, 2) and
μ = (5, 3, 2, 2, 2) then the skew shape λ\μ is a ribbon shape. Its diagram is the
shaded region in the following figure.
If λ\μ is a ribbon shape, we call its height, denoted ht(λ\μ) one less than the
number of rows involved in its Young diagram. In the example, the height is 2.
40 Symmetric Group Branching Rules and Tableaux 425

The following result is called the Murnaghan–Nakayama rule (Stanley [153]).


It is the combinatorial basis of the Boson–Fermion correspondence in the theory of
infinite-dimensional Lie algbras.

Exercise 40.1. Let μ be a partition of k and r a positive integer. Show that



p r sμ = (−1)ht(λ\μ) ,
λ

where the sum is over all partitions λ of k + r such that λ\μ is a ribbon shape.
[Hint: If λ ∈ Zn , let
λ
F (λ) = det(xi j )/Δ,
where Δ is the denominator in (36.1). Thus if ρ = (n − 1, n − 2, . . . , 0), then (36.1)
can be written F (λ + ρ) = sλ . Show that

n
p r sλ = F (λ + ρ + rek ),
k=1

where ek = (0, . . . , 1, . . . , 0) is the kth standard basis vector of Zn . Show that each
term in this sum is either zero or ±sμ where λ\μ is a ribbon shape.]

Exercise 40.2. Since the hk generate the ring R, knowing how to multiply them
gives complete information about the multiplication in R. Thus, Pieri’s formula
contains full information about the Littlewood–Richardson coefficients. This exercise
gives a concrete illustration. Using Pieri’s formula (or the Jacobi–Trudi identity),
check that
h2 h1 − h3 = s(21) .
Use this to show that

s(21) s(21) = s(42) + s(411) + s(33) + 2s(321) + s(3111) + s(222) + s(2211) .

Exercise 40.3. Let λ be a partition of k into at most n parts. Prove that the
number of standard tableaux of shape λ is

tr(g)k χλ (g) dg.
U(n)

(Hint: Use Theorems 40.2 and 36.4.)


426 40 Symmetric Group Branching Rules and Tableaux

Exercise 40.4. (Frobenius) Let (k1 , . . . , kr ) be a sequence of integers whose sum


is k. The multinomial coefficient if all ki  0 is
( ) 
k k!
if all ki  0,
= k1 !···kr !
k1 , . . . , k r 0 otherwise.

(i) Show that thisrmultinomial coefficient is the coefficient of tk1 1 · · · tkr r in the
k
expansion of ( i=1 ti ) .
(ii) Prove that if λ is a partition of k into at most n parts, then the number of
standard tableaux of shape λ is
( )
 l(w) k
(−1) .
w∈S
λ1 − 1 + w(1), λ2 − 2 + w(2), . . . , λn − n + w(n)
n

For example, let λ = (3, 2) = (3, 2) and k = 5. The sum is


( ) ( )
5 5
− = 10 − 5 = 5,
3, 2 4, 1

the number of standard tableaux with shape λ. (Hint: Use Theorems 37.3
and 40.2.)
(iii) Let λ be a partition of k into at most n parts. Let μ = λ + δ, where δ =
(n − 1, n − 2, . . . , 1, 0). Show that the number of standard tableaux of shape λ is
( )
k! 
, (μi − μj ) .
i μi ! i<j

[Hint: Show that


, ( )
iμi !  l(w) k
(−1)
k! w∈S μ1 − n + w(1), μ2 − n + w(2), . . . , μn − n + w(n)
n

is a polynomial of degree 1
2
n(n − 1) in μ1 , . . . , μn , and that it vanishes when
μi = μj .]

Continuing from the previous exercise:

Exercise 40.5. (i) Show that the product of the hooks in the ith row is

μi !
, .
j>i μi − μj

(ii) Prove the hook length formula.


41
Unitary Branching Rules and Tableaux

In this chapter, representations of both GL(n, C) and GL(n − 1, C) occur.


To distinguish the two, we will modify the notation introduced before Theo-
rem 36.3 as follows. If λ is a partition (of any k) of length  n, or more gener-
ally an integer sequence λ = (λ1 , . . . , λn ) with λ1  λ2  · · · , we will denote
by πλGLn or more simply as πλ the representation of GL(n, C) parametrized
by λ. On the other hand, if μ is a partition of length  n − 1, or more gen-
erally an integer sequence μ = (μ1 , . . . , μn−1 ) with μ1  μ2  · · · , we will
denote by πμ n−1 or (more simply) as πμ the representation of GL(n − 1, C)
GL

parametrized by μ.
We embed GL(n − 1, C) −→ GL(n, C) by
 
g
g −→ . (41.1)
1

It is natural to ask when the restriction of πλ to GL(n − 1, C) contains πμ .


Since algebraic representations of GL(n, C) correspond precisely to represen-
tations of its maximal compact subgroup, this is equivalent to asking for the
branching rule from U(n) to U(n − 1).
This question has a simple and beautiful answer in Theorem 41.1 below.
We say that the integer sequences λ = (λ1 , . . . , λn ) and μ = (μ1 , . . . , μn−1 )
interlace if
λ1  μ1  λ2  μ2  · · ·  μn−1  λn .

Proposition 41.1. Suppose that λn and μn−1 are nonnegative, so the integer
sequences λ and μ are partitions. Then λ and μ interlace if and only if λ ⊃ μ
and the skew partition λ\μ is a horizontal strip.
This is obvious if one draws a diagram.

Proof. Assume that λ ⊃ μ and λ\μ is a horizontal strip. Then λj  μj because


λ ⊃ μ. We must show that μj  λj+1 . If it is not, λj  λj+1 > μj , which

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 427


DOI 10.1007/978-1-4614-8024-2 41, © Springer Science+Business Media New York 2013
428 41 Unitary Branching Rules and Tableaux

implies that the diagram of λ\μ contains two entries in the μj + 1 column,
namely in the j and j +1 rows, which is a contradiction since λ\μ was assumed
to be a horizontal strip. We have proved that λ and μ interlace. The converse
is similar.


Theorem 41.1. Let λ = (λ1 , . . . , λn ) and μ = (μ1 , . . . , μn−1 ) be integer seq-


uences with λ1  λ2  · · · and μ1  μ2  · · · . Then the restriction of πλ
to GL(n − 1, C) contains a copy of πμ if and only if λ and μ interlace. The
restriction of πλ is multiplicity-free.

Proof. We restriction the representation πλ of GLn (C) to GLn−1 (C) in two


stages. First, we restrict it to GLn−1 (C)×GL1 (C), and then we restrict it from
GLn−1 (C) × GL1 (C) to GLn−1 (C). Here (g, h) ∈ GLn−1 (C) × GL1 (C) is em-
bedded in GLn (C) as in (38.7). Every irreducible character of GL1 (C) = C× is
of the form α −→ αk for some k ∈ Z, and we will denote this character as πk .
We may order the eigenvalues of (g, h) so that α1 , . . . , αn−1 are the eigen-
values of g, and αn is the eigenvalue of h. Since sλ (α1 , . . . , αn ) is a homo-
geneous polynomial of degree |λ| = λi , and since sμ (α1 , . . . , αn−1 )αkn is
homogeneous of degree |μ| + k, πμ ⊗ πk can occur in the restriction of πλ
to GLn−1 (C) × GL1 (C) if and only if k = |λ| − |μ|. In other words, the
fact that Schur polynomials are homogeneous implies that the multiplicity
of πμ in πλ restricted to GLn−1 (C) equals the multiplicity of πμ ⊗ πk to
GLn−1 (C) × GL1 (C) where k = |λ| − |μ|. By Theorem 38.3 this equals the
Littlewood–Richardson coefficient cλμν where ν = (k), and by Pieri’s formula
(Theorem 40.4) this equals 1 if λ\μ is a horizontal strip, 0 otherwise. By
Proposition 41.1 this means that the partitions λ and μ must interlace.


We can now give a combinatorial formula for the degree of the irreducible
representation πλ of GL(n, C), where λ = (λ1 , . . . , λn ) and λ1  · · ·  λn . A
Gelfand–Tsetlin pattern of degree n consists of n decreasing integer sequences
of lengths n, n − 1, . . . , 1 such that each adjacent pair interlaces. For example,
if the top row is 3, 2, 1, there are eight possible Gelfand–Tsetlin patterns:

3 2 1 3 2 1 3 2 1
3 2 3 2 3 1
3 2 3
3 2 1 3 2 1 3 2 1
3 1 3 1 2 2
2 1 2
3 2 1 3 2 1
2 1 and 2 1
2 1

Theorem 41.2. The degree of the irreducible representation πλ of GL(n, C)


equals the number of Gelfand–Tsetlin patterns whose top row is λ.
41 Unitary Branching Rules and Tableaux 429

Thus, dim π(3,2,1) = 8.

Proof. The proof is identical in structure to Theorem 40.2. The Gelfand–


Tsetlin patterns of shape λ can be counted by noting that striking the top
row gives a Gelfand–Tsetlin pattern with a top row that is a partition μ of
length n − 1 that interlaces with λ. By induction, the number of such patterns
is equal to the dimension of πμ , and the result now follows from the branching
rule of Theorem 41.1.

Just as with the symmetric group, the dimension of an irreducible repre-
sentation of U(n) can be expressed as the number of tableaux of a certain
type. By a semistandard Young tableau of shape λ we mean a filling of the
boxes in the Young diagram of shape λ in which the columns are strictly
increasing but the rows are only weakly increasing.
Proposition 41.2. Let λ be a partition of length  n. The degree of the irre-
ducible representation πλ of GL(n, C) equals the number semistandard Young
tableaux of shape λ with entries in {1, 2, . . . , n}.
Proof. In view of Theorem 41.2 it is sufficient to exhibit a bijection between
these tableaux and the Gelfand–Tsetlin patterns with top row λ. We will
explain how to go from the tableau to the Gelfand–Tsetlin pattern. Given
a tableau, the top row of the Gelfand–Tsetlin pattern is the shape: of the
tableau:
⎧ ⎫
1 1 1 2 3 ⎨5 2 1⎬
2 2
⎩ ⎭
3

Removing all boxes labeled n gives a second tableau, with entries in 1, 2, . . . , n−


1. Its shape is the second row of the Gelfand–Tsetlin pattern:
⎧ ⎫
1 1 1 2 ⎨5 2 1⎬
4 3
2 2 ⎩ ⎭

We continue removing the boxes labeled n − 1, and the resulting shape is the
third row: ⎧ ⎫
⎨5 2 1⎬
1 1 1 4 3
⎩ ⎭
3
Continuing in this way we obtain a Gelfand–Tsetlin pattern. We leave it to
the reader to convince themselves that this is a bijection. 

The relationship between representation theory and the combinatorics of


tableaux is subtle and interesting. It can be understood as just an analogy,
but at a deeper level, it can be understood as a reflection of the theory of
quantum groups. We start by explaining the analogy.
430 41 Unitary Branching Rules and Tableaux

There is an algorithm, the Robinson–Schensted–Knuth (RSK) algorithm,


which describes bijections between pairs of tableaux of the same shape (or
of conjugate shapes) and various combinatorial objects. Historically, the RSK
algorithm first occurred in Robinson’s work on the representation theory of
the symmetric group [136]. It was rediscovered in the early 1960s by Schen-
sted [147], who was motivated the question of the longest increasing subse-
quence of an integer sequence, and substantially generalized by Knuth [108].
It has applications in various fields from linguistics to algebraic geometry.
We will comment mainly on its connections with representation theory, so we
begin by pointing out how it gives combinatorial analogs of the correspon-
dences that we are familiar with, Frobenius–Schur duality and GLn × GLm
duality. We will focus on Frobenius–Schur duality.
Let us recapitulate two facts. Let λ be a partition of k with length  n.
• Let SYT(λ) be the set of standard tableaux of shape λ having entries in
{1, . . . , k}. Then by Proposition 40.2 the cardinality of ST(k) equals the
degree of the irreducible representation ρλ of Sk corresponding to λ.
• Let SSYT(λ, n) be the set of semistandard tableau of shape λ having
entries in {1, . . . , n}. Then by Proposition 41.2 the cardinality of SSYT
GL(n)
(λ, n) equals the degree of the irreducible representation πλ with
highest weight λ.
For more about the RSK algorithm, see Fulton [53], Knuth [109] Sect. 5.1.4,
or Stanley [153], van Leeuwen [164] and the original papers. Our interest here
in the RSK algorithm comes from the fact that it is the basis of the combi-
natorial side of a series of analogies between results in representation theory
and combinatorics. There are three main versions of RSK, each analogous to a
fact in representation theory. Briefly, the three representation-theoretic facts
in question are:
• The decomposition of C[Sk ] under the action of Sk × Sk by left and right
translation;
• Frobenius–Schur duality;
• GL(n) × GL(m) duality.
We will take these one at a time, focussing on the second.
The first version of RSK (Robinson) gives a bijection between Sk and the
set of pairs of standard tableaux of the same shape λ. That is, between Sk
and the disjoint union
F
SYT(λ) × SYT(λ).
λ a partition of k

So k!, the cardinality of Sk , equals the number of pairs of standard tableaux of


size k with the same shape. Beyond this combinatorial reason, let us observe
another representation-theoretic reason that these two sets have the same
cardinality. Indeed, the cardinality of any finite group equals the sum of the
squares of the degrees of its irreducible representations.
41 Unitary Branching Rules and Tableaux 431

The second version of RSK (Schensted) gives a bijection between the set
of sequences {m1 , . . . , mk } with mi ∈ {1, . . . , n} (called words) and
F
SYT(λ) × SSYT(λ, n). (41.2)
λ a partition of k
l(λ)  n

Let us again observe that these sets have the same cardinality, which we may
prove using Frobenius–Schur duality in the form
&
⊗ k Cn ∼
GL(n)
= ρλ ⊗ πλ .
λ a partition of k
l(λ)  n

Indeed, the dimension of the left-hand side is nk , which is the cardinality


of the set of integer sequences; the right-hand side has the same cardinality
as the above disjoint union. So the second RSK bijection is a combinatorial
analog of Frobenius–Schur duality.
The third RSK bijection (Knuth) is between the set of n × m matrices
whose entries are nonnegative integers is in bijection with
F
SSYT(λ, m) × SSYT(λ, n).
λ

This may be thought of a combinatorial analog of GLn × GLm duality. In


particular there is a combinatorial proof of the Cauchy identity (see Stan-
ley [153]). Knuth also found a variant for matrices whose entries are 0 and 1
and F
SSYT(λ, m) × SSYT(λt , n),
λ
t
where λ is the conjugate partition. This is related to the dual Cauchy identity,
and the version of GLn × GLm duality for the exterior algebra on Cn × Cm .
All of these combinatorial bijections are based on one process, called Schen-
sted insertion, which we will explain. Given a semistandard Young tableaux
Q of shape λ and an integer m, there is a tableau m → Q whose shape μ is
obtained from λ by adding one box (somewhere).
To compute m → Q we insert the m into the first row at its “best”
location. It can go at the end if it is larger or equal to all of the entries in the
row, and if this is true, the algorithm terminates. Otherwise, it will have to
displace or “bump” one of the entries in the row. It displaces the first entry
that its greater than it. The displaced entry is then inserted into the second
row. If it is greater than or equal to all the entries in the row, then we add
the entry at the end of the row and the algorithm terminates. Otherwise, we
continue by inserting the bumped entry into the third row, and so forth.
Let us do an example. We will calculate 1 −→ Q where Q is the following
tableau:
432 41 Unitary Branching Rules and Tableaux

1 1 1 2 2 3
2 3 3
3

We’ve shaded the box where the inserted 1 will go. The 1 bumps the 2 which
is then inserted into the second row:
1 1 1 2 3
2 2 3 3
3

again we’ve shaded the location where the inserted 2 will go. The 3 that is
bumped will then go in the third row. This time it will go at the end:

1 1 1 2 3
2 2 3
3 3

The algorithm therefore terminates and we see that 1 −→ Q is the tableau:

1 1 1 2 3
2 2 3
3 3

Now let us explain the second RSK algorithm mentioned above, which is
the bijection of {1, . . . , n}k with (41.2). We begin with a sequence (m1 , . . . , mk )
and the empty tableau, which we will denote by ∅. We insert the mi one by
one, finally ending up with a tableau

Q = m1 → m2 → · · · → mk → ∅

Actually mk = mk → ∅, so this is the same as m1 → · · · → mk .


Clearly Q is in SSYT(λ, n) for some partition λ of k. We obtain another
tableau P , called the recording tableau of the same shape which has the entries
{1, 2, 3, . . . , k} by putting 1 in the first box that was created (necessarily the
upper left-hand corner), then 2 in the second box that was created, and so
forth. The recording tableau is a clearly a standard tableau. The bijection
maps the sequence (m1 , . . . , mk ) to the pair (P, Q) in (41.2).
During the years prior to the late 1980s, one could say that tableau com-
binatorics and representation theory existed in parallel. The RSK algorithm
existed as a combinatorial analog of the Frobenius–Schur duality correspon-
dence (and GLn × GLm duality), but a direct connection between these topics
was missing until Kashiwara described crystals as an aspect of the developing
theory of quantum groups. The book [72] of Hong and Kang and the paper
Kashiwara [93] are good introductions to this topic.
41 Unitary Branching Rules and Tableaux 433

We will not give a complete definition of crystals here. Our goal is to


describe them sufficiently well to explain their connection with the RSK algo-
rithm. Crystals are purely combinatorial analogs of Lie group representations,
but now the connection is more than an analogy: crystals are derived from
Lie group representations by a process of deformation.
Let us begin with a Lie group G (say a compact Lie group or its complexifi-
cation) with weight lattice Λ. Let λ ∈ Λ be a dominant weight. The crystal Bλ
is a combinatorial analog of the irreducible representation πλG having highest
weight λ.
The crystal Bλ is a directed graph with vertices that may be identified with
some type of tableaux (at least for the classical Cartan types). Its cardinality
(i.e., the number of vertices in this graph) is equal to the dimension of πλG .
The edges of the graph are labeled by integer 1  i  r where r is the
semisimple rank, that is, the number of simple roots. There is a weight function
i
wt : Bλ → Λ such that if P and Q are vertices with an edge P ←− Q, then
wt(Q) = wt(P ) + αi .
If χλ is the character of πλG , then

χλ = ewt(P ) . (41.3)
P ∈Bλ

(The notation eλ is as in Chap. 22.)


Let us consider the case G = GL(n, C). The semisimple rank r is n − 1.
Assuming that the dominant weight λ is a partition, the vertices of Bλ are
the semistandard Young tableaux in SSYT(λ, n), and the weight function is
easy to describe: the weight of a tableau is μ = (μ1 , . . . , μn ) where μi is the
number of boxes labeled i in the tableau. Figure 41.1 shows the crystal with
n = 3 and λ = (3, 1): The edges are labeled 1 or 2. These correspond to the
simple roots α1 and α2 .
In the case G = GL(n, C) where we have identified the vertices with
tableaux, the edges have the following meaning: if there is an edge labeled
i from P to Q, then Q is obtained from P by changing an entry labeled i to
i + 1. (But if there is more than one box labeled i, deciding which is to be
changed is not entirely straightforward.)
For GL(n), the weight function may be described as follows: Λ, we recall,
is identified with Zn , in which μ = (μ1 , . . . , μn ) is identified with the character
μ ∈ ΛX ∗ (T ), where T is the diagonal torus and the character μ maps
⎛ ⎞
t1
⎜ ⎟ 7 μ
z = ⎝ . . . ⎠ −→ ti i .
tn
Then if P ∈ SSYT(λ, n) we define μ = wt(P ) by letting μi be the number of
entries in the tableau equal to i. So (41.3) becomes the combinatorial formula
for Schur polynomial:
434 41 Unitary Branching Rules and Tableaux

1 1
1 2 2 1 1 2 1 1 1
2 2 2
2 2
2
1
1 2 3 1 1 3
2 2
1 1 1
2 2 2 1 2 2 1 1 2 1 1 1
3 3 2 3 3
2
2 2
1 3 3
2
1 1
2 2 3 1 2 3 1 1 3 1 1 1
3 3 3
2 1 2 2 1
2 1 2 1
1 1
1
2 3 3 1 3 3
3 3

Fig. 41.1. The crystal with highest weight λ = (3, 1, 0). The weight diagram (see
Chap. 21) is supplied to the right, to orient the reader


sλ (z1 , . . . , zn ) = zwt(P ) (41.4)
P

where zwt(P ) is the product of zi as i runs through the entries in the tableau P .
We will not prove (41.4), which is due to Littlewood, but proofs may be found
in Fulton [53] or Stanley [153].
Crystals have a purely combinatorial tensor product rule that exactly par-
allels the decomposition rule for tensor products of Lie group representations.
That is, if C and C  are crystals, a crystal C ⊗ C  is defined which is the disjoint
union of “irreducible” crystals, each isomorphic to a crystal of the type Bλ .
If λ, μ and ν are dominant weights, then the number of copies of Bλ in the
decomposition of Bμ ⊗ Bν equals the multiplicity of πλG in πμG ⊗ πνG .
Crystals give an explanation of the RSK algorithm. The point is that
the tensor product operation is closely related to Schensted insertion. Let
B = B(1) . This is Kashiwara’s standard crystal . It looks like this:
1 2 n−1
1 −→ 2 −→ · · · −→ n .

Now we have an isomorphism


F
B ⊗ Bλ ∼
= Bμ
μ

as μ runs through all the partitions with Young diagrams that are obtained
from λ by adding one box. In this isomorphism, if P is a tableau of shape λ,
the element i ⊗ P in B ⊗ Bλ corresponds to the tableaux i → P obtained
41 Unitary Branching Rules and Tableaux 435

by Schensted insertion in one of the Bλ . Which Bμ it lives in depends on


the row in which the Schensted insertion terminates. The crystal analog of
Frobenius–Schur duality expresses ⊗k B as a disjoint union of copies of Bλ as
λ runs through the partitions of k; the number of times Bλ occurs equals the
number of standard tableaux of shape λ.
Crystals are a concrete link between the combinatorics of tableaux and
representation theory. Let Vλ be the module for the representation πλG . We
would like to deform the group G to obtain a “quantum group” depending on
a parameter q. The quantum group q should be G in the limit. This scenario
does not quite work, but instead, one may replace G by its (slightly modified)
universal enveloping algebra which is a Hopf algebra (Exercise 41.3). This has
a deformation Uq (g). This Hopf algebra is the quantized enveloping algebra.
It was introduced by Drinfeld and (independently) Jimbo in 1986, in response
to developments in mathematical physics. If the parameter q −→ 1 we recover
U (g). If q −→ 0, the Hopf algebra Uq (g) does not have a limit but its modules
do. They “crystalize” and the crystal is a basis of the resulting module. We
refer to Hong and Kang [72] for an account following Kashiwara.

Exercises
Exercise 41.1. Illustrate the bijection described in the proof of Proposition 41.2 by
translating the eight Gelfand–Tsetlin patterns with top row {3, 2, 1} into tableaux.

Exercise 41.2. Illustrate the second RSK bijection by showing that the word
(1, 2, 3, 2, 3, 1) corresponds to the tableau Q with recording tableau P where

1 1 2 1 2 4
Q= 2 3 P= 3 5
3 3

Exercise 41.3. Let g be a Lie algebra. Let U = U (g) be its universal enveloping
algebra.

(i) Show that the map Δ : g → U ⊗ U defined by Δ(X) = X ⊗ 1 + 1 ⊗ X satisfies


Δ(X)Δ(Y ) − Δ(Y )Δ(X) = Δ([X, Y ]) and conclude that Δ extends to a ring
homomorphism U → U ⊗ U .
(ii) Prove that U is a bialgebra with comultiplication Δ. You will have to define a
co-unit.
(iii) Let S : g → U be the map S(X) = −X. Show that S extends to a linear map
such that is, antimultiplicative, that is S(ab) = S(b) S(a).
(iv) Show that U is a Hopf algebra with antipode S.
42
Minors of Toeplitz Matrices

This chapter can be read immediately after Chap. 39. It may also be skipped
without loss of continuity. It gives further examples of how Frobenius–Schur
duality can be used to give information about problems related to random
matrix theory. 

Let f (t) = n=−∞ dn tn be a Laurent series representing a function f :
T −→ C on the unit circle. We consider the Toeplitz matrix
⎛ ⎞
d0 d1 · · · dn−1
⎜ d−1 d0 · · · dn−2 ⎟
⎜ ⎟
Tn−1 (f ) = ⎜ .. .. .. ⎟ .
⎝ . . . ⎠
d1−n d2−n · · · d0

Szegö [157] considered the asymptotics of Dn−1 (f ) = det Tn−1 (f ) as
n −→ ∞. He proved, under certain assumptions, that if
8 ∞
9

n
f (t) = exp cn t ,
−∞

then 8 9


Dn−1 (f ) ∼ exp nc0 + kck c−k . (42.1)
k=1

In other words, the ratio is asymptotically 1 as n −→ ∞. See Böttcher and


Silbermann [22] for the history of this problem and applications of Szegö’s
theorem.
A generalization of Szegö’s theorem was given by Bump and Diaconis [28],
who found that the asymptotics of minors of Toeplitz matrices had a similar
formula. Very strikingly, the irreducible characters of the symmetric group
appear in the formula.

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 437


DOI 10.1007/978-1-4614-8024-2 42, © Springer Science+Business Media New York 2013
438 42 Minors of Toeplitz Matrices

One may form a minor of a Toeplitz matrix by either striking some rows
and columns or by shifting some rows and columns. For example, if we strike
the second row and first column of T4 (f ), we get
⎛ ⎞
d1 d2 d3 d4
⎜ d−1 d0 d1 d2 ⎟
⎜ ⎟
⎝ d−2 d−1 d0 d1 ⎠ .
d−3 d−2 d−1 d0
This is the same result as we would get by simply shifting the indices in T3 (f );
that is, it is the determinant det(dλi −i+j )1i,j4 where λ is the partition (1).
The most general Toeplitz minor has the form det(dλi −μj −i+j ), where λ and
μ are partitions. The asymptotic formula of Bump and Diaconis holds λ and
μ fixed and lets n −→ ∞.
The formula with μ omitted [i.e., for det(dλi −i+j )] is somewhat simpler to
state than the formula, involving Laguerre polynomials, with both λ and μ.
We will content ourselves with the special case where μ is trivial.
We will take the opportunity in the proof of Theorem 42.1 to correct a
minor error in [28]. The statement before (3.4) of [28] that “. . . the only terms
that survive have αk = βk ” is only correct for terms of degree  n. (We thank
Barry Simon for pointing this out.)
If λ is a partition, let χλ denote the character of U(n) defined in Chap. 36.
We will use the notation like that at the end of Chap. 22, which we review
next. Although we hark back to Chap. 22 for our notation, the only “deep”
fact that we need from Part II of this book is the Weyl integration formula. For
example, the Weyl character formula in the form that we need it is identical
to the combination of (36.1) and (36.3). The proof of Theorem 42.1 in [28],
based on the Jacobi–Trudi and Cauchy identities, did not make use of the Weyl
integration formula, so even this aspect of the proof can be made independent
of Part II.
Let T be the diagonal torus in U(n). We will identify X ∗ (T ) ∼ = Zn by

mapping the character (22.15) to (k1 , . . . , kn ). If χ ∈ X (T ) we will use the
“multiplicative” notation eχ for χ so as to be able to form linear combinations
of characters yet still write X ∗ (T ) additively. The Weyl group W can be
identified with the symmetric group Sn acting on X ∗ (T ) = Zn by permuting
the characters. Let E be the free Abelian group on X ∗ (T ). (This differs slightly
from the use of E at the end of Chap. 22.)
Elements of E are naturally functions on T . Since each conjugacy class of
U(n) has a representative in T , and two elements of T are conjugate in G if
and only if they are equivalent by W , class functions on G are the same as
W -invariant functions on W . In particular, a W -invariant element of E may
be regarded as a function on the group. We write the Weyl character formula
in the form (22.17) with δ = (n − 1, n − 2, . . . , 1, 0) as in (22.16).
If λ and μ are partitions of length  n, let
λ,μ
Dn−1 (f ) = det(dλi −μj −i+j ).
It is easy to see that this is a minor in a larger Toeplitz matrix.
42 Minors of Toeplitz Matrices 439

∞(Heine,n Szegö, Bump, Diaconis). Let f ∈ L (T) be given,


1
Theorem 42.1
with f (t) = n=−∞ dn t . Let λ and / μ be partitions of length  n. Define a
n
function Φn,f on U(n) by Φn,f (g) = i=1 f (ti ), where ti are the eigenvalues
of g ∈ U(n). Then

λ,μ
Dn−1 (f ) = Φn,f (g) χλ (g) χμ (g) dg .
U(n)

If λ and μ are trivial, this is the classical Heine–Szegö identity. Histori-


cally, a “Hermitian” precursor of this formula may be found in Heine’s 1878
treatise on spherical functions, but the “unitary” version seems due to Szegö.
The following proof of the general case is different from that given by Bump
and Diaconis, who deduced this formula from the Jacobi–Trudi identity.

Proof. By the Weyl integration formula in the form (22.18), and the Weyl
character formula in the form (22.17), we have

Φn,f (g)χλ (g)χμ (g) dg
U(n)
 8 98 9
1
l(w) w(μ+δ) l(w  ) −w  (λ+δ)
= Φn,f (t) (−1) e (−1) e dt
n! T w∈W w  ∈W
⎛ ⎞

1 l(w)+l(w  ) w(μ+δ)−w  (λ+δ) ⎠
= Φn,f (t) ⎝ (−1) e dt.
n! T w,w  ∈W

Interchanging the order of summation and integration, replacing w by w w,


and then making the variable change t −→ w t, we get
-  .
1 > ?
Φn,f (t) (−1)l(w) ew(μ+δ)−λ−δ dt .
n!  T
w ∈W w∈W

Each w contributes equally, and we may simply drop the summation over w
and the 1/n! to get
 > ?
Φn,f (t) (−1)l(w) ew(μ+δ)−λ−δ dt.
w∈W T

Now, as a function on T , the weight ew(μ+δ)−λ−δ has the effect


⎛ ⎞
t1
⎜ .. ⎟ 7n
μ +(n−w(i))−λi −(n−i) 7n
μ −w(i)−λi +i
⎝ . ⎠ −→ ti w(i) = ti w(i) .
tn i=1 i=1
440 42 Minors of Toeplitz Matrices

Thus, the integral is


n 
8 ∞
9
7 μ −w(i)−λi +i
l(w)
(−1) dk tki ti w(i) dt
w∈W i=1 T −∞
7n
= (−1)l(w) d−μw(i) +w(i)+λi −i .
w∈W i=1

Since the Weyl group is Sn and (−1)l(w) is the sign character, by the definition
λ,μ
of the determinant, this is the determinant Dn−1 (f ).


As we already mentioned, we will only consider here the special case where
μ is (0, . . . , 0). We refer to [28] for the general case. If μ is trivial, then
Theorem 42.1 reduces to the formula

λ
Dn−1 (f ) = Φn,f (g) χλ (g) dg, (42.2)
U(n)

where
λ
Dn−1 (f ) = det(dλi −i+j ).

Theorem 42.2 (Szegö, Bump, Diaconis). Let


8 ∞
9

k
f (t) = exp ck t ,
−∞

where we assume that



|ck | < ∞, and |k||ck |2 < ∞.
k k

Let λ be a partition of m. Let sλ : Sk −→ Z be the irreducible character


associated with λ. If ξ ∈ Sm , let γk (ξ) denote the number of k-cycles in the
decomposition of ξ into a product of disjoint cycles, and define

7
Δ(f, ξ) = (kck )γk (ξ) .
k=1

(The product is actually finite.) Then


8 ∞
9
1
λ
Dn−1 (f ) ∼ sλ (ξ) Δ(f, ξ) exp nc0 + kck c−k .
m!
ξ∈Sm k=1
42 Minors of Toeplitz Matrices 441

Proof. Our assumption that |ck | < ∞ implies that
 > ?
exp |ck | |tr(g k )| dg < ∞,
U(n)

which is enough tojustify2 all of the following manipulations. (We will use the
assumption that |kck | < ∞ later.)
First, take λ to be trivial, so that m = 0. This special case is Szegö’s
original theorem. By (42.2),
 8 9 
7
k
Dn−1 (f ) = exp ck tr(g ) dg = exp ck tr(g k ) dg .
U(n) k U(n) k

We can pull out the factor exp(nc0 ) since tr(1) = n, substitute the series
expansion for the exponential function, and group together the contributions
for k and −k. We get
 - ∞ .- ∞ β .
7 cα k c−k k
βk
nc0 k k αk k
e tr(g ) tr(g ) dg
U(n) k α !
αk =0 k αk =0 k
β !
8 98 9
 7 cαk 7 cβ−kk
β k
= enc0 k
tr(g k )αk tr(g k ) dg,
U(n) αk ! βk !
(αk ) (βk ) k k

where the sum is now over all sequences


 (αk ) and (βk ) of nonnegative integers.
The integrand is multiplied by eiθ( kαk − kβk ) when  we multiply g by e .

This means that the integral is zero unless kαk = kβk . Assuming this, we
look more closely at these terms.
/ By Theorem 37.1, in notation introduced in
Chap. 39,
 the function
 g −
 → k tr(g k αk
) is Ch (n)
(p ν ), where ν is a partition
of r = kαk = kβk with αk = αk (ν) parts of size k, and similarly we will
denote by σ the partition of r with βk parts of size k. This point was discussed
in the last chapter in connection with (39.7). We therefore obtain


λ
Dn−1 (f ) = enc0 C(r, n),
r=0

where
8 9
7 cαk βk G H
k c−k
C(r, n) = Ch(n) (pν ), Ch(n) (pσ ) .
αk !βk !
ν, σ partitions of r k

Now consider the terms with r  n. When r  n, by Theorem 39.1, the


characteristic map from Rr to the space of class functions in L2 (G) is an
isometry, and if ν = ν  , then by (37.2) we have
G H 
zν if ν = σ,
Ch(n) (pν ), Ch(n) (pσ ) = pν , pσ Sr =
U(n) 0 otherwise.
442 42 Minors of Toeplitz Matrices

(This is the same fact we used in the proof of Proposition 39.4.) Thus, when
r  n, we have C(r, n) = C(r) where, using the explicit form (37.1) of zν ,
we have
8 9
7 (ck c−k )αk
nc0
C(r) = e zν
(αk !)2
ν a partition of r k
8 9
7 (kck c−k )αk
nc0
=e .
αk !
ν a partition of r k

Now
∞ 
7  7
nc0 (kck c−k )αk
C(r) = e = enc0 exp(kck c−k ),
r α =0
αk !
k k k
 
so as n −→ ∞, the series r C(r, n) stabilizes to the series r C(r) that
converges to the right-hand side of (42.1). 
To prove (42.1), we must bound the tails of the series r C(r,  n). It is
enough to show that there exists an absolutely convergent series r |D(r)| <
∞ such that |C(r, n)|  |D(r)|. First, let us consider the case where ck = c−k .
In thiscase, we may take D(r) = C(r). The absolute  convergence of the
series |D(r)| follows from our assumption that |k| |ck |2 < ∞ and the
Cauchy–Schwarz inequality. In this case,
I 8 9 I2
I I
I 7 cαk I
I
C(r, n) = I k
Ch (pν )I
(n)
α ! I ,
Iν a partition of r k k I

where, as before, αk = αk (ν) is the number of parts of size k of the partition ν


and the inner product is taken in U(n). Invoking the fact from Theorem 39.1
that the Ch(n) is a contraction, this is bounded by
I 8 9 I
I I2
I 7 cαk I
I
C(r, n) = I k
pν I
I ,
Iν a partition of r k αk ! I

where now the inner product is taken in Sr , and of course this is C(r). If we do
not assume ck = c−k , we may use the Cauchy–Schwarz inequality and bound
C(r, n) by
I 8 9 II 8 9 I
I 7 cαk II 7 cβ−k I
I I I k
I
I k I I
Ch (pν )I·I
(n)
Ch (pσ )I
(n)
I I.
Iν a partition of r k αk ! I Iσ a partition of r k βk ! I

Each norm is dominated by the corresponding norm in Rk and, proceeding as


before, we obtain the same bound with ck replaced by max(|ck |, |c−k |).
42 Minors of Toeplitz Matrices 443

Now (42.1) is proved, which is the special case with λ trivial. We turn now
to the general case.
We will make use of the identity

sλ = zμ−1 sλ (ξμ ) pμ
μ a partition of m

in the ring of class functions on Sm , where for each μ, ξμ is a representative


of the conjugacy class Cμ of cycle type μ. This is clear since zμ−1 pμ is the
characteristic function of Cμ , so this function has the correct value at every
group element. Applying the characteristic map in the ring of class functions
on U(n), we have

χλ = zμ−1 sλ (ξμ ) Ch(n) (pμ ).
μ a partition of m

For each μ, let γk (ξμ ) be the number of cycles of length k in the decomposition
of ξμ into a product of disjoint cycles. By Theorem 37.1, we may write this
identity
7
χλ = zμ−1 sλ (ξμ ) tr(g k )γk (ξμ ) .
μ a partition of m k
λ
Now, proceeding as before from (42.2), we see that Dn−1 (f ) equals

nc0 −1
e zμ sλ (ξμ )
μ a partition of m
 8 ∞
9⎛ ∞

7 cα k cβ−k
k
βk +γk (ξμ )
× k
tr(g k )αk ⎝ tr(g k ) ⎠ dg.
U(n) k α =0
α k ! βk !
k βk =0

Since Sm contains m!/zλ elements of cycle type μ and sλ has the same value
sλ (ξμ ) on all of them, we may write this as
1
enc0 sλ (ξ)
m!
ξ∈Sm
8 98 9
 7 cαk 7 cβ−kk
βk +γk (ξ)
× k k αk
tr(g ) k
tr(g ) dg.
αk ! βk !
(αk ) (βk ) U(n) k k
 
As in the previous case, the contribution vanishes unless kαk = kβk +m,
and we assume this. We get

λ 1
Dn−1 (f ) = enc0 sλ (ξ) C(r, n, ξ),
m! r=0
ξ∈Sm

where now
C(r, n, ξ)
 ( )( )
   cαk  cβ−k
k
βk +γk (ξ)
k k αk
= tr(g ) tr(g k ) dg.
U(n) αk ! βk !
(αk ) k k
  (βk )
kαk = r + m kβk = r
444 42 Minors of Toeplitz Matrices

If r  n, then (as before) the contribution is zero unless αk = βk + γk . In this


case,
 7 7
tr(g k )αk tr(g k )βk +γk dg = (βk + γk )! k βk +γk ,
U(n) k k

and using this value, we see that when r  n we have C(r, n, ξ) = C(r, ξ),
where
(kck c−k )βk
C(r, ξ) = Δ(f, ξ) .
βk !
 (βk )
kβk = r

The series is
8 ∞
9

C(r, ξ) = Δ(f, ξ) exp nc0 + kck c−k ,
r k=1

result will follow as before if we can show that |C(r, n, ξ)| < |D(r, ξ)|
so the 
where |D(r, ξ)| < ∞. The method is the same as before, based on the fact
that the characteristic map is a contraction, and we leave it to the reader. 

Exercises
Exercise 42.1 (Bump et al. [30]).
(i) If f is a continuous function on T, show that there is a well-defined continuous
function uf : U(n) −→ U(n) such that if ti ∈ T and h ∈ U(n), we have
⎛ ⎛ ⎞ ⎞ ⎛ ⎞
t1 f (t1 )
⎜ ⎜ ⎟ ⎟ ⎜ ⎟ −1
uf ⎝h ⎝ . . . ⎠ h−1 ⎠ = h ⎝ ..
. ⎠h .
tn f (tn )
n
(ii) If g is an n × n matrix, with n  m, let Em (g) denote the sum of the m
principal m × m minors of g. Thus, if n = 4, then E2 (g) is
           
 g11 g12   g11 g13   g11 g14   g22 g23   g22 g24   g33 g34 
 + + + + + 
 g21 g22   g31 g33   g41 g44   g32 g33   g42 g44   g43 g44  .

Prove that if f (t) = dk tk , then

μ,λ
Em (uf (g)) χλ (g) χμ (g) dg = Em (Tn−1 ),
U(n)

μ,λ
where Tn−1 is the n × n matrix whose i, jth entry is dλi −μj −i+j . (Hint: Deduce
this from the special case m = n.)
43
The Involution Model for Sk

Let σ1 = 1, σ2 = (12), σ3 = (12)(34), . . . be the conjugacy classes of


involutions in Sk . It was shown by Klyachko and by Inglis et al. [82] that
it is possible to specify a set of characters ψ1 , ψ2 , ψ3 , . . . of degree 1 of the
centralizers of σ1 , σ2 , σ3 , . . . such that the direct sum of the induced repre-
sentations of the ψi contains every irreducible representation exactly once.
In the next chapter, we will see that translating this fact and related ones to
the unitary group gives classical facts about symmetric and exterior algebra
decompositions due to Littlewood [120].
If (π, V ) is a self-contragredient irreducible complex representation of a
compact group G, we may classify π as orthogonal (real) or symplectic (quater-
nionic). We will now explain this classification due to Frobenius and Schur
[52]. We recall that the contragredient representation to (π, V ) is the represen-
tation π̂ : G −→ GL(V ∗ ) on the dual space V ∗ of V defined by π̂(g) = π(g −1 )∗ ,
which is the adjoint of π(g −1 ). Its character is the complex conjugate of the
character of π.

Proposition 43.1. The irreducible complex representation π is self-contra-


gredient if and only if there exists a nondegenerate bilinear form B : V ×V −→
C such that
B π(g)v, π(g)w = B(v, w). (43.1)
The form B is unique up to a scalar multiple. We have B(w, v) = B(v, w),
where  = ±1.

Proof. To emphasize the symmetry between V and V ∗ , let us write the


dual pairing V × V ∗ −→ C in the symmetrical form L(v)  = v, L.The
contragredient representation thus satisfies π(g)v, L = v, π̂(g −1 )L , or
π(g)v, π̂(g)L = v, L. Any bilinear form B : V × V −→ C is of the form
B(v, w) = v, λ(w), where λ : V −→ V ∗ is a linear isomorphism. It is clear
that (43.1) is satisfied if and only if λ intertwines π and π̂.
Since π and π̂ are irreducible, Schur’s lemma implies that λ, if it exists,
is unique up to a scalar multiple, and the same conclusion follows for B.

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 445


DOI 10.1007/978-1-4614-8024-2 43, © Springer Science+Business Media New York 2013
446 43 The Involution Model for Sk

Now (v, w) → B(w, v) has the same property as B, and so B(w, v) = B(v, w)
for some constant . Applying this identity twice, 2 B(v, w) = B(v, w) so
 = ±1.

If (π, V ) is self-contragredient, let π be the constant  in Proposition 43.1;
otherwise let π = 0. If π = 1, then we say that π is orthogonal or real ; if
π = −1, we say that π is symplectic or quaternionic. We call π the Frobenius–
Schur indicator of π.
Theorem 43.1 (Frobenius and Schur). Let (π, V ) be an irreducible rep-
resentation of the compact group G. Then

π = χ(g 2 ) dg.
G

Proof. We have p2 = h2 − e2 in Λ(n) . Indeed, p2 (x1 , . . . , xn ) equals


⎛ ⎞ ⎛ ⎞

x2i = ⎝ x2i + xi xj ⎠ − ⎝ xi xj ⎠
i i i<j i<j

= h2 (x1 , . . . , xn ) − e2 (x1 , . . . , xn ) .
By (33.8) and Proposition 33.2, this means that

χ(g 2 ) = tr ∨2 π(g) − tr ∧2 π(g) .
We see that π is
 

tr ∨2 π(g) dg − tr ∧2 π(g) dg.
G G

Thus, what we need to know is that ∨2 π(g) contains the trivial representation
if and only if π = 1, while ∧2 π(g) contains the trivial representation if and
only if π = −1.
If ∨2 π(g) contains the trivial representation, let ξ ∈ ∨2 V be a ∨2 π(g)-
fixed vector. Let , be a G-invariant inner product on V . There is induced
a G-invariant Hermitian inner product on ∨2 V such that v1 ∨ v2 , w1 ∨ w2 =
v1 , v2 w1 , w2 , and we may define a symmetric bilinear form on V by
B(v, w) = v ∨ w, ξ . Thus, π = 1.
Conversely, if π = 1, let B be a symmetric invariant bilinear form. By the
universal property of the symmetric square, there exists a linear form L :
∨2 V −→ C such that B(v, w) = L(v ∨ w), and hence a vector ξ ∈ ∨2 V such
that B(v, w) = v ∨ w, ξ , which is a ∨2 π(g)-fixed vector.
The case where π = −1 is identical using the exterior square.

Proposition 43.2. Let (π, V ) be an irreducible complex representation of the
compact group G. Then π is the complexification of a real representation if
and only if π = 1. If this is true, π(G) is conjugate to a subgroup of the
orthogonal group O(n).
43 The Involution Model for Sk 447

Proof. First, suppose that π : G −→ GL(V ) is the complexification of a real


representation. This means that there exists a real vector space V0 and a
homomorphism π0 : G −→ GL(V0 ) such that V ∼ = C ⊗R V0 as G-modules.
Every compact subgroup of GL(V0 ) ∼ = GL(n, R) is conjugate to a subgroup
of O(n). Indeed, if , is a positive definite symmetric bilinear form on V0 ,
then averaging it gives another positive definite symmetric bilinear form

B0 (v, w) = π0 (g)v, π0 (g)w dg
G

that is G-invariant. Choosing a basis of V0 that is orthonormal with respect


to this basis, the matrices of π0 (g) will all be orthogonal. Extending B0 by
linearity to a symmetric bilinear form on V , which we identify with C ⊗ V0 ,
gives a symmetric bilinear form showing that π = 1.
Conversely, if π = 1, there exists a G-invariant symmetric bilinear form B
on V . We will make use of both B and a G-invariant inner product , on V .
They differ in that B is linear in the second variable, while the inner product
is conjugate linear. If w ∈ V , consider the linear functional v → B(v, w).
Every linear functional is the inner product with a unique element of V , so
there exists λ(w) ∈ V such that B(v, w) = v , λ(w) . The map λ : V −→ V
is R-linear but not C-linear; in fact, it is complex antilinear. Let V0 = {v ∈
V | λ(v) = v}. It is a real vector space. We may write every
element1 v ∈ V as a
sum v = u+iw, where u, w ∈ V0 , taking u = 12 v+λ(v) and w = 2i v−λ(v) .
This decomposition is unique since λ(v) = u − iw, and we may solve for u and
w. Therefore, V = V0 ⊕ iV0 and V is the complexification of V0 . Since B and
H are both G-invariant, it is easy to see that λ ◦ π(g) = π(g) ◦ λ, so π leaves
V0 -invariant and induces a real representation with the complexification π.



Theorem 43.2. Let G be a finite group. Let μ : G −→ C be the sum of the


irreducible characters of G.
(i) Suppose that π = 1 for every irreducible representation π. Then, for any
g ∈ G, μ(g) is the number of solutions to the equation x2 = g in G.
(ii) Suppose that μ(1) is the number of solutions to the equation x2 = 1. Then
π = 1 for all irreducible representations π.

Proof. If π is an irreducible representation of G, let χπ be its character. We


will show

χπ (g)π̂ = #{x ∈ G | x2 = g}. (43.2)
irreducible π

Indeed, by Theorem 43.1, the left-hand side equals


- .
1 1
χ(g) χ(x2 ) = χ(g) χ(x2 ) .
χ
|G| |G| χ
x∈G x∈G
448 43 The Involution Model for Sk

Let C be the conjugacy class of g. By Schur orthogonality, the expression in


brackets equals 1/|C| if x2 is conjugate to g and zero otherwise. Each element
of the conjugacy class will have the same number of square roots, so counting
the number of solutions to x2 ∼ g (where ∼ denotes conjugation) and then
dividing by |C| gives the number of solutions to x2 = g. This proves (43.2).
Now (43.2) clearly implies (i). It also implies (ii) because, taking g = 1,
each coefficient χπ (1) is a positive integer, so

χπ (1) π̂ = χπ (1)
irreducible π irreducible π

is only possible if all π are equal to 1.




Let K be a field and F a subfield. Let V be a K-vector space. If π : G −→


GL(V ) is a representation of a group G over K, we say that π is defined over
F if there exists an F -vector space V0 and a representation π0 : G −→ GL(V0 )
over F such that π is isomorphic to the representation of G on the K-vector
space K ⊗F V0 . The dimension over K of V must clearly equal the dimension
of V0 as an F -vector space.

Theorem 43.3. Every irreducible representation of Sk is defined over Q.

Proof. The construction of Theorem 35.1 contained no reference to the ground


field and works just as well over Q. Specifically, our formulation of Mackey
theory was valid over an arbitrary field, so if λ and μ are conjugate partitions,
the computation of Proposition 35.5 shows that there is a unique intertwining
operator IndSSkλ (ε) −→ IndSSkμ (1), where we are now considering representations
over Q. The image of this intertwining operator is a rational representation
which has a complexification that is the representation ρλ of Sk parametrized
by λ.


In this chapter, we will call an element x ∈ G an involution if x2 = 1. Thus,


the identity element is considered an involution by this definition. If G = Sk ,
then by Theorem 43.3 every irreducible representation is defined over Q, a
fortiori over R, and so by Theorem 43.2 we have π = 1 for all irreducible
representations π. Therefore, the number of involutions is equal to the sum
of the degrees of the irreducible characters, and moreover the sum of the
irreducible characters evaluated at g ∈ Sk equals the number of solutions to
x2 = g. In particular, it is a nonnegative integer.
It is possible to prove that the sum of the degrees of the irreducible repre-
sentations of G is equal to the number of involutions when G = Sk using the
Robinson–Schensted correspondence (see Knuth [109], Sect. 5.1.4, or Stanley
[153], Corollary 7.13.9). Indeed, both numbers are equal to the number of
standard tableaux.
Let G be a group (such as Sk ) having the property that all π = 1, so
the number of involutions of G is the sum of the degrees of the irreducible
representations. Let x1 , . . . , xh be representatives of the conjugacy classes of
43 The Involution Model for Sk 449

involutions.The cardinality of a conjugacy class x is the index of its centralizer


CG (x), so [G : CG (xi )] is the number of involutions of G. Since this is the
sum of the degrees of the irreducible characters of G, it becomes a natural
question to ask whether we may specify characters ψi of degree 1 of CG (xi )
such that the direct sum of the induced characters ψiG contains each irreducible
character exactly once. If so, these data comprise an involution model for G.
Involution models do not always exist, even if all π = 1.
A complete set of representatives of the conjugacy classes of Sk are 1,
(12), (12)(34), . . .. To describe their centralizers, we first begin with the in-
volution (12)(34)(56) · · · (2r − 1, 2r) ∈ S2r . Its centralizer, as described in
Proposition 37.1, has order 2r r!. It has a normal subgroup of order 2r gener-
ated by the transpositions (12), (34), . . ., and the quotient is isomorphic to
Sr . We denote this group B2r . It is isomorphic to the Weyl group of Cartan
type Br .
Now consider the centralizer in Sk of (12)(34) · · · (2r − 1, 2r) where 2r < k.
It is contained in S2r × Sk−2r , where the second Sk−2r acts on {2r + 1, 2r +
2, . . . , k} and equals B2r ×Sk−2r . The theorem of Klyachko, Inglis, Richardson,
and Saxl is that we may specify characters of these groups with inductions
to Sk that contain every irreducible character exactly once. There are two
ways of doing this: we may put the alternating character on Sk−2k and the
trivial character on B2r , or conversely we may put the alternating character
(restricted from S2r ) on B2r and the trivial character on Sk−2r .
Let ω2r be the character of S2r induced from the trivial character of B2r .
Proposition 43.3. The restriction of ω2r to S2r−1 is isomorphic to the char-
acter of S2r−1 induced from the character ω2r−2 to S2r−1 .
Proof. First, let us show that B2r \S2r /S2r−1 consists of a single double coset.
Indeed, S2r acts transitively on X = {1, 2, . . . , 2r}, and the stabilizer of 2r is
S2r−1 . Therefore, we can identify S2r /S2r−1 with X and B2r \S2r /S2r−1 with
B2r \X. Since B2r acts transitively on X, the claim is proved.
Thus, we can compute the restriction of ω2r to S2r−1 by Corollary 32.2 to
Theorem 32.2, taking H1 = B2r , H2 = S2r−1 , G = S2r , π = 1, with γ = 1 the
only double coset representative. We see that the restriction of ω2r = IndG H1 (1)
is the same as the induction of 1 from Hγ = B2r ∩ S2r−1 = B2r−2 to H2 .
Inducing in stages first from B2r−2 to S2r−2 and then to S2r−1 , this is the
same as the character of S2r−1 induced from ω2r−2 .

We are preparing to compute ω2r . The key observation of Inglis, Richard-
son, and Saxl is that Proposition 43.3, plus purely combinatorial considera-
tions, contains enough information to do this.
We call a partition λ = (λ1 , λ2 , . . .) even if every λi is an even integer.
If λ is a partition, let Ri λ = (λ1 , λ2 , . . . , λi−1 , λi + 1, λi+1 , . . .) be the
result of incrementing the ith part. In applying this raising operator , we must
always check that the resulting sequence is a partition. For this, we need
either i = 1 or λi < λi−1 . Similarly, we have the lowering operator Li λ =
(λ1 , λ2 , . . . , λi−1 , λi − 1, λi+1 , . . .), which is a partition if λi > λi+1 .
450 43 The Involution Model for Sk

Lemma 43.1. Every partition of 2r − 1 having exactly one odd part is


contained in a unique even partition of 2r.

Proof. Let μ be a partition of 2r − 1 having exactly one odd part μi . The


unique even partition of 2r containing μ is Ri μ. Note that this is a partition
since i = 1 or μi < μi−1 . (We cannot have μi and μi−1 both equal since one
is odd and the other is even.)


Proposition 43.4. Let S be a set of partitions of 2r. Assume that:


(i) Each partition of 2r − 1 contained in an element of S has exactly one odd
part;
(ii) Each partition of 2r −1 with exactly one odd part is contained in a unique
element of S;
(iii) The trivial partition (2r) ∈ S.
Then S consists of the set S0 of even partitions of 2r.

Proof. First, we show that S contains S0 . Assume on the contrary that λ ∈ S0


is not in S. We assume that the counterexample λ is minimal with respect
to the partial order, so if λ ∈ S0 with λ ≺ λ, then λ ∈ S. Let i = l(λ).
We note that i > 1 since if i = 1, then λ is the unique partition of 2r of
length 1, namely (2r), which is impossible since λ ∈ / S while (2r) ∈ S by
assumption (iii).
Let μ = Li λ. It is a partition since we are decrementing the last nonzero
part of λ. It has a unique odd part μi , so by (ii) there is a unique τ ∈ S such
that μ ⊂ τ . Evidently, τ = Rj μ for some j. Let us consider what j can be.
We show first that j cannot be > i. If it were, we would have j = i + 1
because i is the length of μ and λ. Now assuming τ = Ri+1 μ = Ri+1 Li λ, we
can obtain a contradiction as follows. We have τi−1 = λi−1  λi > λi − 1 = τi ,
so ν = Li−1 τ is a partition. It has three odd parts, namely νi−1 , νi and νi+1 .
This contradicts (i) for ν ⊂ τ ∈ S.
Also j cannot equal i. If it did, we would have τ = Ri Li λ = λ, a contra-
diction since τ ∈ S while λ ∈/ S.
Therefore, j < i. Let σ = Rj Li τ = Rj2 L2i λ. Note that σ is a partition.
Indeed, either j = 1 or else τj = τj−1 since one is odd and the other one
is even, and we are therefore permitted to apply Rj . Furthermore, τi = τi+1
since one is odd and the other one is even, so we are permitted to apply Li .
Since λ is even, σ is even, and since j < i, σ ≺ λ. By our induction
hypothesis, this implies that σ ∈ S. Now let θ = Li τ = Lj σ. This is easily seen
to be a partition with exactly one odd part (namely θj ), and it is contained
in two distinct elements of S, namely τ and σ. This contradicts (ii).
This contradiction shows that S ⊃ S0 . We can now show that S = S0 .
Otherwise, S contains S0 and some other partition λ ∈ / S0 . Let μ be any
partition of 2r − 1 contained in λ. Then μ has exactly one odd part by (i), so
by Lemma 43.1 it is contained in some element λ ∈ S0 ⊂ S. Since λ ∈ / S0 , λ
and λ are distinct elements of S both containing μ, contradicting (ii).

43 The Involution Model for Sk 451

Theorem 43.4. The character ω2r of S2r is multiplicity-free. It is the sum of


all irreducible characters sλ with λ an even partition of 2r.
Proof. By induction, we may assume that this is true for S2r−2 . The restriction
of ω2r to S2r−1 is the same as the character induced from w2r−2 by Proposi-
tion 43.3. Using the branching rule for the symmetric groups, its irreducible
constituents consist of all sμ , where μ is a partition of S2r−1 containing an
even partition of 2r − 2, and clearly this is the set of partitions of 2r − 1 having
exactly one odd part. There are no repetitions.
We see immediately that ω2r is multiplicity-free since its restriction to
S2r−1 is multiplicity-free. Let S be the set of partitions λ of 2r such that
sλ is contained in ω2r . Again using the branching rule for symmetric groups,
we see that this set satisfies conditions (i) and (ii) of Proposition 43.4 and
condition (iii) is clear by Frobenius reciprocity. The result now follows from
Proposition 43.4.

We may now show that Sk has an involution model. The centralizer of the
involution (12)(34) · · · (2r − 1, r) is B2r × Sk−2r .
Theorem 43.5 (Klyachko, Inglis, Richardson, and Saxl). Every irre-
ducible character of Sk occurs with multiplicity 1 in the sum
&
IndSBk2r ×Sk−2r (1 ⊗ ε),
2rk

where ε is the alternating character of Sk−2r .


Proof. We will show that IndSBk2r ×Sk−2r (1 ⊗ ε) is the sum of the sλ as λ runs
through the partitions of k having exactly k−2r odd parts. Indeed, it is obvious
that if λ is a partition of k, there is a unique even partition μ such that λ ⊃ μ
and λ\μ is a vertical strip; the partition μ is obtained by decrementing each
odd part of λ. Since ω2r is the sum of all sλ where λ is a partition of 2r into
even parts, it follows from Pieri’s formula that the character ω2r ek−2r is the
sum of all sλ where λ is a partition of k having exactly k − 2r odd parts.
We note that the numberof odd parts of any partition λ of k is congruent
to k modulo 2 because k = λi . The result follows by summing over r. 

Exercises
The first exercise generalizes Theorem 43.1 of Frobenius and Schur. Suppose that
G is a compact group and θ : G −→ G is an involution (i.e,, an automorphism
satisfying θ2 = 1). Let (π, V ) be an irreducible representation of G. If π ∼
= θ π, where
θ
π : V −→ V is the “twisted” representation π(g) = π( g), then by an obvious
θ θ

variant of Proposition 43.1 there exists a symmetric bilinear form B : V × V −→ C


such that
452 43 The Involution Model for Sk
 
Bθ π(g)v, π(θ g)w = Bθ (v, w). (43.3)
In this case, the twisted Frobenius–Schur indicator θ (π) is defined to be the constant
equal to ±1 such that
B(v, w) = θ (π)B(w, v).
If π  π we define θ (π) = 0. The goal is to prove the following theorem.
θ

Theorem (Kawanaka and Matsuyama [96]). Let G be a compact group and


θ an involution of G. Let (π, V ) be an irreducible representation with character χ.
Then

θ (π) = χ(g · θ g) dg. (43.4)
G

Exercise 43.1. Assuming the hypotheses of the stated theorem, define a group H
that is the semidirect product of G by a cyclic group t generated by an element t
of order 2 such that tgt−1 = θ g for g ∈ G. Thus, the index [G : H] = 2. The idea
is to use Theorem 43.1 for the group H to obtain the theorem of Kawanaka and
Matsuyama for G. Proceed as follows.
Case 1: Assume that π ∼ = θ π. In this case, show that there exists an endomor-
phism T : V −→ V such that T ◦ π(g) = π(θ g) ◦ T and T 2 = 1V . Extend π to a
representation πH of H such that πH (t) = T . Let Bθ : V × V −→ C satisfy (43.3).
Then B(v, w) = Bθ (v, T w) satisfies (43.1), as does B(T v, T w) = Bθ (T v, w). Thus,
there exists a constant δ such that B(T v, T w) = δB(v, w). Show that δ 2 = 1 and
that
θ (π) = δ(π) . (43.5)
Apply Theorem 43.1 to the representation πH , bearing in mind that the Haar mea-
sure on H restricted to G is only half the Haar measure on G because both measures
are normalized to have total volume 1. This gives
  
1
(πH ) = (π) + χ(g · θ g) dg . (43.6)
2 G

Now observe that if πH is self-contragredient, then the nondegenerate form that it


stabilizes must be a multiple of B. Deduce that if δ = 1 then πH is self-contragredient
and (πH ) = (π), while if δ = −1, then (πH ) = 0. In either case reconcile, (43.5)
and (43.6) to prove (43.4).
Case 2: Assume that π  θ π. In this case, show that the induced representation
H
IndG (π) is irreducible and call it πH . Show that

(πH ) = e(π) + χ(g · θ g) dg.
G

Show using direct constructions with bilinear forms on V and V H that if either (π)
or θ (π) is nonzero, then πH is self-contragredient, while if πH is self-contragredient,
then exactly one of (π) or θ (π) is nonzero, and whichever one is nonzero equals
(πH ).
43 The Involution Model for Sk 453

Exercise 43.2. Let G be a finite group and let θ be an involution. Let μ : G −→ C


be the sum of the irreducible characters of G. If μ(1) equals the number of solutions
to the equation x· θ x = 1, then show that θ (π) = 1 for all irreducible representations
π. If this is true, show that μ(g) equals the number of solutions to x · θ x = g for all
g ∈ G.

For example, if G = GL(n, Fq ), it was shown independently by Gow [57] and Kly-
achko [103] that the conclusions to Exercise 43.2 are satisfied when G = GL(n, Fq )
and θ is the automorphism g −→ t g −1 .
For the next group of exercises, the group B2k is a Coxeter group with generators

(13)(24), (35)(46), ... , (2k − 3, 2k − 1)(2k − 2, 2k)

and (2k − 1, 2k). It is thus a Weyl group of Cartan type Bk with order k!2k . It has
a linear character ξ2k having value −1 on these “simple reflections.” This is the
S
character (−1)l(w) of Proposition 20.12. Let η2k = IndB2k2k
(ξ2k ) be the character of
S2k induced from this linear character of B2k . The goal of this exercise will be to
prove analogs of Theorem 43.4 and the other results of this chapter for η2k .

Exercise 43.3. Prove the analog of Proposition 43.3. That is, show that inducing
the restriction of η2r to S2r−1 is isomorphic to the character of S2r−1 induced from
the character η2r−2 to S2r−1 .

Let S2k be the set of characters sλ of S2k where λ is a partition of 2k such that if
μ is the conjugate partition, then μi = λi + 1 for all i such that λi  i. For example,
the partition λ = (5, 5, 4, 3, 3, 2) has conjugate (6, 6, 5, 3, 2), and the hypothesis is
satisfied. Visually, this assumption means that the diagram of λ can be assembled
from two congruent pieces, as in Fig. 43.1. We will describe these as the “top piece”
and the “bottom piece,” respectively.

Top Piece

Bottom
Piece

Fig. 43.1. The diagram of a partition of class S2k when k = 11

Let T2k+1 be the set of partitions of 2k + 1 with a diagram that contains an element
of S2k .

Exercise 43.4. Prove that if λ ∈ T2k+1 , then there are unique partitions μ ∈ S2k
and ν ∈ S2k+2 such that the diagram of λ contains the diagram of μ and is contained
in the diagram of ν. (Hint: The diagrams of the skew partitions λ − μ and ν − λ,
each consisting of a single node, must be corresponding nodes of the top piece and
bottom piece.)
454 43 The Involution Model for Sk

Exercise 43.5. Let Σ be a set of partitions of 2k + 2. Assume that each partition


λ of 2k + 1 is contained in an element of Σ if and only if λ ∈ T2k+1 , in which case it
is contained in a unique element of Σ. Show that Σ = S2k+2 . [This is an analog of
Proposition 43.4. It is not necessary to assume any condition corresponding to (iii)
of the proposition.]

Exercise 43.6. Show that η2k is multiplicity-free and that the representations
occurring in it are precisely the sλ with λ ∈ S2k .
44
Some Symmetric Algebras

The results of the last chapter can be translated into statements about the
representation theory of U(n). For example, we will see that each irreducible
representation of U(n) occurs exactly once in the decomposition of the sym-
metric algebra of V ⊕ ∧2 V , where V = Cn is the standard module of U(n).
The results of this chapter are also proved in Goodman and Wallach [56],
Howe [77], Littlewood [120], and Macdonald [124]. See Theorem 26.6 and the
exercises to Chap. 26 for alternative proofs of some of these results.
Let us recall some ideas that already
@ appeared
( in Chap. 38. If ρ : G −→
GL(V ) is a representation, then V and V become modules for G and
we may ask for their decomposition into irreducible representations of V . For
some representations ρ, this question will have a simple answer, and for others
the answer will be complex. The very simplest case is where G = GL(V ).
In this case, each ∨k V is itself irreducible, and each ∧k V is either irreducible
(if k < dim(V )) or zero.
We can encode the solution to this question with generating functions




Pρ∨ (g; t) = tr g| ∨k V tk , Pρ∧ (g; t) = tr g ∧k V tk .
k=0 k=0

Proposition 44.1. Suppose that ρ : G −→ GL(V ) is a representation and


γ1 , . . . , γd are the eigenvalues of ρ(g). Then
7 7
Pρ∨ (g, t) = (1 − tγi )−1 , Pρ∧ (g, t) = (1 + tγi ). (44.1)
i i

Proof. The traces of ρ(g) on ∨k V and ∧k V are

hk (γ1 , . . . , γd ) and ek (γ1 , . . . , γd ),

so this is a restatement of (33.1) and (33.2).




D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 455


DOI 10.1007/978-1-4614-8024-2 44, © Springer Science+Business Media New York 2013
456 44 Some Symmetric Algebras

We see that for all g, Pρ∨ (g, t) is convergent if t < max(|γi |−1 ) and has
meromorphic continuation in t, while Pρ∧ (g, t) is a polynomial in t of degree
equal to the dimension of V . We will denote Pρ∨ (g) = Pρ∨ (g, 1) and Pρ∧ (g) =
Pρ∧ (g, 1). Then we specialize t = 1 in (44.1) and write
7 7
Pρ∨ (g) = (1 − γi )−1 , Pρ∧ (g) = (1 + γi ). (44.2)
i i

For the first equation, this is understood to be an analytic continuation since


the series defining Pρ∨ might not converge when t = 1.

Proposition 44.2. Let V = Cn be regarded as a GL(n, C)-module in the


usual way. Then
% 
2k
k 2 ∼
∨ (∨ V ) = V ⊗C[S2k ] ω2k

as GL(n, C)-modules. It is the direct sum of the πλ as λ runs through all even
partitions of k.

Proof. Let Ctrivial denote C denoted as a trivial module of C[B2k ]. It is suffi-


cient to prove that
% 
2k
∨k (∨2 V ) ∼
= V ⊗C[B2k ] Ctrivial (44.3)

as GL(n, C)-modules. Indeed, assuming this, the right-hand side is


isomorphic to
>>$ ? ?
2k
V ⊗C[S2k ] C[S2k ] ⊗C[B2k ] Ctrivial ∼
=
>$ ? >$2k ?
V ⊗C[S2k ] C[S2k ] ⊗C[B2k ] Ctrivial ∼
2k
= V ⊗C[S2k ] ω2k .

To prove (44.3), we will use the universal properties of the symmetric


power and tensor products to construct inverse maps
% 
2k
∨ (∨ V ) ←→
k 2
V ⊗C[B2k ] Ctrivial .

$
Here B2k ⊂ S2k acts on k V on the right by the action (34.1).
First, we note that the map

(v1 , . . . , v2k ) −→ (v1 ∨ v2 ) ∨ · · · ∨ (v2k−1 ∨ v2k )

commutes with the right action of B2k . It is 2k-linear and hence induces a
map
%2k
α: V −→ ∨k (∨2 V ), α(v1 ⊗ · · · ⊗ v2k ) = (v1 ∨ v2 ) ∨ · · · ∨ (v2k−1 ∨ v2k ),
44 Some Symmetric Algebras 457

and α(ξσ) = α(ξ) for σ ∈ B2k . Thus, the map


% 
2k
V × Ctrivial −→ ∨k (∨2 V ), (ξ, t) → tα(ξ),

is C[B2k ]-balanced and there is an induced map


>$ ?
2k
V ⊗C[B2k ] Ctrivial −→ ∨k (∨2 V ),
(v1 ⊗ · · · ⊗ v2k ) ⊗ t → t(v1 ∨ v2 ) ∨ · · · ∨ (v2k−1 ∨ v2k ).

As for the other direction, we first note that for v3 , v4 , . . . , v2k fixed, using
the fact that ⊗C[B2k ] is B2k -balanced, the map
% 
2k
(v1 , v2 ) −→ (v1 ⊗ v2 ⊗ v3 ⊗ · · · ⊗ v2k ) ⊗ 1 ∈ V ⊗C[B2k ] Ctrivial

is symmetric and bilinear, so there is induced a map


>$ ?
2k
μv3 ,v4 ,...,v2k : ∨2 V −→ V ⊗C[B2k ] Ctrivial ,
μv3 ,...,v4 (v1 ∨ v2 ) = (v1 ⊗ v2 ⊗ v3 ⊗ · · · ⊗ v2k ) ⊗ 1.

Now with ξ1 ∈ ∨2 V and v5 , . . . , v2k fixed, the map

(v3 , v4 ) −→ μv3 ,v4 ,...,v2k (ξ1 )

is symmetric and bilinear, so there is induced a map


>$ ?
2k
νξ,v5 ,...,v2k : ∨2 V −→ V ⊗C[B2k ] Ctrivial ,
νξ,v5 ,...,v2k (v3 ∨ v4 ) = μv3 ,v4 ,...,v2k (ξ1 ).

With v5 , . . . , v2k fixed, denote by


% 
2k
μv5 ,...,v2k : ∨ V × ∨ V −→
2 2
V ⊗C[B2k ] Ctrivial

the map μv5 ,...,v2k (ξ1 , ξ2 ) = νξ1, v5 ,...,v2k (ξ2 ). Continuing


>$in this? way, we even-
2k
tually construct a k-linear map μ : ∨ V ×· · ·∨ V −→
2 2
V ⊗C[B2k ] Ctrivial
such that

μ(v1 ∨ v2 , . . . , v2k−1 ∨ v2k ) = (v1 ⊗ · · · ⊗ v2k ) ⊗ 1.

Using the fact that ⊗C[B2k ] is B2k -balanced, the map μ is symmetric and hence
>$ ?
2k
induces a map ∨k (∨2 V ) −→ V ⊗C[B2k ] Ctrivial that is the inverse of the
map previously constructed. We have now proved (44.3).

458 44 Some Symmetric Algebras

Theorem 44.1. Let V = Cn be regarded as a GL(n, C)-module in the usual


way. Then
&
∨k ∨2 V ∼= πλ .
λ an even partition of 2k

Proof. This follows from Proposition 44.2, Theorem 36.4, and the explicit
decomposition of Theorem 43.4.


Theorem 44.2 (D. E. Littlewood). Let α1 , . . . , αn be complex numbers,


|αi | < 1. Then
7
(1 − αi αj )−1 = sλ (α1 , . . . , αn ). (44.4)
1ijn λ even

The sum is over even partitions.

Proof. This follows on applying (44.2) to the symmetric square representation


by using Proposition 44.1 and the explicit decomposition of Theorem 44.2. 

Theorem 44.3 (D. E. Littlewood). Let α1 , . . . , αn be complex numbers,


|αi | < 1. Then
⎡ ⎤⎡ ⎤
7 7
⎣ (1 + αi )⎦ ⎣ (1 − αi αj )−1 ⎦ = sλ (α1 , . . . , αn ).
1in 1ijn λ

The sum is over all partitions.

Proof. The coefficient of tk in


⎡ ⎤⎡ ⎤
7 7
⎣ (1 + tαi )⎦ ⎣ (1 − t2 αi αj )−1 ⎦
1in 1ijn

- .⎡ ⎤

= e k tk ⎣ sλ t2r ⎦
k λ an even partition of 2r

is

ek−2r (α1 , . . . , αn ) sλ .
2rk λ an even partition of 2r

This is the image of ek−2r ω2r under the characteristic map, and it equals the
sum of the sλ for all partitions of k by Theorem 43.5. Taking t = 1, the result
follows.

44 Some Symmetric Algebras 459

A polynomial character of GL(n, C) is one with matrix coefficients that are


polynomials in the coordinates functions gij not involving det−1 . As we know,
they are exactly the characters of πλ where λ = (λ1 , . . . , λn ) is a partition.
We may express Theorem 44.3 as saying that ( every polynomial
@ @2 character of
GL(n, C) occurs exactly once in the algebra ( V ) ⊗ ( V ), which is the
tensor product of the exterior algebra over V with the symmetric algebra over
the exterior square representation.
There are dual forms of these results. Let ω̃2k = IndSB2k
2k
(ε) be the character
of S2k obtained by inducing the alternating character ε from B2k .

Proposition 44.3. The character ω̃2k is the sum of the sλ , where λ runs
through all the partitions of k such that the conjugate partition λt is even.

Proof. This may be deduced from Theorem 43.4 as follows. Applying this with
G = S2k , H = B2k , and ρ = ε, we see that ω̃2k is the same as ω2k multiplied
by the character ε. By Theorem 37.4, this is ι ω2k , and by Theorems 43.4, and
35.2, this is the sum of the sλ with λt even.


Theorem 44.4. Let V = Cn be regarded as a GL(n, C)-module in the usual


way. Then % 
2k
∨k (∧2 V ) ∼
= V ⊗C[S2k ] ω̃2k

as GL(n, C)-modules. It is the direct sum of the πλ as λ runs through all


conjugates of even partitions of k.

Proof. Similar to Theorem 44.2.




Theorem 44.5 (D. E. Littlewood). Let α1 , . . . , αn be complex numbers,


|αi | < 1. Then
7
(1 − αi αj )−1 = sλ (α1 , . . . , αn ). (44.5)
1i<jn λt even

The sum is over even partitions.

Proof. Similar to Theorem 44.2.




Theorem 44.6 (D. E. Littlewood). Let α1 , . . . , αn be complex numbers,


|αi | < 1. Then
⎡ ⎤⎡ ⎤
7 7
⎣ (1 − αi )−1 ⎦ ⎣ (1 − αi αj )−1 ⎦ = sλ (α1 , . . . , αn ).
1in 1i<jn λ

The sum is over all partitions.

Proof. Similar to Theorem 44.3, and actually equivalent to Theorem 44.3


using the identity (1 + αi )(1 − α2i )−1 = (1 − αi )−1 .

460 44 Some Symmetric Algebras

Exercises
Exercise 44.1. Let η2k be the character of S2k from the exercises of the last chapter,
and let S2k be the set of partitions of 2k defined there. Show that
-2k 
∧k (∧2 V ) ∼
= V ⊗C[S2k ] η2k ,

and deduce that


'
∧k (∧2 V ) ∼
= πλ .
λ∈S2k

Prove also that


'
∧k (∨2 V ) ∼
= πλ .
t λ∈S
2k

Exercise 44.2. Prove the identities


  
(1 + αi αj ) = sλ (α1 , . . . , αn ),
1i<jn k λ∈S2k

  
(1 + αi αj ) = sλ (α1 , . . . , αn ).
1ijn k t λ∈S
2k

Explain why, in contrast with (44.4) and (44.5), there are only finitely many nonzero
terms on the right-hand side in these identities.
45
Gelfand Pairs

We recall that a representation θ of a compact group G is called multiplicity-


free if in its decomposition into irreducibles,
&
θ= di πi , (45.1)
i

each irreducible representation πi occurs with multiplicity di = 0 or 1.


A common situation that we have seen already several times is for a group
G ⊃ H to have the property that for some representation τ of H the induced
representation IndG H (τ ) is multiplicity-free.
In this chapter we will see how the question of showing that a represen-
tation is multiplicity-free leads to the consideration of a Hecke algebra. If the
Hecke algebra is commutative, the representation is multiplicity free. If it is
not commutative, it may also have an interesting structure, as we will see in
the next chapter. Another approach to multiplicity-free representations may
be seen in Theorem 26.7
Of course, we have only defined induced representations when H and G are
finite. Assuming H and G are finite, saying that IndG H (τ ) is multiplicity-free
means that each irreducible representation π of G, when restricted to H, can
contain at most one copy of τ , and formulated this way, the statement makes
sense even if H and G are infinite.
The most striking examples we have seen are when H = Sk−1 and G = Sk
and when H = U(n − 1) and G = U(n). In these examples every irreducible
representation τ of H has this “multiplicity one” property. Such examples are
fairly rare. A far more common circumstance is for a single representation
τ of H to have the multiplicity one property. For example, we showed in
Theorem 43.4 that inducing the trivial representation from the group B2k of
S2k produces a multiplicity-free representation. However, this would not be
true for some other irreducible representations.

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 461


DOI 10.1007/978-1-4614-8024-2 45, © Springer Science+Business Media New York 2013
462 45 Gelfand Pairs

Proposition 45.1. Suppose θ is a representation of a finite group G.


A necessary and sufficient condition that θ be multiplicity-free is that the ring
EndG (θ) be commutative.


Proof. In the decomposition (45.1), we have EndG (θ) = Matdi (C). This is
commutative if and only if all di  1.


Let G be a group, finite for the time being, and H a subgroup. Then (G, H)
is called a Gelfand pair if the representation of G induced by the trivial repre-
sentation of H is multiplicity-free. We also refer to H as a Gelfand subgroup.
More generally, if π is an irreducible representation of H, then (G, H, π) is
called a Gelfand triple if π G is multiplicity-free. See Gross [59] for a lively
discussion of Gelfand pairs.
From Proposition 45.1, Gelfand pairs are characterized by the commuta-
tivity of the endomorphism ring of an induced representation. To study it, we
make use of Mackey theory.

Proposition 45.2. Let G be a finite group, and let H1 , H2 , H3 be subgroups.


Let (πi , Vi ) be complex representations of H1 , H2 , and H3 and let L1 :
V1G −→ V2G and L2 : V2G −→ V3G be intertwining operators. Let Δ1 :
G −→ Hom(V1 , V2 ) and Δ2 : G −→ Hom(V2 , V3 ) correspond to L1 and L2
as in Theorem 32.1. Then Δ2 ∗ Δ1 : G −→ Hom(V1 , V3 ) corresponds to
L2 ◦ L1 : V1G −→ V3G , where the convolution is

Δ2 ∗ Δ1 (g) = Δ2 (gγ −1 ) ◦ Δ1 (γ).
γ∈H2 \G

Proof. Note that, using (32.9), the summand Δ2 (gγ −1 )Δ1 (γ) does not depend
on the choice of representative γ ∈ H2 \G. The result is easily checked.


Theorem 45.1. Let H be a subgroup of the finite group G, and let (π, V ) be
a representation of H. Then (G, H, π) is a Gelfand triple if and only if the
convolution algebra H of functions Δ : G −→ EndC (V ) satisfying

Δ(h2 gh1 ) = π(h2 ) ◦ Δ(g) ◦ π(h1 ), h1 , h2 ∈ H,

is commutative.

We call a convolution ring H of this type a Hecke algebra.

Proof. By Proposition 45.2, this condition is equivalent to the commutativity


of the endomorphism ring EndG (V G ), so this follows from Proposition 45.1.



In this chapter, an involution of a group G is a map ι : G → G of order 2


that is anticommutative:
ι
(g1 g2 ) = ι g2 ι g1 .
45 Gelfand Pairs 463

Similarly, an involution of a ring R is an additive map of order 2 that is


anticommutative for the ring multiplication.
A common method of proving that such a ring is commutative is to exhibit
an involution and then show that this involution reduces to the identity map.

Theorem 45.2. Let H be a subgroup of the finite group G, and suppose that
G admits an involution fixing H such that every double coset of H is invariant:
HgH = H ι g H. Then H is a Gelfand subgroup.

Proof. The ring H of Theorem 45.1 is just the convolution ring of H-bi-
invariant functions on G. We have an involution on this ring:
ι
Δ(g) = Δ(ι g).

It is easy to check that


ι
(Δ1 ∗ Δ2 ) = ι Δ2 ∗ ι Δ1 .

On the other hand, each Δ is constant on each double coset, and these are
invariant under ι by hypothesis, so ι is the identity map. This proves that H
is commutative, so (G, H) is a Gelfand pair.


Let Sn denote the symmetric group. We can embed Sn × Sm → Sn+m by


letting Sn act on the first n elements of the set {1, 2, 3, . . . , n + m} and letting
Sm act on the last m elements.

Proposition 45.3. The subgroup Sn × Sm is a Gelfand subgroup of Sn+m .

We already know this: the representation of Sn+m induced from the trivial
character of Sn × Sm is the product in the ring R of hn by hm . By Pieri’s
formula, one computes, assuming without loss of generality that n > m,

m
hn h m = s(n+m−k,k) .
k=0

Thus, the induced representation is multiplicity-free. We prove this again to


illustrate Theorem 45.2.

Proof. Let H = Sn × Sm and G = Sn+m . We take the involution ι in Theorem


45.2 to be the inverse map g −→ g −1 . We must check that each double coset
is ι-stable.
It will be convenient to represent elements of Sn+m by permutation
matrices. We will show that each double coset HgH has a representative
of the form
⎛ ⎞
Ir 0 0 0
⎜ 0 0n−r 0 In−r ⎟
⎜ ⎟
⎝ 0 0 Im−n+r 0 ⎠ . (45.2)
0 In−r 0 0n−r
464 45 Gelfand Pairs

Here In and 0n are the n × n identity and zero matrices, and the remaining
0 matrices are rectangular blocks.
We start with g in block form,
 
AB
,
CD

where A, B, C, and D are subpermutation matrices—that is, matrices with


only 1’s and 0’s, and with at most one nonzero entry in each row and column.
Here A is n × n and D is m × m. Let r be the rank of A. Then clearly B and
C both must have rank n − r, and so D has rank m − n + r.
Multiplying A on the left by an element of Sn , we may arrange its rows so
that its nonzero entries lie in the first r rows. Then multiplying on the right by
an element of Sn , we may put these in the upper left-hand corner. Similarly,
we may arrange that D has its nonzero entries in the upper left-hand corner.
Now the form of the matrix is
⎛ ⎞
Tr 0 0 0
⎜ 0 0n−r 0 Un−r ⎟
⎜ ⎟,
⎝ 0 0 Vm−n+r 0 ⎠
0 Wn−r 0 0n−r

where the sizes of the square blocks are indicated by subscripts. The matrices
T , U , V , and W are permutation matrices (invertible). Left multiplication by
element of Sr × Sn−r × Sm−n+r × Sn−r can now replace these four matrices
by identity matrices. This proves that (45.2) is a complete set of double coset
representatives.
Since these double coset representatives are all invariant under the invo-
lution, by Theorem 45.2 it follows that Sn × Sm is a Gelfand subgroup.


Proposition 45.4. Suppose that (G, H, ψ) is a Gelfand triple, and let (π, V )
be an irreducible representation of G. Then there exists at most one space M
of functions on G satisfying

M (hg) = ψ(h)M (g), (h ∈ H) , (45.3)

such that M is closed under right translation and such that the representation
of G on M by right translation is isomorphic to π.

The space M is called a model of π, meaning a concrete realization of the


representation in a space of functions on G.

Proof. This is just the Frobenius reciprocity. The space of functions satis-
fying (45.3)
H (ψ), so M, if it exists, is the image of an element of
is Ind G
HomG V, IndG H (ψ) . This is one-dimensional since the induced representation
is assumed to be multiplicity-free.

45 Gelfand Pairs 465

We turn now to Gelfand pairs in compact groups. We will obtain a result


similar to Theorem 45.1 by a different method.
Let C(G) be the space of continuous functions on the compact group G.
It is a ring (without unit) under convolution. If φ ∈ C(G), and if (π, V ) is
a finite-dimensional representation, let π(φ) : V −→ V denote the endomor-
phism 
π(φ) v = φ(g) π(g) v dg.
G
One checks easily that if φ, ψ ∈ C(G), then

π(φ ∗ ψ) = π(φ) ◦ π(ψ).

Let H be a closed subgroup of G. Let H be the subring of C(G) consisting


of functions that are both left- and right-invariant under H. If (π, V ) is a
representation of G, let V H denote the space of H-fixed vectors.
Theorem 45.3. Let H be a closed subgroup of the compact group G. Let H
be the subring of C(G) consisting of functions that are both left- and right-
invariant under H. If H is commutative, then V H is at most one-dimensional
for every irreducible representation (π, V ) of G.
In this case, extending the definition from the case of finite groups, we say
(G, H) is a Gelfand pair or that H is a Gelfand subgroup of G.
Proof. Let ξ, η ∈ V H . For g ∈ G, let

φξ,η (g) = π(g)ξ, η ,

where , is an invariant inner product on V (Proposition 2.1). It is easy to


see that φξ,η ∈ H. We will prove that

π(φξ,η ) v = 1
dim(V ) v, ξ η. (45.4)

Indeed, taking the inner product of the left-hand side with an arbitrary vector
θ ∈ V , Schur orthogonality (Theorem 2.4) gives

π(φξ,η )v, θ = G π(g) v, θ π(g)ξ, η dg = dim(V ) v, ξ η, θ ,
1

and since this is true for every θ, we have (45.4).


Now we show that the image of π(φη,ξ ∗φξ,η ) is Cη. Indeed, applying (45.4)
twice, we see that

π(φη,ξ ∗ φξ,η ) v = π(φη,ξ ) ◦ π(φξ,η ) v = 1


dim(V )2 v, ξ η, η ξ.

The image of this is contained in the linear span of η, and taking v = ξ shows
that the map is nonzero. Since H is assumed commutative, this also equals
π(φξ,η ∗ φη,ξ ). Hence, its image is also equal to C ξ, and so we see that ξ and
η both belong to the same one-dimensional subspace of V .

466 45 Gelfand Pairs

To give an example where we can verify the hypotheses of Theorem 45.3,


let G = SO(n + 1), and let H = SO(n), which we embed into the upper
left-hand corner of G:  
g 0
g −→ .
0 1
We also embed K = SO(2) into the lower right-hand corner:
⎛ ⎞
  In−1 0
a b
−→ ⎝ a b ⎠. (45.5)
−b a 0
−b a

Proposition 45.5. With G = SO(n + 1), H = SO(n), and K = SO(2) emb-


edded as explained above, every double coset in H\G/H has a representative
in K.

Proof. Let g ∈ G. Write the last column of g in the form


⎛ ⎞
bv1 ⎛ ⎞
⎜ bv2 ⎟   v1
⎜ ⎟
⎜ .. ⎟ bv ⎜ .. ⎟
⎜ . ⎟= , v = ⎝ . ⎠,
⎜ ⎟ a
⎝ bvn ⎠ vn
a

where b2 + a2 = 1 and v has length 1. Complete v to an orthogonal matrix


h ∈ H. Then it is simple to check that the last column of h−1 g is
⎛ ⎞
0
⎜ .. ⎟
⎜.⎟
⎜ ⎟
⎜0⎟,
⎜ ⎟
⎝b⎠
a

so with k the matrix in (45.5), the last column of k −1 h−1 g is


⎛ ⎞
0
⎜ .. ⎟
⎜ ⎟
ξ0 = ⎜ . ⎟ . (45.6)
⎝0⎠
1

This implies that k −1 h−1 g ∈ O(n), so g and k lie in the same double coset. 

Theorem 45.4. The subgroup SO(n) of SO(n + 1) is a Gelfand subgroup.


45 Gelfand Pairs 467

Proof. With G = SO(n + 1), H = SO(n), and K = SO(2) embedded as


explained above, we exhibit an involution of G, namely
   
In In
g → t
g .
−1 −1

This involution maps H to itself and is the identity on matrices in O(2).


Hence, the involution of H that it induces is the identity, and H is therefore
commutative.


Now let us think a bit about what this means in concrete terms. The
quotient G/H may be identified with the sphere S n . Indeed, thinking of S n
as the unit sphere in Rn+1 , G acts transitively and H is the stabilizer of a
point in S n .
Consequently, we have an action of G on L2 (S n ), and this may be thought
of as the representation induced from the trivial representation of O(n).

Theorem 45.5. Let (π, V ) be an irreducible representation of O(n + 1). Then


there exists at most one subspace of L2 (S n ) that is invariant under the action
of O(n + 1) and affords a representation isomorphic to π.

This gives us a concrete model for at least some representations of O(n + 1).

Proof. Let φ : V → L2 (S n ) be an intertwining operator. It is sufficient to


show that φ is uniquely determined up to a constant multiple. The O(n + 1)-
equivariance of φ amounts to the formula

φ π(g) v (x) = φ(v)(g −1 x) (45.7)

for g ∈ O(n + 1), v ∈ V , and x ∈ S n .


Let ·, · be an invariant Hermitian form on V . This form is nondegenerate,
so each linear functional on V is of the form v → v, η for some vector η.
In particular, with ξ0 ∈ S n as in (45.6), there exists a vector η ∈ V such that

φ(v)(ξ0 ) = v, η .

By (45.7), we have
 
φ(v) π(g) ξ0 = π(g −1 )v, η = v, π(g)η .

This makes it clear that φ is determined by η, and it also shows that η is O(n)-
invariant since ξ0 ∈ S n is O(n)-fixed. Since the space of O(n)-fixed vectors is
at most one-dimensional, the theorem is proved.


Proposition 45.6. If g ∈ U(n), then there exist k1 and k2 ∈ O(n) such that
k1 gk2 is diagonal.
468 45 Gelfand Pairs

Proof. Let x = g t g. This is a unitary symmetric matrix. By Proposition 28.2,


there exists k1 ∈ O(n) such that k1 xk1−1 is diagonal. It is unitary, so its
diagonal entries have absolute value 1. Taking their square roots, we find
a unitary diagonal matrix d such that k1 xk1−1 = d2 . This means that
(d−1 k1 g)t (d−1 k1 g) = 1, so k2−1 = d−1 k1 g is orthogonal and k1 gk2 = d.


Theorem 45.6. The group O(n) is a Gelfand subgroup of U(n).


Proof. Let G = U(n) and H = O(n), and let H be the ring of Theorem 45.3.
The transpose involution of G preserves H and thus induces an involution
of H. By Proposition 45.6, every double coset in H\G/H has a diagonal
representative, so this involution is the identity map, and it follows that H is
commutative. Therefore, H is a Gelfand subgroup.


Exercises
Exercise 45.1. Let G be any compact group. Let H = G × G, and embed G into
H diagonally, that is, by the map g −→ (g, g). Use the involution method to prove
that G is a Gelfand subgroup of H.

Exercise 45.2. Use the involution method to show that O(n) is a Gelfand subgroup
of U (n).

Exercise 45.3. Show that each irreducible representation of O(3) has an O(2)-fixed
vector, and deduce that L2 (S 2 ) is the (Hilbert space) direct sum of all irreducible
representations of O(3), each with multiplicity one.

Exercise 45.4 (Gelfand and Graev). Let G = GL(n, Fq ) and let N be the sub-
group of upper triangular unipotent matrices. Let ψ : Fq −→ C× be a nontrivial
additive character. Define a character ψN of N by
⎛ ⎞
1 x12 x13 · · · x1n
⎜ 1 x23 · · · x2n ⎟
⎜ ⎟
⎜ 1 ⎟
ψN ⎜ ⎟ = ψ(x12 + x23 + · · · + xn−1,n ).
⎜ . . ⎟
⎝ . . .. ⎠
1

The object of this exercise is to show that IndN G (ψN ) is multiplicity-free. This
Gelfand–Graev representation is important because it contains most irreducible rep-
resentations of the group; those it contains are therefore called generic. We will
denote by Φ the root system of GL(n, Fq ) and by Φ+ the positive roots αij such
that i < j. Let Σ be the simple positive roots αi,i+1 .
(i) Show that each double coset in N \G/N has a representative m that is a mono-
mial matrix. In the notation of Chap. 27, this means that m ∈ N (T ), where T
is the group of diagonal matrices. (Make use of the Bruhat decomposition.) Let
w ∈ W = N (T )/T be the corresponding Weyl group element.
45 Gelfand Pairs 469

(ii) Suppose that the double coset of N wN supports an intertwining operator


Ind(ψN ) −→ Ind(ψN ). (See Remark 32.2.) Show that if α ∈ Σ and w(α) ∈ Φ+ ,
then w(α) ∈ Σ. (Otherwise, choose x in the unipotent subgroup corresponding
to the root α such that mx = ym with ψN (x)
= 1 and ψN (y) = 1, and applying
Δ as in Theorem 32.1, obtain a contradiction.) 
(iii) Deduce from (ii) that there exist integers n1 , . . . , nr such that ni = n such
that
⎛ ⎞
Mr
⎜ . ⎟
..
m=⎜ ⎝
⎟,

M2
M1
where Mi is an ni × ni diagonal matrix.
(iv) Again make use of the assumption that N wN supports an intertwining operator
to show that Mi is a scalar matrix.
(v) Define an involution ι of G by
⎛ ⎞
1
.
g −→ w0 t g w0 , w0 = ⎝ . . ⎠ .
1

Note that N and its character ψN are invariant under ι. Interpret (iv) as show-
ing that every double coset that supports an intertwining operator Ind(ψN ) −→
Ind(ψN ) has a representative that is invariant under ι, and deduce that
EndG Ind(ψN ) is commutative and that Ind(ψN ) is multiplicity-free.
46
Hecke Algebras

A Coxeter group (Chap. 25) is a group W which may be given the following
description. The group W has generators si (i = 1, 2, . . . , r) with relations
s2i = 1 and for each pair of indices i and j the “braid relations”
si sj si · · · = sj si sj · · ·
where the number of terms on both sides is the same integer n(i, j). An
example is the symmetric group Sk , where si is the transposition (i, i + 1).
In this case r = k − 1.
Given a Coxeter group W , we may deform its group algebra as follows.
Let H(W ) be the ring with generators ti satisfying the same braid relations
ti tj ti · · · = tj ti tj · · · ,
but we replace the relation s2i = 1 by a more general relation
t2i = (q − 1)ti + q.
The parameter q may be a complex number or an indeterminate. If q = 1, we
recover the group algebra of W .
Hecke algebras are ubiquitous. They arise in various seemingly different
ways: as endomorphism rings of induced representations for the groups of
Lie type such as GL(k, Fq ) (Iwahori [84], Howlett and Lehrer [80]); as
convolution rings of functions on p-adic groups (Iwahori and Matsumoto [86]);
as rings of operators acting on the equivariant K-theory of flag varieties
(Lusztig [122], Kazhdan and Lusztig [98]); as rings of transfer matrices in
statistical mechanics and quantum mechanics (Temperley and Lieb [160],
Jimbo [90]), in knot theory (Jones [91]), and other areas. It is the con-
text for defining the Kazhdan–Lusztig polynomials, which occur in seemingly
unrelated questions in representation theory, geometry and combinatorics [97].
Some of these different occurrences of Hecke algebras may seem unrelated to
each other, but this can be an illusion when in fact deep and surprising con-
nections exist.

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 471


DOI 10.1007/978-1-4614-8024-2 46, © Springer Science+Business Media New York 2013
472 46 Hecke Algebras

Following Iwahori [84], we will study a certain “Hecke algebra” Hk (q) that,
as we will see, is isomorphic to the Hecke algebra of the symmetric group Sk .
The ring Hk (q) can actually be defined if q is any complex number, but if q is a
prime power, it has a representation-theoretic interpretation. We will see that
it is the endomorphism ring of the representation of G = GL(k, Fq ), where
Fq is the finite field with q elements, induced from the trivial representation
of the Borel subgroup B of upper triangular matrices in G. The fact that it
is a deformation of C[Sk ] amounts to a parametrization of a certain set of
irreducible representations of G—the so-called unipotent ones—by partitions.
If instead of G = GL(k, Fq ) we take G = GL(k, Qp ), where Qp is the
p-adic field, and we take B to be the Iwahori subgroup consisting of elements
g of K = GL(k, Zp ) that are upper triangular modulo p, then one obtains the
affine Hecke algebra, which is similar to Hk (q) but infinite-dimensional. It was
introduced by Iwahori and Matsumoto [86]. The role of the Bruhat decompo-
sition in the proofs requires a generalization of the Tits’ system described in
Iwahori [85]. This Hecke algebra contains a copy of Hk (p). On the other hand,
it also contains the ring of K-bi-invariant functions, the so-called spherical
Hecke algebra (Satake [143], Tamagawa [158]). The spherical Hecke algebra is
commutative since K is a Gelfand subgroup of G. The spherical Hecke algebra
is (when k = 2) essentially the portion corresponding to the prime p of the
original Hecke algebra introduced by Hecke [65] to explain the appearance
of Euler products as the L-series of automorphic forms. See Howe [76] and
Rogawski [137] for the representation theory of the affine Hecke algebra.
Let F be a field. Let G = GL(k, F ) and, as in Chap. 27, let B be the
Borel subgroup of upper triangular matrices in G. A subgroup P containing
B is called a standard parabolic subgroup. (More generally, any conjugate of a
standard parabolic subgroup is called parabolic.) 
Let k1 , . . . , kr be positive integers such that i ki = k. Then Sk has a
subgroup isomorphic to Sk1 ×· · ·×Skr in which the first Sk1 acts on {1, . . . , k1 },
the second Sk2 acts on {k1 + 1, . . . , k1 + k2 }, and so forth. Let Σ denote the
set of k − 1 transpositions {(1, 2), (2, 3), . . . , (k − 1, k)}.

Lemma 46.1. Let J be any subset of Σ. Then there exist integers k1 , . . . , kr


such that the subgroup of Sk generated by J is Sk1 × · · · × Skr .

Proof. If J contains (1, 2), (2, 3), . . . , (k1 − 1, k1 ), then the subgroup they
generate is the symmetric group Sk1 acting on {1, . . . , k1 }. Taking k1 as large
as possible, assume that J omits (k1 , k1 + 1). Taking k2 as large as possible
such that J contains (k1 + 1, k1 + 2), . . . , (k1 + k2 − 1, k1 + k2 ), the subgroup
they generate is the symmetric group Sk2 acting on {k1 + 1, . . . , k1 + k2 }, and
so forth. Thus J contains generators of each factor in Sk1 × · · · × Skr and does
not contain any element that is not in this product, so this is the group it
generates.


The notations from Chap. 27 will also be followed. Let T be the maximal
torus of diagonal elements in G, N the normalizer of T , and W = N/T the
46 Hecke Algebras 473

Weyl group. Moreover, Φ will be the set of all roots, Φ+ the positive roots, and
Σ the simple positive roots. Concretely, elements of Φ are the k 2 − k rational
characters of T of the form
⎛ ⎞
t1
⎜ .. ⎟ −1
αij ⎝ . ⎠ = ti tj ,
tn

where 1  i, j  n, Φ+ consists of {αij  i < j}, and Σ = {αi,i+1 }. Identifying
W with Sk , the set Σ in Lemma 46.1 is then the set of simple reflections.
Let J be any subset of Σ. Let WJ be the subgroup of W generated by the
sα with α ∈ Σ. Then, by Lemma 46.1, we have (for suitable ki )
WJ ∼
= Sk1 × · · · × Skr . (46.1)
Let NJ be the preimage of WJ in N under the canonical projection to W . Let
PJ be the group generated by B and NJ . Then
⎧⎛ ⎞⎫

⎪ G11 G12 · · · G1r ⎪⎪

⎨⎜ 0 G22 · · · G2r ⎟⎪ ⎬
⎜ ⎟
PJ = ⎜ . . . ⎟ , (46.2)

⎪ ⎝ .. . . .. ⎠⎪


⎩ ⎪

0 0 · · · Grr
where each Gij is a ki × kj block. The group PJ is a semidirect product
PJ = MJ UJ = UJ MJ , where MJ is characterized by the condition that
Gij = 0 unless i = j, and the normal subgroup UJ is characterized by
the condition that each Gii is a scalar multiple of the identity matrix in
GL(ki ). The groups PJ with J a proper subset of Σ are called the standard
parabolic subgroups, and more generally any subgroup conjugate to a PJ is
called parabolic. The subgroup UJ is the unipotent radical of PJ (that is, its
maximal normal unipotent subgroup), and MJ is called the standard Levi
subgroup of PJ . Evidently,
MJ ∼
= GL(k1 , F ) × · · · × GL(kr , F ). (46.3)
Any subgroup conjugate in PJ to MJ (which is not normal) would also be
called a Levi subgroup.
As in Chap. 27, we note that a double coset BωB, or more generally PI ωPJ
with I, J ⊂ Σ, does not depend on the choice ω ∈ N of representative for an
element w ∈ W , and we will use the notation BwB = C(w) or PI wPJ for
this double coset. Let BJ = MJ ∩ B. This is the standard “Borel subgroup”
of MJ .
Proposition 46.1.
(i) Let J ⊆ Σ. Then
A
MJ = BJ wBJ (disjoint).
w∈WJ
474 46 Hecke Algebras

(ii) Let I, J ⊆ Σ. Then, if w ∈ W , we have


BWI wWJ B = PI wPJ . (46.4)
(iii) The canonical map w −→ PI wPJ from W −→ PI \G/PJ induces a bijec-
tion
WI \W/WJ ∼ = PI \G/PJ .

Proof. For (i), we have (46.3) for suitable ki . Now BJ is the direct product of
the Borel subgroups of these GL(ki , F ), and WJ is the direct product (46.1).
Part (i) follows directly from the Bruhat decomposition for GL(k, F ) as proved
in Chap. 27.
As for (ii), since BWI ⊂ PI and WJ B ⊂ PJ , we have BWI wWJ B ⊆
PI wPJ . To prove the opposite inclusion, we first note that
wBWJ ⊆ BwWJ B. (46.5)
Indeed, any element of WJ can be written as s1 · · · sr , where si = sαi , with
αi ∈ J. Using Axiom TS3 from Chap. 27, we have
wBs1 · · · sr ⊂ BwBs2 · · · sr B ∪ Bws1 Bs2 · · · sr B
and, by induction on r, both sets on the right are contained in BwWJ B. This
proves (46.5). A similar argument shows that
WI BwWJ ⊆ BWI wWJ B. (46.6)
Now, using (i),
PI wPJ = UI MI wMJ UJ ⊂ UI BI WI BI wBJ WJ BJ UJ ⊂ BWI BwBWJ B.
Applying (46.5) and (46.6), we obtain BWI wWJ B ⊇ PI wPJ , whence (46.4).
As for (iii), since by the Bruhat decomposition w −→ BwB is a bijec-
tion W −→ B\G/B, (46.4) implies that w −→ PI wPJ induces a bijection
WI \W/WJ −→ PI \G/PJ .

To proceed further, we will assume that F = Fq is a finite field. We
recall from Chap. 34 that Rk denotes the free Abelian group generated by the
isomorphism classes of irreducible representations of the symmetric group Sk ,
or, as we sometimes prefer, the additive group of generalized characters. It can
be identified with the character ring of Sk . However, we do not need its ring
structure, only its additive structure and its inner product, in which the dis-
tinct isomorphism classes of irreducible representations form an orthonormal
basis.
Similarly, let Rk (q) be the free Abelian group generated by the isomor-
phism classes of irreducible representations of GL(n, Fq ) or equivalently the
additive group of generalized characters. Like Rk , we can make Rk (q) into
the k-homogeneous part of a graded ring, a point we will take up in the next
chapter.
46 Hecke Algebras 475

Proposition 46.2. Let H be a group, and let M1 and M2 be subgroups of H.


Then in the character ring of H, the inner product of the characters induced
from the trivial characters of M1 and M2 , respectively, is equal to the number
of double cosets in M1 \H/M2 .

Proof. By the geometric form of Mackey’s theorem (Theorem 32.1), the space
of intertwining maps from IndH H
M1 (1) to IndM2 (1) is isomorphic to the space
of functions Δ : H −→ Hom(C, C) ∼ = C that satisfy Δ(m2 hm1 ) = Δ(h) for
mi ∈ Mi . Of course, a function has this property if and only if it is constant
on double cosets, so the dimension of the space of such functions is equal to
the number of double cosets. On the other hand, the dimension of the space
of intertwining operators equals the inner product in the character ring by
(2.7).


Theorem 46.1. There is a unique isometry of Rk into Rk (q) in which for


each subset I of Σ the representation IndW WI (1) maps to the representation
IndG
PI (1). This mapping takes irreducible representations to irreducible repre-
sentations.

Proof. If I ⊆ Σ, let χI denote the character of Sk induced from the trivial


character of WI , and let χI (q) denote the character of G induced from the
trivial character of PI .
We note that the representations χI of Rk span Rk . Indeed, by the
definition of the multiplication in R, inducing the trivial representation from
Sk1 × · · · × Skr to Sk , where ki = k, gives the representation denoted

hk1 hk2 · · · hkr ,

which is χI . Expanding the right-hand side of (35.10) expresses each sλ as a


linear combination of such representations, and by Theorem 35.1 the sλ span
Rk ; hence so do the χI .
We would like to define a map Rk −→ Rk (q) by

nI χI −→ nI χI (q), (46.7)
I I

where the sum is over subsets of Σ. We need to verify that this is well-defined
and an isometry.
By Proposition 46.1, if I, J ⊆ Σ, the cardinality of WI \W/WJ equals the
cardinality of PI \G/PJ . By Proposition 46.2, it follows that

χI , χJ Sk = χI (q), χJ (q) GL(k,Fq ) . (46.8)



Now, if nI χI (q) = 0, we have
476 46 Hecke Algebras
D E

n I χI , n I χI = nI nJ χI , χJ Sk
I I Sk I,J

= nI nJ χI (q), χJ (q) GL(k,Fq )
I,J
D E

= nI χI (q), nI χI (q) = 0,
I I GL(k,Fq )

so nI χI = 0. Therefore (46.7) is well-defined, and (46.8) shows that it is
an isometry.
It remains to be shown that irreducible characters go to irreducible
characters. Indeed, if χ is an irreducible character of W = Sk , and if χ̂ is
the corresponding character of G = GL(k, Fq ), then χ̂, χ̂ = χ, χ = 1, so
either χ̂ or −χ̂ is an irreducible character, and it is sufficient to show that
χ̂ occurs with positive multiplicity in some proper character of G. Indeed,
χ = sλ for some partition λ, and by (35.10) this means that χ appears with
multiplicity one in the character induced from the trivial character of Sλ . Con-
sequently, χ̂ occurs with multiplicity one in IndGPI (1), where I is any subset
of Σ such that WI ∼ = Sλ . This completes the proof.

If λ is a partition, let sλ (q), hk (q), and ek (q) denote the images of the
characters sλ , hk , and ek , respectively, of Sk under the isomorphism of The-
orem 46.1. Thus hk (q) is the trivial character. The character ek (q) is called
the Steinberg character of GL(k, Fq ). The characters sλ (q) are the unipotent
characters of GL(k, Fq ). This is not a proper definition of the term unipotent
character because the construction as we have described it depends on the
fact that the unipotent characters are precisely those that occur in IndG B (1).
This is true for G = GL(n, F) but not (for example) for Sp(4, Fq ). See Deligne
and Lusztig [41] and Carter [32] for unipotent characters of finite groups of
Lie type and Vogan [167] for an extended meditation on unipotent represen-
tations.
Proposition 46.3. As a virtual representation, the alternating character ek
of Sk admits the following expression:

ek = (−1)|J| IndW
WJ (1).
J⊆Σ

Proof. We recall that ek = sλ , where λ is the partition (1, . . . , 1) of K. The


right-hand side of (35.10) gives
 
 h1 h2 h3 · · · hk 
 
 1 h1 h2 · · · hk−1 
 
 
ek =  0 1 h1 · · · hk−2  .
 .. .. .. .. 
 . . . . 

 0 0 0 · · · h1 
46 Hecke Algebras 477

Expanding this gives a sum of exactly 2k−1 monomials in the hi , which are
in one-to-one correspondence with the subsets J of Σ. Indeed, let J be given,
and let k1 , k2 , k3 , . . . be as in Lemma 46.1. Then there is a monomial that
has |J| 1’s taken from below the diagonal; namely, if αi,i+1 ∈ Σ, then there
is a 1 taken from the i + 1, i position, and there is an hk1 taken from the
1, k1 position, an hk2 taken from the k1 + 1, k1 + k2 position, and so forth.
This monomial equals (−1)|J| hk1 hk2 · · · , which is (−1)|J| times the character
induced from the trivial representation of WJ = Sk1 × Sk2 × · · · .

Theorem 46.2. As a virtual representation, the Steinberg representation
ek (q) of GL(k, Fq ) admits the following expression:

ek (q) = (−1)|J| IndP
PJ (1).
J⊆Σ

Proof. This follows immediately from Proposition 46.3 on applying the map-
ping of Theorem 46.1.

For our next considerations, there is no reason that F needs to be finite,
so we return to the case where G = GL(k, F ) of a general field F . We will
denote by U the group of upper triangular unipotent matrices in GL(k, F ).
Proposition 46.4. Suppose that S is any subset of Φ such that if α ∈ S, then
−α ∈/ S, and if α, β ∈ S and α + β ∈ Φ, then α + β ∈ S. Let US be the set
of g = (gij ) in GL(k, F ) such that gii = 1, and if i = j, then gij = 0 unless
αij ∈ S. Then US is a group.

Proof. Let S̃ be the set of (i, j) such that the root αij ∈ S. Translating the
hypothesis on S into a statement about S̃, if (i, j) ∈ S̃ we have i < j, and

if both (i, j) and (j, k) are in S̃, then i = k and (i, k) ∈ S̃. (46.9)

From this it is easy to see that if g and h are in US , then so are g −1 and
gh.

As a particular case, if w ∈ W , then S = Φ+ ∩wΦ− satisfies the hypothesis
of Proposition 46.4, and we denote

UΦ+ ∩wΦ− = Uw− .

Similarly, S = Φ+ ∩ wΦ+ meets this hypothesis, and we denote

UΦ+ ∩wΦ+ = Uw+ .

Finally, let U be the group of all upper triangular unipotent matrices in G,


which was denoted N in Chap. 27.
Let l(w) denote the length of the Weyl group element, which (as in
Chap. 20) is the smallest k such that w can be written as a product of k
simple reflections.
478 46 Hecke Algebras

Proposition 46.5. Let F = Fq be finite, and let w ∈ W . We have

|Uw− | = q l(w) .

Proof. By Propositions 20.2 and 20.5, the cardinality of S = Φ+ ∩ w−1 Φ− is


l(w), so this follows from the definition of US .


Proposition 46.6. Let w ∈ W . The multiplication map Uw+ × Uw− −→ U is


bijective.

Proof. We will prove this if F is finite, the only case we need. In this case
Uw+ ∩ Uw− = {1} by definition since the sets Φ+ ∩ wΦ− and Φ+ ∩ wΦ+ are
− + − ± ±
1 u1 = u2 u2 with ui ∈ Uw , then (u2 )
disjoint. Thus, if u+ u1 = u−
+ −1 + − −1
2 (u1 ) ∈
− ± ± −
Uw ∩ Uw so u1 = u2 . Therefore, the multiplication map Uw × Uw −→ U is
+ +

injective. To see that it is surjective, note that


+
∩wΦ− | +
∩wΦ+ |
|Uw− | = q |Φ , |Uw+ | = q |Φ ,
+
so the order of Uw+ × Uw− is q |Φ |
= |U |, and the surjectivity is now clear.


We are interested in the size of the double coset BwB. In geometric terms,
G/B can be identified with the space of F -rational points of a projective
algebraic variety, and the closure of BwB/B is an algebraic subvariety in
which BwB/B is an open subset; the dimension of this “Schubert cell” turns
out to be l(w).
If F = Fq , an equally good measure of the size of BwB is its cardinality.
It can of course be decomposed into right cosets of B, and its cardinality will
be the order of B times the cardinality of the quotient BwB/B.

Proposition 46.7. Let F = Fq be finite, and let w ∈ W . The order of


BwB/B is q l(w) .

Proof. We will show that u− −→ u− wB is a bijection Uw− −→ BwB/B. The


result then follows from Proposition 46.5.
Note that every right coset in BwB/B is of the form bwB for some b ∈ B.
Using Proposition 46.6, we may write b ∈ B uniquely in the form u− u+ t
with u± ∈ Uw± and t ∈ T . Now w−1 u+ tw = w−1 u+ w.w−1 tw ∈ B because
w−1 u+ w ∈ U and w−1 tw ∈ T . Therefore bwB = u− wB.
It is now clear that the map u− −→ u− wB is surjective. We must show that
it is injective; in other words, if u− − − − −
1 wB = u2 wB for ui ∈ Uw , then u1 = u2 .

− − −1 − −1 −
Indeed, if u = (u1 ) u2 then w u w ∈ B from the equality of the double
cosets. On the other hand, w−1 u− w is lower triangular by the definition of Uw− .
It is both upper triangular and lower triangular, and unipotent, so u− = 1.



With k and q fixed, let H be the convolution ring of B-bi-invariant


functions on G. The dimension of H equals the cardinality of B\G/B, which is
46 Hecke Algebras 479

|W | = k! by the Bruhat decomposition. A basis of H consists of the functions


φw (w ∈ W ), where φw is the characteristic function of the double coset
C(w) = BwB. We normalize the convolution as follows:
1 1
(f1 ∗ f2 )(g) = f1 (x)f2 (x−1 g) = f1 (gx)f2 (x−1 ).
|B| |B|
x∈G x∈G

With this normalization, the characteristic function f1 of B serves as a unit


in the ring.
The ring H is a normed ring with the L1 norm. That is, we have

|f1 ∗ f2 |  |f1 | · |f2 |,

where
1
|f | = |f (x)|.
|B|
x∈G

There is also an augmentation map, that is, a C-algebra homomorphism


 : H −→ C given by
1
(f ) = f (x).
|B|
x∈G

By Proposition 46.7, we have

(φw ) = q l(w) . (46.10)

Proposition 46.8. Let w, w ∈ W such that l(ww ) = l(w) + l(w ). Then

φww = φw φw .

Proof. By Proposition 27.1, we have C(ww ) = C(w) C(w ). Therefore φw ∗ φw


is supported in C(ww ) and is hence a constant multiple of φww . Writing
φw ∗ φw = cφww , applying the augmentation , and using (46.10), we see
that c = 1.


Proposition 46.9. Let s ∈ W be a simple reflection. Then

φs ∗ φs = qφ1 + (q − 1)φs .

Proof. By (27.2), we have C(s) C(s) ⊆ C(1) ∪ C(s). Therefore, there exist
constants λ and μ such that φs ∗ φs = λφ1 + μφs . Evaluating both sides
at the identity gives λ = q. Now applying the augmentation and using the
special cases (φs ) = q, (f1 ) = 1 of (46.10), we have q 2 = λ · 1 + μ · q = q + μq,
so μ = q − 1.

480 46 Hecke Algebras

Let q be a nonzero element of a field containing C, and let R = C[q, q −1 ].


Thus q might be a complex number, in which case the ring R = C or it might
be transcendental over C, in which case the ring R will be the ring of Laurent
polynomials over C.
We will define a ring Hk (q) as an algebra over R. Specifically, Hk (q) is the
free C[q]-algebra on generators fsαi (i = 1, . . . , k − 1) subject to the relations
fs2α = q + (q − 1)fsαi , (46.11)
i

fsαi ∗ fsαi+1 ∗ fsαi = fsαi+1 ∗ fsαi ∗ fsαi+1 , (46.12)


fsαi ∗ fsαj = fsαi ∗ fsαj if |i − j| > 1. (46.13)

We note that fsα is invertible, with inverse q −1 fαi + q −1 − 1, by (46.11).


Although Hk (q) is thus defined as an abstract ring, its structure reflects
that of the Weyl group W of GL(k), which, as we have seen, is a Coxeter
group. We recall what this means. Let sα1 , . . . , sαk−1 be the simple reflections
of W . By Theorem 25.1, the group W has a presentation with generators sαi
and relations
s2αi = 1,
sαi sαi+1 sαi = sαi+1 sαi sαi+1 , 1  i  k − 2 ,
sαi sαj = sαj sαi if |i − j| > 1.
Of course, since s2αi = 1, the relation sαi sαi+1 sαi = sαi+1 sαi sαi+1 is just
another way of writing (sαi sai+1 )3 = 1.
Proposition 46.10. If q = 1, the Hecke ring Hk (1) is isomorphic to the
group ring of Sk .
Proof. This is clear from Theorem 25.1 since if q = 1 the defining relations of
the ring Hk (1) coincide with the Coxeter relations presenting Sk .

Thus Hk (q) is a deformation of C[Sk ], and its representation theory is the
same as the representation theory of the symmetric group, one might therefore
ask whether the Frobenius–Schur duality between the representations of Sk
and U(n), which has been a great theme for us, can be extended to repre-
sentations of this Hecke algebra. The answer is affirmative. The role of U(n)
is played by a “quantum group,” which is not actually a group at all but
a Hopf algebra. Frobenius–Schur duality in this quantum context is due to
Jimbo [89]. See also Zhang [179].
If w ∈ W is arbitrary, we want to associate an element fw of Hk (q)
extending the definition of the generators. The next result will make this
possible. (Of course, fw is already defined if w is a simple reflection.)
Proposition 46.11. Suppose that w ∈ W with l(w) = r, and suppose that
w = s1 · · · sr = s1 · · · sr are distinct decompositions of minimal length into
simple reflections. Then
fs1 ∗ · · · ∗ fsr = fs1 ∗ · · · ∗ fsr . (46.14)
46 Hecke Algebras 481

Proof. Let B be the braid group generated by uαi parametrized by the simple
roots αi , with n(uαi , uαj ) equal to the order (2 or 3) of sαi sαj . Let si = sβi
and si = sγi with βi , γi ∈ Σ, and let ui = uαi , ui = uβi be the corresponding
elements of B. By Theorem 25.2, we have

u1 · · · ur = u1 · · · ur . (46.15)

Since the fαi satisfy the braid relations, there is a homomorphism of B into
the group of invertible elements of Hk (q) such that uαi −→ fαi . Applying this
homomorphism to (46.15), we obtain (46.14).


If w ∈ W , let w = s1 · · · sr be a decomposition of w into r = l(w) simple


reflections, and define
fw = fs1 ∗ · · · ∗ fsr .
According to Proposition 46.11, this fw is well-defined.

Theorem 46.3 (Iwahori). The fw form a basis of Hk (q) as a free R-module.


Thus, the rank of Hk (q) is |W |.

Proof. First, assume that q is transcendental, so that R is the ring of Laurent


polynomials in q. We will deduce the corresponding statement when q ∈ C at
the end.
Let us check that

Rfw = Hk (q). (46.16)
w∈W

It is sufficient to show that this R-submodule is closed under right multiplica-


tion by generators fs of W with s a simple reflection. If l(ws) = l(w) + 1, then
fw fs = fws . On the other hand, if l(ws) = l(w) − 1, then writing w = ws we
have fw fs = fw s fs = fw fs2 , which by (46.11) is a linear combination of fw
and fw fs = fw .
It remains to be shown that the sum (46.16) is direct. If not, there will be
some Laurent polynomials cw (q), not all zero, such that

cw (q)fw = 0.
w

There exists a rational prime p such that cw (p) are not all zero. Let H be
the convolution ring of B-bi-invariant functions on GL(k, Fp ). It follows from
Propositions 46.8 and 46.9 that (46.11)–(46.13) are all satisfied by the stan-
dard generators of H, so we have a homomorphism Hk (q) −→ H mapping
each fw to the corresponding generator φw of H and mapping q −→ p. The
images of the fw are linearly independent in H, yet since the cw (p) are not
all zero, we obtain a relation of linear independence. This is a contradiction.
The result is proved if q is transcendental. If 0 = q0 ∈ C, then there
is a homomorphism R −→ C, and a compatible homomorphism Hk (q) −→
482 46 Hecke Algebras

Hk (q0 ), in which q −→ q0 . What we must show is that the R-basis elements
fw remain linearly independent when projected to Hk (q0 ). To prove this, we
note that in Hk (q) we have

fw fw  = aw,w ,w (q, q −1 )fw ,
w  ∈W

where aw,w ,w is a polynomial in q and q −1 . We may construct ring H̃k (q0 )
over C with basis elements f˜w indexed by W and specialized ring structure
constants aw,w ,w (q0 , q0 −1 ). The associative law in Hk (q) boils down to a
polynomial identity that remains true in this new ring, so this ring exists.
Clearly, the identities (46.11)–(46.13) are true in the new ring, so there exists
a homomorphism Hk (q0 ) −→ H̃k (q0 ) mapping the fw to the f˜w . Since the f˜w
are linearly independent, so are the fw in Hk (q0 ).

Let us return to the case where q is a prime power.
Theorem 46.4. Let q be a prime power. Then the Hecke algebra Hk (q) is
isomorphic to the convolution ring of B-bi-invariant functions on GL(k, Fq ),
where B is the Borel subgroup of upper triangular matrices in GL(n, Fq ). In
this isomorphism, the standard basis element fw (w ∈ W ) corresponds to the
characteristic function of the double coset BwB.
Proof. It follows from Propositions 46.8 and 46.9 that (46.11)–(46.13) are
all satisfied by the elements φw in the ring H of B-bi-invariant functions on
GL(n, Fq ), so there exists a homomorphism Hk (q) −→ H such that fw −→ φw .
Since the {fw } are a basis of Hk (q) and the φw are a basis of H, this ring
homomorphism is an isomorphism.


Exercises
Exercise 46.1. Show that any subgroup of GL(n, F ) containing B is of the form
(46.2).

Exercise 46.2. For G = GL(3), describe Uw+ and Uw− explicitly for each of the six
Weyl group elements.

Exercise 46.3. Let G be a finite group and H a subgroup. Let H be the “Hecke
algebra” of H bi-invariant functions, with multiplication being the convolution
product normalized by

1 
(f1 ∗ f2 )(g) = f1 (x)f2 (x−1 g).
|H| x∈G

If (π, V ) is an irreducible representation of G, let V H be the subspace of H-fixed


vectors. Then V H becomes a module over H with the action
46 Hecke Algebras 483

f · v = |H|−1 f (g)π(g)v. (46.17)
g∈G

f · v = |H|−1 g∈G f (g)π(g)v. Show that V H , if nonzero, is irreducible as an
H-module. (Hint: If W is a nonzero invariant subspace of V H , and v ∈ V H , then
since V is irreducible, we have f1 · w = v for some function f1 on G, where f1 · w is
defined as in (46.17) even though f1
∈ H. Show that f · w = v, where f = ε ∗ f1 ∗ ε
and ε is the characteristic function of H. Observe that f ∈ H and conclude
that V H = W .)

Exercise 46.4. In the setting of Exercise 46.3, show that (π, V ) −→ V H is a
bijection between the isomorphism classes of irreducible representations of G with
V H
= 0 and isomorphism classes of irreducible H-modules.

Exercise 46.5. Show that if (π, V ) is an irreducible representation of G = GL(k, Fq )


with character sλ (q), then the degree of the corresponding representation of Hk (q)
is the degree of the irreducible character sλ of Sk . (Thus, the degree dλ of sλ is the
dimension of V B .) Show that dλ is the multiplicity of sλ (q) in IndG
B (1).

Exercise 46.6. Assume that q is a prime. Prove that


'
Hk (q) ∼
= Matdλ (C) ∼= C[Sk ].
λ a partition of k

Exercise 46.7. Prove that the degree of the irreducible character sλ (q) of GL(k, Fq )
is a polynomial in q whose value when q = 1 is the degree dλ of the irreducible
character sλ of Sk .

Exercise 46.8. An element of GL(k, Fq ) is called semisimple if it is diagonalizable


over the algebraic closure of Fq . A semisimple element is called regular if its eigen-
values are distinct. If λ is a partition of k, let cλ be a regular semisimple element of
GL(k, Fq ) such that
⎛ ⎞
c1
⎜ .. ⎟
cλ = ⎝ . ⎠, ci ∈ GL(λi , Fq ),
cr
and such that the eigenvalues of ci generate Fqλi . Of course, cλ isn’t completely
determined by this description. Such a cλ will exist (for k fixed) if q is sufficiently
large.
(i) Show that, if k = 2, then the unipotent characters of GL(2, Fq ) have the
following values:
c(11) c(2)
s(11) 1 1
s(2) 1 −1

Note that this is the character table of S2 .


(ii) More generally, prove that in the notation of Chap. 37, the value of the character
sμ (q) on the conjugacy class cλ of GL(k, C) equals the value of the character sμ
on the conjugacy class Cλ of Sk .
47
The Philosophy of Cusp Forms

There are four theories that deserve to be studied in parallel. These are:
• The representation theory of symmetric groups Sk ;
• The representation theory of GL(k, Fq );
• The representation theory of GL(k, F ) where F is a local field;
• The theory of automorphic forms on GL(k).
In this description, a local field is R, C, or a field such as the p-adic field
Qp that is complete with respect to a non-Archimedean valuation. Roughly
speaking, each successive theory can be thought of as an elaboration of its
predecessor. Both similarities and differences are important. We list some
parallels between the four theories in Table 47.1.
The plan of this chapter is to discuss all four theories in general terms,
giving proofs only for the second stage in this tower of theories, the represen-
tation theory of GL(n, Fq ). (The first stage is already adequately covered.)
Although the third and fourth stages are outside the scope of this book, our
goal is to prepare the reader for their study by exposing the parallels with the
finite field case.
There is one important way in which these four theories are similar: there
are certain representations that are the “atoms” from which all other repre-
sentations are built and a “constructive process” from which the other repre-
sentations are built. Depending on the context, the “atomic” representations
are called cuspidal or discrete series representations. The constructive process
is parabolic induction or Eisenstein series. The constructive process usually
(but not always) produces an irreducible representation.
Harish-Chandra [62] used the term “philosophy of cusp forms” to describe
this parallel, which will be the subject of this chapter. One may substitute any
reductive group for GL(k) and most of what we have to say will be applicable.
But GL(k) is enough to fix the ideas.
In order to explain the philosophy of cusp forms, we will briefly summarize
the theory of Eisenstein series before discussing (in a more serious way) a
part of the representation theory of GL(k) over a finite field. The reader

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 485


DOI 10.1007/978-1-4614-8024-2 47, © Springer Science+Business Media New York 2013
486 47 The Philosophy of Cusp Forms

only interested in the latter may skip the paragraphs on automorphic forms.
When we discuss automorphic forms, we will prove nothing and state exactly
what seems relevant in order to see the parallel. For GL(k, Fq ), we prove
more, but mainly what we think is essential to see the parallel. Our treatment
is greatly influenced by Howe [74] and Zelevinsky [178]. To go deeper into
the representation theory of the finite groups of Lie type, Carter [32] is an
exceedingly useful reference.
For the symmetric groups, there is only one “atom”—the trivial represen-
tation of S1 . The constructive process is ordinary induction from Sk × Sl to
Sk+l , which was the multiplication ◦ in the ring R introduced in Chap. 34.
The element that we have identified as atomic was called h1 there. It does not
generate the ring R. However, hk1 is the regular representation (or character)
of Sk , and it contains every irreducible representation. To construct every
irreducible representation of Sk from this single irreducible representation of
S1 , the constructive process embodied in the multiplicative structure of the
ring R must be supplemented by a further procedure. This is the extraction
of an irreducible from a bigger representation hk1 that includes it. This ex-
traction amounts to finding a description for the “Hecke algebra” that is the
endomorphism ring of hk1 . This “Hecke algebra” is isomorphic to the group
ring of Sk .
For the groups GL(k, Fq ), let us construct a graded ring R(q) analogous
to the ring R in Chap. 34. The homogeneous part Rk (q) will be the free
Abelian group on the set of isomorphism classes of irreducible representations
of GL(k, Fq ), which may be identified with the character ring of this group;
the multiplicative structure of the character ring is not used. Instead, there
is a multiplication Rk (q) × Rl (q) −→ Rk+l (q), called parabolic induction.
Consider the maximal parabolic subgroup P = M U of GL(k + l, Fq ), where
  
∼ g1 
M = GL(k, Fq ) × GL(l, Fq ) =  g1 ∈ GL(k, Fq ), g2 ∈ GL(l, Fq )
g2

and
  
Ik X 
U=  X ∈ Matk×l (Fq ) .
Il
The group P is a semidirect product, since U is normal, and the composition

M −→ P −→ P/U

is an isomorphism. So given a representation (π1 , V1 ) of GL(k, Fq ) and a rep-


resentation (π2 , V2 ) of GL(l, Fq ), one may regard the representation π1 ⊗ π2 of
M as a representation of P/U ∼ = M and pull it back to a representation of P in
which U acts trivially. Inducing from P to GL(k + l, Fq ) gives a representation
that we will denote π1 ◦ π2 . By the definition of the induced representation, it
acts by right translation on the space V1 ◦ V2 of all functions f : G −→ V1 ⊗ V2
such that
47 The Philosophy of Cusp Forms 487
  
g1 ∗
f h = π1 (g1 ) ⊗ π2 (g2 ) f (h).
g2

With this multiplication, R(q) = Rk (q) is a graded ring (Exercise 47.1).
Inspired by ideas of Philip Hall, Green [58] defined the ring R(q) and used
it systematically by in his description of the irreducible representations of
GL(k, Fq ). Like R, it can be given the structure of a Hopf algebra. See Zelevin-
sky [178] and Exercise 47.5.
If, imitating the construction with the symmetric group, we start with the
trivial representation h1 (q) of GL(1, Fq ) and consider all irreducible represen-
tations of GL(k, Fq ) that occur in h1 (q)k , we get exactly the unipotent repre-
sentations (i.e., the sk (q) of Chap. 46), and this is the content of Theorem 46.1.
To get all representations, we need more than this. There is a unique smallest
set of irreducible representations of the GL(k, Fq )—the cuspidal ones—such
that we can find every irreducible representation as a constituent of some
representation that is a ◦ product of cuspidal ones. We will give more precise
statements later in this chapter.
At the third stage in the tower of theories, the most important represen-
tations are infinite-dimensional, and analysis is important as well as algebra
in their understanding. The representation theory of algebraic groups over a
local field F is divided into the case where F is Archimedean—that is, F = R
or C—and where F is non-Archimedean.
If F is Archimedean, then an algebraic group over F is a Lie group, more
precisely a complex analytic group when F = C. The most important fea-
ture in the representation theory of reductive Lie groups is the Langlands
classification expressing every irreducible representation as a quotient of one
that is parabolically induced from discrete series representations. Usually the
parabolically induced representation is itself irreducible and there is no need
to pass to a quotient. See Knapp [104], Theorem 14.92 on p. 616 for the
Langlands classification. Knapp [104] and Wallach [168] are comprehensive
accounts of the representation theory of noncompact reductive Lie groups.
For reductive p-adic groups—that is, reductive algebraic groups over a
non-Archimedean local field—the situation is similar and in some ways sim-
pler. The most important discrete series representations are the supercusp-
idals. There is again a Langlands classification expressing every irreducible
representation as a quotient of one parabolically induced from discrete se-
ries. Surveys of the representation theory of p-adic groups can be found in
Cartier [33] and Moeglin [130]. Two useful longer articles with foundational
material are Casselman [34] and Bernstein and Zelevinsky [16]. The most im-
portant foundational paper is Bernstein and Zelevinsky [17]. Chapter 4 of
Bump [27] emphasizes GL(2) but is still useful.
The fourth of the four theories in the tower is the theory of automorphic
forms. In developing this theory, Selberg and Langlands realized that certain
automorphic forms were basic, and these are called cusp forms. The definitive
reference for the Selberg–Langlands theory is Moeglin and Waldspurger [131].
Let us consider the basic setup.
488 47 The Philosophy of Cusp Forms

Table 47.1. The philosophy of cusp forms


Class of groups Atoms Synthetic Analytic Unexpected
process process symmetry
Sk h1 Induction Restriction (Trivial)
GL(k, Fq ) Cuspidal Parabolic Unipotent R(q) is
representations induction invariants commutative
GL(k, F ) Discrete series Parabolic Jacquet Intertwining
F local induction functors integrals such
rU,1 in [17] as (47.2)
GL(k, A) Automorphic Eisenstein Constant Functional
A = adele ring cuspidal series terms equations
of global F representations

Let G = GL(k, R). Let Γ be a discrete subgroup of G such that Γ \G


has finite volume such as GL(k, Z). An automorphic form on G with respect
to Γ is a smooth complex-valued function f on G that is K-finite, Z-finite,
of moderate growth and automorphic, and has unitary central character. We
define these terms now.
The group G acts on functions by right translation: ρ(g)f (h) = f (hg).
The group K is the maximal compact subgroup O(n), and f is K-finite if the
space of functions ρ(κ)f with κ ∈ K spans a finite-dimensional vector space.
The Lie algebra g of G also acts by right translation: if X ∈ g, then

d 
(Xf )(g) = f (getX )  .
dt t=0

As a consequence, the universal enveloping algebra U (g) acts on smooth func-


tions. Let Z be its center. This is a ring of differential operators on G that
are invariant under both right and left translation (Exercise 10.2). For exam-
ple, it contains the Casimir element constructed in Theorem 10.2 (from the
trace bilinear form B on g); in this incarnation, the Casimir element is the
Laplace–Beltrami operator. The function f is called Z-finite if the image of
f under Z is a finite-dimensional vector space.
Embed G into 2k 2 -dimensional Euclidean space Matk (R) ⊕ Matk (R) =
2
R2k by
g −→ (g, g −1 ).
2
Let % % denote the Euclidean norm in R2k restricted to G. The function f is
said to be of moderate growth if f (g) < C%g%N for suitable C and N .
The function f is called automorphic with respect to Γ if f (γg) = f (g)
for all γ ∈ Γ .
We will consider functions f such that for some character ω of R×
+ we have
⎛⎛ ⎞ ⎞
z
⎜⎜ .. ⎟ ⎟
f ⎝⎝ . ⎠ g ⎠ = ω(z) f (g)
z
47 The Philosophy of Cusp Forms 489

for all z ∈ R×
+ . The character ω is the central character . It is fixed throughout
the discussion and is assumed unitary; that is, |ω(z)| = 1.
Let V be a vector space on which K and g both act. The actions are as-
sumed to be compatible in the sense that both induce the same representation
of Lie(K). We ask that V decomposes into a direct sum of finite-dimensional
irreducible subspaces under K. Then V is called a (g, K)-module. If every
irreducible representation of K appears with only finite multiplicity, then we
say that V is admissible. For example, let (π, H) be an irreducible unitary
representation of G on a Hilbert space H, and let V be the space of K-finite
vectors in H. It is a dense subspace and is closed under actions of both g
and K, so it is a (g, K)-module. The (g, K)-modules form a category that can
be studied by purely algebraic methods, which captures the essence of the
representations.
The space A(Γ \G) of automorphic forms is not closed under ρ because
K-finiteness is not preserved by ρ(g) unless g ∈ K. Still, both K and g
preserve the space A(Γ \G). A subspace that is invariant under these actions
and irreducible in the obvious sense is called an automorphic representation.
It is a (g, K)-module.
Given an automorphic form f on G = GL(k, R) with respect to Γ =
GL(k, Z), if k = r + t we can consider the constant term along the parabolic
subgroup P with Levi factor GL(r) × GL(t). This is the function
   
IX g1
f dX
Matr×t (Z)\Matr×t (R) I g2
for (g1 , g2 ) ∈ GL(r, R) × GL(t, R). If the constant term of f along every maxi-
mal parabolic subgroup vanishes then f is called a cusp form. An automorphic
representation is called automorphic cuspidal if its elements are cusp forms.
Let L2 (Γ \G, ω) be the space of measurable functions on g that are auto-
morphic and have central character ω and such that

|f (g)|2 dg < ∞.
Γ Z\G

The integral is well-defined modulo Z because ω is assumed to be unitary.


Cusp forms are always square-integrable—an automorphic cuspidal represen-
tation embeds as a direct summand in L2 (Γ \G, ω). In particular, it is unitary.
There is a construction that is dual to the constant term in the Selberg–
Langlands theory, namely the construction of Eisenstein series. Let (π1 , V1 )
and (π2 , V2 ) be automorphic cuspidal representations of GL(r, R) and GL(t, R),
where r + t = k. Let P = M U be the maximal parabolic subgroup with Levi
factor M = GL(r, R)×GL(t, R). The modular quasicharacter δP : P −→ R× + is
 
g1 ∗ | det(g1 )|t
δP =
g2 | det(g2 )|r
by Exercise 1.2. The space of the (g, K)-module of the induced representation
Ind(π1 ⊗ π2 ⊗ δPs ) of G consists of K-finite functions fs : G −→ C such that
490 47 The Philosophy of Cusp Forms

any element fs of the (g, K)-submodule of C ∞ (G) generated by fs satisfies


the condition that
 
 g1 X
fs
g2
s+1/2
is independent of X and equals δP times a finite linear combination of
functions of the form f1 (g1 )f2 (g2 ), where fi ∈ Vi . Due to the extra factor
1/2
δP , this induction is called normalized induction, and it has the property
that if s is purely imaginary (so that π1 ⊗ π2 ⊗ δPs is unitary), then the
induced representation is unitary.
Then, for re(s) sufficiently large and for fs ∈ Ind(π1 ⊗ π2 ⊗ δPs ), the series

E(g, fs , s) = fs (γg)
P (Z)\GL(k,Z)

is absolutely convergent. Here P (Z) is the group of integer matrices in P with


determinant ±1.
Unlike cusp forms, the Eisenstein series are not square-integrable. Never-
theless, they are needed for the spectral decomposition of GL(k, Z)\GL(k, R).
This is analogous to the fact that the characters x −→ e2πiαx of R are not
square-integrable, but as eigenfunctions of the Laplacian, a self-adjoint op-
erator, they are needed for its spectral theory and comprise its continuous
spectrum. The spectral problem for GL(k, Z)\GL(k, R) has both a discrete
spectrum (comprised of the cusp forms and residues of Eisenstein series) and
a continuous spectrum. The Eisenstein series (analytically continued in s and
restricted to the unitary principal series) are needed for the analysis of the
continuous spectrum.
For the purpose of analytic continuation, we call a family of functions
fs ∈ Ind(π1 ⊗ π2 ⊗ δPs ) a standard section if the restriction of the functions fs
to K is independent of s.

Theorem 47.1 (Selberg, Langlands). Let r + t = k. Let P and Q be the


parabolic subgroups of GL(k) with Levi factors GL(r) × GL(t) and GL(t) ×
GL(r), respectively. Suppose that fs ∈ Ind(π1 ⊗ π2 ⊗ δPs ) is a standard sec-
tion. Then E(g, fs , s) has meromorphic continuation to all s. There exists an
intertwining operator
−s
M (s) : Ind(π1 ⊗ π2 ⊗ δPs ) −→ Ind(π2 ⊗ π1 ⊗ δQ )

such that the functional equation



E(g, fs , s) = E g, M (s)fs , −s (47.1)

is true.
47 The Philosophy of Cusp Forms 491

The intertwining operator M (s) is given by an integral formula


    
−It IX
M (s)f (g) = f g dX. (47.2)
Matt×r (R) Ir I

This integral may be shown to be convergent if re(s) > 12 . For other values
of s, it has analytic continuation. This integral emerges when one looks at
the constant term of the Eisenstein series with respect to Q. We will not
explain this further but mention it because these intertwining integrals are
extremely important and will reappear in the finite field case in the proof of
Proposition 47.3.
The two constructions—constant term and Eisenstein series—have paral-
lels in the representation theory of GL(k, F ), where F is a local field including
F = R, C, or a p-adic field. These constructions are functors between repre-
sentations of GL(k, F ) and those of the Levi factor of any parabolic subgroup.
They are the Jacquet functors in one direction and parabolic induction in the
other. (We will not define the Jacquet Functors, but they are the functors rU,1
in Bernstein and Zelevinsky [17].) Moreover, these constructions also descend
to the case of representation theory of GL(n, Fq ), which we look at next.

An irreducible representation (π, V ) of GL(k, Fq ) is called cuspidal if there


are no fixed vectors for the unipotent radical of any (standard) parabolic sub-
group. If P ⊇ Q are parabolic subgroups and UP and UQ are their unipotent
radicals, then UP ⊆ UQ , and it follows that a representation is cuspidal if
and only if it has no fixed vectors for the unipotent radical of any (standard)
maximal parabolic subgroup; these are the subgroups of the form
  
Ir X 
 X ∈ Matr×t (Fq ) , r + t = k. (47.3)
It

Proposition 47.1. Let (π, V ) be a cuspidal representation of GL(k, Fq ). If U


of GL(k, Fq )
is the unipotent radical of a standard maximal parabolic subgroup
and if η : V −→ C is any linear functional such that η π(u) v = η(v) for all
u ∈ U and all v ∈ V , then η is zero.
This means that the contragredient of a cuspidal representation is cuspidal.
Proof. Choose an invariant inner product , on V . There exists a vector
y ∈ V such that η(v) = v, y . Then
 
v, π(u)y = π(u)−1 v, y = η π(u)−1 v = η(v) = v, y

for all u ∈ U and v ∈ V , so π(u)y = y. Since π is cuspidal, y = 0, whence


η = 0.

Proposition 47.2. Every irreducible representation (π, V ) of GL(k, Fq ) is a
constituent in some representation π1 ◦ · · · ◦ πm with the πi cuspidal.
492 47 The Philosophy of Cusp Forms

Proof. If π is cuspidal, then we may take m = 1 and π1 = π. There is nothing


to prove in this case.
If π is not cuspidal, then there exists a decomposition k = r + t such
that the space V U of U -fixed vectors is nonzero, where U is the group (47.3).
Let P = M U be the parabolic subgroup with Levi factor M = GL(r, Fq ) ×
GL(t, Fq ) and unipotent radical U . Then V G is an M -module since M nor-
malizes U . Let ρ ⊗ τ be an irreducible constituent of M , where ρ and τ are
representations of GL(r, Fq ) and GL(t, Fq ). By induction, we may embed ρ
into π1 ◦ · · · ◦ πh and σ into πh+1 ◦ · · · ◦ πm for some cuspidals πi . Thus, we
get a nonzero M -module homomorphism

V U −→ ρ ⊗ τ −→ (π1 ◦ · · · ◦ πh ) ⊗ (πh+1 ◦ · · · ◦ πm ).

By Frobenius reciprocity (Exercise 47.2), there is thus a nonzero GL(k, Fq )-


module homomorphism

V −→ (π1 ◦ · · · ◦ πh ) ◦ (πh+1 ◦ · · · ◦ πm ) = π1 ◦ · · · ◦ πm .

Since π is irreducible, this is an embedding.




The notion of a cuspidal representation can be extended to Levi factors of


parabolic subgroups. Let λ = (λ1 , . . . , λr ), where the λi are positive integers
whose sum is k. We do not assume λi  λi+1 . Such a decomposition we call
an ordered partition of k. Let
⎧⎛ ⎞ ⎫

⎪ g11 ∗ · · · ∗ ⎪


⎨⎜ ⎪
⎜ g22 · · · ∗ ⎟ ⎟
 ⎬
Pλ = ⎜ . . ⎟  gii ∈ GL(λi , Fq ) .

⎪ ⎝ . . .. ⎠ ⎪


⎩ ⎪

grr

This parabolic subgroup has Levi factor

Mλ = GL(λ1 , Fq ) × · · · × GL(λr , Fq )

and unipotent radical Uλ characterized by gii = Iλi . Any irreducible represen-


tation πλ of Mλ is of the form ⊗πi , where πi is a representation of GL(λi , Fq ).
We say that π is cuspidal if each of the πi is cuspidal.
Let Bk be the standard Borel/ subgroup of GL(k, Fq ), consisting of upper
triangular matrices, and let Bλ = Bλi . We regard this as the Borel subgroup
of Mλ . A standard parabolic subgroup
/ of Mλ is a proper subgroup Q containing
Bλ . Such a subgroup has the form Qi , where each Qi is either GL(λi , Fq ) or
a parabolic subgroup of GL(λi , Fq ) and at least one Qi is proper. The parabolic
subgroup is maximal if exactly one Qi is a proper subgroup of GL(λi , Fq ) and
that Qi is a maximal parabolic subgroup of GL(λi , Fq ). A parabolic subgroup
of Mλ has a Levi subgroup and a unipotent radical; if Q is a maximal parabolic
subgroup of Mλ , then the unipotent radical of Q is the unipotent radical of
47 The Philosophy of Cusp Forms 493

the unique Qi that is a proper subgroup of GL(λi , Fq ), and it follows that


π = ⊗πi is cuspidal if and only if it has no fixed vector with respect to the
unipotent radical of any maximal parabolic subgroup of Mλ .
Parabolic induction is as we have already described it for maximal parabolic
subgroups. The group Pλ = Mλ Uλ is a semidirect product with the subgroup
Uλ normal, and so the composition

Mλ −→ Pλ −→ Pλ /Uλ

is an isomorphism, where the first map is inclusion and the second projection.
This means that the representation πλ of Mλ may be regarded as a represen-
tation of Pλ in which Uλ acts trivially. Then π1 ◦ · · · ◦ πr is the representation
induced from Pλ .

Theorem 47.2. The multiplication in R(q) is commutative.

Proof. We will frame our proof in terms of characters rather than repre-
sentations, so in this proof elements of Rk (q) are generalized characters of
GL(k, Fq ).
We make use of the involution ι : GL(k, Fq ) −→ GL(k, Fq ) defined by
⎛ ⎞
1
.
ι
g = wk · t g −1 · wk , wk = ⎝ . . ⎠ .
1

Let r + t = k. The involution takes the standard parabolic subgroup P with


Levi factor M = GL(r, Fq ) × GL(t, Fq ) to the standard parabolic subgroup ι P
with Levi factor ι M = GL(t, Fq ) × GL(r, Fq ). It induces the map M −→ ι M
given by
  ι 
g1 g2
−→ ι , g1 ∈ GL(r, Fq ), g2 ∈ GL(t, Fq ),
g2 g1

where ι g1 = wr · t g1−1 · wr and ι g2 = wt · t g2−1 · wt . Now since every element of


GL(n, Fq ) is conjugate to its transpose, if μ is the character of an irreducible
representation of GL(n, Fq ) with n = k, r, or t, we have μ(ι g) = μ(g). Let μ1
and μ2 be the characters of representations of GL(r, Fq ) and GL(t, Fq ). Com-
posing the character μ̄2 ⊗ μ̄1 of ι M with ι : M −→ ι M and then parabolically
inducing from P to GL(k, Fq ) will give the same result as parabolically induc-
ing the character directly from ι P and then composing with ι. The first way
gives μ1 ◦ μ2 , and the second gives the conjugate of μ̄2 ◦ μ̄1 (that is, μ2 ◦ μ1 ),
and so these are equal.


Unfortunately, the method of proof in Theorem 47.2 is rather limited.


We next prove a strictly weaker result by a different method based on an
analog of the intertwining integrals (47.2). These intertwining integrals are
very powerful tools in the representation theory of Lie and p-adic groups, and
494 47 The Philosophy of Cusp Forms

they are closely connected with the constant terms of the Eisenstein series
and with the functional equations. It is for this reason that we give a second,
longer proof of a weaker statement.

Proposition 47.3. Let (π1 , V1 ) and (π2 , V2 ) be representations of GL(r, Fq )


and GL(t, Fq ). Then there exists a nonzero intertwining map between the rep-
resentations π1 ◦ π2 and π2 ◦ π1 .

Proof. Let f ∈ V1 ◦ V2 . Thus f : G −→ V1 ⊗ V2 satisfies


  
g1 ∗
f h = π1 (g1 ) ⊗ π2 (g2 ) f (h), g1 ∈ GL(r, Fq ), g2 ∈ GL(t, Fq ).
g2
(47.4)
Now define M f : G −→ V2 ⊗ V1 by
   
−Ir IX
M f (h) = τ f h ,
It I
X∈Matr×t (Fq )

where τ : V1 ⊗ V2 −→ V2 ⊗ V1 is defined by τ (v1 ⊗ v2 ) = v2 ⊗ v1 . Let us show


that M f ∈ V2 ◦ V1 . A change of variables X −→ X − Y in the definition of
M f shows that   
Ir Y
Mf h = M f (h).
It
Also, if g1 ∈ GL(r, Fq ) and g2 ∈ GL(t, Fq ), we have
       −1  
g2 g1 −Ir I g2 Xg1
Mf h =τ f h .
g1 g2 It I
X∈Matr×t (Fq )

−1
Making the
variable change X −→ g2 Xg1 and then using (47.4) and the fact
that τ ◦ π1 (g1 ) ⊗ π2 (g2 ) = π2 (g2 ) ⊗ π1 (g1 ) ◦ τ shows that
  
g2
Mf h = π2 (g2 ) ⊗ π1 (g1 ) M f (h).
g1

Thus M f ∈ V2 ◦ V1 .
The map M is an intertwining operator since G acts on both the spaces of
π1 ◦ π2 and π2 ◦ π1 by right translation, and f −→ M f obviously commutes
with right translation. We must show that it is nonzero. Choose a nonzero
vector ξ ∈ V1 ⊗ V2 . Define
  
AB π1 (A) ⊗ π2 (D) ξ if C = 0,
f =
CD 0 otherwise,

where A, B, C and D are blocks, A being r × r and D being t × t, etc. It is


clear that f ∈ V1 ◦ V2 . Now
47 The Philosophy of Cusp Forms 495
     
It −Ir IX −It
Mf =τ f ,
−Ir It I Ir
X∈Matr×t

and the term is zero unless X = 0, so this equals τ (ξ) = 0. This proves that
the intertwining operator M is nonzero.

Returning momentarily to automorphic forms, the functional equation
(47.1) extends to Eisenstein series in several complex variables attached to
cusp forms for general parabolic subgroups. We will not try to formulate a
precise theorem, but suffice it to say that if πi are automorphic cuspidal rep-
resentations of GL(λi , R) and s = (s1 , . . . , sr ) ∈ Cr , and if ds : Pλ (R) −→ C
is the quasicharacter
⎛ ⎞
g1
⎜ .. ⎟
⎠ = | det(g1 )| 1 · · · | det(gr )| r ,
s s
ds ⎝ .
gr
then there is a representation Ind(π1 ⊗ · · · ⊗ πr ⊗ ds ) of GL(k, R) induced
parabolically from the representation π1 ⊗ · · · ⊗ πr ⊗ ds of Mλ . One may form
an Eisenstein series by a series that is absolutely convergent if re(si − sj ) are
sufficiently large and that has meromorphic continuation to all si . There are
functional equations that permute the constituents | det |si ⊗ πi .
If some of the πi are equal, the Eisenstein series will have poles. The polar
divisor maps out the places where the representations Ind(π1 ⊗ · · · ⊗ πr ⊗ ds )
are reducible. Restricting ourselves to the subspace of Cr where si = 0,
the following picture emerges. If all of the πi are equal, then the polar divisor
will consist of r(r − 1) hyperplanes in parallel pairs. There will be r! points
where r − 1 hyperplanes meet in pairs. These are the points where the induced
representation Ind(π1 ⊗ · · · ⊗ πr ⊗ ds ) is maximally reducible. Regarding the
reducibility of representations, we will see that there are both similarities and
dissimilarities with the finite field case.
Returning to the case of a finite field, we will denote by T the subgroup
of diagonal matrices in GL(k, Fq ). If α is a root, we will denote by Uα the
one-dimensional unipotent of GL(k, Fq ) corresponding to α. Thus, if α = αij
in the notation (27.7), then Xα consists of the matrices of the form I + tEij ,
where Eij has a 1 in the i, jth position and zeros elsewhere.
If λ = (λ1 , . . . , λr ) is an ordered partition of k, πi are representations of
GL(λi , Fq ), and πλ = π1 ⊗ · · · ⊗ πr is the corresponding representation of Mλ ,
we will use Ind(πλ ) as an alternative notation for π1 ◦ · · · ◦ πr .
Theorem 47.3 (Harish-Chandra). Suppose that λ = (λ1 , . . . , λr ) and μ =
(μ1 , . . . , μt ) are ordered partitions of k, and let πλ = ⊗πi and πμ = ⊗πj be
cuspidal representations of Mλ and Mμ , respectively. Then

dim HomGL(k,Fq ) Ind(πλ ), Ind(πμ )
is zero unless r = t. If r = t, it is the number of permutations σ of {1, 2, . . . , r}
such that λσ(i) = μi and πσ(i) ∼ = πi .
496 47 The Philosophy of Cusp Forms

See also Harish-Chandra [62], Howe [74] and Springer [151].


Proof. Let Vi be the space of πi and let Vi be the space of πi , so πλ acts on
V = ⊗Vi and πμ acts on V  = ⊗Vi . By Mackey’s theorem in the geometric
form of Theorem 32.1, the dimension of this space of intertwining operators is
the dimension of the space of functions Δ : GL(k, Fq ) −→ HomC (V, V  ) such
that for p ∈ Pλ and p ∈ Pμ we have
Δ(p gp) = πμ (p ) Δ(g) πλ (p).
Of course, Δ is determined by its values on a set of coset representatives
for Pμ \G/Pλ , and by Proposition 46.1, these may be taken to be a set of
representatives of Wμ \W/Wμ , where if T is the maximal torus of diagonal
elements of GL(k, Fq ), then W = N (T )/T , while Wλ = NMλ (T )/T and Wμ =
NMμ (T )/T . Thus WPλ is isomorphic to Sλ1 × · · · × Sλr and Wμ is isomorphic
to Sμ1 × · · · × Sμt .
In the terminology of Remark 32.2, let us ask under what circumstances
the double coset Pμ wPλ can support an intertwining operator. We assume
that Δ(w) = 0.
We will show that wMλ w−1 ⊇ Mμ . We first note that Mμ ∩ wBk w−1 is a
(not necessarily standard) Borel subgroup of Mμ . This is because it contains
T , and if α is any root of Mμ , then exactly one of Uα or U−α is contained in
Mμ ∩wBk w−1 (Exercise 47.3). Now Mμ ∩wPλ w−1 contains Mμ ∩wBk w−1 and
hence is either Mμ or a (not necessarily standard) parabolic subgroup of Mμ .
We will show that it must be all of Mμ ∩wPλ w−1 since otherwise its unipotent
radical is Mμ ∩ wUλ w−1 . Now, if u ∈ Mμ ∩ wUλ w−1 , then w−1 uw ∈ Uλ , so
Δ(w) = Δ(u−1 · w · w−1 uw) = πμ (u−1 ) ◦ Δ(w). (47.5)
This means that any element of the image of Δ(w) is invariant under πμ (u) and
hence zero by the cuspidality of πμ . We are assuming that Δ(w) is nonzero, so
this contradiction shows that Mμ = Mμ ∩wPλ w−1 . Thus Mμ ⊆ wPλ w−1 . This
actually implies that Mμ ⊆ wMλ w−1 because if α is any root of Mμ , then
Pλ contains both w−1 Uα w and w−1 U−α w, which implies that Mλ contains
w−1 Uα w, so Uα ⊆ wMλ w−1 . Therefore wMλ w−1 ⊇ Mμ .
Next let us show that wMλ w−1 ⊆ Mμ . As in the previous case, Mλ ∩
w Pμ w contains the (not necessarily standard) Borel subgroup Mλ ∩w−1 Bμ w
−1

of Mλ , so either it is all of Mλ or a parabolic subgroup of Mλ . If it is a parabolic


subgroup, its unipotent radical is Mλ ∩ w−1 Uμ w. If u ∈ Mλ ∩ w−1 Uμ w, then
by (47.5) we have
Δ(w) = Δ(wuw−1 · w · u−1 ) = Δ(w) ◦ πλ (u−1 ).
By Proposition 47.1, this implies that Δ(w) = 0; this contradiction implies
that Mλ = Mλ ∩ w−1 Pμ w, and reasoning as before gives Mλ ⊆ w−1 Mμ w.
Combining the two inclusions, we have proved that if the double coset
Pμ wPλ supports an intertwining operator, then Mμ = wMλ w−1 . This means
r = t.
47 The Philosophy of Cusp Forms 497

Now, since the representative w is only determined modulo left and right
multiplication by Mμ and Mλ , respectively, we may assume that w takes
positive roots of Mλ to positive roots of Mμ . Thus, a representative of w is a
“block permutation matrix” of the form
⎛ ⎞
w11 · · · w1r
⎜ .. ⎟ ,
w = ⎝ ... . ⎠
wt1 · · · wtr

where each wij is a μi × λj block, and either wij = 0 or μi = λj and wij is an


identity matrix of this size, and there is exactly one nonzero wij in each row
and column. Let σ be the permutation of {1, 2, . . . , r} such that wi,σ(i) is not
zero. Thus λσ(i) = μi , and if gj ∈ GL(λj , Fq ), then we can write
⎛ ⎞ ⎛ ⎞
g1 gσ(1)
⎜ .. ⎟ ⎜ .. ⎟
w⎝ . ⎠=⎝ . ⎠ w.
gr gσ(r)

Thus
⎛ ⎞ ⎛ ⎞
g1 gσ(1)
⎜ .. ⎟  ⎜ .. ⎟
Δ(w) ◦ πλ ⎝ . ⎠ = πμ ⎝ . ⎠ ◦ Δ(w),
gr gσ(r)
so

Δ(w) ◦ π1 (g1 ) ⊗ · · · ⊗ πr (gr ) = π1 (gσ(1) ) ⊗ · · · ⊗ πr (gσ(r) ) ◦ Δ(w).

Since the representations π and π  of Mλ and Mμ are irreducible, Schur’s


lemma implies that Δ(w) is determined up to a scalar multiple, and moreover
πi ∼
= πσ(i) as a representation of GL(μi , Fq ) = GL(λσ(i) , Fq ).
We see that the double cosets that can support an intertwining operator
are in bijection with the permutations of {1, 2, . . . , r} such that λσ(i) = μi
and πσ(i) ∼= πi and that the dimension of the space of intertwining operators
that are supported on a single such coset is 1. The theorem follows.


This theorem has some important consequences.

Theorem 47.4. Suppose that λ = (λ1 , . . . , λr ) is an ordered partition of k,


and let πλ = ⊗πi be a cuspidal representation of Mλ . Suppose that no πi ∼ = πj .
Then π1 ◦ · · · ◦ πr is irreducible. Its isomorphism class is unchanged if the λi
and πi are permuted. If (μ1 , . . . , μt ) is another ordered partition of k, and
πμ = π1 ◦ · · · ◦ πt is a cuspidal representation of Mμ , with the πi also distinct,
then π1 ◦ · · · ◦ πr ∼ = π1 ◦ · · · ◦ πt if and only if r = t and there is a permutation
σ of {1, . . . , r} such that μi = λσ(i) and πi ∼ = πσ(i) .
498 47 The Philosophy of Cusp Forms

Remark 47.1. This is the usual case. If q is large, the probability that there is
a repetition among a list of randomly chosen cuspidal representations is small.

Remark 47.2. The statement that the isomorphism class is unchanged if the λi
and πi are permuted is the analog of the functional equations of the Eisenstein
series.

Proof. By Theorem 47.3, the dimension of the space of intertwining operators


of Ind(πλ ) to itself is one, and it follows that this space is irreducible. The
last statement is also clear from Theorem 47.3.


Suppose now that l is a divisor of k and that k = lt. Let π0 be a cuspidal


representation of GL(l, Fq ). Let us denote by π0◦t the representation π0 ◦· · ·◦π0
(t copies). We call any irreducible constituent of π0◦t a π0 -monatomic irre-
ducible representation. As a special case, if π0 is the trivial representation
of GL(1, Fq ), this is the Hecke algebra identified in Iwahori’s Theorem 46.3.
There, we saw that the endomorphism ring of π0◦t was the Hecke algebra Ht (q),
a deformation of the group algebra of the symmetric group St , and thereby
obtained a parametrization of its irreducible constituents by the irreducible
representations of St or by partitions of t. The following result generalizes
Theorem 46.3.

Theorem 47.5 (Howlett and Lehrer). Let π0 be a cuspidal representation


of GL(l, Fq ). Then the endomorphism ring End(π0◦t ) is naturally isomorphic
to Ht (q l ).

Proof. Proofs may be found in Howlett and Lehrer [80] and Howe [74].


Corollary 47.1. There exists a natural bijection between the set of partitions
λ of t and the irreducible constituents σλ(π) of π0◦t . The multiplicity of σλ(π)
in π0◦t equals the degree of the irreducible character sλ of the symmetric group
St parametrized by λ.

Proof. The multiplicity of σλ(π) in π0◦t equals the multiplicity of the corre-
sponding module of Ht (q l ). By Exercise 46.5, this is the degree of sλ .


Although we will not make use of the multiplicative structure that is con-
tained in this theorem of Howlett and Lehrer, we may at least see immediately
that
dim End(π0◦t ) = t!, (47.6)
by Theorem 47.3, taking μ = λ and all πi , πi to be π0 . This is enough for the
following result.

Theorem 47.6. Let (λ1 , . . . , λr ) be an ordered partition of k, and let λi = li ti .


Let πi be a cuspidal representation of GL(li , Fq ), with no two πi isomor-
phic. Let θi be a πi -monatomic irreducible representation of GL(λi , Fq ). Let
θλ = ⊗θi . Then Ind(θλ ) is irreducible, and every irreducible representation
47 The Philosophy of Cusp Forms 499

of GL(k, Fq ) is of this type. If (μ1 , . . . , μt ) is another ordered partition of k,


and θi be a family of monatomic representations of GL(μi , Fq ) with respect to
another set of distinct cuspidals, and let θμ = ⊗θi . Then Ind(θλ ) ∼ = Ind(θμ )
if and only if r = t, and there exists a permutation σ of {1, . . . , r} such that
μi = λσ(i) and θi ∼
= θσ(i) .

Proof. We 
note the following general principle: χ is a character of any group,
and if χ = di χi is a decomposition into subrepresentations such that

χ, χ = d2i ,

then the χi are irreducible and mutually nonisomorphic. Indeed, we have



d2i = χ, χ = d2i χi , χi + di dj χi , χj .
i=j

All the inner products χi , χi  1 and all the χi , χj  0, so this implies


that the χi , χi = 1 and all the χi , χj = 0.
Decompose each πi◦ti into a direct sum j dij θij of distinct irreducibles
θij with multiplicities dij . The representation θi is among the θij . We have

π1◦t1 ◦ · · · ◦ πr◦tr = ··· (d1j1 · · · drjr ) θ1j1 ◦ · · · ◦ θrjr .
j1 jr

The dimension of the endomorphism ring of this module is computed by


Theorem 47.3. The number of permutations of the advertised type is t1 ! · · · tr !
because each permutation must map the di copies of πi among themselves.
On the other hand, by (47.6), we have

··· (d1j1 · · · drjr )2 = t1 ! · · · tr !
j1 jr

also. By the “general principle” stated at the beginning of this proof, it fol-
lows that the representations θ1j1 ◦ · · · ◦ θrjr are irreducible and mutually
nonisomorphic.
Next we show that every irreducible representation π is of the form Ind(θλ ).
If π is cuspidal, then π is monatomic, and so we can just take r = t1 = 1,
θ1 = π1 . We assume that π is not cuspidal. Then by Proposition 47.2 we
may embed π into π1 ◦ · · · ◦ πm for some cuspidal representations πi . By
Proposition 47.4, we may order these so that isomorphic πi are adjacent, so
π is embedded in a representation of the form π1◦t1 ◦ · · · ◦ πr◦tr , where πi are
nonisomorphic cuspidal representations. We have determined the irreducible
constituents of such a representation, and they are of the form Ind(θλ ), where
θi is πi -monatomic. Hence π is of this form.
We leave the final uniqueness assertion for the reader to deduce from
Theorem 47.3.

500 47 The Philosophy of Cusp Forms

The great paper of Green [58] constructs all the irreducible representations
of GL(k, Fq ). Systematic use is made of the ring R(q). However, Green does
not start with the cuspidal representations. Instead, Green takes as his ba-
sic building blocks certain generalized characters that are “lifts” of modular
characters, described in the following theorem.
Theorem 47.7 (Green). Let G be a finite group, and let ρ : G −→ GL(k, Fq )
be a representation. Let f ∈ Z[X1 , . . . , Xk ] be a symmetric polynomial with
integer coefficients. Let θ : F̄×
q −→ C
×
be any character. Let χ : G −→ C be
the function
χ(g) = f θ(α1 ), . . . , θ(αk ) .
Then θ is a generalized character.
Proof. First, we reduceto the following case: θ : F̄× q −→ C
×
is injective
and f (X1 , . . . , Xk ) = Xi . If this case is known, then by replacing ρ by
its exterior powers we get the same result for the elementary symmetric
polynomials, and  hence for all symmetric polynomials. Then we can take
f (X1 , . . . , Xk ) = Xir , effectively replacing θ by θr . We may choose r to
match any given character on a finite field containing all eigenvalues of all g,
obtaining the result in full generality.
We recall that if l is a prime, a group is l-elementary if it is the direct
product of a cyclic group and an l-group. According to Brauer’s characteriza-
tion of characters (Theorem 8.4(a) on p. 127 of Isaacs [83]), a class function
is a generalized character if and only if its restriction to every l-elementary
subgroup H (for all l) is a generalized character. Thus, we may assume that
G is l-elementary. If p is the characteristic of Fq , whether l = p or not, we may
write G = P × Q where P is a p-group and p  |Q|. The restriction of χ to Q
is a character by Isaacs, [83], Theorem 15.13 on p. 268. The result will follow
if we show that χ(gp q) = χ(q) for gp ∈ P , q ∈ Q. Since gp and q commute,
using the Jordan canonical form, we may find a basis for the representation
space of ρ over F̄q such that ρ(q) is diagonal and ρ(gp ) is upper triangular.
Because the order of gp is a power of p, its diagonal entries are 1’s, so q and
gp q have the same eigenvalues, whence χ(gp q) = χ(q).

Since the proof of this theorem of Green is purely character-theoretic, it
does not directly produce irreducible representations. And the characters that
it produces are not irreducible. (We will look more closely at them later.) How-
ever, Green’s generalized characters have two important advantages. First,
their values are easily described. By contrast, the values of cuspidal repre-
sentations are easily described on the semisimple conjugacy classes, but at
other classes require knowledge of “degeneracy rules” which we will not de-
scribe. Second, Green’s generalized character can be extended to a generalized
character of GL(n, Fqr ) for any r, a property that ordinary characters do not
have.
Still, the cuspidal characters have a satisfactory direct description, which
we turn to next. Choosing a basis for Fqk as a k-dimensional vector space
47 The Philosophy of Cusp Forms 501

over Fq and letting F× qk


act by multiplication gives an embedding F× qk
−→
GL(k, Fq ). Call the image of this embedding T(k) . More generally, if λ =
(λ1 , . . . , λr ) is a partition of k, then Tλ is the group F×
q λ1
× · · · × F×
q λr
embed-
ded in GL(k, Fq ) the same way. We will call any Tλ —or any conjugate of such
a group—a torus. An element of GL(k, Fq ) is called semisimple if it is diago-
nalizable over the algebraic closure of Fq . This is equivalent to assuming that
it is contained in some torus. It is called regular semisimple if its eigenvalues
are distinct. This is equivalent to assuming that it is contained in a unique
torus.
There is a very precise duality between the conjugacy classes of GL(k, Fq )
and its irreducible representations. Some aspects of this duality are shown in
Table 47.2. In each case, there is an exact numerical equivalence. For example,
the number of unipotent conjugacy classes is the number of partitions of k,
and this is also the number of unipotent representations, as we saw in The-
orem 46.1. Again, the number of cuspidal representations equals the number
of regular semisimple conjugacy classes whose eigenvalues generate Fqk . We
will prove this in Theorem 47.8.

Table 47.2. The duality between conjugacy classes and representations


Class type Representation type
Central conjugacy classes One-dimensional representations
Regular semisimple Induced from distinct cuspidals
conjugacy classes
Regular semisimple Cuspidal representations
conjugacy classes whose
eigenvalues generate Fqk
Unipotent conjugacy Unipotent representations
classes
Conjugacy classes whose Monatomic representations
characteristic polynomial
is a power of an irreducible

To formalize this duality, and to exploit it in order to count the irreducible


cuspidal representations, we will divide the conjugacy classes of GL(k, Fq ) into
“types.” Roughly, two conjugacy classes have the same type if their rational
canonical forms have the same shape. For example, GL(2, Fq ) has four distinct
types of conjugacy classes. They are
    
a  a
 a
= b , ,
b a
   
a 1 1
, ,
a −ν 1+q ν + ν q
502 47 The Philosophy of Cusp Forms

where the last consists of the conjugacy classes of matrices whose eigenvalues
are ν and ν q , where ν ∈ Fq2 − Fq . In the duality, these four types of conjugacy
classes correspond to the four types of irreducible representations: the q + 1-
dimensional principal series, induced from a pair of distinct characters of
GL(1); the one-dimensional representations χ ◦ det, where χ is a character
of F×
q ; the q-dimensional representations obtained by tensoring the Steinberg
representation with a one-dimensional character; and the q − 1-dimensional
cuspidal representations.
Let f (X) = X d + ad−1 X d−1 + · · · + a0 be a monic irreducible polynomial
over Fq of degree d. Let
⎛ ⎞
0 1 0 ··· 0
⎜ 0 0 1 0 ⎟
⎜ ⎟
⎜ .. . .. ⎟
U (f ) = ⎜ . .. . ⎟
⎜ ⎟
⎝ 0 0 0 ··· 1 ⎠
−a0 −a1 −a2 · · · −ad−1
be the rational canonical form. Let
⎛ ⎞
U (f ) Id 0 ··· 0
⎜ 0 U (f ) Id ⎟
⎜ ⎟
⎜ .. ⎟

Ur (f ) = ⎜ 0 0 U (f ) . ⎟ ⎟,
⎜ . . ⎟
⎝ . . . . ⎠
0 ··· U (f )

an array of r × r blocks, each of size d × d. If λ = (λ1 , . . . , λt )is a partition of


r, so that λ1  · · ·  λt are nonnegative integers with |λ| = i λi = r, let
⎛ ⎞
Uλ1 (f )
⎜ .. ⎟
Uλ (f ) = ⎝ . ⎠.
Uλt (f )

Then every conjugacy class of GL(k, Fq ) has a representative of the form


⎛ ⎞
Uλ1 (f1 )
⎜ .. ⎟
⎝ . ⎠, (47.7)
Uλm (fm )

where the fi are distinct monic irreducible polynomials, and each λi =


(λi1 , λi2 , . . .) is a partition. The conjugacy class is unchanged if the fi and
λi are permuted, but otherwise, they are uniquely determined.
Thus the conjugacy class is determined by the following data: a pair of
sequences r1 , . . . , rm and d1 , . . . , dm of integers, and for each 1  i  m
a partition λi of ri and a monic irreducible polynomial fi ∈ Fq [X] of de-
gree di , such that no fi = fj if i = j. The data ({ri }, {di }, {λi }, {fi }) and
47 The Philosophy of Cusp Forms 503

({ri }, {di }, {(λ )i }, {fi }) parametrize the same conjugacy class if and only if
they both have the same length m and there exists a permutation σ ∈ Sm
such that ri = rσ(i) , di = dσ(i) , (λ )i = λσ(i) and fi = fσ(i) .
We say two conjugacy classes are of the same type if the parametrizing
data have the same length m and there exists a permutation σ ∈ Sm such
that ri = rσ(i) , di = dσ(i) , (λ )i = λσ(i) . (The fi and fi are allowed to differ.)
The set of types of conjugacy classes depends on k, but is independent of q
(though if q is too small, some types might be empty).
Lemma 47.1. Let {N1 , N2 , . . .} be a sequence of numbers, and for each Nk
let Xk be a set of cardinality Nk (Xk disjoint). Let Σk be the following set.
An element of Σk consists of a 4-tuple ({ri }, {di }, {λi }, {xi }), where {ri } =
{r1 , . 
. . , rm } and {di } = {d1 , . . . , dm } are sequences of positive integers, such
that ri di = k, together with a sequence {λi } of partitions of ri and an
element xi ∈ Xdi , such that no xi are equal. Define an equivalence relation ∼
on Σk in which two elements are considered equivalent if they can be obtained
by permuting the data, that is, if σ ∈ Sm then

({ri }, {di }, {λi }, {xi }) ∼ ({rσ(i) }, {dσ(i) }, {λσ(i) }, {xσ(i) }).

Let Mk be the number of equivalence classes. Then the sequence of numbers


Nk is determined by the sequence of numbers Mk .

Proof. By induction on k, we may assume that the cardinalities N1 , . . . , Nk−1


are determined by the Mk . Let Mk be the cardinality of the set of equivalence
classes of ({ri }, {di }, {λi }, {xi }) ∈ Σk in which no xi ∈ Xk . Clearly Mk de-
pends only on the cardinalities N1 , . . . , Nk−1 of the sets X1 , . . . , Xk−1 from
which the xi are to be drawn, so (by induction) it is determined by the Mi .
Now we claim that Nk = Mk − Mk . Indeed, if  given ({ri }, {di }, {λi }, {xi }) ∈
Σk of length m, if any xi ∈ Xk , then since m i=1 ri di = k, we must have
m = 1, r1 = 1, d1 = k, and the number of such elements is exactly Nk .

Theorem 47.8. The number of cuspidal representations of GL(k, Fq ) equals
the number of irreducible monic polynomials of degree k over Fq .
Proof. We can apply the lemma with Xk either the set of cuspidal represen-
tations of Sk or with the set of monic irreducible polynomials of degree k over
Fq . We will show that in the first case, Mk is the number of irreducible repre-
sentations of GL(k, Fq ), while in the second, Mk is the number of conjugacy
classes. Since these are equal, the result follows.
If Xk is the set of cuspidal representations of GL(k, Fq ), from each el-
ement ({ri }, {di }, {λi }, {xi }) ∈ Σk we can build an irreducible representa-
tion of GL(k, Fq ) as follows. First, since xi is a cuspidal representation of
GL(di , Fq ) we can build the xi -monatomic representations of GL(di ri , Fq ) by
decomposing x◦r i
i . By Corollary 47.1, the irreducible constituents of xi
◦ri
are
i
parametrized by partitions of ri , so xi and λ parametrize an xi -monatomic
representation πi of GL(ri di , Fq ). Let π = π1 ◦ · · · ◦ πm . By Theorem 47.4,
504 47 The Philosophy of Cusp Forms

every irreducible representation of GL(k, Fq ) is constructed uniquely (up to


permutation of the πi ) in this way.
On the other, take Xk to be the set of monic irreducible polynomials
of degree k over Fq . We have explained above how the conjugacy classes of
GL(k, Fq ) are parametrized by such data.


Deligne and Lusztig [41] gave a parametrization of characters of any re-


ductive group over a finite field by characters of tori. Carter [32] is a ba-
sic reference for Deligne–Lusztig characters. Many important formulae, such
as a generalization of Mackey theory to cohomologically induced represen-
tations and an extension of Green’s “degeneracy rules,” are obtained. This
theory is very satisfactory but the construction requires l-adic cohomology.
For GL(k, Fq ), the parametrization of irreducible characters by characters of
tori can be described without resorting to such deep methods. The key point
is the parametrization of the cuspidal characters by characters of T(k) ∼
= Fq k .
Combining this with parabolic induction gives the parametrization of more
general characters by characters of other tori.
Thus let θ : T(k) ∼= Fqk −→ C× be a character such that the orbit of θ
under Gal(Fqk /Fq ) has cardinality k. The number of Gal(Fqk /Fq )-orbits of
such characters is
1 >n? d
μ q , (47.8)
n d
d|n

where μ is the Möbius function—the same as the number of semisimple


conjugacy classes. Then exists a cuspidal character σk = σk,θ of GL(k, Fq )
whose value on a regular semisimple conjugacy class g is zero unless g conju-
gate to an element of T(k) , that is, unless the eigenvalues of g are the roots
k−1
α, αq , . . . , αq of an irreducible monic polynomial of degree k in Fq [X], so
that Fqk = Fq [α]. In this case,


k−1
j
σk (g) = (−1)k−1 θ(αq ).
j=0

By Theorem 47.8, the number of σk,θ is the total number of cuspidal repre-
sentations, so this is a complete list.
We will first construct σk under the assumption that θ, regarded as a
character of F× qk
, can be extended to a character θ : F̄× q −→ C
×
that is
injective. This is assumption is too restrictive, and we will later relax it. We
will also postpone showing that that σk is independent of the extension of θ
to F̄×
q . Eventually we will settle these points completely in the special case
where k is a prime.
Let
k
χk (g) = θ(αi ), (47.9)
i=1
47 The Philosophy of Cusp Forms 505

where αi are the eigenvalues of g ∈ GL(k, Fq ). By Green’s theorem, χk is a


generalized character.
Proposition 47.4. Assume that θ can be extended to a character θ : F̄×
q −→
C× that is injective. Then the inner product χk , χk = k.
Proof. We will first prove that this is true for q sufficiently large, then show
that it is true for all q. We will use “big O” notation, and denote by O(q −1 )
any term that is bounded by a factor independent of q times q −1 . The idea of
the proof is to show that as a function of q, the inner product is k + O(q −1 ).
Since it is an integer, it must equal k when q is sufficiently large.
2 2
The number of elements of G = GL(k, Fq ) is q k + O(q k −1 ). This is clear
since G is the complement of the determinant locus in Matk (Fq ) ∼
2
= Fkq . The
2 2
set Greg of regular semisimple elements also has order q k + O(q k −1 ) since it
is the complement of the discriminant locus. Since |χk (g)|  k for all g,
1
χk , χk = |χk (g)|2 + O(q −1 ).
|G|
g∈Greg

Because every regular element is contained in a unique conjugate of some Tλ ,


which has exactly [G : NG (Tλ )] such conjugates, this equals
1
[G : NG (Tλ )] |χk (g)|2 + O(q −1 )
|G|
λ a partition of k g∈Tλreg
1
= [G : NG (Tλ )] |χk (g)|2 + O(q −1 ),
|G|
λ g∈Tλ

the last step using the fact that the complement of the Tλreg in Tλ is of codi-
mension one. We note that the restriction of χk to Tλ is the sum of k distinct
characters, so
|χk (g)|2 = k|Tλ |.
g∈Tλ

Thus the inner product is


1
k× [G : NG (Tλ )]|Tλ | + O(q −1 ).
|G|
λ

We have
1 1
[G : NG (Tλ )]|Tλ | = [G : NG (Tλ )]|Tλreg | + O(q −1 )
|G| |G|
λ λ
1
= |Greg | + O(q −1 )
|G|
= 1 + O(q −1 ).
The result is now proved for q sufficiently large.
506 47 The Philosophy of Cusp Forms

To prove the result for all q, we will show that the inner product χk , χk
is a polynomial in q. This will follow if we can show that if S is the subset of G
consisting of the union of conjugacy classes of a single type, then [G : CG (g)]
is constant for g ∈ S and
|χk (g)|2 (47.10)
g∈S

is a polynomial in q. We note that for each type, the index of the centralizer
of (47.7) is the same for all such matrices, and that this index is polynomial
in q. Thus it is sufficient to show that the sum over the representatives (47.7)
is a polynomial in q. Moreover, the value of χk is unchanged if every instance
of a Ur (f ) is replaced with r blocks of U (f ), so we may restrictourselves
to semisimple conjugacy classes in confirming this. Thus if k = di ri , we
consider the sum (47.10), where the sum is over all matrices
⎛ ⎞
U(r1 ) (f1 )
⎜ .. ⎟
⎝ . ⎠,
U(rm ) (fm )

where fi are distinct irreducible polynomials, each of size di , and U(r) (f ) is


the sum of r blocks of U (f ). It is useful to conjugate these matrices so that
they are all elements of the same torus Tλ for some λ. The set S is then a
subset of Tλ characterized by exclusion from certain (non-maximal) subtori.
Let us look at an example. Suppose that λ = (2, 2, 2) and k = 6. Then
S consists of elements of Tλ , which may be regarded as (Fq2 )× of the form
(α, β, γ), where α, β and γ are distinct elements of F× ×
q2 − Fq . Now if we
sum (47.10) over all of Tλ we get a polynomial in q, namely 6(q 2 − 1)3 . On
the other hand, we must subtract from this three contributions when one of
α, β and γ is in F× q . These are subtori of the form T(2,2,1) . We must also
subtract three contributions from subgroups of the form T(2,2) in which two
of α, β, and γ are equal. Then we must add back contributions that have been
subtracted twice, and so on.
In general, the set S will consist 6 of the set Tλ minus subtori T1 , . . . , TN . If I
is a subset of {1, . . . , N } let TI = i∈I Ti . We now use the inclusion–exclusion
principle in the form

|χk (g)|2 = |χk (g)|2 + (−1)|I| |χk (g)|2 .
g∈S g⊂Tλ ∅=I⊆{1,...,N } g∈TI

Each of the sums on the right is easily seen to be a polynomial in q, and so


is (47.10). 

Theorem 47.9. Assume that θ is an injective character θ : F̄× ×


q −→ C . For
each k there exists a cuspidal σk = σk,θ of GL(k, Fq ) such that if g is a
regular semisimple element of GL(k, Fq ) with eigenvalues that are the Galois
conjugates of ν ∈ F×
qk
such that Fqk = Fq (ν), then
47 The Philosophy of Cusp Forms 507


k−1
i
σk,θ (g) = (−1)k−1 θ(ν q ). (47.11)
i=0

If 1k denotes the trivial character of GL(k, Fq ), then



n
χn = (−1)k−1 σk ◦ 1n−k .
k=1

Note that σk ◦ 1n−k is an irreducible character of GL(n, Fq ) by Theorem 47.4.


So this gives the expression of χn in terms of irreducibles.
Proof. By induction, we assume the existence of σk and the decomposition of
χk as stated for k < n, and we deduce them for k = n.
We will show first that

χn , σk ◦ 1n−k = (−1)k−1 . (47.12)

Let P = M U be the standard parabolic subgroup with Levi factor M =


GL(k, Fq )× GL(n− k, Fq ) and unipotent radical U . If m ∈ M and u ∈ U , then
as matrices in GL(n, Fq ), m and mu have the same characteristic polynomials,
so χn (mu) = χn (m). Thus, in the notation of Exercise 47.2(ii), with χ = χn ,
we have χU = χ restricted to M . Therefore,

χn , σk ◦ 1n−k G = χn , σk ⊗ 1n−k M .

Let
 
m1
m= ∈ M, m1 ∈ GL(k, Fq ), m2 ∈ GL(n − k, Fq ).
m2

Clearly, χn (m) = χk (m1 ) + χn−k (m2 ). Now using the induction hypothe-
sis, χn−k does not contain the trivial character of GL(n − k, Fq ), hence it is
orthogonal to 1n−k on GL(n − k, Fq ); so we can ignore χn−k (m2 ). Thus,

χn , σk ◦ 1n−k G = χk , σk GL(k,Fq ) .

By the induction hypothesis, χk contains σk with multiplicity (−1)k−1 , and


so (47.12) is proved.
Now σk ◦ 1n−1 is an irreducible representation of GL(n, Fq ), by Theo-
rem 47.4, and so we have exhibited n − 1 irreducible characters, each of which
occurs in χn with multiplicity ±1. Since χn , χn = n, there must be one
remaining irreducible character σn such that

n−1
χn = (−1)k−1 σk ◦ 1n−k ± σn . (47.13)
k=1

We show now that σn must be cuspidal. It is sufficient to show that if U


is the unipotent radical of the standard parabolic subgroup with Levi factor
508 47 The Philosophy of Cusp Forms

M = GL(k, Fq )×GL(n−k, Fq ), and if m1 ∈ GL(k, Fq ) and m2 ∈ GL(n−k, Fq )


then
     
1 1
n−1
m1 m1
χn u = (−1)r−1 (σr ◦ 1n−r ) u ,
|U | m2 |U | r=1 m2
u∈U

since by Exercise 47.2(ii), this will show that the representation affording the
character σn has no U -invariants, the definition of cuspidality. The summand
on the left-hand side is independent of u, and by the definition of χn the
left-hand side is just χk (m1 ) + χn−k (m2 ). By Exercise 47.4, the right-hand
side can also be evaluated. Using (47.11), which we have assumed inductively
for σr with r < n, the terms r = k and r = n − k contribute χk (m1 ) and
χn−k (m2 ) and all other terms are zero.
To evaluate the sign in (47.13), we compare the values at the identity to
get the relation


n−1   k−17 7
n−1
n
n= (−1)k−1 (q j − 1) ± (q j − 1),
k (q) j=1 j=1
k=1

where
  /n
n j=1 (q
j
− 1)
= >/ ? >/ ?
k (q) k
(q j − 1) n−k j
(q − 1)
j=1 j=1

is the Gaussian binomial coefficient, which is the index of the parabolic sub-
group with Levi factor GL(k) × GL(n − k). Substituting q = 0 in this identity
shows that the missing sign must be (−1)n−1 .
If g is a regular element of T(k) , then the value of σk on a regular element
of T(k ) is now given by (47.11) since if k < n then σk ◦ 1n−k vanishes on g,
which is not conjugate to any element of the parabolic subgroup from which
σk ◦ 1n−k is induced.


See Exercise 47.9 for an example showing that the cuspidal characters
that we have constructed are not enough because of our assumption that
θ is injective. Without attempting a completely general result, we will now
give a variation of Theorem 47.9 that is sufficient to construct all cuspidal
representations of GL(k, Fq ) when k is prime.

Proposition 47.5. Let θ : F̄× q −→ C


×
be a character. Assume that the re-
×
striction of θ to Fq is trivial, but that for any 0 < d  k, the restriction of
θ to F×qd
does not factor through the norm map F× qd
−→ F×qr for any proper
divisor r of d. Then
χk , χk = k + 1.

Proof. The proof is similar to Proposition 47.4. It is sufficient to show this for
sufficiently large q. As in that proposition, the sum is
47 The Philosophy of Cusp Forms 509

1
[G : NG (Tλ )] |χk (g)|2 + O(q −1 ).
|G|
λ a partition of k g∈Tλ

We note that [NG (Tλ ) : Tλ ] = zλ , defined in (37.1). With our assumptions if


the partition λ contains r parts of size 1, the restriction of χk to Tλ consists
of r copies of the trivial character, and k − r copies of other characters, all
distinct. (Exercise 47.8.) The inner product of χk with itself on Tλ is thus
k − r + r2 . The sum is thus
1
(k + r2 − r) + O(q −1 ).

λ

We can interpret this as a sum over the symmetric group. If σ ∈ Sk , let r(σ)
be the number of fixed points of σ. In the conjugacy class of shape λ, there
are k!/zλ elements, and so
1 1
(k + r2 − r) = (k + r(σ)2 − r(σ)).
zλ k!
λ σ∈Sk

Now r(σ) = h(k−1,1) = s(k−1,1) + hk in the notation of Chap. 37. Here, of


course, hk = s(k) is the trivial character of Sk and s(k−1,1) is an irreducible
character of degree k−1. We note that r(σ)2 −r(σ) is the value of the character
s2(k−1,1) + s(k−1,1) , so the sum is
G H G H  
khk + s2(k−1,1) + s(k−1,1) , hk = k hk , hk + s2(k−1,1) , hk + s(k−1,1) , hk

where the
 inner product
 is now over the symmetric group. Clearly hk , hk =
1 and s(k−1,1) , hk = 0. Since the character s(k−1,1) is real and hk is the
constant function equal to 1,
G H  
s2(k−1,1) , hk = s(k−1,1) , s(k−1,1) = 1,

and the result follows.




Theorem 47.10. Suppose that n is a prime, and let θ : F× qn −→ C


×
be a
× ×
character that does not factor through the norm map Fqn −→ Fqr for any
proper divisor r of n. Then there exists a cuspidal character σn,θ of GL(n, Fq )
such that if g is a regular semisimple element with eigenvalues ν, ν q , . . . ∈ Fqn
then

n−1
i
σn,θ (g) = (−1)n−1 θ(ν q ). (47.14)
i=0

This gives a complete list of the cuspidal characters of Fqn .

The assumption that n is prime is unnecessary.


510 47 The Philosophy of Cusp Forms

Proof. By Exercise 47.11, we can extend θ to a character of F̄q without enlarg-


ing the kernel. Thus the kernel of θ is contained in F× qn and does not contain
the kernel of any norm map F× qn −→ F ×
qr for any proper divisor r of n. There
are now two cases.
If χ is nontrivial on F×
q , then we may proceed as in Theorem 47.9. We are
not in the case of that theorem, since we have not assumed that the kernel of
θ is trivial, and we do not guarantee that the sequence of cuspidals σk that
we construct can be extended to all k. However, if d  k, our assumptions
guarantee that the restriction of θ to F× qd
does not factor through the norm
map to Fqr for any proper divisor of d, since the kernel of θ is contained in
Fqn , whose intersection with Fqd is just Fq since n is prime and d < n. In
particular, the kernel of θ cannot contain the kernel of N : F× qd
−→ F× qr . We
get χk , χk = k for k  n, and proceeding as in Theorem 47.9 we get a
sequence of cuspidal representations σk of GL(k, Fq ) with k  n such that


k
χk = (−1)r−1 σr ◦ 1k−r .
r=1

If θ is trivial onF×
q , it is still true that the restriction of θ to Fqd does
not factor through the norm map to Fqr for any proper divisor of d whenever
k  n. So χk , χk = k + 1 by Theorem 47.5. Now, we can proceed as before,
except that σ1 = 11 , so σ1 ◦ 1k−1 is not irreducible—it is the sum of two
irreducible representations s(k−1,1) (q) and s(k) (q) of GL(k, Fq ), in the notation
of Chap. 46. Of course, s(k) (q) is the same as 1k in the notation we have been
using. The rest of the argument goes through as in Theorem 47.9. In particular
the inner product formula χk , χk = k + 1 together with fact that 11 ◦ 1k−1
accounts for two representations
 in the decomposition of χk guarantees that
σk , defined to be χk − r<k (−1)r σr ◦ 1k−r is irreducible.
The cuspidal characters we have constructed are linearly independent
by (47.14). They are equal in number to the total number of cuspidal rep-
resentations, and so we have constructed all of them.


Let us consider next representations of reductive groups over local fields.


The problem is to parametrize irreducible representations of Lie and p-adic
groups such as GL(k, F ), where F = R, C or a non-Archimedean local field.
The parametrization of irreducible representations by characters of tori,
which we have already seen for finite fields, extends to representations of Lie
and p-adic groups such as GL(k, F ), where F = R, C or a non-Archimedean
local field. If T is a maximal torus of G = GL(k, F ), then the characters
of T parametrize certain representations of G. As we will explain, not all
admissible representations can be parametrized by characters of tori, though
(as we will explain) in some sense most are so parametrized. Moreover, if we
expand the parametrization we can get a bijection. This is the local Langlands
correspondence, which we will now discuss (though without formulating a
precise statement).
47 The Philosophy of Cusp Forms 511

In this context, a torus is the group of rational points of an algebraic group


that, over the algebraic closure of F , is isomorphic to a product of r copies of
the multiplicative group Gm . (See Chap. 24.) The torus is called anisotropic
if it has no subtori isomorphic to Gm over F . If F = R, an anisotropic torus
is compact. For example, SL(2, R) contains two conjugacy classes of maximal
tori—the diagonal torus, and the compact torus SO(2). Over the complex
numbers, the group SO(2, C) is conjugate by the Cayley transform to the
diagonal subgroup, since if a2 + b2 = 1, then
     
a b −1 a + bi 1 1 i
c c = , c= √ .
−b a a − bi 2i 1 −i
Thus, SO(2) is an anisotropic torus. If G is semisimple, then G has an
anisotropic maximal torus if and only if its maximal compact subgroup K
has the same rank as G. An examination of Table 28.1 shows that this is
sometimes true and sometimes not. For example, by Proposition 28.3, this
will be the case if G/K is a Hermitian symmetric space. The group SO(n, 1)
has anisotropic maximal tori if n is even, but not if n is odd. SL(k, R) does
only if k = 2.
If F is a local field and E/F is an extension of degree k, then, as in the
case of a finite field, we may embed E × −→ GL(k, F ), and the norm one
elements will be an anisotropic torus of SL(k, F ). From this point of view, we
see why SL(2, R) is the only special linear group over R that has an anisotropic
maximal torus—the algebraic closure C of R is too small.
Let G be a locally compact group and Z its center. Let (π, V ) be an
irreducible unitary representation of G. By Schur’s lemma, π(z) acts by a
scalar ω(z) of absolute value 1 for z ∈ Z. Let L2 (G, ω) be the space of all
functions f on G such that f (zg) = ω(z)f (g) and

|f (g)|2 dg < ∞.
G/Z

The group G acts on L2 (G, ω) by right translation. The representation π is


said to be in the discrete series if it can be embedded as a direct summand
in L2 (G, ω). If G is a reductive group over a local field, the irreducible rep-
resentations of G can be built up from discrete series representations of Levi
factors of parabolic subgroups by parabolic induction.
Let F be a local field, and let E/F be a finite extension. Then the (rela-
tive) Weil group WE/F is a certain finite extension of E × . It fits in an exact
sequence:
1 −→ E × −→ WE/F −→ Gal(E/F ) −→ 1.
If E  ⊃ E is a bigger field, there is a canonical map WE  /F −→ WE/F in-
ducing the norm map E  −→ E, and the absolute Weil group WF is the
inverse limit of the WE/F . The discrete series representations of GL(k, F ) are
then parametrized by the irreducible k-dimensional complex representations
512 47 The Philosophy of Cusp Forms

of WE/F . This is a slight oversimplification—we are neglecting the Steinberg


representation and a few other discrete series that can be parametrized by
replacing WE/F by the slightly larger Weil–Deligne group.
This parametrization of irreducible representations of GL(k, F ) by local
Langlands correspondence. Borel [19] is still a useful reference for the Lang-
lands correspondences, though the correspondence must be made more precise
than the formulation in this paper, written before many of the results were
proved. Henniart’s ICM talk [68] is a good more recent reference. The local
Langlands conjectures for GL(k) over non-Archimedean local fields of charac-
teristic zero were proved by Harris and Taylor [63]. The p-adic case had been
proved earlier by Laumon, Rappoport, and Stuhler, and another proof was
given soon after Harris and Taylor by Henniart [67].
Assume that G = GL(k) over a local field F . We now explain why most but
not all discrete series representations correspond to characters of anisotropic
tori. If T is a maximal torus of G, then T /Z is anisotropic if T ∼ = E × where
×
E/F is an extension of degree k. If θ is a character of E then inducing θ to
WE/F gives a representation of WE/F of degree k. This gives a parametrization
of many—even most—discrete series representations by characters of tori. In
fact, if F is non-Archimedean and the residue characteristic is prime to k, then
every irreducible representation is of this form. This is proved in Tate [159]
(2.2.5.3). A simple proof when k = 2 is given in Bump [27], Proposition 4.9.3.
Although the parametrization of the discrete series representations by
characters of tori is thus a more complex story for local fields than for fi-
nite fields, the construction of the irreducible representations by parabolic
induction still follows the same pattern as in the finite field case. An analog
of Theorem 47.3 is true, and the method of proof extends—the function Δ
becomes a distribution, and the corresponding analog of Mackey theory is
due to Bruhat [26]. Some differences occur because of measure considerations.
There are important differences between the finite field case and the local field
case when reducibility occurs. The finite field statement Corollary 47.1 is both
suggestive and misleading when looking at the local field case. See Zelevinsky
[177]. Zelevinsky’s complete results are reviewed in Harris and Taylor [63].
Turning at last to automorphic forms, characters of tori still parametrize
automorphic representations, and characters of anisotropic tori parametrize
automorphic cuspidal representations. Thus, if E/F is an extension of number
fields with [E : F ] = k and AE is the adele ring of E, and if θ is a character of
A× ×
E /E , then there should exist an automorphic representation of GL(k, F )
whose L-function is the same as the L-function of θ. If E/F is cyclic, this is
a theorem of Arthur and Clozel [8], Sect. 3.6. In contrast with the situation
over local fields, however, where “most” discrete series are parametrized by
characters of tori, the cuspidal representations obtained this way are rare.
A few more are obtained if we allow parametrizations by the global Weil
group, but even these are in the minority. The literature on this topic is
too vast to survey here, but we mention one result: in characteristic p, the
Langlands parametrization of global automorphic forms on GL(n) was proved
by Lafforgue [114].
47 The Philosophy of Cusp Forms 513

Exercises
Exercise 47.1 (Transitivity of parabolic induction).
(i) Let P be a parabolic subgroup of GL(k) with Levi factor M and unipotent
radical U , so P = M U . Suppose that Q is a parabolic subgroup of M with
Levi factor MQ and unipotent radical UQ . Show that MQ is the Levi factor of
a parabolic subgroup R of GL(k) with unipotent radical UQ U .
(ii) In the setting of (i), show that parabolic induction from MQ directly to GL(k)
gives the same result as parabolically inducing first to M , and then from M to
GL(k).
(iii) Show that the multiplication ◦ is associative and that R(q) is a ring.

Exercise 47.2 (Frobenius reciprocity for parabolic induction). Let P = M U


be a parabolic subgroup of G = GL(n, Fq ).
(i) Let (π, V ) be a representation of G and let (σ, W ) be a representation of M .
Let V U be the space of U -invariants in V . Since M normalizes U , V U is an M -
module. On the other hand, we may parabolically induce W to a representation
Ind(σ) of G. Show that
 
HomG V, Ind(σ) ∼ U
= HomM (V , W ).

(Hint: Make use of Theorem 32.2. We need to show that

HomP (V, W ) ∼ U
= HomM (V , W ).

Let V0 be the span of elements of the form w − π(u)w with u ∈ U . Show that
V = V U ⊕ V0 , as M -modules, and that any P -equivariant map V −→ W factors
through V /V0 ∼= V U .)
(ii) Let χ be a character of G, and let σ be a character of M . Let Ind(σ) be the
character of the representation of G parabolically induced from σ, and let χU
be the function on M defined by
1 
χU (m) = χ(mu).
|U | u∈U

Show that χU is a class function on M , and that

χ, Ind(σ)G = χU , σM .

Conclude that χU is a character of M . [Note: Although this statement is closely


related to (i), and may be deduced from it, this may also be proved using (32.16)
and Frobenius reciprocity for characters, avoiding use of (i).]

Exercise 47.3. Suppose that H is a subgroup of GL(k, Fq ) containing T such that


for each α ∈ Φ the group H contains either Xα or X−α . Show that H is a (not
necessarily standard) parabolic subgroup. If H contains exactly one of Xα or X−α
for each α ∈ S, show that H is a (not necessarily standard) Borel subgroup. (See
Exercise 20.1.)

The next exercise is very analogous to the computation of the constant terms of
Eisenstein series. For example, the computation around pages 39–40 of Langlands
[117] is a near exact analog.
514 47 The Philosophy of Cusp Forms

Exercise 47.4. Let 1  k, r < n. Let σ1 , σ2 be monatomic characters of GL(r, Fq )


and GL(n − r, Fq ) with respect to a pair of distinct cuspidal representations. Let σ
denote the character of the representation σ1 ◦ σ2 of GL(n, Fq ), which is irreducible
by Theorem 47.6. Let m1 ∈ GL(k, Fq ) and m2 ∈ GL(n − k, Fq ). Let U be the
unipotent radical of the standard parabolic subgroup P of GL(n, Fq ) with Levi
factor M = GL(k, Fq ) × GL(n − k, Fq ). if k = r, k
= n − r,

   ⎨ σ1 (m1 ) σ2 (m2 ) if k = r, k
= n − r,
1  m1
σ u = σ1 (m2 ) σ2 (m1 ) if k = n − r, k
= r,
|U | u∈U m2 ⎩
σ1 (m1 ) σ2 (m2 ) + σ1 (m2 ) σ2 (m1 ) if k = r = n − r.

[Hint: Both sides are class functions, so it is sufficient to compare the inner products
with ρ1 ⊗ ρ2 where ρ1 and ρ2 are irreducible representations of GL(k, Fq )andGL(n −
k, Fq ), respectively. Using Exercise 47.2 this amounts to comparing σ1 ◦σ2 and ρ1 ◦ρ2 .
To do this, explain why in the last statement in Theorem 47.6 the assumption that
the θi are monatomic with respect to distinct cuspidals may be omitted provided
this assumption is made for the θi .]

Exercise 47.5. If k +l = m, and if P = M U is the standard parabolic of GL(m, Fq )


with Levi factor M = GL(k, Fq ) × GL(l, Fq ), then the space of U -invariants of any
representation (π, V ) of GL(m, Fq ) is an M -module. Show that this functor from
representations of GL(m, Fq ) to representations of GL(k, Fq ) × GL(l, Fq ) can be
made the basis of a comultiplication in R(q) and that R(q) is a Hopf algebra.

Exercise 47.6. Let G = GL(k, Fq ). As in Exercise 45.4, let N be the subgroup of


upper triangular unipotent matrices. Let ψ : Fq −→ C× be a nontrivial additive
character, and let ψN be the character of N defined by
⎛ ⎞
1 x12 x13 · · · x1k
⎜ 1 x23 · · · x2k ⎟
⎜ ⎟
⎜ 1 ⎟
ψN ⎜ ⎟ = ψ(x12 + x23 + · · · + xk−1,k ).
⎜ . . ⎟
⎝ . . .. ⎠
1

Let P be the “mirabolic” subgroup of g ∈ G where the bottom row is (0, . . . , 0, 1).
(Note that P is not a parabolic subgroup.) Call an irreducible representation of
P cuspidal if it has no U -fixed vector for the unipotent radical U of any standard
parabolic subgroup of G. Note that U is contained in P for each such U . If 1  r < k
let Gr be GL(r, Fq ) embedded in G in the upper left-hand corner, and let N r be the
subgroup of x ∈ N in which xij = 0 if i < j  r.
(i) Show that the representation κ = IndP N (ψ) is irreducible. [Hint: Use Mackey
theory to compute HomP (κ, κ).]
(ii) Let (π, V ) be a cuspidal representation of P . Let Lr be the set of all linear
functionals λ on V such that λ(π(x)v) = ψN (x)v for v ∈ V and x ∈ Lr . Show

 r > 1 then there exists γ ∈ Gr−1 such that λ ∈ Lr−1 , where
that if λ ∈ Lr and

λ (v) = λ π(γ)v .
(iii) Show that the restriction of an cuspidal representation π of GL(k, Fq ) to P is
a direct sum of copies of κ. Then use Exercise 45.4 to show that at most one
copy can occur, so π|P = κ.
47 The Philosophy of Cusp Forms 515

(iv) Show that each irreducible cuspidal representation of GL(k, Fq ) has dimension
(q − 1)(q 2 − 1) · · · (q k−1 − 1).

Exercise 47.7. Let θ : F×


qk
−→ C× be a character.

(i) Show that the following are equivalent.


(a) The character θ does not factor through the norm map Fqk −→ Fqd for any
proper divisor d of k.
(b) The character θ has k distinct conjugates under Gal(Fqk /Fq ).
r
(c) We have θq −1
= 1 for all divisors r of k.
(ii) Show that the number of such θ satisfying these equivalent conditions given
by (47.8), and that this is also the number of monic irreducible polynomials of
degree k over Fq .

Exercise 47.8. Suppose that θ : F̄× q −→ C


×
is a character. Suppose that for all
d  k, the restriction of θ to Fqd does not factor through the norm map F×
×
qd
−→ F×qr
for any proper divisor r of d. Let λ be a partition of k. Show that the restriction
of θ to Tλ contains the trivial character with multiplicity r, equal to the number of
parts of λ of size 1, and k − r other characters that are all distinct from one another.

Exercise 47.9. Obtain a character table of GL(2, F3 ), a group of order 48. Show
that there are three distinct characters θ of F× 9 such that θ does not factor through
the norm map Fqk −→ Fqd for any proper divisor of d. Of these, two (of order
eight) can be extended to an injective homomorphism F̄× ×
3 −→ C , but the third (of
order four) cannot. If θ is this third character, then χ2 defined by (47.9) defines a
character that splits as χtriv +χsteinberg −σ2 , where χtriv and χsteinberg are the trivial
and Steinberg characters, and σ2 is the character of a cuspidal representation. Show
also that σ2 differs from the sum of the two one-dimensional characters of GL(2, F3 )
only on the two non-semisimple conjugacy classes, of elements of orders 3 and 6.

Exercise 47.10. Suppose that χ is an irreducible representation of GL(k, Fq ). Let


g be a regular semisimple element with eigenvalues that generate Fqk . If χ(g)
= 0,
show that χ is monatomic.

Exercise 47.11. Let θ be a character of Fq . Show that there exists a character θ̄ of


F̄q extending θ, whose kernel is the same as that of θ.

Exercise 47.12. Let θ be an injective character of F̄q . Prove the following result.
Theorem. Let λ be a partition of n and let t ∈ Tλ . Then σk,θ (t) = 0 unless λ = (n).
Hint: Assume by induction that the statement is true for all k < n. Write
t = (t1 , . . . , tr ) where ti ∈ GL(λi , Fq ) has distinct eigenvalues in Fqλi . Show that

(σk ◦ 1n−k )(t) = σk (ti ).
λi
48
Cohomology of Grassmannians

In this chapter, we will deviate from our usual policy of giving complete proofs
in order to explain some important matters. Among other things, we will see
that the ring R introduced in Chap. 34 has yet another interpretation in terms
of the cohomology of Grassmannians.
References for this chapter are Fulton [53], Hiller [70], Hodge and Pedoe
[71], Kleiman [102], and Manivel [126].
We recall the notion of a CW-complex . Intuitively, this is just a space
decomposed into open cells, the closure of each cell being contained in
the union of cells of lower dimension—for example, a simplicial complex. (See
Dold [44], Chap. 5, and the appendix in Milnor and Stasheff [129].) Let Bn
be the closed unit ball in Euclidean n-space. Let B◦n be its interior, the unit
disk, and let Sn−1 be its boundary, the n − 1 sphere. We are given a Hausdorff
topological space X together with set S of subspaces of X. It is assumed that
X is the disjoint union of the Ci ∈ S, which are called cells. Each space Ci ∈ S
is homeomorphic to B◦d(i) for some d(i) by a homeomorphism εi : B◦d(i) −→ Ci
that extends to a continuous map εi : Bd(i) −→ X. The image of Sd(i)−1 under
εi lies in the union of cells Ci of strictly lower dimension. Thus, if we define
the n-skeleton
A
Xn = Ci ,
d(i)n

the image of Sd(i)−1 under εi is contained in Xd(i)−1 . It is assumed that its


image is contained in only finitely many Ci and that X is given the Whitehead
topology, in which a subset of X is closed if and only if its intersection with
each Ci is closed.
Let K be a compact Lie group, T a maximal compact subgroup, and X
the flag manifold K/T . We recall from Theorem 26.4 that X is naturally
a complex analytic manifold. The reason (we recall) is that we can identify
X = G/B where G is the complexification of K and B is its Borel subgroup.
The Lefschetz fixed-point formula can be used to show that the Euler
characteristic of X is equal to the order of the Weyl group W . Suppose that

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 517


DOI 10.1007/978-1-4614-8024-2 48, © Springer Science+Business Media New York 2013
518 48 Cohomology of Grassmannians

M is a manifold of dimension n and f : M −→ M a map. We define the


Lefschetz number of f to be

n

Λ(f ) = (−1)d tr f |H d (M, Q) .
d=0

A fixed point of f is a solution to the equation f (x) = x. The fixed point x


is isolated if it is the only fixed point in some neighborhood of x. According
to the “Lefschetz fixed-point formula,” if M is a compact manifold and f has
only isolated fixed points, the Lefschetz number is the number of fixed points
counted with multiplicity; see Dold [43].
Let g ∈ K, and let f = fg : X → X be translation by g. If g is the identity,
then f induces the identity map on X and hence on its cohomology in every
dimension. Therefore, the Euler characteristic is Λ(f ). On the other hand,
Λ(f ) is unchanged if f is replaced by a homotopic map, so we may compute
it by moving g to a generator of T . (We are now thinking of X as K/T .)
Then f (hT ) = hT if and only if g is in the normalizer of T , so there is one
fixed point for each Weyl group element. The local Lefschetz number, which
is the multiplicity of the fixed point in the fixed point formula, may also be
computed for each fixed point (see Adams [2]) and equals 1. So Λ(f ) = |W |,
and this is the Euler characteristic of X.
Itis possible to be a bit more precise than this: H i (X) = 0 unless i is even
and i dim H 2i (X) = |W |. We will explain the reason for this now.
We may give a cell decomposition making X into a CW-complex as follows.
If w ∈ W , then BwB/B is homeomorphic to Cl(w) , where l is the length
function on W . The proof is the same as Proposition 46.7: the unipotent
subgroup Uw− which has the Lie algebra is
&

α∈Φ+ ∩wΦ−

is homeomorphic to Cl(w) , and u −→ uwB is a homeomorphism of Uw− onto


BwB/B. The closure C(w) of BwB/B—known as a “closed Schubert cell”—
is a union of cells of smaller dimension, so G/B becomes a CW complex.
Since the homology of a CW-complex is the same as the cellular homology
of its skeleton (Dold [44], Chap. 5), and all the cells in this complex have
even dimension—the real dimension of BwB/B is 2l(w)—it follows that the
homology of X is all even-dimensional.
Since X is a compact complex analytic manifold (Theorem 26.4), it is an
orientable manifold, and by Poincaré duality we may associate with C(w) a
cohomology class, and these classes span the cohomology ring H ∗ (X) as a
vector space.
This description can be recast in the language of algebraic geometry. A sub-
stitute for the cohomology ring was defined by Chow [37]. See Hartshorne [64],
Appendix A, for a convenient synopsis of the Chow ring, and see Fulton [54]
48 Cohomology of Grassmannians 519

for a modern treatment. In the graded Chow ring of a nonsingular variety


X, the homogeneous elements of degree r are rational equivalence classes of
algebraic cycles. Here an algebraic cycle of codimension r is an element of the
free Abelian group generated by the irreducible subvarieties of codimension r.
Rational equivalence of cycles is an equivalence relation of algebraic deforma-
tion. For divisors, which are cycles of codimension 1, it coincides with the
familiar relation of linear equivalence. We recall that two divisors D1 and D2
are linearly equivalent if D1 − D2 is the divisor of a function f in the function
field of X.
The multiplication in the Chow ring is the intersection of cycles. If two
subvarieties Y and Z (of codimensions m and n) are given, we say that Y and
Z intersect properly if every irreducible component of Y ∩ Z has codimension
m + n. (If m + n exceeds the dimension of X, this means that Y and Z have
an empty intersection.) Chow’s lemma asserts that Y and Z may be deformed
to intersect properly. That is, there exist Y  and Z  rationally equivalent to Y
and Z, respectively, such that Y and Z  intersect properly. The intersection
X ∩ Z is then a union of cycles of codimension m + n, whose sum in the Chow
ring is Y ∩ Z. (They must be counted with a certain intersection multiplicity.)
The “moving” process embodied by Chow’s lemma will be an issue for
us when we consider the intersection pairing in Grassmannians, so let us
contemplate a simple case of intersections in Pn . Hartshorne [64], I.7, gives a
beautiful and complete treatment of intersection theory in Pn .
The space Pn (C), which we will come to presently, resembles flag manifolds
and Grassmannians in that the Chow ring and the cohomology ring coincide.
(Indeed, Pn (C) is a Grassmannian.) The homology of Pn (C) can be computed
very simply since it has a cell decomposition in which each cell is an affine
space Ai ∼= Ci .

Pn (C) = Cn ∪ Cn−1 ∪ · · · ∪ C0 , dim(Ci ) = 2i. (48.1)

Each cell contributes to the homology in exactly one dimension, so



n Z if i  2n is even,
Hi P (C) = ∼
0 otherwise.

The cohomology is the


same by Poincaré duality. The multiplicative structure
in the ring H ∗ Pn (C) is that of a truncated polynomial ring. The cohomology
class of a hyperplane [Cn−1 in the decomposition (48.1)] is a generator.
Let us consider the intersection of two curves Y and Z in P2 (C). The
intersection Y · Z, which is the product in the Chow ring, 1, is a cycle of
degree zero, that is, just a sum of points. The rational equivalence class of a
cycle of degree zero is completely determined by the number of points, and
intersection theory on P2 is fully described if we know how to compute this
number.
Each curve is the locus of a homogeneous polynomial in three variables,
and the degree of this polynomial is the degrees of the curves, d(Y ) and d(Z).
520 48 Cohomology of Grassmannians

According to Bezout’s theorem, the number of points in the intersection of Y


and Z equals d(Y ) d(Z).

A curve of degree 2
(hyperbola) deformed
into a pair of lines.

Fig. 48.1. A curve of degree d in P2 is linearly equivalent to d lines

Bezout’s theorem can be used to illustrate Chow’s lemma. First, note that
a curve of degree d is rationally equivalent to a sum of d lines (Fig. 48.1), so
Y is linearly equivalent to a sum of d(Y ) lines, and Z is linearly equivalent
to a sum of d(Z) lines. Since two lines have a unique point of intersection,
the first set of d(Y ) lines will intersect the second set of d(Z) lines in exactly
d(Y ) d(Z) points, which is Bezout’s theorem for P2 (Fig. 48.2).

Fig. 48.2. Bezout’s theorem via Chow’s lemma

It is understood that a point of transversal intersection is counted once,


but a point where Y and Z are tangent is counted with a multiplicity that
can be defined in different ways.
The intersection Y · Z must be defined even when the cycles Y and Z
are equal. For this, one may replace Z by a rationally equivalent cycle before
taking the intersection. The self-intersection Y · Y is computed using Chow’s
lemma, which allows one copy of Y to be deformed so that its intersection with
the undeformed Y is transversal (Fig. 48.3). Thus, replacing Y by a rationally
equivalent cycle, one may count the intersections straightforwardly (Fig. 48.2).
The Chow ring often misses much of the cohomology. For example, if X is
a curve of genus g > 1, then H 1 (X) ∼ = Z2g is nontrivial, yet the cohomology
of an algebraic cycle of codimension d lies in H 2d (X), and is never odd-
dimensional. However, if X is a flag variety, projective space, or Grassmannian,
the Chow ring and the cohomology ring are isomorphic. The cup product
corresponds to the intersection of algebraic cycles.
48 Cohomology of Grassmannians 521

What is the
multiplicity of one
circle intersecting? Four!
... deform
To compute one copy
Y · Y ... of the circle

Fig. 48.3. The self-intersection multiplicity of a cycle in P2

Let us now consider intersection theory on G/P , where P is a parabolic


subgroup, that is, a proper subgroup of G containing B. For such a variety,
the story is much the same as for the flag manifold—the Chow ring and
the cohomology ring can be identified, and the Bruhat decomposition gives a
decomposition of the space as a CW-complex. We can write

B\G/P ∼
= W/WP ,

where WP is the Weyl group of the Levi factor of P . If G = GL(n), this is


Proposition 46.1(iii). If w ∈ W , let C(w)◦ be the open Schubert cell BwP/P ,
and let C(w) be its closure, which is the union of C(w)◦ and open Schubert
cells of lower dimension. The closed Schubert cells C(w) give a basis of the
cohomology.
We will discuss the particular case where G = GL(r + s, C) and P is the
maximal parabolic subgroup
  
g1 ∗ 
 g1 ∈ GL(r, C), g2 ∈ GL(s, C)
g2

with Levi factor M = GL(r, C) × GL(s, C). The quotient Xr,s = G/P is then
the Grassmannian, a compact complex manifold of dimension rs. In this case,
the cohomology ring H ∗ (Xr,s ) is closely related to the ring R introduced in
Chap. 34.
To explain this point, let us explain how to “truncate” the ring R and
obtain a finite-dimensional algebra that will be isomorphic to H ∗ (Xr,s ).
Suppose that Jr is the linear span of all sλ such that the length of λ is
> r. Then Jr is an ideal, and the quotient R/Jr ∼ = Λ(r) by the characteristic
map. Indeed, it follows from Proposition 36.5 that Jr is the kernel of the
homomorphism ch(n) : R −→ Λ(n) .
We can also consider the ideal ι Js , where ι is the involution of Theorem 34.3.
By Proposition 35.2, this is the span of the sλ in which the length of λt is
greater than s—in other words, in which λ1 > s. So Jr + ι Js is the span of all
sλ such that the diagram of λ does not fit in an r × s box. Therefore, the ring
Rr,s = R/(Jr + ι Js ) is spanned by the images of sλ where the diagram of λ
does fit in an r × s box. For example, R3,2 is spanned by s() , s(1) , s(2) , s(11) ,
522 48 Cohomology of Grassmannians

s(21) , s(22) , s(111) , s(211) , s(221) , and s(222) . It is a free Z-module of rank 10.

In general the rank of the ring Rr,s is r+s r , which is the number of partitions
of r + s of length  r into parts not exceeding s—that is, partitions with
diagrams that fit into a box of dimensions r × s.

Theorem 48.1. The cohomology ring of Xr,s is isomorphic to Rr,s . In this


isomorphism, the cohomology classes of the Schubert cells correspond to the
sλ , as λ runs through the partitions with diagrams that fit into an r × s box.

We will not prove this. Proofs (all rather similar and based on a method
of Hodge) may be found in Fulton [53], Hiller [70], Hodge and Pedoe [71],
and Manivel [126]. We will instead give an informal discussion of the result,
including a precise description of the isomorphism and an example.
Let us explain how to associate a partition λ with a diagram that is con-
tained in the r×s box with a Schubert cell of codimension equal to |λ|. In fact,
to every coset wWP in W/WP we will associate such a partition.
Right multiplication by an element of WP ∼ = Sr × Ss consists of reordering
the first r columns and the last s columns. Hence, the representative w of the
given coset in W/WP may be chosen to be a permutation matrix such that
the entries in the first r columns are in ascending order, and so that the
entries in the last s columns are in ascending order. In other words, if σ is the
permutation such that wσ(j),j = 0, then

σ(1) < σ(2) < · · · < σ(r), σ(r + 1) < σ(r + 2) < · · · < σ(r + s). (48.2)

With this choice, we associate a partition λ as follows. We mark some of the


zero entries of the permutation matrix w as follows. If 1  j  r, if the 1 in
the ith row is in the last s columns, and if the 1 in the jth column is above
(i, j), then we mark the (i, j)th entry. For example, if r = 3 and s = 2, here
are a some examples of a marked matrix:
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 1 1
⎜ 1 ⎟ ⎜ • 1 ⎟ ⎜ • 1 ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ • • ⎟ ⎜ ⎟ ⎜ 1 ⎟
⎜ 1 ⎟, ⎜ 1 ⎟, ⎜ • ⎟ (48.3)
⎝ 1 ⎠ ⎝ 1 ⎠ ⎝ 1 ⎠
• • • 1 • • • 1 1

Now, we collect the marked columns and read off the permutation. For each
row containing marks, there will be a part of the permutation equal to the
number of marks in that row. In the three examples above, the respective
permutations λ are:

(2, 2, 1), (2, 1, 1), (2).

Their diagrams fit into a 2 × 3 box. We will write Cλ for the closed Schubert
cell C(w) when λ and w are related this way.
48 Cohomology of Grassmannians 523

Let Fi be the vector subspace of Cr+s consisting of vectors of the


form t (x1 , . . . , xi , 0, . . . , 0). The group G acts on the Grassmannian Gr,s
of r-dimensional subspaces of Cr+s . The stabilizer of Fr is precisely the
parabolic subgroup P , so there is a bijection Xr,s −→ Gr,s in which the coset
gP −→ gFr . We topologize Gr,s by asking that this map be a homeomorphism.
We can characterize the Schubert cells in terms of this parametrization by
means of integer sequences. Given a sequence (d) = (d0 , d1 , . . . , dr+s ) with

0  d0  d1  · · ·  dr+s = r, 0  di  1, (48.4)

we can consider the set C◦(d) of V in Gr,s such that

dim(V ∩ Fi ) = di . (48.5)

Let C(d) be the set of V in Gr,s such that

dim(V ∩ Fi )  di . (48.6)

The function V −→ dim(V ∩ Fi ) is upper semicontinuous on Gr,s , that is, for
any integer n, {V | dim(V ∩ Fi )  n} is closed. Therefore, C(d) is closed, and
in fact it is the closure of C◦(d) .

Lemma 48.1. In the characterization of C(d) it is only necessary to impose


the condition (48.6) at integers 0 < i < r + s such that di+1 = di > di−1 .

Proof. If di+1 > di and dim(V ∩ Fi+1 )  di+1 , then since V ∩ Fi has
codimension at most 1 in V ∩Fi+1 we do not need to assume dim(V ∩Fi )  di .
If di = di−1 and dim(V ∩ Fi−1 )  di−1 then dim(V ∩ Fi−1 )  di−1 .


We will show C◦(d) is the image in Gr,s of an open Schubert cell. For example,
with r = 3 and s = 2, taking w to be the first matrix in (48.3), we consider the
Schubert cell BwP/P , which has the image in G3,2 that consists of all bwF3 ,
where b ∈ B. A one-dimensional unipotent subspace of B is sufficient to
produce all of these elements, and a typical such space consists of all matrices
of the form
⎛ ⎞⎛ ⎞⎛ ⎞ ⎛ ⎞
1 1 x1 x1
⎜ 1 ⎟⎜ 1 ⎟ ⎜ x2 ⎟ ⎜ x2 ⎟
⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟
⎜ 1 α ⎟⎜ 1 ⎟ ⎜ x3 ⎟ = ⎜ αx3 ⎟
⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟
⎝ 1 ⎠⎝ 1 ⎠ ⎝ 0 ⎠ ⎝ x3 ⎠
1 1 0 0

with α fixed. These may be characterized by the conditions (48.5) with

(d0 , . . . , d5 ) = (0, 1, 2, 2, 3, 3).


524 48 Cohomology of Grassmannians

Proposition 48.1. The image in Gr,s of the Schubert cell C(w) corresponding
to the partition λ (the diagram of which, we have noted, must fit in an r × s
box) is C(d) , where the integer sequence (d0 , d1 , . . . , dr+s ) where

dk = i ⇐⇒ s + i − λi  k  s + i − λi+1 . (48.7)

Similarly, the image of C(w)◦ is C◦(d) .

We note that, by Lemma 48.1, if (d) is the sequence in (48.7), the closed
Schubert cell C(d) is characterized by the conditions

dim(V ∩ Fs+i−λi )  i. (48.8)

Also, by Lemma 48.1, this only needs to be checked when λi > λi+1 . [The
characterization of the open Schubert cell still requires dim(V ∩ Fk ) to be
specified for all k, not just those of the form s + i − λi .]

Proof. We will prove this for the open cell. The image of C(w)◦ in Gr,s consists
of all spaces bwFr with b ∈ B, so we must show that, with di as in (48.7), we
have
dim(bwFr ∩ Fi ) = di .
Since b stabilizes Fi , we may apply b−1 , and we are reduced to showing that

dim(wFr ∩ Fi ) = di .

If σ is the permutation such that wσ(j),j = 0, then the number of entries


below the nonzero element in the ith column, where 1  i  r, is r + s − σ(i).
However, r−i of these are not “marked.” Therefore, λi = r+s−σ(i) −(r−i),
that is,
σ(i) = s + i − λi . (48.9)
Now wFr is the space of vectors that have arbitrary values in the σ(1),
σ(2), . . . , σ(r) positions, and all other entries are zero. So the dimension of
wFr ∩ Fi is the number of k such that 1  j  r and σ(k)  i. Using (48.2),

dim(wFr ∩ Fi ) = k ⇐⇒ σ(i)  k < σ(i + 1),

which by (48.9) is equivalent to (48.7).




When (d) and λ are related as in (48.7), we will also denote the Schubert
cell C(d) by Cλ .
As we asserted earlier, the cohomology ring Xr,s is isomorphic to the
quotient Rr,s of the ring R, which has played such a role in this last part
of the book. To get some intuition for this, let us consider the identity in R

s(1) · s(1) = s(2) + s(11) .


48 Cohomology of Grassmannians 525

By the parametrization we have given, sλ corresponds to the Schubert cell Cλ .


In the case at hand, the relevant cells are characterized by the following
conditions:

C(1) = {V | dim(V ∩ Fs )  1},


C(2) = {V | dim(V ∩ Fs−1 )  1},
C(11) = {V | dim(V ∩ Fs+1 )  2}.

So our expectation is that if we deform C(1) into two copies C(1) and C(1) that
intersect properly, the intersection will be rationally equivalent to the sum of
C(2) and C(11) . We may choose spaces Gs and Hs of codimension s such that
Gs ∩ Hs = Fs−1 and Gs + Hs = Fs+1 . Now let us consider the intersection of

C(1) = {V | dim(V ∩ Gs )  1}, C(1) = {V | dim(V ∩ Hs )  1}.

If V lies in both C(1) and C(1) , then let v  and v  be nonzero vectors in V ∩ Gs
and V ∩ Hs , respectively. There are two possibilities. Either v  and v  are
proportional, in which case they lie in V ∩ Fs−1 , so V ∈ C(2) , or they are
linearly independent. In the second case, both lie in Fs+1 , so V ∈ C(11) .
The intersection theory of flag manifolds is very similar to that of Grass-
mannians. The difference is that while the cohomology of Grassmannians for
GL(r) is modeled on the ring R, which can be identified as in Chap. 34 with
the ring Λ of symmetric polynomials, the cohomology of flag manifolds is
modeled on a polynomial ring. Specifically, if B is the Borel subgroup of
G = GL(r, C), then the cohomology ring of G/B is a quotient of the poly-
nomial ring Z[x1 , . . . , xr ], where each xi is homogeneous of degree 2. Las-
coux and Schützenberger defined elements of the polynomial ring Z[x1 , . . . , xr ]
called Schubert polynomials which play a role analogous to that of the Schur
polynomials (See Fulton [53] and Manivel [126]).
A minor problem is that H ∗ (G/B) is not precisely the polynomial ring
Z[x1 , . . . , xr ] but a quotient, just as H ∗ (Gr,s ) is not precisely R or even its
quotient R/Jr , which is isomorphic to the ring of symmetric polynomials in
Z[x1 , . . . , xr ].
The ring Z[x1 , . . . , xr ] should be more properly regarded as the cohomology
ring of an infinite CW-complex, which is the cohomology ring of the space Fr
of r-flags in C∞ . That is, let Fr,s be the space of r-flags in Cr+s :

{0} = F0 ⊂ F1 ⊂ F2 ⊂ · · · ⊂ Fr ⊂ Cr+s , dim(Fi ) = i. (48.10)

We can regard Fr,s as G/P , where P is the parabolic subgroup


  
b ∗
| b ∈ B, g ∈ GL(r, C) . (48.11)
g

We may embed Fr,s −→Fr,s+1 , and the union of the Fr,s (topologized as
the direct limit) is Fr . The open Schubert cells in Fr,s correspond to double
526 48 Cohomology of Grassmannians

cosets B\G/P parametrized by elements w ∈ Sr+s /Ss . As we increase s,


the CW-complex Fr,s is obtained by adding new cells, but only in higher
dimension. The n-skeleton stabilizes when s is sufficiently large, and so
H n (Fr ) ∼
= H n (Fr,s ) if s is sufficiently large. The ring H ∗ (Fr ) ∼
= Z[x1 , . . . , xr ]
is perhaps the natural domain of the Schubert polynomials.
The cohomology of Grassmannians (and flag manifolds) provided some
of the original evidence for the famous conjectures of Weil [170] on the
number of points on a variety over a finite field. Let us count the number
of points of Xr,s over the field Fq with field elements. Representing the space
as GL(n, Fq )/P (Fq ), where n = r + s, its cardinality is

|GL(n, Fq )| (q n −1)(q n −q) · · · (q n −q n−1 )


= r .
|P (Fa )| (q −1)(q −q) · · · (q r −q r−1 ) · (q s −1)(q s −q) · · · (q s −q s−1 ) · q rs
r

In the denominator, we have used the Levi decomposition of P = M U , where


the Levi factor M = GL(r)×GL(s) and the unipotent
radical U has dimension
rs. This is a Gaussian binomial coefficient nr q . It is a generating function for
the cohomology ring H ∗ (Xr,s ).
Motivated by these examples and other similar ones, as well as the
examples of projective nonsingular curves (for which there is cohomology in
dimension 1, so that the Chow ring and the cohomology ring are definitely
distinct), Weil proposed a more precise relationship between the complex
cohomology of a nonsingular projective variety and the number of solutions
over a finite field. Proving the Weil conjectures required a new cohomology
theory that was eventually supplied by Grothendieck. This is the l-adic co-
homology. Let F̄q be the algebraic closure of Fq , and let φ : X −→ X be the
geometric Frobenius map, which raises the coordinates of a point in X to the
qth power. The fixed points of φ are then the elements of X(Fq ), and they may
be counted by means of a generalization of the Lefschetz fixed-point formula:


2n
|X(Fq )| = (−1)k tr(φ|H k ) .
k=0

The dimensions of the l-adic cohomology groups are the same as the complex
cohomology, and in these examples (since all the cohomology comes from
algebraic cycles) the odd-dimensional cohomology vanishes while on H 2i (X)
the Frobenius endomorphism acts by the scalar q i . Thus,

n
|X(Fq )| = dim H 2k (X) q k .
k=0

The l-adic cohomology groups have the same dimensions as the complex ones.
Hence, the Grothendieck–Lefschetz fixed-point formula explains the extraor-
dinary fact that the number of points over a finite field of the Grassmannian
or flag varieties is a generating function for the complex cohomology.
48 Cohomology of Grassmannians 527

Exercises
Exercise 48.1. Consider the space Fr,s (Fq ) of r-flags in Fr+s . Compute the car-
dinality by representing it as GL(n,  Fq )/P (Fiq ), where P is the parabolic subgroup
(48.11).
r+i−1 Show that |F r,s (F q )| = i di (r, s) q , where for fixed s, we have di (r, s) =
i
.

Exercise 48.2. Prove that H ∗ (Fr ) is a polynomial ring in r generators, with


generators in H 2 (Fr ) being the cohomology classes of the canonical line bundles
ξi ; here xi associates with a flag (48.10) the one-dimensional vector space Fi /Fi−1 .
Appendix: Sage

Sage is a system of free mathematical software that is under active develop-


ment. Although it was created to do number theory calculations, it contains
considerable code for combinatorics, and other areas. For Lie groups, it can
compute tensor products of representations, symmetric and exterior powers,
and branching rules. It knows the roots, fundamental dominant weights, Weyl
group actions, etc. There is also excellent support for symmetric functions and
crystals. Many other things are in Sage: Iwahori Hecke algebras, Bruhat order,
Kazhdan–Lusztig polynomials, and so on.
This appendix is not a tutorial, but rather a quick introduction to a few of
the problems Sage can solve. For a systematic tutorial, you should go through
the Lie Methods and Related Combinatorics thematic tutorial available at:

http://www.sagemath.org/doc/thematic tutorials/lie.html

Other Sage tutorials may be found at http://www.sagemath.org/help.html.


You should learn Sage’s systems of on-line documentation, tab completion,
and so forth.
You should try to run the most recent version of Sage you can because there
are continual improvements. Important speedups were added to the Lie group
code in both versions 5.4 and 5.5. For simple tasks Sage can be treated as a
command-line calculator (or you can use a notebook interface) but for more
complicated tasks you can write programs using Python. You can contribute
to Sage development: if you want a feature and it doesn’t exist, you can make
it yourself, and if it is something others might want, eventually get it into the
distributed version.
For computations with representations it is convenient to work in a
WeylCharacterRing. There are two notations for these. In the default nota-
tion, a representation is represented by its highest weight vector, as an element
of its ambient space in the notation of the appendices in Bourbaki [23]. Let
us give a brief example of a dialog using the standard notation.

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 529


DOI 10.1007/978-1-4614-8024-2, © Springer Science+Business Media New York 2013
530 Appendix: Sage

sage: B3=WeylCharacterRing("B3"); B3
The Weyl Character Ring of Type [’B’, 3] with Integer Ring
coefficients
sage: B3.fundamental_weights()
Finite family {1: (1,0,0), 2: (1,1,0), 3: (1/2,1/2,1/2)}
sage: [B3(f) for f in B3.fundamental_weights()]
[B3(1,0,0), B3(1,1,0), B3(1/2,1/2,1/2)]
sage: [B3(f).degree() for f in B3.fundamental_weights()]
[7, 21, 8]
sage: B3(1,1,0).symmetric_power(3)
B3(1,0,0) + 2*B3(1,1,0) + B3(2,1,1) + B3(2,2,1) + B3(3,1,0)
+ B3(3,3,0)
sage: [f1,f2,f3]=B3.fundamental_weights()
sage: B3(f3)
B3(1/2,1/2,1/2)
sage: B3(f3)^3
4*B3(1/2,1/2,1/2) + 3*B3(3/2,1/2,1/2) + 2*B3(3/2,3/2,1/2)
+ B3(3/2,3/2,3/2)
This illustrates different ways of interacting with Sage as a command-line
interpreter. I prefer to run Sage from within an Emacs buffer; others prefer
the notebook. For complicated tasks, such as loading some Python code, you
may write your commands in a file and load or attach it. Whatever your
method of interacting with the program, you can have a dialog with this one.
Sage provides a prompt (“sage:”) after which you type a command. Sage will
sometimes produce some output, sometimes not; in any case, when it is done
it will give you another prompt.
The first line contains two commands, separated by a semicolon. The first
command creates the WeylCharacterRing B3 but produces no output. The
second command “B3” prints the name of the ring you have just created.
Elements of the ring are virtual representations of the Lie group Spin (7)
having Cartan type B3 . Addition corresponds to direct sum, multiplication to
tensor product.
The ring B3 is a Python class, and like every Python class it has methods
and attributes which you can use to perform various tasks. If at the sage:
prompt you type B3 then hit the tab key, you will get a list of Python
methods and attributes that B3 has. For example, you will notice methods
dynkin diagram and extended dynkin diagram. If you want more informa-
tion about one of them, you may access the on-line documentation, with
examples of how to use it, by typing B3.dynkin diagram?
Turning to the next command, the Python class B3 has a method called
fundamental weights. This returns a Python dictionary with elements that
are the fundamental weights. The third command gives the irreducible repre-
sentations with these highest weights, as a Python list. After that, we compute
the degrees of these, the symmetric cube of a representation, the spin repre-
sentation B3(1/2,1/2,1/2) and its square.
Appendix: Sage 531

We can alternatively create the WeylCharacterRing with an alternative


syntax when you create it with the option style="coroots". This is most
appropriate for semisimple Lie groups: for a semisimple Lie group, we recall
that the fundamental dominant weights are defined to be the dual basis to
the coroots. Assuming that the group is simply connected the fundamental
dominant weights are weights, that is, characters of a maximal torus. For such
a case, where G is semisimple and simply connected, every dominant weight
may be uniquely expressed as a linear combination, with nonzero integer co-
efficients, of the fundamental dominant weights. This gives an alternative
notation for the representations, as the following example shows:
sage: B3=WeylCharacterRing("B3",style="coroots"); B3
The Weyl Character Ring of Type [’B’, 3] with Integer Ring
coefficients
sage: B3.fundamental_weights()
Finite family {1: (1,0,0), 2: (1,1,0), 3: (1/2,1/2,1/2)}
sage: [B3(f) for f in B3.fundamental_weights()]
[B3(1,0,0), B3(0,1,0), B3(0,0,1)]
sage: [B3(f).degree() for f in B3.fundamental_weights()]
[7, 21, 8]
sage: B3(0,1,0).symmetric_power(3)
B3(1,0,0) + 2*B3(0,1,0) + B3(1,0,2) + B3(0,1,2) + B3(2,1,0)
+ B3(0,3,0)
sage: [f1,f2,f3]=B3.fundamental_weights()
sage: B3(f3)
B3(0,0,1) sage:
B3(f3)^3
4*B3(0,0,1) + 3*B3(1,0,1) + 2*B3(0,1,1) + B3(0,0,3)
This is the same series of computations as before, just in a different
notation.
For Cartan Type Ar , if you use style="coroots", you are effectively
working with the group SL(r + 1, C). There is no way to represent the deter-
minant in this notation. On the other hand, if you use the default style, the
determinant is represented by A2(1,1,1) (in the case r = 2) so if you want
to do computations for GL(r + 1, C), do not use coroot style.
Sage knows many branching rules. For example, here is how to calculate
the restriction of a representation from SL(4) to Sp(4).
sage: A3=WeylCharacterRing("A3",style="coroots")
sage: C2=WeylCharacterRing("C2",style="coroots")
sage: r=A3(6,4,1)
sage: r.degree()
6860
sage: r.branch(C2,rule="symmetric")
C2(5,1) + C2(7,0) + C2(5,2) + C2(7,1) + C2(5,3)
+ C2(7,2) + C2(5,4) + C2(7,3) + C2(5,5) + C2(7,4)
532 Appendix: Sage

To get documentation about Sage’s branching rules, either see the thematic
tutorial or enter the command:
sage: get_branching_rule?
As another example, let us compute

|tr(g)|20 dg.
SU(2)

There are different ways of doing this computation. An efficient way is just
to compute decompose  the tenth power of the standard character
 2 into irre-
ducibles: if tr(g)10 = aλ χλ then its modulus squared is aλ .
sage: A1=WeylCharacterRing("A1")
sage: A1(1)
A1(0,0)
sage: A1([1])
A1(1,0)
sage: A1([1])^10
42*A1(5,5) + 90*A1(6,4) + 75*A1(7,3) + 35*A1(8,2) + 9*A1(9,1)
+ A1(10,0)
sage: (A1([1])^10).monomial_coefficients()
{(8, 2): 35, (10, 0): 1, (9, 1): 9, (5, 5): 42,
(7, 3): 75, (6, 4): 90}
sage: sum(v^2 for v in (A1([1])^10).monomial_coefficients().values())
16796

Alternatively, |tr(g)|20 is itself a character. We can compute this character,


then apply the method monomial coefficients. This gives a dictionary with
entries that are these coefficients. We can extract the value of 0, which we
implement as A1.space().zero().
sage: z = A1.space().zero(); z
(0, 0)
sage: ((A1([1])^10*A1([0,-1])^10)).monomial_coefficients()[z]
16796
Let us check that the moments of the trace are Catalan numbers:
sage: [sum(v^2 for v in
(A1([1])^k).monomial_coefficients().values()) for k in [0..10]]
[1, 1, 2, 5, 14, 42, 132, 429, 1430, 4862, 16796]
sage: [catalan_number(k) for k in [0..10]]
[1, 1, 2, 5, 14, 42, 132, 429, 1430, 4862, 16796]

You may also use the method weight multiplicities of a Weyl character
to get a dictionary of weight multiplicities indexed by weight.
sage: A2=WeylCharacterRing("A2")
sage: d=A2(6,2,0).weight_multiplicities(); d
{(0, 6, 2): 1, (5, 0, 3): 1, (3, 5, 0): 1, ...
Appendix: Sage 533

(Output suppressed.) Here is how to extract a single multiplicity. The space


method of the WeylCharacterRing returns the ambient vector space of the
weight lattice, and we may use this to generate a key.
sage: L=A2.space(); L
Ambient space of the Root system of type [’A’, 2]
sage: k=L((3,3,2)); k
(3, 3, 2)
sage: type(k)
<class ’sage.combinat.root_system.ambient_space.
AmbientSpace_with_category.element_class’>
sage: d[k]
3
In addition to the Lie group code in Sage, the symmetric function code in
Sage will be useful to readers of this book. You may convert between different
bases of the ring of symmetric functions (such as the Schur basis s and the
power sum basis p) and calculate important symmetric functions such as the
Hall–Littlewood symmetric functions. Moreover, Sage knows about the Hopf
algebra structure on the ring of symmetric functions. For example:
sage: Sym = SymmetricFunctions(QQ)
sage: s = Sym.schur()
sage: s[2]^2
s[2, 2] + s[3, 1] + s[4]
sage: (s[2]^2).coproduct()
s[] # s[2,2] + s[] # s[3,1] + s[] # s[4] + 2*s[1] # s[2,1]
+ 2*s[1] # s[3] + s[1, 1] # s[1, 1] + s[1, 1] # s[2]
+ s[2] # s[1, 1] + 3*s[2] # s[2] + 2*s[2, 1] # s[1]
+ s[2, 2] # s[] + 2*s[3] # s[1] + s[3, 1] # s[]
+ s[4] # s[]
sage: def f(a,b): return a*b.antipode()
sage: (s[2]^2).coproduct().apply_multilinear_morphism(f)
0
We’ve computed (in the notation introduced in the exercises to Chap-
ter 35) m ◦ (1 ⊗ S) ◦ Δ applied to s2(2) . This computation may of course also
be done using the defining property of the antipode.
You can get a command line tutorial for the symmetric function code with:
sage: SymmetricFunctions?
References

1. Peter Abramenko and Kenneth S. Brown. Buildings, volume 248 of Graduate


Texts in Mathematics. Springer, New York, 2008. Theory and applications.
2. J. Adams. Lectures on Lie Groups. W. A. Benjamin, Inc., New York-
Amsterdam, 1969.
3. J. F. Adams. Lectures on exceptional Lie groups. Chicago Lectures in Math-
ematics. University of Chicago Press, Chicago, IL, 1996. With a foreword by
J. Peter May, Edited by Zafer Mahmud and Mamoru Mimura.
4. Gernot Akemann, Jinho Baik, and Philippe Di Francesco, editors. The Oxford
handbook of random matrix theory. Oxford University Press, Oxford, 2011.
5. A. Albert. Structure of Algebras. American Mathematical Society Colloquium
Publications, vol. 24. American Mathematical Society, New York, 1939.
6. B. N. Allison. Tensor products of composition algebras, Albert forms and some
exceptional simple Lie algebras. Trans. Amer. Math. Soc., 306(2):667–695,
1988.
7. Greg W. Anderson, Alice Guionnet, and Ofer Zeitouni. An introduction to
random matrices, volume 118 of Cambridge Studies in Advanced Mathematics.
Cambridge University Press, Cambridge, 2010.
8. J. Arthur and L. Clozel. Simple Algebras, Base Change, and the Advanced
Theory of the Trace Formula, volume 120 of Annals of Mathematics Studies.
Princeton University Press, Princeton, NJ, 1989.
9. E. Artin. Geometric Algebra. Interscience Publishers, Inc., New York and
London, 1957.
10. M. Artin, J. E. Bertin, M. Demazure, P. Gabriel, A. Grothendieck,
M. Raynaud, and J.-P. Serre. Schémas en groupes. Fasc. 5b: Exposés 17 et 18,
volume 1963/64 of Séminaire de Gémétrie Algébrique de l’Institut des Hautes
Études Scientifiques. Institut des Hautes Études Scientifiques, Paris, 1964/1966
(http://www.math.jussieu.fr/~polo/SGA3/).
11. A. Ash, D. Mumford, M. Rapoport, and Y. Tai. Smooth Compactification of
Locally Symmetric Varieties. Math. Sci. Press, Brookline, Mass., 1975. Lie
Groups: History, Frontiers and Applications, Vol. IV.
12. John C. Baez. The octonions. Bull. Amer. Math. Soc. (N.S.), 39(2):145–205,
2002.
13. W. Baily. Introductory Lectures on Automorphic Forms. Iwanami Shoten,
Publishers, Tokyo, 1973. Kano Memorial Lectures, No. 2, Publications of the
Mathematical Society of Japan, No. 12.

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 535


DOI 10.1007/978-1-4614-8024-2, © Springer Science+Business Media New York 2013
536 References

14. W. Baily and A. Borel. Compactification of arithmetic quotients of bounded


symmetric domains. Ann. of Math. (2), 84:442–528, 1966.
15. A. Berele and J. B. Remmel. Hook flag characters and their combinatorics.
J. Pure Appl. Algebra, 35(3):225–245, 1985.
16. J. Bernstein and A. Zelevinsky. Representations of the group GL(n, F ) where
F is a local nonarchimedean field. Russian Mathematical Surveys, 3:1–68, 1976.
17. I. Bernstein and A. Zelevinsky. Induced representations of reductive p-adic
groups. I. Ann. Sci. École Norm. Sup. (4), 10(4):441–472, 1977.
18. P. Billingsley. Probability and Measure. Wiley Series in Probability and Math-
ematical Statistics. John Wiley & Sons Inc., New York, third edition, 1995.
A Wiley-Interscience Publication.
19. A. Borel. Automorphic L-functions. In Automorphic Forms, Representations
and L-Functions (Proc. Sympos. Pure Math., Oregon State Univ., Corvallis,
Ore., 1977), Part 2, Proc. Sympos. Pure Math., XXXIII, pages 27–61. Amer.
Math. Soc., Providence, R.I., 1979.
20. A. Borel. Linear Algebraic Groups, volume 126 of Graduate Texts in Mathe-
matics. Springer-Verlag, New York, second edition, 1991.
21. A. Borel and J. Tits. Groupes réductifs. Inst. Hautes Études Sci. Publ. Math.,
27:55–150, 1965.
22. A. Böttcher and B. Silbermann. Introduction to Large Truncated Toeplitz
Matrices. Universitext. Springer-Verlag, New York, 1999.
23. Nicolas Bourbaki. Lie groups and Lie algebras. Chapters 4–6. Elements of
Mathematics (Berlin). Springer-Verlag, Berlin, 2002. Translated from the 1968
French original by Andrew Pressley.
24. Nicolas Bourbaki. Lie groups and Lie algebras. Chapters 7–9. Elements of
Mathematics (Berlin). Springer-Verlag, Berlin, 2005. Translated from the 1975
and 1982 French originals by Andrew Pressley.
25. T. Bröcker and T. tom Dieck. Representations of Compact Lie Groups,
volume 98 of Graduate Texts in Mathematics. Springer-Verlag, New York, 1985.
26. F. Bruhat. Sur les représentations induites des groupes de Lie. Bull. Soc. Math.
France, 84:97–205, 1956.
27. D. Bump. Automorphic Forms and Representations, volume 55 of Cambridge
Studies in Advanced Mathematics. Cambridge University Press, Cambridge,
1997.
28. D. Bump and P. Diaconis. Toeplitz minors. J. Combin. Theory Ser. A,
97(2):252–271, 2002.
29. D. Bump and A. Gamburd. On the averages of characteristic polynomials from
classical groups. Comm. Math. Phys., 265(1):227–274, 2006.
30. D. Bump, P. Diaconis, and J. Keller. Unitary correlations and the Fejér kernel.
Math. Phys. Anal. Geom., 5(2):101–123, 2002.
31. E. Cartan. Sur une classe remarquable d’espaces de Riemann. Bull. Soc. Math.
France, 54, 55:214–264, 114–134, 1926, 1927.
32. R. Carter. Finite Groups of Lie Type, Conjugacy classes and complex char-
acters. Pure and Applied Mathematics. John Wiley & Sons Inc., New York,
1985. A Wiley-Interscience Publication.
33. P. Cartier. Representations of p-adic groups: a survey. In Automorphic forms,
representations and L-functions (Proc. Sympos. Pure Math., Oregon State
Univ., Corvallis, Ore., 1977), Part 1, Proc. Sympos. Pure Math., XXXIII,
pages 111–155. Amer. Math. Soc., Providence, R.I., 1979.
References 537

34. W. Casselman. Introduction to the Theory of Admissible Representa-


tions of Reductive p-adic Groups. Widely circulated preprint. Available at
http://www.math.ubc.ca/~cass/research.html, 1974.
35. C. Chevalley. Theory of Lie Groups. I. Princeton Mathematical Series, vol. 8.
Princeton University Press, Princeton, N. J., 1946.
36. C. Chevalley. The Algebraic Theory of Spinors and Clifford Algebras. Springer-
Verlag, Berlin, 1997. Collected works. Vol. 2, edited and with a foreword by
Pierre Cartier and Catherine Chevalley, with a postface by J.-P. Bourguignon.
37. W. Chow. On equivalence classes of cycles in an algebraic variety. Ann. of
Math. (2), 64:450–479, 1956.
38. B. Conrey. L-functions and random matrices. In Mathematics unlimited—2001
and beyond, pages 331–352. Springer, Berlin, 2001.
39. C. Curtis. Pioneers of Representation Theory: Frobenius, Burnside, Schur, and
Brauer, volume 15 of History of Mathematics. American Mathematical Society,
Providence, RI, 1999.
40. P. A. Deift. Orthogonal polynomials and random matrices: a Riemann-Hilbert
approach, volume 3 of Courant Lecture Notes in Mathematics. New York Uni-
versity Courant Institute of Mathematical Sciences, New York, 1999.
41. P. Deligne and G. Lusztig. Representations of Reductive Groups over Finite
Fields. Ann. of Math. (2), 103(1):103–161, 1976.
42. P. Diaconis and M. Shahshahani. On the eigenvalues of random matrices.
J. Appl. Probab., 31A:49–62, 1994. Studies in applied probability.
43. A. Dold. Fixed point index and fixed point theorem for Euclidean neighborhood
retracts. Topology, 4:1–8, 1965.
44. A. Dold. Lectures on Algebraic Topology. Springer-Verlag, New York, 1972. Die
Grundlehren der mathematischen Wissenschaften, Band 200.
45. E. Dynkin. Maximal subgroups of semi-simple Lie groups and the
classification of primitive groups of transformations. Doklady Akad. Nauk SSSR
(N.S.), 75:333–336, 1950.
46. E. Dynkin. Maximal subgroups of the classical groups. Trudy Moskov. Mat.
Obšč., 1:39–166, 1952.
47. E. Dynkin. Semisimple subalgebras of semisimple Lie algebras. Mat. Sbornik
N.S., 30(72):349–462, 1952.
48. F. Dyson. Statistical theory of the energy levels of complex systems, I, II, III.
J. Mathematical Phys., 3:140–156, 157–165, 166–175, 1962.
49. Freeman Dyson. Selected papers of Freeman Dyson with commentary, volume 5
of Collected Works. American Mathematical Society, Providence, RI, 1996.
With a foreword by Elliott H. Lieb.
50. H. Freudenthal. Lie groups in the foundations of geometry. Advances in Math.,
1:145–190 (1964), 1964.
51. G. Frobenius. Über die charakterisischen Einheiten der symmetrischen Gruppe.
S’ber. Akad. Wiss. Berlin, 504–537, 1903.
52. G. Frobenius and I. Schur. Über die rellen Darstellungen der endlichen
Gruppen. S’ber. Akad. Wiss. Berlin, 186–208, 1906.
53. W. Fulton. Young Tableaux, with applications to representation theory and
geometry, volume 35 of London Mathematical Society Student Texts. Cambridge
University Press, Cambridge, 1997.
54. W. Fulton. Intersection Theory, volume 2 of Ergebnisse der Mathematik und
ihrer Grenzgebiete. Springer-Verlag, Berlin, second edition, 1998.
538 References

55. I. Gelfand, M. Graev, and I. Piatetski-Shapiro. Representation Theory and


Automorphic Functions. Academic Press Inc., 1990. Translated from the Rus-
sian by K. A. Hirsch, Reprint of the 1969 edition.
56. R. Goodman and N. Wallach. Representations and Invariants of the Classi-
cal Groups, volume 68 of Encyclopedia of Mathematics and its Applications.
Cambridge University Press, Cambridge, 1998.
57. R. Gow. Properties of the characters of the finite general linear group related to
the transpose-inverse involution. Proc. London Math. Soc. (3), 47(3):493–506,
1983.
58. J. Green. The characters of the finite general linear groups. Trans. Amer. Math.
Soc., 80:402–447, 1955.
59. B. Gross. Some applications of Gelfand pairs to number theory. Bull. Amer.
Math. Soc. (N.S.), 24:277–301, 1991.
60. R. Gunning and H. Rossi. Analytic functions of several complex variables.
Prentice-Hall Inc., Englewood Cliffs, N.J., 1965.
61. P. Halmos. Measure Theory. D. Van Nostrand Company, Inc., New York, N. Y.,
1950.
62. Harish-Chandra. Eisenstein series over finite fields. In Functional analysis and
related fields (Proc. Conf. M. Stone, Univ. Chicago, Chicago, Ill., 1968), pages
76–88. Springer, New York, 1970.
63. M. Harris and R. Taylor. The Geometry and Cohomology of Some Simple
Shimura Varieties, volume 151 of Annals of Mathematics Studies. Princeton
University Press, Princeton, NJ, 2001. With an appendix by Vladimir G.
Berkovich.
64. R. Hartshorne. Algebraic Geometry. Springer-Verlag, New York, 1977. Gradu-
ate Texts in Mathematics, No. 52.
65. E. Hecke. Über Modulfunktionen und die Dirichletschen Reihen mit Eulerscher
Produktentwicklungen, I and II. Math. Ann., 114:1–28, 316–351, 1937.
66. S. Helgason. Differential Geometry, Lie Groups, and Symmetric Spaces, volume
80 of Pure and Applied Mathematics. Academic Press Inc. [Harcourt Brace
Jovanovich Publishers], New York, 1978.
67. Guy Henniart. Une preuve simple des conjectures de Langlands pour GL(n)
sur un corps p-adique. Invent. Math., 139(2):439–455, 2000.
68. Guy Henniart. On the local Langlands and Jacquet-Langlands correspon-
dences. In International Congress of Mathematicians. Vol. II, pages 1171–1182.
Eur. Math. Soc., Zürich, 2006.
69. E. Hewitt and K. Ross. Abstract Harmonic Analysis. Vol. I, Structure of
topological groups, integration theory, group representations, volume 115 of
Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of
Mathematical Sciences]. Springer-Verlag, Berlin, second edition, 1979.
70. H. Hiller. Geometry of Coxeter Groups, volume 54 of Research Notes in Math-
ematics. Pitman (Advanced Publishing Program), Boston, Mass., 1982.
71. W. Hodge and D. Pedoe. Methods of Algebraic Geometry. Vol. II. Cambridge
Mathematical Library. Cambridge University Press, Cambridge, 1994. Book
III: General theory of algebraic varieties in projective space, Book IV: Quadrics
and Grassmann varieties, Reprint of the 1952 original.
72. Jin Hong and Seok-Jin Kang. Introduction to quantum groups and crystal
bases, volume 42 of Graduate Studies in Mathematics. American Mathematical
Society, Providence, RI, 2002.
References 539

73. R. Howe. θ-series and invariant theory. In Automorphic Forms, Representations


and L-Functions (Proc. Sympos. Pure Math., Oregon State Univ., Corvallis,
Ore., 1977), Part 1, Proc. Sympos. Pure Math., XXXIII, pages 275–285. Amer.
Math. Soc., Providence, R.I., 1979.
74. R. Howe. Harish-Chandra Homomorphisms for p-adic Groups, volume 59 of
CBMS Regional Conference Series in Mathematics. Published for the Confer-
ence Board of the Mathematical Sciences, Washington, DC, 1985. With the
collaboration of Allen Moy.
75. Roger Howe. Remarks on classical invariant theory. Trans. Amer. Math. Soc.,
313(2):539–570, 1989.
76. R. Howe. Hecke algebras and p-adic GLn . In Representation theory and analysis
on homogeneous spaces (New Brunswick, NJ, 1993), volume 177 of Contemp.
Math., pages 65–100. Amer. Math. Soc., Providence, RI, 1994.
77. Roger Howe. Perspectives on invariant theory: Schur duality, multiplicity-free
actions and beyond. In The Schur lectures (1992) (Tel Aviv), volume 8 of Israel
Math. Conf. Proc., pages 1–182. Bar-Ilan Univ., Ramat Gan, 1995.
78. R. Howe and E.-C. Tan. Nonabelian Harmonic Analysis. Universitext.
Springer-Verlag, New York, 1992. Applications of SL(2, R).
79. Roger Howe, Eng-Chye Tan, and Jeb F. Willenbring. Stable branching rules for
classical symmetric pairs. Trans. Amer. Math. Soc., 357(4):1601–1626, 2005.
80. R. Howlett and G. Lehrer. Induced cuspidal representations and generalised
Hecke rings. Invent. Math., 58(1):37–64, 1980.
81. E. Ince. Ordinary Differential Equations. Dover Publications, New York, 1944.
82. N. Inglis, R. Richardson, and J. Saxl. An explicit model for the complex rep-
resentations of sn . Arch. Math. (Basel), 54:258–259, 1990.
83. I. M. Isaacs. Character Theory of Finite Groups. Dover Publications Inc., New
York, 1994. Corrected reprint of the 1976 original [Academic Press, New York;
MR 57 #417].
84. N. Iwahori. On the structure of a Hecke ring of a Chevalley group over a finite
field. J. Fac. Sci. Univ. Tokyo Sect. I, 10:215–236, 1964.
85. N. Iwahori. Generalized Tits system (Bruhat decompostition) on p-adic
semisimple groups. In Algebraic Groups and Discontinuous Subgroups (Proc.
Sympos. Pure Math., Boulder, Colo., 1965), pages 71–83. Amer. Math. Soc.,
Providence, R.I., 1966.
86. N. Iwahori and H. Matsumoto. On some Bruhat decomposition and the struc-
ture of the Hecke rings of p-adic Chevalley groups. Inst. Hautes Études Sci.
Publ. Math., 25:5–48, 1965.
87. N. Jacobson. Cayley numbers and normal simple Lie algebras of type G. Duke
Math. J., 5:775–783, 1939.
88. N. Jacobson. Exceptional Lie Algebras, volume 1 of Lecture Notes in Pure and
Applied Mathematics. Marcel Dekker Inc., New York, 1971.
89. M. Jimbo. A q-analogue of U (gl(N + 1)), Hecke algebra, and the Yang-Baxter
equation. Lett. Math. Phys., 11(3):247–252, 1986.
90. Michio Jimbo. Introduction to the Yang-Baxter equation. Internat. J. Modern
Phys. A, 4(15):3759–3777, 1989.
91. V. Jones. Hecke algebra representations of braid groups and link polynomials.
Ann. of Math. (2), 126:335–388, 1987.
92. Victor G. Kac. Infinite-dimensional Lie algebras. Cambridge University Press,
Cambridge, third edition, 1990.
540 References

93. Masaki Kashiwara. On crystal bases. In Representations of groups (Banff, AB,


1994), volume 16 of CMS Conf. Proc., pages 155–197. Amer. Math. Soc., Prov-
idence, RI, 1995.
94. N. Katz and P. Sarnak. Zeroes of zeta functions and symmetry. Bull. Amer.
Math. Soc. (N.S.), 36(1):1–26, 1999.
95. Nicholas M. Katz and Peter Sarnak. Random matrices, Frobenius eigenval-
ues, and monodromy, volume 45 of American Mathematical Society Colloquium
Publications. American Mathematical Society, Providence, RI, 1999.
96. N. Kawanaka and H. Matsuyama. A twisted version of the Frobenius-Schur
indicator and multiplicity-free permutation representations. Hokkaido Math.
J., 19(3):495–508, 1990.
97. David Kazhdan and George Lusztig. Representations of Coxeter groups and
Hecke algebras. Invent. Math., 53(2):165–184, 1979.
98. David Kazhdan and George Lusztig. Proof of the Deligne-Langlands conjecture
for Hecke algebras. Invent. Math., 87(1):153–215, 1987.
99. J. Keating and N. Snaith. Random matrix theory and ζ(1/2 + it). Comm.
Math. Phys., 214(1):57–89, 2000.
100. A. Kerber. Representations of permutation groups. I. Lecture Notes in Math-
ematics, Vol. 240. Springer-Verlag, Berlin, 1971.
101. R. King. Branching rules for classical Lie groups using tensor and spinor meth-
ods. J. Phys. A, 8:429–449, 1975.
102. S. Kleiman. Problem 15: rigorous foundation of Schubert’s enumerative cal-
culus. In Mathematical Developments Arising from Hilbert Problems (Proc.
Sympos. Pure Math., Northern Illinois Univ., De Kalb, Ill., 1974), pages 445–
482. Proc. Sympos. Pure Math., Vol. XXVIII. Amer. Math. Soc., Providence,
R. I., 1976.
103. A. Klyachko. Models for complex representations of groups GL(n, q). Mat. Sb.
(N.S.), 120(162)(3):371–386, 1983.
104. A. Knapp. Representation Theory of Semisimple Groups, an overview based on
examples, volume 36 of Princeton Mathematical Series. Princeton University
Press, Princeton, NJ, 1986.
105. A. Knapp. Lie groups, Lie algebras, and Chomology, volume 34 of Mathematical
Notes. Princeton University Press, Princeton, NJ, 1988.
106. A. Knapp. Lie Groups Beyond an Introduction, volume 140 of Progress in
Mathematics. Birkhäuser Boston Inc., Boston, MA, second edition, 2002.
107. M.-A. Knus, A. Merkurjev, M. Rost, and J.-P. Tignol. The Book of Involutions,
volume 44 of American Mathematical Society Colloquium Publications. Amer-
ican Mathematical Society, Providence, RI, 1998. With a preface in French by
J. Tits.
108. Donald E. Knuth. Permutations, matrices, and generalized Young tableaux.
Pacific J. Math., 34:709–727, 1970.
109. D. Knuth. The Art of Computer Programming. Volume 3, Sorting
and Searching. Addison-Wesley Publishing Co., Reading, Mass.-London-
Don Mills, Ont., 1973. Addison-Wesley Series in Computer Science and
Information Processing.
110. S. Kobayashi and K. Nomizu. Foundations of Differential Geometry. Vol I.
Interscience Publishers, a division of John Wiley & Sons, New York-London,
1963.
111. A. Korányi and J. Wolf. Generalized Cayley transformations of bounded sym-
metric domains. Amer. J. Math., 87:899–939, 1965.
References 541

112. A. Korányi and J. Wolf. Realization of hermitian symmetric spaces as gener-


alized half-planes. Ann. of Math. (2), 81:265–288, 1965.
113. S. Kudla. Seesaw dual reductive pairs. In Automorphic forms of several vari-
ables (Katata, 1983), volume 46 of Progr. Math., pages 244–268. Birkhäuser
Boston, Boston, MA, 1984.
114. Laurent Lafforgue. Chtoucas de Drinfeld et correspondance de Langlands.
Invent. Math., 147(1):1–241, 2002.
115. J. Landsberg and L. Manivel. The projective geometry of Freudenthal’s magic
square. J. Algebra, 239(2):477–512, 2001.
116. S. Lang. Algebra, volume 211 of Graduate Texts in Mathematics. Springer-
Verlag, New York, third edition, 2002.
117. R. Langlands. Euler Products. Yale University Press, New Haven, Conn., 1971.
A James K. Whittemore Lecture in Mathematics given at Yale University,
1967, Yale Mathematical Monographs, 1.
118. H. B. Lawson and M.-L. Michelsohn. Spin Geometry, volume 38 of Princeton
Mathematical Series. Princeton University Press, Princeton, NJ, 1989.
119. G. Lion and M. Vergne. The Weil representation, Maslov index and theta series,
volume 6 of Progress in Mathematics. Birkhäuser Boston, Mass., 1980.
120. D. Littlewood. The Theory of Group Characters and Matrix Representations
of Groups. Oxford University Press, New York, 1940.
121. L. Loomis. An Introduction to Abstract Harmonic Analysis. D. Van Nostrand
Company, Inc., Toronto-New York-London, 1953.
122. George Lusztig. Equivariant K-theory and representations of Hecke
algebras. Proc. Amer. Math. Soc., 94(2):337–342, 1985.
123. I. G. Macdonald. Schur functions: theme and variations. In Séminaire
Lotharingien de Combinatoire (Saint-Nabor, 1992), volume 498 of Publ. Inst.
Rech. Math. Av., pages 5–39. Univ. Louis Pasteur, Strasbourg, 1992.
124. I. Macdonald. Symmetric Functions and Hall Polynomials. Oxford Mathemat-
ical Monographs. The Clarendon Press Oxford University Press, New York,
second edition, 1995. With contributions by A. Zelevinsky, Oxford Science
Publications.
125. S. Majid. A quantum groups primer, volume 292 of London Mathematical So-
ciety Lecture Note Series. Cambridge University Press, Cambridge, 2002.
126. L. Manivel. Symmetric Functions, Schubert Polynomials and Degeneracy Loci,
volume 6 of SMF/AMS Texts and Monographs. American Mathematical Soci-
ety, Providence, RI, 2001. Translated from the 1998 French original by John
R. Swallow, Cours Spécialisés [Specialized Courses], 3.
127. H. Matsumoto. Générateurs et relations des groupes de Weyl généralisés. C.
R. Acad. Sci. Paris, 258:3419–3422, 1964.
128. M. Mehta. Random Matrices. Academic Press Inc., Boston, MA, second
edition, 1991.
129. J. Milnor and J. Stasheff. Characteristic Classes. Princeton University Press,
Princeton, N. J., 1974. Annals of Mathematics Studies, No. 76.
130. C. Moeglin. Representations of GL(n) over the real field. In Representation
theory and automorphic forms (Edinburgh, 1996), volume 61 of Proc. Sympos.
Pure Math., pages 157–166. Amer. Math. Soc., Providence, RI, 1997.
131. C. Mœglin and J.-L. Waldspurger. Spectral Decomposition and Eisenstein Se-
ries, Une paraphrase de l’Écriture [A paraphrase of Scripture], volume 113 of
Cambridge Tracts in Mathematics. Cambridge University Press, Cambridge,
1995.
542 References

132. D. Mumford, J. Fogarty, and F. Kirwan. Geometric invariant theory, volume 34


of Ergebnisse der Mathematik und ihrer Grenzgebiete (2) [Results in Mathe-
matics and Related Areas (2)]. Springer-Verlag, Berlin, third edition, 1994.
133. I. Pyateskii-Shapiro. Automorphic Functions and the Geometry of Clas-
sical Domains. Translated from the Russian. Mathematics and Its
Applications, Vol. 8. Gordon and Breach Science Publishers, New York, 1969.
134. Martin Raussen and Christian Skau. Interview with John G. Thompson and
Jacques Tits. Notices Amer. Math. Soc., 56(4):471–478, 2009.
135. N. Reshetikhin and V. G. Turaev. Invariants of 3-manifolds via link polynomi-
als and quantum groups. Invent. Math., 103(3):547–597, 1991.
136. G. de B. Robinson. On the Representations of the Symmetric Group. Amer.
J. Math., 60(3):745–760, 1938.
137. J. Rogawski. On modules over the Hecke algebra of a p-adic group. Invent.
Math., 79:443–465, 1985.
138. H. Rubenthaler. Les paires duales dans les algèbres de Lie réductives.
Astérisque, 219, 1994.
139. Michael Rubinstein. Computational methods and experiments in analytic num-
ber theory. In Recent perspectives in random matrix theory and number theory,
volume 322 of London Math. Soc. Lecture Note Ser., pages 425–506. Cambridge
Univ. Press, Cambridge, 2005.
140. W. Rudin. Fourier Analysis on Groups. Interscience Tracts in Pure and Applied
Mathematics, No. 12. Interscience Publishers (a division of John Wiley and
Sons), New York-London, 1962.
141. B. Sagan. The Symmetric Group, representations, combinatorial algorithms,
and symmetric functions, volume 203 of Graduate Texts in Mathematics.
Springer-Verlag, New York, second edition, 2001.
142. I. Satake. On representations and compactifications of symmetric
Riemannian spaces. Ann. of Math. (2), 71:77–110, 1960.
143. I. Satake. Theory of spherical functions on reductive algebraic groups over
p-adic fields. Inst. Hautes Études Sci. Publ. Math., 18:5–69, 1963.
144. I. Satake. Classification Theory of Semi-simple Algebraic Groups. Marcel
Dekker Inc., New York, 1971. With an appendix by M. Sugiura, Notes prepared
by Doris Schattschneider, Lecture Notes in Pure and Applied Mathematics, 3.
145. I. Satake. Algebraic Structures of Symmetric Domains, volume 4 of Kano
Memorial Lectures. Iwanami Shoten and Princeton University Press, Tokyo,
1980.
146. R. Schafer. An Introduction to Nonassociative Algebras. Pure and Applied
Mathematics, Vol. 22. Academic Press, New York, 1966.
147. C. Schensted. Longest increasing and decreasing subsequences. Canad. J.
Math., 13:179–191, 1961.
148. J.-P. Serre. Galois Cohomology. Springer-Verlag, Berlin, 1997. Translated from
the French by Patrick Ion and revised by the author.
149. E. Spanier. Algebraic Topology. McGraw-Hill Book Co., New York, 1966.
150. T. Springer. Galois cohomology of linear algebraic groups. In Algebraic Groups
and Discontinuous Subgroups (Proc. Sympos. Pure Math., Boulder, Colo.,
1965), pages 149–158. Amer. Math. Soc., Providence, R.I., 1966.
151. T. Springer. Cusp Forms for Finite Groups. In Seminar on Algebraic Groups
and Related Finite Groups (The Institute for Advanced Study, Princeton, N.J.,
1968/69), Lecture Notes in Mathematics, Vol. 131, pages 97–120. Springer,
Berlin, 1970.
References 543

152. T. Springer. Reductive groups. In Automorphic forms, representations and L-


functions (Proc. Sympos. Pure Math., Oregon State Univ., Corvallis, Ore.,
1977), Part 1, Proc. Sympos. Pure Math., XXXIII, pages 3–27. Amer. Math.
Soc., Providence, R.I., 1979.
153. R. Stanley. Enumerative Combinatorics. Vol. 2, volume 62 of Cambridge Stud-
ies in Advanced Mathematics. Cambridge University Press, Cambridge, 1999.
With a foreword by Gian-Carlo Rota and appendix 1 by Sergey Fomin.
154. Robert Steinberg. A general Clebsch-Gordan theorem. Bull. Amer. Math. Soc.,
67:406–407, 1961.
155. Robert Steinberg. Lectures on Chevalley groups. Yale University, New Haven,
Conn. (http://www.math.ucla.edu/ rst/), 1968. Notes prepared by John
Faulkner and Robert Wilson.
156. E. Stiefel. Kristallographische Bestimmung der Charaktere der geschlossenen
Lie’schen Gruppen. Comment. Math. Helv., 17:165–200, 1945.
157. G. Szegö. On certain Hermitian forms associated with the Fourier series of a
positive function. Comm. Sém. Math. Univ. Lund [Medd. Lunds Univ. Mat.
Sem.], 1952(Tome Supplementaire):228–238, 1952.
158. T. Tamagawa. On the ζ-functions of a division algebra. Ann. of Math. (2),
77:387–405, 1963.
159. J. Tate. Number theoretic background. In Automorphic forms, representations
and L-functions (Proc. Sympos. Pure Math., Oregon State Univ., Corvallis,
Ore., 1977), Part 2, Proc. Sympos. Pure Math., XXXIII, pages 3–26. Amer.
Math. Soc., Providence, R.I., 1979.
160. H. N. V. Temperley and E. H. Lieb. Relations between the “percolation” and
“colouring” problem and other graph-theoretical problems associated with reg-
ular planar lattices: some exact results for the “percolation” problem. Proc.
Roy. Soc. London Ser. A, 322(1549):251–280, 1971.
161. J. Tits. Algèbres alternatives, algèbres de Jordan et algèbres de Lie excep-
tionnelles. I. Construction. Nederl. Akad. Wetensch. Proc. Ser. A 69 = Indag.
Math., 28:223–237, 1966.
162. J. Tits. Classification of algebraic semisimple groups. In Algebraic Groups and
Discontinuous Subgroups (Proc. Sympos. Pure Math., Boulder, Colo., 1965),
pages 33–62, Providence, R.I., 1966, 1966. Amer. Math. Soc.
163. Jacques Tits. Buildings of spherical type and finite BN-pairs. Lecture Notes in
Mathematics, Vol. 386. Springer-Verlag, Berlin, 1974.
164. Marc A. A. van Leeuwen. The Robinson-Schensted and Schützenberger algo-
rithms, an elementary approach. Electron. J. Combin., 3(2):Research Paper
15, approx. 32 pp. (electronic), 1996. The Foata Festschrift.
165. V. Varadarajan. An Introduction to Harmonic Analysis on Semisimple Lie
Groups, volume 16 of Cambridge Studies in Advanced Mathematics. Cambridge
University Press, Cambridge, 1989.
166. È. Vinberg, editor. Lie Groups and Lie Algebras, III, volume 41 of Encyclopae-
dia of Mathematical Sciences. Springer-Verlag, Berlin, 1994. Structure of Lie
groups and Lie algebras, A translation of Current problems in mathematics.
Fundamental directions. Vol. 41 (Russian), Akad. Nauk SSSR, Vsesoyuz. Inst.
Nauchn. i Tekhn. Inform., Moscow, 1990 [MR 91b:22001], Translation by V.
Minachin [V. V. Minakhin], Translation edited by A. L. Onishchik and È. B.
Vinberg.
544 References

167. D. Vogan. Unitary Representations of Reductive Lie Groups, volume 118 of


Annals of Mathematics Studies. Princeton University Press, Princeton, NJ,
1987.
168. N. Wallach. Real Reductive Groups. I, volume 132 of Pure and Applied Math-
ematics. Academic Press Inc., Boston, MA, 1988.
169. A. Weil. L’intégration dans les Groupes Topologiques et ses Applications.
Actual. Sci. Ind., no. 869. Hermann et Cie., Paris, 1940. [This book has been
republished by the author at Princeton, N. J., 1941.].
170. A. Weil. Numbers of solutions of equations in finite fields. Bull. Amer. Math.
Soc., 55:497–508, 1949.
171. A. Weil. Algebras with involutions and the classical groups. J. Indian Math.
Soc. (N.S.), 24:589–623 (1961), 1960.
172. A. Weil. Sur certains groupes d’opérateurs unitaires. Acta Math., 111:143–211,
1964.
173. A. Weil. Sur la formule de Siegel dans la théorie des groupes classiques. Acta
Math., 113:1–87, 1965.
174. H. Weyl. Theorie der Darstellung kontinuierlicher halb-einfacher Gruppen
durch lineare Transformationen, i, ii and iii. Math. Zeitschrift, 23:271–309,
24:328–395, 1925, 1926.
175. J. Wolf. Complex homogeneous contact manifolds and quaternionic symmetric
spaces. J. Math. Mech., 14:1033–1047, 1965.
176. J. Wolf. Spaces of Constant Curvature. McGraw-Hill Book Co., New York,
1967.
177. A. Zelevinsky. Induced representations of reductive p-adic groups. II. On ir-
reducible representations of GL(n). Ann. Sci. École Norm. Sup. (4), 13(2):
165–210, 1980.
178. A. Zelevinsky. Representations of Finite Classical Groups, A Hopf algebra
approach, volume 869 of Lecture Notes in Mathematics. Springer-Verlag, Berlin,
1981.
179. R. B. Zhang. Howe duality and the quantum general linear group. Proc. Amer.
Math. Soc., 131(9):2681–2692 (electronic), 2003.
Index

Abelian subspace, 283 bilinear form


absolute root system, 281, 282 invariant, 63
Adams operations, 189, 353 bimodule, 345
adjoint group, 145 Borel subgroup, 227, 232
adjoint representation, 54 standard, 232
admissible path, 110 boundary
affine Hecke algebra, 472 Bergman-Shilov, 274
affine ring, 405 boundary component, 272, 274
affine root, 307 boundary of a symmetric space, 269
affine Weyl group, 191, 195, 221 bounded operator, 19
algebraic character, 349 bracket
algebraic complexification, 208 Lie, 32
algebraic cycle, 519 braid group, 216
algebraic representation, 209, 349 braid relation, 216
alternating map, 59, 356 branching rule, 399, 419
anisotropic kernel, 265, 284 Brauer-Klimyk method, 185
anisotropic torus, 511 Bruhat decomposition, 243, 300
antipodal map, 43 building
arclength, 110 Tits’, 195, 214, 243, 276
Ascoli-Arzela Lemma, 21, 22, 24
atlas, 39 Cartan decomposition, 89
augmentation map, 479 Cartan involution, 257
automorphic cuspidal representation, Cartan type, 145
489 classical, 145
automorphic form, 487, 488 Casimir element, 62, 64, 75, 488
automorphic representation, 489 Catalan numbers, 128
Cauchy identity, 241, 395, 415
balanced map, 345 dual, 398, 416
base point, 81 supersymmetric, 406
Bergman-Shilov boundary, 274 Cayley numbers, 276, 313
Bezout’s Theorem, 520 Cayley transform, 37, 268–270
bialgebra, 375 center, 201
big Bruhat cell, 254 central character, 489

D. Bump, Lie Groups, Graduate Texts in Mathematics 225, 545


DOI 10.1007/978-1-4614-8024-2, © Springer Science+Business Media New York 2013
546 Index

central orthogonal idempotents, 261 coordinate functions, 39


character, 7 coordinate neighborhood, 39
algebraic, 349 coroot, 130–133, 144, 147, 169, 195
linear, 103 correlation, 416
rational, 103 correspondence, 401
unipotent, 476 covering map, 83
character, generalized, 15 local triviality of, 83
character, virtual, 15 pointed, 83
characteristic function of a measure, trivial, 83
411 universal, 84
Chow’s Lemma, 519 covering space
Christoffel symbol, 111 morphism of, 83
circle coweight lattice, 195
Noneuclidean, 121 Coxeter complex, 214
Circular orthogonal ensemble (COE), Coxeter group, 213, 471
415 crystal, 432
Circular symplectic ensemble (CSE), cusp form, 487, 489
415 cuspidal representation, 485, 491, 492
Circular unitary ensemble (CUE), 415 CW-complex, 517
class function, 16 cycle type, 387
classical Cartan types, 145
classical root systems, 148 defined over a field, 448
Clifford algebra, 324 Demazure operator, 219
closed Lie subgroup, 31, 45 derivation, 33
coalgebra, 374 diffeomorphism, 31, 39
compact Lie algebra, 262 differential, 42, 48
compact operator, 19 discrete series, 485, 511
complementary minors, 365 divided difference operator, 218
complete reducibility, 75 dominant weight, 165, 384
complete symmetric polynomial, 349 dual Cauchy identity, 398
complex analytic group, 101 dual group, 7
complex and real representations, 67 dual lattice, 130
complex Lie group, 101 dual reductive pair, 314
complex manifold, 101 dual symmetric spaces, 258
complexification, 68, 103 Dynkin diagram, 222
algebraic, 208 extended, 307
torus, 103
complexification of a Lie group, 205 effective weight, 238
concatenation of paths, 81 eigenspace, 19
cone eigenvalue, 19
homogeneous, 275 eigenvector, 19
self-dual, 275 Einstein summation convention, 110
conformal map, 120 Eisenstein series, 485, 489
conjugacy class indicator, 388 elementary symmetric polynomial, 349
conjugate partition, 359 ensemble, 413
constant term, 489 equicontinuity, 21, 22
contractible space, 81 equivariant map, 11
contragredient representation, 10, 445 Euclidean space, 129
convolution, 17, 23, 338 evaluation map, 40
Index 547

even partition, 449 Haar measure, 3


even weight, 238 left, 3
exceptional group, 152 right, 3
exceptional Jordan algebra, 276 half-integral weight, 323
exponential map, 33 Hamiltonian, 414
extended Dynkin diagram, 304, 307 Hecke algbra, 461
extension of scalars, 67 Hecke algebra, 462, 471
affine, 472
faithful representation, 26 Iwahori, 472
Ferrers’ diagram, 359 spherical, 472
fixed point, 518 Heine-Szegö identity, 439
isolated, 518 Hermitian form, 8
flag manifold, 108 Hermitian manifold, 266
folding, 310 Hermitian matrix, 88
Fourier inversion, 7 positive definite, 88
Fourier inversion formula, 8 Hermitian symmetric space, 266
highest weight vector, 132, 169, 171,
Frobenius-Schur duality, viii, 355, 480
182, 384
Frobenius-Schur indicator, 188, 446, 452
Hilbert–Schmidt operator, 22
twisted, 452
homogeneous space, 90
fundamental dominant weight, 147, 166,
homomorphism
169
Lie algebra, 48
fundamental group, 84
homomorphism of G-modules, 11
homotopic, 81
G-module, 339 homotopy, 81
Galois cohomology, 210 path, 81
Gamburd, 415 homotopy lifting property, 83
Gaussian binomial coefficient, 508, 526 hook, 422
Gaussian Orthogonal Ensemble (GOE), hook length formula, 422
414 Hopf algebra, 374, 376, 435
Gaussian Symplectic Ensemble (GSE), Hopf axiom, 375
414 horizontal strip, 423
Gaussian Unitary Ensemble (GUE), 414 Howe correspondence, 401
Gelfand pair, 462, 465 Howe duality, viii, 401
Gelfand subgroup, 462, 465 hyperbolic space, 293
Gelfand-Graev representation, 468 hyperoctahedral group, 377
Gelfand-Tsetlin pattern, 428
general linear group, 32 idempotents
generalized character, 15 orthogonal central, 261
generator induced representation, 337
topological, 104 initial object, 57
generic representation, 468 inner form, 263
geodesic, 111, 113 inner product, 8, 109
geodesic coordinates, 114 equivariant, 8
geodesically complete, 116 invariant, 8
germ, 39 integral curve, 51
graded algebra, 59, 375 integral manifold, 93
graded module, 375 integral weight, 323
Grassmannian, 521 interlace, 427
548 Index

intersection multiplicity, 519 Lie group, 45


intertwining integral, 491 reductive, 281, 303
intertwining operator, 11, 337 Lie subgroup, 31, 45
support of, 343 closed, 31, 45
invariant bilinear form, 63 Lie’s theorem on solvable Lie algebras,
invariant inner product, 8 230
invariants of a representation, 14 linear character, 103
Inverse Function Theorem, 31 linear equivalence of cycles, 519
involution, 448, 463 Littlewood-Richardson coefficient, 399
Cartan, 257 Littlewood-Richardson rule, 422, 423
involution model, 449 local coordinates, 39
involutory family, 93 local derivation, 41
irreducible character, 7 local field, 485
irreducible representation, 7, 62 local homomorphism, 86, 97
isolated fixed point, 518 local Langlands correspondence, 510
isometric map, 120 local subgroup, 97
Iwahori Hecke algebra, 472 local triviality, 83
Iwahori subgroup, 472 locally closed subspace, 31
Iwasawa decomposition, 228 long Weyl group element, 165
loop, 81
Jacobi identity, 32 lowering operator, 449
Jacquet functor, 491
Jordan algebra, 275 Magic Square of Freudenthal, 276
manifold
Hermitian, 266
Kawanaka and Matsuyama theorem, Riemannian, 109
452 smooth, 39
Keating and Snaith, 415 matrix coefficient of a representation, 9,
Killing form, 63 10
Kronecker’s Theorem, 104 Matsumoto’s theorem, 216
minimal parabolic subgroup, 246, 248,
L-group, 130 251
lambda ring, 189 model of a representation, 464, 467
Langlands correspondence, 510, 512 module, 11
Langlands L-group, 130 module of invariants, 76
Laplace-Beltrami operator, 488 monatomic representation, 498
lattice monomial matrix, 468
weight, 172 morphism of covering maps, 83
Lefschetz fixed-point formula, 518, 526 multinomial coefficient, 426
Lefschetz number, 518 multiplicity
left invariant vector field, 46 weight, 177
length of a partition, 359 multiplicity-free representation, 239,
Levi subgroup, 399, 473 420, 461, 462
Lie algebra, 32 Murnaghan-Nakayama rule, 424
compact, 262
simple, 64 negative root, 157
Lie algebra homomorphism, 48, 61 nilpotent Lie algebra, 228
Lie algebra representation, 53 no small subgroups, 26
Lie bracket, 32 noneuclidean geometry, 257
Index 549

normalized induction, 490 pointed topological space, 81


polarization, 381
observable, 414 polynomial character, 458
octonion algebra, 276 Pontriagin duality, 7
octonions, 276, 313 positive root, 147, 157
one-parameter subgroup, 33 positive Weyl chamber, 148, 163
open Weyl chamber, 163 power-sum symmetric polynomial, 352
operator preatlas, 39
Demazure, 219 probability measure, 408
divided difference, 218
operator norm, 19 quadratic space, 32, 36
ordered partition, 492 quantum group, 435, 480
orientation, 107 quasicharacter, 4
orthogonal group, 32 modular, 4
orthogonal representation, 445, 446 unitary, 4
oscillator representation, 333 quasisplit group, 294
outer form, 263 quaternionic representation, 445, 446

parabolic induction, 485, 486 raising operator, 449


parabolic subgroup, 232, 248, 270, 304, random matrix theory, 413
473 rank
minimal, 246, 248, 251 real, 265
standard, 232, 472, 473 semisimple, 130
partial order on root space, 148 rank of a Lie group, 130
partition, 359 rational character, 103, 285
conjugate, 359 rational equivalence of cycles, 519
even, 449 real form, 209
length, 359 real representation, 445, 446
path, 81 recording tableau, 432
arclength, 110 reduced decomposition, 213
concatenation of, 81 reduced norm, 281
reparametrization, 81 reduced word, 213
reversal of, 82 reducible root system, 152, 222
trivial, 81 reductive group, 281, 303
well-paced, 110 reflection, 129
path lifting property, 83 regular element, 142, 299, 483
path of shortest length, 111 regular embedding, 304
path of stationary length, 111 regular function, 405
path-connected space, 81 regular measure, 3
path-homotopy, 81 regular semisimple element, 501
Peirce decomposition, 261 relative root system, 166, 281, 282
permutation matrix, 369 relative Weyl group, 281, 290
Peter-Weyl theorem, 8, 25–27, 182 reparametrization of a path, 81
Pieri’s Formula, 423 representation, 7
Pieri’s formula, 422, 423 algebraic, 209, 349
Plancherel formula, 7, 8 contragredient, 10, 445
plethysm, 353 cuspidal, 485
Poincaré–Birkoff–Witt theorem, 62, 236 discrete series, 485
pointed covering map, 83 irreducible, 62
550 Index

Lie algebra, 53 Siegel parabolic subgroup, 270


orthogonal, 445, 446 Siegel space, 266
quaternionic, 445, 446 Siegel upper half-space, 266
real, 445, 446 simple Lie algebra, 64
symplectic, 445, 446 simple positive root, 166
trivial, 14 simple reflection, 157, 244
unitary, 27 simple root, 157
reproducing kernel, 396 simply-connected, 49
restricted root system, 281, 282 topological space, 82
ribbon, 393 simply-laced Dynkin diagram, 223
Riemann zeta function, 416 singular element, 142, 299
Riemannian manifold, 109 skew partition, 423
Riemannian structure, 109 skew shape, 424
Robinson-Schensted-Knuth algoritm, smooth manifold, 39
429 smooth map, 31, 39
root, 131, 245 smooth premanifold, 39
affine, 307 solvable Lie algebra, 229
positive, 147, 157 Lie’s theorem, 230
simple, 166 special linear group, 32
simple positive, 157 special orthogonal group, 32
root datum, 129 special unitary group, 32
root folding, 310 spherical function, 66
root system, 129 spherical variety, 239
absolute, 281, 282 spin group, 91, 319
reducible, 152, 222 spin representation, 319, 323
relative, 281, 282 split group, 294
RSK, 429 standard Borel subgroup, 232
standard parabolic subgroup, 304, 472,
Schensted insertion, 431 473, 492
Schubert cell, 521 standard representation, 170
Schubert polynomial, 525 standard tableau, 421
Schur functions stationary length, 111
supersymmetric, 406 Steinberg character, 476
Schur orthogonality, 12, 13, 15 Stone-von Neumann theorem, 334, 347
Schur polynomial, 365, 379 strip
Schur’s lemma, 11 horizontal, 423
Schur-Weyl duality, viii vertical, 423
see-saw, 402 submanifold, 31
Selberg Integral, 415 subpermutation matrix, 464
self-adjoint, 19 summation convention, 110
semisimple, 177 supersymmetric Cauchy identity, 406
semisimple case, 166 supersymmetric Schur functions, 406
semisimple element, 483, 501 support, 179
semisimple Lie group, 145 support of a permutation, 387
semisimple rank, 130, 145 support of an intertwining operator, 343
semistandard Young tableau, 429 symmetric algebra, 60
Siegel domain symmetric power, 59
Type I, 278 symmetric space, 257
Type II, 278 boundary, 269
Index 551

dual, 258 unit of an algebra, 374


Hermitian, 266 Unitarian Trick, 98
irreducible, 261 unitary group, 32
reducible, 261 unitary representation, 8, 27
type I, 262 universal cover, 84
type II, 261 universal property, 57, 58
type III, 262
type IV, 261 vector
symplectic Clifford algebra, 333 Weyl, 147
symplectic group, 32 vector field, 42
symplectic representation, 445, 446 left invariant, 46
subordinate to a family, 93
tableau, 421 vertical strip, 423
standard, 421 virtual character, 15
tangent bundle, 93
tangent space, 41
weak convergence of measures, 408
tangent vector, 41
weight, 130, 131, 165, 169, 177
tensor product, 57
dominant, 165, 384
terminal object, 57
fundamental dominant, 147, 166, 169
Tits’ system, 243, 244
half-integral, 323
Toeplitz matrix, 437
integral, 323
topological generator, 104
weight diagram, 171
torus, 102, 501
weight lattice, 130, 172
anisotropic, 511
weight multiplity, 177
compact, 102
weight space, 72
complex, 103
Weil representation, 333, 401
totally disconnected group, 26
well-paced, 110
trace bilinear form, 63
triality, 310 Weyl algebra, 333
trivial path, 81 Weyl chamber, 163
trivial representation, 14 positive, 148
tube domain, 267 Weyl character formula, 179
twisted Frobenius-Schur indicator, 452 Weyl dimension formula, 183
Type I symmetric spaces, 262 Weyl group, 106
Type II symmetric spaces, 261 affine, 195, 221
Type III symmetric spaces, 262 relative, 281, 290
Type IV symmetric spaces, 261 Weyl integration formula, 123
type of conjugacy class, 503 Weyl vector, 147, 165
word, 213, 430
unimodular group, 3 reduced, 213
unipotent character, 476
unipotent matrix, 227 Young diagram, 359
unipotent radical, 316 Young tableau, 421
unipotent subgroup, 316 semistandard, 429

You might also like