Representation Theory
Representation Theory
E. Kowalski
iii
CHAPTER 1
These notes are intended to provide a basic introduction to (some of) the fundamental
ideas and results of representation theory of groups. In this first preliminary chapter, we
start with some motivating remarks and provide a general overview of the rest of the
text; we also include some notes on the prerequisite knowledge – which are not uniform
for all parts of the notes – and discuss the basic notation that we use.
In writing this text, the objective has never been to give the shortest or slickest proof.
To the extent that the author’s knowledge makes this possible, the goal is rather to
explain the ideas and the mechanism of thought that can lead to an understanding of
“why” something is true, and not simply to the quickest line-by-line check that it holds.
The point of view is that representation theory is a fundamental theory, both for
its own sake and as a tool in many other fields of mathematics; the more one knows,
understands and breathes representation theory, the better. This style (or its most ideal
potential form) is perhaps best summarized by P. Sarnak’s advice in the Princeton Com-
panion to Mathematics [17, p. 1008]:
One of the troubles with recent accounts of certain topics is that they
can become too slick. As each new author finds cleverer proofs or treat-
ments of a theory, the treatment evolves toward the one that contains
the “shortest proofs.” Unfortunately, these are often in a form that
causes the new student to ponder, “How did anyone think of this?”
By going back to the original sources one can usually see the subject
evolving naturally and understand how it has reached its modern form.
(There will remain those unexpected and brilliant steps at which one
can only marvel at the genius of the inventor, but there are far fewer
of these than you might think.) As an example, I usually recommend
reading Weyl’s original papers on the representation theory of compact
Lie groups and the derivation of his character formula, alongside one of
the many modern treatments.
So the text sometimes gives two proofs of the same result, even in cases where the
arguments are fairly closely related; one may be easy to motivate (“how would one try to
prove such a thing?”), while the other may recover the result by a slicker exploitation of
the formalism of representation theory. To give an example, we first consider Burnside’s
irreducibility criterion, and its developments, using an argument roughly similar to the
original one, before showing how Frobenius reciprocity leads to a quicker line of reasoning
(see Sections 2.7.3 and 2.7.4).
Acknowledgments. The notes were prepared in parallel with the course “Represen-
tation Theory” that I taught at ETH Zürich during the Spring Semester 2011. Thanks
are obviously due to all the students who attended the course for their remarks and inter-
est, in particular M. Lüthy, M Rüst, I. Schwabacher, M. Scheuss, and M. Tornier, and to
the assistants in charge of the exercise sessions, in particular J. Ditchen who coordinated
those. Thanks also “Anonymous Rex” for a comment on a blog post, to U. Schapira for
1
his comments and questions during the class, and to A. Venkatesh for showing me his
own notes for a (more advanced) representation theory class, from which I derived much
insight.
1.1. Presentation
A (linear) representation of a group G is, to begin with, simply a homomorphism
% : G −→ GL(E)
where E is a vector space over some field k and GL(E) is the group of invertible k-linear
maps on E. Thus one can guess that this should be a useful notion by noting how it
involves the simplest and most ubiquitous algebraic structure, that of a group, with the
powerful and flexible tools of linear algebra. Or, in other words, such a map attempts to
“represent” the elements of G as symmetries of the vector space E (note that % might
fail to be injective, so that G is not mapped to an isomorphic group).
But even a first guess would probably not lead to imagine how widespread and influ-
ential the concepts of representation theory turn out to be in current mathematics. Few
fields of pure mathematics, or of mathematical physics (or chemistry), do not make use
of these ideas, and many depend on representations in an essential way. We will try to
illustrate this wide influence with examples, taken in particular from number theory and
from basic quantum mechanics; already in Section 1.2 below we state four results, where
representation theory does not appear in the statements although it is a fundamental
tool in the proofs. Moreover, it should be said that representation theory is now a field
of mathematics in its own right, which can be pursued without having immediate appli-
cations in mind; it does not require external influences to expand with new questions,
results and concepts – but we will barely scratch such aspects, by lack of knowledge.
The next chapter starts by presenting the fundamental vocabulary that is the foun-
dation of representation theory, and by illustrating them with examples. It also contains
a number of short remarks concerning variants of the definition of a representation %:
restrictions can be imposed on the group G, on the type of fields or vector spaces E al-
lowed, or additional regularity assumptions may be imposed on % when this makes sense;
or even the target groups GL(E) may be replaced by suitable subgroups, or even by other
classes of basic groups, such as symmetric groups. Many of these variants are important
topics in their own right, but most of them won’t reappear in the rest of these notes –
lack of space, if not as well of expertise, is to blame.
Chapter 4 is an introduction to the simplest case of representation theory: the linear
representations of finite groups in finite-dimensional complex vector spaces. This is also
historically the first case that was studied in depth by Dirichlet (for finite abelian groups),
then Frobenius, Schur, Burnside, and many others. It is a beautiful theory, and has many
important applications. It also serves as “blueprint” to many generalizations, as we will
see in the rest of the notes: various facts, which are extremely elementary for finite groups,
remain valid, when properly framed, for important classes of infinite groups.
Among these, the compact topological groups are undoubtedly those closest to finite
groups, and we consider them first abstractly in the following chapter. Then another
chapter presents some concrete examples of compact Lie groups (compact matrix groups,
such as unitary groups U(n, C)), with some applications – the most important being
probably the way representation theory explains a lot about the way the most basic
atom, Hydrogen, behaves in the real world...
The final chapter is only an introduction to the next class of infinite groups, the
locally compact groups, but non-compact, groups, through the fundamental examples of
2
the abelian group R and the group SL2 (R) of two-by-two real matrices with determinant
1. We use this example primarily to illustrate some of the striking new phenomena that
arise when compactness is lost. Here, although we can not give complete proofs, we also
have an important concrete application in mind: the use of a famous theorem of Selberg
to construct so-called expander graphs.
In an Appendix, we have gathered statements and sketches of proofs for certain facts,
especially the Spectral Theorem for bounded or unbounded self-adjoint linear operators,
which are needed for rigorous treatments of compact and locally compact groups. We
have also included a short section illustrating concrete computations of representation
theory, using such software packages as Magma or GAP.
Throughout, we also present some examples by means of exercises. These are usually
not particularly difficult, but we hope they will help the reader to get acquainted with
the way of thinking that representation theory often suggests for certain problems...
1237, 2237, 5237, 7237, 8237, 19237, 25237, 26237, 31237, 32237,
38237, 40237, 43237, 46237, 47237, 52237, 56237, 58237, 64237,
70237, 71237, 73237, 77237, 82237, 85237, 88237, 89237, 91237, 92237
to be those prime numbers ending with 237 which are 6 100000.
We will present the idea of the proof of this theorem in Chapter 4. As we will see,
a crucial ingredient (but not the only one) is the simplest type of representation theory:
that of groups which are both finite and commutative. In some sense, there is no better
example to guess the power of representation theory than to see how even the simplest
instance leads to such remarkable results.
Example 1.2.3 (The hydrogen atom). According to current knowledge, about 75%
of the observable weight of the universe is accounted for by hydrogen atoms. In quantum
mechanics, the possible states of an (isolated) hydrogen atom are described in terms of
3
combinations of “pure” states, and the latter are determined by discrete data, tradition-
ally called “quantum numbers” – so that the possible energy values of the system, for
instance, form a discrete set of numbers, rather than a continuous interval.
Precisely, in the non-relativistic theory, there are four quantum numbers for a given
pure state, denoted (n, `, m, s) – “principal”, “angular momentum”, “magnetic” and
“spin” are their usual names – which are all integers, except for s, with the restrictions
n > 1, 0 6 ` 6 n − 1, −` 6 m 6 `, s ∈ {±1/2}.
It is rather striking that this simplest quantum-mechanical model of the hydrogen
atom can be “explained” qualitatively by an analysis of the representation theory of
the underlying symmetry group (see [44] or [38]), leading in particular to a natural
explanation of the intricate structure of these four quantum numbers!
We will attempt to explain this in Section 6.4.
Example 1.2.4 (“Word” problems). For a prime number p, consider the finite group
SL2 (Fp ) of 2 × 2 matrices of determinant 1 with coefficients in the finite field Fp = Z/pZ.
This group is generated by the two elements
1 1 1 0
(1.1) s1 = , s2 = ,
0 1 1 1
(this is a fairly easy fact from elementary group theory, see, e.g., [33, Th. 8.8] for
K = Fp , noting that s1 itself generates the set of all upper-triangular transvections and
s2 the lower-triangular ones). Certainly the group is also generated by the elements of the
set S = {s1 , s−1 −1
1 , s2 , s2 }, and in particular, for any g ∈ SL2 (Fp ), there exists an integer
k > 1 and elements s1 , . . . , sk , each of which belongs to S, such that
g = s1 · · · sk .
One may ask, how large can k be, at worse? The following result gives an answer:
Theorem 1.2.5 (Selberg, Brooks, Burger). There exists a constant C > 0, indepen-
dent of p and g, such that, with notation as above, we have
k 6 C log p
in all cases.
All proofs of this result depend crucially on ideas of representation theory, among
other tools. And while it may seem to be rather simple and not particularly worth
notice, the following open question should suggest that there is something very subtle
here:
Question. Find an algorithm that, given p and g ∈ SL2 (Fp ), explicitly gives k 6
C log p and a sequence (g1 , . . . , gk ) such that
g = g1 · · · gk .
For instance, how would you do with
1 (p − 1)/2
g=
0 1
(for p > 3)? Of course, one can take k = (p − 1)/2 and gi = s1 for all i, but when p is
large, this is much larger than what the theorem claims to be possible!
We will not prove Theorem 1.2.5, nor really say much more about the known proofs.
However, in Section 4.7.1, we present more elementary results of Gowers [16] (and
4
Nikolov–Pyber [30]) which are much in the same spirit, and use the same crucial in-
gredient concerning representations of SL2 (Fp ).
In these three first examples, it turns out that representation theory appears in a
similar manner: it is used to analyze functions on a group, in a way which is close to the
theory of Fourier series or Fourier integrals – indeed, both of these can also be understood
in terms of representation theory, as we will see, for the groups (R/Z) and R, respectively.
The next motivating example is purely algebraic:
Example 1.2.6 (Burnside’s theorem on finite groups). Recall that if G is a finite
group, it is called solvable if there are normal subgroups
1 / Gk / Gk−1 / · · · / G1 / G = G0
where each successive quotient Gk /Gk+1 is an abelian group.
Theorem 1.2.7 (Burnside). Let G be a finite group. If the order of G is divisible by
at most two distinct prime numbers, then G is solvable.
This beautiful result is sharp in some sense: it is well-known that the group S5 or its
subgroup A5 of index two are not solvable, and since its order 120 is divisible only by the
primes 2, 3 and 5, we see that the analogue statement with 2 prime factors replaced with
3 is not true. (Also it is clear that the converse is not true either: any abelian group is
solvable, and there are such groups of any order.)
This theorem of Burnside will be proved using representation theory of finite groups
in Section 4.7.2 of Chapter 4, in much the same way as Burnside proceeded in the early
20th century. It is only in the late 1960’s that a purely algebraic proof – not using
representation theory – was constructed, first by Goldschmidt when the primes p and q
are odd, and then by Bender and Matsuyama independently for the general case. There is
a full account of this in [20, §7D], and although it is not altogether overwhelming in length,
the reader who compares will probably agree that the proof based on representation theory
is significantly easier to digest...
Remark 1.2.8. There are even more striking results, which are much more difficult;
for instance, the famous “odd-order theorem” of Feit and Thompson states that if G has
odd order, then G is necessarily solvable.
(called the norm from k to Fp ) is surjective (e.g., because the kernel has at most
1 + p + p + · · · + pn−1 = (pn − 1)/(p − 1) elements, so the image has at least
p − 1 elements). Moreover, the kernel of the norm is the set of all x which can
be written as y/y p for some y ∈ k × .
When considering a normed vector space E, we usually denote the norm by kvk, and
sometimes write kvkE , when more than one space (or norm) are considered simultane-
ously. When considering a Hilbert space H, we speak synonymously of inner product or
positive-definite hermitian forms, denoted h·, ·i or h·, ·iH if more than one space might be
understood. We use the convention that a hermitian form is linear in the first variable,
and conjugate-linear in the other, i.e., we have
hαv, wi = αhv, wi, hv, αwi = ᾱhv, wi,
for two vectors v, w and a scalar α ∈ C.
Concerning topology and measure theory, we recall the following definitions and facts.
First, we always consider Hausdorff topological spaces, even if not explicitly mentioned.
A Borel measure on a topological space X is a measure defined on the σ-algebra of Borel
sets. A Radon measure is a Borel measure which is finite on compact subsets of X. The
integral of a non-negative measurable function f , or of an integrable function f , with
respect to µ, is denoted by either of the following
Z Z
f (x)dµ(x) = f dµ.
X X
7
CHAPTER 2
for some J ⊂ I. This is false even for G trivial, where the only irreducible representation
is the trivial one, and writing a decomposition of E amounts to choosing a basis. Then
there are usually many subspaces of E which are not literally direct sums of a subset of
the basis directions (e.g., E = k ⊕ k and F = {(x, x) | x ∈ k}).
We will deduce the lemma from the following more abstract criterion for semisim-
plicity, which is interesting in its own right – it gives a useful property of semisimple
representations, and it is sometimes easier to check because it does not mention irre-
ducible representations.
Lemma 2.2.8 (Semisimplicity criterion). Let G be a group and let k be a field. A
k-representation
% : G −→ GL(E)
of G is semisimple if and only if, for any subrepresentation F ⊂ E of %, there exists a
complementary subrepresentation, i.e., a G-stable subspace F̃ ⊂ E such that
E = F ⊕ F̃ .
It is useful to give a name to the second property: one says that a representation % is
completely reducible if, for any subrepresentation %1 of %, one can find a complementary
one %2 with
% = %1 ⊕ %2 .
Proof of Lemma 2.2.7. Let % act on E, and let F ⊂ E be a subrepresentation.
We are going to check that the condition of Lemma 2.2.8 applies to F .5 Thus let G ⊂ F
be a subrepresentation of F ; it is also one of E, hence there exists a subrepresentation
G̃ ⊂ E such that
E = G ⊕ G̃.
5 I.e., a subrepresentation of a completely reducible one is itself completely reducible.
20
Now we claim that F = G⊕(F ∩ G̃), which shows that G has also a stable complement
in F – and finishes the proof that F is semisimple. Indeed, G and (F ∩ G̃) are certainly
in direct sum, and if v ∈ F and we write v = v1 + v2 with v1 ∈ G, v2 ∈ G̃, we also obtain
v2 = v − v1 ∈ F ∩ G̃,
because v1 is also in F . The case of a quotient representation is quite similar and is left
to the reader to puzzle...
Proof of Lemma 2.2.8. Neither direction of the equivalence is quite obvious. We
start with a semisimple representation %, acting on E, written as a direct sum
M
E= Ei
i∈I
22
From elementary multilinear algebra, we recall that if E is finite-dimensional, the
symmetric
Vm algebra is infinite-dimensional, but the alternating algebra is not – indeed,
E = 0 if m > dim E. More generally, the dimension of the symmetric and alternating
powers is given by
m dim(E) + m − 1 ^m dim E
dim Sym (E) = , dim E= .
m m
For instance, if n = dim(E), we have
n(n + 1) ^2 n(n − 1)
dim Sym2 (E) = , dim E= .
2 2
Remark 2.2.11 (Symmetric powers as coinvariants). Let E be a k-vector space. For
any m > 1, there is a natural representation of the symmetric group Sm on the tensor
power
E ⊗m = E ⊗ · · · ⊗ E
(with m factors), which is induced by the permutation of the factors, i.e., we have
σ · (v1 ⊗ · · · ⊗ vm ) = vσ(1) ⊗ · · · ⊗ vσ(m) .
The classical definition of the m-th symmetric power is
Symm (E) = E ⊗m /F
where F is the subspace generated by all vectors of the type
(v1 ⊗ · · · ⊗ vm ) − (vσ(1) ⊗ · · · ⊗ vσ(m) ) = (v1 ⊗ · · · ⊗ vm ) − σ · (v1 ⊗ · · · ⊗ vm )
where vi ∈ E and σ ∈ Sm . In other words, in the terminology and notation of Sec-
tion 2.2.2, we have
Symm (E) = ES⊗m m
,
the space of coinvariants of E ⊗m under this action of the symmetric group.
2.2.6. Contragredient, endomorphism spaces. Let % be a k-representation of
G, acting on the vector space E. Using the transpose operation, we can then define a
representation on the dual space E 0 = Homk (E, k), which is called the contragredient %̃
of %. More precisely, since the transpose reverses products,6 the contragredient is defined
by the rule
hg · λ, vi = hλ, g −1 · vi,
for g ∈ G, λ ∈ E 0 and v ∈ E, using duality-bracket notation, or in other words the linear
form %̃(g)λ is the linear form
v 7→ λ(%(g −1 )v).
Remark 2.2.12. One way to remember this is to write the definition in the form of
the equivalent invariance formula
(2.13) hg · λ, g · vi = hλ, vi
for all λ ∈ E 0 and v ∈ E.
6 Equivalently, in the language of Proposition 2.2.1, the assignment T (E) = E 0 “reverses” arrows in
contrast with (2.8), i.e., T (f ◦ g) = T (g) ◦ T (f ), with T (f ) the transpose.
23
We check explicitly that the contragredient is a representation, to see that the inverse
(which also reverses products) compensates the effect of the transpose:
hgh · λ, vi = hλ, (gh)−1 · vi = hλ, h−1 g −1 · vi = hh · λ, g −1 · vi = hg · (h · λ), vi
for all g, h ∈ G and v ∈ E.
The following lemma shows how the contragredient interacts with some of the other
operations previously discussed:
Lemma 2.2.13. Let k be a field, G a group.
(1) For any k-representations (%i ) of G, we have canonical isomorphisms
M^ M
%i ' %̃i .
i i
given by mapping a family (vj ) of vectors in F to the unique linear map such that
A(wj ) = vj .
This is a k-linear isomorphism, since (wj ) is a basis, and it is very simple to check
that it is an intertwiner.
for this induced representation, but as for restriction, we will use it in the general case
(keeping in mind the ambiguity that comes from not indicating explicitly φ). One may
even drop H and G from the notation when they are clear from the context.
Remark 2.3.3. If we take h ∈ ker(φ), the transformation formula in (2.21) for ele-
ments of F gives
ϕ(x) = %(h)ϕ(x)
so that, in fact, any function ϕ ∈ F takes values in the space E ker(φ) of invariants of
E under the action of the subgroup ker(φ) through %. However, we do not need to
state it explicitly in the definition of F , and this avoids complicating the notation. It
will reappear in the computation of the dimension of F (Proposition 2.3.8 below). Of
course, when φ is an inclusion (the most important case), the target space is genuinely
E anyway. It is worth observing, however, that as a consequence of Lemma 2.1.9, this
subspace E ker(φ) is in fact a subrepresentation of E, so that in the condition
ϕ(φ(h)x) = %(h)ϕ(x),
the right-hand side also is always in E ker(φ) .
Example 2.3.4 (Elementary examples of induction). (1) By definition of F , and
comparison with the definition of the regular representation, we see that the latter can
be expressed as
(2.22) Ck (G) = IndG
1 (1),
with the action of G/K induced by % (note that by Lemma 2.1.9, the subspace E K is a
subrepresentation of E.) This isomorphism is again given by ϕ 7→ ϕ(1), which – as we
28
have remarked – is a vector in E ker(φ) = E K . The reader should again check that this is
an isomorphism.
(4) Suppose now that φ : G → G is an automorphism. Then, for a representation
% of the “source” G, acting on E, the induced representation IndG G (%) is not in general
isomorphic to %; rather it is isomorphic to
φ∗ % = % ◦ φ−1 : G −→ GL(E).
Indeed, the k-linear isomorphism
F −→ E
Φ
ϕ 7→ ϕ(1)
satisfies
Φ(reg(g)ϕ) = ϕ(g) = ϕ(%(%−1 (g))) = %(φ−1 (g))ϕ(1) = φ∗ %(ϕ),
i.e., it intertwines the induced representation with the representation %◦φ−1 . Incidentally,
using again φ and seeing % as a representation of the target G, one has of course
∗
ResG
G (%) = φ % = % ◦ φ.
Although this looks like a quick way to produce many “new” representations from
one, it is not so efficient in practice because if φ is an inner automorphism (i.e., if
φ(g) = xgx−1 for some fixed x ∈ G), we do have φ∗ % ' %: by definition, the linear
isomorphism Φ = %(x) satisfies
Φ ◦ φ∗ %(g) = %(x)%(x−1 gx) = %(g)Φ
for all g ∈ G, and therefore it is an isomorphism ϕ∗ % −→ %.
(5) Finally, one can see from the above how to essentially reduce a general induction to
one computed using an inclusion homomorphism. Indeed, we have always an isomorphism
IndG G
H (%) ' IndIm(φ) (φ∗ (%
ker(φ)
))
where the right-hand side is computed using the inclusion homomorphism Im(φ) ⊂ G.
This isomorphism is a combination of the previous cases using the factorization
φ1
H −→ H/ ker(φ) ' Im(φ) ,→ G,
where the first map is a quotient map, the second the isomorphism induced by φ, and
the third an injection. (This is also a special case of “induction in steps”, see Proposi-
tion 2.3.14 below.)
(6) Another important special case of induction occurs when the representation % is
one-dimensional, i.e., is a homomorphism
H −→ k × .
In that case, the space F of IndG H (%) is a subspace of the space Ck (G) of k-valued
functions on G, and since G acts on this space by the regular representation, the induced
representation is a subrepresentation of Ck (G), characterized as those functions which
transform like % under H:
ϕ(φ(h)x) = %(h)ϕ(x)
where now %(h) is just a (non-zero) scalar in k.
This type of example is significant because of the crucial importance of the regular
representation. Indeed, it is often a good strategy to (attempt to) determine the irre-
ducible k-representations of a group by trying to find them as being either induced by
29
one-dimensional representations of suitable subgroups, or subrepresentations of such in-
duced representations. We will see this in effect in Chapter 4, in the special case of the
groups GL2 (Fq ), where Fq is a finite field.
Remark 2.3.5. Although we have given a specific “model” of the induced represen-
tation by writing down a concrete vector space on which G acts, one should attempt to
think of it in a more abstract way. As we will see in the remainder of the book, many
representations constructed differently – or even “given” by nature – turn out to be iso-
morphic to induced representations, even if the vector space does not look like the one
above.
Note also that we have defined induction purely algebraically. As one may expect, in
cases where G is an infinite topological group, this definition may require some changes
to behave reasonably. The model (2.21) is then a good definition as it can immedi-
ately suggest to consider restricted classes of functions on G instead of all of them (see
Example 5.2.10.)
The following properties are the most important facts to remember about induction.
Proposition 2.3.6 (Induced representation). Let k be a field, φ : H → G a group
homomorphism.
Φ
(1) For any homomorphism %1 −→ %2 of k-representations of H, there is a corre-
sponding homomorphism
Ind(Φ) : IndG G
H (%1 ) −→ IndH (%2 ),
and this is “functorial”: the identity maps to the identity and composites map to com-
posites.
(2) For any k-representation %1 of G and %2 of H, there is a natural isomorphism of
k-vector spaces
(2.23) HomG (%1 , IndG G
H (%2 )) ' HomH (ResH (%1 ), %2 ),
where we recall that HomG (·, ·) denotes the k-vector space of homomorphism between two
representations of G.
The last isomorphism is an instance of what is called Frobenius reciprocity, and it is
an extremely useful result. In fact, in some (precise) sense, this formula characterizes
the induced representation, and can be said to more or less define it (see Remark 2.3.15
for an explanation). We will use induction and the Frobenius formula extensively – in
particular in the next chapters – to analyze the decomposition of induced representations.
Another remark is that the definition of the induced representation that we chose is
the best for handling situations where [G : H] can be infinite. If [G : H] is finite, then
another natural (isomorphic) model leads to isomorphisms
(2.24) HomG (IndG G
H (%1 ), %2 ) ' HomH (%1 , ResH (%2 )),
and those are sometimes considered to be the incarnation of Frobenius reciprocity (see
Chapter 4 and, e.g., [19, Ch. 5]).
Proof. (1) The induced homomorphism Φ∗ = Ind(Φ) is easy to define using the
model of the induced representation given above: denoting by F1 , F2 the spaces corre-
sponding to IndG G
H (%1 ) and IndH (%2 ) respectively, we define Φ∗ (ϕ) for ϕ ∈ F1 to be given
by
Φ∗ (ϕ)(x) = Φ(ϕ(x))
30
for x ∈ G. This is a function from G to E2 , by definition, and the relation
Φ∗ (ϕ)(φ(h)x) = Φ(ϕ(φ(h)x)) = Φ(%1 (h)ϕ(x)) = %2 (h)Φ(ϕ(x))
for all h ∈ H shows that Φ∗ (ϕ) is in the space F2 of the induced representation of
%2 . We leave it to the reader to check that Φ∗ is indeed a homomorphism between the
representations F1 and F2 .
(2) Here again there is little that is difficult, except maybe a certain bewildering
accumulation of notation, especially parentheses, when checking the details – the reader
should however make sure that these checks are done.
Assume that G acts on the space F1 through %1 , and that H acts on E2 through %2 .
Then the “restriction” of %1 acts on F1 through %1 ◦ φ, while the induced representation
of %2 acts on the space F2 defined as in (2.21).
We will describe how to associate to
Φ : F1 −→ F2 ,
which intertwines %1 and IndG
H (%2 ), a map
T (Φ) : F1 −→ E2
intertwining the restriction of %1 and %2 . We will then describe, conversely, how to start
with an intertwiner
Ψ : F1 −→ E2
and construct another one
T̃ (Ψ) : F1 −→ F2 ,
and then it will be seen that T ◦ T̃ and T̃ ◦ T are the identity morphism, so that T and
T̃ give the claimed isomorphisms.
The main point to get from this is that both T and T̃ more or less “write themselves”:
they express the simplest way (except for putting zeros everywhere!) to move between
the desired spaces. One must then check various things (e.g., that functions on G with
values in E2 actually lie in F2 , that the maps are actually intertwiners, that they are
reciprocal), but at least once this is done, it is quite easy to recover the definitions.
To begin, given Φ as above and a vector v ∈ F1 , we must define a map F1 −→ E2 ;
since Φ(v) is in F2 , it is a function on G with values in E2 , hence it seems natural to
evaluate it somewhere, and the most natural guess is to try to evaluate at the identity
element. In other words, we define T (Φ) to be the map
F1 −→ E2
(2.25) T (Φ) :
v 7→ Φ(v)(1).
We can already easily check that Φ̃ = T (Φ) is an H-homomorphism (between the
restriction of %1 and %2 ): indeed, we have
Φ̃(h · v) = Φ̃(φ(h)v) = Φ(φ(h)v)(1) = Φ(v)(φ(h))
where the last equality reflects the fact that Φ intertwines %1 and the induced representa-
tion of %2 , the latter acting like the regular representation on F2 . Now because Φ(v) ∈ F2 ,
we get
Φ(v)(φ(h)) = %2 (h)Φ(v)(1) = %2 (h)Φ̃(v)
which is what we wanted.
In the other direction, given an H-homomorphism
Ψ : F1 → E2 ,
31
we must construct a map Ψ̃ = T̃ Ψ from F1 to F2 . Given v ∈ F1 , we need to build a
function on G with values in E2 ; the function
(2.26) x 7→ Ψ(%1 (x)v),
is the most natural that comes to mind, since the values of Ψ are elements of E2 . Thus
Ψ̃(v) is defined to be this function.
We only describe some of the necessary checks required to finish the argument. First,
we check that ϕ = Ψ̃(v) is indeed in F2 : for all x ∈ G and h ∈ H, we have
ϕ(φ(h)x) = Ψ(φ(h)%1 (x)v) = %2 (h)Ψ(%1 (x)v) = %2 (h)ϕ(x)
(using the fact that Ψ is a homomorphism from ResG H (%1 ) to %2 .)
Next, Ψ̃ intertwines %1 and IndG (%
H 2 ): for g ∈ G, the function Ψ̃(%1 (g)v) is
x 7→ Ψ(%1 (xg)v)
and this coincides with
reg(g)Ψ̃(v) = (x 7→ Ψ̃(v)(xg)).
The remaining property we need is that the two constructions are inverse of each
other. If we start with Ψ ∈ HomH (F1 , E2 ), then construct Ψ̃ = T̃ Ψ, the definitions (2.25)
and (2.26) show that
T T̃ Ψ(v) = Ψ̃(v)(1) = Ψ(v)
for all v, i.e., T ◦ T̃ is the identity. If we start with Φ ∈ HomG (F1 , F2 ), define Ψ = T Φ and
Φ̃ = T̃ Ψ = T̃ T Φ, and unravel the definitions again, we obtain the inescapable conclusion
that, given v ∈ F1 , the function Φ̃(v) is given by
(x 7→ Ψ(%1 (x)v) = Φ(%1 (x)v)(1)),
and this function of x does coincide with Φ(v) because
Φ(%1 (x)v) = reg(x)Φ(v) = (y 7→ Φ(v)(yx)).
Thus T̃ ◦ T is also the identity, and the proof is finished.
Example 2.3.7. Let %1 = 1 be the trivial (one-dimensional) representation of G.
Then its restriction to H is the trivial representation 1H of H. By Frobenius reciprocity,
we derive
HomG (1G , Ind(%2 )) ' HomH (1H , %2 ).
Comparing with (2.7), we deduce that there is a (canonical) isomorphism
IndG G H
H (%2 ) ' %2
32
Proof. The idea is very simple: the definition of the space F on which the induced
representation acts shows that the value of ϕ ∈ F at a point x determines the values at
all other points of the form φ(h)x, i.e., at all points which are in the same left-coset of G
modulo the image of φ. Thus there should be [G : Im(φ)] independent values of ϕ; each
seems to belong to the space E, but as we have observed in Remark 2.3.3, it is in fact
constrained to lie in the possibly smaller space E ker(φ) .
To check this precisely, we select a set R of representatives of Im(φ)\G, we let F̃
denote the space of all functions
ϕ̃ : R −→ E ker(φ) ,
and we consider the obvious k-linear map
F −→ F̃
defined by restricting functions on G to R (using the remark 2.3.3 to see that this is
well-defined). Now we claim that this is an isomorphism of vector spaces, and of course
this implies the formula for the dimension of F .
To check the injectivity, we simply observe that if ϕ ∈ F is identically zero on R, we
have
ϕ(φ(h)x) = %(h)ϕ(x) = 0
for all x ∈ R and h ∈ H; since these elements, by definition, cover all of G, we get ϕ = 0
(this is really the content of the observation at the beginning of the proof).
For surjectivity, for any x ∈ G, we denote by r(x) the element of R equivalent to x,
and we select one h(x) ∈ H such that
x = φ(h(x))r(x),
with h(x) = 1 if x ∈ R.
Given an arbitrary function ϕ̃ : R → E ker(φ) , we then define
ϕ(x) = ϕ(φ(h(x))r(x)) = %(h(x))ϕ̃(r(x)),
which is a well-defined E-valued function on G. Thus ϕ is equal to ϕ̃ on R; by definition
of F , this is in fact the only possible definition for such a function, but we must check
that ϕ ∈ F to conclude. Consider x ∈ G and h ∈ H; let y = φ(h)x, so that we have the
two expressions
y = φ(hh(x))r(x), y = φ(h(y))r(y) = φ(h(y))r(x)
since y and x are left-equivalent under Im(φ). It follows that hh(x) and h(y) differ by an
element (say κ) in ker(φ). Thus we get
ϕ(y) = ϕ(φ(h(y))r(x)) = %(h(y))ϕ̃(r(x))
= %(κ)%(hh(x))ϕ̃(r(x))
= %(h)%(h(x))ϕ̃(r(x))
since κ acts trivially on the space E ker(φ) , and (as in Lemma 2.1.9) the vector
%(hh(x))ϕ̃(r(x))
does belong to it. We are now done because
ϕ(φ(h)x) = ϕ(y) = %(h)%(h(x))ϕ̃(r(x)) = %(h)ϕ(x)
finishes the proof that ϕ ∈ F .
33
Remark 2.3.9. From the proof we see that one could have defined the induced rep-
resentations as the k-vector space of all functions
ker(φ)
Im(φ)\G −→ E2 ,
together with a suitable action of G. However, this “restriction model” of IndGH (%) is not
very convenient because the action of G, by “transport of structure”, is not very explicit.
Example 2.3.10 (Minimal index of a proper subgroup). Here is an application of this
formula: consider a group G, and a proper subgroup H of G of finite index. We want to
obtain a lower bound for its index [G : H]. If we fix a field k, then a simple one is given
by
[G : H] > min dim(π),
π6=1
of representations of G.
As in the case of the Frobenius reciprocity isomorphism (2.23), the proof is not very
difficult as the isomorphism can be described explicitly, but full details are a bit tedious.
The reader should attempt to guess a homomorphism between the two representations (it
is easier to go from right to left here), and after checking that the guess is right, should
also try to verify by herself that it satisfies the required properties.9
9 In fact, the details of this and similar proofs are probably not worth trying to read without
attempting such a process of self-discovery of the arguments.
34
Proof. We denote by F1 the space of %1 , E2 that of %2 and F2 the space (2.21) of
the induced representation IndG
H (%2 ). Moreover, we denote by τ the representation
τ = %2 ⊗ ResG
H (%1 )
for some wj (x) ∈ E2 . This is simply because, for every x, the vectors (x · vj )j also form
a basis of F1 .
We now first show the injectivity of Φ: any element of F2 ⊗ F1 can be expressed as
X
ϕj ⊗ vj
j
35
for some functions ϕj ∈ F2 . Let us assume such an element is in ker(Φ). This means that
for all x ∈ G, we have
X
ϕj (x) ⊗ (x · vj ) = 0 ∈ E2 ⊗ F1 .
j
thus defining coefficient functions ϕ̃j : G → E2 . We now will show that – because ϕ̃ ∈ F̃2
– each ϕ̃j is in fact in F2 , which will ensure that
X
ϕ̃ = Φ(ϕ̃j ⊗ vj )
j
Comparing using the uniqueness of (2.27), with x replaced by φ(h)x, we find that, for
all j, we have
ϕ̃j (φ(h)x) = %2 (h)ϕ̃j (x)
and this does state that each coefficient function ϕ̃j is in F2 .
Remark 2.3.13. If φ is an injective homomorphism and the groups G and H are
finite, then all spaces involved are finite-dimensional. Since Proposition 2.3.8 shows that
both sides of the projection formula are of degree [G : H] deg(%1 ) deg(%2 ), the injectivity
of Φ is sufficient to finish the proof.
Yet another property of induction (and restriction), which is quite important, is the
following:
10 This is the trick: using (2.27) for a varying x, not for a single fixed basis.
36
Proposition 2.3.14 (Transitivity). Let k be a field and let
φ2 φ1
H2 −→ H1 −→ G
be group homomorphisms, and let φ = φ1 ◦ φ2 . For any k-representations %2 of H2 and %
of G, we have canonical isomorphisms
ResH G G
H2 (ResH1 %) ' ResH2 (%),
1
IndG H1 G
H1 (IndH2 %2 ) ' IndH2 (%2 ).
Proof. As far as the restriction is concerned, this is immediate from the definition.
For induction, the argument is pretty much of the same kind as the ones we used before:
defining maps both ways is quite simple and hard to miss, and then one merely needs to
make various checks to make sure that everything works out; we will simplify those by
omitting the homomorphisms in the notation.
So here we go again: let E, F1 , F2 , F denote, respectively, the spaces of the represen-
tations
%2 , IndHH2 %2 ,
1
IndG H1
H1 (IndH2 %2 ), IndGH2 (%2 ),
so that we must define a G-isomorphism
T : F −→ F2 .
Note that F is a space of functions from G to E, and F2 a space of functions from
G to F1 . We define T as follows: given ϕ ∈ F , a function from G to E, it is natural to
consider
reg(g)ϕ = (x 7→ ϕ(xg)),
the image of ϕ under the regular representation on E-valued functions. But T (ϕ) must
be F1 -valued to be in F2 , and F1 is a space of functions from H1 to E. Hence we define
T (ϕ)(g) = (reg(g)ϕ) ◦ φ1 ,
the “restriction” to H1 of this function on G.
We can then check that T (ϕ) is, in fact, F1 -valued; if we assume that the group
homomorphisms involved are inclusions of subgroups, this amounts to letting η = T (ϕ)(g)
and writing
η(h2 h1 ) = reg(g)ϕ(h2 h1 ) = ϕ(h2 h1 g) = %(h2 )ϕ(h1 g) = %(h2 )η(h1 ),
for hi ∈ Hi , using of course in the middle the assumption that ϕ is in F (again, the
confusing mass of parentheses is unlikely to make much sense until the reader has tried
and succeeded independently to do it).
Now we should check that T (ϕ) is not only F1 -valued, but also lies in F2 , i.e., trans-
forms under H1 like the induced representation IndH H2 (%). We leave this to the reader:
1
this is much helped by the fact that the action of H1 on this induced representation is
also the regular representation.
Next, we must check that T is an intertwining operator; but again, both F and F2
carry actions which are variants of the regular representation, and this should not be
surprising – we therefore omit it...
The final step is the construction of the inverse T̃ of T .11 We now start with ψ ∈ F2
and must define a function from G to E; unraveling in two steps, we set
T̃ (ψ)(g) = ψ(g)(1)
11 If the vector spaces are finite dimensional and the homomorphisms are inclusions, note that it is
quite easy to check that T is injective, and since the dimensions of F and F2 are both [G : H2 ] dim %,
this last step can be shortened.
37
(ψ(g) is an element of F1 , i.e., a function from H1 to E, and we evaluate that at the unit
of H1 ...) Taking g ∈ G and h2 ∈ H2 , denoting ϕ = T̃ (ψ), we again let the reader check
that the following
ϕ(h2 g) = ψ(h2 g)(1) = (reg(h2 )ψ(g))(1) = ψ(g)(h2 ) = %(h2 )ψ(g)(1) = %(h2 )ϕ(g),
makes sense, and means that T̃ (ψ) is in F .
Now we see that T̃ T (ϕ) is the function which maps g ∈ G to
(reg(g)ϕ)(1) = ϕ(g),
in other words T̃ ◦ T is the identity. Rather more abstrusely, if ψ ∈ F2 , ϕ = T̃ (ψ) and
ψ̃ = T (ϕ), we find for g ∈ G and h1 ∈ H1 that
ψ̃(g)(h1 ) = (reg(g)ϕ)(h1 ) = ϕ(h1 g)
= ψ(h1 g)(1) = (reg(h1 )ψ(g))(1)
= ψ(g)(h1 )
(where we use the fact that, on F2 , H1 acts through the regular representation), which
indicates that T ◦ T̃ is also the identity. Thus both are reciprocal isomorphisms.
Remark 2.3.15 (Functoriality saves time). At this point, conscientious readers may
well have become bored and annoyed at this “death of a thousand checks”. And there
are indeed at least two ways to avoid much (if not all) of the computations we have done.
One will be explained in Section 3.1, and we sketch the other now, since the reader is
presumably well motivated to hear about abstract nonsense if it cuts on the calculations.
The keyword is the adjective “natural” (or “canonical”) that we attributed to the
isomorphisms (2.23) of Frobenius Reciprocity. In one sense, this is intuitive enough: the
linear isomorphism
HomG (%1 , IndG G
H (%2 )) −→ HomH (ResH (%1 ), %2 ),
defined in the proof of Proposition 2.3.6 certainly feels natural. But we now take this
more seriously, and try to give rigorous sense to this sentence.
The point is the following fact: a representation % of G is determined, up to isomor-
phism, by the data of all homomorphism spaces
V (π) = HomG (π, %)
where π runs over k-representations of G, together with the data of the maps
V (Φ)
V (π) −→ V (π 0 )
Φ
associated to any (reversed!) G-homomorphism π 0 −→ π by mapping
(Ψ : π → %) ∈ V (π)
to
V (Φ)(Ψ) = Ψ ◦ Φ.
To be precise:
Fact. Suppose that %1 and %2 are k-representations of G, and that for any represen-
tation π, there is given a k-linear isomorphism
I(π) : HomG (π, %1 ) −→ HomG (π, %2 ),
38
in such a way that all diagrams
HomG (π, %1 ) −→ HomG (π, %2 )
↓ ↓
HomG (π , %1 ) −→ HomG (π 0 , %2 )
0
commute for any Φ : π 0 → π, where the vertical arrows, as above, are given by Ψ 7→ Ψ◦Φ.
Then %1 and %2 are isomorphic, and in fact there exists a unique isomorphism
I
%1 −→ %2
such that I(π) is given by Ψ 7→ I ◦ Ψ for all π.
Let us first see why this is useful. When dealing with induction, the point is that it
tells us that an induced representation IndGH (%) is characterized, up to isomorphism, by
the Frobenius Reciprocity isomorphisms (2.23). Indeed, the latter tells us, simply from
the data of %, what any G-homomorphism space
HomG (π, IndG
H (%))
is supposed to be. And the fact above says that there can be only one representation
with a “given” homomorphism groups HomG (π, %0 ).
More precisely, the behavior under morphisms must be compatible – the fact above
gives:
Fact. Let φ : H → G be a group-homomorphism and let % be a k-representation of
H. There exists, up to isomorphism of representations of G, at most one k-representation
%0 of G with k-linear isomorphisms
i(π) : HomG (π, %0 ) −→ HomH (Res(π), %)
such that the diagrams
i(π)
HomG (π, %0 ) −→ HomH (Res(π), %)
↓ ↓
i(π 0 )
HomG (π 0 , %0 ) −→ HomH (Res(π 0 ), %)
Φ
commute for π 0 −→ π a G-homomorphism, where the vertical arrows are again Ψ 7→ Ψ◦Φ,
on the left, and Ψ 7→ Ψ ◦ Res(Φ) on the right (restriction of Φ to H)
Readers are invited to check that the (explicit) isomorphisms
i(π) : HomG (π, IndG G
H (%)) −→ HomH (ResH (π), %),
that we constructed (based on the explicit model (2.21)) are such that the diagrams
i(π)
HomG (π, IndG G
H (%)) −→ HomH (ResH (π), %)
(2.28) ↓ ↓
i(π 0 )
HomG (π 0 , IndG G 0
H (%)) −→ HomH (ResH (π ), %)
commute (these are the same as the ones above, with %0 = Ind(%)). This is the real content
of the observation that those are “natural”. Thus the construction of (2.21) proved the
existence of the induced representation defined by “abstract” Frobenius reciprocity...
Now we can see that the transitivity of induction is just a reflection of the – clearly
valid – transitivity of restriction. Consider
φ2 φ1
H2 −→ H1 −→ G
39
as in the transitivity formula, and consider a representation % of H2 , as well as
H1
%1 = IndG
H1 (IndH2 (%)), %2 = IndG
H2 (%).
and
HomG (π, %2 ) ' HomH2 (ResG H2 (π), %),
hence by comparison and the “obvious” transitivity of restriction, we obtain isomorphisms
I(π) : HomG (π, %1 ) ' HomG (π, %2 ).
The reader should easily convince herself (and then check!) that these isomorphisms
satisfy the compatibility required in the claim to deduce that %1 and %2 are isomor-
phic – indeed, this is a “composition” or “tiling” of the corresponding facts for the
diagrams (2.28).
At first sight, this may not seem much simpler than what we did earlier, but a second
look reveals that we did not use anything relating to k-representations of G except the
existence of morphisms, the identity maps and the composition operations! In particular,
there is no need whatsoever to know an explicit model for the induced representation.
Now we prove the claim: take π = %1 ; then HomG (π, %1 ) = HomG (%1 , %1 ). We may
not know much about the general existence of homomorphisms, but certainly this space
contains the identity of %1 . Hence we obtain an element
I = I(%1 )(Id%1 ) ∈ HomG (%1 , %2 ).
Then – this looks like a cheat – this I is the desired isomorphism! To see this – but
first try it! –, we check first that I(π) is given, as claimed, by pre-composition with I for
any π. Indeed, I(π) is an isomorphism
HomG (π, %1 ) −→ HomG (π, %2 ).
Take an element Φ : π → %1 ; we can then build the associated commutative square
HomG (%1 , %1 ) −→ HomG (%1 , %2 )
↓ ↓ .
HomG (π, %1 ) −→ HomG (π, %2 )
Take the element Id%1 in the top-left corner. If we follow the right-then-down route,
we get, by definition the element
I(π)(Id) ◦ Φ = I ◦ Φ ∈ HomG (π, %2 ).
But if we follow the down-then-right route, we get I(π)(Id ◦ Φ), and hence the com-
mutativity of these diagrams says that, for all Φ, we have
(2.29) I(π)(Φ) = I ◦ Φ,
which is what we had claimed.
We now check that I is, indeed, an isomorphism, by exhibiting an inverse. The
construction we used strongly suggests that
J = I(%2 )−1 (Id%2 ) ∈ HomG (%2 , %1 ),
should be what we need (where we use that I(%2 ) is an isomorphism, by assumption).
Indeed, tautologically, we have
I(%2 )(J) = Id%2 ,
40
which translates, from the formula (2.29) we have just seen, to
I ◦ J = Id%2 .
Now we simply exchange the role of %1 and %2 and replace I(π) by its inverse; then I
and J are exchanged, and we get also
J ◦ I = Id%1 .
Why did we not start with this “functorial” language? Partly this is a matter of
personal taste and partly of wanting to show very concretely what happens – especially
if the reader does (or has done...) all computations on her own, par of the spirit of the
game will have seeped in. Moreover, in some of the more down-to-earth applications of
these games with induction and its variants, it may be quite important to know what
the “canonical maps” actually are. The functorial language does usually give a way to
compute them, but it may be more direct to have written them down as directly as we
did.
To conclude with the general properties of induction, we leave the proofs of the fol-
lowing lemma to the reader:
Lemma 2.3.16. Let k be a field, let φ : H −→ G be a group homomorphism. For
any representations % and %i of H, we have natural isomorphisms
^ G G
Ind H (%) ' IndH (%̃),
and M M
IndG
H %i ' IndG
H (%i ).
i∈I i∈I
The corresponding statements for the restriction are also valid, and equally easy to
check. On the other hand, although the isomorphism
ResG G G
H (%1 ⊗ %2 ) ' ResH (%1 ) ⊗ ResH (%2 ),
is immediate, it is usually definitely false (say when φ is injective but is not an isomor-
phism) that
IndG
H (%1 ⊗ %2 ), IndG G
H (%1 ) ⊗ IndH (%2 ),
are isomorphic, for instance because the degrees do not match (from left to right, they
are given by
[G : H] deg(%1 ) deg(%2 ), [G : H]2 (deg %1 )(deg(%2 )
respectively).
We conclude this longish section with another type of “change of groups”. Fix a field
k and two groups G1 and G2 . Given k-representations %1 and %2 of G1 and G2 , acting on
E1 and E2 respectively, we can define a representation of the direct product G1 × G2 on
the tensor product E1 ⊗ E2 : for pure tensors v1 ⊗ v2 in E1 ⊗ E2 , we let
(%1 %2 )(g1 , g2 )(v1 ⊗ v2 ) = %1 (g1 )v1 ⊗ %2 (g2 )v2 ,
which extends by linearity to the desired action, sometimes called the external tensor
product
%1 %2 : G1 × G2 −→ GL(E1 ⊗ E2 ).
Of course, the dimension of this representation is again (dim %1 )(dim %2 ). In par-
ticular, it is clear that not all representations of G1 × G2 can be of this type, simply
41
because their dimensions might not factor non-trivially. However, in some cases, irre-
ducible representations must be external tensor products of irreducible representations of
the factors.
Proposition 2.3.17 (Irreducible representations of direct products). Let k be a field,
and let G1 , G2 be two groups. If % is a finite-dimensional irreducible k-representation
of G = G1 × G2 , then there exist irreducible k-representations %1 of G1 and %2 of G2 ,
respectively, such that
% ' %1 %2 ;
moreover, %1 and %2 are unique, up to isomorphism of representations of their respective
groups.
Conversely, if %1 and %2 are irreducible finite-dimensional k-representations of G1 and
G2 , respectively, the external tensor product %1 %2 is an irreducible representation of
G1 × G2 .
The proof of this requires some preliminary results, so we defer it to Section 2.7
(Proposition 2.3.17 and Exercise 2.7.29).
Remark 2.3.18 (Relation with the ordinary tensor product). Consider a group G;
there is a homomorphism (which is injective)
G −→ G × G
φ
g 7→ (g, g).
which corresponds to the action of R/2πZ on the real plane by rotation of a given angle.
This makes it clear that this is a homomorphism, as trigonometric identities can also be
used to check. Also, this interpretation makes it clear that % is irreducible: there is no
non-zero (real) subspace of R2 which is stable under %, except R2 itself.
However, this irreducibility breaks down when extending the base field to C: indeed,
on C2 , we have
1 cos θ + i sin θ 1
%(θ) = = (cos θ + i sin θ) ,
i − sin θ + i cos θ i
and
1 cos θ − i sin θ 1
%(θ) = = (cos θ − i sin θ) ,
−i sin θ − i cos θ −i
so that C2 , under the action of G through %, splits as a direct sum
2 1 1
C = C⊕ C
i −i
of two complex lines which are both subrepresentations, one of them isomorphic to the
one-dimensional complex representation
(
G → GL(C) ' C×
θ 7→ eiθ
and the other to (
G → C×
θ 7→ e−iθ
(its conjugate, in a fairly obvious sense). Hence, one can say that % is not absolutely
irreducible.
Another way to change the field, which may be more confusing, is to apply automor-
phisms of k. Formally, this is not different: we have an automorphism σ : k −→ k, and
we compose % with the resulting homomorphism
%
%σ : G −→ GL(E) −→ GL(E ⊗k k),
where we have to be careful to see k, in the second argument of the tensor product, as
given with the k-algebra structure σ. Concretely, Eσ = E ⊗k k is the k-vector space with
the same underlying abelian group E, but with scalar multiplication given by
α · v = σ(α)v ∈ E.
Here again matrix representations may help understand what happens: a basis (vi ) of
E is still a basis of Eσ but, for any g ∈ G, the matrix representing %(g) in the basis (vi )
of Eσ is obtained by applying σ −1 to all coefficients of the matrix that represents %(g).
Indeed, for any i, we have
X X
%(g)vi = αj vj = σ −1 (αj ) · vj
j j
43
so that the (j, i)-th coefficient of the matrix for %(g) is αj , while it is σ −1 (αj ) for %σ (g).
This operation on representations can be interesting because % and %σ are usually
not isomorphic as representations, despite the fact that they are closely related. In
particular, there is a bijection between the subrepresentations of E and those of Eσ
(given by F 7→ Fσ ), and hence % and %σ are simultaneously irreducible or not irreducible,
semisimple or not semisimple.
Example 2.4.2 (Complex conjugate). Consider k = C. Although C, considered as
an abstract field, has many automorphisms, the only continuous ones, and therefore the
most important, are the identity and the complex conjugation σ : z 7→ z̄. It follows
therefore that any time we have a complex representation G −→ GL(E), where E is a
C-vector space, there is a naturally associated “conjugate” representation %̄ obtained by
applying the construction above to the complex conjugation. From the basic theory of
characters (Corollary 2.7.32 below), one can see that %̄ is isomorphic to % if and only
if the function g 7→ Tr %(g) is real-valued. This can already be checked when % is one-
dimensional, since %̄ is then the conjugate function G → C, which equals % if and only if
% is real-valued. In particular, the examples in (2.4), (2.5) or (2.6) lead to many cases of
representations where % and %̄ are not isomorphic.
Field extensions are the only morphisms for fields. However, there are sometimes
other possibilities to change fields, which are more subtle.
2.6. Examples
We collect here some more examples of representations.
2.6.1. Binary forms and invariants. Let k be any field. For an integer m > 0,
we denote by Vm the vector space of polynomials in k[X, Y ] which are homogeneous of
degree m, i.e., the k-subspace of k[X, Y ] generated by the monomials
X i Y m−i , 0 6 i 6 m.
In fact, these monomials are independent, and therefore form of basis of Vm . In
particular dim Vm = m + 1.
If we take G = SL2 (k), we can let G act on Vm by
(%m (g)f )(X, Y ) = f ((X, Y ) · g),
where (X, Y ) · g denotes the right-multiplication of matrices; in other words, if
a b
g= ,
c d
we have
(g · f )(X, Y ) = f (aX + cY, bX + dY )
(one just says that G acts on Vm by linear change of variables).
The following theorem will be proved partly in Example 2.7.9):
45
Theorem 2.6.1 (Irreducible representations of SL2 ). For k = C, the representations
%m , for m > 0, are irreducible representation of SL2 (C). In fact, %m is then an irreducible
representation of the subgroup SU2 (C) ⊂ SL2 (C).
On the other hand, if k is a field of non-zero characteristic p, the representation %p is
not irreducible.
We only explain the last statement here: if k has characteristic p, consider the sub-
space W ⊂ Vp spanned by the monomials X p and Y p . Then W 6= Vp (since dim Vp = p+1),
and Vp is a subrepresentation. Indeed, we have
(g · X p ) = (aX + cY )p = ap X p + cp Y p ∈ W, (g · Y p ) = (aX + cY )p = bp X p + dp Y p ,
by the usual properties of the
p-th power operation in characteristic p (i.e., the fact that
the binomial coefficients pj are divisible by p for 1 6 j 6 p − 1). One can also show that
W ⊂ Vp does not have a stable complementary subspace, so that Vp is not semisimple in
characteristic p.
We now consider only the case k = C. It is elementary that %m is isomorphic to
the m-th symmetric power of %1 for all m > 0. Hence we see here a case where, using
multilinear operations, all irreducible (finite-dimensional) representations of a group are
obtained from a “fundamental” one. We also see here an elementary example of a group
which has irreducible finite-dimensional representations of arbitrarily large dimension. (In
fact, SL2 (C) also has many infinite-dimensional representations which are irreducible, in
the sense of representations of topological groups explained in Section 3.2 in the next
chapter.)
Exercise 2.6.2 (Matrix representation). (1) Compute the matrix representation for
%2 and %3 , in the bases (X 2 , XY, Y 2 ) and (X 3 , X 2 Y, XY 2 , Y 3 ) of V2 and V3 , respectively.
(2) Compute the kernel of %2 and %3 .
A very nice property of these representations – which turns out to be crucial in
Quantum Mechanics – illustrates another important type of results in representation
theory:
Theorem 2.6.3 (Clebsch-Gordan formula). For any integers m > n > 0, the tensor
product %m ⊗ %n is semisimple and decomposes as
(2.31) %m ⊗ %n ' %m+n ⊕ %m+n−2 ⊕ · · · ⊕ %m−n .
One point of this formula is to illustrate that, if one knows some irreducible represen-
tations of a group, one may well hope to be able to construct or identify others by trying
to decompose the tensor products of these representations into irreducible components
(if possible); here, supposing one knew only the “obvious” representations %0 = 1 and %1
(which is just the inclusion SL2 (C) −→ GL2 (C)), we see that all other representations
%m arise by taking tensor products iteratively and decomposing them, e.g.,
%1 ⊗ %1 = %2 ⊕ 1, %2 ⊗ %1 = %3 ⊕ %1 , etc.
Proof. Both sides of the Clebsch-Gordan formula are trivial when m = 0. Using
induction on m, we then see that it is enough to prove that
(2.32) %m ⊗ %n ' %m+n ⊕ (%m−1 ⊗ %n−1 )
for m > n > 1.
At least a subrepresentation isomorphic to %m−1 ⊗ %n−1 is not too difficult to find.
Indeed, first of all, the tensor product %m ⊗ %n can be interpreted concretely by a rep-
resentation on the space Vm,n of polynomials in four variables X1 , Y1 , X2 , Y2 which are
46
homogeneous of degree m with respect to (X1 , Y1 ), and of degree n with respect to the
other variables, where the group SL2 (C) acts by simultaneous linear change of variable
on the two sets of variables, i.e.,
(g · f )(X1 , Y1 , X2 , Y2 ) = f ((X1 , Y1 )g, (X2 , Y2 )g)
for f ∈ Vm,n . This G-isomorphism
Vm ⊗ Vn −→ Vm,n
is induced by
(X i Y m−i ) ⊗ (X j Y n−j ) 7→ X1i Y1m−i X2j Y2n−j
for the standard basis vectors.
Using this description, we have a linear map
Vm−1,n−1 −→ Vm,n
∆
f 7→ (X1 Y2 − X2 Y1 )f
which is a G-homomorphism: if we view the factor X1 Y2 − X2 Y1 has a determinant
X1 X2
δ= ,
Y1 Y2
it follows that
δ((X1 , Y1 )g, (X2 , Y2 )g) = δ(X1 , X2 , Y1 , Y2 ) det(g) = δ(X1 , X2 , Y1 , Y2 )
for g ∈ SL2 (C). Moreover, it should be intuitively obvious that ∆ is injective, but we
check this rigorously: if f 6= 0, it has degree d > 1 with respect to some variable, say X1 ,
and then X1 Y2 f has degree d + 1 with respect to X1 , while X2 Y1 f remains of degree d,
and therefore X1 Y2 f 6= X2 Y1 f .
Now we describe a stable complement to the image of ∆. To justify a bit the solu-
tion, note that Im(∆) only contains polynomials f such that f (X, Y, X, Y ) = 0. Those
for which this property fails must be recovered. We do this by defining W to be the
representation generated by the single vector
e = X1m X2n ,
i.e., the linear span in Vm,n of the translates g · e. To check that it has the required
property, we look on W at the linear map “evaluating f when both sets of variables are
equal” suggested by the remark above. It is given by
(
W −→ Vm+n
T
f 7→ f (X, Y, X, Y )
(since a polynomial of the type f (X, Y, X, Y ) with f ∈ Vm,n is homogeneous of degree
m + n), and we notice that it is an intertwiner with %m+n . Since e maps to X m+n which
is non-zero, and %m+n is irreducible (Theorem 2.6.1; although the proof of this will be
given only later, the reader will have no problem checking that there is no circularity),
Schur’s Lemma 2.2.4 proves that T is surjective.
a b
We now examine W more closely. Writing g = , we have
c d
X
g · e = (aX1 + bY1 )m (aX2 + bY2 )n = aj bm+n−j ϕj (X1 , Y1 , X2 , Y2 )
06j6m+n
47
for some ϕj ∈ Vm,n . We deduce that the space W , spanned by the vectors g·e, is contained
in the span of the ϕj , and hence that dim W 6 m+n+1. But since dim %m+n = m+n+1,
we must have equality, and in particular T is an isomorphism.
Since dim Vm−1,n−1 +dim W = mn+m+n+1 = dim Vm,n , there only remains to check
that Vm−1,n−1 ⊕W = Vm,n to conclude that (2.32) holds. But the intersection Vm−1,n−1 ∩W
is zero, since f (X, Y, X, Y ) = 0 for f ∈ Vm−1,n−1 , while f (X, Y, X, Y ) = T f 6= 0 for a
non-zero f ∈ W ...
In Corollary 5.6.4 in Chapter 5, we will see that the Clebsch-Gordan formula for the
subgroup SU2 (C) (i.e., seeing each %m as restricted to SU2 (C)) can be proved – at least at
the level of existence of an isomorphism! – in a few lines using character theory. However,
the proof above has the advantage that it “explains” the decomposition, and can be used
to describe concretely the subspaces of Vm ⊗ Vn corresponding to the subrepresentations
of %m ⊗ %n .
Now, in a slightly different direction, during the late 19th and early 20th Century,
a great amount of work was done on the topic called invariant theory, which in the
(important) case of the invariants of SL2 (C) can be described as follows: one considers,
for some m > 0, the algebra S(Vm ) of all polynomial functions on Vm ; the group G acts
on S(Vm ) according to
(g · φ)(f ) = φ(%m (g −1 )f ).
and hence S(Vm ) is also a representation of G (it is infinite-dimensional, but splits as a
direct sum of the homogeneous components of degree d > 0, which are finite-dimensional).
Then one tries to understand the subalgebra S(Vm )G of all G-invariant functions on Vm ,
in particular, to understand the (finite) dimensions of the homogeneous pieces S(Vm )G d of
invariant functions of degree d.
For instance, if m = 2, so that V2 is the space of binary quadratic forms, one can
write any f ∈ V2 as
f = a0 X 2 + 2a1 XY + a2 Y 2 ,
and then S(V2 ) ' C[a0 , a1 , a2 ] is the polynomial algebra in these coordinates. One
invariant springs to mind: the discriminant
∆(a0 , a1 , a2 ) = a21 − a0 a2
of a binary quadratic form. One can then show that S(V2 )G ' C[∆] is a polynomial
algebra in the discriminant. For m = 3, with
f = a0 X 3 + 3a1 X 2 Y + 3a2 XY 2 + a3 Y 3 ,
one can prove that S(V3 )G ' C[∆3 ], where
∆3 = a20 a23 − 6a0 a1 a2 a3 + 4a0 a32 − 3a21 a22 + 4a31 a3 .
The search for explicit descriptions of the invariant spaces S(Vm )G – and similar
questions for other linear actions of groups like SLm (C) on homogeneous polynomials
in more variables – was one of main topics of the classical theory of invariants, which
was extremely popular during the 19-th century (see, e.g., [39, Ch. 3] for a modern
presentation). These questions are in fact very hard if one wishes to give concrete answers:
currently, explicit generators of S(Vm )G (as an algebra) seem to be known only for m 6 10.
For m = 9, one needs 92 invariants to generate S(Vm )G as an algebra (see [6]; these
generators are not algebraically independent).
48
2.6.2. Permutation representations. At the origin of group theory, a group G
was often seen as a “permutation group”, or in other words, as a subgroup of the group
SX of all bijections of some set X (often finite). Indeed, any group G can be identified
with a subgroup of SG by mapping g ∈ G to the permutation h 7→ gh of the underlying
set G (i.e., mapping g to the g-th row of the “multiplication table” of the group law on G).
More generally, one may consider any action of G on a set X, i.e., any homomorphism
(
G −→ SX
g 7→ (x 7→ g · x)
as a “permutation group” analogue of a linear representation. Such actions, even if X
is not a vector space, are often very useful means of investigating the properties of a
group. There is always an associated linear representation which encapsulates the action
by “linearizing it”: given any field k, denote by EX the k-vector space generated by basis
vectors ex indexed by the elements of the set X, and define
% : G −→ GL(EX )
by linearity using the rule
%(g)ex = eg·x
which exploits the action of G on X. Since g · (h · x) = (gh) · x (the crucial defining
condition for an action!), we see that % is, indeed, a representation of G. It has dimension
dim % = |X|, by construction.
Example 2.6.4. (1) The representation (denoted %1 ) in (2.3) of G on the space k(G)
spanned by G is simply the permutation representation associated to the left-action of G
on itself by multiplication.
(2) If H ⊂ G is a subgroup of G, with finite index, and X = G/H is the finite set of
right cosets of G modulo H, with the action given by
g · (xH) = gxH ∈ G/H,
the corresponding permutation representation % is isomorphic to the induced representa-
tion
IndG
H (1).
Indeed, the space for this induced representation is given by
F = {ϕ : ϕ(hg) = ϕ(g) for all h ∈ H},
with the left-regular representation. This space has a basis given by the functions ϕx
which are the characteristic functions of the left cosets Hx. Moreover
reg(g)ϕx = ϕxg−1
(the left-hand side is non-zero at those y where yg ∈ Hx, i.e., y ∈ Hxg −1 ), which means
that mapping
ϕx 7→ ex−1
gives a linear isomorphism F −→ EX , which is now an intertwiner.
A feature of all permutation representations is that if X is finite then, unless |X| = 1,
they are never irreducible: the element
X
ex ∈ EX
x∈X
is in an invariant vector.
49
Exercise 2.6.5. If % is the permutation representation associated to the action on
G/H, for H ⊂ G of finite index, show that %G is spanned by this invariant vector, and
explain how to recover it as the image of an explicit element
Φ ∈ HomG (1, %)
constructed using Frobenius reciprocity.
2.6.3. Generators and relations. From an abstract point of view, one may try
to describe representations of a group G by writing down a presentation of G, i.e., a
set g ⊂ G of generators, together with the set r of all relations between the elements
of g, relations being seen as (finite) words involving the g ∈ g – a situation which one
summarizes by writing
G ' hg | ri.
Then one can see that for a given field k and dimension d > 1, it is equivalent to give
a d-dimensional (matrix) representation
G −→ GLd (k)
or to give a family
(xg )g∈g
of invertible matrices in GLd (k), such that “all relations in r hold”, i.e., if a given r ∈ r
is given by a word
r = g1 · · · g`
(with gi in the free group generated by g), we should ensure that
xg 1 · · · xg ` = 1
in the matrix group GLd (k).
This description is usually not very useful for practical purposes if the group G is
given, since it is often the case that there is no particularly natural choice of generators
and relation to use. However, it does have some purpose.
For instance, if we restrict to representations of dimension d = 1, since GL1 (k) = k ×
is abelian (and no group GLd (k) is, for any k, when d > 2), and since for any group G
and abelian group A, there is a canonical bijection between homomorphisms G → A and
homomorphisms G/[G, G] → A, we derive:
Proposition 2.6.6 (1-dimensional representations). Let G be a group, and let Gab =
G/[G, G] be the abelianization of G. For any field k, the 1-dimensional representations
of G correspond with the homomorphisms
Gab −→ k × .
In particular, if G is perfect, i.e., if [G, G] = G, any non-trivial representation of G,
over any field k, has dimension at least 2.
The last part of this proposition applies in many cases. For instance, if d > 2 and k
is any field, SLd (k) is known to be perfect except when d = 2 and k = F2 or k = F3 .
Thus no such group has a non-trivial one-dimensional representation.
One can also make use of this approach to provide more examples of groups with
“a lot” of representations. Indeed, if G is a group where there are no relations at all
between a set g of generators (i.e., a free group), it is equivalent to give a homomorphism
G −→ GL(E) as to give elements xg in GL(E) indexed by the generators g ∈ g. Moreover,
two such representations given by xg ∈ GL(E) and yg ∈ GL(F ) are isomorphic if and
50
only if these elements are (globally) conjugate, i.e., if there exists a linear isomorphism
Φ : E → F such that
xg = Φ−1 yg Φ
for all g ∈ g.
Here is a slight variant that makes this very concrete. Consider the group G =
PSL2 (Z) of matrices of size 2 with integral coefficients and determinant 1, modulo the
subgroup {±1}. Then G is not free, but it is known to be generated by the (image modulo
{±1} of the) two elements
0 −1 0 −1
g1 = , g2 =
1 0 1 1
in such a way that the only relations between the generators are
g12 = 1, g23 = 1
(i.e., G is a free product of Z/2Z and Z/3Z).
Hence it is equivalent to give a representation PSL2 (Z) −→ GL(E) or to give two
elements x, y ∈ GL(E) such that x2 = 1 and y 3 = 1.
Yet another use of generators and relations is in showing that there exist groups for
which certain representations do not exist: in that case, it is enough to find some abstract
presentation where the relations are incompatible with matrix groups. Here is a concrete
example:
Theorem 2.6.7 (Higman–Baumslag; “non-linear finitely generated groups exist”).
Let G be the group with 2 generators a, b subject to the relation
a−1 b2 a = b3 .
Then, whatever the field k, there exists no faithful linear representation
G −→ GL(k)
where E is a finite-dimensional k-vector space.
The first example of such a group was constructed by Higman; the example here is
due to Baumslag (see [29]), and is an example of a family of groups called the Baumslag-
Solitar groups which have similar presentations with the exponents 2 and 3 replaced by
arbitrary integers.
We will only give a sketch, dependent on some fairly deep facts of group theory.
Sketch of proof. We appeal to the following two results:
– (Malcev’s Theorem) If k is a field and G ⊂ GLd (k) is a finitely generated group, then
for any g ∈ G, there exists a finite quotient G −→ G/H such that g is non-trivial modulo
H.
– (The “Identitätssatz” of Magnus, or Britton’s Lemma; see, e.g., [33, Th. 11.81]) Let G
be a finitely presented group with a single relation (a one-relator group); then one can
decide algorithmically if a word in the generators represents or not the identity element
of G.
Now we are going to check that G fails to satisfy the conclusion of Malcev’s Theorem,
and therefore has no finite-dimensional representation over any field.
To begin with, iterating the single relation leads to
k k
a−k b2 ak = b3
51
π
for all k > 1. Now assume G −→ G/H is a finite quotient of G, and let α = π(a),
β = π(b). Taking k to be the order of α in the finite group G/H, we see that
k −3k
β2 = 1,
i.e., the order of β divides 2k − 3k . In particular, this order is coprime with 2, and this
implies that the map γ 7→ γ 2 is surjective on the finite cyclic group generated by β. Thus
β is a power of β 2 . Similarly, after conjugation by a, the element b1 = a−1 ba is such that
β1 = π(b1 ) is a power of β12 .
But now we observe that β12 = π(a−1 b2 a) = π(b3 ) = β 3 . Hence β1 is a power of β 3 ,
and in particular it commutes with β, so that
π([b1 , b]) = β1 ββ1−1 β −1 = 1
and this is valid in any finite quotient.
Now Britton’s Lemma [33, Th. 11.81] implies that the word
c = [b1 , b] = b1 bb−1
1 b
−1
= a−1 baba−1 b−1 ab−1
is non-trivial in G.12 Thus c ∈ G is an element which is non-trivial, but becomes so in
any finite quotient of G. This is the desired conclusion.
Remark 2.6.8. Concerning Malcev’s Theorem, the example to keep in mind is the
following: a group like SLd (Z) ⊂ GLd (C) is finitely generated and one can check that it
satisfies the desired condition simply by using the reduction maps
SLd (Z) −→ SLd (Z/pZ)
modulo primes. Indeed, for any fixed g ∈ SLd (Z), if g 6= 1, we can find some prime p
larger than the absolute values of all coefficients of g, and then g is certainly non-trivial
modulo p. The proof of Malcev’s Theorem is based on similar ideas (though of course
one has to use more general rings than Z).
Note that if one does not insist for finitely-generated counterexamples, it is easier to
find non-linear groups – for instance, “sufficiently big” abelian groups will work.
m
0 %m
2 (g) ? ...
% (g) = .. .. .. .
. . .
0 0 . . . %m
n (g)
for some subset S ⊂ {1, . . . , n}. Indeed, this is already clear in the case of the trivial
group G, where any representation is semisimple, and a decomposition (2.34) is obtained
by any choice of a basis of E: if dim(E) > 2, the vector space E contains many more
subspaces than those spanned by finitely many “axes” spanned by basis vectors.
More generally, consider E = E1 ⊕ E1 , a direct sum of two copies of E1 . Then we can
also write
E = F1 ⊕ F2 ,
with
F1 = {(v1 , v1 ) | v1 ∈ E1 }, F2 = {(v1 , −v2 ) | v1 ∈ E1 }
(at least if 2 6= 0 in k...), and the maps
( (
E1 −→ F1 E1 −→ F2
,
v 7→ (v, v) v 7→ (v, −v)
are isomorphisms of representations.
A weaker uniqueness is still valid: in any decomposition (2.34), the direct sum M (π)
of all subspaces Ei on which the action of G is isomorphic to a given irreducible repre-
sentation π, is independent of the decomposition.
Proposition 2.7.7 (Unicity of isotypic components). Let G be a group and let k be
a field. Let % : G −→ GL(E) be a semisimple k-representation of G.
(1) Fix an irreducible k-representation π of G. For any decomposition
M
(2.35) E= Ei ,
i∈I
The right-hand side is therefore semisimple, and its irreducible components (the χ2i−m ,
which are irreducible since one-dimensional) occur with multiplicity 1.
Now consider any non-zero G-stable subspace F ⊂ Vm ; it is also a subrepresentation of
the restriction of %m to T , obviously, and from what we just observed, Proposition 2.7.7,
(2), implies that the subspace F is a direct sum of some of the lines Cei corresponding
to the representations of T occurring in F . Thus there exists some non-empty subset
I ⊂ {0, . . . , m}
such that
M
(2.38) F = Cei .
i∈I
Now we “bootstrap” this information using the action of other elements of G than
those in T . Namely, fix some i ∈ I and consider the action of
1 1
(2.39) u= ;
0 1
since F is a stable under G, we know that %m (u)ei ∈ F , and this means
X i (X + Y )m−i ∈ F.
Expanding by the binomial theorem, we get
m
m m−1 i m−i−1 i m−i
X m−i
X + (m − i)X Y + · · · + (m − i)X Y +X Y = ej ∈ F,
j=i
j − i
and comparison with (2.38) leads to the conclusion that all j > i are also in I. Similarly,
considering the action of
1 0
1 1
we conclude that j ∈ I if j 6 i. Hence, we must have I = {0, . . . , m}, which means that
F = Vm . This gives the irreducibility.
59
Exercise 2.7.11 (Irreducibility of %m restricted to a smaller group). Consider again
the representation %m of SL2 (C) for m > 0. We restrict it now to the subgroup SU2 (C)
of unitary matrices of size 2. The proof of irreducibility of %m in the previous example
used the element (2.39) and its transpose, which do not belong to SU2 (C). However,
%m restricted to SU2 (C) is still irreducible, as claimed in Theorem 2.6.1. Of course, this
gives another proof of the irreducibility of %m as a representation of SL2 (C).
(1) Show that a decomposition (2.38) still holds with I not empty for a non-zero
subspace F stable under SU2 (C).
(2) Let j be such that ej = X j Y n−j is in F . Show that for
cos θ sin θ
g= ∈ SU2 (C),
− sin θ cos θ
we have
X
%m (g)ej = fi (θ)ei
06i6m
where the fi are functions on [0, 2π] which are not identically zero. Deduce that F = Vm .
Exercise 2.7.12 (Another example of irreducible representation). The following ex-
ample will be used in Chapter 6. We consider k = C and we let V = Cn for n > 1
and E = End(V ). The group G = GLn (C) acts on V (by matrix multiplication!) and
therefore there is an associated representation on E, as in (2.15).
(1) Show that this representation on E is the conjugation action
g · A = gAg −1
and that the space E0 of endomorphisms A ∈ E with trace 0 is a subrepresentation of E.
We will now prove that E0 is an irreducible representation of G, and in fact that it
is already an irreducible representation of the subgroup SUn (C) of unitary matrices with
determinant 1.
(2) Let T ⊂ SUn (C) be the diagonal subgroup of SUn (C). Show that the restriction
of E0 to T decomposes as the direct sum of T -subrepresentations
M
E0 = H ⊕ CEi,j
i6=j
Proof. The first part is a consequence of the earlier version of Schur’s Lemma
(Lemma 2.2.4). For the second, consider a G-homomorphism Φ from π to itself. The
fact that k is algebraically closed and π is finite-dimensional implies that Φ, as a linear
map, has an eigenvalue, say λ ∈ k. But Φ − λId is then a G-homomorphism of π which
is not injective. By (2), the only possibility is that Φ − λId be identically zero, which is
the desired conclusion.
Finally we prove the converse when π is semisimple. Let E be the k-vector space on
which π acts; we can assume that dim E > 1, and then we let F ⊂ E be an irreducible
subrepresentation, and F1 a complementary subrepresentation, so that E = F ⊕ F1 . The
projector Φ : E −→ E onto F with kernel F1 is an element of HomG (π, π) (we saw
this explicitly in Lemma 2.2.2), and our assumption on π implies that it is a multiple of
the identity. Since it is non-zero, it is therefore equal to the identity, which means that
F1 = 0 and E = F is irreducible.
Exercise 2.7.14 (Schur’s Lemma and semisimplicity). The last statement in Schur’s
Lemma can be a very useful irreducibility criterion. However, one should not forget the
semisimplicity condition! Consider the representation % of
n o
a b
G= ⊂ GL2 (C)
0 d
on C2 by left-multiplication.
(1) What are its composition factors? Is it semisimple?
(2) Compute HomG (%, %) and conclude that the converse of Schur’s Lemma (part (3))
does not always hold when π is not assumed to be semisimple. What happens if instead
of G one uses its subgroup where a = d = 1?
A simple, but important, corollary of Schur’s Lemma is the following:
61
Corollary 2.7.15 (Abelian groups and central character). Let G be an abelian group,
k an algebraically closed field. Then any finite-dimensional irreducible representation of
G is of dimension 1, i.e., the finite-dimensional irreducible representations of G coincide
with the homomorphisms G −→ k × .
More generally, if G is any group, and % : G −→ GL(E) is a finite-dimensional
irreducible representation of G, there exists a one-dimensional representation ω of the
center Z(G) of G such that
%(z) = χ(z)IdE
for all z ∈ Z(G). This representation ω is called the central character of %.
Proof. (1) Let % be a finite-dimensional irreducible representation of G, acting on E.
Because G is abelian, any Φ = %(g) : E → E is in fact a homomorphism in HomG (%, %).
By Schur’s Lemma 2.7.13, there exists therefore λ(g) ∈ k such that %(g) = λ(g)Id is a
scalar. Then any one-dimensional subspace of E is invariant under all operators %(g),
and by irreducibility, this means that E is equal to any such subspace.
(2) Similarly, for G arbitrary, if z is an element of the center of G, we see that
%(z) commutes with all %(g), for any representation of G, i.e., %(z) ∈ EndG (%). If % is
irreducible, Schur’s Lemma implies that %(z) is multiplication by a scalar, and of course
the latter is a one-dimensional representation of Z(G).
Remark 2.7.16 (Division algebras). Example 2.4.1 shows that this result does not
hold in general if the field is not necessarily algebraically closed.
If % is an irreducible (finite-dimensional) k-representation of G, the earlier version
Schur’s Lemma already shows that A = EndG (%), the space of G-endomorphisms of %,
has a remarkable structure: it is a subalgebra of the matrix algebra Endk (%) which is a
division algebra, i.e., any non-zero element of A has an inverse in A.
In the case of Example 2.4.1, the reader is invited to show explicitly that A is isomor-
phic to C, as an R-algebra.
Another easy useful corollary is the following algebraic characterization of multiplic-
ities of irreducible representations in a semisimple representation:
Corollary 2.7.17 (Multiplicities). Let G be a group and k an algebraically closed
field. If % is a finite-dimensional semisimple k-representation of G, then for any irre-
ducible k-representation π of G, we have
nπ (%) = dim HomG (%, π) = dim HomG (π, %),
where nπ (%) is the multiplicity of π as a summand in %.
Proof. If we express
M
%' %i ,
i
for all irreducible representation π. This space has dimension equal to the number of
indices i for which %i ' π, by Schur’s Lemma (i.e., by (2.40)), which is of course nπ (%).
A similar argument applies of course to HomG (%, π).
62
If k is algebraically closed, we can also use Schur’s Lemma to give a nice description
of the isotypic component M (π) of a finite-dimensional semisimple representation % of
G (acting on E). To describe this, let Eπ be the space on which π acts; then there is a
natural k-linear map
HomG (Eπ , E) ⊗ Eπ −→ E
Θ
Φ⊗v 7→ Φ(v).
The image of this map is, almost by definition, equal to the isotypic component
M (π) ⊂ E (because any non-zero Φ ∈ HomG (Eπ , E) is injective by Schur’s Lemma, so
that Φ(v) is in the subrepresentation Im(Φ) isomorphic to %.)
If E is finite-dimensional, we then see (by the previous corollary) that the dimensions
of M (π) and of the source
HomG (Eπ , E) ⊗ Eπ
coincide, and we conclude that Θ gives an isomorphism
HomG (Eπ , E) ⊗ Eπ ' M (π) ⊂ E.
Moreover, Θ is a G-homomorphism if we let G act trivially on HomG (Eπ , E) (which
is natural by (2.18)) and through π on Eπ . From this (picking, if needed, a basis of
HomG (Eπ , E)) we see that M (π) is isomorphic, as representation of G, to a direct sum
of d copies of π, where
d = dim HomG (Eπ , E).
As it happens, the injectivity of Θ can also be proved directly, and this leads to
the following useful result, where % is not assumed to be semisimple, but we can still
characterize the π-isotypic component using the same construction:
Lemma 2.7.18 (A formula for isotypic components). Let G be a group, and let k be
an algebraically closed field. If
% : G −→ GL(E)
is a finite-dimensional k-representation of G and
π : G −→ GL(Eπ )
is any irreducible k-representation, the map
HomG (Eπ , E) ⊗ Eπ −→ E
Θ
Φ⊗v 7→ Φ(v)
is injective and its image is equal to the π-isotypic component ME (π), the sum of all
subrepresentations of E isomorphic to π. In particular,
M (π) ' (dim HomG (Eπ , E))π
as representation.
Note that, if % is not semisimple, π might also appear as composition factor outside
of M (π) (i.e., as a genuine quotient or subquotient.)
Proof. As before, it is clear that the image of Θ is equal to the subrepresentation
M (π) ⊂ E. It remains thus to show that Θ is injective, as this leads to the isomorphism
M (π) ' HomG (Eπ , E) ⊗ π,
from which the last step follows.
63
Let (Φj ) be a basis of the space HomG (Eπ , E), so that any element of the tensor
product is of the form X
Φj ⊗ vj
j
for some vj ∈ Eπ . Then we have
X X
Θ Φj ⊗ vj = Φj (vj ) ∈ E,
j j
and the injectivity of Θ is seen to be equivalent to saying that the spaces Fj = Im(Φj ) are
in direct sum in E. We prove this in a standard manner as follows: assume the contrary
is true, and let J ⊂ I be a set of smallest order for which there is a relation
X
Φj (vj ) = 0
j∈J
so that, by irreducibility, this intersection is in fact equal to Im(Φ` ) (note that we wrote
that the Φj , j 6= `, are in direct sum because otherwise we could replace the set J by
J − {`}.) This means that Φ` belongs, in an obvious sense, to the space
M
HomG (Eπ , Im(Φj ))
j6=`
which is spanned by the homomorphisms (Φj )j6=` . This is impossible however, since all
the Φj ’s are linearly independent.
The following addition is also useful:
Lemma 2.7.19. With assumptions and notation as in Lemma 2.7.18, if (πi ) is a
family of pairwise non-isomorphic irreducible representations of G, the isotypic subspaces
M (πi ) ⊂ E are in direct sum.
Proof. Indeed, for any fixed π in this family, the intersection of M (π) with the sum of
all M (πi ), π1 6= π, is necessarily zero: it can not contain any irreducible subrepresentation,
since the possibilities coming from π are incompatible with those coming from the other
πi .
2.7.3. Burnsides’s irreducibility criterion and its generalizations, 1. We now
show how to use Schur’s Lemma to prove a result of Burnside which provides a frequently
useful irreducibility criterion for finite-dimensional representations, and we derive further
consequences along the same lines. In fact, we will prove this twice (and a third time in
Chapter 4 in the case of finite groups); in this section, we argue in the style of Burnside,
and below in Section 2.7.4, we will recover the same results in the style of Frobenius.
We will motivate the first result, Burnside’s criterion, from the following point of view:
given a finite-dimensional k-representation % of a group G, acting on the vector space E,
the image of % is a subset of the vector space Endk (E). We ask then “what are the linear
relations satisfied by these elements?” For instance, the block-diagonal shape (2.30) of a
representation which is is not irreducible shows clearly some relations: those that express
that the matrices in the bases indicated have lower-left corner(s) equal to 0, for instance.
These are obvious. Are there others?
64
Theorem 2.7.20 (Burnside’s irreducibility criterion). Let k be an algebraically closed
field, G a group. A finite-dimensional k-representation
% : G −→ GL(E)
is irreducible if and only if the image of % satisfies no non-trivial linear relation in
Endk (E). Equivalently, % is irreducible if and only if the linear span of the image of % in
Endk (E) is equal to Endk (E).
The proof we give is a modern version of Burnside’s original argument. One can give
much shorter proofs – the one in the next section is an example, – but this one has the
advantage of “exercising” the basic formalism of representation theory, and of being easy
to motivate.
Proof. First of all, the two statements we give are equivalent by duality of finite-
dimensional k-vector spaces. More precisely, let V = Endk (E); then by “relations satisfied
by the image of %”, we mean the k-linear subspace
R = {φ ∈ V 0 | hφ, %(g)i = 0 for all g ∈ G},
of V 0 , the linear dual of Endk (E). Then we are saying that R = 0 if and only if the image
of G spans V , which is part of duality theory.
The strategy of the proof is going to be the following:
(1) For some natural representation of G on V 0 , we show that R is a subrepresenta-
tion;
(2) We find an explicit decomposition of V 0 (with its G-action) as a direct sum of
irreducible representations, embedded in V 0 in a specific manner;
(3) Using this description, we can see what the possibilities for R are, and in partic-
ular that R = 0 if % is irreducible.
This strategy will also be used afterward to give a more general result of comparison
of distinct irreducible representations.
We let G act on V 0 by the contragredient of the representation of G on V given by
g · T = %(g) ◦ T
for g ∈ G and T : E → E. Note that this corresponds to the action (2.19), and not the
action (2.16). To check that R ⊂ V 0 is a subrepresentation, we need simply note that if
φ ∈ R and g ∈ G, then we have
hg · φ, %(h)i = hφ, g −1 · %(h)i = hφ, %(g −1 h)i = 0
for all h ∈ G, which means that g · φ is also in R.
From (2.20), we know that V – with the above action – is isomorphic to a direct sum
of dim % copies of %; hence V 0 is isomorphic to a direct sum of the same number of copies
of %̃, which we know to be irreducible (Lemma 2.2.13). It follows that any irreducible
subrepresentation π of R – which exists if R 6= 0 – must be itself isomorphic to %̃. (This
fact is clear from the Jordan-Hölder-Noether Theorem, but can be seen directly also by
considering the composites
M
pi : π ,→ R ,→ V 0 ' Wi −→ Wi ' %̃,
i6dim %
0
where Wi are subrepresentations of V isomorphic to %̃; each of these composites is in
HomG (π, %̃), and hence is either 0 or an isomorphism, by Schur’s Lemma. Not all pi can
be zero, if π 6= 0, hence π ' %̃.)
65
However, we claim that there is an isomorphism (of k-vector spaces)
E −→ HomG (E 0 , V 0 )
v 7→ αv
where αv : E 0 → V 0 is defined by
hαv (λ), T i = hλ, T (v)i
for λ ∈ E 0 and T : E → E. If this is the case, then assuming that R 6= 0, and hence
that R contains a copy of %̃, means that, for some v 6= 0, the image of αv is in R. But
this implies that for all λ ∈ E 0 , and g ∈ G, we have
0 = hαv (λ), %(g)i = hλ, %(g)vi
which is impossible even for a single g, since %(g)v 6= 0.
Checking the claim is not very difficult; we leave it to the reader to verify that each
αv is indeed a G-homomorphism from E 0 to V 0 , and that v 7→ αv is k-linear. We then
observe that
dim HomG (E 0 , V 0 ) = dim HomG (%̃, (dim E)%̃) = dim(E) dimG (%̃, %̃) = dim E
by Schur’s Lemma again (using the fact that k is algebraically closed). So the map will
be an isomorphism as soon as it is injective. However, αv = 0 means that
hλ, T (v)i = 0
for all λ ∈ E 0 and T ∈ V , and that only happens when v = 0 (take T to be the
identity).
Example 2.7.21. Consider again the representation % of Example 2.4.1 which is
irreducible over R but not absolutely irreducible. We see then that the linear span of the
image of % is the proper subalgebra
n
a b
| a, b ∈ R}
−b a
in M2 (R). In particular, it does satisfy non-trivial relations like “the diagonal coefficients
are equal”.
We emphasize again the strategy we used, because it is in fact a common pattern
in applications of representation theory: one wishes to analyze a certain vector space
(here, the relation space R); this space is seen to be a subspace of a bigger one, on
which a group G acts, and then the space is seen to be a subrepresentation of this bigger
space; independent analysis of the latter is performed, e.g., a decomposition in irreducible
summands; and then one deduces a priori restrictions on the possibilities for the space
of original interest.
We now implement this again in a generalization of Burnside’s Theorem (which is due
to Frobenius and Schur). To motivate it, consider a finite-dimensional k-representation
of G
% : G −→ GL(E)
which is irreducible, with k algebraically closed. Burnside’s Theorem means, in particular,
that if we fix a basis (vi ) of E and express % in the basis (vi ), producing the homomorphism
%m : G −→ GL(dim(E), k)
66
the resulting “matrix coefficients” (%m
i,j (g)), seen as functions on G, are k-linearly inde-
pendent. Indeed, if we denote by (λi ) the dual basis of E 0 , we have
%m
i,j (g) = hλi , %(g)vj i,
so that a relation X
αi,j %m
i,j (g) = 0
i,j
valid for all g, for some fixed αi,j ∈ k, means that the element φ of Endk (E)0 defined by
X
hφ, T i = αi,j hλi , T (vj )i
i,j
Then the k-linear span of the elements %(g), for g ∈ G, in Endk (E) is equal to
M
(2.41) Endk (Ei ).
i
67
(2) The matrix coefficients of finite-dimensional irreducible k-representations of G are
linearly independent, in the following sense: for any finite collection (%i ) of pairwise non-
isomorphic, finite-dimensional, irreducible k-representations of G, acting on Ei , for any
choice (vi,j )16j6dim Ei of bases of Ei , and for the dual bases (λi,j ), the family of functions
(fvi,j ,λi,k )i,j,k
on G are k-linearly independent.
Note that there are X
(dim Ei )2
i
functions on G in this family, given by
(
G→k
g 7→ hλi,j , %i (g)vi,k iEi ,
for 1 6 j, k 6 dim Ei .
Proof. It is easy L to see first that (1) implies (2): writing down as above a matrix
representation for % = %i in the direct sum of the given bases of Ei , a k-linear relation
on the matrix coefficients implies one on the k-linear span of the image of %, but there is
no non-trivial such relation on M
Endk (Ei ).
i
To prove (1), we use the same strategy as in the proof of Burnside’s theorem, for the
representation % of G on E: let V = Endk (E) and
R = {φ ∈ V 0 | hφ, %(g)i = 0 for all g ∈ G}
be the relation space. We will compute R and show that R = S where
M
(2.42) S = {φ ∈ V 0 | hφ, T i = 0 for all T ∈ Endk (Ei )},
i
Proof. We first check (1), which states that M (%) is a canonical subspace of Ck (G).
Let E be the space on which % acts and let τ : G −→ GL(F ) be a k-representation
isomorphic to %, with the linear map
Φ : E −→ F
giving this isomorphism. Then for any w ∈ F and λ ∈ F 0 , writing w = Φ(v) for some
v ∈ E, we have
fw,λ (g) = hλ, τ (g)wiF
= hλ, τ (g)Φ(v)iF
= hλ, Φ(%(g)v)iF
= ht Φ(λ), %(g)viE = fv,t Φ(λ) (g),
for all g ∈ G, showing that any matrix coefficient for τ is also one for %. By symmetry,
we see that M (%) and M (τ ) are equal subspaces of Ck (G).
We next check that M (%) ⊂ Ck (G) is indeed a subrepresentation: for v ∈ E (the
space on which G acts), λ ∈ E 0 , and g ∈ G, we have
reg(g)fv,λ (x) = fv,λ (xg) = hλ, %(xg)vi = f%(g)v,λ (x).
But this formula says more: it also shows that, for a fixed λ ∈ E 0 , the linear map
Φλ : v 7→ fv,λ
is an intertwining operator between % and M (%).
Fix a basis (vj ) of the space E on which % acts, and let (λj ) be the dual basis. By
construction, M (%) is spanned by the matrix-coefficients
fi,j = fvi ,λj , 1 6 i, j 6 dim(%)
70
and from the linear independence of matrix coefficients, these functions form in fact a basis
of the space M (%). In particular, we have dim M (%) = dim(%)2 . Now the intertwining
operator
M M
Φ= Φλi : % −→ M (%)
i i
is surjective (since its image contains each basis vector fi,j ), and both sides have the same
dimension. Hence it must be an isomorphism, which shows that M (%) is isomorphic to
(dim(%))% as a representation of G.
There only remains to check the last part. Let E ⊂ Ck (G) be a subrepresentation of
the regular representation which is isomorphic to %. To show that E ⊂ M (%), we will
check that the elements f ∈ E, which are functions on G, are all matrix coefficients of
E: let δ ∈ Ck (G)0 be the linear form defined by
δ(f ) = f (1)
for f ∈ Ck (G). Consider the element δE ∈ E 0 which is the restriction of δ to E. Then,
for any function f ∈ E and x ∈ G, we have, by definition of the regular representation
hδE , reg(x)f iE = reg(x)f (1) = f (x).
The left-hand side (as a function of x) is a matrix coefficient for % (since reg on E is
isomorphic to %), and hence we see that f ∈ M (%).
The next corollary will be improved in the chapter on representations of finite groups.
We state it here because it is the first a priori restriction we have found on irreducible
representations for certain groups:
Corollary 2.7.27. Let G be a finite group and let k be an algebraically closed field.
There are only finitely many irreducible k-representations of G up to isomorphism, and
they satisfy
X
(dim %)2 6 |G|
%
71
is semisimple: conversely, if Ck (G) is semisimple, by Lemma 2.2.8 there exists a stable
subspace F such that
M
Ck (G) = F ⊕ M (%) ,
%
In the next section, we will derive further consequences of the linear independence
of matrix coefficients, related to characters of finite-dimensional representations. Before,
as another application of Schur’s Lemma and its corollaries, we can now prove Proposi-
tion 2.3.17 about irreducible representations of a direct product G = G1 × G2 .
Proof of Proposition 2.3.17. Recall that we are considering a field k and two
groups G1 and G2 and want to prove that all finite-dimensional irreducible k-representa-
tions of G = G1 × G2 are of the form % ' %1 %2 for some irreducible representations %i
of Gi (unique up to isomorphism). We will prove the result here when k is algebraically
closed, and leave the general case to an exercise for the reader below.
To begin with, if %1 and %2 are finite-dimensional irreducible representations of G1
and G2 , the irreducibility of % = %1 %2 follows from Burnside’s irreducibility criterion:
since the %1 (g1 ) and %2 (g2 ), for gi ∈ Gi , span the k-linear endomorphism spaces of their
respective spaces, it follows by elementary linear algebra that the %1 (g1 ) ⊗ %2 (g2 ) also
span the endomorphism space of the tensor product. (Here we use the fact that k is
algebraically closed.)
Thus what matters is to prove the converse. Let therefore
% : G1 × G2 −→ GL(E)
%1 %2 ' %.
and, by fixing a basis (vj ) of the space of %02 , we see that the restriction to G1 of % = %1 %02
is the direct sum M
E1 ⊗ kvj ' (dim %02 )E1 ,
j
so that the dimension of E2 is given by
dim E2 = (dim %02 ) dim HomG1 (%1 , %1 ) = dim %02
by the last part of Schur’s Lemma (we use again the fact that k is algebraically closed).
Finally, coming back to the general situation, note that this last observation on the
restriction of %1 %2 to G1 (and the analogue for G2 ) show that %1 and %2 are indeed
unique up to isomorphism, by the Jordan-Hölder-Noether Theorem.
73
Exercise 2.7.29. In this exercise we prove Proposition 2.3.17 when k is not neces-
sarily algebraically closed.
(1) Show that the second part (every finite-dimensional irreducible k-representation
of G = G1 × G2 is an external tensor product) will follow from the first. [Hint: Follow
the same argument as in the algebraically closed case.]
(2) Consider two irreducible finite-dimensional representations %i of Gi , acting on Ei ,
and let % = %1 %2 . Show that if F 6= 0 is an irreducible subrepresentation of E1 ⊗ E2
(under the action of %) there exists a non-zero intertwiner
E1 F2 −→ F
for some irreducible representation space F2 of G2 . [Hint: Use the ideas of the second
part in the algebraically closed case.]
(3) Conclude by showing that necessarily F2 ' E2 , and F ' E1 E2 . [Hint: Show
that E2 is a composition factor of F , and note that F2 is the only possible composition
factor for the restriction of E1 F2 to G2 .]
2.7.4. Burnside’s theorem and its generalizations, 2. As promised, we now
explain how to recover the results of the previous section in a style closer (maybe) to
that of Frobenius. Even for readers who have understood the arguments already used,
this may be useful. In fact, the proofs are simpler, but not so well motivated. The
viewpoint is to start this time by determining directly the isotypic components of the
regular representation.
Proposition 2.7.30 (Isotypic component of the regular representation). Let k be
an algebraically closed field and G a group. For any finite-dimensional irreducible k-
representation
% : G −→ GL(E)
of G, the %-isotypic component M (%) ⊂ Ck (G) of the regular representation is isomorphic
to a direct sum of dim(%) copies of %, and is spanned by the matrix coefficients of %.
Proof. We start with the realization (2.22) of Ck (G) as the induced representation
IndG
1 (1),
and then apply Frobenius Reciprocity (2.23), which gives us linear isomorphisms
0
HomG (%, Ck (G)) = HomG (%, IndG G
1 (1)) ' Hom1 (Res1 (%), 1) = Homk (E, k) = E .
Thus the isotypic component M (%), which is equal to the image of the injective G-
homomorphism
HomG (%, Ck (G)) ⊗ E −→ Ck (G)
Φ⊗v 7→ Φ(v)
(see Lemma 2.7.18) is isomorphic to E 0 ⊗ E as a representation of G, where E 0 has the
trivial action of G, and E the action under %. This is the same as the direct sum of
dim(E) copies of %.
We now show, also directly, that the image of the linear map above is spanned by
matrix coefficients. Indeed, given λ ∈ E 0 , the corresponding homomorphism
Φλ : E −→ Ck (G)
in HomG (%, Ck (G)) is given, according to the recipe in (2.26), by
Φλ (v) = (x 7→ λ(%(x)v)) = fv,λ ,
i.e., its values are indeed matrix coefficients.
74
Re-proof of Theorem 2.7.24. First of all, we recover Burnside’s irreducibility
criterion. Consider an irreducible finite-dimensional representation % : G −→ GL(E).
We know from the proposition that
dim M (%) = dim(%)2 .
Since M (%) is spanned by the dim(%)2 matrix coefficients
fvi ,λj , 1 6 i, j 6 dim %
associated with any basis (vi ) of E and the dual basis (λj ) of E 0 , these must be indepen-
dent. But then if we consider the matrix representation %m of % in the basis (vi ), that
means that there is no non-trivial linear relation between the coefficients of the %m (g),
and hence – by duality – the span of those matrices must be the space of all matrices of
size dim %, which means that the linear span of the %(g) in End(E) is equal to End(E).
This recovers Burnside’s criterion (since the converse direction was immediate).
Now consider finitely many irreducible representations (%i ) which are pairwise non-
isomorphic. The subspaces M (%i ) ⊂ Ck (G) are in direct sum (Lemma 2.7.19: the inter-
section of any one with the sum of the others is a subrepresentation where no composition
factor is permitted, hence it is zero), and this means that the matrix coefficients of the
%i must be linearly independent – in the sense of the statement of Theorem 2.7.24.
2.7.5. Characters of finite-dimensional representations. As another conse-
quence of Theorem 2.7.24, we see that if we are given one matrix coefficient of each
of %1 and %2 , some irreducible k-representations of G, both finite-dimensional, we are
certain that they will be distinct functions if %1 and %2 are not isomorphic. The converse
is not true, since even a single representation has typically many matrix coefficients.
However, one can combine some of them in such a way that one obtains a function
which only depends on the representation up to isomorphism, and which characterizes
(finite-dimensional) irreducible representations, up to isomorphism.
Definition 2.7.31 (Characters). Let G be a group and k a field.
(1) A character of G over k, or k-character of G, is any function χ : G −→ k of the
type
χ(g) = Tr %(g)
where % is a finite-dimensional k-representation of G. One also says that χ is the character
of G.
(2) An irreducible character of G over k is a character associated to an irreducible
k-representation of G.
We will typically write χ% for the character of a given representation %. Then we have:
Corollary 2.7.32. Let G be a group and let k be an algebraically closed field.
(1) Two irreducible finite-dimensional representations %1 and %2 of G are isomorphic
if and only if their characters are equal as functions on G. More generally, the characters
of the irreducible finite-dimensional representations of G, up to isomorphism, are linearly
independent in Ck (G).
(2) If k is of characteristic zero, then two finite-dimensional semisimple representa-
tions of G are isomorphic if and only if their characters are equal.
Proof. If two representations (irreducible or not, but finite-dimensional) %1 and %2
are isomorphic, with Φ : E1 → E2 giving this isomorphism, we have
Φ ◦ %1 (g) ◦ Φ−1 = %2 (g)
75
for all g ∈ G, and hence
Tr(%1 (g)) = Tr(%2 (g))
so that their characters are equal.
To prove (1), we note simply that the character Tr %(g) is a sum of (diagonal) matrix
coefficients: if (vi ) is a basis of the space of %, with dual basis (λi ), we have
X
χ% = fvi ,λi
i
(i.e,
X
Tr %(g) = hλi , %(g)vi i,
i
for all g ∈ G). Hence an equality
χ%1 = χ%2
is a linear relation between certain matrix coefficients of %1 and %2 respectively. If %1 and
%2 are irreducible but not isomorphic, it follows that such a relation is impossible.
Similarly, expanding in terms of matrix coefficients, we see that any linear relation
X
α(π)χπ = 0
π
77
The two statements in (2.44) are equivalent, and state that the value of a character at
some x ∈ G only depends on the conjugacy class of x in G. Functions with this property
are usually called class functions on G, or central functions.
Note that we restrict to a finite-index subgroup for induction because otherwise the
dimension of the induced representation is not finite.
The character formula makes sense because the property that sgs−1 be in H, and the
value of the trace of %(sgs−1 ), are both unchanged if s is replaced by any other element
of the coset Hs. It may also be useful to observe that one can not use the invariance of
%% under conjugation to remove the s in χ% (sgs−1 ), since % is only a representation of H,
and s ∈/ H (except for the coset H itself).
Note also that by taking g = 1, it leads – as it should – to the formula
dim IndG
H (%) = [H : G] dim %
of Proposition 2.3.8, if the field k has characteristic zero (in which case the equality in k
gives the same in Z).
Proof. The first formulas are direct consequences of the definitions and the prop-
erties of the trace of linear maps. Similarly, the formula for the restriction is clear, and
only the case of induction requires proof.
Let E be the space on which % acts, and let F be the space
F = {ϕ : G → E | ϕ(hx) = %(h)ϕ(x) for h ∈ H, x ∈ G}
of the induced representation. We will compute the trace by decomposing F (as a linear
space) conveniently, much as was done in the proof of Proposition 2.3.8 when computing
the dimension of F (which is also, of course, the value of the character at g = 1). First
of all, for any s ∈ G, let Fs ⊂ F be the subspace of those ϕ ∈ F which vanish for all
x ∈/ Hs; thus Fs only depends on the coset Hs ∈ H\G. We then have a direct sum
decomposition M
F = Fs ,
s∈H\G
(where the components of a given ϕ are just obtained by taking the restrictions of ϕ to
the cosets Hs and extending this by zero outside Hs.)
Now, for a fixed g ∈ G, the action of Ind(g) on F is given by the regular representation
Ind(g)ϕ(x) = ϕ(xg).
It follows from this that Ind(g) permutes the subspaces Fs , and more precisely that
Ind(g) sends Fs to Fsg−1 . In other words, in terms of the direct sum decomposition
above, the action of Ind(g) is a “block permutation matrix”. Taking the trace (it helps
visualizing this as a permutation matrix), we see that it is the sum of the trace of the
maps induced by Ind(g) over those cosets s ∈ H\G for which Hsg −1 = Hs, i.e., over
those s for which sgs−1 ∈ H.
Now for any s ∈ G with sgs−1 ∈ H, we compute the trace of the linear map
πs,g : Fs −→ Fs
induced by Ind(g). To do this, we use the fact – already used in Proposition 2.3.8 – that
Fs is isomorphic to E. More precisely, there are reciprocal k-linear isomorphisms
α β
E −→ Fs −→ E
such that
β(ϕ) = ϕ(s)
78
on the one hand, and α(v) is the element of Fs mapping hs to %(h)v (and all x ∈ / Hs
to 0.) The fact that α and β are inverses of each other is left for the reader to (it is
contained in the proof of Proposition 2.3.8).
Thus the trace of Ind(g) is the sum, over those s, of the trace of the linear map on E
given by β ◦ πs,g ◦ α. But – and this shouldn’t be much of a surprise – this map is simply
given by
%(sgs−1 ) : E −→ E,
with trace χ% (sgs−1 ) (which is defined because sgs−1 ∈ H, of course). The stated formula
follows by summing over the relevant s.
We check the claim: given v ∈ E, and ϕ = α(v), we have
(β ◦ πs,g ◦ α)(v) = Ind(g)ϕ(s) = ϕ(sg)
= ϕ((sgs−1 )s) = %(sgs−1 )ϕ(s) = %(sgs−1 )v,
using the definitions of α and β.
Example 2.7.36. The formula for an induced character may look strange of compli-
cated at first. In particular, it is probably not clear just by looking at the right-hand side
that it is the character of a representation of G! However, we will see, here and especially
in Chapter 4, that the formula is quite flexible and much nicer than it may seem.
(1) Example 2.7.34 is also a special case of the formula (2.45) for the character of
an induced representation. Indeed, we know that the regular representation reg of a
group G (over a field k) is the same as the induced representation IndG 1 (1) of the trivial
representation of the trivial subgroup {1} of G. Hence
X
χreg (g) = 1,
s∈G
sgs−1 =1
which leads to the formula (2.43), since the conjugacy class of 1 is reduced to {1}.
(2) Generalizing this, let % be the permutation representation (Section 2.6.2) associ-
ated with the action of G on a finite set X. Then we have
χ% (g) = |{x ∈ X | g · x = x}|,
i.e., the character value at g is the number of fixed points of g acting on X. Indeed,
in the basis (ex ) of the space of %, each %m (g) is a permutation matrix, and its trace is
the number of non-zero (equal to 1) diagonal entries. These correspond to those x ∈ X
where %(g)x = egx is equal to ex , i.e., to the fixed points of g.
(3) Let H / G be a normal subgroup of G of finite index, and % a finite-dimensional
representation of H. Let π = IndG H (%). Then
χπ (g) = 0
/ H, since the condition sgs−1 ∈ H means g ∈ s−1 Hs = H. For h ∈ H, on the
for g ∈
other hand, we have shs−1 ∈ H for all s, and thus
X
χπ (h) = χ% (shs−1 ).
s∈G/H
Exercise 2.7.37. Show directly using the character formula that if H is a subgroup
of finite index in G, the characters on both sides of the projection formula
IndG G G
H (%2 ⊗ ResH (%1 )) ' IndH (%2 ) ⊗ %1
2.8. Conclusion
We have now built, in some sense, the basic foundations of representation theory, try-
ing to work in the greatest generality. Interestingly, there are some results of comparable
generality which the techniques we have used (which involve basic abstract algebra) are
– as far as we know! – unable to prove. These require the theory of algebraic groups. A
81
very short introduction, with a discussion of some of the results it leads to, can be found
in the beginning of Chapter 7.
Before closing this chapter, we can use the vocabulary we have built to ask: What are
the fundamental questions of representation theory? The following are maybe the two
most obvious ones:
• (Classification) For a group G, one may want to classify all its irreducible rep-
resentations (say over the complex numbers), or all representations of a certain
type. This is possible in a number of very important cases, and is often of
tremendous importance for applications; one should think here of a group G
which is fairly well-understood from a group-theoretic point of view.
• (Decomposition) Given a group G again, and a natural representation E of G
(say over C), one may ask to find explicitly the irreducible components of E,
either as summands if E is semisimple, or simply as Jordan-Hölder factors.
Both are crucial to the so-called “Langlands Program” in modern number theory.
82
CHAPTER 3
Variants
We discuss here some of the variants of the definition of representations that we have
used. Many of them are very important topics in their own right, but we will only come
back, usually rather briefly, to some of them.
respectively (with only finitely many non-zero coefficients, if G is infinite), and their
product is
XX XX
ab = λg µh [g][h] = λg µh [x].
g,h∈G x∈G gh=x
83
Similarly, multiplying on the left or the right with x ∈ k(G), we get
X X X
(3.2) s[x] = [g][x] = [gx] = [h] = s, [x]s = s.
g∈G g∈G h∈G
By linearity, we see that sa = as for all a ∈ k(G); thus the element s is in the center
of the algebra k(G).
Definition 3.1.2 (Group algebra). Let k be a field and G a group. The algebra k(G)
defined above is called the group algebra of G over k.
The characteristic algebraic property of the group algebra is the following:
Proposition 3.1.3 (Representations as modules over the group algebra). Let G be
a group and k a field. If
%
G −→ GL(E)
is a k-representation of E, then E has a structure of k(G)-module defined by extending
% by linearity, i.e., X X
λg [g] · v = λg %(g)v.
g∈G g∈G
Conversely, any k(G)-module E inherits a representation of G by restricting the “mul-
tiplication by a” maps to the basis vectors [g], i.e., by putting
%(g) = (v 7→ [g] · v).
Sometimes, to avoid dropping the reference to the representation %, one writes
X
%(a)v = λg %(g)v
g∈G
for an element a ∈ k(G) as above. Thus %(a) becomes an element in Homk (E, E).
Here are some reasons why this approach can be very useful:
• It gives access to all the terminology and results of the theory of algebras; in
particular, for any ring A, the notions of sums, intersections, direct sums, etc, of
A-modules, are well-defined and well-known. For A = k(G), they correspond to
the definitions already given in the previous chapter for linear representations.
More sophisticated constructions are however more natural in the context of
algebras. For instance, one may consider the ideals of k(G) and their properties,
which is not something so natural at the level of the representations themselves.
• The k(G)-modules parametrize, in the almost tautological way we have de-
scribed, the k-representations of G. It may happen that one wishes to con-
centrate attention on a special class C of representations, characterized by some
property. It happens, but very rarely, that these representations correspond in
a natural way to all (or some of) the representations of another group GC (an
example is to consider for C the class of one-dimensional representations, which
correspond to those of the derived group G/[G, G], as in Proposition 2.6.6), but
it may happen more often that there is a natural k-algebra AC such that its
modules correspond precisely (and “naturally”) to the representations in C.
Partly for reasons of personal habit (or taste), we won’t exploit the group algebra
systematically in this book. This can be justified by the fact that it is not absolutely
necessary at the level we work. But we will say a bit more, e.g., in Section 4.3.6, and
Exercise 4.3.29 describes a property of representations with cyclic vectors which is much
more natural from the point of view of the group algebra.
84
Exercise 3.1.4 (Representations with a fixed vector under a subroup). Let G be a
finite group. Consider a subgroup H ⊂ G; note that C(H) is naturally a subring of
C(G).
(1) Show that
H = {a ∈ k(G) | hah0 = a for all h, h0 ∈ H}
is a subalgebra of k(G), and that it is generated as k-vector space by the characteristic
functions of all double classes
HxH = {hxh0 | h, h0 ∈ H} ⊂ G.
(2) Show that if % : G −→ GL(E) is a representation of G, the subspace E H is stable
under multiplication by H, i.e., that E H is an H-module.
(3) Show that if Φ ∈ HomG (E, F ), the restriction of Φ to E H is an H-linear map from
E to F H .
H
defined for T ∈ L(E) (in other words, it is a topology of uniform convergence on bounded
subsets of E; the equality of those three quantities follows from linearity). But it turns
out that asking for homomorphisms to BGL(E) which are continuous for this topology
does not lead to a good theory: there are “too few” representations in that case, as we
will illustrate below. The “correct” definition is the following:
Definition 3.2.1 (Continuous representation). Let G be a topological group and
k = R or C. Let E be a Banach space.
(1) A continuous representation, often simply called a representation, of G on E is a
homomorphism
% : G −→ BGL(E)
such that the corresponding action map
(
G × E −→ E
(g, v) 7→ %(g)v
is continuous, where G × E has the product topology.
(2) If %1 and %2 are continuous representations of G, acting on E1 and E2 respectively,
a homomorphism
Φ
%1 −→ %2
is a continuous linear map E1 −→ E2 such that
Φ(%1 (g)v) = %2 (g)Φ(v)
for all g ∈ G and v ∈ E1 .
If dim(E) < +∞, note that this is indeed equivalent to asking that % be continuous,
where BGL(E) as the topology from its isomorphism with GLn (k), n = dim(E).
Example 3.2.2 (Too many representations). Consider G = R and k = C. If we
consider simply one-dimensional representations
R → C× ,
with no continuity assumptions at all, there are “too many” for most purposes. Indeed,
as an abelian group, R is torsion-free, and hence is a free abelian group: selecting a
basis2 (vi )i∈I of R as abelian group, we obtain zillions of 1-dimensional representations
by deciding simply that
vi 7→ zi
1 It is natural, for instance, because L(E) becomes a Banach space for this norm.
2 Necessarily uncountable, and not measurable as a subset of R.
87
for some arbitrary zi ∈ C and extending these by linearity to R, which is the free group
generated by the (vi ).
But as soon as we impose some regularity on the homomorphisms R → C× , we obtain
a much better understanding:
Proposition 3.2.3 (Continuous characters of R). Let χ : R → C× be a continuous
group homomorphism. Then there exists a unique s ∈ C such that
χ(x) = esx
for all x ∈ R.
In fact, one can show that it is enough to ask that the homomorphism be measurable.
The intuitive meaning of this is that, if one can “write down” a homomorphism R →
C× in any concrete way, or using standard limiting processes, it will automatically be
continuous, and hence be one of the ones above.
The proof we give uses differential calculus, but one can give completely elementary
arguments (as in [1, Example A.2.5]).
Proof. If χ is differentiable, and not merely continuous, this can be done very quickly
using differential equations: we have
χ(x + h) − χ(x) χ(h) − 1
χ0 (x) = lim = χ(x) lim = χ(x)χ0 (0)
h→0 h h→0 h
for all x ∈ R. Denoting s = χ0 (0), any solution of the differential equation y 0 = sy is
given by
y(x) = αesx
for some parameter α ∈ C. In our case, putting x = 0 leads to 1 = χ(0) = α, and hence
we get the result.
We now use a trick to show that any continuous homomorphism χ : R −→ C× is in
fact differentiable. We define the primitive
Z x
Ψ(x) = χ(u)du
0
(note that this is s−1 (esx − 1) if χ(x) = esx ; this formula explains the manipulations to
come), which is a differentiable function on R. Then we write
Z x+t Z x Z x+t
Ψ(x + t) = χ(u)du = χ(u)du + χ(u)du
0 0 x
Z t
= Ψ(x) + χ(x + u)du = Ψ(x) + χ(x)Ψ(t)
0
for all real numbers x and t. The function Ψ can not be identically zero (its derivative χ
would then also be zero, which is not the case), so picking a fixed t0 ∈ R with Ψ(t0 ) 6= 0,
we obtain
Ψ(x + t0 ) − Ψ(x)
χ(x) = ,
Ψ(t0 )
which is a differentiable function! Thus our first argument shows that χ(x) = esx for all
x, with s = χ0 (0).
88
Example 3.2.4 (Too few representations). Let G be the compact group R/Z. We
now show that, if one insisted on considering as representations only homomorphisms
% : R/Z −→ BGL(E)
which are continuous with respect to the norm topology on BGL(E), there would be “too
few” (similar examples hold for many other groups). In particular, there would be no
analogue of the regular representation. Indeed, if f is a complex-valued function on R/Z
and t ∈ R/Z, it is natural to try to define the latter with
%(t)f (x) = f (x + t).
To have a Banach space of functions, we must impose some regularity condition.
Although the most natural spaces turn out to be the Lp spaces, with respect to Lebesgue
measure, we start with E = C(R/Z), the space of continuous functions f : R/Z → C.
If we use the supremum norm
kf k = sup |f (t)|,
t∈R/Z
this is a Banach space. Moreover, the definition above clearly maps a function f ∈ E to
%(t)f ∈ E; indeed, we have
k%(t)f k = kf k
(since the graph of %(t)f , seen as a periodic function on R, is just obtained from that of
f by translating it to the left by t units), and this further says that %(t) is a continuous
linear map on E. Hence, % certainly defines a homomorphism
% : R/Z −→ BGL(E).
But now we claim that % is not continuous for the topology on BGL(E) coming from
the norm (3.3). This is quite easy to see: in fact, for any t 6= 0 with 0 < t < 1/2, we have
k%(t) − IdE kL(E) = k%(t) − %(0)kL(E) > 1
which shows that % is very far from being continuous at 0. To see this, we take as “test”
function any ft ∈ E which is zero outside [0, t], non-negative, always 6 1, and equal to
1 at t/2, for instance (where we view R/Z as obtained from [0, 1] by identifying the end
points). Then kft k = 1, and therefore
k%(t) − IdkL(E) > k%(t)ft − ft k = sup |ft (t + x) − ft (x)| = 1
x∈R/Z
since %(t)ft is supported on the image modulo Z of the interval [1 − t, t] which is disjoint
from [0, t], apart from the common endpoint t.
However, the point of this is that we had to use a different test function for each t,
and for a fixed f , the problem disappears: by uniform continuity of f ∈ E, we have
lim k%(t)f − f k = 0
t→0
with M = sup k%(g)k. Thus if λ is close enough to 0 (and g arbitrary), so is %̃(g)λ, which
means that g̃ is continuous.
90
3.3. Unitary representations
When the representation space (for a topological group) is a Hilbert space H, with
an inner product3 h·, ·i, it is natural to consider particularly closely the unitary represen-
tations, which are those for which the operators %(g) are unitary, i.e., preserve the inner
product:
h%(g)v, %(g)wi = hv, wi
for all g ∈ G, v and w ∈ H.
We present here most basic facts about such representations. They will be considered
in more detail first for finite groups, and then – more sketchily – for compact and locally
compact groups. For additional information on the general theory of unitary representa-
tions, especially with respect to infinite-dimensional cases, we recommend the Appendices
to [1].
Definition 3.3.1 (Unitary representations, unitarizable representations). Let G be
a topological group, which can be an arbitrary group with the discrete topology.
(1) A unitary representation of G is a continuous representation of G on a Hilbert
space H where %(g) ∈ U(H) for all g ∈ G, where U(H) is the group of unitary operators
of G. A morphism %1 → %2 of unitary representations, acting on H1 and H2 respectively,
Φ
is a morphism of continuous representations %1 −→ %2 (one does not require that Φ
preserve the inner product.)
(2) An arbitrary representation
G −→ GL(E)
of G on a complex vector space E is unitarizable if there exists an inner product h·, ·i
on E, defining a structure of Hilbert space, such that the values of % are unitary for this
inner product, and the resulting map G −→ U(E) is a unitary representation.
Remark 3.3.2. If H is finite-dimensional and G carries the discrete topology (e.g., if
G is finite) it is enough to construct an inner product on the vector space E such that %
takes value in U(E) in order to check that a representation is unitarizable (the continuity
is automatic).
The continuity requirement for unitary representations can be rephrased in a way
which is easier to check:
Proposition 3.3.3 (Strong continuity criterion for unitary representations). Let % :
G −→ U(H) be a homomorphism of a topological group G to a unitary group of some
Hilbert space. Then % is a unitary representation, i.e., the action map is continuous, if
and only if % is strongly continuous: for any fixed v ∈ H, the map
(
G −→ H
g 7→ %(g)v
is continuous on G. Equivalently, this holds when these maps are continuous at g = 1.
Proof. The joint continuity of the two-variable action map implies that of the maps
above, which are one-variable specializations. For the converse, we use the unitarity and
a splitting of epsilons...
3 Recall from the introduction that our inner products are linear in the first variable, and conjugate-
linear in the second: hv, λwi = λ̄hv, wi for λ ∈ C.
91
Let (g, v) be given in G × H. For any (h, w) ∈ G × H, we have
k%(g)v − %(h)wk = k%(h)(%(h−1 g)v − w)k = k%(h−1 g)v − wk
by unitarity. Then, by the triangle inequality, we get
k%(g)v − %(h)wk 6 k%(h−1 g)v − vk + kv − wk.
Under the assumption of the continuity of x 7→ %(x)v when x → 1, this shows that
when w is close to v and h close to g, the difference becomes arbitrarily small, which
is the continuity of the action map at (g, v). To be precise: given ε > 0, we can first
find an open neighborhood Uv of v in H such that kv − wk < ε for w ∈ Uv , and we
can use the continuity assumption to find an open neighborhood U1 of 1 ∈ G such that
k%(x)v − vk < ε for x ∈ U1 . Then, when (h, w) ∈ gU1−1 × Uv , we have
k%(g)v − %(h)wk 6 2ε.
Remark 3.3.4 (The strong topology). The name “strong continuity” refers to the
fact that the condition above is equivalent with the assertion that the homomorphism
% : G −→ U(H)
is continuous, with respect to the topology induced on U(H) by the so-called strong
operator topology on L(H). The latter is defined as weakest topology such that all linear
maps
L(H) −→ C
T 7→ T v
for v ∈ H are continuous with respect to this topology. This means that a basis of
neighborhoods of T0 ∈ L(H) for this topology is given by a finite intersection
\
Vi
16i6m
where
Vi = {T ∈ L(H) | kT vi − T0 vi k < ε}
for some unit vectors vi ∈ H and some fixed ε > 0 (see, e.g., [31, p. 183]).
Example 3.3.5 (The regular representation on L2 (R)). Here is an example of infinite-
dimensional unitary representation. We consider the additive group R of real numbers,
with the usual topology, and the space H = L2 (R), the space of square-integrable func-
tions on R, defined using Lebesgue measure. Defining
%(t)f (x) = f (x + t),
we claim that we obtain a unitary representation. This is formally obvious, but as we
have seen, the continuity requirement needs some care. We will see this in a greater
context in Proposition 5.2.6 in Chapter 5, but we sketch the argument here.
First, one must check that the definition of %(t)f makes sense (since elements of L2 (R)
are not really functions): this is because if we change a measurable function f on a set
of measure zero, we only change x 7→ f (x + t) on a translate of this set, which still have
measure zero.
Then the unitarity of the action is clear: we have
Z Z Z
2 2
|%(t)f (x)| dx = |f (x + t)| dx = |f (x)|2 dx,
R R R
92
and only continuity remains to be checked. This is done in two steps using Proposi-
tion 3.3.3 (the reader can fill the outline, or look at Proposition 5.2.6). Fix f ∈ L2 (R),
and assume first that it is continuous and compactly supported. Then the continuity of
t 7→ %(t)f amounts to the limit
Z
lim |f (x + t + h) − f (x + t)|2 dx = 0
h→0 R
for all t ∈ R, which follows from the dominated convergence theorem. Next, one uses
the fact that continuous functions with compact support form a dense subspace of L2 (R)
(for the L2 -norm) in order to extend this continuity statement to all f ∈ L2 (R).
The operations of representation theory, when applied to unitary representations, lead
most of the time to other unitary representations. Here are the simplest cases:
Proposition 3.3.6 (Operations on unitary representations). Let G be a topological
group.
(1) If % is a unitary representation of G, then any subrepresentation and any quotient
representation are naturally unitary. Similarly, the restriction of % to any subgroup H is
a unitary representation with the same inner product.
(2) Direct sums of unitary representations are unitary with inner product
hv1 ⊕ w1 , v2 ⊕ w2 i%1 ⊕%2 = hv1 , v2 i%1 + hw1 , w2 i%2 .
for %1 ⊕%2 , so that the subrepresentations %1 and %2 in %1 ⊕%2 are orthogonal complements
of each other.
(3) The tensor product %1 ⊗ %2 of finite-dimensional unitary representations %1 and %2
is unitary, with respect to the inner product defined for pure tensors by
(3.4) hv1 ⊗ w1 , v2 ⊗ w2 i%1 ⊗%2 = hv1 , v2 i%1 hw1 , w2 i%2 .
Similarly, external tensor products of finite-dimensional unitary representations, are
unitary, with the same inner product on the underlying tensor-product space.
We leave to the reader the simple (and standard) argument that checks that the
definition of (3.4) does extend to a well-defined inner product on the tensor product of
two finite-dimensional Hilbert spaces.
Note that one can extend this to situations involving infinite-dimensional represen-
tations; this is very easy if either %1 or %2 is finite-dimensional, but it becomes rather
tricky to define the tensor product of two infinite-dimensional Hilbert spaces. Similarly,
extending the notion of induction to unitary representations of topological groups is not
obvious; we will consider this in Chapter 5 for compact groups. In both cases, the reader
may look at [1, App. A, App. E] for some more results and general facts.
Example 3.3.7. A special case of (3) concerns the tensor product of an arbitrary
unitary representation % : G −→ U(H) with a one-dimensional unitary representation
χ : G −→ S1 . Indeed, in that case, one can define % ⊗ χ on the same Hilbert space H,
by
(% ⊗ χ)(g)v = χ(g)%(g)v,
with the same inner product, which is still invariant for % ⊗ χ: for any v, w ∈ H, we have
h(% ⊗ χ)(g)v, (% ⊗ χ)(g)wiH = hχ(g)%(g)v, χ(g)%(g)wiH
= |χ(g)|2 h%(g)v, %(g)wiH = hv, wiH .
93
Now we discuss the contragredient in the context of unitary representations. There
are special features here, which arise from the canonical duality properties of Hilbert
spaces. For a Hilbert space H, the conjugate space H̄ is defined to be the Hilbert space
with the same underlying abelian group (i.e., addition) H, but with scalar multiplication
and inner products defined by
α · v = ᾱv, hv, wiH̄ = hw, viH .
The point of the conjugate Hilbert space is that it is canonically isometric4 (as Banach
space) to the dual of H, by the map
H̄ −→ H 0
Φ
w 7→ (λw : v 7→ hv, wi)
(vectors in w are “the same” as vectors in H, but because λαw = ᾱλw , this map is only
linear when the conjugate Hilbert space structure is used as the source.)
If % is a unitary representation of a group G on H, this allows us to rewrite the basic
matrix coefficients fv,λ of % using inner products on H only: given λ = λw ∈ H 0 and
v ∈ H, we have
fv,λ (g) = λw (%(g)v) = h%(g)v, wi.
These are now parametrized by the two vectors v and w in H; though it is formally
better to see w as being an element of H̄, one can usually dispense with this extra
formalism without much loss.
Using the map Φ, we can also “transport” the contragredient representation of % to
an isomorphic representation %̄ acting on H̄. Its character, when % is finite-dimensional,
is given by
χ%̄ (g) = χ% (g)
(since the eigenvalues of a unitary matrix are roots of unity, hence their inverse is the
same as their conjugate; see Lemma 3.3.10 below also).
Exercise 3.3.8 (Matrix coefficients of the conjugate representation). Let % be a
unitary representation of G on a Hilbert space H. Show that the functions
g 7→ h%(g)v, wi,
for v, w ∈ H are matrix coefficients of %̄.
Example 3.3.9 (A unitarizability criterion). Let G = R with its usual topology.
We have seen that there are many one-dimensional representations of G as a topological
group, given by
R −→ C×
ωs :
x 7→ esx
for s ∈ C (these are different functions, hence non-isomorphic representations).
However, these are only unitary (or, more properly speaking, unitarizable) when s = it
is purely imaginary. Indeed, when Re(s) = 0, we have |ωs (x)| = 1 for all x ∈ R, and
the unit circle is the unitary group of the 1-dimensional Hilbert space C with inner
product z w̄. Conversely, the following lemma is a quick convenient necessary condition
for unitarity or unitarizability, because it gives a property which does not depend on any
information on the inner product, and it implies that ωs is not unitarizable otherwise.
Lemma 3.3.10. Let % be a finite-dimensional unitary representation of a group G.
Then any eigenvalue of %(g), for any g, is a complex number of modulus 1.
4 This is the Riesz representation theorem for Hilbert spaces.
94
Proof. This is linear algebra for unitary matrices, of which the %(g) are examples...
This lemma applies also to the representations %m of SL2 (C) of Examples 2.6.1
and 2.7.38: from (2.37) – for instance – we see that if m > 1, %m is not unitarizable
(since the restriction to the subgroup T is diagonal with explicit eigenvalues which are
not of modulus 1). However, the restriction of %m to the compact subgroup SU2 (C) is
unitarizable.
Along the same lines, observe that if we consider the compact group R/Z, its one-
dimensional representations induce, by composition, some representations of R
R −→ R/Z −→ C× ,
which must be among the ωs . Which ones occur in this manner is easy to see: we must
have Z ⊂ ker(ωs ), and from ωs (1) = 1, it follows that s = 2ikπ for some integer k ∈ Z.
In particular, we observe a feature which will turn out to be quite general: all these
representations of R/Z are unitary!
Example 3.3.11 (Regular representation of a finite group). Let G be a finite group,
and C(G) the space of complex-valued functions on G, with the regular representation
of G. A natural inner product on the vector space C(G) is
1 X
hϕ1 , ϕ2 i = ϕ1 (x)ϕ2 (x)
|G| x∈G
(one could omit the normalizing factor 1/|G|, but it has the advantage that k1k = 1 for
the constant function5 1 on G, independently of the order of G.)
It is quite natural that, with respect to this inner product, the representation C(G)
is unitary. Indeed, we have
X X
k reg(g)ϕk2 = |ϕ(xg)|2 = |ϕ(y)|2 = kϕk2
x∈G y∈G
and we define
%(g)v = (%i (g)vi )i∈I ,
5 This should not be confused with the neutral element in the group.
95
which is easily checked to be a unitary representation acting on H (see Exercise 3.3.12
below). Of course, for each i, the subspace Hi ⊂ H is a subrepresentation of H isomorphic
to %i . Moreover, the “standard” direct sum of the Hi (the space of families (vi ) where
vi is zero for all but finitely many i) is a dense subspace of H. It is stable under the
action of %, but since it is not closed in general, it is usually not a subrepresentation in
the topological sense.
Exercise 3.3.12 (Pre-unitary representation). It is often convenient to define a uni-
tary representation by first considering an action of G on a pre-Hilbert space, which
“extends by continuity” to a proper unitay representation (e.g., when defining a repre-
sentation of a space of functions, it may be easier to work with a dense subspace of regular
functions; for a concrete example, see the construction of the regular representation of a
compact topological group in Proposition 5.2.6)). We consider a fixed topological group
G.
A pre-unitary representation of G is a strongly continuous homomorphism
% : G −→ U(H0 ),
where H0 is a pre-Hilbert space, i.e., a complex vector space given with a (positive-
definite) inner product, but which is not necessarily complete.
(1) Show that if % is a pre-unitary representation, the operators %(g) extend by con-
tinuity to unitary operators of the completion H of H0 , and that the resulting map is a
unitary representation of G, such that H0 ⊂ H is a stable subspace. [Hint: To check the
strong continuity, use the fact that H0 is dense in H.]
(2) Suppose H is a Hilbert space and H0 ⊂ H a dense subspace. If % is a pre-unitary
representation on H0 , explain why the resulting unitary representation is a representation
of G on H.
(3) Use this to check that the Hilbert direct sum construction of (3.5) above leads to
unitary representations.
Unitary representations are better behaved than general (complex) representations.
One of the main reasons is the following fact:
%
Proposition 3.3.13 (Reducibility of unitary representations). Let G −→ U(H) be
a unitary representation of a topological group G. Then any closed subspace F ⊂ H
invariant under % has a stable closed complement given by F ⊥ ⊂ H. In particular, any
finite-dimensional unitary representation is semisimple.
Proof. Since any finite-dimensional subspace of a Hilbert space is closed, the second
part follows from the first using the criterion of Lemma 2.2.8.
Thus we consider a subrepresentation F ⊂ H. Using the inner product, we can very
easily construct a stable complement: we may just consider the orthogonal complement
F ⊥ = {v ∈ H | hv, wi = 0 for all w ∈ F } ⊂ H.
Indeed, the theory of Hilbert spaces6 shows that F ⊥ is closed in H and that F ⊕ F ⊥ =
H. From the fact that % preserves the inner product, the same property follows for its
orthogonal complement: if v ∈ F ⊥ , we have
h%(g)v, wi = hv, %−1 (g)wi = 0
for all g ∈ G and w ∈ F , i.e., F ⊥ is indeed a subrepresentation of H.
6 If H is finite-dimensional, of course, this is mere linear algebra.
96
Example 3.3.14 (Failure of semisimplicity for unitary representations). Although this
property is related to semisimplicity, it is not the case that any unitary representation
% : G −→ U(H)
is semisimple, even in the sense that there exists a family (Hi )i∈I of stable subspaces of
H such that M
H= Hi
i∈I
(in the Hilbert-space sense described above). The reader can of course look back at the
proof of Lemma 2.2.8 where the equivalence of semisimplicity and complete reducibility
was proved for the “algebraic” case: the problem is that the first step, the existence of
an irreducible subrepresentation of H (which is Exercise 2.2.10), may fail in this context.
To see this, take the representation % of R on L2 (R) described in Example 3.3.5. This
is of course infinite-dimensional but L2 (R) contains no irreducible subrepresentation! We
can not quite prove this rigorously yet, but the following explains what happens: first,
because R is abelian, all its irreducible unitary representations are of dimension 1 (as is
the case for finite abelian groups, though the proof is harder here since one must exclude
possible infinite-dimensional representations), and then, by Proposition 3.2.3 and the
unitarizability criterion, these are given by
R −→ C×
χx
t 7→ eitx
for x ∈ R. Now, a non-zero function f ∈ L2 (R) spans an irreducible subrepresentation
of % isomorphic to χx if and only if we have
f (x + t) = %(t)f (x) = χx (t)f (x) = eitx f (x)
for all x and t ∈ R. But this means that |f (t)| = |f (0)| is constant for all t ∈ R, and
this constant is non-zero since we started with f 6= 0. However, we get
Z Z
|f (x)|dx = |f (0)| dx = +∞,
R R
2
which contradicts the assumption f ∈ L (R)...
In Chapter 5, we will see that this type of behavior does not occur for compact
topological groups. But this shows, obviously, that the study of unitary representations
of non-compact groups will be frought with new difficulties...
Along the same lines, the following related result is also very useful:
Lemma 3.3.15 (Unrelated unitary subrepresentations are orthogonal). Let G be a
%
topological group and let G −→ U(H) be a unitary representation. If H1 and H2 are
subrepresentations of H such that there is no non-zero G-intertwiner H1 → H2 , then
H1 and H2 are orthogonal. In particular, isotypic components in H of non-isomorphic
irreducible representations of G are pairwise orthogonal.
Proof. Consider the orthogonal projector
Φ : H −→ H
on H2 . This linear map Φ is also a G-homomorphism because of the previous proposition:
if v ∈ H and g ∈ G, its projection Φ(v) is characterized by
v = Φ(v) + (v − Φ(v)), Φ(v) ∈ H1 , v − Φ(v) ∈ H1⊥ ,
97
and the subsequent relation
%(g)v = %(g)(Φ(v)) + %(g)(v − Φ(v))
together with the condition %(g)Φ(v) ∈ H1 , %(g)(v − Φ(v)) ∈ H1⊥ (which follow from the
fact that H1 is a subrepresentation) imply that
Φ(%(g)v) = %(g)Φ(v),
which gives the desired property.
Since the image of Φ is H2 , its restriction to H1 is a linear map
H1 −→ H2
which is a G-intertwiner. The assumption then says that it is zero, so that H1 ⊂ ker(Φ) =
H2⊥ , which is the same as to say that H1 and H2 are orthogonal.
The last statement is of course a corollary of this fact together with Schur’s Lemma.
98
CHAPTER 4
In this chapter, we take up the special case of finite groups, building on the basic
results of Section 2.7. There are however still two very distinct cases: if the field k
has characteristic coprime with the order of a finite group G (for instance, if k = C,
or any other field of characteristic 0), a fundamental result of Maschke shows that any
k-representation of G is semisimple. Thus, we can use characters to characterize all
(finite-dimensional) representations of G, and this leads to very powerful methods to
analyze representations. Most of this chapter will be devoted to this case. However, if
k is of characteristic p dividing the order of G, the semisimplicity property fails. The
classification and structure of representations of G is then much more subtle; since the
author knows next to nothing about this case, we will only give some examples and
general remarks in Section 4.8.2.
4.1. Maschke’s Theorem
As already hinted, the next result is the most important result about the representa-
tion theory of finite groups:
Theorem 4.1.1 (Maschke). Let G be a finite group, and let k be a field with charac-
teristic not dividing |G|. Then any k-linear representation
% : G −→ GL(E)
of G is semisimple. In fact, the converse is also true: if all k-representations of G are
semisimple, then the characteristic of k does not divide |G|.
Thus, in some sense, in the case where Maschke’s Theorem applies, the classification
of all representations of G is reduced to the question of classifying the irreducible ones.
Note that it is not required to assume that k be algebraically closed here.
Proof. We use the criterion of Lemma 2.2.8. For a given subrepresentation F ⊂ E,
the idea is to construct a stable supplement F ⊥ as the kernel of a linear projection
P : E −→ E
with image F which is a G-morphism, i.e., P ∈ HomG (E, E). Indeed, if P 2 = P (which
means P is a projection) and Im(P ) = F , we have
E = F ⊕ ker(P ),
and of course ker(P ) is a subrepresentation if P ∈ HomG (E, E).
From linear algebra again, we know the existence of a projection p ∈ Homk (E, E)
with Im(p) = F , but a priori not one that commutes with the action of G. Note that
p ∈ HomG (E, E) means that p ∈ Endk (E)G (see (2.18)). The trick is to construct P
using p by averaging the action (2.16) of G on p in order to make it invariant.
Let then
1 X
P = g · p ∈ Endk (E).
|G| g∈G
99
We claim that P is the desired projection in EndG (E). The first thing to notice is
that it is this definition which requires that p - |G|, since |G| must be invertible in k in
order to compute P .
By averaging, it is certainly the case that P ∈ EndG (E) = Endk (E)G : acting on the
left by some h ∈ G just permutes the summands
1 X 1 X 1 X
h·P = h · (g · p) = (hg) · p = x·p=P
|G| g∈G |G| g∈G |G| x∈G
1
(in the notation of Example 3.1.1, we are using the fact that P = e · p where e = |G|
s ∈
k(G), and that he = e by (3.2)).
Next, Im(P ) ⊂ Im(p) = F : indeed, F is G-invariant and each term
(g · p)v = %(g)(p(%(g −1 )v)) ∈ F
in the sum is in F for any fixed v. Moreover, P is the identity on F , since p is and F is
stable: for v ∈ F , we have
1 X 1 X
P (v) = %(g)p(%(g −1 )v) = %(gg −1 )v = v.
|G| g∈G |G| g∈G
The last sum, which is λ(%) is, however, equal to 0, and this is a contradiction. Indeed,
either % is trivial, and then the value is |G| = 0 in k, or there exists x ∈ G with %(x) 6= 1,
and then writing X X
λ(%) = %(g) = %(xh) = %(x)λ(%)
g∈G h∈G
implies that λ(%) = 0 also!
100
Remark 4.1.2 (Semisimplicity of unitary representations). If k = C, we can also
prove Theorem 4.1.1 by exploiting Proposition 3.3.13, at least when dealing with finite-
dimensional representations. Indeed, we have the following result:
Proposition 4.1.3 (Unitarizability for finite groups). For a finite group G, any
finite-dimensional representation of G over C is unitarizable.
Proof. The idea is similar to the one in the proof of Maschke’s Theorem. Indeed,
let % be a finite-dimensional representation of G on E. What must be done is to find an
inner product h·, ·i on E with respect to which % is unitary, i.e., such that
h%(g)v, %(g)wi = hv, wi
for all v, w ∈ E. Now, since E is finite-dimensional, we can certainly find some inner
product h·, ·i0 on E, although it is not necessarily invariant. But then if we let
1 X
hv, wi = h%(g)v, %(g)wi0 ,
|G| g∈G
it is easy to check that we obtain the desired invariant inner product.
More generally, suppose h·, ·i0 is a non-negative, but not necessarily positive-definite,
hermitian form on E, with kernel
F = {v ∈ E | hv, vi = 0}.
Then it is clear that the construction of h·, ·i above still leads to an invariant non-
negative hermitian form. By positivity, it will be a genuine, positive-definite, inner prod-
uct if \
g · F = 0.
g∈G
An example of this is the regular representation, where we can take
hϕ1 , ϕ2 i0 = ϕ1 (1)ϕ2 (1).
This has a huge kernel F (all functions vanishing at 1) but since
reg(g)F = {ϕ ∈ C(G) | reg(g)ϕ(1) = ϕ(g) = 0},
the intersection of the translates of F is in fact 0. Rather naturally, the resulting inner
product on C(G) is the same described in Example 3.3.11.
The meaning of Maschke’s Theorem is that for any k-representation % of G on a vector
space E, if |G| is invertible in k, there is a direct sum decomposition of E in irreducible
stable subspaces. As already discussed in Chapter 2, this decomposition is not unique.
However, by Proposition 2.7.7, the isotypic components of %, denoted M (π) or ME (π),
are defined for any irreducible k-representation π of G, and they are intrinsic subspaces
of E such that
M
(4.1) E= ME (π),
π
where π runs over isomorphism classes of irreducible k-representations of G. Recall that
because they are intrinsic, it follows that for any G-homomorphism
Φ : E −→ F,
the restriction of Φ to ME (π) gives a linear map
ME (π) −→ MF (π).
101
In order to analyze the representations concretely, one needs some way of obtaining
information concerning this decomposition; we will see how to describe (when k is alge-
braically closed) explicitly the projectors on E mapping onto the isotypic components,
and when k is of characteristic 0, how to use characters to compute the multiplicities
of the irreducible representations (which are of course related to the dimension of the
isotypic components.)
where the sum is over the set of isomorphism classes of representations of G, which can
be identified with the set of characters of irreducible representations.
One naturally wants to get more information about the irreducible representations
than what is contained in the formula (4.2). The first basic question is: what is the number
of irreducible representations (up to isomorphism)? The general answer is known, but
we start with a particularly simple case:
Proposition 4.2.2 (Irreducible representations of finite abelian groups). Let G be a
finite group and k an algebraically closed field of characteristic not dividing |G|. Then
all irreducible finite-dimensional representations of G are of dimension 1 if and only if G
is commutative. In particular, there are |G| non-isomorphic irreducible k-representations
of G.
Proof of Proposition 4.2.2. We know that the one-dimensional representations
of G are in bijection with those of the abelianized group G/[G, G] (Proposition 2.6.6).
Thus if all irreducible k-representations of G are of dimension 1, Corollary 4.2.1 implies
that |G| = |G/[G, G]| (the left-hand sides being equal for G and G/[G, G]), which means
that [G, G] = 1, i.e., that G is commutative.
Although the representations of abelian groups are quite elementary in comparison
with the general case, they are of great importance in applications. We will say more
about them in Section 4.5, which includes in particular a sketch of the proof of Dirichlet’s
Theorem on primes in arithmetic progressions. (That later section could be read right
now without much difficulty.)
Note that one can not remove the assumption on k: there are cases where G is non-
abelian, k is algebraically closed of characteristic dividing G, and the only irreducible
102
k-representation of G is trivial (indeed, this is true for all non-abelian groups of prime
order p and algebraically closed fields of characteristic p, see, e.g., [8, 27.28]; examples
are given by the groups from Example 2.7.33.)
Example 4.2.3. Consider G = S3 , the symmetric group on 3 letters. It is non-
abelian of order 6, and hence the only possible values for the degrees of irreducible
C-representations of G are 1, 1 and 2 (there are no other integers with squares summing
to 6, where not all are equal to 1). The two one-dimensional representations are of course
the trivial one and the signature
ε : S3 −→ {±1} ⊂ C× ,
and the 2-dimensional one is isomorphic to the representation by permutation of the
coordinates on the vector space
E = {(x, y, z) ∈ C3 | x + y + z = 0}.
More generally, the decomposition of the regular representation implies that there
are at most |G| irreducible representations, and Proposition 4.2.2 shows that this upper
bound is reached if and only G is abelian. In fact, we have the following:
Theorem 4.2.4 (Number of irreducible characters). Let G be a finite group, k an
algebraically closed field of characteristic not dividing |G|. Then the number of irreducible
k-representations of G is equal to the number of conjugacy classes in G.
This is another striking fact. This tells us immediately, for instance, that the symmet-
ric group S24 has exactly 1575 irreducible complex representations up to isomorphism.
Indeed, the number of conjugacy classes in Sn is, via the cycle type of permutations, the
same as the number p(n) of partitions of n, i.e., the number of solutions (r1 , . . . , rn ) of
the equation
n = 1 · r1 + 2 · r2 + · · · + n · rn ,
in non-negative integers (in this bijection, (r1 , . . . , rn ) corresponds to the permutations
which have cycle decomposition with r1 fixed points, r2 disjoint transpositions, . . . , and rn
cycles of length n.) Of course, it might not be obvious that p(24) = 1575, but computers
are here to confirm this. We will give more comments on this theorem after its proof.
Proof. We have the decomposition
M
(4.3) Ck (G) = M (%),
%
where M (%) is the space spanned by all matrix-coefficients of the irreducible k-representa-
tion %. To compute the number of summands, we try to find some invariant of Ck (G)
which will involve “counting” each % only once. The dimension does not work since
dim M (%) = dim(%)2 ; computing the intertwiners from Ck (G) to itself is also tempting
but since M (%) is a direct sum of dim(%) copies of %, we have (by Schur’s Lemma) again
X X
dim HomG (Ck (G), Ck (G)) = dim HomG (M (%), M (%)) = (dim %)2 = |G|.
% %
The way (or one way at least) to do this turns out to be a bit tricky: we use the fact
that Ck (G) carries in fact a representation π of G × G defined by
π(g1 , g2 )f (x) = f (g1−1 xg2 ),
103
and that the decomposition (4.3) is in fact a decomposition of Ck (G) into subrepresen-
tations of π; indeed, if fv,λ ∈ M (%) is a matrix coefficient
fv,λ (x) = hλ, %(g)vi
of an irreducible representation %, we have
π(g1 , g2 )fv,λ (x) = hλ, %(g1−1 xg2 )vi = h%̃(g1 )λ, %(x)%(g2 )vi = f%̃(g1 )λ,%(g2 )v (x)
so that M (%) is indeed stable under the action of G × G.
Now the point is that M (%), as a subrepresentation of π in Ck (G), is isomorphic to
the external tensor product %̃ %. Indeed, if % acts on the space E, this isomorphism is
the canonical one given by 0
E ⊗ E −→ M (%)
λ⊗v 7→ fv,λ
which is a linear isomorphism by linear independence of the matrix coefficients, and an
intertwiner by the computation just performed.1
Since the representations %̃ % of G × G are all irreducible and non isomorphic, as %
runs over the irreducible representations of G, by Proposition 2.3.17, each appears with
multiplicity 1 in π. As a consequence, we can use Schur’s Lemma (on G × G) to express
the number of % by the formula
X X
dim HomG×G (π, π) = dim HomG×G (%1 %̃1 , %2 %˜2 ) = 1.
%1 ,%2 %
We now compute directly the left-hand side dimension to deduce the desired formula.
In fact, consider a linear map
Φ : Ck (G) −→ Ck (G)
which commutes with the representation π. If δg ∈ Ck (G) denotes the function which is
0 except at x = g, where it is equal to 1, and
`g = Φ(δg ),
we must therefore have
π(g1 , g2 )`g = Φ(π(g1 , g2 )δg ) = Φ(δg1 gg2−1 ) = `g1 gg2−1
for all g, g1 , g2 ∈ G. Evaluating at x ∈ G means that we must have
(4.4) `g (g1−1 xg2 ) = `g1 gg2−1 (x),
and in fact, since (δg ) is a basis of Ck (G), these equations on `g are equivalent with Φ
being an intertwiner of π with itself.
We can solve these equations as follows: picking first g2 = 1 and g1 = x−1 , and then
g1 = 1 and g2 = x, we get quickly the two relations
(4.5) `gx−1 (1) = `g (x) = `x−1 g (1).
Thus Φ is entirely defined by the function ψ : G −→ k defined by
ψ(g) = `g (1).
In view of the two expressions for `g (x) above, this function must satisfy
(4.6) ψ(ab) = ψ(ba)
1 Note that the order of the factors is important here! If we consider E ⊗ E 0 instead, we do not
obtain an intertwiner...
104
for all a, b ∈ G. But conversely, is a function ψ satisfies this relation, defining `g (x) by
`g (x) = ψ(gx−1 ) = ψ(x−1 g),
(see (4.5)), we obtain
`g (g1−1 xg2 ) = ψ(gg2−1 x−1 g1 )
and
`g1 gg2−1 (x) = ψ(x−1 g1 gg2−1 ),
from which (4.4) follows by putting a = gg2−1 , b = x−1 g1 in (4.6).
The conclusion is that HomG×G (π, π) is isomorphic (by mapping Φ to ψ) to the linear
space ck (G) of all functions ψ such that (4.6) holds. But this condition is equivalent with
ψ(x) = ψ(gxg −1 ), for all x, g ∈ G,
or in other words, ck (G) is the space of class-functions on G. Since the dimension of
ck (G) is equal to the number of conjugacy classes of G, by definition (a class function is
determined by the values at the conjugacy classes), we obtain the desired formula.
Remark 4.2.5 (Sum of dimensions). Thus we have “directly accessible” group-theore-
tic expressions for the number of irreducible k-representations of a finite group G (which
is the number of conjugacy classes), and for the sum of the squares of their degrees
(which is simply |G|). It seems natural to ask: what about the sum of the degrees
themselves, or what about other powers of dim(%)? Although there does not seem to
exist any nice expression valid for all groups, there are some special cases (including
important examples like symmetric groups) where the sum of the dimensions has a nice
interpretation, as explained in Lemma 6.2.6 (in Chapter 6).
Remark 4.2.6 (Bijections, anyone?). Theorem 4.2.4 is extremely striking; since it
gives an equality between the cardinality of the set of irreducible characters of G (over
an algebraically closed field of characteristic not dividing |G|) and the cardinality of the
set of conjugacy classes, it implies that there exist some bijection between the two sets.
However, even for k = C, there is no general natural definition of such a bijection, and
there probably is none.
Despite this, there are many cases where one understand both the conjugacy classes
and the irreducible representations, and where some rough features of the two sets seem
to correspond in tantalizing parallels (see the discussion of GL2 (Fp ) later). And in some
particularly favorable circumstances, a precise correspondence can be found; the most
striking and important of these cases is that of the symmetric groups Sm .
Another equally striking aspect of the result is that the number of irreducible repre-
sentations does not depend on the field k. Here again, this means that there are bijections
between the sets of irreducible characters for different fields. This is of course surpris-
ing when the characteristics of the fields are distinct! One may also ask here if such a
bijection can be described explicitly, and the situation is better than the previous one.
Indeed, although a completely canonical correspondence does not seem to be obtainable,
Brauer developed a theory which – among other things – does lead to bijections between
irreducible characters of G over any algebraically closed fields of characteristic coprime
with |G|. See, e.g, [19, Ch. 15, Th. 15.13] or [34, Ch. 18], for an account of this the-
ory, from which it follows, in particular, that the family of dimensions of the irreducible
characters over two such fields always coincide.
105
4.3. Decomposition of representations
4.3.1. The projection on invariant vectors. Especially when written in terms
of the group algebra (using Example 3.1.1), the core argument of the proof of Maschke’s
Theorem can be immediately generalized:
Proposition 4.3.1 (Projection on the invariant vectors). Let G be a finite group and
k an algebraically closed field of characteristic not dividing |G|. For any k-representation
% : G −→ GL(E), the map
1 X
%(g) : E −→ E
|G| g∈G
is a homomorphism of representations, which is a projection with image equal to E G , the
space of invariant vectors in E. Moreover, if E is finite-dimensional, we have
1 X
(4.7) dim(E G ) = χ% (g)
|G| g∈G
To see what this means concretely, we select some Φ ∈ E = Homk (E1 , E2 ). Because
there is no intrinsic relation between the spaces here, some choice must be made. We
consider linear maps of rank 1: let λ ∈ E10 be a linear form and w ∈ E2 be a vector, and
let
Φ : v 7→ hλ, viw
106
be the corresponding rank 1 map in Homk (E1 , E2 ). Spelling out the identity (4.8) by
applying it to a vector v ∈ E1 , we get
1 X 1 X
0= π2 (g)(hλ, π1 (g −1 )v)iw) = fv,λ (g)π2 (g −1 )w
|G| g∈G |G| g∈G
for all v (we replaced g by g −1 in the sum, which merely permutes the terms).
We can make this even more concrete by applying an arbitrary linear form µ ∈ E20 to
obtain numerical identities:
Corollary 4.3.2. With notation as above, let π1 , π2 be non-isomorphic irreducible
representations of G; then for all vectors v ∈ E1 , w ∈ E2 and linear forms λ ∈ E10 ,
µ ∈ E20 , we have
1 X 1 X
(4.9) fv,λ (g)fw,µ (g −1 ) = hλ, g · viE1 hµ, g −1 · wiE2 = 0.
|G| g∈G |G| g∈G
Because such sums with come out often, it is convenient to make the following defi-
nition:
Definition 4.3.3 (“Inner-product” of functions on G). Let G be a finite group and
let k be a field with |G| invertible in k. For ϕ1 , ϕ2 in Ck (G), we denote
1 X
[ϕ1 , ϕ2 ] = ϕ1 (g)ϕ2 (g −1 ).
|G| g∈G
This is a non-degenerate2 symmetric bilinear form on Ck (G), called the k-inner prod-
uct.
Thus we have shown that matrix coefficients of non-isomorphic representations are
orthogonal for the k-inner product on Ck (G).
Before going on, we can also exploit (4.7) in this situation: since E G = 0, we derive
X
0= χE (g) ;
g∈G
We now finally come back to rank 1 maps. For given w ∈ E and λ ∈ E 0 defining
(
E→E
Φ :
v 7→ hλ, viw,
as before, the operator
1 X
g·Φ
|G| g∈G
is now a multiplication by a scalar α. To determine the scalar in question, we compute
the trace. Since
Tr(g · Φ) = Tr(π(g)Φπ(g −1 )) = Tr(Φ)
for all g (using (2.15)), we get
(dim E)α = Tr(αIdE ) = Tr(Φ) = hλ, wi
(the trace of the rank 1 map can be computed by taking a basis where w, the generator
of the image, is the first basis vector).
Since we have already seen that dim(E) is invertible in k, we obtain therefore
1 X hλ, wi
g·Φ= ,
|G| g∈G dim E
and applying this at a vector v ∈ E, and applying further a linear form µ ∈ E 0 to the
result, we get:
108
Corollary 4.3.5 (Orthogonality of matrix coefficients). With notation as above, let
π be an irreducible k-representation of G. Then for all vectors v, w ∈ E and linear forms
λ, µ ∈ E 0 , we have
1 X hλ, wihµ, vi
(4.12) [fv,λ , fw,µ ] = hλ, g · vihµ, g −1 · wi = .
|G| g∈G dim E
or equivalently
(
1 X 0 if π is not isomorphic to %
(4.14) χπ(g) χ%̃ (g) =
|G| g∈G 1 otherwise.
(π)
Our claims are: (1) for any π, and any distinct indices i 6= j, the coefficient αi,j is
(π)
zero; (2) for any π, the diagonal coefficients αi,i are constant as i varies. Given this, if
απ denotes this last common value, we get
X
ϕ= απ χπ
π
(%)
We now think of % as fixed. If we remember that fk,l is the (l, k)-th coefficient of the
(%)
matrix representing %(g) in the basis (el )l , we can reinterpret the left-hand side of this
computation as the (l, k)-th coefficient of the matrix representing
1 X
Aϕ = ϕ(g)%(g −1 ) ∈ Endk (%).
|G| g∈G
Now, because ϕ is a class function, it follows that, in fact, Aϕ is in EndG (%). This will
be presented in the context of the group algebra later, but it is easy to check: we have
1 X
Aϕ (%(h)v) = ϕ(g)%(g −1 h)v
|G| g∈G
1 X
= ϕ(g)%(h(h−1 g −1 h))v
|G| g∈G
1 X
= ϕ(hgh−1 )%(h)%(g −1 )v = %(h)Aϕ (v).
|G| g∈G
and spelling it out by applying this to a v ∈ E1 and taking the inner product of the result
with w ∈ E2 , we obtain3
1 X 1 X
hg · v, v1 ihg −1 · v2 , wi = hg · v, v1 ihg · w, v2 i
|G| g∈G |G| g∈G
= 0.
We interpret this in terms of the invariant inner product
1 X
hϕ1 , ϕ2 i = ϕ1 (g)ϕ2 (g)
|G| g∈G
for which the regular representation on the space of complex-valued functions C(G) is
unitary: the formula says that
hϕv1 ,v2 , ϕv3 ,v4 i = 0
for any “unitary” matrix-coefficients of non-isomorphic irreducible unitary representa-
tions of G.
Similarly, if E = E1 = E2 carries the irreducible unitary representation π, the same
argument as in the previous example leads to
1 X hv2 , v1 i
g·Φ= ∈ EndC (E).
|G| g∈G dim E
Note that we also have nπ (%) = dim HomG (%, π) by the symmetry between irreducible
quotient representations and subrepresentations, valid for a semisimple representation
(Corollary 2.7.17).
Proof. Since % is semisimple, we know that its character is given by
X
χ% = nπ (%)χπ
π
112
in terms of the multiplicities and the various irreducible representations. Then we get by
orthogonality
[χ% , χπ ] = nπ (%)[χπ , χπ ] = nπ (%).
This is an equality in the field k, but since k has characteristic zero, it is also one in
Z (in particular the left-hand side is an integer).
More generally, we can extend this multiplicity formula by linearity:
Proposition 4.3.12. Let G be a finite group, and k an algebraically closed field of
characteristic 0. For i = 1, 2, let
%i : G −→ GL(Ei )
be a finite-dimensional k-representation of G. Then we have
[χ%1 , χ%2 ] = dim HomG (%1 , %2 ) = dim HomG (%2 , %1 ).
If k = C, we have
hχ%1 , χ%2 i = dim HomG (%1 , %2 ) = dim HomG (%2 , %1 ).
Remark 4.3.13. It is customary to use (especially when k = C) the shorthand
notation
h%1 , %2 i = hχ%1 , χ%2 i
for two representations %1 and %2 of G. We will do so to simplify notation, indicating
sometimes the underlying group by writing h%1 , %2 iG .
The multiplicity formula also leads to the following facts which are very useful when
attempting to decompose a representation, when one doesn’t know a priori all the irre-
ducible representations of G. Indeed, this leads to a very convenient “numerical” criterion
for irreducibility:
Corollary 4.3.14 (Irreducibility criterion). Let G be a finite group, and let k be an
algebraically closed field of characteristic 0. For any finite-dimensional k-representation
% of G, we have
X
[χ% , χ% ] = nπ (%)2
π
where π runs over irreducible k-representations of G up to isomorphism, or, if k = C,
we have the formula
X
hχ% , χ% i = nπ (%)2
π
for the “squared norm” of the character of %.
In particular, % is irreducible if and only if
[χ% , χ% ] = 1,
and if k = C, if and only if
1 X
hχ% , χ% i = |χ% (g)|2 = 1.
|G| g∈G
113
and similarly for k = C. And if this is equal to 1, as an equality in Z, the only possibility
is that one of the multiplicities nπ (%) be equal to 1, and all the others are 0, which means
% ' π is irreducible.
Exercise 4.3.15 (Product groups). Let G = G1 × G2 where G1 and G2 are finite
groups. Use the irreducibility criterion to prove Proposition 2.3.17 directly for complex
representations: all irreducible complex representations of G are of the form π1 π2 for
some (unique) irreducible representations πi of Gi .
Example 4.3.16 (Permutation representations). Suppose we have a complex repre-
sentation % of G with hχ% , χ% i = 2. Then % is necessarily a direct sum of two non-
isomorphic irreducible subspaces, since 2 can only be written as 12 + 12 as the sum of
positive squares of integers.
A well-know source of examples of this is given by certain permutation representations
(Section 2.6.2). Consider an action of G on a finite set X, and the associated permutation
representation % on the space EX with basis vectors (ex ), so that
%(g)ex = eg·x .
The character of % is given in Example 2.7.36: we have
χ% (g) = |{x ∈ X | g · x = x}|.
We first deduce from this that
1 XX 1 X
hχ% , 1i = 1= |Gx |
|G| g∈G |G| x∈X
x∈X
g·x=x
the number of orbits (we used the standard bijection Go \G −→ o induced by mapping
g ∈ G to g · x0 for some fixed x0 ∈ o).
We assume now that there is a single orbit, i.e., that the action of G on X is transitive
(otherwise, % already contains at least two copies of the trivial representation). Then,
since the character of % is real-valued, we have
1 X X 2
h%, %i = 1
|G| g∈G
x∈X
g·x=x
1 X X
= 1
|G| x,y∈X g∈G ∩G
x y
1 X X
=1+ 1
|G| x6=y g∈G ∩G
x y
1 X
=1+ |{(x, y) ∈ X × X | x 6= y and g · (x, y) = (x, y)}|
|G| g∈G
114
and we recognize, from the character formula for a permutation representation, that this
means
h%, %i = 1 + h%(2) , 1i,
where %(2) is the permutation representation corresponding to the natural action of G on
the set
Y = {(x, y) ∈ X | x 6= y}
(recall that g ·x = g ·y implies that x = y, so the action of G on X ×X leaves Y invariant.)
By (4.18), we see that we have h%, %i = 2 if (and, in fact, only if) this action on Y is
transitive. This, by definition, is saying that the original action was doubly transitive:
not only can G bring any element x ∈ X to any other (transitivity), but a single element
can simultaneously bring any x to any x0 , and any y to any y 0 , provided the conditions
x 6= y and x0 6= y 0 are satisfied.
Thus:
Proposition 4.3.17. Let G be a finite group acting doubly transitively on a finite set
X. Then the representation of G on the space
X X
E={ λx ex | λx = 0}
x∈X x∈X
is irreducible of dimension n − 1.
Remark 4.3.18. A warning about the irreducibility criterion: it only applies if one
knows that χ% is, indeed, the character of a representation of G. There are many class
functions ϕ with squared norm 1 which are not characters, for instance
3χπ1 + 4χπ2
ϕ=
5
if π1 and π2 are non-isomorphic irreducible representations. If π1 and π2 have dimen-
sion divisible by 5, the non-integrality might not be obvious from looking simply at the
character values!
However, note that if ϕ ∈ R(G) is a virtual character (over C, say), i.e., ϕ = χ%1 − χ%2
for some actual complex representations %1 and %2 , the condition
hϕ, ϕi = 1
means that either %1 or %2 is irreducible and the other zero, or in other words, either ϕ
or −ϕ is a character of an irreducible representation of G. Indeed, we can write
X
ϕ= nπ χπ
π
115
as a combination of irreducible characters with integral coefficients nπ , and we have again
X
hϕ, ϕi = n2π
π
so one, and only one, of the nπ is equal to ±1, and the others are 0.
Example 4.3.19 (Frobenius reciprocity). From the general multiplicity formula, we
get a “numerical” version of Frobenius reciprocity for induced representations (Proposi-
tion 2.3.6): given a subgroup H of a finite group G, a (complex, say) representation %1
of G and a representation %2 of H, we have4
h%1 , IndG G
H (%2 )iG = hResH %1 , %2 iH .
Note that, by symmetry, we also have
hIndG G
H (%2 ), %1 iG = h%2 , ResH %1 iH ,
(something which is not universally true in the generality in which we defined induced
representations.)
This numerical form of Frobenius reciprocity can easily be checked directly, as an
identity between characters: denoting χi = χ%i , we find using (2.45) that we have
1 X X
h%1 , IndG
H (% 2 )i G = χ1 (g) χ2 (sgs−1 )
|G| g∈G
s∈H\G
sgs−1 ∈H
1 X X
= χ1 (g)χ2 (sgs−1 )
|G|
s∈H\G g∈s−1 Hs
1 X X
= χ1 (s−1 hs)χ2 (h)
|G| h∈H
s∈H\G
|H\G| X
= χ1 (h)χ2 (h) = hResG
H (%1 ), %2 iH .
|G| h∈H
For instance, if one thinks that %1 is an irreducible representation of G, and %2 is one
of H, Frobenius reciprocity says that “the multiplicity of %1 in the representation induced
from %2 is the same as the multiplicity of %2 in the restriction of %1 to H.”
Here is an example of application: for a finite group G, we denote by A(G) the
maximal dimension of an irreducible complex representation of G. So, for instance,
A(G) = 1 characterizes finite abelian groups. More generally, A(G) can be seen to be
some measure of the complexity of G.
Proposition 4.3.20. For any finite group G and subgroup H ⊂ G, we have A(H) 6
A(G).
Proof. To see this, pick an irreducible representation π of H such that dim π =
A(H). Now consider the induced representation % = IndG H (π). It may or may not be
irreducible; in any case, let τ be any irreducible component of %; then we have
1 6 hτ, %i = hτ, IndG G
H (π)i = hResH (τ ), πiH
by Frobenius reciprocity. This means that π occurs with multiplicity at least 1 in the
restriction of τ . This implies that necessarily dim τ = dim Res(τ ) > dim π. Thus τ
is an irreducible representation of G of dimension at least A(H), i.e., we have A(G) >
A(H).
4 We use the notation of Remark 4.3.13.
116
Exercise 4.3.21. For a finite group G and a real number p > 0, let
X
Ap (G) = (dim π)p .
π
This doesn’t look like the 0 power series, but there might be cancellations in the sum.
However, we haven’t used the assumption that % is faithful, and there is the cunning
trick: the point 1/χ% (1) = 1/ dim(%) is a pole of the term corresponding to g = 1, and
it can not be cancelled because χ% (g) = dim(%) if and only if g ∈ ker % = 1 (this is the
easy Proposition 4.6.4 below; the point is that the character values are traces of unitary
matrices, hence sum of dim % complex numbers of modulus 1.) So it follows that the
power series is non-zero, which certainly means that mk is not always 0.
4.3.6. Isotypic projectors. We now come back to the problem of determining the
projectors on all the isotypic components of a representation, not only the invariant
subspace. In the language of the group algebra, Proposition 4.3.1 means that the single
element
1 X
e= g ∈ k(G)
|G| g∈G
of the group algebra has the property that its action on any representation of G gives
“universally” the space of invariants. Since E G is the same as the isotypic component of
E with respect to the trivial representation, it is natural to ask for similar elements for the
other irreducible representations of G. These exist indeed, and they are also remarkably
simple: they are given by the characters.
117
Proposition 4.3.23 (Projectors on isotypic components). Let G be a finite group, k
an algebraically closed field of characteristic p - |G|. For any k-representation
% : G −→ GL(E)
of G, and for any irreducible k-representation π of G, the element
dim(π) X
eπ = χπ (g −1 )g ∈ k(G)
|G| g∈G
acts on E as a homomorphism in HomG (E, E) and is a projector onto the isotypic com-
ponent M (π) ⊂ E.
In other words, the linear map
E −→ E
(4.20) dim π X
v →
7 χπ (g −1 )%(g)v
|G| g∈G
acts on E as a homomorphism in HomG (E, E) and is the orthogonal projector onto the
isotypic component M (π) ⊂ E.
We will explain how one can find this formula for eπ , instead of merely checking
its properties. Indeed, this leads to additional insights. The point is that, for a given
irreducible representation π, the family of projections to the π-isotypic component, which
maps all others to 0, gives for every representation % : G −→ GL(E) of G a linear map
ε% : E −→ E,
in a “functorial” manner, in the sense described in Exercise 3.1.5: for any representation
τ on F and any G-homomorphism
Φ
E −→ F,
we have
ετ ◦ Φ = Φ ◦ ε% .
Exercise 4.3.25. Check this fact (this is because Φ sends the isotypic components
ME (τ ) to MF (τ ), for any irreducible representation τ (Proposition 2.7.7, (3)).
The outcome of Exercise 3.1.5 is that the source of a “universal” linear map on all
representations can only be the action of some fixed element a of the group algebra; even
if you did not solve this exercise, it should be intuitively reasonable that this is the only
obvious source of such maps. Thus, we know a priori that there is a formula for the
projector. We only need to find it.
The projectors are not just linear maps, but also intertwiners; according to the last
part of Exercise 3.1.5, this corresponds to an element a of the group algebra k(G) which
118
is in its center Z(k(G)). (This is because a gives rise to G-homomorphism if and only if
a satisfies
g · a = a · g ∈ k(G)
for all g ∈ G, which is equivalent with a ∈ Z(k(G)) because G generates k(G) as a ring.)
Remark 4.3.26. If we write
X
a= αx x, αx ∈ k,
x∈G
Taking a = 1, we deduce from this that eπ ∈ I(π) in particular. This means for
instance that
(4.21) e2π = eπ eπ = eπ ,
and also (since other projections map I(π) to 0) that
(4.22) e% eπ = 0
if % is an irreducible representation not isomorphic to π. Note moreover that
X
(4.23) 1= eπ ,
π
In any ring A, a family (ei ) of elements satisfying the relations (4.21), (4.22) and
(4.23) is known as a “complete system of independent idempotents”. Their meaning is
the following:
Corollary 4.3.27 (Product decomposition of the group algebra). Let G be a finite
group and k an algebraically closed field of characteristic not dividing |G|. Then the
subspaces I(π), where π runs over irreducible k-representations of G, are two-sided ideals
in k(G). Moreover, with eπ ∈ I(π) as unit, I(π) is a subalgebra of k(G) isomorphic to
the matrix algebra Endk (π), and we have a k-algebra isomorphism
∼
Y
k(G) −→ End(π)
(4.24) π
a 7→ (π(eπ a))π
Proof. The space I(π) is the image of the projection given by multiplication by eπ ,
i.e., we have
I(π) = eπ k(G),
which is, a priori, a right-ideal in k(G). But if we remember that eπ is also in the center
Z(k(G)) of the group algebra, we deduce that I(π) = k(G)eπ also, i.e., that I(π) is a
two-sided ideal.
In particular, like any two-sided ideal, I(π) is stable under multiplication. Usually,
1∈/ I(π), so that 1 does not provide a unit. But, for any a ∈ I(π), if we write a = eπ a1 ,
we find that
eπ a = e2π a1 = eπ a1 = a
by (4.21), and similarly aeπ = a, so that eπ , which is in I(π), is indeed a unit for this
two-sided ideal!
120
The identity (4.23) means that, as algebras, we have
Y
k(G) ' I(π),
π
where any a ∈ k(G) corresponds to (eπ a)π . Thus there only remains to prove that the
map
I(π) −→ Endk (π)
a 7→ π(a).
is an algebra isomorphism.
This is certainly an algebra homomorphism (the unit eπ maps to the identity in
Endk (π), since – by the above – the action of eπ is the projector on M (π), which is the
identity for π itself.) It is surjective, by Burnside’s irreducibility criterion (the latter says,
more precisely, that the image of all of k(G) is Endk (π), but the other isotypic components
map to 0.) We can show that it is an isomorphism either by dimension count (since the
representation of G on k(G) is, for a finite group, isomorphic to that on Ck (G), the
isotypic component I(π) has the same dimension as the space of matrix coefficients of
π, namely (dim π)2 ), or by proving injectivity directly: if a ∈ I(π) satisfies π(a) = 0,
it follows that the action of a on every π-isotypic component of every representation is
also zero; if we take the special case of k(G) itself, this means in particular that aeπ = 0.
However, aeπ = eπ a = a if a ∈ I(π), and thus a = 0.
Remark 4.3.28. This result also leads to Theorem 4.2.4: the center Z(k(G)) of k(G)
has dimension equal to the number of conjugacy classes of G (since it is the space of all
elements
X
a= αg g,
g∈G
(since any endomorphism ring has one-dimensional center spanned by the identity). Thus
the dimension of Z(k(G)) is also equal to the number of irreducible k-representations of
G.
Exercise 4.3.29 (How big can a cyclic representation be?). Let G be a finite group
and k a field of characteristic not dividing |G|. Let % : G −→ GL(E) be a finite-
dimensional k-representation of G, and π an irreducible k-representation.
(1) For v ∈ ME (π) ⊂ E, show that the subrepresentation Fv of E generated by v
(which is a cyclic representation, see Remark 2.2.6) is the direct sum of at most dim(π)
copies of π. [Hint: Show that Fv is isomorphic to a quotient of the π-isotypic component
of k(G).]
(2) Show that this can not be improved (i.e., it is possible, for some % and v, that Fv
is the direct sum of exactly dim(π) copies of π.)
(3) If you solved (1) using the group algebra k(G), try to do it without (see [34, Ex.
2.10] if needed).
In the argument leading to the projector formula, we have also proved the following
useful result:
121
Proposition 4.3.30 (Action of the center of the group ring). Let G be a finite group
and k an algebraically closed field of characteristic not dividing |G|. For any irreducible
k-representation % of G, there is an associated algebra homomorphism
Z(k(G)) −→ k
ω% :
a 7→ %(a),
i.e.,
(4.25) %(a) = ω% (a)Id.
This is given by
X 1 X
(4.26) ω% αg g = αg χ% (g).
g∈G
dim(%) g∈G
The last formula is obtained, as usual, by taking the trace on both sides of (4.25).
Note the following special case: if c ⊂ G is a conjugacy class, the element
X
ac = g
g∈c
(where 1X is the characteristic function of X). Furthermore, it is usually the case that
the constant function 1 is part of the orthonormal basis (this is the case for characters
as for matrix coefficients), in which case the corresponding term is
X |X| X
h1X , 1i f (x) = f (x),
x∈G
|G| x∈X
which can be interpreted as the term that would arise from a heuristic argument, where
|X|/|G| is seen as the probability that some element of G is in X.
We present a first interesting illustration here, which is due to Frobenius, and another
one is found in Section 4.7.1 later on.
Proposition 4.4.3 (Detecting commutators). Let G be a finite group. The number
N (g) of (x, y) ∈ G × G such that g = [x, y] = xyx−1 y −1 is equal to
X χπ (g)
(4.29) |G| ,
χπ (1)
π∈Ĝ
Proof. This will be a bit of a roller-coaster, and we won’t try to motivate the
arguments... The first idea is to compute instead N ] (g), the number of (x, y) ∈ G × G
such that [x, y] is conjugate to G. The point is that
X
(4.30) N ] (g) = N (h) = |g ] |N (g)
h∈g ]
124
where fg denotes the characteristic function of the conjugacy class of g. Using Corol-
lary 4.4.1, and exchanging the order of the two sums, we get
|g ] | X X
(4.31) n(x, g) = χπ (g) χπ ([x, y]).
|G| y∈G
π∈Ĝ
In order to go further, we consider the inner sum as β(x, x−1 ) where β is a function
of two variables: X
β(a, b) = χπ (ayby −1 ).
y∈G
(after the change of variable h−1 y 7→ y). We therefore attempt to expand β in terms of
characters, with respect to the a-variable:
X 1 X
(4.32) β(a, b) = β(h, b)χ% (h) χ% (a).
|G| h∈G
%∈Ĝ
is simply
1 X
χ% (h) reg(h)χπ ,
|G| h∈G
or in other words (Proposition 4.3.23), it is 1/ dim(%) times the image of χπ under the
projection on the %-isotypic component of the regular representation! Since χπ is in the
π-isotypic component, this projection is 0 except when % = π, when it is equal to π.
Hence (
χπ (z)
1 X dim(π)
if % ' π
χπ (hz)χ% (h) =
|G| h∈G 0 otherwise,
and
(yby −1 )
(
1 X |G| χπdim(π) if % ' π
β(h, b)χ% (h) =
|G| h∈G 0 otherwise.
Going back to (4.32), only the term % = π survives, and gives
1 |G|
β(a, b) = χπ (a)χπ (yby −1 ) = χπ (a)χπ (b),
dim π dim π
125
or in other words, we have the fairly nice formula
(dim π) X
(4.33) χπ (a)χπ (b) = χπ (ayby −1 ).
|G| y∈G
|g ] | X X χπ (g)|χπ (x)|2
n(x, g) = χπ (g)β(x, x−1 ) = |g ] | .
|G| dim π
π∈Ĝ π∈Ĝ
Remark 4.4.4. (1) If we isolate the contribution of the trivial representation, we see
that the number of (x, y) with [x, y] = g is given by
X χπ (g)
|G| 1 + .
π6=1
χπ (1)
1 −→ (Z/2Z)4 −→ G −→ A5 −→ 1,
such that the number of actual commutators [x, y] in G is not equal to |G|. To be more
precise, this group G is isomorphic to the commutator subgroup of the group W5 discussed
below in Exercise 4.7.13, with the homomorphism to A5 defined as the restriction to
[W5 , W5 ] of the natural surjection W5 −→ S5 . It inherits from W5 a faithful (irreducible)
representation of dimension 5 by signed permutation matrices, and it turns out that the
126
120 elements in the conjugacy class of the element
0 −1 0 0 0
1 0 0 0 0
g= 0 0 0 1 0
0 0 1 0 0
0 0 0 0 −1
are not commutators, as one can see that all commutators have trace in {−3, −2, 0, 1, 2, 5}.
On the other hand, one can see that g is a commutator in W5 . (Note that, because G
contains at least 481 commutators, it also follows that any g ∈ G is the product of at
most two commutators, by the following well-known, but clever, argument: in any finite
group G, if S1 and S2 are subsets of G with |Si | > |G|/2, and if g ∈ G is arbitrary, the
fact that5 |S1 | + |gS2−1 | > |G| implies that S1 ∩ gS2−1 6= ∅, so that g is always of the form
s1 s2 with si ∈ Si ; the end of the proof of Theorem 4.7.1 in Section 4.7.1 will use an even
more clever variant of this argument involving three subsets...)
(2) If we take g = 1 in (4.29), we see that the number of (x, y) in G × G such that
xy = yx is equal to
|G||Ĝ| = |G||G] |.
The reader should attempt to prove this directly (without using characters).
The representations of a finite group G can also be used to understand other spaces of
functions. We give two further examples, by showing how to construct fairly convenient
orthonormal bases of functions on a quotient G/K, as well as on a given coset of a suitable
subgroup.
Proposition 4.4.5 (Functions on G/K). Let G be a finite group and let H be a
subgroup of G. Let V be the space of complex-valued functions on the quotient G/H, with
the inner product
1 X
hϕ1 , ϕ2 iV = ϕ1 (x)ϕ2 (x).
|G/K|
x∈G/H
H
For π : G −→ GL(E) and v ∈ E , w ∈ E, define
√
ϕπ,v,w : gH 7→ dim Ehπ(g)v, wiE .
Then the family of functions (ϕπ,v,w ), where π runs over Ĝ, w runs over an orthonor-
mal basis of the space Eπ of π and v runs over an orthonormal basis of the EπH , forms
an orthonormal basis of V .
Proof. First of all, the functions ϕπ,v,w are well-defined (that is, they are functions
in V ), because replacing g by gh, with h ∈ H, leads to
hπ(gh)v, wiπ = hπ(g)π(h)v, wiπ = hπ(g)v, wiπ ,
since v ∈ EπH . We observe next that if ϕ̃π,v,w denote the corresponding matrix coefficients
of G, we have
hϕπ1 ,v1 ,w1 , ϕπ2 ,v2 ,w2 iV = hϕ̃π1 ,v1 ,w1 , ϕ̃π2 ,v2 ,w2 i,
so that the family of functions indicated is, by the orthonormality of matrix coefficients,
an orthonormal family in V .
5 We use here the notation S −1 = {x−1 | x ∈ S}.
127
It only remains to show that these functions span V . But their total number is
X X
(dim π)(dim π H ) = (dim π)hResH1 π, 1H iH
π∈Ĝ π∈Ĝ
X
= (dim π)hπ, IndK
H (1H )iG
π∈Ĝ
and hence
1 X
hχπ1 , χπ2 iW = χπ (y)χπ2 (y)
|H| y∈Y 1
1 X X
= ψ(α) ψ(φ(y))χπ1 (y)χπ2 (y)
|H||A| y∈Y
ψ∈Â
X
= ψ(α)h(ψ ◦ φ) ⊗ π1 , π2 iG
ψ∈Â
X
= ψ(α).
ψ∈Â
π2 '(ψ◦φ)⊗π1
128
This is more complicated than the orthogonality formula, but it remains manageable.
It shows that the characters remain orthogonal on H unless we have π1 ' (ψ ◦ φ) ⊗ π2 for
some ψ ∈ Â. This is natural, because evaluating the characters, we obtain in that case
χπ1 (y) = ψ(α)χπ2 (y)
for y ∈ Y , i.e., χπ1 and χπ2 are then proportional. The factor ψ(α) is of modulus one,
and hence X
hχπ1 , χπ2 iW = hχπ1 , χπ1 iW = ψ(α).
ψ∈Â
(ψ◦φ)⊗π1 'π1
in that case. This can still be simplified a bit: if ψ occurs in the sum, we obtain
ψ(α)χπ1 (y) = χπ1 (y)
for all y ∈ Y , and therefore either ψ(α) = 1, or χπ1 (y) = 0 for all y ∈ Y . In the second
case, the character actually vanishes on all of Y (and will not help in constructing an
orthonormal basis, so we can discard it!), while in the first case, we get
hχπ1 , χπ1 iW = |{ψ ∈ Â | (ψ ◦ φ) ⊗ π1 ' π1 }|.
Let us denote by
(4.34) κ(π) = |{ψ ∈ Â | (ψ ◦ φ) ⊗ π ' π}|
the right-hand side of this formula: it is an invariant attached to any irreducible repre-
sentation of G. We can then summarize as follows the discussion:
Proposition 4.4.6 (Functions on cosets). Let G be a finite group, H / G a normal
subgroup with abelian quotient G/H.
For α ∈ A, Y and W as defined above, an orthonormal basis of W is obtained by
considering the functions
1
ϕπ (y) = p χπ (y)
κ(π)
for y ∈ Y , where π runs over a subset ĜH defined by (1) removing from Ĝ those π such
that the character of π is identically 0 on Y ; (2) considering among other characters only
a set of representatives for the equivalence relation
π1 ∼H π2 if and only if ResG G
H π1 ' ResH π2 .
To completely prove this, we must simply say a few additional words to explain why
the relation π1 ∼H π2 in the statement is equivalent with the existence of ψ ∈ Â such
that π2 ' (ψ ◦ φ) ⊗ π1 : in one direction this is clear (evaluating the character, which is 1
on H ⊃ ker ψ), and otherwise, if ResG G
H π1 ' ResH π2 , we apply the inner product formula
with α = 0 (so that Y = H) to get
X
0 6= hResG
H π 1 , Res G
H π 2 iH = ψ(α),
ψ∈Â
π2 '(ψ◦φ)⊗π1
so that the sum can not be empty, and the existence of ψ follows. This remark means that
the functions described in the statement are an orthonormal system in W . We observed
at the beginning of the computation that they generate W , and hence we are done.
Exercise 4.4.7. In the situation of Proposition 4.4.6, show how to obtain an or-
thonormal basis of the space of all functions Y −→ C, with respect to the inner product
on C(G), using restrictions of matrix coefficients. [Hint: Example 3.3.7 can be useful.]
129
Exercise 4.4.8. Let Fq be a finite field with q elements and n > 2 an integer.
(1) Show that taking G = GLn (Fq ) and H = SLn (Fq ) gives an example of the situation
considered above. What is A in that case?
(2) Show that, in this case, the invariant defined in (4.34) satisfies
κ(π) 6 n
for any irreducible representation π ∈ Ĝ.
In Exercise 4.6.5, we will give examples of groups having representations where
κ(π) 6= 1, and also examples where the set ĜH differs from Ĝ (for both possible rea-
sons: characters vanishing on Y , or two characters being proportional on Y ).
for χ1 , χ2 ∈ Ĝ,
(
X |G| if x = y,
(4.36) χ(x)χ(y) =
0 6 y,
if x =
χ∈Ĝ
for x, y ∈ G.
(3) Let ϕ : G −→ C be any function on G. We have the Fourier decomposition
X
ϕ= ϕ̂(χ)χ
χ∈Ĝ
where
1 X
ϕ̂(χ) = hϕ, χi = ϕ(x)χ(x),
|G| x∈G
and the Plancherel formula
X 1 X
|ϕ̂(χ)|2 = |ϕ(x)|2 .
|G| x∈G
χ∈Ĝ
The crucial feature which is specific to abelian groups is that, since all irreducible
representations are of dimension 1, they form a group under pointwise multiplication: if
χ1 , χ2 are in Ĝ, the product
χ1 χ2 : x 7→ χ1 (x)χ2 (x)
130
is again in Ĝ. Similarly the inverse
χ−1 : x 7→ χ(x)−1 = χ(x)
(where the last is because |χ(x)| = 1 for all characters) is a character, and with the trivial
character as neutral element, we see that Ĝ is also a group, in fact a finite abelian group
of the same order as G. Its properties are summarized by:
Theorem 4.5.2 (Duality of finite abelian groups). Let G be a finite abelian group,
and Ĝ the group of characters of G, called the dual group.
(1) There is a canonical isomorphism
(
ˆ
G −→ Ĝ
x 7→ ex
where ex is the homomorphism of evaluation at x defined on Ĝ, i.e.
ex (χ) = χ(x).
(2) The group Ĝ is non-canonically isomorphic to G.
Proof. (1) A simple check shows that e is a group homomorphism. To show that it
is injective, we must show that if x 6= 1, there is at least one character χ with χ(x) 6= 1.
This follows, for instance, from the orthogonality relation
X
χ(x) = 0.
χ
(2) The simplest argument is to use the structure theory of finite abelian groups:
there exist integers r > 0 and positive integers
d1 | d2 | · · · | dr
such that
G ' Z/d1 Z × · · · × Z/dr Z.
Now we observe that for a direct product G1 × G2 , there is a natural isomorphism
(
Ĝ1 × Ĝ2 −→ G\ 1 × G2
(χ1 , χ2 ) 7→ χ1 χ2 ,
with (χ1 χ2 )(x1 , x2 ) = χ1 (x1 )χ2 (x2 ). Indeed, this is a group homomorphism, which is
quite easily seen to be injective, and the two groups have the same order.6
Thus we find
\
Ĝ ' Z/d \
1 Z × · · · × Z/dr Z
and this means that it is enough to prove that Ĝ ' G when G is a finite cyclic group
Z/dZ. But a homomorphism
χ : Z/dZ −→ C×
is determined uniquely by e1 (χ) = χ(1). This complex number must be a d-th root of
unity, and this means that we have an isomorphism
\ −→ µ = {z ∈ C× | z d = 1}
Z/dZ
e1 : d
χ 7→ χ(1),
Since the group of d-th roots of unity in C× is isomorphic (non-canonically, if d > 3)
to Z/dZ, we are done.
6 It is also surjective by an application of Proposition 2.3.17.
131
Remark 4.5.3. In practice, one uses very often the explicit description of characters
of Z/mZ that appeared in this proof. Denoting e(z) = e2iπz for z ∈ C, they are the
functions of the form ax
ea : x 7→ e
m
where x ∈ Z/mZ and a ∈ Z/mZ. In this description, of course, the exponential is
to be interpreted as computed using representatives in Z of x and a, but the result is
independent of these choices (simply because e(k) = 1 if k ∈ Z).
Exercise 4.5.4. We have derived the basic facts about representations of finite
abelian groups from the general results of this chapter. However, one can also prove them
using more specific arguments. This exercise discusses one possible approach (see [35,
VI.1]).
(1) Prove the orthogonality relation (4.35) directly.
(2) Show – without using anything else than the definition of one-dimensional charac-
ters – that if H ⊂ G is a subgroup of a finite abelian group, and χ0 ∈ Ĥ is a character of
H, there exists a character χ ∈ Ĝ of G such that χ restricted to H is equal to χ0 . [Hint:
Use induction on |G/H|.] Afterward, reprove this using Frobenius reciprocity instead,
and compare with Exercise 2.3.11.
(3) Deduce from this the orthogonality relation (4.36).
(4) Deduce that Ĝ is an abelian group of the same order as G, and that the homo-
morphism e is an isomorphism.
Example 4.5.5. (1) Quite often, the Fourier decomposition is applied to the char-
acteristic function 1A of a subset A ⊂ G. Isolating – as in the previous section – the
contribution of the trivial character, we then have the expansion
|A| X
1A (x) = + α(χ)χ(x)
|G|
χ∈Ĝ
χ6=1
with X
α(χ) = χ(x).
x∈A
The interpretation of the first term is the “probability” that an element x ∈ G,
chosen uniformly at random, belongs to A. Neglecting the other terms and using only
this probabilistic term is a common method to reason, on a heuristic level, about what
“should” be true for certain problems.
(2) In particular, it is often interesting to use the characteristic function of the set A
of d-powers in G, for some fixed d, i.e.,
A = {x ∈ G | x = y d for some y ∈ G}.
Since y 7→ y d is surjective when d is coprime to the order of G, one assumes that
d | |G|.
Example 4.5.6 (Dirichlet’s Theorem on primes in arithmetic progressions). We sketch
how Dirichlet succeeded in proving Theorem 1.2.2. Thus we have a positive integer q > 1
and an integer a > 1 coprime with q, and we want to find prime numbers p ≡ a (mod q).
Dirichlet’s proof is motivated by an earlier argument that Euler used to reprove that
there are infinitely many prime numbers, as follows: we know that
X 1
lim σ
= +∞,
σ→1
n>1
n
132
e.g., by comparison of the series with the integral
Z +∞
1
x−σ dx = , for σ > 1.
1 σ−1
On the other hand, exploiting the unique factorization of positive integers in products
of primes, Euler showed that
X 1 Y
(4.37) σ
= (1 − p−σ )−1
n>1
n p
for σ > 1, where the infinite product is over all prime numbers. (It is defined as the limit,
as x → +∞, of the partial products
Y YX X
(1 − p−σ )−1 = p−kσ = n−σ
p6x p6x k>0 n∈P (x)
where P (x) is the sum of all positive integers with no prime divisor > x, and the unique
factorization of integers has been used in the last step; thus the absolute convergence of
the series on the left is then enough to justify the equality (4.37).
Now obviously, if there were only finitely many primes, the right-hand side of the
formula would converge to some fixed real number as σ → 1, which contradicts what we
said about the left-hand side. Hence there are infinitely many primes.
An equivalent way to conclude is to take the logarithm on both sides; denoting
X
ζ(σ) = n−σ
n>1
which converges absolutely and uniformly on compact sets for |x| < 1.) Thus Euler’s
argument can be phrased as X
lim p−σ = +∞.
σ→1
p
Using this, it is rather tempting (isn’t it?) to try to analyze either the product or the
sum Y X
(1 − p−σ )−1 , p−σ
p≡a (mod q) p≡a (mod q)
similarly, and to show that, as σ → 1, these functions tend to +∞ (as before, they
differ at most by a bounded function). But if we expand the product, we do not get the
“obvious” series X
n−σ ,
n>1
n≡a (mod q)
133
because there is no reason that the primes dividing an integer congruent to a modulo
q should have the same property (also, if a is not ≡ 1 (mod q), the product of primes
congruent to a modulo q is not necessarily ≡ a (mod q)): e.g., 35 = 7 × 5 is congruent to
3 modulo 4, but 5 ≡ 1 (mod 4).
In other words, we are seeing the effect of the fact that the characteristic function
of a ∈ Z/qZ, which is used to select the primes in the product or the series, is not
multiplicative. Dirichlet’s idea is to use, instead, some functions on Z/qZ which are
multiplicative, and to use them to recover the desired characteristic function.
A Dirichlet character modulo q is then defined to be a map
χ : Z −→ C
such that χ(n) = 0 if n is not coprime to q, and otherwise χ(n) = χ∗ (n (mod q)) for some
character of the multiplicative group of invertible residue classes modulo q:
χ∗ : (Z/qZ)× −→ C× .
It follows that χ(nm) = χ(n)χ(m) for all n, m > 1 (either because both sides are 0
or because χ∗ is a homomorphism), and from the orthogonality relation we obtain
(
X |(Z/qZ)× | = ϕ(q), if n ≡ a (mod q)
χ(a)χ(n) =
χ (mod q)
0, otherwise,
for n > 1 (because a is assumed to be coprime to q), the sum ranging over all Dirichlet
characters modulo q, which correspond exactly to the characters of the group (Z/qZ)× .
This is the first crucial point: the use of “suitable harmonics” to analyze the charac-
teristic function of a residue class. Using it by summing over n, we obtain the identity
X 1 X X
p−σ = χ(a) χ(p)p−σ ,
ϕ(q) p
p≡a (mod q) χ (mod q)
while, for each χ, the multiplicativity leads to an analogue of the Euler product:
X Y
χ(n)n−σ = (1 − χ(p)p−σ )−1 .
n>1 p
As now classical, we denote by L(σ, χ) the function in this last formula. By the same
reasoning used for ζ(σ), it satisfies
X
log L(σ, χ) = χ(p)p−σ + O(1)
p
as σ → 1. We have therefore
X 1 X
p−σ = χ(a) log L(σ, χ) + O(1),
ϕ(q)
p≡a (mod q) χ (mod q)
for all σ > 1, and now the idea is to imitate Euler by letting σ → 1 and seeing a divergence
emerge on the right-hand side, which then implies that the series on the left can not have
only finitely many non-zero terms.
On the right-hand side, for the character χ0 corresponding to χ∗ = 1, we have
Y
L(σ, χ0 ) = (1 − p−σ )−1
p-q
for any representation %, where the coordinate n(a) of n = (n(a)) is the multiplicity of
the irreducible character ax
x 7→ e
m
of Z/mZ.
(3) Show that if
Y
m= pk p
p|m
(4) Show that for p prime and for any quadratic form Q0 of rank d > 1, we have
1 X
(Qp ⊗ Q0 )(n) = Q0 (n(a) − n(b))
2
a,b∈Z/pZ
for any n = (n(a)) ∈ (Zd )p . [Hint: It may be useful to start with Q0 (n) = n2 of rank 1.]
135
(5) For Q0 as above, non-negative, let s(Q0 ) denote the smallest non-zero value of
Q (n) for n ∈ Zd . Show that for any quadratic form Q0 of rank d > 1, we have
0
s(Qp ⊗ Q0 ) = (p − 1)s(Q0 ).
(6) For k > 2 and p prime, show that there exists a quadratic form Q0 of rank pk−1
such that Qpk = Qp ⊗ Q0 and s(Q0 ) = pk−1 . Then prove (4.38).
1 (12) (123)
1 1 1 1
ε 1 −1 1
%2 2 0 −1
as before. In other words, assume a subspace M ⊂ C(G) has the property that there
exist eigenvalues λc ∈ C, defined for all c, such that
(4.40) M = {v ∈ C(G) | reg(ac )v = λc v for all c ∈ G] } ;
then M is one of the M (π), and conversely.
If this is granted, we proceed as follows: list the conjugacy classes, and for each of
them, find the eigenvalues and eigenspaces of the operator reg(ac ) (by finding bases for
them, using linear algebra, which is eminently computable), then compute their intersec-
tions, and list the resulting subspaces: they are the isotypic components M (π).
8 As an example of the subtleties that may be involved, note that having a generating set is not
enough: one must be able to say whether two arbitrary products of elements from such a set are equal
or not.
137
We now check the claim. Let M be a common eigenspace of the reg(ac ) given by (4.40).
Writing irreducible characters9 as combinations of the ac ’s, and taking linear combina-
tions, it follows that M is also contained in an eigenspace of the isotypic projectors eπ ,
for any π. But these are M (π), for the eigenvalue 1, and the sum of the other isotypic
components, for the eigenvalue 0. The sum of the eπ is the identity, so there exists some
π ∈ Ĝ such that the eigenvalue of eπ on M is 1. This means that M ⊂ M (π). But M (π)
itself is contained in a common eigenspace:
n |c|χπ (c) o
M (π) ⊂ v ∈ C(G) | reg(ac )v = v for all c ,
|G|
|c|χπ (c)
by Proposition 4.7.11. This shows that the λc must coincide with |G|
. But then
M ⊂ M (π) ⊂ M,
and these inclusions must be equalities!
This algorithm is not at all practical if G is large, but at least it shows that, by trying
to get information from the character table, we can not be building castles completely in
the air!
Note that besides this fact, the proof has led to the characterization
n o
M (π) = v ∈ C(G) | reg(a)v = ωπ (a)v for all a ∈ Z(k(G))
9 We are allowed now to use characters as theoretical tools to check that the algorithm works!
138
Proof. Since % is unitary, the eigenvalues of %(g) are of modulus 1, and hence its
trace, the value χ% (g), is at most dim(%). Moreover, by the equality case of the triangle
inequality, there can be equality only if all eigenvalues are equal, which means (since %(g)
is diagonalizable) that %(g) is the multiplication by this common eigenvalue.
Finally, in that case, we can compute the eigenvalue as χ% (g)/ dim(%), and hence the
latter is equal to 1 if and only if %(g) is the identity.
Note, however, that in general a character, even for a faithful representation, has no
reason to be injective (on G or on conjugacy classes): the regular representation gives an
example of this (it is faithful but, for |G| > 3, its character is not an injective function
on G (Example 2.7.34)). Another type of “failure of injectivity” related to characters is
described in Exercise 4.6.14.
– More generally, all normal subgroups of G can be computed using the character
table, as well as their possible inclusion relations. Indeed, first one can find the kernels of
the irreducible representations using the lemma and the character table. Then, if N / G
is any normal subgroup, we have
N = ker %N
where %N is the permutation representation associated to the left-action of G on G/N
(indeed, if exN are the basis vectors for the space of %N , to have g ∈ ker %N means that
gexN = exN for all x ∈ G, i.e., gxN = xN for all x, so g ∈ N , and the converse follows
because N is normal) and if we denote by
X = {% ∈ Ĝ | h%, IndG 6 1}
N (1)i >
the set of those irreducible representations which occur in the induced representation, it
follows that \
N= ker %.
%∈X
Thus, to determine all normal subgroups of G, from the character table, one can
first list all kernels of irreducible representations, and then compute all intersections of
finitely many such subgroups. In particular, once the conjugacy classes which form a
normal subgroup are known, its order can of course be computed by summing their size.
Note that, on the other and, there is no way to determine all subgroups of G from the
character table (see Exercise 4.6.6).
– The character table of a quotient G/N of G by a normal subgroup N / G can also be
determined from the character table, since irreducible representations of G/N correspond
bijectively to those irreducible representations of G where N ⊂ ker %.
– Whether G is solvable can then, in principle, be determined from the character
table. Indeed, we can determine whether G contains a proper normal subgroup N with
abelian quotient G/N (one can check if a group is abelian by checking that all irreducible
representations have dimension 1); if N = G, then G is abelian, hence solvable, and if N
does not exist, the group is not solvable. Otherwise, we can iterate with G replaced by
G/N .
– It is also natural to ask what can not be determined from the character table. At
first, one might hope that the answer would be “nothing at all!”, i.e., that it may be used
to characterize the group up to isomorphism. This is not the case, however,10 as we will
show with a very classical example of two non-isomorphic groups with distinct character
10 The reader should not add “unfortunately”: there are no unfortunate events in mathematics...
139
tables, in Exercise 4.6.6 below. It will follow, in particular, that the character table can
not be used to determine all subgroups of a finite group.
141
If b 6= 0, sending x to xb + c is a bijection of Fp , and the result is therefore
(
pψ1 (b) if ψ2 = 1
χ% (g) =
0 6 1,
if ψ2 =
while for b = 0, which means g = {0, 0, c}H ∈ Z, we have
χ% (g) = pψ2 (c).
Hence there are two cases for χ% , depending on whether ψ2 is trivial or not:
The middle column concerns p−1 non-central conjugacy classes, and the last concerns
the remaining p2 − p, each having p elements. Thus the respective squared norms in the
two cases are
1 2 2
p × p + p × (p − 1) × p = p,
p3
when ψ2 = 1 and
p × p2
=1
p3
when ψ2 6= 1. Thus, whenever ψ2 6= 1, we have an irreducible representation. Moreover,
the character values in that case show that % is then independent of the choice of ψ1 ,
up to isomorphism, while looking at the values at the center, we see that inducing using
different choices of ψ2 leads to different representations of Hp . In other words, the p − 1
representations
H
%ψ2 = IndKp (ψ), ψ(b, c) = ψ2 (c),
with ψ2 non-trivial, give the remaining p−1 irreducible representations of Hp of dimension
p.
We can then present the full character table as follows, where the sole restriction is
that ψ2 in the last row should be non-trivial:
{0, 0, c}H {a, b, ?}H , (a, b) 6= (0, 0)
χψ1 ,ψ2 1 ψ1 (a)ψ2 (b)
%ψ2 pψ2 (c) 0
142
with abelian kernel n
1 t
o
U= | t ∈ Fp ' Fp ,
0 1
i.e., Bp is an extension of abelian groups.
As before, we compute the conjugacy classes, using the formula
−1
a y −1 {t(b − a) + xu}
x t a u x t
= .
0 y 0 b 0 y 0 b
We consider the middle matrix to be fixed, and we look for its conjugacy class. If
b 6= a the top-left coefficient can take any value when varying x, y and t, in fact even with
x = y = 1, and this gives us (p − 1)(p − 2) conjugacy classes of size p. If a = b, there are
two cases: (1) if u 6= 0, we can get all non-zero coefficients, thus we have p − 1 conjugacy
classes of size p − 1; (2) if u = 0, then in fact the matrix is scalar and its conjugacy class
is a single element.
To summarize, there are:
• p − 1 central conjugacy classes of size 1;
• p − 1 conjugacy classes with representatives
a 1
0 a
of size p − 1;
• (p − 1)(p − 2) conjugacy classes with representatives
a 0
,
0 b
of size p.
The number of conjugacy classes, and hence of irreducible representations, is now
(p − 1)(p − 2) + 2(p − 1) = p(p − 1).
We proceed to a thorough search... First, as in the previous section, we can easily
find the one-dimensional characters; the commutator formula
−1 −1
x t a u x t a u 1 (something)
=
0 y 0 b 0 y 0 b 0 1
shows that the morphism (4.42) factors in fact through an isomorphism
Bp /[Bp , Bp ] ' (F× 2
p) .
144
! ! !
a 0 a 1 a 0
, a 6= b
0 a 0 a 0 b
τ ⊗ χ (p − 1)χ1 (a)χ2 (a) −χ1 (a)χ2 (a) 0
These are all irreducible representations, but they depend only on the product char-
acter χ1 χ2 of F×p , and thus there are only p − 1 different irreducible representations of
dimension p − 1 that arise in this manner.
We have now found the right number of representations. We summarize all this
in the character table, using the characters %(χ, 1) to obtain the (p − 1)-dimensional
representations:
! ! !
a 0 a 1 a 0
, a 6= b
0 a 0 a 0 b
%(χ1 , χ2 ) χ1 (a)χ2 (a) χ1 (a)χ2 (a) χ1 (a)χ2 (b)
τ ⊗ %(χ, 1) (p − 1)χ(a) −χ(a) 0
G
π(χ1 , χ2 ) = IndBpp (%(χ1 , χ2 )),
which has dimension p + 1. To compute its character, we use the set of representatives
n o
1 0 0 1
R= , t ∈ Fp ,
t 1 1 0
(where we write ar and br for the diagonal coefficients of rgr−1 ). Before considering the
four conjugacy types in turn, we observe the following very useful fact: for any x ∈ Gp ,
we have
(
Bp if x ∈ Bp
(4.48) xBp x−1 ∩ Bp =
{g | g diagonal in the basis (e1 , xe1 )} if x ∈
/ Bp ,
Indeed, by definition, an element of Bp has the first basis vector e1 ∈ F2p as eigenvector,
and an element of xBp x−1 has xe1 as eigenvector; if xe1 is not proportional to e1 – i.e.,
/ Bp – this means that g ∈ Bp ∩ xBp x−1 if (and only if) g is diagonalizable in the
if x ∈
fixed basis (e1 , xe1 ). (Note that in this case, the intersection is a specific conjugate of the
group T1 of diagonal matrices.)
Now we compute:
a 0
• If g = is scalar, we obtain χπ (g) = (p + 1)χ1 (a)χ2 (a);
0 a
148
a 1
• If g = is not semisimple, since it is in Bp , only 1 ∈ R contributes to the
0 a
sum (since for r 6= 1 to contribute, it would be necessary that g ∈ r−1 Bp r ∩ Bp ,
which is not possible by (4.48) since g is not diagonalizable). Thus we get
χπ (g) = χ1 (a)χ2 (a)
in that
case;
a 0
• If g = with a 6= b, besides r = 1, the other contributions must come
0 b
from r ∈ R such that g is diagonal in the basis (e1 , re1 ), which is only possible
if re1 = e2 , i.e., the only other possibility is r = w; this gives
χπ (g) = χ1 (a)χ2 (b) + χ1 (b)χ2 (a)
for the split semisimple elements;
• If g is non-split semisimple, it has no conjugate at all in Bp (as this would mean
that g has an eigenvalue in Fp ), and hence χπ (g) = 0.
The character values are therefore quite simple:
! ! ! !
a 0 a 1 a 0 a b
, b 6= a , b 6= 0
0 a 0 a 0 b εb a
π(χ1 , χ2 ) (p + 1)χ1 (a)χ2 (a) χ1 (a)χ2 (a) χ1 (a)χ2 (b) + χ2 (b)χ1 (a) 0
The squared norm of the character of these induced representations is also quite
straightforward to compute: we find
1 n o
hχπ , χπ i = (p − 1)(p + 1)2 + (p − 1)(p2 − 1) + A
|Gp |
where A is the contribution of the split semisimple classes, namely
XX
A = 12 p(p + 1) |χ1 (a)χ2 (b) + χ1 (b)χ2 (a)|2 .
a,b∈F× p
a6=b
149
Thus there are two cases: if χ1 = χ2 , we have B = (p − 1)2 − (p − 1), whereas if
χ1 6= χ2 , we get B = −(p − 1). This leads, if no mistake is made in gathering all the
terms, to
(
2 if χ1 = χ2
(4.49) hπ(χ1 , χ2 ), π(χ1 , χ2 )i =
1 if χ1 6= χ2 .
Thus π(χ1 , χ2 ) is irreducible if and only χ1 6= χ2 (see Exercise 4.8.3 for another argu-
ment towards this result, which is less computational). This means that we have found
many irreducible representations of dimension p+1. the precise number is 21 (p−1)(p−2),
because in addition to requiring χ1 6= χ2 , we must remove the possible isomorphisms be-
tween those representations, and the character values show that if χ1 6= χ2 , we have
π(χ1 , χ2 ) ' π(χ01 , χ02 ) if and only if (χ01 , χ02 ) = (χ1 , χ2 ) or (χ01 , χ02 ) = (χ2 , χ1 ).
These 21 (p − 1)(p − 2) representations are called the principal series representations
for GL2 (Fp ).
Remark 4.6.9. The existence of the isomorphism π(χ1 , χ2 ) ' π(χ2 , χ1 ) is guaranteed
by the equality of characters. It is not immediate to write down an explicit isomorphism.
(Note that, by Schur’s Lemma, we have dim HomGp (π(χ1 , χ2 ), π(χ2 , χ1 )) = 1, so at least
the isomorphism is unique, up to scalar; for an actual description, see, e.g., [7, p.404].)
Even when χ1 = χ2 we are not far from having an irreducible: since
hπ(χ1 , χ1 ), π(χ1 , χ1 )i = 2,
the induced representation has two irreducible components. Could it be that one of
them is one-dimensional? Using Frobenius reciprocity we see that for a one-dimensional
character of the type χ ◦ det, we have
(
0 if χ1 6= χ
hπ(χ1 , χ1 ), χ ◦ detiGp = h%(χ1 , χ1 ), %(χ, χ)iBp =
1 if χ1 = χ,
since the restriction of χ ◦ det to Bp is
a t
7→ χ(a)χ(b).
0 b
Switching notation, we see that π(χ, χ) contains a unique 1-dimensional representa-
tion, which is χ ◦ det. Its other component, denote St(χ), is irreducible of dimension p,
with character values given by
χSt(χ) = χπ(χ,χ) − χ ◦ det,
namely
! ! ! !
a 0 a 1 a 0 a b
, b 6= a , b 6= 0
0 a 0 a 0 b εb a
St(χ) pχ(a)2 0 χ(a)χ(b) −χ(a2 − εb2 )
From this, we see also that these representations are pairwise non-isomorphic (use the
values for split semisimple elements).
Another description of these representations, which are called the Steinberg repre-
sentations, is the following: first Gp acts on the set Xp of lines in F2p , as did Bp in the
previous section; by linear algebra, this action is doubly transitive (choosing two vectors
on a pair of distinct lines gives a basis of F2p , and any two bases can be mapped to one
150
another using Gp ), and therefore by Proposition 4.3.17, the permutation representation
associated to Xp splits as
1 ⊕ St
for some irreducible representation St of dimension p. Then we get
St(χ) ' St ⊗χ(det)
G
(e.g., because the permutation representation on Xp is isomorphic to IndBpp (1) = π(1, 1),
as in Example 2.6.4, (2), so that St is the same as St(1), and then one can use character
values to check the effect of multiplying with χ ◦ det.) The character of “the” Steinberg
representation is particularly nice-looking:
! ! ! !
a 0 a 1 a 0 a b
, b 6= a , b 6= 0
0 a 0 a 0 b εb a
St p 0 1 −1
151
! ! ! !
a 0 a 1 a 0 a b
, b 6= a , b 6= 0
0 a 0 a 0 b εb a
√ √
R(φ) (p − 1)φ(a) −φ(a) 0 −(φ(a + b ε) + φ(a − b ε))
R(φ1 ) = R(φ2 )
Once this is known, we have all the characters we need. Indeed, these characters are
of dimension p − 1. To count them, we note that the condition (4.50) is equivalent with
√
√ × x+y ε
ker(φ) ⊃ {w ∈ Fp ( ε) | w = √ }
x−y ε
√
(when seeing φ as a character of Fp ( ε)× ) and the right-hand side is the same as the
kernel of the norm map
√
Fp ( ε)× −→ F× p.
Thus those φ which do satisfy the condition are in bijection with the characters of
the image of the norm map, which is F×p since the norm is surjective. There are therefore
p − 1 (the characters of the form
√
x + y ε 7→ χ(x2 − εy 2 )
where χ is a character of F× 2
p ) to be excluded from the p − 1 characters of T2 . Finally,
the identities R(φ) = R(φ0 ) show that the total number of irreducible characters given
by the proposition is, as expected, 21 p(p − 1).
Proof of Proposition 4.6.10. We see first that the identity R(φ) = R(φ0 ) does
hold, as equality of class functions. Similarly, the restriction φ 6= φ0 is a necessary
condition for R(φ) to be an irreducible character, as we see by computing the square
norm, which should be equal to 1: we have
1 n
hR(φ), R(φ)i = (p − 1)3 + (p − 1)(p2 − 1)+
|Gp |
1 X √ √ o
p(p − 1) |φ(a + b ε) + φ(a − b ε)|2
2
a,b∈Fp
b6=0
11 √ √
A better notation would be φ0 = φp , since this is what the operation a + b ε 7→ a − b ε amounts
to.
152
and the last sum (rather like the one for the induced representation π(χ1 , χ2 )) is equal to
X √ √ 2 X √ √
|φ(a + b ε) + φ(a − b ε)| = 2p(p − 1) + 2 Re φ(a + b ε)φ(a − b ε)
a,b∈Fp a,b∈Fp
b6=0 b6=0
X X
= 2p(p − 1) + 2 Re φ(x)φ0 (x) − 1
√
x∈Fp ( ε)× a∈F×
p
(this is especially easy to evaluate because T2 only intersects conjugacy classes of central
and non-split semisimple elements.)
We are thus done computing this character table! To summarize, we present it in a
single location:
153
! ! ! !
a 0 a 1 a 0 a b
, b 6= a , b 6= 0
0 a 0 a 0 b εb a
χ ◦ det χ(a2 ) χ(a2 ) χ(ab) χ(a2 − εb2 )
π(χ1 , χ2 ) (p + 1)χ1 (a)χ2 (a) χ1 (a)χ2 (a) χ1 (a)χ2 (b) + χ2 (b)χ1 (a) 0
St(χ) pχ(a2 ) 0 χ(ab) −χ(a2 − εb2 )
√
−(φ(a + b ε)+
R(φ) (p − 1)φ(a) −φ(a) 0 √
φ(a − b ε))
Remark 4.6.11. (1) For p = 2, the only actual difference is that there are no split
semisimple conjugacy classes, and correspondingly no principal series (induced) represen-
tations. Indeed, GL2 (F2 ) is isomorphic to S3 (an isomorphism is obtained by looking
at the permutations of the three lines in F22 induced by an element of GL2 (F2 )), and
the character table of the latter in Example 4.6.1 corresponds to the one above when we
remove the third line and column: the 2-dimensional representation of S3 corresponds
to the (unique) Steinberg representation and the signature corresponds to the (unique)
cuspidal representation of GL2 (F2 ).
(2) The restriction to GL2 (k) where k is a field of prime order was merely for conve-
nience; all the above, and in particular the full character table, are valid for an arbitrary
finite field k, with characters of k × , and of the group of invertible elements in its quadratic
×
extension, instead of those of F× p and Fp2 .
This computation of the character table of GL2 (Fq ) was somewhat involved. The
following series of exercises shows some of the things that can be done once it is known.
Exercise 4.6.12 (Characters of SL2 (Fp )). The group SL2 (Fp ) is quite closely related
to GL2 (Fp ), and one can compute the character table of one from that of the other.
(1) For p > 3, show that SL2 (Fp ) has p + 4 conjugacy classes, and describe represen-
tatives of them.
(2) By decomposing the restriction to SL2 (Fp ) of the irreducible representations of
GL2 (Fp ), describe the character table of SL2 (Fp ) for p > 3. [See, for instance, [13, §5.2]
for the results; there are two irreducible representations of GL2 (Fp ) which decompose as
a direct sum of two representations whose characters are quite tricky to compute, and
you may try at first to just compute the dimensions of the irreducible components.]
(3) Show in particular that
p−1
(4.51) min dim π =
π6=1 2
where π runs over all non-trivial irreducible (complex) representations of SL2 (Fp ).
(In Section 4.7.1, we will see some striking applications of the fact that this dimension
is large, in particular that it tends to infinity as p does, and we will prove (4.51) more
directly, independently of the computation of the full character table.)
(4) For G = GL2 (Fp ), H = SL2 (Fp ), A = G/H ' F× p , compute the invariant κ(π)
defined in (4.34) for all representations π of G.
Exercise 4.6.13 (The Gelfand-Graev representation). Let
n
1 t
o
U= | t ∈ Fp ⊂ Gp ,
0 1
154
This is a subgroup of Gp , isomorphic to Fp . Let ψ 6= 1 be a non-trivial irreducible
G
character of U . Compute the character of % = IndU p (ψ) and show that if π is an irreducible
representation of Gp , we have
(
1 if dim(π) > 2,
h%, πi =
0 otherwise.
The representation % is called the Gelfand-Graev representation of GL2 (Fp ). In con-
crete terms, the result means that for any irreducible representation
π : Gp −→ GL(E),
which is not one-dimensional, there exists, up to scalar, a unique linear form (called a
Whittaker functional for π)
`π : E −→ C
such that
1 x
`π π v = ψ(x)`π (v)
0 1
for x ∈ Fp and v ∈ E (this is because such a linear form is exactly an element of
HomU (π, ψ), which is isomorphic to HomG (π, %) by Frobenius reciprocity).
Using the specific isomorphism that implements Frobenius reciprocity, we find that
given such a linear form `π 6= 0, the homomorphism
π −→ IndG
U (ψ)
4.7. Applications
We present in this section some sample applications of the representation theory of
finite groups, where the statements do not, by themselves, seem to depend on represen-
tations. The sections are independent of each other.
Now the positivity argument shows that the number, say N , of those g ∈ G with
C ∩ gB = ∅ satisfies
1
ν(N ) 6 .
kν(B)ν(C)
This gives the intermediate statement proved by Gowers: if A, B, C satisfy
|A||B||C| 1
3
= ν(A)ν(B)ν(C) > ,
|G| k
the intersection C ∩ AB is not empty. Now we bootstrap this by an amazingly clever
trick of Nikolov and Pyber [30, Prop. 1]: consider again A, B, C as in the proposition,
and redefine
C1 = G − AB = {g ∈ G | g is not of the form ab with a ∈ A, b ∈ B}.
Then by definition, we have C1 ∩ AB = ∅. By contraposition, using the result of
Gowers, this means that we must have
|A||B||C1 | 1
3
6
|G| k
and the assumption (4.53) now leads to
|C1 | < |C|.
This means that |AB| + |C| > |G|. Now for any g ∈ G, this means also that |AB| +
|gC −1 | = |AB|+|C| > |G|. Therefore the sets AB and gC −1 must intersect; this precisely
means that g ∈ ABC, and we are done.
The next corollary does not mention representations at all:
13 We could have kept the term −ν(B)2 to very slightly improve this estimate, but it does not seem
to matter in any application.
159
Corollary 4.7.2 (SL2 (Fp ) is quasirandom). If p > 3 is a prime number and A ⊂
SL2 (Fp ) is a subset such that
|A| > 21/3 p(p + 1)(p − 1)2/3
or equivalently
|A| 2 1/3
> ,
| SL2 (Fp )| p−1
then for any g ∈ SL2 (Fp ), there exist a1 , a2 , a3 ∈ A with g = a1 a2 a3 .
Proof. This follows from the proposition for G = SL2 (Fp ), where |G| = p(p2 + p),
and B = C = A, using (4.51), which shows that k = 21 (p − 1).
In particular, such sets are generators, but in a very strong sense. Results like this
can be used to help with certain proofs of Theorem 1.2.5 in Section 1.2: to show that
a very small subset like S = {s1 , s2 } (as in (1.1)) generates SL2 (Fp ) in at worse C log p
steps, it suffices to find some C 0 such that the number of elements in SL2 (Fp ) obtained
using products of 6 C 0 log p elements from S is at least 2(p + 1)8/9 , for instance. Indeed,
if that is the case, the set A formed by these products satisfies the assumption of the
corollary, and every element of G is the product of at most 3C 0 log p elements from S.
Exercise 4.7.3 (Minimal dimension of non-trivial representations of SL2 (Fp )). We
indicate here how to prove (4.51) without invoking the full computation of the character
table of SL2 (Fp ). Thus let % 6= 1 be an irreducible unitary representation of SL2 (Fp ),
with p > 3.
(1) Show that
1 1
A=%
0 1
1 1 1 0
is not the identity. [Hint: Use the fact that and generate SL2 (Fp ).]
0 1 1 1
(2) Let ξ ∈ C× be an eigenvalue of A with ξ 6= 1. Show that, for all a coprime to p,
2
ξ a is also an eigenvalue of A. [Hint: Use a suitable conjugate of A.]
(3) Deduce that dim(%) > (p − 1)/2.
(Thanks to O. Dinai for pointing out this proof; note that it is only by constructing
the “cuspidal” representations that it is possible to show that this bound is sharp, and
also that if Fp is replaced with another finite field with q elements, of characteristic p,
this argument does not give the correct lower bound 21 (q − 1).)
The terminology “quasirandom” may seem mysterious at first, but it is well explained
by the mechanism of the proof: the average of the function ϕB,C corresponds precisely
to the intuitive “probability” that an element x ∈ G belongs to two subsets of density
|B|/|G| and |C|/|G| if these are genuinely random and independent. Hence, the fact
that ϕB,C is quite closely concentrated around its average value, when k is large, may
be interpreted as saying that its elements and subsets behave as if they were random (in
certain circumstances).
To put the result in context, note that if k = 1 (for instance if G is abelian, or
if G = GL2 (Fp ), which has many one-dimensional irreducible representations) the con-
dition (4.53) can not be satisfied unless A = B = C = G. And indeed a statement
like Corollary 4.7.2 is completely false if, say, G = Z/pZ with p large: for instance, if
A = B = C is the image modulo p of the set of integers 1 6 n 6 b p3 c − 1, we see that
A + B + C is not all of G, although the density of A is about 1/3, for all p.
160
4.7.2. Burnside’s “two primes” theorem. We prove here the theorem of Burn-
side mentioned in Chapter 1 (Theorem 1.2.7): a finite group with order divisible by at
most two distinct primes is necessarily solvable. The proof is remarkable, in that it does
not depend on being able to write the character table of the group being investigated, but
on subtler features about a finite group that may be found by looking at its irreducible
characters. These are related to integrality properties, which have many other important
applications.
The basic idea is to prove the following, weaker-looking statement:
Proposition 4.7.4 (Existence of normal subgroup). Let G be a finite group of order
pa q b for some primes p and q and integers a, b > 0. If G is not abelian, it contains a
normal subgroup H / G with H 6= 1 and H 6= G.
To see that this implies Burnside’s Theorem, that groups of order pa q b are solvable,
one argues by induction on the order of a group G of this type. By the proposition, either
G is abelian, and therefore solvable, or there exists H / G such that H 6= 1 and G/H 6= 1;
in that case we have an exact sequence
1 −→ H −→ G −→ G/H −→ 1,
and both H and G/H have orders strictly smaller than |G|, and divisible only (at most)
by the primes p and q. By induction, they are therefore solvable, and this is well-known
to imply that G itself is solvable.
So we are reduced to a question of finding a non-trivial normal subgroup in a group
G, one way or another, and Burnside’s idea is to find it as the kernel of some suitable
non-trivial irreducible representation
% : G −→ GL(E),
or of an associated homomorphism
%
%̄ : G −→ GL(E) −→ PGL(E) ;
indeed, it is a bit easier to ensure that ker %̄ is non-trivial (the kernel is enlarged modulo
the scalars), and the possibility that ker %̄ = G is so special that its analysis is even
simpler.
We will find the desired representation by means of the following result, which is itself
of great interest:
Theorem 4.7.5 (Burnside). Let G be a finite group,14 and
% : G −→ GL(E)
an irreducible complex representation of G. If g ∈ G is such that its conjugacy class
g ] ⊂ G has order coprime with dim %, then either χ% (g) = 0, or g ∈ ker %̄, where %̄ is the
composite homomorphism
%
G −→ GL(E) −→ PGL(E).
This may not be a result that is easy to guess or motivate, except that the statement
may well come to mind after looking at many examples of characters tables – for instance,
in the case of the solvable groups Bp of order p(p−1)2 (see Table 4.3 in Section 4.6.3), the
characters of the irreducible representations of dimension p − 1 vanish at all conjugacy
classes of size p − 1, and their values at conjugacy classes of size 1 are scalar matrices
(hence in the kernel of %̄). Similarly, the character of the Steinberg representation of
14 Of any order.
161
dimension p of GL2 (Fp ) is zero at all conjugacy classes of size p2 − 1. (Note that if the
reader did look, she will certainly have also remarked a striking fact: for any irreducible
representation % of dimension dim % > 1, there exists (or so it seems) some conjugacy
class c with χ% (c) = 0; this is indeed true, as we will explain in Remark 4.7.10 below...)
We can also check immediately that the statement of the theorem is true for conjugacy
classes of size 1: this corresponds to elements of the center of G, for which %(g) is
always a homothety for any irreducible representation % (the central character, as in
Corollary 2.7.15.)
Proof of Proposition 4.7.4 from Theorem 4.7.5. Note first that we can cer-
tainly assume that a > 1 and b > 1, since a group of order a power of a single prime has
a non-trivial center (see, e.g., [33, Th. 4.4] for this basic feature of finite groups.)
We attempt to find an element g ∈ G and an irreducible representation % so that
Theorem 4.7.5 applies, while ensuring that the character value χ% (g) is non-zero. The
difficulty is to ensure the coprimality condition of dim(%) with |g ] |, and indeed this is
where the assumption that |G| is only divisible by two primes is important.
In fact, the following fact remains valid for arbitrary finite groups, and can be in-
terpreted as one more attempt of conjugacy classes and irreducible representations to
behave “dually” (see Remark 4.2.6):
Fact. Let G 6= 1 be a finite group, and let p, q be prime numbers. There exists a pair
(g, %), where g 6= 1 is an element of G, and % 6= 1 is an irreducible complex representation
of G, such that χ% (g) 6= 0 and
(4.55) p - |g ] |, q - dim(%).
We first conclude the proof using this fact: the point is that if |G| = pa q b with a,
b > 1, then with g and % as so conveniently given, the conditions (4.55) mean that
|g ] | and dim(%) must be coprime (one does not need to know the – true – fact that
dim(%) | |G|: the order of g ] does divide |G|, and hence must be a power of q, but q is
coprime with dim(%)).15 Hence we can apply Theorem 4.7.5 and conclude that g ∈ ker %̄,
so that the latter is non-trivial normal subgroup. The endgame is now straightforward:
if ker %̄ 6= G, this kernel is the required proper, non-trivial, normal subgroup, while
otherwise the composition %̄ is trivial, and then % takes scalar values, and must therefore
be one-dimensional by irreducibility. We then get an isomorphism
G/ ker(%) ' Im(%) ⊂ C× ,
which shows in turn that ker % is a proper, non-trivial, normal subgroup... unless G '
Im(%) is in fact abelian!
Now we prove the existence of the required pair (g, %). Given g 6= 1, the basic
relation between the values of the irreducible characters at g and their dimensions is the
orthogonality relation (4.28), which gives
X X
χ% (g)χ% (1) = (dim %)χ% (g) = 0.
%∈Ĝ %∈Ĝ
which certainly tells us that there is some irreducible representation % 6= 1 such that
χ% (g) 6= 0.
But even better, if we consider this modulo the prime number q, it implies that there
is some irreducible representation % 6= 1 with χ% (g) 6= 0 and q - dim %. This relies on
the fact that the values χ% (g) of irreducible characters, which are sums of roots of unity
(the eigenvalues of %(g)) are algebraic integers: modulo `, the right-hand side of (4.56) is
non-zero, and some term in the sum is therefore not divisible by q.
Precisely, if it were the case that q | dim % for all % such that χ% (g) 6= 0, we would get
1 X dim(%)
(4.57) − = χ% (g),
q q
%6=1
q|dim(%)
where the right-hand side is an algebraic integer, and this is impossible since 1/q is not. In
Section A.1 in the Appendix, we present a short discussion of the properties of algebraic
integers that we use (here, Proposition A.1.1), and readers for whom this is not familiar
may either read this now, or continue while assuming that the character values involved
are all actual integers in Z, since in that case (4.57) is patently absurd.
So, given g 6= 1 and the prime q, we can always find % 6= 1 such that q - dim % and
χ% (g) 6= 0. It is therefore sufficient, to prove the claim, to show that g 6= 1 with p - |g ] |
also exists. Of course if p - |G|, this is always true, and we can assume that p is a divisor
of |G|. Then we use another “averaging” trick: by partitioning G into conjugacy classes,
we have X
|g ] | = |G|.
g ] ∈G]
We isolate the contribution of the conjugacy classes of size 1, i.e., of the center Z(G)
of G, and then reduce modulo p to get
X
|g ] | ≡ −|Z(G)| (mod p).
g ] ∈G] −Z(G)
Thus either the center of G is not reduced to 1 (and we can take any non-trivial
element g ∈ Z(G), with p - |g ] | = 1), or else one of the terms in the left-hand side, must
be non-zero modulo p, and using such a g we can ensure all of (4.55).
Remark 4.7.6 (Why A5 is not solvable...). The first (in terms of order!) non-solvable
group is the alternating group A5 of order 60 = 22 · 3 · 5 (one can show that all groups of
order 30 – there are four up to isomorphism – are solvable.) It is instructive to see “how”
the argument fails in that case. The character table of A5 is computed, e.g., in [13, §3.1,
Ex. 3.5], and we just list it here, subscripting the conjugacy classes with their sizes (the
reader who has not seen it might think of finding natural linear actions corresponding to
the representations displayed):
11 (12)(34)15 (123)20 (12345)12 (13452)12
1 1 1 1 1 1
163
11 (12)(34)15 (123)20 (12345)12 (13452)12
√ √
1+ 5 1− 5
%3 3 −1 0 2√ 2√
%03 3 −1 0 1− 5
2
1+ 5
2
%4 4 0 1 −1 −1
%5 5 1 −1 0 0
The pairs (g, %) for which χ% (g) 6= 0 and (4.55) holds are the following:
(p, q) = (2, 3), (g ] , %) = ((12)(34), %4 ),
(p, q) = (3, 2), (g ] , %) = ((123), %5 ),
(p, q) = (2, 5), (g ] , %) = ((12)(34), %3 or %03 ),
(p, q) = (5, 2), (g ] , %) = ((12345) or (13452), %3 or %03 )
(p, q) = (3, 5), (g ] , %) = ((123), %4 )
(p, q) = (5, 3), (g ] , %) = ((12345) or (13452), %4 ).
As it should be, one sees that in all cases, the actual gcd of |g ] | and dim(%) is different
from 1.
We now come to the proof of Theorem 4.7.5. Here again, the basic ingredeitn is of
independent interest, as it provides more subtle integrality properties of character values:
Proposition 4.7.7 (Divisibility). Let G be a finite group and let
X
a= α(g)g ∈ C[G]
g∈G
be an element of the group ring with coefficients α(g) which are algebraic integers. More-
over, assume a ∈ Z(C(G)) is in the center of the group ring, or equivalently that α is
a class function on G. Then a acts on irreducible representations by multiplication by
scalars, and those are all algebraic integers.
In particular, for any g ∈ G and % ∈ Ĝ, we have
(4.58) dim(%) | χ% (g)|g ] |
in the ring Z̄ of algebraic integers.
Proof. The action of a central element a on an irreducible representation % is given
by the scalar ω% (a) of Proposition 4.3.30, and so we must show that this is in Z̄ under
the stated conditions.
Since, as a function of a, this scalar is a ring-homomorphism, and Z̄ is itself a ring
(Proposition A.1.2), it is enough to prove the integrality of ω% (a) when a runs over a set
of elements which span the subring Z(Z̄[G]). For instance, one can take the elements
X
ac = g
g∈c
where c runs over conjugacy classes in G. This means, in practice, that we may assume
that the coefficients α(g) are in fact in Z.
164
Now, under this condition, we consider the element e% ∈ C[G] giving the %-isotypic
projector. Using the left-multiplication action of G on the group ring, we have
ae% = ω% (a)e% ,
i.e., multiplication by a, as a map on C[G], has ω% (a) as an eigenvalue. But now we claim
that this linear map
C[G] −→ C[G]
Φa
x 7→ ax
can be represented by an integral matrix in a suitable basis. In fact, the elements x ∈ G
form a basis in C[G] which does the job: we have
X X
ax = α(g)gx = α(gx−1 )g
g∈G g∈G
where the relevant coefficients, namely the α(gx−1 ), are indeed integers.
We conclude that ω% (a) is a root of the characteristic polynomial det(X − Φa ) of
Φa , and if we use the basis above, this is monic with integral coefficients, showing that
ω% (a) is indeed an algebraic integer. (We are using here one part of the criterion in
Proposition A.1.2.)
Now for the last part, we use the expression (4.26) for ω% (a), in the special case where
a = ac , which is
1 X |g ] |χ% (g)
ω% (ac ) = χ% (g) = ,
dim(%) g∈c dim(%)
and the fact that this is an algebraic integer is equivalent with the divisibility rela-
tion (4.58).
Before using this to conclude, the reader is probably tempted to apply the general
fact that ω% (a) ∈ Z̄ to other elements a. In fact this gives the proof of an observation we
already mentioned (see Remark 4.3.7 for instance):
Proposition 4.7.8 (Dimensions of irreducible representations divide the order). If G
is a finite group and % ∈ Ĝ is an irreducible complex representation of G, the dimension
of % divides |G|.
Proof. We are looking for a suitable a ∈ C[G] to apply the proposition; since
1 X
ω% (a) = α(g)χ% (g),
dim(%) g∈G
the most convenient would be to have α(g) such that the sum is equal to |G|. But there
does exist such a choice: by the orthogonality relation, we can take α(g) = χ% (g) and
then
1 X |G|
ω% (a) = α(g)χ% (g) = .
dim(%) g∈G dim(%)
Since α(g) ∈ Z̄, this is indeed an algebraic integer, by the proposition. Hence
|G|
∈ Z̄ ∩ Q = Z,
dim(%)
which is the desired result.
We can finally finish:
165
Proof of Theorem 4.7.5. With (4.58) in hand, what to do is quite clear: the
dimension dim(%) divides the product
χ% (g)|g ] |,
and it is assumed that it is coprime with the second factor |g ] | are coprime. So it must
divide the first, i.e., the character value χ% (g) (this happens in the ring Z̄ of algebraic
integers, always; we are using Proposition A.1.5.)
Such a relation, we claim, is in fact equivalent with the conclusion of the theorem.
This would again be clear if χ% (g) were in Z, since the bound
|χ% (g)| 6 dim(%)
and the divisibility dim(%) | χ% (g) ∈ Z lead to
χ% (g) ∈ {− dim(%), 0, dim(%)},
and we know that |χ% (g)| = dim(%) is equivalent with %(g) being a scalar matrix, i.e.,
g ∈ ker %̄ (Proposition 4.6.4).
To deal with the general case, we must be careful because if we have non-zero algebraic
integers z1 , z2 with
z1 | z2 ,
√
we can not always conclude that |z1 | 6 |z2 | (e.g., take z1 = 1 and z2 = −1 + 2.) What
we do is take the norm on both sides of the divisibility relation and obtain
dim(%)r | N (χ% (g))
where r is the number of conjugates of χ% (g) (this is Corollary A.1.8). This is now a
divisibility relation among integers, and if χ% (g) 6= 0, we deduce the inequality
dim(%)r 6 |N (χ% (g))|.
Now, each conjugate of χ% (g) is a sum of dim(%) roots of unity,16 hence is of modulus
6 dim(%). This means that
|N (χ% (g))| 6 dim(%)r ,
and by comparison we must have equality in all the terms of the product, in particular
|χ% (g)| = dim(%),
which – as before – gives g ∈ ker %̄.
Remark 4.7.9. We used the divisibility relation (4.58) in the previous proof by as-
suming that dim(%) is coprime with the factor |g ] | on the right-hand side. What happens
if we assume instead that dim(%) is coprime with the second factor, χ% (g)? One gets
the conclusion that dim(%) divides the size of the conjugacy class of g. This is of some
interest; in particular, if there exists some g with χ% (g) = ±1 (or even χ% (g) a root of
unity), we have
dim(%) | |g ] |.
We can see this “concretely” in the Steinberg representations of GL2 (Fp ), of dimension
p: the values at semisimple conjugacy classes are roots of unity, and indeed dim(St) =
p | p(p + 1), p(p − 1), which are the sizes of the split (resp. non-split) semisimple classes.
On the other hand, it is not clear if this “dual”statement has any interesting applications
in group theory...
16 In fact, it is also a character value χ% (x) for some x ∈ G, but checking this fact requires the
Galois-theoretic interpretation of the conjugates, which we do not wish to assume.
166
Remark 4.7.10 (Characters have zeros). We come back to the following observa-
tion, which is certainly experimentally true for those groups for which we computed the
character table:
Proposition 4.7.11 (Burnside). Let G be a finite group, and % ∈ Ĝ an irreducible
representation of dimension at least 2. Then there exists some g ∈ G such that χ% (g) = 0.
Proof. This is once again obvious if the character takes actual integer values in Z:
the orthonormality relation for % gives
1 X
|χ% (g)|2 = 1,
|G| g∈G
i.e., the mean-square average over G of the character of % is 1. Hence either |χ% (g)|2 = 1
for all g, which can only happen when dim(%) = 1, or else some element g must have
|χ% (g)| < 1.
If χ% (g) ∈ Z, this gives immediately χ% (g) = 0. In the general case, we must again
√ be
careful, since there are many non-zero algebraic integers z with |z| < 1 (e.g., −1 + 2).
However, one can partition G into subsets for which the sum of the character values is an
actual integer. To be precise, we write G as the union of the equivalence classes for the
relation defined by x ∼ y if and only if x and y generate the same (finite cyclic) subgroup
of G. Hence X X X
|χ% (g)|2 = |χ% (x)|2 .
g∈G S∈G/∼ x∈S
Each class S is the set of generators of some finite cyclic subgroup H of G. Applying
(to H and the restriction of % to H) the inequality (4.38) from Exercise 4.5.7, we deduce
that X
|χ% (x)|2 > |S|
x∈S
unless some (in fact, all) character values χ% (x) are zero for x ∈ S. Summing over S, and
comparing with the orthonormality relation, it follows that when χ% has no zero, there
must be equality in each of these inequalities. But S = {1} is one of the classes, and
therefore |χ% (1)|2 = 1, which gives the desired result by contraposition.
This fact is about the rows of the character table; is there another, “dual”, property of
the columns? If there is, it is not the existence of at least one zero entry in each column,
except for those of central elements (for which the modulus of the character value is
the dimension): although this property holds in a number of examples, for instance the
groups GL2 (Fp ), we can see that it is false for the solvablegroups
Bp of Section 4.6.3: we
a 1
see in Table 4.3 that for the non-diagonalizable elements , with conjugacy classes
0 a
of size p − 1, every character value is a root of unity.
4.7.3. Relations between roots of polynomials. Our last application is to a
purely algebraic problem about polynomials: given a field k (arbitrary to begin with)
and a non-zero irreducible polynomial P ∈ k[X] of degree d > 1, the question is whether
the roots
x1 , . . . , x d
of P (in some algebraic closure of k) satisfy any non-trivial linear, or multiplicative,
relation? By this, we mean, do there exist coefficients αi ∈ k, not all zero, such that
α1 x1 + · · · + αd xd = 0,
167
(linear relation) or integers ni ∈ Z, not all zero, such that
xn1 1 · · · xnd d = 1.
For instance, since
x1 + · · · + xd = ad−1 , x1 · · · xd = (−1)d a0 ,
for
P = X d + ad−1 X d−1 + · · · + a1 X + a0 ,
we have a non-trivial linear relation
x1 + · · · + xd = 0,
whenever the coefficient of degree d − 1 of P is zero, and a non-trivial multiplicative
relation
x21 · · · x2d = 1
whenever a0 = P (0) = ±1.
A general method to investigate such questions was found by Girstmair (see [15] and
the references there, for instance), based on representation theory. We present here the
basic idea and the simplest results.
As in the (first) proof of Burnside’s Irreducibility Criterion (Section 2.7.3), the basic
idea is to define the set of all relations (linear or multiplicative) between the roots of
P , and show that it is carries a natural representation of a certain finite group G. If
we can decompose this representation in terms of irreducible representations, we will
obtain a classification of the possible relations that can occur. As was the case for the
Burnside criterion, this is often feasible because the relation space is a subrepresentation
of a well-understood representation of G.
First, with notation as before for the roots of P , we denote by
d
X
Ra = {(αi )i ∈ k d | αi xi = 0},
i=1
d
Y
Rm = {(ni )i ∈ Z | d
xni i = 1}
i=1
the spaces of linear or multiplicative relations between the roots; we see immediately that
Ra is a k-vector subspace of k d , while Rm is a subgroup of the abelian group Zd , so that
it is a free abelian group of rank at most d.
The group G that acts naturally on these spaces is the Galois group of the polynomial
P , which means the Galois group of its splitting field
kP = k(x1 , . . . , xd ).
To ensure that this Galois group is well-defined, we must assume that P is separable,
for instance that k has characteristic zero (k = Q will do). The elements of G are
therefore field automorphisms
σ : kP −→ kP .
By acting on a relation (linear or multiplicative) using G, we see that the Galois group
acts indeed on Ra and Rm . More precisely, recall that σ ∈ G permutes the roots (xi ), so
that there exists a group homomorphism
G −→ Sd
σ 7→ σ̂
168
characterized by
σ(xi ) = xσ̂(i)
for all roots of P . This homomorphism is faithful since the roots of P generate the
splitting field kP .
If α = (αi ) is in Ra , acting by σ on the relation
α1 x1 + · · · + αd xd = 0,
we get
0 = σ(α1 x1 + · · · + αd xd ) = α1 xσ̂(1) + · · · + αd xσ̂(d)
(since σ is the identity on k) or in other words the vector
σ · α = (ασ̂−1 (i) )16i6d
is also in Ra . But note that we can define
σ · α = (ασ̂−1 (i) )16i6d
for arbitrary α ∈ k d , and σ ∈ G; this is in fact simply the permutation k-representation of
G on k d constructed from the action of G on the set {xi } of roots of G (see Section 2.6.2),
and hence we see that Ra is a subrepresentation of this permutation representation, which
we denote πk .
Similarly, we can act with G on multiplicative relations. However, since Rm is only an
abelian group, it is only the Q-vector space Rm ⊗Q that one can see as a subrepresentation
of the (same!) permutation representation πQ of G over Q (one may also use any field
containing Q).
If we can decompose πk , we can hope to see which subrepresentations can arise as
relation spaces Ra , and similarly for πQ and Rm . The simplest case is the following:
Proposition 4.7.12. Let k be a field of characteristic zero, P ∈ k[X] an irreducible
polynomial of degree d > 2 with Galois group isomorphic to the full symmetric group Sd .
(1) Either Ra = 0, i.e., there are no non-trivial linear relations between the roots of
P , or Ra is one-dimensional and is spanned by the element e0 = (1, . . . , 1) corresponding
to the relation
x1 + · · · + xd = 0.
This second case may always happen, for a given field k, if there exists a polynomial
with Galois group Sd .
(2) Either Rm = 0, i.e., there are non non-trivial multiplicative relations between the
roots of P , or Ra is a free Z-module of rank 1 generated by ne0 for some n > 1, or the
splitting field of P is contained in the splitting field of a Kummer polynomial X n − b. The
first case can happen for any field k for which there exists a polynomial with Galois group
Sd , and the second and third cases are possible for k = Q for n = 2, n = 3.
Proof. As we have already observed, the space k d of πk decomposes as a direct sum
of subrepresentations
k d = ke0 ⊕ V
where X
V = {v = (vi ) ∈ k d | vi = 0}.
i
When G ' Sd , although k is not algebraically closed (otherwise an irreducible P of
degree > 2 would not exist!), these are irreducible subrepresentations. Indeed, we must
only check this for V , and we immediately see that if V could be decomposed into two
or more subrepresentations (recall that Maschke’s theorem does apply for any field of
169
characteristic 0), then the same would be true for the representation of G on V ⊗ k̄,
which contradicts the fact that it is irreducible (though we have only directly proved this
for k ⊂ C, see (4.19)).
Because ke0 and V are non-isomorphic as representations of G ' Sd (even for d = 2,
where V is also one-dimensional), the only possibilities for Ra are therefore
Ra = 0, or ke0 , or V, or k d
(this is the uniqueness of isotypic subspaces, see the second part of Proposition 2.7.7.)
The cases Ra = 0 and Ra = ke0 are precisely the two possibilities of the statement we
try to prove, and we can now check that the others can not occur. For this, it is enough
to show that Ra ⊃ V is impossible. But V is spanned by the vectors
(4.59) f2 = (1, −1, 0, . . . , 0), ..., fd = (1, 0, . . . , 0, −1).
Even if only the first were to be in Ra , this would translate into x1 = x2 , which is
impossible.
There only remains to prove the existence part: provided some polynomial in k[X]
with Galois group Sd exists,17 one can find instances where both cases Ra = 0 or Ra = ke0
occur. This is easy: if we fix such a polynomial
P = X d + ad−1 X d−1 + · · · + a1 X + a0 ∈ k[X]
with Galois group Sd , the polynomials P (X + a), for any a ∈ k, also have the same
Galois group (the splitting field has not changed!), and the sum of the roots yi = xi − a
is
X d
xi − da,
i=1
which takes all values in k when a varies; when it is zero, the polynomial P (X + a)
satisfies Ra = ke0 .
We now deal with the multiplicative case; the argument is similar, but some of the
excluded cases become possible. First, since Rm ⊗ Q is a subrepresentation of πQ , we see
again that there are the same four possibilities for the subspace Rm ⊗ Q. Of course, if
it is zero, we have Rm = 0 (because Rm is a free abelian group); if Rm ⊗ Q = Qe0 , on
the other hand, we can only conclude that Rm = nZe0 for some integer n > 1 (examples
below show that one may have n 6= 1.)
Continuing with the other possibilities, we have Rm ⊗ Q ⊃ V if and only, for some
n > 1, the vectors nf2 , . . . , nfd are in Rm , where fi is defined in (4.59). This means that
we have x n x n
1 1
= ··· = = 1,
x2 xd
and from this we deduce that
σ(xn1 ) = xnσ̂(1) = xn1 ,
for all σ ∈ G. By Galois theory, this translates to xn1 ∈ k. Therefore x1 is a root of a
Kummer polynomial X n −b ∈ k[X], which is the last possible conclusion we claimed. Note
that b could be a root of unity (belonging to k): this is a special case, which corresponds
to Rm ⊗ Q = Qd (instead of V ), since each xi is then a root of unity.
In terms of existence of polynomials with these types of multiplicative relations, we
first note that Rm = 0 is always possible if there exists at least one irreducible polynomial
P ∈ k[X] with Galois group Sd . Indeed, we have P (0) 6= 0, and as before, we may replace
17 This may not be the case, or only for some d (in the case of k = R, only d = 2 is possible.)
170
P with Q = P (aX) for a ∈ k, without changing the Galois group; the roots of Q are
yi = a−1 xi , and
x1 · · · xd
y1 · · · yd = .
ad
Then, if we pick a ∈ k × so that this is expression is not a root of unity, we obtain a
polynomial Q with Rm = 0 (such an a 6= 0 exists: otherwise, taking a = 1 would show
that x1 · · · xd is itself a root of unity, and then it would follow that any a ∈ k × is a root
of unity, which is absurd since Q ⊂ k).
For the case of Rm ⊗ Q = Qe0 , we will just give examples for k = Q: it is known
(see, e.g., [37, p. 42]) that the polynomial
P = Xd − X − 1
has Galois group Sd for d > 2; since the product of its roots is (−1)d P (0) = (−1)d+1 , it
satisfies the relation Y
x2i = 1,
i
so Rm ⊗ Q = Qe0 , and in fact Rm = nZe0 with n = 1 if d is odd, and n = 2 if d is even.
For the Kummer cases, we take k = Q for simplicity (it will be clear that many fields
will do). For n = 2, any quadratic polynomial X 2 − b with b not a square of a rational
number has Galois group S2 ; if b = −1, noting x1 = i, x2 = −i, we have
Rm = {(n1 , n2 ) ∈ Z2 | n1 + 2n2 ≡ 0 (mod 4)},
which has rank 2 (so Rm ⊗ Q = Q2 ), and if b 6= −1, we have
Rm = {(n1 , n2 ) ∈ Z2 | ni ≡ 0 (mod 2), n1 + n2 = 0},
with Rm ⊗ Q = Qe0 . For n = 3, any Kummer equation X 3 − b = 0, with b not a perfect
cube, √
will have
√ splitting
√ field with Galois group S3 , and a quick computation with the
roots 3 b, j 3 b, j 2 3 b, where j is a primitive cube root of unity in C, leads to
Rm = {(n1 , n2 , n3 ) ∈ Z3 | n1 + n2 + n3 = 0, ni ≡ nj (mod 3) for all i, j},
so that again Rm ⊗ Q = Qe0 .
Exercise 4.7.13 (Palindromic polynomials). We consider in this exercise the case
where d is even and the Galois group of P is the group Wd defined as the subgroup of
Sd that respects a partition of {1, . . . , d} into d/2 pairs. More precisely, let X be a finite
set of cardinality d, and let i : X → X be an involution on X (i.e., i ◦ i = IdX ) with
no fixed points, for instance X = {1, . . . , d} and i(x) = d + 1 − x. The d pairs {x, i(x)}
partition X, and
Wd = {σ ∈ Sd | σ(i(x)) = i(σ(x)) for all x ∈ X}
which means concretely that an element of Wd permutes the pairs {x, i(x)}, and may (or
not) switch x and i(x). This group is sometimes called the group of signed permutations
of {1, . . . , d}.
(1) Show that Wd is of order 2d d! for d > 2 even, and that there is an exact sequence
1 −→ (Z/2Z)d −→ Wd −→ Sd −→ 1.
Find a faithful representation of Wd in GLd (C) where the matrix representation has
values in GLd (Z). (See Example 7.1.2.)
(2) Let k be a field of characteristic 0, P ∈ k[X] an irreducible polynomial of degree
d = 2n even, d > 2, of the form
P = X d + ad−1 X d−1 + · · · + an+1 X n + an+1 X n−1 + · · · + ad−1 X + 1,
171
i.e., with the same coefficients for X j and X d+1−j for all j (such polynomials are called
palindromic, or self-dual ). Show that the Galois group of P can be identified with a
subgroup of Wd . [Hint: If x is a root of P , then 1/x is also one, and 1/x 6= x, so that one
can take X = {roots of P} and i(x) = 1/x.]
(3) Assume that the Galois group of P is equal to Wd . Show that the permutation
representation πk of Wd associated to the action of Wd on X splits as a direct sum
E0 ⊕ E1 ⊕ E2
where E0 = ke0 and, for a suitable numbering of the roots, we have
n X o
E1 = (αi ) | αd+1−i − αi = 0, 1 6 i 6 d, αi = 0 ,
n o
E2 = (αi ) | αd+1−i + αi = 0, 1 6 i 6 d ,
and the three spaces are irreducible. [Hint: You may assume k ⊂ C; compute the orbits
of Wd on X, and deduce the value of the squared norm of the character of πk .]
(4) Show that the only possible spaces of linear relation between roots of a polynomial
P as above are Ra = 0 and Ra = ke0 .
It is known that “many” palindromic polynomials with Galois groups Wd exist; for
more information and some applications, the reader may look at [25, §2].
Remark 4.7.14. Both Proposition 4.7.12 and this exercise are in the direction of
showing that linear or multiplicative relations are rare in some cases. However, for some
Galois groups, interesting things can happen. For instance, Girstmair showed that there
exists a group G of order 72 which can arise as the Galois group (for k = Q) of some
polynomial P of degree 9 for which the roots, suitably numbered, satisfy
4x1 + x2 + x3 + x4 + x5 − 2(x6 + x7 + x8 + x9 ) = 0.
Another example is the group G usually denoted W (E8 ), the “Weyl group of E8 ”,
which can be defined as the group with 8 generators
w1 , . . . , w 8
which are subject to the relations
wi2 = 1 (wi wj )m(i,j) = 1, 1 6 i < j 6 8,
where
m(i, j) = 3 if (i, j) ∈ {(1, 3), (3, 4), (2, 4), (4, 5), (5, 6), (6, 7), (7, 8)},
and m(i, j) = 2 otherwise. (This definition is given here only in order to be definite; of
course, this presentation is justified by the many other definitions and properties of this
group, which is a special case of Coxeter groups.) The group W (E8 ) has order
W (E8 ) = 696, 729, 600 = 214 · 35 · 52 · 7,
and one can construct irreducible polynomials P ∈ Q[X], of degree 240, with Galois
group W (E8 ), such that
dim(Rm ⊗ Q) = 232,
or in other words: there are 8 roots of P , out of 240, such that all others are in the
multiplicative group generated by those (see [2, §5] or [21, Rem. 2.4]).
172
4.8. Further topics
We finish this chapter with a short discussion of some further topics concerning rep-
resentations of finite groups. These – and their developments – are of great interest and
importance in some applications, and although we only consider basic facts, we will give
references where more details can be found. One last important notion, the Frobenius-
Schur indicator of an irreducible representation, will be considered in Section 6.2, because
it makes sense and is treated exactly the same way for all compact groups.
4.8.1. More on induction. We have used induction quite often, either in a general
way (typically to exploit Frobenius reciprocity) or to construct specific representations of
concrete groups. In the second role, in particular, we see that it is useful to understand
intertwiners between two induced representations. In particular in Section 4.6.4, we com-
puted the dimension of such spaces “by hand”, as inner products of induced characters.
The answers are rather clean, as (4.49), and it should not be a surprise to see that there
is a general approach to these computations.
Proposition 4.8.1 (Intertwiners between induced representations). Let G be a finite
group, and let H1 , H2 be subgroups of G, and %1 , %2 complex finite-dimensional represen-
tations of H1 and H2 , acting on the vector spaces E1 , and E2 ,respectively. There is an
isomorphism
HomG (IndG G
H1 %1 , IndH2 (%2 )) ' I%1 ,%2
where
(4.60) I%1 ,%2 = {α : G → HomC (E1 , E2 ) | α(h1 xh2 ) = %2 (h2 )−1 ◦ α(x) ◦ %1 (h1 )−1 ,
for all h1 ∈ H1 , x ∈ G, h2 ∈ H2 }.
Proof. We start naturally by applying Frobenius reciprocity to “remove” one in-
duced representation: we have the isomorphism
HomG (IndG G G G
H1 %1 , IndH2 (%2 )) ' HomH2 (ResH2 IndH1 %1 , %2 ).
The idea now is to find first a convenient model for the space
G
Hom(ResG
H2 IndH1 %1 , %2 ),
and then to isolate inside the H2 -intertwiners. Let F1 denote the space on which IndG H1 %1
acts, as well as its restriction to H2 . By construction, F1 is a subspace of the space V1 of
all functions from G to E2 , and hence we can write any linear map T from F1 to E2 in
the form X
T ϕ = Tα ϕ = α(x)(ϕ(x))
x∈G
for some α(x) ∈ Hom(E1 , E2 ) (this amounts to using the basis of characteristic functions
of single points to compute linear maps with values in E2 which are defined on the whole
of V1 ). We claim that this gives an isomorphism
I −→ Hom(F1 , E2 )
T :
α 7→ Tα
where
I = {α : G −→ Hom(E1 , E2 ) | α(h1 x) = %1 (h1 )−1 α(x)}
(in general, since F1 is a subspace of V1 , Hom(F1 , E2 ) would be a quotient of Hom(V1 , E2 ),
and what we are doing is find a good representative subspace for it in Hom(V1 , E2 )).
173
Indeed, the computation
X X
Tα (ϕ) = α(h1 y)(ϕ(h1 y))
y∈H1 \G h1 ∈H1
X X
= α(y) ◦ %1 (h1 )−1 (%1 (h1 )ϕ(y))
y∈H1 \G h1 ∈H1
X
= |H1 | α(y)(ϕ(y))
y∈H1 \G
quickly shows that T is injective (since one can prescribe the values of an element ϕ ∈
F1 arbitrarily on each coset in H1 \G). But dim I = [G : H](dim E1 )(dim E2 ) (by an
argument similar to the computation of the dimension of an induced representation),
and this is also the dimension of Hom(F1 , E2 ) (by Proposition 2.3.8), so T is indeed an
isomorphism.
Now we can easily answer the question: given α ∈ I, when is Tα ∈ Hom(F1 , E2 ) an
H2 -intertwiner? We have
X X
Tα (h2 · ϕ) = α(x)((h2 · ϕ)(x)) = α(x)(ϕ(xh2 ))
x∈G x∈G
Proof. By the irreducibility criterion (Corollary 4.3.14, which is also the converse of
Schur’s Lemma), we need to determine when the space Iχ,χ of intertwiners of % = IndG H (χ)
with itself is one-dimensional. We apply Proposition 4.8.1 to compute this space; if
174
we note that the space Hom(E1 , E2 ) can be identified with C when E1 = E2 is one-
dimensional, we see that Iχ,χ is isomorphic to the space I of functions α : G −→ C such
that
α(h1 xh2 ) = %(h1 h2 )−1 α(x)
for all x ∈ G and h1 , h2 ∈ H. These conditions seem similar to those defining an induced
representation, and this would suggest at first that the dimension of I is H\G/H = |S|,
but there is a subtlety: a representation x = h1 sh2 with hi ∈ H need not be unique,
which creates additional relations to be satisfied. In consequence, the dimension can be
smaller than this guess.
The one-dimensional subspace of I corresponding to CId in EndG (%) is spanned by
the function α0 such that
(
χ(x)−1 if x ∈ H,
α0 (x) =
0 otherwise,
(as one can easily check using the explicit form of the Frobenius reciprocity isomorphism).
We now determine the condition under which I is actually spanned by this special
function α0 . If we denote by S a set of representations for the double cosets HsH ⊂ G
(taking s = 1 for the double coset H · H = H), we see first of all, from the relations
defining I, that the map
I −→ CS
α 7→ (α(s))s∈S
is injective. Similarly, we get
(4.61) α(hs) = χ(h)−1 α(s) = α(sh)
for all h ∈ H and s ∈ S.
Now fix s ∈ S. We claim that either α(s) = 0 or χs = χ on Hs = H ∩ sHs−1
(note in passing that χs is indeed a well-defined representation of this subgroup, since
s−1 Hs s ⊂ H). Indeed, for x ∈ Hs = H ∩ sHs−1 , we have
α(xs) = α(s(s−1 xs)) = α((s−1 xs)s)
by (4.61), applied with h = s−1 xs ∈ H, and this gives
χ(x)−1 α = χ(s−1 xs)−1 α,
which is valid for all x ∈ Hs , hence the claim.
Consequently, if no χs coincides with χ on Hs when s ∈ / H (corresponding to the
double cosets which are distinct from H), any α ∈ I must vanish on the non-trivial
double cosets, and hence be a multiple of α0 . This proves the sufficiency part of the
irreducibility criterion, and we leave the necessity to the reader...
Exercise 4.8.3 (Principal series). Let n > 2 be an integer and let p be a prime
number. Let G = GLn (Fp ) and define B ⊂ G to be the subgroup of all upper-triangular
matrices. Moreover, let W ⊂ G be the subgroup of permutation matrices.
(1) Show that [
G= BwB,
w∈W
and that this is a disjoint union. [Hint: Think of using Gaussian elimination to triangulate
a matrix in some basis.]
(2) Let χ : B −→ C× be the one-dimensional representation given by
χ(g) = χ1 (g1,1 )χ2 (g2,2 ) · · · χn (gn,n )
175
G
where χi , 1 6 i 6 n, are characters of F× p , and let % = IndB χ. Show that % is irreducible
whenever all characters χi are distinct.
(3) For n = 2, show without using characters that the induced representations
π(χ1 , χ2 ) and π(χ2 , χ1 ) (with χ1 6= χ2 ) are isomorphic, and write down a formula for
an isomorphism. [Hint: Follow the construction in Proposition 4.8.1 and the Frobenius
reciprocity isomorphism.]
The irreducible representations of GLn (Fp ) constructed in this exercise are called the
principal series; for n = 2, they are exactly those whose irreducibility was proved in
Section 4.6.4 using their characters.
Note that there is no principle series unless there are at least n distinct characters
of F×p , i.e., unless p − 1 > n. This suggests – and this is indeed the case! – that the
character tables of GLn (Fp ), when p is fixed and n varies, behave rather differently from
those of GLn (Fp ) when n is fixed and p varies.
Exercise 4.8.4. We explain here a different approach to the corollary. Let G be a
finite group, H1 and H2 subgroups of G, χ a one-dimensional complex representation of
H1 .
(1) Show that M
ResG
H2 Ind G
H1 χ ' IndH
H2,s χs
2
s∈S
where S is a set of representatives of the double cosets H2 gH1 , H2,s = H2 ∩ s−1 H1 s and
χs is the one-dimensional character of H2,s given by
χs (x) = χ(sxs−1 )
[Hint: This can be done with character theory.]
(2) Prove the corollary, and recover the irreducibility result (2) of the previous exercise,
using (1).
Our second topic concerning induction takes up the following question: we have seen,
in concrete examples, that many irreducible representations of certain finite groups arise
as induced representations from subgroups, and indeed from one-dimensional characters
of abelian subgroups. How general is this property? It turns out that, provided some
leeway is allowed in the statement, one can in fact recover all irreducible representations
using induced representations of cyclic subgroups. This is the content of the following
theorem of Artin, which is an excellent illustration of the usefulness of the character ring
R(G) = RC (G) of virtual characters of G introduced in Definition 2.7.40.
Theorem 4.8.5 (Artin). Let G be a finite group, and let % ∈ Ĝ be an irreducible
representation of G. There exists a decomposition
X
(4.62) %= αi IndG
Hi χi
i
176
(precisely, m is a common denominator of all αi in (4.62), ni = −mαi if αi < 0, while
mj = mαj if αj > 0).
As we will see with examples, this statement can not, in general, be improved to
express % without considering virtual characters, i.e., with all αi non-negative.
Proof. There is a surprisingly easy proof: the Q-vector space R(G) ⊗ Q has a
basis corresponding to the irreducible representations of G, and inherits a non-degenerate
symmetric bilinear form h·, ·iG for which those characters form an orthonormal basis. By
duality of vector spaces, a subspace V ⊂ R(G) ⊗ Q is equal to the whole space if and
only if its orthogonal V ⊥ is zero. In particular, if V is generated by certain elements (χi )
in R(G) ⊗ Q, we V = R(G) ⊗ Q if and only if no ξ ∈ R(G) ⊗ Q is orthogonal to all χi .
We apply this now to the family χi of all representations induced from irreducible
representations of cyclic subgroups of G. Suppose ξ ∈ R(G) ⊗ Q is orthogonal to all
such χi . Then, using the Frobenius reciprocity formula (which holds in R(G) ⊗ Q by
“linearity” of induction and restriction with respect to direct sums), we get
hIndG G
H χ, ξiG = hχ, ResH ξiH
for any cyclic subgroup H and χ ∈ Ĥ. Varying χ for a fixed H, it follows that ResG Hξ
is zero for all H. If we identiy ξ with its virtual character in C(G), this means that ξ
vanishes on all cyclic subgroups of G. But since any element x ∈ G belongs to at least
one cyclic subgroup, this gives ξ = 0.
4.8.2. Representations over other fields.
177
CHAPTER 5
1 It is useful here to remember that a continuous bijection between compact topological spaces is
necessarily a homeomorphism, i.e., the inverse is also automatically continuous.
178
where Z
α(m) = f (t)e−2iπmt dt
R/Z
are the Fourier coefficients. Many results are known about the convergence of Fourier
series towards ϕ, many of which reveal of great subtlety of behavior. For instance, it was
shown by Kolmogorov that there exist integrable functions ϕ ∈ L1 (R/Z) such that the
Fourier series above diverges for all x ∈ R/Z. (For the classical theory of Fourier series,
one can look at Zygmund’s treatise [45].)
However, it is also classical that a very good theory emerges when considering the
square-integrable functions: if ϕ ∈ L2 (G), the Fourier series converges in L2 -norm, i.e.,
X
ϕ − α(m)χm
−→ 0
L2
|m|6M
holds.
This can be interpreted in terms of a unitary representation of G on L2 (G): under
the regular action
reg(t)ϕ(x) = ϕ(x + t),
the condition of square-integrability is preserved, and in fact
Z Z
2 2
k reg(t)ϕk = |ϕ(x + t)| dx = |ϕ(x)|2 dx = kϕk2 ,
R/Z R/Z
because the Lebesgue measure is invariant under translations. Thus L2 (G) is formally a
unitary representation (the continuity requirement also holds; we will verify this later in
a more general case). The L2 -theory of Fourier series says that the family of characters
(χm ) is an orthonormal basis of the space L2 (G), and this is, quite recognizably, a suitable
analogue of the decomposition of the regular representation: each of the characters χm
appear once (a one which is the dimension of χm !) in L2 (G). At this point it may
not be entirely clear that G has no other irreducible unitary representations, but at
least Schur’s Lemma still applies to show that G has no finite-dimensional complex
representation which is not one-dimensional (since G is abelian), and those are the χm
(see Example 3.3.9). As it turns out, there is indeed no infinite-dimensional unitary
representation of G, but we will see this later.
It is rather natural to explore a bit how facts about Fourier series can be interpreted
in terms of the representation theory of G. Although this is quite straightforward, this
brings out a few facts which are quite useful to motivate some parts of the next section,
where arbitrary compact groups enter the picture.
For instance, let us consider the orthogonal projection map
pm : L2 (G) −→ L2 (G)
onto the χm -isotypic component of G. Since this space is one-dimensional, with χm itself
as a unit vector, we have simply
pm (ϕ) = hϕ, χm iχm ,
179
i.e., for t ∈ R/Z, we have
Z
−2iπmx
(5.1) pm (ϕ)(t) = ϕ(x)e dx e2iπmt .
R/Z
This seems to be a complicated way of writing the m-th Fourier coefficient of ϕ, which
is the integral that appears here. However, we can write this as
Z
pm (ϕ)(t) = e2iπm(t−x) ϕ(x)dx
R/Z
Z Z
−2iπmy
= e ϕ(y + t)dy = χm (y)(reg(y)ϕ)(t)dy,
R/Z R/Z
can be written
pm (ϕ) = ϕ ? χm
where · ? · denotes the convolution of functions on G:
Z
(ϕ1 ? ϕ2 )(t) = ϕ1 (x)ϕ2 (t − x)dx
R/Z
(when this makes sense, of course). In particular, (5.1) means that the χm are eigenfunc-
tions of the convolution operator
2
L (G) −→ L2 (G)
Tϕ :
f 7→ ϕ ? f,
(for any ϕ ∈ L2 (G); this is a continuous linear operator on L2 (G)) with eigenvalues given
precisely by the Fourier coefficients hϕ, χm i. Note finally that this convolution operator is
an intertwiner for the action of G on L2 (G): this follows either from a direct computation,
or from the fact that Tϕ acts by scalar multiplication on the characters.
for f ∈ C(G). We recognize a type of expression which was extensively used in Chapter 4!
(2) If G = Rd , d > 1, or G = (R/Z)d , a Haar measure is obviously given by Lebesgue
measure. (Which shows that Theorem 5.2.1 is at least as deep as the existence of the
Lebesgue measure!)
(3) Let G = R× . Then a Haar measure on G is given by
dx
dµ(x) =
|x|
in terms of the Lebesgue measure dx on R. This can be proved by a straightforward
computation using the change of variable formula for the Lebesgue measure: for a ∈ R× ,
182
putting y = ax with dy = |a|dx, we have
Z Z Z
dx dy dy
(5.5) f (ax) = f (y) = f (y) ,
R× |x| R× |a||y/a| R× |y|
or more conceptually using the exponential group isomorphism R ' R+,× and the (ob-
vious) fact that if
f : G1 → G2
is an isomorphism of locally compact groups (i.e., a homeomorphism which is also a group
isomorphism), the direct image f∗ µ1 of a Haar measure on G1 is a Haar measure on G2 .
If one restricts to R+,× =]0, +∞[, we get the Haar measure x−1 dx. This property
explains some formulas, such as the definition
Z +∞
Γ(s) = xs−1 e−x dx
0
of the Gamma function: it really is an integral of x 7→ xs e−x with respect to Haar measure
on ]0, +∞[.
(4) The example (3) can be generalized to the (genuinely non-abelian, Q non-compact)
group G = GLn (R) for n > 1: in terms of the Lebesgue measure dx = i,j dxi,j on the
2
space Mn (R) ' Rn of matrices, a Haar measure on the subset G is given by
1
(5.6) dµ(x) = dx.
| det(x)|
This is again a simple application of the change of variable formula for multi-dimensio-
nal Lebesgue measure: the maps x 7→ xg on G extends to a diffeomorphism of Mn (R)
with Jacobian | det(g)|, and the result follows, formally, as in (5.5). Note that, although
G is not compact (and not abelian), the left-invariance property (5.3) also holds for this
Haar measure.
(5) Let G = SU2 (C). Here, and in many similar circumstances, one needs first to find
a convenient parametrization to describe the Haar measure on G; typically, this means
introduced finitely many continuous coordinates on G, and using a measure defined using
Lebesgue measure, in terms of these coordinates.
A very convenient system of coordinates on SU2 (C) is obtained by remarking that
any g ∈ SU2 (C) can be written uniquely
a b
(5.7) g=
−b̄ ā
where a, b ∈ C are arbitrary, subject to the unique condition |a|2 + |b|2 = 1 (we leave the
verification of this fact as an exercise). Using the real and imaginary parts of a and b as
coordinates, we see that G is homeomorphic to the unit sphere S3 in R4 .
There is an obvious measure that comes to mind on this unit sphere: the “surface”
Lebesgue measure µ, which can be defined as follows in terms of the Lebesgue measure
µ4 on R4 : n x o
µ(A) = µ4 x ∈ R4 − {0} | kxk 6 1 and ∈A
kxk
3
for A ⊂ S . This measure is natural because it is invariant under the obvious linear
action of the orthogonal group O4 (R) on S3 (simply because so is the Lebesgue measure
µ4 ).
We claim that this also implies that µ is in fact a Haar measure on SU2 (C). Indeed,
an element g ∈ SU2 (C), when acting on S3 by multiplication on the right, does so as the
183
restriction of an element of O4 (R) (as an elementary computation reveals), which gives
the result.
Exercise 5.2.5. We present a few additional properties and examples of Haar mea-
sures in this exercise.
(1) [Compactness and Haar measure] Show that if G is a locally compact topological
group and µ is a Haar measure on G, we have µ(G) < +∞ if and only if G is compact.
Next, we will explain a slightly different proof of the left-invariance of Haar measure
on a compact group, which applies to more general groups.
(2) Fix a Haar measure µ on G. Show that the function ∆ : G −→ [0, +∞[ defined
by (5.4) is nowhere zero, and is a continuous homomorphism G −→ R+,× .
(3) Show that if G is compact, such a homomorphism is necessarily trivial. Show
directly (without using (5.5)) that ∆ is also necessarily trivial for G = GLn (R), n > 2.
(4) Let G be locally compact. Show that the measure
dν(y) = ∆(y)dµ(y)
is a non-zero left-invariant measure on G.
(5) Let
a x ×
G= | a ∈ R , x ∈ R, ⊂ GL2 (R).
0 1
Show that, in terms of the coordinates a, b and x, the measure
dadx
dµ =
|a|
is a Haar measure on G. Check that it is not left-invariant, i.e., (5.3) does not always
hold. What is the function ∆(g) in this case? What is a non-zero left-invariant measure
on G?
With the Haar measure in hand, we can now define the regular representation of a
locally compact group.
Proposition 5.2.6 (The regular representation). Let G be a compact group, and let
µ be a Haar measure on G. The regular action
reg(g)ϕ(x) = ϕ(xg)
is a well-defined unitary representation of G on L2 (G, µ), which is strongly continuous.
Up to isometry, this representation is independent of the choice of Haar measure, and
is called “the” regular representation of G.
Similarly, the left-regular representation reg0 is defined on L2 (G, µ) by
reg0 (g)ϕ(x) = ϕ(g −1 x),
and the two representations commute: we have reg0 (g) reg(h) = reg(h) reg0 (g) for all g,
h ∈ G.
Proof. Although this sounds formal – and an important goal of the theory is to set
it up in such a way that it becomes formally as easy and flexible as it is in the case of
finite groups –, it is important to see that there is actually some non-trivial subtleties
involved.
First of all, the regular action is clearly well-defined for functions, but an element of
L2 (G, µ) is an equivalence class of functions, modulo those ϕ which are zero µ-almost
everywhere. Thus we must check that, if ϕ has this property, so does reg(g)ϕ. This
184
follows directly from the invariance of Haar measure (but it is not a triviality). Once this
is settled, we check that reg is unitary using once more the invariance of µ:
Z Z
2 2
k reg(ϕ)k = |ϕ(xg)| dµ(x) = |ϕ(x)|2 dµ(x) = kϕk2 .
G G
What is by no means obvious is the continuity of the representation. We use the
strong continuity criterion of Proposition 3.3.3 to check this.2
We first take ϕ ∈ C(G), and we check the continuity at g = 1 of
g 7→ reg(g)ϕ
as follows: we have
Z
2
k reg(g)ϕ − ϕk = |ϕ(xg) − ϕ(x)|2 dµ(x),
G
and since ϕ(xg) − ϕ(x) → 0 as g → 1, for every x ∈ G, while
|ϕ(xg) − ϕ(x)|2 6 4kϕk2∞ ,
we see from the dominated convergence theorem that
lim k reg(g)ϕ − ϕk2 = 0,
g→1
(5.9) V0 = {f : G −→ H | f is continuous,
and f (kg) = %(k)f (g) for all k ∈ K, g ∈ G},
on which G acts by the regular action:
π(g)f (x) = f (xg).
Define an inner product on V0 by
Z
hf1 , f2 i0 = hf1 (x), f2 (x)iH dµ(x).
G
so that kP k = 0 if and only if P vanishes on every (a, b) ∈ C2 which are the first row of
a matrix in SU2 (C). As observed in (5.7), these are the pairs of complex numbers with
|a|2 + |b|2 = 1, and by homogeneity of P ∈ Vm , it follows that in fact P = 0, proving that
the inner product we defined is positive definite.
It remains to compute the inner products hej , ej i. We get
Z
hej , ej i = |a|2j |b|2(m−j) dµ(g),
SU2 (C)
Another argument for determining this invariant inner product is based on an a priori
computation based on its known existence and its invariance property. Using the same
basis as above, we find, for instance, that for all θ ∈ R and j, k, we must have
iθ iθ
e 0 e 0
hej , ek i = h e, e i
0 e−iθ j 0 e−iθ k
= hei(2j−m)θ ej , ei(2k−m)θ ek i = e2i(j−k)θ hej , ek i,
which immediately implies that ej and ek must be orthogonal when j 6= k. This means
(ej ) is orthogonal basis, and only the norm kej k2 = hej , ej i need to be computed in order
to conclude.
This must rely on other elements of SU2 (C) than the diagonal ones (otherwise, we
would be arguing with %m restricted to the diagonal subgroup K, where any choice of
kej k2 > 0 gives a K-invariant inner product). Here we sketch the method in [43, III.2.4],4:
consider the elements
cos t − sin t
gt = , t ∈ R,
sin t cos t
of SO2 (R) ⊂ SU2 (C), and differentiate with respect to t and evaluate at 0 the invariance
formula
0 = hgt · ej , gt · ej+1 i.
One obtains the relation
hAej , ej+1 i + hej , Aej+1 i = 0,
4 Which can be seen a simple case of studying the group SU2 (C) through its Lie algebra.
190
where A is the linear operator defined on Vm by
d
AP = (gt · P ) |t=0 .
dt
Spelling out gt · ej , an elementary computation shows that
j m−j
Aej = ej−1 − ej+1 ,
2 2
so that (by orthogonality of the non-diagonal inner products) we get a recurrence relation
(j + 1)hej , ej i = (m − j)hej+1 , ej+1 i.
This determines, up to a constant scalar factor c > 0, the norms kej k2 , and indeed a
quick induction leads to the formula
hej , ej i = cj!(m − j)!, 0 6 j 6 m,
which coincides – as it should! – with (5.10) when taking c−1 = (m + 1)!.
5.3. The analogue of the group algebra
It is now natural to discuss the analogue of the action of the group algebra of a finite
group. However, some readers may prefer to skip to the next section, and to come back
once the proof of the Peter-Weyl theorem has shown that this extension is naturally
required.
The group algebra for a finite group G is the source of endomorphisms of a representa-
tion space % which are linear combinations of the %(g). When G is compact, but possibly
infinite, the group algebra (over C) can still be defined (as finite linear combinations of
symbols [g], g ∈ G), but the endomorphisms it defines are not sufficient anymore. For
instance, in a group like SU2 (C), the center of the group algebra is too small to create
interesting intertwiners (e.g., on the regular representation), because all conjugacy classes
in SU2 (C), except those of ±1, are infinite.
It seems intuitively clear that one can solve this problem of paucity by replacing
the sums defining elements of the group algebra with integrals (with respect to Haar
measure). This means, that we want to consider suitable functions ψ on G and define
Z
ψ(g)[g]dµ(g),
G
in some sense. More concretely, we will consider a unitary representation %, and define
%(ψ) as the linear map
Z
(5.11) %(ψ) = ψ(g)%(g)dµ(g).
G
This already is well-defined when % is finite-dimensional: the integration can be com-
puted coordinate-wise after selecting a basis,5 but needs some care when % is not (e.g.,
when G is infinite and % is the regular representation), since we are then integrating
a function with values in the space L(H) of continuous linear maps on H, for some
infinite-dimensional Hilbert space H.
However, this can be defined. A natural approach would be to use the extension
of Lebesgue’s integration theory to Banach-space valued functions, but this is not so
commonly considered in first integration courses. We use an alternate definition using
“weak integrals” which is enough for our purposes and is much quicker to set-up. The
5 One should check that the result is independent of the basis, but that is of course easy, e.g., by
computing the coordinates with respect to a fixed basis.
191
basic outcome is: for ψ ∈ L1 (G) (with respect to Haar measure, as usual), one can define
continuous linear operators %(ψ), for any unitary representation % of G, which behave
exactly as one would expect from the formal expression (5.11) and the standard properties
of integrals.
Proposition 5.3.1 (L1 -action on unitary representations). Let G be a compact group
with Haar measure µ. For any integrable function ψ ∈ L1 (G) and any unitary represen-
tation % : G −→ U(H), there exists a unique continuous linear operator
%(ψ) : H −→ H
such that
Z
(5.12) h%(ψ)v, wi = ψ(g)h%(g)v, widµ(g)
G
for any vectors v, w ∈ H. This has the following properties:
(1) For a fixed %, the map
1
L (G) −→ L(H)
ψ 7→ %(ψ)
is linear and continuous, with norm at most 1, i.e., k%(ψ)vk 6 kψkL1 kvk for all ψ ∈
L1 (G). If Φ : % −→ π is a homomorphism of unitary representations, we have
Φ ◦ %(ψ) = π(ψ) ◦ Φ.
(2) For a fixed ψ ∈ L1 (G), the adjoint of %(ψ) is given by
(5.13) %(ψ)∗ = %(ψ̌)
where ψ̌(g) = ψ(g −1 ).
(3) For any ψ and %, and for any subrepresentation π of % acting on H1 ⊂ H, the
restriction of %(ψ) to H1 is given by π(ψ). In other words, H1 is a stable subspace of
%(ψ).
(4) For any g ∈ G and ψ ∈ L1 (G), we have
(5.14) %(g)%(ψ) = %(reg0 (g)ψ), %(ψ)%(g) = %(reg(g −1 )ψ),
where reg and reg0 represent here the right and left-regular representations acting on
L1 (G), as in Exercise 5.2.7. For ψ ∈ L2 (G), this coincides with the usual regular repre-
sentations.
We will denote Z
%(ψ) = ψ(x)%(x)dµ(x),
G
and for any vector v ∈ H, we also write
Z
%(ψ)v = ψ(x)%(x)vdµ(x),
G
which is a vector in H. The properties of the proposition can then all be seen to be
formally consequences of these expressions, and can be remembered (or recovered when
needed) in this way. For instance, the inequality k%(ψ)kL(H) 6 kψkL1 (G) boils down to
Z
Z Z
ψ(x)%(x)dµ(x)
6 |ψ(x)|k%(x)kdµ(x) = |ψ(x)|dµ(x)
G G G
which is perfectly natural (we use here that %(x) has norm 1 for all x).
192
Proof. (1) The procedure is a familiar one in Hilbert-space theory: a vector can be
pinpointed (and shown to exist) by showing what its inner products with other vectors are,
and these can be prescribed arbitrarily, provided only that they are linear and continuous.
Precisely, given ψ ∈ L1 (G), a unitary representation % on H and a vector v ∈ H, define
H −→Z C
`ψ,v
w 7→ ψ(g)hw, %(g)vidµ(g).
G
as expected.
193
(4) These formulas replace formal computations based on the invariance of the Haar
measure under translation. We only write down one of these: for g ∈ G and ψ ∈ L1 (G),
we have Z Z Z
%(g)%(ψ) = %(g) ψ(x)%(x)dµ(x) = ψ(x)%(gx)dµ(x) = ψ(g −1 y)%(y)dµ(y)
G G G
which “is” is the first formula. We leave the other to the reader, as we leave her the task
of translating it into a formal argument
Exercise 5.3.2 (More general integrals). Many variants of this construction are pos-
sible. For instance, let H be a separable Hilbert space (so there is a countable subset of
H which is dense in H). For any function
f : G −→ H
which is “weakly measurable”, in the sense that for every w ∈ H, the function
g 7→ hf (g), wi
is measurable on G, and which has bounded values (in the norm of H), show how to
define the integrals Z
f (g)dµ(g) ∈ H
G
and show that this construction is linear with respect to f , and satisfies
Z
Z
(5.15)
f (g)dµ(g)
6 kf (g)kdµ(g)
G G
(you will first have to show that g 7→ kf (g)k is integrable). Moreover, for a unitary
representation % on H and for any ψ ∈ L1 (G) and v ∈ H, show that
Z
%(ψ)v = f (g)dµ(g)
G
for f (g) = ψ(g)%(g)v.
Exercise 5.3.3 (Intertwiners from L1 -action). Let G be a compact group and µ a
Haar measure on G.
(1) For an integrable function ϕ on G, explain how to define the fact that ϕ is a class
function (i.e., formally, ϕ(xyx−1 ) = ϕ(y) for all x and y in G).
(2) Let ϕ be an integrable class function on G. For any unitary representation % of
G, show that the operator Z
ϕ(x)%(x)dµ(x)
G
is in HomG (%, %), i.e., is an intertwiner.
Exercise 5.3.4 (Convolution). Let G be a compact group and µ a Haar measure on
G. For any functions ϕ, ψ ∈ L2 (G), show that
reg0 (ϕ)ψ = ϕ ? ψ,
where the convolution ϕ ? ψ is defined by
Z
(ϕ ? ψ)(g) = ϕ(x)ψ(x−1 g)dµ(x).
G
Prove also that
(ϕ ? ψ)(g) = hϕ, reg0 (g)ψ̌i,
where ψ̌ is given by (5.13), and deduce that ϕ ? ψ is continuous.
194
The formulas (5.14) express the link between the action of L1 -functions on H (given
by the “implicit” definition (5.12) using inner products) and the original representation %.
The following is another important relation; in fact, it shows how to recover the original
representation operators %(g) starting from the collection of linear maps %(ψ). This is
trickier than for finite groups, because there is (in the infinite case) no integrable function
supported only at a single point g ∈ G.
Proposition 5.3.5 (L1 -approximation of representation operators). Let G be a com-
pact topological group, and let % : G −→ U(H) be a unitary representation of G. Fix
g ∈ G, and for any open neighborhood U of g in G, write ψU for the characteristic
function of U , normalized so that kψU k = 1, i.e., define
1
if x ∈ U
ψU (x) = µ(U )
0 otherwise.
Then we have
lim %(ψU ) = %(g),
U ↓g
where the limit is a limit “as U tends to g”, taken in the strong topology in L(H), i.e., it
should be interpreted as follows: for any v ∈ H, and for any ε > 0, there exists an open
neighborhood V of g such that
(5.16) k%(ψU )v − %(g)vk < ε,
for any U ⊂ V .
Proof. This is a simple analogue of classical “approximation by convolution” results
in integration theory. We use (5.12), and the fact that the integral of ψU is one, to write
Z
h%(ψU )v − %(g)v, wi = ψU (x)h%(x)v − %(g)v, widµ(x)
G
for any U and any w ∈ H. Therefore
Z
|h%(ψU )v − %(g)v, wi| 6 kwk ψU (x)k%(x)v − %(g)vkdµ(x)
G
6 kwk sup k%(x)v − %(g)vk
x∈U
(alternatively, this should be thought as an application of the inequality (5.15) for the
integral Z
%(ψU )v − %(g)v = ψU (x)(%(x)v − %(g)v)dµ(x)
G
as defined in Exercise 5.3.2.)
But now, the strong continuity of the representation % means that x 7→ k%(x)v−%(g)vk
is a continuous function on G taking the value 0 at x = g. Hence, for any ε > 0, we can
find some neighborhood V of g such that
k%(x)v − %(g)vk < ε
for all x ∈ V . This choice of V gives the desired inequality (5.16).
195
5.4. The Peter-Weyl theorem
Once we have the regular representation reg of a compact group G, we attack the study
of unitary representations of the group by attempting to decompose it into irreducibles.
This is done by the Peter-Weyl theorem, from which all fundamental facts about the
representations of general compact groups follow.
Theorem 5.4.1 (Peter-Weyl). Let G be a compact topological group with probability
Haar measure µ. Then the regular representation of G on L2 (G, µ) decomposes as a
Hilbert space direct sum6
M
(5.17) L2 (G, µ) = M (%)
%
(where the direct sum is orthogonal) as well as the fact that M (%) is, for any finite-
dimensional irreducible unitary representation, isomorphic to a direct sum of dim(%)
copies of %.
Indeed, we can follow the method of Section 2.7.3, and in particular Theorem 2.7.24,
to embed irreducible finite-dimensional representations in L2 (G, µ). Given an irreducible
unitary representation
% : G −→ U(H)
of G, we see that the unitary matrix coefficients
fv,w : g 7→ h%(g)v, wi
are bounded functions on G, by the Cauchy-Schwarz inequality, and are continuous, so
that fv,w ∈ L2 (G) for all v, w ∈ H (this is (5) in Theorem 5.2.1, which also shows that dis-
tinct matrix coefficients are distinct in L2 (G)). A formal argument (as in Theorem 2.7.24)
implies that, for a fixed w ∈ H, the map
v 7→ fv,w
6 Recall that we defined a orthogonal direct sum of unitary representations in Section 3.3.
196
is an intertwiner of % and reg. If w 6= 0, Schur’s Lemma implies that this map is injective,
because it is then non-zero: we have fw,w (1) = kwk2 6= 0, and the continuity of fw,w
shows that it is a non-zero element of L2 (G).
Now we assume in addition that % is finite-dimensional. In that case, the image
of v 7→ fv,w is a closed subspace of L2 (G) (since any finite-dimensional subspace of a
Banach space is closed) and hence it is a subrepresentation of reg which is isomorphic
to %. Varying w, again as in Theorem 2.7.24, we see furthermore that reg contains a
subrepresentation isomorphic to a direct sum of dim(%) copies of %.
If we let % vary among non-isomorphic finite-dimensional unitary representations,
we also see that, by Lemma 3.3.15, the corresponding spaces of matrix coefficients are
orthogonal.
So to finish the proof of the first inclusion (5.18), we should only check that the %-
isotypic component coincides with the space of matrix coefficients of %. We can expect
this to hold, of course, from the case of finite groups. Indeed, this is true, but this time
the argument in the proof of Theorem 2.7.24 needs some care before it can be applied,
as it relies on the linear form δ : ϕ 7→ ϕ(1) which is not necessarily continuous on a
subspace of L2 (G) (and it is certainly not continuous on all of L2 (G), if G is infinite).
Still, δ is well-defined on any space consisting of continuous functions. The following
lemma will then prove to be ad-hoc:
Lemma 5.4.3. Let G be a compact group and let E ⊂ L2 (G) be a finite-dimensional
subrepresentation of the regular representation. Then E is (the image of ) a space of
continuous functions.
Assuming this, let E ⊂ L2 (G) be any subrepresentation isomorphic to %. We can
view E, by the lemma, as a subspace of C(G). Then the linear form
δ : ϕ 7→ ϕ(1)
is well-defined on E. Using the inner product induced on E by the L2 -inner product, it
follows that there exists a unique ψ0 ∈ E such that
δ(ϕ) = hϕ, ψ0 i
for all ϕ ∈ E. We conclude as in the proof of Theorem 2.7.24: for all ϕ ∈ E and x ∈ G,
we have
ϕ(x) = δ(reg(x)ϕ) = hreg(x)ϕ, ψ0 i = fϕ,ψ0 (x),
so that ϕ = fϕ,ψ0 . This shows that E is contained in the space of matrix coefficients of %.
Before going to the converse inclusion, we must prove Lemma 5.4.3:
Proof of Lemma 5.4.3. The basic idea is that continuous functions can be ob-
tained by averaging integrable functions, and that averaging translates of a given ϕ ∈ E
(under the regular representation) leads to another function in E. This way we will show
that E ∩ C(G) is dense in E (for the L2 -norm), and since dim E < +∞, this implies that
E ∩ C(G) = E, which is what we want.
Thus, given ϕ ∈ E, consider functions of the type
Z
ϑ(g) = ψ(x)ϕ(gx)dµ(x)
G
2
where ψ ∈ L (G). If we write this as
ϑ(g) = hreg0 (g −1 )ϕ, ψi,
197
the strong continuity of the left-regular representation shows that ϑ is continuous. But
we can also write this expression, as
Z
ϑ= ψ(x) reg(x)ϕ dµ(x) = reg(ψ)ϕ,
G
1
using the action of L functions defined in Proposition 5.3.1 (since G is compact, any
square-integrable function is also integrable). This shows (using part (3) of the proposi-
tion) that ϑ ∈ E ∩ C(G). Finally, by Proposition 5.3.5 applied to g = 1, for any ε > 0,
we can find ψ ∈ L1 (G) such that
kϕ − ϑk = k reg(1)ϕ − reg(ψ)ϕk 6 ε,
and we are done.
We have now proved (5.18), and must consider the converse. The problem is that, for
all we know, the set of finite-dimensional irreducible unitary representations of G might
be reduced to the trivial one! In other words, to prove the reverse inclusion, we need a
way to construct or show the existence of finite-dimensional representations of G.
The following exercise shows an “easy” way, which applies to certain important groups
(like Un (C) for instance).
Exercise 5.4.4 (Groups with a finite-dimensional faithful representation). Let G be
a compact topological group which is a closed subgroup of GLn (C) for some n > 1.
Show that the linear span of matrix coefficients of finite-dimensional irreducible rep-
resentations of G is a dense subspace of C(G), using the Stone-Weierstrass theorem (we
recall the statement of the latter in Section A.3). Deduce the Peter-Weyl Theorem from
this.
The class of groups covered by this exercise is restricted (in the next chapter, we
describe another characterization of these groups and give examples of compact groups
which are not of this type). We now deal with the general case, by proving that the direct
sum M
M (%)
%
2
is dense in L (G). Equivalently, using Hilbert space theory, we must show that if ϕ ∈
L2 (G) is non-zero, it is not orthogonal to all M (%).
The basic motivation for what follows comes from looking at the case of the circle
group: when % varies, we expect the projection onto M (%) to be given by a convolu-
tion operator that commutes with the regular representation; if we can find a non-zero
eigenvalue with finite multiplicity of this convolution operator, this will correspond to a
non-trivial projection.
The details are a bit more involved because the group G is not necessarily abelian.
We exploit the fact that M (%) is stable under the left-regular representation reg0 : if
ϕ ⊥ M (%), we have
hϕ, reg0 (g)f i = 0
for all g ∈ G and f ∈ M (%). As a function of g, this is the convolution ϕ ? fˇ (see
Exercise 5.3.4). If we further note that, for a basic matrix coefficient f (g) = h%(g)v, wi,
we have
fˇ(g) = h%(g −1 )v, wi = h%(g)w, vi
which is also a matrix coefficient, we can deduce that ϕ ⊥ M (%) implies that ϕ ? f = 0
for all f ∈ M (%).
198
It is therefore sufficient to prove that, for some finite-dimensional subrepresentation
E ⊂ L2 (G), the convolution operator
Tϕ : f 7→ ϕ ? f
is non-zero on E. For an arbitrary operator, this might be very tricky, but by Exer-
cise 5.3.4 again, we have also
Tϕ (f ) = reg0 (ϕ)f,
so that Tϕ is an intertwiner of the regular representation (Exercise 5.3.3). As such, by
Schur’s Lemma, it acts by a scalar on any finite-dimensional subrepresentation. The
question is whether any of these scalars is non-zero, i.e., whether Tϕ has a non-zero
eigenvalue when acting on these finite-dimensional subspaces. Now here comes the trick:
we basically want to claim that the convolution form of Tϕ , abstractly, implies that it
has a non-zero eigenspace ker(Tϕ − λ), with non-zero eigenvalue λ, of finite dimension. If
that is the case, then ker(Tϕ − λ) is a finite-dimensional subrepresentation on which Tϕ
is a non-zero scalar, and we deduce that ϕ is not orthogonal to it.
To actually implement this we must change the operator to obtain a better-behaved
one, more precisely a self-adjoint operator. We form the function ψ = ϕ̌ ? ϕ, and consider
the convolution operator
Tψ = reg0 (ψ) = reg0 (ϕ)∗ reg0 (ϕ).
Since the convolution product is associative, we see that ker(Tϕ ) ⊂ ker(Tψ ), and hence
the previous reasoning applies to Tψ also: Tψ intertwines the regular representation with
itself, and if there exists a finite-dimensional subrepresentation E such that Tψ acts as a
non-zero scalar on E, the function ϕ is not orthogonal to E.
But now Tψ is self-adjoint, and even non-negative since
hTψ f, f i = kTϕ f k2 > 0
for f ∈ L2 (G). Moreover, writing
Z Z
0 −1
Tψ f (x) = reg (ψ)f (x) = ψ(y)f (y x)dµ(y) = ψ(xy −1 )f (y)dµ(y)
G G
(by invariance of Haar measure), we see that Tψ is an integral Hilbert-Schmidt operator
on G with kernel
k(x, y) = ψ(xy −1 )
(as defined in Proposition A.2.4); this kernel is in L2 (G × G, µ × µ) (again by invariant
of Haar measure, its L2 -norm is kψk). By Proposition A.2.4, it is therefore a compact
operator.
The operator Tψ is also non-zero, because ψ is continuous (Exercise 5.3.4 again)
and ψ(1) = kϕk2 6= 0 (by assumption), and because by taking f to be a normalized
characteristic function of a small enough neighborhood of 1, we have ψ ? f close to ψ,
hence non-zero (this is Proposition 5.3.5, applied to the left-regular representation). So
we can apply the spectral theorem (Theorem A.2.3, or the minimal version stated in
Proposition A.2.5) to deduce that Tψ has a non-zero eigenvalue λ with finite-dimensional
eigenspace ker(Tψ − λ), as desired.
Remark 5.4.5. At the end of Section A.2, we prove – essentially from scratch – the
minimal part of the spectral theorem which suffices for this argument, in the case when
the space L2 (G, µ) is separable (which is true, for instance, whenever the topology of G
is defined by a metric and thus for most groups of practical interest; see [31, Pb. IV.43]
for this.)
199
Exercise 5.4.6 (Another argument). Let G be a compact topological group with
probability Haar measure µ.
(1) Show directly that, for any g 6= 1, there exists a finite-dimensional unitary rep-
resentation % of G such that %(g) 6= 1. (We also state this fact formally in Corol-
lary 5.4.8.)[Hint: One can also use compact operators for this purpose.]
(2) Use this and the Stone-Weierstrass Theorem (Theorem A.3.1) to give a proof of
the Peter-Weyl theorem in the general case. (See Exercise 5.4.4.)
We now come to the proof of Corollary 5.4.2. This also requires the construction
of some finite-dimensional subrepresentations. The following lemma is therefore clearly
useful:
Lemma 5.4.7. Let G be a compact topological group, and let
% : G −→ U(H)
be any non-zero unitary representation of G. Then % contains an irreducible finite-
dimensional subrepresentation.
Proof. It suffices to find a non-zero finite-dimensional subrepresentation of H, since
it will in turn contain an irreducible one. But we can not hope to construct the desired
subrepresentation using kernels or eigenspaces of intertwiners this time (think of % being
an infinite orthogonal direct sum of copies of the same irreducible representation). Instead
we bootstrap the knowledge we just acquired of the regular representation, and use the
following remark: if E ⊂ L2 (G) is a finite-dimensional subrepresentation of the left-
regular representation and v ∈ H, then the image
F = {reg0 (ϕ)v | ϕ ∈ E}
of the action of E on v is a subrepresentation of H. Indeed, by (5.14), we have
%(g)%(ϕ)v = %(reg0 (g)ϕ)v ∈ F
for all ϕ ∈ E and g ∈ G. Obviously, the subspace F is a quotient of E (by the obvious
surjection ϕ 7→ %(ϕ)v), and hence we will be done if we can ensure that F 6= 0.
To do this, we fix any v 6= 0, and basically use the fact that %(1)v = v 6= 0 is “almost”
in F . So we approximate %(1) using the L2 -action: first of all, by Proposition 5.3.5, we can
find ψ ∈ L2 (G) such that %(ψ)v is arbitrarily close to %(1)v, in particular, we can ensure
that %(ψ)v 6= 0. Further, using the Peter-Weyl Theorem, we can find a ψ1 ∈ L2 (G) which
is a (finite) linear combination of matrix coefficients of finite-dimensional representations
and which approximates ψ arbitrarily closely. Since (Proposition 5.3.1, (1)) we have
k%(ψ) − %(ψ1 )kL(H) 6 kψ − ψ1 kL1 6 kψ − ψ1 kL2 ,
we get
k%(ψ1 )vkH > k%(ψ)vkH − kψ − ψ1 kL2 kvkH ,
which will be > 0 if the approximation ψ1 is suitably chosen. Now take for E the
finite direct sum of the spaces M (%) for the % which occur in the expression of ψ1 as a
combination of matrix coefficients: we have ψ1 ∈ E, and E is a subrepresentation of the
left-regular representation with F 6= 0 since it contains %(ψ1 )v 6= 0.
Proof of Corollary 5.4.2. First of all, Lemma 5.4.7 shows by contraposition
that all irreducible representations must be finite-dimensional (if % is infinite-dimensional,
the statement shows it is not irreducible). Then we also see that the “complete re-
ducibility” must be true: if reducibility failed, the (non-zero) orthogonal complement of
a “maximal” completely-reducible subrepresentation could not satisfy the conclusion of
200
the lemma. To be rigorous, this reasoning can be expressed using Zorn’s Lemma. We
give the details, though the reader may certainly think that this is quite obvious (or may
rather write his own proof). Let % : G −→ U(H) be a unitary representation; consider
the set
O = {(I, (Hi )i∈I )}
where I is an arbitrary index set, and Hi ⊂ H are pairwise orthogonal finite-dimensional
irreducible subrepresentations of G. This set is not empty, by Lemma 5.4.7. We order
it by inclusion: (I, (Hi )) 6 (J, (Hj0 )) if and only if I ⊂ J and Hi0 = Hi for i ∈ I ⊂ J.
This complicated-looking ordered set is set up so that it is very easy to see that every
totally ordered subset P has an upper bound.7 By Zorn’s Lemma, we can find a maximal
element (I, (Hi )) in O. Then we claim that the subspace
⊥
M
0
H = Hi
i∈I
for all ϕ ∈ L2 (G), where p% is the orthogonal projection on the isotypic component M (%).
As a matter of fact, the statement of the Peter-Weyl Theorem in the original paper
is that (3) holds (and the statement in Pontryagin’s version a few years later was the
density of M in C(G)!
Proof. (1) The regular representation is faithful (Exercise 5.2.8), so for g 6= 1, the
operator reg(g) is not the identity. However, by the Peter-Weyl Theorem, reg(g) is the
direct sum of the operators obtained by restriction to each M (%), which are just direct
7 For the more natural set O0 of all “completely reducible subrepresentations” of H, ordered by
inclusion, checking this is more painful, because one is not keeping track of consistent decompositions of
the subspaces to use to construct an upper bound.
201
sums of dim(%) copies of %(g). Hence some at least among the %(g) must be distinct from
the identity.
The statement of (2) concerning the L2 -norm is an immediate consequence of the
Peter-Weyl Theorem and Hilbert space theory: if M is not dense in L2 (G, µ), the orthog-
onal complement of its closure is a non-zero subrepresentation of the regular represen-
tation, which must therefore contain some irreducible subrepresentation π. Because the
definition of M means that it is spanned by the M (%) where % ranges over C, we must
have π ∈ / C.
Similarly, (3) is a direct translation of the Peter-Weyl Theorem using the theory of
Hilbert spaces.
For the part of (2) involving continuous functions, we recall that M is indeed a
subspace of C(G). We must show that M is dense in C(G) for the L∞ -norm. This
can be thought of as an analogue of the classical Weierstrass approximation theorem for
trigonometric polynomials (which, indeed, corresponds to G = S1 ), and one can indeed
use (1) and the Stone-Weierstrass Theorem (this is the content of Exercise 5.4.6).
Our last result in this section is a special case of Frobenius reciprocity for induced
representations.
Proposition 5.4.9 (Frobenius reciprocity for compact groups). Let G be a compact
topological group with probability Haar measure µ, and let K ⊂ G be a compact subgroup.
For any finite-dimensional unitary representations
%1 : G −→ U(H1 ), %2 : K −→ U(H2 ),
we have a natural isomorphism
HomG (%1 , IndG G
K (%2 )) ' HomK (ResK (%1 ), %2 ).
Proof. We leave it to the reader to check that the proof of Proposition 2.3.6 can carry
through formally unchanged, provided one proves first that the image of any intertwiner
Φ : %1 −→ IndG
K (%2 )
lies in the (image of) the subspace V0 of continuous functions used in the definition of
induced representations (see (5.9); note that all intertwiners constructed from right to left
by (2.26) have this property, because of the strong continuity of unitary representations,
so that Frobenius reciprocity can only hold when this property is true.)
But since %1 is finite-dimensional, so is the image of Φ, and thus the statement is an
analogue of Lemma 5.4.3. To reduce to that case, note the isomorphism
IndG 2 2
K (%2 ) ' L (G) ⊕ · · · ⊕ L (G)
for f ∈ V . By Lemma 5.4.3, each fi is continuous for f ∈ Im(Φ), and hence Im(Φ) ⊂
V0 .
Exercise 5.4.10. Which of the other properties of induction can you establish for
compact groups?
202
5.5. Characters and matrix coefficients for compact groups
The reward for carefully selecting the conditions defining unitary representations of
compact groups and proving the analytic side of the Peter-Weyl Theorem is that, with
this in hand, character theory becomes available, and is just as remarkably efficient
as in the case of finite groups (but of course it is also restricted to finite-dimensional
representations).
We summarize the most important properties, using as before the notation Ĝ for the
set of irreducible unitary representations of G.
Theorem 5.5.1 (Character theory). Let G be a compact topological group, and let µ
be the probability Haar measure on G.
(1) The characters of irreducible unitary representations of G are continuous func-
tions on G, which form an orthonormal basis of the space L2 (G] ) of conjugacy-invariant
functions on G. In particular, a set C of finite-dimensional irreducible representations of
G is complete, in the sense of Corollary 5.4.8, if and only if the linear combinations of
characters of representations in C are dense in L2 (G] ).
(2) For any irreducible unitary representation π ∈ Ĝ of G, and any unitary represen-
tation
% : G −→ U(H)
of G, the orthogonal projection map onto the π-isotypic component of H is given by
Z
Φπ = (dim π) χπ (g)%(g)dµ(g),
G
1
using the L -action of Proposition 5.3.1.
In particular, if % is finite-dimensional, the dimension of the space %G of G-invariant
vectors in H is given by the average over G of the values of the character of %, i.e.,
Z
G
dim % = χ% (g)dµ(g).
G
using the orthonormality of the basis and the unitarity of π(g). Hence, by comparison,
we get α = 1/ dim(H), which gives the statement (5.20).
Proof of Theorem 5.5.1. (1) The space L2 (G] ) is a closed subspace of L2 (G).
The characters of finite-dimensional representations are (non-zero) continuous functions,
invariant under conjugation, and therefore belong to L2 (G] ). Since the character of π ∈ Ĝ
lies in M (π) (as a sum of matrix coefficients), the distinct characters are orthogonal, and
Lemma 5.5.2 actually shows that they form an orthonormal system in L2 (G] ).
In order to show its completeness, we need only check that if ϕ ∈ L2 (G] ) is conjugacy-
invariant, its isotypic components, say ϕπ , are multiples of the character of π for all
irreducible representations π. This can be done by direct computation, just as in the case
of finite groups (Section 4.3.3), or by the following argument: it is enough to prove that
the space M (π) ∩ L2 (G] ), in which ϕπ lies, is one-dimensional, since we already know
that χπ is a non-zero element of it.
But we can see this space M (π) ∩ L2 (G] ) as the invariant subspace of M (π) when G
acts on L2 (G) by the diagonal or conjugation combination of the two regular representa-
tions, i.e.,
%(g)ϕ(x) = ϕ(x−1 gx)
for ϕ ∈ L2 (G). The isotypic component M (π) is isomorphic to Hπ ⊗ H̄π ' End(Hπ ) as a
vector space, and the corresponding action on End(Hπ ) is the usual representation of G on
204
an endomorphism space. Thus the G-invariants of M (π) under the action % corresponds
to the space of G-invariants in End(Hπ ), which we know is EndG (Hπ ) = CId, by Schur’s
Lemma. Transporting the identity under the isomorphism of End(Hπ ) with M (π) above
leads precisely to the character of π (we leave this as an exercise!), which proves the
desired statement.
(2) Because characters are conjugacy-invariant, the action of Φπ on a unitary repre-
sentation is an intertwiner (Exercise 5.3.3). In particular, Φπ acts by multiplication by a
scalar on every irreducible unitary representation % ∈ Ĝ. This scalar is equal to the trace
of Φπ acting on %, divided by dim π. Since the trace is given by
Z
(dim π) χπ (g)χ% (g)dµ(g),
G
which is equal to 0 if π is not isomorphic to %, and to dim π otherwise (by orthonormality
of characters), we see that Φπ is the identity on the π-isotypic component of any unitary
representation, while it is zero on all other isotypic components. This means that it is
the desired projection.
(3) If % is a finite-dimensional unitary representation of G, we have
X
χ% = nπ (%)χπ ,
π∈Ĝ
for g, h ∈ G, does not make sense – in general – in any usual sense (i.e., when G is
infinite, this is usually a divergent series.) In other words, the duality between conjugacy
classes and irreducible representations is even fuzzier than was the case for finite groups.
205
Other features of the representations of finite groups that are missing when G is infinite
are those properties having to do with integrality properties (though it is tempting to
think that maybe there should be some analogue?)
In addition to character theory, matrix coefficients remain available to describe or-
thonormal bases of L2 (G, µ). We present this in two forms, one of which is more intrinsic
since it does not require a choice of basis. However, it gives an expansion of a different
nature than an orthonormal basis in Hilbert space, which is not so well-known in general.
Theorem 5.5.6 (Decomposition of the regular representation). Let G be a compact
topological group and let µ be the probability Haar measure on G.
(1) For π ∈ Ĝ, fix an orthonormal basis (eπ,i )16i6dim(π) of the space of π. Then the
family of matrix coefficients
p
ϕπ,i,j (g) = dim(π)hπ(g)eπ,i , eπ,j i
form an orthonormal basis of L2 (G, µ), where π runs over Ĝ and 1 6 i, j 6 dim π.
(2) For π ∈ Ĝ, acting on the space Hπ , consider the map
L2 (G)Z −→ End(Hπ )
Aπ
ϕ 7→ ϕ(g)π(g −1 )dµ(g),
G
Then the Aπ give “matrix-valued” Fourier coefficients for ϕ, in the sense that
X
ϕ(x) = (dim π) Tr(Aπ (ϕ)π(x)),
π∈Ĝ
Proof. The first statement (1) is a consequence of the Peter-Weyl theorem and the
orthogonality of matrix coefficients. The second statement is another formulation of the
Peter-Weyl decomposition, because the summands g 7→ Tr(Aπ (g)χπ (g)) are elements of
M (π). The advantage of (2) is that we obtain an intrinsic decomposition without having
to select a basis of the spaces of irreducible representations.
The statement itself is now easy enough: given ϕ ∈ L2 (G), we have an L2 -convergent
series X
ϕ= ϕπ ,
π∈Ĝ
where ϕπ is the orthogonal projection of ϕ on the π-isotypic component. We compute it
using the projection formula of Theorem 5.5.1 applied to the regular representation and
to the irreducible representation π. Using the unitarity, this gives
Z
ϕπ (x) = (dim π) χπ (g) reg(g)ϕ(x)dµ(g)
ZG
= (dim π) χπ (g −1 )ϕ(xg)dµ(g)
G
Z
= (dim π) Tr ϕ(xg)π(g −1 )dµ(g)
G
Z
= (dim π) Tr ϕ(y)π(y −1 x)dµ(y)
G
Z
−1
= (dim π) Tr ϕ(y)π(y )dµ(y) π(x) = (dim π) Tr(Aπ (ϕ)π(x)),
G
206
as claimed.
Exercise 5.5.7 (G-finite vectors). Let G be a compact group with probability Haar
measure µ, and let
% : G −→ U(H)
be a unitary representation of G. A vector v ∈ H is called G-finite if the subrepresentation
generated by v is finite-dimensional.
(1) Show that the space H1 of G-finite vectors is stable under G, and that it is dense
in H. When is it a subrepresentation?
(2) Show that a function f ∈ L2 (G, µ) is a G-finite vector of the regular represen-
tation if and only if f is a finite linear combination of matrix coefficients of irreducible
unitary representations of G. (These functions are analogues, for G, of the trigonometric
polynomials in the case of the circle group S1 .)
(3) Prove the analogue for a unitary representation % of a compact group G of the
property described in Exercise 4.3.29 for finite groups: for any irreducible representation
π and any vector v in the π-isotypic component of %, the subrepresentation generated by
v is the direct sum of at most dim(π) copies of π.
i.e., X
ϕ = 2i αm+1 ϕm
m>0
2 2 2
in L ([0, π], sin θdθ). Since we already know that the ϕm are an orthonormal system
π
in this space, it follows that they form an orthonormal basis, as we wanted to prove.
Example 5.6.3. The proof we gave has at least the advantage that we can easily
compute the expansion of various class functions on SU2 (C), and use results on Fourier
series to say something about their convergence with respect to other norms than the
L2 -norm (in particular pointwise.).
208
We can now check the Clebsch-Gordan formula for SU2 (C) is a single stroke of the
pen (compare with the proof of Theorem 2.6.3 that we sketched earlier):
Corollary 5.6.4 (Clebsch-Gordan formula for SU2 (C)). For all m > n > 0, the
representations %m of SU2 (C) satisfy
M
%m ⊗ %n ' %m+n−2i .
06i6n
Proof. One might say it is a matter of checking the identity at the level of characters,
i.e., of proving the elementary formula
sin((m + 1)θ) sin((n + 1)θ) X sin((m + n − 2i)θ)
=
sin θ sin θ 06i6n
sin θ
for all θ ∈ [0, π]. But more to the point, character theory explains how to guess (or find)
such a formula: by complete reducibility and orthonormality of characters, we know that
M
%m ⊗ %n ' hχm χn , χk i%k ,
06k6mn−1
(the restriction of the range comes from dimension considerations), and we are therefore
reduced to computing the multiplicities
2 π sin((m + 1)θ) sin((n + 1)θ) sin((k + 1)θ) 2
Z
hχm χn , χk i = sin (θ)dθ.
π 0 sin θ sin θ sin θ
This is, of course, an elementary – if possibly boring – exercise.
Using the coordinates in (5.7) on SU2 (C), it is easy to see how the representations
%m , defined as acting on polynomials, can be embedded in the regular representation.
Indeed, if we see the coordinates (a, b) ∈ C2 of some element g ∈ SU2 (C) (which are
subject to the condition |a|2 + |b|2 = 1) as a row vector, a direct matrix multiplication
shows that the row vector corresponding to a product gh is the same as the vector-matrix
product (a, b)h. If we restrict a polynomial P ∈ C[X, Y ] to (X, Y ) = (a, b), this gives
an intertwiner from Vm to a space of continuous functions on SU2 (C). Since this map is
non-zero (any basis vector X i Y m−i restricts to a non-zero function), it is an injection
Vm ,→ L2 (SU2 (C)).
According to Vilenkin [43], it was first observed by É. Cartan that matrix coefficients
or characters of irreducible representations of certain important groups lead to most of
the “classical” special functions8 (of course, the fact that the exponential function is
a representation of the additive group of C, or of R by restriction, is an even older
phenomenon that has the same flavor.) We present here some simple instances, related
to the group SU2 (C); more information about these, as well as many additional examples,
are found in [43].
We begin with the characters of SU2 (C). As functions of the parameter θ describing
the conjugacy class, they are of course completely elementary, but a change of variable
adds more subtlety:
Definition 5.6.5 (Chebychev polynomials). For m > 0, there exists a polynomial
Pm ∈ R[X], of degree m, such that
χm (θ) = Pm (2 cos θ), 0 6 θ 6 π,
8 Especially the functions that arise in mathematical physics, e.g., Bessel functions.
209
i.e., such that
sin((m + 1)θ)
(5.23) = Pm (2 cos θ),
sin(θ)
and the polynomial Um = Pm (X/2) is called the m-th Chebychev polynomial of the second
kind.
The fact that the characters of SU2 (C) form an orthonormal basis of the space of
class functions translates into the following fact:
Proposition 5.6.6 (Chebychev polynomials as orthogonal polynomials). The restric-
tions to [−1, 1] of the polynomials Um , m > 0, form an orthonormal basis of the space
L2 ([−1, 1], dν), where ν is the measure supported on [−1, 1] given by
2√
dν(t) = 1 − t2 dt.
π
The justification for this substitution is that the Chebychev polynomials arise in
many applications completely independently of any (apparent) consideration of the group
SU2 (C). On the other hand, algebraic properties of the representations %m can lead to
very simple (or very natural) proofs of identities among Chebychev polynomials which
might otherwise look quite forbidding if one starts from the definition (5.23), and even
more if one begins with an explicit (!) polynomial expansion.
Example 5.6.7. The first few Chebychev polynomials Um are
U0 = 1, U1 = 2X, U2 = 4X 2 − 1,
U3 = 8X 3 − 4X, U4 = 16X 4 − 12X 2 + 1, ...
(as one can prove, e.g., by straightforward trigonometric manipulations...)
Exercise 5.6.8 (Playing with Chebychev polynomials). In this exercise, we express
the Clebsch-Gordan decomposition of %m ⊗ %n in terms of expansions of Chebychev poly-
nomials, and deduce some combinatorial identities.
(1) Show that we have
X
j m−j
Um (X) = (−1) (2X)m−2j
m − 2j
06j6m/2
for all n > 0. [Hint: It is helpful here to interpret the Chebychev polynomials in terms of
characters of the larger group SL2 (C).]
(2) Using the Clebsch-Gordan decomposition, show that for m > n > 0, we have
X n
Um Un = Um+n−2k .
k=0
Example 5.6.9 (Representations of SO3 (R)). The group SO3 (R) of rotations in a 3-
dimensional euclidean space is also very important in applications (we will see it appear
prominently in the analysis of the hydrogen atom). As it turns out, it is very closely
related to the group SU2 (C), and this allows us to find easily the representations of
SO3 (R) from those of SU2 (C).
210
Proposition 5.6.10. There exists a continuous surjective group homomorphism
p : SU2 (C) −→ SO3 (R)
such that ker p = {±1} ⊂ SU2 (C) has order 2. As a consequence, the irreducible unitary
representations of the group SO3 (R) are representations π` , ` > 0, of dimension 2` + 1,
determined by the condition π` ◦ p = %2l .
Partial proof. One can write explicitly p in terms of matrices; crossing fingers to
avoid
typing
mistakes, and using the fact that a matrix in SU2 (C) is uniquely of the form
z −ȳ
for some y, z ∈ C with |z|2 + |y|2 = 1, this takes the form
y z̄
a + ib −c + id
(5.24) p =
c + id a − ib
2
(a + b2 ) − (c2 + d2 ) 2(bd − ac) −2(ad + bc)
2(ac + bd) a2 − b 2 − c 2 + d 2 2(ab − cd)
2(ad − bc) −2(ab + cd) a − b2 + c 2 − d 2
2
but that is about as enlightening as checking by hand the associativity of the product of
matrices of fixed size 4 or more; a good explanation for the existence of p is explained
in the next chapter, so we defer a reasonable discussion of this part of the result (see
also [38, §4.3] for a down-to-earth approach). Note at least that the formula makes it
easy to check that p(g) = 1 if and only if g = 1 or −1.
On the other hand, given the existence of p, we notice that if % is any irreducible
unitary representation of SO3 (R), then % ◦ p is an irreducible unitary representation of
SU2 (C). By the previous classification, it is therefore of the form %m for some m > 0.
The question is therefore: for which integers m > 0 is %m of the form % ◦ p? The answer is
elementary: this happens if and only if ker(p) ⊂ ker(%m ), and since ker(p) has only two
elements, this amounts to asking that %m (−1) = 1. The answer can then be obtained
from the explicit description of %m , or by character theory using (5.22): −1 corresponds
to θ = π and we have
sin((m + 1)θ)
χm (−1) = lim = (−1)m ,
θ→π sin(θ)
so that %m is obtained from an irreducible representation of SO3 (R) if and only if m = 2`
is even. Since different values of ` > 0 lead to representations of different dimension, this
gives the desired correspondence.
Example 5.6.11 (Infinite products). The following class of examples is quite sim-
ple (and does not occur that often in applications), but it is enlightening. By Proposi-
tion 2.3.17, we see that the irreducible unitary representations of a direct product G1 ×G2
of compact groups are of the form
%1 %2 ,
where %1 (resp. %2 ) ranges over the representations in Ĝ1 (resp. Ĝ2 ). This extends, by
induction, to a finite product G1 × · · · × Gk , with irreducible representations
%1 · · · %k .
We now extend this to infinite products of compact groups. Let I be an arbitrary
index set (the interesting case being when I is infinite, say the positive integers), and
211
let (Gi )i∈I be any family of compact topological groups indexed by I (for instance, the
family (GL2 (Fp ))p , where p runs over primes). The product group
Y
G= Gi
i∈I
can be given the product topology. By Tychonov’s Theorem, this is a compact topological
space. One can check that the product on G is continuous, and hence G is a compact
topological group. We now determine its irreducible unitary representations.
There is an abundance of irreducible representations arising from the finite products
of the groups: for any finite subset J ⊂ I, we have the projection homomorphism
Q
G −→ GJ = i∈J Gi
(5.25) pJ :
(gi )i∈I 7→ (gj )i∈J
which is continuous (by definition of the product topology), so that any irreducible uni-
tary representation %i of the finite product GJ gives by composition an irreducible
i∈J
representation of G, with character
Y
χ((gi )) = χ%i (gi ).
i∈J
Some of these representations are isomorphic, but this only happens when a compo-
nent %i is trivial, in which case we might as well have constructed the character using
GJ−{i} (we leave a formal proof to the reader!) In other words, we have a family of irre-
ducible representations parametrized by a finite – possibly empty – subset J of I, and a
family (%i )i∈J of non-trivial irreducible unitary representations of the groups Gi , i ∈ J.
In particular, the trivial representation of G arises from I = ∅, in which case GJ = 1.
We now claim that these are the only irreducible unitary representations of G. This
statement is easy to prove, but it depends crucially on the topological structure of G. For
the proof, we use the completeness criterion from Peter-Weyl theory (Corollary 5.4.8),
by proving that the linear span (say V ) of the matrix coefficients of those known repre-
sentations is dense in L2 (G).
For this purpose, it is enough to show that the closure of V contains the continuous
functions, since C(G) is itself dense in L2 (G). Let therefore ϕ be continuous on G.
The main point is that the product topology imposes that ϕ depends “essentially” only
on finitely many coordinates. Precisely, let ε > 0 be arbitrary. Since G is compact,
the function ϕ is in fact uniformly continuous, in the sense that there exists an open
neighborhood U of 1 such that
|ϕ(g) − ϕ(h)| 6 ε
−1 9
if gh ∈ U . The definition of the product topology shows that, for some finite subset
J ⊂ I, the open set U contains a product set
V = {(gi )i∈I | gi ∈ Vi for i ∈ J}
for suitable open neighborhoods Vi of 1 in Gi . Intuitively, up to a precision ε, it follows
that ϕ “only depends on the coordinates in J”. Let then
ϕJ (g) = ϕ(g̃)
where g̃i = gi for i ∈ J and g̃i = 1 otherwise. Since gg̃ −1 ∈ V , this function on G satisfies
|ϕ(g) − ϕJ (g)| 6 ε
9 We leave the proof as an exercise, adapting the classical case of functions on compact subsets of
R.
212
for all g ∈ G.
But now, ϕJ can be identified with a function on GJ . Then, by the Peter-Weyl
theorem, we can find a linear combination ψ of matrix coefficients of representations of
GJ such that
kψ − ϕJ kL2 (GJ ) 6 ε.
But it is quite easy to see that the probability Haar measure µ on G is such that its
image under the projection pJ is the probability Haar measure µJ on GJ . This means
that when we see ψ (and again ϕJ ) as functions on G, we still have
kψ − ϕJ kL2 (G) 6 ε.
Putting these inequalities together, we obtain
kϕ − ψkL2 (G) 6 2ε,
and as ε was arbitrary, we are done.
In Exercise 6.1.4 in the next chapter, we present a slightly different proof, which
directly shows that an irreducible unitary representation of G must be of the “known”
type.
213
CHAPTER 6
are of this type, provided I is infinite and infinitely many among the Gi are non-trivial
(for the simplest example, take I to be countable and Gi = Z/2Z for all i). Indeed, from
214
the description of the irreducible unitary representations of G in Example 5.6.11, we see
any one of them contains in its kernel a subgroup
Y
G(J) = Gi
i∈J
/
which is therefore the “explanation” for the Frobenius-Schur indicator.2 Again from
character theory, we get that the desired invariant is
Z
G G
dim Bsym − dim Balt = (χsym (g) − χalt (g)) dµ(g)
G
where, for simplicity, we denote by χsym and χalt the characters of Bsym and Balt . There-
fore we proceed to compute the difference of the two characters.
This is quite easy. For a fixed element g ∈ G, we can diagonalize the unitary operator
%(g) in some basis (ei )16i6n of E, with dual basis (λi ) of E 0 , so that
%(g)ei = θi ei , %̃(g)λi = θ̄i λi
for some eigenvalues θi , whose sum is χ% (g) or χ%̃ (g), respectively. Then the bilinear
forms
bi,j = bλi ,λj
given by (6.2) form a basis of B, with
g · bi,j = θi θj bi,j ,
by definition of the action on B. Applying the decomposition (6.6), a basis of Bsym is
given by the symmetric bilinear forms
1
(bi,j + bj,i )
2
2 Note that we could have exchanged the sign of the two terms, which would just have changed the
meaning of the indicators ±1; the choice we made is the standard one, but it is merely a convention.
219
where i and j are arbitrary (but of course (i, j) and (j, i) give the same basic bilinear
form), and a basis of Balt is given by the alternating forms
1
(bi,j − bj,i )
2
where this time i and j are arbitrary, but distinct (the pair (i, i) leading to the zero
form). Notice that in both case, these are eigenvectors for the action of g with the same
eigenvalue θi θj (because bi,j and bj,i are in the same eigenspace). Thus we find that
X X
χsym (g) = θi θj , χalt (g) = θi θj .
16i6j6n 16i<j6n
Only the diagonal terms are missing from the second sum compared to the first; hence
we get X 2
χsym (g) − χalt (g) = θi = χ% (g 2 )
16i6n
2
(since the matrix %(g ) has eigenvalues θi2
in the basis (ei )), and to conclude and recover
the formula (6.1) we may simply observe that
Z Z
2
χ% (g )dµ(g) = χ% (g 2 )dµ(g)
G G
since we know already that the integral is a real-number.
We will now give a few examples. Before doing this, the following result illustrates
another interesting meaning of the Frobenius-Schur indicator, and should suggest that
complex representations are in the some sense the most usual ones.
Proposition 6.2.4 (Self-dual representations). Let G be a compact topological group
and let % be an irreducible representation of G. Then FS(%) 6= 0 if and only if the character
of % is real-valued, and this is the case if and only if % is isomorphic to its contragredient
representation %̃. Such a representation is called self-dual.
Proof. According to the proof above, % is symplectic or orthogonal if and only if the
space B G of invariant bilinear forms on the space of % is one-dimensional, and (in view
of the character formula (6.3)) this is the case if and only if
Z
2
χ% (g) dµ(g) = 1.
G
As we did earlier, we argue that
Z 2
Z
χ% (g) dµ(g) 6 |χ% (g)|2 dµ(g) = 1,
G G
2
but now we continue by noticing that if there is equality, it must be the case that χ% (g)
is proportional to |χ% (g)|2 , with a scalar multiple of modulus 1. Taking g = 1 shows that
the scalar must be equal to 1, i.e. (taking conjugate), we have
χ% (g)2 = |χ% (g)|2 > 0
for all g ∈ G. Since, among complex numbers, only real numbers have a non-negative
square, we obtain the first result.
Now the last (and possibly most interesting!) conclusion is easy: since the character
of %̃ is χ% , it follows from character theory that % has a real-valued character if and only
if it is isomorphic to its contragredient.
220
Example 6.2.5. (1) Let G = SL2 (C), or SU2 (C). Among the representations %m of
G, m > 0, those with m even are of orthogonal type, while those with m odd are of
symplectic type. This can be checked in different ways: for SU2 (C), one may use the
integration and character formulas to check that it amounts to proving the identity
2 π sin(2(m + 1)θ) 2
Z
sin θdθ = (−1)m .
π 0 sin 2θ
For either group, one may also define explicitly an invariant non-degenerate bilinear
b form on the space Vm of homogeneous polynomials of degree m in two variables, by
putting
(−1)i δ(i, m − j)
b(ei , ej ) = m
i
i m−i
for the basis vectors ei = X Y .
Such a definition certainly defines a bilinear form on Vm , and it is symmetric for m
even, alternating for m odd (where δ(i, m − i) is always zero). To see that it is non-
degenerate, observe that the non-zero coefficients of the matrix (b(ei , ej ))i,j are exactly
the anti-diagonal ones, so that the determinant is their product, up to sign, which is
non-zero.
It is not immediately obvious, on the other hand, that b is invariant. A fairly quick
algebraic proof of this is explained in [39, 3.1.4, 3.1.5]: the group SL2 (C) is generated by
the elements
1 t x 0 × 0 1
u(t) = , t ∈ C, a(x) = , x∈C , w=
0 1 0 x−1 −1 0
so that it is enough to check that
b(%m (g)ei , %m (g)ej ) = b(ei , ej )
for g in one of these three classes (and all basis vectors). We leave the easy cases of a(x)
and w to the reader, and just present the (maybe somewhat mysterious) case of g = u(t).
In that case, we have
m−i
i m−i
X m−i k
(6.7) %m (g)ei = X (tX + Y ) = t ei+k
k=0
k
by the binomial theorem, and hence
m−i m−j
X X
k+` m−i m−j
b(%m (g)ei , %m (g)ej ) = t b(ei+k , ej+` ).
k=0 `=0
k `
Using the definition of b, only terms with i + k + j + ` = m remain, and this can be
rearranged as
m−i
m−i−j
m−j
X k m−i−j−k
b(%m (g)ei , %m (g)ej ) = (−1)i tm−i−j (−1)k m
k=0 i+k
(where it is possible that the sum be empty). If one rearranges the ratio of binomial
coefficients in terms of factorials, this becomes
m−i−j
(−1)i tm−i−j (m − i)!(m − j)! X
k m−i−j
b(%m (g)ei , %m (g)ej ) = (−1) .
(m − i − j)! m! k=0
k
221
Now the inner sum over k is zero (hence equal to b(ei , ej )) except when m = i + j,
and in that last case we also get
(−1)i (m − i)!i!
b(%m (g)ei , %m (g)ej ) = = b(ei , ej ).
m!
This example can be considered as a rather striking illustration of the power of char-
acter theory: the existence of a symmetric or alternating invariant bilinear form on the
space of %m is obtained by a simple integral of trigonometric functions, but the actual
bilinear form is quite intricate and it is not straightforward at all to guess its expression.
In particular, note that it is quite different from the SU2 (C)-invariant inner product, for
which the vectors ei are orthogonal (see Example 5.2.12). In fact, this inner product
h·, ·i on Vm is not invariant under the larger group SL2 (C), since %m is not unitary as a
representation of SL2 (C).
Concretely, recall that the inner product can be defined (a scalar multiple of the one
computed in Example 5.2.12) so that the (ei ) are orthogonal and have squared length
1
hei , ei i = m .
i
222
of the dimensions of irreducible representations of G is equal to the number of elements
of order 2 in G.
Proof. This is quite a cute argument: by assumption, we have
1 X
χ% (g 2 ) = 1
|G| g∈G
for all % ∈ Ĝ. Multiplying by dim(%) and then summing over all %, we obtain
X 1 XX
dim(%) = χ% (g 2 ) dim(%)
|G| g∈G
%∈Ĝ %∈Ĝ
1 XX
= χ% (g 2 )χ% (1).
|G| g∈G
%∈Ĝ
By the second orthogonality relation (4.28), the inner sum vanishes unless g 2 is con-
jugate to 1, i.e., unless g 2 = 1, and in that case it is equal to |G|. Thus we get
X
dim(%) = |{g ∈ G | g 2 = 1}|,
%∈Ĝ
as claimed.
The question of evaluating this sum was mentioned briefly in Remark 4.2.5.
Exercise 6.2.7. Consider the examples of finite groups for which we computed the
full character table (in Section 4.6.2, 4.6.3 and 4.6.4), and for each of them determine
the Frobenius-Schur indicators (in particular determine which are self-dual). [Hint: For
GL2 (Fp ), one can use Exercise 4.6.15 to first find very easily the self-dual representations.]
where xt ∈ SUn (C) defines any smooth curve with tangent vector A at t = 0. As usual,
one takes xt = exp(tA), where the exponential is that of matrices; then we have
i(g)xt = g exp(tA)g −1 = exp(tgAg −1 ),
(e.g., using the Taylor series expansion) and the derivative at t = 0 gives Ad(g)A =
gAg −1 , as desired.
Remark 6.3.4 (The Larsen alternative for other groups). In addition to the case
of the unitary group considered above, there are criteria for orthogonal and symplectic
groups.
Remark 6.3.5 (Finite groups with M4 = 2). As observed by Katz [22, 1.6.1], there
do exist finite groups G ⊂ SUn (C), for some n > 2, for which M4 (G) = 2. For instance,
let G = PSL2 (F7 ); it follows from the character table of SL2 (F7 ) that G has two distinct
irreducible representations π1 and π2 of dimension 3 = (7 − 1)/2. Unitarized, either of
these gives a homomorphism
G −→ U3 (C).
Since G is a simple group, this is necessarily a faithful representation, and (for the
det
same reason) the composite G ,→ U3 (C) −→ C× , which can not be injective, is trivial.
Thus the image of either of these representations is a finite subgroup of U3 (C), and one
can check that these have fourth moment equal to 2, i.e., that M4 (π1 ) = M4 (π2 ) = 2.
Remark 6.3.6 (How does one apply the Larsen alternative?). We explain here, with a
specific example, some of the situations where results like the Larsen alternative are very
valuable tools. As already hinted, sometimes theory gives the existence of some group
which carries information concerning objects of interest. A very good example, though
it is not directly relevant to the Larsen alternative, is the Galois group of the splitting
field of a polynomial. This is a finite group, constructed abstractly. If one knows the
coefficients of the polynomial, however, it is not so easy to determine the Galois group.
In fact, often the only obvious information is that it is isomorphic to a subgroup of Sn ,
where n is the degree of the polynomial (for instance, can you guess the Galois group of
the splitting field of
X 8 − 4X 7 + 8X 6 − 11X 5 + 12X 4 − 10X 3 + 6X 2 − 3X + 2
over Q?) In fact, part of what makes the Larsen alternative surprising is that it does not
really have an analogue for Galois groups!
Now for the example, which is based on very recent (and very deep) work of N.
Katz [23]. Fix an integer d > 1 and a prime number p such that p - d(d − 1). For any
227
finite field Fq of order q which is a power of p, and any non-trivial (one-dimensional)
character χ of the multiplicative group F×
q , one defines the sum
X
(6.13) S(χ; q) = χ(xd − dx − 1)
x∈Fq
using the convention χ(0) = 0. These are apparently just complex numbers, but they
turn out to be related to some compact Lie groups. Indeed, it follows from the work of
A. Weil4 that for every such pair (q, χ), there exists a well-defined conjugacy class θ(χ; q)
in the unitary group Ud−1 (C) such that
S(χ; q)
(6.14) Tr θ(χ; q) = − √
q
(in particular, note that this implies that
√
|S(χ; q)| 6 (d − 1) q,
which the reader may try to prove directly, knowing that, even for the first difficult case
d = 3, all known proofs are very involved...)
The connection with the Larsen alternative arises from the following fact, which is
a recent theorem of Katz (closely related to another very deep result of Deligne): there
exists a compact subgroup K ⊂ Ud−1 (C), depending a priori on p and d, such that, first,
all θ(χ; q) are in fact naturally conjugacy classes of K, and second, they become equidis-
tributed among conjugacy classes of K, in the sense that for any conjugacy-invariant
continuous function f : K −→ C, we have
Z
1 X
(6.15) f (x)dµ(x) = lim f (θ(χ; q)).
K q→+∞ q − 2
χ6=1
where µ is the probability Haar measure on K and the sum is over non-trivial characters
of F×q .
Thus, if one succeeds in determining what the group K is – something which, just as
was the case for Galois group, is by no means clear by just looking at the sums (6.13)! –
one can answer many questions about the asymptotic distribution of the sum, something
which is of great interest (at least, to arithmeticians...)
Now it is clear why the Larsen alternative is useful: applying first (6.15) with f (x) =
| Tr(x)|4 and then (6.14), we get the alternative formula
1 X
M4 (K) = lim | Tr θ(χ; q)|4
q→+∞ q − 2
χ6=1
1 X X
d
4
= lim 2 χ(x − dx − 1)|
q→+∞ q (q − 2)
χ6=1 x∈F q
for the fourth moment of K, which involves the given, concrete, data defining the problem.
We may have a chance to evaluate this...
As it turns out, one can show that the compact group K, if d > 6 at least, does satisfy
M4 (K) = 2 (though proving this is actually quite difficult). Hence the Larsen alternative
shows that either K is finite, or K ⊃ SUd−1 (C). One can analyze further the situation,
and the conclusion (still for d > 6) is that K is equal to the full unitary group Ud−1 .
4 This is a special case of the Riemann Hypothesis for curves over finite fields.
228
In the works of Katz, many other (more general) situations are considered. Note that,
even though the statements can be rather concrete, there is no known elementary proof
of the deep connection between sums like S(χ; p, d) and a compact Lie group.
Exercise 6.3.7 (Other moments). One can define other types of moments. For
instance, given a compact group G and a finite-dimensional unitary representation % of
G, let Z
Ma (%) = χ% (g)a dµ(g)
G
for an integer a > 0. It is an elementary consequence of character theory, which is not
necessarily clear at first when expressed for a “concrete” group, that Ma (%) is a non-
negative integer, as the multiplicity of the trivial representation in the finite-dimensional
representation %⊗a .
The sequence of moments (Ma (%))a>0 , as a > 0 varies, can be quite interesting...
(1) Take G = SU2 (C) and % the tautological inclusion SU2 (C) ,→ GL2 (C). Show that
Ma (%) = 0
if a is odd and
1 2a
M2a (%) =
2a + 1 a
for a > 0. Can you prove directly that the right-hand side is an integer?
(2) Compute the first few terms and identify this sequence in the “Online Encyclopedia
of Integer Sequences” (http://oeis.org).
(3) For a prime number p, and an element α ∈ F× p , let
1 X x + αx̄
S(α; p) = √ e ,
p ×
p
x∈Fp
where e(·) is the character e(z) = e2iπz of R/Z and x̄ designates the inverse of x modulo
p (i.e., xx̄ = 1 (mod p)). For reasonably large values of p (say p 6 100000) and the first
few a > 0, compute (using a computer) the “empirical” moments
1 X
ma,p = S(α, p)a .
p−1 ×
α∈Fp
230
where pi : H −→ ker(A−λi ) is the orthogonal projection on the i-th eigenspace.
(This is a probability measure since ψ is a unit vector, and it satisfies the con-
dition that any physically observed values would be among the eigenvalues λi of
A, since the measure µψ,A has support given by these eigenvalues.)
In particular, suppose A is a (non-trivial: A 6= 0, A 6= Id) orthogonal projec-
tion. Its spectrum is {0, 1}, and the corresponding projections are just p1 = A
itself and p0 = Id − A. Thus “measuring” the observable A will result in either
of these values, with probability kAψk2 of getting 1, and 1 − kAψk2 of getting 0
(in probabilistic terms, this is a Bernoulli random variable).
• The probability can be understood experimentally, and the prediction checked,
as follows: if the measurement is repeated a large number of times (say N times),
each one after preparing the system to be in state ψ, then the proportion NB /N
of the number NB of measurements for which the experimental value λ is in B
will be close to µψ,A (B). [This is in striking contrast with newtonian mechanics:
given that the particle is in the state (x, p) ∈ P , the value of the observation f
is simply the exact value f (x, p) ∈ R.] This property makes the link between
the mathematical model and the natural world; it can, in principle, be falsified,
but the probabilistic interpretation has turned out to be confirmed by test after
test. Not only is it the case that the relative frequencies of various results
(especially zero/one tests corresponding to projections) are found to be close to
the theoretical values, but no method (either practical or even theoretical) has
been found to predict exactly the values of the measurements one after the other.
In the example where the spectral measure is given by (6.17), one will there-
fore “observe” the eigenvalue λi with relative frequency given by
µψ,A ({λi }) = kpi (ψ)k2 .
• Finally, the basic dynamical equation is Schrödinger’s equation: there exists
a particular observable E, the Hamiltonian, such that the state of the system
evolves in time as a solution (ψt ) of the equation
h d
i ψt = Eψt ,
2π dt
here h is Planck’s constant. The Hamiltonian encapsulates the forces acting
on the particle. [In newtonian mechanics, the particle evolves according to the
2
differential equation m dtd 2 x = sum of the forces.] We won’t discuss dynamics of
quantum systems here, but it turns out that there is a connection between this
equation and unitary representations of the (additive, non-compact) group R;
see Section 7.2.
For more information, written mostly from a mathematical point of view, the reader
may refer to [38, 41, 42, 44] (there are also, of course, many physics books on quantum
mechanics which may be worth reading.)
Exercise 6.4.1 (Unbounded self-adjoint operator). Let H be a Hilbert space, and
let A : DA −→ H be a linear operator defined on DA ⊂ H, a a dense subspace of H.
The pair (DA , A) is called an unbounded operator on H, and it is called self-adjoint, if
the following two conditions hold: (1) we have
DA = {ψ ∈ H | φ 7→ hAφ, ψi extends to a continuous linear form on H};
and (2) for all psi1 , ψ2 ∈ DA , we have
hAψ1 , ψ2 i = hψ1 , Aψ2 i.
231
Show that the following defines a self-adjoint unbounded operator:
H = L2 (R, dx)
DA = {ψ ∈ H | x 7→ xψ(x) ∈ H}
(Aψ)(x) = xψ(x) for ψ ∈ DA .
This observable is interpreted as the position of a particle constrained to move on
the line R. Given ψ ∈ L2 (R) with kψk2 = 1, the measure µψ,A , in that case, is the
probability measure |ψ(x)|2 dx on R, so that the probability that a particle in the state
described by ψ be located inside a set B is given by
Z
|ψ(x)|2 dx.
R
Much more about the general theory of unbounded linear operators can be found, for
instance, in the books of Reed and Simon [31, 32].
How does representation theory enter the picture? The answer has to do with possible
symmetries of the system, which must be compatible with the linear structure underlying
the Hilbert space involved. If an observable A of interest is also compatible with the
symmetries of the system, and can be described using only eigenvalues (as in (6.16)), it
follows that the eigenspaces must be invariant under these symmetries; in other words, if
there is a symmetry group G of the system, the eigenspaces of observables are (unitary)
representations of G.
Now consider such an eigenspace, say V = ker(A−λ). For states ψ ∈ V , the observable
A has the specific, deterministic, value λ. If the representation V is not irreducible, we
can find another observable B such that B commutes with A, and some eigenspace of B
is a proper subspace W of V . For the states in W , both A and B have determined value.
Thus, in the opposite direction, if V is an irreducible representation of G, nothing more
may be said (deterministically) concerning the states in V .
What this shows is that, given a quantum system with symmetry group G, we should
attempt to decompose the corresponding representation of G on H into irreducible repre-
sentations. The states in each subrepresentation will be fundamental building blocks for
all states (by linearity), which can not be further analyzed in a fully deterministic way.
We now illustrate these general principles with concrete examples. We consider a
particle evolving in R3 , which is constrained to lie on the unit sphere S2 ⊂ R3 (this
restriction is not very physical, but it helps at first with the mathematical analysis, and
there are many fascinating purely mathematical questions). The underlying Hilbert space
is taken to be H = L2 (S2 , ν), where ν is the surface Lebesgue measure on the sphere,
defined similarly as in Example 5.2.4, (5). The operators of multiplication of a ψ ∈ H by
each coordinate function can play the role of position observables (as in Exercise 6.4.1),
but since the coordinates are bounded, the corresponding operators are continuous on H).
Suppose now that the system evolves according to a homogeneous force, compatible with
the rotational symmetry of the sphere S2 . The corresponding representation is therefore
a representation of the rotation group SO3 (R), given by
(g · ψ)(x) = ψ(g −1 x)
(this is indeed a unitary representation since the measure µ is SO3 (R)-invariant; see
Example 5.2.9).
The simplest observable to consider is the energy of the particle. In fact, even without
knowing anything about its shape, it is intuitively clear that if the system is rotation-
invariant, the energy must be compatible with the symmetries: in some sense, applying a
232
rotation to a state ψ amounts to observing this state from a different direction in space,
and rotation-invariance implies the absence of privileged directions!
Thus we attempt to decompose this representation of SO3 (R). Recall from Exam-
ple 5.6.9 that the irreducible unitary representations of SO3 (R) are obtained using the
projection
SU2 (C) −→ SO3 (R)
from the odd-dimensional irreducible representations of SU2 (C): for each integer ` > 0,
there exists a unique irreducible representation of SO3 (R) of dimension 2` + 1, which we
denote V` here.
Proposition 6.4.2 (Decomposition of L2 (S2 )). The space L2 (S2 ) is isomorphic, as
a representation of SO3 (R) to the Hilbert direct sum
M
V`
`>0
236
CHAPTER 7
The picture of representation theory beyond the case of compact groups changes
quite dramatically. Even if one restricts to locally compact groups, it happens that non-
compact groups have typically infinite-dimensional irreducible unitary representations,
and from these one can not usually produce all representations using Hilbert direct sums.
Their study becomes in many respects more analytic, and certainly requires quite different
tools and techniques. These are mostly beyond the scope of this book – as people say!
– and the very modest goal of this chapter is simply to present some concrete examples
of groups and, even if without full proofs, a survey of some phenomena involved in the
representation theory of such groups, emphasizing mostly those that were not visible in
the previous chapters.
We begin however with some words about a more algebraic topic concerning algebraic
groups.
(the various coefficients b(ek , e` ) and b(ei , ej ) are complex constants since b and the basis
have been fixed.)
We now define two objects:
– the set of all polynomials vanishing on G, or more precisely, denoting
A = C[(Xi,j )i,j , Y ],
a polynomial algebra in n2 + 1 variables,1 we let
IG = {f ∈ A | f (gi,j , det(g)−1 ) = 0 for all g = (gi,j ) ∈ G} ;
– the set of all elements in GL(V ) which are common zeros of these polynomials, i.e
(7.2) VG = {x ∈ GL(V ) | f (x, det(x)−1 ) = 0 for all f ∈ IG }.
Obviously, the definitions show that G ⊂ VG . Moreover, since the invariance rela-
tions (7.1) – which are satisfied by G – are polynomial, they define elements fi,j ∈ IG
(which happen to not depend on the extra variable Y ). This means that those relations
are also satisfied, by definition, by all elements g ∈ VG . In other words, the bilinear form
b is invariant under all elements g ∈ VG .
The claim is now that this set VG ⊂ GL(V ) is in fact a subgroup of GL(V ). We will
then denote G = VG , and we have obtained the desired relation
(7.3) BG = BG.
1 The usefulness of the presence of the extra variable Y , which is used to represent polynomially the
inverse of a g ∈ GL(V ), will be clear very soon.
238
So let us prove the claim, which is quite elementary. We start by showing that VG
is stable under inversion, since this is where the extra variable Y is useful. Given any
f ∈ IG , define a new polynomial by
f˜((Xi,j ), Y ) = f (Y (X̃i,j ), det(Xi,j ))
where Y is seen as a scalar in the first argument, and the matrix X̃ = (X̃i,j ) it multiplies
is the comatrix of X = (Xi,j ), i.e., the coefficients X̃i,j are the polynomials in C[(Xi,j )]
such that
det(X) × X −1 = X̃.
Thus f˜ ∈ A; now, for g ∈ GL(V ), we have
f˜(g, det(g)−1 ) = f (det(g)−1 (g̃i,j ), det(g)) = f (g −1 , det(g)),
and this vanishes for all g ∈ G since – because G is a group – g −1 ∈ G and f ∈ IG . Thus
f˜ ∈ IG . This implies that f˜ vanishes on VG , i.e., for all g ∈ VG , we have
f (g −1 , det(g)) = f˜(g, det(g)−1 ) = 0.
Finally, consider g ∈ VG to be fixed; then f (g −1 , det(g)) = 0 for all f ∈ IG , which by
definition means that g −1 ∈ VG , as desired.
We now proceed to show that VG is stable under products, along similar lines. First,
we show that VG is stable by multiplication by elements of G on both sides: if g1 ∈ G is
given then for all g ∈ VG , both g1 g and gg1 are in VG . Indeed, given f ∈ IG , we define
f˜(Xi,j , Y ) = f (g1 · (Xi,j ), det(g1 )−1 Y )
where g1 · (Xi,j ) denotes the matrix product. Since matrix multiplication is polynomial
in the coordinates (Xi,j ), this is an element of A. And as such, it belongs to IG , because
for g ∈ G we have g1 g ∈ G – since G is itself a group – and hence
f˜(g, det(g)−1 ) = f (g1 g, det(g1 g)−1 ) = 0
by definition of IG . Hence f˜ vanishes in fact identically on all of VG , which means that
f (g1 g) = 0 for all g ∈ Vg . Since f ∈ IF is arbitrary, this property applied to a fixed
g ∈ VG means that g1 g ∈ VG . Reversing the order of a few products, we also obtain in
this way that gg1 ∈ VG for all g ∈ VG .
Now we deduce the stability of VG under all products: let g1 ∈ VG be given; for
f ∈ IG , define again
f˜((Xi,j ), Y )) = f (g1 · (Xi,j ), det(g1 )−1 Y ) ∈ A.
We get f˜ ∈ IG (because, for g ∈ G, we know from above that g1 g ∈ VG , and then
f˜(g, det(g)−1 ) = f (g1 g, det(g1 g)−1 ) = 0, and therefore f˜ vanishes on VG ; in particular,
fixing some g ∈ VG and using the fact that f (g1 g) = 0 for all f ∈ IG , this means that
g1 g ∈ VG .
Now we continue the proof. Because of (7.3), we are left with having to prove
dim B G 6 1.
The reason this is easier, and why the passage from G to G is a drastic simplification
(despite appearances at first sight!) is that G belongs to the category of linear algebraic
groups, i.e., it is a subgroup of GL(V ) which is the set of common zeros of a set of
polynomial functions in A.
In fact, G is even better than a general algebraic group, because it is given with
the inclusion G ⊂ GL(V ), which is a faithful irreducible (in particular, semisimple)
239
representation (since G ⊃ G, and G acts irreducibly on V , so does necessarily G). An
algebraic group of this type is is a reductive group.2
Now we must invoke a fairly deep fact without proof: if G is a reductive subgroup
of GL(V ), then it contains a compact subgroup K ⊂ G for which the Zariski-closure
(computed by the same method as above, with G = K instead) is still G. Therefore we
get
dim B G = dim B G = dim B K 6 1
by character theory applied to K as in the proof of Theorem 6.2.3 (see (6.4))...
Example 7.1.2. One may ask why one does not try to go directly from G to the
compact group K. This seems difficult because it may well happen that G and K have
trivial intersection! A basic example is G = GLn (Z), n > 2; then one can show (we will
comment briefly on this in the next example) that G is the set of matrices g in GLn (C)
with det(g)2 = 1; the subgroup K is then the group of unitary matrices g ∈ Un (C)
with det(g)2 = 1. The intersection G ∩ K is the finite group Wn of signed permutation
matrices (discussed in Exercise 4.7.13; indeed, if g = (gi,j ) is any unitary matrix with
integral coefficients, the condition
X
2
gi,j =1
j
implies that, for a given i, a single gi,j ∈ Z is ±1, and the others are 0; denoting
j = σ(i), one sees that σ must be a permutation of {1, . . . , n}, so that g ∈ Wn ; since any
Wn ⊂ GLn (Z), we get the result.) In this case, G ∩ K still acts irreducibly on Cn (as the
reader should check), but if we replace G by the subgroup
G3 = {g ∈ GLn (Z) | g ≡ Id (mod 3)},
which has finite index, it also possible to show that the Zariski-closure of G3 , and hence
the compact subgroup K, are the same as that for G. However, G3 ∩ K is now trivial
since a signed permutation matrix which is congruent to the identity modulo 3 has to be
the identity (we used reduction modulo 3 instead of 2 here to be able to distinguish the
two signs.)
Example 7.1.3 (Examples of Zariski-closure). Going from G to the Zariski-closure G
in the above proof might be a difficult psychological step at first, especially if one thinks
of G as being particularly concrete (e.g., G = GLn (Z) ⊂ GLn (C)) while G seems a very
abstract object. However, it is a fact that computing the Zariski-closure of a group is
quite often relatively easy, using known results (which may, of course, be quite deep and
non-trivial.) Here are a few examples.
Exercise 7.1.4 (The Zariski topology). The association of the “big” group G to the
group G has the aspect of a “closure” operation. Indeed, it can be interpreted as taking
the closure of G in GL(V ) with respect to a certain topology, called the Zariski topology.
Let k be an algebraically closed field and let n > 1 be an integer. We define An =
k n , which is called the affine n-space over k, and A = k[X1 , . . . , Xn ]. Note that for a
polynomial f ∈ A and x ∈ An , we can evaluate f at x.
(1) For any subset I ⊂ A, let
V(I) = {x ∈ An | f (x) = 0 for all f ∈ I}.
2 One definition of a general reductive group is that it is an algebraic group which has a finite-
dimensional semisimple representation % with finite kernel.
240
Show that the collection of sets (V(I))I⊂A is the collection of closed sets for a topology
on An .
In particular, this allows us to speak of the Zariski-closure of any subset V ⊂ An : it
is the intersection of all Zariski-closed subsets of An which contain V .
(2) For n = 1, show that a subset V ⊂ A1 is closed for the Zariski topology if and
only if V = A1 or V is finite. Show that the Zariski topology on A2 is not the product
topology on A1 × A1 . Show also that the Zariski topology is not Hausdorff, at least for
A1 (the case of An for arbitrary n might be more difficult.)
2
(3) Let G = GLn (k), seen as a subset of An by means of the matrix coefficients.
2
Show that G is dense in An with respect to the Zariski topology. Furthermore, show
that the set
2 2
G̃ = {(g, x) ∈ An +1 | det(g)x = 1} ⊂ An +1
2
is Zariski-closed in An +1 .
2
(4) For k = C and G ⊂ GLn (C), show that the Zariski-closure of G̃ ⊂ An +1 is equal
to
G̃ = {(g, x) ∈ G × C | det(g)x = 1}.
The Zariski topology is a foundational notion in algebraic geometry; readers interested
in learning more may look at [18] for the general theory (or should really try to attend a
course in algebraic geometry, if at allpossible!). In the context of linear algebraic groups,
the books [4] and [40] mayb be more accessible.
Exercise 7.1.5 (Zariski closure and polynomial representations). Let k be an alge-
braically closed field and let G ⊂ GLn (k) be any subgroup. Let
% : G −→ GLm (k)
be a (matrix) representation of G. We assume that % is polynomial, i.e., that the matrix
coefficients of % are functions on G which are restrictions of polynomials in the coordinates
gi,j and in det(g)−1 .
(1) Define G ⊂ GLn (k) as the Zariski closure of G. Show that there exists a unique
representation
%1 : G −→ GLm (k)
such that % coincides with % on G. What is %1 if % is the injection of G in GLn (k)?
(2) Show that a subspace V ⊂ k m is a subrepresentation of % if and only if it is a
subrepresentation of %1 .
(3) Show that the subspace (k m )G of vectors invariant under G is equal to the subspace
of vectors invariant under G.
(4) Show that the representations %̃ (contragredient), Symk % (k-th symmetric power,
for k > 0), k % (alternating power, for k > 0) are also polynomial.
V
Remark 7.1.6 (The Larsen alternative). The Larsen Alternative can also be phrased
in terms of algebraic groups, if the fourth moment invariant M4 (G) of a subgroup G ⊂
GLn (C) is defined algebraically as the dimension of the space of G-invariants in End(Cn )⊗
End(Cn ), as in Lemma 6.3.3. The statement is the following (see [22, Th. 1.1.6 (1)]):
Theorem 7.1.7 (Larsen). Let G ⊂ GLn (C) be a reductive linear algebraic group. If
M4 (G) = 2, then either G is finite, or SLn (C) ⊂ G.
We now sketch another application of algebraic groups to a basic question of repre-
sentation theory, which is due to Chevalley: over C, the tensor product of two semisimple
representations remains semisimple (with no continuity or related assumption!)
241
Theorem 7.1.8 (Chevalley). Let %1 , %2 be finite-dimensional complex semisimple
representations of a group G. Then %1 ⊗ %2 is semisimple.
As mentioned by Serre [36, Part II, §1], it would be interesting to have an “elemen-
tary” proof of this fact, not involving algebraic groups.
Sketch of proof.
7.2. Locally-compact abelian groups
7.3. A non-abelian example: SL2 (R)
242
APPENDIX A
Given any N > 1, we define the approximation T (N ) = TkN , where kN is the corre-
sponding partial sum
XX
kN = α(k, `)ψk,` .
k6N `6N
2
Note that for any ϕ ∈ L (X), we have
XX Z
(N )
T ϕ= α(k, `) ϕ` (y)f (y)dµ(y) ϕk ,
k6N `6N X
exists in H, and has norm 6 1. But then, since the ϕk are finite linear combinations of
the ψm , we get
hwj , ϕk i −→ hv, ϕk i
as j → +∞, for all k. Since {ϕk } is dense in H, it follows (formally) that
(A.3) hwj , wi −→ hv, wi
as j → +∞, for all w ∈ H.3
We now argue that, due to the compactness of T , this weaker convergence property
suffices to ensure that T wj → T v in H. If this is true, we are done: since T wj =
Tnj vnj + (T − Tnj )vnj , we have
kT wj k = λnj + o(1) → λ,
and therefore kT vk = λ, as desired (note that this also shows that kvk = 1).
Now, we will finally use the explicit form of T to prove that T wj → T v. We write
k(x, y) = kx (y), so that (using our simplifying assumption that k is continuous) each kx
is a well-defined continuous (hence bounded and square-integrable) function on X. Then
we can write
(T wj − T v)(x) = hwj − v, kx i,
and hence Z
kT wj − T vk2 = |hwj − v, kx i|2 dµ(x).
X
3 One can think of this property as saying that “every coordinate” of wj , captured by any linear
functional on H, converges to the corresponding coordinate of v. To clarify the meaning of this, note
that the formula kvk = supkwk61 |hv, wi| shows that the convergence in norm is equivalent to a uniform
convergence (over w of norm 1) of the corresponding “coordinates”.
250
We can now apply the dominated convergence theorem to conclude: by (A.3), the
integrand converges pointwise to 0, and moreover
|hwj − v, kx i|2 6 (kwj k + kvk)2 kkk2∞ 6 2kkk2∞ ,
which is an integrable function on X, so that the dominated convergence theorem does
apply.
Remark A.2.6. In the language of weak convergence, we can summarize the last
steps of the argument as follows: (1) any sequence of unit vectors in H contains a weakly
convergent subsequence; (2) a compact operator maps a weakly convergent sequence to
a norm-convergent sequence. In general, (1) is a consequence of the Banach-Alaoglu
Theorem (see, e.g., [31, Th. IV.21]) and the self-duality of Hilbert spaces, while (2) is
most easily seen as coming from the alternate characterization of compact operators (on
a Hilbert space H) as those mapping the unit ball to a relatively compact subset H (see,
e.g., [31, VI.5, Th. VI.11] for this fact).
A.3. The Stone-Weierstrass theorem
We conclude with the statement of the general Stone-Weierstrass approximation the-
orem, which is used in Exercises 5.4.4 and 5.4.6 for an alternative proof of the Peter-Weyl
Theorem.
Theorem A.3.1 (Stone-Weierstrass approximation theorem). Let X be a compact
topological space, and let A ⊂ C(X) be a subspace of continuous functions on X such
that (1) A is an algebra: if f , g ∈ A, the product f g is also in A; (2) A is stable under
complex conjugation: if f ∈ A, then f¯ is in A (here f¯ maps x to f (x)); (3) A separates
points, i.e., for any elements x, y in X with x 6= y, there exists some f ∈ A such that
f (x) 6= f (y).
Then A is dense in C(X) for the uniform topology, i.e., for any f ∈ C(X) and ε > 0,
there exists g ∈ A with
sup |f (x) − g(x)| < ε.
x∈X
251
Bibliography
[1] B. Bekka, P. de la Harpe and A. Valette: Kazhdan’s Property (T ), New Math. Monographs 11,
Cambridge Univ. Press (2008).
[2] N. Berry, A. Dubickas, N. Elkies, B. Poonen and C. J. Smyth: The conjugate dimension of
algebraic numbers, Quart. J. Math. 55 (2004), 237–252.
[3] E.D. Bolker: The spinor spanner, American Math. Monthly 80 (1973), 977–984.
[4] A. Borel: Linear algebraic groups, 2nd edition, GTM 126, Springer 1991.
[5] W. Bosma, J. Cannon and C. Playoust, The Magma algebra system, I. The user language
J. Symbolic Comput. 24 (1997), 235–265; also http://magma.maths.usyd.edu.au/magma/
[6] A.E. Brouwer and M. Popoviciu: The invariants of the binary nonic, J. Symb. Computation 45
(2010) 709–720.
[7] D. Bump: Automorphic forms and representations, Cambridge Studies in Adv. Math. 55, Cam-
bridge Univ. Press (1998).
[8] C.W. Curtis and I. Reiner: Representation theory of finite groups and associative algebras, AMS
Chelsea Publishing, 1962.
[9] P. Diaconis: Group representations in probability and statistics, Inst. Math. Stat. Lecture Notes
11, Institute of Math. Statistics (1988).
[10] N. Dunford and J.T. Schwarz: Linear operators, part II: spectral theory, Wiley (1963).
[11] P. Etingof, O. Golberg, S. Hensel, T. Liu, A. Schwendner, D. Vaintrob, E. Yudovina: Introduc-
tion to representation theory, arXiv:0901.0827.
[12] G. Folland: Real Analysis, Wiley (1984).
[13] W. Fulton and J. Harris: Representation theory, a first course, Universitext, Springer (1991).
[14] The GAP Group: GAP – Groups, Algorithms, and Programming, Version 4.4.9, 2007; also
www.gap-system.org
[15] K. Girstmair: Linear relations between roots of polynomials, Acta Arithmetica 89 (1999), 53–96.
[16] W.T. Gowers: Quasirandom groups, Comb. Probab. Comp. 17 (2008), 363–387.
[17] W.T. Gowers (editor): The Princeton companion to mathematics, Princeton Univ. Press (2008).
[18] R. Hartshorne: Algebraic Geometry, Grad. Texts in Math. 52, Springer Verlag (1977).
[19] I.M. Isaacs: Character theory of finite groups, Academic Press (1976).
[20] I.M. Isaacs: Finite group theory, Graduate Studies in Math. 92, American Math. Soc. (2008).
[21] F. Jouve, E. Kowalski and D. Zywina: An explicit integral polynomial whose splitting field has
Galois group W (E8 ), Journal de Théorie des Nombres de Bordeaux 20 (2008), 761–782.
[22] N.M. Katz: Larsen’s alternative, moments and the monodromy of Lefschetz pencils, in “Contri-
butions to automorphic forms, geometry, and number theory (collection in honor of J. Shalika’s
60th birthday)”, J. Hopkins Univ. Press (2004), 521–560.
[23] N.M. Katz: Convolution and equidistribution: Sato-Tate theorems for finite field Mellin trans-
forms, Annals of Math. Studies, Princeton Univ. Press (to appear).
[24] A.W. Knapp: Representation theory of semisimple groups, Princeton Landmarks in Math.,
Princeton Univ. Press (2001).
[25] E. Kowalski: The large sieve, monodromy, and zeta functions of algebraic curves, II: indepen-
dence of the zeros, International Math. Res. Notices 2008, doi:10.1093/imrn/rnn091.
[26] S. Lang: Algebra, 3rd edition, Grad. Texts in Math. 211, Springer (2002).
[27] M. Larsen: The normal distribution as a limit of generalized Sato-Tate measures, preprint
(http://mlarsen.math.indiana.edu/~larsen/papers/gauss.pdf).
[28] M.W. Liebeck, E.A. O’Brien, A. Shalev and P.H. Tiep: The Ore conjecture, J. Eur. Math. Soc.
(JEMS) 12 (2010), 939–1008.
[29] W. Magnus: Residually finite groups, Bull. Amer. Math. Soc. 75 (1969), 305–316.
252
[30] N. Nikolov and L. Pyber: Product decompositions of quasirandom groups and a Jordan-type
theorem, J. European Math. Soc., to appear.
[31] M. Reed and B. Simon: Methods of modern mathematical physics, I: Functional analysis, Aca-
demic Press 1980.
[32] M. Reed and B. Simon: Methods of modern mathematical physics, II: Self-adjointness and
Fourier theoretic techniques, Academic Press 1980.
[33] J. Rotman: An introduction to the theory of groups, 4th ed., Grad. Texts in Math. 148, Springer
(1995).
[34] J.-P. Serre: Linear representations of finite groups, Grad. Texts in Math. 42, Springer (1977).
[35] J.-P. Serre: Cours d’arithmétique, Presses Univ. France (1988).
[36] J.-P. Serre: Moursund lectures, arXiv:math.0305257
[37] J.-P. Serre:Topics in Galois theory, Res. Notes in Math., Jones and Bartlett (1992).
[38] S.F. Singer: Linearity, symmetry, and prediction in the hydrogen atom, Undergraduate texts in
mathematics, Springer (2005).
[39] T.A. Springer: Invariant theory, Springer Lecture Notes in Math. 585, (1977).
[40] T.A. Springer: Linear algebraic groups, 2nd edition, Progr. Math. 9, Birkhaüser (1998).
[41] S. Sternberg: Group theory and physics, Cambridge Univ. Press (1994).
[42] L. Takhtajan: Quantum mechanics for mathematicians, A.M.S Grad. Studies in Math. 95, 2008.
[43] N.J. Vilenkin: Special functions and the theory of group representations, Translations of Math.
Monographs 22, A.M.S (1968).
[44] H. Weyl: The theory of groups and quantum mechanics, Dover (1950).
[45] A. Zygmund: Trigonometric series, 3rd Edition, Cambridge Math. Library, Cambridge Univ.
Press (2002).
253