0% found this document useful (0 votes)
43 views75 pages

Lie Groups

Uploaded by

abooie
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views75 pages

Lie Groups

Uploaded by

abooie
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 75

LIE GROUPS

Course C3.5 MT 2023

Notes by Nigel Hitchin 2015

Lecturer: Jason D. Lotay

[email protected]

1
Contents
1 Introduction 4

2 Manifolds 6
2.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Lie groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Functions and vector fields . . . . . . . . . . . . . . . . . . . . . . . . 10

3 Lie groups and Lie algebras 17


3.1 The Lie bracket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 Examples of Lie groups . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3 The adjoint representation . . . . . . . . . . . . . . . . . . . . . . . . 21
3.4 The exponential map . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4 Submanifolds, subgroups and subalgebras 28


4.1 Lie subgroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.2 Continuity and smoothness . . . . . . . . . . . . . . . . . . . . . . . . 29
4.3 Subgroups versus subalgebras . . . . . . . . . . . . . . . . . . . . . . 31

5 Global aspects 35
5.1 Components and coverings . . . . . . . . . . . . . . . . . . . . . . . . 35
5.2 From Lie algebras to Lie groups . . . . . . . . . . . . . . . . . . . . . 37

6 Representations of Lie groups 40


6.1 Basic notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
6.2 Integration on G . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
6.3 Characters and orthogonality . . . . . . . . . . . . . . . . . . . . . . 47

7 Maximal tori 53
7.1 Abelian subgroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

2
7.2 Conjugacy of maximal tori . . . . . . . . . . . . . . . . . . . . . . . . 54
7.3 Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

8 Representations and maximal tori 62


8.1 The representation ring . . . . . . . . . . . . . . . . . . . . . . . . . . 62
8.2 Representations of U (n) . . . . . . . . . . . . . . . . . . . . . . . . . 63
8.3 Integration on T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

9 Simple Lie groups 68


9.1 The Killing form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
9.2 Ideals and simplicity . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
9.3 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
9.4 The group G2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

3
1 Introduction
A first course on group theory introduces a group as a set with two operations –
multiplication and inversion – satisfying appropriate axioms. However the subject
only begins to come to life when groups act on other sets: it is easier to think of the
symmetric group Sn as the group of permutations of the numbers {1, 2, . . . , n} than
as a multiplication table. But these same groups also appear acting on other sets, for
example the symmetric group S4 is isomorphic to the symmetries of the cube, and so
acts on R3 by rotations. Such linear actions of groups are frequent in real life. They
are called representations and they are studied side-by-side together with the groups
considered as objects in their own right.
The symmetric groups are finite groups but there are infinite groups such as the full
group of rotations which is what this course is about. Roughly speaking they are
infinite but with a finite number of degrees of freedom: a rotation in R3 is given
by an axis (equivalently a unit vector) with two degrees of freedom and an angle of
rotation, making three in all. Or the isometries of R2 – two degrees of freedom for
translations and one for rotations making again three. The structure that makes this
a consistent mathematical concept is that of a manifold, and that is what a Lie group
is: a manifold for which the group operations are smooth maps.
Lie groups are very special manifolds, however. If you think, rightly, of manifolds
as higher-dimensional generalizations of surfaces then the only compact, connected
surface which is a Lie group is a torus, and the only spheres which are Lie groups
are those of dimension one and three. So, although we shall have to review the basic
features of manifolds it will be via a viewpoint which is slanted towards this special
case.
Manifolds are also topological spaces and the two examples of Lie groups above are
quite different: the rotation group is compact and the isometry group of the plane
noncompact. The representation theory of compact Lie groups has a lot in common
with that of finite groups, and for the most part this course will deal with compact
groups and their representations.
This is a rich and well-studied area and a 16-lecture course, which also has to prove
the fundamental results, is necessarily restricted in its scope. We introduce roots
and the Weyl group but stop short of studying irreducible representations by highest
weights. This is a beautiful subject and provides a means of reading off information
about irreducibles from combinatorial formulae, but time does not allow us to pursue
that here.
We begin in Section 2 with the basics of manifold theory, as applied to Lie groups.

4
Section 3 defines the Lie algebra of a Lie group and the exponential map which links
the two. Section 4 deals with Lie subgroups where we prove in particular the useful
result that closed subgroups of Lie groups are Lie groups. Section 5 gives a precise
description of the Lie groups which have isomorphic Lie algebras.
The remaining four sections are about representations of compact Lie groups. Section
6 introduces integration on a compact Lie group, orthogonality of characters and the
Peter-Weyl theorem which shows that every irreducible representation appears within
the Hilbert space of L2 functions on the group. Section 7 introduces maximal tori
and proves the crucial result that all such tori are conjugate. We use a proof based
on the degree of a smooth map between manifolds of the same dimension, the higher-
dimensional analogue of the winding number. Given that result, many features follow
rapidly: the Weyl group, roots and weights. Section 8 introduces the representation
ring or character ring and its relation with the action of the Weyl group. The final
section states without proof the classification of simply-connected compact simple
Lie groups. I have also added a concrete description of the exceptional group G2 to
illustrate some of the features encountered within the course.

5
2 Manifolds

2.1 Definitions
The concept of a manifold starts with defining the notion of a coordinate chart.

Definition 1 A coordinate chart on a set M is a subset U ⊆ M together with a


bijection
φ : U → φ(U ) ⊆ Rn
onto an open set φ(U ) in Rn .

Thus we can parametrize points x of U by n coordinates φ(x) = (x1 , . . . , xn ).

Definition 2 A smooth manifold of dimension n is a set M with a collection of


coordinate charts {Uα , φα }α∈I such that

ˆ M is covered by the {Uα }α∈I

ˆ for each α, β ∈ I, φα (Uα ∩ Uβ ) is open in Rn

ˆ the map
φβ φ−1
α : φα (Uα ∩ Uβ ) → φβ (Uα ∩ Uβ )

is C ∞ with C ∞ inverse.

Remark: Recall that F (x1 , . . . , xn ) ∈ Rn is C ∞ if it has derivatives of all orders. In


manifold theory one can have lower degrees of differentiability with different outcomes,
but it is a deep theorem that groups which are manifolds only assuming continuity
of the maps above generates the same theory as asserting smoothness.

A manifold is automatically a topological space. Recall what a topological space is:


a set M with a distinguished collection of subsets called open sets such that

1. ∅ and M are open

2. an arbitrary union of open sets is open

3. a finite intersection of open sets is open

6
For a manifold we shall say that a subset V ⊆ M is open if, for each α, φα (V ∩ Uα )
is an open set in Rn . It is straightforward to see that this defines a topology, and
furthermore the coordinate charts are homeomorphisms. To avoid pathological cases,
we make the assumption that the topological space is Hausdorff, and has a countable
basis of open sets.
With this definition one can easily see that the product M ×N of manifolds is another
manifold where dim(M × N ) = dim M + dim N . All you need to do is look at the
product φα × ψi for charts {Uα , φα }, {Vi , ψi } on M, N respectively.
Here is the definition of a smooth map between manifolds:

Definition 3 A map F : M → N of manifolds is a smooth map if for each point


x ∈ M and chart (Uα , φα ) in M with x ∈ Uα and chart (Vi , ψi ) of N with F (x) ∈ Vi ,
the set F −1 (Vi ) is open and the composite function

ψi F φ−1
α

on φα (F −1 (Vi ) ∩ Uα ) is a C ∞ function.

A smooth map is continuous in the manifold topology.


The natural notion of equivalence between manifolds is the following:

Definition 4 A diffeomorphism F : M → N is a smooth map with smooth inverse.

The composition of two diffeomorphisms is again a diffeomorphism and in particular


Diff(M ), the set of all diffeomorphisms F : M → M is a group, but far too large to
be a Lie group.

2.2 Lie groups


Finally we can define a Lie group:

Definition 5 A Lie group G is a smooth manifold which is also a group and is such
that

ˆ the multiplication map µ : G × G → G

ˆ and the inversion map i : G → G

7
are smooth maps of manifolds.

Then we have the natural notion:

Definition 6 A Lie group homomorphism γ : G → H is a smooth map which is also


a group homomorphism.

Examples:
1. The general linear group GL(n, R) of all invertible n × n matrices is the open
subset of the n2 -dimensional vector space of all n × n matrices given by det A ̸= 0.
2
The inclusion GL(n, R) ⊂ Rn is a single chart: the coordinates are the entries
aij , 1 ≤ i, j ≤ n of the matrix A. The entries of the product AB are polynomials in
aij , bkℓ and so µ is smooth. Similarly the determinant is a smooth function. We can
represent A−1 = (det A)−1 adj A where A, the adjugate matrix, is the matrix of signed
cofactors. The entries of adj A are polynomials in aij of degree (n − 1) and det A is a
non-vanishing polynomial of degree n so the inversion map, a ratio of polynomials, is
smooth. So G = GL(n, R) is a Lie group (noncompact since det A is an unbounded
continuous function).
2. A standard way to produce manifolds is via the implicit function theorem. A
precise statement is the following:

Theorem 2.1 Let F : U → Rm be a C ∞ function on an open set U ⊆ Rn+m and


take c ∈ Rm . Assume that for each a ∈ F −1 (c), the derivative
DFa : Rn+m → Rm
is surjective. Then F −1 (c) has the structure of an n-dimensional manifold. Moreover
the manifold topology is the induced topology which is therefore Hausdorff and has a
countable basis of open sets.

In this construction, functions of Rn+m which are smooth restrict to smooth functions
on F −1 (c). The standard example is the sphere where we take F : Rn+1 → R to be
F (x) = x21 + · · · + x2n+1 and c = 1. If n = 1 then the sphere is the unit complex
numbers, which is a group. Moreover complex multiplication (x1 + ix2 )(y1 + iy2 ) is
smooth, as is inversion (x1 + ix2 ) 7→ (x1 − ix2 ) so the circle is a Lie group.
The 3-sphere is also a group: it is the group SU (2) of 2 × 2 unitary matrices with
determinant one. To see this take
 
a b
A=
c d

8
If A is unitary then the first row is a unit complex vector so aā + bb̄ = 1. This is the
unit sphere (a, b) ∈ C2 . But by unitarity ac̄ + bd¯ = 0 and since det A = 1, ad − bc = 1.
Solve these equations for (c, d) and we get c = −b̄, d = ā so the first row determines
everything. Multiplication and inversion A−1 = A∗ are clearly smooth.
The circle S 1 and SU (2) are the simplest compact Lie groups and we shall frequently
use them as examples. The circle is abelian but SU (2) is not.
3. The product of m copies of the circle S 1 is a torus which we shall denote by T m . It
is abelian, and is important since we shall see that every compact Lie group contains
distinguished tori.
4. The standard use of Theorem 2.1 is to the group O(n) of orthogonal matrices: the
2
space of n × n real matrices such that AAT = 1. Take the vector space Rn of all real
n × n matrices and define the function

F (A) = AAT

to the vector space of symmetric n × n matrices. This has dimension n(n + 1)/2.
Then O(n) = F −1 (I).

We have
F (A + H) = AAT + HAT + AH T + R(A, H)
where the remainder term ∥R(A, H)∥/∥H∥ → 0 as H → 0. So the derivative at A is

DFA (H) = HAT + AH T

and putting H = KA this is

KAAT + AAT K T = K + K T

if AAT = I, i.e. if A ∈ F −1 (I). But given any symmetric matrix S, taking K = S/2
shows that DFA is surjective and so, applying Theorem 2.1 we find that O(n) is
a manifold. Its dimension is n2 − n(n + 1)/2 = n(n − 1)/2. The smoothness of
multiplication and inversion follow from that of GL(n, R).

Remark: The role of K in this last example illustrates a particular point about
Lie groups as manifolds. If A was the identity then the derivative would be just
H + H T . What we have done to get K from H is to right-multiply by A−1 . In other
words translated back from A to the identity. Here is a point which distinguishes Lie
groups from other manifolds – every point looks like every other one by translation.
In particular once we know we have a manifold, a chart in a neighbourhood of the
identity can be translated to a chart anywhere else on the manifold.

9
5. Here is another chart for O(n). Let S be a real skew-symmetric n × n matrix,
so S T = −S. Then S has eigenvalues which are either complex or zero so I + S is
invertible. Then

((I − S)(I + S)−1 )T = ((I + S)−1 )T (I − S)T = (I − S)−1 (I + S).

But (I + S), (I − S) commute, so this is

(I + S)(I − S)−1 = ((I − S)(I + S)−1 )−1 .

So A = (I − S)(I + S)−1 is orthogonal. Now rewrite this as S(I + A) = I − A. The


matrix I + A is invertible if A has no eigenvalue −1, and this is a long way from the
identity, so S is uniquely determined by A and we have a chart for a neighbourhood of
the identity. By translation we get similar charts everywhere, with local coordinates
skew-symmetric matrices.

Each Lie group has some natural diffeomorphisms: for a fixed element h ∈ G we have
left translation Lh x = hx. From the smoothness of multiplication this is a smooth
map and it is invertible with inverse Lh−1 of the same form so it is a diffeomorphism.
Since Lg Lh x = Lgh x this is a homomorphism of groups G → Diff(G). We also have
right translation Rg x = xg and conjugation x 7→ gxg −1 . Note that Rg−1 Rh−1 x =
xh−1 g −1 = x(gh)−1 so a Rg−1 defines an action of G.

2.3 Functions and vector fields


We defined above a smooth map between manifolds.
P The simplest ones are smooth
maps to R. For example the trace tr A = i Aii for any of the groups of matrices
above is a natural smooth function. For a rotation in R3 , tr A = 1 + 2 cos θ where θ
is the angle of rotation. But if a function is differentiable we have to know what its
derivative is.
The most convenient way to do this is to generalize the directional derivative of a
function f of several variables
X ∂f
ci
i
∂xi
– the derivative of f in the direction (c1 , . . . , cn ). Since manifolds do not sit naturally
in RN in which the direction is a vector in the ambient space we define a tangent
to a manifold by a linear map with the same formal properties as the directional
derivative.

10
Definition 7 A tangent vector at a point a ∈ M is a linear map Xa : C ∞ (M ) → R
such that
Xa (f g) = f (a)Xa g + g(a)Xa f.

This is the formal version of the Leibnitz rule for differentiating a product, and in a
chart {U, φ} where φ = (x1 , . . . , xn )
X ∂f def
X  ∂ 
Xa (f ) = ci (φ(a)) = ci f
i
∂xi i
∂xi a
The tangent space at a ∈ M is the vector space Ta M of all tangent vectors at a and
in a coordinate system has as a basis the n tangent vectors
   
∂ ∂
,..., .
∂x1 a ∂xn a

Defining the tangent space this way provides an abstract, coordinate-free definition
of the derivative of a map of manifolds:

Definition 8 The derivative at a ∈ M of the smooth map F : M → N is the


homomorphism of tangent spaces
DFa : Ta M → TF (a) N
defined by
DFa (Xa )(f ) = Xa (f ◦ F ).

Concretely, this becomes


 
∂ ∂
DFa (f ) = (f ◦ F )(a)
∂xi a ∂xi
X ∂Fj  
∂f X ∂Fj ∂
= (a) (F (a)) = (a) f
j
∂xi ∂yj j
∂xi ∂yj F (a)

Thus the derivative of F is an invariant way of defining the Jacobian matrix. Moreover
the derivative of the composition of two maps is the composition of the derivatives.
This in coordinates is the chain rule.
A particular case is when N = R and then the derivative of a smooth function f ,
denoted df , is a linear map from Ta M to R: an element of the dual space Ta∗ M called
the cotangent space.
We need the notion of a smoothly varying family of tangent vectors, like the wind
velocity at each point of the earth’s surface. This is provided by the following

11
Definition 9 A vector field is a linear map X : C ∞ (M ) → C ∞ (M ) which satisfies
the Leibnitz rule
X(f g) = f (Xg) + g(Xf )

so for each a ∈ M the evaluation of Xf at a gives a tangent vector. In local coordi-


nates  
X ∂f
X(f ) = ci (x)
i
∂xi x
where each ci (x) is a smooth function.

Any object like a vector field which is defined intrinsically in a coordinate-free fashion
can be transformed by a diffeomorphism. For a vector field X and F ∈ Diff(M ) we
define a vector field F∗ X by using the derivative DFa : Ta M → TF (a) M :
(F∗ X)F (x) = DFx (Xx ).
If F∗ X = X then we say X is invariant by F . We then define

Definition 10 On a Lie group G, a vector field X is left-invariant if (Lg )∗ X = X


for each g ∈ G, where Lg is the left-translation diffeomorphism.

The left-invariant vector fields are very important. We can construct one by taking
a tangent vector Xe ∈ Te G, the tangent space at the identity e ∈ G, and defining
Xg = (DLg )Xe
for then
((Lg )∗ X)gh = DLg (Xh ) = DLg (DLh Xe ) = D(Lg ◦ Lh )Xe = DLgh (Xe ) = Xgh .
Conversely, if X is left-invariant then (Lg )∗ X = X so Xg = DLg Xe and this is true
for all g. Thus the vector space of left-invariant vector fields is isomorphic to Te G
and is a vector space of dimension dim G.

Remark: If Xe1 , . . . , Xen is a basis for Te G then the corresponding left-invariant


vector fields X 1 , . . . , X n form a basis for each tangent space Tg G. In particular each
X i is non-vanishing so the Euler characteristic of G vanishes. This is the first evidence
that Lie groups are very special manifolds. The topological consequences are much
stronger since we have not just one but a basis – any vector field can be written
uniquely as X
X= fi X i
i
for globally defined smooth functions fi .

12
Taking the example of a vector field as the wind velocity (assuming it is constant in
time) on the surface of the earth, we can go further to say that a particle at position
x moves after time t seconds to a position φt (x). After a further s seconds it is at

φt+s (x) = φs (φt (x)).

What we get this way is a homomorphism of groups from the additive group R to
Diff(M ). Since Diff(M ) is not a Lie group we can’t use our definition of a smooth
map but the technical definition is the following:

Definition 11 A one-parameter group of diffeomorphisms of a manifold M is a


smooth map
φ:M ×R→M
such that (writing φt (x) = φ(x, t))

ˆ φt : M → M is a diffeomorphism

ˆ φ0 = id

ˆ φs+t = φs ◦ φt .

To a one-parameter group of diffeomorphisms we can associate a vector field: given


φt and f a smooth function, then for each a ∈ M

f (φt (a))

is a smooth function of t and we write



f (φt (a))|t=0 = Xa (f ).
∂t
It is straightforward to see that, differentiating a product with respect to t, the
Leibnitz rule holds and since φ0 (a) = a this is a tangent vector at a. So as a = x
varies we have a vector field. In local coordinates we have

φt (x1 , . . . , xn ) = (y1 (x, t), . . . , yn (x, t))

and
∂ X ∂f ∂yi
f (y1 , . . . , yn ) = (y) (x)|t=0
∂t i
∂yi ∂t
X ∂f
= ci (x) (x)
i
∂xi

13
which yields the vector field
X ∂
X= ci (x) .
i
∂xi

We now want to reverse this: go from the vector field to the diffeomorphism. The
first point is to track the “trajectory” of a single particle.

Definition 12 An integral curve of a vector field X is a smooth map of an interval


γ : (α, β) ⊂ R → M such that
 
d
Dγ = Xγ(t) .
dt

In a coordinate chart (U, ψ) around a then if


X ∂
X= ci (x)
i
∂xi

the equation for γ to define an integral curve can be written as the system of ordinary
differential equations
dxi
= ci (x1 , . . . , xn ).
dt
The existence and uniqueness theorem for ODE’s asserts that there is some interval
on which there is a unique solution with initial condition
(x1 (0), . . . , xn (0)) = ψ(a).

A further theorem on ODEs says that in a suitably small neigbourhood the solution
has smooth dependence on the initial conditions. The curve γ then depends on the
initial point x and we write γ(t) = φt (x) where (t, x) 7→ φt (x) is smooth. Now
consider φt ◦ φs (x). If we fix s and vary t, then this is the unique integral curve of X
through φs (x). But φt+s (x) is an integral curve which at t = 0 passes through φs (x).
By uniqueness they must agree so that φt ◦ φs = φt+s . (Note that φt ◦ φ−t = id shows
that we have a diffeomorphism wherever it is defined).
Our conclusion is that we have the group law for these diffeomorphisms but only for
s, t, x in a small neighbourhood.

Example: Take M to be the one-dimensional manifold (0, ∞) ⊂ R and X = d/dx.


Then the ODE is
dx
=1
dt
14
and so φt (x) = t + x and this is only defined for t > −x. On the other hand if we
take X = xd/dx we have
dx
=x
dt
and φt (x) = et x which is positive if x > 0 and so is defined for all t.
In fact λxd/d(λx) = xd/dx. If we regard (0, ∞) as a Lie group with multiplication of
positive reals as the operation, then this says that X is left-invariant. Moreover if we
set x = 1, the identity element, then the integral curve φt (1) = et is a smooth map
from R to (0, ∞) which is a group homomorphism.

The example above is true more generally:

Theorem 2.2 Let G be a Lie group and X a left-invariant vector field. Then

ˆ the integral curve φt (a) through a ∈ G exists for all t ∈ R

ˆ when x = e (the identity element)

φt (e) : R → G

is a group homomorphism

ˆ any Lie group homomorphism γ : R → G arises this way

Proof: (i) First note that if φt (a) is an integral curve for X through a ∈ G then
gφt (a) is an integral curve through ga. This holds because
 
d
Dφt (x) = Xφt (x)
dt

and so, because X is left-invariant,


 
d
(DLg )φt (x) ◦ Dφt = (DLg )φt (x) (Xφt (x) ) = Xgφt (x) .
dt

But by the chain rule the left hand side is


 
d
D(Lg φt )
dt

which shows that Lg φt is an integral curve, and since φ0 (a) = a, Lg φ0 (a) = ga.

15
This gives us two facts: the chart on which we have local existence and uniqueness
can be shifted around by left translation to give the same result in a neighbourhood
of any point; and if t ∈ (−ϵ, ϵ) is the interval on which the existence theorems work,
the same interval works for any of these neighbourhoods.
Consider now the curve
ψt = φϵ/2 (a)a−1 φt−ϵ/2 (a).
This is well-defined for t ∈ (−ϵ/2, 3ϵ/2) and when t = ϵ/2, ψt = φϵ/2 (a)a−1 a =
φϵ/2 (a). It defines an integral curve because it is a left-translate of an integral curve.
However, it agrees with φt at t = ϵ/2 and so by uniqueness it extends the solution
to the larger interval (−ϵ, 3ϵ/2). Continuing in this way at both ends of the interval,
suppose that (a, b) is a maximal interval on which φt (a) is defined. Then if b is finite
we have a solution on (a, b−ϵ/4) but then the same argument extends it to (a, b+ϵ/4)
giving a contradiction. So the integral curve is defined for all t ∈ R.

(ii) If φt (e) is the integral curve through e, then consider φs (e)φt (e) (group multipli-
cation) for fixed s. Since φ0 (e) = e, φs (e)φt (e) and φs+t (e) agree at t = 0. Moreover
φs (e)φt (e) is left translation by g = φs (e) and is therefore an integral curve of X.
From uniqueness we must then have φs (e)φt (e) = φs+t (e) and a group homomor-
phism.

(iii) If γ : R → G is a Lie group homomorphism then R(γt )−1 (the right action
φt (x) = xγ −1 (t)) is a one-parameter group of diffeomorphisms. Since (gx)h = g(xh),
left and right actions commute. So the one-parameter group φt commutes with all
left translations. The vector field obtained by differentiation with respect to t at t = 0
is therefore left-invariant. From part (ii) we see that the integral curve through φs (e)
is φs (e)φt (e) i.e. right multiplication by the group so this is obtained by the same
process.
2

16
3 Lie groups and Lie algebras

3.1 The Lie bracket


We saw in the previous section how a one-parameter group of diffeomorphisms on a
manifold gives rise to a vector field by

f (φt (x))|t=0 = Xf (x).
∂t
We can do the same with vector fields, namely use the diffeomorphism φt to transform
the vector field Y into (φt )∗ Y and differentiate at t = 0. If this sounds worrying –
differentiating a map from R into the infinite-dimensional space of vector fields – just
note that these are vector fields with a parameter t so we are just differentiating a
tangent vector at each point:
X ∂ci (x, t) ∂f
.
i
∂t ∂x i

The result is called the Lie derivative of Y in the direction X, denoted by LX Y . From
the action of diffeomorphisms on vector fields we have
(φt )∗ Y (f (φt (x))) = (Y f )(φt (x))
and differentiating at t = 0 we obtain
(LX Y )f + Y Xf = XY f.
Thus, as operators on the space of smooth functions we have LX Y = XY − Y X.
This is also denoted by [X, Y ] and called the Lie bracket of two vector fields. (One
can also check directly that [X, Y ] satisfies the Leibnitz formula which characterizes
vector fields, but we have given here a more general context for this.)

Example: In Rn we have
∂ 2f ∂ 2f
 
∂ ∂
, f= − =0
∂xi ∂xj ∂xi ∂xj ∂xj ∂xi
for all f so the Lie bracket of these standard vector fields vanishes.

Remark: If [X, Y ] = 0 more generally, then ∂(φs+t )∗ Y /∂t = 0 at t = 0 so ((φs )∗ Y )a


is the value Y at φs (a). In other words the one parameter group φt defined by X
preserves the vector field Y . This means the integral curve ψt (φs (a)) of Y through
φs (a) is the transform φs (ψt (a)) of the integral curve through a. In other words the
two one parameter groups commute, at least where they are both defined.

17
Expanding all terms [X, Y ] = XY − Y X and cancelling pairs one can easily see that

[X, [Y, Z]] + [Y, [Z, X]] + [Z, [X, Y ]] = 0.

Now suppose we consider a Lie group and X, Y are left-invariant vector fields. In
Theorem 2.2 we saw that the one-parameter group for X consists of right multiplica-
tion so (φt )∗ Y is left-invariant, and differentiating at t = 0 we see that the Lie bracket
[X, Y ] is also left-invariant.

Definition 13 A Lie algebra is a vector space V with a map [ , ] : V × V → V such


that

ˆ the bracket [ , ] is bilinear

ˆ [X, Y ] = −[Y, X]

ˆ Jacobi’s identity holds: [X, [Y, Z]] + [Y, [Z, X]] + [Z, [X, Y ]] = 0.

So we see that the left-invariant vector fields on a Lie group G form a finite-dimensional
Lie algebra g.

2
Example: Take G = GL(n, R). This is an open set in Rn and so the tangent space
at any point can be considered as an element in this vector space. In particular, the
Lie algebra of G is isomorphic to the tangent space at the identity TI so it may be
considered as the space of all n × n matrices. We need the Lie bracket, so take the
matrix C to lie in the tangent space at the identity. Then AC, the left translate by
A, is the value of the corresponding left-invariant vector field Y at A ∈ GL(n, R).
To compute the Lie bracket we need to right multiply by γ −1 for a homomorphism
γ : R → G.
For an n × n matrix B define
t2 2 t3 3
exp tB = I + tB + B + B + ...
2! 3!
Using the usual estimates ∥AB∥ ≤ ∥A∥∥B∥ one can show that this is a smooth map
from R to the space of matrices and it is invertible with inverse exp(−tB). It is a Lie
group homomorphism γ : R → GL(n, R) and


exp tB|t=0 = B.
∂t

18
Right multiplying by γ −1 gives AC exp(−tB) so this is the value of the vector field
(φt )∗ Y at A exp −tB where the diffeomorphism φt is φt (g) = gγ −1 (t). Left-translating
back to the origin gives

(A exp(−tB))−1 AC exp(−tB) = (exp tB)C(exp(−tB)) = I + t(BC − CB) + . . .

Differentiating at t = 0 we see that LX Y = [X, Y ] is the left translate of the commu-


tator BC − CB.
So for GL(n, R) the Lie algebra is the space of n×n matrices with [A, B] = AB −BA.
Virtually all our examples are subgroups of GL(n, R) where the same picture holds.

3.2 Examples of Lie groups


1. SL(n, R)
The special linear group is the group of n × n matrices with determinant 1. It is
non-compact since tr(diag(λ, λ−1 , 1 . . . , 1)) = λ + λ−1 so tr : G → R is an unbounded
continuous function. To prove this is a manifold we view it as det−1 (1) for the real-
valued smooth function det on the space of n × n matrices. To use the implicit
function theorem, as in the case of O(n), it suffices to work at the identity and

det(I + H) = 1 + tr H + R(H)

for a remainder term where |R(H)|/∥H∥ → 0. The tangent space at the identity is
the kernel of the derivative map, the space of matrices of trace 0 and this is the Lie
algebra. The Lie bracket is again the commutator of the matrices. We could also use
the complex numbers to obtain SL(n, C).

2. SO(n)
The determinant of an orthogonal matrix is ±1 and det is continuous, so O(n) is
not connected as a topological space. The group SO(n) the special orthogonal group
is the subgroup with determinant +1. In fact SO(n) is connected and is the con-
nected component of O(n) which contains the identity. This can be seen be using the
eigenvalues of A ∈ SO(n), which are either ±1 or complex conjugates e±iθj .
If A preserves a subspace V ⊂ Rn then it preserves the orthogonal complement.
Suppose it has a complex eigenvalue eiθj with eigenvector v. Then there is a real
invariant 2-dimensional subspace spanned by the real and imaginary parts of v and
A acts as  
cos θj sin θj
− sin θj cos θj

19
Then  
cos tθj sin tθj
− sin tθj cos tθj
with t ∈ [0, 1] connects A in SO(n) to a matrix which is the identity on this subspace.
Continuing this way, we end up with A′ ∈ SO(n) with eigenvalues ±1. But det A′ = 1
so there is an even number of −1s. But then note that θj = π in the above works the
same way for every 2-dimensional eigenspace with eigenvalue −1. So we can connect
to the identity.
The Lie algebra is the space of skew symmetric matrices with commutator as bracket.
This group is compact: in fact each row of an orthogonal matrix has unit length and
2
so O(n) is a closed subspace of the product of spheres S n−1 × · · · × S n−1 ⊂ Rn which
is closed and bounded hence compact.
Instead of the Euclidean positive definite bilinear form ( , ) we can take one with
p positive terms and q negative ones, and then the group of matrices such that
(Ax, Ay) = (x, y) is denoted O(p, q). This group is noncompact and there may be
more than two components in this case.

3. Sp(2m, R)
This, the real symplectic group, is the group of 2m × 2m matrices preserving a nonde-
generate skew-symmetric
P bilinear form ( , ), i.e. (Ax, Ay) = (x, y). It is noncompact.
If (x, y) = Bij xi yj for some basis and skew-symmetric matrix B, then C is in the
Lie algebra if (Cx, y) + (x, Cy) = 0 or C T B + BC = 0. But

(BC)T = C T B T = −C T B = BC

and hence the Lie algebra is isomorphic to the space of symmetric 2m × 2m matrices
S. If S = BC, T = BD then the Lie bracket is B(CD − DC).

3. U(m)
The unitary group is the group of unitary m × m matrices, i.e. A∗ = ĀT = A−1 .
This is compact and connected by a similar argument to SO(n). Its Lie algebra is
the space of skew-Hermitian matrices C ∗ = −C. The determinant is now a Lie group
homomorphism to the unit complex numbers, another Lie group.

3. SU(m)
The special unitary group is the subgroup of U (m) for which det A = 1. Its Lie algebra
is the space of skew-Hermitian matrices of trace zero.

20
3.3 The adjoint representation
If V is a vector space (real or complex) then Aut(V ) is the group of invertible linear
transformations from V to V . By choosing a basis it is isomorphic to either GL(n, R)
or GL(n, C) where n = dim V .

Definition 14 A representation of a Lie group G on V is a Lie group homomor-


phism G → Aut(V ).

All our examples above come with a particular representation – they are defined as
subgroups of Aut(V ) for V = Rn or Cn for some n. But for any Lie group we can
take V = g to be the space of left-invariant vector fields and the action of right
translation makes this a representation of G. It is called the adjoint representation.
For GL(n, R) it has dimension n2 – much bigger than the defining representation –
but for some groups (especially one called E8 of dimension 248) it is the smallest
non-trivial representation. The action of g on X ∈ g is denoted by
Ad g (X).
Since X is left-invariant we can also consider the right action as the conjugation
action:
Ad g = (Lg )∗ (Rg−1 )∗ X
and conjugation Cg (x) = gxg −1 fixes the identity so identifying g with Te the adjoint
action can also be written
DCg : Te → Te .
In particular, if γ : R → G is a homomorphism with tangent vector X at e, then
gγg −1 has tangent vector Adg(X).
In this case the representation space has the extra structure of the Lie bracket and
we need to know whether Ad(g) preserves this. So given left-invariant vector fields
X, Y let γ be the homomorphism determined by X then
∂ ∂
[Ad g(X), Ad g(Y )] = (R(gγg−1 )−1 )∗ (Rg−1 )∗ Y |t=0 = (Rg−1 )∗ (Rγ −1 )∗ Y |t=0
∂t ∂t
and the right hand term is

(Rg−1 )∗ (Rγ −1 )∗ Y |t=0 = Ad g([X, Y ]).
∂t

Consider now the homomorphism of Lie groups Ad : G → Aut(g). The group Aut(g)
is just the invertible linear transformations from g to itself. The tangent space at I is

21
all such linear transformations End(g). The derivative at the identity of Ad is then a
linear map
D Ade : Te (∼
= g) → End(g).
The Lie bracket [X, Y ] of left-invariant vector fields is the derivative at t = 0 of
(Rγ −1 )∗ Y where γ is the one-parameter subgroup which is an integral curve of X.
Since the adjoint action is right translation on left-invariant vector fields (D Ade )Xe
is the directional derivative of (Rγ −1 ) at the identity, which is [X, Y ]e . This endomor-
phism of the Lie algebra is denoted by ad X so that

ad X(Y ) = [X, Y ].

Examples:
1. If G = GL(n, R) then the Lie algebra is the space of all n × n matrices X and
AdA (X) = AXA−1 . And then ad X(Y ) = XY − Y X.
2. Take G = SO(3), the Lie algebra is the 3-dimensional space of all 3 × 3 skew
symmetric matrices. Recall the vector cross-product

a × b = (∥a∥∥b∥ sin θ)n

where n is a unit vector orthogonal to a and b and such that a, b, n has a right-
handed orientation and θ is the angle between a and b. If A ∈ SO(3), it preserves
lengths, angles and orientation and so

A(a × b) = (Aa × Ab).

Now x 7→ a × x is skew-symmetric with respect to the symmetric bilinear form x · y


since
(a × x) · y = [a, x, y]
which is skew in all its entries, so we can describe the Lie algebra as vectors a ∈ R3 .
Then for A ∈ SO(3)

AdA(a)(x) = A(a × A−1 x) = Aa × AA−1 x = Aa × x

and the adjoint action is just the usual action on vectors. So

ad a(x) = a × x.

We saw above that Ad g preserves the Lie bracket on g. This is part of a more general
fact:

22
Proposition 3.1 Let φ : G → H be a Lie group homomorphism, then the linear map
Dφe : Te G → Te H preserves the Lie bracket.

In other words Dφ defines a homomorphism of Lie algebras g → h. The case of Ad


is the conjugation homomorphism φ : G → G for φ(x) = gxg −1 .

Proof: Take Ye ∈ T Ge . Then (Lg )∗ Ye is the left-invariant vector field Y defined by


Ye . Given Xe ∈ Te G let γ be the one-parameter subgroup tangent to Xe at e. Then
at g ∈ G by definition

[X, Y ]g = (Rγ −1 )∗ (Lg )∗ Ye .
∂t
Now apply Dφ. The chain rule for derivatives shows that

Dφ([X, Y ]g ) = (Rφ(γ)−1 )∗ (Lφ(g) )∗ Dφe (Ye )|t=0 . (1)
∂t
Then (Lφ(g) )∗ Dφe (Ye ) is the left-invariant vector field Ỹ on H defined by Dφe (Ye ) at
the point φ(g).
Evaluating (1) at t = 0 we get on the right hand side [X̃, Ỹ ]φ(g) where X̃ is the vector
field with tangent Dφe (Xe ) ∈ Te H. Take g = e and we obtain
Dφ([X, Y ]e ) = [X̃, Ỹ ]e
as required. 2

Remark: In the case of the adjoint representation Ad : G → Aut(g) this derivative


was ad. The Lie bracket on End(g) is just the commutator of linear transformations
so from Proposition 3.1 we must have
ad X ad Y − ad Y ad X = ad[X, Y ].
Apply this to Z and it is equivalent to the Jacobi identity.

3.4 The exponential map


We defined the exponential of a matrix above:
1 2 1 3
exp B = I + B + B + B + ....
2! 3!
This is a smooth function from the space of n × n matrices to the invertible ones:
from the Lie algebra of GL(n, R) to the Lie group. There is a generalization of the
to an arbitrary Lie group:

23
Definition 15 The exponential map for a Lie group G is the map exp : Te G → G
defined by
exp(Xe ) = γ(1)
where γ(t) is the one-parameter subgroup with tangent vector Xe at the identity.

The map exp is smooth. More explicitly it is the composition of Xe 7→ (1, e, Xe ) ∈


R×G×Te with (t, g, Xe ) 7→ (Lg )∗ γX (t) and the latter is smooth by smooth dependence
of solutions to ODEs on initial conditions.
It is also a local diffeomorphism by the inverse function theorem, for the derivative
at 0 ∈ Te in the direction Xe is by definition Xe , so D expe = id. It gives therefore
a natural local coordinate system near the identity and by translation we get similar
coordinate neighbourhoods at all points.

Examples:
1. For the unit complex numbers, S 1 , the exponential map is x 7→ e2πix . This is a
diffeomorphism for x ∈ (0, 1) but of course exp−1 (1) = Z ⊂ R, so it is not a global
diffeomorphism. It is surjective however.
2. If G = SU (2), the exponential map is surjective (in fact for any compact Lie
group) but note that skew-hermitian matrices of the form
 
ia b
B=
−b̄ −ia

satisfy B 2 = −1 if a2 + bb̄ = 1. So, following the usual algebra of eiπ = −1, we obtain
exp πB = −I and the inverse image of −1 ∈ SU (2) is a 2-dimensional sphere inside
a 3-dimensional sphere.
3. If A = exp B then (exp(B/2))2 = A so A has a square root. If G = SL(2, R)
and A = diag(λ, λ−1 ) where λ is negative and not −1, then A has no square root and
hence is not in the image of exp. This can be seen by looking at
 
a b
C=
c d

with ad − bc = 1. If C 2 = A then b(a + d) = 0 and a2 + bc = λ. So b ̸= 0 since λ < 0


hence a + d = 0 and 1 = −a2 − bc = −λ.

The exponential map intertwines the associated homomorphisms of Lie groups and
Lie algebras:

24
Proposition 3.2 Let φ : G → H be a Lie group homomorphism, and let Dφe :
Te G → Te H be its derivative at the identity. Then
exp(Dφe (Xe )) = φ(exp Xe ).

Proof: A one-parameter subgroup with tangent Xe maps under φ to a one-


parameter subgroup with tangent Dφe (Xe ). 2

It also tells us that a homomorphism φ : G → H of Lie groups is determined by Dφe .


This needs another result:

Proposition 3.3 Let V ⊂ G be an open set containing the identity, and suppose G
is connected. Then every element of G is the product of a finite number of elements
in V , together with their inverses.

Proof: Since left translation is a homeomorphism each set gV is open. Let F ⊆ G


be the subset obtained from products and inverses in V , and g ∈ F . Then gV ⊆ F
since we are multiplying on the right by a further element of V . This shows F is
open. Now if g lies in the closure of F , the open set gV intersects F in some point h,
but then g = hv −1 for some v ∈ V and g ∈ F . Since F is open and closed and G is
connected F = G. 2

It follows that a homomorphism φ : G → H is determined uniquely by its restriction


to V . But there exists V such that the exponential map is a diffeomorphism from an
open set in Te onto V and further exp(Dφe (Xe )) = φ(exp Xe ). This means that φ on
V is uniquely determined by Dφe .

Remark: The above result does not say that a Lie algebra homomorphism g → h
defines a Lie group homomorphism G → H. For example, take the following basis
for the Lie algebra su(2):
     
i 0 0 1 0 i
X= Y = Z=
0 −i −1 0 i 0
These satisfy [X, Y ] = 2Z, [Y, Z] = 2X, [Z, X] = 2Y . Recall that the Lie algebra
so(3) could be identified with R3 and the action x 7→ a × x and then [a, b] = a × b.
So putting X = 2i, Y = 2j, Z = 2k using the standard basis for R3 , we get an
isomorphism of Lie algebras.
But SO(3) is not isomorphic to SU (2): the matrix −I ∈ SU (2) commutes with
everything but there is no such rotation in SO(3).

25
The exponential map is however useful in describing abelian Lie groups:

Theorem 3.4 Let G be a connected Lie group, then

ˆ exp : g → G is a group homomorphism if and only if G is abelian

ˆ G is abelian if and only if it is isomorphic to T r × Rk where T r is the r-


dimensional torus.

Proof:
(i) Since the additive group g is abelian, if exp is a homomorphism then

exp(X) exp(Y ) = exp(X + Y ) = exp(Y + X) = exp(Y ) exp(X).

So since exp is a diffeomorphism near 0, there is a neighbourhood V of e all of whose


elements commute. Now apply Proposition 3.3 using the fact that G is connected.
Finite products of elements in V must commute, so G is abelian.
Conversely suppose G is abelian then the multiplication map µ : G × G → G is a Lie
group homomorphism, since µ(g1 h1 , g2 h2 ) = g1 h1 g2 h2 = g1 g2 h1 h2 . But Dµ(X, Y ) =
X + Y so from Proposition 3.2 exp(X + Y ) = exp(X) exp(Y ).
(ii) From (i) any element is a product of terms exp(±Xi ) where each exp(±Xi ) ∈ V ,
but since exp is a homomorphism these are of the form exp(±X1 ± X2 · · · ± Xk ) and
hence exp is a surjective homomorphism. We need to identify the quotient group
G∼ = g/K where K is the kernel of exp.
Since exp(A + X) = exp(A) exp(X) and exp is a local diffeomorphism at 0, it is a
local diffeomorphism at every point, in particular at A ∈ K, the kernel of exp. This
means there is a neighbourhood of A ∈ g which only intersects K in the point A. We
need now to identify the structure of K, an additive subgroup of Rn .
Let r be the dimension of the vector subspace spanned by K, and choose r linearly
independent elements w1 , . . . , wr ∈ K. Consider the set

F = {x ∈ K : x = x1 w1 + · · · + xr wr , 0 ≤ xi ≤ 1}.

This is closed, bounded and hence compact, but each point has a neighbourhood which
contains one point of K. By compactness this open covering has a finite subcovering
and F therefore must be finite.
For each 1 ≤ i ≤ r choose vi = xi wi + · · · + xr wr with xi > 0 and minimal. These are
clearly linearly independent and so any v ∈ K can be written as v = λ1 v1 + · · · + λr vr .

26
An integer linear combination of vi lies in K so taking the integer part [λi ] we see
that v ′ = (λ1 − [λi ])v1 + · · · + (λr − [λr ])vr ∈ K. Suppose for a contradiction that
some (λr − [λr ]) (which is non-negative of course) is not zero, and choose a minimal
index j. Then
v ′ = xj (λj − [λj ])wj + · · · ∈ F
which contradicts the minimality in the definition of vj . Hence each λi is an integer
and K consists of all integer linear combinations of v1 , . . . , vr .
Extend this to a basis v1 , . . . , vn of g and then the map
n
X
xi vi 7→ (e2πix1 , . . . , e2πixr , xr+1 , . . . , xn )
1

gives an isomorphism of G ∼
= g/K to T r × Rn−r . 2

The exponential map, even for matrices, does not behave as well as the exponential of
real or complex numbers. In particular if A and B don’t commute then exp(A + B) ̸=
exp A exp B. Nevertheless, it is a local diffeomorphism so there is a local inverse which
we could call log. So for any Lie group we could take A, B ∈ g sufficiently small, and
ask for
log(exp A exp B) ∈ g.
This is a passage from two elements in the Lie algebra to a third, and it is a natural
question to ask whether there is a formula for this entirely in terms of the algebra of
g together with its bracket. There is, and it is called the Campbell-Baker-Hausdorff
formula and involves just the operation ad X(Y ) = [X, Y ]. We shall not need it later
but here is one explicit (at least in a sense) formula. Define

x log x X (1 − x)n
ψ(x) = =1−
x−1 n=1
n(n + 1)

The coefficients are Bernoulli numbers, important in number theory and algebraic
topology. Then
Z 1 
log(exp X exp Y ) = X + ψ(exp(ad X) exp(t ad Y ))dt Y.
0

27
4 Submanifolds, subgroups and subalgebras

4.1 Lie subgroups


In group theory, a subgroup of G is simply defined as a subset closed under the op-
erations of multiplication and inversion. Since Lie groups have the extra structure of
being manifolds we need to address the definition of submanifolds. However, we shall
prove theorems that show us how fairly minimal assumptions allow us to recognize
Lie subgroups.
As far as manifolds are concerned there are two notions of submanifold related to an
injective smooth map of manifolds F : M → N . If the derivative DF is injective at
each point, it is called an immersed submanifold. If, further, the induced topology
from N on F (M ) is the manifold topology of M then it is called an embedded sub-
manifold. There are various features that can cause an immersed submanifold to fail
to be embedded but here is one that is particularly relevant for us.

Example: Define F : R → S 1 × S 1 by

F (x) = (e2πix , e2παix )

where α is an irrational number. This is a homomorphism of Lie groups with injective


derivative. The map is injective because the kernel requires x ∈ Z and αx ∈ Z.
However, α can be rationally approximated by |α − m/n| < 1/n2 for arbitrarily large
n which means that e2πinα can be made arbitrarily close to 1. So in the induced
topology, any neighbourhood of the identity contains arbitrarily large values of x,
which is definitely not the topology of R.

An embedded submanifold of dimension m can be described in a coordinate neigh-


bourhood as F −1 (c) where F is a smooth function U ⊂ Rn → Rn−m with surjective
derivative as in Theorem 2.1. So O(n) for example is an embedded submanifold of
2
Rn . The example above shows that we cannot discuss Lie groups without encoun-
tering immersed submanifolds.

Definition 16 A Lie subgroup is a Lie group H which is a subgroup of G and is such


that the inclusion j : H → G is a Lie group homomorphism.

The point here is that the induced topology on H ⊆ G may not be the manifold
topology of H.

28
Examples:
1. Given a 1-parameter subgroup γ : R → G its image R/ Ker γ (which is either R
or S 1 by Theorem 3.4) is a Lie subgroup.
2. The kernel of a Lie group homomorphism φ : G → G′ is a Lie subgroup. In
this case it does have the induced topology. Since it is closed this will follow from
Theorem 4.1.
3. If a Lie group is not connected, then the component containing the identity is an
embedded Lie subgroup.

For Lie algebras we have

Definition 17 A Lie subalgebra is a vector subspace h ⊆ g which is closed under the


bracket operation.

and clearly given a Lie subgroup j : H → G, Dj : h → g is a Lie subalgebra.

4.2 Continuity and smoothness


These are the definitions, but life is made easier by the following:

Theorem 4.1 Let G be a Lie group. A subgroup H is an embedded Lie subgroup if


and only if H is closed.

The point to note here is that H is not assumed to be a Lie group, simply a (topologi-
cally) closed subset which is (algebraically) closed under multiplication and inversion.
In particular, it tells us that all those examples of subgroups of GL(n, R) that we
gave are Lie groups, without using arguments like that given for O(n).

Proof:
(i) If H is embedded it is given in a coordinate neighbourhood of the identity as F −1 (c)
then let B be a closed ball containing the identity so that, F being continuous, H ∩ B
is closed. There is another such neighbourhood B −1 consisting of the inverses of
elements of B. We want to show that H itself is closed.
Take y ∈ H̄ then there exists x ∈ yB −1 ∩ H which implies y ∈ xB ∩ H̄ and hence
x−1 y ∈ B ∩ H̄ since x ∈ H. But B ∩ H is closed so x−1 y ∈ B ∩ H and x−1 y ∈ H
hence y ∈ H and H is closed.

29
(ii) Now assume that H is closed. We need to construct first a tangent space for H
and then a Lie group. Let log denote the local inverse of exp at the identity, defined
on U ⊂ G. Then if U ′ = U ∩ H we need V ′ = log(U ′ ) to be a neighbourhood of 0 in
a vector subspace of Te G.
Consider a sequence vn ∈ V ′ such that vn → 0. Normalize vn /∥vn ∥ and we have a
sequence on the unit sphere, which is closed and bounded hence there is a subsequence
converging to X. Consider such sequences and the tangent vectors X which they
generate.
Since ∥vn ∥ → 0, given t ̸= 0 ∈ R, |t|/∥vn ∥ → ∞. Let mn be the integer part [t/∥vn ∥],
then mn ∥vn ∥ → t. This means

exp(mn vn ) = exp(mn ∥vn ∥(vn /∥vn ∥)) → exp(tX). (2)

On the other hand each term exp(vn ) lies in H. Considering the one-parameter
subgroup in G with tangent vector vn we know that exp(mn vn ) = exp(vn )mn and
since H is a group this implies exp(mn vn ) ∈ H. Equation (2) then says that exp(tX)
lies in the closure of H. But H is assumed closed so exp(tX) ∈ H.
We want to show that the Xs produced this way form a vector space. Multiplying by
a scalar is just rescaling t above, addition is the problem. Take X and Y and consider

ℓ(t) = log(exp(tX) exp(tY )).

Since exp(tX) and exp(tY ) lie in H, for small t we have ℓ(t) ∈ V ′ . Take vn = ℓ(1/n)
then vn → 0 and
nvn → ℓ′ (0) = X + Y
since Dµe (X, Y ) = X + Y where µ : G × G → G is the multiplication map. Finally

vn /∥vn ∥ = (nvn )/(n∥vn ∥) → (X + Y )/∥X + Y ∥.

We therefore have a vector space W ⊂ Te defined by sequences as above and such


that exp(X) ∈ H if X ∈ W . We need to show that this is in some sense the largest
possible. So using an inner product decompose Te = W ⊕ W ⊥ and consider the map
ψ(w, v) = exp(w) exp(v) for w ∈ W and v ∈ W ⊥ . Suppose ψ(w, v) lies in H then
since we have just shown exp(w) does, we also have exp(v) ∈ H. Now repeat the
process above with a sequence of such vs. Then vn /∥vn ∥ will converge to a unit vector
u ∈ W ⊥ . But also by definition u ∈ W which is a contradiction. Hence v = 0.
The map log ψ gives a local coordinate system in which H is defined as the kernel of
projection onto W ⊥ and so is an embedded Lie subgroup. 2

30
Example:
1. The centre of a group G is the (normal, abelian) subgroup of elements that com-
mute with every element of G. For a fixed element h in a Lie group G the map
g 7→ ghg −1 h−1 is continuous and so the inverse image of e is closed. This consists of
the elements that commute with h. The intersection for all h is the centre which is
closed and hence an embedded Lie subgroup.
2. If H is compact, and φ : H → G injective then H is an embedded subgroup.

This theorem has a useful consequence:

Theorem 4.2 A continuous group homomorphism between two Lie groups is smooth
and hence a Lie group homomorphism.

Proof: Let φ : G → H be a continuous group homomorphism and consider the


graph
Γ = {(g, φ(g)) ∈ G × H}.
Since manifolds are Hausdorff, the graph is closed and so by Theorem 4.1 it is a Lie
group with a smooth homomorphism to G × H. The projection π from Γ onto the
first factor G is smooth because it is the restriction of a smooth map G × H → G to
Γ. It is moreover a homeomorphism with inverse g 7→ (g, φ(g)).
Now if Dπe had a non-trivial kernel, the exponential map would define a one-parameter
subgroup which mapped to the identity in G. Since π is a bijection this does not hold,
so Dπe is an isomorphism. By the inverse function theorem this means that π −1 is
smooth and so defines a Lie group isomorphism between Γ and G.
Now the projection from Γ to the second factor H is smooth and composing with π −1
this is the homomorphism φ. 2

4.3 Subgroups versus subalgebras


If H is a Lie subgroup of G then its Lie algebra h is a subalgebra of g. What about
the converse? Is every Lie subalgebra tangential to a Lie subgroup? The answer is:

Theorem 4.3 There is a one-to-one correspondence between Lie subalgebras h ⊆ g


and connected Lie subgroups H ⊆ G.

31
The case of the irrational homomorphism F (x) = (e2πix , e2παix ) above shows that H
may not be closed in G. The subgroup is clearly going to be generated by exp(h) ⊂ G
but this needs to be given the structure of a manifold and of a group.
We start with the vector space h of left-invariant vector fields on G. If dim h = r
this defines a smoothly varying r-dimensional subspace of each tangent space of G.
This makes sense on a more general manifold and is called a distribution (not to be
confused with the analytical meaning of this term). Since h is a subalgebra, it is
closed under the Lie bracket, so if Y1 , . . . , Yr forms a basis then [Yi , Yj ] = cijk Yk for
constants cijk . This is a Lie bracket of vector fields. If f, g are smooth functions on
a manifold and X, Y vector fields then

[f X, gY ] = f (Xg)Y − g(Y f )X + f g[X, Y ] (3)

So in our case, introducing coefficients fi , gi which are smooth functions on G we see


that X X
[fi Yi , gj Yj ] = fi (Yi gj )Yj − fj (Yj gi )Yi + fi gj cijk Yk .
i,j i,j

Thus the Lie bracket as vector fields of any linear combinations of the Yi with functions
as coefficients is again a combination of those basis vectors. A distribution on a
manifold with this property is called an integrable distribution. The basic theorem
which relates to this is:

Theorem 4.4 (Frobenius) If E is an integrable distribution of rank r on an open set


in Rn , then through each point in some open subset there is an embedded submanifold
of dimension r which is tangential to E. Furthermore we can choose coordinates
x1 , . . . , xn such that these submanifolds are defined by xi = ci , for r + 1 ≤ i ≤ n.

Remark: If r = 1, then we can choose a single vector field Y and the integral curve
of Y is the submanifold. There is no integrability condition here since [Y, Y ] = 0.
Globally, as with the irrational homomorphism, the full integral curve may not be an
embedded submanifold.

Proof:
(i) Let Y1 , . . . , Yr be a basis for the distribution: local vector fields which are linearly
independent and span E at each point
n
X ∂
Yi = Aij .
j=1
∂xj

32
The matrix of functions Aij has rank r so by reordering the basis we may assume
that on a possibly smaller neighbourhood Aij , 1 ≤ i, j ≤ r is nonsingular. Call the
inverse of this matrix Bij and define
r
X
Xi = Bij Yj .
j=1

Then observe that, by choosing B as the inverse of A when we expand [Xi , Xj ] using
[f X, gY ] = f (Xg)Y − g(Y f )X + f g[X, Y ] the last term only involves [∂/∂xi , ∂/∂xj ]
and so vanishes. Thus n
X ∂
[Xi , Xj ] = aijk .
k=r+1
∂x k

But the integrability condition says that [Xi , Xj ] is a linear combination of X1 , . . . , Xr


which only involve terms ∂/∂xi for i ≤ r so [Xi , Xj ] = 0.
(ii) Let φit be the (local) one-parameter group of diffeomorphisms for Xi and define
the smooth map
F (t1 , . . . , tr ) = φ1t1 ◦ φ2t2 · · · ◦ φrtr (a).
Since the vector fields commute, so do the one-parameter groups of diffeomorphisms,
so F is independent of the ordering. By moving φiti to the beginning and differentiat-
ing with respect to ti we obtain Xi . So we see that F has injective derivative whose
image is spanned by X1 , . . . , Xr as required.
Now restrict a to depend on xr+1 , . . . , xn : a = (a1 , . . . , ar , xr+1 , . . . , xn ) then by the
inverse function theorem the local coordinate system (t1 , . . . , tr , xr+1 , . . . , xn ) satisfies
the conditions. 2

This theorem is a local one but to make it global we define an integral manifold for
the distribution to be an immersed submanifold F : N → M such that DFa maps
the tangent space Ta N isomorphically to the distribution EF (a) ⊂ TF (a) M . This is
just a generalization of the integral curve of a vector field. Through each point we
can define a maximal integral submanifold and in our case, taking the point as the
identity, this will turn out to be the Lie group H. But first a technical point.

Proposition 4.5 If F : N → M is an integral submanifold and L is a manifold with


map f : L → N , then f is smooth if and only if F ◦ f is.

Proof: One direction is obvious, so suppose that F ◦ f is smooth. For each a ∈ L


consider F (f (a)) ∈ M and a small neighbourhood U of this point for which the

33
Frobenius theorem holds. Then F −1 (U ) ⊂ N is open. Since a manifold has by
definition a countable basis of open sets, this is a countable union of open connected
manifolds (each given by xi = ci , r + 1 ≤ i ≤ n for countably many ci ). The map
F ◦ f , being continuous, maps a connected open neighbourhood of a to just one of
these components given by a constant value of the ci . In coordinates (y1 , . . . , ym ) on
this neighbourhood F ◦ f is just of the form (x1 (y), . . . , xn (y), cr+1 , . . . , cn ) where the
ci are constant and this is a smooth function f to N . 2

Now for the proof of Theorem 4.3.

Proof: The distribution E on G given by the left-invariant vector fields h is in-


tegrable as noted above. Let H be the maximal connected integral submanifold
through e. Since E is left-invariant left translation gives another integral submani-
fold so if h ∈ H, h−1 H is an integral submanifold. But it passes through e = h−1 h so
h−1 H = H and we deduce that H is closed under multiplication and inversion.
We need to prove that multiplication and inversion are smooth. But H × H ⊂ G × G
is smooth and µ : G × G → G is smooth, so putting L = H × H, N = H and M = G
in Proposition 4.5 we have the result.
2

34
5 Global aspects

5.1 Components and coverings


As we saw with O(n), naturally occurring Lie groups are not necessarily connected.
It is straightforward to see that a manifold is connected in the topological sense of
not containing any sets which are closed and open if and only if it is path-connected.
The connected component G0 of a Lie group is a subgroup since paths g(t), h(t) from
the identity to g, h define a path g(t)h(t) to gh. It is also a normal subgroup since
x 7→ gxg −1 is a homeomorphism and homeomorphisms take connected components
to connected components. Conjugation also takes the identity to the identity so takes
G0 to G0 . The set of components π0 (G) is therefore a group: the quotient of G by the
normal subgroup G0 . It has the discrete topology (every point is open and closed)
and is countable because all our manifolds have a countable basis of open sets.

Examples:
1. The homomorphism det : O(n) → ±1 has kernel SO(n) which is connected, so
π0 (O(n)) ∼
= Z2 .
2. If we take Rp ⊕ Rq and put a positive definite inner product on Rp and a negative
one on Rq we can write an element of O(p, q) in block form
 
A B
C D
and then (sgn det A, sgn det D) is a homomorphism to Z2 × Z2 with connected kernel.

Connectedness is a simple concept. More interesting is whether a Lie group is simply-


connected i.e. whether any continuous map f : S 1 → G is contractible to a point by a
continuous family ft : S 1 × [0, 1] → G. Clearly the circle is not simply connected, but
spheres are simply connected hence SU (2) is. When a space is not simply-connected
it has non-trivial covering spaces and we encounter this way covering groups.

Definition 18 A smooth surjective map π : M → N of manifolds is a covering map


if each point of N has a neighbourhood U such that π −1 (U ) is the disjoint union of
open sets Ui such that π : Ui → U is a diffeomorphism.

Remark: If π : M → N is a map of compact manifolds such that Dπ is an


isomorphism at each point, then using the inverse function theorem one can see this
is a covering map where there are a finite number of Ui s.

35
For Lie groups we have the following:

Theorem 5.1 Let π : G → H be a Lie group homomorphism, with H connected,


then π is a covering map if and only if Dπe is an isomorphism of Lie algebras.

Examples:
1. We observed that SU (2) and SO(3) have isomorphic 3-dimensional Lie algebras.
The adjoint representation Ad : SU (2) → GL(3, R) has SO(3) ⊂ GL(3, R) as image
and this is a covering homomorphism. The element −I ∈ SU (2) acts trivially by
conjugation so this is a 2-fold covering with kernel ±I.
2. The group SL(2, C) of 2 × 2 complex invertible matrices of determinant 1 is a
covering of the identity component of SO(1, 3), the Lorentz group of special relativity.
We can see this by considering the 4-dimensional real vector space V of 2×2 Hermitian
matrices X = X ∗ and the action of A ∈ SL(2, C) as X 7→ AXA∗ . If
 
a b
X=
b̄ c

where a, c are real, then det X = ac − bb̄ = (a + c)2 /2 − (a − c)2 /2 − bb̄ is a quadratic
form of signature (1, 3) and since det AXA∗ = | det A|2 det X = det X, the action
preserves the indefinite inner product. Again, −I ∈ SL(2, C) acts trivially.

Proof:
(i) If π is a covering then it is a local diffeomorphism and so Dπe gives an isomorphism
of Lie algebras.
(ii) Suppose Dπe is an isomorphism then by the inverse function theorem π is a local
diffeomorphism at e and by left translation also at any point. At e ∈ G it maps an
open neighbourhood to an open neighbourhood of e ∈ H, which therefore generates
the whole group H, hence π is surjective.
By the inverse function theorem e ∈ G has a neighbourhood W which maps dif-
feomorphically to a neighbourhood U ∈ H and therefore contains only one point of
Ker π, namely e. To get a covering we want, for each k ∈ Ker π such a neighbourhood
mapping diffeomorphically to the same U , perhaps by shrinking U .
Since the map f (g1 , g2 ) = g1 g2−1 is continuous, f −1 (W ) is open and so contains an
open neighbourhood V × V of (e, e). Take k ∈ Ker π and suppose g ∈ kV ∩ V .
Then g = v1 = kv2 and so k = v1 v2−1 . But by construction this lies in W which
only contains the identity, so V and kV are disjoint. Replacing U by π(V ), we have

36
π −1 (π(V )) expressed as a disjoint union of open sets each mapping diffeomorphically
to π(V ). Each such set is a translate kV . By left translation we get such open sets
for all inverse images.
2

The kernel of a covering homomorphism is normal and also discrete (each element
is an open set in the induced topology) as we have seen in the proof of the last
theorem. Conjugation k 7→ gkg −1 permutes the elements of Ker π but if the covering
group G is connected it must act trivially, for given a path g(t) to e, g(t)kg(t)−1 is a
continuous map from [0, 1] to Ker π and so maps to one connected component. But
g(1)kg(1)−1 = eke = k. This means that k ∈ Ker π commutes with all g and so is a
subgroup of the centre of G.

5.2 From Lie algebras to Lie groups


Every connected manifold M has a special connected covering called the universal
covering M̃ . The universal covering is simply connected, and this characterizes it.
It is defined, given a base point z ∈ M , as a set of equivalence classes of continuous
maps γ : [0, 1] → M with γ(0) = z. Two such maps γ, γ ′ are equivalent if they have
the same endpoint γ(1) = γ ′ (1) and there is a continuous family γt , for t ∈ [0, 1] fixing
the end points and such that γ0 = γ, γ1 = γ ′ . The map [γ] 7→ γ(1) is a covering map
π of manifolds. It is universal in the sense that if p : N → M is any other connected
covering then there is a covering map q : M̃ → N with π = p ◦ q. The map q is
unique so long as we choose a point x ∈ M̃ and y ∈ N mapping to z ∈ M and require
q(x) = y.
When G is a Lie group and we take z = e, then multiplication of paths γ(t)γ ′ (t)
defines a product on G̃ and with the identity as the equivalence class of the trivial
path γ(t) = e, we can define inversion by (γ(t))−1 . Thus G̃ is a Lie group and
the projection π : G̃ → G a Lie group homomorphism. Moreover given a covering
homomorphism p : H → G, if we choose x, y, z as previously to be the identity
elements in each group, then the unique q : G̃ → H is a Lie group homomorphism.
Thus each covering group is obtained from G̃ as the quotient group by a subgroup of
Ker π.

Examples:
1. The universal covering of the circle is the additive real line: π(t) = e2πit . The
kernel is Z ⊂ R.

37
2. The universal covering of SO(3) is SU (2) with the projection defined above and
kernel ±I.
3. The universal covering of SL(2, R) is a connected Lie group which is not a sub-
group of any GL(n, R). In this case Ker π ∼
= Z (this is the fundamental group π1 of
SL(2, R)).

Thus far we have associated Lie algebras to Lie groups, but the properties of covering
homomorphisms show that any two coverings of G have isomorphic Lie algebras so
the association is not one-to-one. However:

Theorem 5.2 Let G be a simply connected Lie group and H another Lie group, then

ˆ there is a one-to-one correspondence between Lie algebra homomorphisms g → h


and Lie group homomorphisms G → H
ˆ if both G and H are simply connected and have isomorphic Lie algebras then
they are isomorphic as Lie groups.

Proof:
(i) If ψ : g → h is the given homomorphism of Lie algebras consider its graph

Γ = {(x, ψ(x)) ∈ g ⊕ h}.

This is a Lie subalgebra since ψ is a homomorphism.


Now apply Theorem 4.3 and we get a connected Lie subgroup S ⊂ G × H with Lie
algebra Γ. Projection from Γ to the first factor is an isomorphism of Lie algebras so
from Theorem 5.1 we have a covering homomorphism π : S → G. But G is simply
connected, hence its own universal covering, which means that it is a diffeomorphism.
Then π −1 followed by projection onto the second factor in G × H gives the required
Lie group homomorphism.
We already saw, using the exponential map, that a Lie group homomorphism is
uniquely determined by the corresponding Lie algebra homomorphism.
(ii) Using the first part for both G and H gives the last part.
2

The outstanding question is whether one can associate a Lie group to any Lie algebra,
and this is provided by Ado’s theorem which we shall not prove:

38
Theorem 5.3 (Ado) For any Lie algebra V , there is an injective Lie algebra homo-
morphism V → gl(m, R) for some m.

By Theorem 4.3 this implies there is a connected Lie subgroup G of GL(m, R) with
V as Lie algebra. If we pass to the universal cover then we get Lie’s third theorem:

Theorem 5.4 There is a one-to-one correspondence between Lie algebras up to iso-


morphism and simply-connected Lie groups up to isomorphism.

39
6 Representations of Lie groups

6.1 Basic notions


Recall that:

Definition 19 A representation of a Lie group G on V is a Lie group homomor-


phism φ : G → Aut(V ).

From Theorem 4.2 we need only assume that the homomorphism is continuous. We
can also describe a representation as an action of G on V by linear transformations:
gv = φ(g)v. Some of the following examples entail the action on certain functions on
a vector space. The action on functions is (gf )(x) = f (g −1 x) for then

(h(gf ))(x) = gf (h−1 x) = f (g −1 h−1 x) = f (hg)−1 x) = ((hg)f )(x).

Examples:
1. For each n ∈ Z we have a representation Un of S 1 on C given by φ(eiθ )z = einθ z.
The case n = 0 is the trivial representation which of course any group has.
2. The group SU (2) has a defining representation on V = C2 . We can regard this as
an action on linear functions in z1 , z2 . The matrix
 
a b
A=
−b̄ ā

acts on a linear function f (z) by f (A−1 z) and using A−1 = A∗ this is

f 7→ f (āz1 − bz2 , b̄z1 + az2 ).

But this formula makes sense for a homogeneous polynomial of any degree. So if
Vn denotes the space of degree n polynomials it has a basis of the (n + 1) functions
z1n , z1n−1 z2 , . . . , z2n and this is a representation space.

Given a representation φ of a Lie group, Dφe is a representation of the Lie algebra,


meaning we have a homomorphism ψ : g → End V such that ψ[X, Y ] = ψ(X)ψ(Y ) −
ψ(Y )ψ(X). We find this by differentiating one-parameter subgroups at the identity.

Examples:

40
1. For the representation of S 1 on C given by φ(eiθ )z = einθ z we differentiate with
respect to θ at θ = 0 and get the Lie algebra homomorphism ψ(z) = inz.
2. For SU (2) we can use the basis
     
i 0 0 1 0 i
X= Y = Z=
0 −i −1 0 i 0

of the Lie algebra. Then the action of exp tX is f (z1 , z2 ) 7→ f (e−it z1 , eit z2 ) and
differentiating at t = 0 we get

∂f ∂f
Xf = −iz1 + iz2
∂z1 ∂z2
and similarly
∂f ∂f ∂f ∂f
Y f = −z2 + z1 , Zf = −iz2 − iz1 .
∂z1 ∂z2 ∂z1 ∂z2

3. From the previous section, each Lie algebra representation defines a representation
of the associated simply-connected Lie group, but not necessarily otherwise. So al-
though SU (2) and SO(3) have isomorphic Lie algebras only the spaces Vn for n even
are representations of SO(3) since −I ∈ SU (2) acts as −1 on odd degree polynomials.

Two representations V, W of G are isomorphic if there is a vector space isomorphism


commuting with the two actions of G.
There are some natural operations on representation spaces:

ˆ The direct sum V ⊕ W with the action g(v, w) = (gv, gw)

ˆ The dual V ∗ . A linear transformation A : V → W has a natural dual map


A′ : W ∗ → V ∗ (the transpose of the matrix if we take a dual basis) and to get
an action on V ∗ from φ : G → Aut(V ) we define for ξ ∈ V ∗ ,
gξ = φ(g −1 )′ : V ∗ → V ∗ .
Concretely this is (A−1 )T for the group action and −B T for the Lie algebra
action. Note that since A′ f (v) = f (Av) this is the natural action on linear
functions.
ˆ The tensor product V ⊗ W with action
X X
g vi ⊗ wi = gvi ⊗ gwi
i i

41
We differentiate to get the Lie algebra action
X X X
X vi ⊗ w i = Xvi ⊗ wi + vi ⊗ Xwi .
i i i

ˆ The space Hom(V, W ) of homomorphisms from V to W . Here the group action


is
(gA)(v) = g(A(g −1 v)).
In fact Hom(V, W ) is canonically isomorphic to V ∗ ⊗ W and so this is a com-
bination of the above.

Definition 20 A representation V is

ˆ reducible if there is a proper subspace W ⊂ V such that gw ∈ W for all w ∈ W


and g ∈ G.
ˆ it is irreducible if {0} and V are the only invariant subspaces

ˆ it is completely reducible if it is a direct sum V = V1 ⊕ V2 · · · ⊕ Vm of irreducible


representations.

Examples:
1. Take G = R the additive reals and V = R2 with action t(x, y) = (x + ty, y).
Then y = 0 is an invariant subspace so V is reducible. It is not completely reducible
however for if y ̸= 0, (x + ty, y) = λ(x + 0y, y) implies λ = 1 and t = 0.
2. Any one-dimensional representation is clearly irreducible so the representations Un
of S 1 are.
3. The representations Vn above of SU (2) are irreducible. To see this use the Lie
algebra action. Using the basis X, Y, Z write
1 ∂
N + = (Y + iZ) = z1
2 ∂z2
then
N + (z1k z2n−k ) = (n − k)(z1k+1 z2n−k−1 ).
If W ⊂ Vn is invariant, and f ∈ W , then W contains N + f, (N + )2 f, . . . also. Choose
f ∈ W and let k be the smallest integer such that the coefficient of z1k z2n−k is non-zero.
Then (N + )n−k f is a non-zero multiple of z1n , hence z1n ∈ W . Now consider
1 ∂
N − = − (Y − iZ) = z2 .
2 ∂z1

42
Applying this to z1n we get nz1n−1 z2 and by repetition all the basis vectors, hence
W = Vn .
4. The product of a polynomial of degree m and one of degree n is of degree m+n and
so there is an invariant homomorphism Vm ⊗ Vn → Vm+n . Since Vm+n is irreducible
the image is the whole space and so if m, n > 0 there is a non-zero kernel which is an
invariant subspace of Vm ⊗ Vn which is therefore reducible.
5. Restrict Vn to S 1 ⊂ SU (2) acting as (z1 , z2 ) 7→ (eiθ z1 , e−iθ z2 ) then by looking at
the basis vectors z1k z2n−k we can see that it is completely reducible:

Vn | = Un ⊕ Un−2 ⊕ · · · ⊕ U−n .

We have this useful result:

Proposition 6.1 (Schur’s lemma) A G-invariant homomorphism A : V → W be-


tween two irreducible representations is either an isomorphism or is zero. If V = W
and these are complex vector spaces then A is scalar multiplication by λ ∈ C.

Proof: Since V is irreducible, Ker A = V or 0 so A is either zero or injective. But


W is irreducible so the image is either 0 or W .
If V = W and the field is C, then A has an eigenvalue λ and so A − λI is an invariant
endomorphism of V . By the first part it has to be zero since A − λI is not invertible.
2

Remark: It is convenient to work over the complex numbers and only as a secondary
issue discuss whether a representation is real. A real structure on a complex vector
space V is an antilinear involution T , i.e. T (u + v) = T (u) + T (v), T (λu) = λ̄T (u)
and T 2 = 1. The fixed point set of T is a real vector space U and V is naturally
its complexification U ⊗R C with T becoming complex conjugation. So a complex
representation which commutes with such a T is actually a real representation.
As an example consider (z1 , z2 ) 7→ (z̄2 , −z̄1 ) acting on C2 . This is antilinear but
squares to −1. However its action on polynomials of even degree gives a real structure
preserved by SU (2). So the V2m , which we already saw are representations of SO(3),
are all real in this sense.

Proposition 6.2 Over the complex numbers every irreducible representation of an


abelian Lie group is one-dimensional.

43
Proof: If G is abelian the map v 7→ gv for a fixed g commutes with G and hence
by Schur’s Lemma is a non-zero scalar λg . So multiples of a fixed vector v form an
invariant subspace and hence by irreducibility the whole space. 2

If V is a real representation of an abelian group which is completely reducible then


we can complexify V ⊗R C and from the proposition write

V ⊗R C = V1 ⊕ V2 ⊕ · · · ⊕ Vn

as a direct sum of one-dimensional representations: g ∈ G acts as a scalar λ(g) ∈ C∗


on each one. Reality means the action commutes with an antilinear involution T :
complex conjugation. So T maps Vi into some Vj and the action λ(g) on Vi is paired
with λ̄(g) on Vj . Since ±1 is the only real subgroup of the unit complex numbers, if
G is connected it must act by a complex scalar or the identity. So V is a direct sum
of a trivial representation of dimension k, say, and a sum of 2-dimensional real vector
spaces where the action is  
cos θ sin θ
.
− sin θ cos θ
An example is the real defining 3-dimensional representation of SO(3) restricted to
the abelian subgroup of rotations about (0, 0, 1). It has the trivial representation with
multiplicity 1 (the space (0, 0, z)) and an irreducible 2-dimensional real representation
on (x, y, 0). The eigenvalues e±iθ of a rotation have complex conjugate eigenspaces
which are irreducible one-dimensional complex representations.

6.2 Integration on G
Suppose a representation space V has a positive-definite Hermitian inner product
⟨v, w⟩ and the action of G preserves it so that ⟨gv, gw⟩ = ⟨v, w⟩, then g acts as
unitary transformations, and the representation can be viewed as a homomorphism
φ : G → U (n) where dim V = n. More importantly, if W ⊂ V is an invariant
subspace, so is its orthogonal complement W ⊥ . For a finite group the existence of
such an inner product is straightforward by averaging over the group elements. In
other words, choose any inner product ( . ) and define
X 1
⟨v, w⟩ = (gv, gw)
g∈G
|G|

and since the sum of positive-definite Hermitian inner products is still positive definite
we have the required inner product. This means that any representation is completely

44
reducible – pick an invariant subspace W ⊂ V and write V = W ⊕ W ⊥ and repeat
for W and W ⊥ until it has no invariant subspaces.
For a compact Lie group we can do the same thing but we need to integrate over the
group rather than sum, so we need to know how to integrate functions on a manifold.
Recall the change of variables formula in a multiple integral:
Z Z
f (y1 , . . . , yn )dy1 dy2 . . . dyn = f (y1 (x), . . . , yn (x))| det ∂yi /∂xj |dx1 dx2 . . . dxn .

Let Ω(X1 , . . . , Xn ) be an alternating multilinear form on the tangent space Ta of


a manifold of dimension n. So it is linearPin each factor and changes sign if we
interchange two variables. If we write Xi = Aij Ej for a basis E1 , . . . , En so that Xi
is the ith row of the matrix A, then det A is an example of such a form and in fact all
such forms are multiples of this. If (x1 , . . . , xn ), (y1 , . . . , yn ) are two local coordinate
systems they each give a basis for the tangent space and these are related by
   
∂ X ∂yj ∂
= (a) .
∂xi a j
∂xi ∂yj a

So      
∂ ∂ ∂yi ∂ ∂
Ω ,..., = det Ω ,..., .
∂x1 ∂xn ∂xj ∂y1 ∂yn
Apart from the absolute value this is how multiple integrals transform.
This is at one point a ∈ M but if we want a smoothly varying form this is the
definition:

Definition 21 A differential form of degree m on a manifold M is an alternating


multilinear function of vector fields Ω(X1 , . . . , Xm ) ∈ C ∞ (M ) .

When m = n, if f is a function with support in a coordinate neighbourhood then we


can define the integral of the n-form f Ω and this will be independent of coordinates
so long as the sign of det ∂yi /∂xj is positive. For a Lie group this certainly holds for
we can take a non-zero multilinear form ω on Te and by translation extend it to all
of G so that for a basis of left-invariant vector fields X1 , . . . , Xn we have

Ω(X1 , . . . , Xm ) = ω((X1 )e , . . . , (Xn )e )

If we restrict to local coordinates x1 , . . . , xn for which


 
∂ ∂
Ω ,..., >0
∂x1 ∂xn

45
then we have a consistent notion of integration on each coordinate neigbourhood.
A general function can be written on a compact manifold as a finite sum of smooth
functions supported in coordinate neighbourhoods and then we can define integration
of functions in a coordinate-independent way.
The form described above is left-invariant and on a compact manifold its integral is
finite, so we can normalize it so that the integral of Ω is 1. It is also right-invariant
because the adjoint action on left-invariant vector fields induces an action on the
one-dimensional space of degree n (n = dim Te )) multilinear maps, a homomorphism
from G to R∗ . But ±1 is the only compact subgroup of the multiplicative group of
non-zero reals and if G is connected the action must be trivial.

Example: For the circle x 7→ e2πix for x ∈ [0, 1], we can take Ω = dx, its value on
the vector field d/dx is 1 and its integral is 1.

From now on, in performing integrals we shall omit Ω and just write
Z
f (g).
G

Then given a representation V we can always find a G-invariant inner product by


choosing any inner product ( . ) and defining
Z
⟨v, w⟩ = (gv, gw).
G

For an irreducible representation the inner product is unique up to multiplication by


a positive real number. This follows from Schur’s lemma since two inner products
are related by (u, v)1 = (Au, v)2 where A = A∗ with respect to ( . )2 . But if both are
G-invariant so is A and so by Schur it is a multiple of the identity.
Differential forms transform naturally under a diffeomorphism F which is denoted
F ∗ Ω. The integral of the differential form f Ω transforms as
Z Z

(f ◦ F )F Ω = ± f Ω.
M M

depending on whether F preserves or reverses the orientation. The form defines a


measure which is insensitive to orientation: a positive function will always have a
positive integral.
Since Lh and Rh preserve our choice of Ω for G this means that for a fixed h ∈ G,
Z Z Z
f (hg) = f (gh) = f (g).
G G G

46
Example: For the circle this is just the statement that
Z 1 Z 1
f (x + a)dx = f (x)dx
0 0

for a periodic function with period 1.

Inversion g 7→ g −1 acts as −1 on the tangent space at the identity which introduces a


sign (−1)dim G in the transform of the differential form Ω but the associated measure
satisfies Z Z
−1
f (g ) = f (g).
G G

The assumption above was that f is a real-valued function but it could equally be
vector valued. So consider a representation space V and for v ∈ V the function
g 7→ gv on G. Then Z Z Z
h gv = hgv = gv
G G G
so the integral is an invariant vector. On the other hand if u is an invariant vector
Z Z
gu = u=u
G G

since the integral of 1 is 1. So Z


P (v) = gv
G
is a projection onto the subspace V G ⊂ V of invariant vectors. It is also an orthogonal
projection (P = P ∗ ) since using an invariant inner product,
Z Z
⟨P v, w⟩ = ⟨gv, w⟩ = ⟨v, g −1 w⟩
G G

since G acts as unitary transformations. But using the invariance of the integral
under g 7→ g −1 this is Z
⟨v, gw⟩ = ⟨v, P w⟩.
G

6.3 Characters and orthogonality


P
Recall the trace of a matrix tr A = i Aii . It has the property that tr(AB) =
T
tr(BA), tr A = tr A. This has a more invariant interpretation if we use the canonical
isomorphism between End V and V ∗ ⊗ V . The (linear extension of) the map ξ ⊗ v 7→
ξ(v) is the trace.

47
Definition 22 The character of a representation ρ : G → Aut(V ) is the function
χV (g) = tr(ρ(g)).

Here are some properties of the character for a compact group, which is evidently a
smooth function on G.

ˆ χV (e) = dim V since tr IV = dim V

ˆ χV (hgh−1 ) = χV (g) since tr ρ(hgh−1 ) = tr(ρ(h)−1 ρ(h)ρ(g)) = tr(ρ(g))

ˆ If V, W are isomorphic as representations then they have the same character.


This is again invariance of trace under conjugation.
ˆ χV ⊕W = χV + χW

ˆ χV ⊗W = χV χW . To see this note that if v1 , . . . , vm is a basis for V and w1 , . . . , wn


a basis for W then vi ⊗ wj is a basis for the tensor product. So considering a
linear map of the form
X X
cij vi ⊗ wj 7→ cij Avi ⊗ Bwj
ij ij

summing the diagonal terms for basis vectors vi ⊗ w1 gives tr AB11 and contin-
uing gives tr A tr B.
ˆ χV ∗ (g) = χV (g −1 ). This is because tr(A−1 )T = tr(A−1 ). But also the represen-
tation is unitary and for a unitary matrix tr A−1 = tr ĀT = tr Ā = (tr A) and
so also χV ∗ (g) = χV (g).

We see immediately from the previous section that


Z
χV (g) = dim V G
G

since this is the trace of the orthogonal projection P onto the invariant subspace.
We can define an L2 inner product on smooth functions on G by integration and then
we have:

Theorem 6.3 For two representations V, W of a compact Lie group G


⟨χV , χW ⟩ = dim HomG (V, W ).
In particular, if V and W are inequivalent irreducible representations, their characters
are orthogonal.

48
Proof: Using Hom(V, W ) ∼
= V ∗ ⊗ W and the properties of characters above we see
that
χV ∗ ⊗W = χV ∗ χW = χV χW .
Integrating the right hand term gives ⟨χW , χV ⟩. Integrating the first gives dim HomG (V, W ).
If V, W are inequivalent then by Schur’s lemma HomG (V, W ) = 0. In fact if V = W
then Schur says that dim HomG (V, V ) = 1 and so χV has norm 1.
2

It is not just characters which satisfy orthogonality relations:

Theorem 6.4 Let V, W be irreducible representations of a compact Lie group G and


take v1 , v2 ∈ V , w1 , w2 ∈ W . Then
Z
⟨gv1 , v2 ⟩⟨gw1 , w2 ⟩
G

vanishes if V and W are inequivalent and if V ∼


= W equals
1
⟨v1 , w1 ⟩⟨v2 , w2 ⟩.
dim V

If v1 , . . . , vm is a unitary basis for V then ⟨gvi , vj ⟩ is the (i, j) entry in the matrix rep-
resenting the action of G so the theorem tells us that the matrix entries of irreducible
representations form an orthonormal set in L2 (G).

Proof: Consider the term ⟨gw1 , w2 ⟩ in this expression. Since a Hermitian inner
product is antilinear in the second factor, for each w ∈ W , ⟨w, u⟩ is complex linear
in u and defines an element ξw ∈ W ∗ . In different words, a Hermitian inner product
gives an antilinear isomorphism from W to its dual. Since G preserves the inner
product, ⟨gw1 , w2 ⟩ = gξw1 (w2 ) where g is the dual action.
Now Z
gv1 ⊗ gξw1
G
is the orthogonal projection P onto the invariant subspace of V ⊗ W ∗ = Hom(W, V ).
But by Schur’s lemma if V, W are irreducible this is zero. If V = W then the invariant
part consists of scalar multiples of the identity so for A ∈ End V the orthogonal
projection onto multiples of I is
1
P (A) = (tr A)I.
dim V
49
Now tr(v ⊗ ξ) = ξ(v) so P (v1 ⊗ ξw1 ) = ξw1 (v1 )I/ dim V = ⟨w1 , v1 ⟩I/ dim V =
⟨v1 , w1 ⟩I/ dim V .
Evaluating gv1 ⊗ gξw1 ∈ V ⊗ V ∗ on ξv2 ⊗ w2 is

⟨gv1 , v2 ⟩⟨w2 , gw1 ⟩ = ⟨gv1 , v2 ⟩⟨gw1 , w2 ⟩

and so
Z
1 1
⟨gv1 , v2 ⟩⟨gw1 , w2 ⟩ = ⟨v1 , w1 ⟩ tr(ξv2 ⊗ w2 ) = ⟨v1 , w1 ⟩⟨v2 , w2 ⟩.
G dim V dim V

If v1 , . . . , vm is an orthonormal basis for V and w1 , . . . , wn for W then ⟨gvi , vj ⟩ is the


(i, j) matrix coefficient for g acting on V and similarly for W . The theorem then
says first of all that if V, W are inequivalent irreducible representations the matrix
coefficients, as functions of G, are orthogonal. And if V = W we have
Z
1
⟨gvi , vj ⟩⟨gvk , vℓ ⟩ = ⟨vi , vk ⟩⟨vj , vℓ ⟩.
G dim V
The right hand side is zero unless i = k and j = ℓ which means that the matrix
coefficients ⟨gvi , vj ⟩ are orthogonal.

For the circle, the matrix coefficients of irreducible representations are the functions
einθ , n ∈ Z and these form a complete orthonormal basis for the Hilbert space of all
L2 functions on S 1 , or equivalently periodic functions on [0, 2π]. The same is true of
any compact Lie group.

Remark: The proof we shall give is for a compact subgroup of GL(n, R): a matrix
group. In fact any compact subgroup is embedded in a general linear group, but this
is a consequence of the theorem below given a general proof. Such a proof can be
found in many texts (or online) using some functional analysis. For any compact Lie
group, the image of Ad is of course a matrix group and Ado’s theorem (which we
did not prove either) tells us that some quotient of the universal covering is a matrix
group.

2
If G ⊂ GL(n, R) then it is a compact submanifold of Rn . From point set topology
2
any continuous function on G can be extended to a continuous function on Rn and
the Weierstrass approximation theorem says that continuous functions there can be

50
approximated by polynomials, so continuous functions on G can be approximated by
polynomials in the entries of the n × n matrix.
Call the representation space for G ⊂ GL(n, R) V (which may decompose as a sum
of irreducibles) then the matrix coefficients for V are the linear polynomials and for
V ⊗ V the homogeneous quadratic ones etc. So we see that the matrix coefficients
for a countable collection of irreducible representations are dense in the continuous
functions on G with the uniform norm. Then using the orthogonality of Theorem 6.4,
by choosing orthogonal bases {v1 , . . . , vnm } for each irreducible Vm and normalizing
we get a complete orthonormal basis in L2 (G).
Rather than dealing with an orthogonal sequence this result, the Peter-Weyl theorem
is better stated as

Theorem 6.5 (Peter-Weyl) Let G be a compact Lie group. Then

ˆ
L2 (G) ∼
M
= VL ⊗ VR∗ ,

the L2 -completion of the direct sum over all finite-dimensional irreducible represen-
tations of G, where VL denotes the left action on functions and VR the right action.

Proof: The matrix coefficients are the functions f (g) = ⟨gvi , vj ⟩ as the vi range
through an orthonormal basis of V and the V run through (representatives of equiv-
alence classes of) irreducible representations.
Consider the functions ⟨gvi , vj ⟩. For fixed vj this is a representation of G. Varying j
shows that it has multiplicity dim V in L2 (G). More invariantly we define an action
of G × G on the space spanned by these functions

(h, k)⟨gvi , vj ⟩ = ⟨k −1 ghvi , vj ⟩

which gives the term VL ⊗ VR∗ . 2

Remark:
1. The theorem above holds for any compact group and in particular for a finite
group. There L2 (G) has dimension |G| and is the regular representation.
2. The character χV of an irreducible representation V is the function on G given by
n
X
χV (g) = ⟨gvi , vi ⟩.
1

51
In terms of the Peter-Weyl theorem this is the identity IV ∈ Hom(VR , VL ) on that
factor.
3. The Hilbert space L2 (G) above of complex functions is clearly the complexification
of the space of real functions, even though the representations V may be complex.
But they appear as End V ⊂ L2 (G) and T (A) = A∗ is an antilinear involution which
commutes with the G-action, so these are real representations.

52
7 Maximal tori

7.1 Abelian subgroups


The eigenspaces of a unitary matrix A are orthogonal. This means we can find an
orthonormal basis such that A is diagonal. Another way of saying this is to consider
the set of diagonal matrices in U (n): diag(eiθ1 , eiθ2 , . . . , eiθn ). This is a Lie subgroup
isomorphic to the product of n copies of S 1 : the torus T n . The argument above shows
that any element in U (n) is conjugate to an element in T n .
This is particularly relevant when considering the character χV of a representation
V . Since χV (hgh−1 ) = χV (g), the function χV is determined by its restriction to
T n . Orthogonality of characters implies that the character uniquely determines V
(up to equivalence of course) so this function on T n determines V . Not only that, but
since T n is abelian, V can be expressed as a direct sum of one-dimensional invariant
subspaces. Considering the actions of the separate factors, on each irreducible the
action is given by a scalar of the form exp i(m1 θ1 + m2 θ2 + · · · + mn θn ) for mk ∈ Z.
In other words V is determined by a collection of integer vectors. This is a general
feature which we shall investigate next.

Definition 23 A torus T ⊂ G is a Lie subgroup T isomorphic to a product of circles.


A maximal torus is maximal under the inclusion of tori.

Tori, being the image of a compact group, are closed in G and hence embedded
Lie subgroups. They are connected and so (using the exponential map) any proper
inclusion T ⊂ T ′ implies that dim T ′ > dim T . It follows that any torus is contained
in a maximal one. A maximal torus is maximal among connected abelian subgroups
A, for the closure of A is a torus.

Definition 24 Let T ⊂ G be a maximal torus. The Weyl group W of T is the


quotient of the normalizer by the normal subgroup T

W = N (T )/T = {g ∈ G : gT g −1 = T }/T

This definition seems to depend on the choice of maximal torus but we shall prove
that in fact all maximal tori are conjugate so that the Weyl group (up to isomorphism)
is independent of choice.

Example:

53
1. For SO(3) a maximal torus is the circle subgroup of rotations by θ about an axis
given by a unit vector u. Conjugating by a rotation which takes u to −u takes θ to
−θ. The Weyl group is Z2 .
2. Return to U (n). A matrix which commutes with all diagonal matrices is diagonal so
diag(eiθ1 , eiθ2 , . . . , eiθn ) is a maximal torus. Let h ∈ T have distinct diagonal entries.
Then since conjugation leaves the eigenvalues unchanged g ∈ N (T ) must permute
the entries. But permutation of the elements of an orthonormal basis is a unitary
transformation π. It follows that π −1 g leaves fixed the open set in T with distinct
entries and by continuity every point. But then it commutes with all diagonals and
so lies in T . Hence the Weyl group N (T )/T is isomorphic to the symmetric group
Sn .

Proposition 7.1 The Weyl group is a finite group.

Proof: The normalizer N (T ) acts on T and its Lie algebra, commuting with the
exponential map. It thus preserves the kernel of exp which from Theorem 3.4 con-
sists of integer multiples of basis vectors v1 , . . . , vn . Since this is discrete and the
action is continuous an element connected to the identity acts trivially. Let N0 be
the component of the identity. It is a connected Lie subgroup which contains T and
acts trivially on T by conjugation, so there is a one parameter subgroup which com-
mutes with T , but this contradicts the maximality of T amongst connected abelian
subgroups, so N0 = T . Then W = N (T )/T = N (T )/N0 is a compact group with the
discrete topology and hence is finite.
2

7.2 Conjugacy of maximal tori


We shall prove that maximal tori are conjugate by using the map

F : G/T × T → G

(where G/T is the space of cosets of T ) defined by F (gT, t) = gtg −1 . This is well-
defined because replacing g by gs for s ∈ T gives gsts−1 g −1 = gtg −1 . Clearly the
image consists of elements conjugate to an element in T , so if we can prove F is
surjective then we conclude that every element h is conjugate to an element of T .

Example: Consider G = SO(3) and T = S 1 ⊂ G the circle subgroup of rotations


about the unit vector k. To each coset gT associate the unit vector gk and this

54
gives an identification of G/T with the unit sphere S 2 ⊂ R3 . If h = gtg −1 then
hgk = gtg −1 gk = gk so the unit vector is the axis of rotation of h and t is the angle
of rotation. But a rotation of θ about u is the same as a rotation of −θ about −u,
so the inverse image of a general point in SO(3) is two points in S 2 × S 1 . If h = I,
however, the inverse image is S 2 × {e}.

This example reveals the general features: first that G/T × T and G are compact
manifolds of the same dimension, and secondly that the smooth map F is a covering
space on an open subset. We shall show this in general and use a theorem in the
theory of manifolds: If a smooth map F between two compact, connected, orientable
manifolds has non-zero degree, then it is surjective. A proof of this can be found in
https://people.maths.ox.ac.uk/hitchin/hitchinnotes/manifolds2012.pdf
but we will sketch the idea below.
The proper way to treat this is to use the theory of differential forms and the exterior
derivative, but we have not introduced that here. However, we have a distinguished
n-form on G, so we will work with a compact manifold M with an everywhere non-
vanishing form Ω, whose existence is the definition of being orientable. We will call
a differential n-form ω on a manifold of dimension n exact if there is a vector field X
such that
ω = LX Ω
where we have taken the Lie derivative of Ω. Recall that diffeomorphisms act on
n-forms and integrating X to a local one-parameter subgroup of diffeomorphisms we
have

LX Ω = φ∗t Ω|t=0 .
∂t
But since φt is a diffeomorphism connected to the identity the integral of φ∗t Ω over
a compact manifold M is constant (effectively just a change of variables in the inte-
gration). Thus the integral of an exact form is zero. In Rn if
X ∂  
∂ ∂
X= ai Ω ,..., =f
i
∂x i ∂x 1 ∂x n

then
X ∂ai
LX Ω = Xf + f = div f X
∂xi
so locally, if f has compact support, this is the divergence theorem. More importantly
is the global converse theorem, which we do not prove here:

55
Proposition 7.2 Two differential n-forms on a compact, connected, oriented mani-
fold of dimension n differ by a divergence if their integrals are the same.

Now differential forms, unlike vector fields, transform not just via diffeomorphisms
but for arbitrary smooth maps F : M → N . This is the pull-back F ∗ Ω of a form on
N defined pointwise by

(F ∗ Ω)x (X1 , . . . , Xn ) = ΩF (x) (DFx (X1 ), . . . , DFx (Xn ))

for tangent vectors Xi at x ∈ M . Then one can show that the pull-back of an exact
form is exact: if Ω̃ is a non-vanishing form on M then the required vector field Y on
M is defined by

Ω̃x (Y, X1 , . . . , Xn−1 ) = ΩF (x) (X, DFx (X1 ), . . . , DFx (Xn−1 )).

Now if F : M → N is a smooth map of compact manifolds which is not surjective, its


image is compact and hence closed, so its complement is non-empty and open. We
can then take an n-form f Ω on N where f has support in some small open set of the
complement and such that the integral is 1. Then F ∗ (f Ω) = 0 because f vanishes
on the image of F . On the other hand, if we have a point x over which the map
is a covering map, and we take a form gΩ of integral one in a small neighbourhood
U , then the integral of F ∗ (gΩ) is a sum of terms over the finite number of open sets
U1 , . . . , Uk whose union is F −1 (U ). Since F : Ui → U is a diffeomorphism the integral
is ±1 on each Ui depending on whether F ∗ Ω is a positive or negative multiple of Ω̃.
The degree is defined to be the sum of these terms, which is clearly an integer. If all
signs are positive it counts the number of inverse images.
But since f Ω and gΩ have the same integral on N , they differ by an exact form, and
hence so do F ∗ (f Ω) and F ∗ (gΩ) in which case F ∗ (gΩ) has the same integral on M
as F ∗ (f Ω) = 0. So if the degree is non-zero, there is a contradiction and the map is
surjective.

To apply this to our situation we have three tasks:


1. Show that G/T is an orientable manifold
2. Find an element g ∈ G over which F is a covering map
3. Calculate the orientations at the finite number of points F −1 (g).
The space G/T is the set of cosets gT , so any point can be obtained by the left action
of some g on the identity coset [e] ∈ G/T . In order to define a manifold structure it
suffices to define a chart in a neighbourhood of the identity and transport it around
by the left action. So let t ⊂ Te G be the tangent space of T at e ∈ G and using

56
the invariant inner product take its orthogonal complement t⊥ . Then the exponential
map restricted to this gives, by the inverse function theorem, a submanifold of a
neighbourhood U of e ∈ G which intersects each coset in U T in one point. Using these
charts (and the fact that T is a compact group to provide the Hausdorff condition)
we give G/T the structure of a compact manifold of dimension dim G − dim T .
The action of g ∈ G gives an isomorphism from the tangent space at [e] to the
tangent space at [g], but this is only well-defined up to the right action of T on
g. This is the adjoint action of T on t⊥ , so only properties of this vector space
which are T -invariant can be propagated around G/T . For example, there are no
invariant vectors since T is a maximal connected abelian subgroup so unlike G itself
we don’t have invariant vector fields. The inner product restricted to t⊥ is invariant,
and an inner product defines a multilinear form Ω(E1 , . . . , En ) = 1 for an oriented
orthonormal basis E1 , . . . , En . Since T is connected it cannot alter the orientation so
this defines a non-vanishing form on G/T which is therefore oriented.
2. In the example of SO(3) we could take a general element of the torus T to get
a covering and here we do the same, but interpret “general” as being a generator
t in the sense that the closure of the subgroup {tn ; n ∈ Z} is T . If the kernel of
the exponential map for T consists of Z-linear combinations of v1 , . . . , vk then t =
exp(c1 v1 +c2 v2 +· · ·+cn vn ) is a generator if {1, c1 , c2 , . . . , cn } are linearly independent
over the rational numbers. One property of this choice is the following:

Proposition 7.3 If t ∈ T is a generator then there are |W | points in the inverse


image of F , where W is the Weyl group of T .

Proof: If F (gT, s) = gsg −1 = t then sn = g −1 tn g and so g −1 T g ⊂ T . Thus g ∈ N (T )


and so there is one coset gT for each g ∈ N (T )/T = W . The cosets are just the orbit
of T under the left action of N (T ). 2

To prove that F is a covering map we need its derivative at these |W | points, and
this is necessary for the orientation question too.
3. Define F̃ : G × T → G by F̃ (g, t) = gtg −1 . This is a lift of the map F . By left
translation we want to compute the derivative at (g, t) in terms of the geometry at
the identity (e, e). This means first translating on G × T from (e, e) to (g, t), applying
F̃ and then left multiplying by F̃ (g, t)−1 . It is easiest to calculate imagining it as a
matrix group: the derivative of gtg −1 is ġtg −1 + g ṫg −1 − gtg −1 ġg −1 and left translation
by (gtg −1 )−1 = gt−1 g −1 gives

gt−1 g −1 ġtg −1 + gt−1 ṫg −1 − ġg −1

57
so if X = g −1 ġ is a left-invariant vector field on G and Y = t−1 ṫ on T this is

Ad(gt−1 )(X) + Ad(g)(Y ) − Ad(g)(X) = Ad(g)(Ad(t−1 )X + Y − X).

For the derivative of F we want to restrict X to the subspace t⊥ and then this is a
linear map from Te G = t⊥ ⊕ t to itself.
For a covering map we want this to be invertible. Since Ad(g) is invertible this will
fail to be invertible only if Ad(t−1 ) has an eigenvalue +1 on t⊥ . But as remarked
above that does not hold since T is maximal abelian. Now Ad(g) is orthogonal so
its determinant is ±1 for each of the |W | points gi over t. However, the gi are all
connected to the identity so det Ad(gi ) = 1 and hence det Ad(gi )(det(Ad(t−1 ) − I)) is
the same non-zero number for each point and thus the degree is non-zero. We deduce
therefore that F is surjective and hence

Theorem 7.4 Every element g ∈ G is contained in a maximal torus and all such
tori are conjugate.

The last part of the theorem follows by taking t′ to be a generator of a maximal torus
T ′ . Then t′ = gtg −1 for t ∈ T and taking powers and the closure it follows that
T ′ ⊂ gT g −1 but by maximality this must be equality.

Remark: The power of this type of argument is that by investigating the equation
F (x) = g for a general point g we deduce the existence of a solution for any point.
For example, suppose g is in the centre Z(G), the subgroup which commutes with
everything. Then by the theorem it lies in a maximal torus T , but hgh−1 = g so
it lies in all conjugates, hence in the intersection of all maximal tori. Conversely,
since every element is in a maximal torus, which is abelian, every element commutes
with the intersection, which is therefore equal to the centre. We saw earlier that a
covering homomorphism G → H is given by H = G/Γ for Γ ⊂ Z(G). It follows that
the maximal tori of H are quotients of those of G by Γ.
For example diag(eiθ , e−iθ ) is the maximal torus of SU (2) which contains the centre
±I. The adjoint representation SU (2) → SO(3) is a covering homomorphism in
which −I acts trivially. The group SO(3) has no centre.

From the theorem it follows that, up to conjugation, there is a single Weyl group W
and the dimension of T , called the rank, is an invariant of G. There is one further
point:

Proposition 7.5 The dimension of G/T is even.

58
Proof: The tangent space at [e] ∈ G/T is t⊥ and this is a real representation of the
compact abelian group T . We saw that Ad(t−1 ) has no eigenspace with eigenvalue
+1. Nor does it have one with −1, for if so Ad(t2 ) would have a +1 eigenvalue.
But t2 is also a generator since if {1, c1 , c2 , . . . , cn } are linearly independent over the
rationals so are {1, 2c1 , 2c2 , . . . , 2cn }. It follows that the real irreducible subspaces
have dimension 2 and hence dim t⊥ is even. 2

7.3 Roots
In the proof of Proposition 7.5 we used the fact that t⊥ is a direct sum of irreducible
real 2-dimensional representation spaces for T , or homomorphisms T → SO(2). So
the tangent space at e, or equivalently the Lie algebra, is a direct sum
M
g=t⊕ ga .
a

Let k = dim t be the rank of G. Choosing an orientation on R2 allows us to identify


the rotation action on ga as e2πiθa where, for (e2πix1 , e2πix2 , . . . , e2πixk ) ∈ T , we have
θa = m1 x1 + · · · + mk xk for integers mi . Choosing the opposite orientation gives −θa .

Definition 25 The linear forms ±θa ∈ t∗ are called the roots of G.

Example:
1. For U (n) we have seen that T consists of the diagonal matrices and the Lie algebra
is the space of skew Hermitian matrices, so Bij = −B̄ji . The 1-dimensional complex
space of entries in the upper triangular (i, j)-place with i < j is a real 2-dimensional
space ga since it is acted on by T as eixi e−ixj . Thus the roots are xi − xj , i ̸= j.
2. For
P SU (n) the only difference is that since det A = 1 the maximal torus is given
by i xi = 0. In particular for SU (2) the roots are ±(x1 − x2 ) = ±2x1 .
3. If G = SO(2m) then the maximal torus consists of m 2 × 2 blocks down the
diagonal each of the form  
cos 2πxi sin 2πxi
.
− sin 2πxi cos 2πxi
It acts on the space of skew-symmetric matrices broken up into 2 × 2 blocks, with
the i < j place a real 2 × 2 matrix acted on the left by a rotation by 2πxi and on
the right by −2πxj . This is a tensor product Vi ⊗ Vj and complexifying and looking
at the eigenspaces the four one-dimensional representations are e2πi(±xi ±xj ) . Thus the
roots are ±xi ± xj for i < j.

59
Given a root θa the kernel of the homomorphism T → S 1 is a subtorus of dimension
k − 1 which acts trivially on ga . So take a one-parameter subgroup with tangent
X ∈ ga . Its closure together with the kernel generates a connected abelian subgroup.
In fact since the maximum dimension is k it will be automatically closed (there is a
geometric reason for this which will become evident below). It follows that points on
a root hyperplane defined by θa (x) = 0 are tangent to many maximal tori.
Conversely, suppose that t ∈ T is not contained in the kernel of any of the root
homomorphisms. This means its action on each ga is non-trivial. But this is the
condition in the proof of Theorem 7.4 for the derivative of F to be an isomorphism
and F to be a covering map in a neighbourhood. Moreover this neighbourhood
contains generators of T , so as in the proof of Proposition 7.3 the cosets in the fibre
is an orbit of N (T ).
Hence if t is contained in another maximal torus hT h−1 then t = hsh−1 for some
s ∈ T which means (hT, s) is in F −1 (t) but then hT = gT for g ∈ N (T ), so h = gu,
u ∈ T and h ∈ N (T ). This means hT h−1 = T .
Hence if t is not contained in the kernel of any of the root homomorphisms it lies in a
unique maximal torus. The root planes are thus the tangent spaces of the intersection
of other maximal tori with T .

Example: In U (n) the root planes are xi − xj = 0, so a matrix A lies in a unique


maximal torus if its eigenvalues e2πixi are distinct.

The subgroup N (T ) ⊂ G acts on g via the adjoint representation preserving t ⊂


g, but T acts trivially on itself by conjugation, so the quotient W = N (T )/T ,
the Weyl group, acts. If ρ : T → Aut(g) is the adjoint action then ρ(gtg −1 ) =
Ad(g)ρ(t) Ad(g)−1 , for g ∈ N (T ), is an equivalent representation thus it permutes
the irreducible components of the action and so permutes the roots.
The discussion above shows that if X ∈ t does not lie in any root hyperplanes then
its orbit under W has |W | elements and so the action on this open subset is free.
In particular W embeds as a finite subgroup of linear automorphisms of t. It also
preserves the inner product, so it embeds in O(k).

Example:
1. For SU (3) the rank k = 2 and W = S3 so this symmetric group embeds as a
subgroup of O(2): it can only be the dihedral group of rotations and reflections of an
equilateral triangle.
2. For SU (4) we have S4 ⊂ O(3), the symmetries of the cube.

60
If X ∈ t lies in a root hyperplane then there are elements of W which leave it fixed.
We prove the following:

Theorem 7.6 The orthogonal reflection in each root hyperplane lies in the Weyl
group.

Since reflection in a hyperplane has one eigenvalue −1 and the rest 1, its determinant
is −1 and so W never lies in SO(k).

Proof: The 2-dimensional root space ga has a positive definite inner product: choose
an orthonormal basis X1 , X2 . Then X1 ± iX2 is acted on by T as e±2πiθa (x) and so
[X1 , X2 ] = i[X1 +iX2 , X1 −iX2 ]/2 = α is invariant by T and so lies in t. It is non-zero,
for otherwise the root hyperplane together with X1 , X2 span an abelian subalgebra
of g of dimension k + 1 which contradicts maximality.
Since T preserves gα this means that {X1 , X2 , α} spans a 3-dimensional non-abelian
Lie subalgebra. Its elements are skew-adjoint with respect to the inner product so
ad defines an isomorphism to so(3). There is thus a corresponding connected Lie
subgroup Ga of G which is a covering of SO(3). But the double covering of SO(3) is
SU (2) which is simply connected so the Lie subgroup is either SU (2) or SO(3) which
are compact and hence the subgroup is embedded.
Take Y ∈ Ker θa , then [Y, X1 ] = [Y, X2 ] = 0 since Y acts trivially on ga . And
[α, Y ] = 0 because Y, α ∈ t which is abelian. Hence Ga fixes the hyperplane.
Consider (Y, α) = (Y, [X1 , X2 ]) = (Y, ad(X1 )X2 ). Since the inner product is bi-
invariant, ad(X1 ) is skew adjoint and so

(Y, ad(X1 )X2 ) = −(ad(X1 )Y, X2 ) = −([X1 , Y ], X2 )

which vanishes since [X1 , Y ] = 0. Hence α is orthogonal to the root hyperplane.


There is a rotation in SO(3) which takes α to −α and a corresponding element
g ∈ Ga . This fixes pointwise Ker θa and preserves Ker θa ⊕ Rα = t and so lies in
N (T ). Consequently it is an action of an element of order 2 in the Weyl group. 2

Remark: Every one-parameter subgroup in SU (2) or SO(3) is a circle and em-


bedded. The maximal tori which intersect the root plane are then generated by the
(k − 1)-dimensional kernel of ρa : T → S 1 and one of the 2-parameter family of circle
subgroups of Ga .

61
8 Representations and maximal tori

8.1 The representation ring


We have observed that given representations V, W we can form the direct sum V ⊕ W
and the tensor product V ⊗ W . We can also write mV = V ⊕ V ⊕ · · · ⊕ V for
a positive integer m. If we now take equivalence classes [V ] of representations and
allow negative values of m we get a ring R(G) called the representation ring. Since
an equivalence class is determined by its character χV and χV ⊕W = χV + χW , and
χV ⊗W = χV χW this is also the character ring consisting of finite integer combinations
of the functions χV . We take complex representations here.

Example: For the circle S 1 , every irreducible representation is of the form Un with
character einθ . So any representation is a sum of these and R(S 1 ) = Z[t, t−1 ], the ring
of finite Laurent series in t where t is the basic representation U1 with character eiθ .

Restricting a representation ρ of G to a maximal torus T , it splits into one-dimensional


irreducibles with character of the form exp 2πi(a1 x1 + · · · + am xm ) and these linear
maps a1 x1 + · · · + am xm are called the weights of the representation. The roots are
the weights for the adjoint representation.
The representation ring R(T ) for T = S 1 × S 1 × · · · × S 1 is, from the example above,
the Laurent polynomials in k variables where k = dim T : R(T ) = Z[t±1 ±1 ±1
1 , t2 , . . . , tk ].
Just as the Weyl group permutes the roots, it also permutes the weights of ρ and so
the character of a representation restricted from G must be invariant under W .

Proposition 8.1 The restriction map R(G) → R(T )W , to the fixed part under the
Weyl group, is injective.

Proof: By orthogonality of characters a representation is determined by its character


χ, and the character is conjugation-invariant so since every element is conjugate to
one in T , χ is determined by its restriction to T . The conjugation action of N (T )
leaves χ invariant so its restriction to T must be invariant under the Weyl group. 2

Example: If G = SU (2), T = diag(eiθ , e−iθ ) and the Weyl group is S2 = Z2 taking


θ to −θ (a rather simple case of reflection!) or t to t−1 . So R(T )W is the ring of
Laurent polynomials of the form

π = m0 + m1 (t + t−1 ) + · · · + mn (tn + t−n ). (4)

62
The character χn of the irreducible representation Vn in Section 6.1 is

tn + t−n + tn−2 + t−(n−2) + . . .

so we can rewrite π as a linear combination of irreducible characters:

π = m0 + m1 χ1 + m2 (χ2 − χ0 ) + · · · + mn (χn − χn−2 ).

The character of any irreducible representation can therefore be written in this form.
By orthogonality of characters, it must therefore be one of the Vn . The character
of any genuine representation is a sum of non-negative multiples of χn : a Laurent
polynomial (4) where mi − mi+2 > 0.
Since
(t + t−1 )n = tn + t−n + n(tn−2 + t−(n−2) ) + . . .
π can also be written as a polynomial in a single variable t + t−1 so algebraically
R(T )W = Z[t + t−1 ].

In the example, we saw that the restriction map R(G) → R(T )W was surjective and
hence an isomorphism. This is a general fact though we shall not prove it here.
Identifying the irreducible characters in the representation ring is in general far more
difficult than the case of SU (2). We shall demonstrate the isomorphism in another
example next.

8.2 Representations of U (n)


The Weyl group for U (n) is the symmetric group Sn acting on an n-dimensional
torus. Any polynomial in t1 , . . . , tn which is symmetric is itself a polynomial in
the elementary symmetric functions σm : the coefficient of xm in the expansion of
(x − t1 )(x − t2 ) . . . (x − tn ). The representation ring R(T ) = Z[t± ±
1 , . . . , tn ] and involves
the negative powers, but we can convert such a Laurent polynomial into a genuine
polynomial by multiplying by a power of σn = t1 t2 . . . tn . If we can find represen-
tations of U (n) with character σ1 , σ2 , . . . , σn , σn−1 then we will have an isomorphism
R(U (n)) ∼ = R(T )W and

R(T )W ∼
= Z[σ1 , σ2 , . . . , σn , σn−1 ]

as a ring.
If V is the n-dimensional vector space which defines U (n) we shall use the exterior
powers Λm V for 0 ≤ m ≤ n. This is the algebra which lies behind the differential

63
forms which we used in discussing integration, but now using a complex vector space.
The easiest way to define Λm V is the dual space of the vector space of alternating
multilinear functions M (v1 , . . . , vm ) on V . So Λ1 V is the dual space of V ∗ which is
canonically V itself. Given vectors v1 , . . . , vm ∈ Λm V we define a linear function of
multilinear forms M by
def
v1 ∧ v2 ∧ · · · ∧ vm (M ) = M (v1 , . . . , vm ).

This expression is linear in each vi and the alternating property of M means it changes
sign if any two vi are interchanged. A general element in Λm V is a linear combination
of terms like this. In particular, if v1 , . . . , vn is a basis for V , a basis for Λm V is
provided by vi1 ∧ vi2 ∧ · · · ∧ vim where i1 < i2 < · · · < im . A matrix A ∈ U (n) acts as
X X
A ai1 i2 ...im vi1 ∧ vi2 ∧ . . . vim = ai1 i2 ...im Avi1 ∧ Avi2 ∧ . . . Avim
i1 <···<im i1 <···<im

Taking the maximal torus as the diagonal matrices with respect to the unitary basis
v1 , . . . , vn , the character of this representation is the elementary symmetric function
σm . When m = n, Λn V is one-dimensional since if wi = j Aij vj , then
P

w1 ∧ w2 ∧ · · · ∧ wn = det Av1 ∧ v2 ∧ · · · ∧ vn .

Its dual space has character σn−1 . Hence the generators of R(T )W are all characters
of representations of U (n) and so R(U (n)) ∼
= R(T )W .
These representations are irreducible because if U ⊂ Λm V is an invariant subspace,
it is invariant by T and hence a sum of one-dimensional representations given by
the weights. But the weights are all distinct so some vi1 ∧ vi2 ∧ . . . vim ∈ U and
the character χU contains the monomial ti1 ti2 . . . tim . But it is invariant under the
symmetric group and so contains all such terms and is the character of Λm V .

Example: The adjoint representation has weights which are the roots ±(xi −xj )and
n trivial weights, so its character is
X
χAd = n + ti t−1
j .
i̸=j

Multiply by σn = t1 t2 . . . tn and we get nσn + 2σn−1 so in R(T )W we have


σn−1
χAd = n + 2 .
σn

64
Elements of the group SU (n) have determinant 1 and the determinant is the action
on Λn V which is therefore the trivial representation and t1 . . . tn = 1, which is the
equation of its maximal torus. In this case the generators σn , σn−1 are removed. In
particular when n = 2 we recover the description of R(SU (2)) above.

The representation ring contains the characters of all representations. An additive


basis is provided by the irreducible ones but the ring by itself does not identify these.
This requires the further study of roots, weights and the Weyl group which is beyond
this course. By the same token, it is difficult to determine the product in the ring
in terms of a basis of irreducibles. This is an important issue: given two irreducible
representations V, W , how does V ⊗ W break up as a sum of irreducible components?
We can answer this for SU (2), since we know that the Vn , the representation on
homogeneous polynomials f (z1 , z2 ) of degree n, give all the irreducibles.

Proposition 8.2 (Clebsch-Gordan) The tensor product of the irreducible represen-


tations Vm , Vn (where m ≥ n) of SU (2) decomposes as

Vm ⊗ Vn = Vm+n ⊕ Vm+n−2 ⊕ · · · ⊕ Vm−n .

Proof: The character of Vn is tn + tn−2 + · · · + t−n or


t−n (1 − t2n+2 )
χ Vn = .
(1 − t2 )
Hence
t−m−n (1 − t2m+2 )
χVm ⊗Vn = χVm χVn = (1 + t2 + · · · + t2n )
(1 − t2 )
and collecting terms the numerator is

t−m−n (1 − t2m+2n+2 ) + t−m−n+2 (1 − t2m+2n ) + . . .

8.3 Integration on T
The argument for injectivity R(G) ⊂ R(T )W was based on orthogonality of charac-
ters. This means by integration over G. But since χV (ghg −1 ) = χV (h) the function
itself is uniquely determined by its restriction to T , so there should be a formula for
integrating a function of this form (a class function) over G in terms of an integral
over T .

65
In fact the formula comes from our discussion in 7.2 of the map F : G/T × T → G
together with the properties of root hyperplanes in 7.3. Recall that when g ∈ G lies
in a unique maximal torus the map F in a neighbourhood of F −1 (g) is a covering
transformation of degree |W | and so there is a dense open set of G over which F is
a covering. In particular, a set of full measure, so we can restrict our integral to this
set.
Further, because F is a local diffeomorphism, the integral of an n-form transforms
via the determinant of DF , relative to the invariant n-forms on G and G/T × T . But
we calculated this, f org = t, as

(det(Ad(t−1 ) − I))

acting on t⊥ although we omitted to determine the sign of this.


However, t preserves each root space ga and acts via the root θa so the determinant
is a product of terms
 
cos 2πθa (x) − 1 sin 2πθa (x)
det = 2(1 − cos 2πθa (x))
− sin 2πθa (x) cos 2πθa (x) − 1

and, since θa (x) ̸= 0 because x is not on the root hyperplane, this is positive.
The degree of the covering is |W | so the integral over F −1 (U ) for a coordinate neigh-
bourhood U in G is |W | times the integral over U .
Now suppose f : G → C is a class function, and consider f ◦ F : G/T × T → G. This
is f (gT, t) = f (gtg −1 ) = f (t). For fixed t this is constant on G/T . Applying Fubini’s
theorem we get

Proposition 8.3 (Weyl integration formula) If f is a class function,


Z Z
1
f (g) = det(Ad(t−1 ) − I)|G/T f (t).
G |W | T

Example: For SU (2) with roots ±2x, we have

det(Ad(t−1 ) − I)|G/T = 2(1 − cos 4πx) = −(t − t−1 )2

if t = e2πix . Then since the Weyl group has order 2, the Weyl integration formula
gives Z Z 1
f (g) = (1 − cos 4πx)f (x)dx.
SU (2) 0

66
We can test orthogonality of the characters of Vm , Vn this way by replacing this by
the contour integral around a circle C. Since χVm = e2πimx + · · · + . . . e−2πimx is real

t−m−n (1 − t2n+2 )(1 − t2m+2 ) −1


Z Z Z
1
χVm χ̄Vn = χVm χVn = − 2πi (t−t−1 )2 t dt
SU (2) SU (2) 2 C (1 − t2 )2

which is by Cauchy’s residue theorem the residue at t = 0 of


1 −m−n
− (t − t−m+n+2 − tm−n+2 + tm+n+4 )dt
2t3
and this vanishes unless m = n in which case it is 1.

67
9 Simple Lie groups

9.1 The Killing form


So far we have worked on a compact Lie group with a bi-invariant positive definite
inner product on the Lie algebra constructed by averaging. There are choices: at one
extreme is the torus where any left-invariant inner product
P is2 bi-invariant. But for
T T
other groups there is a natural choice. Since tr AA = i,j Aij , tr AB is a positive
definite inner product on the space of all n × n real matrices so for SO(n), whose Lie
algebra consists of the skew-symmetric matrices, tr A2 = − tr AAT is negative-definite
and, since tr P XP −1 = tr X, is invariant under the adjoint action.
Every compact Lie group has a homomorphism Ad : G → SO(g) and so we can use
this natural inner product on SO(g) to give:

Definition 26 The Killing form on a Lie algebra g is the symmetric bilinear form

(X, Y ) = tr(ad X ad Y ).

For a compact Lie group G this is negative definite by the observation above so long
as ad : g → End(V ) has zero kernel. The centre Z(G) acts trivially by conjugation
and so acts trivially under Ad, and so its Lie algebra lies in this kernel. Conversely
if ad X = 0 the exponential of tX acts trivially on g and hence on a neighbourhood
of the identity in G so if G is connected it lies in the centre.

Example:
1. For SL(2, R), which is not compact, the Killing form is non-degenerate but not
positive definite. In fact the adjoint representation is a covering map SL(2, R) →
SO(2, 1) and the form has signature (2, 1).
2. For SO(n) the Killing form is (n − 2) tr XY which is negative definite for n > 2.
3. The Killing form for U (n) is not negative definite since elements eiθ I lie in the
centre. It is negative definite for SU (n) and is 2n tr XY .

9.2 Ideals and simplicity


An abstract group is simple if it has no nontrivial normal subgroups. If H ⊂ G is a
connected normal Lie subgroup of a Lie group then its Lie algebra h is preserved by
the adjoint action of G, and so for Y ∈ h and X ∈ g, [X, Y ] ∈ h.

68
Definition 27 An ideal in a Lie algebra g is a subspace V such that for Y ∈ V and
X ∈ g, [X, Y ] ∈ V .

With an Ad-invariant inner product on g, the endomorphism ad X is skew adjoint.


It follows that if V is preserved by all transformations of the form ad X, so is its
orthogonal complement V ⊥ which is therefore an ideal also.

Definition 28 A Lie algebra is simple if it is non-abelian and has no non-trivial


ideals. It is semi-simple if it is a direct sum of simple Lie algebras.

For a connected Lie group G, the ideals are the invariant subspaces for the adjoint
representation, and when G is compact we have complete reducibility of any represen-
tation so g splits as a direct sum of trivial representations and non-trivial irreducible
ones. Thus if there are no trivial ones, the Lie algebra is semisimple.
The connected Lie group corresponding to an ideal in g is a normal subgroup i.e.
preserved by conjugation. A trivial ideal is the Lie algebra of a connected Lie subgroup
in the centre of G.

Definition 29 A connected Lie group is simple if it is non-abelian and has no non-


trivial connected normal Lie subgroups. It is semisimple if its Lie algebra is semisim-
ple.

Note that a simple Lie group may have a discrete centre which is of course a normal
subgroup.
Although a semisimple Lie algebra is a direct sum of simple Lie algebras, Theorem
5.2 shows that the corresponding statement about semisimple Lie groups only holds
for simply-connected ones.

Examples:
1. The group U (n) is not semisimple because it has a 1-dimensional centre: the scalar
matrices eiθ .
2. The case of SU (n) is simple because the adjoint representation is irreducible. In
fact since the roots xi − xj are permuted by the Weyl group, they are the weights of a
single representation so any complementary representation must be acted on trivially
by the maximal torus. But these are diagonal matrices and, unless they are scalars
(which is impossible if the trace is zero) they can be conjugated to be non-diagonal.
3. The centre of SU (n) consists of scalars ωI where ω n = 1, the cyclic group Zn . So
take the diagonal copy of Zn ⊂ Zn × Zn and the quotient SU (n) × SU (n)/Zn . This
group is semisimple but not a product.

69
The result that the Lie algebra of a compact group is a direct sum of simple ones
does not require a positive definite invariant inner product. In particular it can be
used for groups for which the Killing form is nondegenerate.

Theorem 9.1 If the Lie algebra g contains no abelian ideals and admits an invariant
(possibly indefinite) inner product then it is semisimple.

Proof: Let m be a minimal ideal. The span [m, m] of commutators [X, Y ] is again an
ideal, contained in m. It cannot be zero, since m would then be abelian, so it must be
m. Take its orthogonal complement m′ which is an ideal. Note that for an indefinite
inner product there exist null vectors, so a vector space can intersect non-trivially its
orthogonal complement.
If m ∩ m′ = 0 we have g = m ⊕ m′ , a direct sum of Lie algebras. Moreover any abelian
ideal in either factor is an ideal in g. The inner product is nondegenerate on each
factor too if it is nondegenerate on the sum. So we can continue, reducing dimension
each time.
If m ∩ m′ ̸= 0 it must be m by minimality. Together with m = [m, m] this means that
the inner product is zero on m.
P
Take A = i [Bi , Ci ] ∈ m where Bi , Ci ∈ m. Then if X ∈ g
X X
(A, X) = ([Bi , Ci ], X) = (Bi , [Ci , X]) = 0
i i

since [Ci , X] ∈ m as m is an ideal. But this contradicts the nondegeneracy of the


inner product. 2

With this approach, by Lie’s third theorem one can reduce the classification of uni-
versal covers of compact Lie groups to the discussion of simple Lie groups.

9.3 Classification
Simply-connected compact simple Lie groups are classified. They are:

ˆ The special unitary group SU (n)

ˆ The double cover Spin(n) of the special orthogonal group SO(n)

70
ˆ The quaternionic unitary group Sp(n). The notation here is a little ambiguous
(Sp = symplectic). The reason is that the complexification of the Lie algebra
is sp(2n, C), the Lie algebra of the non-compact group of complex 2n × 2n
matrices leaving fixed a nondegenerate skew symmetric form.

ˆ The exceptional groups G2 , F4 , E6 , E7 , E8 . The subscript denotes the rank of


the group.

The group Sp(n) is one we haven’t discussed. The quaternions H consist of real linear
combinations of 1, i, j, k where i, j, k satisfy the defining relations

i2 = j 2 = k 2 = ijk = −1.

So q = x0 + ix1 + jx2 + kx3 is a quaternion. Multiplication is associative but not


commutative. The quaternionic conjugate is q̄ = x0 − ix1 − jx2 − kx3 and pq = q̄ p̄.
The group Sp(n) consists of n × n matrices A with quaternionic entries such that
AĀT = I. When n = 1 this is the 3-sphere x20 + x21 + x22 + x23 = 1 which we have
seen as SU (2). In fact in low dimensions there are special isomorphisms among the
simply-connected groups:

Sp(1) ∼
= SU (2) Spin(4) ∼
= Sp(1) × Sp(1) Spin(5) ∼
= Sp(2) Spin(6) ∼
= SU (4).

More invariantly, if we consider Hn as a quaternionic vector space by right multipli-


cation by H, then Sp(n) is the group of orthogonal matrices commuting with this
action. Another even more invariant way to say this is to pick out the i in the quater-
nions and then regard j as an antilinear map J such that J 2 = −1. In Section 6.1
we saw that the action of SU (2) on the representation space Vn commuted with such
an action when n was odd. This means that V2m−1 corresponds to a homomorphism
SU (2) → Sp(m).

The exceptional groups have a rather more complicated description, especially E8 . It


is often said that if you want to prove something about Lie groups by case-by-case
treatment using the classification, by the time you get to E8 you can see the general
proof. However G2 is more amenable and we shall finally discuss this.

9.4 The group G2


The “exceptional” group G2 is in some respects not exceptional and has more in
common with the “classical” groups like SO(n). To justify this, consider O(n) de-
fined as the subgroup of GL(n, R) which preserves the positive definite bilinear form

71
(x, y) = x1 y1 +· · ·+xn yn . An invertible matrix A ∈ GL(n, R) acts on all such bilinear
forms by B(x, y) 7→ B(Ax, Ay). Positive definiteness is an open condition and any
positive definite form can be transformed to (x, y) by this action. So the orbit of B
in the space of symmetric bilinear forms (which has dimension n(n + 1)/2) is open,
and the stabilizer is a Lie group of dimension n2 − n(n + 1)/2 = n(n − 1)/2 conjugate
to O(n).
Instead of symmetric bilinear forms we can consider alternating trilinear forms T (x, y, z).
In fact every Lie algebra has one (X, [Y, Z]) but these are very special. In a 7 dimen-
sional space the picture is more like the symmetric forms: there are open orbits of the
49-dimensional Lie group GL(7, R) on the 35-dimensional space (Λ3 R7 )∗ of trilinear
forms T and the stabilizer of a point has dimension 49 − 35 = 14. This will be the
group G2 . To show this we need to pick a representative form T and show that its
stabilizer is a compact subgroup of dimension 14.
We take an orthonormal basis e1 , . . . , e7 of R7 and set
T = e7 (e1 e2 + e3 e4 + e5 e6 ) + e1 e3 e5 − e1 e4 e6 − e2 e3 e6 − e2 e4 e5 .
This terminology means that evaluating T on three of these vectors is the coefficient in
the above expression, taking account of the alternating property, so 1 = T (e7 , e1 , e2 ) =
−T (e1 , e7 , e2 ) etc.

Definition 30 The Lie group G2 is the identity component of the stabilizer in SO(7)
of the alternating trilinear form T .

We shall then prove

Proposition 9.2 The form T lies in an open orbit of GL(7, R) in the vector space
of alternating trilinear forms and the stabilizer in GL(7, R) of any form in this open
set is conjugate to G2 .

Proof: The Lie algebra action of a matrix A is


T (Ax, y, z) + T (x, Ay, z) + T (x, y, Az).

First we shall find a 14-dimensional Lie subalgebra of so(7) which leaves T invariant,
then show that the connected Lie group corresponding to it is the stabilizer of T in
GL(7, R).

1. For each i ̸= j let E (ij) denote the skew symmetric matrix taking ei to ej and ej
to −ei and zero on all other basis vectors. Consider E (71) . It transforms T to
e1 (e3 e4 + e5 e6 ) − e7 (e3 e5 + e4 e6 ).

72
But E (36) + E (45) transforms T to

−2e1 (e3 e4 + e5 e6 ) + 2e7 (e3 e5 + e4 e6 )

and so 2E (71) + E (36) + E (45) lies in the Lie algebra of the stabilizer. Continuing with
E (7i) for 1 ≤ i ≤ 6 we have a 6-dimensional space which leaves T invariant and to find
the other elements we need only restrict to linear combinations of E (ij) for i, j ̸= 7.
This is the Lie subalgebra of so(6) leaving fixed e7 .
If e7 is fixed and T is fixed, then so are the alternating 2-form e1 e2 + e3 e4 + e5 e6 and
the 3-form e1 e3 e5 − e1 e4 e6 − e2 e3 e6 − e2 e4 e5 which make up T . The skew bilinear
form B defined by the first is B(x, y) = (Ix, y) where Ie1 = e2 , Ie2 = −e1 et. So
ei + ie2 , e3 + ie4 , e5 + ie6 form a basis for R6 as a complex vector space C3 . The
elements in SO(6) which commute with I are orthogonal and complex linear hence
in U (3). The 3-form can be written

e1 e3 e5 − e1 e4 e6 − e2 e3 e6 − e2 e4 e5 = Re(e1 + ie2 )(e3 + ie4 )(e5 + ie6 )

which lies in the complex 1-dimensional space Λ3 C3 . The group U (3) acts on this
by the complex determinant eiθ , so if the real part is preserved the group must lie in
SU (3).
Thus SU (3) lies in the stabilizer of T and the Lie algebra g2 is su(3) ⊕ R6 which has
dimension 14.

2. Now suppose U is any trilinear form on R7 . We shall define an associated inner


product. This involves the exterior product of two forms. Given M of degree p and
N of degree q the exterior product M ∧ N is defined as
1 X
M ∧ N (x1 , . . . , xp+q ) = M (xσ(1) , . . . , xσ(p )N (xσ(p+1) . . . , xσ(p+q) ).
p!q! σ∈S
p+q

If α, β are two linear forms then α ∧ β = −β ∧ α and more generally M ∧ N =


(−1)pq N ∧ M .
For a vector x ∈ R7 define a 2-form Bx by Bx (y, z) = U (x, y, z), Then consider

Bx ∧ By ∧ U.

Since 2-forms commute this is symmetric in x, y and takes values in the 1-dimensional
space of 7-linear forms in 7 dimensions. This is almost an inner product, except it
takes values in a one-dimensional space on which A ∈ GL(7, R) acts as det A. So a
representative matrix B transforms like B 7→ ABAT det A and so det B 7→ (det A)9 B.

73
Then B/(det B)1/9 defines an inner product if det B ̸= 0. This may not be positive
definite.
3. Now consider T . The given positive definite inner product for which e1 , . . . , e7
is an orthonormal basis is the one determined as above. In fact since T is invariant
by SU (3) this is an invariant inner product on R6 ⊕ R. As a real representation of
SU (3) R6 is irreducible so it can only be a scalar multiple of the standard one so it
suffices to check that e7 and e1 have the same length in the canonical inner product.
In this case
Be7 = e1 e2 + e3 e4 + e5 e6
and so Be7 ∧ Be7 = 6e1 e2 e3 e4 e5 e6 e7 . And

Be1 = −e7 e2 + e3 e5 − e4 e6

which gives Be1 ∧ Be1 = 6e1 e2 e3 e4 e5 e6 e7 . These have the same sign and value so the
inner product is positive definite and (up to a factor) the given one.
We conclude from this that the connected component of the stabilizer of T under the
action of GL(7, R) lies in SO(6) and so from the first part is a 14-dimensional compact
Lie group. The map A 7→ T (Ax, Ay, Az) from GL(7, R) to the 35-dimensional space
of trilinear forms is smooth. The derivative at the identity has, as we have seen, a
14-dimensional kernel and so since 49 − 14 = 35 has a surjective derivative. By the
inverse function theorem it maps to an open set and so the orbit of T is open. 2

We can now consider properties of G2 as a Lie group, using the subgroup SU (3).

ˆ We have seen that the Lie algebra g2 = su(3) ⊕ R6 and the R6 is the defining
3-dimensional complex vector space for SU (3). Its complexification is the rep-
resentation V ⊕ V̄ in the notation of Section 8.2. So the maximal torus T of
SU (3) acts on R6 with weights ±x1 , ±x2 , ±x3 . Since none of these is zero, T
is the maximal torus of G2 which is therefore a group of rank 2. Moreover the
roots are (where x1 + x2 + x3 = 0):

±(x1 − x2 ), ±(x2 − x3 ), ±(x3 − x1 ), ±x1 , ±x2 , ±x3 .

ˆ The Weyl group is generated by reflections about these six root planes and is
the dihedral group of symmetries in O(2) of a regular hexagon.

ˆ The group G2 is simple: any ideal in g2 is a representation space for SU (3)


so (since SU (3) is simple) if G2 had an ideal it would be su(3) and also its
orthogonal complement R6 . But R6 is not closed under Lie bracket. We could

74
check this directly but since it has no zero weights, as a representation of SU (3),
all the Lie brackets would vanish. But then we would have a 6-dimensional
abelian subalgebra which contradicts the rank calculation.

ˆ The group G2 has the same maximal torus as the subgroup SU (3) but the Weyl
group consists of the symmetries of the hexagon instead of the triangle: it has
the extra symmetry of multiplication by −1. This acts on R(T ) by the involution
(t1 , t2 , t3 ) 7→ (t−1 −1 −1
1 , t2 , t3 ). We saw that R(T )
W
for SU (3) was Z[σ1 , σ2 ] where

σ1 = t1 + t2 + t3 σ2 = t1 t2 + t2 t3 + t3 t1 t1 t2 t3 = 1.

So the extra involution takes σ1 to σ2 /σ3 = σ2 and σ1 + σ2 , σ1 σ2 are generate


the invariant subring.
Now the 7-dimensional representation of G2 splits as 1 ⊕ V ⊕ V̄ as a represen-
tation of SU (3) with character

1 + t1 + t2 + t3 + t−1 −1 −1
1 + t2 + t3 = 1 + σ1 + σ2

and the adjoint representation

2 + t1 t−1 −1 −1 −1
2 + · · · + t1 + t2 + t3 + t1 + t2 + t3 .

But
σ1 σ2 = (t1 + t2 + t3 )(t−1 −1 −1
1 + t2 + t3 )

so the character of the adjoint representation is

σ1 σ2 − 1 + σ1 + σ2 .

We see therefore that R(G2 ) = R(T )W = Z[σ1 + σ2 , σ1 σ2 ] is a polynomial ring


on two generators.

75

You might also like