1 Lie Groups: 1.1 The General Linear Group
1 Lie Groups: 1.1 The General Linear Group
1 Lie Groups: 1.1 The General Linear Group
1 Lie groups
As mentioned in Handout #11, the general framework for studying symmetry in mathe-
matics and physics is group theory. In your algebra class you have studied many examples
of finite groups (e.g. the symmetric group and its subgroups) and some examples of infinite
discrete groups (e.g. the integers Z). But in physics the most important symmetries are
those associated to continuous groups (i.e. groups parametrized by one or more real param-
eters), since these are the ones that give rise, via Noether’s theorem, to conservation laws.
Continuous groups are also known as Lie groups, in honor of the Norwegian mathematician
Sophus Lie (1842–1899); they play a key role in differential geometry, quantum mechanics,
and many other areas of mathematics and physics. In this handout I would like to give
you an elementary introduction to the theory of Lie groups, focusing on the classical matrix
groups and then concentrating on the most important one for our purposes, namely the
group of rotations of three-dimensional space.
A precise definition of “Lie group” requires some preliminary concepts from differential
geometry, notably the notion of a manifold . Instead of doing this, I will simply show you
some concrete examples of Lie groups; all of my examples will be groups of matrices, with
the group operation being matrix multiplication.
Since a matrix is invertible if and only if its determinant is nonzero, we can also write
Therefore, GL(n), considered as a set, is simply the n2 -dimensional vector space Rn×n with
the subset {A : det A = 0} removed. Since the determinant is a polynomial function of the
matrix elements a11 , a12 , . . . , ann , it follows that the set {A : det A = 0} is an “algebraic sub-
variety of codimension 1” in Rn×n . [Codimension 1 because it is specified by one equation.]
1
Geometrically it is a hypersurface in Rn×n (“hypersurface” is a synonym of “submanifold
of codimension 1”). Fancy language aside, the important point is that “nearly all” n × n
matrices are invertible; only a lower-dimensional subset of them are noninvertible. So GL(n)
is a Lie group of the full dimension n2 .1
The n × n matrices are in one-to-one correspondence with the linear maps from Rn to
itself: namely, the matrix A induces the linear map x 7→ Ax. Under this correspondence,
the matrices A ∈ GL(n) correspond to the automorphisms of the n-dimensional real vector
space Rn : that is, they are the maps from Rn to itself that preserve its vector-space structure
(namely, are linear and invertible). The group GL(n) can thus be regarded as the group of
automorphisms of the n-dimensional real vector space Rn .
All of the remaining groups to be introduced here will be subgroups of GL(n).
Remark. Sometimes we write GL(n, R) to stress that these are matrices whose entries
are real numbers. One can also study the group GL(n, C) of complex matrices, or more
generally the group GL(n, F ) for any field F .
Since det(AB) = (det A)(det B), it follows that SL(n) really is a group; it is a subgroup of
GL(n). Since SL(n) consists of the matrices A satisfying one equation det A = 1, it is a
subgroup of codimension 1. Therefore, SL(n) is a Lie group of dimension n2 − 1; it will be
parametrized (at least locally) by n2 − 1 independent real parameters.
Remark. Once again we write SL(n, R) to stress that these are matrices whose entries
are real numbers, since one can also study SL(n, C) and SL(n, F ) for any field F , or even
SL(n, R) for any commutative ring R. In fact, the group SL(2, Z) and its subgroups play
an important role in complex analysis and analytic number theory. (Note, however, that
SL(2, Z) is a discrete group, not a Lie group.)
2
Lemma 1 Let A be an n×n matrix with real entries, and consider the linear transformation
of Rn defined by x 7→ Ax. Then the following are equivalent:
(a) This transformation preserves the Euclidean inner product, i.e. (Ax) · (Ay) = x · y for
all x, y ∈ Rn .
(b) This transformation preserves the Euclidean norm, i.e. kAxk = kxk for all x ∈ Rn .
n
X
= xj (AT )ji Aik yk (4c)
i,j,k=1
n
X
= xj (AT A)jk yk . (4d)
j,k=1
Remarks. 1. The operation used in the proof of (b) =⇒ (a) is called polarization: it is
a general procedure that allows one to recover a symmetric bilinear form from the quadratic
form it generates. An alternative version of the polarization formula is
1h i
x·y = (x + y) · (x + y) − (x − y) · (x − y) (7)
4
3
and the analogous thing with A.
2. In your linear algebra course you learned that for a finite square matrix A, the following
are equivalent:
(a) A has a left inverse (namely, a matrix B satisfying BA = I).
(b) A has a right inverse (namely, a matrix B satisfying AB = I).
(c) det A 6= 0.
and that when (a)–(c) hold, the left and right inverses are unique and equal . It follows from
this that AT A = I is equivalent to AAT = I; therefore, we can define “orthogonal matrix”
by any one of the three equivalent statements:
(i) AT A = I
(ii) AAT = I
(iii) AT A = AAT = I
Warning: The equivalence (a)⇐⇒(b) is not true for rectangular matrices, or for oper-
ators on infinite-dimensional vector spaces.
[You should verify for yourself that A, B ∈ O(n) imply AB ∈ O(n) and A−1 ∈ O(n), so
that O(n) is indeed a group; it is a subgroup of GL(n).] The group O(n) is the group of
automorphisms of n-dimensional Euclidean space (i.e., Rn equipped with the Euclidean inner
product).
Taking determinants of the defining equation AT A = I and remembering that
we see that a matrix A ∈ O(n) must satisfy (det A)2 = 1, or in other words
det A = ±1 . (10)
Both signs are possible. Consider, for instance, n × n diagonal matrices with 1’s and −1’s
on the diagonal. All such matrices are orthogonal, hence belong to O(n). If the number of
entries −1 is even, then det A = +1; but if the number of entries −1 is odd, then det A = −1.
Topologically, the group O(n) is the disjoint union of two connected components: the
orthogonal matrices with determinant +1 (this is the connected component containing the
identity matrix), and the orthogonal matrices with determinant −1. (These two subsets must
“stay away from” each other in Rn×n , because the determinant is a continuous function.)
The orthogonal matrices with det A = +1 form a subgroup of O(n) [why?]; it is called
the special orthogonal group SO(n):
4
It is also called the rotation group, because the elements of SO(n) can be regarded as
rotations. (We will see this explicitly for n = 2 and n = 3.) By contrast, some of the
matrices in O(n) with determinant −1 can be regarded as reflections.
What is the dimension of the Lie groups O(n) and SO(n)? The defining equation AT A =
I naively has n2 entries. But it is important to note that the matrix AT A is symmetric (no
matter what A is); therefore, some of the n2 equations in AT A = I are redundant (namely,
the ij equation says the same thing as the ji equation). So the non-redundant equations
are, for instance, those on the diagonal and those above the diagonal (i.e. i ≤ j); there are
n(n + 1)/2 of these [you should check this carefully]. Therefore, the dimension of the Lie
groups O(n) and SO(n) is
n(n + 1) n(n − 1)
n2 − = . (12)
2 2
In particular, the rotation group in dimension n = 2 is one-dimensional (it is parametrized
by a single angle), and the rotation group in dimension n = 3 is three-dimensional (it can
be parametrized by three angles, as we shall later see explicitly).
5
Remark. One might ask whether there are other symmetric bilinear forms worth con-
sidering, beyond Ip,q . The answer is no, because Sylvester’s law of inertia guarantees
that every nondegenerate symmetric bilinear form can be represented by Ip,q in a suitably
chosen basis. More precisely, let us say that the n × n matrices A and B are congruent if
there exists an invertible matrix S such that B = SAS T (and hence also A = S −1 B(S −1 )T ).
Then Sylvester’s theorem asserts that any symmetric matrix A is congruent to a diagonal
matrix having all its diagonal entries equal to +1, −1 or 0, and that moreover the number
of diagonal entries of each kind is an invariant of A, i.e. it does not depend on the matrix
S being used. This invariant is called the inertia (n+ , n− , n0 ) of the matrix A. The nonde-
generate symmetric bilinear forms are the ones with n0 = 0; that is, they are congruent to
some matrix Ip,q (where p = n+ and q = n− ).
det A = ±1 . (18)
But, unlike in the case of the orthogonal group, it turns out that the case det A = −1 does
not occur; all symplectic matrices in fact have det A = +1. But the proof of this fact requires
2
For a real antisymmetric matrix, the eigenvalues are pure imaginary and come in complex-conjugate
pairs. If the dimension is odd, then at least one of the eigenvalues must be zero.
3
In equation (39) of Handout #12 I used a slightly different convention, writing AΩAT = Ω instead of
T
A ΩA = Ω.
4
In fact det Ω = +1, but we don’t need this.
6
a deeper algebraic study, and I will not pursue it here.5
What is the dimension of the symplectic group Sp(2n)? The defining equation AT ΩA = Ω
naively has (2n)2 = 4n2 entries. But it is important to note that the matrix AT ΩA is
antisymmetric (no matter what A is); therefore, some of the (2n)2 equations in AT ΩA = Ω
are redundant (namely, the ij equation says the same thing as the ji equation, and the
diagonal equations say 0 = 0). So the non-redundant equations are, for instance, those
above the diagonal (i.e. i < j); there are (2n)(2n − 1)/2 of these [you should check this
carefully]. Therefore, the dimension of the symplectic group Sp(2n) is
(2n)(2n − 1) (2n)(2n + 1)
(2n)2 − = = n(2n + 1) . (19)
2 2
2 Lie algebras
Let G be a Lie group of n × n matrices: for instance, one of the “classical matrix groups”
introduced in the preceding section. It turns out that much of the structure of G can be
understood by looking only at the behavior of G in a tiny neighborhood of the identity
element. Intuitively speaking, we consider matrices A = I + ǫA with “ǫ infinitesimal” and
we ask under what conditions this matrix belongs to G “through first order in ǫ”. More
rigorously, we consider a smooth curve A(t) lying in G and satisfying A(0) = I, and we ask
what are the possible values of A′ (0) = (d/dt)A(t)t=0 .6 Geometrically, what this means
is that we regard G as a submanifold embedded in the vector space Rn×n , and we seek to
determine the tangent space to G at the identity element I. The Lie algebra g associated
to the Lie group G is, by definition, the tangent space to G at the point I; that is, it is the
set of all possible values of A′ (0) when we consider all smooth curves A(t) lying in G and
satisfying A(0) = I. Since a tangent space to a manifold is always a vector space (that is, it
is closed under addition and multiplication by scalars), it follows that the Lie algebra g is a
vector space. This fact makes it somewhat easier to study than G itself, which is a nonlinear
manifold. That is the principal reason for introducing Lie algebras.
So our next task is to determine the Lie algebra g for each of the “classical matrix
groups” G introduced in the preceding section.
7
2.2 The special linear group
The next case to consider is the special linear group SL(n), which consists of the matrices
A satisfying det A = 1. Consider a smooth curve A(t) lying in SL(n) and satisfying A(0) = I,
say
A(t) = I + At + A2 t2 + A3 t3 + . . . . (21)
As explained last week in class, we have
where tr A = ni=1 Aii is the trace of the matrix A. [Recall the proof: we start from the
P
definition of determinant as a sum over permutations σ. Every σ other than the identity
permutation must contribute the product of at least two off-diagonal elements, so it will make
a contribution starting only at order t2 or higher. The identity permutation will contribute
n
the product of diagonal elements, which is (1 + tAii ) = 1 + t ni=1 Aii + O(t2 ).] Therefore,
Q P
i=1
if the curve A(t) lies in SL(n) — that is, if det A(t) = 1 for all t — then we must have
tr A = 0. So every matrix in the Lie algebra sl(n) must be traceless. And conversely, if A
is a traceless n × n matrix, then it can be shown that the particular curve
∞ k
tA def
X t Ak
A(t) = e = (23)
k!
k=0
satisfies det A(t) = 1 for all t. So the Lie algebra sl(n) consists precisely of the n×n matrices
that are traceless:
sl(n) = {A ∈ Rn×n : tr A = 0} . (24)
Note that sl(n) is a vector space of dimension n2 − 1 (why?).
has some but not all of the properties of the ordinary exponential; this is because matrix
multiplication, unlike ordinary multiplication, is noncommutative. Thus, if M and N are
two matrices, in general we do not have eM eN = eM +N ; rather, log(eM eN ) is given by a
much more complicated formula, called the Campbell–Baker–Hausdorff formula. But
if M and N happen to commute (that is, MN = NM), then it is true that eM eN = eM +N .
In particular, since all multiples of a single matrix A trivially commute, it follows that
esA etA = e(s+t)A , or in other words A(s)A(t) = A(s + t). Thus, the curve A(t) = eAt forms
a commutative subgroup of G: it is the one-parameter subgroup of G generated by the
“infinitesimal generator” A.
8
2.3 The orthogonal and special orthogonal groups
Now consider the orthogonal group O(n) and its subgroup SO(n). Any smooth (or even
continuous) curve in O(n) passing through the identity element must lie entirely in SO(n),
since the determinant obviously cannot jump from +1 to −1. So O(n) and SO(n) will have
the same Lie algebra, which we call so(n).
So consider a smooth curve A(t) lying in O(n) and satisfying A(0) = I, say
A(t) = I + At + A2 t2 + A3 t3 + . . . . (26)
we conclude that every matrix in the Lie algebra so(n) must satisfy A + AT = 0, i.e. it must
be antisymmetric. And conversely, if A is a antisymmetric n × n matrix, then it can be
shown that ∞ k
tA def
X t Ak
A(t) = e = (28)
k=0
k!
satisfies det A(t) = 1 for all t. So the Lie algebra so(n) consists precisely of the n×n matrices
that are antisymmetric:
A(t) = I + At + A2 t2 + A3 t3 + . . . . (30)
we conclude that every matrix in the Lie algebra sp(2n) must satisfy
AT Ω + ΩA = 0 . (32)
9
This same equation arose already in Handout #12 when we were discussing “infinitesimal
canonical transformations”.7 Recall how we handled it: since Ω is antisymmetric, we can
also write the condition (32) as
ΩA − AT ΩT = 0 (33)
or in other words
ΩA − (ΩA)T = 0 . (34)
So this says that the matrix ΩA is symmetric, or equivalently that the matrix
ΩT (ΩA)Ω = AΩ (35)
satisfies A(t)T ΩA(t) = Ω for all t. So the Lie algebra sp(2n) consists precisely of the 2n × 2n
matrices such that ΩA is symmetric:
(2n)(2n + 1)
= n(2n + 1) (38)
2
(why?).
10
Of course, in a general (e.g. discrete) group G there may not be any sensible way of
defining quantitatively what we mean by the “deviation” of ghg −1h−1 from e. But in a Lie
group there is an obvious way of doing so, since the group elements are parametrized by a
vector of real numbers. In particular, our examples of Lie groups are groups of n × n real
matrices (and e is the identity matrix I), so we can simply look at the matrix difference
ghg −1h−1 − I: its deviation from zero tells us the degree to which g fails to commute with h.
Let us do this when g and h are elements of G near the identity element. More precisely,
let us consider a smooth curve g(s) lying in G and satisfying g(0) = I, say
and another smooth curve h(t) lying in G and satisfying h(0) = I, say
(1 + x)−1 = 1 − x + x2 − x3 + x4 − . . . (41)
in the sense that the power series on the right-hand side is absolutely convergent and equals
(1 + x)−1 . [The absolute convergence is easy to prove, using properties of geometric series.
To show that the infinite series equals (1 + x)−1 , just multiply it by 1 + x, rearrange the
terms (this is legitimate since the series is absolutely convergent), and see that the result is
1 + 0x + 0x2 + . . . .] Similarly, for any real (or complex) n × n matrix A satisfying kAk < 1
(where k · k is a suitable norm), we have
(I + A)−1 = I − A + A2 − A3 + A4 − . . . ; (42)
the proof is exactly the same. This sum is called the Neumann series for the matrix
inverse. (Here we won’t bother to make explicit what norm k · k we are using; but it won’t
matter, because anyway we will be using the Neumann formula only for matrices A that are
very small.)
So using the Neumann formula we have
and
h(t)−1 = I − Bt + O(s2) (44)
[why?]. The commutator of g(s) and h(t) is therefore
11
(You should check this computation carefully!) Note that there are no terms sn t0 with n ≥ 1,
because g(s)h(0)g(s)−1h(0)−1 = I for all s [why?]; and likewise there are no terms s0 tn with
n ≥ 1. So all the terms not explicitly shown here must be of order st2 or s2 t or higher.
From (45) we see that the quantity AB − BA thus measures the degree to which g(s)
fails to commute with h(t), at the lowest nontrivial order (which is order st). We therefore
make the definition: the Lie bracket [A, B] is defined by9
[A, B] = AB − BA . (46)
The Lie bracket measures the noncommutativity of the Lie group G in a small neighborhood
of the identity element.
Since the curves g(s) and h(t) lie in the group G, their “infinitesimal elements” A and B
must belong to the Lie algebra g (by definition); and conversely, any two elements A and B
of the Lie algebra g are tangent vectors to suitable smooth curves g(s) and h(t) lying in G.
What about [A, B]? We see from (45) that [A, B] is indeed the tangent vector to a curve
lying in G, namely the curve
√ √ √ √
g( t)h( t)g( t)−1 h( t)−1 = I + [A, B]t + O(t3/2 ) . (47)
(I am not sure whether the O(t3/2 ) here can actually be replaced by O(t2); but anyway,
formula (47) already suffices to show that the curve in question is at least once continuously
differentiable, with tangent vector [A, B] at t = 0.) It follows that [A, B] belongs to the Lie
algebra g.
1. Bilinearity. We have
[A, [B, C]] + [B, [C, A]] + [C, [A, B]] = 0 (50)
or equivalently (using anticommutativity)
[[A, B], C] + [[B, C], A] + [[C, A], B] = 0 (51)
9
The Lie bracket is also called the commutator of the two matrices A and B. It is unfortunate that the
same word “commutator” is used in group theory and in linear algebra for two closely related but slightly
different notions.
12
The bilinearity and anticommutativity are trivial, and the Jacobi identity is an easy compu-
tation (do it!).
This structure is so important in mathematics that it is given a name: any vector space V
equipped with a product V ×V → V that is bilinear, anticommutative and satisfies the Jacobi
identity is called a Lie algebra (or an abstract Lie algebra if we wish to distinguish it
from the concrete Lie algebras of n × n matrices introduced here).
TO BE CONTINUED
13