1 Lie Groups: 1.1 The General Linear Group

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

MATHEMATICS 7302 (Analytical Dynamics)

YEAR 2016–2017, TERM 2

HANDOUT #13: INTRODUCTION TO THE ROTATION GROUP

1 Lie groups
As mentioned in Handout #11, the general framework for studying symmetry in mathe-
matics and physics is group theory. In your algebra class you have studied many examples
of finite groups (e.g. the symmetric group and its subgroups) and some examples of infinite
discrete groups (e.g. the integers Z). But in physics the most important symmetries are
those associated to continuous groups (i.e. groups parametrized by one or more real param-
eters), since these are the ones that give rise, via Noether’s theorem, to conservation laws.
Continuous groups are also known as Lie groups, in honor of the Norwegian mathematician
Sophus Lie (1842–1899); they play a key role in differential geometry, quantum mechanics,
and many other areas of mathematics and physics. In this handout I would like to give
you an elementary introduction to the theory of Lie groups, focusing on the classical matrix
groups and then concentrating on the most important one for our purposes, namely the
group of rotations of three-dimensional space.
A precise definition of “Lie group” requires some preliminary concepts from differential
geometry, notably the notion of a manifold . Instead of doing this, I will simply show you
some concrete examples of Lie groups; all of my examples will be groups of matrices, with
the group operation being matrix multiplication.

1.1 The general linear group


The set of all n × n matrices (with real entries) does not form a group with respect
to matrix multiplication, because not all matrices have multiplicative inverses. But if we
restrict ourselves to invertible matrices, then we do have a group, since the product of two
invertible matrices is invertible: (AB)−1 = B −1 A−1 . We therefore define the general linear
group GL(n) to be the set of all invertible n × n matrices:

GL(n) = {A ∈ Rn×n : A is invertible} . (1)

Since a matrix is invertible if and only if its determinant is nonzero, we can also write

GL(n) = {A ∈ Rn×n : det A 6= 0} . (2)

Therefore, GL(n), considered as a set, is simply the n2 -dimensional vector space Rn×n with
the subset {A : det A = 0} removed. Since the determinant is a polynomial function of the
matrix elements a11 , a12 , . . . , ann , it follows that the set {A : det A = 0} is an “algebraic sub-
variety of codimension 1” in Rn×n . [Codimension 1 because it is specified by one equation.]

1
Geometrically it is a hypersurface in Rn×n (“hypersurface” is a synonym of “submanifold
of codimension 1”). Fancy language aside, the important point is that “nearly all” n × n
matrices are invertible; only a lower-dimensional subset of them are noninvertible. So GL(n)
is a Lie group of the full dimension n2 .1
The n × n matrices are in one-to-one correspondence with the linear maps from Rn to
itself: namely, the matrix A induces the linear map x 7→ Ax. Under this correspondence,
the matrices A ∈ GL(n) correspond to the automorphisms of the n-dimensional real vector
space Rn : that is, they are the maps from Rn to itself that preserve its vector-space structure
(namely, are linear and invertible). The group GL(n) can thus be regarded as the group of
automorphisms of the n-dimensional real vector space Rn .
All of the remaining groups to be introduced here will be subgroups of GL(n).
Remark. Sometimes we write GL(n, R) to stress that these are matrices whose entries
are real numbers. One can also study the group GL(n, C) of complex matrices, or more
generally the group GL(n, F ) for any field F .

1.2 The special linear group


The determinant of an n × n matrix has an important geometric meaning: namely, det A
is the factor by which volumes are scaled under the transformation x 7→ Ax of Rn into itself.
(More precisely, | det A| is the factor by which volumes are scaled, while the sign of det A tells
us whether orientations are preserved or reversed.) This suggests that we might consider
the subgroup of GL(n) consisting of matrices that preserve volume. We therefore define the
special linear group SL(n) by

SL(n) = {A ∈ GL(n) : det A = 1} . (3)

Since det(AB) = (det A)(det B), it follows that SL(n) really is a group; it is a subgroup of
GL(n). Since SL(n) consists of the matrices A satisfying one equation det A = 1, it is a
subgroup of codimension 1. Therefore, SL(n) is a Lie group of dimension n2 − 1; it will be
parametrized (at least locally) by n2 − 1 independent real parameters.
Remark. Once again we write SL(n, R) to stress that these are matrices whose entries
are real numbers, since one can also study SL(n, C) and SL(n, F ) for any field F , or even
SL(n, R) for any commutative ring R. In fact, the group SL(2, Z) and its subgroups play
an important role in complex analysis and analytic number theory. (Note, however, that
SL(2, Z) is a discrete group, not a Lie group.)

1.3 The orthogonal group and special orthogonal group


Now consider Rn equipped with the Euclidean inner product x · y = ni=1 xi yi [where
P
x = (x1 , . . . , xn ) and y = (y1 , . . . , yn )] and hence with the Euclidean norm kxk = (x · x)1/2 .
Which linear transformations x 7→ Ax preserve these structures? You learned the answer in
your linear algebra course:
1
Another way of saying this is: for any invertible matrix A, there exists a neighborhood of A in Rn×n in
which all the matrices are invertible.

2
Lemma 1 Let A be an n×n matrix with real entries, and consider the linear transformation
of Rn defined by x 7→ Ax. Then the following are equivalent:

(a) This transformation preserves the Euclidean inner product, i.e. (Ax) · (Ay) = x · y for
all x, y ∈ Rn .

(b) This transformation preserves the Euclidean norm, i.e. kAxk = kxk for all x ∈ Rn .

(c) The matrix A is an orthogonal matrix, i.e. AT A = I.

For completeness let me remind you of the proof:


Proof. Let x and y be arbitrary vectors in Rn , and let us compute
n
X
(Ax) · (Ay) = (Ax)i (Ay)i (4a)
i=1
n
X
= Aij xj Aik yk (4b)
i,j,k=1

n
X
= xj (AT )ji Aik yk (4c)
i,j,k=1

n
X
= xj (AT A)jk yk . (4d)
j,k=1

So if AT A = I, or in other words (AT A)jk = δjk , we have (Ax) · (Ay) = x · y. So (c)


implies (a).
For the converse, let x = eℓ (the vector having a 1 in the ℓth entry and zeros in all other
entries) and y = em . Then x · y = δℓm . And the formula above shows that (Ax) · (Ay) =
(AT A)ℓm . These two are equal for all choices of ℓ, m only if AT A = I. So (a) implies (c).
Now (a) trivially implies (b), by setting y = x. To prove the converse, we observe that
1h i
x·y = (x + y) · (x + y) − x · x − y · y (5)
2
and
1h i
(Ax) · (Ay) = (A(x + y)) · (A(x + y)) − (Ax) · (Ax) − (Ay) · (Ay) . (6)
2
It follows from these two formulae that (b) implies (a) [please make sure you understand
why]. 

Remarks. 1. The operation used in the proof of (b) =⇒ (a) is called polarization: it is
a general procedure that allows one to recover a symmetric bilinear form from the quadratic
form it generates. An alternative version of the polarization formula is
1h i
x·y = (x + y) · (x + y) − (x − y) · (x − y) (7)
4

3
and the analogous thing with A.
2. In your linear algebra course you learned that for a finite square matrix A, the following
are equivalent:
(a) A has a left inverse (namely, a matrix B satisfying BA = I).
(b) A has a right inverse (namely, a matrix B satisfying AB = I).
(c) det A 6= 0.
and that when (a)–(c) hold, the left and right inverses are unique and equal . It follows from
this that AT A = I is equivalent to AAT = I; therefore, we can define “orthogonal matrix”
by any one of the three equivalent statements:
(i) AT A = I
(ii) AAT = I
(iii) AT A = AAT = I
Warning: The equivalence (a)⇐⇒(b) is not true for rectangular matrices, or for oper-
ators on infinite-dimensional vector spaces. 

In view of the preceding Lemma, we define the orthogonal group O(n) by

O(n) = {A ∈ GL(n) : AT A = I} . (8)

[You should verify for yourself that A, B ∈ O(n) imply AB ∈ O(n) and A−1 ∈ O(n), so
that O(n) is indeed a group; it is a subgroup of GL(n).] The group O(n) is the group of
automorphisms of n-dimensional Euclidean space (i.e., Rn equipped with the Euclidean inner
product).
Taking determinants of the defining equation AT A = I and remembering that

det(AT ) = det A and det(AB) = (det A)(det B) , (9)

we see that a matrix A ∈ O(n) must satisfy (det A)2 = 1, or in other words

det A = ±1 . (10)

Both signs are possible. Consider, for instance, n × n diagonal matrices with 1’s and −1’s
on the diagonal. All such matrices are orthogonal, hence belong to O(n). If the number of
entries −1 is even, then det A = +1; but if the number of entries −1 is odd, then det A = −1.
Topologically, the group O(n) is the disjoint union of two connected components: the
orthogonal matrices with determinant +1 (this is the connected component containing the
identity matrix), and the orthogonal matrices with determinant −1. (These two subsets must
“stay away from” each other in Rn×n , because the determinant is a continuous function.)
The orthogonal matrices with det A = +1 form a subgroup of O(n) [why?]; it is called
the special orthogonal group SO(n):

SO(n) = {A ∈ O(n) : det A = +1} = O(n) ∩ SL(n) . (11)

4
It is also called the rotation group, because the elements of SO(n) can be regarded as
rotations. (We will see this explicitly for n = 2 and n = 3.) By contrast, some of the
matrices in O(n) with determinant −1 can be regarded as reflections.
What is the dimension of the Lie groups O(n) and SO(n)? The defining equation AT A =
I naively has n2 entries. But it is important to note that the matrix AT A is symmetric (no
matter what A is); therefore, some of the n2 equations in AT A = I are redundant (namely,
the ij equation says the same thing as the ji equation). So the non-redundant equations
are, for instance, those on the diagonal and those above the diagonal (i.e. i ≤ j); there are
n(n + 1)/2 of these [you should check this carefully]. Therefore, the dimension of the Lie
groups O(n) and SO(n) is
n(n + 1) n(n − 1)
n2 − = . (12)
2 2
In particular, the rotation group in dimension n = 2 is one-dimensional (it is parametrized
by a single angle), and the rotation group in dimension n = 3 is three-dimensional (it can
be parametrized by three angles, as we shall later see explicitly).

1.4 The pseudo-orthogonal groups


Here is a generalization of the orthogonal groups O(n) that plays an important role in
special relativity, among other applications.
Note first that the defining equation AT A = I of the orthogonal group can trivially be
rewritten as AT IA = I. That is, the orthogonal transformations are those that preserve the
symmetric bilinear form represented by the matrix I, which is hx, yi = xT Iy = x · y. (This
is precisely the content of the Lemma of the preceding section.)
So it might be worth generalizing this to consider matrices that preserve the symmetric
bilinear form represented by some other symmetric matrix. For instance, we can consider the
diagonal matrix Ip,q whose diagonal entries are p +1’s followed by q −1’s (where n = p + q).
The associated symmetric bilinear form is
p p+q
X X
T
hx, yi = x Ip,q y = xi yi − xi yi . (13)
i=1 i=p+1

We then define the pseudo-orthogonal group O(p, q) by


O(p, q) = {A ∈ GL(p + q) : AT Ip,q A = Ip,q } . (14)
In particular, the pseudo-orthogonal group O(3, 1) is called the Lorentz group; it plays a
central role is special relativity. (The numbers 3 and 1 arise because we live in a universe
having three space dimensions and one time dimension.)
It is easy to see, just as with the orthogonal groups, that every matrix A ∈ O(p, q)
has determinant ±1 and that both signs are possible. We therefore define also the special
pseudo-orthogonal group SO(p, q) by
SO(p, q) = {A ∈ O(p, q) : det A = +1} = O(p, q) ∩ SL(p + q) . (15)
The pseudo-orthogonal groups O(p, q) and SO(p, q) have dimension n(n − 1)/2 [where
n = p + q], just like the orthogonal group.

5
Remark. One might ask whether there are other symmetric bilinear forms worth con-
sidering, beyond Ip,q . The answer is no, because Sylvester’s law of inertia guarantees
that every nondegenerate symmetric bilinear form can be represented by Ip,q in a suitably
chosen basis. More precisely, let us say that the n × n matrices A and B are congruent if
there exists an invertible matrix S such that B = SAS T (and hence also A = S −1 B(S −1 )T ).
Then Sylvester’s theorem asserts that any symmetric matrix A is congruent to a diagonal
matrix having all its diagonal entries equal to +1, −1 or 0, and that moreover the number
of diagonal entries of each kind is an invariant of A, i.e. it does not depend on the matrix
S being used. This invariant is called the inertia (n+ , n− , n0 ) of the matrix A. The nonde-
generate symmetric bilinear forms are the ones with n0 = 0; that is, they are congruent to
some matrix Ip,q (where p = n+ and q = n− ).

1.5 The symplectic group


We now come to the least intuitive of the “classical matrix groups”, namely the symplectic
group. But I would be remiss if I ignored it, because it plays a central role in Hamiltonian
mechanics.
We follow the same reasoning as for the pseudo-orthogonal groups, but instead of looking
for matrices that preserve a symmetric bilinear form, we look for matrices that preserve
an antisymmetric bilinear form. That is, we fix an antisymmetric matrix J and we look
for matrices A satisfying AT JA = J. Now, nondegenerate antisymmetric matrices exist
only in even dimension, say dimension 2n.2 The canonical example of a nondegenerate
antisymmetric matrix is precisely the matrix Ω that we introduced in Handout #12 to
represent the structure of Hamiltonian phase space. Namely, Ω is the 2n × 2n matrix whose
n × n blocks look like
0 n In
 
Ω = (16)
−In 0n
where In denotes the n × n identity matrix and 0n denotes the n × n zero matrix. Note that
Ω is antisymmetric and satisfies Ω2 = −I. We then define the symplectic group Sp(2n)
by
Sp(2n) = {A ∈ GL(2n) : AT ΩA = Ω} . (17)
The matrices A ∈ Sp(2n) are called symplectic matrices.3
Taking determinants of the defining equation AT ΩA = Ω and using the nondegeneracy
det Ω 6= 0,4 we conclude that every symplectic matrix satisfies

det A = ±1 . (18)

But, unlike in the case of the orthogonal group, it turns out that the case det A = −1 does
not occur; all symplectic matrices in fact have det A = +1. But the proof of this fact requires
2
For a real antisymmetric matrix, the eigenvalues are pure imaginary and come in complex-conjugate
pairs. If the dimension is odd, then at least one of the eigenvalues must be zero.
3
In equation (39) of Handout #12 I used a slightly different convention, writing AΩAT = Ω instead of
T
A ΩA = Ω.
4
In fact det Ω = +1, but we don’t need this.

6
a deeper algebraic study, and I will not pursue it here.5
What is the dimension of the symplectic group Sp(2n)? The defining equation AT ΩA = Ω
naively has (2n)2 = 4n2 entries. But it is important to note that the matrix AT ΩA is
antisymmetric (no matter what A is); therefore, some of the (2n)2 equations in AT ΩA = Ω
are redundant (namely, the ij equation says the same thing as the ji equation, and the
diagonal equations say 0 = 0). So the non-redundant equations are, for instance, those
above the diagonal (i.e. i < j); there are (2n)(2n − 1)/2 of these [you should check this
carefully]. Therefore, the dimension of the symplectic group Sp(2n) is
(2n)(2n − 1) (2n)(2n + 1)
(2n)2 − = = n(2n + 1) . (19)
2 2

2 Lie algebras
Let G be a Lie group of n × n matrices: for instance, one of the “classical matrix groups”
introduced in the preceding section. It turns out that much of the structure of G can be
understood by looking only at the behavior of G in a tiny neighborhood of the identity
element. Intuitively speaking, we consider matrices A = I + ǫA with “ǫ infinitesimal” and
we ask under what conditions this matrix belongs to G “through first order in ǫ”. More
rigorously, we consider a smooth curve A(t) lying in G and satisfying A(0) = I, and we ask
what are the possible values of A′ (0) = (d/dt)A(t) t=0 .6 Geometrically, what this means
is that we regard G as a submanifold embedded in the vector space Rn×n , and we seek to
determine the tangent space to G at the identity element I. The Lie algebra g associated
to the Lie group G is, by definition, the tangent space to G at the point I; that is, it is the
set of all possible values of A′ (0) when we consider all smooth curves A(t) lying in G and
satisfying A(0) = I. Since a tangent space to a manifold is always a vector space (that is, it
is closed under addition and multiplication by scalars), it follows that the Lie algebra g is a
vector space. This fact makes it somewhat easier to study than G itself, which is a nonlinear
manifold. That is the principal reason for introducing Lie algebras.
So our next task is to determine the Lie algebra g for each of the “classical matrix
groups” G introduced in the preceding section.

2.1 The general linear group


The case of the general linear group GL(n) is easy: every matrix in a neighborhood of
the identity matrix belongs to GL(n), so the Lie algebra gl(n) consists of all n × n matrices.
That is,
gl(n) = Rn×n . (20)
Of course gl(n) is a vector space of dimension n2 .
5
The best way to prove this is to use the pfaffian, which is a kind of “square root of a determinant” for
antisymmetric matrices. See e.g. http://en.wikipedia.org/wiki/Symplectic_matrix and http://en.
wikipedia.org/wiki/Pfaffian
6
Here t is simply a parameter; it need not be interpreted as “time”. But it can be interpreted as “time”
if you wish: that is, we consider a “particle” moving through G, reaching I at time t = 0, and we ask what
are its possible “velocities” at time t = 0.

7
2.2 The special linear group
The next case to consider is the special linear group SL(n), which consists of the matrices
A satisfying det A = 1. Consider a smooth curve A(t) lying in SL(n) and satisfying A(0) = I,
say
A(t) = I + At + A2 t2 + A3 t3 + . . . . (21)
As explained last week in class, we have

det A(t) = 1 + (tr A)t + O(t2 ) (22)

where tr A = ni=1 Aii is the trace of the matrix A. [Recall the proof: we start from the
P
definition of determinant as a sum over permutations σ. Every σ other than the identity
permutation must contribute the product of at least two off-diagonal elements, so it will make
a contribution starting only at order t2 or higher. The identity permutation will contribute
n
the product of diagonal elements, which is (1 + tAii ) = 1 + t ni=1 Aii + O(t2 ).] Therefore,
Q P
i=1
if the curve A(t) lies in SL(n) — that is, if det A(t) = 1 for all t — then we must have
tr A = 0. So every matrix in the Lie algebra sl(n) must be traceless. And conversely, if A
is a traceless n × n matrix, then it can be shown that the particular curve
∞ k
tA def
X t Ak
A(t) = e = (23)
k!
k=0

satisfies det A(t) = 1 for all t. So the Lie algebra sl(n) consists precisely of the n×n matrices
that are traceless:
sl(n) = {A ∈ Rn×n : tr A = 0} . (24)
Note that sl(n) is a vector space of dimension n2 − 1 (why?).

Remark. The matrix exponential, defined by



M def
X Mk
e = , (25)
k=0
k!

has some but not all of the properties of the ordinary exponential; this is because matrix
multiplication, unlike ordinary multiplication, is noncommutative. Thus, if M and N are
two matrices, in general we do not have eM eN = eM +N ; rather, log(eM eN ) is given by a
much more complicated formula, called the Campbell–Baker–Hausdorff formula. But
if M and N happen to commute (that is, MN = NM), then it is true that eM eN = eM +N .
In particular, since all multiples of a single matrix A trivially commute, it follows that
esA etA = e(s+t)A , or in other words A(s)A(t) = A(s + t). Thus, the curve A(t) = eAt forms
a commutative subgroup of G: it is the one-parameter subgroup of G generated by the
“infinitesimal generator” A.

8
2.3 The orthogonal and special orthogonal groups
Now consider the orthogonal group O(n) and its subgroup SO(n). Any smooth (or even
continuous) curve in O(n) passing through the identity element must lie entirely in SO(n),
since the determinant obviously cannot jump from +1 to −1. So O(n) and SO(n) will have
the same Lie algebra, which we call so(n).
So consider a smooth curve A(t) lying in O(n) and satisfying A(0) = I, say

A(t) = I + At + A2 t2 + A3 t3 + . . . . (26)

The condition to lie in O(n) is AT A = I. Since

A(t)T A(t) = I + (A + AT )t + O(t2 ) , (27)

we conclude that every matrix in the Lie algebra so(n) must satisfy A + AT = 0, i.e. it must
be antisymmetric. And conversely, if A is a antisymmetric n × n matrix, then it can be
shown that ∞ k
tA def
X t Ak
A(t) = e = (28)
k=0
k!
satisfies det A(t) = 1 for all t. So the Lie algebra so(n) consists precisely of the n×n matrices
that are antisymmetric:

so(n) = {A ∈ Rn×n : AT = −A} . (29)

Note that so(n) is a vector space of dimension n(n − 1)/2 [why?].


A similar procedure can be employed to determine the Lie algebra so(p, q) associated to
the special pseudo-orthogonal group SO(p, q). The details are left to you.

2.4 The symplectic group


Finally, we consider the symplectic group Sp(2n), which you will recall is the set of
2n × 2n real matrices A satisfying AT ΩA = Ω, where the antisymmetric matrix Ω is defined
by (16).
So consider a smooth curve A(t) lying in Sp(2n) and satisfying A(0) = I, say

A(t) = I + At + A2 t2 + A3 t3 + . . . . (30)

The condition to lie in Sp(2n) is AT ΩA = Ω. Since

A(t)T ΩA(t) = I + (AT Ω + ΩA)t + O(t2 ) , (31)

we conclude that every matrix in the Lie algebra sp(2n) must satisfy

AT Ω + ΩA = 0 . (32)

9
This same equation arose already in Handout #12 when we were discussing “infinitesimal
canonical transformations”.7 Recall how we handled it: since Ω is antisymmetric, we can
also write the condition (32) as
ΩA − AT ΩT = 0 (33)
or in other words
ΩA − (ΩA)T = 0 . (34)
So this says that the matrix ΩA is symmetric, or equivalently that the matrix

ΩT (ΩA)Ω = AΩ (35)

is symmetric. And conversely, if A is a 2n × 2n matrix such that ΩA is symmetric, then it


can be shown that ∞ k
tA def
X t Ak
A(t) = e = (36)
k=0
k!

satisfies A(t)T ΩA(t) = Ω for all t. So the Lie algebra sp(2n) consists precisely of the 2n × 2n
matrices such that ΩA is symmetric:

sp(2n) = {A ∈ R2n×2n : ΩA = (ΩA)T } . (37)

Note that sp(2n) is a vector space of dimension

(2n)(2n + 1)
= n(2n + 1) (38)
2
(why?).

3 More on Lie algebras: The Lie bracket


The Lie algebra g, being the tangent space to the Lie group G at the identity element,
is a vector space. But it is not just a vector space; it is also an algebra, i.e. a vector space
equipped with a bilinear product. This bilinear product g × g → g is called the Lie bracket;
it encodes much of the structure of the underlying Lie group G, and in a form that is easier to
work with than the multiplication law of G itself.8 The purpose of this section is to explain
what the Lie bracket is and how it encodes the structure of G.
The most interesting groups are noncommutative: that is, gh 6= hg for at least some
pairs g, h ∈ G. We can rewrite this as gh(hg)−1 6= e, where e is the identity element of G.
In group theory, the quantity gh(hg)−1 = ghg −1h−1 is called the commutator of g and h;
its deviation from e measures in some sense the degree to which g fails to commute with h.
7
With a transpose compared to the present notation: that is, K in Handout #12 [see equations (50) and
following] corresponds to AT here. I have therefore slightly modified the treatment from Handout #12 in
order to avoid unnecessary tranposes in the final formula.
8
More precisely, the Lie bracket encodes all of the local structure of the group G. But it obviously does
not tell us about global aspects of G, such as the difference between the groups O(n) and SO(n), both of
which have the same Lie algebra.

10
Of course, in a general (e.g. discrete) group G there may not be any sensible way of
defining quantitatively what we mean by the “deviation” of ghg −1h−1 from e. But in a Lie
group there is an obvious way of doing so, since the group elements are parametrized by a
vector of real numbers. In particular, our examples of Lie groups are groups of n × n real
matrices (and e is the identity matrix I), so we can simply look at the matrix difference
ghg −1h−1 − I: its deviation from zero tells us the degree to which g fails to commute with h.
Let us do this when g and h are elements of G near the identity element. More precisely,
let us consider a smooth curve g(s) lying in G and satisfying g(0) = I, say

g(s) = I + As + O(s2) , (39)

and another smooth curve h(t) lying in G and satisfying h(0) = I, say

h(t) = I + Bt + O(t2 ) . (40)

We now wish to compute the commutator g(s)h(t)g(s)−1h(t)−1 in power series in s and t.


Recall first that for any real (or complex) number x satisfying |x| < 1, we have

(1 + x)−1 = 1 − x + x2 − x3 + x4 − . . . (41)

in the sense that the power series on the right-hand side is absolutely convergent and equals
(1 + x)−1 . [The absolute convergence is easy to prove, using properties of geometric series.
To show that the infinite series equals (1 + x)−1 , just multiply it by 1 + x, rearrange the
terms (this is legitimate since the series is absolutely convergent), and see that the result is
1 + 0x + 0x2 + . . . .] Similarly, for any real (or complex) n × n matrix A satisfying kAk < 1
(where k · k is a suitable norm), we have

(I + A)−1 = I − A + A2 − A3 + A4 − . . . ; (42)

the proof is exactly the same. This sum is called the Neumann series for the matrix
inverse. (Here we won’t bother to make explicit what norm k · k we are using; but it won’t
matter, because anyway we will be using the Neumann formula only for matrices A that are
very small.)
So using the Neumann formula we have

g(s)−1 = I − As + O(s2 ) (43)

and
h(t)−1 = I − Bt + O(s2) (44)
[why?]. The commutator of g(s) and h(t) is therefore

g(s)h(t)g(s)−1h(t)−1 = [I + As + O(s2 )][I + Bt + O(t2 )][I − As + O(s2 )][I − Bt + O(t2)]


(45a)
2 2
= I + (AB − BA)st + terms of order st and s t and higher .
(45b)

11
(You should check this computation carefully!) Note that there are no terms sn t0 with n ≥ 1,
because g(s)h(0)g(s)−1h(0)−1 = I for all s [why?]; and likewise there are no terms s0 tn with
n ≥ 1. So all the terms not explicitly shown here must be of order st2 or s2 t or higher.
From (45) we see that the quantity AB − BA thus measures the degree to which g(s)
fails to commute with h(t), at the lowest nontrivial order (which is order st). We therefore
make the definition: the Lie bracket [A, B] is defined by9
[A, B] = AB − BA . (46)

The Lie bracket measures the noncommutativity of the Lie group G in a small neighborhood
of the identity element.
Since the curves g(s) and h(t) lie in the group G, their “infinitesimal elements” A and B
must belong to the Lie algebra g (by definition); and conversely, any two elements A and B
of the Lie algebra g are tangent vectors to suitable smooth curves g(s) and h(t) lying in G.
What about [A, B]? We see from (45) that [A, B] is indeed the tangent vector to a curve
lying in G, namely the curve
√ √ √ √
g( t)h( t)g( t)−1 h( t)−1 = I + [A, B]t + O(t3/2 ) . (47)
(I am not sure whether the O(t3/2 ) here can actually be replaced by O(t2); but anyway,
formula (47) already suffices to show that the curve in question is at least once continuously
differentiable, with tangent vector [A, B] at t = 0.) It follows that [A, B] belongs to the Lie
algebra g.

Conclusion: The Lie bracket [A, B] = AB − BA maps g × g → g.

The Lie bracket has the following fundamental properties:

1. Bilinearity. We have

[α1 A1 + α2 A2 , B] = α1 [A1 , B] + α2 [A2 , B] (48)


and likewise for B.
2. Anticommutativity. We have

[A, B] = −[B, A] . (49)


In particular it follows that [A, A] = 0 (why?).
3. Jacobi identity. We have

[A, [B, C]] + [B, [C, A]] + [C, [A, B]] = 0 (50)
or equivalently (using anticommutativity)
[[A, B], C] + [[B, C], A] + [[C, A], B] = 0 (51)
9
The Lie bracket is also called the commutator of the two matrices A and B. It is unfortunate that the
same word “commutator” is used in group theory and in linear algebra for two closely related but slightly
different notions.

12
The bilinearity and anticommutativity are trivial, and the Jacobi identity is an easy compu-
tation (do it!).
This structure is so important in mathematics that it is given a name: any vector space V
equipped with a product V ×V → V that is bilinear, anticommutative and satisfies the Jacobi
identity is called a Lie algebra (or an abstract Lie algebra if we wish to distinguish it
from the concrete Lie algebras of n × n matrices introduced here).

TO BE CONTINUED

13

You might also like