0% found this document useful (0 votes)
198 views231 pages

Ge, Mo Lin He, Yang Hui Yang

This chapter discusses the complex geometry of nuclei and atoms. It begins by explaining that the quantum mechanical behavior of particles like electrons can be described using complex geometry and topology. Complex geometry provides insights into nuclear and atomic structure. For example, it reveals that electron orbitals in atoms form bundles with topological properties. The chapter explores using complex geometry and topology to better understand phenomena in nuclear and atomic physics.

Uploaded by

Gustavo Villar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
198 views231 pages

Ge, Mo Lin He, Yang Hui Yang

This chapter discusses the complex geometry of nuclei and atoms. It begins by explaining that the quantum mechanical behavior of particles like electrons can be described using complex geometry and topology. Complex geometry provides insights into nuclear and atomic structure. For example, it reveals that electron orbitals in atoms form bundles with topological properties. The chapter explores using complex geometry and topology to better understand phenomena in nuclear and atomic physics.

Uploaded by

Gustavo Villar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 231

11217_9789813278493_tp.

indd 1 17/12/18 11:27 AM


This page intentionally left blank
11217_9789813278493_tp.indd 2 19/12/18 2:56 PM
Published by
World Scientific Publishing Co. Pte. Ltd.
5 Toh Tuck Link, Singapore 596224
USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

Library of Congress Cataloging-in-Publication Data


Names: Yang, Chen Ning, 1922– editor. | Ge, M. L. (Mo-Lin), editor. | He, Yang-Hui, 1975– editor.
Title: Topology and physics / editors, Chen Ning Yang (Institute for Advanced Study (IAS),
Tsinghua University, China), Mo-Lin Ge (Chern Institute of Mathematics, China),
Yang-Hui He (City University of London, UK).
Description: New Jersey : World Scientific, 2018. | Includes bibliographical references.
Identifiers: LCCN 2018045467| ISBN 9789813278493 (hardcover : alk. paper) |
ISBN 9789813278509 (pbk : alk. paper)
Subjects: LCSH: Topology. | Physics.
Classification: LCC QA611 .T65725 2018 | DDC 514--dc23
LC record available at https://lccn.loc.gov/2018045467

British Library Cataloguing-in-Publication Data


A catalogue record for this book is available from the British Library.

Copyright © 2019 by World Scientific Publishing Co. Pte. Ltd.


All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or
mechanical, including photocopying, recording or any information storage and retrieval system now known or to
be invented, without written permission from the publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center,
Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from
the publisher.

For any available supplementary material, please visit


https://www.worldscientific.com/worldscibooks/10.1142/11217#t=suppl

Printed in Singapore

Suqi - 11217 - Topology and Physics.indd 1 14-11-18 5:19:16 PM


October 31, 2018 15:2 taken from 146-MPLA ws-rv961x669 chap00a-pref-S0217732318300094 page v

Preface

Early examples of topological concepts in physics∗

C. N. Yang
Institute for Advanced Study, Tsinghua University, P. R. China

In the mid-1940s, S. S. Chern published an “intrinsic proof” of a generalization of


the Gauss–Bonnet Theorem to 4-dimensions. The paper led to the Chern Class and
Chern Numbers, to the new exciting field of global differential geometry, and to new
important topological concepts in other areas of mathematics. Andrei Weil was one
mathematician who was greatly impressed. He wrote an enthusiastic review of the
paper which became very influential.
A few years later, in 1946–1949, several totally unexpected new elementary
particles were discovered by experimental physicists. They were of different kinds,
with very different quantum numbers, and quickly became physicist’s center of
attention.
One day in 1948, I was present at a lunch in which Weil told Fermi his specu-
lation that these new particles might be related to some topological classification
ideas in geometry. Neither Fermi, nor I, nor others at that lunch, understood what
Weil had meant that day by his speculation across the boundary of math–physics.
Many years later, in the mid-1970s, after I learned from Jim Simon elements of
fiber bundle geometry and related concepts, I realized Weil maybe speculating that
day about possible relationships between the new particles plus their new quantum
number with topological concepts such as the Chern Numbers. For details please
see Ref. 1.
****************************
In an article published in 2012,2 I discussed in some detail the following early entry
of topology into physics:

• The Aharonov–Bohm experiment proposed theoretically in 1959, and verified


experimentally by Tonomura in 1983–1986.
• In the early 1950s physicists used the new computers to calculate the vibrational
frequency distribution of crystals, and were surprised to find unexplained ups

∗ Thischapter also appeared in Modern Physics Letters A, Vol. 33, No. 22 (2018) 1830009. DOI:
10.1142/S0217732318300094.
November 15, 2018 8:55 taken from 146-MPLA ws-rv961x669 chap00a-pref-S0217732318300094 page vi

vi Topology and Physics

and downs in the spectra. Were they real? Or just quirks of the computation?
The puzzle was resolved in a 1953 paper of Van Hove which introduced topology,
viz. Morse Theory, into physics.

****************************
That topological concepts are important in physics is now well known, especially in
phenomena/problems involving Abelian or non-Abelian phases. Here is an example
which shows that in one problem in classical Maxwell theory, topology already
plays an essential role:
Consider
An EM field interacting with both an electric charge e and a magnetic
charge g,
a problem which had been considered by Dirac in 1931.3 The electromagnetic
potential (i.e. the connection), when analytically continued, forms a complicated
nontrivial manifold. The action integral a is then definable only modulo 4πeg.4
If one tries to quantize this theory, à la Feynman’s path integral, one would be
dealing with the quantity
exp(ia /h) ,
which is meaningful only if
2eg/h = an integer .
This condition, first given by Dirac, is thus a consequence of the topology of classical
Maxwell theory.

References
1. C. N. Yang, Phys. Today 65, 33 (2012).
2. C. N. Yang, Int. J. Mod. Phys. A 27, 1230035 (2012).
3. P. A. M. Dirac, Proc. R. Soc. London A 133, 60 (1931).
4. T. T. Wu and C. N. Yang, Phys. Rev. D 14, 437 (1976).
December 12, 2018 15:33 ws-rv961x669 chap00b-contents page vii

vii

Contents

Preface — Early examples of topological concepts in physics v


C. N. Yang

1. Complex geometry of nuclei and atoms 1


M. F. Atiyah and N. S. Manton

2. Developments in topological gravity 17


Robbert Dijkgraaf and Edward Witten

3. Majorana Fermions and representations of the braid group 81


Louis H. Kauffman

4. Arithmetic gauge theory: A brief introduction 109


Minhyong Kim

5. Singularity theorems 135


Roger Penrose

6. Beyond anyons 173


Zhenghan Wang

7. Four revolutions in physics and the second quantum revolution —


A unification of force and matter by quantum information 181
Xiao-Gang Wen

8. Topological insulators from the perspective of first-principles


calculations 205
Haijun Zhang and Shou-Cheng Zhang

Appendix — SO4 symmetry in a Hubbard model 215


C. N. Yang and Shou-Cheng Zhang
This page intentionally left blank
October 31, 2018 12:8 taken from 139-IJMPA ws-rv961x669 chap01-S0217751X18300223 page 1

Chapter 1

Complex geometry of nuclei and atoms∗

M. F. Atiyah
School of Mathematics, University of Edinburgh,
James Clerk Maxwell Building,
Peter Guthrie Tait Road, Edinburgh EH9 3FD, UK
[email protected]

N. S. Manton
Department of Applied Mathematics and Theoretical Physics,
University of Cambridge,
Wilberforce Road, Cambridge CB3 0WA, UK
[email protected]

We propose a new geometrical model of matter, in which neutral atoms are modelled
by compact, complex algebraic surfaces. Proton and neutron numbers are determined
by a surface’s Chern numbers. Equivalently, they are determined by combinations of
the Hodge numbers, or the Betti numbers. Geometrical constraints on algebraic surfaces
allow just a finite range of neutron numbers for a given proton number. This range
encompasses the known isotopes.

Keywords: Atoms; nuclei; algebraic surfaces; 4-manifolds.

PACS numbers: 02.40.Tt, 02.40.Re, 21.60.−n

1. Introduction
It is an attractive idea to interpret matter geometrically, and to identify conserved
attributes of matter with topological properties of the geometry. Kelvin made the
pioneering suggestion to model atoms as knotted vortices in an ideal fluid.1 Each
atom type would correspond to a distinct knot, and the conservation of atoms in
physical and chemical processes (as understood in the 19th century) would follow
from the inability of knots to change their topology. Kelvin’s model has not survived
because atoms are now known to be structured and divisible, with a nucleus formed

∗ Thischapter also appeared in International Journal of Modern Physics A, Vol. 33, No. 24 (2018)
1830022. DOI: 10.1142/S0217751X18300223.
October 31, 2018 12:8 taken from 139-IJMPA ws-rv961x669 chap01-S0217751X18300223 page 2

2 Topology and Physics

of protons and neutrons bound together, surrounded by electrons. At high energies,


these constituents can be separated. It requires of order 1 eV to remove an electron
from an atom, but a few MeV to remove a proton or neutron from a nucleus.
Atomic and nuclear physics has progressed, mainly by treating protons, neu-
trons and electrons as point particles, interacting through electromagnetic and
strong nuclear forces.2 Quantum mechanics is an essential ingredient, and leads
to a discrete spectrum of energy levels, both for the electrons and nuclear parti-
cles. The nucleons (protons and neutrons) are themselves built from three point-
like quarks, but little understanding of nuclear structure and binding has so far
emerged from quantum chromodynamics (QCD), the theory of quarks. These point
particle models are conceptually not very satisfactory, because a point is clearly
an unphysical idealisation, a singularity of matter and charge density. An infinite
charge density causes difficulties both in classical electrodynamics3 and in quantum
field theories of the electron. Smoother structures carrying the discrete information
of proton, neutron and electron number would be preferable.
In this paper, we propose a geometrical model of neutral atoms where both the
proton number P and neutron number N are topological and none of the constituent
particles are pointlike. In a neutral atom the electron number is also P , because the
electron’s electric charge is exactly the opposite of the proton’s charge. For given
P , atoms (or their nuclei) with different N are known as different isotopes.
A more recent idea than Kelvin’s is that of Skyrme, who proposed a nonlinear
field theory of bosonic pion fields in 3 + 1 dimensions with a single topological
invariant, which Skyrme identified with baryon number.4,5 Baryon number (also
called atomic mass number) is the sum of the proton and neutron numbers, B =
P + N . Skyrme’s baryons are solitons in the field theory, so they are smooth,
topologically stable field configurations. Skyrme’s model was designed to model
atomic nuclei, but electrons can be added to produce a model of a complete atom.
Protons and neutrons can be distinguished in the Skyrme model, but only after
the internal rotational degrees of freedom are quantised.6 This leads to a quantised
1
“isospin,” with the proton  having isospin up I3 = 2 and the neutron having
1
isospin down I3 = − 2 , where I3 the third component of isospin. The model is
consistent with the well-known Gell-Mann–Nishijima relation7
1
Q= B + I3 , (1.1)
2
where Q is the electric charge of a nucleus (in units of the proton charge) and
B is the baryon number. Q is integral, because I3 is integer-valued (half-integer-
valued) when B is even (odd). Q equals the proton number P of the nucleus and
also the electron number of a neutral atom. The neutron number is N = 21 B − I3 .
The Skyrme model has had considerable success providing models for nuclei.8–11
Despite the pion fields being bosonic, the quantised Skyrmions have half-integer
spin if B is odd.12 But a feature of the model is that proton number and neutron
number are not separately topological, and electrons have to be added on.
October 31, 2018 12:8 taken from 139-IJMPA ws-rv961x669 chap01-S0217751X18300223 page 3

Complex Geometry of Nuclei and Atoms 3

The Skyrme model has a relation to 4-dimensional fields that provides some
motivation for the ideas discussed in this paper. A Skyrmion can be well approxi-
mated by a projection of a 4-dimensional Yang–Mills field. More precisely, one can
take an SU(2) Yang–Mills instanton and calculate its holonomy along all lines in the
(Euclidean) time direction.13 The result is a Skyrme field in 3-dimensional space,
whose baryon number B equals the instanton number.
So a quasi-geometrical structure in 4-dimensional space (a Yang–Mills instan-
ton in flat R4 ) can be closely related to nuclear structure, but still there is just one
topological charge. A next step, first described in Ref. 14, was to propose an identi-
fication of smooth, curved 4-manifolds with the fundamental particles in atoms —
the proton, neutron and electron. Suitable examples of manifolds were suggested.
These manifolds were not all compact, and the particles they modelled were not all
electrically neutral. One of the more compelling examples was Taub-NUT space as
a model for the electron. By studying the Dirac operator on the Taub-NUT back-
ground, it was shown how the spin of the electron can arise in this context.15 There
has also been an investigation of multi-electron systems modelled by multi-Taub-
NUT space.16,17 However, there are some technical difficulties with the models of
the proton and neutron, and no way has yet been found to geometrically com-
bine protons and neutrons into more complicated nuclei surrounded by electrons.
Nor is it clear in this context what exactly should be the topological invariants
representing proton and neutron number.
A variant of these ideas is a model for the simplest atom, the neutral hydrogen
atom, with one proton and one electron. This appears to be well modelled by CP2 ,
the complex projective plane.a The fundamental topological property of CP2 is
that it has a generating 2-cycle with self-intersection 1. The second Betti number is
− 2
b2 = 1, which splits into b+2 = 1 and b2 = 0. A complex line in CP represents this
cycle, and in the projective plane, two lines always intersect in one point. A copy of
this cycle together with its normal neighbourhood can be interpreted as the proton
part of the atom, whereas the neighbourhood of a point dual to this is interpreted
as the electron. The neighbourhood of a point is just a 4-ball, with a 3-sphere
boundary, but this is the same as in the Taub-NUT model of the electron, which is
topologically just R4 . The 3-sphere is a twisted circle bundle over a 2-sphere (the
Hopf fibration) and this is sufficient to account for the electron charge.
In this paper, we have a novel proposal for the proton and neutron numbers.
The 4-manifolds we consider are compact, to model neutral atoms. Our previous
models always required charged particles to be noncompact so that the electric
flux could escape to infinity, and this is an idea we will retain. We also restrict
our manifolds to be complex algebraic surfaces, and their Chern numbers will be
related to the proton and neutron numbers. There are more than enough examples
to model all currently known isotopes of atoms. We will retain CP2 as the model
for the hydrogen atom.

a CP2 had a different interpretation in Ref. 14.


October 31, 2018 12:8 taken from 139-IJMPA ws-rv961x669 chap01-S0217751X18300223 page 4

4 Topology and Physics

Fig. 1. The Hodge diamond for a general complex surface (left) and its entries in terms of Betti
numbers for an algebraic surface (right).

2. Topology and Physics of Algebraic Surfaces


Complex surfaces18 provide a rich supply of compact 4-manifolds. They are prin-
cipally classified by two integer topological invariants, denoted c21 and c2 . For a
surface X, c1 and c2 are the Chern classes of the complex tangent bundle. c2 is an
integer because X has real dimension 4, whereas the (dual of the) canonical class c1
is a particular 2-cycle in the second homology group, H2 (X). c21 is the intersection
number of c1 with itself, and hence another integer.
There are several other topological invariants of a surface X, but many are
related to c21 and c2 . Among the most fundamental are the Hodge numbers. These
are the dimensions of the Dolbeault cohomology groups of holomorphic forms. In
two complex dimensions the Hodge numbers are denoted hi,j with 0 ≤ i, j ≤ 2.
They are arranged in a Hodge diamond, as illustrated in Fig. 1. Serre duality, a
generalisation of Poincaré duality, requires this diamond to be unchanged under a
180◦ rotation. For a connected surface, h0,0 = h2,2 = 1.
Complex algebraic surfaces are a fundamental subclass of complex surfaces.19,20
A complex algebraic surface can always be embedded in a complex projective space
CPn , and thereby acquires a Kähler metric from the ambient Fubini–Study metric
on CPn . For any Kähler manifold, the Hodge numbers have an additional symmetry,
hi,j = hj,i . For a surface, this gives just one new relation, h0,1 = h1,0 . Not all
complex surfaces are algebraic: some are still Kähler and satisfy this additional
relation, but some are not Kähler and do not satisfy it.
Particularly interesting for us are the holomorphic Euler number χ, which is an
alternating sum of the entries on the top right (or equivalently, bottom left) diagonal
of the Hodge diamond, and the analogous quantity for the middle diagonal, which
we denote θ. More precisely,
χ = h0,0 − h0,1 + h0,2 , (2.1)
1,0 1,1 1,2
θ = −h +h −h . (2.2)
(Note the sign choice for θ.) The Euler number e and signature τ can be expressed
in terms of these as
e = 2χ + θ , (2.3)
τ = 2χ − θ . (2.4)
October 31, 2018 12:8 taken from 139-IJMPA ws-rv961x669 chap01-S0217751X18300223 page 5

Complex Geometry of Nuclei and Atoms 5

Fig. 2. Hodge diamonds for the projective plane CP2 (left) and for a K3 surface (right).

The first of these formulae reduces to the more familiar alternating sum of Betti
numbers e = b0 − b1 + b2 − b3 + b4 , because each Betti number is the sum of the
entries in the corresponding row of the Hodge diamond. The second formula is the
less trivial Hodge index theorem. τ is more fundamentally defined by the splitting

of the second Betti number into positive and negative parts, b2 = b+2 + b2 . Over the
reals the intersection form on the second homology group H2 (X) is nondegenerate

and can be diagonalised. b+2 is then the dimension of the positive subspace, and b2

the dimension of the negative subspace. The signature is τ = b+ 2 − b2 .
The Chern numbers are related to χ and θ through the formulae
c21 = 2e + 3τ = 10χ − θ , c2 = e = 2χ + θ . (2.5)
1 2
Their sum gives the Noether formula χ = 12 (c1
+ c2 ), which is always integral.
For an algebraic surface, there are just three independent Hodge numbers and

they are uniquely determined by the Betti numbers b1 , b+ 2 and b2 . The Hodge
diamond must take the form shown on the right in Fig. 1, which gives the correct
values for b1 , e and τ . Note that b1 must be even and b+2 must be odd. χ and θ are
now given by
1
χ= (1 − b1 + b+
2 ), (2.6)
2
θ = 1 − b1 + b−
2 . (2.7)
If X is simply connected, which accounts for many examples, then b1 = 0. Hodge
diamonds for the projective plane CP2 and for a K3 surface, both of which are
simply connected, are shown in Fig. 2. For the projective plane χ = 1 and θ = 1,
so e = 3 and τ = 1, and for a K3 surface χ = 2 and θ = 20, so e = 24 and τ = −16.
Our proposal is to model neutral atoms by complex algebraic surfaces and to
interpret χ as proton number P , and θ as baryon number B. So neutron number is
N = θ − χ. This proposal fits with CP2 having P = 1 and N = 0. We will see later
that for each positive value of P there is an interesting, finite range of allowed N
values.
In terms of e and τ ,
1 1 1
P = (e + τ ) , B= (e − τ ) , N= (e − 3τ ) . (2.8)
4 2 4
October 31, 2018 12:8 taken from 139-IJMPA ws-rv961x669 chap01-S0217751X18300223 page 6

6 Topology and Physics

Note that for a general, real 4-manifold, these formulae for P and N might be
fractional, and would need modification. It is also easy to verify that in terms of P
and N ,

c21 = 9P − N , (2.9)
c2 = e = 3P + N , (2.10)
τ = P −N. (2.11)

The simple relation of signature τ to the difference between proton and neutron
numbers is striking. If we write N = P + Nexc , where Nexc denotes the excess of
neutrons over protons (which is usually zero or positive, but can be negative), then
τ = −Nexc .
If an algebraic surface X is simply connected then b1 = 0, and in terms of P
and N ,

b+
2 = 2P − 1 , b−
2 = P + N − 1 = 2P − 1 + Nexc . (2.12)

These formulae will be helpful when we consider intersection forms in more detail.
The class of surfaces that we will use, as models of atoms, are those with c21 and
c2 non-negative. Many of these are minimal surfaces of general type. Perhaps the
most important results on the geometry of algebraic surfaces are certain inequalities
that the Chern numbers of minimal surfaces of general type have to satisfy. The
basic inequalities are that c21 and c2 are positive. Also, there is the Bogomolov–
Miyaoka–Yau (BMY) inequality which requires c21 ≤ 3c2 , and finally there is the
Noether inequality 5c21 − c2 + 36 ≥ 0. These inequalities can be converted into the
following inequalities on P and N :

P > 0, 0 ≤ N < 9P , N ≤ 7P + 6 . (2.13)

All integer values of P and N satisfying these are allowed. The allowed region is
shown in Fig. 3, and corresponds to the allowed region shown on page 229 of Ref. 18,
or in the article, Ref. 21.
There are also the elliptic surfaces (including the Enriques surface and K3 sur-
face) where c21 = 0 and c2 is non-negative, and we shall include these among our
models. Here, P ≥ 0 and N = 9P , so c2 = 12P and τ = −8P . CP2 is also allowed,
even though it is rational and not of general type, because c21 and c2 are positive.
In addition to CP2 , there are further surfaces on the BMY line c21 = 3c2 ,22 which
have P > 1 and N = 0.
Physicists usually denote an isotope by proton number and baryon number,
where proton number P is determined by the chemical name, and baryon number
is P + N . For example, the notation 56 Fe means the isotope of iron with P = 26
and N = 30. The currently recognised isotopes are shown in Fig. 4.
The shape of the allowed region of algebraic surfaces qualitatively matches the
region of recognised isotopes, and this is the main justification for our proposal.
October 31, 2018 12:8 taken from 139-IJMPA ws-rv961x669 chap01-S0217751X18300223 page 7

Complex Geometry of Nuclei and Atoms 7


 



 
 
 
 
 
  
 
  
  
  
  
  
  
  
   
   
   
   
   
   
   
   
   
    
    
    
    
    
    
    
    
    
     
     
     
     
     
     
     
     
     
     

Fig. 3. Proton numbers P , and neutron numbers N , for atoms modelled as algebraic surfaces.
The allowed region is limited by inequalities on the Chern numbers, as discussed in the text. Note
the change of slope from 9 to 7 at the point P = 3, N = 27 on the boundary. The line N = P
corresponds to surfaces with zero signature, i.e. τ = 0.

Fig. 4. Nuclear isotopes. The horizontal axis is proton number P (Z in physics notation) and the
vertical axis is neutron number N . The shading (colouring online) indicates the lifetime of each
isotope, with black denoting stability (infinite lifetime).
October 31, 2018 12:8 taken from 139-IJMPA ws-rv961x669 chap01-S0217751X18300223 page 8

8 Topology and Physics

For example, for P = 1, the geometric inequalities allow N to take values from 0
up to 9. This corresponds to a possible range of hydrogen isotopes from 1 H to 10 H.
Physically, the well-known hydrogen isotopes are the proton, deuterium and tritium,
that is, 1 H, 2 H and 3 H respectively, but nuclear physicists recognise isotopes of a
quasi-stable nature (resonances) up to 7 H, with N = 6.
The minimal models for the common isotopes, the proton alone, and deuterium,
each bound to one electron, are CP2 and the complex quadric surface Q. The quadric
is the product Q = CP1 × CP1 , with e = 4 and τ = 0. We shall say more about its
intersection form below.
For P = 2, N is geometrically allowed in the range 0 to 18. The corresponding
algebraic surfaces should model helium isotopes from 2 He to 20 He. Isotopes from
3
He up to 10 He are physically recognised. All of these potentially form neutral
atoms with two electrons. The helium isotope 2 He with no neutrons is not listed
in some nuclear tables, but there does exist an unbound diproton resonance, and
diprotons are sometimes emitted when heavier nuclei decay. The most common,
stable helium isotope is 4 He, with two protons and two neutrons, but 3 He is also
stable. 4 He nuclei are also called alpha-particles, and play a key role in nuclear
processes and nuclear structure. It is important to have a good geometrical model
of an alpha-particle, which ideally should match the cubically symmetric B = 4
Skyrmion that is a building block for many larger Skyrmions.9,11,23,24

3. Valley of Stability
Running through the nuclear isotopes is the valley of stability.2 In Fig. 4, this is
the irregular curved line of stable nuclei marked in black. On either side, the nuclei
are unstable, with lifetimes of many years near the centre of the valley, reducing
to microseconds further away. Sufficiently far from the centre are the nuclear drip
lines, where a single additional proton or neutron has no binding at all, and falls
off in a time of order 10−23 seconds.
For small nuclei, for P up to about 20, the valley is centred on the line N = P .
In the geometrical model, this line corresponds to surfaces with signature τ = 0.
For larger P , nuclei in the valley have a neutron excess, Nexc , which increases slowly
from just a few when P is near 20 to over 50 for the quasi-stable uranium isotopes
with P = 92, and slightly more for the heaviest artificially produced nuclei with P
approaching 120.
In standard nuclear models, the main effect explaining the valley is the Pauli
principle. Protons and neutrons have a sequence of rather similar 1-particle states
of increasing energy, and just one particle can be in each state. For given baryon
number, the lowest-energy state has equal proton and neutron numbers, filling the
lowest available states. If one proton is replaced by one neutron, the proton state
that is emptied has lower energy than the neutron state that is filled, so the total
energy goes up. An important additional effect is a pairing energy that favours
protons to pair up and neutrons to pair up. Most nuclei with P and N both odd
are unstable as a result.
October 31, 2018 12:8 taken from 139-IJMPA ws-rv961x669 chap01-S0217751X18300223 page 9

Complex Geometry of Nuclei and Atoms 9

For larger values of P , the single-particle proton energies tend to be higher


than the single-particle neutron energies, because in addition to the attractive,
strong nuclear forces which are roughly the same for protons and neutrons, there
is the electrostatic Coulomb repulsion that acts between protons alone. This effect
becomes important for nuclei with large P , and favours neutron-rich nuclei. It also
explains the instability of all nuclei with P larger than 83. These nuclei simply
split up into smaller nuclei, either by emitting an alpha-particle, or by fissioning
into larger fragments. However, the lifetimes can be billions of years in some cases,
which is why uranium, with P = 92, is found in nature in relatively large quantities.
Note that if N = P , then the electric charge is half the baryon number, and
according to formula (1.1), the third component of isospin is zero. By studying
nuclear ground states and excited states, one can determine the complete isospin,
and it is found to be minimal for stable nuclei. So nuclei with N = P have zero
isospin. When the baryon number is odd, the most stable nuclei have N just one
greater than P (if P is not too large), and the isospin is 21 . Within the Skyrme
model, isospin arises from the quantisation of internal degrees of freedom, associated
with an SO(3) symmetry acting on the pion fields. There is an energy contribu-
tion proportional to the squared isospin operator I2 , analogous to the spin energy
proportional to J2 . In the absence of Coulomb effects, the energy is minimised by
fixing the isospin to be zero or 12 . The Coulomb energy competes with isospin, and
shifts the total energy minimum towards neutron-rich nuclei.
These are the general trends of nuclear energies and lifetimes. However there
is a lot more in the detail. Each isotope has its own character, depending on its
proton and neutron numbers. This is most clear in the energy spectra of excited
states, and the spins of the ground and excited states. Particularly interesting is the
added stability of nuclei where either the proton or neutron number is magic. The
smaller magic numbers are 2, 8, 20, 28, 50. It is rather surprising that protons and
neutrons can be treated independently with regard to the magic properties. This
appears to contradict the importance of isospin, in which protons and neutrons are
treated as strongly influencing each other.
Particularly stable nuclei are those that are doubly magic, like 4 He, 16 O, 40 Ca
and 48 Ca. 40 Ca is the largest stable nucleus with N = P . 48 Ca is also stable, and
occurs in small quantities in nature, but is exceptionally neutron-rich for a relatively
small nucleus.
The important issue for us here is to what extent our proposed geometrical
model based on algebraic surfaces is compatible with these nuclear phenomena,
not forgetting the electron structure in a neutral atom. There are some broad
similarities. First there is the “geography” of surfaces we have discussed above,
implying that the geometrical inequalities restrict the range of neutron numbers.
Algebraic geometers also refer to “botany,” the careful construction and study of
surfaces with particular topological invariants. The patterns are very complicated.
Some surfaces are simple to construct, others less so, and their internal structure is
very variable. This is analogous to the complications of the nuclear landscape, and
October 31, 2018 12:8 taken from 139-IJMPA ws-rv961x669 chap01-S0217751X18300223 page 10

10 Topology and Physics

the similar complications (better understood) of the electron orbitals and atomic
shell structure.
Rather remarkable is that the line of nuclear stability where N = P corresponds
to the simple geometrical condition that the signature τ is zero. We have not yet
tried to pinpoint an energy function on the space of surfaces, but clearly it would be
easy to include a dominant contribution proportional to τ 2 , whose minimum would
be in the desired place. Mathematicians have discovered that it is much easier to
construct surfaces on this line, and on the neutron-rich side of it, where τ is negative,
than on the proton-rich side. There are always minimal surfaces on the neutron-rich
side which are simply connected, but not everywhere on the proton-rich side. The
geometry of surfaces therefore distinguishes protons from neutrons rather clearly.
This is attractive for the physical interpretation, as it can be regarded as a predic-
tion of an asymmetry between the proton and neutron. In standard nuclear physics
it is believed that in an ideal world with no electromagnetic effects, there would be
an exact symmetry between the proton and neutron, but in reality they are not the
same, partly because of Coulomb energy, but more fundamentally, because their
constituent up (u) and down (d) quarks are not identical in their masses, making
the proton (uud) less massive than the neutron (udd), despite its electric charge.
The geometrical model would need an energy contribution that favours neutrons
over protons for the larger nuclei and atoms. One possibility has been explored by
LeBrun.25,26 This is the infimum, over complex surfaces with given topology, of the
L2 norm of the scalar curvature. For surfaces with b1 even, including all surfaces
that are simply connected, this infimum is simply a constant multiple of c21 . The
scalar curvature can be zero for surfaces on the line c21 = 0, for example the K3
surface, which is the extreme of neutron-richness, with P = 2 and N = 18. It would
be interesting to consider more carefully the energy landscape for an energy that
combines τ 2 and a positive multiple of the L2 norm of scalar curvature.

4. Intersection Form
A complex surface X is automatically oriented, so any pair of 2-cycles has an unam-
biguous intersection number.27 Given a basis αi of 2-cycles for the second homology
group H2 (X), the matrix Ωij of intersection numbers is called the intersection form
of X. Ωij ≡ Ω(αi , αj ) is the intersection number of basis cycles αi and αj , and the
self-intersection number Ωii is the intersection number of αi with a generic smooth
deformation of itself. Ω is a symmetric matrix of integers, and by Poincaré duality
it is unimodular (of determinant ±1). Over the reals, such a symmetric matrix is
diagonalisable, and the diagonal entries are either +1 or −1. The numbers of each

of these are b+2 and b2 , respectively, and we have already given an interpretation of
them for simply-connected algebraic surfaces X in terms of P and N in Eq. (2.12)
above.
However, diagonalisation over the reals does not make sense for cycles, because
one can end up with fractional cycles in the new basis. One may only change the
October 31, 2018 12:8 taken from 139-IJMPA ws-rv961x669 chap01-S0217751X18300223 page 11

Complex Geometry of Nuclei and Atoms 11

basis of cycles using an invertible matrix of integers, whose effect is to conjugate Ω


by such a matrix. The classification of intersection forms is finer over the integers
than the reals.

For almost all algebraic surfaces, Ω is indefinite. b+
2 is always positive, and b2
is positive too, except for surfaces with b1 = 0 and B = θ = 1. So the only surfaces
for which the intersection form Ω is definite are CP2 , and perhaps additionally the
fake projective planes, for which we have not found a physical interpretation. For
CP2 , with P = 1 and N = 0, the intersection form is the 1 × 1 matrix Ω = (1).
Nondegenerate, indefinite forms over the integers have a rather simple classification.
The basic dichotomy is between those that are odd and those that are even. An
odd form is one for which at least one entry Ωii is odd, or more invariantly, Ω(α, α)
is odd for some 2-cycle α. An odd form can always be diagonalised, with entries +1
and −1 on the diagonal.
Even forms are more interesting. Here Ω(α, α) is even for any cycle α. The
simplest example is
 
0 1
Ω= . (4.1)
1 0
This is the intersection form of the quadric Q, with the two CP1 factors as basis
cycles, α1 and α2 . If α = xα1 + yα2 then Ω(α, α) = 2xy, so is always even. Over the
reals this form can be diagonalised and has entries +1 and −1 (the eigenvalues). So
it has zero signature. But the diagonalisation involves fractional matrices, and is
not possible over the integers. The intersection form (4.1) is called the “hyperbolic
plane.” A second ingredient in even intersection forms is the matrix −E8 . This
is the negative of the Cartan matrix of the Lie algebra E8 (with diagonal entries
−2). It is even and unimodular. By itself this form is negative definite, but when
combined with hyperbolic plane components, the result is indefinite, as needed.
The most general (indefinite) even intersection form for an algebraic surface can be
brought to the block diagonal form
 
0 1
Ω=l ⊕ m(−E8 ) , (4.2)
1 0
with l > 0 and m ≥ 0. l must be odd, and the Betti numbers are b+ 2 = l and
b−
2 = l + 8m. The signature is τ = −8m.
For most surfaces, the signature is not a multiple of 8, so the intersection form
is odd. If the signature is a multiple of 8, it may be even. For given Betti numbers,
there could be two distinct minimal surfaces (or families of these), one with an odd
intersection form, and the other with an even intersection form. We do not know if
surfaces with both types of intersection form always occur.
We can reexpress these conditions in terms of the physical numbers P and N .
If Nexc = N − P is neither zero nor a positive multiple of 8, then the intersection
form must be odd. If N = P , then the intersection form can be of the hyperbolic
 
plane type l 01 10 , with l = 2P − 1, or it might still be odd. Notice that l is odd,
October 31, 2018 12:8 taken from 139-IJMPA ws-rv961x669 chap01-S0217751X18300223 page 12

12 Topology and Physics

as it must be. The isotopes for which even intersection forms are possible therefore
include all those with N = P . These are numerous. In addition to the stable
isotopes with N = P that occur up to 40 Ca, with P = 20, there are many that are
quasistable, like 52 Fe, with P = 26. The heaviest recognised isotopes with N = P
are 100 Sn and perhaps 108 Xe, with P = 50 and P = 54. Our geometrical model
suggests that the additional stability of these isotopes is the result of the nontrivial
structure of an even intersection form.
If Nexc = 8m then the intersection form can be of type (4.2), again with l =
2P −1, but it might also be odd. Examples are the Enriques surface, for which l = 1
and m = 1, and the K3 surface, for which l = 3 and m = 2. The potential isotopes
corresponding to these surfaces are 10 H and 20 He. These are both so neutron-
rich that they have not been observed, but there are many heavier nuclei (and
corresponding atoms) for which the neutron excess Nexc is a multiple of 8.
There is some evidence that nuclei whose neutron excess is a multiple of 8 have
additional stability. The most obvious example is 48 Ca, but this is conventionally
attributed to the shell model, as P = 20 and N = 28, both magic numbers. A more
interesting and less understood example is the heaviest known isotope of oxygen,
24
O, with 8 protons and 16 neutrons. This example and others do not obviously fit
with the shell model. The most stable isotope of iron is 56 Fe, whose neutron excess
is 4, but it is striking that 60 Fe, whose neutron excess is 8, has a lifetime of over
a million years. Here P = 26 and N = 34. 64 Ni, also with a neutron excess of 8,
is one of the stable isotopes of nickel. There are also striking examples of stable
or relatively stable isotopes with neutron excesses of 16 or 24. Some of these are
outliers compared to the general trends in the valley of stability. An example is
124
Sn, the heaviest stable isotope of tin, with Nexc = 24. A more careful study
would be needed to confirm if the additional stability of isotopes whose neutron
excess is a multiple of 8 is statistically significant.
There is no evidence that a neutron deficit of 8 has a stabilising effect. In fact,
almost no nuclei with such a large neutron deficit are recognised. The only candidate
is 48 Ni, with the magic numbers P = 28 and N = 20.

5. Other Surfaces
In addition to the minimal surfaces of general type there are various other classes
of algebraic surface. Do these have a physical interpretation?
On a surface X it is usually possible to “blow up” one or more points. The
result is not minimal, because a minimal surface, by definition, is one that cannot be
constructed by blowing up points on another surface. Blowing up one point increases
c2 by 1 and decreases c21 by 1. This is equivalent, in our model, to increasing N by
1, leaving P unchanged. In other words, one neutron has been added. Topologically,
blowing up is a local process, equivalent to attaching (by connected sum) a copy
of CP2 . This adds a 2-cycle that has self-intersection −1, but no intersection with
any other 2-cycle. The rank (size) of the intersection form Ω increases by 1, with an
October 31, 2018 12:8 taken from 139-IJMPA ws-rv961x669 chap01-S0217751X18300223 page 13

Complex Geometry of Nuclei and Atoms 13

extra −1 on the diagonal, and the remaining entries of the extra row and column all
zero. This automatically makes the intersection form odd, so any previously even
form now becomes diagonalisable.
The physical interpretation seems to be that a neutron has been added, well
separated from any other neutron or proton. This adds a relatively high energy,
more than if the additional neutron were bound into an existing nucleus. Mini-
mal algebraic surfaces, and especially those with even intersection forms, should
correspond to tightly bound nuclei and atoms, having lower energy.
The simplest example is the blow up of one point on CP2 . The result is the
Hirzebruch surface H1 , which is a nontrivial CP1 bundle over CP1 . Its intersection
 
form is 10 −10 . The Hirzebruch surface and quadric are both simply connected and

have the same Betti numbers, b+ 2 = b2 = 1, corresponding to P = 1 and N = 1, but
the intersection form is odd for the Hirzebruch surface and even for the quadric.
The proposed interpretation is that the Hirzebruch surface represents a separated
proton, neutron and electron, whereas the quadric represents the deuterium atom,
with a bound proton and neutron as its nucleus, orbited by the electron.
There is an inequality of LeBrun for the L2 norm of the Ricci curvature sup-
porting this interpretation.25,26 The norm increases if points on a minimal surface
are blown up, the increase being a constant multiple of the number of blown-up
points. This indicates that both the norm of the Ricci curvature and the norm of
the scalar curvature, possibly with different coefficients, should be ingredients in
the physical energy.
So far, we have not considered any surfaces X that could represent a single neu-
tron, or a cluster of neutrons. Candidates are the surfaces of Type VII. These have
c21 = −c2 , with c2 positive, equivalent to P = 0 and arbitrary positive N . These sur-
faces are complex, but are not algebraic and do not admit a Kähler metric. They are
also not simply connected. It is important to have a model of a single neutron. The
discussion of blow-ups suggests that CP2 is another possible model. In this case a
single neutron would be associated with a 2-cycle with self-intersection −1, mirror-
ing the proton inside CP2 being represented by a 2-cycle with self-intersection +1.
A free neutron is almost stable, having a lifetime of approximately 10 minutes.
There is considerable physical interest in clusters of neutrons. There is a dineu-
tron resonance similar to the diproton resonance. Recently there has been some
experimental evidence for a tetraneutron resonance, indicating some tendency for
four neutrons to bind.28 Octaneutron resonances have also been discussed, but no
conclusive evidence for their existence has yet emerged. Neutron stars consist of
multitudes of neutrons, accompanied perhaps by a small number of other particles
(protons and electrons), but their stability is only possible because of the gravita-
tional attraction supplementing the nuclear forces. Standard Newtonian gravity is
of course negligible for atomic nuclei.
Products of two Riemann surfaces (algebraic curves) of genus 2 or more are
examples of minimal surfaces of general type, but they are certainly not simply
October 31, 2018 12:8 taken from 139-IJMPA ws-rv961x669 chap01-S0217751X18300223 page 14

14 Topology and Physics

connected. Their interpretation as atoms should be investigated. Other surfaces,


for example ruled surfaces, may have some physical interpretation, but our formu-
lae would give them negative proton and neutron numbers. They do not model
antimatter, that is, combinations of antiprotons, antineutrons and positrons, be-
cause antimatter is probably best modelled using the complex conjugates of surfaces
modelling matter. Also bound states of protons and antineutrons, with positive P
and negative N , do not seem to exist.

6. Conclusions
We have proposed a new geometrical model of matter. It goes beyond our earlier
proposal14 in that it can accommodate far more than just a limited set of basic
particles. In principle, the model can account for all types of neutral atom.
Each atom is modelled by a compact, complex algebraic surface, which as a
real manifold is 4-dimensional. The physical quantum numbers of proton number P
(equal to electron number for a neutral atom) and neutron number N are expressed
in terms of the Chern numbers c21 and c2 of the surface, but they can also be
expressed in terms of combinations of the Hodge numbers, or of the Betti numbers

b1 , b+
2 and b2 .
Our formulae for P and N were arrived at by considering the interpretation
of just a few examples of algebraic surfaces — the complex projective plane CP2 ,
the quadric surface Q, and the Hirzebruch surface H1 . Some consequences, which
follow from the known constraints on algebraic surfaces, can therefore be regarded
as predictions of the model. Among these are that P is any positive integer, and
that N is bounded below by 0 and bounded above by the lesser of 9P and 7P + 6.
This encompasses all known isotopes. A most interesting prediction is that the
line N = P , which is the centre of the valley of nuclear stability for small and

medium-sized nuclei, corresponds to the line τ = 0, where τ = b+ 2 − b2 is the
signature. Surfaces with τ positive and τ negative are known to be qualitatively
different, which implies that in our model there is a qualitative difference between
proton-rich and neutron-rich nuclei.
For simply connected surfaces with b1 = 0 (or more generally, if b1 is held fixed)
then an increase of P by 1 corresponds to an increase of b+ 2 by 2. The interpretation
is that there are two extra 2-cycles with positive self-intersection, corresponding to
the extra proton and the extra electron. This matches our earlier models, where
a proton was associated with such a 2-cycle,14 and where multi-Taub-NUT space
with n NUTs modelled n electrons.16,17 On the other hand, an increase of N by 1
corresponds to an increase of b−2 by 1. This means that a neutron is associated with
a 2-cycle of negative self-intersection, which differs from our earlier ideas, where a
neutron was modelled by a 2-cycle with zero self-intersection. It appears now that
the intersection numbers are related to isospin (whose third component is 21 for a
proton and − 12 for a neutron) rather than to electric charge (1 for a proton and 0
for a neutron).
October 31, 2018 12:8 taken from 139-IJMPA ws-rv961x669 chap01-S0217751X18300223 page 15

Complex Geometry of Nuclei and Atoms 15

Clearly, much further work is needed to develop these ideas into a physical
model of nuclei and atoms. We have earlier made a few remarks about possible
energy functions for algebraic surfaces. Combinations of the topological invariants
and nontopological curvature integrals should be explored, and compared with the
detailed information on the energies of nuclei and atoms in their ground states.
It will be important to account for the quantum mechanical nature of the ground
and excited states, their energies and spins. Discrete energy gaps could arise from
discrete changes in geometry, for example, by replacing a blown-up surface with a
minimal surface, or by considering the effect of changing b1 while keeping P and N
fixed, or by comparing different embeddings of an algebraic surface in (higher-
dimensional) projective space. In some cases there should be a discrete choice
for the intersection form. There are also possibilities for finding an analogue of
a Schrödinger equation using linear operators, like the Laplacian or Dirac opera-
tor, acting on forms or spinors on a surface. Alternatively, the right approach may
be to consider the continuous moduli of surfaces as dynamical variables, and then
quantise these. The moduli should somehow correspond to the relative positions of
the protons, neutrons and electrons. Some of the ideas just mentioned have
already been investigated in the context of single particles, modelled by the Taub-
NUT space or another noncompact 4-manifold.15,29 Further physical processes, for
example, the fission of larger nuclei, and the binding of atoms into molecules, also
need to be addressed.
Before these investigations can proceed, it will be necessary to decide what
metric structure the surfaces need. Previously, we generally required manifolds to
have a self-dual metric, i.e. to be gravitational instantons, but this now seems too
rigid, as there are very few compact examples. Requiring a Kähler–Einstein metric
may be more reasonable, although these do not exist for all algebraic surfaces.30,31
For further developments of these ideas, see Refs. 32 and 33.

Acknowledgments
We are grateful to Chris Halcrow for producing Figs. 1–3, and Nick Mee for sup-
plying Fig. 4.

References
1. W. Thomson, On vortex atoms, Trans. R. Soc. Edin. 6, 94 (1867).
2. J. Lilley, Nuclear Physics: Principles and Applications (Wiley, Chichester, 2001).
3. F. Rohrlich, Classical Charged Particles, 3rd edn. (World Scientific, Singapore, 2007).
4. T. H. R. Skyrme, A nonlinear field theory, Proc. R. Soc. London A 260, 127 (1961).
5. T. H. R. Skyrme, A unified field theory of mesons and baryons, Nucl. Phys. 31, 556
(1962).
6. G. S. Adkins, C. R. Nappi and E. Witten, Static properties of nucleons in the Skyrme
model, Nucl. Phys. B 228, 552 (1983).
7. D. H. Perkins, Introduction to High Energy Physics, 4th edn. (Cambridge University
Press, Cambridge, 2000).
October 31, 2018 12:8 taken from 139-IJMPA ws-rv961x669 chap01-S0217751X18300223 page 16

16 Topology and Physics

8. R. A. Battye et al., Light nuclei of even mass number in the Skyrme model, Phys.
Rev. C 80, 034323 (2009).
9. P. H. C. Lau and N. S. Manton, States of carbon-12 in the Skyrme model, Phys. Rev.
Lett. 113, 232503 (2014).
10. C. J. Halcrow, Vibrational quantisation of the B = 7 Skyrmion, Nucl. Phys. B 904,
106 (2016).
11. C. J. Halcrow, C. King and N. S. Manton, A dynamical α-cluster model of 16 O, Phys.
Rev. C 95, 031303(R) (2017).
12. D. Finkelstein and J. Rubinstein, Connection between spin, statistics and kinks, J.
Math. Phys. 9, 1762 (1968).
13. M. Atiyah and N. S. Manton, Skyrmions from instantons, Phys. Lett. B 222, 438
(1989).
14. M. Atiyah, N. S. Manton and B. J. Schroers, Geometric models of matter, Proc. R.
Soc. London A 468, 1252 (2012).
15. R. Jante and B. J. Schroers, Dirac operators on the Taub-NUT space, monopoles and
SU(2) representations, J. High Energy Phys. 01, 114 (2014).
16. G. Franchetti and N. S. Manton, Gravitational instantons as models for charged
particle systems, J. High Energy Phys. 03, 072 (2013).
17. G. Franchetti, Harmonic forms on ALF gravitational instantons, J. High Energy Phys.
12, 075 (2014).
18. W. Barth, C. Peters and A. Van de Ven, Compact Complex Surfaces (Springer, Berlin,
Heidelberg, 1984).
19. P. Griffiths and J. Harris, Principles of Algebraic Geometry (Wiley Classics, New
York, Chichester, 1994).
20. C. Voisin, Hodge Theory and Complex Algebraic Geometry, Vol. I (Cambridge Uni-
versity Press, Cambridge, 2002).
21. Enriques–Kodaira classification, Wikipedia, 2016.
22. D. I. Cartwright and T. Steger, Enumeration of the 50 fake projective planes, C.R.
Acad. Sci. Paris, Ser. I 348, 11 (2010).
23. E. Braaten, S. Townsend and L. Carson, Novel structure of static multisoliton solu-
tions in the Skyrme model, Phys. Lett. B 235, 147 (1990).
24. R. A. Battye, N. S. Manton and P. M. Sutcliffe, Skyrmions and the α-particle model
of nuclei, Proc. R. Soc. London A 463, 261 (2007).
25. C. LeBrun, Four-manifolds without Einstein metrics, Math. Res. Lett. 3, 133 (1996).
26. C. LeBrun, Ricci curvature, minimal volumes, and Seiberg–Witten theory, Invent.
Math. 145, 279 (2001).
27. S. K. Donaldson and P. B. Kronheimer, Geometry of Four-Manifolds (Oxford Univer-
sity Press, Oxford, 1990).
28. K. Kisamori et al., Candidate resonant tetraneutron state populated by the
4
He(8 He, 8 Be) reaction, Phys. Rev. Lett. 116, 052501 (2016).
29. R. Jante and B. J. Schroers, Spectral properties of Schwarzschild instantons, Class.
Quantum Grav. 33, 205008 (2016).
30. T. Ochiai (ed.), Kähler Metric and Moduli Spaces, Advanced Studies in Pure Math-
ematics, Vol. 18-II (Kinokuniya, Tokyo, 1990).
31. G. Tian, Kähler–Einstein metrics with positive scalar curvature, Invent. Math. 137,
1 (1997).
32. M. Atiyah and M. Marcolli, Anyons in geometric models of matter, J. High Energy
Phys. 07, 076 (2017).
33. M. F. Atiyah, Geometric models of helium, Mod. Phys. Lett. A 32, 1750079 (2017).
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 17

17

Chapter 2

Developments in topological gravity

Robbert Dijkgraaf and Edward Witten


Institute for Advanced Study,
Einstein Drive, Princeton, NJ 08540 USA

This note aims to provide an entrée to two developments in two-dimensional topological


gravity — that is, intersection theory on the moduli space of Riemann surfaces —
that have not yet become well-known among physicists. A little over a decade ago,
Mirzakhani discovered [1, 2] an elegant new proof of the formulas that result from the
relationship between topological gravity and matrix models of two-dimensional gravity.
Here we will give a very partial introduction to that work, which hopefully will also
serve as a modest tribute to the memory of a brilliant mathematical pioneer. More
recently, Pandharipande, Solomon, and Tessler [3] (with further developments in [4–6])
generalized intersection theory on moduli space to the case of Riemann surfaces with
boundary, leading to generalizations of the familiar KdV and Virasoro formulas. Though
the existence of such a generalization appears natural from the matrix model viewpoint
— it corresponds to adding vector degrees of freedom to the matrix model — constructing
this generalization is not straightforward. We will give some idea of the unexpected way
that the difficulties were resolved.

Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2 Weil–Petersson Volumes and Two-Dimensional Topological Gravity . . . . . . . . . . . . . . . . . . . . . . . 19
2.1 Background and initial steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2 A simpler problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3 How Maryam Mirzakhani cured modular invariance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25
2.4 Volumes and intersection numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3 Open Topological Gravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.2 The anomaly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3 Relation to condensed matter physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.4 Two realizations of theory T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.5 Boundary conditions in theory T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.6 Boundary anomaly of theory T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.6.1 The trivial case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.6.2 Boundary condition in condensed matter physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.6.3 Boundary condition in two-dimensional gravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.7 Anomaly cancellation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.8 Branes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.8.1 The ζ-instanton equation and compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.8.2 Boundary condition in the ζ-instanton equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.8.3 Orientations and statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.8.4 Quantizing the string . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 18

18 Topology and Physics

3.9 Boundary degenerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56


3.10 Computations of disc amplitudes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4 Interpretation via Matrix Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.1 The loop equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.2 Double-scaling limits and topological gravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.3 Branes and open strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

1. Introduction
There are at least two candidates for the simplest model of quantum gravity in
two space–time dimensions. Matrix models are certainly one candidate, extensively
studied since the 1980s. These models were proposed in [7–11] and solved in [12–14];
for a comprehensive review with extensive references, see [15]. A second candidate is
provided by topological gravity, that is, intersection theory on the moduli space of
Riemann surfaces. It was conjectured some time ago that actually two-dimensional
topological gravity is equivalent to the matrix model [16, 17].
This equivalence led to formulas expressing the intersection numbers of certain
natural cohomology classes on moduli space in terms of the partition function of the
matrix model, which is governed by KdV equations [18] or equivalently by Virasoro
constraints [19]. These formulas were first proved by Kontsevich [20] by a direct
calculation that expressed intersection numbers on moduli space in terms of a new
type of matrix model (which was again shown to be governed by the KdV and
Virasoro constraints).
A little over a decade ago, Maryam Mirzakhani found a new proof of this re-
lationship as part of her Ph.D. thesis work [1, 2]. (Several other proofs are known
[21,22].) She put the accent on understanding the Weil–Petersson volumes of moduli
spaces of hyperbolic Riemann surfaces with boundary, showing that these volumes
contain all the information in the intersection numbers. A hyperbolic structure on
a surface Σ is determined by a flat SL(2, R) connection, so the moduli space M of
hyperbolic structures on Σ can be understood as a moduli space of flat SL(2, R)
connections. Actually, the Weil–Petersson symplectic form on M can be defined by
the same formula that is used to define the symplectic form on the moduli space
of flat connections on Σ with structure group a compact Lie group such as SU (2).
For a compact Lie group, the volume of the moduli space can be computed by a
direct cut and paste method [23] that involves building Σ out of simple building
blocks (three-holed spheres). Naively, one might hope to do something similar for
SL(2, R) and thus for the Weil–Petersson volumes. But there is a crucial difference:
in the case of SL(2, R), in order to define the moduli space whose volume one will
calculate, one wants to divide by the action of the mapping class group on Σ. (Oth-
erwise the volume is trivially infinite.) But dividing by the mapping class group is
not compatible with any simple cut and paste method. Maryam Mirzakhani over-
came this difficulty in a surprising and elegant way, of which we will give a glimpse
in Sec. 2.
Matrix models of two-dimensional gravity have a natural generalization in which
vector degrees of freedom are added [24–29]. This generalization is related, from a
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 19

Developments in Topological Gravity 19

physical point of view, to two-dimensional gravity formulated on two-manifolds Σ


that carry a complex structure but may have a boundary. We will refer to such
two-manifolds as open Riemann surfaces (if the boundary of Σ is empty, we will
call it a closed Riemann surface). It is natural to hope that, by analogy with what
happens for closed Riemann surfaces, there would be an intersection theory on the
moduli space of open Riemann surfaces that would be related to matrix models
with vector degrees of freedom. In trying to construct such a theory, one runs
into immediate difficulties: the moduli space of open Riemann surfaces does not
have a natural orientation and has a boundary; for both reasons, it is not obvious
how to define intersection theory on this space. These difficulties were overcome
by Pandharipande, Solomon, and Tessler in a rather unexpected way [3] whose
full elucidation involves introducing spin structures in a problem in which at first
sight they do not seem relevant [4–6]. In Sec. 3, we will explain some highlights of
this story. In Sec. 4, we review matrix models with vector degrees of freedom, and
show how they lead — modulo a slightly surprising twist — to precisely the same
Virasoro constraints that have been found in intersection theory on the moduli
space of open Riemann surfaces.
The matrix models we consider are the direct extension of those studied in
[12–14]. The same problem has been treated in a rather different approach via
Gaussian matrix models with an external source in [30] and in chapter 8 of [31].
See also [32] for another approach. For an expository article on the relation of
matrix models and intersection theory, see [33].

2. Weil–Petersson Volumes and Two-Dimensional Topological


Gravity
2.1. Background and initial steps
Let Σ be a closed Riemann surface of genus g with marked points1 p1 , . . . , pn , and
let Li be the cotangent space to pi in Σ. As Σ and the pi vary, Li varies as the fiber
of a complex line bundle — which we also denote as Li — over Mg,n , the moduli
space of genus g curves with n punctures. In fact, these line bundles extend naturally
over Mg,n , the Deligne–Mumford compactification of Mg,n . We write ψi for the
first Chern class of Li ; thus ψi = c1 (Li ) is a two-dimensional cohomology class. For
a non-negative integer d, we set τi,d = ψid , a cohomology class of dimension 2d. The
usual correlation functions of 2d topological gravity are the intersection numbers
Z Z
hτd1 τd2 . . . τdn i = τ1,d1 τ2,d2 · · · τn,dn = ψ1d1 ψ2d2 · · · ψndn , (2.1)
Mg,n Mg,n

where d1 , . . . , dn is any n-plet of nonnegative integers. The right-hand side of Eq.


Pn
(2.1) vanishes unless i=1 di = 3g−3+n. To be more exact, what we have defined in
Eq. (2.1) is the genus g contribution to the correlation function; the full correlation

1 The marked points are labeled and are required to be always distinct.
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 20

20 Topology and Physics

function is obtained by summing over g ≥ 0. (For a given set of di , there is at most


Pn
one integer solution g of the condition i=1 di = 3g − 3 + n, and this is the only
value that contributes to hτd1 τd2 . . . τdn i.)
Let us now explain how these correlation functions are related to the Weil–
Petersson volume of Mg . In the special case n = 1, we have just a single marked
point p and a single line bundle L and cohomology class ψ. We also have the
forgetful map π : Mg,1 → Mg that forgets the marked point. We can construct
a two-dimensional cohomology class κ on Mg by integrating the four-dimensional
class τ2 = ψ 2 over the fibers of this forgetful map:
κ = π∗ (τ2 ). (2.2)
More generally, the Miller–Morita–Mumford (MMM) classes are defined by κd =
π∗ (τd+1 ), so κ is the same as the first MMM class κ1 . κ is cohomologous to a
multiple of the Weil–Petersson symplectic form ω of the moduli space [34, 35]:
ω
= κ. (2.3)
2π 2
Because of (2.2), it will be convenient to use κ, rather than ω, to define a volume
form. With this choice, the volume of Mg is
κ3g−3
Z Z
Vg = = exp(κ). (2.4)
Mg (3g − 3)! Mg

The relation between κ and τ2 might make one hope that the volume Vg would
be one of the correlation functions of topological gravity:
? 1
Vg = τ 3g−3 . (2.5)
(3g − 3)! 2
Such a simple formula is, however, not true, for the following reason. To compute
the right-hand side of Eq. (2.5), we would have to introduce 3g − 3 marked points
on Σ, and insert τ2 (that is, ψi2 ) at each of them. It is true that for a single marked
point, κ can be obtained as the integral of τ2 over the fiber of the forgetful map,
as in Eq. (2.2). However, when there is more than one marked point, we have to
take into account that the Deligne–Mumford compactification of Mg,n is defined
in such a way that the marked points are never allowed to collide. Taking this into
account leads to corrections in which, for instance, two copies of τ2 are replaced by a
single copy of τ3 . The upshot is that Vg can be expressed in terms of the correlation
functions of topological gravity, and thus can be computed using the KdV equations
or the Virasoro constraints, but the necessary formula is more complicated. See
Subsec. 2.4 below. For now, we just remark that this approach has been used [36]
to determine the large g asymptotics of Vg , but apparently does not easily lead to
explicit formulas for Vg in general. Weil–Petersson volumes were originally studied
and their asymptotics estimated by quite different methods [37].
Mg,n likewise has a Weil–Petersson volume Vg,n of its own, which likewise can
be computed, in principle, using a knowledge of the intersection numbers on Mg,n0
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 21

Developments in Topological Gravity 21

a) b)

Fig. 1. (a) A marked point in a hyperbolic Riemann surface is treated as a cusp: it lies at infinity
in the hyperbolic metric. (b) Instead of a cusp, a hyperbolic Riemann surface might have a geodesic
boundary, with circumference any positive number b.

for n0 > n. Again this gives useful information but it is difficult to get explicit
general formulas.
Mirzakhani’s procedure was different. First of all, she worked in the hyperbolic
world, so in the following discussion Σ is not just a complex Riemann surface; it
carries a hyperbolic metric, by which we mean a Riemannian metric of constant
scalar curvature R = −1. We recall that a complex Riemann surface admits a
unique Kahler metric with R = −1. We recall also that in studying hyperbolic
Riemann surfaces, it is natural2 to think of a marked point as a cusp, which lies at
infinity in the hyperbolic metric (Fig. 1).
Instead of a marked point, we can consider a Riemann surface with a boundary
component. In the hyperbolic world, one requires the boundary to be a geodesic
in the hyperbolic metric. Its circumference may be any positive number b. Let
us consider, rather than a closed Riemann surface Σ of genus g with n labeled
marked points, an open Riemann surface Σ also of genus g, but now with n labeled
boundaries. In the hyperbolic world, it is natural to specify n positive numbers
b1 , . . . , bn and to require that Σ carry a hyperbolic metric such that the boundaries
are geodesics of lengths b1 , . . . , bn . We denote the moduli space of such hyperbolic
metrics as Mg;b1 ,b2 ,...,bn or more briefly as Mg,~b , where ~b is the n-plet (b1 , b2 , . . . , bn ).
As a topological space, M ~ is independent of ~b. In fact, M ~ is an orbifold,
g,b g,b
and the topological type of an orbifold cannot depend on continuously variable
data such as ~b. In the limit that b1 , . . . , bn all go to zero, the boundaries turn into
cusps and Mg,~b turns into Mg,n . Thus topologically, Mg,~b is equivalent to Mg,n for
any ~b. Very concretely, we can always convert a Riemann surface with a boundary
component to a Riemann surface with a marked point by gluing a disc, with a
marked point at its center, to the given boundary component. Thus we can turn a
Riemann surface with boundaries into one with marked points without changing the
parameters the Riemann surface can depend on, and this leads to the topological
equivalence of Mg,~b with Mg,n . If we allow the hyperbolic metric of Σ to develop

2 This is natural because the degenerations of the hyperbolic metric of Σ that correspond to
Deligne–Mumford compactification of Mg,n generate cusps. Since the extra marked points that
occur when Σ degenerates (for example to two components that are glued together at new marked
points) appear as cusps in the hyperbolic metric, it is natural to treat all marked points that way.
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 22

22 Topology and Physics

cusp singularities, we get a compactification Mg,~b of Mg,b which coincides with the
Deligne–Mumford compactification Mg,n of Mg,n .
Mg,n and Mg,~b have natural Weil–Petersson symplectic forms that we will call
ω and ω~b (see [38]). Since Mg,n and Mg,~b are equivalent topologically, it makes
sense to ask if the symplectic form ω~b of Mg,~b has the same cohomology class as
the symplectic form ω of Mg,n . The answer is that it does not. Rather, one has
(see [2], Theorem 4.4)
n
1X 2
ω~b = ω + b ψi . (2.6)
2 i=1 i
(This is a relationship in cohomology, not an equation for differential forms.) From
this it follows that the Weil–Petersson volume of Mg,~b is3
Z n
!
1 1X 2
Vg,~b = exp ω + b ψi . (2.7)
(2π 2 )3g−3+n Mg,~b 2 i=1 i
Equivalently, since compactification by allowing cusps does not affect the volume
integral, and the compactification of Mg,~b is the same as Mg,n , one can write this
as an integral over the compactification:
Z n
!
1 1X 2
Vg,~b = exp ω + b ψi . (2.8)
(2π 2 )3g−3+n Mg,n 2 i=1 i

This last result tells us that at ~b = 0, Vg,~b reduces to the volume Vg,n =
2 3g−3+n
eω of Mg,n . Moreover, Eq. (2.8) implies that Vg,~b is a polyno-
R
(1/2π ) Mg,n
mial in ~b2 = (b2 , . . . , b2 ) of total degree 3g − 3 + n. In evaluating the term of top
1 n
degree in Vg,~b , we can drop ω from the exponent in Eq. (2.8). Then the expansion
in powers of the bi tells us that this term of top degree is
n
1 X Y b2d
i
i

hτd1 τd2 · · · τdn i . (2.9)


(2π 2 )3g−3+n 2di di !
d1 ,...,dn i=1
P
(Only terms with i di = 3g − 3 + n make nonzero contributions in this sum.) In
other words, the correlation functions of two-dimensional topological gravity on a
closed Riemann surface appear as coefficients in the expansion of Vg,~b . Of course,
Vg,~b contains more information,4 since we can also consider the terms in Vg,~b that
are subleading in ~b.
Thus Mirzakhani’s approach to topological gravity involved deducing the cor-
relation functions of topological gravity from the volume polynomials Vg,~b . We will

3 The reason for the factor of 1/(2π 2 )3g−3+n is just that we defined the volumes in Eq. (2.4) using

κ rather than ω.
4 This additional information in principle is not really new. Using facts that generalize the rela-

tionship between Vg,n and the correlation functions of topological gravity that we discussed at
the outset, one can deduce also the subleading terms in Vg,~b in terms of the correlation functions
of topological gravity. However, it appears difficult to get useful formulas in this way.
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 23

Developments in Topological Gravity 23

give a few indications of how she computed these volume polynomials in Subsec.
2.3, after first recalling a much simpler problem.

2.2. A simpler problem


Before explaining how to compute the volume of Mg,~b , we will describe how volumes
can be computed in a simpler case. In fact, the analogy was noted in [2].
Let G be a compact Lie group, such as SU (2), with Lie algebra g, and let Σ be a
closed Riemann surface of genus g. Let Mg be the moduli space of homomorphisms
from the fundamental group of Σ to G. Equivalently, Mg is the moduli space of flat
g-valued flat connections on Σ. Then [38,39] Mg has a natural symplectic form that
in many ways is analogous to the Weil–Petersson form on Mg . Writing A for a flat
connection on Σ and δA for its variation, the symplectic form of Mg can be defined
by the gauge theory formula
Z
1
ω= Tr δA ∧ δA, (2.10)
4π 2 Σ
where (for G = SU (2)) we can take Tr to be the trace in the two-dimensional
representation.
Actually, the Weil–Petersson form of Mg can be defined by much the same
formula. The moduli space of hyperbolic metrics on Σ is a component5 of the
moduli space of flat SL(2, R) connections over Σ, divided by the mapping class
group of Σ. Denoting the flat connection again as A and taking Tr to be the trace
in the two-dimensional representation of SL(2, R), the right hand side of Eq. (2.10)
becomes in this case a multiple of the Weil–Petersson symplectic form ω on Mg .
There is also an analog for compact G of the moduli spaces Mg,~b of hyperbolic
Riemann surfaces with geodesic boundary. For ~b = (b1 , . . . , bn ), M ~ can be inter-
g,b
preted as follows in the gauge theory language. A point in Mg,~b corresponds, in
the gauge theory language, to a flat SL(2, R) connection on Σ with the property
that the holonomy around the ith boundary is conjugate in SL(2, R) to the group
element diag(ebi , e−bi ).
In this language, it is clear how to imitate the definition of Mg,~b for a compact
Lie group such as SU (2). For k = 1, . . . , n, we choose a conjugacy class in SU (2),
say the class that contains Uk = diag(eiαk , e−iαk ), for some αk . We write α ~ for the n-
plet (α1 , α2 , . . . , αn ), and we define Mg,~α to be the moduli space of flat connections
on a genus g surface Σ with n holes (or equivalently n boundary components) with

5 The moduli space of flat SL(2, R) connections on Σ has various components labeled by the Euler

class of a flat real vector bundle of rank 2 (transforming in the two-dimensional representation of
SL(2, R)). One of these components parametrizes hyperbolic metrics on Σ together with a choice
of spin structure. If we replace SL(2, R) by P SL(2, R) = SL(2, R)/Z2 (the symmetry group of the
hyperbolic plane), we forget the spin structure, so to be precise, Mg is a component of the moduli
space of flat P SL(2, R) connections. This refinement will not be important in what follows and
we loosely speak of SL(2, R). In terms of P SL(2, R), one can define Tr as 1/4 of the trace in the
three-dimensional representation.
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 24

24 Topology and Physics

a) b)

Fig. 2. (a) A three-holed sphere or “pair of pants.” (b) A Riemann surface Σ, possibly with
boundaries, that is built by gluing three-holed spheres along their boundaries. Each boundary of
one of the three-holed spheres is either an external boundary — a boundary of Σ — or an internal
boundary, glued to a boundary of one of the three-holed spheres (generically a different one). The
example shown has one external boundary and four internal ones.

the property that the holonomy around the k th hole is conjugate to Uk . With a
little care,6 the right hand side of the formula (2.10) can be used in this situation
to define the Weil–Petersson form κ~b of Mg,~b , and the analogous symplectic form
ωα~ of Mg,~α . Thus in particular, Mg,~α has a symplectic volume Vg,~α . Moreover, Vg,~α
is a polynomial in α ~ , and the coefficients of this polynomial are the correlation
functions of a certain version of two-dimensional topological gauge theory — they
are the intersection numbers of certain natural cohomology classes on Mg,~α .
These statements, which are analogs of what we described in the case of gravity
in Subsec. 2.1, were explained for gauge theory with a compact gauge group in [40].
Moreover, for a compact gauge group, various relatively simple ways to compute the
symplectic volume Vg,~α were described in [23]. None of these methods carry over
naturally to the gravitational case. However, to appreciate Maryam Mirzakhani’s
work on the gravitational case, it helps to have some idea how the analogous problem
can be solved in the case of gauge theory with a compact gauge group. So we will
make a few remarks.
First we consider the special case of a three-holed sphere (sometimes called a
pair of pants; see Fig. 2(a)). In the case of a three-holed sphere, for G = SU (2),
M0,~α is either a point, with volume 1, or an empty set, with volume 0, depending on
α
~ . The volumes of the three-holed sphere moduli spaces can also be computed (with
a little more difficulty) for other compact G, but we will not explain the details as
the case of SU (2) will suffice for illustration.

6 On the gravity side, Mirzakhani’s proof that the cohomology class of κ~b is linear in ~b2 did not
use Eq. (2.10) at all, but a different approach based on Fenchel–Nielsen coordinates. On the
gauge theory side, in using Eq. (2.10), it can be convenient to consider a Riemann surface with
punctures (i.e., marked points that have been deleted) rather than boundaries. This does not
affect the moduli space of flat connections, because if Σ is a Riemann surface with boundary, one
can glue in to each boundary component a once-punctured disc, thus replacing all boundaries by
punctures, without changing the moduli space of flat connections. For brevity we will stick here
with the language of Riemann surfaces with boundary.
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 25

Developments in Topological Gravity 25

Now to generalize beyond the case of a three-holed sphere, we observe that


any closed surface Σ can be constructed by gluing together three-holed spheres
along some of their boundary components (Fig. 2(b)). If Σ is built in this way,
then the corresponding volume Mg,~α can be obtained by multiplying together the
volume functions of the individual three-holed spheres and integrating over the
α parameters of the internal boundaries, where gluing occurs. (One also has to
integrate over some twist angles that enter in the gluing, but these give a trivial
overall factor.) Thus for a compact group it is relatively straightforward to get
formulas for the volumes Vg,~α . Moreover, these formulas turn out to be rather
manageable.
If we try to imitate this with SU (2) replaced by SL(2, R), some of the steps
work. In particular, if Σ is a three-holed sphere, then for any ~b, the moduli space
M0,~b is a point and V0,~b = 1. What really goes wrong for SL(2, R) is that, if Σ
is such that M0,~b is not just a point, then the volume of the moduli space of flat
SL(2, R) connections on Σ is infinite. For SU (2), the procedure mentioned in the
last paragraph leads to an integral over the parameters α ~ . Those parameters are
angular variables, valued in a compact set, and the integral over these parameters
converges. For SL(2, R) (in the particular case of the component of the moduli
space of flat connections that is related to hyperbolic metrics), we would want to
replace the angular variables α~ with the positive parameters ~b. The set of positive
numbers is not compact and the integral over ~b is divergent.
This should not come as a surprise as it just reflects the fact that the group
SL(2, R) is not compact. The relation between flat SL(2, R) connections and com-
plex structures tells us what we have to do to get a sensible problem. To go from (a
component of) the moduli space of flat SL(2, R) connections to the moduli space
of Riemann surfaces, we have to divide by the mapping class group of Σ (the group
of components of the group of diffeomorphisms of Σ). It is the moduli space of
Riemann surfaces that has a finite volume, not the moduli space of flat SL(2, R)
connections.
But here is precisely where we run into difficulty with the cut and paste method
to compute volumes. Topologically, Σ can be built by gluing three-holed spheres in
many ways that are permuted by the action of the mapping class group. Any one
gluing procedure is not invariant under the mapping class group and in a calculation
based on any one gluing procedure, it is difficult to see how to divide by the mapping
class group.
Dealing with this problem, in a matter that we explain next, was the essence of
Maryam Mirzakhani’s approach to topological gravity.

2.3. How Maryam Mirzakhani cured modular invariance


Let Σ be a hyperbolic Riemann surface with geodesic boundary. Ideally, to compute
the volume of the corresponding moduli space, we would “cut” Σ on a simple
closed geodesic `. This cutting gives a way to build Σ from hyperbolic Riemann
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 26

26 Topology and Physics

a) b)

Fig. 3. A “cut” of a Riemann surface with boundary along an embedded circle may be separating
as in (a) or non-separating as in (b).

surfaces that are in some sense simpler than Σ. If cutting along ` divides Σ into
two disconnected components (Fig. 3(a)), then Σ can be built by gluing along `
two hyperbolic Riemann surfaces Σ1 and Σ2 of geodesic boundary. If cutting along
` leaves Σ connected (Fig. 3(b)), then Σ is built by gluing together two boundary
components of a surface Σ0 . We call these the separating and nonseparating cases.
In the separating case, we might naively hope to compute the volume function
Vg,~b for Σ by multiplying together the corresponding functions for Σ1 and Σ2 and
integrating over the circumference b of `. Schematically,
Z ∞
?
VΣ = db VΣ1 ,b VΣ2 ,b , (2.11)
0
where we indicate that Σ1 and Σ2 each has one boundary component, of circum-
ference b, that does not appear in Σ. In the nonseparating case, a similarly naive
formula would be
Z ∞
?
VΣ = db VΣ0 ,b,b , (2.12)
0
0
where we indicate that Σ , relative to Σ, has two extra boundary components each
of circumference b.
The surfaces Σ1 , Σ2 , and Σ0 are in a precise sense “simpler” than Σ: their genus
is less, or their Euler characteristic is less negative. So if we had something like
(2.11) or (2.12), a simple induction would lead to a general formula for the volume
functions.
The trouble with these formulas is that a hyperbolic Riemann surface actually
has infinitely many simple closed geodesics `α , and there is no natural (modular-
invariant) way to pick one. Suppose, however, that there were a function F (b) of a
positive real number b with the property that
X
F (bα ) = 1, (2.13)
α

where the sum runs over all simple closed geodesics `α on a hyperbolic surface Σ, and
bα is the length of `α . In this case, by summing over all choices of embedded simple
closed geodesic, and weighting each with a factor of F (b), we would get a corrected
version of the above formulas. In writing the formula, we have to remember that
cutting along a given `α either leaves Σ connected or separates a genus g surface Σ
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 27

Developments in Topological Gravity 27

into surfaces Σ1 , Σ2 of genera g1 , g2 such that g1 + g2 = g. In the separating case,


the boundaries of Σ are partitioned in some arbitrary way between Σ1 and Σ2 and
each of Σ1 , Σ2 has in addition one more boundary component whose circumference
we will call b0 . So denoting as ~b the boundary lengths of Σ, the boundary lengths
of Σ1 and Σ2 are respectively ~b1 , b0 and ~b2 , b0 , where ~b = ~b1 t ~b2 (here ~b1 t ~b2
denotes the disjoint union of two sets ~b1 and ~b2 ) and Σ is built by gluing together
Σ1 and Σ2 along their boundaries of length b0 . This is drawn in Fig. 3(a), but in
the example shown, the set ~b consists of only one element. In the nonseparating
case of Fig. 3(b), Σ is made from gluing a surface Σ0 of boundary lengths ~b, b0 , b0
along its two boundaries of length b0 . The genus g 0 of Σ0 is g 0 = g − 1. Assuming
the hypothetical sum rule (2.13) involves a sum over all simple closed geodesics `α ,
regardless of topological properties, the resulting recursion relation for the volumes
will also involve such a sum. This recursion relation would be
Z ∞
? 1
X X
Vg,~b = db0 F (b0 )Vg1 ,~b1 ,b0 Vg2 ,~b2 ,b0
2 0
g1 ,g2 |g=g1 +g2 ~b1 ,~b2 |~b1 t~b2 =~b
Z ∞
+ db0 F (b0 )Vg−1,~b,b0 ,b0 . (2.14)
0
In the first term, the sum runs over all topological choices in the gluing; the factor
of 1/2 reflects the possibility of exchanging Σ1 and Σ2 . The factors of F (b0 ) in the
formula compensate for the fact that in deriving such a result, one has to sum over
cuts on all simple closed geodesics. By induction (in the genus and the absolute
value of the Euler characteristic of a surface), such a recursion relation would lead
to explicit expressions for all Vg,~b .
There is an important special case in which there actually is a sum rule [41]
precisely along the lines of Eq. (2.13) and therefore there is an identity precisely
along the lines of Eq. (2.14). This is the case that Σ is a surface of genus 1 with
one boundary component.
The general case is more complicated. In general, there is an identity that in-
volves pairs of simple closed geodesics in Σ that have the property that — together
with a specified boundary component of Σ — they bound a pair of pants (Fig. 4).
This identity was proved for hyperbolic Riemann surfaces with punctures by Mc-
Shane in [41] and generalized to surfaces with boundary by Mirzakhani in [1],
Theorem 4.2.
This generalized McShane identity leads to a recursive formula for Weil–
Petersson volumes that is similar in spirit to Eq. (2.14). See Theorem 8.1 of Mirza-
khani’s paper [1] for the precise statement. The main difference between the naive
formula (2.14) and the formula that actually works is the following. In Eq. (2.14),
we imagine building Σ from simpler building blocks by a more or less arbitrary
gluing. In the correct formula — Mirzakhani’s Theorem 8.1 — we more specifically
build Σ by gluing a pair of pants onto something simpler, as in Fig. 4. There is
a function F (b, b0 , b00 ), analogous to F (b0 ) in the above schematic discussion, that
enters in the generalized McShane identity and therefore in the recursion relation.
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 28

28 Topology and Physics

Σ′

Fig. 4. Building a hyperbolic surface Σ by gluing a hyperbolic pair of pants with geodesic bound-
ary onto a simpler hyperbolic surface Σ0 . Σ and Σ0 both have geodesic boundary. (Shown here is
the case that Σ0 is connected.)

It compensates for the fact that in deriving the recursion relation, one has to sum
over infinitely many ways to cut off a hyperbolic pair of pants from Σ.
In this manner, Mirzakhani arrived at a recursive formula for Weil–Petersson
volumes that is similar to although somewhat more complicated than Eq. (2.14).
Part of the beauty of the subject is that this formula turned out to be surprisingly
tractable. In [1], section 6, she used the recursive formula to give a new proof —
independent of the relation to topological gravity that we reviewed in Subsec. 2.1
— that the volume functions Vg,~b are polynomials in b21 , . . . , b2n . In [2], she showed
that these polynomials satisfy the Virasoro constraints of two-dimensional gravity,
as formulated for the matrix model in [19]. Thereby — using the relation between
volumes and intersection numbers that we reviewed in Subsec. 2.1 and to which we
will return in a moment — she gave a new proof of the known formulas [17, 20] for
intersection numbers on the moduli space of Riemann surfaces, or equivalently for
correlation functions of two-dimensional topological gravity.

2.4. Volumes and intersection numbers


We conclude this section by briefly describing the formula that relates Weil–
Petersson volumes to correlation functions of topological gravity.
Given a surface Σ with n + 1 marked points, there is a forgetful map π :
Mg,n+1 → Mg,n that forgets one of the marked points p. If we insert the class
τd+1 at p and integrate over the fiber of π, we get the Miller–Morita–Mumford
class κd = π∗ (τd+1 ), which is a class of degree 2d in the cohomology of Mg,n .
Qk
As a first step in evaluating a correlation function hτd+1 j=1 τnj i, one might to
try to integrate over the choice of the point at which τd+1 is inserted. Integrating
over the fiber of π : Mg,n+1 → Mg , one might hope to get a formula
* k
+ * k
+
?
Y Y
τd+1 τnj = κd τnj . (2.15)
j=1 j=1

This is not true, however. The right version of the formula has corrections that
involve contact terms in which τd+1 collides with τnj for some j. Such a collision
generates a correction that is an insertion of τd+nj . For a fuller explanation, see [17].
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 29

Developments in Topological Gravity 29

Taking account of the contact terms, one can express correlation functions of the
τ ’s in terms of those of the κ’s, and vice versa.
A special case is the computationR of volumes. As before, we write just κ for κ1 ,
and we define the volume of Mg as Mg κ3g−3 /(3g − 3)!. This can be expressed in
terms of correlation functions of the τ ’s, but one has to take the contact terms into
account.
As an example, we consider the case of a closed surface of genus 2. The volume
of the compactified moduli space M2 is
1
V2 = hκκκi, (2.16)
3!
and we want to compare this to topological gravity correlation functions such as
1
hτ2 τ2 τ2 i. (2.17)
3!
By integrating over the position of one puncture, we can replace one copy of τ2
with κ, while also generating contact terms. In such a contact term, τ2 collides with
some τs , s ≥ 0, to generate a contact term τs+1 . Thus for example

hτ2 τ2 τ2 i = hκτ2 τ2 i + 2hτ3 τ2 i, (2.18)

where the factor of 2 reflects the fact that the first τ2 may collide with either of
the two other τ2 insertions to generate a τ3 . The same process applies if factors of
κ are already present; they do not generate additional contact terms. For example,

hκτ2 τ2 i = hκκτ2 i + hκτ3 i = hκκκi + hκτ3 i. (2.19)

Similarly

hτ2 τ3 i = hκτ3 i + hτ4 i. (2.20)

Taking linear combinations of these formulas, we learn finally that

hκκκi = hτ2 τ2 τ2 i − 3hτ2 τ3 i + hτ4 i. (2.21)

This is equivalent to saying that V2 , which is the term of order ξ 3 in

hexp(ξκ)i , (2.22)

is equally well the term of order ξ 3 in


ξ2 ξ3
  
exp ξτ2 − τ3 + τ4 . (2.23)
2! 3!
The generalization of this for higher genus is that
 * ∞
!+
(−1)k ξ k−1
 X
exp (ξκ) = exp τk . (2.24)
(k − 1)!
k=2
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 30

30 Topology and Physics

The volume of Mg is the coefficient of ξ 3g−3 in the expansion of either of these for-
mulas. To prove Eq. (2.24), we write W (ξ) for the right-hand side, and we compute
that
∞ ∞
* ! !+
r−2 k k−1
d X ξ X (−1) ξ
W (ξ) = τ2 + (−1)r τr exp τk . (2.25)
dξ r=3
(r − 2)! (k − 1)!
k=2

Next, one replaces the explicit τ2 term in the parentheses on the right-hand side
with κ plus a sum of contact terms between τ2 and the τk ’s that appear in the
exponential. These contact terms cancel the τr ’s inside the parentheses, and one
finds

* !+
d X (−1)k ξ k−1
W (ξ) = κ exp τk . (2.26)
dξ (k − 1)!
k=2

Repeating this process gives for all s ≥ 0



* !+
ds s
X (−1)k ξ k−1
W (ξ) = κ exp τk . (2.27)
dξ s (k − 1)!
k=2

Setting ξ = 0, we get
ds
W (ξ) = hκs i , (2.28)
dξ s ξ=0

and the fact that this is true for all s ≥ 0 is equivalent to Eq. (2.24).
Eq. (2.24) has been deduced by comparing matrix model formulas to Mirza-
khani’s formulas for the volumes [42]. We will return to this when we discuss the
spectral curve in Subsec. 4.2. For algebro-geometric approaches and generalizations
see [43, 44]. It is also possible to obtain similar formulas for the volume of Mg,~b .

3. Open Topological Gravity


3.1. Preliminaries
In this section, we provide an introduction to recent work [3–6] on topological grav-
ity on open Riemann surfaces, that is, on oriented two-manifolds with boundary.
From the point of view of matrix models of two-dimensional gravity, one should
expect an interesting theory of this sort to exist because adding vector degrees of
freedom to a matrix model of two-dimensional gravity gives a potential model of
two-manifolds with boundary.7 We will discuss matrix models with vector degrees
of freedom in Sec. 4. Here, however, we discuss the topological field theory side of
the story.

7 Similarly,
by replacing the usual symmetry group U (N ) of the matrix model with O(N ) or
Sp(N ), one can make a model associated to gravity on unoriented (and possibly unorientable)
two-manifolds. It is not yet understood if this is related to some sort of topological field theory.
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 31

Developments in Topological Gravity 31

Let Σ be a Riemann surface with boundary, and in general with marked points
or punctures both in bulk and on the boundary. Its complex structure gives Σ a
natural orientation, and this induces orientations on all boundary components. If
p is a bulk puncture, then the cotangent space to p in Σ is a complex line bundle
L, and as we reviewed in Subsec. 2.1, one defines for every integer d ≥ 0 the
cohomology class τd = ψ d of degree 2d. The operator τ0 = 1 is associated to a bulk
puncture, and the τd with d > 0 are called gravitational descendants.
A boundary puncture has no analogous gravitational descendants, because if p
is a boundary point in Σ, the tangent bundle to p in Σ is naturally trivial. It has a
natural real subbundle given by the tangent space to p in ∂Σ, and this subbundle
is actually trivialized (up to homotopy) by the orientation of ∂Σ. So c1 (L) = 0 if p
is a boundary puncture.
Thus the list of observables in 2d topological gravity on a Riemann surface with
boundary consists of the usual bulk observables τd , d ≥ 0, and one more boundary
observable σ, corresponding to a boundary puncture. Formally, the sort of thing one
hopes to calculate for a Riemann surface Σ with n bulk punctures and m boundary
punctures is
Z
hτd1 τd2 · · · τdn σ m iΣ = ψ1d1 ψ2d2 · · · ψndn , (3.1)
M
where M is the (compactified) moduli space of conformal structures on Σ with
n bulk punctures and m boundary punctures. The di are arbitrary nonnegative
Qn
integers, and we note that the cohomology class i=1 ψidi that is integrated over
M is generated only from data at the bulk punctures (and in fact only from those
bulk punctures with di > 0). The boundary punctures (and those bulk punctures
with di = 0) participate in the construction only because they enter the definition
of M, the space on which the cohomology class in question is supposed to be
integrated. Similarly to the case of a Riemann surface without boundary, to make
the integral (3.1) nonzero, Σ must be chosen topologically so that the dimension of
M is the same as the degree of the cohomology class that we want to integrate:
Xn
dim M = 2di . (3.2)
i=1
Assuming that we can make sense of the definition in Eq. (3.1), the (unnor-
malized) correlation function hτd1 τd2 · · · τdn σ m i of 2d gravity on Riemann surfaces
with boundary is then obtained by summing hτd1 τd2 · · · τdn σ m iΣ over all topolog-
ical choices of Σ. (If Σ has more than one boundary component, the sum over
Σ includes a sum over how the boundary punctures are distributed among those
boundary components.) It is also possible to slightly generalize the definition by
weighting a surface Σ in a way that depends on the number of its boundary compo-
nents. For this, we introduce a parameter w, and weight a surface with h boundary
components with a factor of 8 wh .

8 Still more generally, we could introduce a finite set S of “labels” for the boundaries, so that each
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 32

32 Topology and Physics

Introducing also the usual coupling parameters ti associated with the bulk ob-
servables τi , and one more parameter v associated with σ, the partition function of
2d topological gravity on a Riemann surface with boundary is then formally
∞ X ∞
* !+
X X
h
Z(ti ; v, w) = w exp ti τi + vσ . (3.3)
h=0 Σ i=0 Σ
The sum over Σ runs over all topological types of Riemann surface with h boundary
components, and specified bulk and boundary punctures. The exponential on the
right hand side is expanded in a power series, and the monomials of an appropriate
degree are then evaluated via Eq. (3.1).
There are two immediate difficulties with this formal definition:
(1) To integrate a cohomology class such as i ψidi over a manifold M , that
Q

manifold must be oriented. But the moduli space of Riemann surfaces with bound-
ary is actually unorientable.
(2) For the integral of a cohomology class over an oriented manifold M to be
well-defined topologically, M should have no boundary, or the cohomology class in
question should be trivialized along the boundary of M . However, the compactified
moduli space M of conformal structures on a Riemann surface with boundary is
itself in general a manifold with boundary.
Dealing with these issues requires some refinements of the above formal defini-
tion [3–6]. The rest of this section is devoted to an introduction. We begin with the
unorientability of M. Implications of the fact that M is a manifold with boundary
will be discussed in Subsec. 3.9.

3.2. The anomaly


The problem in orienting the moduli space of Riemann surfaces with boundary can
be seen most directly in the absence of boundary punctures. Thus we let Σ be a
Riemann surface of genus g with h holes or boundary components and no boundary
punctures, but possibly containing bulk punctures.
First of all, if h = 0, then Σ is an ordinary closed Riemann surface, possibly
with punctures. The compactified moduli space M of conformal structures on Σ is
then a complex manifold (or more precisely an orbifold) and by consequence has
a natural orientation. This orientation is used in defining the usual intersection
numbers on M, that is, the correlation functions of 2d topological gravity on a
Riemann surface without boundary.

boundary is labeled by some s ∈ S. Then for each s ∈ S, one would have a boundary observable σs
corresponding to a puncture inserted on a boundary with label s, and a corresponding parameter
vs to count such punctures. Eq. (3.3) below corresponds to the case that w is the cardinality of
the set S, and vs = v for all s ∈ S. This generalization to include
Q labels would correspond in
Eq. (4.49) below to modifying the matrix integral with a factor s∈S det(zs − Φ). Similarly, one
could replace wh by s∈S wshs , where hs is the number of boundary components labeled by s and
Q

there is aQseparate parameter ws for each s. This corresponds to including in the matrix integral
a factor s∈S (det(zs − Φ))ws . Such generalizations have been treated in [45].
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 33

Developments in Topological Gravity 33

This remains true if Σ has punctures (which automatically are bulk punctures
since so far Σ has no boundary). Now let us replace some of the punctures of Σ by
holes. Each time we replace a bulk puncture by a hole, we add one real modulus
to the moduli space. If we view Σ as a two-manifold with hyperbolic metric and
geodesic boundaries, then the extra modulus is the circumference b around the hole.
By adding h > 1 holes, we add h real moduli b1 , b2 , . . . , bh , which we write col-
lectively as ~b. We denote the corresponding compactified moduli space as Mg,n,~b
(here n is the number of punctures that have not been converted to holes). A very
important detail is the following. In defining Weil–Petersson volumes in Sec. 2,
we treated the bi as arbitrary constants; the “volume” was defined as the volume
for fixed ~b, without trying to integrate over ~b. (Such an integral would have been
divergent since the volume function Vg,n,~b is polynomial in ~b. Moreover, what natu-
rally enters Mirzakhani’s recursion relation is the volume function defined for fixed
~b.) In defining two-dimensional gravity on a Riemann surface with boundary, the
bi are treated as full-fledged moduli — they are some of the moduli that one inte-
grates over in defining the intersection numbers. Hopefully this change in viewpoint
relative to Sec. 2 will not cause serious confusion.
If we suppress the bi by setting them all to 0, the holes turn back into punctures
and Mg,n,~b is replaced by Mg,n+h . This is a complex manifold (or rather an orb-
ifold) and in particular has a natural orientation. Restoring the bi , Mg,n,~b is a fiber
bundle9 over Mg,n+h with fiber a copy of Rh+ parametrized by b1 , . . . , bh . (Here R+
is the space of positive real numbers and Rh+ is the Cartesian product of h copies
of R+ .) Orienting Mg,n,~b is equivalent to orienting this copy of Rh+ .
If we were given an ordering of the holes in Σ up to an even permutation, we
would orient Rh+ by the differential form

Ω = db1 db2 · · · dbh . (3.4)

However, for h > 1, in the absence of any information about how the holes should
be ordered, Rh+ has no natural orientation.
Thus Mg,n,~b has no natural orientation for h > 1. In fact it is unorientable.
This follows from the fact that a Riemann surface Σ with more than one hole
has a diffeomorphism that exchanges two of the holes, leaving the others fixed.
(Moreover, this diffeomorphism can be chosen to preserve the orientation of Σ.)
Dividing by this diffeomorphism in constructing the moduli space Mg,n,~b ensures
that this moduli space is actually unorientable.
We can view this as a global anomaly in two-dimensional topological gravity on
an oriented two-manifold with boundary. The moduli space is not oriented, or even

9 This assertion actually follows from a fact that was exploited in Sec. 2: for fixed ~b, Mg,n,~b is
isomorphic to Mg,n as an orbifold (their symplectic structures are inequivalent, as we discussed
in Sec. 2). Given this equivalence for fixed ~b, it follows that upon letting ~b vary, Mg,n,~b is a fiber
bundle over Mg,n+h with fiber paramerized by ~b.
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 34

34 Topology and Physics

orientable, so there is no way to make sense of the correlation functions that one
wishes to define.
As usual, to cancel the anomaly, one can try to couple two-dimensional topo-
logical gravity to some matter system that carries a compensating anomaly. In
the context of two-dimensional topological gravity, the matter system in question
should be a topological field theory. To define a theory that reduces to the usual
topological gravity when the boundary of Σ is empty, we need a topological field
theory that on a Riemann surface without boundary is essentially trivial, in a sense
that we will see, and in particular is anomaly-free. But the theory should become
anomalous in the presence of boundaries.
These conditions may sound too strong, but there actually is a topological field
theory with the right properties. First of all, we endow Σ with a spin structure. (We
will ultimately sum over spin structures to get a true topological field theory that
does not depend on the choice of an a priori spin structure on Σ.) When ∂Σ = ∅,
we can define a chiral Dirac operator on Σ (a Dirac operator acting on positive
chirality spin 1/2 fields on Σ). There is then a Z2 -valued invariant that we call ζ,
namely the mod 2 index of the chiral Dirac operator, in the sense of Atiyah and
Singer [46,47]. ζ is defined as the number of zero-modes of the chiral Dirac operator,
reduced mod 2. ζ is a topological invariant in that it does not depend on the choice
of a conformal structure (or metric) on Σ. A spin structure is said to be even or
odd if the number of chiral zero-modes is even or odd (in other words if ζ = 0 or
ζ = 1). For an introduction to these matters, see [48], especially Subsec. 3.2.
We define a topological field theory by summing over spin structures on Σ with
each spin structure weighted by a factor of 12 (−1)ζ . The reason for the factor of
1
2 is that a spin structure has a symmetry group that acts on fermions as ±1,
with 2 elements. As in Faddeev–Popov gauge-fixing in gauge theory, to define a
topological field theory, one needs to divide by the order of the unbroken symmetry
group, which in this case is the group Z2 . This accounts for the factor of 21 . The
more interesting factor, which will lead to a boundary anomaly, is (−1)ζ . It may
not be immediately apparent that we can define a topological field theory with
this factor included. We will describe two realizations of the theory in question in
Subsec. 3.4, making it clear that there is such a topological field theory. We will
call it T .
On a Riemann surface of genus g, there are 12 (22g + 2g ) even spin structures and
1 2g
2 (2 − 2g ) odd ones. The partition function of T in genus g is thus
   
1 1 2g g 1 2g g
Zg = (2 + 2 ) − (2 − 2 ) = 2g−1 . (3.5)
2 2 2
This is not equal to 1, and thus the topological field theory T is nontrivial. How-
ever, when we couple to topological gravity, the genus g amplitude has a factor
2g−2
gst , where gst is the string coupling constant.10 The product of this with Zg is

10 In mathematical treatments, gst is often set to 1. There is no essential loss of information, as


October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 35

Developments in Topological Gravity 35

2 g−1
(2gst ) . Thus, as long as we are on a Riemann surface without boundary,√coupling
topological gravity to T can be compensated11 by absorbing a factor of 2 in the
definition of gst . In that sense, coupling of topological gravity to T has no effect,
as long as we consider only closed Riemann surfaces.
Matters are different if Σ has a boundary. On a Riemann surface with boundary,
it is not possible to define a local boundary condition for the chiral Dirac operator
that is complex linear and sensible (elliptic), and there is no topological invariant
corresponding to ζ. Thus theory T cannot be defined as a topological field theory
on a manifold with boundary.
It is possible to define theory T on a manifold with boundary as a sort of
anomalous topological field theory, with an anomaly that will help compensate
for the problem that we found above with the orientation of the moduli space.
To explain this, we will first describe some more physical constructions of theory
T . First we discuss how theory T is related to contemporary topics in condensed
matter physics.

3.3. Relation to condensed matter physics


Theory T has a close cousin that is familiar in condensed matter physics. One
considers a chain of fermions in 1 + 1 dimensions with the property that in the bulk
of the chain there is an energy gap to the first excited state above the ground state,
and the further requirement that the chain is in an “invertible” phase, meaning that
the tensor product of a suitable number of identical chains would be completely
trivial.12 There are two such phases, just one of which is nontrivial. The nontrivial
phase is called the Kitaev spin chain [50]. It is characterized by the fact that at
the end of an extremely long chain, there is an unpaired Majorana fermion mode,
usually called a zero-mode because in the limit of a long chain, it commutes with
the Hamiltonian.13
The Kitaev spin chain is naturally studied in condensed matter physics from a
Hamiltonian point of view, which in fact we adopted in the last paragraph. From
a relativistic point of view, the Kitaev spin chain corresponds to a topological
field theory that is defined on an oriented two-dimensional spin manifold Σ, and

the dependence on gst carries the same information as the dependence on the parameter t1 in the
generating function. This follows from the dilaton equation, that is, the L0 Virasoro constraint.
11 This point was actually made in [49], as a special case of a more general discussion involving

rth roots of the canonical bundle of Σ for arbitrary r ≥ 2.


12 Triviality here means that by deforming the Hamiltonian without losing the gap in the bulk

spectrum, one can reach a Hamiltonian whose ground state is the tensor product of local wave-
functions, one on each lattice site.
13 A long but finite chain has a pair of such Majorana modes, one at each end. Upon quantization,

they generate a rank two Clifford algebra, whose irreducible representation is two-dimensional.
As a result, a long chain is exponentially close to having a two-fold degenerate ground state.
In condensed matter physics, this degeneracy is broken by tunnelling effects in which a fermion
propagates between the two ends of the chain. In the idealized model considered below, the
degeneracy is exact.
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 36

36 Topology and Physics

whose partition function if Σ has no boundary is (−1)ζ . We will see below how
this statement relates to more standard characterizations of the Kitaev spin chain.
Our theory T differs from the Kitaev spin chain simply in that we sum over spin
structures in defining it, while the Kitaev model is a theory of fermions and is
defined on a two-manifold with a particular spin structure. Moreover, as we discuss
in detail below, when Σ has a boundary, the appropriate boundary conditions in
the context of condensed matter physics are different from what they are in our
application to two-dimensional gravity. Despite these differences, the comparison
between the two problems will be illuminating.
Because we are here studying two-dimensional gravity on an oriented two-
manifold Σ, time-reversal symmetry, which corresponds to a diffeomorphism that
reverses the orientation of Σ, will not play any role. The Kitaev spin chain has an
interesting time-reversal symmetric refinement, but this will not be relevant.
Theory T has another interesting relation to condensed matter physics: it is
associated to the high temperature phase of the two-dimensional Ising model. In
this interpretation [51], the triviality of theory T corresponds to the fact that the
Ising model in its high temperature phase has only one equilibrium state, which
moreover is gapped.

3.4. Two realizations of theory T


We will describe two realizations of theory T , one in the spirit of condensed matter
physics, where we get a topological field theory as the low energy limit of a physical
gapped system, and one in the spirit of topological sigma models [52], where a
supersymmetric theory is twisted to get a topological field theory.
First we consider a massive Majorana fermion in two spacetime dimensions. It
is convenient to work in Euclidean signature. One can choose the Dirac operator to
be

Dm = γ 1 D1 + γ 2 D2 + mγ, (3.6)

where one can choose the gamma matrices to be real and symmetric, for instance
   
1 01 2 1 0
γ = , γ = . (3.7)
10 0 −1

This ensures that γ = γ 1 γ 2 is real and antisymmetric:


 
0 −1
γ= . (3.8)
1 0
These choices ensure that the Dirac operator Dm is real and antisymmetric. We
call m the mass parameter; the mass of the fermion is actually |m|.
Formally, the path integral for a single Majorana fermion is Pf(Dm ), the Pfaffian
of the real antisymmetric operator Dm . The Pfaffian of a real antisymmetric opera-
tor is real, and its square is the determinant; in the present context, the determinant
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 37

Developments in Topological Gravity 37

det Dm = (Pf(Dm ))2 can be defined satisfactorily by, for example, zeta-function
regularization. However, the sign of the Pfaffian is subtle. For a finite-dimensional
real antisymmetric matrix M , the sign of the Pfaffian depends on an orientation
of the space that M acts on. In the case of the infinite-dimensional matrix Dm , no
such natural orientation presents itself and therefore, for a single Majorana fermion,
there is no natural choice of the sign of Pf(Dm ).
Suppose, however, that we consider a pair of Majorana fermions with the same
(nonzero) mass parameter m. Then the path integral is Pf(Dm )2 and (since Pf(Dm )
is naturally real) this is real and positive. This actually ensures that the topological
field theory obtained from the low energy limit of a pair of massive Majorana
fermions of the same mass parameter is completely trivial. Without losing the mass
gap, we can take |m| → ∞, and in that limit, Pf(Dm )2 produces no physical effects
at all except for a renormalization of some of the parameters in the effective action.14
To get theory T , we consider instead a pair of Majorana fermions, one of positive
mass parameter and one of negative mass parameter. Just varying mass parameters,
to interpolate between this theory and the equal mass parameter case, we would
have to let a mass parameter pass through 0, losing the mass gap.
This suggests that a theory with opposite sign mass parameters might be in
an essentially different phase from the trivial case of equal mass parameters. To
establish this and show the relation to theory T , we will analyze what happens to
the partition function of the theory when the mass parameter of a single Majorana
fermion is varied between positive and negative values.
The absolute value of Pf(Dm ) does not depend on the sign of m. This follows
2
from the fact that the operator Dm is invariant under m → −m. (The determinant
2
of −Dm is a power of Pf(Dm ), and the fact that it is invariant under m → −m
implies that Pf(Dm ) is independent of sign(m) up to sign.) Therefore the partition
functions of the two theories with opposite masses or with equal masses have the
same absolute value. They can differ only in sign, and this sign is what we want to
understand.
To determine the sign, we ask what happens to Pf(Dm ) when m is varied from
large positive values to large negative ones. To change sign, Pf(Dm ) has to vanish,
and it vanishes only when Dm has a zero-mode. This can only happen at m = 0.
So the question is just to determine what happens to the sign of Pf(Dm ) when m
passes through 0.
Zero-modes of Dm at m = 0 are simply zero-modes of the massless Dirac oper-
ator D = γ 1 D1 + γ 2 D2 . Such modes appear in pairs of equal and opposite chirality.
To be more precise, let γ b = iγ be the chirality operator, with eigenvalues 1 and
−1 for fermions of positive or negative chirality. What we called the chiral Dirac

14 Intwo dimensions, when we integrate out a massive neutral field, the only parameters that
have to be renormalized are the vacuum energy, which corresponds to a term in the
R effective
√ √
action proportional to the volume Σ d2 x g of Σ, and the coefficient of another term Σ d2 x gR
R

proportional to the Euler characteristic of Σ (here R is the Ricci scalar of Σ).


October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 38

38 Topology and Physics

operator when we defined the mod 2 index in section 3.2 is the operator D re-
stricted to act on states of γ
b = +1. Since γ b is imaginary and D is real, complex
b eigenvalue while commuting with D;
conjugation of an eigenfunction reverses its γ
thus zero-modes of D occur in pairs with equal and opposite chirality.
Restricted to such a pair of zero-modes of D, the antisymmetric operator Dm
looks like
 
0 −m
, (3.9)
m 0

and its Pfaffian is m, up to an m-independent sign that depends on a choice of


orientation in the two-dimensional space. The important point is that this Pfaffian
changes sign when m changes sign. If there are s pairs of zero-modes, the Pfaffian
in the zero-mode space is ms , and so changes in sign by (−1)s when m changes
sign. But s is precisely the number of zero-modes of positive chirality, and (−1)s is
the same as (−1)ζ , where ζ is the mod 2 index of the Dirac operator.
Now we can answer the original question. Since the partition function for the
theory with two equal mass parameters is trivial (up to a renormalization of some
of the low energy parameters), the partition function of the theory with one mass
of each sign is (−1)ζ (up to such a renormalization). Thus we have found a physical
realization of theory T .
The result we have found can be interpreted in terms of a discrete chiral anomaly.
At the classical level, for m = 0, the Majorana fermion has a Z2 chiral symmetry15
ψ→γ bψ. The mass parameter is odd under this symmetry, so classically the theories
with positive or negative m are equivalent. Quantum mechanically, one has to ask
whether the fermion measure is invariant under the discrete chiral symmetry. As
usual, the nonzero modes of the Dirac operator are paired up in such a way that
the measure for those modes is invariant; thus one only has to test the zero-modes.
Since ψ → γ bψ leaves invariant the positive chirality zero-modes and multiplies each
negative chirality zero-mode by −1, this operation transforms the measure by a
factor (−1)s = (−1)ζ , where s is the number of negative (or positive) chirality
zero-modes, and ζ is the mod 2 index.
Finally, we will describe another though closely related way to construct the
same topological field theory. The extra machinery required will be useful later.
We consider in two dimensions a theory with (2, 2) supersymmetry and a single
complex chiral superfield Φ. We work in flat spacetime to begin with and assume
a familiarity with the usual superspace formalism of (2, 2) supersymmetry and its

15 With our conventions, the operator γ


b is imaginary in Euclidean signature and one might wonder
if this symmetry makes sense for a Majorana fermion. However, after Wick rotation to Lorentz
signature (in which γ 0 acquires a factor of i), γ
b becomes real, and it is always in Lorentz signature
that reality conditions should be imposed on fermion fields and their symmetries. Thus actually
ψ → γ bψ is a physically meaningful symmetry and ψ → γψ (which may look more natural in
Euclidean signature) is not. Under the latter transformation, the massless Dirac action actually
changes sign, so it is indeed ψ → γbψ and not ψ → γψ that is a symmetry at the classical level.
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 39

Developments in Topological Gravity 39

twisting to make a topological field theory. Φ can be expanded

Φ = φ + θ− ψ+ + θ+ ψ− + θ+ θ− F. (3.10)

Here φ is a complex scalar field; ψ+ and ψ− are the chiral components of a Dirac
fermion field; and F is a complex auxiliary field. We consider the action
Z Z Z
i i 2
S = d2 xd4 θΦΦ + d2 xd2 θ mΦ2 − d2 xd2 θ mΦ . (3.11)
2 2
Thus the superpotential is W (Φ) = imΦ2 /2. In general, here m is a complex mass
parameter, but for our purposes we can assume that m > 0. After integrating over
the θ’s and integrating out the auxiliary field F , the action becomes16
Z
S = d2 x ∂µ φ∂ µ φ + m2 |φ|2 + ψγ µ ∂µ ψ + imαβ (ψα ψβ − ψ α ψ β ) .

(3.12)

If we expand the √Dirac fermion ψ in terms of a pair of Majorana fermions χ1 , χ2


by ψ = (χ1 + iχ2 )/ 2, we find that χ1 and χ2 are massive Majorana fermions with
a mass matrix that has one positive and one negative eigenvalue, as in our previous
construction of theory T . The massive field φ does not play an important role at
low energies: its path integral is positive definite, and in the large m or low energy
limit, just contributes renormalization effects. So at low energies the supersym-
metric theory considered here gives another realization of theory T . However, the
supersymmetric machinery gives a way to obtain theory T without taking a low
energy limit, and this will be useful later. Because the superpotential W = imΦ2 /2
is homogeneous in Φ, the theory has a U (1) R-symmetry that acts on the super-
space coordinates as θ± → eiα θ± . Because W is quadratic in Φ, one has to define
this symmetry to leave ψ invariant and to transform φ by φ → eiα φ. When one
“twists” to make a topological field theory, the spin of a field is shifted by one-half
of its R-charge. In the present case, as ψ is invariant under the R-symmetry, it
remains a Dirac fermion after twisting, but φ acquires spin +1/2 (it transforms
under rotations like the positive chirality part of a Dirac fermion).
The twisted theory can be formulated as a topological field theory on any Rie-
mann surface Σ with any metric tensor. We use the phrase “topological field theory”
loosely since the twisted theory, as it has fields of spin 1/2, requires a choice of spin
structure. To get a true topological field theory, one has to sum over the choice
of spin structure. The supersymmetry of the twisted theory ensures that the path
integral over φ cancels the absolute value of the path integral over ψ, leaving only
the sign (−1)ζ . Thus the twisted theory is precisely equivalent to theory T , without
taking any low energy limit.
In [49], this last statement is deduced in another way as a special case of an
analysis of a theory with Φr superpotential for any r ≥ 2.

16 Here αβ is the Levi-Civita antisymmetric tensor in the two-dimensional space spanned by
ψ+ , ψ− .
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 40

40 Topology and Physics

For our later application, it will be useful to know that the condition for a
configuration of the φ field to preserve the supersymmetry of the twisted theory is
∂φ + imφ = 0. (3.13)
The generalization of this equation for arbitrary superpotential is
∂W
∂φ + = 0, (3.14)
∂φ
which has been called the ζ-instanton equation [53]. A small
 calculation
 shows that
φ 1
if we set φ = φ1 + iφ2 with real φ1 , φ2 , and set φb = , then Eq. (3.13) is
−φ2
equivalent to
Dm φb = 0, (3.15)
with Dm the massive Dirac operator (3.6).

3.5. Boundary conditions in theory T


Our next task is to consider theory T on a manifold with boundary. Here of course
we must begin by discussing possible boundary conditions. In this section, we will
use the realization of theory T in terms of a pair of Majorana fermions with opposite
masses.
The main requirement for a boundary condition is that it should preserve the
antisymmetry of the operator Dm . If tr denotes the transpose, then the antisym-
metry means concretely that
Z
χtr Dm ψ + (Dm χ)tr ψ = 0.

(3.16)

In verifying this, one has to integrate by parts, and one encounters a surface term,
which is the boundary integral of χtr γ⊥ ψ, where γ⊥ is the gamma matrix normal
to the boundary. This will vanish if we impose the boundary condition
γk ψ = ηψ, (3.17)
where η = +1 or −1, γk is the gamma matrix tangent to the boundary, and |
represents restriction to the boundary. Just to ensure the antisymmetry of the
operator Dm , either choice of sign will do. With either choice of sign, Dm is a real
operator, so its Pfaffian Pf(Dm ) remains real.
The boundary conditions γk ψ = ±ψ have a simple interpretation. Tangent to
the boundary, there is a single gamma matrix γk . It generates a rank 1 Clifford
algebra, satisfying γk2 = 1. In an irreducible representation, it satisfies γk = 1 or
γk = −1. Thus the spin bundle of Σ, which is a real vector bundle of rank 2,
decomposes along ∂Σ as the direct sum of two spin bundles of ∂Σ, namely the
subbundles defined respectively by γk ψ = ψ and by γk ψ = −ψ. These two spin
bundles of ∂Σ are isomorphic, since they are exchanged by multiplication by γ⊥ ,
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 41

Developments in Topological Gravity 41

which is globally-defined along ∂Σ. Thus the spin bundle of Σ decomposes along
∂Σ in a natural way as the direct sum of two copies of the spin bundle of ∂Σ, and
the boundary condition says that along the boundary, ψ takes values in one of these
bundles. We will write S for the spin bundle of Σ and E for the spin bundle of ∂Σ.
Now let us discuss the behavior near the boundary of a Majorana fermion that
satisfies one of these boundary conditions. We work on a half-space in R2 , say the
half-space x1 ≥ 0 in the x1 x2 plane. For a mode that is independent of x2 , the
Dirac equation Dm ψ = 0 becomes
 
d
+ mγ2 ψ = 0, (3.18)
dx1
with solution
ψ = exp(−mx1 γ2 )ψ0 , (3.19)
for some ψ0 . In this geometry, γ2 is the same as γk . We see that if ψ satisfies the
boundary condition γk ψ = ηψ, then this mode is normalizable if and only if
mη > 0. (3.20)
If mη < 0, the theory remains gapped, with a gap of order m, even along the
boundary. But if mη > 0, the mode that we have just found propagates along the
boundary as a (0 + 1)-dimensional massless Majorana fermion.
We will now use these results to study the boundary anomaly of theory T , with
several possible boundary conditions.

3.6. Boundary anomaly of theory T


Let us first recall that for a real fermion field with a real antisymmetric Dirac
operator such as Dm , in general there is an anomaly in the sign of the path integral
Pf(Dm ). The anomaly is naturally described mathematically by saying that there
is a real Pfaffian line PF associated to the Dirac operator, and the fermion Pfaffian
Pf(Dm ) is well-defined as a section of PF .
In our problem, there are two Majorana fermions, say ψ1 and ψ2 , with possibly
different masses and possibly different boundary conditions. Correspondingly there
are two Pfaffian lines, say PF 1 and PF 2 , and the overall Pfaffian line is the tensor
product17 PF = PF 1 ⊗ PF 2 .

17 There is a potential subtlety here. If a fermion field has an odd number of zero-modes, its Pfaffian
line should be considered odd or fermionic. Accordingly, if ψ1 and ψ2 each have an odd number of
zero-modes, then PF 1 and PF 2 are both odd and the correct statement is that PF = PF 1 ⊗PF b 2,
where ⊗ b is a Z2 -graded tensor product (this notion is described in Subsec. 3.6.2). We will not
encounter this subtlety, because always at least one of ψ1 and ψ2 will satisfy one of the boundary
conditions (3.17). A fermion field obeying one of those boundary conditions has an even number
of zero-modes, since there are none at all if mη < 0 and the number is independent of m mod 2.
Note that on a Riemann surface with boundary, there is no notion of the chirality of a zero-mode
and we simply count all zero-modes. By contrast, the mod 2 index that is used in defining theory
T on a surface without boundary is defined by counting positive chirality zero-modes only.
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 42

42 Topology and Physics

In general, the Pfaffian line of a Dirac operator does not depend on a fermion
mass, but it may depend on the boundary conditions. Indeed, as we will see, there
is such a dependence in our problem and it will play an essential role.
We will now consider the boundary path integral and boundary anomaly in our
problem for several choices of boundary condition.

3.6.1. The trivial case


The most trivial case is that the two masses and also the two boundary conditions
are the same. Moreover, we choose the masses and the signs so that mη < 0.
Since the two boundary conditions are the same, PF 1 is canonically isomorphic
to PF 2 , and therefore PF = PF 1 ⊗ PF 2 is canonically trivial.
Since the two Majorana fermions have the same mass and boundary condition,
the combined Dirac operator D of the two modes is just the direct sum of two copies
of the same Dirac operator Dm . Thus the fermion path integral Pf(D) satisfies
Pf(D) = Pf(Dm )2 , and in particular Pf(D) is naturally positive (relative to the
trivialization of PF that reflects the isomorphism PF 1 ∼ = PF 2 ).
Since mη < 0, there are no low-lying modes near the boundary and the theory
has a uniform mass gap of order m along the boundary as well as in the bulk.
Therefore, after renormalizing a few constants in the low energy effective action,
the path integral Pf(D) is just 1.
In other words, with equal boundary conditions for the two modes, the trivial
theory with equal masses remains trivial along the boundary.
Assuming we allow ourselves to make a generic relevant deformation of the
theory (as we would certainly do in condensed matter physics, for example), this is
still true if we pick the boundary conditions for the two Majorana fermions to be
equal but such that mη > 0. Then we generate two (0 + 1)-dimensional massless
Majorana fermions, say χ1 , χ2 . But given any such pair of Majorana modes in
(0 + 1) dimensions, one can add a mass term iµχ1 χ2 to the Hamiltonian (or the
Lagrangian), with some constant µ, removing them from the low energy theory.
The theory becomes gapped and the renormalized partition function is again 1.
Fermi statistics do not allow the addition of a mass term for a single massless 1d
Majorana fermion. Hence the number of 1d Majorana modes along the boundary
is a topological invariant mod 2. We will discuss next the case that this invariant
is nonzero.

3.6.2. Boundary condition in condensed matter physics


For theory T , or for the Kitaev spin chain, we consider two Majorana fermions, with
opposite signs of m. In the context of condensed matter physics, to study the theory
on a manifold with boundary, we want a boundary condition that makes the theory
fully anomaly-free. In other words, we want to ensure that the Pfaffian line bundle
PF = PF 1 ⊗ PF 2 remains canonically trivial. This is straightforward: since PF is
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 43

Developments in Topological Gravity 43

in general independent of the masses, we simply use the same boundary condition
as in Subsec. 3.6.1 — namely the same sign of η for both Majorana fermions —
and then PF remains canonically trivial, regardless of the masses.
However, since the two Majorana fermions have opposite signs of m, we see now
that regardless of the common choice of η, precisely one of them has a normalizable
zero-mode (3.19) along the boundary. This means that the mass gap of the theory
breaks down along the boundary. Although it is gapped in bulk, there is a single
(0 + 1)-dimensional massless Majorana fermion propagating along the boundary.
As we noted in Subsec. 3.3, this is regarded in condensed matter physics as the
defining property of the Kitaev spin chain.
Now let us discuss a consequence of this construction that has been important
in mathematical work [3–6] on 2d gravity on a manifold with boundary. We will
see later the reason for its importance. In general, suppose that Σ has h boundary
components ∂1 Σ, ∂2 Σ, . . . , ∂h Σ. On each boundary component, one makes a choice
of sign in the boundary condition, and this determines a real spin bundle Ei of ∂i Σ.
Along each ∂i Σ, there propagates a massless 1d Majorana fermion χi . In propagat-
ing around ∂i Σ, χi may obey either periodic or antiperiodic boundary conditions.
Indeed, on the circle ∂i Σ, there are two possible spin structures, which in string
theory are usually called the Neveu-Schwarz or NS (antiperiodic) spin structure
and the Ramond (periodic) spin structure. The NS spin structure is bounding and
the R spin structure is unbounding. The underlying spin bundle S of Σ determines
whether Ei is of NS or Ramond type. For general S, the only general constraint on
the Ei is that the number R of boundary components with Ramond spin structure
is even.
The field χi , in propagating around the circle ∂i Σ, has a zero-mode if and only
if Ei is of Ramond type. This is not an exact zero-mode, but it is exponentially
close to being one if m is large (compared to the inverse of the characteristic length
scale of Σ). Let us write νi , i = 1, . . . , R, for these modes. The νi have much smaller
eigenvalues of Dm than any other modes of ψ1 and ψ2 , so there is a consistent
procedure in which we integrate out all other modes and leave an effective theory
of the νi only.
Since the underlying theory was chosen to be anomaly-free, it must determine a
well-defined measure for the νi . This condition is not as innocent as it may sound.
A measure on the space parametrized by the νi is something like
dν1 dν2 · · · dνR . (3.21)
However, a priori, this expression does not have a well-defined sign. First of all, its
sign is obviously changed if we make an odd permutation of the νi , that is of the
Ramond boundary components. But in addition, we should worry about the signs
of the individual νi . Since the νi are real, we can fix their normalization up to sign
by asking them to have, for example, unit L2 norm. But there is no natural way to
choose the signs of the νi , and obviously, flipping an odd number of the signs will
reverse the sign of the measure (3.21).
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 44

44 Topology and Physics

There is no natural way to pick the signs of the νi up to an even number of


sign flips, and likewise, there is no natural way to pick an ordering of the νi up to
even permutations. However, the fact that there is actually a well-defined measure
on the space spanned by ν1 , . . . , νR means that one of these choices determines the
other. This fact (originally proved in a very different way) is an important lemma
in [3–6].
The existence of a natural measure on the space spanned by the νi can be
expressed in the following mathematical language. For i = 1, . . . , R, let εi be the one-
dimensional real vector space generated by νi . The Z2 -graded tensor product18 of
the εi , denoted ⊗
b i εi , is equivalent to the ordinary tensor product once an ordering of
the εi is picked, by an isomorphism that reverses sign if two of the εi are exchanged.
The lemma that we have been describing is equivalent to the statement that the
Z2 -graded tensor product of the εi is canonically trivial:

b Ri=1 εi ∼
⊗ = R. (3.22)

3.6.3. Boundary condition in two-dimensional gravity


For the application of theory T to two-dimensional gravity — or at least to the
theory studied in [3–6] — we need a different boundary condition. In this applica-
tion, we want theory T to remain gapped along the boundary as well as in bulk.
But it will have an anomaly that will help in canceling the gravitational anomaly.
Thus, the two Majorana fermions must remain gapped along the boundary, even
though they have opposite masses. To achieve this, we must give the two Majorana
fermions opposite boundary conditions, so that mη < 0 for each of the two modes.
Given that the theory has a uniform mass gap of order m even near the bound-
ary, its path integral, after renormalizing a few parameters in the effective action, is
of modulus 1. Moreover, this path integral is naturally real. Thus it is fairly natural
to write the path integral as (−1)ζ , just as we did in the absence of a boundary.19
However, (−1)ζ is no longer a number ±1; it now takes values in the real line bundle
PF . In fact, since it is everywhere nonzero, (−1)ζ is a trivialization of PF .
This theory actually challenges some of the standard terminology about anoma-
lies. The line bundle PF is clearly trivial, because the renormalized partition func-
tion (−1)ζ provides a trivialization. However, because this trivialization is provided
by the path integral itself, rather than by more local or more elementary consider-
ations, it is not natural to call the theory anomaly-free. When we say that a theory

18 Since this notion may be unfamiliar, we give Qan example, following P. Deligne. Let Si , i = 1, . . . , t
be a family of circles, and let T be the torus ti=1 Si . Then εi = H 1 (Si , R) is a one-dimensional
vector space, as is α = H n (T, R). There can be no natural isomorphism between α and the
ordinary tensor product ⊗i εi , since the exchange of two of the circles acts trivially on ⊗i εi , while
acting on α as −1. But there is a canonical isomorphism α ∼ =⊗ b i εi .
19 This is a path integral for a particular spin structure. As usual, to make the partition function

of theory T , we sum over spin structures and divide by 2.


October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 45

Developments in Topological Gravity 45

is anomaly-free, we usually mean that its path integral can be defined as a number,
rather than as a section of a line bundle; that is not the case here.
In our problem, PF cannot be trivialized by local considerations. Rather, local
considerations will give an isomorphism

PF ∼
=⊗b Ri=1 εi , (3.23)
where the product is over all boundary components with Ramond spin structure.
This claim is consistent with the claim that PF is trivial, because we have shown
in Eq. (3.22) that ⊗b Ri=1 εi is trivial.
To explain what we mean in saying that (3.23) can be established by local
considerations, first set
V = ⊕Ri=1 εi . (3.24)
Then the statement (3.23) is equivalent to
PF ∼
= det V, (3.25)
where for a vector space V , det V is its top exterior power. (Note that exchanging
two summands εi and εj in V acts as −1 on det V , and likewise acts as −1 on the
Z2 -graded tensor product in (3.23).)
We will use the following fact about Pfaffian line bundles. Consider a family of
real Dirac operators parametrized by some space W (in our case, W represents the
choice of metric on Σ). As long as the space of zero-modes of the Dirac operator has
a fixed dimension, it furnishes the fiber of a vector bundle V → W . The Pfaffian
line bundle PF → W is then det V , the top exterior power of V .
More generally, instead of considering zero-modes, we can consider any positive
number a that (in a given portion of W ) is not an eigenvalue of iDm , and let V be
the space spanned by eigenvectors of the Dirac operator with eigenvalue less than
a in absolute value. One still has an isomorphism PF ∼ = det V .
Furthermore, the Pfaffian line bundle PF is independent of fermion masses. This
means that to compute PF in our problem, instead of considering the case that the
masses are opposite and the signs in the boundary conditions are also opposite, we
can take the masses to be the same while the boundary conditions remain opposite.
In this situation, one of the fields ψ1 , ψ2 has positive mη and one has negative
mη. So although the interpretation is different, we are back in the situation consid-
ered in Subsec. 3.6.2: one fermion has a mass gap m that persists even along the
boundary, and the other has a single low-lying made along each Ramond boundary
component. The space of low-lying fermion modes is thus V = ⊕Ri=1 εi , and this
leads to Eq. (3.23).
Eq. (3.23) will suffice for our purposes, but it is perhaps worth pointing out that
it has the following generalization, which is analogous to Theorem B in [54]. Instead
of flipping the boundary condition simultaneously along all boundary components
of Σ, it makes sense to flip the boundary condition along one boundary component
at a time. Let S be a particular boundary component of Σ and let PF and PF 0 be
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 46

46 Topology and Physics

the Pfaffian line bundles before and after flipping the boundary condition of one
fermion along S. If the spin structure along S is of NS type, then

PF 0 ∼
= PF , (3.26)

that is, changing the boundary condition has no effect. But if it is of Ramond type,
then20

PF 0 ∼
= ε⊗PF
b , (3.27)

where ε is the space of fermion zero-modes along S. Repeated application of these


rules, starting with the fact that PF is trivial if ψ1 and ψ2 have the same sign of
η, leads to Eq. (3.23) for the case that they have opposite signs of η.
To justify the statements (3.26) and (3.27), we use the fact that by the excision
property of index theory, the change in the Pfaffian line when we flip the boundary
condition along S depends only on the geometry along S and not on the rest of Σ.
Thus we can embed S in any convenient Σ of our choice. It is convenient to take Σ
to be the annulus S × I, where I = [0, 1] is a unit interval, and we consider S to be
embedded in S × I as the left boundary S × {0}. We want to compute the effect
of flipping the boundary condition at S × {0}, keeping it fixed at S × {1}. We can
take the fermion mass to be 0, so the Dirac operator becomes conformally invariant
and we can take the metric on the annulus to be flat. A fermion zero-mode is then
simply a constant mode that satisfies the boundary conditions. For the case of an
NS spin structure, the fermions are antiperiodic in the S direction and so have
no zero-modes. Thus the space of zero-modes is V = 0, so that det V = R. This
justifies (3.26) in the NS case. In the R case, flipping the boundary condition at
one end adds or removes a zero-mode (depending on the boundary condition at the
other end). The relevant space of zero-modes is V ∼ = ε, so that det V ∼
= ε, leading
to (3.27).

3.7. Anomaly cancellation


We are finally ready to explain how the anomaly that we described in Subsec. 3.2
has been canceled in [3–6]. We consider first the case that all boundaries of Σ are
of Ramond type, and to start with, we omit boundary punctures. We denote the
circumference of the ith boundary as bi . We recall that the reason for the anomaly
is that there is no natural sign of the differential form Ω = db1 db2 · · · dbR (Eq.
(3.4)). However, after coupling to theory T , what needs to have a natural sign is
the product of this with (−1)ζ , the path integral of theory T :
b = db1 db2 · · · dbR (−1)ζ .
Ω (3.28)

20 This formula shows that if we flip the boundary condition for one of the Majorana fermions
along just one of the Ramond boundaries or more generally along some but not all of them, then
the Pfaffian line becomes nontrivial and the theory becomes genuinely anomalous.
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 47

Developments in Topological Gravity 47

We recall, in addition, that (−1)ζ takes values in ⊗ b i εi , where εi is a one-dimensional


th
vector space of zero-modes along the i Ramond boundary.
Here ⊗ b i is a Z2 -graded tensor product, meaning that ⊗ b i εi changes sign if any
two of the boundary components are exchanged. But the original anomaly was that
db1 db2 · · · dbR likewise changes sign if any two boundary components are exchanged.
The upshot then is that the product Ω b does not change sign under permutations of
boundary components. It naturally takes values in the ordinary tensor product of
the εi :
b ∈ ⊗Ri=1 εi .
Ω (3.29)
What have we gained? The anomaly has not disappeared, but it has become
local: it has turned into an ordinary tensor product of factors associated with in-
dividual boundary components; because it is an ordinary tensor product, it can be
canceled by a local choice made independently on each boundary component.
The last step in canceling the anomaly is to say that a boundary of Σ is not just
a “bare” boundary: it comes with additional structure. Let Si be the ith boundary
component of Σ, and let Ei be its spin structure. In the theory developed in [3–6]
(but for the moment still ignoring boundary punctures) each Si is endowed with a
trivialization of Ei , up to homotopy. For the moment we consider Ramond bound-
aries only. Since Ei is a real line bundle, and is trivial on a Ramond boundary, it has
two homotopy classes of trivialization over each Ramond boundary. In addition to
summing over spin structures on Σ and integrating over its moduli, one is supposed
to sum over (homotopy classes of) trivializations of Ei for each Ramond boundary
Si .
A fermion zero-mode on Si is a “constant” mode that is everywhere non-
vanishing, so the choice of such a zero-mode gives a trivialization of Ei . This means
that, still in the absence of boundary punctures, trivializations of Ei correspond to
choices of the sign of the zero-mode on Si . Hence once we trivialize all the Ei , the
right-hand side of (3.29) is trivialized and Ω b acquires a well-defined sign.
Thus once theory T is included and the boundaries are equipped with trivializa-
tions of their spin bundles, the problem with the orientation of the moduli space is
solved. However, without some further ingredients, all correlation functions would
vanish. Indeed, summing over the signs of the trivializations of the εi will imply
summing over the sign of Ω. b
Moreover, what we have said does not make sense for boundaries with NS spin
structure, since their spin bundles cannot be globally trivialized.
The additional ingredient that has to be considered is a boundary puncture.
One postulates that locally, away from punctures, Ei is trivialized, but that this
trivialization changes sign in crossing a boundary puncture.
With this rule, it is possible to incorporate NS as well as Ramond boundaries. A
simple example of a boundary component with NS spin structure is the boundary
of a disc (Fig. 5). Its spin structure is of NS or antiperiodic type, and cannot be
trivialized globally. It can be trivialized on the complement of one point, but then
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 48

48 Topology and Physics

the trivialization changes sign in crossing that point. In the theory of [3–6], that
point would be interpreted as a boundary puncture. So an NS boundary with one
boundary puncture is possible in the theory, but an NS boundary with no boundary
punctures is not. More generally, the number of punctures on a given NS boundary
can be any positive odd number, since the spin structure of an NS boundary can
have a trivialization that jumps in sign any odd number of times in going around
the boundary circle.
As an example, a disc with n boundary punctures and m bulk ones has a moduli
space of dimension 2m + n − 3. The fact that n is odd means that this number is
even.
R Q This is actually a necessary condition for some of the correlation functions
di
M i ψ i (Eq. (3.1)) to be nonzero, since the cohomology classes ψi are all of even
degree.
The spin structure of a Ramond puncture is globally trivial, so it is possible to
have a Ramond boundary with no boundary punctures. Of course, this is the case
we started with. More generally, a Ramond boundary can have any even number
of punctures.
On any given boundary component of either NS or Ramond type, there are two
allowed classes of piecewise trivialization of the spin structure. One can pick an
arbitrary trivialization at a given starting point (not one of the punctures), and
then the extension of this over the rest of the circle is uniquely determined by the
condition that the trivialization jumps in sign whenever a boundary puncture is
crossed.

Fig. 5. A disc with five boundary punctures. The spin bundle of the boundary circle S is a real
line bundle that is inevitably of NS type. This real line bundle is not trivial globally over S, but
— since the number of boundary punctures is odd — it can be trivialized on the complement of
the boundary punctures in such a way that the trivialization changes sign whenever one crosses a
boundary puncture.

This description of boundaries and their punctures may seem bizarre at first,
but we will see in Subsec. 3.8 that it is not too difficult to give it a plausible physical
interpretation. But first, let us ask whether incorporating boundary punctures has
reintroduced any problem with the orientation of moduli space. We will deal with
this question by describing a consistent recipe [3–6] for dealing with the sign ques-
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 49

Developments in Topological Gravity 49

tions. We expect that this recipe could be deduced from the framework of Subsec.
3.8, but we will not show this.
First let us consider the case of a boundary component S with NS spin structure.
It has a circumference b and it has an odd number n of boundary punctures that
have a natural cyclic order. Let us pick an arbitrary starting point p ∈ S and relative
to this label the punctures in ascending order by angles α1 < α2 < · · · < αn . So b
and α1 , . . . , αn are the moduli that are associated to S. To orient this parameter
space, we can use the differential form
Υ = db dα1 dα2 · · · dαn . (3.30)
We note that Υ has a natural sign: because the number of α’s is odd, moving a dα
from the end of the chain to the beginning does not affect the sign of Υ. Also, since
Υ is of even degree, it commutes with similar factors associated to other boundary
components. Therefore, an NS boundary component raises no problem in orienting
the moduli space.
Now let S have Ramond spin structure. In this case, n is even. This has two
consequences. First, we get a sign change if we move a dα from the end of the chain
to the beginning. However, just as in the case n = 0 that we started with, the sign
of (−1)ζ depends on how one trivializes the spin structure of a Ramond boundary.
A consistent recipe is to define the sign of (−1)ζ using the trivialization that is in
effect just to the right of the starting point p ∈ S relative to which we measured
the α’s. Then moving one of the boundary punctures from the end of the chain to
the beginning will reverse the sign of Υ while also reversing the sign of (−1)ζ . Also,
because n is even, Υ is of odd degree in the case of a Ramond boundary. Therefore
the Υ factors associated to different Ramond boundaries anticommute with each
other. Just as we discussed for the case n = 0, this compensates for the fact that
(−1)ζ is odd under exchanging any two Ramond boundaries.

3.8. Branes
3.8.1. The ζ-instanton equation and compactness
In the present section, we will attempt to interpret the possibly strange-sounding
picture just described in terms of the physics of branes.
For this, it will be helpful to use the second realization of theory T that was pre-
sented in Subsec. 3.4. This was based on topologically twisting a two-dimensional
theory with (2, 2) supersymmetry and a complex chiral superfield Φ. The bottom
component of Φ is a complex field φ. The theory also has a holomorphic superpoten-
tial, which in our application is W (Φ) = 2i m2 Φ2 , but we will write some formulas
for a more general W (Φ).
The condition for a configuration of the φ field to be supersymmetric is the
ζ-instanton equation
∂φ ∂W
+ = 0. (3.31)
∂z ∂φ
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 50

50 Topology and Physics

This equation can be written ∂φ + dz∂φ W , so it can be defined on a Riemann


surface Σ with a distinguished, everywhere nonzero (0, 1)-form dz. (For example,
such a form exists globally if Σ is a Riemann surface of genus 1 or a domain in
the complex plane.) If W is quasihomogeneous, as in our case, the equation is
conformally-invariant if φ is suitably interpreted. The conformally-invariant ver-
sion of the equation can be formulated on any Riemann surface. This conformally-
invariant form of the equation is what one gets when one topologically twists the
theory using the R-symmetry that exists for quasi-homogeneous W . For example,
in the case of a quadratic W , after topological twisting, φ has to be interpreted
as a section of a chiral spin bundle L → Σ, a square root of the canonical bundle
K → Σ. In [49], a more general case W ∼ Φr was considered, and then in the
topologically twisted version of the theory, φ is a section of an rth root of K (this
rth root may have singularities at specified points in Σ where “twist fields” are
inserted).
Certain important properties hold whenever the ζ-instanton equation can be
defined, whether in a topologically-twisted version or simply in a naive version in
which φ is a complex field. In particular, if Σ has no boundary, then the ζ-instanton
equation has only “trivial” solutions. This is proved in a standard way: take the
absolute value squared of the equation, integrate over Σ, and then integrate by
parts, to show that any solution satisfies
2
!
Z Z 2
∂φ ∂W ∂W
|d2 z| |d2 z| |dφ|2 +

0= + = + ∂z W + ∂z W . (3.32)
Σ ∂z ∂φ Σ ∂φ

If Σ has no boundary, we can drop the total derivatives ∂z W and ∂z W , and we


learn that on a closed surface Σ, any solution has dφ = 0 and ∂W/∂φ = 0; in
other words, φ must be constant and this constant must be a critical point of W .
For a large class of W ’s, this implies that, on a surface Σ without boundary, the
space of solutions of the ζ-instanton equation is compact (and in fact “trivial”).
This compactness is an important ingredient in the well-definedness of the twisted
topological field theory constructions related to the ζ-instanton equation.

3.8.2. Boundary condition in the ζ-instanton equation


If Σ has a boundary, then we have to pick a boundary condition on the ζ-instanton
equation. Let us first ignore the twisting and treat φ as an ordinary complex scalar
field. If we also set W to 0, the equation for φ becomes the Cauchy–Riemann equa-
tion saying that φ is holomorphic. The topological σ-model associated to counting
solutions of this equations is then an ordinary A-model. Though the topological
field theory associated to theory T is not an ordinary A-model — because of the
superpotential and because φ is twisted to have spin 1/2 — it will be useful to first
discuss this more familiar case.
A boundary condition for the Cauchy–Riemann equations that is sensible
(elliptic) at least locally can be obtained by picking an arbitrary curve ` ∈ C
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 51

Developments in Topological Gravity 51

and asking that the boundary values of φ should lie in `. Adding a superpotential
to get the ζ-instanton equation does not affect this statement, which only depends
on the “leading part” of the equation (the terms with the maximum number of
derivatives). Here we may loosely call ` a brane, although to be more precise, it is
the support of a brane. As we will discuss later, there can be more than one brane
with support `. More generally, as is usual in brane physics, we may impose such
a boundary condition in a piecewise way. For this, we pick several branes `α , we
decompose the boundary ∂Σ as a union of intervals Iα that meet only at their end-
points, and for each α, we require that Iα should map to `α . (A common endpoint
of Iα and Iβ must then map to an intersection point of `α and `β .)
What sort of ` should we use? At first sight, it may seem that the A-model
is most obviously well-defined if ` is compact. Actually, a compact closed curve
in C is a boundary, and with such a choice of `, the A-model with target C is
actually anomalous, as explained from a physical point of view in [53], Subsec.
13.5. This anomaly is an ultraviolet effect that is related to a boundary contribution
to the fermion number anomaly on a Riemann surface. More intuitively, if ` is a
closed curve in the plane, that it can be shrunk to a point and is not interesting
topologically. Thus we should consider noncompact `, for example a straight line in
C.
With such a choice, we avoid the ultraviolet issues mentioned in the last para-
graph, but the noncompactness of ` raises potential infrared problems. The space
of solutions of the Cauchy–Riemann equation ∂φ = 0, with boundary values in the
noncompact space `, is in general not compact, and this poses difficulties in defining
the A-model with target C.
There are a number of approaches to resolving these difficulties, depending
on what one wants. One approach leads mathematically to the “wrapped Fukaya
category.” For our purposes, we want to use the superpotential W to prevent φ from
becoming large. This corresponds mathematically to the Fukaya-Seidel category
[55]; for a physical interpretation, see [53], especially Subsecs. 11.2.6 and 11.3. To
see the idea, let us return to the identity (3.32), but now allow for the possibility
that Σ has a boundary. For instance, we can take Σ to be the upper half z-plane.
Setting z = x1 + ix2 , the identity becomes
!
Z 2 Z
∂W
0= |d2 z| |dφ|2 + +2 dx1 Im W. (3.33)
Σ ∂φ ∂Σ

Now it becomes clear what sort of brane we should consider. We should choose ` so
that Im W → ∞ at ∞ along `. Then the boundary term in the identity will ensure
that φ cannot become large along ∂Σ, and given this, the bulk terms in the identity
ensure that φ cannot become large anywhere. That is an essential technical step
toward being able to define the A-model.
Let us implement this in our case that W (φ) = 2i mφ2 , with m > 0 and φ =
φ1 + iφ2 . We have Im W (φ) = m 2 2
2 (φ1 − φ2 ). Thus near infinity in the complex φ
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 52

52 Topology and Physics

plane, there are two regions with Im W → +∞: this happens near the positive real
φ axis and also near the negative axis.
A noncompact 1-manifold ` is topologically a copy of the real line, with two
ends. To ensure that Im W → ∞ at ∞ along `, we should pick ` so that each of
its ends is in one of the good regions near the positive or negative φ axis. Beyond
this, the precise choice of ` does not matter, because of the fact that the A-model
is invariant under Hamiltonian symplectomorphisms of C. All that really matters
is whether φ tends toward +∞ or −∞ at each of the two ends of `. Moreover, if φ
tends to infinity in the same direction at each end of `, it is “topologically trivial” in
the sense that it can be pulled off to infinity in the φ-plane while preserving the fact
that Im W → ∞ at ∞ along `. So the only interesting case is that φ tends to −∞
at one end of ` and to +∞ at the other. Further details do not matter. Therefore,
we may as well simply take21 ` to be the real φ axis. In other words, the boundary
condition on φ is that it is real along ∂Σ, or in other words if φ = φ1 + iφ2 , then
φ2 = 0 at x2 = 0.
This has an interesting interpretation in the topologically twisted model that
we are really interested in. We recall that in this model, φ is a section of the chiral
spin bundle L of Σ. The fiber of L at a point in Σ is a complex vector space of
dimension 1. This is actually the same as a real vector space of rank 2. Thus, we can
alternatively view the complex line bundle L → Σ as a rank 2 real vector bundle
S → Σ. The resulting S is simply the real, nonchiral spin bundle of Σ. Thus, it is
possible to view the real and imaginary parts of φ as a two-component real spinor
field over Σ. In fact, we have already made much the same statement in Eq. (3.15),
where we asserted that  the ζ-instanton
 equation for φ is equivalent to the massive
φ 1
Dirac equation for φb = .
−φ2
Now recall that in Subsec. 3.5, we defined a rank 1 real spin bundle E → ∂Σ by
saying that a section of E is a section φb of the rank 2 spin bundle S of Σ (restricted
to ∂Σ) that satisfies γk φb = φ. b (The opposite sign in this relation, γk φb = −φ,
defines another equivalent real spin bundle of ∂Σ.) For Σ the upper half plane, the
tangential gamma matrix is γk = γ1 , and the representation that we have used of
the gamma matrices (Eq. (3.7)) is such that γ1 φb = φb is equivalent to φ2 = 0.
Thus, we can state the boundary condition that we have found in a way that
makes sense in general for the twisted topological field theory under study. In bulk,
that is away from ∂Σ, φ is a section of the chiral spin bundle L → Σ. The boundary
condition satisfied by φ is that along ∂Σ, it is a section of the real spin bundle
E → ∂Σ. The merit of this boundary condition is the same as it is in the ordinary
A-model, which we used as motivation: it ensures that the surface terms in Eq.
(3.32) vanish, and therefore that the only solution of the ζ-instanton equation on
a Riemann surface Σ with boundary is φ = 0.

21 This ` can be described as a Lefschetz thimble for the superpotential W associated to its unique
critical point at φ = 0. In general, in the Fukaya-Seidel category, the most basic objects are such
Lefschetz thimbles.
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 53

Developments in Topological Gravity 53

We can gain some more insight by comparison to the ordinary A-model. To


construct a brane with support `, we need to pick an orientation of `. There are
two possible orientations, so there are two possible branes, which we will call B 0
and B 00 . Neither one is distinguished relative to the other.
In the ordinary A-model, we could at our discretion introduce B 0 or B 00 or both.
The twisted model that is related to theory T , in which φ is a chiral spinor rather
than a complex-valued field, is different in this respect. The reason it is different
is that B 0 and B 00 represent choices of orientation of the real spin bundle E → ∂Σ,
but in general this real spin bundle is unorientable. Thus, if one goes all the way
around a component of ∂Σ with NS spin structure, then B 0 and B 00 are exchanged.
Accordingly, in the model relevant to theory T , if we introduce one of these branes,
we have to also introduce the other.
Once we introduce branes B 0 and B 00 , we are very close to the picture developed
in the mathematical literature [3–6]. The boundary of Σ is decomposed as a union
of intervals Iα that have only endpoints in common, and each interval is labeled by
B 0 or B 00 . This labeling here means simply a chosen orientation of E → ∂Σ. Since E
is a real vector bundle of rank 1, a choice of orientation of E is (up to homotopy)
the same as a trivialization of E, the language used in Subsec. 3.7.
There is really just one more puzzle. In the theory developed in [3–6], whenever
one crosses a boundary puncture, the orientation of E jumps. Why is this true?
A quick answer is the following. In general, for any brane B, (B, B) strings in
the A-model correspond to local operators that can be inserted on the boundary of
the string in a region of the boundary that is labeled by brane B. Our model is only
locally equivalent to an A-model, but this is good enough to discuss local operators.
In the case of the branes B 0 and B 00 , as ` is contractible, the only interesting local
(B 0 , B 0 ) or (B 00 , B 00 ) operator is the identity operator. However, in topological string
theory, what we add to the action along the boundary of the string worldsheet
is really a descendant of a given local operator. In the case of a boundary local
operator O, what we want is the 1-form operator V that can be deduced from O
via the descent procedure. If O is the identity operator, then V = 0. (Recall that V
is characterized by {Q, V} = dO, where Q is the BRST operator of the theory; if O
is the identity operator, then dO = 0 so V = 0.) Therefore we cannot get anything
interesting from (B 0 , B 0 ) or (B 00 , B 00 ) strings.
The analogy with the standard A-model indicates that the space of (B 0 , B 00 ) or
(B , B 0 ) strings is also one-dimensional (see Subsecs. 3.8.3 and 3.8.4), but now a
00

(B 0 , B 00 ) or (B 00 , B 0 ) string corresponds to a local operator that causes a jumping in


the brane that labels the boundary, and this is certainly not the identity operator.
Thus the gravitational descendant will not vanish.
Another crucial detail concerns the statistics of the operators. The identity
operator is bosonic, so its 1-form descendant, if not zero, would be fermionic. A
fermionic boundary puncture operator is not what we need for the theory of [3–6],
in which the coupling parameters and correlation functions are all bosonic. The
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 54

54 Topology and Physics

analogy with the standard A-model indicates (Subsec. 3.8.3) that the (B 0 , B 00 ) and
(B 00 , B 0 ) local operators are fermionic, so that their 1-form descendants are bosonic.
There is also an important detail on which the analogy to the standard A-
model is a little misleading, because it is only valid locally. In an A-model with
branes B 0 and B 00 , the (B 0 , B 00 ) and (B 00 , B 0 ) local operators would be independent
operators, and we would potentially include them (or their 1-form descendants)
with independent coupling parameters. In the present context, there is not really
any way to say which is which of B 0 and B 00 ; one can only say that they differ by the
orientation of the real spin bundle.22 So there is really only one type of boundary
puncture, which one can think of as (B 0 , B 00 ) or (B 00 , B 0 ), and correspondingly there
is only one boundary coupling.
It follows, incidentally, that even if the identity (B 0 , B 0 ) or (B 00 , B 00 ) operator had
a nontrivial 1-form gravitational descendant, it could not play a role. We would
have to identify these two operators, so we would have a single such operator with
a fermionic coupling constant υ. As the correlation functions of topological gravity
are bosonic, they could not depend on a single fermionic variable υ.

3.8.3. Orientations and statistics


Consider a brane B 0 in an arbitrary A-model with some target space X. The support
of B 0 is a Lagrangian submanifold L ⊂ X. Take B 0 to have trivial Chan-Paton
bundle.23 If we consider N copies of brane B 0 , we get an effective U (N ) gauge
theory along L.
Another M copies of brane B 0 would similarly support by themselves a U (M )
gauge theory. If we combine N copies of B 0 with M more copies, we get a U (N +M )
gauge theory.
Now consider another brane B 00 that differs from B 0 only by reversing the ori-
entation of L. M copies of B 00 would support a U (M ) gauge theory. However, if
one combines M copies of B 00 with N copies of B 0 , one does not get a gauge group
U (N + M ). Instead, one gets the supergroup U (N |M ) [56].
We will give a simple example to explain why this must be the case. For a
familiar setting, take X to be a Calabi–Yau three-fold. The effective gauge theory
for N copies of a brane is actually a U (N ) gauge theory. Let us denote the gauge
field as A. The theory also has a 1-form field φ in the adjoint representation, which
describes fluctuations in the position of the brane. The effective action is a multiple

22 For example, B0 and B00 are exchanged in going all the way around a circle with NS spin
structure. Perhaps more fundamentally, orienting the real spin bundle of one boundary of Σ does
not in general tell us how to choose such an orientation for other boundaries. So we can say locally
how B0 and B00 differ but there is no global notion of which is which.
23 For example, L might be topologically trivial (as it is in our application, with L = `). We will

ignore various subtleties related to the K-theory interpretation of branes; these are not relevant
for our purposes.
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 55

Developments in Topological Gravity 55

of the Chern–Simons three-form for the complex connection A = A + iφ:


Z
1
I= CS(A). (3.34)
gst L
Here CS(A) = Tr A ∧ dA + 32 A ∧ A ∧ A is the Chern–Simons three-form and


gst is the string coupling constant. There is no problem, given L purely as a bare
three-manifold, so define the three-form CS(A). But to integrate a three-form over
L requires an orientation of L. There is no natural choice, but a choice is part of
the definition of a brane with support L. That is one way to understand the fact
that in order to define a brane B 0 or B 00 with support L, one needs to endow L
with an orientation; and there are in fact two A-branes B 0 and B 00 with the same
support L that differ only by which orientation is chosen. The sign of the effective
action I is opposite for B 0 relative to B 00 .
Now if we bring together N branes supporting a U (N ) Chern–Simons theory
to M more branes supporting a U (M ) Chern–Simons theory with the same sign
of the action, the two Chern–Simons theories can merge into a U (N + M ) Chern–
Simons theory. (The expectation value of the field φ can describe the breaking of
U (N +M ) down to U (N )×U (M ).) However, if the U (M ) and U (N ) Chern–Simons
actions have opposite signs, they cannot possibly combine to a U (N + M ) Chern–
Simons theory. Instead, they can combine to a U (N |M ) supergroup Chern–Simons
theory. We recall that the supertrace of an N |M -dimensional matrix is defined, in
an obvious notation, as
 
U V
Str = Tr U − Tr X, (3.35)
W X
where the relative minus sign is just what we need so that the supertrace of a
Chern–Simons three-form of U (N |M ) leads to opposite signs for the U (N ) and
U (M ) parts of the action.
A consequence of going from U (N + M ) to U (N |M ) is that the statistics of
the off-diagonal blocks V and W is reversed. At the end of Subsec. 3.8.2, that is
what we needed so that the (B 0 , B 00 ) strings are fermionic, and have bosonic 1-form
descendants.
The situation just described does not usually arise in physical string theory,
because there one usually is interested in branes that satisfy a stability condition
involving the phase of the holomorphic volume form of the Calabi–Yau manifold, re-
stricted to the brane. For a given Lagrangian submanifold, this condition is satisfied
at most for one orientation.

3.8.4. Quantizing the string


In the standard A-model, the space of local operators of type (B1 , B2 ), for any
branes B1 and B2 that may or may not be the same, is the same as the space of
physical states found by quantization on an infinite strip with boundary conditions
set by B1 at one end and by B2 at the other end. Here we will explain the analog
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 56

56 Topology and Physics

of this for the model under consideration here, which is only locally equivalent to
a standard A-model.
We will work on the strip 0 ≤ x2 ≤ a in the x1 x2 plane, for some a, and will treat
x1 as Euclidean “time.” In Eq. (3.33), there is now a boundary contribution at x2 =
a, as well as the one at x2 = 0 that was discussed previously. The two contributions
have opposite signs, and to achieve compactness the boundary condition at x2 = a
should ensure that Im W → −∞ at infinity. Thus we take the boundary condition
at x2 = a to be φ1 = 0, while at x2 = 0 it is φ2 = 0, as before.24
To find the space of physical states with these boundary conditions, the first
step is to find the space of classical ground states. With x1 viewed as “time,”
these are the x1 -independent solutions of the ζ-instanton equation that satisfy the
boundary conditions at the two ends. For solutions that depend only on x2 , the

ζ-instanton equation reduces to dx 2
+ mφ = 0. The only solution of this linear
first-order equation with φ2 = 0 at x2 = 0 and φ1 = 0 at x2 = a is φ = 0.
Moreover, this solution is nondegenerate, meaning that when we linearize around
it, the linearized equation has trivial kernel. (In the present case, this statement is
trivial since the ζ-instanton equation is already linear.) A nondegenerate classical
solution corresponds upon quantization to a single state.
If there were multiple classical vacua, we would have to consider possible tun-
nelling effects to identity the quantum states that really are supersymmetric ground
states. With only one classical vacuum, this step is trivial. So in our problem, there
is just one supersymmetric ground state.
One might be slightly puzzled that we seem to have used different boundary
conditions and thus different branes at x2 = a relative to x2 = 0. However, if we
conformally map the strip to the upper half plane x2 ≥ 0, mapping x2 = −∞ in
the strip to the origin x1 = x2 = 0 in the boundary of the upper half plane, then
this difference disappears. What we have done, on both boundaries, is to require
that φ should restrict on ∂Σ to a section of the real spin bundle E → ∂Σ.
The space of supersymmetric ground states that we just obtained corresponds to
the space of local operators of type (B 0 , B 0 ), (B 00 , B 00 ), or (B 0 , B 00 ) that can be inserted
at x1 = x2 . Since we did not have to orient the spin bundles of the boundaries of
the strip in order to determine that there is a 1-dimensional space of physical states
on the strip, the spaces of local operators of type (B 0 , B 0 ), (B 00 , B 00 ), or (B 0 , B 00 ) are
the same if understood just as vector spaces. But these operators have different
statistics, as explained in Subsec. 3.8.3.

3.9. Boundary degenerations


So far we have concentrated on questions concerning the orientation of the moduli
space. However, as explained in Subsec. 3.1, in trying to define topological gravity

24 This difference
√ in boundary condition is related
√ to something that will be explained in Subsec.
3.9: at x2 = 0, dz is real, while at x2 = a, −dz is real.
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 57

Developments in Topological Gravity 57

on Riemann surfaces with boundary, there is a second serious problem, which is that
the moduli space of Riemann surfaces with boundary, with its Deligne–Mumford
compactification, itself has aR boundary. Because of this, intersection numbers such
as the correlation functions M i ψidi of topological gravity (Eq. (3.1)) are a priori
Q

not well-defined from a topological point of view. We will explain schematically how
this difficulty has been overcome, going just far enough to describe the simplest
concrete computations. For full explanations, see [3–6].
First let us give a simple example to illustrate the problem. A disc Σ with
n boundary punctures (and no bulk punctures) has a moduli space M of real
dimension n−3. The disc can degenerate in real codimension 1 by forming a narrow
neck (Fig. 6(a)), which then pinches off (Fig. 6(b)) to make a singular Riemann
surface Σ that can be obtained by gluing together two discs Σ1 and Σ2 (Fig. 6(c)).
This occurs in real codimension 1, and thus Fig. 6(b) describes a component of ∂M,
the boundary of M. As a check, let us confirm that the configuration in Fig. 6(b)
has precisely n − 4 real moduli, so that it is of real codimension 1 in M. Σ1 and Σ2
inherit the boundary punctures of Σ, say n1 for Σ1 and n2 for Σ2 with n1 + n2 = n.
In addition, Σ1 and Σ2 have one more boundary puncture p1 or p2 where the gluing
occurs. So in all, Σ1 and Σ2 have respectively n1 +1 and n2 +1 boundary punctures,
and moduli spaces of dimension n1 − 2 and n2 − 2. The singular configuration in
Fig. 6(b) thus has a total of (n1 − 2) + (n2 − 2) = n − 4 real moduli, as claimed.
Thus, we have confirmed the assertion that moduli spaces of Riemann surfaces
with boundary are themselves manifolds (or orbifolds) with boundary. This presents
a problem for defining intersection numbers.
Now let us reexamine this assuming that Σ is endowed with a spin bundle S
and that the induced real spin bundle E of ∂Σ is piecewise trivialized along ∂Σ, as
described in Subsec. 3.7. We immediately run into something interesting. If Σ is a
disc, the spin bundle E → ∂Σ is always of NS type, and the number n of boundary
punctures on a disc will have to be odd. But when Σ degenerates to the union of two
branches Σ1 and Σ2 , with n1 + 1 punctures on one side and n2 + 1 on the other side,
inevitably either n1 + 1 or n2 + 1 is even. But in the theory that we are describing
here, a disc is always supposed to have an odd number of boundary punctures.

a) b) c)

p1 p2

Σ Σ1 Σ2 Σ1 Σ2
Fig. 6. (a) A disc Σ with n boundary punctures that develops a narrow neck. (b) The neck
collapses and Σ degenerates to the union of two discs Σ1 and Σ2 glued at a point. (c) The picture
of part (b) can be recovered by gluing p1 ∈ Σ1 to p2 ∈ Σ2 . The original boundary punctures of Σ
are divided in some way between Σ1 and Σ2 .
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 58

58 Topology and Physics

What this means in practice is that either p1 or p2 does not really behave as a
boundary puncture in the sense of this theory: the piecewise trivializations of the
real spin bundles E1 → Σ1 and E2 → Σ2 jump in crossing either p1 or in crossing p2 ,
but not both. This is explained more explicitly shortly. As a result, the cohomology
classes ψi whose products we want to integrate to get the correlation functions have
the property that when restricted to ∂M, they are pullbacks from a quotient space
in which either p1 or p2 is forgotten. Effectively, then, ∂M behaves as if it is of real
codimension 2 and the intersection numbers are well-defined.
Now let us explain these assertions in more detail. First we introduce a useful
language. In the following, Σ will be a Riemann surface, possibly with boundary.
We write K for the complex canonical bundle of Σ and S for its chiral spin bundle.
So K is a complex line bundle over Σ, and S is a complex line bundle over Σ with
a linear map w : S ⊗ S → K that establishes an isomorphism between S ⊗ S and
K.
Along ∂Σ, it is meaningful to say that a one-form is real, and thus K, restricted
to ∂Σ, has a real subbundle. Moreover, the Riemann surface Σ is oriented and this
induces an orientation of ∂Σ. As a result, it is meaningful to say that a section of
K, when restricted to ∂Σ, is real and positive. For example, if Σ is the upper half
of the complex z-plane, so that ∂Σ is the real z axis, then the complex 1-form dz
is real and positive when restricted to ∂Σ. But if Σ is the lower half of the z-plane,
then its boundary is the real z axis now with the opposite orientation, and so in
this case, −dz is real and positive along ∂Σ.
This gives a convenient framework in which to describe the real spin bundle E of
∂Σ. We say that a local section ψ of S → Σ is real along ∂Σ if the 1-form w(ψ ⊗ ψ)
is real and positive when restricted to ∂Σ. In this case, we say that the restriction of
ψ to ∂Σ is a section of E. This serves to define E. For example, if Σ is the upper half
of the complex z-plane, then a section ψ of S with the property that w(ψ ⊗ ψ) = dz
√ to ∂Σ provides a section of E. We describe this
is real along ∂Σ, and its restriction
more informally by writing ψ = dz. Note that since (−ψ) ⊗ (−ψ) = ψ ⊗ ψ, in
this situation we also have w((−ψ) ⊗ (−ψ)) = dz. So just like the square root of a
number, a square root of dz is only uniquely determined up to sign. If Σ is the lower
half of the complex z plane, then a section ψ of S that satisfies w(ψ ⊗ ψ) √ = −dz is
real and √ is a section of E. We describe this informally by writing ψ = ± −dz or
ψ = ±i dz.
A trivialization of the real spin bundle E → ∂Σ is given by any nonzero section
of E. For example,
√ if Σ is the upper half z plane, then E → ∂Σ can be trivialized
by ψ = ± √dz, and if Σ is the lower half z plane, then E → ∂Σ can be trivialized
by ψ = ±i dz.
With this in place, we can return to our problem. In Fig. 7, we show the same
open-string degeneration as in Fig. 6, but now we zoom in on the important region
where the degeneration occurs and do not specify what the Riemann surface Σ
looks like outside this region. The open-string degeneration is drawn in the figure
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 59

Developments in Topological Gravity 59

a) b) c)
! z !1 z !1 p z
1

!2 !2 p 2
Fig. 7. (a) The complement of the shaded region of the complex z-plane is a Riemann surface Σ
with boundary. It consists of an upper and lower half plane connected through a narrow neck. (b)
In real codimension 1, the neck collapses and Σ degenerates to a pair of branches Σ1 and Σ2 glued
together along a double point. (c) In this picture, the two branches have been separated. Now Σ1
and Σ2 are upper and lower half-planes, respectively, with distinguished boundary punctures p1
and p2 . Gluing p1 to p2 will return us to the singular configuration in (b).
√ √ z
± dz dz

√ √
∓ i dz i dz
Fig. 8. Here we repeat Fig. 7(a), but now providing information on the trivialization of the spin
bundle of ∂Σ. On the upper and lower left and right of the figure, ∂Σ is parallel to the √ real z

axis and so the spin structure is trivialized by a choice of ± z (the upper regions) or ±i dz (the
lower regions).

ignoring spin structures and their trivializations. In figure 8, we repeat Fig. 7(a),
but now providing information about the trivializations of spin structures.
First of all, as there are no boundary punctures in this picture,25 the real spin
bundle of ∂Σ is supposed to be trivialized everywhere in the picture. The trivial-
izations are easy to describe in the regions — the upper and lower left and right in
the figure — in which ∂Σ is parallel to the real z axis. We will use the fact that as
Σ is a region in the complex z plane, the complex 1-form dz is defined√everywhere
on Σ; similarly it is possible to make a global choice of sign of ψ = dz, though
such
√ a ψ will not be everywhere real on ∂Σ. The overall sign of what we mean by
dz will not be important in what follows. √
We begin on the upper right of the picture
√ with E trivialized by ψ = dz. (It
would add√ nothing essentially new to use − dz in the starting point, as the overall
sign of dz is anyway
√ arbitrary.) Now on the upper left of the picture, we pick a
trivialization
√ ± dz. This sign is meaningful, given that we used the trivialization
+ dz on the upper right. Now we continue through the narrow neck into the lower
part of the picture. As we do this, the boundary of ∂Σ bends counterclockwise by

25 The Deligne–Mumford compactification is defined in such a way that a degeneration never


occurs at the location of an already existing puncture. Hence in Fig. 7(a), which shows the part
of Σ in which an open-string degeneration occurs, we can assume that there are no boundary
punctures.
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 60

60 Topology and Physics

an angle π on the right of the figure and by an angle −π on the left. As a result, a
section of S → ∂Σ has√to acquire a phase in order to remain√ real. The trivialization
of E that is defined as dz on the upper right will√ evolve to i dz on the lower right,
and√ the trivialization of E that is defined as ± dz on the upper left will evolve to
∓i dz on the lower left.
We see that with one choice of sign on the left part of the picture, the trivial-
izations agree on the upper left and upper right of the figure but not on the lower
left and lower right; with the other choice of sign, matters are reversed. So when
Σ degenerates to the union of two branches Σ1 and Σ2 that are to be joined by
gluing a point p1 ∈ ∂Σ1 to a point p2 ∈ ∂Σ2 , as in Fig. 7(c), the trivialization of
the spin structure of the boundary jumps in crossing p1 but not in crossing p2 or
in crossing p2 but not in crossing p1 . In the construction studied in [3–6], precisely
one of p1 and p2 plays no role and can be forgotten. This is the basic reason that
the boundary of M behaves as if it is of real codimension two and the correlation
functions are well-defined. We provide more detail momentarily.

3.10. Computations of disc amplitudes


Several concrete methods to compute in this framework have been deduced [3–6].
Here we will just describe the simplest computations of disc amplitudes.
First let us discuss the proper normalization of a disc amplitude. We write gst
for the string coupling constant in topological gravity of closed Riemann surfaces
with its usual normalization, and e gst for the string coupling constant in the present
theory.
2g−2
In the standard approach, genus g amplitudes are weighted by a factor of gst .
2g−2 g−1 g−1
With theory T included, this is replaced by e gst 2 , where 2 is the partition
function of theory T (Eq. (3.5)). The relation between the two is thus
gst
gst = √ .
e (3.36)
2
√ A disc has Euler characteristic 1, so a disc amplitude is weighted by 1/e gst =
2/gst . The partition function of theory T on a disc is 1/2 (as a disc has only
one spin structure). However, for any given set of boundary punctures, there are
two possible piecewise trivializations of the spin structure of the boundary, with
the requisite jumps across boundary punctures. These two choices will contribute
equally in the simple computations we will discuss, so we can take them into account
by including a factor of 2. √ √
The factors discussed so far combine to 2 · 12 2/g √ st = 2/gst . In addition, in [3]
it was found convenient to include a factor of 1/ 2 for every boundary puncture.
Thus, let Σ be a disc with m boundary punctures and n bulk punctures labeled by
integers d1 , . . . , dn ; let M be the compactified moduli space of conformal structures
on Σ. Then refining Eq. (3.1), the general disc amplitude is
2(1−m)/2
Z
hτd1 τd2 . . . τdn σ m iD = ψ1d1 ψ2d2 · · · ψndn . (3.37)
gst M
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 61

Developments in Topological Gravity 61

This formula agrees with Eq. (18) in [3]. We have included factors of gst in this
explanation, because that helps determine the factors of 2 that are needed to ensure
that the theory is consistent with the standard normalization in the case that a
surface Σ has no boundary. However, in mathematical treatments, gst is often set
to 1, and we will do so in the rest of this section. (No topological information is lost,
since a given correlation function receives contributions only from surfaces with a
given Euler characteristic, and this determines the power of gst .)
In interpreting Eq. (3.37), we consider the boundary punctures to be inequiva-
lent and labeled, and we sum over all possible cyclic orderings. For example, let us
compute hσσσi, which receives a contribution only from a disc with three boundary
punctures labeled 1,2,3. There are two cyclic orderings
R (namely 123 and 132), and
for each cyclic ordering, M is just a point, with M 1 = 1. So after setting gst = 1,
Eq. (3.37) with n = 0, m = 3, and including a factor of 2 from the sum over cyclic
orderings, gives

hσ 3 i = 1. (3.38)

Getting this formula was the motivation to include a factor 1/ 2 for each boundary
puncture. Another simple formula is

hτ0 σi = 1. (3.39)

This is again easy because the moduli space is a point. With boundary punctures
only, Eq. (3.38) is the only nonzero amplitude, for dimensional reasons, and similarly
(3.39) is the only additional nonzero disc amplitude with insertions of σ and τ0 only.
The simplest method to compute arbitrary disc amplitudes is given by the re-
cursion relations in Theorem 1.5 of [3], and indeed the first of these relations is
sufficient. To explain it, first we recall the genus 0 recursion relations of [17]. It is
convenient to define

* !+
X
hhτd1 τd2 · · · τds ii = τd1 τd2 · · · τds exp tn τn . (3.40)
n=0

Thus hhτd1 τd2 · · · τds ii is an amplitude with specified insertions as shown, with
all possible additional insertions weighted by powers of the tn . We also write
hhτd1 τd2 · · · τds ii0 for the genus 0 contribution to hhτd1 τd2 · · · τds ii. Then one has
the genus 0 recursion relation

hhτd1 τd2 τd3 ii0 = hhτd1 −1 τ0 ii0 hhτ0 τd2 τd3 ii0 . (3.41)

The proof goes roughly as follows. For a smooth genus 0 surface Σ, we take the
complex z-plane plus a point at infinity. We denote the specified punctures as
z1 , z2 , z3 . We will construct a convenient section λ of the line bundle L1 → M
whose fiber is the cotangent bundle to Σ at z1 . Let ρ be the 1-form
dz
ρ = (z2 − z3 ) . (3.42)
(z − z2 )(z − z3 )
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 62

62 Topology and Physics

a) b)

z2 z3 z1 z2

z3

Fig. 9. (a) If the two-sphere Σ degenerates to two branches with punctures z2 and z3 on opposite
sides, then the 1-form ρ = dz/(z − z2 )(z − z3 ) has poles on each branch, so in particular it is
nonzero on each branch. (When Σ degenerates, ρ also acquires poles at the double point that the
two branches have in common, with equal and opposite residues on the two sides.) λ also remains
nonzero. (b) If instead z2 and z3 are on the same branch, then all poles of ρ are on that branch
and in fact ρ = 0 on the other branch. Since λ is defined by setting z = z1 in ρ, λ vanishes if, as
sketched here, z1 is on the branch on which ρ is identically zero.

It has poles at z = z2 , z3 , with residues 1 and −1, and elsewhere is regular and
nonzero. These properties characterize ρ uniquely, so ρ does not depend on the
coordinates used in writing the formula. Upon setting z = z1 in ρ, we get a holo-
morphic section λ of L1 → M; the divisor D of the zeroes of this section represents
c1 (L1 ). But λ never vanishes when Σ is smooth, because ρ has no zeroes on the
finite z-plane or at z = ∞. If Σ degenerates to two components with z2 and z3
on opposite sides (Fig. 9(a)), λ is still everywhere nonzero. But if z2 and z3 are
contained in the same component (Fig. 9(b)), then λ vanishes on the other com-
ponent. Finally, then, ρ vanishes precisely if, as in the figure, z1 is contained in
the opposite component from the one containing z2 and z3 . Moreover, this is a
simple zero (because ρ has a simple zero at z2 = z3 ). So in τd1 = c1 (L1 )d1 , we can
replace one factor of c1 (L1 ) with a restriction to the divisor D that is depicted in
Fig. 9(b). After making this substitution, we are left with an insertion of τd1 −1 on
one branch and insertions of τd2 and τd3 on the other; in addition, a new puncture
corresponding to an insertion of τ0 appears on each branch, where the two branches
meet. All this leads to the right hand side of Eq. (3.41). It is not difficult to see that
this recursion relation uniquely determines all genus zero amplitudes, modulo the
statement that the only nonzero amplitude with insertions of τ0 only is hτ03 i0 = 1.
The disc recursion relation that we aim to describe can be formulated and proved
in almost the same way. Similarly to the previous case, we define

* !+
X
m m
hhτd1 τd2 · · · τds σ ii = τd1 τd2 · · · τds σ exp tn τn + vσ , (3.43)
n=0
m
and write hhτd1 τd2 · · · τds σ iiD for the disc contribution. The desired recursion
relation is
hhτn σiiD = hhτn−1 τ0 ii0 hhτ0 σiiD + hhτn−1 iiD hhσ 2 iiD . (3.44)
Given a knowledge of eqns. (3.38) and (3.39) and vanishing of hτ0n σ m iD for other
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 63

Developments in Topological Gravity 63

values of n, m, it is not difficult to see that Eq. (3.44) determines all disc amplitudes
in terms of the genus 0 amplitudes. These in turn can be determined, for example,
from (3.41).
The proof of Eq. (3.44) is rather similar to the proof of the genus zero recursion
relation (3.41). However, we will have to explain more fully what is meant in saying
that one of the punctures in an open-string degeneration should be forgotten.
Roughly speaking, we are going to again compute c1 (L1 ), for one of the bulk
punctures, from the zeroes of a convenient section λ of L1 . However, here because
M has a boundary, we have to discuss how to relate c1 (L1 ) to the zeroes of a
section.
As discussed in section 3.9, the boundary ∂M of M has a forgetful map in
which precisely one of the extra boundary punctures that appears at an open-
string degeneration is forgotten. Let us write N for the remaining moduli space
when this puncture is forgotten, so that the forgetful map is π : ∂M → N .
Simplifying a little,26 the recipe [3] is that c1 (L1 ) can be represented by the
zeros of any section s of L1 that is nonvanishing everywhere along ∂M, and whose
restriction to ∂M is a pullback from N . Alternatively, one can still calculate c1 (L1 )
using any section s of L1 that is everywhere nonzero along the boundary, even if its
restriction to the boundary is not a pullback. But in this case, c1 (L1 ) is represented
by a sum of two contributions, one involving in the usual way the zeroes of s, and
the second measuring the failure of the restriction of s to be a pullback.
Setting z = x + iy, we take a smooth disc D to be the closed upper half-
plane y ≥ 0 plus a point at infinity. On the left-hand side of Eq. (3.44), we see
a distinguished bulk puncture that we place at z1 = x1 + iy1 , y1 > 0, and a
distinguished boundary puncture that we place at x0 . In the present case, there
is a convenient section λ of L1 that is everywhere nonzero along the boundary,
but whose restriction to the boundary is not a pullback. To construct it, rather as
before, we set
dz
ρ = (z 1 − x0 ) . (3.45)
(z − z 1 )(z − x0 )
This 1-form is regular and nonzero throughout D, except at the boundary point
x0 . Evaluating ρ at z = z1 , we get a section λ of L1 that is regular and nonzero as
long as D is smooth.
At a closed-string degeneration, where D splits up into the union of a two-sphere
and a disc (Fig. 10(a)), λ has a simple zero if and only if z1 is on the two-sphere
component. This is responsible for the first term on the right-hand side of the

26 The general recipe has two further complications. First, in general one is allowed to compute
using a multisection rather than a section. This is important because the conditions on a section
that weQare about dto state are difficult to satisfy. Second, the general procedure allows one to
define n i=1 c1 (Li ) , without defining the individual c1 (Li ), by picking a multisection s of E =
i

n ⊕di
⊕i=1 Li . This multisection should obey conditions analogous to the ones that we will state
momentarily.
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 64

64 Topology and Physics

a) b)

z1
z1
p1 p2 x0
z̄1

x0

z̄1

Fig. 10. (a) A disc D splits up into the union of a disc and a sphere (upper half of the drawing).
If the bulk puncture z1 is contained in the sphere, then the section λ vanishes. To see this, take
the closed oriented double cover, obtained here by adding additional components (lower half of
the drawing, sketched with dotted lines). It is a union of three spheres connected at double points.
The differential ρ has poles only on the bottom two components and vanishes identically on the
top component. So, setting z = z1 to define λ, we learn that, with z1 being in the top component,
λ vanishes. (b) The same disc D splits into a union of two discs, again comprising the upper
half of the drawing. The interesting case is that z1 and x0 are on opposite sides, as shown. The
oriented double cover (the full drawing including the bottom half) is a union of two spheres. ρ
has poles at x0 and z 1 and so is nonzero on both branches; hence λ 6= 0 along this divisor. On
the branch containing z1 , ρ has an additional pole at the point labeled p1 where the two branches
meet. Therefore λ depends on p1 , and, if p1 is the boundary puncture that is forgotten by the
forgetful map π : ∂M → N , then along this component of the boundary, λ is not a pullback.

recursion relation (3.44). At an open-string degeneration, where D splits up into


the union of two discs (Fig. 10(b)), λ remains everywhere nonzero. However, in case
the boundary puncture that is supposed to be forgotten is in the same component
as z1 , λ restricted to ∂M is not a pullback from N . The second term on the right
hand side of Eq. (3.44) corrects for this failure. See Fig. 10 for an explanation of
the statements about the behavior of λ at degenerations.

4. Interpretation via Matrix Models


4.1. The loop equations
Let us now briefly recapitulate the representation of topological gravity in terms of
random matrix models. The simplest models are single matrix models of the form
Z  
1 1
Z= dΦ · exp − Tr W (Φ) (4.1)
vol(U (N )) gst
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 65

Developments in Topological Gravity 65

(a) (b)

Fig. 11. (a) Feynman diagrams of the matrix model are naturally ribbon graphs. The two sides
of a ribbon represent the flow of the two “indices” of an N × N matrix M i j , i, j = 1, . . . , N . The
edges of the ribbons form closed loops. Gluing a disc to each such loop, the union of the ribbons
and the discs is a two-manifold Σ without boundary on which the given Feynman diagram can
be drawn. (The edges of the ribbon are oriented — not shown here — because the two indices
transform according to inequivalent, dual representations of U (N ). As a result, Σ has a natural
orientation. (A similar model with symmetry group O(N ) or Sp(N ) leads to unoriented two-
manifolds.) (b) New variables Ψ, Ψ transforming in the N -dimensional representation of U (N )
and its dual are added to the matrix model. Because Ψi and Ψj , i, j = 1, . . . , N carry only a
single “index” — rather than the two indices of the matrix M i j — their propagator is naturally
represented by a single line rather than the double line of the matrix propagator. These single
lines provide boundaries of the surface Σ, so now we get a ribbon graph on Σ with Ψ propagating
on the boundary of Σ, as shown. For the model described in the text, the Ψ propagtor is 1/z and
this gives a factor 1/z L where L is the length of the boundary.

Here Φ is a Hermitian N × N matrix integrated with the Euclidean measure for


each matrix element, W (x) is a complex polynomial, say of degree d + 1, and gst is
the string coupling constant. Since we divide by the volume of the “gauge group”
U (N ), this integral should be considered the zero-dimensional analogue of a gauge
theory — we integrate over matrices Φ modulo gauge transformations

Φ → U · Φ · U −1 . (4.2)

In general, if Re W is not bounded below, one needs to complexify the matrix Φ


and pick a suitable integration contour in the space of complex matrices to make
the integral well-defined. For a formal expansion in powers of gst and even for the
formal expansion in powers of 1/N that we will make shortly, this is not necessary
and we can consider (4.1) as a formal expression.
In a perturbative expansion near a critical point of W (Φ), the Feynman diagrams
become so-called “fat” or ribbon graphs that can be conveniently represented (see
Fig. 11(a)) by a double line [57]. These are graphs, in general with ` loops, that can
be naturally drawn on some oriented two-manifold of genus g. The contribution of
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 66

66 Topology and Physics

such a graph to the expansion of the matrix integral is weighted by a factor


2g−2
(gst N )` gst . (4.3)

The large N or ’t Hooft limit is obtained by taking the rank N of the matrix to
infinity and simultaneously the coupling gst to zero, keeping fixed the combination

µ = gst N. (4.4)

In the limit, all graphs with a fixed genus and an arbitrary number of holes con-
tribute in the same order, so the matrix integral has an asymptotic expansion of
the form
 
X
Z ∼ exp  gst 2g−2 Fg  , (4.5)
g≥0

where Fg is the contribution of ribbon graphs of genus g. In general, the matrix


integral depends on the coefficients of the potential W and the particular critical
point around which the expansion is made. We describe the critical points at the
end of this section.
Matrix integrals are governed by Virasoro constraints that are associated to the

vector fields Ln ∼ −Tr Φn+1 ∂Φ . Though these constraints can be deduced directly
from that representation of Ln , a fuller understanding with details that we will
need below can be obtained by diagonalizing the matrix as Φ = U ΛU −1 , with U
unitary and Λ = diag(λ1 , λ2 , . . . , λN ). The integral over U cancels the factor of
1/vol(U (N )) in the definition of the matrix integral, and the integral becomes
Z !
Y 2
X 1
Z = dNλ (λI − λJ ) exp − W (λI ) . (4.6)
gst
I<J I

If xi , i = 1, . . . , d are the critical points of the polynomial W (x), then the critical
points of the matrix function Tr W (Φ) are found by setting each λI equal to one
of the xi . A critical point is labeled by the number Ni of eigenvalues with λI = xi .
(Note that the eigenvalues λI are only defined up to permutation.) The large N
limit is taken is such a way that the “filling fractions”

µi = gst Ni , i = 1, . . . , d, (4.7)

are all kept finite. These parameters characterize the saddle-points, and together
with the coefficients of the polynomial W (x) play the role of moduli of the matrix
model. (In our application, because it only involves a local portion of the spectral
curve, we will not really see these parameters.)
To derive the Virasoro constraints on the matrix integral, one can start with
Z !!
X ∂ 1 Y 2
X 1
N
0= d λ (λI − λJ ) exp − W (λI ) . (4.8)
∂λK x − λK gst
K I<J I
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 67

Developments in Topological Gravity 67

This implies the identity


* !2 +
X 1 1 X W 0 (λK )
− = 0, (4.9)
x − λK gst x − λK
K K

where the symbol h· · · i is defined by


Z !
1 Y 2
X 1
hAi = dNλ A (λI − λJ ) exp − W (λI ) . (4.10)
Z gst
I<J I

In Eq. (4.9) we see the matrix resolvent Tr (x − Φ)−1 = K (x − λK )−1 , but as we


P

will see a slightly more convenient variable is


1 1 1 X 1
J(x) = W 0 (x) − gst Tr = W 0 (x) − gst . (4.11)
2 x−Φ 2 x − λK
K

The identity (4.9) is equivalent to


* +
1 0 2 X W 0 (x) − W 0 (λK )
2
hJ(x) i = W (x) − gst , (4.12)
4 x − λK
K

and we note that if W is a polynomial, then


* +
X W 0 (x) − W 0 (λK )
f (x) = −gst (4.13)
x − λK
K

is a polynomial in x, as is
1
P (x) = W 0 (x)2 + f (x). (4.14)
4
If W is a general function W = n≥0 un xn regular at x = 0, then P (x) is no longer
P

a polynomial but is regular at x = 0.


When we insert the expression J(x) inside the matrix integral (4.6), where we
now consider a general function
X
W (x) = un xn , (4.15)
n≥0

it can be written as a differential operator


 
1X n 2 ∂ −n−1
J(x) = (n + 1)un+1 x + 2gst x . (4.16)
2 ∂un
n≥0

Comparing to standard formulas in conformal field theory, we are led to set


gst
J(x) = √ ∂ϕ(x) (4.17)
2
where ϕ(x) is a chiral boson in a c = 1 conformal field theory with canonical
two-point function ∂ϕ(x)∂ϕ(y) ∼ 1/(x − y)2 . Thus
!
√ W 0 (x) X 1
∂ϕ(x) = 2 − , (4.18)
2gst x − λK
K
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 68

68 Topology and Physics

and formally
!
√ √
 
W (x) X W (x)
ϕ(x) = 2 − log(x − λK ) = 2 − log det(x − Φ) . (4.19)
2gst 2gst
K

The corresponding stress tensor is


1 1
T (x) = (∂ϕ)2 = 2 J(x)2 . (4.20)
2 gst
Making the standard mode expansion
X Lk
T (x) = , (4.21)
xk+2
k∈Z

the equation (4.12) becomes a set of differential equations for the partition function,
X Lk
2
Z = gst P (x)Z. (4.22)
xk+2
k∈Z

Since P (x) is regular at x = 0, it contributes only to the terms in Eq. (4.22) with
k ≤ −2 and those terms serve to determine27 P (x). However, for k ≥ −1, P (x)
does not contribute to Eq. (4.22) and we get differential equations satisfied by Z:
Ln Z = 0, n ≥ −1. (4.23)
In this range of n, the Ln are
X ∂
L−1 = kuk , (4.24)
∂uk−1
k≥1
X ∂ 2
X ∂2
Ln = kuk + gst , n ≥ 0. (4.25)
∂uk+n i+j=n
∂ui ∂uj
k

So far, all of this is true for any N ; we have not made any large N approximation.
For any function h, the quantity hgst Tr h(Φ)i has a limit for large N , and for any
two functions h1 , h2 , one has a large N factorization
2 N →∞
hgst Tr h1 (Φ) Tr h2 (Φ)i −→ hgst Tr h1 (Φ)ihgst Tr h2 (Φ)i. (4.26)
These properties can be demonstrated by an elementary study of the matrix in-
tegral. In particular, both hJi and f (x) have large N limits, and in the large N
limit
hJ(x)2 i = hJ(x)i2 . (4.27)

27 IfW is a polynomial of degree d + 1, then P (x) is a polynomial of degree 2d. The condition
on W means that un = 0 for n > d + 1, which leads to Ln = 0 for n < −2d − 2. The terms
in Eq. (4.22) with −2 ≥ k ≥ −2d − 2 determinePP and the ones with k < −2d − 2 are trivial
identities. If we consider a general function W = n≥0 un xn that is regular at x = 0, then P has
a similar power series expansion around x = 0, and all of the constraints (4.22) with n ≤ −2 serve
to determine P .
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 69

Developments in Topological Gravity 69

We define
y = hJ(x)i0 , (4.28)
where the subscript denotes the the large N limit. Eq. (4.12) becomes for large N
a hyperelliptic equation for y
1 0 2
y 2 = P (x) = W (x) + f (x) (4.29)
4
and defines what is known as the spectral curve C. In Eq. (4.29), y, W , and f
all depend on the “coupling parameters” ui , though this is not shown explicitly.
Remarkably, the spectral curve fully captures the solution of the matrix model.
That is, all the perturbative functions Fg can be completely calculated using the
geometric data of the spectral curve [58].
Suppose that W is a polynomial of degree d + 1, thus with d critical points
p1 , . . . , pd . Concretely, when one takes the large N limit of the matrix integral, the
PN
first step is to pick a critical point of the matrix function Tr W (Φ) = K=1 W (λK )
about which to expand. The critical points of this matrix function are found simply
by setting each of the λK equal to one of the pj . Up to a permutation of the λ’s,
the critical points are classified by the number Ni of eigenvalues that equal pi . The
Ni are subject to one constraint
d+1
X
Ni = N. (4.30)
i=1

A large N limit is obtained in general by taking N → ∞ keeping fixed


µi = gst Ni . (4.31)
In the large N limit, the µi behave as continuous variables constrained only by
X
µi = µ. (4.32)
i

Thus if W is of degree d + 1, there are d “moduli” µi that appear in constructing


the large N limit. We note from Eq. (4.13) that f (x) is a polynomial of degree d − 1
in x and so has d coefficients. These d coefficients are functions of the moduli µi of
the matrix model. So except for constraints coming from keeping the Ni real and
positive, a (d − 1)-parameter family of f ’s can arise, even for fixed W , by varying
the critical point about which one expands the matrix model.
In the above derivation, for finite N , we discovered that the matrix integral
is governed by an operator-valued conformal field ∂ϕ(x). For finite N , this field
depends on the parameters of the matrix model, namely the ui and N , as well
as x. In the large N limit, the matrix integral can be defined by an expansion
around a particular saddle point, and then new parameters appear. For the “bare”
matrix integral, without trying to compute the expectation value of the resolvent,
the extra parameters are the µi . When one tries to compute the expectation value
of the resolvent, there is an additional binary choice, since hJ(x)i is governed by
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 70

70 Topology and Physics

a quadratic equation with two roots. In the large N limit, and also in the more
refined double scaling limit in which N → ∞ with µ = gst N fixed, the formalism
with the conformal field ∂ϕ remains valid, but this field now depends on additional
parameters — the µi and the choice of sign of hJ(x)i.
In our application, the µi will not be very important, since we will consider
only the local behavior near a particular branch point. However, the extension of
the conformal formalism to include the choice of sign of hJ(x)i is important. It
means that ∂φ should be interpreted as a conformal field on the spectral curve C,
the double cover of the x-plane that is defined by the hyperelliptic equation (4.29).
The hyperelliptic curve has an involution y → −y that exchanges the two choices
of the sign of hJ(x)i. Since ∂ϕ is defined as a multiple of J(x) (Eq. (4.17)), ∂ϕ is
odd under the hyperelliptic involution.

4.2. Double-scaling limits and topological gravity


Topological gravity and other models of two-dimensional gravity coupled to matter
are obtained by taking a suitable double-scaling limit of the generic matrix model.
These scaling limits are best understood in terms of the underlying spectral curve.
For the so-called (2, 2p−1) minimal model CFT coupled to gravity, the correspond-
ing spectral curve takes the form
y 2 ∼ x2p−1 . (4.33)
This limiting curve can be obtained by starting from the generic case y 2 = P (x),
where P is a polynomial of degree 2p, and then making 2p−1 branch points coincide
and sending the remaining one to infinity. In particular for topological gravity, which
corresponds to the case p = 1, we choose to write the underlying curve as
1 2
y = x. (4.34)
2
This curve can be obtained, for example, from the simple Gaussian matrix model,
with a quadratic polynomial W (x) = x2 . In this example, the polynomial P is

P (x) = x2 − c, with a constant c. There are branch points at x = ± c. After
shifting x by a constant and “zooming in” to a single branch point, one gets the
curve of Eq. (4.34).
In the limit that the spectral curve C is described by Eq. (4.34), the operator-
valued conformal field ∂ϕ takes a simple form. Because it is odd under the hyperel-
liptic involution y → −y, its expansion in powers of x has only half-integer powers.
We will choose to parametrize the expansion as
1 1
X 1 1 1 2 X ∂ −n− 3
gst ∂ϕ(x) = x 2 − (n + )sn xn− 2 − gst x 2. (4.35)
2 2 4 ∂sn
n≥0 n≥0

Here ϕ is what would usually be called a twisted chiral boson on the complex x-
plane, with a twist field at x = 0 (and another at x = ∞). The sn are functions of
the parameters un of an underlying matrix model; the precise relationship depends
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 71

Developments in Topological Gravity 71

upon exactly what matrix model one starts with before passing to the limit in
which the spectral curve C reduces to the curve y 2 = 2x. This relationship is not
very important for us.
What is important is the relationship between the sn and the corresponding
parameters tn of topological gravity — the parameters that were introduced in Eq.
(3.3). This relationship turns out to be
(2n + 1)!!
tn = sn . (4.36)
2n
This statement is part of the relationship between the matrix model and intersection
theory on Mg,n , as proved in [20] as well as [2, 21, 22]. (Note that the factor 2n ,
which is not entirely standard, is a consequence of our particular normalization of
the spectral curve in Eq. (4.34).)
Inserting these expressions into the loop equations then gives the familiar Vira-
soro constraints
Ln Z = 0, n ≥ −1, (4.37)
1 2
where the operators Ln are modes of the stress tensor T = 2 (∂ϕ) , with ∂ϕ now
given by Eq. (4.35). That is, we have
∂ X 1 ∂ 1
L−1 = − + (k + )sk + 2 s20 , (4.38)
∂s0 2 ∂sk−1 2gst
k≥1
∂ X 1 ∂ 1
L0 = − + (k + )sk + , (4.39)
∂s1 2 ∂sk 16
k≥0

∂ X 1 ∂ 1 2 X ∂2
Ln = − + (k + )sk + gst , n ≥ 1. (4.40)
∂sn+1 2 ∂sk+n 8 i+j=n−1 ∂si ∂sj
k≥0

Note that these equations fix the normalization of the partition function. In par-
ticular if we set all variables sn = 0 for n > 0, the L−1 constraint gives the genus
zero contribution (using s0 = t0 )
∂ 1
F0 = t20 (4.41)
∂t0 2
corresponding to three closed-string punctures on the sphere
hτ03 i0 = 1. (4.42)
Note that in that case, with only t0 non-zero, the spectral curve becomes
1 2
y = x − t0 . (4.43)
2
Returning to a theme from Subsec. 2.4, we are now also in a position to write
the spectral curve that corresponds to the model computing the volumes of the
moduli space of curves. As we have seen in Eq. (2.24), in that case the values of
the coupling constants are
(−1)n ξ n−1
tn = , n ≥ 2, (4.44)
(n − 1)!
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 72

72 Topology and Physics

which corresponds to
n(−1)n 22n ξ n−1
sn = , n ≥ 2. (4.45)
(2n + 1)!
Plugging this into (4.35) we find

sin(2 ξx)
y= √ (4.46)
2 2ξ
which is, up to normalization conventions, the known expression for the spectral
curve [42].
Perhaps we should add another word about Eq. (4.35). Because the modes
of ∂ϕ(x) proportional to xn−1/2 , n ≥ 0, commute, we can just declare them to
be multiplication by commuting variables sn . In a derivation that starts with a
un xn , the s’s would be complicated
P
matrix model based on a function W (x) =
functions of the u’s; the precise functions would depend on exactly how one zooms
in on a critical point to get to the spectral curve y 2 = 2x. Once the coefficients of
xn−1/2 , n ≥ 0 are fixed as sn , the coefficients of other terms in ∂ϕ(x) are uniquely
determined by the commutation relations and operator product expansion satisfied
by ∂ϕ(x).

4.3. Branes and open strings


Before we consider open strings within topological gravity, let us first discuss the
formulation of open strings in a general random matrix model.28 Open strings
are naturally included by adding vector degrees of freedom. Let Ψ, Ψ be a pair of
conjugate U (N ) vectors. We can choose these to be bosonic or fermionic variables.
The natural interaction with the matrix variable Φ takes the form
Z n o
T T
dΨ dΨ · exp −zΨ Ψ + Ψ · Φ · Ψ (4.47)

The effect of adding these additional variables is that now the ribbon graph is
naturally drawn on a two-manifold Σ with boundary (Fig. 11(b)). The propagator
of the vector variables has a factor 1/z, leading to a factor 1/z L , where L is the
length of the boundary of Σ.
The integral over Ψ and Ψ just gives a determinant
det(z − Φ)±1 (4.48)
(apart from an irrelevant constant factor that could be absorbed in normalizing the
measure). Here the sign in the exponent is −1 or +1 if Ψ, Ψ are bosons or fermions.
In terms of the Feynman diagram expansion, this sign means that for fermions, one
will get an extra −1 for every component of the boundary of Σ.
Instead of including the variables Ψ, Ψ in the model, it is equivalent to simply
consider a matrix model with an extra factor of det(z − Φ)±1 in the integrand.

28 Early references include [24–29].


October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 73

Developments in Topological Gravity 73

However, it turns out that it is slightly more convenient to accompany this factor
with a prefactor e∓W (z)/2gst , which is a “trivial” modification in the sense that it
does not depend on the matrix variables. Thus we consider the modified matrix
model based on the integral
Z  
1 1
dΦ · exp − Tr W (Φ) det(z − Φ)±1 e∓W (z)/2gst (4.49)
vol(U (N )) gst
Loosely
V (z) = det(z − Φ)e−W (z)/2gst , V ∗ (z) = det(z − Φ)−1 eW (z)/2gst (4.50)
are “operators” that create a brane or antibrane with the “modulus” z. We will see
that z has the interpretation of a value of x, which parametrizes the base of the
hyperelliptic spectral curve29 y 2 = P (x). More generally, one could add several sets
of vector degrees of freedom Ψa , Ψa , a = 1, . . . , r, each with its own modulus za .
For definiteness, we will consider the case of insertion of just one factor of V :
Z  
1 1
ZV (z) = dΦ · exp − Tr W (Φ) det(z − Φ)e−W (z)/2gst . (4.51)
vol(U (N )) gst
It is not difficult to derive the modification of the Virasoro equations that reflects
the presence of a brane. Repeating the derivation of Eq. (4.9), we get
* !2 +
X 1 1 X W 0 (λK ) X 1
− − = 0. (4.52)
x − λK gst x − λK (x − λK )(z − λK )
K K K V (z)

Here hAiV (z) is defined, by analogy with Eq. (4.10), as the expectation of A in the
matrix integral ZV (z) . However, it turns out that it is slightly more convenient to
make the insertion of V explicit and to write the equivalent identity
* !2  +
X 1 1 X W 0 (λK ) X 1
 − −  · V (z) = 0,
x − λK gst x − λK (x − λK )(z − λK )
K K K
(4.53)
where hAi is defined precisely as in Eq. (4.10), with the original matrix integral Z.
We can write the integral (4.51) as
Z  
1 1
ZV = dΦ · exp − Tr (W (Φ) − gst log(z − Φ)) e−W (z)/2gst ,
vol(U (N )) gst
(4.54)
suggesting that in the definition of J(x), we should just shift W (Φ) → W (Φ) −
gst log(z − Φ) and hence W 0 (x) → W 0 (x) + gst /(z − x). So we define
1 gst X 1
J(x) = W 0 (x) + − gst (4.55)
2 2(z − x) x − λK
K

29 Thus, z parametrizes a pair of points on the spectral curve that are exchanged by the hyperel-
liptic involution y → −y. This is somewhat analogous to the fact that in Subsec. 3.8, the brane
had locally two components, which globally are exchanged by a sort of monodromy.
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 74

74 Topology and Physics

and again
J(x)2
T (x) = 2 . (4.56)
gst

Because we multiplied the partition function with the factor e−W (z)/2gst , which
introduces an extra explicit dependence on the coefficients un , the formula for J(x)
as a differential operator is still given by Eq. (4.16). We get the identity
 
1 1 1 ∂
hT (x)V (z)i = P (x) + + hV (z)i (4.57)
4 (x − z)2 x − z ∂z
where now
1 0 2 1 W 0 (z) − W 0 (x)
P (x) = W (x) + f (x) − gst (4.58)
4 2 z−x
and the definition of f (x) becomes
* +
1 X W 0 (x) − W 0 (λK )
f (x) = −gst · V (z) . (4.59)
hV (z)i x − λK
K

P (x) has the same essential properties as before: it is a polynomial of degree 2d


if W (x) is a polynomial of degree d + 1, and if W (x) has a general expansion
n
P
n≥0 un x , then P (x) is regular at x = 0. Moreover, P (x) is regular at x = z.
Eq. (4.57) has a nice interpretation. We can interpret V (z) as an insertion on
the spectral curve (which generically is locally parametrized by x) of a primary
field of conformal dimension h = 1/4. On the right-hand side of Eq. (4.57), we see
the expected singular contributions to the T (x)V (z) operator product expansion,
h 1
T (x) · V (z) ∼ V (z) + ∂z V (z) + . . . (4.60)
(x − z)2 x−z
as well as regular terms that are contained in P (x). Indeed, comparing Eq. (4.19)
to the definition (4.50), we see that we can identify V and V ∗ in terms of the
conformal field ϕ as
√ √
V (z) = e−ϕ(z)/ 2
, V ∗ (z) = eϕ(z)/ 2
. (4.61)

These are indeed standard expressions for conformal primaries of dimension 1/4.
In the large N limit the scalar ϕ and therefore also the vertex operator V can be
expressed in terms of the spectral curve data
√ Z z 
2
ϕ(z) ∼ y(x)dx + O(gst ) . (4.62)
gst
Since there are two roots in the hyperelliptic spectral curve, there are two saddle-
points that dominate the expectation value of V (z)
n 1
Rz 1
Rz o
hV (z)i ∼ Ae− gst y(x)dx + Be gst y(x)dx (1 + O(gst )) (4.63)
November 15, 2018 8:53 ws-rv961x669 chap02-TopoGrav page 75

Developments in Topological Gravity 75

for some coefficients A, B given by the one-loop correction. These two contributions,
that only appear in string perturbation theory, can be considered as the manifesta-
tion of the two branes B 0 and B 00 as discussed in the A-model in Subsec. 4.3. Note
that they are interchanged by flipping the sign of gst .
Just as before, the terms in Eq. (4.57) involving negative powers of x give
Virasoro constraints
Ln ZV = 0, n ≥ −1, (4.64)
while the terms involving nonnegative powers determine P (x) or are trivial identi-
ties. However, there are additional terms in the Virasoro generators. We write the
Virasoro generators as Ln = Lcn + Lon , where superscripts c and o represent “closed-
string” and “open-string” contributions. Lcn comes from T (x) on the left-hand side
of Eq. (4.57) and is given by the same formula (4.24) as before. To find Lon , we move
the singular terms in Eq. (4.57) to the left-hand side of the equation and expand
in powers of 1/x:
∞  
1 ∂ 1 X 1 k ∂ 1 k−1
− − = − z + kz . (4.65)
x − z ∂z 4(x − z)2 xk+1 ∂z 4
k=0
Thus
∂ 1
Lok = −z k+1 − (k + 1)z k . (4.66)
∂z 4
Consequently
X ∂ X ∂2 ∂ 1
Ln = Lcn + Lon = kuk 2
+ gst − z n+1 − (n + 1)z n . (4.67)
∂uk+n i+j=n
∂ui ∂uj ∂z 4
k

On top of these Virasoro constraints, there is another useful relation that should
be added. Recall that with the introduction of the brane modulus z, the partition
function depends on one more variable, and we expect to find an accompanying
relation to determine the matrix model. This extra relation can be considered as
the analogue of the BPZ equation for degenerate fields. It is obtained as the limit
of the expression T (x)V (z) when we take x to z. The equation can be derived by
observing that [59]
* !2 +
∂2 X 1 X 1 1 00
ZV = − − W (z) . (4.68)
∂z 2 (z − λK ) (z − λK )2 2gst
K K V (z)

On the right-hand side we recognize part of the loop equation (4.53) in the case
x = z. Combining the two equations we obtain a second-order differential equation
in z
 2 

− Q(z) ZV = 0 (4.69)
∂z 2
where now
1 0 2 1
Q(z) = lim P (x) = W (z) − gst W 00 (z) + g(z), (4.70)
x→z 4 2
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 76

76 Topology and Physics

with
* +
1 X W 0 (z) − W 0 (λK )
g(z) = lim f (x) = −gst · V (z) . (4.71)
x→z hV (z)i z − λK
K

Note that g(z) can in general be a complicated function of z, not necessarily poly-
nomial. Together, the Virasoro constraints combined with Eq. (4.69) determine the
behavior of the open string partition function as a function of the couplings tn and
z.
Let us now consider these equations in the double-scaling limit, where the spec-
tral curve takes the form
1 2
y = x − t0 . (4.72)
2
In the absence of any further deformations — that is, without any other closed
string insertions than the bulk puncture t0 — the open string partition function
ZV (z) is very simple to compute. We obtain this case by taking the limit of the
Gaussian model W (x) = ax2 , for which we find
Q(x) = a2 x2 − c, c = gst (2N + 1), (4.73)
and zoom again in on one of the branch points. In this limit the function Q(z)
becomes simply Q = 2(z − t0 ), and consequently Eq. (4.69) becomes the Airy
equation
1 2 ∂2
 
g − z + t0 ZV = 0. (4.74)
2 st ∂z 2
The solution is the Airy function
Z
1 3
ZV (z) = dv e gst (−vz+v /6+vt0 )
. (4.75)

In this case one can also directly take the double-scaling limit of the exact expression
for ZV (z) in the Gaussian model, where it given by the N th eigenfunction of the
harmonic oscillator, see, e.g. the discussion in [60].
We know claim that the brane partition function, as computed in the double-
scaled matrix model, is related to the topological gravity partition function by a
Laplace transform
Z
1
Ztop (v) = dz e gst vz ZV (v). (4.76)

Something similar has been encountered in the B-model. It has been claimed in
[61, 62] for example, that there is an important subtlety if one introduce branes on
a spectral curve. One can insert branes at a fixed value x = z or at a fixed value
of y = v. These two brane insertions are exchanged by a Laplace (or Fourier30 )
transform.

30 Note that all these functional transforms are here considered as operations on formal power
series.
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 77

Developments in Topological Gravity 77

We have to compare this answer with the calculation in topological gravity


where one computes insertions of the bulk puncture operator τ0 and the boundary
puncture operator σ
X
n t0 τ0 +vσ
Ztop (v) = exp gst he i (4.77)
χ=−n

as a sum over surfaces with Euler number −n. In the absence of other operators,
as discussed in Subsec. 3.10, only two nonvanishing contributions are expected: the
disc with three insertions of σ, or with one insertion of σ and one of τ0 :
hσ 3 iD = 1, hτ0 σiD = 1. (4.78)
So the correct answer should be
1 3
Ztop (v) = e gst (v /6+vt0 )
, (4.79)
which is consistent with the matrix model calculation (4.75).
One can now include arbitrary closed string perturbations and use this iden-
tification for the full partition functions. This becomes clear by considering the
combined Virasoro constraints. If one takes into account the above Laplace trans-
formation, these now take the form
Z  
1 1 ∂
Lcn Ztop (v) = dz e gst vz (n + 1)z n + z n+1 ZV (z) (4.80)
4 ∂z
"  n  n+1 #
n 3 ∂ ∂
= gst − (n + 1) −v Ztop (v). (4.81)
4 ∂v ∂v

This is indeed the expression given in [3]. This completes the identification of
the double-scaled matrix model with the open-closed topological string partition
function.

Acknowledgements
We thank D. Freed, R. Penner, and J. Solomon for comments on the manuscript.
Research of EW is supported in part by NSF Grant PHY-1606531.

References
[1] M. Mirzakhani, Simple geodesics and Weil–Petersson volumes of moduli spaces of
bordered Riemann surfaces, Invent. Math. 167, 179–222 (2007).
[2] M. Mirzakhani, Weil–Petersson volumes and intersection theory on the moduli space
of curves, J. Am. Math. Soc. 20, 1–23 (2007).
[3] R. Pandharipande, J. P. Solomon, and R. J. Tessler, Intersection theory on moduli
of disks, open Kdv, and Virasoro, arXiv:1409.2191.
[4] R. Tessler, The combinatorial formula for open gravitational descendants,
arXiv:1507.04951.
[5] A. Buryak and R. J. Tessler, Matrix models and a proof of the open analog of Witten’s
conjecture, arXiv:1501.07888.
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 78

78 Topology and Physics

[6] J. P. Solomon and R. J. Tessler, to appear.


[7] D. Weingarten, Euclidean quantum gravity on a lattice, Nucl. Phys. B 210, 229–245
(1982).
[8] V. Kazakov, Bilocal regularization of models of random surfaces, Phys. Lett. B 150,
282–284 (1985).
[9] F. David, Randomly triangulated surfaces in two dimensions, Nucl. Phys. B 159,
303–306 (1985).
[10] J. Ambjorn, B. Durhuus, and J. Fröhlich, Diseases of triangulated random surface
models, and possible cures, Nucl. Phys. B 257, 433–449 (1985).
[11] V. Kazakov, I. Kostov, and A. Migdal, Critical properties of randomly triangulated
planar random surfaces, Phys. Lett. B 157, 295–300 (1985).
[12] E. Brezin and V. Kazakov, Exactly solvable field theories of closed strings, Phys.
Lett. B 236, 144–150 (1990).
[13] M. Douglas and S. Shenker, Strings in less than one dimension, Nucl. Phys. B 335,
635–654 (1990).
[14] D. J. Gross and A. Migdal, Nonperturbative two-dimensional quantum gravity, Phys.
Rev. Lett. 64, 127–130 (1990).
[15] P. Di Francesco, P. H. Ginsparg, and J. Zinn-Justin, 2-d gravity and random matrices,
Phys. Rept. 254, 1–133 (1995).
[16] E. Witten, On the structure of the topological phase of two-dimensional gravity, Nucl.
Phys. B 340, 281–332 (1990).
[17] E. Witten, Two-dimensional gravity and intersection theory on moduli space, Surveys
Diff. Geom. 1, 243–310 (1991).
[18] M. Douglas, Strings in less than one dimension and the generalized KdV equations,
Phys. Lett. B 238, 176–180 (1990).
[19] Robbert Dijkgraaf, H. L. Verlinde, and E. P. Verlinde, Loop equations and Virasoro
constraints in nonperturbative 2-d quantum gravity, Nucl. Phys. B 348, 435–456
(1991).
[20] M. Kontsevich, Intersection theory on the moduli space of curves and the matrix airy
function, Commun. Math. Phys. 147, 1–23 (1992).
[21] A. Okounkov and R. Pandharipande, Gromov-Witten theory, Hurwitz numbers, and
matrix models, Proc. Symp. Pure Math. 80.1, 325–489 (2009).
[22] M. E. Kazarian and S. K. Lando, An algebro-geometric proof of Witten’s conjecture,
J. Am. Math. Soc. 20, 1079–1089 (2007).
[23] E. Witten, On quantum gauge theories in two dimensions, Commun. Math. Phys.
141, 153–209 (1991).
[24] I. K. Kostov, Exactly solvable field theory of D = 0 closed and open strings, Phys.
Lett. B 238, 181–186 (1990).
[25] J. A. Minahan, Matrix models with boundary terms and the generalized Painlevé II
equation, Phys. Lett. B 268, 29–34 (1991).
[26] S. Dalley, C. V. Johnson, T. R. Morris, and A. Watterstam, Mod. Phys. Lett. A 7,
2753–2762 (1992), peh-th/9206060.
[27] Z. Yang, Dynamical loops in d = 1 random matrix models, Phys. Lett. B 257, 40–44
(1991).
[28] Y. Itoh and Y. Tanii, Schwinger-Dyson equations of matrix models for open and
closed strings, Phys. Lett. B 289, 335–341 (1992), hep-th/9202080.
[29] C. V. Johnson, On integrable c < 1 open string theory, Nucl. Phys. B 414, 239–266
(1994), hep-th/9301112.
[30] E. Brézin and S. Hikami, Random matrix, singularities, and open/closed intersection
numbers, J. Phys. A: Math. Theor. 48, 475201 (2015).
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 79

Developments in Topological Gravity 79

[31] E. Brézin and S. Hikami, Random Matrix Theory With An External Source, Springer
Briefs in Mathematical Physics, Vol. 19, Springer Singapore.
[32] A. Alexandrov, Open intersection numbers and free fields, arXiv:1606.06712.
[33] A. Okounkov, Random trees and moduli of curves, arXiv:math/0309075.
[34] S. Wolpert, Chern forms and the Riemann tensor for the moduli space of curves,
Invent. Math. 85, 119–145 (1986).
[35] S. Wolpert, On the homology of the moduli space of stable curves, Ann. Math. 118,
491–523 (1983).
[36] P. Zograf, On The large genus asymptotics of Weil–Petersson volumes, arXiv:
0812.0544.
[37] R. Penner, Weil–Petersson volumes, J. Diff. Geom. 35, 599–608 (1992).
[38] W. Goldman, The symplectic nature of fundamental groups of surfaces, Adv. Math.
54, 200–225 (1984).
[39] M. F. Atiyah and R. Bott, The Yang–Mills equations over Riemann surfaces, Phil.
Trans. Roy. Soc. London A 308, 523–615 (1983).
[40] E. Witten, Two-dimensional gauge theory revisited, J. Geom. Phys. 9, 303–368
(1992).
[41] G. McShane, Simple geodesics and a series constant over Teichmuller space, Invent.
Math. 132, 607–632 (1998).
[42] B. Eynard, Recursion between Mumford volumes of moduli spaces, Ann. Henri
Poincaré 12, 1431–1447 (2011).
[43] R. Kauffman, Yu. Manin, and D. Zagier, Higher Weil–Petersson volumes of moduli
spaces of stable n-pointed curves, arXiv:alg-geom/9604001.
[44] Yu. Manin and P. Zograf, Invertible cohomological field theories and Weil–Petersson
volumes, arXiv:math/9902051.
[45] A. Alexandrov, A. Buryak, and R. Tessler, Refined open intersection numbers and
the Kontsevich–Penner matrix model, arXiv:1702.02319.
[46] M. F. Atiyah and I. M. Singer, The index of elliptic operators: V, Ann. Math. 93,
139–149 (1971).
[47] M. F. Atiyah, Riemann Surfaces and Spin Structures, Ann. Scientifique de l’É.N.S.
4, 47–62 (1971).
[48] E. Witten, Fermion path integrals and topological phases, Rev. Mod. Phys. 88,
035001 (2016), arXiv:1508.04715.
[49] E. Witten, Algebraic geometry associated with matrix models of two dimensional
gravity, Topological Methods in Modern Mathematics: A Symposium in Honor of
John Milnor’s Sixtieth Birthday, L. R. Goldberg and A. V. Phillips, eds., Publish or
Perish, Inc., 1993.
[50] A. Kitaev, Unpaired majorana fermions in quantum wires, Usp. Fiz. Nauk. (Suppl.)
171, 131–136 (2001), arXiv:cond-mat/0010440.
[51] A. Kapustin and N. Seiberg, Coupling a QFT to a TQFT and duality, arXiv:
1401.0740.
[52] E. Witten, Topological sigma models, Commun. Math. Phys. 118, 411–449 (1988).
[53] D. Gaiotto, G. Moore, and E. Witten, Algebra of the infrared: string field theoretic
structures in massive N = (2, 2) field theory in two dimensions, arXiv:1506.04087.
[54] D. Freed, Two index theorems in odd dimensions, Commun. Anal. Geom. 6, 317–329
(1998), dg-ga/9601005.
[55] P. Seidel, Fukaya A∞ Structures associated to Lefschetz fibrations, I, II, arXiv:
0912.3932, arXiv:1404.1352.
[56] C. Vafa, Brane/anti-Brane systems and U (N |M ) supergroup, hep-th/0101218.
October 31, 2018 14:52 ws-rv961x669 chap02-TopoGrav page 80

80 Topology and Physics

[57] G. ’t Hooft, A planar diagram theory for strong interactions, Nucl. Phys. B 72,
461–473 (1974).
[58] B. Eynard and N. Orantin, Invariants of algebraic curves and topological expansion,
Commun. Number Theory 1, 347–452 (2007).
[59] M. Aganagic, M. C. N. Cheng, R. Dijkgraaf, D. Krefl and C. Vafa, Quantum geometry
of refined topological strings, J. High Energy Phys. 11, 019 (2012).
[60] J. M. Maldacena, G. W. Moore, N. Seiberg and D. Shih, Exact vs. Semiclassical
target space of the minimal string, J. High Energy Phys. 10, 020 (2004).
[61] M. Aganagic and C. Vafa, Mirror symmetry, D-branes and counting holomorphic
discs, arXiv:hep-th/0012041.
[62] M. Aganagic, R. Dijkgraaf, A. Klemm, M. Marino and C. Vafa, Topological strings
and integrable hierarchies, Commun. Math. Phys. 261, 451–516 (2006).
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 81

81

Chapter 3

Majorana Fermions and representations of the braid group∗

Louis H. Kauffman
Department of Mathematics, Statistics and Computer Science,
851 South Morgan Street, University of Illinois at Chicago,
Chicago, Illinois 60607-7045, USA
Department of Mechanics and Mathematics,
Novosibirsk State University, Novosibirsk, Russia
[email protected]

In this paper we study unitary braid group representations associated with Majorana
Fermions. Majorana Fermions are represented by Majorana operators, elements of a
Clifford algebra. The paper recalls and proves a general result about braid group repre-
sentations associated with Clifford algebras, and compares this result with the Ivanov
braiding associated with Majorana operators. The paper generalizes observations of
Kauffman and Lomonaco and of Mo-Lin Ge to show that certain strings of Majorana
operators give rise to extraspecial 2-groups and to braiding representations of the Ivanov
type.

Keywords: Knots; links; braids; braid group; Fermion; Majorana Fermion; Kitaev chain;
extraspecial 2-group; Majorana string; Yang–Baxter equation; quantum process; quan-
tum computing.

1. Introduction
In this paper we study a Clifford algebra generated by non-commuting elements
of square equal to one and the relationship of this algebra with braid group rep-
resentations and Majorana Fermions.10,29,33 Majorana Fermions can be seen not
only in the structure of collectivities of electrons, as in the quantum Hall effect,39
but also in the structure of single electrons both by experiments with electrons in
nanowires2,32 and also by the decomposition of the operator algebra for a Fermion9
into a Clifford algebra generated by two Majorana operators. Majorana Fermions

∗ Thischapter also appeared in International Journal of Modern Physics A, Vol. 33, No. 23 (2018)
1830023. DOI: 10.1142/S0217751X18300235.
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 82

82 Topology and Physics

have been discussed by this author and his collaborators in Refs. 11, 13, 22–24,
26–30, 33, 38 and parts of the present paper depend strongly on this previous work.
In order to make this paper self-contained, we have deliberately taken explanations,
definitions and formulations from these previous papers, indicating the references
when it is appropriate. The purpose of this paper is to discuss these braiding rep-
resentations, important for relationships among physics, quantum information and
topology.
We study Clifford algebra and its relationship with braid group representations
related to Majorana Fermion operators. Majorana Fermion operators a and b are
defined, so that the creation and annihilation operators ψ † and ψ for a single stan-
dard Fermion can be expressed through them. The Fermion operators satisfy the
well-known algebraic rules:
2
ψ† = ψ2 = 0 ,
ψψ † + ψ † ψ = 1 .
Remarkably, these equations are satisfied if we take
ψ = (a + ib)/2 ,
ψ † = (a − ib)/2 ,
where the Majorana operators a, b satisfy
a† = a , b† = b ,
a2 = b2 = 1 , ab + ba = 0 .
In certain situations, it has been conjectured and partially verified by experi-
ments32,33 that electrons (in low temperature nanowires) may behave as though
each electron were physically a pair of Majorana particles described by these
Majorana operators. In this case the mathematics of the braid group represen-
tations that we study may have physical reality.
Particles corresponding to the Clifford algebra generated by a and b described
in the last paragraph are called Majorana particles because they satisfy a† = a and
b† = b indicating that they are their own antiparticles. Majorana35 analyzed real
solutions to the Dirac equation6,31 and conjectured the existence of such particles
that would be their own antiparticle. It has been conjectured that the neutrino is
such a particle. Only more recently10,33 has it been suggested that electrons may
be composed of pairs of Majorana particles. It is common to speak of Majorana
particles when referring to particles that satisfy the interaction rules for the original
Majorana particles. These interaction rules are, for a given particle P , that P
can interact with another identical P to produce a single P or to produce an
annihilation. For this, we write P P = P + 1 where the right-hand side is to be
read as a superposition of the possibilities P and 1 where 1 stands for the state of
annihilation, the absence of the particle P. We refer to this equation as the fusion
rules for a Majorana Fermion. In modeling the quantum Hall effect,3,4,7,39 the
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 83

Majorana Fermions and Representations of the Braid Group 83

braiding of quasiparticles (collective excitations) leads to nontrivial representations


of the Artin braid group. Such particles are called anyons. The braiding in these
models is related to topological quantum field theory.
Thus there are two algebraic descriptions for Majorana Fermions — the fusion
rules, and the associated Clifford algebra. One may use both the Clifford algebra
and the fusion rules in a single physical situation. However, for studying braiding, it
turns out that the Clifford algebra leads to braiding and so does the fusion algebra
in the so-called Fibonacci model (while the Fibonacci model is not directly related
to the Clifford algebra). Thus we can discuss these two forms of braiding. We show
mathematical commonality between them in the appendix to the present paper.
Both forms of braiding could be present in a single physical system. For example,
in the quantum Hall systems, the anyons (collective excitations of electrons) can
behave according to the Fibonacci model, and the edge effects of these anyons can
be modeled using Clifford algebraic braiding for Majorana Fermions.
Braiding operators associated with Majorana operators are described as follows.
Let {c1 , c2 , . . . , cn } denote a collection of Majorana operators such that c2k = 1 for
k = 1, . . . , n and ci cj + cj ci = 0 when i 6= j. Take the indices {1, 2, . . . , n} as a set
of residues modulo n so that n + 1 = 1. Define operators

σk = (1 + ck+1 ck )/ 2
for k = 1, . . . , n where it is understood that cn+1 = c1 since n + 1 = 1 modulo n.
Then one can verify that
σi σj = σj σi
when |i − j| ≥ 2 and that
σi σi+1 σi = σi+1 σi σi+1
for all i = 1, . . . , n. Thus
{σ1 , . . . , σn−1 }
describes a representation of the n-strand Artin braid group Bn . As we shall see
in Sec. 3, this representation has very interesting properties and it leads to unitary
representations of the braid group that can support partial topological computing.
What is missing to support full topological quantum computing in this representa-
tion is a sufficient structure of U (2) transformations. These must be supplied along
with the braiding operators. It remains to be seen if the braiding of Majorana oper-
ator constituents of electrons can be measured, and if the physical world will yield
this form of partial topological computing.
Here is an outline of the contents of the paper. Section 2 reviews the definition
of the Artin braid group and emphasizes that the braid group on n strands, Bn ,
is a natural extension of the symmetric group on n letters. Along with its topo-
logical interpretations, this proximity to the symmetric group probably explains
the many appearances of the braid group in physical and mathematical problems.
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 84

84 Topology and Physics

Section 3 discusses how unitary braiding operators can be (in the presence of local
unitary transformations) universal gates for quantum computing and why certain
solutions of the Yang–Baxter (braiding) equation, when entangling, are such uni-
versal gates. Section 4 discusses how a Clifford algebra of Majorana Fermion opera-
tors can produce braiding representations. We recall the Clifford Braiding Theorem
of Ref. 30 and we show how extraspecial 2-groups give rise to representations of
the Artin braid group. In Sec. 4 we show how Majorana Fermions can be used
to construct representations of the Temperley–Lieb algebra and we analyze cor-
responding braid group representations, showing that they are equivalent to our
already-constructed representations from the Clifford Braiding Theorem. In Sec. 6
we show how the Ivanov10 representation of the braid group on a space of Majorana
Fermions generates a 4 × 4 universal quantum gate that is also a braiding operator.
This shows how Majorana Fermions can appear at the base of (partial) topologi-
cal quantum computing. We say partial here because a universal topological gate
of this type must be supported by local unitary transformations not necessarily
generated by the Majorana Fermions. In Sec. 7 we consider the Bell-Basis Change
Matrix BII and braid group representations that are related to it. The matrix itself
is a solution to the Yang–Baxter equation and so it is a universal gate for
√ partial
22
topological computing. Mo-Lin Ge has observed that BII = (I + M )/ 2 where
M 2 = −I. In fact, we take
    
0 0 0 1 1 0 0 0 0 0 0 1
 0 0 −1 0   0 −1 0 0 0 0 1 0 .
 
M =  0 1
= (1)
0 0   0 0 1 0   0 1 0 0
−1 0 0 0 0 0 0 −1 1 0 0 0
Let
   
1 0 0 0 0 0 0 1
0 −1 0 0 0 0 1 0
A=
0
, B= . (2)
0 1 0 0 1 0 0
0 0 0 −1 1 0 0 0
2 2
Thus M = AB and A = B = I while AB + BA = 0. Thus, we can take A and
B themselves as Majorana Fermion operators. This is our key observation in this
paper. The fact that the matrix M factors into a product of (Clifford algebraic)
Majorana Fermion operators means that the extraspecial 2-group associated with
M can be seen as the result of a string of Majorana operators where we define such
a string as follows: A list of pairs Ak , Bk of Majorana operators that satisfies the
identities below is said to be a string of Majorana operators:
A2i = Bi2 = 1 , (3)
Ai Bi = −Bi Ai , (4)
Ai Bi+1 = −Bi+1 Ai , (5)
Ai+1 Bi = Bi Ai+1 , (6)
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 85

Majorana Fermions and Representations of the Braid Group 85

Ai Bj = Bj Ai , for |i − j| > 1 , (7)


Ai Aj = Aj Ai , (8)
Bi Bj = Bj Bi for all i and j . (9)

Letting Mi = Ai Bi we obtain an extraspecial 2-group and by the Extraspecial


2-Group Braiding Theorem of Sec. 4, a representation of the Artin braid group with

generators σi = (I + Ai Bi )/ 2. If we think of the Majorana Fermions Ai and Bi
as anyonic particles, then it will be of great interest to formulate a Hamiltonian for
them and to analyze them in analogy with the Kitaev spin chain. This project will
be carried out in a sequel to the present paper.
In this paper, we continue a description of the relationship of the Majorana
string for the BII matrix and the work of Mo-Lin Ge8 relating with an extraspecial
2-group and with the topological order in the Kitaev spin chain. We believe that
our formulation of the Majorana string sheds new light on this relationship.
The last section of the paper is an appendix on braid group representations of
the quaternions. As the reader can see, the different braid group representations
studied in this paper all stem from the same formal considerations that occur in
classifying quaternionic braid representations. The essential reason for this is that
given three Majorana Fermions A, B, C that pairwise anticommute and such that
A2 = B 2 = C 2 = 1, then the Clifford elements BA, CB, AC generate a copy
of the quaternions. Thus, our Clifford braiding representations are generalizations
of particular quaternionic braiding. The appendix is designed to show that the
quaternions are a rich source of connection among these representations, including
the Fibonacci model,23,26 which is briefly discussed herein.

2. Braids
A braid is an embedding of a collection of strands that have their ends in two rows
of points that are set one above the other with respect to a choice of vertical. The
strands are not individually knotted and they are disjoint from one another. See
Figs. 1 and 2 for illustrations of braids and moves on braids. Braids can be multiplied
by attaching the bottom row of one braid to the top row of the other braid. Taken
up to ambient isotopy, fixing the endpoints, the braids form a group under this
notion of multiplication. In Fig. 1 we illustrate the form of the basic generators of
the braid group, and the form of the relations among these generators. Figure 2
illustrates how to close a braid by attaching the top strands to the bottom strands
by a collection of parallel arcs. A key theorem of Alexander states that every knot
or link can be represented as a closed braid. Thus, the theory of braids is critical
to the theory of knots and links. Figure 2 illustrates the famous Borromean Rings
(a link of three unknotted loops such that any two of the loops are unlinked) as the
closure of a braid.
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 86

86 Topology and Physics

s1 s2
Braid Generators

s3 s 1-1

= s 1-1 s 1 = 1

= s1 s 2 s1 = s 2 s1 s 2

= s1 s 3 = s 3 s1

Fig. 1. Braid generators.

b CL(b)

Fig. 2. Borromean rings as a braid closure.

Let Bn denote the Artin braid group on n strands. We recall here that Bn is
generated by elementary braids {s1 , . . . , sn−1 } with relations
(1) si sj = sj si for |i − j| > 1,
(2) si si+1 si = si+1 si si+1 for i = 1, . . . , n − 2.
See Fig. 1 for an illustration of the elementary braids and their relations. Note
that the braid group has a diagrammatic topological interpretation, where a braid
is an intertwining of strands that lead from one set of n points to another set of
n points. The braid generators si are represented by diagrams where the ith and
(i + 1)th strands wind around one another by a single half-twist (the sense of this
turn is shown in Fig. 1) and all other strands drop straight to the bottom. Braids
are diagrammed vertically as in Fig. 1, and the products are taken in order from
top to bottom. The product of two braid diagrams is accomplished by adjoining
the top strands of one braid to the bottom strands of the other braid.
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 87

Majorana Fermions and Representations of the Braid Group 87

In Fig. 1 we have restricted the illustration to the four-stranded braid group B4 .


In that figure the three braid generators of B4 are shown, and then the inverse of
the first generator is drawn. Following this, one sees the identities s1 s−1
1 = 1 (where
the identity element in B4 consists in four vertical strands), s1 s2 s1 = s2 s1 s2 , and
finally s1 s3 = s3 s1 .
Braids are a key structure in mathematics. It is not just that they are a col-
lection of groups with a vivid topological interpretation. From the algebraic point
of view the braid groups Bn are important extensions of the symmetric groups Sn .
Recall that the symmetric group Sn of all permutations of n distinct objects has
presentation as shown below.

(1) s2i = 1 for i = 1, . . . , n − 1,


(2) si sj = sj si for |i − j| > 1,
(3) si si+1 si = si+1 si si+1 for i = 1, . . . , n − 2.

Thus Sn is obtained from Bn by setting the square of each braiding generator


equal to one. We have an exact sequence of groups

1 → P (n) → Bn → Sn → 1

exhibiting the Artin braid group as an extension of the symmetric group. The kernel
P (n) is the pure braid group, consisting in those braids where each strand returns
to its original position.
In the next sections we shall show how representations of the Artin braid
group, rich enough to provide a dense set of transformations in the unitary groups
(see Ref. 23 and references therein), arise in relation to Fermions and Majorana
Fermions. Braid groups are in principle fundamental to quantum computation and
quantum information theory.

3. Braiding Operators and Universal Quantum Gates


A key concept in the construction of quantum link invariants is the association
of a Yang–Baxter operator R to each elementary crossing in a link diagram. The
operator R is a linear mapping

R: V ⊗ V → V ⊗ V

defined on the 2-fold tensor product of a vector space V, generalizing the permu-
tation of the factors (i.e. generalizing a swap gate when V represents one qubit).
Such transformations are not necessarily unitary in topological applications. It is
useful to understand when they can be replaced by unitary transformations for
the purpose of quantum computing. Such unitary R-matrices can be used to make
unitary representations of the Artin braid group.
More information about the material sketched in this section can be found in
Refs. 12, 22 and 23.
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 88

88 Topology and Physics

R I I R
R I I R
I R = R I
R I I R

Fig. 3. The Yang–Baxter equation.

A solution to the Yang–Baxter equation, as described in the last paragraph is a


matrix R, regarded as a mapping of a 2-fold tensor product of a vector space V ⊗ V
to itself that satisfies the equation
(R ⊗ I)(I ⊗ R)(R ⊗ I) = (I ⊗ R)(R ⊗ I)(I ⊗ R) .
From the point of view of topology, the matrix R is regarded as representing an ele-
mentary bit of braiding represented by one string crossing over another. In Fig. 3 we
have illustrated the braiding identity that corresponds to the Yang–Baxter equa-
tion. Each braiding picture with its three input lines (bottom) and output lines
(top) corresponds to a mapping of the three fold tensor product of the vector space
V to itself, as required by the algebraic equation quoted above. The pattern of
placement of the crossings in the diagram corresponds to the factors R ⊗ I and
I ⊗ R. This crucial topological move has an algebraic expression in terms of such a
matrix R. We need to study solutions of the Yang–Baxter equation that are unitary.
Then the R matrix can be seen either as a braiding matrix or as a quantum gate
in a quantum computer.

3.1. Universal gates


A two-qubit gate G is a unitary linear mapping G : V ⊗ V → V where V is a two
complex dimensional vector space. We say that the gate G is universal for quantum
computation (or just universal ) if G together with local unitary transformations
(unitary transformations from V to V ) generates all unitary transformations of
the complex vector space of dimension 2n to itself. It is well known36 that CNOT
is a universal gate. (On the standard basis, CNOT is the identity when the first
qubit is |0i, and it flips the second qubit, leaving the first alone, when the first
qubit is |1i.)
A gate G, as above, is said to be entangling if there is a vector
|αβi = |αi ⊗ |βi ∈ V ⊗ V
such that G|αβi is not decomposable as a tensor product of two qubits. Under these
circumstances, one says that G|αβi is entangled.
In Ref. 5, the Brylinskis give a general criterion of G to be universal. They prove
that a two-qubit gate G is universal if and only if it is entangling.
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 89

Majorana Fermions and Representations of the Braid Group 89

Remark. A two-qubit pure state


|φi = a|00i + b|01i + c|10i + d|11i
is entangled exactly when (ad − bc) 6= 0. It is easy to use this fact to check when a
specific matrix is, or is not, entangling.

Remark. There are many gates other than CNOT that can be used as universal
gates in the presence of local unitary transformations (see Ref. 22). Some of these are
themselves topological (unitary solutions to the Yang–Baxter equation, see Refs. 1
and 5) and themselves generate representations of the Artin braid group. Replacing
CNOT by a solution to the Yang–Baxter equation does not place the local unitary
transformations as part of the corresponding representation of the braid group.
Thus, such substitutions give only a partial solution to creating topological quantum
computation. In a full solution (e.g. as in Ref. 23) all unitary operations can be built
directly from the braid group representations, and one hopes that the topology in
the physics behind these representations will give the system protection against
decoherence. It remains to be seen if partial topological computing systems can
have this sort of protection.

4. Fermions, Majorana Fermions and Braiding


Fermion Algebra. Recall Fermion algebra.9 One has Fermion annihilation oper-
ators ψ and their conjugate creation operators ψ † . One has ψ 2 = 0 = (ψ † )2 . There
is a fundamental commutation relation
ψψ † + ψ † ψ = 1 .
If you have more than one standard Fermion operator, say ψ and φ, then they
anticommute:
ψφ = −φψ .
The Majorana Fermions10,35 c satisfy c† = c so that they are their own antiparticles.
They have a different algebra structure than standard Fermions, as we shall see
below. The reader may be curious just why the Majorana Fermions are assigned a
Clifford algebra structure, as we are about to do. We give a mathematical motivation
below, by showing that, with this Clifford algebra structure, a standard Fermion is
generated by two Majorana Fermions. One way of putting this is that we can use
c1 = (ψ + ψ † )/2, c2 = (ψ − ψ † )/(2i) to make two operators satisfying c†i = ci for
i = 1, 2. We shall see that c1 and c2 satisfy Clifford algebra identities.
Majorana Fermion operators can model quasiparticles, and they are related to
braiding and to topological quantum computing. A group of researchers2,32 have
found quasiparticle Majorana Fermions in edge effects in nanowires. (A line of
Fermions could have a Majorana Fermion happen nonlocally from one end of the line
to the other.) The Fibonacci model that we discuss23,33 is also based on Majorana
particles, possibly related to collective electronic excitations. If P is a Majorana
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 90

90 Topology and Physics

Fermion particle, then P can interact with itself to either produce itself or to annihi-
late itself. This is the simple “fusion algebra” for this particle. One can write P 2 =
P + ∗ to denote the two possible self-interactions of the particle P, as we have
discussed in the introduction. The patterns of interaction and braiding of such a
particle P give rise to the Fibonacci model.23

Majoranas make Fermions. Majorana operators35 are related to standard


Fermions as follows: We first take two Majorana operators c1 and c2 . The alge-
bra for Majoranas is c1 = c†1 , c2 = c†2 and c1 c2 = −c2 c1 if c1 and c2 are distinct
Majorana Fermions with c21 = 1 and c22 = 1. Thus, the operator algebra for a collec-
tion of Majorana particles is a Clifford algebra. One can make a standard Fermion
operator from two Majorana operators via

ψ = (c1 + ic2 )/2 ,


ψ † = (c1 − ic2 )/2 .

Note, for example, that

ψ 2 = (c1 + ic2 )(c1 + ic2 )/4 = c21 − c22 + i(c1 c2 + c2 c1 ) = 0 + i0 = 0 .

Similarly, one can mathematically make two Majorana operators from any single
Fermion operator via

c1 = (ψ + ψ † )/2 ,
c2 = (ψ − ψ † )/(2i) .

This simple relationship between the Fermion creation and annihilation algebra and
an underlying Clifford algebra has long been a subject of speculation in physics.
Only recently have experiments shown (indirect) evidence32 for Majorana Fermions
underlying the electron.

Braiding. Let there be given a set of Majorana operators {c1 , c2 , c3 , . . . , cn } so


that c2i = 1 for all i and ci cj = −cj ci for i 6= j. Then there are natural braiding
operators10,33 that act on the vector space with these ck as the basis. The operators
are mediated by algebra elements

τk = (1 + ck+1 ck )/ 2 ,

τk−1 = (1 − ck+1 ck )/ 2 .

Then the braiding operators are

Tk : Span{c1 , c2 , . . . , cn } → Span{c1 , c2 , . . . , cn }

via

Tk (x) = τk xτk−1 .
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 91

Majorana Fermions and Representations of the Braid Group 91

The braiding is simply:


Tk (ck ) = ck+1 ,
Tk (ck+1 ) = −ck ,
and Tk is the identity otherwise. This gives a very nice unitary representation of
the Artin braid group and it deserves better understanding.

That there is much more to this braiding is indicated by the following result.

Clifford Braiding Theorem. Let C be the Clifford algebra over the real num-
bers generated by linearly independent elements {c1 , c2 , . . . , cn } with c2k = 1 for√all
k and ck cl = −cl ck for k 6= l. Then the algebra elements τk = (1 + ck+1 ck )/ 2,
form a representation of the (circular) Artin √ braid group. That is, we have
{τ1 , τ2 , . . . ,√
τn−1 , τn } where τk = (1 + ck+1 ck )/ 2 for 1 ≤ k < n and τn =
(1 + c1 cn )/ 2, and τk τk+1 τk = τk+1 τk τk+1 for all k and τi τj = τj τi when |i − j| > 2.
Note that each braiding generator τk has order 8.

Proof. Let ak = ck+1 ck . Examine the following calculation:


√ 
τk τk+1 τk = 2/2 (1 + ak+1 )(1 + ak )(1 + ak+1 )
√ 
= 2/2 (1 + ak + ak+1 + ak+1 ak )(1 + ak+1 )
√ 
= 2/2 (1 + ak + ak+1 + ak+1 ak + ak+1
+ ak ak+1 + ak+1 ak+1 + ak+1 ak ak+1 )
√ 
= 2/2 (1 + ak + ak+1 + ck+2 ck + ak+1 + ck ck+2 − 1 − ck ck+1 )
√ 
= 2/2 (ak + ak+1 + ak+1 + ck+1 ck )
√  √ 
= 2/2 (2ak + 2ak+1 ) = 2/2 (ak + ak+1 ) .
Since the end result is symmetric under the interchange of k and k + 1, we conclude
that
τk τk+1 τk = τk+1 τk τk+1 .

Note that this braiding relation works circularly if we define τn = (1 + c1 cn )/ 2.
It is easy to see that τi τj = τj τi when |i − j| > 2. This completes the proof.

This representation of the (circular) Artin braid group is significant for the
topological physics of Majorana Fermions. This part of the structure needs further
study. We discuss its relationship with the work of Mo-Lin Ge in the next section.

Remark. It is worth noting that a triple of Majorana Fermions say x, y, z gives


rise to a representation of the quaternion group. This is a generalization of the well-
known association of Pauli matrices and quaternions. We have x2 = y 2 = z 2 = 1
and, when different, they anticommute. Let I = yx, J = zy, K = xz. Then
I 2 = J 2 = K 2 = IJK = −1 ,
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 92

92 Topology and Physics

giving the quaternions. The operators


√ 
X = σI = 1/ 2 (1 + I) ,
√ 
Y = σJ = 1/ 2 (1 + J) ,
√ 
Z = σK = 1/ 2 (1 + K) .
braid one another:
XY X = Y XY , XZX = ZXZ , Y ZY = ZY Z .
This is a special case of the braid group representation described above for an
arbitrary list of Majorana Fermions. These braiding operators are entangling and
so can be used for universal quantum computation, but they give only partial
topological quantum computation due to the interaction with single qubit operators
not generated by them.

Remark. We can see just how the braid group relation arises in the Clifford Braid-
ing Theorem as follows. Let A and B be given algebra elements with
A2 = B 2 = −1
and
AB = −BA .
(For example we can let A = ck+1 ck and B = ck+2 ck+1 as above.)

Then we see immediately that ABA = −BAA = −B(−1) = B and BAB = A.


Let

σA = (1 + A)/ 2
and

σB = (1 + B)/ 2 .
Then
√ 
σA σB σA = (1 + A)(1 + B)(1 + A)/ 2 2
√ 
= (1 + B + A + AB)(1 + A)/ 2 2
√ 
= (1 + B + A + AB + A + BA + A2 + ABA)/ 2 2
√ 
= (B + 2A + ABA)/ 2 2
√  √ 
= 2(A + B)/ 2 2 = (A + B)/ 2 .
From this it follows by symmetry that σA σB σA = σB σA σB .
The relations ABA = B and BAB = A play a key role in conjunction with the
fact that A and B have square equal to minus one. We further remark that there
is a remarkable similarity in the formal structure of this derivation and the way
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 93

Majorana Fermions and Representations of the Braid Group 93

elements of the Temperley–Lieb algebra23,29 can be combined to form a representa-


tion of the Artin braid group. In Temperley–Lieb algebra one has elements R and
S with RSR = R and SRS = S where R2 = R and S 2 = S. The Temperley–Lieb
algebra also gives rise to representations of the braid group and this is used in the
Fibonacci model.29 Both the Clifford algebra and the Temperley–Lieb algebra are
associated with Majorana Fermions. The Clifford algebra is associated with the
annihilation/creation algebra and the Temperley–Lieb algebra is associated with
the fusion algebra (see Ref. 29 for a detailed discussion of the two algebras in
relation to Majoranas). We will explore the relationship further in the next section.
Note that if we have a collection of operators Mk such that

Mk2 = −1 ,
Mk Mk+1 = −Mk+1 Mk

and

Mi Mj = Mj Mi

when |i − j| ≥ 2, then the operators



σi = (1 + Mi )/ 2

give a representation of the Artin braid group. This is a way to generalize the
Clifford Braiding Theorem, since defining Mi = ci+1 ci gives exactly such a set of
operators, and the observation about braiding we give above extends to a proof of
the more general theorem. One says that the operators Mk generate an extraspecial
2-group.37

Extraspecial 2-Group Braiding Theorem. Let {M1 , . . . , Mn } generate √ an


extraspecial 2-group as above. Then the operators σi = (1 + Mi )/ 2 for i =
1, . . . , n − 1 give a representation of the Artin braid group with σi σj = σj σi for
|i − j| ≥ 2 and σi σi+1 σi = σi+1 σi σi+1 .

Proof. The proof follows at once from the discussion above.

Remark. In the next section we will see examples of braid group representations
that arise from the Braiding Theorem, but are of a different character than the
representations that come from the Clifford Braiding Theorem.
The braiding operators in the Clifford Braiding Theorem can be seen to act
on the vector space over the complex numbers that is spanned by the Majorana
Fermions {c1 , c2 , c3 , . . . , cn }. To see how this works, let x = ck and y = ck+1 from
the basis above. Let
   
1 + yx −1 1 + yx 1 − yx
s= √ , T (p) = sps = √ p √ ,
2 2 2
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 94

94 Topology and Physics

x y
x y

x y y
x

T(x) = y
T(y) = - x

Fig. 4. Braiding action on a pair of Fermions.

and verify that T (x) = y and T (y) = −x. Now view Fig. 4 where we have illustrated
a topological interpretation for the braiding of two Fermions. In the topological
interpretation the two Fermions are connected by a flexible belt. On interchange,
the belt becomes twisted by 2π. In the topological interpretation a twist of 2π
corresponds to a phase change of −1. (For more information on this topological
interpretation of 2π rotation for Fermions, see Ref. 12.) Without a further choice
it is not evident which particle of the pair should receive the phase change. The
topology alone tells us only the relative change of phase between the two particles.
The Clifford algebra for Majorana Fermions makes a specific choice in the matter
by using the linear ordering of the Majorana operators {c1 · · · cn }, and in this way
fixes the representation of the braiding. In saying this, we are just referring to
the calculational form above where with x prior to y in the ordering, the braiding
operator assigns the minus sign to one of them and not to the other.
A remarkable feature of this braiding representation of Majorana Fermions is
that it applies to give a representation of the n-strand braid group Bn for any row of
n Majorana Fermions. It is not restricted to the quaternion algebra. Nevertheless,
we shall now examine the braiding representations of the quaternions. These repre-
sentations are very rich and can be used in situations (such as Fibonacci particles)
involving particles that are their own antiparticles (analogous to the Majorana
Fermions underlying electrons). Such particles can occur in collectivities of elec-
trons as in the quantum Hall effect. In such situations it is theorized that one can
examine the local interaction properties of the Majorana particles and then the
braidings associated with triples of them (the quaternion cases) can come into play
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 95

Majorana Fermions and Representations of the Braid Group 95

very strongly. In the case of electrons in nanowires, one at the present time must
make do with long-range correlations between ends of the wires and forgo such local
interactions. Nevertheless, it is the purpose of this paper to juxtapose the full story
about three strand braid group representations of the quaternions in the hope that
this will lead to deeper understanding of the possibilities for even the electronic
Majorana Fermions.

5. Remark on the Temperley Lieb Algebra and Braiding from


Majorana Operators
Let {c1 , . . . , cn } be a set of Majorana Fermion operators so that c2i = 1 and
ci cj = −cj ci for i 6= j. Recall12 that the n-strand Temperley–Lieb algebra T Ln
is generated by {Ui , . . . , Un−1 } with the relations (for an appropriate scalar δ)
Ui2 = δUi , Ui Ui±1 Ui = Ui and Ui Uj = Uj Ui for |i − j| ≥ 2. Here we give a
Majorana Fermion representation of Temperley–Lieb algebra that is related to the
Ivanov braid group representation that we have discussed in previous sections of
the paper.
First define A and B as A = ci ci+1 , B = ci−1 ci so that A2 = B 2 = −1 and
AB = −BA. Note the following relations: Let U = (1 + iA) and V = (1 + iB).
Then U 2 = 2U and V 2 = 2V and U V U = V and V U V = U.
Thus, a Majorana Fermion representation of the Temperley–Lieb algebra is
given by
1
Uk = √ (1 + ick+1 ck ) , (10)
2
2

Uk = 2Uk , (11)
Uk Uk±1 Uk = Uk , (12)
Uk Uj = Uj Uk for |k − j| ≥ 2 . (13)

Hence, we have a representation of the Temperley–Lieb algebra with loop value 2.
Using this representation of the Temperley–Lieb algebra, we can construct (via the
Jones representation of the braid group to the Temperley–Lieb algebra) another
apparent representation of the braid group that is based on Majorana Fermions. In
order to see this representation more explicitly, note that the Jones representation
can be formulated as

σk = AUK + A−1 1 ,
σk−1 = A−1 Uk + A1 ,

where

−A2 − A−2 = δ

and δ is the loop value for the Temperley–Lieb algebra so that Uk2 = δUk .12
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 96

96 Topology and Physics

In the case of the Temperley–Lieb algebra constructed from Majorana Fermion


operators described above, we take A such that

−A2 − A−2 = 2
and
σk = AUk + A−1 1
1
= A √ (1 + ick+1 ck ) + A−1 1
2
   
1 1
= A √ + A−1 + iA √ ck+1 ck .
2 2
Note that with

 
2 −2 1 −1
A +A = − 2, x= A√ + A ,
2
and
 
1
y= iA √ ,
2
then x2 = y 2 . In fact the condition x2 = y 2 is by itself a sufficient condition for the
braiding relation to be satisfied.

Remark. Note that x2 − y 2 = (x + y)(x − y) and so the condition x2 = y 2 is


equivalent to the condition that x = +y or x = −y. This means that the braiding
representation that we have constructed for Majorana Fermion operators via the
Temperley–Lieb representation is not different from the one we first constructed
from the Clifford Braiding Theorem. More precisely, we have the result of the
Theorem below.

Theorem. Suppose that α2 = β 2 = −1 and αβ = −βα. Define


σα = x + yα
and
σβ = x + yβ .
Then x2 = y 2 is a sufficient condition for the braiding relation
σα σβ σα = σβ σα σβ
to be satisfied. Since x2 = y 2 is satisfied by y = x or y = −x, we have the two
possibilities:
√  √ 
σα = x 2 (1 + α)/ 2
and
√  √ 
σα = x 2 (1 − α)/ 2 .
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 97

Majorana Fermions and Representations of the Braid Group 97

In other words, braiding representations that arise in this way will be equivalent
to the original braiding representation of Ivanov that we have derived from the
Clifford Braiding Theorem.

Proof. Given the hypothesis of the Theorem, note that αβα = −βα2 = β. Thus,
we have

σα σβ σα = (x + yα)(x + yβ)(x + yα)


= x2 + xyβ + xyα + y 2 αβ (x + yα)


= x3 + x2 yβ + x2 yα + xy 2 αβ + x2 yα + xy 2 βα + xy 2 α2 + y 3 αβα
= x3 + x2 yβ + x2 yα + x2 yα + xy 2 (−1) + y 3 β
= x3 − xy 2 + x2 y + y 3 β + 2x2 yα .


This expression will be symmetric in α and β if x2 y + y 3 = 2x2 y, or equivalently,


if y(−x2 + y 2 ) = 0. Thus the expression will be symmetric if x2 = y 2 . Since the
braiding relation asserts the symmetry of this expression, this completes the proof
of the Theorem.

6. Majorana Fermions Generate Universal Braiding Gates


Recall that in Sec. 3 we showed how to construct braid group representations. Let
Tk : Vn → Vn defined by

Tk (v) = τk vτk−1

be defined as in Sec. 3. Note that τk−1 = √1 (1


2
− ck+1 ck ). It is then easy to verify
that

Tk (ck ) = ck+1 , Tk (ck+1 ) = −ck

and that Tk is the identity otherwise.


For universality, take n = 4 and regard each Tk as operating on V ⊗ V where
V is a single qubit space. Then the braiding operator T2 satisfies the Yang–Baxter
equation and is an entangling operator. So, we have universal gates (in the presence
of single qubit unitary operators) from Majorana Fermions. If experimental work
shows that Majorana Fermions can be detected and controlled, then it is possible
that quantum computers based on these topological unitary representations will be
constructed. Note that the matrix form R of T2 is
 
1 0 0 0
 0 0 −1 0 
R= 0 1
.
0 0
0 0 0 1
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 98

98 Topology and Physics

Here we take the ordered basis {|00i, |01i, |10i, |11i} for the corresponding 2-qubit
space V ⊗ V so that
R|00i = |00i , R|01i = |10i ,
R|10i = −|01i , R|11i = |11i .
It is not hard to verify that R satisfies the Yang–Baxter equation. To see that it is
entangling we take the state |φi = a|0i + b|1i and test R on
|φi ⊗ |φi = a2 |00i + ab|01i + ab|10i + b2 |11i
and find that
R(|φi ⊗ |φi) = a2 |00i + ab|10i − ab|01i + b2 |11i .
The determinant of this state is a2 b2 + (ab)(ab) = 2a2 b2 . Thus when both a and
b are nonzero, we have that R(|φi ⊗ |φi) is entangled. This proves that R is an
entangling operator, as we have claimed. This calculation shows that a fragment
of the Majorana operator braiding can be used to make a universal quantum gate,
and so to produce partial topological quantum computing if realized physically.

7. The Bell-Basis Matrix and Majorana Fermions


We can say more about braiding by using the operators τk = √12 (1 + ck+1 ck ),
as these operators have natural matrix representations. In particular, consider the
Bell-Basis Matrix BII that is given as follows:
 
1 0 0 1
1  0 1 −1 0   = √1 (I + M ) (M 2 = −1) ,
BII = √  (14)
2  0 1 1 0 2
−1 0 0 1
where
 
0 0 0 1
 0 0 −1 0 
M =
 0
 (15)
1 0 0
−1 0 0 0
and we define
Mi = I ⊗ I ⊗ · · · I ⊗ M ⊗ I ⊗ I ⊗ · · · ⊗ I ,
where there are n tensor factors in all and M occupies the i and i + 1-st positions.
Then one can verify that these matrices satisfy the relations of an “extraspecial
2-group.”8,37 The relations are as follows:
Mi Mi±1 = −Mi±1 Mi , M 2 = −I , (16)
Mi Mj = Mj Mi , |i − j| ≥ 2 . (17)
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 99

Majorana Fermions and Representations of the Braid Group 99

In fact, we can go further and identify a collection of Majorana Fermion opera-


tors behind these relations. Note that
    
0 0 0 1 1 0 0 0 0 0 0 1
 0 0 −1 0   0 −1 0 0  0 0 1 0
M =  0 1
=  . (18)
0 0 0 0 1 0 0 1 0 0
−1 0 0 0 0 0 0 −1 1 0 0 0
Let
   
1 0 0 0 0 0 0 1
0 −1 0 0 0 0 1 0
A=
0
, B= . (19)
0 1 0 0 1 0 0
0 0 0 −1 1 0 0 0
Then we have M = AB and A2 = B 2 = I while AB + BA = 0. Thus we can take
A and B themselves as Majorana Fermion operators. By the same token, we have
Mi = I ⊗ I ⊗ · · · I ⊗ AB ⊗ I ⊗ I ⊗ · · · ⊗ I
so that if we define
Ai = I ⊗ I ⊗ · · · I ⊗ A ⊗ I ⊗ I ⊗ · · · ⊗ I
and
Bi = I ⊗ I ⊗ · · · I ⊗ B ⊗ I ⊗ I ⊗ · · · ⊗ I .
Thus
M i = Ai B i .
By direct calculation (details omitted here), we find the following relations:
A2i = Bi2 = 1 , (20)
Ai Bi = −Bi Ai , (21)
Ai Bi+1 = −Bi+1 Ai , (22)
Ai+1 Bi = Bi Ai+1 , (23)
Ai Bj = Bj Ai for |i − j| > 1 , (24)
Ai Aj = Aj Ai , (25)
Bi Bj = Bj Bi for all i and j . (26)
Definition. Call a list of pairs of Majorana operators that satisfies the above
identities a string of Majorana operators or Majorana string.

Theorem. Let Ai , Bi (i = 1, . . . , n) be a Majorana string. Then the product


operators Mi = Ai Bi (i =√ 1, . . . , n) define an extraspecial 2-group and hence the
operators σi = (1 + Mi )/ 2 for i = 1, . . . , n − 1 give a representation of the Artin
braid group.
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 100

100 Topology and Physics

Proof. We must verify that the Mi = Ai Bi satisfy the relations for an extraspecial
2-group. To this end, note that Mi2 = Ai Bi Ai Bi = −Ai Ai Bi Bi = −1 and that
(using the relations for a string of Majorana operators as above)
Mi Mi+1 = Ai Bi Ai+1 Bi+1 = Ai Ai+1 Bi Bi+1
while
Mi+1 Mi = Ai+1 Bi+1 Ai Bi = −Ai+1 Ai Bi+1 Bi .
Thus Mi Mi+1 = −Mi+1 Mi . We leave to the reader to check that for |i − j| ≥ 2,
Mi Mj = Mj Mi . This completes the proof that the Mi form an extraspecial 2-group.
The braiding representation then follows from the Extraspecial 2-Group Braiding
Theorem.

Remark. It is remarkable that these matrices representing Majorana Fermion


operators give, so simply, a braid group representation and that the pattern behind
this representation generalizes to any string of Majorana operators. Kauffman and
Lomonaco22 observed that BII satisfies the Yang–Baxter equation and is an entan-
gling gate. Hence BII = √12 (I + M ) (M 2 = −1) is a universal quantum gate in
the sense of this paper. It is of interest to understand the possible relationships of
topological entanglement (linking and knotting) and quantum entanglement. See
Ref. 22 for more than one point of view on this question.

Remarks. The operators Mi take the place here of the products of Majorana
Fermions√ci+1 ci in the Ivanov picture of braid group representation in the form
τi = (1/ 2)(1 + ci+1 ci ). This observation gives a concrete interpretation of these
braiding operators and relates them to a Hamiltonian for a physical system by an
8 10
observation
√ of Mo-Lin Ge. Mo-Lin Ge shows that the observation of Ivanov that
τk = 1/ 2 (1 + ck+1 ck ) = exp(ck+1 ck π/4) can be extended by defining
R̆k (θ) = eθck+1 ck .
Then R̆i (θ) satisfies the full Yang–Baxter equation with rapidity parameter θ. That
is, we have the equation
R̆i (θ1 )R̆i+1 (θ2 )R̆i (θ3 ) = R̆i+1 (θ3 )R̆i (θ2 )R̆i+1 (θ1 ) .
This makes it very clear that R̆i (θ) has physical significance and suggests examining
the physical process for a temporal evolution of the unitary operator R̆i (θ).

In fact, following Ref. 8, we can construct a Kitaev chain33,34 based on the


solution R̆i (θ) of the Yang–Baxter equation. Let a unitary evolution be governed
by R̆k (θ). When θ in the unitary operator R̆k (θ) is time-dependent, we define a state

|ψ(t)i by |ψ(t)i = R̆k |ψ(0)i. With the Schrödinger equation i~ ∂t |ψ(t)i = Ĥ(t)|ψ(t)i
one obtains:

i~ [R̆k |ψ(0)i] = Ĥ(t)R̆k |ψ(0)i . (27)
∂t
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 101

Majorana Fermions and Representations of the Braid Group 101

Then the Hamiltonian Ĥk (t) related to the unitary operator R̆k (θ) is obtained by
the formula:
∂ R̆k −1
Ĥi (t) = i~ R̆ . (28)
∂t k
Substituting R̆k (θ) = exp(θck+1 ck ) into Eq. (28), we have:
Ĥk (t) = i ~θ̇ck+1 ck . (29)
This Hamiltonian describes the interaction between kth and (k + 1)th sites via the
parameter θ̇. When θ = n × π4 , the unitary evolution corresponds to the braiding
progress of two nearest Majorana Fermion sites in the system as we have described
it above. Here n is an integer and signifies the time of the braiding operation.
We remark that it is interesting to examine this periodicity of the appearance
of the topological phase in the time evolution of this Hamiltonian (compare with
discussion in Ref. 38). For applications, one may consider processes that let the
Hamiltonian take the system right to one of these topological points and then
this Hamiltonian cuts off. This goes beyond the work of Ivanov, who examines
the representation on Majoranas obtained by conjugating by these operators. The
Ivanov representation is of order two, while this representation is of order eight.
Mo-Lin Ge points out that if we only consider the nearest-neighbor interactions
between Majorana Fermions, and extend Eq. (29) to an inhomogeneous chain with
2N sites, the derived model is expressed as:
N
X 
Ĥ = i~ θ̇1 c2k c2k−1 + θ̇2 c2k+1 c2k , (30)
k=1

with θ̇1 and θ̇2 describing odd–even and even–odd pairs, respectively.
The Hamiltonian derived from R̆i (θ(t)) corresponding to the braiding of nearest
Majorana Fermion sites is exactly the same as the 1D wire proposed by Kitaev,33
and θ̇1 = θ̇2 corresponds to the phase transition point in the “superconducting”
chain. By choosing different time-dependent parameters θ1 and θ2 , one finds that the
Hamiltonian Ĥ corresponds to different phases. These observations of Mo-Lin Ge
give physical substance and significance to the Majorana Fermion braiding opera-
tors discovered by Ivanov,10 putting them into a robust context of Hamiltonian
evolution via the simple Yang–Baxterization R̆i (θ) = eθci+1 ci .
In Ref. 22, Kauffman and Lomonaco observe that the Bell-Basis Change Matrix
BII , is a solution to the Yang–Baxter equation. This solution can be seen as a 4 × 4
matrix representation for the operator R̆i (θ). One can ask whether there is relation
between topological order, quantum entanglement and braiding. This is the case for
the Kitaev chain where nonlocal Majorana modes are entangled and have braiding
structure.
As we have pointed out above, Ge makes the further observation, that the Bell-
Basis Matrix BII can be used to construct an extraspecial 2-group and a braid
representation that has the same properties as the Ivanov braiding for the Kitaev
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 102

102 Topology and Physics

chain. We have pointed out that this extraspecial 2-group comes from a string
of Majorana matrix operators (described above). This suggests alternate physical
interpretations that need to be investigated in a sequel to this paper.

Appendix. SU (2) Representations of the Artin Braid Group


This appendix is determines all the representations of the three strand Artin braid
group B3 to the special unitary group SU (2) and concomitantly to the unitary
group U (2). One regards the groups SU (2) and U (2) as acting on a single qubit,
and so U (2) is usually regarded as the group of local unitary transformations in
a quantum information setting. If one is looking for a coherent way to represent
all unitary transformations by way of braids, then U (2) is the place to start. Here
we will give an example of a representation of the three-strand braid group that
generates a dense subset of SU (2) (the Fibonacci model23,33 ). Thus it is a fact that
local unitary transformations can be “generated by braids” in many ways. The
braid group representations related to Majorana Fermions are also seen to have
their roots in these quaternion representations, as we shall see at the end of the
appendix.
We begin with the structure of SU (2). A matrix in SU (2) has the form
 
z w
M= ,
−w̄ z̄
where z and w are complex numbers, and z̄ denotes the complex conjugate of z.
To be in SU (2) it is required that Det(M ) = 1 and that M † = M −1 where Det
denotes determinant, and M † is the conjugate transpose of M. Thus if z = a + bi
and w = c + di where a, b, c, d are real numbers, and i2 = −1, then
 
a + bi c + di
M=
−c + di a − bi
with a2 + b2 + c2 + d2 = 1. It is convenient to write
       
1 0 i 0 0 1 0 i
M =a +b +c +d ,
0 1 0 −i −1 0 i 0
and to abbreviate this decomposition as
M = a + bI + cJ + dK ,
where
   
1 0 i 0
1≡ , I≡ ,
0 1 0 −i
   
0 1 0 i
J≡ , K≡
−1 0 i 0
so that
I 2 = J 2 = K 2 = IJK = −1
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 103

Majorana Fermions and Representations of the Braid Group 103

and
IJ = K , JK = I , KI = J ,
JI = −K , KJ = −I , IK = −J .
The algebra of 1, I, J, K is called the quaternions after William Rowan Hamilton
who discovered this algebra prior to the discovery of matrix algebra. Thus the unit
quaternions are identified with SU (2) in this way. We shall use this identification,
and some facts about the quaternions to find the SU (2) representations of braiding.
First we recall some facts about the quaternions. For a detailed exposition of basics
for quaternions, see Ref. 12.

(1) Note that if q = a + bI + cJ + dK (as above), then q † = a − bI − cJ − dK so


that qq † = a2 + b2 + c2 + d2 = 1.
(2) A general quaternion has the form q = a + bI + cJ + dK where the value of
qq † = a2 + b2 + c2 + d2 , is not fixed to unity. The length of q is by definition
p
qq † .
(3) A quaternion of the form rI + sJ + tK for real numbers r, s, t is said to be a
pure quaternion. We identify the set of pure quaternions with the vector space
of triples (r, s, t) of real numbers R3 .
(4) Thus a general quaternion has the form q = a + bu where u is a pure quater-
nion of unit length and a and b are arbitrary real numbers. A unit quaternion
(element of SU (2)) has the additional property that a2 + b2 = 1.
(5) If u is a pure unit length quaternion, then u2 = −1. Note that the set of pure
unit quaternions forms the two-dimensional sphere S 2 = {(r, s, t)|r2 + s2 + t2 =
1} in R3 .
(6) If u, v are pure quaternions, then

uv = −u · v + u × v ,

where u · v is the dot product of the vectors u and v, and u × v is the vector
cross product of u and v. In fact, one can take the definition of quaternion
multiplication as

(a + bu)(c + dv) = ac + bc(u) + ad(v) + bd(−u · v + u × v) ,

and all the above properties are consequences of this definition. Note that
quaternion multiplication is associative.
(7) Let g = a + bu be a unit length quaternion so that u2 = −1 and a = cos(θ/2),
b = sin(θ/2) for a chosen angle θ. Define φg : R3 → R3 by the equation
φg (P ) = gP g † , for P any point in R3 , regarded as a pure quaternion. Then φg
is an orientation preserving rotation of R3 (hence an element of the rotation
group SO(3)). Specifically, φg is a rotation about the axis u by the angle θ.
The mapping

φ : SU (2) → SO(3)
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 104

104 Topology and Physics

is a two-to-one surjective map from the special unitary group to the rota-
tion group. In quaternionic form, this result was proved by Hamilton and by
Rodrigues in the middle of the nineteenth century. The specific formula for
φg (P ) is shown below:
φg (P ) = gP g −1 = (a2 − b2 )P + 2ab(P × u) + 2(P · u)b2 u .
We want a representation of the three-strand braid group in SU (2). This means
that we want a homomorphism ρ : B3 → SU (2), and hence we want elements
g = ρ(s1 ) and h = ρ(s2 ) in SU (2) representing the braid group generators s1 and
s2 . Since s1 s2 s1 = s2 s1 s2 is the generating relation for B3 , the only requirement on
g and h is that ghg = hgh. We rewrite this relation as h−1 gh = ghg −1 and analyze
its meaning in the unit quaternions.
Suppose that g = a + bu and h = c + dv where u and v are unit pure quaternions
so that a2 + b2 = 1 and c2 + d2 = 1. Then ghg −1 = c + dφg (v) and h−1 gh =
a + bφh−1 (u). Thus, it follows from the braiding relation that a = c, b = ±d, and
that φg (v) = ±φh−1 (u). However, in the case where there is a minus sign we have
g = a + bu and h = a − bv = a + b(−v). Thus, we can now prove the following
Theorem.
Theorem. Let u and v be pure unit quaternions and g = a + bu and h = c + dv
have unit length. Then (without loss of generality), the braid relation ghg = hgh
is true if and only if h = a + bv, and φg (v) = φh−1 (u). Furthermore, given that
g = a + bu and h = a + bv, the condition φg (v) = φh−1 (u) is satisfied if and only
2 2
if u · v = a 2b−b
2 when u 6= v. If u = v then g = h and the braid relation is trivially
satisfied.

Proof. We have proved the first sentence of the Theorem in the discussion prior to
its statement. Therefore assume that g = a + bu, h = a + bv, and φg (v) = φh−1 (u).
We have already stated the formula for φg (v) in the discussion about quaternions:
φg (v) = gvg −1 = a2 − b2 v + 2ab(v × u) + 2(v · u)b2 u .


By the same token, we have


φh−1 (u) = h−1 uh = a2 − b2 u + 2ab(u × −v) + 2(u · (−v))b2 (−v)


= a2 − b2 u + 2ab(v × u) + 2(v · u)b2 (v) .




Hence, we require that


a2 − b2 v + 2(v · u)b2 u = a2 − b2 u + 2(v · u)b2 (v) .
 

This equation is equivalent to


2(u · v)b2 (u − v) = a2 − b2 (u − v) .


If u 6= v, then this implies that


a2 − b2
u·v = .
2b2
This completes the proof of the Theorem.
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 105

Majorana Fermions and Representations of the Braid Group 105

The Majorana Fermion Example. Note the case of the theorem where
g = a + bu , h = a + bv .
Suppose that u · v = 0. Then the theorem√tells us that we need a2 − b2 = 0 and
since a2 + b2 = 1, we conclude that a = 1/ 2 and b likewise. For definiteness, then
we have for the braiding generators (since I, J and K are mutually orthogonal) the
three operators
1 1 1
A = √ (1 + I) , B = √ (1 + J) , C = √ (1 + K) .
2 2 2
Each pair satisfies the braiding relation so that ABA = BAB, BCB = CBC,
ACA = CAC. We have already met this braiding triplet in our discussion of the
construction of braiding operators from Majorana Fermions in Sec. 3. This shows
(again) how close Hamilton’s quaternions are to topology and how braiding is fun-
damental to the structure of Fermionic physics.

The Fibonacci Example. Let


g = eIθ = a + bI ,
where a = cos(θ) and b = sin(θ). Let
h = a + b c2 − s2 I + 2csK ,
  

2 2
where c2 + s2 = 1 and c2 − s2 = a 2b−b
2 . Then we can rewrite g and h in matrix

form as the matrices G and H. Instead of writing the explicit form of H, we write
H = F GF † where F is an element of SU (2) as shown below:
 iθ   
e 0 ic is
G= , F = .
0 e−iθ is −ic
This representation of braiding where one generator G is a simple matrix of phases,
while the other generator H = F GF † is derived from G by conjugation by a unitary
matrix, has the possibility for generalization to representations of braid groups (with
more than three strands) to SU (n) or U (n) for n greater than 2. In fact, we shall see
just such representations23 by using a version of topological quantum field theory.
The simplest example is given by

g = e7πI/10 , f = Iτ + K τ , h = f gf −1 ,
where τ 2 + τ = 1. Then g and h satisfy ghg = hgh and generate a representation of
the three-strand braid group that is dense in SU (2). We shall call this the Fibonacci
representation of B3 to SU (2).
At this point we can close this paper with the speculation that braid group
representations such as this Fibonacci representation can be realized in the con-
text of electrons in nanowires. The formalism is the same as our basic Majorana
representation. It has the form of a braiding operator of the form
exp(θyx) ,
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 106

106 Topology and Physics

where x and y are Majorana operators and the angle θ is not equal to π/4 as
is required in the full Majorana representation. For a triple {x, y, z} of Majorana
operators, any quaternion representation is available. Note how this will affect the
conjugation representation: Let T = r + syx where r and s are real numbers with
r2 + s2 = 1 (the cosine and sine of θ), chosen so that a representation of the braid
group is formed at the triplet (quaternion level). Then T −1 = r − syx and the
reader can verify that
T xT −1 = (r2 − s2 )x + 2rsy , T yT −1 = (r2 − s2 )y − 2rsx .
Thus, we see that the original Fermion exchange occurs with r = s and then the
sign on −2rs is the well-known sign change in the exchange of Fermions. Here it is
generalized to a more complex linear combination of the two particle/operators.

Acknowledgments
Much of this paper is based upon our joint work in the papers.13–23,25,26 We have
woven this work into the present paper in a form that is coupled with recent and
previous work on relations with logic and with Majorana Fermions. This work was
supported by the Laboratory of Topology and Dynamics, Novosibirsk State Uni-
versity (Contract No. 14.Y26.31.0025 with the Ministry of Education and Science
of the Russian Federation).

References
1. R. J. Baxter, Exactly Solved Models in Statistical Mechanics (Academic Press, 1982).
2. C. W. J. Beenakker, Search for Majorana Fermions in superconductors, arXiv:1112.
1950.
3. N. E. Bonesteel, L. Hormozi, G. Zikos and S. H. Simon, Braid topologies for quantum
computation, Phys. Rev. Lett. 95, 140503 (2005), arXiv:quant-ph/0505665.
4. S. H. Simon, N. E. Bonesteel, M. H. Freedman, N. Petrovic and L. Hormozi, Topo-
logical quantum computing with only one mobile quasiparticle, Phys. Rev. Lett. 96,
070503 (2006), arXiv:quant-ph/0509175.
5. J. L. Brylinski and R. Brylinski, Universal quantum gates, in Mathematics of Quantum
Computation, eds. R. Brylinski and G. Chen (Chapman & Hall/CRC Press, Boca
Raton, Florida, 2002).
6. P. A. M. Dirac, Principles of Quantum Mechanics (Oxford University Press, 1958).
7. E. Fradkin and P. Fendley, Realizing non-Abelian statistics in time-reversal invariant
systems, Theory Seminar, Physics Department, UIUC, 4/25/2005.
8. L.-W. Yu and M.-L. Ge, More about the doubling degeneracy operators associated
with Majorana fermions and Yang–Baxter equation, Sci. Rep. 5, 8102 (2015).
9. B. Hatfield, Quantum Field Theory of Point Particles and Strings (Perseus Books,
Cambridge, Massachusetts, 1991).
10. D. A. Ivanov, Non-Abelian statistics of half-quantum vortices in p-wave superconduc-
tors, Phys. Rev. Lett. 86, 268 (2001).
11. L. H. Kauffman, Temperley–Lieb Recoupling Theory and Invariants of Three-
Manifolds, Annals Studies, Vol. 114 (Princeton University Press, 1994).
12. L. H. Kauffman, Knots and Physics (World Scientific, Singapore, 1991), 2nd edn.
(1993), 3rd edn. (2002), 4th edn. (2012).
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 107

Majorana Fermions and Representations of the Braid Group 107

13. L. H. Kauffman, Knot logic and topological quantum computing with Majorana
Fermions, in Logic and Algebraic Structures in Quantum Computing, eds. J. Chubb,
A. Eskadarian and V. Harizanov (Cambridge University Press, 2016), pp. 223–335.
14. L. H. Kauffman and S. J. Lomonaco Jr., Quantum entanglement and topological
entanglement, New J. Phys. 4, 73 (2002).
15. L. H. Kauffman, Teleportation topology: Proceedings of the 2004 Byelorus Conference
on quantum optics, Opt. Spectrosc. 9, 227 (2005), arXiv:quant-ph/0407224.
16. L. H. Kauffman and S. J. Lomonaco Jr., Entanglement criteria — Quantum and
topological, in Quantum Information and Computation, Orlando, FL, 21–22 April
2003, Proc. SPIE, Vol. 5105, eds. E. Donkor, A. R. Pinch and H. E. Brandt (SPIE,
2003), pp. 51–58.
17. L. H. Kauffman and S. J. Lomonaco Jr., Quantum knots, in Quantum Information
and Computation II, 12–14 April 2004, Proc. SPIE, Vol. 5436, eds. E. Donkor, A. R.
Pinch and H. E. Brandt (SPIE, 2004), pp. 268–284.
18. S. J. Lomonaco and L. H. Kauffman, Quantum knots and mosaics, J. Quantum Inf.
Process. 7, 85 (2008), http://arxiv.org/abs/0805.0339.
19. S. J. Lomonaco and L. H. Kauffman, Quantum knots and lattices, or a blueprint for
quantum systems that do rope tricks, in Quantum Information Science and its Contri-
butions to Mathematics, Proc. Symp. Appl. Math., Vol. 68 (American Mathematical
Society, Providence, RI, 2010), pp. 209–276.
20. S. J. Lomonaco and L. H. Kauffman, Quantizing braids and other mathematical struc-
tures: The general quantization procedure, in Quantum Information and Computation
IX, April 2014, Proc. SPIE, Vol. 8057, eds. E. Donkor, A. R. Pinch and H. E. Brandt
(SPIE, 2003), pp. 1–14.
21. L. H. Kauffman and S. J. Lomonaco, Quantizing knots groups and graphs, in Quantum
Information and Computation IX, April 2011, Proc. SPIE, Vol. 8057, eds. E. Donkor,
A. R. Pinch and H. E. Brandt (SPIE, 2001), pp. 1–15.
22. L. H. Kauffman and S. J. Lomonaco, Braiding operators are universal quantum gates,
New J. Phys. 6, 134 (2004).
23. L. H. Kauffman and S. J. Lomonaco Jr., q-deformed spin networks, knot polynomials
and anyonic topological quantum computation, J. Knot Theory Ramifications 16, 267
(2007).
24. L. H. Kauffman and S. J. Lomonaco Jr., Spin networks and quantum computation, in
Lie Theory and Its Applications in Physics VII, eds. H. D. Doebner and V. K. Dobrev
(Heron Press, Sofia, 2008), pp. 225–239.
25. L. H. Kauffman, Quantum computing and the Jones polynomial, in Quantum Com-
putation and Information, Contemporary Mathematics, Vol. 305, ed. S. Lomonaco Jr.
(AMS, 2002), pp. 101–137, arXiv:math.QA/0105255.
26. L. H. Kauffman and S. J. Lomonaco Jr., The Fibonacci model and the Temperley–Lieb
algebra, Int. J. Mod. Phys. B 22, 5065 (2008).
27. L. H. Kauffman, Iterants, fermions and Majorana operators, in Unified Field
Mechanics — Natural Science Beyond the Veil of Spacetime, eds. R. Amoroso, L. H.
Kauffman and P. Rowlands (World Scientific, 2015), pp. 1–32.
28. L. H. Kauffman, Iterant algebra, Entropy 19, 347 (2017), doi:10.3390/e19070347.
29. L. H. Kauffman, Knot logic and topological quantum computing with Majorana
fermions, in Logic and Algebraic Structures in Quantum Computing and Information,
Lecture Notes in Logic, eds. J. Chubb, J. Chubb, A. Eskandarian and V. Harizanov
(Cambridge University Press, 2016).
30. L. H. Kauffman, Braiding and Majorana fermions, J. Knot Theory Ramifications 26,
1743001 (2017).
October 31, 2018 12:11 taken from 139-IJMPA ws-rv961x669 chap03-S0217751X18300235 page 108

108 Topology and Physics

31. L. H. Kauffman and P. Noyes, Discrete physics and the Dirac equation, Phys. Lett.
A 218, 139 (1996).
32. V. Mourik, K. Zuo, S. M. Frolov, S. R. Plissard, E. P. A. M. Bakkers and L. P. Kouwen-
huven, Signatures of Majorana fermions in hybrid superconductor-semiconductor
devices, arXiv:1204.2792.
33. A. Kitaev, Anyons in an exactly solved model and beyond, Ann. Phys. 321, 2 (2006),
arXiv:cond-mat/0506438.
34. A. Kitaev, Fault-tolerant quantum computation by anyons, Ann. Phys. 303, 2 (2003),
arXiv:quant-ph/9707021.
35. E. Majorana, A symmetric theory of electrons and positrons, Il Nuovo Cimento 14,
171 (1937).
36. M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information
(Cambridge University Press, Cambridge, 2000).
37. J. Franko, E. C. Rowell and Z. Wang, Extraspecial 2-groups and images of braid
group representations, J. Knot Theory Ramifications 15, 413 (2006).
38. R. Ul Haq and L. H. Kauffman, Z/2Z topological order and Majorana doubling in
Kitaev Chain, to appear, arXiv:1704.00252v1 [cond-mat.str-el].
39. F. Wilczek, Fractional Statistics and Anyon Superconductivity (World Scientific,
1990).
October 31, 2018 12:13 taken from 146-MPLA ws-rv961x669 chap04-S0217732318300124 page 109

109

Chapter 4

Arithmetic gauge theory: A brief introduction∗

Minhyong Kim
Mathematical Institute, University of Oxford,
Woodstock Road, Oxford OX2 6GG, UK
The Korea Institute for Advanced Study, 85 Hoegiro,
Dongdaemungu, Seoul 02455, Republic of Korea
[email protected]

Much of arithmetic geometry is concerned with the study of principal bundles. They
occur prominently in the arithmetic of elliptic curves and, more recently, in the study of
the Diophantine geometry of curves of higher genus. In particular, the geometry of mod-
uli spaces of principal bundles holds the key to an effective version of Faltings’ theorem
on finiteness of rational points on curves of genus at least 2. The study of arithmetic
principal bundles includes the study of Galois representations, the structures linking
motives to automorphic forms according to the Langlands program. In this paper, we
give a brief introduction to the arithmetic geometry of principal bundles with emphasis
on some elementary analogies between arithmetic moduli spaces and the constructions
of quantum field theory.

Keywords: Gauge fields; principal bundles; arithmetic schemes; algebraic number fields;
Galois groups; class field theory.

PACS Nos.: 02.10.De, 02.40.Re

1. Fermat’s Principle
Fermat’s principle says that the trajectory taken by a beam of light is a solution to
an optimization problem. That is, among all the possible paths that light could take,
it selects the one requiring the least time to traverse. This was the first example of a
very general methodology known nowadays as the principle of least action. To figure
out the trajectory or spacetime configuration favored by nature, you should analyze
the physical properties of the system to associate to each possible configuration a
number, called the action of the configuration. Then the true trajectory is one
where the action is extremized. The action determines a constraint equation, the
so-called Euler–Lagrange (E-L) equation of the system, whose solutions give you
possible trajectories. The action principle in suitably general form is the basis of
classical field theory, particle physics, string theory, and gravity. For Fermat to have

∗ Thischapter also appeared in Modern Physics Letters A, Vol. 33, No. 29 (2018) 1830012. DOI:
10.1142/S0217732318300124.
October 31, 2018 12:13 taken from 146-MPLA ws-rv961x669 chap04-S0217732318300124 page 110

110 Topology and Physics

discovered this idea so long ago in relation to the motion of light was a monumental
achievement, central to the scientific revolution that rose out of the intellectual
fervor of 17th century Europe.
However, Fermat is probably better known these days as the first modern
number-theorist. Among the intellectual giants of the period, Fermat was almost
unique in his preoccupation with prime numbers and Diophantine equations, poly-
nomial equations to which one seeks integral or rational solutions. Located among
his many forays into this subject one finds his famous “Last Theorem”, which
elicited from the best mathematical minds of subsequent generations several hun-
dred years of theoretical development before it was finally given a proof by Andrew
Wiles in 1995.48 The action principle and Fermat’s last theorem are lasting tributes
to one of the singularly original minds active at the dawn of modern science. Could
there be a relation between the two? In fact, the problem of finding the trajectory of
light and that of finding rational solutions to Diophantine equations are two facets
of the same problem, one occurring in geometric gauge theory, and the other, in
arithmetic gauge theory. The fact that the photon is described by a U(1) gauge field
is well-known. The purpose of this paper is to give the motivated physicist with
background in geometry and topology some sense of the second type of theory and
its relevance to the theory of Diophantine equations.
In the context of Abelian problems, say the arithmetic of elliptic curves, much
of the material is classical. However, for non-Abelian gauge groups, the perspective
of gauge theory is very useful and has concrete consequences. It is hoped that
number-theorists will also benefit from the intuition provided by this somewhat
fanciful view.

2. Diophantine Geometry and Gauge Theory


We will be employing the language of Diophantine geometry, whereby a systems of
equations will be encoded in an algebraic variety
V
defined over Q. We will always assume V is connected. The rational solutions (or
points, to use the language of geometry) will be denoted by V (Q), while p-adic
or adelic pointsa will be denoted by V (Qp ) and V (AQ ). The theory of Refs. 24–27
associates to p-adic or adelic points of V arithmetic gauge fields. We will be focusing
mostly on the p-adic theory for the sake of expositional simplicity. The statement,
which we will review in Sec. 5, is that there is a natural map
A: V (Qp ) - p-adic arithmetic gauge fields .

a The field of p-adic numbers for a prime p are non-Archimedean completions of Q. They give a
way of geometrizing the rational numbers into a fractal-like space whose main advantage over the
reals is having a substantial but manageable absolute Galois group. The adeles can be thought of
as essentially the product ring of R and Qp for all p, with some small restriction. See Ref. 39 for
a review.
October 31, 2018 12:13 taken from 146-MPLA ws-rv961x669 chap04-S0217732318300124 page 111

Arithmetic Gauge Theory: A Brief Introduction 111

The type of gauge field is determined by the arithmetic geometry of V . Among p-


adic or adelic gauge fields, the problem is to find the locus of rational gauge fields.
The condition for a gauge field to be rational can be phrased entirely in terms
of global symmetry, and is shown in many cases to impose essentially computable
constraints on the p-adic gauge fields. These constraints are what we view as “arith-
metic E-L equations”. In number theory, they are also known as reciprocity laws.
The key point is that when the solution x ∈ V (Qp ) lies in the subset V (Q), then
the corresponding gauge field A(x) is rational. That is, we have a commutative
diagram

X(Q) ⊂ - X(Qp )

A A

? ?
rational gauge fields ⊂- p-adic gauge fields.

The E-L equation for A(x) can be translated, using p-adic Hodge theory, back to an
analytic equation satisfied by the point x. When V is a curve and the equation thus
obtained is nontrivial, this implies finiteness theorems for rational points. That is,
it is possible to often prove that

A−1 (rational gauge fields)

is a finite set. One can give thereby new proofs of the finiteness of rational solutions
to a range of Diophantine equations, including the generalized Fermat equations12

axn + by n = c

for n ≥ 4. This finiteness was first proved by Gerd Faltings in 1983 as part of
his proof of the Mordell conjecture (cf. Sec. 3) using ideas and constructions of
arithmetic geometry, However, the proof in Ref. 12 has a number of theoretical
advantages as well as practical ones. On the one hand, the gauge-theoretical per-
spective has the potential to be applicable to a very broad class of phenomena
encompassing many of the central problems of current day number theory.28 On
the other, unlike Faltings’ proof, which is widely regarded as ineffective, the gauge-
theory proof leads to a computational method for actually finding rational solutions,
a theme that is currently under active investigation.7,8,14,15
It should be remarked that the map A that associates gauge fields to points has
been well-known since the ’50s when the variety V is an elliptic curve, an Abelian
variety, or generally, a commutative algebraic group. More general equations, for
example, curves of genus ≥ 2, require non-Abelian gauge groups, and it is in this
context that the analogy with physics assumes greater important. Nonetheless, the
arithmetic E-L equations obtained thus far have not been entirely canonical. The
situation is roughly that of having an E-L equation without an action. On the other
October 31, 2018 12:13 taken from 146-MPLA ws-rv961x669 chap04-S0217732318300124 page 112

112 Topology and Physics

hand, if we consider gauge theory with constant gauge groups (to be discussed be-
low), there is a very natural analogue of the Chern–Simons action on 3-manifolds,
for which it appears a theory can be developed in a manner entirely parallel to usual
topology. In particular, some rudiment of path integral quantization become avail-
able, and give interpretations of nth power residue symbols as arithmetic linking
numbers.28,10,11

3. Principal Bundles and Number Theory: Weil’s Constructions


In the language of geometry, gauge fields are principal bundles with connection,
and this is the form in which we will be discussing arithmetic analogues. Perhaps
it is useful to recall that over the last 40 years or so, the idea that a space X
can be fruitfully studied in terms of the field theories it can support has been
extraordinarily powerful in geometry and topology. The space of interest can start
out both as a target space of fields or as a source. Both cases are able to give rise
to suitable moduli spaces of principal bundles (with connections)
M (X, G) ,
which then can be viewed as invariants of X. Here, G might be a compact Lie group
or an algebraic group, while the moduli space might consist of flat connections, or
other spaces of solutions to differential equations, for example, the (self-dual) Yang–
Mills equation. Of course this idea is at least as old as Hodge theory for Abelian
G, while the non-Abelian case has seen an increasing array of deep interactions
with physics since the work of Atiyah, Bott, Drinfeld, Hitchin, Manin, Donaldson,
Simpson and Witten.3,5,4,18,43,49
However, my impression is that it is not widely known among mathematicians
that the study of principal bundles was from its inception closely tied to num-
ber theory. Probably, the first moduli space of principal bundles appeared as the
Jacobian of a Riemann surface, the complex torus target of the Abel–Jacobi map
Z x Z x Z x 
x 7→ ω1 , ω2 , . . . , ωg mod H1 (X, Z) ,
b b b

in what can now be interpreted as the Hodge realization. In the early 20th century,46
Andre Weil gave the first algebraic construction of the Jacobian JC of a smooth
projective algebraic curve C of genus at least two defined over an algebraic number
field F . His main motivation was the Mordell conjecture, which said that the set
C(F ) of F -rational points should be finite. The algebraic nature of the construction
allowed JC to be viewed also as a variety in its own right over F that admitted an
embedding
C(F ) ,→ JC (F ) ,

x 7→ O(x) ⊗ O(−b) .
Weil proved that JC (F ) was a finitely-generated Abelian group, but was unable to
use this striking fact to prove the finiteness of C(F ). It appears to have taken him
October 31, 2018 12:13 taken from 146-MPLA ws-rv961x669 chap04-S0217732318300124 page 113

Arithmetic Gauge Theory: A Brief Introduction 113

another decade or so47 to realize that the Abelian nature of JC kept it from being
too informative about C(F ), and from there went on to define
" #-
Y
M (C, GLn )(F ) = GLn (OC,x ) GLn (AF (C) )/GLn (F (C)) ,
x∈C

the set of isomorphism classes of rank n vector bundles on C. Weil considered this as
a non-Abelian extension of the Jacobian, which might be applied to the non-Abelian
arithmetic of C. Even though the Mordell conjecture remained unproven for another
45 years, Weil’s construction went on to inspire many ideas in geometric invariant
theory and non-Abelian Hodge theory, much of it in interaction with Yang–Mills
theory.4,38,18,43
In order to bring about further applications to number theory, it turned out to
be critical to consider moduli of principal bundles over F itself, or over various rings
of integers in F , not just over other objects of algebro-geometric nature sitting over
F . These are the arithmetic gauge fields mentioned above.

4. Arithmetic Gauge Fields


For the most part, in this paper, we will present the theory in a pragmatic manner,
requiring as little theory as possible. What underlies the discussion is the topology
of the spectra of number fields, local fields, and rings of integers, but it is possible to
formulate most statements in the language of fields and groups. Roughly speaking,
when we refer to an object over (or on) a ring O, we will actually have in mind the
geometry Spec(O), the spectrumb of O.
Given a field K of characteristic zero, denote by
GK = Gal(K̄/K) ,
the Galois group of an algebraic closure K̄ of K. Thus, these are the field auto-
morphisms of K̄ that act as the identity on K. For any finite extension L of K
contained in K̄, the algebraic closure L̄ is the same as K̄, and
GL = {g ∈ GK | g|L = I} .
When L/K is Galois, GL is the kernel of the projection
GK - Gal(L/K) .

In fact, we can write the Galois group as an inverse limitc


GK = lim Gal(L/K) ,
←−
L

b But the reader will not be required to know the language of spectra or schemes until the very
end of the last section.
c An element of such an inverse limit is a compatible collection (g ) , where the compatibility
L L
means that if L ⊃ L0 ⊃ K, then gL |L0 = gL0 .
October 31, 2018 12:13 taken from 146-MPLA ws-rv961x669 chap04-S0217732318300124 page 114

114 Topology and Physics

as L runs over the finite Galois extensions of K contained in K̄. This equips GK
with the topology of a compact, Hausdorff, totally disconnected space, with a basis
of open sets given by the cosets of the GL . In particular, it is homeomorphic to a
Cantor set. Such large inverse limits afford some initial psychological difficulty, but
form an essential part of arithmetic topology.
By a gauge group over K we mean a topological group U with a continuous
action of GK . A U -gauge field, or principal U -bundle over K is a topological space
P with a simply-transitive continuous right U -action and a continuous left GK
action that are compatible. This means

g(pu) = g(p)g(u) ,

for all g ∈ GK , u ∈ U and p ∈ P . We remark that a principal G-bundle as defined


corresponds naively only to flat connections in geometry. We will comment on this
analogy in more detail below. In algebraic geometry, the expression U -torsor is
commonly used in place of the differential geometric terminology. We will use both.
For the purposes of this paper, an arithmetic gauge group (or field ) will mean a
gauge group (or field) over an algebraic number field or a completion of an algebraic
number field.
There is an obvious notion of isomorphism of U -torsors, and a well-known clas-
sification of U -torsors over K: Given P , choose p ∈ P . Then for any g ∈ GK ,
g(p) = pc(g) for a unique c(g) ∈ GK . It is easy to check that g 7→ c(g) defines a
continuous function

c: GK - U,

such that

c(gg 0 ) = c(g)gc(g 0 ) .

The set of such functions is denoted Z 1 (GK , U ), and called the set of continuous
1-cocycles of GK with values in U . There is a right action of U on Z 1 (GK , U ) by

(uc)(g) = g(u−1 )c(g)u

and we define

H 1 (GK , U ) := Z 1 (GK , U )/U .

Lemma 4.1. The procedure described above defines a bijection

Isomorphism classes of U -torsors ' H 1 (GK , U ) .

We will denote H 1 (GK , U ) also by H 1 (K, U ), to emphasize its dependence on


the topology of Spec(K). A rather classical case is when U = R(K̄), the K̄-points
of an algebraic group R over K, which we consider with the discrete topology. We
will often write R for R(K̄), when there is no danger of confusion. A trivial but
October 31, 2018 12:13 taken from 146-MPLA ws-rv961x669 chap04-S0217732318300124 page 115

Arithmetic Gauge Theory: A Brief Introduction 115

important example is R = Gm , the multiplicative group, so that Gm (K̄) = K̄ × . In


this case, Hilbert’s theorem40 90 says
H 1 (K, Gm ) = 0 .
Another important class is that of Abelian varieties for example, elliptic curves. In
that case H 1 (K, R) is usually called the Weil–Chatelet group of R.42
Some useful operations on torsors include
(1) Pushout: If f : U - U 0 is a continuous homomorphism of groups over K,
then there is a pushout functor f∗ that takes U -torsors to principal U 0 -torsors.
The formula is
f∗ (P ) = [P × U 0 ]/U ,
where the right action of U on the product is (p, u)v = (pv, f (v −1 )u). The
resulting quotient still has the U 0 -action: [(p, u)]u0 = [(p · uu00 )].
(2) Product: When P is an U -torsor and P 0 is an U 0 -torsor, P ×P 0 is a U ×U 0 -torsor.
Note that if U is Abelian, the group law
m: U × U - U,

is a homomorphism. Using this,


(P, P 0 ) 7→ m∗ (P × P 0 ) ,
defines a bifunctor on principal U bundles and an Abelian group law on H 1 (K, U ).
However, if U is non-Abelian, there is no group structure on the H 1 and matters
becomes more subtle and interesting.
When U is an Abelian group, one can define cohomology groups in every degree
H i (K, U ) := Ker[d: C i (GK , U ) - C i+1 (GK , U )]/Im[d: C i−1 (GK , U )
- C i (GK , U )] .

Here, C i (G, U ) is the set of continuous maps from Gi to U , while the differential d
is defined in a natural combinatorial manner.40 One checks that H 0 (K, U ) = U GK ,
the set of invariants of the action, and that the cohomology groups fit into a long
exact sequence as usual. That is, if
1 - U 00 - U - U0 - 1

is exact, then we get


0 - (U 00 )GK - U GK - (U 0 )GK
- H 1 (K, U 00 ) - H 1 (K, U ) - H 1 (K, U 0 )
- H 2 (K, U 00 ) - H 2 (K, U ) - H 2 (K, U 0 ) - ···

The sequence up to the H 1 terms remains exact even when the groups are non-
Abelian, except the meaning needs to be interpreted a bit carefully.
October 31, 2018 12:13 taken from 146-MPLA ws-rv961x669 chap04-S0217732318300124 page 116

116 Topology and Physics

An important case is U = R(K̄) for R a connected Abelian algebraic group. In


this case, multiplication by n induces an exact sequence
n
0 - R[n] - R - R - 0,

where A[n] generally denotes the n-torsion subgroup of an Abelian group A. Hence,
we get the long exact sequence
0 - (R[n])GK - R GK - R GK - H 1 (K, R[n])
- H 1 (K, R) - H 1 (K, R) -

Note here that RGK = R(K), the K-rational points of R. Thus, we get an injection
R(K)/nR(K) ⊂ - H 1 (K, R[n]) ,

indicating how principal bundles for R[n] can encode information about the group
of rational points. When R is an elliptic curve, this is the basis of the descent
algorithm for computing the Mordell–Weil group, about which we will say more
later.
Some genuinely topological groups U arise from taking inverse limits. For ex-
ample, we have the group µn ⊂ Gm of nth roots of unity. They are related by the
system of power maps
(·)a
µab - µb ,

so that we can take an inverse limit


Z(1)
b := lim µn .
← −
n

This is a topological group isomorphic to Z,b the profinite completiond of Z, but


with a nontrivial action of GK . It is common to focus on a set of prime powers for
a fixed prime p, and define
Zp (1) = lim µpn .
←−
As a topological group, Zp (1) ' Zp , the group of p-adic integers.e It is a simple
example of a compact p-adic Lie group. Principal bundles for this are then classified
by H 1 (K, Zp (1)).
In fact, we have an isomorphism
H 1 (K, Z(1))
b ' lim H 1 (K, µn ) ,
←−

d Given any group A, the profinite completion  of A is by definition

 = lim A/N ,
←−
N

where N are normal subgroups of finite index. P∞


e Recall that the p-adic integers are sometime represented as power series i
i=0 ai p with 0 ≤ ai ≤
p − 1. Another representation is Zp = lim Z/p . n
←−
October 31, 2018 12:13 taken from 146-MPLA ws-rv961x669 chap04-S0217732318300124 page 117

Arithmetic Gauge Theory: A Brief Introduction 117

so we can consider a Z(1))-torsor


b as being a compatible collection of µn -torsors as
we run over n. The exact sequence
n
1 - µn - Gm - Gm - 1

gives rise to the long exact sequence


n
1 - µn (K) - K× - K× - H 1 (K, µn ) - 0,

so that we get an isomorphism


K × /(K × )n ' H 1 (K, µn ) .
Concretely, the torsor associated to an element a ∈ K is simply the set a1/n of nth
roots of a in K̄. This clearly admits an action of µn . That is, the group µn can be
thought of as “internal symmetries” of the set a1/n . This torsor only depends on
the class of a-modulo nth powers, and is trivial if and only if a has an nth root
in K. The point is that the choice of any nth root in K̄ will determine a bijection
to µn , but this will be equivariant for the GK -action exactly if you choose an nth
root in K itself, which may or may not be possible. In discussing torsors over fields,
it will be important in this way to keep track of both the U -action, the internal
symmetries, and the GK -action, which can be thought of as the analogue of external
(spacetime) symmetries in physics.

5. Homotopy and Gauge Fields


We will generalize the discussion of internal and external symmetries of the previous
section. Let V be a variety defined over K and b ∈ V (K) a K-rational point.f From
this data, one gets a gauge group as well as torsors on K associated to rational
points of V . The gauge group will be
U = π1 (V , b) ,
one of the many different versions of the fundamental groupg of V , which is V
regardedh as a variety over K̄. We will not take care to distinguish notationally
between different types of fundamental groups, since the context will make it clear
which one is being referred to. (Conceptually, it is also useful to regard them all as
essentially the same.) Whenever K is embedded into C, it will be a completion of the
topological fundamental group of V (C), either in a profinite or an algebraic sense.
However, the key point is that it admits an action of GK , and has the structure

f Inmost of the work thus far, a basepoint b was used. It is possible to develop the theory without
such a choice. But then, instead of a moduli space of torsors, we will be dealing with a gerbe.
g We will not give the precise definitions in terms of fiber functors. A good general introduction is

the book of Szamuely,44 while the algebraic group realization we will use below is given a careful
discussion in Ref. 16.
h The general principle is that varieties of algebraic closed fields lies below the realm of usual

geometry, while there is always an arithmetic component to geometry over non-closed field. But
even in dealing with such subtleties, one constantly uses geometry over the algebraic closure.
October 31, 2018 12:13 taken from 146-MPLA ws-rv961x669 chap04-S0217732318300124 page 118

118 Topology and Physics

of a gauge group over K. The GK -action is usually highly nontrivial, and this is a
main difference from geometric gauge theory, where the gauge group tends to be
constant over spacetime. Now, given any other point x ∈ V (K), we associate to it
the homotopy classes of path
P (x) := π1 (V ; b, x)
from b to x, which then has both a compatible action of GK and of π1 (V , b).
That is,
the loops based at b are acting as internal symmetries of sets of paths
emanating from b, while GK acts compatibly as external symmetries.i
In order to provide some intuition for the GK -action, we give a rather concrete
description in the case where π1 (V , b) is the profinite étale fundamental group. It
is worth stressing again that this is just the profinite completion of the topological
fundamental group of V (C), the complex manifold associated to V via some complex
embedding of K. However, the remarkable, albeit elementary, fact is the existence
of the “hidden” Galois symmetry. To describe it, one approach is to construct the
fundamental group and path spaces using covering spaces.
Recall that for a manifold M , if
f : M̃ - M
is the universal covering space, then the choice of a basepoint m ∈ M and a lift
m̃ ∈ M̃m := f −1 (m) determines a canonical bijection
π1 (M, m) ' M̃m
that takes e to m̃. This bijection is induced by the homotopy lifting of paths.
Similarly,
π1 (M ; m, m0 ) ' M̃m0 .
If we replace M by the variety V , there is still a notion of an algebraic universal
covering
V˜ - V,
except it is actually an inverse system
(V i - V )i∈I ,
of finite algebraic covers, each of which is unramified, that is, surjective on tangent
spaces. The universality means that any finite connected unramified cover is dom-
inated by one of the V i . For an easy example, consider the compatible system of n
power maps
(·)n 
Ḡm - Ḡm n .
˜
These together form the algebraic universal cover Ḡ - Ḡm .
m

i This is a very elementary idea, but worth emphasizing in my view.


October 31, 2018 12:13 taken from 146-MPLA ws-rv961x669 chap04-S0217732318300124 page 119

Arithmetic Gauge Theory: A Brief Introduction 119

Now if we choose a basepoint b ∈ V (K) and a liftj b̃ ∈ V˜ , then there is a unique


K-model
Ṽ - V

of V˜ , that is, a system


(Vi - V )i

defined over K that gives rise to the universal covering over K̄, characterized by
the property that b̃ consists of K-rational points of the system.k
Even though we have not given a formal definition of the profinite étale funda-
mental group, a useful fact is that there are canonical bijections
π1 (V , b) ' Ṽb
and
π(V ; b, x) ' Ṽx .
That is, the fundamental group and the homotopy class of paths can be identified
with the fibers of the universal covering space. This way of presenting them makes
it somewhat hard to see the torsor structure. On the other hand, it does make it
apparent how GK is acting. The problem of describing this action can be thought
of as that of giving some manageable construction of Ṽ . This is in general a quite
hard problem and typically, one studies some quotient of the fundamental group
corresponding to special families of covers, such as Abelian covers or solvable covers.
An alternative is to study linearizations of the fundamental group, which we will
discuss below.
Anyway, we end up with a map
V (K) - H 1 (GK , π1 (V , b)) ;

x 7→ π1 (V ; b, x) ;
encoding points of V into torsors. Typically H 1 (GK , π1 (V , b)) will be much bigger
than V (K). That is, there will be many torsorsl that are not of the form P (x) for
some point of x. But the important thing for us is that the space of torsors often
carries a natural geometry, remarkably similar to the geometry of classical solutions
to a geometric gauge theory. This added geometric structure turns out to be very
useful in grappling with the sparse structure of V (K).

j Bythis, we mean a compatible system bi of basepoint lifts to the finite covers V i - V.


Compatibility here means that whenever you have a map V i - V j of covers, bi is taken to bj .
˜ , b̃) - (V , b) that is really universal. In that, any other
k One way to see the pointed covering (V

pointed covering is dominated by a unique map from V˜ . By applying this to Galois conjugates of
(V˜ , b̃), we get descent data that gives a K-model for the pointed system. This kind of reasoning
is usually called “Weil descent”. For details, see Ref. 36.
l However, there are important cases where this is conjectured to be a bijection. This is the subject

of Grothendieck’s section conjecture.21


October 31, 2018 12:13 taken from 146-MPLA ws-rv961x669 chap04-S0217732318300124 page 120

120 Topology and Physics

6. The Local to Global Problem, Reciprocity Laws, and


Euler Lagrange Equations
For the remainder of this paper, we will assume that U is either a p-adic Lie groupm
for a fixed prime p or a discrete group. (Depending on convention, the latter can
be included in the former.) So as to avoid discussing detailed algebraic number
theory, we will focus mostly on K = Q or K = Qv , where v could be a prime p
or the symbol ∞. We will refer to any such v as a place of Q, as it corresponds
to an equivalence class of absolute values. The field Qv is obtained by completing
with respect to an absolute value corresponding to v. Thus, we have the field Qp of
p-adic numbers, while Q∞ denotes the field of real numbers R.
R
6


⊃ ⊂
Q2  Q - Q3




?
-

Q5 Q7 ...

We will denote by Q the field of algebraic numbers and

G := GQ = Gal(Q/Q) .

We denote by Qv an algebraic closure of Qv and

Gv = GQv = Gal(Qv /Qv ) .

For each v, we choose an embedding Q ⊂ - Qv . Restricting the action of Gv to Q


then determines an embeddingn

Gv ⊂ - G

for each v.
We will need one more mildly technical fact about the structure of Gp for primes
p.39 The field Qp has an integral subring Zp , the p-adic integers. The integral

m We will not define this notion here, but rely on examples like Zp , GLn (Zp ), p-adic points of
more general reductive algebraic groups, finite groups, and group extensions formed out of such
groups. For a systematic treatment, see Ref. 41.
n The fact that this is an embedding is not entirely obvious. It has to do with the denseness of

algebraic numbers inside Qv . The reader should be aware that Gv is a very thin subgroup of G .
It is topologically finitely generated and has an explicit description.40 The structure of G, on the
other hand, is still very mysterious.
October 31, 2018 12:13 taken from 146-MPLA ws-rv961x669 chap04-S0217732318300124 page 121

Arithmetic Gauge Theory: A Brief Introduction 121

closureo of Zp in Qp is a subring
OQp ⊂ Qp
which is stabilized by the Gp -action. The ring OQp has a unique maximal ideal mp ,
and
OQp /mp ' F̄p ,
an algebraic closure of Fp . Thus, acting on this quotient ring determines a homo-
morphism
Gp - Gup = Aut(O /mp ) ' Gal(F̄p /Fp )
Qp

that turns out to be surjective.p The last Galois group is generated by one element
F rp0 whose effect is x 7→ x1/p . (This is the group-theoretic inverse of the usual
generator, the pth power map.q ) Any lift F rp of F rp0 to Gp ⊂ G is called a Frobenius
element at p. The kernel of the homomorphism Gp - Gup is denoted by Ip and
called the inertia subgroup at p.
The key interaction for applications to arithmetic are between H 1 (Q, U ) and
the various H 1 (Qv , U ). Since Gv injects into G, there is a restriction map
H 1 (Q, U ) - H 1 (Qv , U )

for each v, which we put together into


Y
loc: H 1 (Q, U ) - H 1 (Qv , U ) .
v

Here, there is the main problem of arithmetic gauge theory:


For a gauge group U over Q, describe the image of loc.
Any kind of a solution to this problem is called a local-to-global principle in number
theory.
At this point, we pursue a bit more analogy with geometric gauge fields. As
discussed already, geometric gauge theory with symmetry group U (in this case a
real Lie group) deals with a space A of principal U -connections on a spacetime
manifold X. The usual convention these days is to take A to be a space of C ∞
connections. There is an action functional
S: A - R,

o This refers to the elements of the field extension that satisfy a nontrivial monic polynomial
equation with coefficients in Zp . This notion is most natural when we consider the integral closure
of Z in a field extension F of Q of dimension d. In this setting, the integral closure is the maximal
subring of F isomorphic to Zd as a group.
p The superscript is supposed to stand for “unramified”, corresponding to the fact that Gu r is the
p
Galois group of the maximal extension in Qp that is unramified over Qp .
q The reason for using the inverse rather than the natural p-power map has to do with the geometric

Frobenius map acting on étale cohomology. This is a rather confusing convention, about which I
would suggest the reader refrain from asking further at the moment.
October 31, 2018 12:13 taken from 146-MPLA ws-rv961x669 chap04-S0217732318300124 page 122

122 Topology and Physics

that is invariant under gauge transformations U (connection preserving automor-


phisms of the principal bundle). The space of classical solutions is

M (X, U ) = AEL /U ,

where AEL ⊂ A is the set of connections that satisfy the E-L equations for the
functional S. The classical problem is to describe the space M (X, U ), or to find
points in M (X, U ) corresponding to specific boundary conditions. The quantum
problem is to compute path integrals like
Z
O1 (A)O2 (A) · · · Ok (A) exp(−S(A))dA ,
A/U

where Oi are local functions of A.


Now, from the point of view of classical physics, M (X, U ) will be the fields that
we actually observe, and the embedding

M (X, U ) ⊂ A/U

corresponds to a model for “quantum fluctuations” around classical solutions. How-


ever, the justification for considering A/U as the space of quantum fluctuation, or
“off-shell states” of the field, is not so clear. It depends on the choice of an initial
mathematical model inside which the classical solutions were constructed. Some
might argue that it is hard to even describe M (X, U ) unless one starts from A/U.
This is false for physical reasons, since M (X, U ) is supposed to be a model of the
classical states, which should make intrinsic sense regardless of the space in which
we seek them.r Another objection comes from specific examples such as 3d Chern–
Simons theory or 2d Yang–Mills theory, where M (X, U ) can easily be a space of
flat connections. In that case, it has a topological description as a space of rep-
resentations of the fundamental group of X. It is mostly this last case we have in
mind when we consider the arithmetic versions. A model for quantum fluctuations
of M (X, U ) might then just as well be collections of punctual local systems around
points of X, in the spirit of the jagged or singular paths that occur in Feynman’s
motivational description of the path integral.50 This will be especially appropriate
if we allow a model of X that can have complicated local topology.
It is from this point of view that we regard a collection (Pv )v of Qv principal
bundles as v runs over the places of Q as a quantum arithmetic gauge field on Q.
A problem of finding which collections come from a rational gauge field, that is, a
principal U -bundle over Q, is the problem of describing the image of the localization
map. It is also an arithmetic analogue of finding and solving the E-L equation.

r Ofcourse this is not quite true. Classical states should be a statistical state of some sort arising
out of the quantum theory, and hence, dependent on the quantum states. However, we are following
here the tentative treatment found in standard expositions of path integral quantization.
October 31, 2018 12:13 taken from 146-MPLA ws-rv961x669 chap04-S0217732318300124 page 123

Arithmetic Gauge Theory: A Brief Introduction 123

7. The Tate Shafarevich Group and Abelian Gauge Fields


We should remark that the kernel of the localization map is also frequently of
importance. In other words, these are the locally trivial torsors.s The best known
case is when we have an elliptic curve E. The kernel of localization is then called
the Tate–Shafarevich group of E, and denotedt
X(Q, E) .
A simple example is when E is given by the (non-Weierstrass) equation
x3 + y 3 + 60z 3 = 0 .
Then the curve C given by
3x3 + 4y 3 + 5z 3 = 0
is an elementu of X(Q, E). The group H 1 (Q, E) is an infinite torsionv group.
Remarkably, X(Q, E) is conjectured to be finite.
The relationship between the localization map and X is a crucial part of the
so-called “descent algorithm” for computing the points on an elliptic curve. Recall
that E(Q) is a finitely generated Abelian group, that is,
E(Q) ' Zr × finite Abelian group
and that its torsion subgroup is easy to compute.42 However, the rank is still a
difficult quantity and the conjecture of Birch and Swinnerton-Dyer (BSD) is mainly
concerned with the computation of r. The standard method at the moment is to
look at
E(Q)/pE(Q)
for some prime p, often p = 2. If we know the structure of this group, it is elementary
group theory to figure out the rank of E(Q), given that we also know the torsion.
The long exact sequence arising from
p
0 - E[p] - E - E - 0.

Gives
i
0 - E(Q)/pE(Q) ⊂ - H 1 (Q, E[p]) - H 1 (Q, E)[p] - 0.

s One of the complexities associated with arithmetic gauge fields is that many are not locally
trivial, unlike the geometric situation.
t Read “Sha”.
u This also is not so easy to see. There is an action of E on C, which arises from the fact that E

is actually the Jacobian of C.2


v To see this, one notes that all elements of H 1 (Q, E) can be represented by the Q points of an alge-

braic curve C in such a way that the action is algebraic and defined over Q.42 The torsor becomes
trivial as soon as C has a rational point. Now C has a rational point over some finite field extension
K of Q. That is, the class will become trivial under the restriction map H 1 (Q, E) - H 1 (K, E).
However, there is also a “trace” map H 1 (K, E) - H 1 (Q, E) defined by summing a Galois
conjugacy class of torsors. Also, the composed map H 1 (Q, E) - H 1 (K, E) - H 1 (Q, E) is
simply multiplication by [K: Q]. Thus, the element [C] ∈ H 1 (Q, E) is killed by this degree.
October 31, 2018 12:13 taken from 146-MPLA ws-rv961x669 chap04-S0217732318300124 page 124

124 Topology and Physics

We have
X(Q, E)[p] ⊂ H 1 (Q, E)[p] .
Define the p-Selmer group Sel(Q, E[p]) to be the inverse image of X(Q, E) under
the map i, so that it fits into an exact sequence
0 - E(Q)/pE(Q) - Sel(Q, E[p]) - X(Q, E)[p] - 0.

Described in words, Sel(Q, E[p]) consists of the E[p]-torsors that become locally
trivial when pushed out to E-torsors.
The key point is that the Selmer group is effectively computable, and this already
gives us a bound on the Mordell–Weil group of E. This is then refined by way of
the diagram
0 - E(Q)/pn E(Q) - Sel(Q, E[pn ]) - X(Q, E)[pn ] - 0

? ? ?
0 - E(Q)/pE(Q) - Sel(Q, E[p]) - X(Q, E)[p] - 0

for increasing values of n. Provided X is finite, one can see that the image of
E(Q)/pE(Q) in Sel(Q, E[p]) consists exactly of the elements that can be lifted to
Sel(Q, E[pn ]) for all n. We get thereby, a cohomological expression for the group
E(Q)/pE(Q) that can be used to compute its structure precisely. The idea is to
compute the image
Im(Sel(Q, E[pn ])) ⊂ Sel(Q, E[p])
for each n and simultaneously compute the image of E(Q)≤n in Sel(Q, E[p]). Here,
E(Q)≤n consists of the point in E(Q) of heightw ≤ n. This is a finite set that can
be effectively computed: just look at the finite set of pairs (x, y) of height ≤ n and
see which ones satisfy the equation for E. Thus, we have an inclusion
Im(E(Q)≤n ) ⊂ Im(Sel(Q, E[pn ]) ⊂ Sel(Q, E[p]) .
Assuming X is finite, we get
Im(E(Q)≤n ) = Im(Sel(Q, E[pn ])
for n sufficiently large, at which point we can conclude that
E(Q)/pE(Q) = Im(Sel(Q, E[pn ]) .
This is a conditionalx algorithm for computing the rank, which is used by all the
existing computer packages. With slightly more care, it also gives a set of generators

w The height h(x, y) of a point (x, y) ∈ E(Q) is defined as follows. Write (x, y) = (s/r, t/r) for
coprime integers s, t, r. Then h(x, y) := log sup{|s|, |t|, |r|}. The height of the origin is defined to
be zero. Clearly, there are only finitely many rational (x, y) of height ≤ n for any n.
x In the sense that it terminates only if X, or more precisely, its p-primary part X[p∞ ] is finite.
October 31, 2018 12:13 taken from 146-MPLA ws-rv961x669 chap04-S0217732318300124 page 125

Arithmetic Gauge Theory: A Brief Introduction 125

for the group E(Q). In this sense, we do arrive at a conditional algorithm for
“completely determining” E(Q). An important part (some would argue the most
important part) of BSD is to remove the “conditional” aspect.

8. Non-Abelian Gauge Fields and Diophantine Geometry


We keep to the conventions of the previous section and assume U to be a p-adic
Lie group over Q. We now make the following further assumption: There is a finite
set S of places containing p and ∞ such that for all v ∈ / S, the action of Gv on
U factors through the quotient Guv ' Gal(F̄v /Fv ). Another way of saying this is
that the action of the inertia subgroup Iv is trivial. We say that U is unramified
at v. Geometrically, this corresponds to having a family of groups on Spec(ZS ),
where ZS ⊂ Q is the ring of S-integers,y i.e. rational numbers whose denomina-
tors are divisible only by primes in S. We will assume that the torsors P satisfy
the same condition. In terms of Spec(Z), these can be thought of as connections
having singularityz only along the primes in S. We will refer to these as S-integral
U -torsors. We will denote by
H 1 (ZS , U ) ,
the isomorphism classes of S-integral U -torsors. The reason for introducing this
notion is that the U and P that arise in nature are S-integral for some S. The
condition of being unramified clearly makes sense even when P is just a torsor
over Qv . We denote by Hu1 (Qv , U ) the isomorphism classes of unramified U tor-
sors over Qv .
One additional condition that U and its torsors are required to satisfy is that of
being crystalline at p, a technical condition about which we will be quite vague.aa
There is a big topological Qp -algebra Bcr called the ring of p-adic periods and
the torsors are required to trivialize over Bcr . This is a condition that comes from
geometry and is closely related to p-adic Hodge theory. The point is that because U
is a p-adic Lie group, it will very rarely happen that the action is actually unramified
at p. The crystalline condition captures smooth behavior nevertheless. We denote
by Hf1 (Gp , U ) the torsors over Qp that are crystalline.
With these assumptions, we denote by
0
Y
H 1 (Qv , U ) ,
the isomorphism classes of tuples (Pv )v where Pv is a U -torsor over Qv with the
property that all but finitely many Pv are unramified and such that Pp is crystalline.

y Interms of scheme theory, the set underlying Spec(ZS ) is the open subset of Spec(Z) obtained
by removing the primes in S.
z The geometry of schemes is organized in such a way that Z becomes a ring of functions on

Spec(Z). It is easy to imagine that ZS then becomes the ring of functions with restricted poles.
aa This is an enormous subject in the study of Galois representations and pedagogical references

are easy to find with just the key word “crystalline representation”. In the non-Abelian situation,
a treatment is given in Ref. 24.
October 31, 2018 12:13 taken from 146-MPLA ws-rv961x669 chap04-S0217732318300124 page 126

126 Topology and Physics

For the global version, denote by Hf1 (ZS , U ) the U torsors over Q that are unram-
ified outside S and crystalline at p. Thus, we get a map
0
Y
loc: Hf1 (ZS , U ) - H 1 (Qv , U ) ,
whose image we would like to compute.
The main examples are:
(1) The constant group U = GLn (Zp ) or other p-adic Lie groups with trivial
G-action.
In this case, from the earlier description in terms of cocycles, it is easy to see
that a U -torsor is simply a representation
ρ: G - U.

By our earlier assumption, this representation is required to be unramified


outside S and crystalline at p. We will return to this important case in the next
section.
(2) The Qp -pro-unipotent fundamental group16,25
U = π1 (V̄ , b)Qp ,
of a smooth projective variety V over Q equipped with a rational base-point
b ∈ V (Q). We assume that V extends to a smooth projective family over
ZS\p . An abstract definition of U can be given starting from the profinite étale
fundamental group π1 (V , b): π1 (V̄ , b)Qp is the universal pro-unipotent groupbb
over Qp admitting a continuous homomorphism
π1 (V̄ , b) - π1 (V̄ , b)Qp .

This is one of a number of “algebraic envelopes” of a group that have been


important in both arithmetic and algebraic geometry.1 In spite of the difficulty
of definition, it is substantially easier to work with than either the “bare”
fundamental group or its profinite completion.
The important and convenient fact is that Hf1 (ZS , U ) has the structure of a
pro-algebraic scheme over Qp .24 Among the constructions discussed so far, this is
the closest to gauge-theoretic moduli spaces in physics and geometry. For another
rational point x, one can also define
P (x) = π1 (V̄ ; b, x)Qp := [π1 (V ; b, x) × U ]/π1 (V , b) ,
the U torsor of pro-unipotent paths from b to x. This construction gives us a map
V (Q) - H 1 (ZS , U ) ;
f

x 7→ P (x)

bb An algebraic group is unipotent if it can be represented as a group of upper-triangular matrices


with 1s on the diagonal. A pro-unipotent group is a projective limit of unipotent algebraic groups.
October 31, 2018 12:13 taken from 146-MPLA ws-rv961x669 chap04-S0217732318300124 page 127

Arithmetic Gauge Theory: A Brief Introduction 127

that fits into a diagram


V (Q) - V (Qp )

A Ap

? ?
locp
Hf1 (ZS , U ) - H 1 (Qp , U )
f

Even though the localization map needs in principal to be studied as a whole,


because U is a p-adic Lie group, it will usually happen that the component at p
is the most informative, and we will concentrate on this for now. (We will explain
below the role of the adelic points.)
Conjecture 8.1 (Ref. 6). Suppose V is a smooth projective curve of genus ≥ 2.
Then
A−1
p (Im(locp )) = V (Q) .

In essence, the conjecture is saying that the rational points can be recovered as
the intersection between p-adic points and the space of S-integral torsors inside the
space of p-adic torsors. A number of other diagrams are relevant to this discussion.
loc Q0
Hf1 (ZS , U ) - v H 1 (Qv , U )

-
?
Hf1 (Qp )
The right vertical arrow is just the projection to the component at p. The image of
the horizontal arrow should be computed by a reciprocity law,29,30 which we view
as the arithmetic analogue of the E-L equation in that it specifies which collection
of local torsors glue to a global torsor.
The diagram
X(Qp )
AD
R
Ap
-
?
D
H 1 (Qp , U ) - U DR /F 0
is used to clarify the structure of the local moduli space Hf1 (Qp , U ) and to translate
the E-L equations into equations satisfied by the p-adic points. The last object U DR
is the De Rham fundamental group16 endowed with a Hodge filtration F i , which can
be computed explicitly in such a way that the map ADR is also described explicitly
in terms of p-adic iterated integrals. From this point of view, computing the E-L
equation is the main tool for finding the points V (Q).26,27,7,8
October 31, 2018 12:13 taken from 146-MPLA ws-rv961x669 chap04-S0217732318300124 page 128

128 Topology and Physics

To make this practical we use the lower central series


U = U1 ⊃ U2 ⊃ U3 ⊃ · · · ,
where U n = [U, U n−1 ]. We denote by Un = U/U n+1 the corresponding quotients,
which are then finite-dimensional algebraic groups. All the diagrams above can be
replaced by truncated versions, for example,
V (Q) - V (Qp )

An An,p

? ?
locp
Hf1 (ZS , Un ) - H 1 (Qp , Un )
f

These iteratively give equations for V (Q) depending on a reciprocity law for the
Q0 1
image of Hf1 (ZS , Un ) in H (Qv , Un ).
We illustrate this process with one example,14,15 which we take to be affine
because it is easier to describe than the projective case. Let V = P1 \ {0, 1, ∞}.
When we take n = 2 and S = {∞, 2, p}, the image of Hf1 (ZS , U2 ) in
Hf1 (Qp , U2 ) ' A3 = {(x, y, z)}
is described by the equationcc
z − (1/2)xy = 0 .
When translated back to points, this yields the consequence that the 2-integral
points V (Z2 ) are in the zero set of the function
D2 (z) = `2 (z) + (1/2) log(z) log(1 − z) .
Here, log(z) is the p-adic logarithm that is defined by the usual power series in
a neighborhood of 1 and then continued to all of Q \ {0} via additivity and the
condition log(p) = 0. The p-adic k-logarithm is defined for k ≥ 2 by

X
`k (z) = z n /nk
n=1

in a neighborhood of zero and analytically continued to V (Qp ) using Coleman


integration.24
When we use the E-L equations for Hf1 (ZS , U4 ), we find that V (Z2 ) is killed by
the additional equation
ζp (3)`4 (z) + (8/7)[log3 2/24 + `4 (1/2)/ log 2] log(z)`3 (z)

+ [(4/21)(log3 2/24 + `4 (1/2)/ log 2) + ζp (3)/24] log3 (z) log(1 − z) = 0 .


Here, ζp (s) is the Kubota–Leopold p-adic zeta function.

cc The isomorphism between the local moduli space and the affine three-space is also a consequence
of p-adic Hodge theory.25
October 31, 2018 12:13 taken from 146-MPLA ws-rv961x669 chap04-S0217732318300124 page 129

Arithmetic Gauge Theory: A Brief Introduction 129

It is worth noting that this method for finding rational points is a surprising
confluence of three ingredients.
(1) The method of Chabauty,13 which was, in retrospect, the case of Abelian gauge
groups. One ends up using the p-adic logarithm on the Jacobian of the curve
without considering cohomology at all.
(2) The descent method for finding points on elliptic curves.42 This again is another
version of the Abelian case, where one uses Galois cohomology and the Selmer
group. As described earlier, this method is central to the conjecture of Birch
and Swinnerton-Dyer.
(3) The geometry of arithmetic gauge fields.

9. Galois Representations, L-Functions, and Chern Simons


Actions
Finally, we consider
Hf1 (ZS , GLn (Zp )) ,
a moduli space of Galois representations.dd
It is believed then that the image can be characterized by an L-function via
a reciprocity law that one is tempted to view as an arithmetic action principle
of sorts. That is, for any collection (Pv )v of local principal bundles with Pp ∈
Hf1 (Qp , GLn (Zp )) crystalline and Pv ∈ Hu1 (Qv , GLn (Zp )) for v ∈ / S, one looks at
the complex-valued product31
Y 1
L((Pv )v , s) := ,
det(I − v −s F r |P Iv )
v6=∞ v v

which formally amalgamates the information associated to all “local” functions


(Pv )v 7→ Tr(F rvn |PvIv )
as we run over places w and natural numbers n.
Assuming a rather large number of standard conjectures in the theory of motives,
some necessary conditions for (Pv )v to be in the image of the localization map of
an irreducible representation P are as follows.
(1) Each of the det(I − v −s F rv |PvIv ) should be polynomials of v −s with integral
coefficients for v ∈
/ S; for any v, the coefficients should be algebraic. We use the
algebraicity to regard the polynomial as complex-valued.
(2) There is an integer w such that the absolute values of the eigenvalues of F rv
are v w/2 for v ∈ / S. This implies that the product converges absolutely for
Re(s) > w/2 + 1.

dd Here,we will allow ourselves to use the word “space” quite loosely. There are numerous ways
to geometrize this set, sometime formally35 and sometimes analytically.9 It may also be most
natural to regard it as a stack without worrying too much about representability.
October 31, 2018 12:13 taken from 146-MPLA ws-rv961x669 chap04-S0217732318300124 page 130

130 Topology and Physics

(3) L((Pv )v , s) has analytic continuation to all of C and satisfies a functional equa-
tion of the form

L((Pv )v , s) = abs L((Pv )v , w + 1 − s) ,

for some rational numbers a, b. This function should have no poles unless w
is even, the Pv are one-dimensional, and F rv acts as v w/2 for all but finitely
many v.

Roughly speaking, these statements are necessarily summarized under the rubric
of the Fontaine–Mazur conjecture19 and the Hasse–Weil conjecture.31 Even though
it is not clear if a conjecture is stated in the literature, it appears to be commonly
believed that these conditions should also characterize all (Pv )v that are in the
image of the localization map (cf. Ref. 45).
There is a sense in which L((Pv )v , s) should be related to an action. The complex
number s itself parametrizes representations of a somewhat more general type,
namely belonging to the idele class group of F . That is, for each place v of F , there
is a suitably normalized absolute value k·kv , which come together to form the norm
character


F
- C× ;
Y
(av )v →
7 N ((av )v ) := kav kv .
v

This character and its complex powers N (·)−s factor through the idele class group
A×F /F
×
and the L value is a complex amplitude associated to (Pv )v twisted by
−s
N (·) . The infinite product expansion will hold only for a region of s, so that
the conjectured analytic continuation is supposed to involve a move from a kind of
“decomposable range of the parameter” to one that is not. When the continuation
is carried out, it turns out to be natural to view it as a section of a determinant line
bundle, which is a function only in certain regions,20,23 creating an analogy with
the wave functions of topological quantum field theory.3
The question of finding natural action functionals on spaces of principal bundles
appears to be important not just for unity of the theory, but because of the hope
that it might lead to a more efficient approach to the arithmetic E-L equations
of the previous section. While an action on torsors for π1 (V , b)Qp seems hard to
define, there is an approach to a Chern–Simons action of Galois representations.
To describe this, we lapse now into more geometric language and reproduce the
discussion from Refs. 28 and 10, which, in turn, is based on Ref. 17.
Let X = Spec(OF ), the spectrum of the ring of integers in a number field F . We
assume that F is totally imaginary. Denote by Gm the étale sheaf that associates
to a scheme the units in the global sections of its coordinate ring. The topological
fact underlying the functional is the canonical isomorphism [Ref. 33, p. 538]:

inv: H 3 (X, Gm ) ' Q/Z . (∗)


October 31, 2018 12:13 taken from 146-MPLA ws-rv961x669 chap04-S0217732318300124 page 131

Arithmetic Gauge Theory: A Brief Introduction 131

This map is deduced from the “invariant” map of local class field theory.40 We will
therefore use the same name for a range of isomorphisms having the same essential
nature, for example,
inv: H 3 (X, Zp (1)) ' Zp , (∗∗)
where Zp (1) = limi µpi , and µn ⊂ Gm is the sheaf of nth roots of 1. The pro-
←−
sheaf Zp (1) is a very familiar coefficient system for étale cohomology and (∗∗)
is reminiscent of the fundamental class of a compact oriented three-manifold for
singular cohomology. Such an analogy was noted by Mazur around 50 years ago34
and has been developed rather systematically by a number of mathematicians,
notably, Masanori Morishita.37 Within this circle of ideas is included the analogy
between knots and primes, whereby the map
Spec(OF /Pv )  X ,
from the residue field of a prime Pv should be similar to the inclusion of a knot.
Let Fv be the completion of F at the prime v and OFv its valuation ring. If one
takes this analogy seriously, the map
Spec(OFv ) → X ,
should be similar to the inclusion of a handle-body around the knot, whereas
Spec(Fv ) → X
resembles the inclusion of its boundary torus.ee Given a finite set S of primes, we
consider the scheme
XS := Spec(OF [1/S]) = X \ {Pv }v∈S .
Since a link complement is homotopic to the complement of a tubular neighbor-
hood, the analogy is then forced on us between XS and a three-manifold with
boundary given by a union of tori, one for each “knot” in S. These of course are
basic morphisms in three-dimensional topological quantum field theory.3 From this
perspective, perhaps the coefficient system Gm of the first isomorphism should have
reminded us of the S 1 -coefficient important in Chern–Simons theory.49,17 A more
×
direct analogue of Gm is the sheaf OM of invertible analytic functions on a complex
variety M . However, for compact Kähler manifolds, the comparison isomorphism
×
H 1 (M, S 1 ) ' H 1 (M, OM )0 ,
where the subscript refers to the line bundles with trivial topological Chern class,
is a consequence of Hodge theory. This indicates that in the étale setting with no
natural constant sheaf of S 1 ’s, the familiar Gm has a topological nature, and can

ee Itis not clear to us that the topology of the boundary should really be a torus. This is reasonable
if one thinks of the ambient space as a three-manifold. On the other hand, perhaps it is possible
to have a notion of a knot in a homology three-manifold that has an exotic tubular neighborhood?
In any case, Kapranov has pointed out that a better analogy is with a Klein bottle.
October 31, 2018 12:13 taken from 146-MPLA ws-rv961x669 chap04-S0217732318300124 page 132

132 Topology and Physics

be regarded as a substitute.ff One problem, however, is that the Gm -coefficient


computed directly gives divisible torsion cohomology, whence the need for consid-
ering coefficients like Zp (1) in order to get functions of geometric objects having
an analytic nature as arise, for example, in the theory of torsors for motivic funda-
mental groups.12,24–27
We now move to the definition of the arithmetic Chern–Simons action just for
the simple case of a finite unramified Galois representation. Let
π := π1 (X, b) ,
be the profinite étale fundamental group of X, where we take
b: Spec(F ) → X
to be the geometric point coming from an algebraic closure of F . Assume now that
the group µn (F ) of nth roots of unity is in F and fix a trivialization ζn : Z/nZ ' µn .
This induces the isomorphism
1
inv: H 3 (X, Z/nZ) ' H 3 (X, µn ) ' Z/Z .
n
Now let A be a finite group with trivial GF -action and fix a class c ∈ H 3 (A, Z/nZ).
For
[ρ] ∈ H 1 (π, A) ,
we get a class
ρ∗ (c) ∈ H 3 (π, Z/nZ)
that depends only on the isomorphism class [ρ]. Denoting by inv also the composed
map

H 3 (π, Z/nZ) / H 3 (X, Z/nZ) inv / 1 Z/Z .


' n
We get thereby a function

CSc : H 1 (π(X), A) / 1 Z/Z ; .


n
[ρ]  / inv(ρ∗ (c))

This is the basic and easy case of the classical Chern–Simons action in the arithmetic
setting. There is a natural generalization to the case where ramification is allowed
and where the representation has p-adic coefficients. It is related to natural invari-
ants of algebraic number theory such as extensions of ideal class groups and nth
power residues symbols.10,11 One might hope for such constructions to be related
at once to L-functions and to E-L equations even for the unipotent fundamental
groups of the previous section. Indeed, the approaches to the BSD conjecture that
go via the “main conjecture of Iwasawa theory” take the view that Selmer groups

ff Recall,
however, that it is of significance in Chern–Simons theory that one side of this isomor-
phism is purely topological while the other has an analytic structure.
October 31, 2018 12:13 taken from 146-MPLA ws-rv961x669 chap04-S0217732318300124 page 133

Arithmetic Gauge Theory: A Brief Introduction 133

should be annihilated by L-functions.23 The reader might notice that the formu-
lae for the arithmetic E-L equations of the previous section actually indicate some
connection to L-functions, but in a way that remains mysterious.
Finally, we mention that the Langlands reciprocity conjecture32 has as its goal
the rewriting of arithmetic L-functions quite generally in terms of automorphic L-
functions. In view of the striking work,22 it seems reasonable to expect the geometry
of arithmetic gauge fields to play a key role in importing quantum field theoretic
dualities to arithmetic geometry.

Acknowledgment
This work is supported by Grant EP/M024830/1 from the EPSRC,

References
1. J. Amorós, M. Burger, K. Corlette, D. Kotschick and D. Toledo, Fundamental Groups
of Compact Kähler Manifolds, Mathematical Surveys and Monographs, Vol. 44 (Amer-
ican Mathematical Society, 1996).
2. S. Y. An, S. Y. Kim, D. C. Marshall, S. H. Marshall, W. G. McCallum and A. R.
Perlis, J. Number Theory 90, 304 (2001).
3. M. Atiyah, Inst. Hautes Études Sci. Publ. Math. 68, 175 (1988).
4. M. F. Atiyah and R. Bott, Philos. T. Roy. Soc. A 308, 1523 (1983).
5. M. F. Atiyah, N. J. Hitchin, V. G. Drinfel’d and Yu. I. Manin, Phys. Lett. A 65, 185
(1978).
6. J. Balakrishnan, I. Dan-Cohen, M. Kim and S. Wewers, arXiv:1209.0640. To be
published in Math. Ann.
7. J. Balakrishnan and N. Dogra, arXiv:1601.00388.
8. J. Balakrishnan and N. Dogra, arXiv:1705.00401.
9. G. Chenevier, The p-adic analytic space of pseudocharacters of a profinite group
and pseudorepresentations over arbitrary rings, Automorphic Forms and Galois
Representations, Vol. 1, London Mathematical Society Lecture Note Series, Vol. 414
(Cambridge Univ. Press, 2014), pp. 221–285.
10. H.-J. Chung, D. Kim, M. Kim, J. Park and H. Yoo, arXiv:1609.03012.
11. H.-J. Chung, D. Kim, M. Kim, G. Pappas, J. Park and H. Yoo, arXiv:1706.03336.
12. J. Coates, M. Kim and S. Minhyong, Kyoto J. Math. 50, 827 (2010).
13. R. F. Coleman, Duke Math. J. 52, 765 (1985).
14. I. Dan-Cohen and S. Wewers, arXiv:1311.7008.
15. I. Dan-Cohen and S. Wewers, arXiv:1510.01362.
16. P. Deligne, Le groupe fondamental de la droite projective moins trois points,
Galois groups over Q, Mathematical Sciences Research Institute Publications,
Vol. 16 (Springer, 1989), pp. 79–297.
17. R. Dijkgraaf and E. Witten, Commun. Math. Phys. 129, 393 (1990).
18. S. K. Donaldson, J. Differ. Geom. 18, 279 (1983).
19. J.-M. Fontaine and B. Mazur, Geometric Galois representations, Elliptic Curves,
Modular Forms, and Fermat’s Last Theorem, Vol. 1 of Series in Number Theory
(International Press, 1995), pp. 41–78.
20. T. Fukaya and K. Kato, A formulation of conjectures on p-adic zeta functions
in noncommutative Iwasawa theory, in Proc. St. Petersburg Mathematical Society,
Vol. XII (American Mathematical Society, 2006), pp. 1–85.
October 31, 2018 12:13 taken from 146-MPLA ws-rv961x669 chap04-S0217732318300124 page 134

134 Topology and Physics

21. A. Grothendieck and B. G. Faltings, Geometric Galois Actions, Vol. 1, London


Mathematical Society Lecture Note Series, Vol. 242 (Cambridge Univ. Press, 1997),
pp. 49–58.
22. A. Kapustin and E. Witten, Commun. Number Theory Phys. 1, 1 (2007).
23. K. Kato, Lectures on the approach to Iwasawa theory for Hasse-Weil L-functions
via BDR . Part I, Arithmetic Algebraic Geometry, Lecture Notes in Mathematics,
Vol. 1553 (Springer, 1993), pp. 50–163.
24. M. Kim, Invent. Math. 161, 629 (2005).
25. M. Kim, Ann. Math. 172, 751 (2010).
26. M. Kim, J. Am. Math. Soc. 23, 725 (2010).
27. M. Kim, Duke Math. J. 161, 173 (2012).
28. M. Kim, arXiv:1510.05818.
29. M. Kim, Diophantine geometry and non-Abelian reciprocity laws I, Elliptic Curves,
Modular Forms and Iwasawa Theory, Springer Proceedings in Mathematics and
Statistics, Vol. 188 (Springer, 2016), pp. 311–334.
30. M. Kim, Principal bundles and reciprocity laws in number theory. To be published in
Proc. Symp. Pure Mathematics.
31. M. Kim, Classical motives and motivic L-functions, Autour Des Motifs — École d’été
Franco-Asiatique de Géométrie Algébrique et de Théorie des Nombres, Asian-French
Summer School on Algebraic Geometry and Number Theory, Vol. I (Soc. Math.
France, 2009), pp. 1–25.
32. R. P. Langlands, L-functions and automorphic representations, in Proc. Int. Congress
of Mathematicians, Helsinki, 1978 (Acad. Sci. Fennica, 1980), pp. 165–175.
33. B. Mazur, Ann. Sci. École Norm. Sup. 6, 521 (1974).
34. B. Mazur, Remarks on the Alexander polynomial. Unpublished notes.
35. B. Mazur, An introduction to the deformation theory of Galois representations, in
Modular Forms and Fermat’s Last Theorem (Springer, 1997), pp. 243–311.
36. J. S. Milne, Michigan Math. J. 46, 203 (1999).
37. M. Morishita, Knots and Primes. An Introduction to Arithmetic Topology, Universi-
text (Springer, 2012).
38. M. S. Narasimhan and C. S. Seshadri, Ann. Math. 82, 540 (1965).
39. J. Neukirch, Algebraic Number Theory, Grundlehren der Mathematischen
Wissenschaften, Vol. 322 (Springer-Verlag, 1999).
40. J. Neukirch, A. Schmidt and K. Wingberg, Cohomology of Number Fields, 2nd edn.,
Grundlehren der Mathematischen Wissenschaften, Vol. 323 (Springer-Verlag, 2008).
41. P. Schneider, p-Adic Lie Groups, Grundlehren der Mathematischen Wissenschaften,
Vol. 344 (Springer, 2011).
42. J. H. Silverman, The Arithmetic of Elliptic Curves, 2nd edn., Graduate Texts in
Mathematics, Vol. 106 (Springer, 2009).
43. C. T. Simpson, Inst. Hautes Études Sci. Publ. Math. 75, 5 (1992).
44. T. Szamuely, Galois Groups and Fundamental Groups, Cambridge Studies in Ad-
vanced Mathematics, Vol. 117 (Cambridge Univ. Press, 2009).
45. R. Taylor, Ann. Fac. Sci. Toulouse Math. 13, 73 (2004).
46. A. Weil, Acta Math. 52, 281 (1929).
47. A. Weil, J. Math. Pure Appl. 17, 47 (1938).
48. A. Wiles, Ann. Math. 141, 443 (1995).
49. E. Witten, Commun. Math. Phys. 121, 351 (1989).
50. A. Zee, Quantum Field Theory in a Nutshell, 2nd edn. (Princeton Univ. Press, 2010).
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 135

135

Chapter 5

Singularity theorems

Roger Penrose
Mathematical Institute,
Oxford University,
Radcliffe Observatory Centre,
Woodstock Rd., Oxford OX1 6GG, UK

Not long after Einstein introduced the equations for his general theory of relativity,
solutions were found by Friedman, Lemaı̂tre, and others, that presented plausible mod-
els of the cosmos, but they had singular origins where densities and curvatures diverge
to infinity. A similar situation, but in the opposite time direction, occurred with the
Oppenheimer–Snyder model, found somewhat later, which describes gravitational col-
lapse to what is now called a black hole with its internal singularity. The question arose
as to whether the strict spherical symmetry and the simplistic representation of the
natter source (dust), in these models, might be misleading, and that a generic asym-
metrical (possibly rotating) model, with perhaps a more realistic matter source, might
avoid actual singularities, allowing for the possibility of a non-singular “bounce”.
However, techniques introduced in the 1960s, described in detail here, in which ideas
from differential topology and causal structure theory, showed the singularities occur-
ring in these situations cannot be removed by such means, being stable phenomena and
occurring with very general matter sources satisfying merely a condition related to local
energy non-negativity. Results in this area due to Stephen Hawking, Robert Geroch,
the current author, and many others are presented here, these being concerned with
causality conditions, the structures of boundaries of futures and pasts, Cauchy hori-
zons, and domains of dependence. A version of the authors original singularity theorem,
demonstrating the necessity of singularities arising in gravitational collapse, is presented
in detail, but with a new perspective, whereby the nature of such singularities can be
analysed in relation to the notion of boundary points to space-times, referred to as TIPs
and TIFs (terminal indecomposable pasts and futures).

5.1. General Background


Albert Einstein produced his great theory of general relativity (here abbreviated
GR) in 1915, making the modification of incorporating his additional Λ-term two
years later (Λ-GR). Although being admired for its firm physical foundations, orig-
inality, boldness, and over-arching grandeur, and being given some early physical
credence by Sir Arthur Eddington’s expedition to the Isle of Principe during the
solar eclipse of 1919, where the first rudiments of observational support for the the-
ory were found in its effect of the bending of light by the Sun’s mass, the theory was
nevertheless viewed with suspicion by most physicists for many years afterwards.
This was understandable, because Einstein’s theory of gravity was very different in
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 136

136 Topology and Physics

form from that of any other successful physical theory that had been put forward
up to that point. In GR, gravitation is not even regarded as a force, in a remarkable
irony with the fact that the huge success of Newton’s theory of gravitational force
had provided the model for all later serious theories of physical interaction between
particles. Not only that, but unlike all other successful physical theories, in GR
there seemed to be no local measure of the energy in the gravitational field.
This latter point seemed particularly troublesome, as energy had become rec-
ognized as one of the most fundamental concepts in physics, and it remains so
today. As early as 1916, Einstein had argued [Einstein, A. (1918)] that a rotating
rod should emit gravitational waves, which would carry energy away from the ki-
netic motion of the rod, and he provided a “mass quadrupole formula” for this rate
of energy released in this radiation, a formula still used today in the calculations
pertinent to the interpretation of signals received by the LIGO gravitational wave
detectors. Yet there was no locally meaningful measure of the energy density in
GR’s curved space-time geometry that one could identify as that carried away by
such a wave. Einstein had introduced a concept of an energy “pseudo-tensor” to
take on this role, but it was a coordinate-dependent quantity that could not be
regarded as having local physical meaning. Only by taking one’s considerations
out to infinity did it become possible to find some kind of coordinate-invariant no-
tion that could be assigned genuine physical content. The issue remained obscure
for many years, and Einstein himself vacillated from year to year concerning the
physical reality of genuine energy-carrying gravitational waves, as did many of his
contemporaries. It was not until the late 1950s and early 1960s, through the work
of Andrzej Trautman [Trautman, A. (1958)], Hermann Bondi and collaborators
[Bondi, H. (1960)], [Bondi, H., van der Burg, M.G.J. and Metzner, A.W.K. (1962)],
Rayner Sachs [Sachs, R.K. (1961)], [Sachs, R.K. (1962a)], [Sachs, R.K. (1962b)], and
many others [Newman, E.T. and Unti, T.W.J. (1962)], [Newman, E.T. and Penrose,
R. (1962)], [Penrose, R. (1963)], [Penrose, R. (1965)] that a clear measure of the en-
ergy carried away in the form of gravitational waves, coming from a reasonably
localized gravitating system was obtained, albeit only when the entire space-time
could be considered asymptotically flat in some appropriate sense (see [Trautman,
A. (1958)], [Bondi, H., van der Burg, M.G.J. and Metzner, A.W.K. (1962)], [Sachs,
R.K. (1962a)], [Sachs, R.K. (1962b)], [Newman, E.T. and Penrose, R. (1962)],
[Penrose, R. (1963)], [Penrose, R. (1965)]) — and even at infinity there remains
some non-locality in the measure of energy flux (see [Penrose, R. and Rindler,
W. (1986)] p. 427).
As an alternative approach, borrowed from the standard procedures of
(quantum) field theory and particle physics, one may take note of the fact that
Hilbert had shown, in 1915, [Hilbert, D. (1915)], that the equations of GR could
be obtained from a Lagrangian, so it might be thought that Noether’s theorem,
which demonstrates that any such theory, with a time-translation invariant La-
grangian, necessarily possesses a conserved notion of energy. Accordingly, it might
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 137

Singularity Theorems 137

be thought, that this would apply to GR. However, the time-invariance of GR is


merely an aspect of the general covariance (the infinite-parameter coordinate inde-
pendence) of the theory, and Noether’s theorem does not usefully apply (although
some authors have found value in this procedure in a relatively limited way; see
[Komar, A. (1959)]). The problem is made more manifest with the notion of an-
gular momentum conservation, which turns out to be particularly problematic for
GR (see [Sachs, R.K. (1962b)]).
Despite this apparent departure from some of the most basic of principles of
modern physics, the actual physical precision of GR has become more and more
apparent as the years have rolled by. A particular landmark was the 1974 observa-
tion of a double star system, one member of which was a pulsar (PSR B1913+16),
by Russel Hulse and Joseph Taylor, demonstrating the accuracy of GR to an ex-
traordinary degree (see [Taylor, J.H. (1994)], [Will, C.M. (2005)], this including a
significant and necessary contribution from the energy loss from the system due
to its emission of gravitational waves, and earning them the well-deserved 1993
Nobel Prize for physics. The overall accuracy, from Newtonian effects, through or-
bital corrections due to GR, to the details of the gravitational radiation, revealed
a precision of GR to around one part in 1013 , which compares favourably with
even the extraordinary precision of 1011 that is exhibited by quantum electrody-
namics (referred to as QED) as seen in the calculation of the electron’s magnetic
moment. Later observations of other pulsars, most particularly the binary pulsar
PSR J0737-3039 could eventually improve on this gravitational figure considerably.
This comparison of GR with QED is apposite, the first being our best theory
of the large and the second, of the small. QED has often been claimed to be the
most accurate theory in known science, and when taking this kind of comparison,
we must take into account the fact that quantum field theory (QFT), and most
particularly QED, finds many different types of application, and is undoubtedly a
hugely versatile theory, explaining many different kinds of phenomena. GR, on the
other hand, has relatively few examples of different phenomena where a detailed
matching between theory and observed effects can be made. Nevertheless, exam-
ples of phenomena for which GR-predicted effects can be directly observed have
increased recent years. The most noteworthy of these has been LIGO’s detection
of anticipated signals that are completely in accordance with GR’s prediction of
gravitational wave signals from colliding black holes. Another is the clear evidence
of lensing effects of the light from very distant galaxies as it reaches us after coming
near to other large conglomerations of mass.
On the negative side, both theories encounter fundamental difficulties that can
be regarded as serious shortcomings, in that strict calculations lead us into infini-
ties, perhaps suggesting a basic incompleteness of these theories as we currently
know them. But here, it seems to me, the situations with regard to the two are very
different. In QFT, these infinities lurk behind every calculation that is carried out
within the theory. Strict adherence to the principles underlying QFT’s calculational
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 138

138 Topology and Physics

procedures almost inevitably leads us to divergent series or divergent integrals, and


numerous ad hoc theoretical devices, such as renormalization or dimensional reg-
ularization are invoked in order that one may extract the required finite answers,
and frequently even these do not remove all the troublesome infinities. With GR,
the situation is very different. Here we find that most physical situations of inter-
est, although frequently leading to greatly complicated calculations, nevertheless
provide perfectly well-defined finite results. Yet under certain particular extreme
conditions, it can be shown that the equations of the theory do inevitably lead us
to “singularities” where the evolution according to the procedures of GR cannot be
continued to give us finite results. In the next section we shall see what kinds of
physical situation can lead us into problems of this kind.

5.2. The Big Bang and Gravitational Collapse


In 1922 and 1924, the Russian mathematician Alexander Friedmann published the
first solutions of Einstein’s equations aimed at providing space-times that might
give a reasonable approximation to our universe as a whole. His models provided
alternative overall spatial geometries that were either closed up, with topology
S 3 , or open hyperbolic 3-space. A little later, in 1927, the Belgian priest Geoges
Lemaı̂tre also solved the GR equations in such a cosmological setting (this time
with a flat spatial geometry) and as with Friedmann’s earlier findings, there was a
singular origin to the universe, in the model, where curvatures and densities diverge
to infinity. Yet, the expanding nature of this model, following this singularity, was
a positive feature for Lemaı̂tre. For he had been strongly influenced by the obser-
vational findings of Vesto Slipher (see [O’Raifeartaigh, C. (2013)]), which indicated
that the actual universe appeared to be expanding, a finding that was later given
more definitive support by Edwin Hubble.
Lemaı̂tre referred to this initial singularity as the “primordial atom”, the cur-
rent terminology “Big Bang” being much later introduced by Fred Hoyle [Hoyle,
F. (1950)], as an intendedly derogatory term, as Hoyle was a supporter of the
rival steady state theory in which there would be no initial singularity [Sciama,
D.W. (1959)]. However, the evolution equations for that theory required an awk-
ward modification of standard GR. Nevertheless, many other cosmologists were
also unhappy about this “Big-Bang” origin for the universe. Most particularly, Ein-
stein himself, although admiring the mathematics of both Friedmann and Lemaı̂tre,
initially derided the physical understandings of both of them. Eventually, after be-
coming aware of Hubble’s impressive findings, Einstein began to appreciate the
power of Lemaı̂tre’s reasoning and was converted to the Big-Bang idea.
In 1917, Einstein had himself proposed a cosmological model [Einstein,
A. (1917)] which was completely static (i.e., technically, having a hypersurface-
orthogonal timelike Killing-vector field), with topology R × S 3 . It was to allow
solutions of this type that Einstein introduced the Λ-term into his equations. But
Eddington argued that the static aspect of this model was unstable, and since as-
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 139

Singularity Theorems 139

tronomical evidence indicated that the universe is in fact not static but expanding,
Einstein abandoned not only this model but the Λ-term itself — ironically, as it
eventually turned out, as there is now significant evidence for a positive Λ, and
that Λ-GR appears to provide a profoundly accurate description of our large-scale
universe.
A separate issue for relativity and astrophysics was brought to light in 1930 by
the then 19-year-old Indian physics student Subrahmanyan Chandrasekhar when
he was travelling on the boat from India to England, where he was to study for
a doctorate in astrophysics in Cambridge. Chandrasekhar was interested in the
equations for the physical effects that keep a white dwarf star from collapsing
inwards, owing to its huge gravitational forces. White dwarfs (such as the dark
companion of Sirius) are extremely dense — something like the entire mass of our
Sun concentrated into an object the size of the Earth. Such stars are held apart
by what is called electron degeneracy pressure, arising from the Pauli exclusion
principle that forbids different electrons from inhabiting in the same state. What
Chandrasekhar found was that when the rules of special relativity are appropriately
incorporated into the electrons’ equations, there would be a limit to the total mass
of the star, namely about 1.4 times the mass of the sun — now referred to as the
Chandrasekhar limit (here referred to as MC ) — above which the star would not
be able to hold itself apart after it had cooled down.
This raised a very disturbing question. Stars whose mass is much greater than
MC are observed to exist, though with temperatures that are currently very high,
so that they can be held apart by pressures resulting from the thermal motions of
the stars’ constituent particles. Eventually, however, the thermonuclear processes
involved in maintaining such temperatures will begin to run out. The picture pre-
sented by detailed astrophysical analysis is that material of the star starts to extend
outwards, but accompanied by a central white-dwarf core. This core begins to ac-
quire more and more of the star’s mass, as the outer material continues to expand,
and the star becomes what is called a red giant. As the nuclear fuel starts to be
used up, this core acquires more and more material. If the total mass is not too
great (depending upon details), the star may eventually shrink down as the core
grows, and the whole configuration becomes that of a white dwarf of mass less than
MC , finding peace as a white dwarf star. But for a star of mass much greater than
MC the growing core would collapse, accompanied by infalling outer material that
could reach enormous temperatures, and nuclear processes then lead to a supernova
explosion, removing much mass from the system. This may result in an even more
concentrated core region, where neutron degeneracy pressure takes over from the
white-dwarf’s electron degeneracy pressure, and the core becomes an entity of far
greater density, called a neutron star (something like the entire mass of the sun
concentrated into a region of roughly 10 kilometres in radius). Again, there is a
limit to the mass that can be supported against gravitational collapse, sometimes
referred to as the Landau limit (here referred to as ML ) after the distinguish Soviet
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 140

140 Topology and Physics

physicist Lev Landau. However, the exact value that should be attached to ML is
very much dependent upon details of high-energy particle physics that were not
known in Landau’s time, and there is still much uncertainty. It appears that neu-
tron stars can have up to about twice the mass of the Sun (as has been currently
observed), but probably not much more.
Nevertheless, stars are observed which have a much greater mass, apparently
up to 300 times the mass of the sun, and the more massive a star may be, the
more quickly it is expected to evolve through its different stages. When nuclear
fuel runs out, the star will collapse, and it is hard to imagine that in the ensuing
evolution there can be an explosive emission of mass that would result in a rem-
nant that does not hugely exceed the mass of both MC and ML . The importance
of this problem was appreciated by Chandrasekhar [Chandrasekhar, S. (1931)]. But
Eddington [Eddington, A.S. (1935)] formed the opinion that there must be some-
thing wrong in the analysis, or perhaps the very laws of physics themselves needed
profound change (see [Eddington, A.S. (1946)]).

5.3. Collapse to a Black Hole


In 1939, John Robert Oppenheimer, and his student Hartland Snyder [Oppen-
heimer, J.R. and Snyder, H. (1939)] addressed the issue head on, by considering
what GR had to say about the problem, applying their considerations to the sim-
plified situation where the collapsing material consists just of “dust”, see Fig. 5.1.

singularity
horizon
observer

time

collapsing matter

Fig. 5.1. Space-time picture of Oppenheimer–Snyder gravitational collapse to a black hole, with
null cones indicated.
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 141

Singularity Theorems 141

This is an idealized type of material where there is no pressure. Moreover, all


the cosmological models explicitly referred to above (Friedmann, Lemaı̂tre, and
Einstein) also take such “dust” as the source term for the GR equations, since
they were considering things on a much larger scale where the “dust particles”
would represent individual galaxies (or perhaps clusters of galaxies) moving inde-
pendently of one another. This gross type of approximation is still very commonly
used in cosmology. (In fact, the local geometry of the infalling dust material, in the
Oppenheimer–Snyder space-time, was modelled exactly by that of one on the above
Friedmann–Lemaı̂tre cosmological GR solutions!) The energy-momentum tensor of
such “dust” can be written simply as

Tab = µνa νb ,

where µ is the mass density and ν a is the 4-velocity of the material, at each point
in space-time, normalized to be a unit vector, according to

νa ν a = 1.

As an interlude to these physical considerations, it will be helpful to make some


comments about notation. I am adopting the very useful abstract-index notation
[Penrose, R. (1968)], [Penrose, R. and Rindler, W. (1984)], which superficially re-
sembles what pure geometers may refer to as “physicists’ notation”. However, while
being designed to resemble that notation in appearance and in explicit calculation,
the symbols do not refer to components of tensor (or vector) quantities in some
coordinate patch, but they actually represent the appropriate tensor/vector quan-
tities directly. This is perfectly rigorous and is described in detail in [Penrose, R.
and Rindler, W. (1984)], and also in [Penrose, R. (1968)]. We may note that the
conventions require that whereas ν a is a vector field, νa is the covector field ob-
tained by transvecting ν a with the metric tensor field gab , as expressed (mirroring
Einstein’s summation convention) by

νa = gab ν b .

We must take into account that although ν b and ν a each stand for the same
vector field it would be incorrect, algebraically, to write ν b = ν a since a differ-
ent member of the (infinite) alphabet of abstract index letters is associated with
each term. It allows us to mirror the physicist’s coordinate-based notation by the
abstract-index formalism. The dual metric tensor is g ab , and we have, conversely

ν b = g ab νa ,

where g ab being the contravariant form of the metric (inverse to gab ) and all the
standard rules of the “physicist’s notation” (including Einstein’s summation con-
vention, as used above) hold true. On the odd occasion where I actually wish to
refer to a specific set of actual coordinates or local basis frames I shall use upright
bold index letters rather than the above lightface italic abstract ones.
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 142

142 Topology and Physics

I should also explain the specific sign conventions that I normally use [Penrose,
R. and Rindler, W. (1984)], [[Penrose, R. and Rindler, W. (1986)]] but where, for
convenience in relation to the discussions here, I reverse my previous signs of the
Ricci tensor Rab and Ricci scalar R. The metric signature is (+ − −−), and the
signs I here adopt for the Riemann curvature tensor Rabcd , Ricci tensor Rab , and
scalar curvature R are determined by
(∇a ∇b − ∇b ∇a )V d = Rabc
d
V c , Rbc = Rabc
a
, and R = Raa , (5.3.1)
where ∇a stands for covariant derivative. The equations of Λ-GR are
1
Rab − Rgab + Λgab = 8πGTab ,
2
where Tab is the energy-momentum tensor of the matter (source of gravity), Λ is
Einstein’s cosmological constant, and G is Newton’s gravitational constant.
Having dealt with such notational issues in this interlude, let us return to what
Oppenheimer and Snyder actually showed. In their model, they demonstrated that
a spherically symmetrical cloud of dust, falling inwards to a central point, could,
according to the strict rules of GR, reach a central point at which densities become
infinite, so that the equations of GR cannot be continued beyond that. Moreover,
an examination of the causal structure of the Oppenheimer–Snyder model (here
abbreviated O–S) reveals that it has a “horizon”, from within which signals cannot
escape to infinity, as well as a central singularity. This provides a picture that we
have come to recognize as what we refer to as collapse to a black hole. See Fig. 5.2.
Yet, many questions can be raised concerning the physical trustworthiness of the
O–S model. To begin with, one might consider the assumption that the collapsing
material is just “dust” is too restrictive, since in this idealization there is no pressure
to hold the infalling material apart. Curiously, as it turns out, in GR collapse
situations, the presence of pressure in the matter actually makes things worse,
because GR’s equations tell us that, in a sense, it is really the trace-reversed energy-
momentum tensor
1
Tab − T gab , where T = Taa , (5.3.2)
2
that acts as “source of Ricci tensor”, which is what we see in the Λ-GR equations
when written in the equivalent trace-reversed form
 1 
Rab = 8πG Tab − T gab − Λgab , (5.3.3)
2
a description which will have importance for us later (§5.7).
In normal circumstances, one can find, at each point, a local pseudo-orthonormal
vector basis frame (δ0a , δ1a , δ2a , δ3a ), i.e.
gjk = gab δja δkb = δbj δkb = diag(1, −1, −1, −1), (5.3.4)
for which the components Tjk of the energy tensor Tab , take the form
Tjk = diag(µ, p1 , p2 , p3 ),
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 143

Singularity Theorems 143

where µ being the energy density and p1 , p2 , and p3 being the principal pressures.
For a perfect fluid, p1 = p2 = p3 , so we just have a single pressure quantity p

Tjk = diag(µ, p, p, p).

The trace-reversed energy tensor is now given by components


1 1
Tjk − T gjk = diag(µ + 3p, µ − p, µ − p, µ − p).
2 2
It is the time-time component (µ + 3p)/2, rather than just µ, that we may regard
as being responsible for the overall inward pull of the matter, in GR, as we shall
be seeing in §5.7, which accounts for the curious fact that a positive pressure in
collapsing material can increase, rather than reduce the overall tendency for a body
to collapse in GR.
Nevertheless, at a local level, dust is indeed liable to lead to spurious local
regions of infinite density. It follows immediately from the “conservation law”

∇a Tab = 0

(not a proper integral conservation law because of the extra index b) that, wher-
ever µ 6= 0, the timelike unit vectors ν a are tangents to geodesics. If such geodesics
are non-rotating, then they are likely to encounter caustics (an issue that we shall
be more seriously concerned with later, see §5.6, §5.7), and at such caustic points,
the space-time density — and therefore the space-time curvature — becomes infi-
nite, so we have a space-time singularity (see [Yodzis, P., Seifert, H.-J. and Muller
zum Hagen, H. (1973)]). However, from the point of view of the general discus-
sion of gravitational collapse, such singularities would normally be considered to
be spurious, because with actual matter, to which this “dust” description would
be considered to be just an approximation, the material particles would push each
other apart (by contact forces) when they get too close to one another, so that
at these caustic regions one would not be expected to give rise to actual physical
singularities, once a more realistic description of the collapsing material is adopted.
For reasons such as this, many physicists were reluctant to accept the O–S pic-
ture as representative of the general case of gravitational collapse. More importantly,
a detailed discussion of general space-time singularities in GR had been carried out
by the Russian physicists Evgeny Mikhailovich Lifshitz and Isaak Markovich Kha-
latnikov in 1963 [Lifshitz, E.M. and Khalatnikov, I.M. (1963)], and they had come
to the conclusion that in the general case, actual singularities would not occur in
solutions of the GR equations. In accordance with such a picture, a collapsing star
would, in realistic situations, not follow the behavior described in the O–R picture,
but when the irregularities in the collapsing material become close to the central
regions, deviations from the assumed spherical symmetry would become extreme,
and the O–R picture would be no longer relevant. Instead, a complicated swirling
around near the centre might take over, and we could well hold to the view that
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 144

144 Topology and Physics

time

Fig. 5.2. Supposed classical cosmological non-singular space-time “bounce” via very complicated
intermediary phase.

most of the collapsing material would be slung out from the central regions, so that
a non-singular space-time could be the result.
Similarly, in many cosmological models, such as that described by Friedmann
in 1922 [Friedmann, A. (1922)], there are collapsing phases of the model which
reach a singular moment, where matter densities and space-time curvatures become
infinite. The solution carries on, however, and a singular “bounce” occurs in the
model, where the collapse is converted to an expansion. One may well consider that
the assumed spherical symmetry of the model is what gives rise to the singularity,
and that a more realistic irregular collapse, with perhaps some significant rotation,
might swirl itself out and convert the situation of the collapse into an expansion, via
an extremely complicated, though non-singular, intermediate phase. See Fig. 5.2 for
a fanciful impression of such a complicated transition. It is the main purpose of the
remainder of this article to provide the essentials of the arguments that show that
such classical “bounces” are, however, inconsistent with basic physical requirements
of classical general relativity.

5.4. The Problem of Generic Gravitational Collapse


It was in 1963 that the Dutch (-American) radio-astronomer Maarten Schmidt
identified the radio source 3C 273, as the first example of what we now refer to as a
quasar. It became evident, from its frequent variations in brightness, that this object
could not be of greater diameter than something like our solar system. Yet, from
its overall emission rate of energy, far exceeding an entire normal galaxy’s output,
it was realized that we were observing an entity that must be of such concentration
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 145

Singularity Theorems 145

of mass, that considerations of GR tell us that, very remarkably, this activity must
be taking place near this entity’s “Schwarzschild radius”.
This term refers to Karl Schwarzschild’s famous 1915 solution of the GR equa-
tions, in the case of spherical symmetry, where for a mass m this is the critical
radius
r = 2mG.
The physical nature of this Schwarzschild radius was initially misunderstood, and
it was frequently referred to as the “Schwarzschild singularity”. It is now more
correctly referred to as the “Schwarzschild horizon”. In order to appreciate the
issues involved, it will be necessary to examine this Schwarzschild model explicitly,
and to simplify matters I shall henceforth adopt units for which the speed of light
c and Newton’s gravitational constant G are both set equal to unity:
c = 1, G = 1.
The original metric form for Schwarzschild’s solution is
 2m  2  2m −1 2
ds2 = 1 − dt − 1 − dt − r2 (dθ2 + sin2 θ dφ2 ). (5.4.1)
r r
This expression evidently becomes singular when the radial r-coordinate reaches
r = 2m, but it turns out that the space-time can be smoothly extended across
r = 2m to smaller r values if we adopt a suitable coordinate change, replacing the
time-coordinate t by the “advanced” time parameter
u = t + r + 2m log(r − 2m).
Then the extended Schwarzschild metric becomes
 2m  2
ds2 = 1 − du − 2du dr − r2 (dθ2 + sin2 θ dφ2 ). (5.4.2)
r
Now, the singularity in the metric expression at r = 2m has disappeared, but
instead, we find that the 3-surface given by r = 2m is a null hypersurface, i.e.
its intrinsic geometry has a degenerate metric, this being −r2 (dθ2 + sin2 θ dφ2 ),
which has rank only 2, the u-coordinate being not represented. This 3-surface is
the Schwarzschild space-time’s horizon, so-called for reasons that we shall come
to later. It is interesting that as far back as 1933 , Lemaı̂tre [Lemaı̂tre, G. (1933)]
had already clearly understood that the “Schwarzschild singularity” is really only
some sort of horizon, that could be crossed by infalling particles, but he did not
have the very simple metric form given in (5.4.2). The metric form (5.4.2) is often
called the Eddington–Finkelstein metric (here abbreviated E–F), after Eddington
[Eddington, A.S. (1924)] and David Finkelstein [Finkelstein, D. (1958)], although
Eddington’s purpose was very different from ours, and he did not comment on the
singularity/horizon issue. (It may also be mentioned that others, even well before
Lemaı̂tre’s 1933 publication [Lemaı̂tre, G. (1933)], had found coordinates extending
through r = 2m, the most notable being the mathematician Paul Painlevé in, 1921
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 146

146 Topology and Physics

[Painlevé, P. (1921)], but there was little understanding of the physical nature of
such extensions.)
We notice that even though the apparent “Schwarzschild singularity” at r = 2m
in (5.4.1) has been removed by this mere change of coordinates, the problem at
r = 0 remains. It certainly cannot be removed by a coordinate change because the
space-time curvature diverges to infinity there, as appropriate calculation shows.
We might take the view that some realistic matter source, rather than the vacuum
(Rab = 0, Λ = 0) used in (5.3.3) might provide a smooth continuation to r = 0, or
perhaps the strict assumption of exact spherical symmetry might be the culprit.
However, as we shall be seeing in §5.7, neither of these modifications can provide
a way out, despite the claims that had been made by Lifshitz and Khalatnikov
[Lifshitz, E.M. and Khalatnikov, I.M. (1963)] in 1963.
Fascinated by the quasar observations — and much stimulated by comments
made to me by John Archibald Wheeler, of Princeton University, about the im-
portance of these observations for general relativity and to fundamental physics
generally — I began to think seriously about the issue of whether or not singu-
larities might be a general consequence of gravitational collapse. I had not studied
the work of Lifshitz and Khalatnikov in any detail, but I was aware of the gen-
eral kind of mathematical procedure involved in their arguments, and I did not
feel confident that one could come to a clear-cut conclusion by using techniques
of that nature. Consequently, I began to try to think, myself, about whether I
believed that singularities would occur in a generic collapse. I came to regard it
unlikely that the forcing of a singularity could be simply a local matter, whereby
some purely local condition on densities or curvatures becoming too large could be
what leads inevitably to runaway behaviour ending in singularity. We recall that
with the singularities in dust, which occur when geodesic world-lines encounter a
caustic, such local divergence in density can be removed by a slight change in the
equations governing the matter. It had seemed to me that the forcing of singular
behaviour in gravitational collapse must be something quite different, and instead
be understood in terms of some more global criterion that tells us when collapsing
material has surpassed an overall limit beyond which there can be no escape from
runaway behaviour leading to catastrophe.
I had been aware, from earlier work [Penrose, R. (1964)], [Penrose, R. (1966)] of
the way that the focussing of null geodesics (henceforth rays) relates to the energy
flux (including that of gravity) across them. I had also gained some expertise (see
appendix of [Penrose, R. (1965)]) concerning the structure of those 3-dimensional
spaces that can act as boundaries of the futures of sub-regions in curved space-times.
I took the view that such concepts could prove invaluable for identifying limits to
what can be achieved in singularity-free GR. For some while, I could identify no
plausible candidate for this. But in the autumn of 1964, an idea occurred to me
(the unusual circumstances of which are related in [Penrose, R. (1989)](pp. 542–544)
which provided an appropriate criterion for an irretrievable gravitational collapse.
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 147

Singularity Theorems 147

This was the notion of a trapped surface, which I shall come to in §5.7, but in the
next two sections we shall need some appropriate mathematical preliminaries so
that we can appreciate the fundamental problems confronting space-time structure
in gravitational collapse.

5.5. Causal Structure of Space-Times


Let us start with some general concepts of Lorentzian space-time geometry. Al-
though much of what I have to say applies generally, for Lorentzian manifolds
of any dimension (preferably ≥ 3), for explicitness to the topics at hand I shall
phrase my descriptions so that it is being applied specifically to a 4-dinmensional
Lorentzian manifold M. Thus, our metric gab has signature (+−−−), and a pseudo-
orthonormal vector basis (δ0a , δ1a , δ2a , δ3a ), can be assigned at any chosen point, as in
(5.3.4). A non-zero tangent vector ν a can be either timelike, spacelike, or null, ac-
cording as gab ν a ν b is positive, negative, or zero. The null cone at each point p of
M consists of the family of null tangent vectors at p. We shall require that M be
connected, and that it be time-oriented, which means that the two components, at
each point, of the non-spacelike (and non-zero) tangent vectors can be separated
into two continuously disjoint systems, which we refer to as future-pointing and
past-pointing. Accordingly, we can consistently designate, in a continuous way, the
two separate parts of the null cone at any point p as the future cone and the past
cone, thereby assigning to M its time-orientation.
If two distinct points p and q of M can be connected by a smooth future-timelike
curve on M, from p to q (where “future-timelike” means that its oriented tangent
vectors are timelike, future-pointing), then we write p  q, which we read as “q lies
to the chronological future of p”. If the connecting curve from p to q is everywhere
future-causal (i.e. future-timelike or future-null), then we write p ≺ q, which we
read as “q lies to the causal future of p”. We have the elementary properties:

p  q implies p ≺ q,
p  q and q  r together imply p  r,
p ≺ q and p  r together imply p  r,
p  q and q ≺ r together imply p  r,
p ≺ q and q ≺ r together imply q ≺ r.

(See [Kronheimer E.H. and Penrose, R. (1967)], and also [Penrose, R. (1972)] where
the definitions of p ≺ q and p  q are given in terms of geodetic links or finite
successions of geodetic links. This has certain advantages in simplifying some of the
proofs.)
We assume that M is strongly causal [Hawking, S.W. and Ellis, G.F.R. (1973)],
[Penrose, R. (1972)], which means that there is no pair of distinct points p, q in
M, with p ≺ q, such that for every pair of open neighbourhoods P of p and Q
of q there are points u ∈ P and v ∈ Q such that v  u. This condition is a bit
stronger than mere absence of causality violation, which forbids the existence of
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 148

148 Topology and Physics

pairs of distinct points u, v such that both u  v and v  u hold. Strong causality
asserts, in effect, the absence of “almost closed” timelike curves. We assume that
M is, indeed, strongly causal. With strong causality we certainly never have p  p.
However, the relation p ≺ p is defined to hold always. Any connected open subset
of a strongly causal space-time is itself also strongly causal.
The chronological futures I + (q) of a point q ∈ M and I + [Q] of a subset Q ⊂ M
are defined by

I + (q) = {x ∈ M | q  x} and I + [Q] = ∪x∈Q I + (x)

and, similarly, for the chronological pasts

I − (q) = {x ∈ M | x  q} and I − [Q] = ∪x∈Q I − (x).

In the same way, we can define the causal past and future of a point q ∈ M or
subset Q ⊂ M by

J + (q) = {x ∈ M | q ≺ x} and J + [Q] = ∪x∈Q J + (x),


J − (q) = {x ∈ M | x ≺ q} and J − [Q] = ∪x∈Q J − (x),

but I shall not be quite so concerned with this concept here. It is easy to see that
the operations of taking the chronological future and past are each idempotent:

I + [I + [Q]] = I + [Q], I − [I − [Q]] = I − [Q]

and we call such sets future-sets and past-sets, respectively, a future-set being char-
acterized as being its own chronological future and a past-set, its own chronological
past. Equivalently, a future-set is simply the chronological future of some subset of
M, and a past-set, the chronological past of some subset of M. Past-sets and future-
sets are always open sets, in the topology of the manifold M (although it may be
remarked that causal futures {x ∈ M | q ≺ x} and causal pasts {x ∈ M | x ≺ q}
need not always be closed sets). With the assumption of strong causality, made
here for M, we may obtain M’s manifold topology from its chronological struc-
ture, by means of its Alexandroff topology, for which the open sets are unions of
basic neighbourhoods of the form {x ∈ M | p  x  q} for every pair p, q, with
p  q.
Of particular interest are those future-sets and past-sets that are irreducible. An
irreducible future-set, or IF, is a future-set that is not the union of two future-sets
unless one is contained in the other. Likewise, an irreducible past-set, or IP, is a
past-set that is not the union of two past-sets unless one is contained in the other.
It is a basic theorem that every IF is the chronological future of — or generated
by — a connected timelike curve in M and every IP is the chronological past of
(generated by) a connected timelike curve in M [Seifert, H.-J. (1971)], [Geroch,
R., Kronheimer E.H., and Penrose, R. (1972)], [Penrose, R. (1998)]. It is also the
case that the chronological future of any connected causal curve in M is an IF,
and likewise the chronological past of that curve is an IP. As with the case of a
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 149

Singularity Theorems 149

connected timelike curve, we can say that such a curve generates the IF or IP in
question, but there is a difference in that such a causal generating curve need not
lie within the IF or IP that it generates.
IFs are of two kinds, namely the PIFs (pointed IFs) — which are the chronologi-
cal futures of single points in M (such a point — called a generating point — being
the past end-point of a curve generating the PIF) — and those which are not the
futures of individual points in M, called TIFs (terminal IFs), which are generated
by past-endless timelike curves (i.e. not further extendible into the past, as timelike
curves). Likewise, any IP is either a PIP (pointed IP) — the chronological past of
a single generating point in M — or else it is a TIP (terminal IP), generated by a
future-endless timelike curve. With the assumption of strong causality made here,
the system of PIFs is in 1 : 1 correspondence with the system of points in M that
generate those PIFs, and this system of points is, in turn, in 1 : 1 correspondence
with the system of PIPs that those points generate. However, the systems of TIFs
and TIPs give us something new. We may think of the TIFs and TIPs as supplying
ideal points for the manifold M, the TIFs supplying past boundary points to M,
and the TIPs supplying future boundary points for M.
In the case of the flat Minkowski space-time M, the TIPs provide M’s future
conformal boundary [Penrose, R. (1965)], [Penrose, R. (1964)], which consists of
a 3-dimensional region I + together with a single ideal point i+ . The TIP i+ is
generated by any timelike straight line in M, and the ideal points that constitute
I + are generated by null straight lines (rays) in M, where two rays in M generate
the same ideal point in I + if they belong to the same null hyperplane in M.
Likewise, the TIFs of M provide M’s past conformal boundary, consisting of the
3-dimensional region I − of ideal points that are generated by rays in M, together
with a single ideal point i− generated by any timelike straight line. See Fig. 5.3.

i+

i0

i–
Fig. 5.3. Minkowski space, coformally compactified, has i+ and the points of I + represented as
TIPs, and i− and the points of I − represent TIFs. (Spatial infinity i0 is not represented in this
way.)
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 150

150 Topology and Physics

This type of structure is shared by other asymptotically flat space-times, but


there may be additional TIPs or TIFs corresponding to the singular points of such
a space-time M. A good illustrative example is the O–S model of a gravitationally
collapsing dust cloud. Here, the matter-free part of the space-time is described by
a portion of the future-extended Schwarzschild metric given by the E–F expression
(5.4.1–5.4.2), and the collapsing dust cloud by a portion of a Friedmenn–Lemaı̂tre
cosmological model, as is indicated (with future null-cone structure) at the bottom
of Fig. 5.3. We would have past ideal points (TIFs) very similar to the I − and i−
of M, and we also have future ideal points corresponding to the TIPs giving the I +
of M, but now there are some additional TIPs, corresponding to the singularity at
r = 0. These “singular TIFs” are generated by the timelike curves that terminate
at the r = 0 singularity.
One lesson we learn from this is that, despite a natural expectation that this
singular region should be viewed as a single point — perhaps stretched out to a
line through its persistence with time — the TIF perspective gives us a different
picture, namely that this singularity should actually be regarded as a 3-dimensional
surface! We might try to think of this 3-surface as being a 2-surface persisting with
time, but this is not really appropriate. Locally, near the singularity, the time-
direction has become spacelike, so it is better to think of the entire singularity
as actually spacelike! This may well be a feature of the singularities that arise in
general relativity that come about in generic realistic circumstances — this being
the contention that is referred to as the hypothesis of strong cosmic censorship
[Penrose, R. (1969)], [Penrose, R. (1998)].
In fact, the TIP/TIF point of view enables us to adopt a considerably more
constructive attitude to singularities in GR than might have been anticipated, and
it becomes meaningful to provide some kind of causal structure to singularities. But
before addressing this issue, it is important to try to distinguish those TIPs and
TIFs that actually correspond to singular points from those which refer to points
at infinity. The most direct way of doing this would appear to be to label a TIP
as a “point at future infinity”, or an “∞-TIP”, if it has a generating curve that
extends into the future to infinite length. Correspondingly, we can label a TIF as an
“∞-TIF” if it has a generating curve that extends into the past to infinite length.
Then we could refer to the TIPs which are not ∞-TIPs as singular TIPs and the
TIFs which are not ∞-TIFs as singular TIFs.
This is not unreasonable, and I shall loosely follow this terminology here, but
a few of clarifying points ought to be made. The first is that there might be cir-
cumstances under which by following such a generating curve (into the future for a
TIP or into the past for a TIF) we could find that some kind of singular behaviour
ensues, such as a divergence of curvature scalars, or other catastrophic behaviour,
and in such a case one might regard the TIP or TIF as representing some kind
of “singularity at infinity”.
R Another point to bear in mind is that the “length” of
a timelike curve, namely ds, is really a time measure (“proper time”), and that
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 151

Singularity Theorems 151

in a Lorentzian geometry, the integral of the proper time along a timelike curve is
maximized rather than minimized, locally, when that curve is a geodesic (so that
“wiggling” a timelike curve between two fixed points reduces, rather than increases
its length. Accordingly, having a timelike generator of infinite length (into the future
for a TIP and into the past for a TIF) is a non-trivial restriction.
However, there is another issue to bear in mind, which is that the singularity
theorems, as so far established, do not directly assert the existence of “singular
TIPs” or “singular TIFs” in the above sense. More to the point would seem to be a
notion of a geodetic-singular TIP or TIF, which is generated by a timelike geodesic
of finite length. What we shall find for the theorem explicitly proved at the end of
§5.7 is the slightly different notion of a singular TIP (or TIF), namely:

A null-singular TIP or TIF is one generated


(5.5.1)
by an affine-finite ray segment.

(For the notion of “affine” length for a ray, see (5.7.5)). Although one can envisage
exotic situations in which these various definitions of singular TIPs (or TIFs) differ
from one another, this can only happen where the space-time curvature behaves
badly in the limit when one approaches the end of the generating curve, so the
TIP (or TIF should certainly be considered to be “singular” in any case, in such
circumstances (compare [Hawking, S.W. and Ellis, G.F.R. (1973)], [Geroch, R. and
Horowitz, G.T. (1987)]).
It is of relevance that causal and chronological relations can be provided for
TIFs and for TIPs. Let P and Q be two TIFs. Then we can define

P ≺ Q iff P ⊇ Q;
P  Q iff there is a point x ∈ M such that P ⊃ I + (x) ⊃ Q;

and if U and V are two TIPs, we define

U ≺ V iff U ⊆ V ;
U  V iff there is a point x ∈ M such that U ⊂ I − (x) ⊂ V .

If these definitions are applied to PIFs or to PIPs we get the same relation between
their generating points in the case of the chronological relation , but with regard
to the causal relation, there can be some differences. These differences do not oc-
cur, however, when M is what is called “globally hyperbolic”, a restriction (first
considered by the French mathematician Jean Leray) that we come to next.
A situation that can arise in certain space-times (such as in “anti-de Sit-
ter space”; see, for example [Penrose, R. (1972)], [Hawking, S.W. and Ellis,
G.F.R. (1973)]) is that there can be a PIF I + (p) that contains a TIF Q:

I + (p) ⊇ Q

so that the past ideal point Q lies to the chronological future of an actual space-
time point p, which we might try to write as write as p  Q. This kind of situation
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 152

152 Topology and Physics

can cause difficulties when we consider the time development of the standard type
of evolution equations in relativistic physics. Let us consider a generating curve γ
of Q and consider a point q ∈ γ. Then we have a future-timelike curve λ from p to
q. Allowing q to recede indefinitely into the past along γ (Q being a TIF), we find
that the connecting curves λ cannot have a compact limit within M. This would
contradict one of the characterizations of a space-time that it be globally hyperbolic,
namely that for any pair of its points p, q such that p  q, the set of points x such
that p  x  q (i.e. the Alexandroff neighbourhood, i.e. I + (q)∩I − (q)) has compact
closure. Equivalently, for any p  q, the set of points x such that p ≺ x ≺ q, i.e.
J + (p) ∩ J − (q), is compact [Penrose, R. (1972)]. This, in turn, is equivalent, to the
assertion that, for any p  q, the space of causal curves from p to q is itself compact
[Penrose, R. (1972)]. Thus, global hyperbolicity for the space-time is equivalent to
the absence of any PIF containing a TIF, or (by the time symmetry of the definition
of global hyperbolicity), to the absence of any PIP containing a TIP. Moreover, we
note from the definition of the relation “”, for PIPs and for TIPs, that this relation
can never be satisfied for either TIPs or IIFs in a globally hyperbolic space-time.
It is a theorem of Robert Geroch [Geroch, R. (1970)], that if a globally hyper-
bolic space-time M necessarily has the topology

M∼
= R × S,

where each instance of R, in this product, is a timelike curve, and each instance of S
is a spacelike 3-surface that is a Cauchy hypersurface for M. A Cauchy hypersurface
for a space-time is a hypersurface S with the property that every timelike curve
in the space-time meets S, or can be extended as a timelike curve to meet S.
More concisely, we can say that every endless (i.e. both future- and past-endless)
timelike curve meets S. From the point of view of standard relativistic classical field
equations (such as Maxwell’s free-field equations), the assumption of a Cauchy
surface is a natural one. Initial data for all the fields involved would normally
be specified on such a spacelike hypersurface, and if these equations (and initial
constraint relations) are appropriate ones, one would expect to be able to evolve
the fields uniquely throughout the space-time. Thus, Cauchy hypersurfaces have an
important role to play in relativistic classical field theory [Choquet-Bruhat, Y. and
Geroch, R. (1969)].
Moreover, the role of a Cauchy hypersurface can also have importance in a
more local sense. Assume that the space-time M is strongly causal, though not
necessarily globally hyperbolic. Suppose that, in M, we have a 3-surface S (not
necessarily everywhere smooth) which is achronal, meaning that there is no pair of
points p and q in S for which p  q. Then, we can define the domain of dependence

D(S) = {x ∈ M | every endless timelike curve through x meets S}.

D(S) represents the total region throughout which we would expect that appropri-
ate relativistic classical field equations would have a unique evolution from data on
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 153

Singularity Theorems 153

S. It is useful to separate D(S) into the future and past domains of dependence
D+ (S) = D(S) ∩ I + [S] and D− (S) = D(S) ∩ I − [S],
respectively which, together with S itself, determine D(S):
D(S) = D+ (S) ∪ D− (S) ∪ S.
We take note of the fact that the interior region
int D(S),
when regarded as a space-time in its own right, would be globally hyperbolic. An
important fact about int D(S), in relation to this (with particular regard to singu-
larity theorems), is:
if x ∈ int D(S), then every endless causal curve through x meets S. (5.5.2)
In particular, every ray through x meets S. For a proof, see [Penrose, R. (1972)]
p. 45. The future and past boundaries of D(S)
H + (S) = {x ∈ D(S) | I + (x) ∩ D(S) = ∅},
(5.5.3)
H − (S) = {x ∈ D(S) | I − (x) ∩ D(S) = ∅},
are, respectively, the future and past Cauchy horizons of S, concepts introduced by
Stephen Hawking [Hawking, S.W. (1967)], that prove useful in establishing singular-
ity theorems in situations where it is not assumed that a global Cauchy hypersurface
exists.
The other kind of horizon that is important for the singularity theorems is the
boundary of the chronological future (or past) of some sub-region L of M. There is a
curious kind of duality between these two kinds of horizon, which can be understood
in the following way. Let us consider that the achronal region S is part of a much
larger achronal set K, where we are taking K to be large enough that I + [K] ∪ I − [K]
contains everything of interest to us. Let us consider any point x in D(K). Then
every endless timelike curve through x meets K. Let R be the compliment of S
within K, so that K is the disjoint union of S and R:
R ∪ S = K and R ∩ S = ∅.
Now, consider any point x in D(K). Any endless curve γ through x must meet K, so
it must meet either R or S. We consider two alternatives, with regard to x. It might
be that every endless timelike curve through x meets S, in which case x ∈ D(S), or
else there is an endless timelike curve through x which does not meet S and which
therefore meets R. In this latter case, either x ∈ I + [R] or x ∈ I − [R], or else x ∈ R.
It follows that I ± [K] is the disjoint union of I ± [R] and D± (S):
I + [R] ∪ D+ (S) = I + [K], and I + [R] ∩ D+ (S) = ∅.
From this we can see that the common boundary between the future of R and the
domain of dependence of S will have the same kind of structure. This is somewhat
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 154

154 Topology and Physics

confused, in the above picture by the fact that this common boundary need not
be the entire boundary of either, because the boundary of the future and Cauchy
horizon of K itself would become relevant. Nevertheless this illustrates the essential
similarity of structure between these two kinds of boundary. Indeed, we may com-
pare the definition of the Cauchy horizon, given in (5.5.3) above, with the boundary
definition for the chronological future of an achronal set R,

∂I + [R] = {x ∈ M | I + (x) ⊆ I + [R] but x 6∈ I + [R]}

and, similarly, for the boundary of a past,

∂I − [R] = {x ∈ M | I − (x) ⊆ I − [R] but x 6∈ I − [R]}.

An important property of the boundary of the chronological future B = ∂I + [R]


of any achronal set R, and also of any future Cauchy horizon (resp. B = ∂D+ (S))
of any achronal set S, within a strongly causal space-time M is the following:

Lemma 5.1. If p is any point of B, not on R, (resp., not on S), then there is
a null geodesic (ray) γ lying on B, whose future end-point is p and which extends
into the past along B, either endlessly or until it reaches a point of the closure of
R (resp. of S).

Proof. For the proof, see [Penrose, R. (1972)].

Thus, we have an important property of boundaries of futures and of future


Cauchy horizons. Any such region is a topological 3-manifold B which is completely
generated by rays, each of which is either past-endless or has a past end-point on
the closure of the initiating set (R or S), and each of which is either future-endless
or has a future end-point on B, beyond which the extended ray enters the interior
region I + [R] (in the case of the boundary of future) or leaves D+ (S) altogether
(in the case of a Cauchy horizon). These hypersurfaces are topological 3-manifolds,
but they are normally not everywhere smooth, being not smooth at the future end-
points of these generating rays. Of course, the time-reverses of all these statements,
when referred to boundaries of chronological pasts and past Cauchy horizons, also
hold.

5.6. Geodetic Maximality


In order to address the singularity issues raised by a generic gravitational col-
lapse, it will be necessary to go a little further than the framework of §5.5, which
was concerned essentially with the causal structure of a space-time M, this being
equivalent to M’s time-oriented conformal structure (M being strongly causal),
and not with its more refined metric structure. This conformal structure is given
by M’s Lorentzian metric field gab factored by the equivalence

gab ≡ Ω2 gab ,
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 155

Singularity Theorems 155

where Ω is a smooth positive scalar field on M. The metric assigns a non-negative


length (physically: a proper time) to any causal curve segment γ connecting points
p and q in M, with p ≺ q, (as already noted in §5.5), given by
Z q
ds,
p
p
where ds = gab dxa dxb . We shall be concerned with maximality features of this
integral. It can be seen, using local arguments (assuming p ≺ q), that if q is close
enough to p — which technically means with q within some small enough topological
neighbourhood N of p — there will be a causal geodesic γ within N , from p to q,
that maximizes the length of all causal curves from p to q lying within N . In this,
we recall that in a Lorentzian space we look for maximality rather than minimality,
for a geodesic — the direct route that maximizes the “proper time”. Similarly, this
holds for all p close enough to q, (with p ≺ q). As a special case of this, if p ≺ q,
but not p  q , so that q lies on p’s light cone, then the geodesic from p to q must
be null (i.e. a ray), the length being zero, so γ is the only connecting causal curve
in this extreme case.
If M is not globally hyperbolic, then there can be pairs of points p, q, with
p ≺ q, for which there is no causal curve from p to q that maximizes the length
of causal curves from p to q. The simplest example of such an M (though hardly
a plausible one, for a realistic space-time) would be Minkowski space M with a
single point o removed. Then, if p and q are two points on a timelike straight
line through o, in M, each on either side of o, then the upper limit of the lengths
of timelike curves from p to q is not attained. Though clearly unrealistic as a
physical space-time, this example illustrates the kind of difficulty that can arise,
when M is not globally hyperbolic. A more physically plausible example is the
frequently considered (unwrapped) anti-de Sitter space (see [Hawking, S.W. and
Ellis, G.F.R. (1973)] p. 131 or [Penrose, R. (2004)] p. 749), in which points p and q
on any timelike geodesic γ, extended so that p and q are sufficiently far apart, then
we find there is no upper bound to the lengths of timelike curves from p to q , and
so the geodesic γ from p to q is certainly not maximal.
On the other hand, if M is globally hyperbolic then, from the fact that the
space of causal curves from p to q (with p ≺ q) is compact (see §5.5 or [Penrose,
R. (1972)]), it follows that there must indeed be a maximizing curve from p to q
(possibly more than one such curve) and such a curve must be a causal geodesic.
This applies, also, within any globally hyperbolic open sub-region D of M, whether
or not the containing space-time M, assumed to be strongly causal, is itself globally
hyperbolic. In particular, it applies to the interior of any domain of dependence
D = int D(S), S being any achronal subset of M, so that there is a maximal
geodesic connecting any pair of points p, q, within D, for which p ≺ q.
We shall need to go somewhat further than this and consider causal curves
within such a region D = int D(S), which maximize the lengths of causal curves
joining a point p in D to S. Again, we find that there is a maximizing causal geodesic
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 156

156 Topology and Physics

γ, indeed a timelike one, that joins p in D to S. Moreover, if we consider that S is


a smooth spacelike hypersurface (the situation normally considered), then we can
assert that the geodesic γ is necessarily orthogonal (in the appropriate Lorentzian
sense) to the hypersurface S. For the proof, see [Penrose, R. (1972)].
It will be important for us to consider the circumstances where a causal geodesic
γ, connecting two points p and q, or when connecting a point q to a spacelike hyper-
surface S, does not actually maximize the length among the family of causal curves
satisfying that condition. For this, we shall need the notion of conjugate points
along a timelike or null geodesic, or of a point conjugate to a surface intersecting
such a geodesic. For this we shall need the idea of a Jacobi field along a geodesic
γ, sometimes called the equation of geodesic deviation.
The idea of a Jacobi field is that it describes how a geodesic γ relates to its
immediately neighbouring geodesics. We consider γ 0 to be a neighbouring geodesic
to γ, where the points of γ 0 are displaced from those γ by the vector field ν a along
γ, taking ν a to be Lie propagated along the geodesic γ, whose parallel-propagated
tangent vector is ta :

ta ∇a tb = 0, ta ∇a ν b = ν a ∇a tb . (5.6.1)

We can write this as

Dta = 0, Dν b = ν a ∇a tb , (5.6.2)

where the operator D = ta ∇a defines parallel propagation along γ. The Jacobi


equation is then

D2 ν a = Rbcd
a
tb ν c td (5.6.3)

(see (5.3.1) for conventions). The solutions (for ν a ) of this equation, other than
the trivial one where ν a is everywhere zero along γ, are referred to as Jacobi fields
along γ. It is usual to normalize ta according to:

ta ta = 1 if γ is timelike, but ta ta = 0 if γ is null .

If γ is timelike, then we can choose ν a to be orthogonal to ta all along γ.

ta νa = 0, (5.6.4)

without any effect on the relation between the neighbouring geodesics γ and γ 0 .
However, when γ is null, the equation (5.6.4), which holds all along γ if it holds
at any point of γ, asserts a particular geometrical relation between γ and γ 0 that
is referred to as their being abreast (see [Penrose, R. and Rindler, W. (1986)],
Chapter 7). This is important to us here, because this condition is always satisfied
for neighbouring rays on a null hypersurface, and this is just the situation for non-
singular parts of the boundaries of future sets I + [R] or past sets I − [R]), or Cauchy
horizons. In accordance with this, we make the requirement that the orthogonality
condition (5.6.4) always holds for our Jacobi fields for null geodesics, generally.
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 157

Singularity Theorems 157

Two distinct points p and q, on a geodesic γ, are said to be conjugate to one


another on γ if there is a Jacobi field on γ for which ν a vanishes at both p and q.
The importance of this for us here is:

Lemma 5.2. If γ is a causal geodesic from p to q (with p ≺ q) for which there is


a pair of conjugate points p0 , q 0 on γ between p and q (where we take p0 ≺ q 0 and
allow either p0 = p or q 0 = q, but not both), then there is a causal curve from p to
q which is longer than the segment of γ from p to q.
We note that when γ is null, then under the conditions stated, we must actually
have p  q, so that in this case γ cannot, in its entirety from p to q, remain on the
boundary of a future or of a past, or on a Cauchy horizon.
The proof of this lemma comes from an examination of the neighbourhood of
either q 0 or p0 and noting the orders of infinitesimal change that are made when
the segment of γ through p0 or q 0 , respectively, is varied slightly. This enables us to
find a slightly longer causal curve connecting p to q 0 or connecting p0 to q than is
achieved by γ. See [Penrose, R. (1972)] for details.
We also need the notion of conjugacy, on a timelike geodesic γ, between a
point q on γ and a smooth spacelike hypersurface (3-surface) R which meets γ
orthogonally at a point p, distinct from q. We are here concerned with Jacobi fields
along γ which represent geodesics neighbouring to γ that remain orthogonal to R.
From the equation Dν b = ν a ∇a tb in (5.6.2), we find that the propagation of these
Jacobi fields along γ is constrained, at p, by the extrinsic curvature (ν a ∇a tb ) of
R there. We note that the orthogonality of γ to R and the requirement νa ta = 0
in (5.6.4) tell us that the vectors ν a are tangent to R at p, so that the quantities
ν a ∇a tb indeed describe how the normal ta to R varies in these different directions
tangential to R, in accordance with the extrinsic curvature of R at p. We say that
the point q is conjugate to the surface R if there is such a Jacobi field orthogonal
to R which vanishes at p.
Roughly speaking, we can think of the geodesics normal to R that are near
to γ as coming to some sort of focus at q, but in general this will not be a clean
focus. Technically it is a caustic point of the congruence of geodesics near γ that are
orthogonal to R. Here, we must bear in mind the nature of Lorentzian geometry and
note that in the case of what we might think of positive overall extrinsic curvature
— “hill shaped” — the normals ta converge into the future (eventually to reach
such a caustic point, in the case of flat space-time), whereas for a “dip-shaped”
extrinsic curvature, the normal direction ta will diverge into the future.
The case when γ is a null geodesic (ray), needs particular attention. Here, it
is appropriate to think in terms of a spacelike 2-surtace T instead of a spacelike
3-surface. The rays intersecting T orthogonally, constitute two families, which we
can associate with the two “sides” of T . These two sides 1 correspond to the two

1 The reader may be justly puzzled how this very different-looking situation arises, as compared
with that for a timelike geodesics orthogonal to a spacelike 3-surface R. We can, indeed, consider
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 158

158 Topology and Physics

different null directions orthogonal to a spacelike 2-surface element δT (these being


the intersections of the orthogonal complement of δT , which is timelike, with the
null cone). See Fig. 5.4, which provides a picture of this situation from the perspec-
tive of an observer in a rest-frame with time-axis orthogonal to δT , so the 2-surface
is at rest in this frame.
The rays neighbouring to γ that intersect T orthogonally are all abreast with γ
as required. Again, we have a notion of a point q on γ being conjugate, now, to a
spacelike 2-surface T , intersecting γ orthogonally at a point p, distinct from q. We
say that q is conjugate to T on γ if there is a Jacobi field ν a along γ, for which the
neighbouring rays to γ are all orthogonal to T , vanishes at q.
To get a picture of how T ’s extrinsic curvature relates to the behaviour of the
rays that meet it orthogonally, let us first think in terms of a 2-surface T that lies
in one constant time slice of Minkowski space-time, though T could be curved one
way or another. Then the rays orthogonal to T will converge into the future on a
side where T is concave, and diverge into the future on the other side where T is
convex ; see Fig. 5.5. However, when T is not constrained in this way, things are
not so simple, the importance of which we shall be seeing in §5.7.
We are now in a position to consider the following result:

Lemma 5.3. Let γ be a causal geodesic segment from p to q, which is orthogonal


at p to a hypersurface (3-surface) R if γ is timelike, or to a spacelike 2-surface T
if γ is null, and suppose there is a point q 0 between p and q on γ which is conjugate
on γ to R or to T , respectively. Then there is a timelike curve connecting R or T ,
respectively, to q that is longer than the segment γ.

The proof uses arguments similar to the proof of Lemma 5.2, where we examine the
Jacobi fields in the neighbourhood of q 0 , taking note of the orders of infinitesimal
change as the geodesics orthogonal to R or T vary, to see that a longer causal curve
to q can be constructed. For details, see [Penrose, R. (1972)].
It should be noted that without the orthogonality condition in Lemma 5.3, we
would not even need the existence of a conjugate point q 0 to obtain the conclusion of
the lemma (as follows from an examination of the neighbourhood of p). We should
also take note of the corollary that although the boundary of the future ∂I + [T ]
of T is initially generated by rays with past end-points on T that are orthogonal
to T there, some of these rays must subsequently leave ∂I + [T ] if they eventually
encounter a conjugate point q 0 to T , either at q 0 or earlier, and thereafter enter the
interior I + [T ]. A corresponding statement clearly also applies to the boundaries
of pasts.

taking a limit, whereby R becomes null. The appropriate picture is that in this limit, the (now)
null 3-surface R becomes foliated by a family of spacelike 2-surfaces, of which T is a member,
the others being displacements of T in the direction of the family of rays orthogonal to T on its
opposite side.
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 159

Singularity Theorems 159

nu mal
no

no null
al
ll
r

rm
time

projection
spatial

n
projectio
spatial
spat
normial one time
al
instant
spat
2-surface normial
al
element

Fig. 5.4. A combined space-time and spatial picture illustrating the two null normals to a space-
like 2-surface element (shown shaded).

Fig. 5.5. For a spacelike 2-surface T constrained to an instant of time, in flat space-time, the
normal rays converge where it is concave and diverge where it is convex.

5.7. Collapse to Singularity


We are now almost at a position to address the issue of gravitational collapse, to
see whether the creation of singularities is a generic feature, once a situation of “no
return” has been reached. However little can be said unless we have some restriction
on the energy tensor Tab since Einstein’s Λ-GR equations restrict the space-time
only by asserting that the Ricci tensor Rab must be related to Tab via the equation
(5.3.3):
 1 
Rab = 8πG Tab − T gab − Λgab .
2
Now, the detailed form of Tab can be very complicated, depending upon all the
various equations that the matter fields might satisfy. Moreover, in the extreme
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 160

160 Topology and Physics

situations that could be likely to occur in a catastrophic collapse, much would be


likely to be unknown. It is fortunate, therefore, that one can say a great deal on
the basis of some very general principles that, at least for considerations of purely
classical physics, ought to hold true for any reasonable matter source. These are
what are refers to as energy conditions, of which there are several different versions,
but for current purposes, I refer primarily to just two, namely

(a) the null energy condition:


for any null vector na ,

na nb Tab ≥ 0;

(b) the strong energy condition:


for any timelike vector ta ,
 1 
ta tb Tab − T gab ≥ 0.
2
In the case of a perfect fluid, with mass-energy density µ and pressure density
p, this null energy condition can be re-expressed as

µ + p ≥ 0,

whereas the strong energy condition becomes the pair

µ + p ≥ 0, µ + 3p ≥ 0,

which makes a difference when we allow for materials with large negative pressures
(as occurs, for example, with the Maxwell field — though not a “perfect fluid”,
having positive pressure terms as well as negative). The physical motivation for
the null energy condition 2 is better founded, theoretically, than that for the strong
energy condition, since the former is consequence of classical local energy positivity,
whereas there is no such clear-cut motivation for the latter, and one may consider
particularly exotic kinds of matter for which p < − 31 µ , violating strong energy,
despite the energy component of Tab being non-negative in any local frame.
A related issue is the role of the cosmological constant Λ. In contrast with
the situation for the null energy condition, which is unaffected by Λ, the Λ-term
in Einstein’s equations provides an effective contribution towards violation of the
strong energy condition. It had been usual, in the discussions concerning singularity
theorems, to ignore Λ on the grounds that it would be extremely small and therefore
not relevant to the behaviour of space-time near a singularity, where one would

2 In my own early writings on this topic, e.g. [Penrose, R. (1972)], p. 63, I referred to what is here
called “the null energy condition” as the “weak energy condition”. However this has led to some
confusion, as Hawking and Ellis [Hawking, S.W. and Ellis, G.F.R. (1973)] have used that term for
a different concept, so to avoid possible confusion, I adopt their terminology here.
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 161

Singularity Theorems 161

be concerned with excessively large curvatures, and the larger the curvature the
more irrelevant would be Λ. Nevertheless the formulation of appropriate conditions
indicating unstoppable collapse to singularity can be sensitive to the presence of a
non-zero Λ, which could be of relevance in cosmology.
More to the point is the fact that what is really required for many singularity
theorems is positivity for ta tb Rab , for all timelike ta , and the strong energy condi-
tion, as stated, does not encapsulate this when there is a positive Λ. Accordingly
we could consider

(c) the Λ-strong energy condition:


for any timelike vector ta ,
 n1 
a b Λ o
t t Tab − T+ gab ≥ 0.
2 8πG
This unwieldy-looking condition does assert, via the Λ-GR equations, positivity of
ta tb Rab , for all timelike ta . However, this condition is violated by the presence of
a positive (or negative) Λ in regions where Tab = 0, where Λ provides an effective
Λ Λ
“matter density” with µ = 8πG and p = − 8πG .
The point about these energy conditions is indeed that they are supposed to
assert, by virtue of the Einstein (Λ-)GR-equations, the non-negativity of the Ricci
tensor in the appropriate causal directions. The null energy condition asserts (with
conventions given in §5.3)

Null Ricci positivity: na nb Rab ≥ 0 for all null vectors na , (5.7.1)

(irrespective of Λ) and the Λ-strong energy condition asserts

Ricci positivity: ta tb Rab ≥ 0 for all timelike vectors ta . (5.7.2)

These non-negativity requirements for the Ricci tensor are all that we need from
the Einstein equations for the singularity theorems to hold.
To appreciate the role of this Ricci positivity, we shall need to address the issue
of how the (Λ-)strong and null energy conditions are used, respectively, to imply
the existence, on a timelike or null geodesic γ, of conjugate points to a spacelike
3-surface R or a spacelike 2-surface T . For this, we consider a 3-parameter family of
timelike geodesics meeting R orthogonally, or a 2-parameter family of rays meeting
T orthogonally. As a preliminary issue, however, we need to consider the matter of
the rotation of a family of causal geodesics.
The easiest way to consider rotation, for our current considerations, is simply
to restrict attention to the situation when it is absent. For timelike geodesics, this
occurs when they are orthogonal to a smooth spacelike hypersurface R, and under
this circumstance, if we move along the geodesics by a fixed distance — or proper
time — either into the future or past we again get smooth spacelike hypersurfaces
that remain orthogonal to the geodesics, so long as we do not proceed so far that
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 162

162 Topology and Physics

caustics and crossing regions are reached. We can consider a time function τ that
is constant over each of these respective hypersurfaces, so that

ta = ∇a τ, whence ∇a tb = ∇b ta , (5.7.3)

which expresses a condition that the geodesics are rotation-free, and provided that
τ represents proper time, we maintain

ta ta = 1. (5.7.4)

The case of rays is a little different. Again, we can consider a “time-function” ν (not
proper time), but now we are only concerned with the single value ν = 0, though
it is convenient that we consider neighbouring values of ν also, each constant value
of ν specifying a null hypersurface, so that we can define

ta = ∇a ν, with ta ta = 0,

whence the non-rotating condition ∇a tb = ∇b ta again holds, but now we demand


that all these hypersurfaces be null, and from the above we also find

ta ∇a tb = 0, (5.7.5)

so that the vectors ta are parallel-propagated along the null generators of the null
hypersurfaces, as required at ν = 0. The condition that the rays are abreast, in any
one of these null hypersurface is, also automatically satisfied.
The condition (5.7.5) tells us that a parameter u along a null geodesic for which

t a ∇a u = 1 (5.7.6)

is what is called an affine parameter. It would still be an affine parameter if the


“1” were replaced by any positive constant multiple of u along γ, but the scaling of
u provided by (5.7.6) is the particular affine parameter uniquely (up to an additive
constant) associated with the scaling provided by the null vector ta (subject to
parallel propagation along γ, as given by (5.7.5)). Although the concept of “affine
parameter” applies also to timelike geodesics — and, indeed, the ordinary length
(proper time) parameter is then an affine parameter (as is any constant multiple of
this time parameter) — affine parameters have particular relevance to rays, since
the natural “length” notion does not apply, the actual length of any segment of a
ray being simply zero.
The key result, of relevance here, is that referred to as the Raychaudhuri effect
(see [Raychaudhuri, A.K. (1955)], [Komar, A. (1956)]; this is the Lorentzian version
of an earlier result for Riemannian metrics due to Myers [Myers, S.B. (1941)]; see
also [Penrose, R. (1965)] for the Lorentzian version with null geodesics). For this, we
require a notion of a volume measure ∆, which is a spacelike 3-volume in the case of
timelike geodesics, and a spacelike 2-volume (area) in the case of rays. For a given
such geodesic γ, we consider a point p where it meets, orthogonally, a spacelike
3-surface R if γ is timelike [resp. 2-surface T if γ is null]. At p, we chose three [resp.
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 163

Singularity Theorems 163

two] linearly independent vectors v1 , v2 , v3 , tangent to R [resp. v1 , v2 , tangent to


T ] and we define ∆ at p to be the 3-volume spanned by v1 , v2 , and v3 [resp. area
spanned by v1 and v2 ] at p. As in §5.6, we propagate the vj along γ according to
the Jacobi equation, taking note of the fact that the starting value of Dvj (with
D = ta ∇a ) is defined by the Lie condition Dν b = ν a ∇a tb , which is determined by
the extrinsic curvature of R [resp. T ] at p (see (5.6.2)). The convergence ρ, of the
family of geodesics 3 near γ is defined by

D∆ = −ρ∆, (5.7.7)

and we also find that

ρ = −∇a ta . (5.7.8)

Lemma 5.4. Take ∆ to be positive. Then


1 1 1
D2 ∆ 3 ≤ − Rab ta tb ∆ 3 (5.7.9)
3
if γ is timelike, and
1 1 1
D2 ∆ 2 ≤ − Rab na nb ∆ 2 . (5.7.10)
2
if γ is null.

Proof. We first note that (5.3.1) applied to ta (∇a ∇b − ∇b ∇a )tb gives

Dρ = Rab ta tb + ∇a tb ∇b ta = Rab ta tb + (∇a tb )(∇a tb ),

where we have used ta ∇a tb = 0, ρ = −∇b tb and ∇a tb = ∇b ta (vanishing rotation).


Suppose that ta is timelike; then we can express Raychaudhuri’s equation as
1
Dρ + ρ2 = Rab ta tb + Sab S ab , (5.7.11)
3
where
1
Sab = Sba = ∇a tb + ρ(gab − ta tb )
3
is the symmetric shear tensor. From the fact (Sab ta = 0) that all components of
Sab lie in the spacelike 3-plane orthogonal to ta , it follows that Sab S ab is non-
negative. The timelike part of Lemma 5.4 then follows from (5.7.11). For the null
part, a similar argument applies, the simplest version using a complex quantity σ
to describe the shear [Penrose, R. and Rindler, W. (1986)], and the non-negative
quantity σσ̄ takes the place of Sab S ab .

3 While the letter ρ is not uncommonly used for the convergence of a family of rays (though
normally half the value used here (e.g. [Penrose, R. and Rindler, W. (1986)])), it is more usual to
denote this concept by −θ in the timelike case.
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 164

164 Topology and Physics

The Raychaudhuri effect, for our current purposes, is encapsulated in the fol-
lowing result:

Lemma 5.5. If a timelike [resp. null] geodesic γ belongs to a family of such


geodesics meeting orthogonally a spacelike 3-surface R [resp. 2-surface T ], γ meet-
ing it at a point p at which the convergence ρ of the family is positive, and if Ricci
positivity [resp. null Ricci positivity] is assumed along γ, then there is a point q on
γ, to the future of p, which is conjugate to R [resp. T ] along γ, provided that γ
is long enough so that its affine parameter extends to the future of p to parameter
distance ρ3 in the timelike case (the affine parameter being proper time) [resp. ρ2 in
the null case]. A corresponding result holds in the past direction, if ρ is negative at
p.

Proof. In the timelike case, we compare the general situation of Ricci positivity,
1 1
where (5.7.9) tells us that D2 ∆ 3 ≤ 0, with the simple example where D2 ∆ 3 = 0.
1 1 1
In each example we take the initial staring value of D∆ 3 to be D∆ 3 = − 13 ρ∆ 3 , at
τ = 0 (where τ is the affine, or time, parameter associated with ta ). In the example
1
D2 ∆ 3 = 0, we can directly integrate it using D∆ = −ρ (see (5.7.7)) with ρ > 0,
finding a straight line graph coming down to the value 0 at the affine parameter
1 1
value τ = ρ3 . Comparing this with the actual graph for ∆ 3 with D2 ∆ 3 ≤ 0, we
1
find that ∆ 3 necessarily becomes zero at some point q of γ, given by positive value
of τ ≤ ρ3 . Now, we can only obtain a zero value for ∆ at a point q on γ where the
vectors (Jacobi fields) v1 , v2 , v3 , defining ∆ become linearly dependent, so that
this linear combination of them would be a Jacobi field vanishing at q, so that q
is indeed a conjugate point to R, as required. The argument for the null case is
1 1
identical to this, but with ∆ 2 replacing ∆ 3 .

We are now in a position to establish the first singularity theorem, concerning


the collapse of an over-massive star, as envisaged at the end of §5.4, or alternatively,
the collapsing together of a large collection of stars. The picture of a collapsing
dust cloud, as provided by Oppenheimer and Snyder and described in §5.3 (see
Fig. 5.1), provides a plausible overall description of the kind of initial situation in a
gravitational collapse. Such an initial picture might well be generally plausible, even
if minor deviations from spherical symmetry may be present, and also if the O–S
assumption of “dust” (describing a gravitationally collapsing “perfect fluid” of non-
interacting particles) may be regarded as a rather crude approximation. Moreover,
if one considers a large, roughly spherical, collection of stars to be involved in an
overall gravitational in-fall, the “Schwarzschild radius” of the whole system could
well encompass a great number of the stars, without the densities of the material
approaching anything like the somewhat exotic densities unevolved in neutron stars
or even in white dwarfs. This follows merely from the way that physical quantities
scale in general relativity. For a large enough total mass, the Schwarzschild radius
can arise, and be crossed, by material of density as small as we please.
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 165

Singularity Theorems 165

However, as the collapse progresses, it could well be expected that the O–R
picture gets less and less reliable. Even small deviations from spherical symmetry
would be expected to become more and more enhanced as the material gets com-
pressed into irregular shapes, with matter swirling around in complicated ways,
and, moreover, the “dust” approximation is hardly likely to remain plausible for
long. What our singularity theorem is able to achieve is to circumvent all this com-
plication, by concentrating on a certain key feature of the collapse that signifies the
passing of a “point of no return”.
Fig. 5.6 depicts an O–S type collapse, as in Fig. 5.1, but within the region inside
the horizon, and just outside the collapsing material, a trapped surface is delineated,
as the pair of points at the bottom of the shaded region — where we must bear
in mind that the 4-space is obtained by rotating the picture through a sphere S 2
about the central vertical axis, so this pair of points itself represents a spherical
spacelike 2-surface T . The key feature of this 2-surface is that the two families of
null normals (the ingoing and outgoing ones — where we notice that the “outgoing”
ones are also falling inwards, but at a lesser rate than the “ingoing” ones) are both
converging into the future, which we see by the fact that they both enter regions of
decreasing r values, so that for both families the convergence ρ is positive at T , as
is illustrated in Fig. 5.7. This property of the spacelike 2-surface T , together with

Fig. 5.6. Spherically symmetrical collapse (one space dimension surpressed). The diagram essen-
tially also serves for the discussion of the asymmetrical case. The coordinate u is that of the E–F
metric of (5.4.2).
December 12, 2018 15:42 ws-rv961x669 chap05-Penrose page 166

166 Topology and Physics

Fig. 5.7. For a trapped surface, the rays normal to it on both sides start converging.

Fig. 5.8. The past light cones of two spacelike-separated points p and q in Minkowski space
intersect in a spacelike 2-surface that is locally trapped, but not compact.

the fact that T is closed (compact without boundary) is what characterizes T as a


trapped surface.
It might be felt that this property of having the rays orthogonal to a spacelike
surface converging on both sides is a strange local behaviour for a surface, but this
is not so at all. This situation occurs wherever two past light cones intersect, in
Minkowski space M (see Fig. 5.8). What is odd about a trapped surface is that
this property of converging null normal on both sides holds globally over a closed
surface and it is this feature that leads to singular behaviour of the space-time, as
we see in the following theorem.

Theorem 5.1. Let M be a globally hyperbolic space-time with a non-compact


Cauchy hypersurface S. Assume that M is null Ricci positive. Suppose there is
a trapped surface T to the future of S. Then M contains a null-singular TIP.
Recall that a null-singular TIP is a TIP generated by a ray segment of finite affine
length; see (5.5.1) and (5.7.5).

Proof. Let B = ∂I + [T ]. By Lemma 5.1, the topological 3-surface B is generated


by ray segments each of which either has a past end-point on T or else is past-
endless, while remaining on B. The latter possibility cannot occur here because, by
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 167

Singularity Theorems 167

(5.5.2), any past-endless ray in int D+ (S) (which here is the whole of I + [S], by M’s
global hyperbolicity) must meet S, impossible because B ⊂ I + [S]. Now, since T is
trapped, we have ρ > 0 at the initial point of every ray generating the 3-surface B,
so by Lemma 5.5 there must be a conjugate point to T on each of these generators,
by an affine distance no greater than ρ2 , to the future of its initial point on T , so
long as that generator continues, within M to such a parameter value. If it does not,
this would be a situation not normally envisaged for a non-singular space-time, and,
indeed, such an “incomplete” ray segment would generate a null-singular TIP. Let
us suppose, therefore, that such null-singular TIPs are absent in M. Then, we infer
from Lemma 5.3 — or more explicitly from the comments following Lemma 5.3, in
the final paragraph of §5.5 — that such a ray continuing into the future would have
to leave B, and enter the interior of I + [T ]. Thus, if M is free of null-singular TIPs,
then B must be entirely generated by ray segments of finite affine length, where
these ray segments join together at their end-points at various places, namely at
T , at their past end-points, and at caustics and crossing points at their future end-
points. Overall, being a topological 3-manifold, and being composed entirely of an
S 2 ’s worth of finite line segments, coming together at various places, T must be
a compact topological 3-manifold (i.e. closed without boundary). We now invoke a
general property of any Lorentzian time-oriented manifold, namely that it admits a
smooth global timelike vector field ha , the integral curves of which can be used 4 to
map the compact achronal 3-surface B down to the initial 3-surface S, injectively, to
obtain a homoeomorphic image B 0 of B. This is not possible, because B 0 would have
to be compact without boundary, whereas S is a non-compact topological manifold,
of the same dimension. We deduce that M must therefore contain a null-singular
TIP, and the theorem is proved.

It may be remarked that this theorem (basically [Penrose, R. (1965)]) tells us


somewhat more than the mere existence of singularities, under the assumptions
stated. Lemma 5.5, as used here, already tells us how far to the future of S (namely
2
ρ ), in terms of affine distance, we must expect singularities to have arisen. Moreover,
we can consider that the singularities arising in the collapse provide a family of
singular TIFs, of which the null-singular TIPs in the considerations above form a
subset. The projection down to S, by the vector field ha gives some kind of image of
the space of these singular TIFs. The singularities would provide “holes” in B that
would render B’s homoeomorphic image B 0 to be non-compact, and whose boundary
points, when mapped back upwards by the ha vector field would correspond to
generators of singular TIPs. As we consider a family of trapped surfaces like T but
moved continuously outwards until, in the limit, they reach the horizon, we can
envisage that the TIPs arising give us some kind of picture of the entire singular
boundary of the space-time.

4 Thissimplification to my original argument of [Penrose, R. (1965)], was suggested to me by


Charles W. Misner in the later 1960s
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 168

168 Topology and Physics

Following the publication of what was essentially the above theorem in early
1965, Stephen Hawking produced a series of papers aimed primarily at the cos-
mological big-bang singularity. His first paper [Hawking, S.W. (1965)], was based
simply on the observation that the time-reverse of a trapped surface could occur
at the very distant regions of an expanding universe. Subsequently, by develop-
ing new techniques, many of which have been incorporated into the above discus-
sions (in §5.5–§5.7), he was able to provide results that eliminated the need for
a Cauchy hypersurface, non-compact or otherwise, the study of Cauchy horizons
providing an important input into the arguments [Hawking, S.W. (1966a)], [Hawk-
ing, S.W. (1966b)], [Hawking, S.W. (1967)]. Finally, we got together to supply a
very general version [Hawking, S.W. and Penrose, R. (1970)], encompassing most
of what had gone before. Yet, what might be regarded as a possibly significant
disadvantage of this later work, was its dependence on Ricci positivity, rather than
the weaker assumption of null Ricci positivity, which can be argued to have a firmer
foundational status.
A natural question to ask, at this point, is how do these results square with
the work of Lifshitz and Khalatnikov [Lifshitz, E.M. and Khalatnikov, I.M. (1963)],
referrer to in §5.3? The answer is that after becoming acquainted with these sin-
gularity theorems, with the aid of Vladimir Alekseevich Belinski, they were able to
locate an error in their earlier work and were able to identify a much larger class of
singularity types that could well be of a completely general character, (see [Belin-
skii, V.A., Khalatnikov, I.M. and Lifshitz, E.M. (1970)], [Belinskii, V.A., Khalat-
nikov, I.M. and Lifshitz, E.M. (1972)] and also the related work of Misner [Misner,
C.W. (1969)]). The conflict with the singularity theorems was thereby resolved.
Of course, the usual response to the classical singularity theorems described in
this article and elsewhere, was that when curvatures become extraordinarily large,
quantum-gravitational considerations must take over, and perhaps some kind of
non-singular quantized theory could provide us with finite answers. However, as
quantized theory currently stands, we are not in a position to make trustworthy
statements about this (but see [Ashtekar, A. (2005)], [Bojowald, M. (2005)]). The
point should be made, nevertheless, that even the very slight modifications to clas-
sical gravitational theory that are involved in the well-founded phenomenon of the
Hawking evaporation of large, very classically behaving black holes, an extremely
tiny violation of even the null energy condition is needed in order that a black holes
horizon can slowly shrink owing to its loss of energy due to Hawking evaporation
[Hawking, S.W. (1975)]. For astrophysical black holes, this requires a very tiny vi-
olation of null Ricci positivity. This can be attributed to an allowed small negative
energy flux from quantum fields. Yet this does not at all involve large space-time
curvatures, and it is hard to see how such considerations can provide a qualitative
change to the purely classical arguments presented in this article.
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 169

Singularity Theorems 169

Bibliography
Einstein, A. (1918) Über gravitationswellen, Königlich Preußische Akademie der Wis-
senschaften (Berlin), Sitzungsberichte, 154–167.
Trautman, A. (1958) Radiation and boundary conditions in the theory of gravitation,
Bull. Acad. Pol. Sci. Sér. Sci. Math. Astron. Phys. 6(6), 407–412.
Bondi, H. (1960) Gravitational waves in general relativity, Nature 186(4724), 535–535.
Bondi, H., van der Burg, M.G.J. and Metzner, A.W.K. (1962) Gravitational waves in
general relativity. VII. Waves from axi-symmetric isolated systems, Proc. Roy. Soc.
London A269, 21–52.
Sachs, R.K. (1961) Gravitational waves in general relativity. VI. The outgoing radiation
condition, Proc. Roy. Soc. London A264, 309–338.
Sachs, R.K. (1962a) Gravitational waves in general relativity. VIII. Waves in asymptoti-
cally flat space-time, Proc. Roy. Soc. London A270, 103–126.
Sachs, R.K. (1962b) Asymptotic symmetries in gravitational theory, Phys. Rev. 128, 2851–
2864.
Newman, E.T. and Unti, T.W.J. (1962) Behavior of asymptotically flat empty space, J.
Math. Phys. 3, 891–901.
Newman, E.T. and Penrose, R. (1962) An approach to gravitational radiation by a method
of spin coefficients, J. Math. Phys 3, 566–578 (Errata 4 (1963) 998).
Penrose, R. (1963) Asymptotic properties of fields and space-times, Phys. Rev. Lett. 10,
66–68.
Penrose, R. (1965) Zero rest-mass fields including gravitation: asymptotic behaviour, Proc.
Roy. Soc. London, A284, 159–203.
Penrose, R. and Rindler, W. (1986) Spinors and Space-Time, Vol. 2: Spinor and Twistor
Methods in Space-Time Geometry, Cambridge University Press, Cambridge.
Hilbert, D. (1915) Die Grundlagen der Physik (Erste Mitteilung), Nachrichten, Königliche
Gesellschaft der Wissenschaften zu Göttingen, Mathematische-Phyikalische Klasse,
395–407. Translation by D. Fine as The Foundations of Physics (First Communica-
tion), in J-P Hsu and D. Fine (eds.), 100 Years of Gravity and Accelerated Frames:
The Deepest Insights of Einstein and Yang-Mills, World Scientific, Singapore, 120–
131.
Komar, A. (1959) Covariant conservation laws in general relativity, Phys. Rev. 113(3),
934–936.
Taylor, J.H. (1994) Binary pulsars and relativistic gravity, Rev. Mod. Phys. 66(3), 711–
719.
Will, C. M. (2005) Was Einstein right? Testing relativity at the centenary, in Abhay
Ashtekar (ed.) 100 Years of Relativity. Space-Time Structure: Einstein and Beyond,
World Scientific, Singapore, 205–227.
O’Raifeartaigh, C. (2013) The contribution of V.M. Slipher to the discovery of the expand-
ing universe, in Michael Way and Deidre Hunter (eds.), Origins of the Expanding
Universe: 1912–1932, Astronomical Society of the Pacific Conference Series 471,
49–62.
Hoyle, F. (1950) The Nature of the Universe, Basil Blackwell, Oxford.
Sciama, D.W. (1959) The Unity of the Universe, Doubleday, Garden City, New York.
Einstein, A. (1917) Kosmologische betrachtungen zur allgemeinen relativitätstheorie,
Sitzungsberichte der Königlich Preußischen Akademie der Wissenschaften (Berlin),
Seite 142–152.
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 170

170 Topology and Physics

Chandrasekhar, S. (1931) The maximum mass of ideal white dwarfs, Astrophys. J. 74,
81–82.
Eddington, A.S. (1935) Meeting of the Royal Astronomical Society, Friday, January 11,
1935, The Observatory 58(February 1935), 33–41.
Eddington, A.S. (1946) Fundamental Theory, Cambridge University Press, Cambridge.
Oppenheimer, J.R. and Snyder, H. (1939) On continued gravitational contraction, Phys.
Rev. 56(5), 455–459.
Penrose, R. (1968) Structure of space-time, in C.M. DeWitt and J.A. Wheeler (eds.)
Battelle Rencontres, 1967 Lectures in Mathematics and Physics, Benjamin, New
York.
Penrose, R. and Rindler, W. (1984) Spinors and Space-Time, Vol. 1: Two-Spinor Calculus
and Relativistic Fields, Cambridge University Press, Cambridge.
Penrose, R. and Rindler, W. (1986) Spinors and Space-Time, Vol. 2: Spinor and Twistor
Methods in Space-Time Geometry, Cambridge University Press, Cambridge.
Yodzis, P., Seifert, H.-J. and Muller zum Hagen, H. (1973) On the occurrence of naked
singularities in general relativity, Commun. Math. Phys. 34(2), 135–148.
Lifshitz, E.M. and Khalatnikov, I.M. (1963) Investigations in relativistic cosmology, Adv.
Phys. 12,185–249.
Friedmann, A. (1922) Über die Krümmung des Raumes, Z. Phys. 10(1), 377–386.
Lemaı̂tre, G. (1933) L’universe en expansion, Ann. Soc. Sci. Bruxelles I A53, 51–85 (cf.
p. 82).
Eddington, A.S. (1924) A comparison of Whitehead’s and Einstein’s formulas, Nature
113(2832), 192–192.
Finkelstein, D. (1958) Past-future asymmetry of the gravitational field of a point particle,
Phys. Rev. 110, 965–967.
Painlevé, P. (1921) La mécanique classique et la théorie de la relativité, C. R. Acad. Sci.
(Paris) 173, 677–680.
Penrose, R. (1964) Conformal approach to infinity, in B.S. DeWitt and C.M. DeWitt
(eds.) Relativity, Groups and Topology: The 1963 Les Houches Lectures, Gordon
and Breach, New York.
Penrose, R. (1966) General-relativistic energy flux and elementary optics, in B. Hoffmann
(ed.) Perspectives in Geometry and Relativity, Indiana University Press, Blooming-
ton, 259–274.
Penrose, R. (1989) The Emperor’s New Mind: Concerning Computers, Minds, and the
Laws of Physics, Oxford University Press, Oxford. ISBN: 0-19-851973-7.
Kronheimer E.H. and Penrose, R. (1967) On the structure of causal spaces, Proc. Camb.
Phil. Soc. 63, 481–501.
Penrose, R. (1972) Techniques of Differential Topology in Relativity, CBMS Regional Conf.
Ser. in Appl. Math., No. 7, S.I.A.M., Philadelphia.
Hawking, S.W. and Ellis, G.F.R. (1973) The Large-Scale Structure of Space-Time, Cam-
bridge University Press, Cambridge.
Seifert, H.-J. (1971) The causal boundary of space-times, J. Gen. Rel and Grav. 1(3),
247–259.
Geroch, R., Kronheimer E.H., and Penrose, R (1972) Ideal points in spacetime, Proc. Roy.
Soc (Lond.) A347, 545–567.
Penrose, R. (1998) The question of cosmic censorship, in R.M. Wald (Ed.) Black Holes
and Relativistic Stars, University of Chicago Press, Chicago, Illinois.
December 11, 2018 11:13 ws-rv961x669 chap05-Penrose page 171

Singularity Theorems 171

Penrose, R. (1969) Gravitational collapse: the role of general relativity, Rivista del Nuovo
Cimento Serie I, Vol. 1; Numero speciale, 252–276. Reprinted: Gen. Rel. Grav. 34(7),
July 2002, 1141–1165.
Geroch, R and Horowitz, G.T. (1987) Global structure of spacetimes, in S.W. Hawking
and W. Israel (eds.) General Relativity an Einstein Centenary Survey, Cambrdge
University Press, Cambridge.
Geroch, R. (1970) Domain of dependence, J. Math. Phys. 11(2), 437–449.
Choquet-Bruhat, Y. and Geroch, R. (1969) Global aspects of the Cauchy problem in
general relativity, Commun. Math. Phys. 14(4), 329–335.
Hawking, S.W. (1967) The occurrence of singularities in cosmology III. Causality and
singularities, Proc. Roy. Soc. (London) A300, 187–201.
The Road to Reality: A Complete Guide to the Laws of the Universe, Vintage. ISBN:
9780-679-77631-4.
Raychaudhuri, A.K. (1955) Relativistic cosmology. I, Phys. Rev. 98(4), 1123–1126.
Komar, A. (1956) Necessity of singularities in the solution of the field equations of general
relativity, Phys. Rev. 104(2), 544–546.
Myers, S.B. (1941) Riemannian manifolds with positive mean curvature, Duke Math. J.
8(2), 401–404.
Penrose, R. (1965) Gravitational collapse and space-time singularities, Phys. Rev. Lett.
14(3), 57–59.
Hawking, S.W. (1965) Occurrence of singularities in open universes, Phys. Rev. Lett.
15(17), 689–690.
Hawking, S.W. (1966a) The occurrence of singularities in cosmology, Proc. Roy. Soc.
(London) A294(1439), 511–521.
Hawking, S.W. (1966b) The occurrence of singularities in cosmology II, Proc. Roy. Soc.
(London) A295(1443), 490–493. [Adams Prize Essay: Singularities in the geometry
of space-time.]
Hawking, S.W. and Penrose, R. (1970) The singularities of gravitational collapse and
cosmology, Proc. Roy. Soc. (London) A314(1519), 529–548.
Belinskii, V.A., Khalatnikov, I.M. and Lifshitz, E.M. (1970) Oscillatory approach to a
singular point in the relativistic cosmology, Usp. Fiz. Nauk 102, 463–500. Engl.
transl. in Adv. in Phys. 19(80), 525–573.
Belinskii, V.A., Khalatnikov, I.M. and Lifshitz, E.M. (1972) Construction of a general
cosmological solution of the Einstein equation with a time singularity, Zh. Eksp.
Teor. Fiz. 62, 1606–1613. English transl. in Soviet Phys. JETP 35(5), 838–841.
Misner, C.W. (1969) Mixmaster universe, Phys. Rev. Lett. 22(20), 1071–1074.
Ashtekar, A. (2005) Quantum geometry and its ramifications, in Abhay Ashtekar (ed.) 100
Years of Relativity. Space-Time Structure: Einstein and Beyond, World Scientific,
Singapore.
Bojowald, M. (2005) Loop quantum cosmology, in Abhay Ashtekar (ed.) 100 Years of
Relativity. Space-Time Structure: Einstein and Beyond, World Scientific, Singapore.
Hawking, S.W. (1975) Particle creation by black holes, Commun. Math. Phys. 43, 199–220.
This page intentionally left blank
October 31, 2018 14:49 taken from 146-MPLA ws-rv961x669 chap06-S0217732318300112 page 173

173

Chapter 6

Beyond anyons∗

Zhenghan Wang
Microsoft Research Station Q and Department of Mathematics,
University of California, Santa Barbara, CA, USA
[email protected]

The theory of anyon systems, as modular functors topologically and unitary modular
tensor categories algebraically, is mature. To go beyond anyons, our first step is the
interplay of anyons with conventional group symmetry due to the paramount importance
of group symmetry in physics. This led to the theory of symmetry-enriched topological
order. Another direction is the boundary physics of topological phases, both gapless
as in the fractional quantum Hall physics and gapped as in the toric code. A more
speculative and interesting direction is the study of Banados–Teitelboim–Zanelli (BTZ)
black holes and quantum gravity in 3d. The clearly defined physical and mathematical
issues require a far-reaching generalization of anyons and seem to be within reach. In this
short survey, I will first cover the extensions of anyon theory to symmetry defects and
gapped boundaries. Then, I will discuss a desired generalization of anyons to anyon-like
objects — the BTZ black holes — in 3d quantum gravity.

Keywords: Anyon; topological phase; symmetry defect; gapped boundary.

PACS Nos.: 71.10.-w, 73.20.-r, 75.60.ch

1. Introduction
Systematic application of quantum topology in condensed matter physics acceler-
ated significantly after the 2003 Workshop Topological Phases in Condensed Matter
Physics.8 Anyons, elementary excitations in 2D topological phases of matter, play
a central role in this new period of interactions between topology and physics in
3d spacetimea (see Refs. 13, 16 and references therein). Conceptually, anyons pro-
vide an interpolation between bosons and fermions through the topological spins
of Abelian anyons: eiθ for some θ’s with θ = 0 bosons and θ = π fermions. I see a
striking parallel of the application of topology in physics to the application of topol-
ogy in differential geometry — global differential geometry. Progress in science and

∗ This chapter also appeared in Modern Physics Letters A, Vol. 33, No. 28 (2018) 1830011. DOI:
10.1142/S0217732318300112.
a I will use the convention that nD means the D-dimensional space and nd the d-dimensional

spacetime, so (n + 1)d = nD + 1.
October 31, 2018 14:49 taken from 146-MPLA ws-rv961x669 chap06-S0217732318300112 page 174

174 Topology and Physics

technology has made topological physics inevitable: the inherently discrete nature
of quantum mechanics, the continuing miniaturization of quantum devices, and the
maturity of local classical physics.
At the forefront of topological physics is the application to quantum comput-
ing: topological quantum computing with the promise of an inherently fault-tolerant
universal quantum computer.9 However, to build a real topological quantum com-
puter, topological physics has to be coupled to conventional physics such as during
the initialization and read-out. Therefore, it is natural to consider how to extend
topological physics beyond anyons. One obvious direction is from 2D to 3D. But
the difference between 2D and 3D could be enormous as the topology of 3d and 4d
manifolds is dramatically different. For example, we have a rather complete classifi-
cation of 3d spacetime manifolds due to the geometrization theorem of Pereleman–
Thurston, while there is no reasonable conjectured picture of smooth 4d spacetime
manifolds.
The theory of anyon systems, as modular functors topologically and unitary
modular tensor categories algebraically, is mature.13 To go beyond anyons, our
first step is the interplay of anyons with conventional group symmetry due to the
paramount importance of group symmetry in physics. This led to the theory of
symmetry-enriched topological order. Another direction is the boundary physics
of topological phases, both gapless as in the fractional quantum Hall physics and
gapped as in toric code.
A more speculative and interesting direction is the study of Banados–
Teitelboim–Zanelli (BTZ) black holes and quantum gravity in 3d. The clearly
defined physical and mathematical issues require a far-reaching generalization of
anyons and seem to be within reach.
In this survey, I will first cover the extensions of anyon theory to symmetry de-
fects and gapped boundaries. Then I will discuss a desired generalization of anyons
to anyon-like objects — the BTZ black holes — in 3d quantum gravity.

2. 2D Non-Abelian Objects
Non-Abelian means the order in a sequence of things is important such as the
order of letters in words: NO is not the same as ON. If many non-Abelian objects
X1 , . . . , Xn are lined up in a line, then their states can be changed by exchanging
any two of them. If two different exchanges are performed sequentially, the order
of the two exchanges becomes important. The fundamental prerequisite for such a
phenomenon is that the states of such n non-Abelian objects are not unique, i.e. the
ground states have degeneracy, which is more fundamental than statistics.14 The
best understood non-Abelian objects are non-Abelian anyons. The last two decades
of research in condensed matter physics has yielded remarkable progress in the
understanding of non-Abelian anyons in topological phases of matter. Recently,
other non-Abelian objects are discovered such as the Majorana zero modes, which
lead essentially to the same non-Abelian physics as non-Abelian anyons. These new
October 31, 2018 14:49 taken from 146-MPLA ws-rv961x669 chap06-S0217732318300112 page 175

Beyond Anyons 175

non-Abelian objects — symmetry defects and gapped boundaries — open doors to


new approaches to topological quantum computation.

2.1. Symmetry defects


In the absence of any symmetry, gapped quantum systems at zero temperature can
still form distinct phases of matter — topological phases of matter (TPMs), which
are characterized by their topological order.
A TPM H = {H} is an equivalence class of gapped Hamiltonians H which
realizes a TQFT at low energy. Elementary excitations in a TPM H are point-like
anyons. Anyons can be modeled algebraically as simple objects in a unitary modular
tensor category (UMC) B, which will be referred to as the topological order of the
TPM H.
The interplay of symmetry with topological order has generated intense research.
In the presence of symmetries, TPMs acquire a finer classification and fractional
quasi-particles of a topologically ordered state can acquire fractional quantum num-
bers of the global symmetry.
When a Hamiltonian for a TPM possesses a global symmetry, it is natural
to consider the topological order that is obtained when this global symmetry is
promoted to a local, gauge symmetry. This gauging procedure is useful in many
ways.

2.2. Algebraic model of symmetry defects


In the real world, TPMs are always coupled to conventional degrees of freedom.
TPMs with conventional group symmetries are called symmetry enriched topo-
logical phases of matter (SETs). When the intrinsic topological order is trivial,
SETs become symmetry protected topological phases (SPTs). Important examples
of SPTs are topological insulators and topological superconductors.
Let G be a finite group and C a UMC, also called an anyon model.

2.2.1. Topological symmetry


Promoting G to a categorical-group G, we denote by Autbr
⊗ (C) the categorical-group
of braided tensor auto-equivalences of the UMC C, which is the full topological
symmetry of C.
Definition 2.1. A finite group G is a topological symmetry of the UMC C if there
is a monoidal functor ρ: G → Autbr⊗ (C). The topological symmetry is denoted as
(ρ, G) or simply ρ.

2.2.2. Symmetry defects


The invertible module categories over C form the Picard categorical-group Pic(C)
of C. The Picard categorical-group Pic(C) of the UMC C is monoidally equivalent
October 31, 2018 14:49 taken from 146-MPLA ws-rv961x669 chap06-S0217732318300112 page 176

176 Topology and Physics

to the categorical-group Autbr


⊗ (C). This one–one correspondence between braided
tensor auto-equivalences and invertible module categories underlies the relation be-
tween symmetry and defect. Hence, given a topological symmetry (ρ, G) of the UMC
C and an isomorphism of Pic(C) with Autbr br
⊗ (C), each ρ(g) ∈ Aut⊗ (C) corresponds
to an invertible bi-module category Cg ∈ Pic(C).

Definition 2.2. An extrinsic topological defect of flux g ∈ G is a simple object


in the invertible module category Cg ∈ Pic(C) corresponding to the braided tensor
auto-equivalence ρ(g) ∈ Autbr
⊗ (C).

2.2.3. Gauging topological symmetry


Let G be the categorical 2-group, and Autbr
⊗ (C) be the categorical 2-group of braided
tensor auto-equivalences.

Definition 2.3. A topological symmetry ρ: G → Autbr


⊗ (C) can be gauged if ρ can
be lifted to a categorical 2-group functor ρ: G → Autbr
⊗ (C).

The physical and mathematical theory of symmetry defects can be found in


Refs. 1 and 2. Symmetry defects can be used to enhance the computing power of
anyons. Study of topological quantum computation with symmetry defects can be
found in Ref. 7.

2.3. Gapped boundaries


A second direction for non-Abelian objects beyond anyons is gapped boundaries in
TPMs which are Drinfeld centers Z(C) of unitary fusion categories C, called doubled
theory in physics.
The Levin–Wen (LW) model in 2D is a lattice Hamiltonian realization of
Turaev–Viro type TQFTs based on unitary fusion categories. The conceptual under-
pinning of the 2D LW model is two mathematical theorems: The Drinfeld cen-
ter Z(C) of a unitary fusion category C is always modular, and the Turaev–Viro
TQFT based on C is equivalent to the Reshetikhin–Turaev TQFT based on Z(C).
Therefore, LW model is a lattice Hamiltonian implementation of both theorems
simultaneously. Those rigorously solvable models provide the best playground for
the theoretical study of TPMs. Realistically, samples of TPMs have boundaries.
The interplay of the boundary with the interior or bulk contains rich physics as
exemplified by the famous holographic principle.
In the categorical formalism, the bulk of a doubled TQFT is given by a UMC
B = Z(C) for some unitary fusion category C, and a (gapped) hole is a Lagrangian
algebra A = ⊕a na a in B. In the case of Dijkgraaf–Witten theories, we have C =
VecG . For most purposes, A can be regarded as a (composite) non-Abelian anyon
of quantum dimension dA . Gapped boundaries are conjectured to be in one-to-one
correspondence to indecomposable module categories Mi over C. Then, elementary
October 31, 2018 14:49 taken from 146-MPLA ws-rv961x669 chap06-S0217732318300112 page 177

Beyond Anyons 177

excitations on Mi are the simple objects in the functor fusion category Cii =
FunC (Mi , Mi ), and simple boundary defects between two gapped boundaries Mi
and Mj are the simple objects in the bimodule category Cij = FunC (Mi , Mj ). The
collections of fusion categories Cii and their bimodule categories Cij form a multi-
fusion category C. From this multi-fusion category, we can find quantum dimensions
of both boundary excitations and the defects between gapped boundaries.
Topological quantum computation with gapped boundaries is investigated in
Ref. 3 and a striking example is the universal gate set from a purely Abelian TPM.4
Topological quantum computation with boundary defects are also very interesting,
but a systematical study has not been initiated.

3. Three-Dimensional Topological Physics Beyond Anyons


Classically the universe has three spacial dimensions, but nano-technology makes
the study of low-dimensional physics exciting such as anyons in 2D. Therefore both
as a toy model for 3D physics and potentially realistic low-dimensional physics, it
is interesting to consider all possible 3d physics including the Yang–Mills theory.
One salient feature of 3d is the Chern–Simons (CS) action which could be coupled
to Yang–Mills. A fascinating direction is 3d quantum gravity. Classical 3d gravity
is the same as a doubled CS theory with gauge group SL(2, R), but how to quan-
tize doubled SL(2, R)-CS theory, which has a non-compact gauge group, is very
challenging. This is an excellent example for the generalization of anyon systems
to topological systems with infinitely many elementary excitation types, closely re-
lated to BTZ black holes. The geometry, topology, and physics in and around 3d
pure quantum gravity with negative cosmological constant center on the relation
between quantum 3D gravity and quantum doubled CS gauge theory (complicated
by the invertibility of viebeins), the existence of BTZ black holes, and the asymp-
totic Virasoro algebra discovered by Brown and Henneaux, which is a precursor of
the Ads/CFT correspondence.

3.1. 3d gravity
Let X 3 be a closed oriented 3d spacetime manifold and g a gravitational field. The
Einstein–Hilbert action is

Z
I(g) = d3 x g(R − 2Λ) ,
X3
where R is the scalar curvature and Λ the cosmological constant. The equation of
motion gives rise to the Einstein equation:
1
Rµν − Rgµν + Λgµν = 0 ,
2
where Rµν is the Ricci curvature.
The 3d anti-de Sitter space is the subspace of R4 defined as
{−x21 − x22 + x23 + x24 = −l2 } ,
October 31, 2018 14:49 taken from 146-MPLA ws-rv961x669 chap06-S0217732318300112 page 178

178 Topology and Physics

for some constant l > 0 with the metric ds2 = −dx21 − dx22 + dx23 + dx24 . Direct
computation gives
2 6
Rµν = − gµν , R=− ,
l2 l2
therefore the gravitational field ds2 = −dx21 − dx22 + dx23 + dx24 is a solution of
the Einstein equation if we choose the negative cosmological constant Λ = − l12 .
Topologically the anti-de Sitter space is simply S 1 × R2 .
The tangent bundle of X 3 is trivial so T X ∼= X 3 × R3 . A framing e: T X ∼ =
3 3
X × R is a choice of such an identification, which is called a vierbein (really
dreibein) in physics. Let ω be a spin connection, then (ω, e) can be made into an
SO(2, 2) gauge field. Let A± = ω ± el , then the Einstein–Hilbert action becomes
a doubled CS action I(X, g) = k4πL CS(A+ ) − k4πR CS(A− ). Therefore, classical 3d
gravity with Λ < 0 is the same as doubled CS theory with levels kL = kR = 3G 2l ,
18
where G is the Newton constant.
The quantum theory of 3d gravity is more subtle as the correspondence with
CS theory is not exact due to the difference between gauge transformations and
non-invertible vierbeins. But if 3d quantum gravity can be defined mathematically,
it should be some irrational TQFT. Then solving 3d quantum gravity would be to
find the corresponding conformal field theory in a sense, presumably irrational too.
Speculatively, then BTZ back holes would be anyon-like objects.

3.2. Volume conjecture


One profound implication of 3d quantum gravity is the possibility of a volume
conjecture.18 The volume conjecture is made precise for hyperbolic knots.11 Our
interest is on closed hyperbolic 3-manifolds. It can be easily seen that the naive
generalization from knots to closed 3-manifolds using rational unitary TQFTs can-
not be correct as the 3-manifold invariants grow only polynomially as the level goes
to infinity. The most promising version is to use non-unitary rational TQFTs as
in Ref. 5. It is puzzling how the non-unitarity arises from the unitary 3d quantum
gravity.

4. Four-Dimensional Topological Physics


Two interesting families of (3 + 1)-TQFTs are discrete gauge theories and BF
theories. Both families are related to the Crane–Frenkel–Yetter (CFY) (3 + 1)-
TQFTs based on unitary pre-modular categories, which can be realized by lattice
models.
A more general framework for the CFY TQFTs are the Mackaay’s TQFTs based
on spherical 2-categories.12 Cui generalized the CFY construction to G-crossed
braided fusion categories.6 The resulting new (3 + 1)-TQFTs do not fit into Mac-
kaay’s notion of spherical 2-categories. Therefore, one problem is to formulate a
October 31, 2018 14:49 taken from 146-MPLA ws-rv961x669 chap06-S0217732318300112 page 179

Beyond Anyons 179

higher category theory that underlies all these theories, and study their applica-
tion in 3D TPMs. Cui’s TQFTs can also be realized with exactly solvable lattice
models.17

4.1. Crane Frenkel Yetter TQFTs and tetra-categories


The lattice models for (3 + 1)-TQFTs are generalizations of the LW models in
2D. We expect a tetra-category, which is some double of the input tri-category,
to describe all elementary excitations. Algebraic formulation of general tri- and
tetra-categories are extremely complicated. The physical intuition we gain from the
lattice models would help us to understand these special tetra-categories.

4.2. Representations of motion groups and statistics


of extended objects
One lesson that we learned from 2D is that deep information about TQFTs is en-
coded in their associated representations of the mapping class groups, in particular
the braid groups. The braid groups are motion groups of points in the 2D disk. In
3D, given a 3-manifold M and a (not necessarily connected) sub-manifold N , we
can define the motion group of N in M . The first interesting cases will be for links
in S 3 . While the partition functions of the CFY (3 + 1)-TQFTs are not necessar-
ily interesting topological invariants, their induced representations of the motion
groups could be more interesting.
The Mueger center describes the pointed excitations in lattice models. More
interesting in 3D are the loop excitations or in general excitations of the shape of any
closed surface. Consider the simplest case for n unlinked unknotted oriented closed
loops in R3 , we obtain the n-component loop braid group LB n . The elementary
exchange of two loops leads to a subgroup isomorphic to the permutation group
Sn , and the passing-through operation to a subgroup isomorphic to the braid group
Bn . It follows that the loop braid group for n unlinked unknotted oriented closed
loops in R3 is some product of the permutation group Sn with the braid group Bn .
By general TQFT properties, each Crane–Frenkel–Yetter TQFT will lead to a
representation of the loop braid group. Physically, we are computing the generalized
statistics of loop excitations in the lattice model. There is no reason to consider only
unknotted loops. Similarly we can consider statistics of knotted loop excitations.

4.3. Fracton physics


Another question in 4d physics is how topological is the fracton physical sys-
tems.10,15 The low energy effective theories of fracton systems are not TQFTs
as the ground state degeneracy of a fixed space manifold grows as the lattice size
grows. Therefore, a new framework beyond anyons is necessary to capture these
new systems.
October 31, 2018 14:49 taken from 146-MPLA ws-rv961x669 chap06-S0217732318300112 page 180

180 Topology and Physics

5. Applications
An immediate application of topological phases of matter is the construction of a
scalable fault-tolerant quantum computer. In quantum computation, qubit is an ab-
straction of all 2-level quantum systems, though it is not the same as any particular
one just like the number 1 is not the same as an apple. Quantumness should be a
new source of energy, and a quantum computer is a machine that converts quantum
resources such as superposition and entanglement into useful energy. We need new
constants similar to the Planck constant or Boltzmann constant to quantify the
energy in superposition and entanglement.

Acknowledgment
This work was partially supported by NSF DMS under Grant No. 1411212.

References
1. M. Barkeshli, P. Bonderson, M. Cheng and Z. Wang, arXiv:1410.4540.
2. S.-X. Cui, C. Galindo, J. Plavnik and Z. Wang, Commun. Math. Phys. 348, 1043
(2016).
3. I. Cong, M. Cheng and Z. Wang, arXiv:1609.02037.
4. I. Cong, M. Cheng and Z. Wang, Phys. Rev. Lett. 119, 170504 (2017), arXiv:1707.
05490.
5. Q. Chen and T. Yang, arXiv:1503.02547.
6. S.-X. Cui, arXiv:1610.07628.
7. C. Delaney and Z. Wang, Symmetry defects and their application to topological quan-
tum computing, in preparation.
8. https://aimath.org/pastworkshops/topquantum.html.
9. M. Freedman, A. Kitaev, M. Larsen and Z. Wang, B. Am. Math. Soc. 40, 31 (2003).
10. J. Haah, Phys. Rev. A 83, 042330 (2011).
11. R. Kashaev, Lett. Math. Phys. 39, 269 (1997).
12. M. Mackaay, Adv. Math. 143, 288 (1999).
13. E. C. Rowell and Z. Wang, arXiv:1705.06206.
14. E. C. Rowell and Z. Wang, Phys. Rev. A 93, 030102 (2016).
15. S. Vijay, J. Haah and L. Fu, Phys. Rev. B 94, 235157 (2016).
16. Z. Wang, Topological Quantum Computation, Issue 112 of Regional Conference Series
in Mathematics (American Mathematical Soc., 2010).
17. D. Williamson and Z. Wang, Ann. Phys. 377, 311 (2017).
18. E. Witten, Nucl. Phys. B 311, 46 (1988).
October 31, 2018 14:53 IJMPB chap07-S0217979218300104 page 181

181

Chapter 7

Four revolutions in physics and the


second quantum revolution A unification
of force and matter by quantum information∗

Xiao-Gang Wen
Department of Physics,
Massachusetts Institute of Technology,
Cambridge, Massachusetts 02139, USA

Newton’s mechanical revolution unifies the motion of planets in the sky and the falling
of apples on Earth. Maxwell’s electromagnetic revolution unifies electricity, magnetism,
and light. Einstein’s relativistic revolution unifies space with time, and gravity with
space–time distortion. The quantum revolution unifies particle with waves, and energy
with frequency. Each of those revolution changes our world view. In this article, we
will describe a revolution that is happening now: the second quantum revolution which
unifies matter/space with information. In other words, the new world view suggests
that elementary particles (the bosonic force particles and fermionic matter particles)
all originated from quantum information (qubits): they are collective excitations of an
entangled qubit ocean that corresponds to our space. The beautiful geometric Yang–Mills
gauge theory and the strange Fermi statistics of matter particles now have a common
algebraic quantum informational origin.

Symmetry is beautiful and rich.


Quantum entanglement is even more beautiful and richer.

1. Four Revolutions in Physics


We have a strong desire to understand everything from a single or very few origins.
Driven by such a desire, physics theories were developed through the cycle of dis-
coveries, unification, more discoveries, and bigger unification. Here, we would like
to review the development of physics and its four revolutions.a We will see that the

∗ This chapter also appeared in International Journal of Modern Physics A, Vol. 32, No. ?? (2018)
1830010. DOI: 10.1142/S0217979218300104.
a Here we do not discuss the revolution for thermodynamical and statistical physics.
October 31, 2018 14:53 IJMPB chap07-S0217979218300104 page 182

182 Topology and Physics

history of physics can be summarized into three stages: (1) all matter is formed by
particles; (2) the discovery of wave-like matter; (3) particle-like matter = wave-like
matter. It appears that we are now entering the fourth stage: matter and space =
information (qubits), where qubits emerge as the origin of everything.1–8 In other
words, all elementary bosonic force particles and fermionic matter particles can be
unified by quantum information (qubits).

1.1. Mechanical revolution


Although the downpull by the earth was realized even before human civilization,
such a phenomenon did not arouse any curiosity. On the other hand, the planet
motion in the sky has aroused a lot of curiosity and led to many imaginary fantasies.
However, it was only after Kepler found that planets move in a certain way described
by a mathematical formula (see Fig. 1) that people started to wonder: Why are
planets so rational? Why do they move in such a peculiar and precise way? This
motivated Newton to develop his theory of gravity and his laws of mechanical
motion (see Fig. 2). Newton’s theory not only explains the planets’ motion; it
also explains the downpull that we feel on earth. The planets’ motion in the sky
and the apple falling on earth look very different (see Fig. 3); however, Newton’s
theory unifies the two seemingly unrelated phenomena. This is the first revolution
in physics — the mechanical revolution.

Fig. 1. Kepler’s Laws of Planetary Motion: (1) The orbit of a planet is an ellipse with the Sun
at one of the two foci. (2) A line segment joining a planet and the Sun sweeps out equal areas
during equal intervals of time. (3) The square of the orbital period of a planet is proportional to
the cube of the semimajor axis of its orbit.
December 13, 2018 9:24 IJMPB chap07-S0217979218300104 page 183

Four Revolutions in Physics and the Second Quantum Revolution 183

Fig. 2. Newton laws: (a) The more force, the more acceleration, no force leads to no acceleration.
(b) Action force = reaction force. (c) Newton’s universal gravitation: F = G m1r2m2 , where G =
3
m
6.674 × 10−11 kg s2
.

Fig. 3. The perceived trajectories of planets (Mars and Saturn) in the sky. The falling of an
apple on earth and the motion of a planet in the sky are unified by Newton’s theory.

Mechanical revolution
All matter is formed by particles, which
obey Newton’s laws. Interactions are
instantaneous over distance.

After Newton, we view all matter as formed by particles, and use Newton’s laws for
particles to understand the motion of all matter. The success and the completeness
of Newton’s theory gave us a sense that we understood everything.

1.2. Electromagnetic revolution


But, later we discovered that two other seemingly unrelated phenomena, electricity
and magnetism, can generate each other (see Fig. 4). Our curiosity about elec-
tricity and magnetism led to another giant leap in science, which is summarized
by Maxwell’s equations. Maxwell’s theory unifies electricity and magnetism and
reveals that light is merely an electromagnetic wave (see Fig. 5). We gain a much
deeper understanding of light, which is so familiar and yet so unexpectedly rich and
December 13, 2018 9:24 IJMPB chap07-S0217979218300104 page 184

184 Topology and Physics

(a) (b) (c)

Fig. 4. (a) A changing magnetic field can generate an electric field around it, that drives electric
current in a coil. (b) An electric current I in a wire can generate a magnetic field B around it.
(c) A changing electric field E (just like electric current) can generate a magnetic field B
around it.

Fig. 5. Three very different phenomena, electricity, magnetism, and light, are unified by
Maxwell’s theory.

complex in its internal structure. This can be viewed as the second revolution —
electromagnetic revolution.

Electromagnetic revolution
The discovery of a new form of matter —
wave-like matter: electromagnetic waves,
which obey Maxwell equation. Wave-like
matter causes interaction.

However, the true essence of Maxwell theory is the discovery of a new form of
matter — wave-like (or field-like) matter (see Fig. 6), the electromagnetic wave.
The motion of this wave-like matter is governed by Maxwell’s equation, which is
very different from the particle-like matter governed by Newton’s equation F = ma.
Thus, the sense that Newtonian theory describes everything is incorrect. Newtonian
theory does not apply to wave-like matter, which requires a new theory — Maxwell’s
theory.
Unlike particle-like matter, the new wave-like matter is closely related to a kind
of interaction — electromagnetic interaction. In fact, electromagnetic interaction
can be viewed as an effect of the newly discovered wave-like matter.
December 13, 2018 9:35 IJMPB chap07-S0217979218300104 page 185

Four Revolutions in Physics and the Second Quantum Revolution 185

(a) (b) (c)

Fig. 6. (a) Magnetic field revealed by iron powder. (b) Electric field revealed by glowing plasma.
(c) They form a new kind of matter: light — a wave-like matter.

1.3. Relativity revolution


After realizing the connection between the interaction and wave-like matter, one
naturally asks: does gravitational interaction also correspond to a wave-like matter?
The answer is yes.
First, people realized that Newton’s equation and Maxwell’s equation have dif-
ferent symmetries under the transformations between two frames moving against
each other. In other words, Newton’s equation F = ma is invariant under Galilean
transformation, while Maxwell’s equation is invariant under Lorentz transforma-
tion (see Fig. 7). Certainly, only one of the above two transformations is correct.
If one believes that physical laws should be the same in different frames, then the
above observation implies that Newton’s equation and Maxwell’s equation are in-
compatible, and one of them must be wrong. If Galilean transformation is correct,
then the Maxwell theory is wrong and needs to be modified. If Lorentz transfor-
mation is correct, then the Newton theory is wrong and needs to be modified. The
Michelson–Morley experiment showed that the speed of light is the same in all the
frames, which implied the Galilean transformation to be wrong. So, Einstein chose

(a) (b) (c)

Fig. 7. (a) A rest frame and a moving frame with velocity v. An event is recorded with co-
ordinates (x, y, z, t) in the rest frame and with (x0 , y 0 , z 0 , t0 ) in the moving frame. There are two
opinions on how (x, y, z, t) and (x0 , y 0 , z 0 , t0 ) are related: (b) Galilean transformation or (c) Lorentz
transformation where c is the speed of light. In our world, the Lorentz transformation is correct.
October 31, 2018 14:53 IJMPB chap07-S0217979218300104 page 186

186 Topology and Physics

Bob

Alice

earth

Fig. 8. The equivalence of the gravitational force of the earth and the force experienced in an
accelerating elevator, leads to a geometric way of understanding gravity: gravity = distortion in
space. In other words, the “gravitational force” in an accelerating elevator is related to a geometric
feature: the transformation between the coordinates in a still elevator and in an accelerating
elevator.

to believe in the Maxwell equation. He modified Newton’s equation and developed


the theory of special relativity. Thus, the Newton theory is not only incomplete, it
is also incorrect.
Einstein has gone further. Motivated by the equivalence of gravitational force
and the force experienced in an accelerating frame (see Fig. 8), Einstein also de-
veloped the theory of general relativity.9 Einstein’s theory unifies several seeming
unrelated concepts, such as space and time, as well as interaction and geometry.
Since gravity is viewed as a distortion of space and since the distortion can prop-
agate, Einstein discovered the second wave-like matter — gravitational wave (see
Fig. 9). This is another revolution in physics — relativity revolution.

Relativity revolution
A unification of space and time. A unification
of gravity and space–time distortion.

Motivated by the connection between interaction and geometry in gravity, peo-


ple went back to reexamine the electromagnetic interaction, and found that the
electromagnetic interaction is also connected to geometry. Einstein’s general rela-
tivity views gravity as a distortion of space, which can be viewed as a distortion

Fig. 9. Gravitational wave is a propagating distortion of space: a circle is distorted by a gravi-


tational wave.
October 31, 2018 14:53 IJMPB chap07-S0217979218300104 page 187

Four Revolutions in Physics and the Second Quantum Revolution 187

Fig. 10. A curved space can be viewed as a distortion of local directions of the space: parallel
moving a local direction (represented by an arrow) around a loop in a curved space, the direction
of the arrow does not come back. Such a twist in local direction corresponds to a curvature in
space.

of local directions of space (see Fig. 10). Motivated by such a picture, in 1918,
Weyl proposed that the unit that we used to measure physical quantities is relative
and is defined only locally. A distortion of the unit system can be described by a
vector field which is called gauge field. Weyl proposed that such a vector field (the
gauge field) is the vector potential that describes the electromagnetism. Although
the above proposal turns out to be incorrect, Weyl’s idea is correct. In 1925, the
complex quantum amplitude was discovered. If we assume the complex phase is rel-
ative, then a distortion of the unit system that measures local complex phase can
also be described by a vector field. Such a vector field is indeed the vector potential
that describes electromagnetism. This leads to a unified way to understand gravity
and electromagnetism: gravity arises from the relativity of spatial directions at dif-
ferent spatial points, while electromagnetism arises from the relativity of complex
quantum phases at different spatial points. Furthermore, Nordström, Möglichkeit,
Kaluza, and Klein showed that both gravity and electromagnetism can be under-
stood as a distortion of space–time provided that we think of the space–time as
five-dimensional with one dimension compactified into a small circle.10–12 This can
be viewed as an unification of gravity and electromagnetism. Those theories are
so beautiful. Since that time, the geometric way to view our world has dominated
theoretical physics.

1.4. Quantum revolution


However, such a geometric view of the world was immediately challenged by new
discoveries from the microscopic world.b Experiments in the microscopic world tell
us that not only is Newtonian theory incorrect, even its relativistic modification

b Many people have ignored such challenges and the geometric world view becomes mainstream.
October 31, 2018 14:53 IJMPB chap07-S0217979218300104 page 188

188 Topology and Physics

(a) (b)

Fig. 11. (a) An electron beam passing through a double-slit can generate an interference pattern,
indicating that electrons are also waves. (b) Using light to eject electrons from a metal (the
photoelectric effect) shows that the higher the light wave frequency (the shorter the wave length),
the higher the energy of the ejected electron. This reveals that a light wave of frequency f can be
2
viewed a beam of particles of energy E = hf , where h = 6.62607004 × 10−34 m skg .

is incorrect. This is because Newtonian theory and its relativistic modification are
theories for particle-like matter. But through experiments on very tiny things, such
as electrons, people found that particles are not really particles. They also behave
like waves at the same time. Similarly, experiments also reveal that light waves
behave like a beam of particles (photons) at the same time (see Fig. 11). So, the
matter in our world is not what we thought it was. Matter is neither particle
nor wave, and both particle and wave. So, the Newton theory (and its relativistic
modification) for particle-like matter and the Maxwell/Einstein theories for wave-
like matter cannot be the correct theories for matter. We need a new theory for the
new form of existence: particle–wave-like matter. The new theory is the quantum
theory that explains the microscopic world. The quantum theory unifies particle-like
matter and wave-like matter.

Quantum revolution
There is no particle-like matter nor wave-like
matter. All the matter in our world is
particle–wave-like matter.

From the above, we see that quantum theory reveals the true existence in our
world to be quite different from the classical notion of existence in our mind. What
exist in our world are not particles or waves but both particle and wave. Such a
picture is beyond our wildest imagination, but reflects the truth about our world and
is the essence of quantum theory. To understand the new notion of existence more
clearly, let us consider another example. This time it is about a bit (represented
by spin-1/2). A bit has two possible states of classical existence: |1i = | ↑i and
|0i = | ↓i. However, quantum theory also allows a new kind of existence | ↑i + | ↓i.
One may say that | ↑i + | ↓i is also a classical existence since | ↑i + | ↓i = |→i
that describes a spin in x-direction. So, let us consider a third example of two bits.
Then there will be four possible states of classical existence: | ↑↑i, | ↑↓i, | ↓↑i, and
October 31, 2018 14:53 IJMPB chap07-S0217979218300104 page 189

Four Revolutions in Physics and the Second Quantum Revolution 189

Fig. 12. To observe two points of distance l apart, we need to send in light of wave length λ < l.
The corresponding photon has an energy E = hc/λ. If l is less than the Planck length l < lP , then
the photon will make a back hole of size larger then l. The black hole will swallow the two points,
and we can never measure the separation of two points of distance less than lP . What cannot be
measured cannot exists. So the notion of “two points less than lP apart” has no physical meaning
and does not exist.

| ↓↓i. Quantum theory allows a new kind of existence | ↑↑i + | ↓↓i. Such a quantum
existence is entangled and has no classical analogues.
Although the geometric way to understand our world is mainstream in physics,
here we will take a position that the geometric understanding is not good enough
and will try to advocate a very different non-geometric understanding of our world.
Why is the geometric understanding not good enough? First the geometric under-
standing is not self-consistent. It contradicts with quantum theory. The considera-
tion based on quantum mechanics and Einstein’s gravity indicates that two points
separated by a distance less than the Planck length
r
~G
lP = = 1.616199 × 1035 m (1)
c3
cannot exist as a physical reality (see Fig. 12). Thus, the foundation of the geomet-
ric approach — manifold — simply does not exist in our universe, since manifold
contains points with arbitrarily small separation. This suggests that geometry is an
emergent phenomenon that appears only at long distances. So, we cannot use geom-
etry and manifold as a foundation to understand fundamental physical problems.
Second, Maxwell’s theory of light and Einstein’s theory of gravity predict light
waves and gravitational waves. But the theories fail to tell us what is waving.
Maxwell’s theory and Einstein’s theory are built on top of geometry. They fail to
answer the question of what the origin of the apparent geometry that we see is.
In other words, the Maxwell theory and Einstein theory are incomplete, and they
should be regarded as effective theories at long distances.
Since geometry does not exist in our world, we say the geometric view of world
is challenged by quantum theory. The quantum theory tell us such a point of view
is wrong at length scales of order Planck length. So, the quantum theory represents
the most dramatic revolution in physics.

2. It From Qubit, Not Bit A Second Quantum Revolution


After realizing that even the notion of existence is changed by quantum theory, it is
no longer surprising to see that quantum theory also blurs the distinction between
October 31, 2018 14:53 IJMPB chap07-S0217979218300104 page 190

190 Topology and Physics

information and matter. In fact, it implies that information is matter, and matter is
information. This is because the frequency is an attribute of information. Quantum
theory tells us that frequency is energy E = hf , and relativity tells us that energy
is mass m = E/c2 . Both energy and mass are attributes of matter. So matter =
information. This represents a new way to view our world.

The essence of quantum theory


The energy–frequency relation E = hf implies
that matter = information.

The above point of view of “matter = information” is similar to Wheeler’s “it


from bit,” which represents a deep desire to unify matter and information. In fact,
such a unification has happened before at a small scale. We introduced electric and
magnetic field to informationally (or pictorially) describe electric and magnetic
interaction. But later, electric/magnetic field became real matter with energy and
momentum, and even has a particle associated with it.
However, in our world, “it” are very complicated. (1) Most “it” are fermions,
while “bit” are bosonic. Can fermionic “it” come from bosonic “bit”? (2) Most “it”
also carry spin-1/2. Can spin-1/2 arise from “bit”? (3) All “it” interact via a special
kind of interaction — gauge interaction. Can “bit” produce gauge interaction?
Can “bit” produce waves that satisfy the Maxwell equation? Can “bit” produce a
photon?
In other words, to understand the concrete meaning of “matter from informa-
tion” or “it from bit,” we note that matter is described by Maxwell’s equation
(photons), Yang–Mills equation (gluons and W/Z bosons), as well as Dirac and
Weyl equations (electrons, quarks, neutrinos). The statement “matter = informa-
tion” means that those wave equations can all come from qubits. In other words,
we know that elementary particles (i.e. matter) are described by gauge fields and
anti-commuting fields in a quantum field theory. Here we try to say that all those
very different quantum fields can arise from qubits. Is this possible?
All the waves and fields mentioned above are waves and fields in space. The
discovery of the gravitational wave strongly suggested that space is a deformable
dynamical medium. In fact, the discovery of the electromagnetic wave and the
Casimir effect already strongly suggested that space is a deformable dynamical
medium. As a dynamical medium, it is not surprising that the deformation of space
gives rise to various waves. But the dynamical medium that describes our space
must be very special, since it should give rise to waves satisfying the Einstein equa-
tion (gravitational wave), Maxwell equation (electromagnetic wave), Dirac equation
(electron wave), etc. But what is the microscopic structure of space? What kind
of microscopic structure can, at the same time, give rise to waves that satisfy the
Maxwell equation, Dirac/Weyl equation, and Einstein equation?
October 31, 2018 14:53 IJMPB chap07-S0217979218300104 page 191

Four Revolutions in Physics and the Second Quantum Revolution 191

Let us view the above questions from another angle. Modern science has made
many discoveries and has also unified many seemingly unrelated discoveries into
a few simple structures. Those simple structures are so beautiful and we regard
them as wonders of our universe. They are also very mysterious since we do not
understand where they come from and why they have to be the way they are. At
the moment, the most fundamental mysteries and/or wonders in our universe can
be summarized by the following short list:

Eight wonders:

(1) Locality.
(2) Identical particles.
(3) Gauge interactions.13–15
(4) Fermi statistics.16,17
(5) Tiny masses of fermions (∼ 10−20 of the Planck mass).2,18,19
(6) Chiral fermions.6,7,20,21
(7) Lorentz invariance.22 as
(8) Gravity.9

In the current physical theory of nature (such as the standard model), we take
the above properties for granted and do not ask where they come from. We put
those wonderful properties into our theory by hand, for example, by introducing
one field for each kind of interaction or elementary particle.
However, here we would like to question where those wonderful and mysterious
properties come from. Following the trend of science history, we wish to have a single
unified understanding of all the above mysteries. Or more precisely, we wish that
we can start from a single structure to obtain all the above wonderful properties.
The simplest element in quantum theory is qubit |0i and |1i (or | ↓i and | ↑i).
Qubit is also the simplest element in quantum information. Since our space is
a dynamical medium, the simplest choice is to assume space to be an ocean of
qubits. We will give such an ocean a formal name, “qubit ether.” Then the matter,
i.e. the elementary particles, are simply the waves, “bubbles” and other defects in
the qubit ocean (or qubit ether). This is how we get “it from qubit” or “matter =
information.”
Qubit, having only two states | ↓i and | ↑i, is very simple. We may view the
many-qubit state with all qubits in | ↓i as the quantum state that corresponds to
the empty space (the vacuum). Then the many-qubit state with a few qubits in
| ↑i corresponds to a space with a few spin-0 particles described by a scalar field.
Thus, it is easy to see that a scalar field can emerge from qubit ether as a density
wave of up-qubits. Such a wave satisfies the Euler equation, but not the Maxwell
equation or Yang–Mills equation. So the above particular qubit ether is not the one
that corresponds to our space. It has the wrong microscopic structure and cannot
carry waves satisfying the Maxwell equation and Yang–Mills equation. But this
October 31, 2018 14:53 IJMPB chap07-S0217979218300104 page 192

192 Topology and Physics

line of thinking may be correct. We just need to find a qubit ether with a different
microscopic structure.
However, for a long time, we do not know how waves satisfying the Maxwell
equation or Yang–Mills equation can emerge from any qubit ether. The anticom-
muting wave that satisfies the Dirac/Weyl equation seems even more impossible.
So, even though quantum theory strongly suggests “matter = information,” trying
to obtain all elementary particles from an ocean of simple qubits is regarded as
impossible by many and has never become an active research effort.
So, the key to understand “matter = information” is to identify the microscopic
structure of the qubit ether (which can be viewed as space). The microscopic struc-
ture of our space must be very rich, since our space can not only carry gravitational
wave and electromagnetic wave, it can also carry electron wave, quark wave, gluon
wave, and the waves that correspond to all elementary particles. Is such a qubit
ether possible?
In condensed matter physics, the discovery of fractional quantum Hall states23
brings us into a new world of highly entangled many-body systems. When the strong
entanglement becomes long range entanglement,24 the systems will possess a new
kind of order–topological order,25,26 and represent new states of matter. We find
that the waves (the excitations) in topologically ordered qubit states can be very
strange: they can be waves that satisfy the Maxwell equation, Yang–Mills equation,
or Dirac/Weyl equation. So the impossible become possible: all elementary particles
(the bosonic force particles and fermionic matter particles) can emerge from long
range entangled qubit ether and be unified by quantum information.2–8,27,28
We would like to stress that the above picture is “it from qubit,” which is
very different from Wheeler’s “it from bit.” As we have explained, our observed
elementary particles can only emerge from long range entangled qubit ether. The
requirement of quantum entanglement implies that “it cannot from bit.” In fact “it
from entangled qubits.”

3. A String-Net Liquid of Qubits and A Unification of Gauge


Interactions and Fermi Statistics
In this section, we will consider a particular entangled qubit ocean — a string liquid
of qubits. Such an entangled qubit ocean supports new kind of waves and their
corresponding particles. We find that the new waves and the emergent statistics
are so profound that they may change our view of the universe. Let us start by
explaining a basic notion — the “principle of emergence.”

3.1. Principle of emergence


Typically, one thinks the properties of a material should be determined by the
components that form the material. However, this simple intuition is incorrect, since
all materials are made of the same components: electrons, protons and neutrons.
So, we cannot use the richness of the components to understand the richness of
October 31, 2018 14:53 IJMPB chap07-S0217979218300104 page 193

Four Revolutions in Physics and the Second Quantum Revolution 193

Fig. 13. Liquids only have a compression wave — a wave of density fluctuations.

the materials. In fact, the various properties of different materials originate from
various ways in which the particles are organized. Different orders (the organizations
of particles) give rise to different physical properties of a material. It is the richness
of the orders that gives rise to the richness of the material world.
Let us use the origin of mechanical properties and the origin of waves to explain,
in a more concrete way, how orders determine the physical properties of a material.
We know that a deformation in a material can propagate just like a ripple on the
surface of water. The propagating deformation corresponds to a wave traveling
through the material. Since liquids can resist only compression deformation, liquids
can only support a single kind of wave — the compression wave (see Fig. 13).
(Compression wave is also called longitudinal wave.) Mathematically the motion of
the compression wave is governed by the Euler equation
∂2ρ ∂2ρ
2
− v 2 2 = 0, (2)
∂t ∂x
where ρ is the density of the liquid.
A solid can resist both compression and shear deformations. As a result, solids
can support both a compression wave and transverse wave. The transverse wave
corresponds to the propagation of shear deformations. In fact, there are two trans-
verse waves corresponding to two directions of shear deformations. The propagation
of the compression wave and the two transverse waves in solids are described by
the elasticity equation
∂ 2 ui 2 j
ikl ∂ u
− Tj =0 (3)
∂t2 ∂xk ∂xl
where the vector field ui (x, t) describes the local displacement of the solid.
We would like to point out that the elasticity equation and the Euler equa-
tion not only describe the propagation of waves, they actually describe all small
deformations in solids and liquids. Thus, the two equations represent a complete
mathematical description of the mechanical properties of solids and liquids.
But why do solids and liquids behave so differently? What makes a solid have a
shape and a liquid have no shape? What are the origins of the elasticity equation
and Euler equations? The answers to those questions must wait until the discovery
October 31, 2018 14:53 IJMPB chap07-S0217979218300104 page 194

194 Topology and Physics

Fig. 14. Drawing a grid on a solid helps us to see the deformation of the solid. The vector ui
in (3) is the displacement of a vertex in the grid. In addition to the compression wave (i.e. the
density wave), a solid also supports transverse waves (waves of shear deformation) as shown in
the above figure.

(a) (b)

Fig. 15. (a) Particles in liquids do not have fixed relative positions. They fluctuate freely and
have a random but uniform distribution. (b) Particles in solids form a fixed regular lattice.

of atoms in the 19th century. Since then, we realized that both solids and liquids
are formed by collections of atoms. The main difference between solids and liquids
is that the atoms are organized very differently. In liquids, the positions of atoms
fluctuate randomly [see Fig. 15(a)], while in solids, atoms organize into a regular
fixed array [see Fig. 15(b)].c It is the different organizations of atoms that lead to
the different mechanical properties of liquids and solids. In other words, it is the
different organizations of atoms that make liquids able to flow freely and solids able
to retain their shapes.
How can different organizations of atoms affect mechanical properties of mate-
rials? In solids, both the compression deformation [see Fig. 16(a)] and the shear de-
formation [see Fig. 16(b)] lead to real physical changes of the atomic configurations.
Such changes cost energies. As a result, solids can resist both kinds of deformations
and can retain their shapes. This is why we have both the compression wave and
the transverse wave in solids.
In contrast, a shear deformation of atoms in liquids does not result in a new
configuration since the atoms still have uniformly random positions. So, the shear
deformation is a do-nothing operation for liquids. Only the compression deformation

c The solids here should be more accurately referred to as crystals.


October 31, 2018 14:53 IJMPB chap07-S0217979218300104 page 195

Four Revolutions in Physics and the Second Quantum Revolution 195

(a) (b)
Fig. 16. The atomic picture of (a) the compression wave and (b) the transverse wave in a crystal.

Fig. 17. The atomic picture of the compression wave in liquids.

which changes the density of the atoms results in a new atomic configuration and
costs energies. As a result, liquids can only resist compression and have only the
compression wave. Since shear deformations do not cost any energy for liquids,
liquids can flow freely.
We see that the properties of the propagating wave are entirely determined by
how the atoms are organized in materials. Different organizations lead to different
kinds of waves and different kinds of mechanical laws. Such a point of view of having
different kinds of waves/laws originate from different organizations of particles is a
central theme in condensed matter physics. This point of view is called the principle
of emergence.

3.2. String-net liquid of qubits unifies light and electrons


The elasticity equation and the Euler equation are two very important equations.
They lay the foundation of many branches of science such as mechanical engineer-
ing, aerodynamic engineering, etc. But, we have a more important equation, the
Maxwell equation, that describes light waves in vacuum. When the Maxwell equa-
tion was first introduced, people firmly believed that any wave must correspond
to the motion of something. So, people want to find out what is the origin of the
Maxwell equation? What is the motion that gives rise to the electromagnetic wave?
First, one may wonder: can the Maxwell equation come from a certain symmetry
breaking order? Based on Landau symmetry-breaking theory, the different symme-
try breaking orders can indeed lead to different waves satisfying different wave
equations. So, maybe a certain symmetry breaking order can give rise to a wave
that satisfies the Maxwell equation. But people have been searching for ether —
October 31, 2018 14:53 IJMPB chap07-S0217979218300104 page 196

196 Topology and Physics

a medium that supports the light waves — for over 100 years and have not been
able to find any symmetry breaking states that can give rise to waves satisfying the
Maxwell equation. This is one of the reasons why people have given up the idea of
ether as the origin of light and Maxwell equation.
However, the discovery of topological order25,26 suggests that Landau symmetry-
breaking theory does not describe all possible organizations of bosons/spins. This
gives us new hope: the Maxwell equation may arise from a new kind of organizations
of bosons/spins that have non-trivial topological orders.
In addition to the Maxwell equation, there is an even stranger equation, the
Dirac equation, that describes the wave of electrons (and other fermions). Elec-
trons have Fermi statistics. They are fundamentally different from the quanta of
other familiar waves, such as photons and phonons, since those quanta all have Bose
statistics. To describe the electron wave, the amplitude of the wave must be anti-
commuting Grassmann numbers, so that the wave quanta will have Fermi statistics.
Since electrons are so strange, few people regard electrons and electron waves as
collective motions of something. People accept without questioning that electrons
are fundamental particles, one of the building blocks of all that exist.
However, from a condensed matter physics point of view, all low energy excita-
tions are collective motions of something. If we try to regard photons as collective
modes, why can’t we regard electrons as collective modes as well? So maybe, the
Dirac equation and associated fermions can also arise from a new kind of organiza-
tion of bosons/spins that have non-trivial topological orders.
A recent study provides a positive answer to the above questions.3,29,30 We find
that if bosons/spins form large oriented strings and if those strings form a quantum
liquid state, then the collective motion of the such organized bosons/spins will
correspond to waves described by the Maxwell equation and Dirac equation. The
strings in the string liquid are free to join and cross each other. As a result, the
strings look more like a network (see Fig. 18). For this reason, the string liquid is
actually a liquid of string-nets, which is called string-net condensed state.
But why does the waving of strings produce waves described by the Maxwell
equation? We know that the particles in a liquid have a random but uniform distri-

Fig. 18. A quantum ether: The fluctuation of oriented strings give rise to electromagnetic waves
(or light). The ends of strings give rise to electrons. Note that oriented strings have directions
which should be described by curves with arrows. For ease of drawing, the arrows on the curves
are omitted in the above plot.
October 31, 2018 14:53 IJMPB chap07-S0217979218300104 page 197

Four Revolutions in Physics and the Second Quantum Revolution 197

Fig. 19. The fluctuating strings in a string liquid.

Fig. 20. A “density” wave of oriented strings in a string liquid. The wave propagates in x-
direction. The “density” vector E points in y-direction. For ease of drawing, the arrows on the
oriented strings are omitted in the above plot.

bution. A deformation of such a distribution corresponds to a density fluctuation,


which can be described by a scaler field ρ(x, t). Thus the waves in a liquid are
described by the scalar field ρ(x, t) which satisfy the Euler equation (2). Similarly,
the strings in a string-net liquid also have a random but uniform distribution (see
Fig. 19). A deformation of string-net liquid corresponds to a change of the density
of the strings (see Fig. 20). However, since strings have an orientation, the “den-
sity” fluctuations are described by a vector field E(x, t), which indicates there are
more strings in the E direction on average. The oriented strings can be regarded as
flux lines. The vector field E(x, t) describes the smeared average flux. Since strings
are continuous (i.e. they cannot end), the flux is conserved: ∂ · E(x, t) = 0. The
vector density E(x, t) of strings cannot change in the direction along the strings
(i.e. along the E(x, t) direction). E(x, t) can change only in the direction perpendic-
ular to E(x, t). Since the direction of the propagation is the same as the direction
in which E(x, t) varies, thus the waves described by E(x, t) must be transverse
waves: E(x, t) is always perpendicular to the direction of the propagation. There-
fore, the waves in the string liquid have a very special property: the waves have
only transverse modes and no longitudinal mode. This is exactly the property of the
light waves described by the Maxwell equation. We see that “density” fluctuations
of strings (which are described be a transverse vector field) naturally give rise to
the light (or electromagnetic) waves and the Maxwell equation.2,3,30–33
October 31, 2018 14:53 IJMPB chap07-S0217979218300104 page 198

198 Topology and Physics

It is interesting to compare solid, liquid, and string-net liquid. We know that


the particles in a solid are organized into a regular lattice pattern. The waving of
such organized particles produces a compression wave and two transverse waves.
The particles in a liquid have a more random organization. As a result, the waves in
liquids lose two transverse modes and contain only a single compression mode. The
particles in a string-net liquid also have a random organization, but in a different
way. The particles first form string-nets and string-nets then form a random liquid
state. Due to this different kind of randomness, the waves in string-net condensed
state lose the compression mode and contain two transverse modes. Such a wave
(having only two transverse modes) is just the electromagnetic wave.
To understand how electrons appear from string-nets, we would like to point
out that if we only want photons and no other particles, the strings must be closed
strings with no ends. The fluctuations of closed strings produce only photons. If
strings have open ends, those open ends can move around and just behave like
independent particles. Those particles are not photons. In fact, the ends of strings
are nothing but electrons.
How do we know that ends of strings behave like electrons? First, since the
waving of string-nets is an electromagnetic wave, a deformation of string-nets cor-
responds to an electromagnetic field. So, we can study how an end of a string
interacts with a deformation of string-nets. We find that such an interaction is just
like the interaction between a charged electron and an electromagnetic field. Also,
electrons have a subtle but very important property — Fermi statistics, which is
a property that exists only in quantum theory. Amazingly, the ends of strings can
reproduce this subtle quantum property of Fermi statistics.29,34 Actually, string-net
liquids explain why Fermi statistics should exist.
We see that qubits that organize into string-net liquid naturally explain both
light and electrons (gauge interactions and Fermi statistics). In other words, string-
net theory provides a way to unify light and electrons.3,30 So, the fact that our
vacuum contains both light and electrons may not be a mere accident. It may
actually suggest that the vacuum is indeed a string-net liquid.

3.3. More general string-net liquid and emergence of Yang Mills


gauge theory
Here, we would like to point out that there are many different kinds of string-net
liquids. The strings in different liquids may have different numbers of types. The
strings may also join in different ways. For a general string-net liquid, the waving
of the strings may not correspond to light and the ends of strings may not be
electrons. Only one kind of string-net liquids give rise to light and electrons. On
the other hand, the fact that there are many kinds of string-net liquids allows us
to explain more than just light and electrons. We can design a particular type of
string-net liquids which not only gives rise to electrons and photons, but also gives
rise to quarks and gluons.2,29 The waving of such type of string-nets corresponds
October 31, 2018 14:53 IJMPB chap07-S0217979218300104 page 199

Four Revolutions in Physics and the Second Quantum Revolution 199

to photons (light) and gluons. The ends of different types of strings correspond
to electrons and quarks. It would be interesting to see if it is possible to design a
string-net liquid that produces all elementary particles! If this is possible, the ether
formed by such string-nets can provide an origin of all elementary particles.d
We like to stress that the string-nets are formed by qubits. So, in the string-net
picture, both the Maxwell equation and Dirac equation emerge from a local qubit
model, as long as the qubits from a long-range entangled state (i.e. a string-net liq-
uid). In other words, light and electrons are unified by the long-range entanglement
of qubits!
The electric field and the magnetic field in the Maxwell equation are called gauge
fields. The field in the Dirac equation are Grassmann-number valued field.e For a
long time, we thought that we had to use gauge fields to describe light waves that
have only two transverse modes, and we thought that we had to use Grassmann-
number valued fields to describe electrons and quarks that have Fermi statistics. So
gauge fields and Grassmann-number valued fields became the fundamental build-
ing blocks of quantum field theory that describe our world. The string-net liquids
demonstrate we do not have to introduce gauge fields and Grassmann-number val-
ued fields to describe photons, gluons, electrons, and quarks. It demonstrates how
gauge fields and Grassmann fields emerge from local qubit models that contain only
complex scalar fields at the cutoff scale.
Our attempt to understand light has a long and evolving history. We first
thought light to be a beam of particles. After Maxwell, we understand light as
electromagnetic waves. After Einstein’s theory of general relativity, where gravity
is viewed as curvature in space–time, Weyl and others tried to view the electro-
magnetic field as curvatures in the “unit system” that we used to measure complex
phases. It leads to the notion of gauge theory. General relativity and the gauge the-
ory are two cornerstones of modern physics. They provide a unified understanding
of all four interactions in terms of a beautiful mathematical framework: all inter-
actions can be understood geometrically as curvatures in space–time and in “unit
systems” (or more precisely, as curvatures in the tangent bundle and other vector
bundles in space–time).
Later, people in high-energy physics and in condensed matter physics found
another way in which the gauge field can emerge35–38 : one first cuts a particle (such
as an electron) into two partons by writing the field of the particle as the product of
the two fields of the two partons. Then one introduces a gauge field to glue the two
partons back to the original particle. Such a “glue-picture” of gauge fields (instead
of the fiber bundle picture of gauge fields) allows us to understand the emergence
of gauge fields in models that originally contain no gauge field at the cutoff scale.

d So far we can use string-net to produce almost all elementary particles, expect for the graviton
that is responsible for gravity. In particular, we can even produce the chiral coupling between the
SU (2) gauge boson and the fermions from the qubit ocean.6,7
e Grassmann numbers are anticommuting numbers.
December 13, 2018 9:48 IJMPB chap07-S0217979218300104 page 200

200 Topology and Physics

A string picture represents the third way understanding gauge theory. String
operators appear in the Wilson-loop characterization39 of gauge theory. The Hamil-
tonian and the duality description of lattice gauge theory also reveal string struc-
tures.40–43 Lattice gauge theories are not local bosonic models and the strings are
unbreakable in lattice gauge theories. String-net theory points out that even break-
able strings can give rise to gauge fields.44 So we do not really need strings. Qubits
themselves are capable of generating gauge fields and the associated Maxwell equa-
tion. This phenomenon was discovered in several qubit models1,27,33,37,45 before
realizing their connection to the string-net liquids.31 Since gauge fields can emerge
from local qubit models, the string picture evolves into the entanglement picture —
the fourth way to understand gauge field: gauge fields are fluctuations of long-range
entanglement. I feel that the entanglement picture captures the essence of gauge
theory. Despite the beauty of the geometric picture, the essence of gauge theory
was not the curved fiber bundles. In fact, we can view gauge theory as a theory for
long-range entanglement, although the gauge theory was discovered long before the
notion of long-range entanglement. The evolution of our understanding of light and
gauge interaction is: particle beam → wave → electromagnetic wave → curvature
in fiber bundle → glue of partons → wave in string-net liquid → wave in long-range
entanglement; this represents the 200-year effort of the human race to unveil the
mystery of universe (see Fig. 21).

(a) (b) (c) (d)

(e) (f) (g)

Fig. 21. The evolution of our understanding of light (and gauge interaction): (a) particle beam,
(b) wave, (c) electromagnetic wave, (d) curvature in fiber bundle, (e) glue of partons, (f) wave in
string-net liquid and (g) wave in long-range entanglement of many qubits.
October 31, 2018 14:53 IJMPB chap07-S0217979218300104 page 201

Four Revolutions in Physics and the Second Quantum Revolution 201

Viewing gauge fields (and the associated gauge bosons) as fluctuations of long-
range entanglement has an added bonus: we can understand the origin of Fermi
statistics in the same way: fermions emerge as defects of long-range entanglement,
even though the original model is purely bosonic. Previously, there are two ways
to obtain emergent fermions from purely bosonic model: by binding gauge charge
and gauge flux in (2 + 1)D,46,47 and by binding the charge and the monopole in a
U (1) gauge theory in (3 + 1)D.48–52 But those approaches only work in (2 + 1)D
or only for U (1) gauge field. Using long-range entanglement and their string-net
realization, we can obtain the simultaneous emergence of both gauge bosons and
fermions in any dimensions and for any gauge group.2,29,30,34 This result gives us
hope that maybe every elementary particle is emergent and can be unified using
local qubit models. Thus, long-range entanglement offers us a new option to view
our world: maybe our vacuum is a long-range entangled state. It is the pattern of
the long-range entanglement in the vacuum that determines the content and the
structures of observed elementary particles. Such a picture has an experimental
prediction that will be described in Subsec. 3.4.
We like to point out that the string-net unification of gauge bosons and fermions
is very different from the superstring theory for gauge bosons and fermions. In the
string-net theory, gauge bosons and fermions come from the qubits that form the
space, and “string-net” is simply the name that describe how qubits are organized
in the ground state. So, string-net is not a thing, but a pattern of qubits. In the
string-net theory, the gauge bosons are waves of collective fluctuations of the string-
nets, and a fermion corresponds to one end of the string. In contrast, gauge bosons
and fermions come from strings in the superstring theory. Both gauge bosons and
fermions correspond to small pieces of strings. Different vibrations of the small
pieces of strings give rise to different kind of particles. The fermions in superstring
theory are put in by hand through the introduction of Grassmann fields.

3.4. A falsifiable prediction of string-net unification of gauge


interactions and Fermi statistics
In the string-net unification of light and electrons,3,30 we assume that space is
formed by a collection of qubits and the qubits form a string-net condensed state.
Light waves are collective motions of the string-nets, and an electron corresponds
to one end of the string. Such a string-net unification of light and electrons has a
falsifiable prediction: all fermionic excitations must carry some gauge charges.29,34
The U (1) × SU (2) × SU (3) standard model for elementary particles contains
fermionic excitations (such as neutrons and neutrinos) that do not carry any U (1)×
SU (2) × SU (3) gauge charge. So according to the string-net theory, the U (1) ×
SU (2) × SU (3) standard model is incomplete. According to the string-net theory,
our universe not only has U (1) × SU (2) × SU (3) gauge theory, it must also contain
other gauge theories. Those additional gauge theories may have a gauge group of
Z2 or other discrete groups. Those extra discrete gauge theories will lead to new
cosmic strings which appear in the very early universe.
October 31, 2018 14:53 IJMPB chap07-S0217979218300104 page 202

202 Topology and Physics

4. A New Chapter in Physics


Our world is rich and complex. When we discover the inner workings of our world
and try to describe it, we often find that we need to invent new mathematical
language to describe our understanding and insight. For example, when Newton
discovered his law of mechanics, the proper mathematical language was not invented
yet. Newton (and Leibniz) had to develop calculus in order to formulate the law of
mechanics. For a long time, we tried to use the theory of mechanics and calculus
to understand everything in our world.
As another example, when Einstein discovered the general equivalence principle
to describe gravity, he needed a mathematical language to describe his theory.
In this case, the needed mathematics, Riemannian geometry, had been developed,
which led to the theory of general relativity. Following the idea of general relativity,
we developed the gauge theory. Both general relativity and gauge theory can be
described by the mathematics of fiber bundles. Those advances led to a beautiful
geometric understanding of our world based on quantum field theory, and we tried
to understand everything in our world in terms of quantum field theory.
Now, I feel that we are at another turning point. In a study of quantum matter,
we find that long-range entanglement can give rise to many new quantum phases.
So long-range entanglement is a natural phenomenon that can happen in our world.
They greatly expand our understanding of possible quantum phases and bring the
research of quantum matter to a whole new level. To gain a systematic understand-
ing of new quantum phases and long-range entanglement, we would like to know,
what mathematical language should we use to describe long-range entanglement?
The answer is not totally clear. But early studies suggest that tensor category29,53–59
and group cohomology60,61 should be a part of the mathematical framework that
describes long-range entanglement. Further progresses in this direction will lead to a
comprehensive understanding of long-range entanglement and topological quantum
matter.
However, what is really exciting in the study of quantum matter is that it might
lead to a whole new point of view of our world. This is because long-range entangle-
ment can give rise to both gauge interactions and Fermi statistics. In contrast, the
geometric point of view can only lead to gauge interactions. So maybe we should
not use geometric pictures, based on fields and fiber bundles, to understand our
world. Maybe we should use entanglement pictures to understand our world. This
way, we can get both gauge interactions and fermions from a single origin — qubits.
We may live in a truly quantum world. So, quantum entanglement represents a new
chapter in physics.

References
1. D. Foerster, H. B. Nielsen and M. Ninomiya, Phys. Lett. B 94, 135 (1980).
2. X.-G. Wen, Phys. Rev. D 68, 065003 (2003), arXiv:hep-th/0302201.
3. M. Levin and X.-G. Wen, Phys. Rev. B 73, 035122 (2006), arXiv:hep-th/0507118.
October 31, 2018 14:53 IJMPB chap07-S0217979218300104 page 203

Four Revolutions in Physics and the Second Quantum Revolution 203

4. Z.-C. Gu and X.-G. Wen, Nucl. Phys. B 863, 90 (2012), arXiv:0907.1203.


5. X.-G. Wen, ISRN Condensed Matter Physics 2013, 198710 (2013), arXiv:1210.1281.
6. X.-G. Wen, Chin. Phys. Lett. 30, 111101 (2013), arXiv:1305.1045.
7. Y.-Z. You and C. Xu, Phys. Rev. B 91, 125147 (2015), arXiv:1412.4784.
8. B. Zeng, X. Chen, D.-L. Zhou and X.-G. Wen, (2015), arXiv:1508.02595.
9. A. Einstein, Annalen der Physik 49, 769 (1916).
10. G. Nordstrom and U. die Moglichkeit, Physik. Zeitschr. 15, 504 (1914).
11. T. Kaluza, Sitzungsber. Preuss. Akad. Wiss. Berlin. (Math. Phys.), 966 (1921).
12. O. Klein, Z. Phys. 37, 895 (1926).
13. H. Weyl, Space, Time, Matter (Dover, 1952).
14. W. Pauli, Rev. Mod. Phys. 13, 203 (1941).
15. C. N. Yang and R. L. Mills, Phys. Rev. 96, 191 (1954).
16. E. Fermi, Z. Phys. 36, 902 (1926).
17. P. A. M. Dirac, Proc. Roy. Soc. A 112, 661 (1926).
18. D. J. Gross and F. Wilczek, Phys. Rev. Lett. 30, 1343 (1973).
19. H. D. Politzer, Phys. Rev. Lett. 30, 1346 (1973).
20. T. D. Lee and C. N. Yang, Phys. Rev. 104, 254 (1956).
21. C. S. Wu et al., Phys. Rev. 105, 1413 (1957).
22. A. Einstein, Annalen der Physik 17, 891 (1905).
23. D. C. Tsui, H. L. Stormer and A. C. Gossard, Phys. Rev. Lett. 48, 1559 (1982).
24. X. Chen, Z.-C. Gu and X.-G. Wen, Phys. Rev. B 82, 155138 (2010), arXiv:1004.3835.
25. X.-G. Wen, Phys. Rev. B 40, 7387 (1989).
26. X.-G. Wen, Int. J. Mod. Phys. B 4, 239 (1990).
27. X.-G. Wen, Phys. Rev. Lett. 88, 11602 (2002), arXiv:hep-th/01090120.
28. Z.-C. Gu and X.-G. Wen, A lattice bosomic model as a quantum theory of gravity,
gr-qc/0606100.
29. M. Levin and X.-G. Wen, Phys. Rev. B 71, 045110 (2005), cond-mat/0404617.
30. M. A. Levin and X.-G. Wen, Rev. Mod. Phys. 77, 871 (2005), cond-mat/0407140.
31. X.-G. Wen, Phys. Rev. B 68, 115413 (2003), cond-mat/0210040.
32. R. Moessner and S. L. Sondhi, Phys. Rev. B 68, 184512 (2003).
33. M. Hermele, M. P. A. Fisher and L. Balents, Phys. Rev. B 69, 064404 (2004).
34. M. Levin and X.-G. Wen, Phys. Rev. B 67, 245316 (2003), cond-mat/0302460.
35. A. D’Adda, P. D. Vecchia and M. Lüscher, Nucl. Phys. B 146, 63 (1978).
36. E. Witten, Nucl. Phys. B 149, 285 (1979).
37. G. Baskaran and P. W. Anderson, Phys. Rev. B 37, 580 (1988).
38. I. Affleck and J. B. Marston, Phys. Rev. B 37, 3774 (1988).
39. K. G. Wilson, Phys. Rev. D 10, 2445 (1974).
40. J. Kogut and L. Susskind, Phys. Rev. D 11, 395 (1975).
41. T. Banks, R. Myerson and J. B. Kogut, Nucl. Phys. B 129, 493 (1977).
42. J. B. Kogut, Rev. Mod. Phys. 51, 659 (1979).
43. R. Savit, Rev. Mod. Phys. 52, 453 (1980).
44. M. B. Hastings and X.-G. Wen, Phys. Rev. B 72, 045141 (2005), cond-mat/0503554.
45. O. I. Motrunich and T. Senthil, Phys. Rev. Lett. 89, 277004 (2002).
46. J. M. Leinaas and J. Myrheim, Il Nuovo Cimento 37B, 1 (1977).
47. F. Wilczek, Phys. Rev. Lett. 49, 957 (1982).
48. I. Tamm, Z. Phys. 71, 141 (1931).
49. R. Jackiw and C. Rebbi, Phys. Rev. Lett. 36, 1116 (1976).
50. F. Wilczek, Phys. Rev. Lett. 48, 1146 (1982).
51. A. S. Goldhaber, Phys. Rev. Lett. 49, 905 (1982).
52. K. Lechner and P. A. Marchetti, J. High Energy Phys. 2000, 12 (2000), arXiv:hep-th/
0010291.
October 31, 2018 14:53 IJMPB chap07-S0217979218300104 page 204

204 Topology and Physics

53. E. Keski-Vakkuri and X.-G. Wen, Int. J. Mod. Phys. B 7, 4227 (1993).
54. M. Freedman, C. Nayak, K. Shtengel, K. Walker and Z. Wang, Ann. Phys. (NY) 310,
428 (2004), cond-mat/0307511.
55. E. Rowell, R. Stong and Z. Wang, Comm. Math. Phys. 292, 343 (2009), arXiv:0712.
1377.
56. X.-G. Wen, Natl. Sci. Rev. 3, 68 (2016), arXiv:1506.05768.
57. M. Barkeshli, P. Bonderson, M. Cheng and Z. Wang, Symmetry, defects, and gauging
of topological phases, arXiv:1410.4540.
58. T. Lan, L. Kong and X.-G. Wen, Phys. Rev. B 94, 155113 (2016), arXiv:1507.04673.
59. T. Lan, L. Kong and X.-G. Wen, (2016), arXiv:1602.05946.
60. X. Chen, Z.-C. Gu, Z.-X. Liu and X.-G. Wen, Phys. Rev. B 87, 155114 (2013),
arXiv:1106.4772.
61. X. Chen, Z.-C. Gu, Z.-X. Liu and X.-G. Wen, Science 338, 1604 (2012), arXiv:1301.
0861.
October 31, 2018 14:54 taken from 146-MPLA ws-rv961x669 chap08-Reprint page 205

205

Phys. Status Solidi RRL 7, No. 1–2, 72– 81 (2013) / DOI 10.1002/pssr.201206414

pss
Part of Focus Issue on
Topological Insulators – From Materials Design to Reality
Eds.: Claudia Felser, Shoucheng Zhang, Binghai Yan

www.pss-rapid.com
Topological insulators from the Review@RRL
perspective of first-principles calculations
Haijun Zhang and Shou-Cheng Zhang*

Department of Physics, McCullough Building, Stanford University, Stanford, California 94305-404531, USA

Received 28 September 2012, revised 22 November 2012, accepted 22 November 2012


Published online 30 November 2012

Keywords topological insulators, first-principles calculations, spin–orbit coupling, surface states

*
Corresponding author: e-mail [email protected], Phone: +01-650-723-2894, Fax: +01-650-723-9389

Topological insulators are new quantum states with helical


gapless edge or surface states inside the bulk band gap. These
topological surface states are robust against weak time-
reversal invariant perturbations without closing the bulk band
gap, such as lattice distortions and non-magnetic impurities.
Recently a variety of topological insulators have been pre-
dicted by theories, and observed by experiments. First-prin-
ciples calculations have been widely used to predict topologi-
cal insulators with great success. In this review, we summa-
rize the current progress in this field from the perspective of
first-principles calculations. First of all, the basic concepts of
topological insulators and the frequently-used techniques
within first-principles calculations are briefly introduced.
Secondly, we summarize general methodologies to search for
new topological insulators. In the last part, based on the band
inversion picture first introduced in the context of HgTe, we
classify topological insulators into three types with s–p, p–p Surface states of topological insulator Bi2Se3 consist of a
and d–f, and discuss some representative examples for each single Dirac cone, as obtained from first-principles calcula-
type. tions.

© 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

1 Introduction In two-dimensional electron systems ceptual framework. In this system TRS is present, and
at low temperature and strong magnetic field, the Hall con- spin–orbit coupling (SOC) effect plays the role of Lorentz
ductance σ xy takes quantized values [1], called quantum force in QH effect. The concept of QSH can be generalized
Hall (QH), which proved to have a fundamental topologi- to three-dimensional (3D) topological insulators with TRS
cal meaning [2]. σ xy can be expressed as an integral of the [8, 9]. The electromagnetic response of a topological
first Chern number over the magnetic Brillouin zone. insulator is described by the topological θ term of
Quantum anomalous Hall (QAH) is adiabatically equiva- Sθ = (θ /2π) (α /2π) Ú d 3 x dt E ◊ B with θ = π, where E and
lent to QH. Both QH and QAH systems are called Chern B are the external electromagnetic fields [9]. This indicates
insulators due to the non-zero Chern number where the the physically measurable and topologically non-trivial re-
time-reversal symmetry (TRS) is broken. Recently quan- sponse, which opens the door for experiments and potential
tum spin Hall (QSH) state was predicted in CdTe/HgTe applications of topological insulators.
quantum well [3] and soon observed experimentally [4]. Both 2D (QSH) and 3D topological insulators have in-
Earlier theoretical models [5–7] provided important con- teresting physical properties [10–15]. In this review, we

© 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Reprinted with permission from Physica Status Solidi RRL, Vol. 7, No. 1–2, pp. 72–81 (2013),
DOI: 10.1002/pssr.201206414 c 2013 by John Wiley and Sons.
October 31, 2018 14:54 taken from 146-MPLA ws-rv961x669 chap08-Reprint page 206

206 Topology and Physics

Review
@ RRL

Phys. Status Solidi RRL 7, No. 1–2 (2013) 73

focus on 3D topological insulators with TRS. In this field, As we know, the conventional LDA and GGA first-
an important task is to systematically search for all topo- principles calculations tend to underestimate the band gap
logical insulators. In this process, first-principles calcula- [25, 26]. However the band gap is directly related to the
tions played a crucial role. Up to now, most of topological possibility of the band inversion which is the key topologi-
insulators were predicted first by first-principles calcula- cal property [3]. For example, sometimes LDA and GGA
tions, and observed subsequently by experiments. predict a negative band gap, whereas the band gap is posi-
tive in reality [27]. This can cause the serious problem to
2 Theories and methods predict topological insulators. So it is necessary to improve
2.1 First-principles methods Density functional the calculations of the energy gap. The most effective
theory (DFT) is a formally exact theory based on the two method to calculate the band gap is GW approximation
Hohenberg–Kohn theorems (HK) [16], but the functional [28]. Simply saying, GW approximation considers the Har-
of the exchange and correlation interaction is unknown tree–Fock self-energy interaction with the screening effect.
in Kohn–Sham (KS) equation [17]. In order to perform Though the GW method has been used to study topological
numerical calculations, the local-density approximation insulators, for example, Hg chalcogenides, half-Heusler
(LDA) [17] and Generalized Gradient Approximation compounds, antiperovskite nitride, honeycomb-lattice
(GGA) [18, 19] are usually used to approximate the ex- chalcogenides, Bi2Se3 and Bi2Te3 [29–31], this method is
change and correlation interaction in KS equation. Based on very expensive. Besides the GW method, the modified
recent experiences, LDA and GGA work quite well for the Becke–Johnson exchange potential together with LDA
study of topological insulators, because most topological in- (MBJLDA), proposed by Tran and Blaha in 2009 [32],
sulators found to-date are weakly correlated electronic sys- costs as much as LDA and GGA, but it allows a band gap
tems, such as, Bi2Se3 [20], TlBiSe2 [21–24] and etc. with similar accuracy to GW. MBJLDA potential can also
recover LDA for the electronic system with a constant
charge density, and mimic the behavior of orbital-
Haijun Zhang is a post-doctoral re-
dependent potentials as well. MBJLDA was successfully
searcher in the Department of Physics
at Stanford University, CA, USA. He used to predict topological insulators with the chalcopyrite
received his BS degree from University structure [33].
of Science and Technology of China LDA + U [34], LDA + DMFT [35] and LDA + Gutz-
(USTC) in 2004, and his Ph.D. from willer [36] are employed to study strongly correlated elec-
Institute of Physics (IOP), Chinese tronic systems (d and f electrons), because LDA often fails
Academy of Sciences (CAS), China in 2009. His interest is to for these systems. In strongly correlated electronic systems,
discover and understand the novel phenomena in condensed the electrons are strongly localized, and have more features
matter physics with first-principles calculations. Recently his of atomic orbitals. This case requires proper treatment of
research focuses on new materials of topological insulators. atomic configurations and orbital dependence. Both LDA
He received Outstanding Science and Technology Team and GGA do not include the orbital dependence of the
Achievement Award by Qiu Shi Science & Technologies Coulomb and exchange interactions. This is why they fail
Foundation in 2011 for the discovery of three-demensional
to describe strongly correlated electronic systems. Based
topological insulators (Bi2Se3, Bi2Te3 and Sb2Te3).
on this understanding, all of LDA + U, LDA + DMFT and
Shou-Cheng Zhang is the J. G. Jackson LDA + Gutzwiller include the orbital-dependent feature in
and C. J. Wood professor of physics at different ways. For example, the on-site interaction is
Stanford University. He received his BS treated in a static Hartree–Fock mean-field manner in
degree from the Free University of Ber- LDA + U method which is the simplest and cheapest
lin in 1983, and his Ph.D. from the State method. It is often used for strongly correlated systems,
University of New York at Stony Brook but it does not work well with intermediately correlated
in 1987. He was a postdoc fellow at the metallic systems. The self-energy of the LDA + DMFT
Institute for Theoretical Physics in Santa method is obtained in a self-consistent way. Up to now
Barbara from 1987 to 1989 and a re-
LDA + DMFT is the most accurate and reliable method,
search staff member at the IBM Almaden Research Center
but its computational costs are high. LDA + Gutzwiller
from 1989 to 1993. He joined the faculty at Stanford in 1993.
He is a condensed matter theorist known for his work on topo- based on Gutzwiller variational approach is recently devel-
logical insulators, spintronics and high-temperature supercon- oped. This method works well for intermediately corre-
ductivity. He is a fellow of the American Physical Society and lated electronic systems, and it is cheaper than
a fellow of the American Academy of Arts and Sciences. He LDA + DMFT. Though it is still an open question how
received the Guggenheim fellowship in 2007, the Alexander well these methods work on strongly correlated systems. It
von Humboldt research prize in 2009, Johannes Gutenberg re- is true that these methods could reproduce some results of
search prize in 2010, the Europhysics prize in 2010, the Oliver experiments,and that they can help to understand some
Buckley prize in 2012 and the Dirac Medal and Prize in 2012 novel results in strongly correlated electronic systems, for
for his theoretical prediction of the quantum spin Hall effect example, the LDA + DMFT study for topological insulator
and topological insulators. PuTe [37].

www.pss-rapid.com © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim


October 31, 2018 14:54 taken from 146-MPLA ws-rv961x669 chap08-Reprint page 207

Topological Insulators from the Perspective of First-Principles Calculations 207

solidi rrl
physica

status

74 H. Zhang and S.-C. Zhang: A review of topological insulators

2.2 Spin–orbit coupling Generally SOC describes where


the interaction of a particle’s spin with its orbital motion. N
For example, in one atom, the interaction between one δ i = ’ ξ2m ( Ki ) , (5)
electron’s spin and the magnetic field produced by its orbit m =1
around the nucleus can cause shifts in the electron’s atomic
energy levels, which is the typical SOC effect. SOC Ham- N is half of the number of occupied bands, and ξ 2 m ( K i ) is
iltonian is given as [38] the parity eigenvalue of the 2m -th occupied energy band at
TRIM Ki = ( n1n2 n3 ) = 12 (n1b1 + n2 b2 + n3 b3 ) where b1,2,3 repre-
= sent primitive reciprocal lattice vectors.
H soc = - σ ◊ p ¥ (—V0 ) , (1)
4m02 c 2
2.3.2 Without inversion symmetry For the com-
where = is Planck’s constant, m0 is the mass of a free elec- pounds without inversion symmetry, several methods are
tron, c is the velocity of light and σ represents the Pauli
proposed to calculate Z 2 invariants [40–43]. Considering
spin matrices. H soc couples the potential V0 and the mo- the simplicity for first-principles calculations, here we
mentum operator p together.
briefly introduce the proposal of Fukui et al. [40]. Firstly,
In the case of the single atomic system V0 is spherically Z 2 formula of QSH state can be expressed with the Berry
symmetric, H soc can be simplified, connection and the Berry curvature, shown by Fu and
H soc = λ L ◊ σ , (2) Kane,
where λ is the strength of SOC interaction. L represents 1
Z2 = È Ú A ( k ) - Ú F ( k ) ˘ mod 2 ,
2 π ÍÎ ∂v
(6)
the angular moment. But in solid systems, V0 is the peri- B- B-
˙˚
odic potential which can be quite complex in form. For
convenience, it is sufficient for SOC effect to employ a with
second-variational procedure with a radial symmetric aver-
age around the atoms. SOC interaction is the key to the A(k ) = iΣ n ·un (k ) | — k un (k )Ò and F (k ) = —k ¥ A(k ) (7)
band topology, so all first-principles calculations to study where B - and ∂B - indicate half of two-dimensional (2D)
topological insulators should be carried out with SOC. tori and its boundary, respectively. In order to do nume-
rical calculations, Eq. (6) can directly be rewritten to its
2.3 The criterion of topological insulators There lattice version. Secondly, for 3D case, we can define
are four Z 2 invariants (ν 0 ; ν 1ν 2ν 3 ) for three-dimensional six 2D tori as Z 0 (k x , k y , 0), Z1 (k x , k y , π), Y0 (k x , 0, k z ),
topological insulators, first proposed by Fu, Kane and Y1 ( k x , π, k z ), X 0 (0, k y , k z ), and X 1 (π, k y , k z ). We can cal-
Mele [8]. When ν 0 = 1, materials are strong topological in- culate the Z 2 based on Eq. (6) for each of these six tori, as
sulators which have topologically protected gapless surface z0 , z1 , y0 , y1 , x0 , and x1 . The four Z 2 invariants of topo-
states consisting of odd number of Dirac cones. These sur- logical insulators are obtained by ν 0 = x0 xπ , ν 1 = xπ ,
face state are robust against time-reversal-invariant (TRI) ν 2 = yπ and ν 3 = zπ . Xiao et al. first successfully using
weak disorders. If ν 0 = 0 and at least one of ν 1,2,3 is non these formulas to evaluate the Z 2 invariants of half-Heusler
zero, the corresponding materials are weak topological in- compounds by first-principles calculations [44].
sulators which have surface states with even number of
Dirac cones on special surfaces. We can simply consider 2.3.3 Adiabatic argument Sometimes it is not nec-
weak topological insulators to be stacked by layered two- essary to directly calculate Z 2 for the compounds without
dimensional QSH materials. In the presence of disorder, inversion symmetry. One can start from a respective com-
the surface states of weak topological insulators can be de- pound with inversion symmetry, and then adiabatically
stroyed. When all ν 0,1,2,3 are zero, materials are conven- change this compound to that without inversion symmetry.
tional insulators. If the energy gap does not close in an adiabatic process, the
topological property will not change. For example, the
2.3.1 With inversion symmetry The calculation of space group of α-Sn is Fd3m (No. 227) and the inversion
Z 2 invariants is very simple for the compounds with inver- symmetry is held in this structure. We can easily know
sion symmetry. The formula of Z 2 can be just expressed α-Sn is topologically non-trivial from the parity calcula-
with the parity values at the eight time-reversal-invariant tions [39]. The band gap of α-Sn defined by Eq. (9) is
moments (TRIMs) [39], negative, which is the key for α-Sn to be topologically
8 non-trivial. Then we assume to adiabatically change α-Sn
( -1)ν 0 = ’ δ i (3) to HgTe without closing this negative gap. Based on the
i =1 adiabatic argument, one can conclude, HgTe is topologi-
and cally non-trivial. Another example to understand this adia-
batic argument is to take SOC strength as an adiabatic pa-
( -1)ν k = ’ δ i = ( n ,n ,n ) ,
1 2 3
(4) rameter [45]. The band gap of YBiTe3 stays open with
nk =1; n j π k = 0 ,1 adiabatically tuning SOC strength from 0 to 100%, which

© 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.pss-rapid.com


October 31, 2018 14:54 taken from 146-MPLA ws-rv961x669 chap08-Reprint page 208

208 Topology and Physics

Review
@ RRL

Phys. Status Solidi RRL 7, No. 1–2 (2013) 75

means YBiTe3 with SOC has the same topological prop- method can predict surface states well for layered com-
erty as the non-SOC case. So one can conclude that pounds. For example, the calculated surface states of
YBiTe3 is topologically trivial. Bi2Se3 with MLWFs method agree well with the ones of
angle resolved photoelectron spectroscopy (ARPES) [20,
2.3.4 Surface states Gapless surface states of topo- 49]. Usually we do not expect to predict the exact disper-
logical insulators must include the odd number of Dirac sion of surface states, because this method does not include
cones on one surface, and these surface states are robust all complex situations on the surface. On the other hand,
against TRI weak disorders. So the calculation of surface the surface states obtained from the MLWF method origi-
states is another useful method to judge the band topology. nate from the topological property of the bulk electronic
The simplest way to calculate surface states is based on the structure, so this is an ideal method to judge whether one
free-standing structure. It is true that this is a very powerful compound is topologically non-trivial or not.
method to calculate surface states, but only for the com-
pounds with inversion symmetry and layered structure, 3 Three-dimensional topological insulators Af-
such as, Bi, Sb, Bi2Se3 etc. For example, if the compounds ter the initial discovery of the 2D topological insulator
do not have inversion symmetry, the polarization field HgTe [3, 4], a number of 3D topological insulators are
might cause serious artificial effect, especially for the com- found with the great effort of theorists and experimentalists
pounds with a small band gap. In addition, if the com- [10, 12, 13]. In the following, we classify the topological
pounds are not layered structures, the dangling bonds on the insulators by the type of the band inversion, because the
surface might cause a number of complex topologically band inversion has a clear and general physical picture for
trivial chemical surface states which can mix with topologi- most topological insulators. Up to now, there are three ba-
cally non-trivial ones. The topological surface states origi- sic types of band inversions (s–p, p–p, d–f) in topological
nate from the topological property of the bulk electronic insulators discovered so far. In the following discussions,
structure. Though the details of these surface states can be we will take some representative compounds as examples
modified by the special dangling bonds and the reconstruc- for each type of topological insulators.
tion of the electronic structure on the surface, we address
that the topological feature does not change, such as, the 3.1 s–p type The most important s–p topological in-
odd number of Dirac cones. The calculation of the free- sulator is HgTe [3, 4] which has the zinc-blende structure
standing model also costs a lot, because the vacuum layer with space group F43m (No. 216). Before HgTe was
and the material part both should be thick enough in order found to be a topologically non-trivial compound, it had
to avoid hybridization between the up and down surfaces. been widely studied experimentally and theoretically [50–
Besides the free-standing model, maximally localized 52]. Unlike other zinc-blende compounds, HgTe is a semi-
Wannier function (MLWF) methods [46, 47] can be used conductor with symmetry-protected zero-energy band gap.
to calculate the surface states [20, 48]. Essentially the The Hg has occupied shallow 5d levels which tends to be
MLWF method is a tight-binding method, but the differ- delocalized, so Hg has a large effective positive charge in
ence from the conventional tight-binding method is that its core. The Hg s level, which forms Γ 6 state in cubic
MLWF method can exactly reproduce the band structure of symmetry, is pulled down below the Te p levels which
first-principles calculations. But it is not easy to obtain split into Γ8 and Γ 7 , by this effective positive charge of
MLWFs, because the transformation from Bloch functions Hg’s core. Finally the energy level sequence at Γ point
to Wannier functions is not unique due to the phase ambi- shows the Γ8 -Γ6 -Γ 7 order, which we call the s–p-type
guity of the Bloch functions used in first-principles calcu- band inversion. If we define the energy gap DE ,
lations. Marzari and Vanderbilt reported an effective
method to obtain MLWF by minimizing the spread func- DE = EΓ6 - EΓ8 , (9)
tion  (· r 2 Ò - · r Ò 2 ) [46]. In order to calculate surface
n where the EΓ6 and EΓ8 are the energy levels for Γ 6 and Γ8
states, first we carry out the first-principles calculations for at the Γ point. HgTe has a negative DE because of the
3D bulk structure and then transform Bloch functions to s–p-type band inversion, so it is well known as a negative
MLWFs. At the same time the hopping parameters gap semiconductor.
H mn ( R) = · n0| Hˆ |mR Ò between Wannier functions are ob- The normal LDA and GGA can predict the band inver-
tained. At the next step, we use these hopping parameters sion between Γ 6 and Γ8 , but the exact band sequence of
to construct the hopping parameters of the corresponding Γ8 -Γ 6 -Γ 7 cannot be obtained [52]. The LDA band struc-
semi-infinite structure, and then iterative method can be ture with SOC shows the Γ8 -Γ7 -Γ6 sequence, shown in
used to solve the surface Green’s function, Fig. 1(a). As we addressed above, MBJLDA method can
correct the error of LDA band structure. The band structure
α ,α
Gnn (k|| , ε + iη ) , (8) with the MBJLDA method is shown in Fig. 1(b), which
perfectly shows the correct Γ8 -Γ 6 -Γ 7 sequence.
where n denotes the unit cell along the surface normal, and Bernevig, Hughes and Zhang first identified the band
α is the Wannier orbital in the unit cell. The MLWFs inversion in HgTe to be the key ingredient of its topologi-

www.pss-rapid.com © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim


October 31, 2018 14:54 taken from 146-MPLA ws-rv961x669 chap08-Reprint page 209

Topological Insulators from the Perspective of First-Principles Calculations 209

solidi rrl
physica

status

76 H. Zhang and S.-C. Zhang: A review of topological insulators

cause grey tin holds the inversion symmetry, its parity val-
ues at all TRIMs can be easily calculated. It is worth to
note that though grey tin is a zero-band gap semiconductor,
we still can define the topological property for all of its oc-
cupied bands. Based on the formulas proposed by Fu and
Kane, its Z 2 invariants are calculated to be (1;000) which
indicate topologically non-trivial. Here the key is that the s
and p at Γ point have opposite parity values. The occupied
s state forms Γ7- , whereas p states form Γ7+ and Γ8+ . Taking
grey tin as the starting point, we assume that we make a
thought experiment to adiabatically change grey tin to
HgTe. In this process, the negative gap ( DE ) is never
closed, which means grey tin and HgTe have the same
topological property. So HgTe proves to be topologically
non-trivial with Z 2 invariant (1;000). Besides this adiabatic
argument, HgTe’s Z 2 invariants can also be directly calcu-
lated by the numerical method addressed above.
Similar to HgTe, there are is a big family of com-
pounds known as half-Heusler materials ( XYZ ) [53] which
Figure 1 (online colour at: www.pss-rapid.com) (a) and (b) Band include more than 250 semiconductors and semimetals.
structure of HgTe by LDA and MBJLDA methods, respectively. Half-Heusler compounds consist of face-centered cubic
Γ 6, 7 ,8 represent the symmetry of energy levels at Γ point. The (fcc) sublattices sharing the same space group with HgTe.
solid red circles indicate the projection of the s orbital of Hg. The Y and Z form zinc-blende structure which is stuffed by X.
LDA band structure shows the Γ8 -Γ 7 -Γ6 band sequence which is Usually X and Y are transition metal or rare earth elements,
not correct, but MBJLDA can calculate the correct band sequence and Z is a main group element. Usually the 18-electron
as Γ8 -Γ 6 -Γ7 . half-Heusler compounds are candidates for topological in-
sulators due to the requirement of semiconducting. The
cally non-trivial behavior [3]. Its topological invariant can band structure of these half-Heusler compounds at Γ point
also be obtained by an adiabatic argument [39]. As we near the Fermi level is almost the same with that of HgTe
know, if we replace Hg and Te by the same atom in the case. s state forms Γ 6 , and p states split into Γ 7 and Γ8 .
zinc-blende structure, the crystal structure will change to Some of half-Heusler compounds, such as ScPtSb with the
the diamond structure with the inversion symmetry. Luck- band sequence Γ 6 -Γ8 -Γ 7 , are topologically trivial and
ily, in nature grey tin has the diamond structure with space some others, such as LaPtBi, with the inverted band se-
group Fd3m, and it is also a semiconductor with a negative quence Γ8 -Γ 6 -Γ 7 , are topologically non-trivial. The in-
energy gap DE due to the s level below the p level. Be- teresting thing is that half-Heusler family were independ-

Figure 2 (online colour at: www.pss-rapid.com) (a) Crystal structure of chalcopyrite compounds (ABC2). (b) Energy gap DE for vari-
ous chalcopyrite compounds as a function of the lattice constant. Open symbols mean the lattice constant has been reported. The lattice
constants of the rest are obtained by first-principles total energy minimization. Squares represent topological insulators, and diamonds
represent topological metals. From Ref. [33].

© 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.pss-rapid.com


October 31, 2018 14:54 taken from 146-MPLA ws-rv961x669 chap08-Reprint page 210

210 Topology and Physics

Review
@ RRL

Phys. Status Solidi RRL 7, No. 1–2 (2013) 77

ently reported almost at the same time by three theory- tors extensively studied worldwide. Especially Bi2Se3 has a
groups [44, 54, 55]. Besides the topological property, half- big energy gap of 0.3 eV which is much larger than the en-
Heusler compounds are a class of multifunctional materials ergy scale at room temperature. These compounds share
[56, 57], such as, superconductivity and magnetism, due to the layered structure with a five-atom layer, called the
transition metals and rare earth elements. So half-Heusler quintuple layer (QL), as the unit cell with the space group
compounds might be the best platform to study the Majo- R3m (No. 166). Two equivalent Se atoms, two equivalent
rana fermion in topological superconductors [58], dynami- Bi atoms and a third Se atom are in each QL. The coupling
cal axion field in topological anti-ferromagnetic phase is the chemical bonding between neighboring atomic layers
[59], and quantum anomalous Hall effect (QAH) in topo- within one QL, but the van der Waals type, which is much
logical ferromagnetic phase [60]. Recently, some ARPES weaker, between two QLs. It is worth to note that the in-
and transport experiments already have been reported for version symmetry is held in the crystal structure.
half-Heusler compounds [61–63]. In the following, we briefly introduce the basic elec-
Generally due to the cubic symmetry, many topologi- tronic structure of this family compounds by taking Bi2Se3
cally non-trivial compounds (HgTe and half-Heusler com- as an example. First of all, the band structure without SOC
pounds) are zero-gap semiconductors with Fermi level shows Bi2Se3 to be a narrow band gap insulator. Both the
through Γ8 level at Γ point, and a uniaxial strain is usually bottom of conduction band and the top of valence band are
needed to break the cubic symmetry in order to open a fi- at Γ point, seen in Fig. 3(a). After SOC is turned on, the
nite energy gap [64]. Feng et al. reported that chalcopyrite bottom of conduction band is pulled down below the top of
structure can naturally break the cubic symmetry [33]. valence band, and an interaction gap opens at the crossing
The chalcopyrite structure (ABC2) is the body-centered of valence and conduction bands, seen in Fig. 3(b). Based
tetragonal structure with space group I42d (No. 122), on the parity calculations, Z 2 invariants of Bi2Se3 are cal-
which could be regarded as a superlattice of two cubic culated to be (1;000) which mean topologically non-trivial.
zinc-blende unit cells, AC and BC, seen in Fig. 2(a). In es- The key for Bi2Se3 to be the topological insulator is the
sence, the unit cell of chalcopyrite is the double unit cell of band inversion at Γ between the conduction and valence
HgTe with naturally breaking the cubic symmetry, and we bands with opposite parity values. The schematic of the
expect that these two class compounds might share the band sequence at Γ point clearly tells the band evolution
same topological property. Feng et al. found that it is true starting from atomic levels with three stages, shown in
that some materials with chalcopyrite structure are topo- Fig. 3(c). Because the s levels are much lower than p levels,
logical insulators, shown in Fig. 2(b). we just start from the atomic p levels of Bi (6s26p3) and Se
Besides the compounds talked about above, there are a (4s24p4). At the stage (I), the bonding and anti-bonding ef-
lot of other s–p-type topological insulators, such as, fect between Bi and Se atoms are considered. All the
β-Ag2Te [65], KHgSb family [66, 67], Na3Bi [68], atomic orbitals are recombined into P 0 -x , y , z , P1±x , y , z and
CsPbCl3 family [69] and so on. P 2 ±x , y , z where ‘0’ represents the third Se, and ‘1’, ‘2’ repre-
sent Bi and the other two Se, respectively. ‘±’ represents
3.2 p–p type Due to the simple surface states consist- the parity values. Because the third Se is exactly at the in-
ing of a single Dirac cone, Bi2Se3, Bi2Te3 and Sb2Te3 com- version center, it is different from the two other Se atoms
pounds [20, 49, 72–75] quickly became topological insula- which together can be classified by the parity. We use P 0

Figure 3 (online colour at: www.pss-rapid.com) Band structure of Bi2Se3 without (a) and with (b) SOC. The blue dashed line repre-
sents the Fermi level. (c) Evolution of the band sequence at Γ point starting from atomic levels. The three stages (I), (II) and (III) rep-
resent turning on chemical bonding, crystal field and SOC effects step by step. From Ref. [20].

www.pss-rapid.com © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim


October 31, 2018 14:54 taken from 146-MPLA ws-rv961x669 chap08-Reprint page 211

Topological Insulators from the Perspective of First-Principles Calculations 211

solidi rrl
physica

status

78 H. Zhang and S.-C. Zhang: A review of topological insulators

Figure 4 (online colour at: www.pss-rapid.com) (a) Calculated surface states of Bi2Se3 with MLWFs tight-binding method for a semi-
infinite structure with the surface normal (111). The red regions indicate bulk bands and the blue regions indicate the band gaps. The
clear surface states with linear dispersion at T can be seen in the band gap. (b) ARPES result for Bi2Se3 along the Γ-M direction.
From Ref. [20, 49].

to indicate the third Se. At the stage (II), after the crystal ference of Bi and Sb is the band sequence of the conduc-
field is turned on, pxyz levels will split into pxy and pz . The tion and valence bands at three L points. For example, the
levels of P1+z and P 2-z are nearest to the Fermi level. At the conduction band of Bi is L s , and the valence band is L a
stage (III), SOC effect is further introduced. P1+z becomes where ‘a/s’ indicates the –/+ parity. Differently, these two
two degeneracy levels ( P1+z ,≠Ø ), and P 2-z becomes two de- bands switched with each other in Sb. After carefully
generacy levels ( P 2 -z ,≠Ø ) due to the time-reversal symmetry. comparing the band structure between Bi and Sb, Fu and
Though the · pz | H soc |pz Ò is zero, · p+ | H soc |pz Ò is not zero Kane predicted that the insulate phase of Bi1–xSbx
which acts like the level repulsion between px , y and pz or- (0.07 < x < 0.22) alloy must be a topological insulator.
bitals, so SOC effect pulls P1+z ,≠Ø down and pushes P 2 -z ,≠Ø Subsequently, the Hasan group observed the topologically
up. Finally, if SOC is strong enough, the p– p-type band non-trivial property of Bi1–xSbx by the ARPES experiment
inversion will happen between P1+z ,≠Ø and P 2 -z ,≠Ø. [71]. But the details of surface states do not agree with the
Due to the layered structure with inversion symmetry,
both the free-standing model and the tight-binding model
based on MLWFs can be used to calculate surface states.
Figure 4(a) shows the clear surface states of Bi2Se3 with
a single Dirac cone at Γ calculated by the MLWFs
tight-binding model. Almost at the same time of Zhang et
al.’s theory prediction [20], the Hasan group reported the
topologically non-trivial surface states of Bi2Se3 by the
ARPES experiment [49], shown in Fig. 4(b). Comparing
the theory and experimental results, we have to agree that
first-principles calculations can successfully predict topo-
logical insulators, including the details of surface states.
Recently a lot of experimental studies of topological insu-
lators are focusing on these compounds, because these
compounds are easily to be grown by all kinds of experi-
ments.
The topological insulator Bi1–xSbx (0.07 < x < 0.22) al-
loy also belongs to the p–p type [39]. Bulk Bi and Sb share
a rhombohedral R3m structure which holds the inversion
symmetry, and they both are semimetals with some tiny
Fermi pockets around the TRIM L and T points, but there Figure 5 (online colour at: www.pss-rapid.com) Schematic for
is a direct gap at every k point through the whole Brillouin the comparison of the surface states of (a) first-principles calcula-
zone (BZ). So we can define an imaginary Fermi surface in tions [48], (b) tight-binding calculations [70], and (d) ARPES ex-
the direct gap. Based on the parity calculations, we confirm periment [71]. (c) With the extra surface states with red dotted
that Bi is topologically trivial with Z 2 (0;000), and that Sb lines, the surface states from first-principles calculations may
is topologically non-trivial with Z 2 (1;111). The key dif- agree with those of ARPES experiment. From Ref. [48].

© 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.pss-rapid.com


October 31, 2018 14:54 taken from 146-MPLA ws-rv961x669 chap08-Reprint page 212

212 Topology and Physics

Review
@ RRL

Phys. Status Solidi RRL 7, No. 1–2 (2013) 79

ones of tight-binding [70] and first-principles calculations (No. 225), and the inversion symmetry is also held in this
[48]. The schematics of the difference among these results structure. All these compounds have been well studied by
are shown in Fig. 5. We can see that the ARPES result in- theories and experiments, known as mixed valence materials.
Here we take AmN as an example to understand the band
dicates three surface states Σ1, 2,3 , but two surface states Σ1, 2
structure. The configuration of actinide Am is 5f77s26d0.
are only found by tight-binding and first-principles calcula-
The SOC interaction is stronger than Hund’s rule, so the f
tions. Zhang et al. argued that the extra surface state Σ 3
orbitals split into high energy J = 7/2 and low energy
might come from the imperfect surface, but this still is an J = 5/2 states. Approximately, in AmN, Am forms Am3+
open question up to now. with the configuration 5f67s06d0, the states of J = 5/2
Following Bi2Se3 family, a number of other p – p Bi- should be fully occupied, and J = 7/2 states are unoccupied.
based topological insulators are predicted by theories and But due to the delocalization of 5f in Am, 5f states partly
observed by experiments, such as, TlBiSe2 family [21–24], hybridize with 6d states with neighbor Am atoms.
SnBi2Te4 and SnBi4Te7 family [76], and so on. In the fcc crystal field, d orbitals first split into t 2g and
eg states, and t 2g level goes down to cross 5f below the
3.3 d–f type There is no clear evidence for the limit Fermi level along Γ-X direction, shown in Fig. 6. The
(>0.3 eV) of the energy gap size for topological insulators. band inversion happens at three X points. If only LDA cal-
How could we find new topological insulators with bigger culations are used, the full energy gap cannot open through
energy gap? One possible way to enhance the SOC energy the whole BZ. After the electron correlation is introduced
gap is to consider the cooperation of the SOC interaction with LDA + U method, a band gap can open up with
and other effects, such as, the electron–electron correlation. proper correlation parameter U. We have to address that
In this idea, topological Kondo insulators were proposed, the electron correlation U is found to enhance the SOC in
and SmB6 as an example was predicted to be a topological these compounds. Because there are three TRIM X points
Kondo insulator [78]. Though due to 4f orbitals SmB6 is a in BZ, Z2 invariants of AmN must be topologically non-
strong correlated system. It only has a tiny energy gap. Re- trivial. Furthermore, our conclusion suggests that all the
cently Zhang et al. predicted AmN and PuTe family com- mix-valence compounds with rock-salt structure must be
pounds are d and f topological insulators with strong inter- topologically non-trivial. Especially, transport experiments
action [77]. All AmN and PuTe family compounds showed, PuTe [79] has a big energy gap around 0.2 eV,
have rock-salt crystal structure with space group Fm3m and this gap can be enhanced to 0.4 eV with pressure.
Many of these f compounds host all kinds of magnetic
phases, so they might open the opportunity to study QAH
effect and dynamic Axion field.

4 Summary and outlook In this review, we first in-


troduced widely-used techniques within first-principles
calculations including LDA and GGA, GW and MBJLDA,
LDA + U, LDA + DMFT and LDA + Gutzwiller methods,
because they play a crucial role on the field of topological
insulators. Then the basic concepts of topological insula-
tors and some useful methods to confirm the topological
property are summarized. We classify topological insula-
tors found to-date into three types as s–p, p–p and d–f
based on the clear band inversion picture. For each type of
topological insulators, we take several typical compounds
as examples with talking about the electronic structure and
the topological property.
Though many topological insulators have been discov-
ered, it is still important to find more with desired proper-
ties. First of all, a big band gap is important for the appli-
cation of surface states of topological insulators. Up to
now the biggest band gap is around 0.3 eV in Bi2Se3 com-
Figure 6 Band structure for AmN compound with U = 0 eV (a) pound. Secondly, the transport experiments to detect sur-
and U = 2.5 eV (b) from the LDA + U method. The thickness of face states are still very challenging [80–82]. One reason is
the band corresponds to the projected weight of the d character of that the quality of samples is not good enough with a low
Am. In Γ-X direction, one band with d character clearly comes mobility. Another reason is that Dirac cone always coexists
down to cross with the valence bands which mainly having f with some bulk carriers. In order to overcome this barrier,
character. (a) This part represents semi-metal without a band gap, on the one hand, experimentalists are trying to improve the
but (b) represents a finite band gap. From Ref. [77]. quality of samples. On the other hand, it is important to

www.pss-rapid.com © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim


October 31, 2018 14:54 taken from 146-MPLA ws-rv961x669 chap08-Reprint page 213

Topological Insulators from the Perspective of First-Principles Calculations 213

solidi rrl
physica

status

80 H. Zhang and S.-C. Zhang: A review of topological insulators

find other new topological insulators with functional prop- [24] T. Sato, K. Segawa, H. Guo, K. Sugawara, S. Souma,
erties. In addition, it is interesting to study the cooperation T. Takahashi, and Y. Ando, Phys. Rev. Lett. 105, 136802
of the topological property with other phases, such as, su- (2010).
perconductivity, magnetism and so on. We hope that this [25] J. P. Perdew and M. Levy, Phys. Rev. Lett. 51, 1884–1887
review can provide some guidance in the search. (1983).
[26] L. J. Sham and M. Schlüter, Phys. Rev. Lett. 51, 1888–1891
Acknowledgements This work was supported by the De- (1983).
fense Advanced Research Projects Agency Microsystems Tech- [27] J. K. Perry, J. Tahir-Kheli, and W. A. Goddard, Phys. Rev.
nology Office, MesoDynamic Architecture Program (MESO) B 63, 144510 (2001).
through the contract number N66001-11-1-4105 and by the Army [28] M. S. Hybertsen and S. G. Louie, Phys. Rev. B 34, 5390–
Research Office (No. W911NF-09-1-0508). 5413 (1986).
[29] J. Vidal, X. Zhang, L. Yu, J. W. Luo, and A. Zunger, Phys.
Rev. B 84, 041109 (2011).
References [30] R. Sakuma, C. Friedrich, T. Miyake, S. Blügel, and
[1] K. v. Klitzing, G. Dorda, and M. Pepper, Phys. Rev. Lett. 45, F. Aryasetiawan, Phys. Rev. B 84, 085144 (2011).
494 (1980). [31] O. V. Yazyev, E. Kioupakis, J. E. Moore, and S. G. Louie,
[2] D. J. Thouless, M. Kohmoto, M. P. Nightingale, and M. den Phys. Rev. B 85, 161101 (2012).
Nijs, Phys. Rev. Lett. 49, 405 (1982). [32] F. Tran and P. Blaha, Phys. Rev. Lett. 102, 226401 (2009).
[3] B. A. Bernevig, T. L. Hughes, and S. C. Zhang, Science [33] W. Feng, D. Xiao, J. Ding, and Y. Yao, Phys. Rev. Lett.
314, 1757 (2006). 106, 016402 (2011).
[4] M. König, S. Wiedmann, C. Brüne, A. Roth, H. Buhmann, [34] V. I. Anisimov, J. Zaanen, and O. K. Andersen, Phys. Rev.
L. Molenkamp, X. L. Qi, and S. C. Zhang, Science 318, B 44, 943–954 (1991).
766–770 (2007). [35] A. Georges, G. Kotliar, W. Krauth, and M. J. Rozenberg,
Rev. Mod. Phys. 68, 13–125 (1996).
[5] C. L. Kane and E. J. Mele, Phys. Rev. Lett. 95, 226801
[36] X. Deng, L. Wang, X. Dai, and Z. Fang, Phys. Rev. B 79,
(2005).
075114 (2009).
[6] B. A. Bernevig and S. C. Zhang, Phys. Rev. Lett. 96,
[37] M. T. Suzuki and P. M. Oppeneer, Phys. Rev. B 80, 161103
106802 (2006).
(2009).
[7] S. Murakami, Phys. Rev. Lett. 97, 236805 (2006).
[38] R. Winkler, Spin–Orbit Coupling Effects in Two-dimensio-
[8] L. Fu, C. L. Kane, and E. J. Mele, Phys. Rev. Lett. 98(10),
nal Electron and Hole Systems, Springer Tracts Mod. Phys.,
106803 (2007).
Vol. 191 (Springer-Verlag, Berlin, 2003).
[9] X. L. Qi, T. L. Hughes, and S. C. Zhang, Phys. Rev. B 78,
[39] L. Fu and C. L. Kane, Phys. Rev. B 76(4), 045302 (2007).
195424–195443 (2008).
[40] T. Fukui and Y. Hatsugai, J. Phys. Soc. Jpn. 76(5), 053702
[10] X. L. Qi and S. C. Zhang, Phys. Today 63(1), 33–38 (2010).
(2007).
[11] J. E. Moore, Nature 464(7286), 194–198 (2010).
[41] A. A. Soluyanov and D. Vanderbilt, Phys. Rev. B 83,
[12] M. Z. Hasan and C. L. Kane, Rev. Mod. Phys. 82(4), 3045–
035108 (2011).
3067 (2010).
[42] Z. Ringel and Y. E. Kraus, Phys. Rev. B 83, 245115
[13] X. L. Qi and S. C. Zhang, Rev. Mod. Phys. 83, 1057–1110
(2011).
(2011).
[43] R. Yu, X. L. Qi, A. Bernevig, Z. Fang, and X. Dai, Phys.
[14] B. Yan and S. C. Zhang, Rep. Progr. Phys. 75, 096501 (2012). Rev. B 84, 075119 (2011).
[15] L. Müchler, H. Zhang, S. Chadov, B. Yan, F. Casper, [44] D. Xiao, Y. Yao, W. Feng, J. Wen, W. Zhu, X. Q. Chen,
J. Kübler, S. C. Zhang, and C. Felser, Angew. Chem., Int. G. M. Stocks, and Z. Zhang, Phys. Rev. Lett. 105, 096404
Ed. 51(29), 7221–7225 (2012). (2010).
[16] P. Hohenberg and W. Kohn, Phys. Rev. 136(3B), B864– [45] B. Yan, H. J. Zhang, C. X. Liu, X. L. Qi, T. Frauenheim,
B871 (1964). and S. C. Zhang, Phys. Rev. B 82(16), 161108 (2010).
[17] W. Kohn and L. J. Sham, Phys. Rev. 140(4A), A1133– [46] N. Marzari and D. Vanderbilt, Phys. Rev. B 56, 12847
A1138 (1965). (1997).
[18] D. C. Langreth and M. J. Mehl, Phys. Rev. B 28, 1809– [47] I. Souza, N. Marzari, and D. Vanderbilt, Phys. Rev. B 65,
1834 (1983). 035109 (2001).
[19] A. D. Becke, Phys. Rev. A 38, 3098–3100 (1988). [48] H. J. Zhang, C. X. Liu, X. L. Qi, X. Y. Deng, X. Dai, S. C.
[20] H. Zhang, C. X. Liu, X. L. Qi, X. Dai, Z. Fang, and S. C. Zhang, and Z. Fang, Phys. Rev. B 80, 085307 (2009).
Zhang, Nature Phys. 5(6), 438–442 (2009). [49] Y. Xia, D. Qian, D. Hsieh, L. Wray, A. Pal, H. Lin, A. Ban-
[21] B. Yan, C. X. Liu, H. J. Zhang, C. Y. Yam, X. L. Qi, sil, D. Grauer, Y. S. Hor, R. J. Cava, and M. Z. Hasan,
T. Frauenheim, and S. C. Zhang, Europhys. Lett. 90(3), Nature Phys. 5(6), 398–402 (2009).
37002 (2010). [50] P. Capper and J. Brice, Properties of Mercury Cadmium
[22] H. Lin, R. S. Markiewicz, L. A. Wray, L. Fu, M. Z. Hasan, Telluride (INSPEC, London, 1987).
and A. Bansil, Phys. Rev. Lett. 105, 036404 (2010). [51] Z. W. Lu, D. Singh, and H. Krakauer, Phys. Rev. B 39,
[23] Y. L. Chen, Z. K. Liu, J. G. Analytis, J. H. Chu, H. J. Zhang, 10154–10161 (1989).
B. H. Yan, S. K. Mo, R. G. Moore, D. H. Lu, I. R. Fisher, [52] A. Delin and T. Klüner, Phys. Rev. B 66, 035117 (2002).
S. C. Zhang, Z. Hussain, and Z. X. Shen, Phys. Rev. Lett. [53] C. Felser, G. H. Fecher, and B. Balke, Angew. Chem., Int.
105, 266401 (2010). Ed. 46(5), 668–699 (2007).

© 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.pss-rapid.com


October 31, 2018 14:54 taken from 146-MPLA ws-rv961x669 chap08-Reprint page 214

214 Topology and Physics

Review
@ RRL

Phys. Status Solidi RRL 7, No. 1–2 (2013) 81

[54] S. Chadov, X. Qi, J. Kübler, G. H. Fecher, C. Felser, and [69] K. Yang, W. Setyawan, S. Wang, M. B. Nardelli, and
S. C. Zhang, Nature Mater. 9(7), 541–545 (2010). S. Curtarolo, Nature Mater. 11(7), 614–619 (2012).
[55] Hsin Lin, L. Wray, Yuqi Xia, Suyang Xu, Shuang Jia, [70] J. C. Y. Teo, L. Fu, and C. L. Kane, Phys. Rev. B 78,
R. Cava, A. Bansil, and M. Hasan, Nature Mater. 9, 546– 045426 (2008).
549 (2010). [71] D. Hsieh, D. Qian, L. Wray, Y. Xia, Y. S. Hor, R. J. Cava,
[56] P. C. Canfield, J. D. Thompson, W. P. Beyermann, and M. Z. Hasan, Nature 452, 970–974 (2008).
A. Lacerda, M. F. Hundley, E. Peterson, Z. Fisk, and H. R. [72] J. Moore, Nature Phys. 5(6), 378–380 (2009).
Ott, J. Appl. Phys. 70(10), 5800–5802 (1991). [73] Y. L. Chen, J. G. Analytis, J. H. Chu, Z. K. Liu, S. K. Mo,
[57] G. Goll, M. Marz, A. Hamann, T. Tomanic, K. Grube, X. L. Qi, H. J. Zhang, D. H. Lu, X. Dai, Z. Fang, S. C.
T. Yoshino, and T. Takabatake, Physica B: Condensed Mat- Zhang, I. R. Fisher, Z. Hussain, and Z. X. Shen, Science
ter 403(5), 1065–1067 (2008). 325(5937), 178–181 (2009).
[58] X. L. Qi, T. L. Hughes, S. Raghu, and S. C. Zhang, Phys. [74] Y. Zhang, K. He, C. Z. Chang, C. L. Song, L. L. Wang,
Rev. Lett. 102, 187001 (2009). X. Chen, J. F. Jia, Z. Fang, X. Dai, W. Y. Shan, S. Q. Shen,
[59] Y. Y. Li, G. Wang, X. G. Zhu, M. H. Liu, C. Ye, X. Chen, Q. Niu, X. L. Qi, S. C. Zhang, X. C. Ma, and Q. K. Xue,
Y. Y. Wang, K. He, L. L. Wang, X. C. Ma, H. J. Zhang, Nature Phys. 6(9), 584 (2010).
X. Dai, Z. Fang, X. C. Xie, Y. Liu, X. L. Qi, J. F. Jia, S. C. [75] H. Peng, K. Lai, D. Kong, S. Meister, Y. Chen, X.-L. Qi,
Zhang, and Q. K. Xue, Adv. Mater. 22(36), 4002–4007 S.-C. Zhang, Z.-X. Shen, and Y. Cui, Nature Mater. 9, 225–
(2010). 229 (2010).
[60] R. Yu, W. Zhang, H. J. Zhang, S. C. Zhang, X. Dai, and [76] S. V. Eremeev, G. Landolt, T. V. Menshchikova, B. Slom-
Z. Fang, Science 329(5987), 61–64 (2010). ski, Y. M. Koroteev, Z. S. Aliev, M. B. Babanly, J. Henk,
[61] K. Gofryk, D. Kaczorowski, T. Plackowski, A. Leithe-Jas- A. Ernst, L. Patthey, A. Eich, A. A. Khajetoorians, J. Hage-
per, and Y. Grin, Phys. Rev. B 84, 035208 (2011). meister, O. Pietzsch, J. Wiebe, R. Wiesendanger, P. M.
[62] C. Liu, Y. Lee, T. Kondo, E. D. Mun, M. Caudle, B. N. Echenique, S. S. Tsirkin, I. R. Amiraslanov, J. H. Dil, and
Harmon, S. L. Bud’ko, P. C. Canfield, and A. Kaminski, E. V. Chulkov, Nature Commun. 3, 635 (2012).
Phys. Rev. B 83, 205133 (2011). [77] X. Zhang, H. Zhang, J. Wang, C. Felser, and S. C. Zhang,
[63] C. Shekhar, S. Ouardi, G. H. Fecher, A. K. Nayak, C. Fel- Science 335(6075), 1464–1466 (2012).
ser, and E. Ikenaga, Appl. Phys. Lett. 100(25), 252109 [78] M. Dzero, K. Sun, V. Galitski, and P. Coleman, Phys. Rev.
(2012). Lett. 104, 106408 (2010).
[64] X. Dai, T. L. Hughes, X. L. Qi, Z. Fang, and S. C. Zhang, [79] V. Ichas, J. C. Griveau, J. Rebizant, and J. C. Spirlet, Phys.
Phys. Rev. B 77(12), 125319-6 (2008). Rev. B 63, 045109 (2001).
[65] W. Zhang, R. Yu, W. Feng, Y. Yao, H. Weng, X. Dai, and [80] M. Veldhorst, M. Snelder, M. Hoek, T. Gang, V. K.
Z. Fang, Phys. Rev. Lett. 106, 156808 (2011). Guduru, X. L. Wang, U. Zeitler, W. G. van der Wiel, A. A.
[66] H. J. Zhang, S. Chadov, L. Müchler, B. Yan, X. L. Qi, Golubov, H. Hilgenkamp, and A. Brinkman, Nature Mater.
J. Kübler, S. C. Zhang, and C. Felser, Phys. Rev. Lett. 106, 11(5), 417–421 (2012).
156402 (2011). [81] D. Kim, S. Cho, N. P. Butch, P. Syers, K. Kirshenbaum,
[67] B. Yan, L. Müchler, and C. Felser, Phys. Rev. Lett. 109, S. Adam, J. Paglione, and M. S. Fuhrer, Nature Phys. 8(6),
116406 (2012). 458–462 (2012).
[68] Z. Wang, Y. Sun, X. Q. Chen, C. Franchini, G. Xu, [82] S. S. Hong, J. J. Cha, D. Kong, and Y. Cui, Nature Commun.
H. Weng, X. Dai, and Z. Fang, Phys. Rev. B 85, 195320 3, 757 (2012).
(2012).

www.pss-rapid.com © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim


December 12, 2018 15:58 taken from MPLB ws-rv961x669 chap09-appendix-S0217984990000933 page 215

215

Dedicated to the memory of Professor Shou-Cheng Zhang

APPENDIX

S04 SYMMETRY IN A HUBBARD MODEL**

CHEN NING YANG


Institute for Theoretical Physics, State University of New York,
Stony Brook, NY 11794-3840, USA
and
S.C. ZHANG
IBM Research Division, Almaden Research Center,
San Jose, CA 95120-6099, USA

For a simple Hubbard model, using a particle-particle pairing operator 11 and a


particle-hole pairing operator , , it is shown that one can write down two commuting sets
of angular momenta operators J and J', both of which commute with the Hamiltonian.
These considerations allow the introduction of quantum numbers j and j' , and lead to the
fact that the system has S04 = (SU2 x SU 2)/Z2 symmetry. j is related to the existence of
superconductivity for a state and j' to its magnetic properties.

In a recent paper1 it was found that a pairing operator 11 is useful for


considering the Hamiltonian in a simple Hubbard model on an L x L x L lattice,
where L = even. We shall extend such considerations in the present paper. All
notations are the same as in Ref. 1. We introduce here a Hamiltonian H' and a
momentum operator P' which are trivially different from the Hand P of Ref. 1,
in order to bring out more symmetries of the system:

H' = T' + V' , (1)

T' = - 2e L (cos kx + cos ky + cos kz)(a; a" + bk b.,.) ,


k
(2)

V' = 2W~ (at a.- ~)(btb.- ~) . (3)

P' = ~(k - ~ n)<ak" a" + bk b.,.) (mod.2n) . (4)

( 1) The operators l x, l y, and Jz - It is easy to verify that 11 + 11 - 11tl + = ~(a+ a + b +b)


- M, where M = L 3 • Calculating the commutator of this commutator with tJ we
obtain

PACS Nos: 74.20.-z, 05.30.Fk.

∗ This chapter also appeared in Modern Physics Letters B, Vol. 4, No. 11 (1990) 759–766. DOI:

10.1142/S0217984990000933.
December 12, 2018 15:58 taken from MPLB ws-rv961x669 chap09-appendix-S0217984990000933 page 216

216 Topology and Physics

Theorem 1. Defining

'1+ = lx + iJY, '7 = lx- Uy, Jz = +L(a+a + b+b)- iM, (5)

one finds that Jx, JY, Jz commute with each other like the components of an
angular momentum. Hence the eigenvalue of J 2 is j (j + 1) where 2} =integer ~ 0.
Furthermore (as can be easily checked),

[T' ,J]_ = [ V' ,J]_ = [H' ,J]_ = (P' J]_ = 0 . (6)

(2) The operators J'x, J~ and J~- We now define a particle-hole pairing operator,
(7)

Then

Theorem 2. Defining

' = J'x - iJ~ ' (8)

one finds that J'x , J~ , J~ commute with each other like the components of an
angular momentum. Hence the eigenvalue of J' 2 is j ' (j' + 1) where 2}' =integer
~ 0. Furthermore all 3 components of J commute with all 3 components of J',
and

[T',J']_ = [V',J ' ]_ = [H',J']_ = [P'J']_ = 0 . (9)

Cis the usual spin lowering operator and J' is the usual "spin" operator.
(3) Explicit eigenfunctions of H' - We can find many eigenstates of H' with
Theorems 1 and 2 as follows. We diagonize J2 , J' 2 , Jz, J'z, H' and P'
simultaneously. These states can be sorted out into multiplets {j, j'}, each
comprising of (2) + 1) (2)' + 1) states, as illustrated in Fig. 1, where N 0 and Nb are
eigenvalues of ~a+a and ~b+b,

Jz = ~ (Na + Nb - M) , Jz = ~ (Na - Nb) . (10)

As explained in Fig. 1, j + j' = integer, i.e., not all representations of SU2 x SU2
are present. This means that the true symmetry of the problem is (SU 2 x SU2 )/Z2
= so4.
Consider now the states in one spot on the bottom row of Fig. I. For
these states, Na = 0. The operators H' and P' for such states are easily
diagonizable since for such states, there are no a-particle - b-particle interac-
December 12, 2018 15:58 taken from MPLB ws-rv961x669 chap09-appendix-S0217984990000933 page 217

SO4 Symmetry in a Hubbard Model 217

.'
lz

2
1
0
® ®
0 2 3 4 5 6 7 8 =Nb
Fig. I. (N. , Nb) diagram for M = 8. The relationship between Uz, )~) with (N., N b) is given by
Eq. (I 0). Each multiplet {j, j'} is represented by a rectangular set of states centered at i z ~fz = 0 in this
diagram. The number of states in the multiplet is (2) + I) (2j' + I). Illustrated is the multiplet
{±• ~}. All states of a multiplet share the same eigenvalue of H ' and P '. The lowest corner in the
multtplet is where i z = - j, fz = - j'. One can generate all states of a multiplet by starting from its
lowest corner and repeatedly operate on it with 11+ = lx + iJY (which increases ).) and with
(+ • l 'x + iJ~. (which increases f z>· Obviously j + j' - integer. Notice that for fixed j and j' , there are
in general a large number of multiplets {}, j'}, except for {M/2, 0} and {0, M/2}, each of which occurs
only once. For the former, the lowest corner is the point A where N. = Nb = 0 which is a single state.
For the latter, the lowest corner is B where Na = 0, N b = Af which is also a single state.

tions, so that the problem reduces to that of Nb noninteracting fermions. One


can thus trivially write down the eigenstates of H' and P' in momentum space.
There are C~) such states. Operating with 11+ and' + on these states generates
(~) multiplets {j, )'}. Now obviously

j = ~ (M - Nb) , j' = I Nb

Thus we can easily write down explicitly the eigenfunctions for H' and P' for
I
( ~) multiplets { ~ (M - Nb), N b} . The total number of such states is

~:(:~)(M-N_b+ ~)(Nb+ 1), where the summation. e~tends from Nb= O toM.
The summation IS equal to 2M- 2(M2 +3M+ 4). Th1s IS an enormous number of
eigenstates, but still very small compared to the total number of eigenstates which
is 4M. We remark here that the eigenstates 1/;N of Ref. 1 are special cases of the
states discussed in this section.
December 12, 2018 15:58 taken from MPLB ws-rv961x669 chap09-appendix-S0217984990000933 page 218

218 Topology and Physics

The eigenstates of H' constructed above obviously do not depend on W and


are simultaneous eigenstates of T' and V'. We believe they are the only
W-independent eigenstates of H', but we do not know how to prove this
statement except in special cases.
(4) ODLRO- We shall show
Theorem 3. For any state 1/; for which j 2 - j} = 0 (M2 ), there is ODLRO.
The 2-particle reduced density matrix p2 has matrix element

Thus

~ei!t·(• - sl<b.a.IP2Ib,a,> = 1/l+rt111/l = r(Jx + ify)(Jx- ify)l{;


= p - j; +j + Jz .
Using

(b,.a, I¢) = M- 112 e ;;..'c5(r - r')


as a trial wave function for p2 , we find the expectation value of p 2 to be

<P2> = ~(jz _ j}) + 0(1) = O(M) ~ 0.

Thus the largest eigenvalue of p2 is O(M) and the state has ODLR0. 2
In Ref. 1 we had showed that the states 1/IN have ODLRO. That fact is a special
case of the above theorem, because for 1/IN, j = M/2, and iz = - M/2 + N.
In the above discussions, the pairs are particle-particle pairs. If the particle is
charged e, then the state exhibits2 flux quantization in units of ch/2e. If
}' 2 - ) ' / = O(M 2 ), the system exhibits particle-hole ODLRO. There is no super-
conductivity for such a system. 2•3 Thus j is related to superconductivity and j' to
magnetic properties.
(5) Unitary Operators Ub and X- We define these two operators as follows:

Ubar Ub-1 = a r ,
u
b r
b u-1
b = e ilt · rb+r ,
u2
b = 1 , ( 11)

and

X.a, X -1 -- e ill ·r a, , X.br x -1 -- e ill ·rbr , xz = 1 . (12)


Operator X is well known and operator Ub has been discussed in the literature. 4
We observe that

(13)
December 12, 2018 15:58 taken from MPLB ws-rv961x669 chap09-appendix-S0217984990000933 page 219

SO4 Symmetry in a Hubbard Model 219

and

( = UbrJUb 1 (14)

Theorem 4. Writing H'(W) for H', we have

(15)

Theorem 5.

XH'(W)X- 1 = -H'(- W) , (17)

It follows that

(19)

Denoting by Spm (W, N 0 , Nb) the spectrum of H ' (W) for given Na and Nb, we
have, by Theorem 4,
Theorem 6.

Spm (W, Na. Nb) - Spm (- W, Na. M- Nb)


= Spm ( - W, M - Na, Nb)
= Spm(W,M- Na,M - Nb). (20)

By Theorem 5, we have
Theorem 7.

(21)

Combining these two results we obtain

Spm (W, Na. Nb) - - Spm (W, Na, M- Nb)


• - Spm ( W, M - Na, Nb)
= Spm(W,M - Na,M - Nb). (22)
December 12, 2018 15:58 taken from MPLB ws-rv961x669 chap09-appendix-S0217984990000933 page 220

220 Topology and Physics

(6) Limit M-oo -We shall now put e = 1 in (2). Diagonalizing J 2 , J' 2 , Jz, J'v
H', P', we have also diagonalized Na and Nb because of (10). Let the lowest
eigenvalue of H' at a fixed N 0 , Nb be denoted by E 0 (W, N 0 , Nb). Now keeping
fixed the values of

Na!M = Pa , Nu!M = Ph
we approach the limit M-oo. It can be proved, by a method used in Ref. 5, that
M- 1Eo approaches a limit which we shall denote by f(W, Pa, Ph>· fis the lowest
eigenvalue of H' per site at fixed densities Pa and Pb·
The function f has many symmetries. Because of Theorems 1 and 2,

f(W,pa,Pb) = f(W,pb,Pa) = f(W, 1 - Pa, 1 - Pb) = f(W, 1 -Po, 1 - Pa)


(23)
Because of (20),

(24)

These symmetries are illustrated in Fig. 2.

Pa

Fig. 2. Equi-f contours in Pa, Pb plane (schematic). Because of (23), these contours are reflection
symmetrical with respect to the Pa = Pb axis and the Pa + Pb = 1 axis. Because of Theorem 8, these
contours are convex. One can obtain the ( - W) contours from the ( W) contours by a rotation through
90" around the center of the square.

Theorem 8. f(W, Pa, Pb) as a function of Pa and Pb is continuous and concaves


upwards.
Theorem 9. f(W, p 0 , Pb) as a function of W concaves downwards.
These two theorems can be proved using the methods of Ref. 5.
Theorem 8 and Eq. (23) show that the minimum of f(W, p0 , Pb) for fixed W
isf(W, 1/2, 112). This minimum value may be shared by fat other values of
(p0 , Pb) than ( l/2, 1/2). Let the region of (p0 , Pb) where this is true be denoted by
R , and call the states that have this minimum value off lowest states. (23) shows
that R is reflection symmetrical with respect to the axis: Pa = Pb , and with respect
to the axis: Pa + p6 = 1. Using Theorem 8 we can show
December 12, 2018 15:58 taken from MPLB ws-rv961x669 chap09-appendix-S0217984990000933 page 221

SO4 Symmetry in a Hubbard Model 221

Theorem 10. The region R in (Pm Pb) where f(W, Pm Pb) = f(W,)/2, l/2) is
convex. Possible schematic shapes of R are illustrated in Fig. 3.
Each of the lowest state belongs to a multiplet {j, ) ' }. Within that multiplet the
leading state (i.e. where iz ~ j, J'z ~ j,) is also a lowest state. Hence it must be in the
i z ~ 0, fz ~ 0 quadrant of R. Thus

Theorem 11. All the lowest states on the boundary of R have j = IJz I, j' ~ IJ'z 1.
Finally we remark that for the points Pa = 0 (or Pb = 0,) the system is devoid of a
(or b) particles. Hence the value of f(W, 0, Pb) and f(W, Pm 0) can be easily
evaluated. (23) then allows one to write down f(W, 1, Pb) and f(W, Pa, 1). Thus
the value of f on the boundary of the square in Fig. 2 is known.
We now define g ( W, Pa, Pb) to be highest eigenvalue of H ' per site.
Equation (22) then shows that

More generally we define the free energy per site by

F({J, W, Pa, Pb) ~ lim ( - M{J)- 1 ln (p.f.) (26)

where

(p.f.) = trace of block of exp (- {JH') belonging to given Pa, Pb , (27)

and the limit is for M -- oo. Then

(28)
The function F has many symmetries. Theorems 1 and 2 show that

F({J, W, Pa. Pb) = F({J, W, Pb, Pa)


= F({J, W, 1 - Pa. 1 - Pb)
= F({J, W, 1 - Pb, 1 - Pa) . (29)

Equation (20) shows that

Equation (21) shows that

(31)
December 12, 2018 15:58 taken from MPLB ws-rv961x669 chap09-appendix-S0217984990000933 page 222

222 Topology and Physics

a b c d
Fig. 3. Possible shapes for R. R is convex and is reflection symmetrical with respect to the Pa • p 6 ,
and the Pa + Pb = 1 axes. For case c there is particle-particle ODLRO at low temperatures in the open
line segment. For case d there is particle-particle ODLRO at low temperatures inside of the region R.
These cases exhibit superconductivity.

These two last equations together show that

(32)

Acknowledgments

One of us (CNY) is supported in part by the National Science Foundation


under grant number PHY 8908495.

References
l. C. N. Yang, Phys. Rev. Lett. 63 (1989) 2144.
2. C. N. Yang, Rev. Mod. Phys. 34 (1962) 694.
3. W. Kohn and D. Sherrington, Rev. Mod. Phys. 42 (1970) 1.
4. H. Shiba, Prog. Theor. Phys. 48 (1972) 21 71.
5. C. N. Yang and C. P. Yang, Phys. Rev. 147 (1966) 303.

You might also like