Three Hundred Years of Gravitation - Stephen Hawking and Werner Israel (Eds.) - 2022
Three Hundred Years of Gravitation - Stephen Hawking and Werner Israel (Eds.) - 2022
Three Hundred Years of Gravitation - Stephen Hawking and Werner Israel (Eds.) - 2022
PHILOSOPHI.t£
NATURAL IS
PRINCIPIA
MATHEMATICA.
I M P R I M A T U R·
S. P E P Y S, Keg. Soc. P R lE S E S.
'J.lii i• 1686.
L 0 ND IN I,
Julfu Socictatis Reg.I.£ ac Typis J fe
o phi Streater. Pro{bt ap ud
plures Bibliop<>las. A11110 MDCLXXXVII.
i.
Three hundred years of gravitation
EDIT E D BY
S.W.HAWKING
Lucasian Professor of Mathematics, University of Cambridge
W.ISRAEL
Professor of Physics, University of Alberta, and
Senior Fellow, Canadian Institute for Advanced Research
Cambridge
1. Gravitation
I. Hawking, S. W. II. Israel, W.
531' .14 QC178
Includes bibliographies.
1. Gravitation. 2. Cosmology. 3. Astrophysics.
I. Hawking, S. W. (Stephen W.) II. Israel, W.
III. Title: 300 years of gravitation.
QC178 .T47 1987 531'.14 87-10364
List of contributors IX
Preface xi
1 Newton's Principia
S. W Hawking
2 Newtonianism and today's physics 5
S. Weinberg
3 Newton, quantum theory and reality 17
R. Penrose
3. 1 Newton's corpuscular-undulatory theory and reality 17
3.2 Stated reasons for rejection of wave theory 18
3.3 Newton and relativity 20
3.4 Newton's route to an undulatory--corpuscular picture 24
3.5 Quantum mechanics 25
3.6 Physical reality 26
3.7 Reality of the state vector 27
3.8 Quantum non-locality 29
3.9 Quantum mechanics and macroscopic physics 31
3.10 Linearity and time-evolution 32
3. 1 1 Quantum gravity and time-asymmetry 34
3.12 Time asymmetry of state-vector reduction 37
3.13 Reduction and the (longitudinal) graviton count 42
3.14 Non-locality in quantum geometry 45
References 48
4 Experiments on gravitation 51
A. H. Cook
4. 1 Introduction 51
4.8 Conclusion 75
References 77
C. M. W ill
5.1 Introduction 80
References 125
T. Damour
6.1 Introduction 128
6.2 The N-extended-body problem in Newtonian gravity 132
6.3 The external and the internal problems of motion (Newtonian case) 134
6.4 The effacement of internal structure in the external problem 136
6.5 The effacement of external structure in the internal problem 140
6.6 Newton and the strong principle of equivalence 141
6.7 Solving the Newtonian problems of motion 144
6.8 The N-extended-body problem in Einsteinian gravity 145
6.9 Approximation methods 149
6.10 The post-Newtonian approximation methods 151
6.11 The post-Minkowskian approximation methods 156
6.12 Singular perturbation methods 160
6.13 The external and the internal problems of motion (Einsteinian case) 162
6.14 The effacement of internal structure in the external problem (Einsteinian case) 173
6.15 The problem of gravitational radiation damping and the relativistic Laplace effect 180
6.16 Conclusion 191
References 192
7 Dark stars: the evolution of an idea 199
W Israel
7.1 Introduction 199
7.2 Early speculations (1784-1921) 201
7.3 White dwarfs: the first compact massive objects ( 1910-26) 205
7.4 The Chandrasekhar limit (1929-35) 212
7.5 Eddington's intervention (1935) 216
7.6 Neutron stars and gravitational collapse (1934-59) 223
7.7 The Schwarzschild 'singularity' (1916-66) 231
7.8 Quasars and relativistic astrophysics (19 51-72) 239
7.9 Non-spherical collapse: from frozen star to black hole (1964-71) 248
7.10 Towards the quantum era: the thermodynamics of black holes (1970-74) 261
References
8
266
Astrophysical black holes 277
R. D. Blandford
8.1 Introduction 277
8 .2 Black holes in astrophysics 28 0
Contents vii
K. S. Thorne
9.1 Introduction 330
9.2 The physical and mathematical description of a gravitational wave 338
9.3 The generation and propagation of gravitational waves 345
9.4 Astrophysical sources of gravitational waves 364
9.5 Detection of gravitational waves 400
9.6 Conclusion 445
References 446
M. J. Rees
10.1 Introduction 459
10.2 The constituents of the universe: 'dark' matter and 'luminous' matter 463
10.3 Large-scale structure and isotropy 474
10.4 Formation of protogalaxies 478
10.5 A flat (00 = 1) universe 488
References 496
12.9 Quantum theory of the new inflationary universe phase transition 570
Appendix 596
References 597
A. Linde
13.1 Introduction 604
13.2 Chaotic inflation 607
13.3 Inflation and the wave function of the universe 610
13.4 Quantum fluctuations in the inflationary universe 612
13.5 Eternal chaotic inflation 61 8
13.6 Global structure of the inflationary universe and the anthropic principle 621
13.7 Conclusions 628
References 628
14 Quantum cosmology 631
S. W. Hawking
14.1 Introduction 631
14.2 The quantum state of the universe 633
14.3 The density matrix 636
14.4 The Wheeler-De Witt equation 638
14.5 Minisuperspace 640
14.6 Beyond mini�u perspace 645
14.7 The direction of time 647
14.8 The origin and fate of the universe 649
References 651
15 Superstring unification 652
J. H. Schwarz
15.1 Introduction 652
15.2 Classification of string theories 654
15.3 Feynman diagrams 656
15.4 String field theory 659
15.5 Anomalies 666
15.6 Compactification 670
15.7 Remaining problems and conclusions 672
References 673
R . Blandford
Department of Theoretical Astrophysics , California Institute of Technology, Pasadena,
California 9 1 125, USA
S. K. Blau
Department of Physics, University of Texas, Austin, Texas 787 12, U SA
A. H. Cook
The Master's Lodge, Selwyn College, Cambridge
C. Crnkovic
Department of Physics, Joseph Henry Laboratory, Princeton, New Jersey 08544, USA
T. Damour
Department of Fundamental Astrophysics, Observatoire de Paris, 5 Place Janssen, 92 195
Meudon , France
A. Guth
Department of Physics, Massachusetts Institute of Technology, Cambridge,
Massachusetts 02 139, USA
S. W Hawking
Department of Applied Mathematics and Theoretical Physics, Silver Street,
Cambridge CB3 9EW, UK
W Israel
Avadh Bhatia Physics Laboratory, University of Alberta, Edmonton T6G 211, Canada
A. Linde
P. N. Lebedev Institute of Physics , Academy of Sciences of U SSR, Leninski Prospect 53,
1 17924 Moscow, USSR
x Contributors
R. Penrose
Mathematical Institute, 24-29 St Giles, Oxford OX 1 3LB, UK
M. J. Rees
Institute of Astronomy, Madingley Road, Cambridge CB3 OHA, UK
J. H. Schwarz
Department of Theoretical Astrophysics , California I nstitute of Technology, Pasadena,
California 9 1 125, USA
K. S. Thorne
Department of Theoretical Astrophysics, California I nstitute of Technology, Pasadena,
California 9 1 125, USA
A. Vilenkin
Department of Physics and Astronomy, Tufts University, Medford ,
Massachusetts 02 155, USA
S. Weinberg
Department of Physics, University of Texas at Austin, Texas 787 12, USA
C. M. Will
McDonnell Centre for Space Science, Department of Physics, Washington University, St
Louis, Missouri 63 130, USA
E. Witten
Department of Physics, Joseph Hen ry Laboratory, P rinceton , New Jersey 08544, USA
Preface
The intervening eight years have witnessed dramatic progress on both the
observational and theoretical fronts. The earlier volume was already in press
when Joseph Taylor announced measurements of the orbital speed-up of the
binary pulsar which confirmed the value predicted from Einstein's formula
for the loss of energy by gravitational radiation. In 1979, this spectacular
vindication could be noted only as a last-minute tabular entry in the proof
stages of Clifford Will's article. In Chapter 5 Will carries the story of this and
other experimental tests of Einstein's theory up to the present.
The flood of data from the binary pulsar has spurred a vigorous
theoretical effort finally to bring under control the two-body problem in
general relativity, long embroiled in controversy and confusion. Thibault
Damour, who has been in the vanguard of these advances, reviews their
current status in Chapter 6.
It is now a realistic prospect that, before the turn of the century, direct
observation of gravitational waves using laser interferometry, will have
become possible and even routine. In Chapter 9, Kip Thome presents a
comprehensive review of the relevant theoretical and experimental aspects.
Cygnus X- 1 , the only firm black hole candidate known in the 1970s, has
since been joined by other binary X-ray sources in our galaxy and in the
Large Magellanic Cloud. Meanwhile circumstantial evidence for
supermassive holes in the nuclei of galaxies (including our own) has
continued to build up steadily. Roger Blandford critically reviews the
current observational evidence for black holes in Chapter 8 .
Of the recent theoretical developments, one o f the most publicised and
most exciting has been superstring theory (reviewed in Chapter 1 5 by John
Schwarz, a leading pioneer in the field) , which for the first time offers the
hope of including the gravitational force within a unified and finite theory.
Another is the great upsurge of interest in the very early universe, sparked by
Alan Guth's formulation of inflationary cosmology in 198 1 . Guth's idea
was, in part, inspired by a brilliantly imaginative chapter in the Einstein
centenary volume co-authored by Robert Dicke and Jim Peebles, which
spotlighted the horizon problem of the standard big-bang cosmology as well
as 'the remarkable balance between mass density and expansion rate'.
Chapters 12-14 by Blau and Guth, Linde and Hawking survey current ideas
on inflationary and quantum cosmology.
The inflationary hypothesis makes it difficult to escape the conclusion
that perhaps 99 % of the matter in the universe must be in some invisible
form . Martin Rees in Chapter 10 discusses various possibilities for the
nature of this hypothetical dark matter and its influence on the manner in
Preface Xlll
idea that the universe was infinite in extent and filled with a distribution of
stars with a density that was more or less constant in space and time. If the
stars were at rest with respect to each other at one time, the gravitational
attraction of each star for the others would cause them to start to fall
together and the density would go up with time. Alternatively, the stars
could be moving away from each other. In that case the density would
decrease with time. Gravity would act to slow down the recession of stars
from one another and might eventually stop the expansion of the universe
and convert it into a contraction . Newton was aware of this problem but he
argued that, in an infinite distribution of stars, the force on a star caused by
the attraction of other stars to one side of it (say, the right side) would be
almost exactly balanced by the force of attraction of the stars to the left of the
star. The net force on the star would therefore be small and the system of
stars could continue to exist at more or less constant distances from each
other. However, there is a flaw in this argument . The force of attraction
produced by the stars to the right of the given star would be infinite because,
although the force produced by each star to the right gets smaller the further
away the star is from the given star, the number of stars at a given distance
gets larger the fu rther one is from the given star. Similarly, the force of
attraction produced by the stars to the left would be infinite. When one
subtracts infinity from infinity, it is well known that one can get any answer
one wants. We now know that the universal attractive nature of gravity is
inconsistent with a static infinite universe : the universe has to be either
expanding or contracting. This was not realised, however, until in the 1920s
observations of distant galaxies revealed that the universe is expanding. It
was a great missed opportunity for theoretical physics : Newton could have
predicted the expansion of the universe.
In the Principia Newton introduced the concepts of Absolute Time and
Absolute Space. By Absolute Time he meant that there was a quantity called
time which would be measured by all properly constructed clocks,
regardless of their motion. This was an idea that nobody questioned until
the advent of the Theory of Relativity at the beginning of this century. Much
more controversial was Newton's advocacy of Absolute Space. Newton
argued that in the case of circular motion there was a preferred state of rest in
which there were no centrifugal forces. He therefore claimed that by analogy
there ought to be a preferred state of Absolute Rest with respect to linear
motion, though he admitted that it would be difficult to determine this state
of rest by observations. He suggested, as a hypothesis, that this state of rest
coincided with the centre of gravity of the solar system. In fact Newton's
4 S. W Hawking
Laws of Motion are the same in all frames of reference moving uniformly in
straight lines though they are not the same in rotating frames. Thus the laws
do not determine a state of rest and so do not support the concept of
Absolute Space, though they are not inconsistent with it. They do, however,
require the concept of Absolute Time. In other words, in Newton 's theory
one can determine whether two events at different points in space occur at
the same time but one cannot tell whether events at different times occur at
the same point of space : that depends on the frame of reference and would be
different in two frames that were moving relative to each other.
The laws presented in the Principia remained the accepted theories of
mechanics and gravity for more than two hundred years. Even today they
are the basis of nearly all practical calculations. It is only in very extreme
situations that one has to take into account the modifications introduced by
the Special and General Theories of Relativity formulated by Albert
Einstein in 1 905 and 1 9 1 5 respectively. In the Theory of Relativity one has to
abandon the concept of Absolute Time as well as that of Absolute Space.
The time measured by a clock now depends on its velocity and one can n o
longer determine whether events at different points o f space occur a t the
same time : again this depends on the motion of the frame of reference. The
philosophical implications of this change from Absolute Space and Time to
Relative Space and Time have been profound but have often not been
properly appreciated. Lenin wrote a pamphlet attacking Relativity because
he thought it was a threat to the absolute system of Hegel and M arx. It is
only in the last thirty years that the study of relativity has become
respectable in the Soviet Union. Other people in the West reacted in a
similar way. Einstein was accused of undermining moral standards by
suggesting that everything was relative. However, this attack and that of
Lenin were based on a misunderstanding of the Theory of Relativity : it is a
mathematical model of space and time. It makes no statement about how
human affairs should be conducted or organised.
Einstein is the only figure in the physical sciences with a stature that can
be compared with Newton. Newton is reported to have said : 'If I have seen
further than other men, it is because I have stood on the shoulders of giants.'
This remark is even more true of Einstein who stood on the shoulders of
Newton. Both Newton and Einstein put forward a theory of mechanics and
a theory of gravity but Einstein was able to base General Relativity on the
mathematical theory of curved . spaces that had been constructed by
Riemann while Newton had to develop his own mathematical machinery. It
is therefore appropriate to acclaim Newton as the greatest figure m
mathematical physics and the Principia is his greatest achievement.
2
Newtonianism and today's physics
S T EV E N W EI N B E R G
The first question that I as a physicist should address is : how well does it
work? How does Newtonianism stand up after three hundred years of
experimental testing? It stands up very well. We do now understand that
there are small corrections to the Newtonian picture of gravity and
dynamics. The most interesting and important corrections are those
provided by Einstein's general theory of relativity, but they're awfully small.
For example, in the motion of the earth around the sun, the corrections due
to general relativity are of the order of one part in a hundred million, so
small in fact that they've never been detected . We detect the effects of
relativity only for more rapidly moving bodies like the planet Mercury, and
for light itself.
However, whether the corrections are large or small doesn't seem to me to
be the point. In 19 19, when Einstein's theory began to be popular, The
Times of London proclaimed that Einstein had disproved Newton . That's
very far from the truth. Einstein's theory reduces to Newton's theory in the
limit of slowly moving bodies at large distances from each other, which is
certainly true of the outer part of the solar system, and indeed of most of the
umverse.
In fact, it is fair to say that, not only does Einstein's theory not supplant
Newton's theory , it explains Newton's theory. (I thi �k this is a point that's
not often appreciated.) In Newton's work the inverse square law appears as
a means of accounting for the observations of the solar system, particularly
Kepler's interpretation of Brahe's observations in terms of a relationship
between periods and radii of orbits. For Newton it was just a fact , learned
from experiment, that the gravitational force fell off as the inverse square of
the distance. As far as Newton was concerned, ifthe force had fallen off as the
inverse cube of the distance or the inverse 2 . 1 power of the distance that
would have been perfectly fine. There was no explanation in Newton's
theory of why it had to be an inverse square law. This explanation was finally
provided in 1 9 1 5 and 19 16, by Einstein's general theory of relativity. If you
adopt Einstein's insight that the force of gravity is due to a curvatu re in
spacetime, and follow that insight to where it leads you , you find that you
cannot without standing on your head make a theory of gravity in which the
force at large distances is anything but an inverse square law. (What the
force is at short distances is a very complicated question to which I will
return.)
Newton's theory then works very well , and it now has a rationale that it
lacked in Newton's time, provided by general relativity. We use it to follow
the motion of objects in the solar system, some of them objects that we put
N ewtonianism and today's physics 7
inside a hydrogen atom at any one time, and if we try to find out, the
experiment that we do breaks up the atom, and we are unable to answer the
question that we set out to answer. Because in quantum mechanics one talks
in terms of probabilities, there has grown up especially in the last decade or
so an idea, fostered by some popular books, that quantum mechanics is
somehow closer to a gentler, more mystical , view of nature than the hard,
brutal deterministic view of Newton. Here Newton is being cast as a kind of
cosmic Scrooge who denies particles any volition, whereas the developers of
quantum mechanics in the 1920s are seen as gentle, soft-spoken flower
children who are going to return physics to a universal mysticism.
Nothing could be farther from the truth. Quantum mechanics in fact
provides a completely deterministic view of the evolution of physical states.
In quantum mechanics, if you know the state at one instant, and of course if
you know the machinery of the system, you can then calculate with no lack
of precision what the state is in any subsequent instant. The indeterminacy
of quantum mechanics comes in when human beings try to make
measurements, because in making measurements we inevitably disturb the
system in ways that can't be predicted, but the system itself evolves in a
completely deterministic way.
In a sense a more interesting challenge to the determinism of Newton's
theory came more recently than quantum mechanics, in the last decade or
so, within Newtonian mechanics itself, with the discovery of the importance
of what is called chaos. Chaos has become a technical term, referring to the
practical unpredictability of systems that are often very simple. It has always
been of course understood that if you have to deal with a very complicated
system with many individual parts, such as the stock market or the weather,
it will be impossible, just bec'ause the system is so complicated, to predict
what will happen. What has been realized in recent years is that even very
simple systems exhibit behaviour which can best be called chaotic, because it
is for all practical purposes unpredictable. Let me give you an example, and,
since this is the year of Halley's comet, let me give you an example involving
three bodies: a comet, a planet (let's say Jupiter), and the sun . Imagine them
all going around in their orbits, and suppose for simplicity (what is not true
for Halley's comet) that they're all on the same plane, the plane of the
ecliptic. Also for the sake of simplicity forget their finite size, so that we don't
have to worry about whether the comet is going to run into the sun or
Jupiter is going to hit the comet. The comet goes way out very far from the
sun, and then comes back on a very eccentric orbit. Jupiter goes chugging
around the sun in its nearly circular orbit. Every once in a while the two
N ewtonianism and today's physics 9
bodies come close together. When that happens the comet is perturbed by
the gravitational field of Jupiter and goes off in a slightly different direction.
Now, there is no question that if we know with absolute mathematical
precision the motion of the comet at one instant (and if we h�d an infinitely
capable computer) we would be able to predict it at all future instants. But
that's never the way the world works. In this situation I think you can see
that if we only know the motion of the comet to, say, a tenth of a per cent
before its close encounter with Jupiter then , since what happens when it
makes the close encounter with Jupiter depends so sensitively on exactly at
what angle it approaches the planet , its motion after its encounter with
Jupiter will be more uncertain , let's say by one per cent. (Just take these
numbers as an example.) Then on the next encounter the uncertainty with
which the orbit of the comet is known increases by another factor of ten, and
so on. Clearly no matter what kind of accuracy you have when you first
observe the comet , whether you observe the elements of its orbits to one part
in a million or one part in a hundred million or what you will , after a
hundred or so encounters with Jupiter the comet's motion will be
completely unknowable, because the system puts such stringent demands on
the accuracy of the initial data (and on the accuracy of your computations)
that as time passes eventually no conceivable precision would allow you to
predict what would happen. The system is chaotic ; determinism has lost its
point.
This is a rather obvious example, one of many that have been known for
generations. What has been realized in the last few decades is that chaos is
itself interesting ; there are universal rules of chaos. As long as you don't try
to follow a precise trajectory, but think statistically of families of trajectories,
you can describe chaotic behavior in terms that turn out to have common
features for the orbits of comets, for the concentrations of various chemicals
in chaotic chemical reactions, for convection in fluids, and so on.
Now there are international conferences on chaos, and journals devoted
to chaos, but it took a while to realize how interesting chaos is. One of the
reasons it took so long is that our solar system is not particularly chaotic.
The orbits of the planets are quite regular (a good thing for us!). Just irt the
last year evidence has come in that one of the moons of Saturn , Hyperion , a
moon a few hundred miles in diameter shaped somewhat like a large
hamburger, is tumbling in what appears to be a chaotic manner. Here
Newtonian mechanics is dealing with what is after all a fairly simple system,
the moons of Saturn, a dozen or so moons going around the planet , the
planet going around the sun . Yet Newtonian mechanics is for all practical
10 S. Weinberg
Trevor-Roper has credited Newton and the ideas that flowed from Newton
with the disappearance of the witch craze in Europe in the eighteenth
century. Historians have argued to what extent Newtonianism , the physics
of Newton and his followers, was responsible for the industrial revolution a
century later, and I don't have an opinion of my own . The general view
seems to be that the industrial revolution was made by men like Watt and
Stephenson and Edison who were not very learned in the science of their
time (or even of Newton's time), but who operated within a scientifically
oriented style that had been created by the scientific revolution of the
seventeenth century. It's not surprising that the British historian , Herbert
Butterfield, a historian both of science and of the politics of the seventeenth
century, made the remark , 'Since the rise of Christianity there is no
landmark in history that is worthy to be compared with this'.
But Newton of course only provided us with a glimpse of a really
comprehensive system of nature. Newton knew there were other forces in
nature besides gravitation , and he knew that he didn't know what they were,
but he hoped that the same kind of mathematical reasoning , the same clarity
of vision that had revealed the nature of the force of gravitation through its
role in the solar system, would reveal the nature of the other forces, and their
role governing all the phenomena of nature. In the preface to the first edition
of the Principia, which Newton wrote at Trinity College on 8 May 1686, he
said, 'I wish we could derive the rest of the phenomena of nature by the same
kind of reasoning for mechanical principles, for I am induced by many
reasons to suspect that they may all depend on certain forces.' But he didn't
know what those forces were, and it took a long time to understand what
they were. There was the understanding in the nineteenth century by organic
chemists that the chemicals of living things were not subject to separate
chemical laws, but were subject to the same chemical laws as ordinary
inorganic chemicals. There was the understanding by Darwin and Wall ace
that the growth of various species of living things on earth did not have to be
accounted for by some extra biological laws separate from the ordinary laws
of physics and chemistry, but could be accounted for by the operations of
chance on inheritable variations among organisms. There was the
realization by Maxwell that light was not something separate from the rest
of nature, but was a manifestation of oscillating electric and magnetic fields.
And so on . These were all great steps in the devel opment of a unified view
of nature, but by the beginning ?f the twentieth century we were still a long
way from understanding even the possibility of a really unified view .
Remember the famous remark , made in 1903 by the American physicist
N ewtonianism and today's physics 13
A. A. M ichelson. In his book, Light Waves and Their Uses (as quoted by
R. S. Andersen), he said that 'The more important fundamental laws and
facts of physical science have all been discovered, and these are now so firmly
established that the possibility of their ever being supplemented in
consequence of new discoveries is exceedingly remote. ' Physicists have
laughed at this ever since. Well , of course Michelson was wrong, but I think
most of those who laugh at this remark may miss the point about where he
went wrong. A certain kind of physics was indeed coming to an end in 1903.
It was the physics of the macroscopic motions of fluids and solid bodies.
Michelson could not possibly have thought that the physicists of his time
had succeeded in explaining chemical forces, for example. To Michelson ,
chemistry and physics were two different sciences; it was not the
responsibility of physics to explain chemistry. M ichelson's mistake I would
guess was in having too unambitious a view of what would ultimately be
explained by the methods of physics. Indeed it was not until the discovery of
direct evidence that nature is composed of atoms, and the beginning of the
task of pinning down their properties - measuring the charge of the electron,
measuring the mass of the hydrogen atom - in the first few decades of the
twentieth century, that physicists really began to see that the properties of
matter - chemical forces, frictional forces, all the familiar things that happen
when you bang things together, could be accounted for in terms of physical
forces, just the way Newton had hoped .
By 1 9 1 8 , Einstein had concluded that we were well along the path to a
unified view of nature, and in a speech he gave in that year he said, 'The
supreme task of the physicist is to arrive at those universal elementary laws
from which the cosmos can be built up by pure deduction' - just Newton's
hope.
Now there's a bad name for this sort of hope : it's sometimes called
reductionism. In fact there's no question that a naive kind of reductionism
can do a great deal of harm. It is certainly not true that physics is going to
replace the other sciences. The work of the chemists will continue to be done
in terms of chemical phenomena, and chemists will not reduce all of their
efforts to solving the Schrodinger equation of quantum mechanics. It's even
more true of the work of biologists and sociologists and economists. They
will all operate within their own disciplines, because at a certain level of
complexity nature begins to exhibit phenomena that cannot be usefully
reduced to the motion of elementary particles.
We even see this within physics itself. Thermodynamics, the science of
heat, is a separate branch of physics. We do not study the behavior of hot
14 S. Weinberg
bodies when they cool by going back at every point to the motions of the
elementary particles within them. It isn't useful ; in fact if you could follow
the motion of every elementary particle in a glass of water as the water boils,
you would have an incredible amount of information about the trajectory of
every particle, and nowhere in the mountain of computer tape would you get
the impression that water was boiling.
So reductionism has its limits. Another reductionist fallacy that has to be
avoided is seeing physics as a model for the other sciences, or for human
thought in general. I think a great deal of harm has been done by those from
Herbert Spencer on who tried to see the social sciences for example as being
based on physics as a model. I think physics is a terrible model for the social
sciences. In fact it's probably a terrible model for everything except physics
itself.
Nevertheless, despite all the caveats that I've presented you with about
the dangers of reductionism , still there is a sense that among all the holes in
our understanding of nature, there is a special importance to the hole at the
bottom, that is, in our understanding of the forces which govern the
particles, out of which the atoms, out of which ordinary matter is made.
Of course we are still struggling to complete this understanding . We now
understand the·electroweak forces which are respon�ible for electricity and
magnetism and for the weak nuclear interactions. We understand the strong
nuclear forces which hold together the quarks inside the particles inside the
nucleus of the atom, and we understand some aspects of gravitation. Oddly
enough, of all the forces that we know about, the one that w.e've studied
longest, gravitation , is the one we understand least. We have a good
mathematical theory of the strong nuclear forces, known as quantum
chromodynamics, developed by about a dozen physicists in the early 1970s.
Quantum chromodynamics is a very satisfactory mathematical theory, in
which there apparently are no mathematical inconsistencies, and which
accounts as far as we can tell for all the phenomena having to do with the
strong nuclear forces, the forces that hold the nucleus of the atom together.
The electroweak forces are also well understood though with certain large
gaps in our understanding, which we hope will be closed by the next
generation of accelerators. These theories make mathematical sense, they've
been reasonably well tested by experiment , and they even have a kind of
compelling quality. They haven't been carefully adjusted to fit the data ; they
are what they are, because you can't think of anything else.
Gravity is in a very different position. We have a theory of gravity ,
Einstein's theory of general relativity, which as I said reduces to Newton's
Newtonianism and today's physics 15
theory at large distances and small velocities. This theory of gravity works
very well on the scale of the solar system or the galaxy or here on the scale of
everyday life on the surface of the earth, but it is a theory which when pushed
to very short distances and high energies begins to give mathematical
nonsense. If you ask questions like : What is the contribution to the force
between two particles due to the emission and reabsorption of quanta of
gravitational radiation of arbitrarily short wavelength?, the answer that the
theory gives you is that the force is infinite. I t's nonsense - it's clear the
theory is simply breaking down for very short wavelengths. How then do we
make sense of these two great revolutions of the twentieth century? How do
we combine our understanding of gravity with the quantum mechanics
which is supposed to govern nature in the very small?
In recent years there has been the development of a remarkable theory
called superstring theory, which perhaps now for the first time provides a
mathematically consistent theory of gravity. In superstring theory the
fundamental constituents of nature are seen to be not particles or waves, but
little strings, either opened or closed, continually joining and breaking
apart. Each string can vibrate in many modes and the vibrations of these
strings are supposed to stand in one-to-one correspondence with the various
species of elementary particles. This is a theory that apparently is free of all
the mathematical inconsistencies of all previous theories of gravity and ,
perhaps even more intriguing , this is a theory that not only allows a
mathematically consistent theory of gravity, but can't help it : You can't in
this theory imagine that gravity didn't exist. So we have taken three big
steps. First , Newton described gravity successfully in terms of an inverse
square law , but he could always have imagined that it wasn't an inverse
square law - it was just the data that forced it to be an inverse square law.
Then Einstein in general relativity developed a theory of gravity which
included Newton's as a limiting case, and which explained why in that
limiting case it had to be an inverse square law. But Einstein's general theory
of relativity still left open the question : Well why is there any gravity at all?
Why does space curve? Why isn't space just perl'ectly flat and there is no
such thing as gravity? Who asked for that? In the superstring theory we have
for the first time a theory in which gravity cannot be left out.
The history of this theory is rather amusing . It was developed as an
attempt to explain strong nuclear forces, at a time before the advent of
quantum chromodynamics. The mathematical physicists who developed it
at the end of the 1960s were terribly embarrassed at the fact that the theory
predicted the existence of a particle which looked like the quantum of
16 S. Weinberg
Acknowledgement
Research supported by Robert A. Welch Foundation and NSF grant PHY
8605978 . I wish to thank Gerald Holton and Victor Szebehely for their
valuable help in the preparation of the published version of this talk.
3
Newton , quantum theory and reality
R O G E R P EN R O S E
well as under shift of origin and under rotation). This much he made clear in
his Principia. But what evidence is there that Newton at any time shared
Einstein's conviction that physics must be invariant under change to uniform
motion? The common view seems to be that, on the contrary, 'Newton
believed in absolute space' - with its decidedly non-relativistic connotations.
If space (as opposed to spacetime) is indeed absolute, so that it is meaningful
to talk about the same point in space at two different times, then there is an
absolutely preferred state of rest. It then becomes a 'fluke' that Newton's
dynamical laws are Galilei-invariant. There is no need to believe that this
fluke should continue to hold when additional ingredients of physics are
added to the basic dynamical laws. One such additional ingredient is the
velocity of the medium in which light travels, if it indeed be the case that light
merely consists of wave motion in some all-pervading medium. If we insist
that light is merely a wave motion, then we must give up Galilean relativity.
Of course , Lorentz, Einstein and Poincare have now taught us that there
is a way out of this. Light can still consist of a wave motion - this time, the
waves of oscillating electric and magnetic fields of the Maxwell theory - and
invariance under change to a uniformly moving frame can be maintained.
However, this requires a drastic revision of dynamical laws (for speeds
approaching that of light) and also an overturning of one's very notions of
space and time. It is thus, as we now know, that Galilean relativity can be
given up and replaced by another relativity - the relativity of Einstein and
Poincare.
I am certainly not attempting to suggest that Newton could have had the
absurd foresight to sense the need , or even the possibility of such a revision
of basic principles. However, if we do accept that this route was not
effectively open to him, then we see that Newton was presented with a clear
choice : give up Galilean relativity or stick to a corpuscular theory of light.
Now what evidence is there that Newton believed 'in his bones' that some
kind of dynamical relativity must hold? I contend that, despite what appears
to be a common belief to the contrary, such evidence does exist. I refer the
reader to Richard Westfall's fine biography of Newton (Never at Rest, 1 980).
On p. 147 Westfall describes how, in January 1665, recorded in private notes
to himself in a work referred to as his Waste Book, the youthful Newton
derived the laws of impact for two perfectly elastic bodies. (Huygen's slightly
earlier treatment of this problem had not yet been published .) Newton's
procedure was to consider the two bodies as a single system, and he came to
the conclusion that their common centre of gravity moves inertially whether
or not the bodies impinge upon one another. Realizing that every impact
22 R. Penrose
can be viewed from the special frame of reference of the common centre of
gravity of the two bodies, Newton appears to have used the impact
behaviour in that frame - in which the behaviour may be regarded as
'obvious' - and then assumed Galilean relativity to derive the elastic impact
law in the general case. This was more than twenty years before the
publication of Principia, but Westfall remarks, concerning the consequent
uniform motion of the centre of gravity, that 'Newton never forgot this
conclusion. It contained the first adumbration of his third law of motion , . . .'
- and it seems to have formed one of the central guiding ideas behind
Newton's dynamics, despite the fact that in Principia Newton chose to
derive his dynamics from different axiomatic principles.
In his account, Westfall does not draw particular attention to the fact that
Newton was implicitly relying on Galilean invariance to derive his impact
law,t although Newton's procedure cannot be used without such an
assumption. Without it, one has no right to 'pass to the special frame of
reference of the common centre of gravity' and to assume that the same
dynamical laws must hold in that frame as in the original frame. It seems to
me to be highly likely that by the age of 23, when Newton produced this
argument, some stroqg intuition concerning the relativity of motion was
already an essential ingredient of his thinking. It is, after all , a necessary
ingredient of the Copernican-Galilean mode of thought that the dynamics
that we employ at the surface of the Earth should be insensitive to the
Earth's uniform motion about the Sun . Newton was well aware of this, of
course. (It appears that, at least by 1666, Newton had read Galileo's
Dialogues, which appeared in translation in England in 1665.) It would have
t Towards the end of his life, Galileo had claimed a solution of the problem of the
dynamics of impulses. However, no account of this appears to have su rvived {cf. Galilei,
163 8 ; Dover edn , p. 293). It is intriguing to speculate whether Galileo might have been
guided along similar lines of reasoning.
Although Huygens's treatment of this problem seems to have been a little different
from that of Newton, he also used Galilean invariance {cf. Mach, 1972) . It is striking to
note that Huygens did not believe that this Galilean invariance extended to light, in the
sense that I mean it here, since he strongly promoted the wave theory ! Huygens's view
seems to have been that the individual particles of the medium might be subject to a
Galilei-invariant dynamics even though the medium as a whole might define a rest
frame. However, this viewpoint attaches an 'accidental' rather than a 'fundamental' role
to this medium - and therefore to light itself. It also presupposes that the Galilei
invariance of the local dynamics of macroscopic bodies is insensitive to the presence of
the medium - another 'accidental' {and, moreover, highly improbable) feature. While the
example of Huygens shows that it was possible for a great scientist of the time to hold to
the wave theory and also to be a professed relativist, this is not totally being a relativist
in the 'deep' sense that I mean here.
Newton, quantum theory and reality 23
been natural for him to suppose that other physical properties (such as the
behaviour oflight) should also be governed by the same beautiful invariance
principles. Perhaps, at that time, he had not explicitly formulated , even to
himself, what such 'invariance principles' might entail , but my guess would
be that he knew pretty well within himself what he was about. His treatment
of the problem of impulses seems to me clearly to indicate that he
instinctively felt that a relativity principl e must be intrinsic to Nature's ways.
At roughly the same age as he studied the impulse problem , he was also
becoming profoundly interested in the nature of light. On Newton's own
accou nt, he 'had the Theory of Colours' in January 1666 (Westfall , 1980,
p. 1 56). He may even have had some inkling (as had Galileo before him - cf.
Drake, 1957, p. 278 , concerning a remark made in The Assayer) that light
has some profound role to play in governing the minute structure of matter.
One can readily imagine that Newton's great intuition guided him to believe
strongly in some kind of relativity principle encompassing both the
dynamics of matter and the behaviour of light. This intuition would have
gu ided him surely to a corpuscular rather than to a wave picture of light.
That would have been in the years when his profound views on the nature of
light were first being formed. These years must have set the stage for all his
later thinking. When at a later time in his life he became fully confronted by
the experimental facts of optical interference phenomena, he would have
been reluctant to give up the intuitions that he had built up during this
formative time. I suggest that in his 'bones' he knew that light must be
co rpu scular, even though in later years the good reasons for these early
intuitions may well have receded from him. And unlike the stated reasons
that Newton later put forward in his Opticks in favour of a corpuscular
picture of light - which, technically, were 'incorrect' - this unstated one, that
a relativity principle mu st hold, was, I am claiming , profoundly 'correct' !
When Newton came t o write his Principia, some twenty years after these
intuitions must have been formed , he needed to place his logical
development on a clear footing . To state his basic laws in a definite
mathematical way, he needed the framework of 'absolute space' so as to be
able to formulate things precisely. Even with the abstract mathematical
machinery of today, it is n ot so easy to give a totally Galilean-relativistic
treatment of space, time and dynamics right from the start. It can be done,
but one needs an absolute spacetime, rather than an absolute space. (See
Arnol'd, 1 978, for an excellent modern such ' treatment .) Newton did take
pains to be completely clear that Galilean invariance was indeed a featu re of
his laws (cf. Scholium 5, following Definition VIII), but this invariance was
24 R. Penrose
rather subtle. It is clearest for the case of the spin-state space of a spin one
half massive particle, say an electron. Here each physically distinct state of
spin corresponds to a definite direction in space. This direction is the one
with the property that if the spin of the electron is measured in that direction,
then the answer to the measurement is : yes, with certainty. In the case of
polarization states of a photon , the geometrical role of this sphere is slightly
less direct. Suppose that the photon is moving upwards. Then the north and
south poles correspond, respectively, to the states of right-handed and left
handed circular polarization. The points around the equator correspond to
the states of linear polarization (Newton's 'sides'), but since a linear
polarization state corresponds to an undirected line element , the two
opposite directions of this line element must correspond to the same point
on the equator. More directly, it is the 'square root' Riemann sphere for
()./µ)1 1 2 which has a clear-cut geometrical correspondence with polarization
directions (Stokes vectors).
Sometimes the geometry is not at all direct , such as with the various linear
superposition states of two different positions of a particle. Suppose a
particle is passing through a pair of slits A and B . If the particle is at A , then
we represent this by the north pole of our Riemann sphere. If it is at B, then by
the south pole. The various other points of the sphere correspond to states in
which the particle is partly at A and partly at B. As we know , it is not a
question of mere probabilities of the particle being at A or B, but of
amplitudes of it being at A or B whatever that might mean ! The space of
-
different possible ways that the particle can be located as it passes through
the pair of slits has the structure of a sphere - not a line -segment , as would be
the case with probabilities (the range of probabilities being the real-line
segment (0, 1]).
formalism, do not try to form pictures and do not ask questions about
reality ! This seems to me to be wholly unreasonable. Physics , after all ,
constitutes our best way of groping for the true nature of the real world in
which we find ourselves . Some would have us believe that physics is to be
regarded as merely providing a means of effective prediction about the
future. It is true that one of the most powerful shifts in attitude in science
came about with the methodology of Galileo and Newton : form a
mathematical theory from which predictions can be made and test these
predictions against observation. Do not ask what it is that constitutes the
substance of things , nor why things behave as they do . Just ask how they
behave and try to form an elegant mathematical structure that mimics that
behaviour as closely as possible. ('Hypotheses non jingo '!)
However, the power of such methodology should not obscure the fact that
we are still , nevertheless, groping for the realities of the physical world. The
methodology is powerful because it enables us to clear from our minds a
good many preconceived and incorrect notions concerning the nature of this
reality . It does not, and it should not, lead us to believe that we can dispense
with the notion of physical reality altogether - as some would seem to have
us believe! What, after all , is the point of making 'predictions about the
future' unless we are allowed to assume that there is some reality about this
future whose state we are proposing to predict? Despite the operationalistic
nature of the methodology of Galileo and Newton, their dynamical theory
provided a picture of 'reality' which was very much clearer than that which
had gone before. In fact , this picture has become so 'clear' to us that there is
now a great reluctance, even after we have been forced to accept the physical
existence of quantum phenomena, to believe that 'reality' can take any other
form .
It is clear from Newton's Queries in his Opticks, that he was , indeed , well
prepared to speculate on the nature of physical reality and that, in the case of
gravity, the existence of a wonderful mathematical theory did not stop him
from speculating on 'causes' (cf. his refractive all-pervasive medium referred
to above) . 'Hypotheses non jingo ' did not apply to Opticks (nor, indeed, did
it apply to Newton's earlier thinking) ! It seems likely, also, that Newton's
picture of reality was actually not very close to the 'clear' picture of reality
that we have subsequently built up from a familiarity with Newtonian
dynamics.
presents us with. We have seen that, with electron spin, our geometrical
picture of the quantum state-vector is not at all an unreasonable one.
However, even here, many would say that we are deluding ourselves, and to
form such a picture is misleading . Somehow the electron has only two ways
in which its spin can point, not a whole spherical continuum of ways. Any
experiment that we may perform on the electron to measure its spin can give
us but one bit of information : the spin can turn out to be one way, or it can
turn out to be the other, but it can be nothing in between . That is the way
quantum mechanics works.
However, when there are just two such 'alternatives' , the space of st�tes is
indeed a whole Riemann sphere. We cannot say which the alternatives are.
They might be up/down or north/south or east/west or anything in between .
There is nothing to choose between any of these pairs of alternatives. But
given any particular state of spin of the electron , there will be precisely one
direction in space for which the state gives certainty that the spin is in that
direction. Given the state, there is a 'reality' about the direction in which the
state points. An observation might be performed measuring the spin in that
direction and , if so , the state has to be prepared to say 'yes', with certainty , to
that measurement. Somehow the state has to 'know' that direction, even
though there is no experiment which can be performed to determine which
direction in space it actually is. So long as we are sticking to standard
quantum mechanics, the different possible 'realities' for the states of spin of
an electron are indeed the points of the Riemann sphere. This is for a system
with just two alternative states. In a sense, the whole Riemann sphere
'counts' as just two !
For massive particles of higher spin there is also a corresponding
geometrical picture for the states. Take the spin to be n/2 , then in place of the
Riemann sphere we have a complex n-dimensional projective space of
different possible physical states. But we can still use the Riemann sphere to
describe the individual states. Each physical state of spin n/2 co rresponds
uniquely to an unordered set of n points on the sphere of directions.t A
characterization of these points is that if the spin is measured in that
direction then there is zero probability that the spin turns out to be totally in
the opposite direction . Again , the state is uniquely characterized by the fact
t I am told that this result is due originally to Majorana, but I do not know of a
reference. The result is not hard to see using 2-spinors (cf. Penrose and Rindler, 1984 ,
p. 162). If ijJ AB N is an n-index symmetric spinor representing the state, then it has a
canonical decomposition ijJ AB . N = r:x. (A f3B . . vN ) (round brackets denoting
. . .
vN .
. .
symmetrization) , and the n directions are those represented by the spinors r:x. A , /3B ,
.
• • . ,
Newton, quantum theory and reality 29
then the other instantly acquires an 'objective spin state' , namely the state
corresponding to spin with certainty in the opposite direction . This applies
equally to any di rection that we may choose. Moreover, there is no
requi rement that the two particles be at all close to one another. In principle
they could be hundreds of miles apart . By observing the spin of one of the
particles , we instantly put the other particle into a spin state whose direction
is fixed by our direction of observation on the first particle! (We note that no
message can be sent from one particle to the other by this means .) This is a
decidedly non-local picture of reality.
I am regarding the total state on the right-hand side of these equations as
being 'real '. But how else are we to regard it? We are always at liberty to
choose to measure the spins of neither particle but , instead, to reflect both of
them - carefully, so as not to disturb their spins - back to thei r original
position so that they can recombine as a spin-zero state. The spin of the
recombined state must indeed be zero with certainty, and the possibility of
observation of this final spin assigns a 'reality' to the non-local spin state of
the combined system. Other situations of this same general nature exhibit a
similar non-locality. This is particularly striking when the two emitted
particles are photons since the effect of observing one of them would, if it
were a signal in the ordinary sense, have to travel faster than light in order to
reach the other one. The experiments of Freedman and Clauser ( 1972) and
Aspect ( 1 976) and coworkers, etc . , have shown that a non-locality of this
type (over a distance of some twelve metres, in the latter case) is an actual
featu re of the world we live in and not just a theoretical fiction.
things to occur. However, l l > will normally be 'absurd' - at least in the sense
that it represents an extreme improbability which violates the second law -
this probability being nothing like what would be predicted by the standard
quantum-mechanical computation . For example, one might have a Stern
Gerlach apparatus preparing spin one-half atoms in a spin-up state ( I i > ).
Interposed in the beam we insert a second Stern-Gerlach apparatus
oriented at right angles so as to measure the spins in the left/right direction
(I -+ ) , I � ) ) , and the beam duly splits as required. However, if we try to
imagine the alternative input, corresponding to l l ) , we find that, instead of
being the alternative beam of the original Stem-Gerlach apparatus , it is yet
another beam , coming from the laboratory wall or some irrelevant part of
the apparatus ! Such a behaviour would normally be thought of as extremely
improbable, if not totally impossible - and not at all that given by the
quantum probabilities applied in the time-reversed sense.
An even more elementary situation of this type occurs if we consider a
photon , coming from a laboratory source (state Ii>), and simply reflect it off
a half-silvered mirror, where we set up photo-cells in both the reflected and
transmitted positions (states I -+ ) and I � ) , respectively). The state l l >
represents the photon coming from such a direction that if reflected i t would
reach the transmitted position and if transmitted it would reach the reflected
position . There would normally be no source in that direction. The photon ,
in state l l ) , like the atoms above, would have to be just 'absurdly' ejected
from the laboratory wall !
I believe this to be a key issue; and , since it is easily misunderstood , let me
try to be somewhat clearer about the sense in which these probabilities are to
be interpreted. It is necessary to adopt a viewpoint on this which allows
conditional probabilities to be understood in a way that is totally unbiassed
with respect to the direction in which time is taken to be running. Let me
simplify the above experimental set-up even further so that we have just one
photo-cell and one 'lamp' (the photon source). Half-way between the photo
cell and lamp we have a half-reflecting mirror whose plane is perpendicular
to the line joining photo-cell to lamp. If desired, we can add an ellipsoidal
mirror (a prolate ellipsoid of revolution) surrounding all three, the photo
cell P being at one focus and the lamp L being at the other, so that we need
not worry about the direction in which the photon enters or leaves the
photo-cell or lamp. (See Fig. 3. 1 .) There are four possible ways that the
photon can go. It can start at L and end at P, or it can start at L and end at L.
(These are the 'normal' cases , of everyday experience .) Moreover, it can start
at P and end at P, or it can start at P and end at L. (These cases are also
40 R. Penrose
Half-silvered
Ellipsoidal mirror
mirror
� - -
--
Newton, quantum theory and reality 41
t o be entirely stochastic, and not influenced by any other such factors. The
standard rules of quantum mechanics were obtained by observing (in our
own galaxy in the present era !) the way in which the probabilities behave in
the normal direction in time. These particular quantum-mechanical rules for
calculating probabilities simply do not work when used in the reverse
direction in time.
I am proposing to disallow these 'absurd' states (like the in-state jP)
above), and simply accept that the objective physical process underlying
state-vector reduction must be time-asymmetrical. (Most such 'absurd'
states, like the in-state IP) , would , when traced backwards in time,
ultimately becomes incompatible with the hypothesis of vanishing Weyl
Fig . 3 .2 . Spacetime diagrams of the four different ways that the photon can
travel . The top two diagrams depict the 'normal' situation, with the photon
coming from the lamp. The two lower possibilities are needed in order to
complete the list of possible transitions between the quantum states.
+
Time
I
p
p p
42 R. Penrose
Fig. 3.3. The entire space-time is examined . If the initial state is selected for,
the final state ratios are compatible with the quantum-mechanical
calculation, but if the final state is selected for, the same calculation gives
quite the wrong answer for the initial state ratios.
< < t
<
Time
I
�
< <
<
<
�
Newton, quantum theory and reality 43
lost at the singularities, all of which are of the future type (into which matter
can disappear : black holes) rather than of the past type (from which matter
could have emerged : white holes), the latter having to be effectively excluded
owing to the above-mentioned claimed quantum gravity effect. As I have
suggested elsewhere (Penrose, 198 1 , 1986), it is very plausible to suggest that
the uniting of phase-space paths that result from the above singularity
discussion is exactly balanced by the splitting of phase-space paths which I
argued earlier should result from the proposed objective state-vector
reduction procedure. I am not , of course, arguing for any direct connection
between specific spacetime singularities and specific instances of the
reduction process. Clearly one does not need a black hole in the laboratory
in order to perform a Stern-Gerlach experiment ! I am speaking merely of an
overall balance between phase-space loss from spacetime singularities and
phase-space gain arising from state-vector reduction. The argument is that if
the first is a quantum gravity effect, then so also must be the second.
In earlier accounts I have phrased things in terms of some (undefined)
concept of 'gravitational entropy'. My present opinion is that this concept is
both inappropriate and unnecessary for the discussion. (I am grateful to
Boris Zel'dovich and Bob Wald for some very pertinent criticisms of my
earlier view.) Instead, I believe what is required is some measure of
(longitudinal) graviton number. The idea is that when two states in linear
superposition become differently coupled to the gravitational field , to the
extent that the difference between the fields is of the general order of one
(longitudinal) graviton, then failu re of linear superposition becomes
significant , and some non-linear procedure (instability?) forces the state out
of the superposition and into one alternative or the other. This procedure, if
it could be found, would provide our sought-for objective state-vector
reduction .
The immediate question that arises is whether the orders of magnitude
involved make this seem at all plausible. While I have not been able to
explore this question in a great deal of detail, it seems to me that the signs
are not altogether discouraging. Let me give an example, which I shall try
to describe according to the kind of picture that I have in mind. Suppose we
have an atom surrounded, at some distance, by a particle detector - say a
cloud chamber or a bubble chamber. The atom decays, emitting a charged
particle. As the particle moves away from the atom, its reality consists of a
spherical wave moving outwards and centred on the atom. Quantum
linearity holds excellently at this level, and the wave can, if preferred, be
viewed as a linear superposition of straight particle tracks directed outwards
44 R. Penrose
from the atom. As these tracks enter the detector, they each cause streaks of
ioniz.ation which in turn cause trails of droplets, or bubbles, to be formed .
The reality o f the situation is still a complex linear combination o f all these
possible different trails together, so long as the movements of energy are
sufficiently small that the coupling to gravity is significantly below the level
of one graviton . Eventually, however, when the droplets, or bubbles , grow
to a sufficient size, the differences between the gravitational fields produced
by the different linearly superposed trails will grow to become quantum
mechanically significant - by which I mean that the (longitudinal) graviton
count for the differences between the various gravitational fields of the
individual trails reaches order unity. When this happens, non-linear effects
become important and the system 'flops' into a state in which the spacetime
geometry is (at the level of one graviton or so) well defined . Just one of the
complex-linearly superposed spacetime geometries is singled out, and so just
one of the possible trails reaches realiz.ation. The idea is that, in a sense,
Nature abhors complex linear combinations of differing spacetime
geometries!
One must do a calculation to see if this is in any way plausible. My own
crude attempts at this have been superseded by a calculation performed by
Abhay Ashtekar. Let us, for simplicity, assume that we have just a single
droplet. (The case of a bubble is essentially similar : we simply count its mass
as negative - which occurs in the formula squared. This mass is the difference
between the mass of the bubble and that of the ambient medium.) Let the
droplet's mass be m and its radius be r. Let the radius of the material from
which the droplet has condensed be R . Take everything to be spherically
symmetrical , and suppose, crudely, that there is a region of vacuum formed
between the radii R and r. Adapting, to the case of (linearized) gravity,
Zel'dovich's ( 1 966) procedure for estimating the number of photons in a
classical free electromagnetic field and applying the expression to the
situation of a static field with source (where the procedure does not strictly
apply - but what else can one do?), and adopting a particular gauge
condition , Ashtekar obtains the following expression for the expectation
value of the number of (longitudinal) gravitons arising in the region between
radii r and R (this being the region where the linearized Weyl curvature is
non-zero) :
768rc3(m/m p ) 2 log(R/r).
observed, and this instantly affects the state of the other photon, putting that
one into a state which becomes subject to the observation on it which
immediately follows. However, this picture is not at all Lorentz invariant. If
we allow a different Lorentz frame to describe the situation, we can arrange
that it is the observation on this second photon which occurs first and causes
the partial reduction, the result of which 'then' becomes subject to the
original observation on the first photon. In this second Lorentz frame our
picture of 'reality' is completely different from the one in the first frame -
where I am taking the conventional description in terms of evolving and
reducing state-vectors to represent 'reality'.
At this point we could simply abandon relativity for our picture of reality,
if we wish. There would be no actual conflict with the results of experiment if
we do so . But there is conflict with the spirit of relativity in such a procedure
(cf. Bell , 1987). Once we adopt a J?.On-relativistic view for our picture of
reality, we find it to be a 'fluke' that relativity actually holds. To adopt such a
view would be to ignore the insight that Einstein impressed upon us in 1905
- and also the closely related insight that I am suggesting guided the
youthful Newton to an important aspect of his views on the nature of light.
This is the insight that relativity should be taken as a principle rather than as
an accidental feature of the precise form of the physical laws. It may be
argued that cosmology (and, indeed, the big-bang singularity) now supplies
us with a rest-frame, so the case is certainly debatable. Nevertheless, my own
money, for what it is worth, would go on a description of reality which, at
this level , should indeed be relativistically invariant. The best suggestion
that I can make at this stage would be for a picture involving some sort of
partially formed , partially bifurcating spacetime, where the nature of the
spacetime has not been adequately resolved until the second photon
observation has taken place. I have been suggesting , after all , that the
reduction process should be an intimate feature of the interrelation between
quantum mechanics and general relativity. General relativity describes the
structure of spacetime, so, when quantum mechanics becomes intimately
involved, we must expect some radically different pre-spacetime notions to
enter the picture.
Another puzzle concerning the reduction procedure is the one mentioned
above concerning 'null measurements'. Not observing a particle in a
detector can effect a partial reduction of the state-vecto r. On the view that I
am putting fo rward, this is not unreasonable. The spacetime geometry
which gets coupled to the situation resulting from the particle entering the
detector differs from the geometry resulting from it not entering the detector,
Newton, quantum theory and reality 47
and their difference might well reach the one-graviton level. At this point
Nature would have to choose between (partial?) spacetime geometries. She
might well choose the geometry in which the particle has not entered the
detector - and that choice would non-locally affect the state elsewhere.
How is one to develop such a non-local quantum spacetime theory? I
have my own ideas about what might be the fruitful line to pursue, but it
would not be appropriate for me to try to enter into these here since the
ideas, even after very many years of labour, are still much too ill-formulated
to suggest any reliable insights. It will surprise no-one if I say that it is still
my opinion that the formalism of twistor theory, with its essentially non
local description of spacetime ideas, ought to provide an important input to
such a quantum spacetime theory (cf. Penrose and Rindler, 1986). Also , I
still have some hankering after ideas that I entertained many years ago in
evolving a theory of spin-networks (Penrose, 197 1 , 1972) . Thought
experiments of the Bohm-Einstein-Podolsky-Rosen type had an important
motivational role to play in that theory, and the idea was to build up the
concept of space as a limiting implicit structure when large numbers of
particles are involved . However, neither twistor theory nor spin-network
theory have any time-asymmetrical ingredients, as things stand . It is clear to
me that some essential new input is needed.
Whatever the future may hold for the development of physical theory, it is
clear that quantum theory, relativity, and even Maxwell's electromagnetic
theory, have led us - or should have led us, if we have examined the facts - to
a view of reality very different from that presented by classical Newtonian
particle mechanics. From his writings, it is equally clear that Isaac Newton
was not, in the strict sense, a 'classical Newtonian'. His view of Nature
contained very much more mystery than that which the standard picture of
Newtonian mechanics now conjures up. Perhaps his readiness to put
forward such a strange corpuscular yet wave-like conception of light is
indicative of his belief in the depths of Nature's mysteries. If so, he was
certainly right !
Acknowledgements
I am grateful to Abhay Ashtekar and Dipankar Home for valuable
suggestions, and to Karel Kuchar for very helpful criticisms of the
manuscript . I thank, also, the Institute for Theoretical Physics at the
University of California in Santa Barbara for its hospitality and stimulation,
where discussions with D. Boulware, I. Bialynicki-Birula, E. T. Newman,
W. G. Unruh and R. Wald were most valuable. This research was supported
48 R. Penrose
References
Penrose, R. and Rindler, W. ( 1984). Spinors and Space- Time, Volume 1 . Cambridge
University Press : Cambridge.
Penrose, R . and Rindler, W. ( 1986). Spinors and Space- Time, Volume 2 . Cambridge
University Press : Cambridge.
Westfall, R. S. ( 1980). Never at Rest. Cambridge University Press : Cambridge.
Wheeler, J. A. ( 1964). In Relativity, Groups and Topology: the 1 963 Les Houches
Lectures, ed . B. S. De Witt and C. M . De Witt. Gordon and Breach : New York.
Wheeler, J. A. (1 964). Geometrodynamics and the Issue of the Final State in Relativite,
Groupes et Topologie (eds. C. deWitt & B. deWitt, Gordon and Breach, New York &
London)
Wigner, E. P. ( 1961). Remarks on the mind-body question. In The Scientist Speculates,
ed . I . J. Good . Heinemann : London .
Zel'dovich, Ya. B . ( 1 966). Sov. Phys. Dok/., 10, 77 1-2 .
4. 1 Introduction
Newton's law of gravitation has long intrigued and balled experimenters
and theorists alike by its simplicity and generality and by the failure of all
attempts to establish any departure from it or any dependence on
extraneous circumstances. The only known deviation is that consequent
upon general·relativity in the neighbourhood of a massive body which leads,
for example, to the anomalous precession of the perihelion of Mercury. The
situation is in contrast to that of the force between electrically charged
particles where the inverse square law attraction between isolated stationary
particles is modified by the presence of other material bodies and by the
velocities of particles. Those deviations from the simple Coulomb law have
enabled the nature of the electromagnetic force and the electrical structure of
materials to be studied by experiment and , in the absence of similar effects in
gravitation, the gravitational force resists experimental investigation .
It may be that gravitation truly is nothing but a manifestation of the
geometrical structure of nature as set out in general relativity, but it may be
that the reason that no other effects have so far been detected is that they are
very small and that experiments are insufficiently sensitive. As will appear
later on , experiments on gravitation are indeed much more difficult than
those on electromagnetism , for two reasons : first, the gravitational force is
only 10 - 4 0 of the electrostatic force between baryons, so that disturbing
forces are much more troublesome; and , secondly, the techniques available
for mechanical measurements are far less delicate than those for electrical
measurements, in which very sensitive electronic devices can be used. Thus,
for example, the inverse square law variation of gravitation has been shown
Experiments on gravitation 51
sphere drawn about a point source being proportional to the square of the
radius, or to the proper distance from the source being independent of
source strength , for Gauss's theorem shows that in a massless field the
sou rce strength is equal to the integral of the force over a sphere drawn
around the source and, if the inverse square law is to apply, the area of the
sphere must be propo rtional to the square of the radius. In the PPN
formulation , the proper distance from a source of potential U is r( l + y U)
where y is one of the PPN parameters (see below) ; if y is zero, the proper
distance is independent of U and then the area of a sphere is proportional to
the square of the radius.
The condition that the field should be massless is not satisfied by, for
example, a Yukawa potential ; effectively the total energy stored within a
sphere now depends on the size of the sphere, even with Euclidean geometry.
The electromagnetic field is massless as is shown by the fact that
electromagnetic waves in vacuum have no dispersion . The electrostatic law
of force is also inverse square to very close limits, as shown experimentally
by the absence of any detectable field outside a closed conductor. It may
therefore be inferred that the geometry is locally Euclidean, at least for the
values of masses that can be placed in a terrestrial laboratory.
If then , it were found that the inverse square law did not hold for
gravitation, it would follow that the gravitational field carried mass, and the
analysis of experiments on the inverse square law is then naturally made in
terms of a Yukawa potential (Fujii , 197 1 , 197 2 )
k
- - ( 1 + ex e - µ r).
r
Experiments on the inverse square law seek to establish limits on ex
and µ.
The general structure of the PPN formulation of metric theories of
gravitation has been described by Will ( 198 1). The theories are metric
because the motion of a particle is completely determined by the metric of
the geometry, a metric which is specified by ten parameters.
The elements of the metric form are
2
900 = - 1 + 2 U - 2 p u - 2e<1>w + (2y + 2 + ex3 + ( 1 - 2 e ) <1>1
relating the (active) mass M of a primary to the semi-major axis, a , and mean
motion , n, of a secondary would not be followed, for it assumes that the
inertial mass, mi , and (passive) gravitational mass, mg , of the secondary are
the same. Without that assumption,
GM = a 3 n 2 (mg/mi )
and if secondaries differed in the ratio mg /mi (as , say , between the Earth and
Jupiter) there might be detectable differences.
The weak equivalence principle does not hold in all PPN theories, for the
ratios of passive and active gravitational masses (mp and m A ) to inertial
mass, mi, are given by (Will , 198 1)
mp /mi = 1 + (4{3 - y - 3 - 13° e - cx1 + jcx2 -1(1 - !(2 )n/m
and
mA jmi = 1 + (4 {3 - y - 3 -.!fe -!cx 3 - !( 1 - 2 (2 ) Q/m
+ ( 3 E' /m' - (-!cx 3 + ( 1 - 3( 4 )P' /m' ,
Experiments on gravitation 55
where m i is the total mass energy of the body (rest energy, internal kinetic,
potential and gravitational energy) , n is the internal self-energy of
gravitation , E' is the integral of internal energy and P' that of pressure for the
attracted body of mass m' . In general relativity 4P = y + 3 and all other
parameters are zero , so mA = mp = mi . Experimental studies of the weak
equivalence principle are crucial to gravitational physics.
The crucial observations that first established the validity of general
relativity were those on the deflexion of light in the gravitational field of the
Sun and the anomalous advance of the perihelion of Mercury,
corresponding to a term in the potential of the Sun proportional to 1/r 2 . The
deflexion of light may be seen as the consequence of a reduction of the speed
oflight near a massive body ; the reduction , which is proportional to 1( 1 + y),
may be observed either through the shift of the apparent direction of a
sou rce or through an increase in the time of passage of light. Both
observations can now be made rather precisely by radio means, long-base
line interferometry for direction (Fomalont and Sramek, 1977), and radar
measurements (Shapiro et al. , 1972) and radio transponders (Anderson et
al., 197 5 ; Reasenberg et al., 1979) for time delays. The time delay
measurements are the most precise and show that y is 1 to within 0. 1 per cent
(Will , 198 1 ) in accordance with general relativity. The overall precession of
the perihelion of Mercury is obtained both from long series of optical
observations (to within 1 per cent) and from radar observations of the
distance of Mercury from the Earth (to within 0.5 per cent) (Shapiro et al.,
1972 ; Shapiro et al., 1976) . In Newtonian theory, the attractions of the other
planets contribute to the precession , as does the quadrupole moment of the
Sun and there is an effect of the general precession of the equinox . The
relativistic effect is the residue. The planetary and equinoctial terms are well
known, but the value of the quadrupole moment of the Sun has been
questioned . The relativistic precession is proportional to �(2 + 2y - p) and
with the best radar results for the overall precession and discounting doubts
about the internal constitution of the Sun, p is found to be 0.99 1 + 0.0 15,
again in accordance with general relativity.
It thus appears that those PPN parameters, P and y, which are non-zero in
general relativity, have the expected values of 1 . There is also nothing in the
motions of planets or satellites which requires there to be a preferred frame
of reference for the solar system. In the PPN formulation, the equations of
motion contain terms proportional to the velocity relative to the preferred
frame, multiplied by various linear combinations of the parameters r.< 1 , r.< 2
and rt 3 . The absence of preferred frame effects means that r.< 1 , r.< 2 and r.< 3 are all
zero. Evidence from Earth tides (Section 4.5) also sets limits on the r.<s.
56 A. H . Cook
has also been used for the non-Newtonian force ; the two are equivalent at
short distances if
£r 0- 1 rxµ .
=
,/
�ll
Excluded
/I Chen et al.
Kn-null
1
//
I
Allowed
-4
-2 0 +2 lg (µ- 1 /m)
0
\ Excluded
I
I
I
/I
'/
�
I
/;
,....,
/;
I
�
-2 Allowed
..!!!l
'-'
/ -null
//
� Chen et al.
, ../7 Null
I
-4 \ /
-2 0 + 2 lg (µ-1 /m)
60 A. H . Cook
they are less precise than the three just described (Panov and Frontov, 1979;
Ogawa et al. , 1982 ; Chan et al., 1982). Experiments at greater distances have
also been attempted (Yu et al. , 1979) while Stacey and Tuck ( 198 1) have
argued from geophysical measurements that the corresponding value of G is
some 0.5- 1 .0 per cent greater than laboratory determinations. However, the
precision of these results also is poor (see also next section).
suspension to the line from the Sun to the ith body, projected upon the plane
of the triangle of mass.
The gravitational torque about the suspension fibre is then
GM O
(m 1 P 1 + m 2 P 2 + m 3 p 3 )
Ri ·
Al
Sun
62 A . H . Cook
the ratio of torques will not be constant, but will vary with a period of 24 h.
Roll , Krotkov and Dicke combined a very sensitive detector of the angular
position of the triangular frame with electrodes fed with a voltage to apply a
restoring force, so that the angular position of the frame was kept constant
as the Earth rotated ; a record was made of the voltage that had to be
applied.
One of the main problems that had to be faced was that many disturbing
effects depend on the position of the Sun, whether directly through the
diurnal variation of the temperature or otherwise, so that very great care
had to be taken to isolate the apparatus (in a special vault) to reduce
extraneous effects of diurnal period so far as possible. The limit put upon any
difference of the ratio of gravitational to inertial mass between gold and
aluminium was 1 part in 101 1 , but an unexplained feature of the
observations is that there was a variation of torque with a semi-diurnal
period . Subsequently Braginski and Panov ( 1972) carried out a similar
experiment but with improved discrimination, so that they were able to set a
limit of 1 part in 101 2 on any difference between materials.
Another experiment was started about 1979 (Keiser and Faller, 1979) in
which the test masses are not hung from a fibre but are floated on water at
the temperature of maximum density and are centred by electrostatic forces.
Preliminary results were encouraging but a final result has · still to be
announced.
EOtvos's main experiments were different from the later ones, in that
the gravitational attraction was that of the Earth and the inertial force that
of rotation about the Earth's polar axis. The geometry is shown in Fig. 4.3. The
forces acting on the masses are g, the gravitational attraction of the Earth
which is towards the centre of mass, and a, the centrifugal force which is
perpendicular to the axis of rotation. The resultant of the two is the force of
gravity as commonly understood and the suspension fibre is tangential to
the resultant ; that direction is not the direction of gravitational attraction
unless the centrifugal force is parallel to it (on the equator) or is zero (at the
poles).
Let IX be the angle between the gravitational and resultant forces and let ()
be that between the centrifugal and resultant forces ; () is very nearly 1 80°
minus the latitude. If the beam of the torsion balance hangs east-west, the
net torque on one mass is
gmg I sin IX ami I cos () sin (),
-
where I is the arm of the beam, mg is the gravitational mass and mi is the inertial
Experiments on gravitation 63
mass. With two masses, one at each end of the beam , the net torque is
g (m 1 g l 1 - m 2g l 2 ) sin cx - -!a (m 1 J 1 - m2 J 2 ) sin 28.
If the apparatus is turned round through 180° the signs of the differences
(m 1 g l 1 - m 2g l 2 ) and (m 1 J 1 - m 2 J 2 ) are reversed . If the two factors are equal
there is no change of torque on reversing the balance, but if they differ
because the ratios (mg /mi ) 1 and ( mg /mJ 2 are unequal , there will be a net
change of torque.
Eotvos, Pekar and Fekete ( 1922) examined a number of pairs of different
substances and found no difference of gravitational mass relative to inertial
mass to exceed 1 part in 109 and their results were confirmed in a later set of
experiments by Renner ( 1935).
Recently, the results have been reexamined by Fishbach et al. ( 1986).
North
/
/
/
/
/
/
/
/
/
/
/
/
/
g a
g +a
a a
g g
64 A. H . Cook
detected by the inverse square law experiments of Section 4.3. They would
also not affect the experiment of Roll, Krotkov and Dicke ( 1964) , nor of
Braginski and Panov ( 1972) for there r is the distance to the Sun and e µr is
-
negligible. There might , Fishbach et al. argue, be some effect in the original
Eotvos experiment, because in it a substantial part of the attraction on the
test masses, a few parts in 10 3 , comes from terrestrial matter within a few
hundred metres of the apparatus. Now the baryon number of material of
given mass varies with atomic mass on account of different binding energies,
and so a failure of the weak equivalence principle might be expected in the
circumstances of the Eotvos experiment. Fishbach et al. argue that such a
failure is indeed shown if the results are appropriately analysed, but Keyser,
Niebauer and Faller ( 1986) maintain that the analysis is faulty, that the
results of the further Eotvos type of experiment by Renner ( 1935) are
incorrectly neglected, and that the discrepancy as isolated by Fishbach et al.
is inconsistent with that needed to explain the geophysical data. The status
of the suggestion of Fishbach et al. is thus doubtful .
It should be noted that the force postulated by Fishbach et al. , involving a
new type of hypercharge, is different from that proposed by Fujii ( 197 1 ,
1972) which is purely gravitational.
The implications of the weak equivalence results are rather far-reaching.
Thus Dicke has pointed out that , since the relative numbers of protons and
neutrons differ in aluminium and gold , there is no detectable difference
between the attraction of the Sun upon protons and neutrons. Further, the
internal velocities at a given temperature are different because the atomic
masses are different and so the gravitational force cannot depend on atomic
velocities. (The kinetic energies are of course the same, so the experiments
say nothing about a potential proportional to kinetic energy.)
Experiments on gravitation 6S
. .
m g 0 i : - 21 (C( 1 - 2C( )w ' U - a w1 U ii '
.
2 2
and as a result (Will, 198 1) the measured constant of gravitation , as found
from the acceleration of a test mass towards a source mass, contains the term
1-G C( 2 ( 1 - 3I/mr 2 )(w · n ) 2 - 1-G [a 1 - C( 3 - a 2 ( 1 - J/mr 2 )] w 2 ,
where n is the unit vector along the line joining the two bodies and m, r and I
are the rest mass, radius and moment of inertia of the source mass, supposed
spherical .
If the sou rce density is uniform, I = �mr 2 and the first term is
- -fo-G a 2 (w n ) 2
·
w 2 w6 + v 2 + 2w 0 · v
=
and
(w n ) 2 [(w 0 + v) n] 2
· = ·
(w 0 n ) 2 + (v n ) 2 + 2(w 0 n)(v n ).
= •
·
•
·
· ·
the two other periods, all thus readily identified in the observations.
No such experiment has yet been performed and the utility of them would
depend on whether they could attain appreciably better limits on the PPN
parameters than are set by observations on the Earth tides. As the Earth
rotates, the potential at any point on the surface varies because of the
changing positions of the Sun and Moon relative to the point, and because a
point on the surface does not coincide with the centre of mass of the Earth,
where the gravitational attraction of the Sun (or Moon) balances the inertial
acceleration in the orbit. The changing potential at the surface is the tidal
potential and can be calculated very accurately on Newtonian principles.
The tidal potential is about 1 part in 10 7 of the Earth's own attraction , and
gives rise to a variation in the magnitude of the local gravitational
acceleration of about 1 part in 10 7 , again accurately calculable. It is,
however, not the only effect. The Earth, being elastic, yields under the tidal
stresses, the direct consequence of which is that the radial distance of the
surface from the centre of mass changes, and hence also the gravitational
acceleration due to the mass of the Earth. The indirect consequence is that
the moment of inertia (and with it higher multi pole moments) changes with the
tidal potential , giving rise to a further change in gravity at the surface. The
elastic yielding of the Earth was at one time somewhat uncertain , but the
present knowledge of the distribution of density and elastic moduli within
the Earth (see Bullen and Bolt, 1985) is now such that the gravitational
effects of tidal yielding can be calculated quite well . A third effect is less
certain. The largest motions produced by the tidal potential are of course the
tides in the oceans, and they in turn have a direct and indirect effect upon the
surface gravity. The gravitational attraction at any point includes a
contribution from the oceans, and that changes with the tides. In addition ,
the pressure exerted by the water upon the surface of the solid Earth has a
tidal variation which leads to an additional distortion of the solid Earth, and
so to a further change in surface gravity. Finally, the tidal displacement of
the sea bed gives rise to yet further components of the ocean tides. All these
effects are again of the order of 1 part in 10 7 of surface gravity , but the ocean
tides and their effects cannot be calculated so well as those of the solid Earth
Experiments on gravitation 67
assuming the oceans absent. The ocean basins have complicated shapes, and
elaborate numerical integrations of the equations of motion are required for
meaningful results. It has also , until recently, been difficult to check the
solutions. Tides around the ocean basins are naturally well known, but
those far from land , which contribute most to the various gravitational
effects, could not be observed until fairly recently.
If there were a preferred frame of reference or a preferred location , the
local surface acceleration due to the gravitational attraction of the Earth as a
whole would vary as the Earth rotated and altered the alignment of the
direction of the centre of the Earth , and the position of the surface point
relative to the preferred position. Will ( 1 98 1) gives a thorough analysis of the
respective changes in gravity. The variations would mimic part of the Earth
tides and because the Earth tides are about 1 part in 10 7 of gravity, it can be
said that any preferred frame or location effects must be less than 1 part in
10 7 , and in fact less by at least one order of magnitude since to within an
order of magnitude the observed tidal effects agree with the calculations on
the best geophysical models.
The measurement of variations of 1 part in 10 8 in local gravity is
nowadays not so difficult, and the measurements themselves are not the
limitation in setting bounds on the PPN parameters. Spring gravity meters
as are commonly used in geophysical surveys are sufficiently sensitive, but
need to be provided with automatic recorders and isolated from mechanical
and thermal disturbance. With such precautions the drift of the gravity
meter can be kept low, as is necessary .for good tidal observations. Some
special instruments have been constructed for tidal observations, in
particular one in which a ball with a superconducting coating is suspended
electromagnetically in the field of superconducting magnet. The sensitivity
( 10 - 1 1 g) and stability are much superior to spring instruments. Analysing a
run of measurements at Pinon Flat in Southern California, Warburton and
Goodkind ( 1976) were able to set a limit of 10- 3 on rt 2 •
eclipse has been seen near the sunset to the west of Trieste where the
variation of the horizontal attraction could be measured with very sensitive
horizontal pendulums of long period (Marussi in Cook , 19� 1). Nothing
inconsistent with the straight sum of the attractions before and after eclipse
was seen during eclipse. The test is not perhaps very rigorous for with a
similar geometrical disposition, electrostatic or magnetostatic screening
would be slight ; for substantial screening, the M oon would need to subtend
more nearly 90°.
A proposed test of the precession of a gyroscope in a space craft may be
considered as an experiment, rather than as an observation, for the
apparatus is under an experimenter's control . The experiment was first
suggested by Schiff ( 1960a, b) and has been in preparation for more than a
decade, but still awaits a suitable space flight. The rate of change of the spin
vector S is (Will , 198 1)
dS/dr = !l /\ S
where n is an angular velocity :
!! = - !v /\ a -!V A g + (y + !)v /\ V U
( U i s the gravitational potential , g i s a vector related to the metric tensor :
g = g0i ei (where ei i s a unit vector i n the }-direction) and a i s the spatial
component of the 4-acceleration and is zero in free fall , as in a satellite in
orbit about the Earth).
Thus if V is the centre of mass 3-velocity,
!l = i(4 + 4y + a 1 ) V /\ V -!a 1 w /\ V U + (y + !)v /\ V U ,
where w i s the velocity of the coordinate system relative to a preferred frame.
The observed precession is the resultant of a number of terms in the
equation for dS/dr . First , there is the geodetic precession, resulting from the
curvature of space near massive bodies. For a gyroscope in a circular orbit of
radius a about the Earth (of mass mE) the change of the direction of the spin
axis in one orbit is
JS = - 2n(y + 1-)(mE/a)(S /\ h),
where h is the unit vector perpendicular to the plane of the orbit. The rate is
approximately 7 arc seconds per year for a close satellite; and is much the
largest contribution to the precession.
The Lense-Thirring precession would be of the order of 0.0 1 arc seconds
per year, while a periodic precession arising from the velocity of a preferred
frame would have an amplitude of some 10 - 3 a 1 seconds.
The p ractical difficulties are very great. The prime requirement is that the
70 A . H . Cook
gyroscope should be subject to no torques other than those arising from the
effects sought, and that means that it must be uniform in shape and
homogeneous in density to about 1 part in a million. Secondly, the
recognition of the spin axis must involve no significant anisotropy or torque
and , lastly, the direction of spin has to be referred to a reference frame
constituted by distant stars, the directions of which should be established to
10 - 3 arc seconds. All these requirements present severe problems (Lippa
and Everitt , 1978 ; Cabrera and van Kann, 1978).
Lastly, in this section, the possible dependence of the constant of
gravitation upon time is mentioned, if only to remark that at present it seems
most unlikely that any dependence could be examined experimentally.
General relativity treats G as a fundamental constant, part of the system of
definition of the theory, and so not susceptible to experimental test (the
velocity of light in special relativity occupies a similar position) but other
theories admit variation, with the implication that G is not an independent
constant of nature, but a quantity related in some way to other constants of
nature so that it may meaningfully be compared with them to establish
whether or not it changes. Dirac's hypothesis of large numbers leads to a
rate of change of an order of magnitude in the lifetime of the universe, an
amount that · could only be checked by reference to geophysical or
astronomical observations. Analyses of, for example, p alaeomagnetic data
about possible changes in the radius of the Earth have been inconclusive.
Within the laboratory, an experiment on the time dependence of G would
need to establish the value of G in relation to the standards of length, time
and mass (in actual current practice, the velocity of light , the standard of
frequency and the standard kilogram) with sufficient precision at different
times to enable a change of 10 - 1 0 per year to be detected . Since, as will be
seen , the value of G is known at best to rather better than 1 part in 104, and
since the standard kilogram is established to no better than 1 part in 109 , it
will be clear that there is no prospect of any realistic experimental check
upon a possible variation of G in time.
forces are very small compared with the electromagnetic forces and so
experiments are very susceptible to extraneous disturbances, while the
reliable and sensitive electromagnetic techniques that can be applied to
other constants are not available for gravitation. Yet, while the measured
value of G is so uncertain the relevance of G to the rest of physics is slight.
The other principal constants of physics form an interconnected set and a
good knowledge of their values has consequences in both fundamental
theory (relativistic quantum mechanics, for example) and in practical
measurement of high precision (establishment of standards for electrical
measurements as an instance). Almost no such requirements or implications
apply to knowledge of the value of G. It is, so far as is known or postulated
(but see the previous section) independent of all the other constants and is
not required in any practical system of measurement . It enters into the
conversion of orbital parameters in celestial mechanics into masses (in
kilograms) of celestial bodies, but no theory of the constitution of such
bodies is so well established in detail as to require the masses to be known in
terms of the terrestrial standard to better than a few per cent. The only
possible exception to that statement has to do with the internal constitution
of the Earth. The variations of density and elastic moduli within the Earth
are nowadays established to about 1 part in a thousand throughout most
of the Earth from analysis of numerous seismological observations. In essence
those analyses give the value of the ratios K/p and µ/p , where K is the bulk
modulus, µ the shear modulus and p the density. In those ratios the standard
of mass enters both the numerators and denominators and so appears in
neither of the ratios. At the same time, the absolute value of density is set by
the condition that the density must integrate up to the total mass of the
Earth, which is fixed by the attraction of gravity of the Earth ; the conversion
of density to units of the terrestrial standards of mass and length therefore
requires a knowledge of the constant of gravitation. The point of such a
conversion is to make comparisons with laboratory determinations of
equations of state of possible constituents of the body of the Earth, in which
the density is measured against the usual terrestrial standards and is
independent of the value of G.
1
Five methods have been used for the laboratory determination of the
constant of gravitation. The earliest uses the deflexion of a torsion balance
to measure the force between two masses. In the second method , a torsion
balance is used as an oscillator, the gravitational potential of attracting
masses being used to alter the restoring force and hence the period of
oscillation. In a variant of those two methods, the attracting masses were
72 A. H. Cook
moved back and forth around the torsional pendulum in resonance with the
period of oscillation of the pendulum and the steady amplitude of the
oscillations was measured. The force exerted by an attracting mass on a test
mass can also be measured by balancing it against the gravitational
attraction of the Earth (with a known acceleration due to gravity) on a
sensitive equal arm chemical balance. Lastly, a method first suggested in
principle by Newton has been attempted . Newton (see Section 4. 1)
calculated the time it would take two masses moving freely to come together
from an initial separation ; the experiment of Luther et al. ( 1976) is
somewhat different, in that the separation of two masses was kept constant
by a servo-system which moved the one away. from the other in opposition
to the gravitational acceleration. One mass (the controlled one) was on a
rotating table, the other was suspended from a torsion balance and the
servo-control of the first mass was adjusted to maintain deflexion of the
torsion balance constant , when the acceleration of the masses towards each
other was equal to the acceleration of the controlled mass.
In the eighteenth century, estimates of the constant of gravitation were
also made from the gravitational attraction of some isolated mountain (in
Scotland, Schiehallion and Arthur's Seat) which was found from the amount
by which a plumb bob was deflected towards the mountain - the tangent of
that angle is the ratio of the attraction of the mountain to that of the whole
Earth. The radical weakness of the method is the determination of the
attraction of the mountain , for the internal distribution of density can never
be known with any exactness and there is no such thing as an i solated
mountain with a clearly defined boundary. The interest of such observations
lies in what they tell us about sub-surface distributions of mass associated
with mountains and not in the values of G that have been derived .
All experimental determinations of the constant of gravitation suffer from
the same problems - the measurement of the very small gravitational forces
in the face of noise from other causes, and the location of the centres of mass
of the attracting and test masses and the measurement of their separations.
In the torsion balance, the force on the test mass is balanced by the torque
corresponding to the twist of the fibre of the balance, and that is so whether a
steady value of the twist is measured , as in the original work of Cavendish,
or whether the torque is the restoring force in free oscillations of the
pendulum , as in the determinations of Braun and others following him . The
essential advantage of finding the torque from the period of free oscillations
is that, if the pendulum is allowed to oscillate for many periods, a period can
be found much more precisely than a steady deflexion. In either case, the
Experiments on gravitation 73
torsional constant of the pendulum has to be found, usually from the free
period in the absence of attracting masses. The other problem common to all
experimental determinations is that the distribution of density within
masses may not be uniform and therefore the position of the centre of
attraction cannot be found from the dimensions of the exterior surface. That
problem can to a large extent be overcome by tu rning the attracting masses
so that the centre of attraction rotates about some known axis, and taking
observations in different positions. The test mass is usually made small
enough that the density can be assumed to be uniform. There is also the
problem of measuring from a fixed attracting mass to a mobile test mass, but
it usually is possible to arrange the observations so that the critical distances
are those between different positions of the attracting masses (compare the
experiment of Chen on the inverse square law - Section 4.3).
In all measurements with the torsion balance, the stability of the position
of rest of the balance beam is criticaJ. The measurement of a period is much
less subject to instability of the rest position, though not entirely so . The
cause of the instability is the movement of dislocations of the material of the
fibre under stress and a considerable improvement in technique was made
when Boys developed fused silica fibres which not only can be made much
finer than metal or other fibres, and so attain a greater sensitivity, but also
lack the dislocations of polycrystalline metals although as glasses they
undergo viscou s drift. Thu s, since the experiments of Boys, fibres of fused
silica have generally been used for the most delicate balances, although in
recent years B raginski and Manukin ( 1977), followed by Chen (Chen, Cook
and Metherell , 1984), have shown how to remove dislocations from
tungsten wires by a combination of thermal and mechanical stress, and so
have produced very stable torsion balances.
All the earlier determinations of the constant of gravitation depended on
the deflexion of a torsion balance (Cavendish , 1798 ; Baily, 1842 ; Cornu and
Baille, 1873, 1 878) and perforce were performed in an enclosure with air at
atmospheric pressu re ; they were greatly disturbed by convection cu rrents.
Two important advances were made by Boys ( 1 895) and Braun ( 1 897). Boys
introduced quartz fibres for the torsion suspension and showed that it was
an advantage to reduce the size of a deflexion experiment. Braun, who
worked in relative isolation in his Jesuit house, was the first to evacuate the
enclosure in which the torsion balance hung and also performed the first
experiment using the period of the balance instead of its deflexion - he used
the deflexion method as well.
Braun's use of the period of the torsion balance has been followed by Heyl
74 A. H . Cook
( 1930), Heyl and Chrzanowski ( 1942) and Luther and Towler ( 1982). If
attracting masses are placed in line with the beam of the torsion balance, the
torque they exert is in the same sense as that of the suspension and increases
the frequency of oscillation, whereas if they are placed at right angles to the
beam, the attraction decreases the frequency. The method seems to have
given the best results so far, by far the most precise result being, it seems, that
of Luther and Towler (Table 4. 1).
A variant of the method of periods was introduced by Zahradnicek (1933).
Two coaxial torsion balances were set up, a light one with masses of about
9 g and a heavy one with masses of about 1 1 kg. The periods were nearly but
not quite the same. The heavy balance was set into free oscillations driving
the light balance through the gravitational attraction ; the value of G is found
from the amplitude of the forced oscillation of the light balance.
A very sensitive equal arm common balance was used by von Jolly ( 188 1)
and by Poynting ( 189 1). Von Jolly plact!d a container which could be filled
with mercury below one pan , while in Poynting's determination a large
block of lead was placed below the mass on one pan of the balance and the
increase in weight of the mass was found by simple weighing. Poynting
devised a very sensitive means of measuring the deflexion of the balance and
hence the small increase in weight.
The accuracy of his result is comparable with that of other determinations
of that time (Table 4. 1). It is clear that if the sensitivity of the balance were to
Result
Author Method ( 10 - 1 1 Nm 2/kg 2)
4.8 Conclusion
The experiments discussed in this article are those that have been made in a
laboratory on Earth or that could be made in one in a space craft. Thus
nothing has been said of attempts to observe gravitational radiation , for,
although they use some of the most advanced techniques of experimental
physics, they depend upon radiation from sources which are clearly not
under the control of the experimenter; the experiments considered here
involve for the most part sources within the laboratory and at the
experimenter's disposition . In addition the techniques are all similar,
involving delicate mechanical measurements of very small forces. Tests of
general relativity or other theories of gravitation by means of analyses of
celestial mechanics and geophysics are also not considered.
The limitation to experimental tests means that the range of distances
over which theories can be checked 'is of the order of 1 m or somewhat less.
Taken with geophysical and astronomical observations (for the most part in
the solar system), laboratory experiments test the validity of theories of
gravitation from a few centimetres out to the extent of the solar system, so
that there are ranges of both larger and small distances for which there is no
experimental test.
Newton's law of gravitation implies that the attraction between two
bodies is proportional to mass and independent of material constitution,
that the law is the inverse square law and that G is a constant independent of
time, or any preferred coordinate ' frame or any preferred position.
Experimental results confirm the first implication at least to the extent that
the ratio of passive gravitational mass to inertial mas� is the same for all
materials to within 1 part in 10 1 2 , and they confirm that the law is the inverse
square to within 1 part in 10 4 . Geophysical evidence from Earth tides
indicates that any effects of a preferred frame or location are small and
76 A . H . Cook
nothing useful can be said about variation in time. There is some evidence
against screening of a gravitational source by other material.
Looked at from the point of view of PPN theory, observations on the
scale of the solar system show that the parameters y and /3 must have values
close to unity as predicted by general relativity, but the only other
conclusions to be drawn from experiment or observation is that the
parameter e< 2 is zero and that certain combinations of other parameters are
also zero. All parameters other than f3 and y are zero in general relativity.
In a number of instances, the results of experiments show unexplained
systematic effects, of which three have already been mentioned . The results
of the experiment by Roll, Krotkov and Dicke ( 1964) showed a 12 h period
in the recorded signal ; that cannot arise from the attraction of the Sun on the
masses and its origin has not been accounted for. Heyl ( 1930) , in his
determination of the gravitational constant, used different materials for the
masses in various groups of observations and found different results for G
(Table 4. 1), which is inconsistent with the Eotvos group of experiments.
Similarly, there appear to be some systematic differences between materials
within the Eotvos experiment itself, even though the interpretation
proposed by Fishbach et al. ( 1986) may be invalid. Again , Heyl and
Chrzanowski ( 1.942) found that whether the tungsten torsion fibre was hard
drawn or annealed gave different values of G (Table 4. 1). The significance of
these examples is not so much the physical interpretation that may be put
upon them, rather it is that it is difficult to attain an adequate understanding
of experiments at the limit of available techniques.
In view of the rather limited nature of experiments so far performed,
and in view of the difficulties in understanding the outcomes of
these experiments, there must be considerable interest in devising
new experiments, both better ones to reexamine existing results and new
ones to investigate other aspects of gravitation . Braginski et al. ( 1977)
have proposed three new techniques of detection which they argue should
be very sensitive and have suggested a number of experiments of which they
might form the basis. None of the experiments has, however, yet been
performed.
The naive experimenter may be allowed to conclude that general relativity
remains the best description of gravitation , but he is also well aware that
much ingeµuity, care and imagination will be required before his
experiments have the delicacy that will enable them to contribute a s
' '
References
Fujii, Y. ( 197 1). Dilation and possible non-Newtonian gravity. Nature Phys. Sci. , 234,
5-7 .
Fujii, Y. ( 1972). Scale invariance and gravity of hadrons. Ann. Phys., 69, 494-52 1 .
G ibbons, G. W. and Whiting, B. F . ( 1 98 1). Newtonian gravity measurements impose
constraints on unification theories. Nature (London), 291 , 636-8 .
Heyl, P. R . ( 1930). A redetermination of the constant of gravitation. J. Res. Nat. Bur.
Stds., 5, 1243-90.
Heyl, P . R . and Chrzanowski, P . ( 1942). A new determination of the constant of
gravitation. J. Res. Nat. Bur. Stds., 29, 1-3 1 .
Holding, S . C. and Tuck, G . J . ( 1984). A new mine determination o f the Newtonian
gravitational constant. Nature (London), 307, 7 14-16.
Keiser, G. M. and Faller, J . E. ( 1979). A new approach to the Eotvos experiment. Bull.
Amer. Phys. Soc., 24, 579.
Keyser, P. T., Niebauer, T. and Faller, J. E. ( 1 986). Comment on 'Reanalysis of the
EOtvos experimen t'. Phys. Rev. Lett., 56, 2425.
Kreuzer, L . B. ( 1968). Experimental measurement o f t h e equivalence o f active and
passive gravitational mass. Phys. Rev., 169, 1 007-12.
Lippa, J . A . and Everitt, C. W. F. ( 1978). The role of cryogenics in the gyroscope
experiment. Acta Astron., 5, 1 1 9-123.
Long, D. R . ( 1976). Experimental examination of the gravitational inverse square law.
Nature (London), 260, 4 1 7-18.
Long, D. R . ( 1 980). Nuovo Cim., 855, 252.
Luther, G. G. and Towler, W. R. ( 1982). Redetermination of the Newtonian
gravitational constant, G. Phys. Rev. Lett . , 48, 12 1-3 .
Luther, G. G . , Towler, W. R . , Deslattes, R . D . , Lowry, R . and Beams, J . ( 1976). Int.
Conf on Atomic Masses and Fundamental Constants 5, ed . J : H. Sanders and A. H .
-
5.1 Introduction
The tercentenary of the publication of Newton's Principia has occurred at a
propitious time in the subject of gravitation physics. We find ourselves in the
midst of a renaissance for general relativity, the theory of gravitation that
superseded Newtonian gravity, and that is now one of the most active and
exciting branches of physics. General relativity has become an important
theoretical tool for the astrophysicist, as well as a fundamental ingredient in
the quest for unification of the theories of the basic interactions.
Like any other branch of physics, gravitation has a strong experimental
component as well . The confrontation between theory and experiment was a
central element in the Principia ; there Newton reported results of his own
experiments to verify the principle of equivalence, and made detailed
comparisons of the predictions of his theory of gravity with astronomical
observations. In the two centuries following publication of the Principia,
Newtonian gravitation was put to the test in a variety of ways, and passed
every test but one with flying colors. Nor did Einstein shy away from
experimental confrontation, for he proposed three important tests of general
relativity. Although two of Einstein's three tests were confirmed
immediately or shortly after publication of his theory in 19 16, further
experimental progress during the next 45 years was very slow, largely
because of a lack of experimental technology of sufficient accuracy to
measure the extremely small predicted effects.
However, during the two decades 1960--80, there was a rebirth in the
subject of experimental gravitation . This rebirth coincided with the
renaissance of general relativity as a whole, and was propelled by
astronomical discoveries that indicated a role for relativistic gravity in
astrophysics, by new theoretical insights into the observable consequences
Experimental gravitation from Newton to Einstein 81
relativity. There, in a local freely falling frame, the measured gradients of the
acceleration experienced by a collection of particles at rest are the 'electric'
components of the Riemann tensor ( R0rn) ; the divergence of the
acceleration is thus the trace of R o iOi • which is the Ricci tensor component
R00 , which vanishes in vacuum, according to Einstein's field equations.
Yet there is one area in which the two theories are very similar though
slightly different, and this is the area of prime importance for experimental
gravitation . This is the weak-field , slow-motion limit of general relativity,
the limit appropriate for discussing motions in the solar system and in other
astronomical arenas. In this limit, the Einstein field equations can be solved
by successive approximations. At the first approximation, the field
equations and equations of motion for matter are equivalent to those of
Newtonian gravity. This is the 'Newtonian limit' of general relativity. At the
next level of approximation, known as the 'post-Newtonian
84 C. M. Will
t This and other quotations from the Principia are taken from the so-called 'Cajori'
edition (Cajori, 1934).
86 C. M. Will
physics are those written in the language of special relativity. The argument
that leads to this conclusion simply notes that, if EEP is valid, then in local
freely falling frames, the laws governing experiments must be independent of
the velocity of the frame (LLI), with constant values for the various atomic
constants (in order to be independent oflocation). The only laws we know of
that fulfil this are those that are compatible with special relativity , such as
Maxwell's equations of electromagnetism. Furthermore, in local freely
falling frames, test bodies appear to be unaccelerated, in other words, they
move on straight lines, but such 'locally straight' lines simply correspond to
'geodesics' in a curved spacetime (TEGP, section 2.3).
General relativity is a metric theory of gravity , but then so are many
others , including the Brans-Dicke theory. Because Newtonian gravitation
is not compatible with LLI , it is not a metric theory (although it can be
expressed in geometrical , coordinate-free language) . So the notion of curved
spacetime is a very general and fundamental one, and therefore it is
important to test the various aspects of EEP thoroughly .
A direct test of WEP is the comparison of the acceleration of two
laboratory-sized bodies of different composition in an external gravitational
field. Such tests of WEP predate Newton, including Philiponos (5th or 6th
C.), Stevin ( 1 586) and Galileo (c. 1 590). If the principle were violated, then
the accelerations of different bodies would differ. The simplest way to
quantify such possible violations of WEP in a form suitable for comparison
with experiment is to suppose that for a body with inertial mass m1 , the
passive gravitational mass mp is no longer equal to m1 , so that in a
gravitational field g, the acceleration is given by m1 a = mpg. Now the inertial
mass of a typical laboratory body is made up of several types of mass
energy : rest energy, electromagnetic energy, weak-interaction energy, and
so on. If one of these forms of energy contributes to mp differently than it
does to m1 , a violation of WEP would result . One could then write
mp = m1 + L, 11AEA/c 2 , ( 1)
A
where EA is the internal energy of the body generated by interaction A , and
>JA is a dimensionless parameter that measures the strength of the violation
of WEP induced by that interaction, and c is the speed of light. A
measurement or limit on the fractional difference in acceleration between
two bodies then yields a quantity called the 'Eotvos ratio' given by
1J
=
2la 1 - a i l
la 1 + a 2 I
( _ )
= L 1'/ A
Et E1
m i c2 mi c 2 •
(2)
A
Experimental gravitation from Newton to Einstein 87
these estimates and limits (based on the Moscow experiments) for various
interactions (see TEGP, section 2.4(a), for details).
Recently, there has been a renewal of interest in Eotvos's original version
of the experiment as a consequence of a reanalysis of the Eotvos data by
Fischbach et al. ( 1986a). One of the goals of that reanalysis was to search for
the effects of a hypothetical short-range ( � 100 m) force, known as the 'fifth'
Strong 5 x 10- 1 0
Electromagnetic
Electrostatic 4 x 10 - 1 0
M agnetostatic 6 x 10 - 6
Hyperfine 2 x 10 - 7
Weak 10 - 2
G ravitational No limit
(3)
where µi is the mass of the ith body measured in atomic mass units, Bi is its
total baryon number, and 71 Y is a parameter that depends on the strength
and range of the putative force, and on the detailed distribution of local
matter within that range. Fischbach et al. claimed that Eotvos's data
showed a significant dependence of 11 on the baryon-number-per-unit-mass
difference, and they quoted a value 71 Y = (5.65 + 0.7 1) x 10 - 6 . This result,
they argued , was qualitatively in accord with measured deviations from the
inverse square law of gravity using gravimeter data from deep mines
(Holding et al. , 1986), and with anomalous energy dependences in the
fundamental parameters that characterize the behavior of the K 0 - K.0
mesons , effects that would also be consequences of such a fifth force. The
more precise Princeton and Moscow experiments would not be sensitive to
this effect because the relevant source of gravity in those cases was the Sun,
and the effect of a 100 m short-range force would therefore be negligible.
A number of authors subsequently took issue with the results of this
reanalysis. It was argued by some that the evidence for the behavior
indicated by (3) was much weaker than claimed by Fischbach et al. because
of a number of factors which were inadequately taken into account in the
reanalysis, including a sign error in the interpretation of one of Eotvos's
conventions, uncertainties in the isotopic composition of each element and
the chemical composition of each compound used in the experiment,
uncertainties in the individual masses of the substances used and of the
containers in which they were placed, and unknown systematic errors that
might have affected the outcomes of Eotvos's experiments, given that three
different methods were used by Eotvos and his colleagues. Reanalyses by
others of Eotvos's data, and of the data from a series of 1935 experiments
using the same apparatus by Renner failed to support the Fischbach et al.
value of 71 r. Other authors pointed out that, even if one accepts the
dependence of 11 on baryon number claimed by Fischbach et al. , it is virtually
impossible to infer anything about the nature of a short-range fifth force,
because its putative effect on Eotvos's experiments is extremely sensitive to
the details of the nearby mass distribution (such as the mass of the building
90 C. M. Will
t Most of the papers commenting on the Fischbach et al. analysis were contained in the
2 June 1986 issue of Physical Review Letters : they include Neufield ( 1986), Thieberger
( 1986), Nussinov ( 1986) , Thodberg ( 1986) , Keyser et al. ( 1986) , together with responses
by Fischbach et al. ( 1986b) . Also listed by the editors are the authors of similar papers
which were not published for reasons of space; see also De Rujula ( 1986).
Experimental gravitation from Newton to Einstein 91
The principle of LPI , the third part of EEP, can be tested by the
gravitational redshift experiment , the first experimental test of gravitation
proposed by Einstein. Despite the fact that Einstein regarded this as a
crucial test of general relativity, we now realize that it does not distinguish
between general relativity and any other metric theory of gravity, instead it
is a test only of EEP. A typical gravitational redshift experiment measures
the frequency or wavelength shift Z = d v/v = dA/J,. between two identical
-
Hughes-Drever Lithium-7 5x w- 1 6
P restage et al. Beryllum-9 10 - 1 8
Lamoreaux et al. Mercury-20 1 10 - 20
° For discussion see TEGP, sections 2.4(b), 2.6(e), and Haugan ( 1986b).
92 C. M. Will
where the parameter a. may depend upon the nature of the clock whose shift
is being measured (see TEGP, section 2.4(b), for details).
Although there were several attempts following the publication of the
general theory of relativity to measure the gravitational redshift of spectral
.
lines from white dwarf stars, the results were inconclusive (see Bertotti et al.
( 1962) for a review). The first successful, high-precision redshift
measurement was the series of Pound-Rebka-Snider experiments of
1960-5, that measured the frequency shift of y-ray photons from iron-57 as
they ascended or descended the Jefferson Physical Laboratory tower at
Harvard University. The high accuracy achieved - 1 per cent - was obtained
by making use of the Mossbauer effect to produce a narrow resonance line
whose shift could be accurately determined . Other experiments since 1960
measured the shift of spectral lines in the Sun's gravitational field and the
change in rate of atomic clocks transported aloft on aircraft, rockets and
satellites. Table 5.4 summarizes the important redshift experiments that
have been performed since 1960.
The most recent experiments have taken advantage of the development of
frequency standards of ultra-high stability - parts in 101 5-10 1 6 over
averaging times of 10- 100 seconds and longer, such as hydrogen-maser
clocks, and superconducting-cavity stabilized oscillator (SCSO) clocks.
Limit
Experiment Name Method on ia l
The first such experiment (and the most precise to date) was the Vessot
Levine rocket redshift experiment that took place in June 1976 (in NASA
circles, the experiment was denoted Gravity Probe A or GPA) . A hydrogen
maser clock was flown on a rocket to an altitude of about 10 000 km and its
frequency compared with a similar clock on the ground. The experiment
took advantage of the masers' frequency stability by monitoring the
frequency shift as a function of altitude. A sophisticated data acquisition
scheme accurately eliminated all effects of the first-order Doppler shift due
to the rocket's motion, while tracking data were used to determine the
payload's location and velocity (to evaluate the potential difference .1 U, and
the time dilation). Analysis of the data yielded a limit l al < 2 x 10 - 4 (V essot
et al. , 1980). Improvement of this limit may be possible by placing such
clocks on Earth-orbiting satellites or on a spacecraft in a very eccentric solar
orbit.
Advances in stable clocks have also made possible a redshift experiment
that is a more direct test of LPI : a 'null ' gravitational redshift experiment
that compares two different types of clocks, side by side in the same
laboratory . If LPI is violated , then not only is the proper ticking rate of an
atomic clock dependent upon position, but the position dependence must
itself depend on the structure and composition of the clock, otherwise all
clocks would vary with position in a universal way and there would be no
operational way to detect the effect (since one clock must be selected as a
standard and ratios taken relative to that clock). From (5) it is easy to see
that a comparison of two different clocks A and B at the same location
would measure variations in their frequency ratio that depend upon
gravitational potential according to
(6)
where a A and a 8 are a-parameters for each clock type, and ( vA /v8 ) 0 is the
constant frequency ratio at some fiducial spacetime location from which U is
measu red.
A null redshift experiment of this type was performed in April 1978 at
Stanford University. The rates of two hydrogen-maser clocks and of an
ensemble of three SCSO clocks were compared over a 10-day period.
During this period, the solar potential U/c 2 changed sinusoidally with a 24-
hou r period by 3 x 10 - 1 3 because of the Earth's rotation, and changed
linearly at 3 x 10 - 1 2 per day because the Earth is 90° from perihelion in
April . However, analysis of the data revealed no variations of either type
within experimental errors, leading to a limit on the LPI violation
94 C. M. Will
-1
es
-2
.2
-3
-4
-4
es
,-.. -3
I
'-'
tlO
..9 -2
-1
0
Experimental gravitation from Newton to Einstein 97
cr = (G)
Method x (2 x 101 0 yr) Name
estimates, despite being based upon similar data sets) are the uncertainty in
the masses and distributions of the asteroids, and the level of correlations
among the many parameters to be estimated in the models. Some authors
have suggested that radar observations of a Mercury orbiter over a two-year
mission (30 cm accuracy in range) could yield �(G/G) -- 3 x 10 - 1 3 yr - 1 (see
UPDATE, section 5.3).
Ap = [-�(2 + 2y - /3) + 3 x 10 - 4 ( 1 2/ 10 - 7 )] . ( 1 3)
The measured perihelion shift is accurately known : after the effects of the
general precession of the equinoxes (5000" c - 1 ) and the perturbing effects of
the other planets (280" c - 1 from Venus, 150" c - 1 from Jupiter, 100" c - 1
from the rest) have been accounted for, the remaining perihelion shift is
known (i) to a precision of about 1 per cent from optical observations of
M ercury during the past three centuries, and (ii) to about 0.5 per cent from
radar observations during 1966-76. Unfortunately, measurements of the
orbit of M ercury alone are incapable at present of separating the effects of
relativistic gravity and of solar quadrupole moment in the determination of
AP . Thus, in two analyses of radar distance measurements to Mercury, 1 2
was assumed to have a value corresponding to uniform rotation (effect on AP
negligible), and the PPN parameter combination was estimated. The results
were
{ 1 .005 + 0.020 (1966-7 1 data)
31 (2 + 2y - /3) - 1 .003 + 0.005 ( 1966-76 data) ' ( 1 4)
(see TEGP, section 7 .3, for references) where the quoted errors are la
104 C. M. Will
Year
1 965 1 970 1 975 1 980 1 98 5
� �� � I Hill
__ mu !
1:
. ·
General relativity 2o _
_ _ _
General relativity 1 o
- - --- -
- - - - - - Gough
Hill
Duvall !
1 0-s
1 06 C. M. Will
relevant to specific data sets , such as station locations, rotation and libration
of bodies, known systematic errors or corrections, etc. The output of the
model is a set of updated or improved values of the parameters. In addition
to many years of optical observations of the planets, the data set includes
radar-ranging measurements to M ercury, Venus, the Mars orbiter M ariner
9, and the Viking Mars landers, and laser-ranging measurements to the
Moon. Further analyses of post-Newtonian ephemerides will improve our
Experimental gravitation from Newton to Einstein 107
Earth. Although Kant was the first to suggest, in 1754, that the Earth must
be slowing down (arguing that since the M oon has spun down until it
presents the same face to the Earth, presumably from tidal effects, the Earth
must suffer the same slowing down), he later retracted the hypothesis, and it
was not until the work of Delauney in 1866 and Spencer-Jones in 1 939 that
the importance of this effect was appreciated (for a brief review, see the
article by Muller in Halpern ( 1978)).
Today, the agreement of the theory of the Moon's motion with
observations carried out using such modern tools as lunar laser ranging
represents another triumph for Newtonian gravity. Although some workers
have incorporated general relativistic effects into the theory, most of them
are too small to be detectable with current technology (TEGP, section 8.3) .
The effect of tidal dissipation in the Earth on the Moon 's secular
acceleration remains an important uncertainty in any attempts to place tight
limits on a cosmological variation in G (see Section 5.3 .4), but these
uncertainties may soon be reduced significantly through laser ranging to the
Earth-orbiting LAGEOS satellite, from which improved determinations of
the effects of ocean tides can be obtained.
However, there is one way in which the M oon has provided an important
test of relativistic gravity, and it hearkens back to Newton ' s W EP .
According t o Newton, the equivalence principle should apply t o all bodies,
laboratory-sized bodies as well as planets. Indeed, since Newtonian gravity
is a linear theory, it is straightforward to show that the acceleration of a
gravitating body such as a planet in an external gravitational field is the
same as that of a body of negligible size (ignoring tidal effects). But in most
relativistic theories of gravity, including general relativity, gravity is non
linear, so that the internal gravitational field of a massive body can interact
with the external gravitational field, in a manner that could differ from the
interaction of the matter comprising the body. The result would be a
violation of WEP for massive, self-gravitating bodies.
In a pioneering calculation using an early form of the PPN formalism,
Nordtvedt ( 1 968) showed that many metric theories predicted just such a
violation, in other words, they predicted that bodies could fall with different
accelerations depending on their gravitational self-energy. For a spherically
symmetric body, the acceleration from rest in an external gravitational
potential U has the form
a = (mp/m1 ) VU,
mp = m1 - ryEg /c 2 ,
1
rJ = 4 /3 - y - 3 - 3° � - <:x 1 + �<:x 2 - �' 1 -j' 2 , ( 1 5)
Experimental gravitation from Newton to Einstein 109
Value of t( ] + 'Y)
Hill ( 1 97 1 )
·�
8. 1 97 1 Sramek ( 1 974)
Riley ( 1 973)
><
4)
....
...
0
"'
1 972 Weiler e t al. ( 1 974)
>
4)
5 1 0 20 40 00
Value of scalar-tensor w
Experimental gravitation from Newton to Einstein 1 13
sensitive to the deflection of light over almost the entire celestial sphere (at
90° from the Sun , the deflection is still 4 milliarcseconds). The data yielded a
value !( 1 + y) = 1 .004 + 0.003, where the error is a formal standard error
(Robertson and Carter, 1984).
One of the major sources of error in these experiments is the solar corona,
which bends radio waves much more strongly than it bent the visible light
rays which Eddington observed. Improvements in dual frequency
techniques improved accuracies by allowing the coronal bending, which
depends on the frequency of the wave, to be measured separately from the
gravitational bending, which does not (for a review, see Fomalont and
Sramek , 1 977).
Measurements of optical light deflection using telescopes in space have
recently come under serious study. Two proposals along this line are (i)
Hipparcos, an astrometry satellite planned for launch in the late 1980s by
the European Space Agency, designed to measure positions of 105 stars to 5
milliarcsecond accuracy (anticipated accuracy in y of 10 - 3); and (ii)
POINTS (precision optical interferometry in space), a proposed orbiting
optical interferometer with microarcsecond accuracy, capable in principle of
detecting second-order or post-post-Newtonian effects in light deflection.
For fu rther details concerning the deflection of light, see TEGP, section 7. 1 ,
and UPDATE, sectibn 4. 1 .
The discovery in 1 979 of the 'double' quasar Q0957 + 56 1 and its
subsequent interpretation as a multiple image of a single quasar caused by
the gravitational lensing effect of an intervening galaxy made the deflection
of light a useful tool in astrophysics and cosmology. The number and
characteristics of images in such lensed systems can be used as a probe of the
mass distribution of the lensing galaxy or cluster of galaxies.
Another important effect of gravity on the propagation of light was
considered neither by Newton nor by Einstein. Instead it was discovered by
Shapiro, in 1964. It is now called the Shapiro time delay, a retardation of
light signals that pass near a massive body, such as the Sun (Shapiro, 1964).
For instance, a radar signal sent across the solar system past the Sun to a
planet or satellite and returned to the Earth suffers an additional non
Newtonian delay in its round-trip travel time, given by
bt :::'. !( l + y ) [240 20 ln(d 2/r)] µs,
- ( 17)
where d is the distance of closest approach of the ray in solar radii , and r is
the distance of the planet or satellite from the Sun , in astronomical units (s�e
TEGP, section 7.2, for detailed derivation and references).
1 14 C. M. Will
Value of to + 'Y)
T T T T T T T I I I I
0.88 0.92 0.96 1 .00 1 .04 1 .08
Time-delay measurements
Passive radar to Mercury and Venus
Shapiro et al. ( 1 97 1 ) L
Active radar I
Anchored spacecraft
Mariner 9 Anderson et al. ( 1 978)
Reasenberg and Shapiro ( 1 977)
Viking Shapiro et al. ( 1 977) •
Cain et al. ( 1 978)
Reasenberg et al. ( 1 979) --i i- (± 0.00 1 )
.1 .1 .1 .l
5 1 0 20 40 00
Value of scalar-tensor w
116 C. M. Will
system. The reason is that the first 'double star' (conventionally referring to
a pair of stars seen to be less than a few arcseconds apart) was discovered
only around 1650 and , until around 1779, such occurrences were viewed
merely as curiosities caused by the chance juxtaposition of stellar images . In
1767, Michell pointed out that the probability of such a random
juxtaposition was sufficiently low for the closest pairs, that they may 'really
consist of stars placed near together, and under the influence of some general
law'. But in 1779, Mayer suggested the serious possibility of small suns
revolving around larger ones, and Herschel began a systematic search for
double stars. By 1803 , Herschel had demonstrated conclusively that the
changes in relative positions of some of the double star systems that he had
followed over a 25 year period could only be accounted for if the two stars
were 'intimately held together by the bonds of mutual attraction '. The
motions were seen to correspond to elliptical orbits projected onto the plane
of the sky, a confirmation of the Newtonian inverse square law. Between
1827 and 1832, the problem of deriving the orbit from observational data
was solved by Savary, Encke, and Herschel . In 1889, the fi rst of a new class
of binary star systems, the spectroscopic binaries, was discovered . These are
binary systems in which the two stars cannot be resolved telescopically,
instead the varying Doppler shifts in one or both of the spectra are
measured.t
Analysis of binary-system orbits provides information about the masses
of the components, which are basic quantities for the theory of stellar
structure and evolution. Observations of non-Keplerian apsidal advances in
close binaries with tidal interactions can be used to infer something about
the interior density distribution of the components. The evolution of close
binaries containing massive highly-evolved stars undergoing mass transfer
is important in understanding binary X-ray sou rces containing neutron
stars and black holes. Newtonian celestial mechanics is the key tool in all
these orbital analyses. Relativistic gravity has typically played a minor role.
However, there is one system in which relativistic celestial mechanics
must be used instead of Newtonian celestial mechanics : the binary pulsar.
Discovered in the summer of 1974 by Hulse and Taylor, it is a pulsar of
nominal period 59 ms in a close binary system with an as yet unseen
companion (Hulse and Taylor, 1975). From detailed analyses of the arrival
times of pulses (which amount to an integrated version of the Doppler-shift
methods used in spectroscopic binary systems) extremely accurate orbital
and physical parameters for the system have been obtained (see Table 5.6).
Because the orbit is so close ( � 1 R 0 ) and because there is no evide'nce of an
eclipse of the pulsar signal or of mass transfer from the companion, it is
generally believed that the companion is compact : a white dwarf, a black
hole, or (most likely) a second neutron star. Thus the orbital motion is very
clean, free from tidal or other complicating effects. Furthermore, the data
acquisition is 'clean' in the sense that the observers can keep track of the
pulsar phase with an accuracy of 20 µs, despite gaps of up to six months
between observing sessions. The pulsar has shown no evidence of 'glitches'
in its pulse period.
Three factors make this system an arena where relativistic celestial
mechanics must be used : the relatively large size of relativistic effects [vorb it /
c � (GM/c 2 R)11 2 � 10 - 3J ; the short orbital period (8 hours), allowing
secular effects to build up rapidly ; and the cleanliness of the system, allowing
Symbol Value
Parameter (units) (data to Feb . 1985)
(i) 'Physical' parameters
Right ascension a 19h 13m 12�468 ± 0�00 1
Declination {J 16°0 1'08': 16 ± 0·:0.2
0.059 029 995 2709 ± 20
(8 .63 ± 0.02) x 10 - l S
Pulsar period P_P (s)
Derivative of period PP (ss - 1 )
(ii) 'Classical' parameters
Projected semimajor axis aP sin i (light-sec) 2.34 1 85 ± 0.000 12
Eccentricity e 0.6 17 127 ± 0.000 003
Orbital period Pb (s) 27 906.98 1 63 ± 0.000 02
Longitude of periastron w0 (deg) 178.8643 ± 0.0009
Julian ephemeris date of
periastron and reference
time for Pb and w 0 . To 2 442 32 1 .433 2084 ± 0.000 00 12
(iii) 'Relativistic' parameters
Mean rate of periastron
advance < w) (deg yr -
1) 4.2263 ± 0.0003
Gravitational redshift and
time dilation y (s) 0.004 38 ± 0.000 12
Orbital period derivative Pb (ss - 1 ) ( - 2.40 ± 0.09) x 10 - 1 2
Orbital inclination sin i 0.72 ± 0.05
Epstein (see Haugan , 1985, for review and references). An alternative timing
model , using a slightly different set of relativistic parameters, devised by
Damour and Deruelle ( 1986), has also been used in the data analysis, with
equivalent results. The values shown in Table 5.6 are from data taken
through February 1985 (Haugan , 1986a); results through August 1983 were
presented by Weisberg and Taylor ( 1984).
The most convenient way to display these results is to plot the constraints
they imply for the two masses mp and me . These are shown in Fig. 5.5. From
(w) and y we obtain the values mp = ( 1 .42 + 0.03)m 0 and me =
( 1 .40 + 0.03)m 0 . Equations (20) and (2 1) then predict values for the
remaining parameters l\ = - 2.403 + 0.002 x 10- 1 2 and sin i = 0.72 + 0.03 ,
in complete agreement with the measured values in Table 5.6. This
consistency is also displayed in Fig . 5.5, in which the regions allowed by the
four measured constraints have a single common overlap. This consistency
provides a test of the assumption that the two bodies behave as 'point'
masses , without complicated tidal effects.
Fig. 5.5 . Curves showing constraints on the mass of pulsar and its
companion provided by measured values and estimated errors of the
parameters (w) , y, I\ , and sin i. Uncertainty in (w) is less than width of
sloping straight line. All four constraints overlap in the region near 1 .4 solar
masses for each body.
3 .--��.--�.� -
.. ... �...-��..--��..--��.-�-----,
.
.
·. ·
c:
0
-�Q,
e
0
u
.....
0
. . . :: : . . . .
. .... . .. . .
···
··
..
··
· ··
Q L._��L._��L-��-'--��...._�.L-�.-..__..
0 2 3
Mass o f pulsar (solar masses)
1 20 C. M. Will
One of the most important results of the analysis of the binary pulsar is the
confirmation of the existence of gravitational radiation through the loss of
orbital energy. Although it is normally stated that Newtonian gravity does
not admit gravitational radiation because the interaction is instantaneous,
'
this was not necessarily Newt on's intent. In the gravitational calculations in
the Principia the interaction was assumed to be instantaneous as a necessary
approximation , while in other writings Newton made it clear that he had in
mind a propagating action , in which the ether between two moving,
gravitating bodies might 'begin to rarify'.t In an unsuccessful attempt to
explain the secular acceleration of the Moon , Laplace assumed that the
speed of gravitational propagation was seven million times that of light.
Nevertheless, the simplest mathematical formulation of Newtonian gravity
assumes instantaneous gravitational interactions. On the other hand ,
gravitational radiation is a necessary outcome of general relativity, since the
theory is compatible with special relativity, and with a limiting speed for all
interactions.
By the same token , alternative metric theories of gravity , because of their
compatibility with Lorentz invariance at some level , also predict
gravitational radiation. However, most such theories predict, in addition to
a contribution from 'quadrupole' gravitational radiation analogous to that
of general relativity, a contribution of 'dipole' gravitational radiation. In
such cases, the amount of orbital period decay is expected to be in significant
disagreement with the constraints imposed on the masses by the other three
relativistic parameters. This difficulty dealt a mortal blow to the ' Rosen
bimetric theory', for example (see TEGP , section 12.3, for discussion).
The binary pulsar has yielded a remarkably rich test of gravitation theory .
confirming Newtonian and general relativistic celestial mechanics, and the
existence of gravitational radiation. It has also demonstrated a role for
general relativity in the determination of astrophysical parameters, such as
the mass of a neutron star. The first announcement of the observation of the
orbital period decrease came in late 1978 , just at the beginning of the
Einstein centenary year, and toward the end of the 'decades for testing
general relativity' . It is now appropriate to summarize and to attempt to
look to the future of experimental gravitation .
and to higher orders perhaps, but with little reason to anticipate a failure of
Einstein's theory.
On the other hand, although physics is normally concerned with making
predictions about the behavior of systems, physicists are notoriously poor at
predicting the future of their own subject. So even after 300 years of
experimental gravitation, who is to say that there will not be surprises
around the corner?
Acknowledgement
This work was supported in part by the National Science Foundation
[PHY85- 1 3953J .
f
U(x , t) = p(x' , tJ l x - x' l - 1 d 3 x' .
The 'order of smallness' is determined according to the rules U ,..., v 2 ,..., TI ,..., p/
i
p ,..., 0(2), v ,..., ld/dtl/ld/dxl ,..., 0(1), and so on. A consistent post-Newtonian
limit requires determination of g 00 correct through 0(4), g 0 i through 0(3)
and gii through 0(2) (for details see TEGP , section 4. 1). The only way that
one metric theory differs from another is in the numerical values of the
coefficients that appear in front of the metric potentials. The parametrized
post-Newtonian (PPN) formalism inserts parameters in place of these
coefficients, parameters whose values depend on the theory under study. In
the standard version of the PPN formalism, summarized in Table 5.7, ten
parameters are used , chosen in such a manner that they measure or indicate
general properties of metric theories of gravity (Table 5.8). The parameters y
and /3 are the usual Eddington-Robertson-Schiff parameters used to
Experimental gravitation from Newton to Einstein 123
A . Coordinate system:
B . Ma tter variables:
1. p = density of rest mass as measured in a local freely falling frame momentarily
comoving with the gravitating matter.
2. vi = (x )
d i/d t = coordinate velocity of the matter.
5. TI = internal energy per unit rest mass. It includes all forms of non-rest-mass ,
non-gravitational energy - e.g . , energy o f compression and thermal energy.
PPN
C.
y, /3, � .ct ct 2 , ct 3 , ( 2 . ( 3 ,
parameters:
1, (1 , (4.
E.
p' 3x' ' Uij - p'(x -I x'Mx-x' ) i
Metric potentials:
u
- JIx-x 'I d
J x-x'13 d 3x'
� - Jp'lxp"(x-x') . x'-x" x - x" d 3x'd3x"
w- - x'l3 ( lx - x"I Ix' - x"I )
p'[v' · (x -x'3 )] 2 3x,
A=
I p l1x2 -x'l d
3 xv
, d �I
2 I x-xU
I = I
�1 =
continued
1 24 C. M. Will
E. Metric potentials:
: ' TI ' 3
<1>3 = 1 - x ' I d x' '
I ' v' 3 ,
P
JI; = I j x - x' I
i d x'
Value Value
. .
Value in m semi- in fully-
What it measures relative general conservative conservative
Parameter to general relativity relativity theories theories
References
6.1 Introduction
The problem of motion, i .e. the problem of describing the dynamics of N
gravitationally interacting extended bodies, is the cardinal problem of any
theory of gravity. From the publication of Newton's Principia to the
beginning of the twentieth century this problem has been thoroughly
investigated within the framework of Newton's dynamics and theory of
gravity. This has led to the formulation of many concepts and theoretical
tools which have been applied to other fields in physics. For instance, the
early development of quantum mechanics has been greatly facilitated by the
existence of several structures of classical mechanics : canonical formalism ,
the Hamilton-Jacobi equation, action-angle variables, the eigenvalue
problem of secular perturbations of the solar system . . . . In fact, quantum
mechanics was successfully modelled on the formal structures of classical
mechanics.
In contrast, Einstein's theory of gravity has developed within a
conceptual and mathematical framework which was completely alien to the
Newtonian framework, as is indicated by the name given to it by its founder :
the general theory of relativity ('Die allgemeine Relativitatstheorie',
Einstein, 19 16a). As a consequence, the relationship between Einstein's and
Newton's theories of gravity has been , and still is, very peculiar. On the one
hand, from the technical point of view, the existence of Newton's theory
facilitated the early development of Einstein's theory by suggesting an
approximation method (called 'post-Newtonian') which allowed the
theorists to draw quickly some observational consequences of general
relativity. Indeed, this post-Newtonian approximation method , developed
by Einstein, Droste and De Sitter within one year of the discovery of general
relativity, has led to the predictions of: the relativistic advance of the
The problem of motion in Newtonian and Einsteinian gravity 129
go into the details of the way these ideas are implemented in the Newtonian
framework, some of which will seem fairly trivial to the reader; but we think
it is important to go through them for the following reason : precisely
because these details are technically trivial (or too well known) , one is liable
to miss the ideas which underlie them, so that, when tackling the
corresponding relativistic problem, one has the tendency to transplant,
within Einstein's mathematical framework , exactly the same technical
developments, instead of trying to transmute, within Einstein's conceptual
framework , the ideas underlying them.
In Section 6.2 we pose the Newtonian problem of the motion of N
extended bodies in a general form. In Section 6.3 we discuss the splitting of
this problem into two sub-problems : the external problem (the motion of a
body as a whole), and the internal problem (the intrinsic motion of each
body) . This splitting is useful because the two sub-problems are only weakly
coupled. Very fundamental facts underlie the remarkable weakness of this
coupling, and these are discussed in Sections 6.4 and 6.5 . In Section 6.4 we
study one aspect of this weak coupling, namely the remarkable smallness of
the influence of the internal structure on the 'external' motion . Reviving the
terminology introduced by M. Brillouin ( 1922) and T. Levi-Civita ( 1937a),
we shall speak of an 'effacement' of the internal structure in respect of the
external problem. In Section 6.5 we study the other aspect, namely the
remarkable smallness of the influence of the external motion on the internal
one. Extending the use of Brillouin's and Levi-Civita's terminology, we
shall speak of an 'effacement' of the external structure in the internal
problem. These 'effacement' properties are closely linked with the so-called
'principle of equivalence' (in a sense they can be regarded as precise
formulations of some aspects of the principle of equivalence, the latter often
being expressed either in a vague manner or in a precise but restricted one).
While preparing this article I went back to read those sections of the
Principia in which Newton discusses the equivalence of inertial and
gravitational mass. Newton was fully aware of the remarkable character of
this equivalence, and he gives as many arguments as possible in its favour. In
"
particular I found , much to my surprise, that Newton gives a very
interesting argument, about the relative motion of celestial bodies, which
anticipates ideas recently put forward by Nordtvedt ( 1968) within the
framework of relativistic theories of gravity (see Section 6.6). We conclude
our discussion of the Newtonian problem of motion by outlining the way the
two sub-problems can be effectively solved (Section 6.7).
In the second part of this article (Sections 6.8-6. 1 5) we discuss the
The problem of motion in Newtonian and Einsteinian gravity 131
report of Ehlers and Walker), Schutz ( 1 984, 1985), Will ( 1 986), and the
proceedings of the I 14th IAU symposium (Kovalevsky and Brumberg,
1 986) (in particular see B romberg, and Grishchuk and Kopej!cin).
(3)
t=lxlc--+onst.
oc
x
limit ( U( , t)) = o (4b)
(where lxl is the Euclidean norm of x). Then the only acceptable solution of
G Jp ',
(x t)
Poisson's equation is :
. t) =
U(x, d 3 x, , (5)
x -x
I. .,1
Ix x x',
where - ' I denotes the Euclidean distance between the field point and
the source point and d 3 x' denotes the Euclidean volume element in
x
cartesian coordinates, dxi t dx' 2 dx' 3 .
p(x , v(x,
Now the dynamics of the system are described by the evolution in time of
t) and t), the evolution of the latter variables being governed by eqs.
(7)
--- :-
dF(x, t) aF(x, t)
.
. aF(x, t)
+ v' (9)
dt at ax1
denotes the 'convective derivative' of F, implies, when applied twice to the
definition (7) ,
i
d 2 zi (' d v
ma la = p d 3 x .
J v. dt
( 10)
dt
i
By the local equations of motion p dvi/dt = !F , where !Fi denotes the local
force density, so that
.
d 2 z�
ma
2 =
dt iv.
:;; 1. d 3x ' ( 1 1)
('centre-of-mass theorem'). In our perfect fluid model , eq. (3), one has
o;;- i ap
= - -. + p -. .
au
.':#' ( 12)
ax1 ax1
The problem of motion in Newtonian and Einsteinian gravity 135
p ( ' , t) d 3 x ' ·
other bodies, is
u<eJa(.x , t) : = L G I p.(x ' ,.�) d 3 x ' .
J v,, l x - x I
( 16)
b ol a
while the basic equation of the ath internal problem (motion in the centre-of
mass frame of the ath body) is (from eqs. (3) and (17)-( 18))
( "' ai
OW
p at + w
j uW
:') ai
a ayf,
) - !#' (s)a + fl' (e)
-
c;;; i
.
i
a d 2 Zai
-p 2
dt
(20)
136 T. Damour
a w� . a w� )
(-
or, equivalently, by using eq. ( 19) :
P
t
.i
a + wa a ,j -
Ya
cE
(s)a - !!_
:fl' i
ma
l
V.
.:fl' i
cE _
cE(ie)a !!_
(s)a d 3 Ya + :fl' ma
l a;;(ieJa
V.
.T
d 3x. (2 1 )
F (is)a ·- l a;;(is)a
·-
v.
<7
d 3 Ya ' (22)
and, moreover, the first term in its right-hand side is of mixed external
internal origin (as discussed later). On the other hand, the internal problem,
eq. (2 1 ), contains, as third and fourth terms in its right-hand side, terms of
external origin, the 'tidal' force-density :
(23)
(25)
this is easily checked by using Gauss's theorem and the vanishing of the
pressure outside the body, for the first term on the right-hand side of eq . (25),
and by replacing the value ( 1 5) for the self-gravitational potential in the
The problem of motion in Newtonian and Einsteinian gravity 137
l
second term , leading to
i a u(s)a 3
F ( s )a = O + P a i d Ya
v.
aYa
� - G fJ/;�·�i1�·) (y: - y;') d'y. d 3 y; = O (26)
(27)
(29)
1 38 T. Damour
where we have used the vanishing of the first-order relative mass moment
oexl(a3)
denotes the 'inertia tensor' (second-order relative mass moment) and where
denotes a remainder term (mainly due to the third-order moment
cou piing) which is of the order of mI.J a4 U ,..,,, m(L/R) 3 a U, i .e. smaller than a3
the first term in the right-hand side of eq. (3 1). In other words, if one uses a
system of units adapted to the external problem, i .e. such that a
characteristic mass, m, a characteristic separation , R, and Newton 's
constant (or a characteristic orbital time), are all equal to one :
m= R = G = 1, (34)
a,
then the first term on the right-hand side of eq. (3 1) is numerically of order
one, the second term is numerically � 2 while the remaining term is �
Note that by its definition ( 16) the Laplacian of U (e)a is zero within the ath
a3 .
body so that we can replace, in eq . (3 1), the inertia tensor, I�, by its
(symmetric and) trace-free part , the 'quadrupole tensor' :
3
Q aii ( t) : = fai
and thus the equation for the external motion reads :
a• _ l.b ijJkk (35)
·
(36)
dt 2
If we introduce an 'ellipticity parameter' , t:,
Q
t: := sup I I
a
( II:fl? ) (37)
(the vertical bars denoting the Euclidean norm of tensors), then 0 � t: � 1 and
The problem of motion in Newtonian and Einsteinian gravity 139
the magnitude of the second term on the right-hand side of eq . (36) is in fact
� ea 2 , which can be much smaller than a 2 if, as is frequently the case (for
reasons to be discussed later), e � 1 .
Finally, we mu st replace the external potential u<e)a (eq. ( 16)) in eq. (36), i.e.
u<e)a (x , t) = L G I d 3y b f b !b ' �
( t
(38)
b#a J �
'
l x - zb - Ybl
by its a-expansion. The following Taylor expansion in powers of Yb (with
· = = oA/oxi)
( ) ( )
A ,I '
i
i
1 _
1 1 1 1
1 3 (39)
I x. _ z. b _ Y. b1 - 1 x· - z·b I - 1 x· - z· bI ,i Yb + -2 1 x· - z· b I , Ybyf, + O ( I Y. b ) ,
ii
.
b# a Ix bl -
Hence we get for the a-expanded equation of external motion :
d2
� {o 1
ma d t � = L G mamb :i i 1 · · 1
b# a UZa Za - Zb ( )
+ � G (m Q{ + m,Q� ) az: :� az: Cz. � z,1 )} + 0'" (� 3).
, • ·
(4 1)
Now the system of eqs. (4 1), with a = 1 , . . . , N, does not yield an
autonomous system for the external motion because it contains, along with
the variables i0 ( t) describing the external motion, and the constant
parameters ma, the time-varying quadrupole tensors Q�(t) which depend on
the internal motion of the bodies. However, in eq. (4 1) the contribution of
the internal structure Q ii( t) to the net force is only ,..., Gem 2 I.3/R 4 , which is ea 2
smaller than the Newtonian force between two point-masses - Gm 2/R 2
(
(first term on the right-hand side of eq. 4 1)). On the other hand, an a priori
order of magnitude estimate of the influence of the internal structure on the
first form of the equation of the external problem , eq . ( 19), would have
yielded an internal-structure-dependent contribution Gm 2/L2 , possibly ,...,
'
coming from the self-force. Therefore, in Newtonian theory there is a drastic
reduction of the influence of the internal structu re on the external motion by
a factor ,..., ea 4 from a priori estimates, and by a factor ,..., ea 2 in the final result.
Using a terminology introduced by Brillouin ( 1922) and Levi-Civita
( 1937a), and revived recently in a study of the motion of two compact bodies
(Damour, 1983a), we shall speak of an effacement of the internal structure in
140 T. Damour
the external problem. When one can neglect the relative order of magnitude
eoc 2 + oc 3 the 'effacement' allows one to reduce the external problem to the
problem of the motion of N point-masses fully characterised by their
positions, ia(t) , and their (constant) masses, ma , and subject to the
( )
autonomous differential system :
d1zi a l
ma ia = L Gmamb - a i I. - . I . (42)
dt bola Za Za zb
o;;- ( • ) ·-
( )a Ya · - Pa LJ(e)a
.? it
• • !!.3._
, i {Za + Ya ) _
ma
f V.
/ ., ) LJ(e)a
d 3 YaPa (Ya , i {Za
•1 )
• + Ya . (44)
A priori, the relative order of magnitude of the 'external ' force � t)a , as
compared with the gravitational self-force density, Pa U��)a - Pa Gm/L2 , would
be thought to be given by the order of magnitude of the terms explicitly
appearing on the right-hand side of eq. (44), which are "" Pa Gm/R2 , so that, a
priori, the influence of the external motion (R(t)) on the internal evolution
would be of relative order (L/R) 2 = oc 2 • However, a 'peculiar circumstance'
(which was raised to the status of a fundamental physical fact by Einstein)
causes a reduction between the two terms of the right-hand side of (44),
leading to a fu rther 'effacement ' of the influence of the external motion on the
internal one. Indeed, using the same Taylor expansion, eq . (30), as in the
external problem and again taking into account the vanishing of the first
order relative moment , eq. (32), we find :
:Fi(t)a ( Ya
· ) - P LJ(ela
[ j
· ) ,J + J.P u(e)a ( · ) ,J .J< lak
a ,ij Za Ya 2 a , ijk Za YaYa - m
( ]
a
+ oint (ocs )
(45)
(where again J�k can be replaced by Q�k ).
The first term in the right-hand side of eq. (45) is now of order oc 3 (without
factors e) as compared with the gravitational self-force, while the second and
third terms are, respectively, of order oc4 and oc 5 . In other words, if one uses a
The problem of motion in Newtonian and Einsteinian gravity 14 1
(47)
at + a-1 (p w i ) 0,
op a
0
a a
_
-
(48)
Ya
with some given equation of state (1) and the condition that P a and Pa vanish
outside some compact domain �·
the 'equivalence princ1pie tor massive bodies'. As we shall discuss later, the
strong form of the effacement of internal structure which holds within
general relativity guarantees that in particular Oab = O (even for strongly self
gravitating bodies), but Nordtvedt pointed out that in other relativistic
theories of gravity oab might become non-zero for self-gravitating bodies.
To be fair it should be added that Ne�ton's estimate (54) (given without
intermediate steps) is incorrect both in magnitude and in sign and that the
first correct calculation of the 'polarisation' of the orbit of a satellite by a
relative acceleration field towards the Sun ('gravitational Stark effect') was
achieved by Nordtvedt ( 1968). However, even if Newton's conclusion (based
on the overestimate (54)) that the observations of the Jovian system (in his
3
time) allow one to conclude that loab l < 10 - (a value consistent with
Newton's best pendulum experiments) is too optimistic, we can still admire
Newton's remarkable insight which led him not only to ask profound
questions about the strong equivalence principle but also to suggest where
to look for a possible astronomical violation of the equality (49).
d Xe
0 = - + Eo (Xe) + ex 2 E ( Xe , X J + ex 3 E 3 (Xe , XJ + · · · , (55a)
d te 2
d�
0 = - + lo ( XJ + ex 3 l 3 (X i , Xe ) + ex4J (X i , X e ) + . .
·, (55b)
d ti 4
where te (resp . tJ is the dimensionless external (resp . internal) time (in eq .
(55b) the matrix Xi describing, a priori, a set of fields, can be infinite) .
Therefore, in principle, one can solve the coupled system (55) by an iteration
process which starts with knowledge of the solution of the uncoupled system
obtained as the ex = 0 limit of ( 55). Note also that under normal conditions
the internal motion will be nearly stationary and , in fact , slowly 'driven' by
external influences, so that one expects Xi to vary with the external time
te = ex 3 f 2 ti . (56)
Using te instead of ti in eq . ( 55b) leads to
0-_ 10(XJ + ex 3 / 2 -
dX i
+ ex 3 J 3 (Xi , Xe ) + . . . , (57)
d te
which , together with (55a), can be iteratively solved starting with a
stationary internal trial solution.
Another way of building a zeroth approximation to the system (55) is to
start with the assumption that the N bodies undergo rigid motions. This
allows one to reduce the system (55) (either taken exactly or truncated at
some finite approximation in ex ) to an autonomous system of 6N differential
equations for the 6N parameters describing the translational and the
rotational degrees of freedom of the N bodies (see e.g. Fock, 1959, and
Dixon , 1 979, for detailed accounts of this approach).
However, the extreme accuracy (sometimes even at the centimetre level)
now needed in both the translational motion of the planets (especially the
Moon and the Earth) and the internal motion of the Earth makes it
necessary to go beyond the approximation of rigid bodies, while still taking
into account the smaller ex-dependent couplings in the system (55). For
recent reviews of the state of the art on the problem of Newtonian motion
the reader is referred to Kovalevsky and Bromberg ( 1986) and references
therein .
Then the equations describing both the dynamics of the matter and the
generation of the 'relativistic gravitational potentials', g µv ' by the matter are
the Einstein equations (c = velocity of light)
8nG
Eµv = -4- I;i v ' (60)
c
where Eµv denotes the Einstein tensor
(6 1)
The Einstein equations (60) replace eqs. (2) (continuity), (3) (Euler) , and
(4a) (Poisson) of the corresponding Newtonian problem. As they are purely
local they must be supplemented by boundary conditions expressing the
The problem of motion in Newtonian and Einsteinian gravity 147
(67)
The problem of motion in Newtonian and Einsteinian gravity 149
Similarly, let {Ji denote the ratio between a characteristic 'internal' velocity
(velocity of spinning motion) and the velocity of light :
(72)
If m denotes a characteristic mass let us also introduce two other relativistic
dimensionless parameters, an 'external ' one,
(73)
2 (relative d ev1ation
mL ·
this makes a total of six parameters (rt, f3e , {Ji , Y e , Y i , t:) that can be used to
concoct some approximation methods.
First let us note that' some of these parameters are linked by equalities or
inequalities. Indeed , first ,
(76)
and as it is generally expected that L ;;;;: Gm/c 2 (the equality, in order of
magnitude, being reached only for 'condensed bodies', i .e. neutron stars or
black holes) we shall always have at least
Yi ;$ 1 . (77)
Hence, from our previous minimal assumption about the relative separation
of the bodies we shall have, in general :
Y e :$ rt � 1 (minimal assumption). (78)
On the other hand , we know, a priori, only that f3e :$ 1 , {Ji ;$ 1 , 'Yi ;$ 1 and t: ;$ 1 .
However, if we further assume that the system is gravitationally bound, so
that by the virial theorem (vext ) 2 - Gm/R (or nearly gravitationally bound as
in wide-angle scattering) then we shall have,
p; - Y e � l
(gravitationally bound or wide-angle scattering) , ( 7 9 a )
but in the case of small-angle gravitational scattering we would have
Y e � p; :$ 1 (small-angle scattering). (79b)
The problem of motion in Newtonian and Einsteinian gravity 15 1
In some special situations one can encounter other small parameters, such
as the ratio between some masses (which plays an important role in the
mechanics of the solar system), but we will discuss only the more general
cases (see e.g. Brumberg, 1986, and references therein for the proposal of a
relativistic approximation method based on such a small parameter m/M).
Before embarking on a quick survey of the main approximation methods
which have been used in the problem of motion, let the reader be warned
that there are many different points of view concerning these methods and
that my viewpoint is probably biased . He is therefore invited to consult
other reviews, e.g. Ehlers et al. ( 1976), Ehlers ( 1977, 1980), Walker ( 1984),
Ehlers and Walker ( 1 984), Schutz ( 1984, 1985), Will ( 1986).
equation of continuity :
(8 1)
0 0
where the 'time' t = = x /c , the 'velocity' v i = dxi/d t = cu i/u and ,
0
r* := r0 Jg u , (82)
is a 'coordinate rest-mass density'.
Note that I have intentionally refrained from using the letter p to denote
any of the relativistic densities (of energy or rest-mass, proper or
'coordinate'). Indeed, the fact that many authors use the same letter, p, to
denote different things is a source of confusion (and error) in the literature
on equations of motion and I did not wish to compound the confusion.
Moreover, in keeping with what I said in the preceding paragraph, I think
that this confusion of notions pertaining to different theories is not innocent
so that the use of different letters helps to contrast Einstein's theory with
Newton's.
Now the modelling of the reduced Einstein equations (69a) on the
Poisson equation (4a) is achieved at a price of several Ansatze. First one
writes
(83)
where fµv =fµ = diag( - 1 , + 1 , + 1 , + 1) are the usual components of a 'flat'
v
metric. Then hµv is assumed to be everywhere small ('weak gravitational
field') and admits an asymptotic expansion when Yi � 0 of the type :
hµv (x, t) = yi hf2J(x, t) + y? 1 2 hf;i(.x, t) + yt h f�J(x, t) + · · · + yj 1 2h (n')(x , t) + · · ·
(84)
The necessity to introduce half-integer powers of Y i comes from the basic
assumptions (80) which imply that the spatial part of the 4-velocity uk "' f3e �
yl 12 (they also imply that Il0/c2 ,..., p0/r0c2 "' /3i2 � Yi ). Another basic ingredient
of all PNA methods is to require that
1 8hµv 2
8hµv
/
c& "' 'Yil axk
. (8 5)
It is assumed that the expansion coefficients h(n) are always and everywhere
of order unity (or less), and that their spatial derivatives are everywhere of
order 1/L (or less). Note also that, in general , the asymptotic expansion (84)
is of the generalised type where some dependence on 'Y i is implicitly
contained in each coefficient h fn) · In order to give to (84) the rigorous
meaning of a Poincare-type asymptotic expansion one should first specify
The problem of motion in Newtonian and Einsteinian gravity 153
which of the parameters determining the gravitational field are fixed in the
limit Yi � 0. For one possible approach along these lines see Futamase and
Schutz ( 1983).
Then one easily sees that if one uses a system of units adapted to the study
of the considered gravitationally interacting system, i .e. such that
G = m = L = 1, (86)
then the numerical value of the velocity of light in these units is such that
(87)
and all the h<11> s , their time or space derivatives, and all the internal or
external velocities become of numerical order unity (or less) . With this in
mind , one can summarise the post-Newtonian assumptions by saying that
one looks for solutions of the system (69) in the form of formal expansions in
powers of 1/c (and it is convenient to use this language even if one does not
explicitly use the system of units (86)).
Among the field equations (69a) the dominant one is the zero-zero
component which reads, at lowest order:
i i
16nG
� (y i h < 2> ) = c 2 r0 + O(y i /L ) .
oo
(8 8 )
Eq. (88) is very similar to the Poisson equation (4a), but like the latter it
does not determine h?2� unless we supplement it by some boundary
conditions. Many authors working with PNAs have implicitly assumed that
it was sufficient to use the same boundary condition as in the Newtonian
case (eq. (4b)) namely :
limit (hf,�(x, t)) = O. (89)
lxl --+ oc
t = co st
n
Y i h<oo .. fr c
' t)
G o (.X '
2 > (x , t) = - 4 2 1 x.. - x..,1 d 3 x ,
,
(90)
which leads, consistent with the previous assumptions, t o a h?2� whose
maximum value is of order unity.
If we extend this program to highe r orders in Y i1 1 2 one finds that it leads
to a formal hierarchy of Poisson equations for the hf11) s of the type :
� (yi l 2hf11) ) = yµv terms + (terms known from preceding approximations) .
(9 1)
1 54 T. Damour
For the first few steps this hierarchy admits a unique solution fulfilling the
boundary conditions (89), and satisfying everywhere hfn) � 1 . However, the
terms resulting from lower approximations on the right-hand side of eq. (9 1)
quickly lead to badly behaving Poisson equations which do not admit any
solutions fulfilling the boundary conditions (89) (in the present scheme using
harmonic coordinates this problem arises first for h?J> and h?6�). The reason
for this incompatibility between the hierarchy (9 1) and the boundary
conditions (89) does not, however, mean that there is a fundamental
breakdown of the post-Newtonian approach at the level of h?J> or h?6� (called
second-post-Newtonian, or 2PN , level). Neither does it mean that the
harmonic coordinates become intrinsically 'bad' at the 2PN level . The
blame for this incompatibility lies with the boundary conditions (89).
Indeed , as was first realised clearly by Fock ( 1 959), the post-Newtonian
expansion (84)-(85) is basically a near-zone expansion of the exact hµ v (x, t),
i .e. an expansion valid only up to a radius, r, around the system which is
much smaller than a characteristic wavelength, A., of the gravitational
radiation emitted by the system : r � A.. As a consequence the asymptotic
behaviour for lxl -+ oo , t fixed , of each term of the right-hand side of eq . (84)
has, a priori, nothing to do with the real asymptotic behaviour of the exact
hµ v(x, t) (because, indeed , the expansion on the right-hand side of eq. (84)
becomes invalid when r gets large enough to reach the wave-zone r � A.). This
is clearly understood for the trivial example of the near-zone (or l/c)
expansion of the simplest radiative field :
S (t - r/c) S ( t) 1 1 1
r
- - - S ( t) +
r c 2c2
-
6c3
s·( t)r - - S ( t)r2 + · · · (92)
(94)
f
I, [f] (X , t) o = f (X' , tJIX - X ' I ' d 3 '
x . (95)
The idea behind this formal procedure was to retain both the technical
simplicity (eq. (85) neglecting many terms) and the conceptual ease of the
post-Newtonian approach (hcnl being expressed as a functional of the
instantaneous state of the source, as in Newtonian theory) while still trying
to incorporate the 'real physics' (which states that in Einsteinian theory
gravity propagates with the velocity of light). This programme has been
developed by many authors, notably Peres ( 1959, 1960), Carmeli (1964,
1965), Synge ( 1969, 1970), Hogan and McCrea (1974), Anderson and
Decanio ( 1975). An improvement of the method (leaving the time
derivatives inside the integral operators in eq . (94) and 'reducing' them by
means of the equations of motion) has been proposed by Ehlers ( 1977, 1980)
and developed by Kerlick ( 1980) and Caporali ( 198 1a). However, the use of
the near-zone expanded retarded potentials (94 ) causes the appearance of
divergent integrals. This means that all these schemes break down (because
they become undefined) at the 3PN level (Kerlick , 1980), or even before in
the case of the non-improved schemes. This breakdown is a direct
consequence of the limitation of validity of the post-Newtonian expansions
to the near-zone.
Other authors, conscious of the necessity to incorporate the propagating
character of gravity (both in the near- and the wave-zone), but aware of the
simplifications brought by the PNA assumptions, attempted to devise some
mixed post-Minkowskian (see next section)-post-Newtonian approaches.
See e.g. Fock ( 1959), Persides ( 197 1a, 197 1b), Winicour ( 1983), Gurses and
Walker ( 1984), Schafer ( 1985). However, in order to really face the problem
of the near zone limitation of the post-Newtonian approach it seems
necessary to have recourse to new methods (see Sections 6. 1 1 and 6. 12).
1 56 T. D amour
(( ))
c + r/c = const
ohµ v 1 ohµ v
( 103b)
l
f�! t
c + r/c = const
r
0 ; +� 0 ; = 0,
where r = : ( b iixixi)1 1 2 and t = : x0/c (these conditions must be satisfied along
all past Minkowski-light-cones). Fock ( 1959) has proven that the fall-off
condition ( 103a), together with the 'no-incoming-radiation condition'
( 103b) necessarily implies that
J
y fh !'(x) = d4x' G�(x - x')S"'(x'), ( 104)
where G��tx - x') is the retarded Green function of the (Minkowski) wave
operator o1 =fµv a 2/ox µ ax v , and where sµ v denotes the right-hand side of
eq. ( 102) ('effective source of the gravitational field'). Therefore, an
alternative prescription for solving the hierarchy of differential systems ( 102)
would be to replace it by the hierarchy of iterated retarded integrals ( 104).
Note, however·, that it is not guaranteed that the 'retarded' h s defined by n
( 104) will satisfy the Fock conditions ( 103) (see Leipold and Walker, 1977).
The first step in the hierarchy ( 104) was . introduced and explicitly
worked out by Einstein (1916b); this is the famous 'linearized
J
approximation' to general relativity :
1 6nG �
Yi h1:V (x) = d 4x' G�ft> (x - x')Tµ v (x'), (105)
c4
where, following the notation of eqs. ( 100}-( 10 1), 'f'µ v denotes a Minkowski
like stress-energy tensor of the matter.
For a long time the post-Minkowskian approximation stayed dormant,
while the post-Newtonian one was developed. It was revived in 1956 and
then began to develop in turn : see Bertotti ( 1 956), Havas ( 1957), Kerr
(1959), Bertotti and Plebanski (1960) (first attempt to explicitly tackle the
non-linear approximation), Havas and Goldberg ( 1 962), Kuhnel ( 1963,
1964), Stephani (1964), Goenner ( 1 970), and Bennewitz and Westpfahl
( 197 1). A new approach to the post-linear formalism, based on an attempt to
use retarded Green's functions in curved space-time (however, the final
formulas use only the fla t Green function) , was devised by Thorne and
Kovacs (1975) and Crowley and Thorne (1977). Recently several aut hors
The problem of motion in Newtonian and Einsteinian gravity 159
spatial range of validity (r � ,\). Using this point of view one can similarly
investigate the domain of validity of the null-cone-improved post
N ewtonian approaches of Persides ( 197 la, b) and Winicour ( 19 83). These
methods essentially consist of looking for expansions of the type
hµv (x ;. ) = L
m ;;;i: 2
: k�v (x , u),
C
( 109)
where u is some retarded time ( � t - r/c) which is kept fixed in the limit
process 1 /c -+ 0. Now the expansion ( 109), if it exists, must 'descend' from the
PM expansion ( 108), re-written in terms of (x , u), and re-expanded for
1/c 0. Then explicit calculations at the 2PM level (in general relativity,
-+
and in some scalar field models with ,\</J" non-linearities) show that these
improved approaches have a range of spatial validity which is still restricted
to the near-zone (r � ,\) (only the linearised approximation sees its range of
validity extended to the wave-zone). In particular, this can be used to show
that the 'logarithmic violation' of peeling found by Isaacson, Welling and
Winicour ( 19 84) is only a near-zone effect (behaviour when r � R but r � A)
and says nothing about the 'real ' asymptotic behaviour of the gravitational
field at future null infinity.
( 1 10)
n
(where bn + 1 (c)/bn(E) -+ 0 when E -+ 0), such that some of the coefficients fn(x)
are not bounded on the domain of variation of the configuration-space
variables x. Problems of this type are quite common in physics, particularly
in fluid mechanics , and several methods have been developed to cope with
them (see e.g. Lagerstrom et al. , 1967; van Dyke, 1975). There are essentially
two classes of methods. In one class one looks for a generalised asymptotic
expansion,
( 1 1 1)
II
admitting generalised coefficients gn(x ; E) which are bounded all over the
The problem of motion in Newtonian and Einsteinian gravity 16 1
f- Ln bn(a)fn(x), ( 1 12a)
( 1 12b)
n
where X = X(x, a). The subtlety of the latter approach lies in the fact that
when a ---+ 0 the a-dependent domains of the configuration-space, where the
expansions ( 1 12a, b) are a priori valid, are not required to overlap (and in
general they do not).
One can say that , as seen from the post-Newtonian point of view, the
post-Minkowskian expansions constitute a first class 'cure' for the wave
zone breakdown of PN expansions (see eq. (108)). Now the usual post
Minkowskian expansion itself exhibits a kind of (weak) breakdown in the
'exponentially far-wave zone'
{ }
r ;;:: A. exp
c 2 A.
4nGm
, ( 1 13)
because of the appearance of log r/r terms, when r --+ oo , in h1r. However, as
argued already by Fock ( 1959), Isaacson and Winicour ( 1 968), and more
recently by Anderson ( 1980) and others, this can easily be cured by a first
class method based on a (G/c 2 )-dependent change of coordinates (for a
complete proof that this cu re works at all orders in G , see e.g. Blanchet
( 1987)).
The singular-perturbation methods of the second class have been found
useful in several problems of relativistic gravity. They have been used, in an
intuitive way, by Fock ( 1 959, section 87) and Manasse ( 1963); and were
introduced in a more formal way in the late sixties by Burke ( 197 1) and
Thorne ( 1 969) to tackle the wave-zone breakdown of PN expansions.
Recently they have been used by several authors : e.g. Demianski and
Grischchuk ( 1974), D'Eath ( 1975a, b), Kates (1980a, b), Anderson, Kates,
Kegeles and Madonna ( 1982), Damour ( 1983a), Anderson and Madonna
(1983); Blanchet and Damour (1984b), Thorne and Hartle ( 1985), Futamase
( 1986). A general discussion of the formal way in which these methods fit
into the framework of General Relativity has been attempted (Kates, 1 98 1 ;
162 T. Damour
describe fully the physical problem at hand so that they must somehow fit
together. Now the literature which deals with precise rules for the 'matching
of asymptotic expansions' is full of ambiguities and controversies. In the
context of General Relativity some of these ambiguities have been pointed
out recently (Blanchet and Damour, 1984b; Damour, 1987). In particul ar, a
re-examination of the problem initially studied by Burke ( 197 1) has shown
the necessity to take non-linear effects into account with great care (Blanchet
and Damour, 1984b, and references therein). This is why , when using this
type of method in the problem of the motion of condensed bodies, care has
been taken to control in sufficient detail the structure of the non-linear
corrections so as to be able to extract minimal, but reliable, information
from the matching of the two expansions (Damour, 1983a, sections 5 and 6).
( 1 15)
with
( 1 16)
1 64 T. Damour
cx
Now the parameter must necessarily be small in all situations where one
wants to talk about an external problem in contrast to an internal one.
Moreover, as Ye ;5cx
(eq. (78)) there are certainly at least two small
parameters among the five appearing in eq. ( 1 1 9a). Therefore we can expect
that, in the most general case, it should be possible to write the following
expanded form of the external equations of motion :
the terms necessary to derive the 2 . 5PN equations of motion (of the form
( 1 19d) without Yi-expansion). The lPM equations of motion of spinning
bodies have been studied by Ibanez and Martin ( 1982).
Explicit results of the fully expanded form ( 1 19d) have been obtained at
the lPN level by many authors starting with the pioneering work of Lorentz
and Droste ( 19 17). Note that the latter work is the first one to give the
correct l PN equations of motion, the earlier results of Droste and De Sitter
and the later results of Chazy and Levi-Civita are incorrect. For references
to more recent works see e.g. Brumberg ( 1972), Ehlers ( 1977 , 1984),
Caporali ( 198 1b), Damou r ( 1983a), Grischchuk and Kopejkin ( 1986) and a
long-promised (but not yet published) review article by P. Havas. Special
mention should be given to the problem of the motion of spinning bodies
which involves several subtle issues (centre of mass, ordinary versus
acceleration-dependent Lagrangian). This problem has been reviewed by
Barker and O'Connell ( 1 979). Further references to the works of Michalska,
Gabosz, Lanzano and B romberg will be found in Bromberg (1972). Recent
works are due to Damour ( 1978, 1982), Dallas ( 1977), Bel and Martin
( 198 1), Ibanez and Martin ( 1982) and Thorne and Hartle ( 1985).
Concerning the issue of the presence of accelerations in the Lagrangian for
spinning bodies and of their avoidance by use of a suitable non-covariant
spin-condition see e .g. Damour ( 1982).
Explicit results at the 2PN level have been obtained by Ohta et al. ( 1974b)
for N point-masses, by Damour and Deroelle ( 198 la) and Damour ( 1 982,
1983a) for two condensed o bjects, by Schafer ( 1985) for N point-masses, and
by Kopejkin ( 1 98 5) and Grishchuk and Kopejkin ( 1986) for two weakly self
gravitating fluid balls. The 2PN equations of motion can be derived from a
Hamiltonian as first shown by the work of Ohta, Okamura, Kimura and
Hiida ( 1974b). However, there are several subtle issues connected with this
Hamiltonian and with its Legendre-transform associated Lagrangian. On
the one hand , one term in the potential part of this Lagrangian, for the two
particle case, has been wrongly evaluated in Ohta et al. ( 1974a), as was first
remarked in Damour ( 1982). The correct evaluation has been given in
Damour ( 1983a) and Damour and Schafer ( 1985). More important is the
issue of the influence of the choice of the coordinate system on the
Lagrangian. Indeed, it has been shown by Damour and Deruelle ( 198 lb)
and Damour ( 1982) that the 2PN motion of two condensed bodies in
harmonic coordinates could be deduced only from a generalised Lagrangian
function involving not only positions and velocities bu t also
accelerations. This issue has been recently clarified by Damour and Schafer
168 T. Damour
are incorrect, but it means that they have not yet been justified within the
PN-fluid schemes. More work is needed to control the effect of the higher
order self-field effects.
( 122)
to be compared with the Newtonian equation (55b). In the Newtonian case
the influence of the external structure on the internal problem was effaced up
The problem of motion in Newtonian and Einsteinian gravity 171
to the order cx 3 . But here we see that there already i s a strong influence of the
external motion on the internal one of order p; or 'Y e · The presence of the p;
coupling is readily understood if one remembers that with the
transformation ( 12 1) one is using the external coordinate time, t, as internal
time. Then this means that each body will suffer an apparent 'Lorentz
contraction' in the direction of external motion by a factor
J{ l - (vext/c) 2 } � 1 -i(vext/c) 2 . Similarly, the Ye-coupling in eq. ( 122) can be
related to what I will call the apparent 'Einstein-contraction', i .e. the factor
linking the coordinate length to the proper length, as was first discussed by
Einstein ( 19 16a). In harmonic coordinates this apparent 'Einstein
contraction' of a body embedded in an external gravitational potential U (e)a
is isotropic and is given by a factor � 1 - U (e)a/c 2 . The Lorentz contraction
effects mean that (when real tidal effects are negligible or subtracted) a point
on the Earth's surface suffers, in the global post-Newtonian coordinate
system, an anisotropic (coordinate) displacement of maximal amplitude (in
the direction of the velocity of the Earth vE9)
�(":)2R $- 3 cm .
i.e. as the sum of its total rest-mass energy, of its internal kinetic energy, of its
self-gravitational energy and of its internal elastic energy.
Recently the extension of the property of effacement beyond the first-post
Newtonian ( lPN) approximation has been studied further. On the one
hand, some authors, staying within the post-Newtonian framework (i.e. for
weakly self-gravitating, slowly moving bodies) have shown that it still holds
at the 2!th post-Newtonian approximation (Grishchuk and Kopejkin,
1986 ; and Kopejkin, 1985, improving on previous partial results of Rudolph
and Borner; and Breuer and Rudolph, 1982). They have shown (in the
absence of internal motion, and for 'spherically symmetric' bodies) that the
effective mass is equal , up to order c - 6 , to the 'Tolman mass' of each body.
174 T. Damour
On the other hand, other authors (D'Eath, 1975b ; Damour, 1983a ; Thorne
and Hartle, 1985) have shown that the effacement property still holds for
strongly self-gravitating bodies. In particular, it has been proven that the
equations of motion of non-rotating 'elastic' condensed bodies (neutron
stars or black holes), which would be spherically symmetric if isolated, can
be written in terms of only some 'centres of field' and some 'Schwarzschild
masses' (Damour, 1983a) up to the appearance of very small internal
structure dependent terms of relative order y; in the equations of motion (i.e.
tenth post-Newtonian order in the usual counting !). The intuitive reason for
this extremely good effacement is that each 'elastic' body is supposed to be in
a stable spherically symmetric state when isolated so that when it is not
isolated it becomes distorted by 'tidal forces' GmL/R3 , thereby acquiring a
()
small (structure dependent) ellipticity
8 _ tidal gravity GmL/R3 L 3
= = er: 3 ·
self gravity Gm/L2 R
Then the corresponding tidally induced quadrupole moments of each body,
amL2 , introduce some (structure dependent) interbody forces - am2L2/R4
which are therefore aL2/R2 - (L/R)5 = rx5 smaller than the main Newtonian
forces. Now, �y definition, condensed bodies are such that L - Gm/c2 , so
that
L Gm
er: = ,...., =y ,
R c2R e
0
hence the very small relative correction cr: 5 - y� = (Gm/c2 R)5 = O { l/c1 ) =
lOPN order.
It has been noted that the property of effacement of the internal structure
in the external motion of condensed bodies seemed to be a very peculiar
property of Einstein's theory and that in other relativistic theories of gravity
the equations of motion of condensed bodies already contain, at the l PN
level (0( 1/c2) relative corrections), some internal-structure-dependent
parameters (see Will, 198 1 , and references therein). It is therefore worth
trying to understand in a non-complex way what are the elements of
structure of general relativity which allow the effacement of internal
structure. In the proofs quoted above these elements of structure are
somewhat lost in the midst of long calculations or complex arguments.
However, it is possible to present a proof of the effacement property within
Newtonian theory which is equivalent but different from the usual
straightforward one presented in Section 6.5 and which shows up some of
the basic elements of structure which allow one to prove the effacement
The problem of motion in Newtonian and Einsteinian gravity 175
property within Einstein's theory. I shall therefore spell out some details of
this proof which can serve as a nice model problem for understanding some
of the 'inner wheels' of general relativity.
The basic trick is to rewrite the Euler. equations (3) in the form of a
conservation law (using the continuity equation (2) and Poisson's law (4a)) :
a . a . . a .. a ..
- (p v1) + -. (p v 1 v' ) + -. T 11 + -. t11 = 0, ( 125)
at ax1 ax1 ax1
where
Tii + p (J ii '
=
( 126)
[ ]
is the material stress tensor, and where
1 a u a u � tJ . . a u a u
tii = + -- - ( 127)
4nG ax i axi 2 IJ axk axk
is a quadratic form in the gravitational field, which is the gravitational
analogue of Maxwell's electromagnetic stress tensor. Note that tii satisfies
the identity
at ii =
-
1 au
-
+ - 8. U . ( 128)
axi -
4nG ax i
Now if we integrate eq . ( 125) over any volume, V, containing the ath body
and no other body in the system, the second and third terms of the left-hand
side will give no contributions because of Gauss's theorem and the vanishing
of p and p outside the body. Using eqs. (7)-(8) we therefore obtain the
following equations of external motion :
( 129a)
where the external force is given as an integral over any surface S enclosing
the ath body (and no other body) :
Fi,1 = -
I. d S/i . ( 129b)
written as :
( 1 30)
with
( 1 3 1)
and
( 1 32)
(actually we are now cheating a little because we are using the vanishing of
the dipole term in uou t which, however has been proven only by an
'
argument using some knowledge of the internal structure). The effacement
of internal structure in the equation for the external motion ( 129) is now
clearly apparent if we choose for the arbitrary surface S a surface of
dimensions - R, the interbody separation, and if we use (for a while) the
external units (eq. (34)). Then both um and um, i are of (numerical) order
unity, while UQ and UQ, i are of order cx 2 . Therefore
FteJ = F� + O(cx2), ( 1 33)
where
( 134)
is of numerical order unity, with
1
5' um um ]
1
tml.J. ·.-
- +
-- [ um , i um ,j zuii
_ _
•k ,k ( 13 5)
4nG •
real'. But still, at the computational level , past experience in many fields of
physics has shown that it is efficient to use some ideali sed mathematical
tools (point particles, delta functions , plane waves, . . . ) to deal with real
physical situations. However, one then needs to have a clear mathematical
framework to use such tools (distribution theory, Fourier transform, . . . ). In
our case distribution theory is not appropriate to deal with t� which is
quadratic in um . we can however use the following trick (inspired by the
' '
work of M . Riesz, 1949). Let us consider, if we are studying the motion of the
ath body,
( 1 36)
where ra = = I x - ia(t )I , A is a complex number, and
'°' Gm b
�)a = = ( 1 37)
/ta I x - ib ( t) I '
Now let us define, for a given surface S, the following complex function of the
complex variable A :
a -4 ·
_ 1-2 b 11 r:A
Hence if the real part of A is large enough (for instance, Real(A) > 3) t�(A)
will define a field which is differentiable at all points within S, including
x = ia . Then for such large A we can apply Gauss theorem to transform the
surface-integral expression ( 138) into an integral over the volume V
178 T. Damour
i
contained within S :
o tii (A)
pi (A) = - 3
d x m ( 142)
v
•
m
.
ox'
-I d 3x �:,; A(A -
1) 2 (x' z�)r; A • ,
- - ( 145)
Both the left-hand and the right-hand sides of eq. ( 146) are analytic functions
of A in the domain Real(A) > 0 (such that the volume integral is convergent).
As they coincide for Real(A) > 3 , as has been just shown, they will also
necessarily coincide for Real(A) > O. Now when A � 0, the factor in front of
the right-hand side of eq. (146) shows that only the infinitesimal
neighbourhood of x = .ia contributes to the limit F�(O) = F� (see ( 140)).
Plugging into eq. (146) the Taylor expansion of U� )a ,i around x = .ia it is
easily seen that only the first term contributes to the limi t :
m m A�o
{
Fi (O) = Fi = limit - � A(A
41t
m - l)
0 um)a
;::i 1
ux
(� (.ia )
iR. 4nr2a dra raA - J }
0
( 147)
as shown by an easy calculation.
The problem of motion in Newtonian and Einsteinian gravity 179
dt h #- 1.1
F� ( A ) = -
4 G
� L d 3x t.ir;;u;:, , , ( 150)
Damour, 1983a ; Thome and Hartle, 1985, and references therein to the
earlier works of Weyl, Einstein and others).
Therefore, even very small damping forces in the system will become
observable by their secular effects on the orbital motion. Such an effect has
indeed been observed by Taylor and coworkers (Taylor, Fowler and
McCulloch, 1979; Taylor and Weisberg, 1982) with sign and magnitude in
excellent agreement (now better than 4 per cent) with the so-called
'quadrupole formula', a formula which had been previously derived (Peters
and Mathews, 1963 ; Peters, 1964) by inferring the secular effects on the
orbital motion of the loss of energy in the form of gravitational radiation.
This suggests the conclusion that, for the first time, a clear proof of the
existence of gravitational radiation, with the properties predicted by
General Relativity, has been obtained .
Because of the importance of this conclusion one must critically examine
both the hypothesis and the chain of deductions leading to it. We shall not
examine here the hypotheses concerning the structure (two condensed
bodies) and the 'cleanness' (no other effects perturbing the orbital motion) of
the system (see Will , 198 1 ; Taylor and Weisberg, 1982; Damour and
Deruelle, 1986). Instead we shall concentrate on the theoretical arguments
used in the derivation and application of the 'quadrupole formula'.
We do not wish to review in detail the controversy about the meanings
and the derivations of several 'quadrupole formulae' and the related
problem of gravitational radiation damping (see e.g. Ehlers et al. , 1 976 ;
Damour, 1983a; Walker, 1984; Ehlers and Walker, 1984; Schutz, 198 5 ;
Will, 1986, and references therein). However, we wish t o emphasise that
most of the work which has been done concerning the quadrupole formulae
has only an indirect bearing on the problem of explaining the secular
acceleration of the orbital motion of PSR 19 13 + 16 observed by Taylor and
coworkers. Indeed , one class of quadrupole formulae gives information
The problem of motion in Newtonian and Einsteinian gravity 18 1
accelerations. Then each body must satisfy the following equation of motion
(Damour and Deruelle, 198 la ; Damour, 1982) :
ai = A�(i - i') + c - 2 A�(i - i' , V, v') + c - 4 A�(i - i' ' v, v' ' S, S')
+ c- 5 A�(i - z', v - v') + O(c - 6), ( 1 54)
with
A� = - Gm'R - 2 Ni, ( 1 55)
A � = Gm' R - 2 {Ni[ - v 2 - 2v' 2 + 4(vv') + �(Nv') 2 + 5(Gm/R) + 4(Gm'/R)]
+ (vi - v' i ) [4(Nv) - 3(Nv')] } , ( 156)
A� = B� + C� + D � , ( 1 57)
B� = Gm' R - 2 {N i [ - 2v' 4 + 4v' 2 (vv') - 2(vv') 2 + �v 2(Nv') 2 + -!v' 2 (Nv') 2
- 6(vv')(Nv') 2 - 185(Nv') 4 + (Gm/R)( - 1fv 2 + �v' 2 - 1(vv') + 1
- 39(Nv)(Nv') + \7 (Nv') 2) + (Gm'/R)(4v' 2 - 8 (vv') + 2(Nv) 2
-4(Nv)(Nv') - 6(Nv') 2)]
+ (vi - vi i) [v 2 (Nv') + 4v' 2 (Nv) - 5v' 2 (Nv') - 4( vv')(Nv)
+ 4(vv')(Nv') - 6(Nv)(Nv') 2 +-!(Nv') 3
+ (Gm/R)( - 6l(Nv) + 545(Nv')) + (Gm'/R)( - 2(Nv) - 2(Nv'))] } , ( 158)
C� = G3m' R - 4 Ni [ - 547 m 2 - 9m' 2 - 6/mm'J ,
) ( )
( 1 59)
(
m
ik S' ik
m
)
D � = S + 2 , (vi - v' 1 ) ( ) (
G m'
R ,kt
+ 2
5k1
m
+ 2
S'�1 1 1 Gm'
m
(v - v' )
R , ik '
(160)
and
A� = 1G 2 mm'R - 3{ Vi [ - V 2 + 2(Gm/R) - 8(Gm'/R)]
+ Ni(N V)[3 V2 - 6(Gm/R) + 5l(Gm'/R)] } . ( 16 1)
The two parameters m and m' appearing in eqs. ( 154)-( 16 1) are the
'Schwarzschild masses' of the condensed bodies. They are two constants
which appear in the external gravitational field, in which are hidden many
internal structure effects (see the discussion of the 'effacement of internal
structure' in Section 6. 14). On the other hand, the spin tensors undergo a slow
evolution (on the post-Newtonian time scale, i.e. Pe- 2 times the orbital
period) which is also obtained in the Einstein-Infeld-Hoffmann-Kerr-type
approach (Damour, 1982, and references therein). Introducing, a la Schiff, a
suitable spin-vector, S , associated with Sµv • the law of evolution ('spin
precession ') reads for the first body (see also references in Section 6 . 1 3 .2)
dS [ Gm' - 3
- = -- N x - v - 2v
dt c 2R 2 2
( +')] x -S + O (-c14)
+
. ( 162)
1 84 T. Damour
L4 (z, v, a) = M 4 + N 4 , ( 167)
M 4 (z, v, a) = L(/6 mv 6 ) + LGmm' R - 1 [iv4 + ��v 2 v' 2 - 2v 2 (vv') + !(vv') 2
- i(N v) 2 v' 2 + i(N v)(N v')(vv') + 136 (N v) 2 (N v') 2]
+ LG 2m 2 m' R - 2 [-!v 2 + iv' 2 -i(vv') + 1(Nv) 2
+ �(Nv') 2 - -!(Nv)(Nv')]
+ LGmm' [(Na)(iv' 2 -! (Nv') 2 ) -i (v' a)(Nv')] , ( 168)
and
( 169)
{
binary system, they read :
d = d i + c - 4D� + c - 5A� , ( 170a)
a' i = d l i + c - 4D1 + c - 5A'� , ( 1 70b)
They can be solved by treating c - 4D� + c - 5 A� as a perturbative force
The problem of motion in Newtonian and Einsteinian gravity 185
( )
constant calculation :
. 192n 2nG 513 mm' 1 + Ue 2 + Jie4
Po = ( 1 72)
- Sc"S P o (m + m') l / 3 ( 1 - e 2 ) 7 / 2
e denoting a 'relativistic eccentricity' (for details of the meaning and proof of
eqs. ( 1 7 1}--( 1 72) see Damour, 1985, for earlier heuristic proofs see Esposito
and Harrison ( 1 975) and Wagoner ( 1 975)).
Finally , these results concerning the coordinate motion of the binary system
in a special harmonic coordinate system geared to the binary system have to
be related to the quantities actually measured by the observers. This
problem has been re-examined recently with a view to correcting, simplify
ing and completing previous works (see Damour and Deruelle, 1986, and
.
references therein). This allows one to conclude that the 'theoretical
parameters' introduced above (such as m, m' , P 0 , e, P 0 , . . ) coincide with
their corresponding 'observed parameters', denoted mP , me, Pb , e, Pb , . . . .
On the other hand, a recently developed program (Schafer, 1983 , 1985,
1986) based on a Hamiltonian approach to the interaction of N spinless
point-particles with the gravitational wave-field has allowed one to reach
elegantly (in a different gauge mo re adapted to the separation of
conservative and damping effects) the main result of the previous method :
i.e. a force, equivalent to c - 5 A� (coming from the interaction with the
dynamical degrees of freedom of the gravitational field) acting on the
Hamiltonian sub-system of the instantaneously interacting N particles.
However, the treatment of the problems associated with the use of point
particles has not been fully justified so that the physical relevance of this
work is not obvious on its own.
Another line of work aimed at treating the problem of the motion of a
1 86 T. Damour
binary system with the same completeness as the ·first method discussed
above, has also been recently reported (Grishchuk and Kopejkin, 1983,
1986; Kopejkin, 1985). It is based on : ( 1) a post-Newtonian approximation
scheme of the type of eq . (94), and (2) the assumption that the bodies are
non-rotating, 'spherically symmetric' fluid balls. Spherical symmetry is
taken in the coordinate sense, so that the centre of mass defined by eq . ( 1 1 5),
with p given by eq . ( 1 17), coincides with the centre of symmetry. Then the
equations for the motion of the centre of mass of each body are obtained by
integrating the local post-Newtonian equations of motion (69b). They have
been explicitly calculated, retaining all the higher derivatives that appear. If
one 'reduces' these higher derivatives by using the lower-order equations of
motion (as was done in the first method) the explicit result of this post
Newtonian calculation is (Kopejkin, 1985) :
(µ + c - 2µ 2 + c - 4µ 4)a i = [(µ + c - 2µ 2 + c - 4µ 4)A�(µ' + c - 2µ� + c - 4µ�)]
+ c - 2 [(µ + c - 2 µ 2 )A�(µ + c - 2µ 2 , µ ' + c - 2 µ� )]
+ c - 4µ(B� (µ, µ') + C�(µ, µ'))
+ c - 5µA � (µ, µ') + O(c - 6 ), ( 173)
where the square brackets mean that one should expand in exp lic i t powers
of c - i and discard all powers greater than or equal to the sixth. In eq . ( 173)
the parameter µ denotes the 'rest mass' of the body :
µ •= L d 3xr. , ( 1 74)
while
µ 1 •= fv d 3 xr . (TI 0 - 1U1'lj, ( 1 75)
time derivative which does not contribute to the equations of motion, and
Q (a - A 0 ) is a quadratic form of the differences a - A0 (z, v), a' - A0(z, v) which
does not contribute to the order-reduced equations of motion at the order
considered . They would contribute at the order c - 6, but then it is sufficient
to replace a - A 0 by a - A 0 - c - 2 A 2 to postpone the influence of such
'double-zero' terms (using the terminology of Barker and O'Connell ( 1980)).
The statement of Grishchuk and Kopejkin that the Lagrangian (164) is 'only
valid in a non-harmonic frame of reference' is incorrect and is based on a
confusion between the effects of a 'double-zero' term such as c - 4Q(a - A 0 )
(which does not contribute) and those of what one might call a 'simple-zero'
term , i.e. a linear function, c - 4.P(a - A 0 ), of the differences a - A 0 . As
remarked recently by Schafer ( 1984), a simple-zero term does contribute to
the equations of motion but in a way which is equivalent to a change of
coordinate system at order c - 4. For a general investigation of the effect of
infinitesimal coordinate transformations on generalised Lagrangians see
Damour and Schafer ( 1985) and references therein. In particular, it has been
shown in the latter work that the acceleration-dependent two-body
Lagrangian ( 164) is equivalent, modulo a suitable coordinate
transformation, to the special N = 2 case of an ordinary N-body Lagrangian,
L(z, v), derived earlier by Ohta, Okamura, Kimura and Hiida ( 1974b), who
were using a non-harmonic coordinate system.
The fact that two independent methods (post-Minkowskian + Einstein
Infeld-Hoffmann and post-Newtonian + perfect fluid) give formally
identical equations of motion, in harmonic coordinates, is a strong
confirmation of the validity of the numerical coefficients in eqs. ( 155)-( 159)
and ( 16 1). So much so that the coefficients in ( 165)-( 169) are further
confirmed by comparison with the works of Ohta et al. ( 1974a, b) and
Schafer ( 1985) based on a third independent method (post-Newtonian +
delta functions) (after correction of a wrongly evaluated two-body integral
in Ohta et al. ( 1974a) ; see Damour ( 1983a, section 13); Damour and Schafer
( 1985)). However, only the first method has been tailored to deal with the
188 T. Damour
and ( 1 82 )
" Gm z 2 zi ,
1;55 := � J.ISS == L Gmz 2 v i ,
one finds
d E N o et
- c S - E s - -- Q 11(. 3. ) Q (11. 3. ) + 0 (c - 6 ) '
- d 1
-- = ( 183)
dt dt 5Gcs
where both EN o et and E s are univalued functions of the instantaneous state
of the binary system : z(t), z'(t), v(t), v'(t). Therefore ENoet experiences a
1 90 T Damour
secular decrease,
(d£Noet ) = l_ ( Q PlQ �?l)
5Gc5 1 1 11 '
_ _
( 184)
dt
whose value coincides with the so-called 'quadrupole flux formula' giving
the outgoing flux of gravitational radiation in the wave zone. Actually, this
'quadrupole flux formula' is, in my opinion , rather less well established than
the 'secular acceleration (quadrupole) formula' ( 1 72) (because of the
problem ofjustifying the 'good' quadrupole ( 1 82) for condensed bodies), but
I quote it here only to justify the name 'gravitational radiation damping'.
However, I wish to stress that what is actually observed is an orbital effect ,
so that Adamping should rather be thought of as the relativistic descendant of
the Laplace effect.
Indeed , Laplace ( 1805), in his famous Traite de M ecanique Celeste which
studied 'les alterations que le mouvement des planetes et des cometes peut
eprou ver [ . . . ] par la transmission successive de la pesanteur' (i .e. the orbital
perturbations due to a finite velocity of propagation of gravity) , introduced
the idea that if gravity propagates with a finite velocity, say cg , then there will
be a small modification of Newton's law (due to aberration effects)
amounting to the addition of a small 'damping' term of the type :
He then showed that this damping term would cause a shrinkage and a
circularisation of the orbit, together with a secular acceleration of the
angular motion . He was aware of the fact that the last effect is the easiest to
observe; and even concluded from the 'known' secular acceleration of the
Moon that the velocity of propagation of gravity must be
c � 7 x 106
g c1ight ! ( 1 86)
The last conclusion of Laplace is not quite correct, as we now know , but
we can still admire his profound insight into what would be the main orbital
effects caused by a finite velocity of propagation of gravity. Indeed, it is clear
from eq. ( 16 1 ) that the Einsteinian damping term ( 179) (in harmonic
coordinates) contains many terms of the Laplace damping type ( 1 8 5).
However, the particular structure of Einstein's theory has resulted in
considerable reduction of the relative magnitude of this damping term from
the (not so) 'naive' expectation , /c , due to aberration effects , down to the
V
present ( /c ) 5 Still, the qualitative conclusions of Laplace are correct, the
V .
main effects of the damping term ( 1 79) being, indeed , a shrinkage and a
The problem of motion in Newtonian and Einsteinian gravity 19 1
6. 16 Conclusion
Henri Poincare ( 1934) once remarked that real problems can never be
classified as solved or unsolved ones, but that they are always more or less
solved ('ii y a seulement des problemes plus ou moins resolus'). This remark
applies particularly well to the problem of motion which has had a
chequered history . Even the Newtonian problem of motion, which appeared
to be well understood after the development of the powerful methods of
classical celestial mechanics (codified around 1889 in the treatise of
Tisserand), embarked on an entirely new career after the work of Poincare
( 1892) which has led to many further developments (see e.g . Arnold, 1978;
Gallavotti , 1983). In my opinion the Einsteinian problem of motion has not
even reached a classical stage where the basic problems appear
(provisionally) as 'well understood'. At first sight the best-developed
approximation method in general relativity, the 'post-Newtonian' one,
would seem to constitute such a classical stage, but the literature on the
post-Newtonian problem of motion is full of repetitions, errors or
ambiguities. Several problems (especially at the lPN approximation) have
192 T. Damour
been done over and over again with slightly modified approaches, and
different notation (or worse, the same notation for different quantities), and
still it seems to me that some of the basic issues have not really been
tackled. Moreover, the development of other approximation methods
('post-Minkowskian', 'singular perturbation') has rather complicated the
picture by giving rise to many hybrids (some of which have been discussed
above).
My original intention was to conclude this survey by giving a list of the
issues that need to be clarified. I renounced this project because, if one
wishes to look at the work done with a critical eye, nearly all aspects of
the problem of motion need to be thoroughly re-investigated for
mathematical, physical or conceptual reasons ; so that the list of open
problems would, consistent with the remark of Poincare, include all the
issues discussed above. Another reason for renouncing this project is that
Brumberg (1986) has very recently given such a list, to which I gladly refer
the reader.
One thing is certain : the Einsteinian problem of motion is no longer a
purely theoretical problem , thanks to the dramatic improvement in the
precision of position measurements in the solar system, and to the discovery
of the binary pulsar 19 13 + 16 which is a marvellous relativistic laboratory ;
the Einsteinian problem of motion has become an important tool of modern
astrophysics. It is therefore of some urgency, not only to complete and unify
the work already done, but also to develop new approaches which will keep
the best aspects of existing methods while freeing themselves from their
conceptual and/or technical drawbacks. These new approaches should aim
at both formal and conceptual clarification of the basic issues, and at
securing more accurate explicit results. Let us hope that such methods will
be developed before the (first) centenary of the publication of Einstein's
classic paper on 'the foundation of the general theory of relativity' ('Die
Grundlage der allgemeinen Relativitatstheorie', Einstein, 19 16a) which is
comparable with Newton's Philosophiae Natura/is Principia Mathematica
both in its importance for physics and its structure which consists, after a
conceptual introduction, of a formal mathematical part followed by a
physical part where connection is made with the natural world.
References
Anderson, J . L. ( 1 980) . Phys. Rev. Lett., 45, 1745-8 .
Anderson , J . L. and Decanio, T . C. ( 1975). Gen. Rel. Grav., 6, 197.
The problem of motion in Newtonian and Einsteinian gravity 193
7. 1 Introduction
Theory may often delay understanding of new phenomena observed
with new technology unless theorists are quite open-minded as to what
types of physical laws may need to be applied : conservatism is unsafe.
. . . In astrophysics, historically, theories have only seldom had
predictive usefulness as guides to experimenters.
(Greenstein, 1984)
How difficult were these advances? If we now acquaint ourselves with
some of the detours, errors and psychological blocks, it is not to console
ourselves with the thought that even great physicists do not always
move in a straight line, but rather to grasp in some measure how
difficult it really was.
(Hund, 197 1)
To pause halfway up a steep hill , look down at one's starting-point barely
half a mile below and recall the arduous, winding path one has actually
followed , is a salutary exercise in humility. The subject of this essay - the
evolution of our ideas since the age of Laplace concerning the dark objects
that populate the universe - is now at that middle distance where a
retrospective look begins to discern broad contours without yet losing all of
the detail that will in time become laundered and streamlined by Darwinian
selection of citations into the handful of names and dates of the potted
histories.
Tracing this evolution, with its meanderings and vicissitudes, as a
continuous process, means abandoning attempts at a 'balanced' account
which gives consideration to ideas in proportion to their evolutionary
t Work supported by Canadian Institute for Advanced Research and by Natural
Sciences and Engineering Research Council of Canada.
200 W Israel
blame for my misconstruction of their replies or the way it has all been
pieced together, include J. D . Barrow, S. Chandrasekhar, D . Finkelstein,
S. W. Hawking , D. P. Hube, M . D. Kruskal , W. H. McCrea, M. J. Rees,
K. S. Thorne and G . M . Volkoff, and I should like to extend. to them my
deepest thanks. My thanks are also due to Professor Z. Maki and Professor
K. Nishijima, the successive Directors of the Research Institute for
Fundamental Physics, Kyoto University, and their colleagues, for their kind
hospitality at Yukawa Hall , where this work was begun. I am indebted to
the staff of the Scientific Periodicals Library, Cambridge University, for
making their unique resources available, and uncomplainingly bringing out
an endless stream of volumes from their stacks, and to Yoshiko Fujinaka,
Mary Yiu and Cheryl Torbett for their superb technical assistance.
7.2 Early speculations (1 784-192 1)
Unseen bodies may, for aught we can tell , predominate in mass over the
sum-total of those that shine : they supply possibly the chief part of the
motive power of the universe.
(Clerke, 1 903)
In regions where our ignorance is great, occasional guesses are
permissible.
(Lodge, 192 1)
The name of the Reverend John Michell (MA (Cantab .) 1 752, BD 1 76 1 ) fell
so quickly into obscurity after his death that two of the English scientific
classics of our century, Eddington's Internal Constitution of the Stars ( 1 926)
and Hawking and Ellis's Large Scale Structure of Space-Time ( 1973) give
credit to a Frenchman , P. S. Laplace ( 1 796) , as having been first to suggest
that 'the attractive force of a heavenly body could be so large that light could
not flow out of it'.
Yet in his day M ichell's reputation as a natural philosopher stood second
in English circles only to that of his contemporary and friend, Henry
Cavendish. The paper communicated to the Royal Society by Cavendish on
November 27, 1783, in which Michell expounded this and many related
ideas, caused such a stir in London circles that it overshadowed exciting
news coming from Paris about Coulomb's electrical experiments,t as extant
t Coulomb's invention of the torsion balance had actually been anticipated by M ichell,
and the inverse-square law by Cavendish, using the now celebrated null method of
measuring the force inside a hollow charged conductor. Both Cavendish and Michell
were reclusive personalities who published very little of their wide-ranging work . Many
of Cavendish's electrical researches were found after his death in a number of sealed
packets of papers, eventually published, under the editorship of Maxwell, in 1879. None
of M ichell's unpublished work survives.
202 W Israel
might be orbiting a central, very massive dark object of the sort postulated
by Laplace, but decided against it on the grounds that the resulting proper
motions would be too large not to be noticed.
Speculation about invisible stars went quickly out of favour as the wave
theory of light gained ascendancy especially after the discovery of
interference by Thomas Young in 180 1 , and removed any reason to believe
that light should be affected by gravity. References to this hypothesis were
deleted from the 1808 and later editions of Laplace's Exposition du
Systeme du Monde . But it was natural that it would resurface, albeit in a
significantly different form, after the success of the 1 9 19 eclipse expedition.
In February 1920, A. Anderson of University College, Galway, made an
intriguing speculation in the Philosophical Magazine :
We may remark, though perhaps the assumption is very violent, that if
the mass of the sun were concentrated in a sphere of diameter 1 .47
kilometres, the index of refraction near it would become infinitely great ,
and we should have a very powerful condensing lens, too powerful
indeed, for the light emitted by the sun itself would have no velocity at
its surface. Thus if, in accordance with the suggestion of Hemholtz, the
body of the sun should go on contracting there will come a time when it
is shrouded in darkness, not because it has no light to emit, but because
its gravitational field will become impermeable to light.
This was an extraordinary anticipation of the gravitational collapse
scenario which was to be so adamantly resisted in the 1930s and for a
quarter of a century afterward .
The convenient fiction , due to Eddington, that the paths of light rays in a
gravitational field could be formally described and visualized as
propagation in an 'aether' of variable refractive index , was imaginatively
taken up by Sir Oliver Lodge, whose firm belief in the aether had somehow
withstood the assault of the 1905 relativistic revolution (see, e.g. his
contributions to the discussion of the results of the 1 9 19 Sobral expedition ,
Lodge, 1920). In a remarkable address to a Student's Science Club at the
University of Birmingham in 192 1 , he presented on this basis an essentially
correct physical picture of the full range of black holes that are of
astrophysical interest today :
If light is subject to gravity, if in any real sense light has weight , it is
natural to trace the consequences of such a fact. One of these
consequences would be that a sufficiently massive and concentrated
body would be able to retain light and prevent its escaping . And the
body need not be a single mass or sun , it might be a stellar system of
Dark stars: the evolution of an idea 205
permanence of things and the Lucretian doctrine of atoms that could 'not be
swamped by any force, for they are preserved indefinitely by their absolute
solidity'. As the web of observation and theory slowly tightened, the
scientific reaction - first disregard, then dismay, yielding only gradually to
the beginnings of acceptance - set a pattern for the disclosures yet to come.
The companions of Procyon and Sirius had puzzled astronomers since
before the turn of the century. Comparable in mass with the sun , they were
more than a hundred times dimmer, 'a condition that we are powerless to
explain' (Campbell , 1 9 1 3). Superficially, there seemed to be two
alternatives :
Either they have a far less surface brilliancy than the sun or their density
is much greater. There can be no doubt that the former is the case.
(Newcomb , 1908)
This emphatic conclusion would have caused no misgivings if these
objects had been unmistakably red and thus of low surface temperature :
their dimness would have been explicable and not at all unusual . But, as far
as it could be distinguished from the overwhelming brilliance of Sirius itself,
the companion appeared nearly as white to the eye. t In December 1 9 1 5 ,
Walter S. Adams announced that he had finally succeeded in securing a
spectrogram with the Cassegrain reflector at Mt Wilson as the companion
passed to its furthest distance from the primary in its 49-year orbit. It
showed a spectrum identical with that of Sirius. Puzzlement now gave way
to some embarrassment , since spectral intensity measurements of 109 stars
by Wilsing and Scheiner at Potsdam during the previous five years had
demonstrated (as had earlier theoretical work by Schwarzschild) that to a
fair approximation stars radiate like black bodies, with effective
temperatures well correlated with colour index and spectral type.t If the
observations were taken at face value, the dimness of Sirius B could not be
attributed to a low surface temperature. Presumably the reaction of the
anonymous reporter for The Observatory ( 1 9 16) was not untypical : 'The
results rather suggest that the spectrum obtained is due to light from Sirius
t In the case of Procyon, the spectral class of the companion remains somewhat
uncertain to this day due to the proximity of the bright component (Liebert, 1980).
t According to the classification system perfected at the Harvard College Observatory in
the early 1 900s, stellar spectra may be divided into several reasonably well-defined types,
labelled 0, B, A, F, G, K, M in order of decreasing surface temperature. Sub-classes a re
denoted by the digits 0 to 9, so that AO is the hottest star of type A . The colour of a star
is a clue to its spectral type. Stars of types 0 and B (characterized by prominence of
helium lines) are bluish-white, type A (hydrogen lines) white, types F and G (calcium
lines) yellow, types K and M (metallic lines) orange to red .
Dark stars: the evolution of an idea 20�
class was known at the time. (With communication impeded by the war,
Opik as yet was unaware of Adams's work on the spectrum of Sirius B .) He
wrote :
Among the binaries with known orbital elements there is one which is
not included in the preceding discussion, O z Eridani ; . . . the density
according to equation (6) . . . would be 25 000. This impossible result
indicates that in this case our assumptions are wrong ; the only possible
explanation is that, however high the temperature, the surface
brightness or the radiating power is very low ; probably Oz Eridani is a
pair of very rarefied nebulae.
(Opik , 1 9 1 6)
The key to the true explanation was forged just six months later and in a
closely related context, but it took another eight years before anyone saw the
connection. On 8 December 19 16 Eddington presented to the Royal
Astronomical Society the first of his pioneering papers on radiative
transport in stellar interiors. In the ensuing discussion, Jeans pointed out
that the value µ = 54 which Eddington had adopted for the mean molecular
weight of the stellar material was almost certainly an overestimate, since 'for
these temperatures and energy we have very hard Rontgen radiation , and so
the atoms in the gas will be smashed up' (The Observatory, 19 17). Jeans
( 1928) later traced this idea back to Descartes ( 1644), who had conjectured
that the sun and fixed stars were made of matter 'which possesses such
violence of agitation that , impinging upon other bodies, it gets divided into
indefinitely minute particles'. Under such conditions, Jeans was now
pointing out, electrons normally bound to much heavier nuclei would be set
free as independent particles, thus greatly reducing the mean molecular
weight of the material.
In his subsequent work, Eddington ( 19 1 8 , 192 1 ) revised his estimate for µ
downwards to values in the range 2 to 4, but, like everyone else (e.g. Jeffreys,
19 18), he still believed that, in stars like the sun , material would deviate
drastically from a perfect gas. It was more than just 'common sense' that
deterred the astronomers from questioning this axiom. Breakdown of the
gas laws at ordinary densities was a pillar of the giant-and-dwarf
evolutionary theory of Russell (19 14), which held sway well into the 1920s.
In Russell's version of the theory·, stars were pictured as beginning their lives
as gaseous red giants, contracting and heating up as they (supposedly)
evolve towards the blue end of the main sequence, at which point their
material becomes liquid and begins to cool. The stars then evolve along the
main sequence to end their luminous phase as red dwarfs. The discovery of
Dark stars: the evolution of an idea 209
t Nearly two years earlier the statistical thermodynamics of relativistic Bose and Fermi
gases had received a general and comprehensive formulation at the hands of Ferencz
Juttner ( 1928) of Breslau . However, Juttner did not explicitly consider the limit of
complete degeneracy (which greatly simplifies the general formulae) , and he was u naware
of the astrophysical relevance of relativistic degeneracy. He considered his work to be of
theoretical i nterest only. His paper appeared in a German physics journal and did not
come to the attention of astronomers for some years.
214 W. Israel
one. Specifically, consider an electron gas in its lowest quantum state : each
of the lowest quantum cells (with phase volume h 3 ), up to a momentum
ceiling p * (say), will be filled by a pair of electrons having opposite spins. The
summation I is thus to be replaced by the integral s�· . . . 2(4np 2 dp )/h 3 •
Eddington's expressions then give P Edd ,....., p ; , n ,....., P! , i .e. P Edd ,....., n 51 3 , which is
precisely Fowler' s formula.
Eddington's own presentation of his arguments never took quite this
simple form, and he never referred to their 1923 pre-quantum antecedents,
presumably because he considered that enumeration of quantum states is a
procedure different in principle from the counting of classical particles. t
Nevertheless, his classical and quantum procedures conform to the same
mould. (The resemblance is plainest in the 1 940 paper.)
I have gone into this point in painstaking detail because it seems to have
gone generally unnoticed and because it reveals the underlying continuity
and inner consistency of Eddington's thought over a time-span stretching
well before the events of 1935. By his own account (Eddington , 1936b), the
motivation which first caused him to question the Stoner-Anderson result
was the 'stellar buffoonery' to which it led ; but there can be little doubt that
the reason for his sustained opposition was grounded in purely technical
considerations whose seeds went back more than a decade. Indeed, it is
interesting that after its debut as the launch-pad of his 193 5 paper, the
contraction scenario and its 'absurdity' make no further appearance in his
published work .
To speculate about a hypothetical replay of the famous RAS talk , with all
reference to a 'reductio ad absurdum ' magically removed , is idle but hard to
resist (Chandrasekhar, 1969, 1972 ; Sullivan, 1979 ; Wali, 1982). Could it
have changed astrophysical history? Given the limited optical window of
pre-war astronomy and the many psychological hurdles, the possibility
appears remote of any radical shift in pre-war opinion on this issue.
In the first instance, there were Eddington's own preconceptions. By 1936
his unconventional stress tensor had so permeated his thinking on 'molar
relativity' and Fundamental Theory that a change of course would have
meant dismantling a complex interlocking structure. Attempts to rebut his
contentions must have been a source of deep frustration to both sides, since
they were often speaking different languages. In the manner typical of
creative minds, his way to assimilate a new body of ideas, such as relativity
or quantum mechanics, was to re-create it in his own terms. It made
communication not always easy. Once, defending the Eddington model for
main-sequence stars against a rival theory of Milne's, he declared,
Prof. Milne did not enter into detail as to why he arrives at results so
widely different from my own ; and my interest in the rest of the paper is
dimmed because it would be absurd to pretend that I think there is the
remotest chance of his being right . . . . I maintain that the interior
determines the state of the photosphere ; Prof. Milne somehow reaches
the opposite view.
(Eddington, 1929)
It stung Milne to the retort :
I recognize that Sir Arthur Eddington has dug a most valuable trench
into unknown territory. But he has encountered a rocky obstacle which
he cannot get round. If he would make the mental effort to scramble up
the sides of the trench he would find the surrounding country totally
different from what he had imagined and the obstacle enti rely an
underground one.
(Milne, 1930)
Suppose, nevertheless, that Eddington's technical concerns could have
been sufficiently allay�d to make the Stoner-Anderson formula and the
Chandrasekhar limit logically acceptable to him. Would his belief in the
contraction scenario as an inescapable consequence have held firm? This is
the hardest and most hypothetical question of all : we do not know how deep
his convictions on this issue really went. Others who accepted the
Chandrasekhar limit did not feel impelled to this extremity. He himself
speaks of 'various accidents intervening to save the star'. One has to judge
whether it is likely that, lacking any observational support for collapse, he
would have been prepared to abandon an astronomically conservative
stance to take a position that even the arch-radical Landau at that time
considered untenable. If the inner conviction was there, it is more than
probable that he would have done so. The courage and integrity with which
he defended his beliefs were legendary. Eddington's failure in 1935 was not a
failure of nerve, but an aberration of a soaring imagination.
One is on much safer ground in predicting the probable reaction of his
contemporaries. In 1935, the astronomical community was not yet ready to
'buy' the idea of gravitational collapse, not even if a master salesman like
Eddington had been ready to exert all of his persuasion. This is plain from
the apathy that greeted Oppenheimer's proposal of the idea four years later.
Even today, after half a century, it would be rash indeed to claim that the
black hole's battle for acceptance is finally over.
Dark stars: the evolution of an idea 223
The year 1935 marks the end of a glorious period in British astronomical
history. For an unbroken span of 20 years the meeting rooms of the Royal
Astronomical Society in Bu rlington House had held the centre of the world's
astrophysical stage. But now the spotlight was moving westward .
the term 'neutron star' in its presently understood sense. It came in a brief
paragraph, appended as an 'additional remark', to a visionary pair of papers
by Baade and Zwicky in which supernovae were named and distinguished as
a class of objects fundamentally different from ordinary novae :
In addition, the new problem of developing a more detailed picture of
the happenings in a supernova now confronts us. With all reserve we
advance the view that a supernova represents the transition of an
ordinary star into a neutron star, consisting mainly of neutrons. Such a
star may possess a very small radius and an extremely high density. As
neutrons can be packed much more closely than ordinary nuclei and
electrons, the 'gravitational packing' energy in a cold neutron star may
become very large, and under certain circumstances, may far exceed the
ordinary nuclear packing fractions. A neutron star would therefore
represent the most stable configuration of matter as such. The
consequences of this hypothesis will be developed in another place,
where also will be mentioned some observations that tend to support
the idea of stellar bodies made up mainly of neutrons.
(Baade and Zwicky, 1934)
Zwicky ( 1938 , 1939) was well aware that general relativity was required
for a detailed understanding of neutron star structure. But his attempts to
develop a general relativistic theory were overtaken by the definitive (and
differently motivated) work of Oppenheimer and Volkoff ( 1939).
It should be mentioned here that the general idea that stellar explosions
were associated with collapse to a superdense configuration was not original
with Baade and Zwicky but had been in the air for several years (e.g. Menzel ,
1926 ; Milne, 193 1) , and appeared to derive observational support from the
resemblance of old novae and the nuclei of planetary nebulae to white d warf
stars (e.g. Gerasomovic, 193 1). Perhaps the earliest reference to the idea is to
be found in a paper by H. N. Russell ( 1925). This was a last-ditch attempt to
reconcile the giant-and-dwarf evolutionary theory with Eddington's mass
luminosity relation, but it anticipates in a very generalized yet remarkably
judicious way, considering the state of nuclear physics at the time, a number
of modern ideas about the ignition of thermonuclear reactions in stars and
about stellar evolution .
It would appear that Lev Landau began to think about neutron stars on
the very day that news of Chadwick's discovery reached Copenhagen . Leon
Rosenfeld ( 1973) has recalled how, on that evening, as he, Bohr and Landau
sat discussing its various implications, Landau came out with the idea of
'weird stars' (unheimliche Sterne). But it was five years before Landau
Dark stars: the evolution of an idea 225
committed any of his thoughts to paper. In a brief article entitled 'The origin
of stellar energy' he pointed out that
. . . in spite of the fact that the 'neutronic' state of matter is, in usual
conditions, energetically less favourable, since the reaction of neutron
formation is strongly endothermic, this state must nevertheless become
stable when the mass of the body is large enough. In this case the
gravitational energy gained in going over to the neutronic state with its
greater density, compensates the losses of internal energy. t
Landau 's estimate of 10- 3 M 0 for the minimum mass of a neutron core
(using Fowler's non-relativistic equation of state) was later revised to a few
tenths of a solar mass by Oppenheimer and Serber ( 1938). His motivation
emerges in the concluding paragraphs :
. . . we can regard a star as a body which has a neutronic core whose
steady growth liberates the energy which maintains the star at its high
temperatures . . . .
. . . the author has shown in a previous article [Landau , 1932] that
the formation of a core mu st certainly take place in a body with a mass
greater than 1 . 5 0 . In stars with smaller mass the conditions which
could make possible the formation of the initial core have yet to be
made clear.
(Landau , 1938)
Similar sentiments were expressed by Gamow ( 1937) in the final chapter
of his book, Atomic Nuclei and Nuclear Transformations, although at the
time of writing the outlook for a thermonuclear origin of stellar energy was
not as dim as it came to seem two years later. The situation as it appeared in
the fall of 1938 was summarized by Oppenheimer and Robert Serber :
It would seem that the formation of deuterons by proton collision , and
at the least partially regenerative capture of protons by elements
between carbon and oxygen could be made to account successfully for
the main-sequence stars . . . . Nevertheless it has been clear that these
reactions could in no way account for the enormously greater radiation
of such stars as Capella, and for these one would either have to invoke
other and readier nuclear reactions with a correspondingly reduced
time-scale, or one would be led, as in the earlier arguments of Milne, to
expect serious deviations from the Eddington model.
t T. E. Sterne ( 1933) had shown earlier that, even without taking gravitational energy
into account , transition to the neutronic phase is energetically favourable for cold matter
compressed to densities exceeding 101 g/cm 3 .
0
226 W Israel
fictitious character of the singularity, and may even have supposed that this
was common knowledge in relativistic circles.
By the time the Oppenheimer-Snyder work appeared in print in
September 1939, Great Britain and France were at war with Germany. The
authors had exhausted the topic as far as it could be pursued in the prewar
'optical' era of astronomy. No tie-up with observation was apparent in the
scenario of a star that is invisible to begin with, and at the end of its nuclear
I
t In his popular book of 1940, Gamow devoted considerable space to Zwicky's ideas,
but became noncommittal on this issue in his subsequent technical papers (Gamow and
Schonberg, 194 1 ; Gamow, 1944).
Dark stars: the evolution of an idea 229
his assistants, B. Kent Harrison and Masami Wakano. It marked the entry
of a charismatic new personality into the then moribund field of general
relativity . Wheeler's interest was not primarily in the possibility of deriving
observable predictions. He was disturbed by a question of principle - 'What
is the final equilibrium state of an A-nucleon system when A is large?' - and
he found the Draconian answer offered by Oppenheimer and Snyder difficult
to accept.
Assembling all the available theoretical information, Harrison and
Wheeler constructed a semi-empirical equation of state for 'cold matter,
catalysed to the endpoint of thermonuclear evolution' at all stages of
compression from 10 g/cm 3 to supranuclear densities. Using this, Wakano
performed 44 numerical integrations, for different choices of central density,
of the general-relativistic stellar equilibrium equations on Princeton's
MANIAC computer. The resulting plot of mass vs. central density was a
curve with two humps, corresponding to the white dwarf and neutron star
configurations. The Chandrasekhar and Oppenheimer-V olkoff results were
confirmedt and for the first time brought together into a single overall
picture.
Wheeler's report on this work to the Solvay congress in Brussels on J ..
June 1958 was only part of a panoramic survey of the entire field that did not
shrink from dragging basic issues into the open :
Of all the implications of general relativity for the structure and
evolution of the universe, this question of the fate of great masses of
matter is one of the most challenging. Moreover, the issue cannot be
escaped by appealing to stellar explosion or rotational disruption, for
the issue as it presents itself today is one of principle, not one of
observational astrophysics.
Won't the star explode? Let it! . . . Simply catch the ejected matter
and extract its kinetic energy . Let it fall back on the star. Then the
original number of nucleons, A, is restored ; but the mass-energy of the
system drops. Ultimately the star gets tired. It can't eject matter. It
can't radiate photons. It can't emit neutrinos. It comes into the
absolutely lowest state possible for an A-nucleon system under the dual
action of nuclear and gravitational forces . . . this is the state we are
interested in as a matter of principle.
t Equations of state currently considered viable give mass limits three or four times
higher than the o riginal Oppenheimer-Volkoff value (Baym and Pethick, 1979).
230 W Israel
constitution of a nucleon.
(Adams et al., 1958)
Five or six years later, when it became commonplace to consider seriously
the collapse of astrophysical objects 108 times as massive as the sun, it was
easy to allay these concerns. For masses of this order, as Zel'dovich and
Novikov ( 1965) emphasized , densities are only a few g/cm 3 at the time of
crossing the gravitational radius :
Under these conditions, which are in no way remarkable, certainly
nothing fantastic can take place. The only thing that is unusually large
is the gnrvitational field, but according to the principle of equivalence
the gravitational field itself does not produce local changes in the laws
that govern physical processes.
Dark stars: the evolution of an idea 23 1
that 'cosmic' (i .e. coordinate) time and proper time would run in opposite
directions for a stationary observer in the 'interior' region . He concluded,
'Of course , only a segment of the solution that does not extend as far as the
singular sphere can actually be realized in Nature'.
The first explicit statement that r = 2m is not singular came from Lemaitre
( 1933) :
La singularite du champ de Schwarzschild est done une singularite
fictive, analogue a celle qui se presentait a } 'horizon du centre dans la
forrne originale de l'univers de Sitter.
He demonstrated this by exhibiting a coordinate system (essentially that
of Painleve and Gullstrand) in terms of which the metric at r = 2m was
manifestly regular.
Lemaitre's statement was buried as an incidental remark in the middle of
a long and somewhat inaccessible paper on cosmol ogy and it went largely
unnoticed. However, it did attract the attention of another cosmologist,
Howard Percy Robertson . Examining the orbits of radially moving test
particles in the Schwarzschild field Robertson noticed (see Robertson and
Noonan, 1968) a second point of resemblance between r = 2m and the
de Sitter horizon : although a particle dropped from any finite radius takes
an infinite coordinate time to reach r = 2m, the proper time measured by a
comoving observer is finite. The Schwarzschild 'singularity' is neither
singular nor inaccessible.
In 1939, Robertson gave a lecture on the Schwarzschild 'singularity' in
Toronto. (In the audience was the Professor of Applied Mathematics, J. L.
Synge, who was to retain vivid memories of the talk.) He also discussed the
problem with Einstein, who was sufficiently intrigued that in May he sent off
a paper whose declared aim was to examine 'whether it is possible to build
up a field containing such singularities with the help of actual gravitating
masses, or whether such regions with vanishing g 44 do not exist in cases
which have physical reality'.
The essential result of this investigation is a clear understanding as to
236 W Israel
'remove' the singularity and extend the spacetime left many unanswered
questions and obscurities . The original Schwarzschild metric is time
reversible and one would have expected this to carry over to any complete
extension. Yet the extensions of Lemaitre and Robertson were clearly not
reversible : they allowed only 'one-way traffic' through r = 2m. Moreover,
the old Flamm-Weyl-Einstein-Rosen picture of two asymptotically flat
spaces connected by a throat seemed quite incompatible with these
extensions. Synge saw this problem as an ideal proving ground for a cause he
had long championed : the geometric approach to general relativity.
Synge's attitude to the possible physical significance of such
investigations is made clear in the opening paragraphs of his paper.
Remarking that the Schwarzschild line element appears to be valid only for
r > 2m, he continues :
This limitation is not commonly regarded as serious, and certainly is
not so if the general theory of relativity is thought of solely as a
macroscopic theory to be applied to astronomical problems, for then
the singularity r = 2m is buried inside the body, i .e. outside the domain
of the field equations Rmn = 0. But if we accord to these equations an
importance comparable to that which we attach to Laplace's equation,
we can hardly remain satisfied by an appeal to the known sizes of
astronomical bodies. We have a right to ask whether the general theory
of relativity actually denies the existence of a gravitating particle, or
whether [the standard Schwarzschild metricJ may not in fact lead to
the field of a particle in spite of the apparent singularity at r = 2m.
(Synge, 1950)
He proceeds to exhibit the complete analytical extension of the
Schwarzschild manifold in terms of coordinates u, v with the property that
v + uconst are the paths of radial light rays , i .e. Kruskal-like coordinates .
Unfortunately, he obtained something extra. For r < 2m, Synge's
coordinates are inverse trigonometric functions of the standard Kruskal
coordinates. He was thus led to a picture of an infinite lattice of regions with
O � r � 2m in which test particles which have fallen through r = 2m will
oscillate forever. If truncated at the first of the geometrical singularities r = 0
- which would represent its maximal extension as a manifold - Synge's
extension is geometrically the same as Kruskal's. But, because of the
complexity of its analysis and the bizarre conclusions , its essential
significance was not understood.
In the view of many relativists the whole procedure of analytic extension
of spacetimes was a mathematical game without clearly defined rules (e.g.
238 W. Israel
Bel, 1969). Is the extended manifold necessarily unique (even aside from
topological ambiguities)? (Misner ( 1967) showed by a counter-example that
the answer is no.) Why insist on analytic continuation for hyperbolic
equations whose characteristic surfaces can admit discontinuities? At
what stage of the continuation does the mathematics take leave of physical
reality? The beginnings of a clear answer to these questions did not emerge
until the early 1 970s.
The decisive step in the unravelling of the extended Schwarzschild
manifold was taken by an amateur unaware of previous work in the field, the
noted plasma physicist Martin Kruskal. In the mid- 1950s Kruskal and a few
colleagues at Princeton formed a small study group to teach themselves
general relativity, using one of the current textbooks as a guide. Kruskal
noticed that the Schwarzschild metric near r = 2m closely resembles the flat
Minkowski metric expressed in uniformly accelerated ('Rindler')
coordinates. By simply following for the Schwarzschild case the path that
leads back from Rindler to Lorentz coordinates, he arrived at the well
known coordinates associated with his name.
Kruskal shelved his calculation when Wheeler, to whom he showed it,
reacted without special interest. A couple of years later (probably in 1958),
well aware riow of its importance, Wheeler reacquainted himself with
Kruskal's transformation and publicized it at the Royaumont conference on
general relativity in June 1959. By the end of 1959 the result had still not
appeared in print , though there were several references to its existence (e.g.
Finkelstein ( 1958) and Fronsdal (1959) mention it in notes added in proof).
Finally, in desperation , Wheeler was driven to writing up the work himself
in a brief paper for which author's credit was given to Kruskal ( 1960).
Meanwhile, others had independently tackled the problem without
knowledge of Kruskal's or other previous work . In a paper entitled 'Past
future asymmetry of the gravitational field of a point particle' David
Finkelstein ( 1958) at the Stevens Institute of Technology showed that
Schwarzschild's metric can be extended in two ways, using either retarded or
advanced time. At the time of submitting his manuscript, Finkelstein
believed that these were inequivalent extensions (rather than
complementary parts of a single complete extension) but was able to correct
this impression in a postscript added after Kruskal (whom he met at a
plasma physics conference) explained his work to him . (Finkelstein 's
extension remains of practical importance because it covers in a simple
explicit form as much of the exterior vacuum manifold as is needed for the
description of spherical collapse.) Results equivalent to Kruskal's were
Dark stars: the evolution of an idea 239
t In the last few years, an analogous idea has been revived in a quantum context by
t'Hooft (e.g. 198 5 ; cf. Gibbons, 1986; Sanchez and Whiting, 1986).
240 W Israel
presence of the [NeV] lines! ! There the matter stands at present; the
collision case appears to be well established now.
(Baade, 1952)
We have this family of chaps, less than a second of arc across. We don't
know what to make of them.
(Hanbury Brown, 196 1)
Radar expertise (and surplus equipment) acquired· in World War II was
the stimulus that enabled British an9 Australian scientists to spearhead the
advance of radio astronomy in the post-war decade. This ushered in the
most eventful era in the history of astronomy since the time of Galileo. An
adequate account of these developments would require a book, and this
brief section cannot do more than sketchily trace one strand of this rich
tapestry. Numerous vivid accounts (many first-hand) that fill out the picture
are now available (e.g. J. L. Greenstein, 1963 , 1984; Robinson et al., 196 5 ;
Hey, 197 3 ; Bell Burnell, 1977 ; Sullivan, 1979; Hoyle, 198 1 ; Smith and
Lovell, 198 3 ; Ginzburg, 1984; G. Greenstein, 1 984; Hazard , 198 5 ; and
Blandford , this volume, Chapter 8).
.
In 1950 just one galaxy (Andromeda) was known to be a radio source, but
several dozen other discrete sources (generally much more powerful) had
been detected. Only three of these had been associated with visible objects :
one was coincident with the Crab nebula and two others (NGC4486 and
5 128) with 'nebulae' in Virgo (Messier 87) and Centaurus, whose status as
external galaxies was not to be established for another couple of years. The
prevailing consensus - Thomas Gold ( 195 1) was the only vocal dissenter
was that the discrete radio sources were dark objects, 'radio stars' (e.g. Ryle,
1950) within our galaxy. To place them outside the galaxy would have
ascribed to them an intrinsic radio power far exceeding Andromeda;
moreover, they were not correlated in position with prominent galaxies. It
even appeared that a local background of such discrete sources was actually
required to explain the long wavelength component of the continuum radio
emission from the M ilky Way. Kiepenheuer ( 1950) had already explained
this component as synchrotron emission from cosmic ray electrons
spiralling in the galactic magnetic field, and these ideas were actively
pursued by Ginzburg and others in the Soviet Union over the next few years.
But they did not sink in in Western astronomical circles until the mid- 1950s,
when measurements of the optical polarization of the Crab nebula
(proposed by I. M . Gordon) spectacularly vindicated Shklovsky's ( 1953)
idea that the synchrotron mechanism was at work in the Crab.
Dark stars: the evolution of an idea 24 1
optical search by the Caltech astronomer Maarten Schmidt ( 1 963) (to whom
Hazard had sent his result at the suggestion of John Bolton), now showed
that
The only objects seen on a 200-inch plate near . . . the radio source
3C273 . . . are a star of about thirteenth magnitude and a faint wisp or
jet. . . .
Spectra of the star were taken with the prime-focus spectrograph of
the 200-in . telescope . . . . A redshift dA./A.0 of 0. 1 58 allows identification
of four emission bands as Balmer lines . . . .
Greenstein ( 1963) immediately re-examined his 196 1 plates of 3C48 . He and
Matthews were now able to identify six lines by applying a redshift of 37 per
cent.
All these revelations appeared back-to-back in the March 16 issue of
Nature along with a theoretical paper by Hoyle and Fowler ( 1963),
proposing a collapsing superstar hypothesis, which concluded :
Our present opinion is that only through the contraction of a mass of
107- 108 M 0 to the relativity limit can the energies of the strongest
sources be obtained.
An earlier suggestion that gravitational energy released in cont raction
might account for the radio sources had come from Ginzburg ( 196 1) :
. . the gravitational contraction of a galaxy (or of its central region),
.
moon, but astronomers do not seem to need such . an object to account for
planetary motions!
Harlan Smith reported on a study of old Harvard plates which revealed
optical variations in 3C273 on timescales of years and months, with even
some indications of flashes in which the brightness doubled in less than ten
days. It appeared that the primary source of energy could not be more than a
light-week in diameter.
Most of the Soviet scientists invited were unable to attend, but they sent
preprints which circulated freely. An extraordinary one, by Zel'dovich and
Novikov (1964), envisaged a supermassive object powered by accretion
(Zel'dovich, 1964; Salpeter, 1964), and argued that its mass must be at least
10 8 M 0 if the observed luminosity is not to exceed the Eddington limit. The
strongest source could be adequately fuelled by accretion of just a few solar
masses per year. Ginzburg (1964), Ozernoy and Kardashev ( 1964) stressed
the role of magnetic fields, which would be strongly amplified by flux
freezing when a compact object forms in a collapse. These ideas paved the
way for the theoretical understanding of pulsars a few years later.
The basic difficulty with gravitational collapse as a power source was
emphasized by Freeman Dyson in one of the concluding summaries. The
timescale for · collapse is only about a day; the lifetime of the sources,
inferred , for example, from the length of the jets, is at least 106.- 107 years.
Attempts over the next couple of years to meet this difficulty head-on by
replacing the Oppenheimer-Snyder scenario by an oscillatory picture (e .g.
Gertsenshtein, 1967) came to nothing. Modifying the Einstein field
equations with a negative-energy cosmological - or 'C-term ' - (Hoyle and
Narlikar, 1964 ; Faulkner et al. , 1964) introduces repulsive gravitational
forces which can lead to a bounce at small radius, and a pulsation as seen by
a comoving observer. On the other hand, it seemed obvious (Zel'dovich and
Novikov, 1965) that an external observer could never see the bouncing
object re-emerge from the event horizon r = 2m that it has not yet entered in
his remote future! The elliptic reinterpretation of the extended
Schwarzschild spacetime predicts oscillatory behaviour for infalling test
particles at the price of relinquishing ordinary ideas of causality within a
sphere slightly larger than the gravitational radius, which , it could be - and
was - argued, might lie beyond the range of conscious intervention (Israel ,
1966). Even so , it proved impossible to devise a consistent dynamical picture
of a gravitating source which revisits the same Cauchy su rface.
There were, of course, many other hypotheses about quasars and active
galactic nuclei : chain reactions of supernova explosions in distant galaxies,
matter-antimatter annihilation, white holes, quasi-local hypotheses in
Dark stars: the evolution of an idea 247
which the quasars were massive objects ejected at relativistic speeds from the
nucleus of our own galaxy, and so on. In retrospect it seems clear that many
of these ideas were over-reactions to phenomena whose power (as distinct
from total energy) requirements did not call for such extraordinary physics .
Only one or two seem to have stood the test of time : the model suggested by
Lynden-Bell ( 1969) for the more quiescent sources in which a super-massive
black hole is fed by an accretion disc,t and the 'magnetoid' or 'spinar' model
of Ginzburg and Ozernoy ( 1 977) , Morrison and others , in which the energy
source is a magnetized supermassive disc stabilized by rotation. As seen
from a present-day vantage point the situation in the mid- 1 960s has been
well summarized by Rees ( 1 984) :
There has been progress towards a consensus in that some bizarre ideas
that could be seriously discussed a decade ago have been discarded . But
if we compare present ideas with the most insightful proposals
advanced when quasars were first discovered 20 years ago (such
proposals being selected , of course , with benefit of hindsight) , progress
indeed seems meager. It is especially instructive to read Zel' dovich and
Novikov's ( 1964) paper entitled 'The mass of quasi-stellar objects' . In
this paper, on the basis of early data on 3C273 , they conjectured the
following : (a) Radiation pressure perhaps balances gravity, so the
central mass is - 108 M 0 . (b) For a likely efficiency of 10 % the
accretion rate would be 3 M 0 yr- 1 . (c) The radiation would come from
an effective 'photosphere' at a radius of - 2 x 10 1 5 cm (i .e. � r9 ), outside
of which line opacity would cause radiation to drive a wind. (d) The
accretion may be self-regulatory, with a characteristic time scale of
,..., 3 yr. These suggestions accord with the ideas that remain popular
today, and we cannot yet make any firmly based statements that are
more specific.
By good fo rtune the fourth 'Texas' symposium in Dallas in December
1968 was perfectly timed to catch the excitement of the unravelling of the
pulsar mystery. It was at the meeting itself that most delegates learned of the
Crab pulsar's deceleration measured at Arecibo . Rotational energy was
being lost at a rate which exactly accounted for the energy radiated (mainly
in X-rays) by the nebula! At one stroke Gold's ( 1968) model of pulsars was
vindicated and the existence of neutron stars established beyond reasonable
doubt. To borrow a well-worn phrase, seldom in the history of astronomy
has so bizarre a concept been so firmly established in so short a time,
although the 'central star' in the Crab, whose spectrum showed a blank, had
248 W Israel
for 25 years been suspected of being the seat of mysterious, highly energetic
processes (Minkowski , 1942 ; Hoyle, 1955). A hint of the true source of the
Crab's radiation had come from Wheeler ( 1 966) in a review of 'superdense
stars' :
. . . several workers have looked to the residual neutron star itself as a
device to power this radiation , either through its heat energy or
through its energy of vibration . . . . (Energy of rotation appears not yet
to have been investigated as a source of power. Presumably this
mechanism can only be effective - if then - when the magnetic field of
the residual neutron star is well coupled to the surrounding ion clouds.)
Spinning neutron stars with oblique magnetic axes had been studied by
Pacini ( 1 967) several months before Hewish's ( 1 968) group announced their
discovery. Yet no one managed to predict that such objects would pulse , and
the exact mechanism remains a matter of controversy to this day.
X-ray astronomy owes its inception in the late 1940s to another legacy of
World War I I : a small stock of V2 rockets captured from the Germans . X
rays from outside the solar system were first detected in 1962 . In 1963, taking
advantage of a lunar occultation , Herbert Friedman and his colleagues at
the Naval Research Laboratory identified the Crab Nebula as a diffuse X
ray source, and, in 1964, they identified the irregularly variable source
Cygnus X- 1 . By 197 1 , after it had been located to within a few minutes of arc
by Uhuru , Cygnus X- 1 was linked with a variable radio source which was
then more accurately pinpointed by radio astronomers in Green Bank and
Holland. It was finally identified with the one-line optical binary HOE
226868 by L. Webster and P. Murdin at the Royal Greenwich Observatory
and by C . T. Bolton ( 1972) at the David Dunlap Observatory (Toronto).
The mass of -6 M 0 estimated for the unseen component made it the first
durable observational candidate for a black hole, and it remained the only
one for more than a decade.t
Fig. 7 .1. Is there a black hole in Cygnus X-1? A bet between Stephen
.
Hawking and Kip Thorne made at Caltech in December 1974, on which
neither side has yet collected.
11'.I • r
.-.I "'1"":
... .
,.. �·� ... �
. .
• ., ,,,. ! � . .'J.
.
u�... w
•
,�.... ...��
250 W Israel
visible. The core like the Cheshire cat fades from view. One leaves
behind only its grin, the other, only its gravitational attraction . . . .
Moreover light and particles incident from the outside emerge and go
down the black hole only to add to its mass and increase its
gravitational attraction.
(Wheeler, 1968)
Black hole research is a good example of the importance of pictures and
phrases. Before the mid- 1960s the object we now call a 'black hole' was
referred to in the English literature as a 'collapsed star' and in the
Russian literature as a 'frozen star'. The corresponding mental picture,
based on stellar collapse as viewed in Schwarzschild coordinates was
one of a collapsing star that contracts more and more rapidly as the
grip of gravity gets stronger and stronger, the contraction then slowing
because of a growing gravitational redshift and ultimately freezing to a
halt at an 'infinite-redshift surface' (Sch warzschild radius), there to
hover for all eternity. Of course, from the work of Oppenheimer and
Snyder ( 1939) we were aware of an alternate viewpoint, that of an
observer on the surface of the collapsing star who sees no freezing but
instead experiences collapse to a singularity in a painfully short time.
But because nothing inside the infinite-redshift surface can ever
influence the external universe, that 'comoving viewpoint' seemed
irrelevant for astrophysics. Thus astrophysical theorizing in the early
1960s (e.g . Zel'dovich and Novikov, 1964, 1965) was dominated by the
'frozen-star viewpoint'. As long as this viewpoint prevailed , physicists
failed to realize that black holes can be dynamical , evolving, energy
storing and energy-releasing objects.
(Thorne et al., 1986)
In 1963 , all at once putative experts in a field they had left untilled for a
quarter of a century, the relativists scurried to make up for lost time. Their
success exceeded all expectations. Within a few years, understanding of
gravitational collapse progressed from its inchoate beginnings to a
sophisticated discipline comparable in elegance, rigour and generality to
thermodynamics, a subject which it turned out unexpectedly to resemble.
The array of problems was formidable. For the first time, the full non
linearity of the Einstein field equations had to be confronted for a generic
situation. Astrophysics insistently posed the question : what is the evolution
and endstate of a generic collapse with all complications included : pressure,
rotation , magnetization , etc? For example, when a star collapses toward its
gravitational radius , one expects its magnetic field to become strongly
Dark stars: the evolution of an idea 25 1
t Specifically, the argument was that a generic solution involves eight arbitrary functions
of the spatial coordinates (six components of the spatial metric tensor and their six time
derivatives less four degrees of freedom due to arbitrariness of the spacetime
coordinates), whereas a singular solution appeared to allow only seven arbitrary
functions. The conflict with the singularity theorems of Hawking and Penrose ( 1969) was
later traced to a subtle point (Khalatnikov and Lifshitz, 1970 ; Misner, 1969) : the generic
solution does not have an analytic expansion near a singularity as Lifshitz and
Khalatnikov had originally assumed , but exhibits a complicated stochastic ('mixmaster')
behaviour.
252 W. Israel
DZN next add ressed the question of the nature of the final 'frozen' state,
in which the external field becomes asymptotically stationary. Examination
of special cases indicated that static fields with multipole moments become
singular at the event horizon. They went on to argue as follows :
An analysis of the static solution outside the body shows that the
deviation from the spherical solution , which is caused by a change in
the sou rce of the field , leads to the appearance of true singularities of
spacetime on the Schwarzschild surface g00 = 0. On the other hand , in
the comoving system of a contracting body with small initial deviations
from sphericity in the density distribution, the Schwarzschild surface is
in no way specially distinguished and is not accompanied by the
appearance of true singularities either in the metric or in the density. A
comparison of these results leads to the conclusion that the quadrupole
and higher moments of the external gravitational field attenuate during
the relativistic stages of collapse. t
This conclusion was highly intriguing but puzzling, since no dynamical
explanation was evident or, indeed, offered for the decay of the multipole
moments. One possibility, suggested later, was that it was an apparent
effect , seen from the outside, due to 'Lorentz contraction of the bulges on the
star's surface' (Thorne, 1967) . DZN were willing to commit themselves only
on what the cause was not :
The change in the multipole moments during the course of contraction
of the body should be accompanied by radiation of gravitational waves ,
but the energy carried by this radiation is small. The radiation of waves
is a consequence of the change of the multipole moments and should
not be regarded as the cause of their total damping. We note that in
Newtonian theory, the moments also vary du ring the course of the
compression of the body, but for finite body dimensions they are finite.
In Einstein's theory, a relativistic damping is superimposed on this
change in the moments of the external field , due to the change in the
dimensions of the contracting body.
The external observer 'sees' (for example, with the aid of neutrino
and antineutrino radiation) in the ultimate cooled state the finite
nonsphericity of the distribution of masses in the sources of the field .
However, this nonsphericity is not at all manifest in the external field .
t Ginzburg and Ozernoy ( 1965) had noted a few months earlier that, for an external
observer, the effective magnetic moment of a magnetized spherical star contracting
quasistatically toward its gravitational radius appears to attenuate like t - 1 .
256 W Israel
t Because of difficulties with referees, this paper had to wait three years for publication.
Dark stars: the evolution of an idea 25�
bets) :
. . . only two alternatives would be open - either the body has to divest
itself of all quadrupole and higher moments by some · mechanism
(perhaps gravitational radiation), or else an event horizon ceases to
exist.
(Israel, 1967a)
In an article in Nature, I expressed my concerns more freely :
Because of the slowing-down of processes for the external observer, it
seems reasonable to suppose . . . that it is permissible to treat the
limiting external field as static. The foregoing theorem on singular
event horizons then applies. Before reaching a singularity at the event
horizon, the star will respond to the precursory large tidal forces in such
a way as to subdue, if possible, the rise of the singularity itself. There are
two possibilities.
The first is that the star shakes off its quadrupole and higher
moments, and is, so to speak, patted into spherical shape before
crossing the event horizon. Some such idea has been suggested
previously.t The chief difficulty here is to find a plausible mechanism
. . . which �ill accomplish this task in the finite proper time available.
The second possibility, which has not been considered before, is that
the event horizon becomes smudged out and obliterated.
(Israel � 1967b)
This argument was an object lesson in how to arrive at nonsense by
stretching the frozen star picture beyond its proper limits. What is important
to remember is that it was not then clear just where these limits were, and no
overall picture was available. Perhaps the biggest conceptual hurdle then
facing the theorists was to reconcile the duality of the frozen star picture and
the local dynamical view of free fall. It was like knowing about
complementarity without having command over the mathematical
apparatus of quantum mechanics. In some ways it was worse, because in
problems like the decay of multipole moments both aspects of the duality
were involved.
The naive version of the frozen star picture was essentially this : as the star
freezes and the last gravitational waves depart,t the external field reduces to
t Here I was uncomfortably aware that DZN had not said exactly this . On the other
hand , I did not know exactly what it was they had said , much less how to express it
succinctly !
t With entrance t o the hole apparently obstructed b y the frozen star, it required some
stretch of imagination to appreciate that gravitational radiation could freely propagate
inwards as well as outwards .
Dark stars: the evolution of an idea 259
its Coulomb-like part , which cannot fade away because it is anchored in the
source still lingering just outside its gravitational radius. The guiding model
is , of course, the electromagnetic field , where wavelike (transverse)
components can be self-sustaining (i.e. source-free) but Coulomb-like
(longitudinal) fields require a source. The defect of this naive version is that
in a non-linear theory like gra vita ti on a sufficiently strong field can be its
own source and thus even Coulomb-like fields can be self-sustaining. It
becomes possible to conceive that in a collapse the field and the material
source completely part company, and their multi pole moments become
independent , as DZN (read with hindsight) seem perhaps to have implied.
But a heuristic picture which needs such a sophisticated re-interpretation
becomes practically almost useless. What the frozen star picture was in the
last analysis telling us was that the frozen star is completely irrelevant to the
final state !
There would be little point to this tedious rehash of personal errors
and confusions if I did not think that it reflects, however distortedly, the
significant shift that occurred during 1967- 1969 in the general way of
thinking about the end-state of a gravitational collapse. Although the idea of
horizon instability found a few echoes in the literature (e.g. Janis et al., 1968 ;
Bel , 1969), generally the frozen star was giving way to a new image : an
elemental , self-sustaining gravitational field which has severed all causal
connection with the material sou rce that created it, and settled, like a soap
bubble,t into the simplest configuration consistent with the external
constraints. It was, above all , Roger Penrose, at least in England, who
inspired and guided the transition to this radically new viewpoint , so aptly
encoded in Wheeler's ( 1968) coincidental and timely coinage of the term
'black hole' .t
When we met in London near the end of 1967, Penrose explained this view
to me essentially as follows :
Doubts have frequently been expressed concerning (the no-hair
conjecture), since it is felt that a body would be unlikely to throw off all
its excess multipole moments just as it crossed the Schwarzschild
radius. But . . . I would certainly not expect the body itself to throw off
its multipole moments. On the other hand, the gravitational field itself
has a lot of settling to do after the body has fallen into the 'hole'. The
t This analogy is, I believe, essentially due to Misner (Thorne, 1970, 1972).
t The term was casually slipped into an address to the New York meeting of the
American Association for the Advancement of Science on 29 December 1967. Wheeler
had first used it a few months earlier in informal discussions at a conference at the
Institute for Space Studies o rganized by V. Canuto.
260 W Israel
injected into this region splitting into two particles . One has negative energy
and drops into the hole; the other escapes to the outside with. greater energy
than the particle that was injected. The nett effect is an extraction of some of
the hole's rotational energy, with a corresponding reduction of its mass and
angular momentum.
A 2 1-year-old Princeton graduate student , Demetrios Christodoulou
( 1970), undertook a detailed study of the efficiency of quasistatic processes of
this type, in which the black hole's parameters are changed gradually by
injecting a succession of test particles. He found that the efficiency is limited
by an inequality which, in effect, states that a certain quantity can never
decrease. This is a function of the mass and angular momentum of the black
hole which Christodoulou called the 'irreducible mass'; its simple
geometrical significance was not yet apparent.
At the Texas Symposium in Austin in December 1 970 , Stephen Hawking
( 197 1) crowned the edifice of classical black hole theory with his elegant
demonstration that Christodoulou's result was a special case of a
very general law. Hawking's theorem states that the area of the event
horizon can never decrease in any interaction of a black hole with its
environment, provided the energy density of any accreted material is non
negative.
The obvious analogies with thermodynamics al ready foretokened in the
title of Christodoulou's ( 1970) paper, 'Reversible and i rreversible
transformations in black-hole physics' , were extensively developed by
Bardeen , Carter and Hawking during the Les Houches Summer School on
black holes in August 1972 (Carter, 1973). Black hole analogues were
formulated for all four of the laws of thermodynamics, with the black hole's
area playing the role of entropy and its surface gravity the role of
temperature.
However, there was almost universal agreement that this analogy was
purely formal :
In fact the effective temperature of a black hole is absolute zero . One
way of seeing this is to note that a black hole cannot be in equilibrium
with black body radiation at any non-zero temperature, because no
radiation could be emitted from the hole whereas some radiation would
always cross the horizon into the black hole.
(Bardeen, Carter and Hawking , 1973)
The real effective temperature of a black hole is well defined and
unambiguously zero, as also are its chemical potentials. The ordinary
particle conservation laws, and the ordinary second law of
Dark stars: the evolution of an idea 26�
References
Gold, T. (1968). Rotating neutron stars as the origin of the pulsating radio sources .
Nature, 218, 73 1-2.
Greenstein, G. ( 1984). Frozen Star. Macdonald : London.
Greenstein, J. L. ( 1963). Quasi-stellar radio sources. Scientific American, 209 (December),
54-62 .
Greenstein, J. L. ( 1 984). Optical and radio astronomers in the early years. In Sullivan
( 1984), pp. 67-8 1 .
Gullstrand, A. ( 1922). Allegemeine Losung des statischen Einkorper-problems i n der
Einsteinschen Gravitations theorie. Arkiv. Mat. Astron . Fys., 16(8), 1- 15.
Harrison, B. K., Thorne, K . S . , Wakano, M . and Wheeler, J. A . ( 1965). Gravitation
Theory and Gravitational Collapse. University of Chicago Press : Chicago.
Harrison, E. R. ( 1972). Particle barriers in cosmology. Comment Astrophys . Space Phys.,
4 , 187-92.
Hartle, J. B . and Hawking, S. W. (1976). Path-integral derivation of black-hole radiance.
Phys. Rev., 0 13, 2 1 88-203.
Hawking , S . W . ( 197 1). Gravitational radiation from colliding black holes. Phys. Rev.
Lett., 26, 1344--6 .
Hawking, S . W. ( 1972) . Black holes in general relativity. Commun . Math. Phys., 25,
152--66.
Hawking , S. W. ( 1974). Black hole explosions? Nature, 248, 30-- 1 .
Hawking, S . W . ( 1987). From the Big Bang to Black Holes: A Short History of Time.
Bantam Books : New York (in press).
Hawking, S. W. and Ellis, G. F. R. ( 1973). The Large Scale Structure of Space- Time.
Cambridge University Press : Cambridge.
Hawking, S. W. and Penrose, R. ( 1969). The singularities of gr<l;vitational collapse and
cosmology. Proc. R. Soc. (London), A314, 529-48 .
Hazard , C. ( 1985). The coming of age of QSOs. In Active Galactic Nuclei, ed . J . E.
Dyson , pp. 1- 19. Manchester University Press : Manchester.
Hazard, C . , Mackey, M . B. and Shimmins, A. J. ( 1963). Investigation of the radio source
3C273 by the method of lunar occultations. Nature, 197, 1037-9.
Hertzprung, E. ( 1 905). Zur Strahlung der Sterne (giants and dwarfs). Z. Wiss. Photog ., 3.
Engl . transl. in Source Book in Astronomy, ed. H . Shapley . Harvard University Press :
Cambridge, Mass.
Hertzprung, E. ( 19 15). Effective wavelengths of absolutely faint stars. Astrophys. J., 42,
1 1 1-19.
Hewish , A . (1975). Pulsars and high-density physics. Rev. Mod. Phys., 47, 567-72.
Hewish, A . , Bell, S. J., Pilkington , J. D . H ., Scott , P. F. and Collins, R . A . ( 1968).
Observation of a rapidly pulsating radio source. Nature, 217, 709-13.
Hey, J. S . ( 1973). The Evolution of Radio Astronomy. Neale Watson : New York .
t'Hooft, G. ( 1985). On the quantum structure of a black hole. Nucl. Phys ., B256, 727-45.
Hoyle, F. ( 1946) . The synthesis o f t h e elements from hydrogen . Mon . Not. R. Astron.
Soc., 1 06, 343-83 .
Hoyle, F. ( 1955). Frontiers of Astronomy, p . 204. Heinemann : London.
Hoyle, F. ( 198 1). The Quasar Controversy Resolved. University College Cardiff Press :
Cardiff.
Hoyle, F. and Fowler, W. A. ( 1 963). Nature of strong radio sources. Nature, 197, 533-5 .
Hoyle, F . , Fowler, W. A . , Burbidge, G . R . and Burbidge, E. M . ( 1 964). On relativistic
astrophysics. Astrophys. J., 139, 909-28 .
Hoyle, F. and Narlikar, J. V . ( 1964). On the avoidance of singularities in C-field
cosmology. Proc. R. Soc. , A278 , 465-78 .
Dark stars: the evolution of an idea 27 1
Hund , F. ( 1936). Materie unter sehr hohen Drucken und Temperaturen . Ergebnisse
Exakten Naturwiss., 15, 189-228 .
Hund, F . ( 1 97 1). I rrwege und Hemmungen beim Werden der Quantentheorie. In
Quanten und Felder, ed . H . P . Diirr, pp. 1- 10. F. Vieweg : Braunschweig.
Israel , W . ( 1966) . Is gravitational collapse really irreversible? Nature, 2 1 1 , 466-7.
Israel , W. ( 1967a). Event horizons in static vacuum space-times . Phys. Rev., 164, 1776-9 .
Israel , W . ( 1967 b) . Possible instability in the self-closure phenomenon in gravitational
collapse. Nature, 2 16, 148-9 and 3 12.
Israel, W. ( 1968). Event ho rizons in static electrovac space-times. Commun . Math . Phys.,
8 , 245-60.
Israel, W . ( 197 1). Event horizons and gravitational collapse. J. Gen. Rel. Grav., 2, 53-6 1 .
Israel , W . ( 1973). Entropy and black hole dynamics . Lett. Nuovo Cimento, 6 , 267-9.
Israel , W. ( 1983). Black holes . Sci. Prog. (Oxford), 68, 333-63 .
Israel , W. and Khan , K . A. ( 1964) . Collinear particles and Bondi dipoles i n general
relativity. Nuovo Cimento , 33, 33 1-44.
Jaki, S. L. ( 1978). Johann Georg von Soldner and the gravitational binding of light, with
an English translation of his essay on it published in 180 1 . Found. Phys., 8, 927-50.
Janis, A. I . , Newman , E. T. and Winicour, J. ( 1968) . Reality of the Schwarzschild
singularity . Phys. Rev. Lett., 20, 878-80.
Jeans, J. H . ( 1926). On liquid stars and the liberation of stellar energy . Mon. Not. R.
Astron. Soc. , 8 7 , 400- 14.
Jeans , J. H . ( 1928). Astronomy and Cosmogony, pp. 72 , 352 and Chap . 5. Cambridge
University Press : Cambridge.
Jeffreys, H . ( 19 18). The compressibility of dwarf stars and planets. Mon. Not. R. Astron.
Soc ., 78 , 183-4.
Jeffreys, H . ( 1 930) . Convections in stars. Mon. Not. R . Astron. Soc. , 91 , 12 1-2 .
Jiittner, F . ( 1 928) . Die relativistische Quantentheorie des idealen Gases. Z. Phys., 47,
542-66.
Kardashev, N . S. ( 1964) . M agnetic collapse and the nature of intense sources of cosmic
radio-frequency emission . Sov. Astron. AJ, 8, 643-8 .
Kerr, R . P . ( 1 963). Gravitational collapse and rotation . In Robinson , Schild and
Schucking ( 1965), pp. 99- 102.
Khalatnikov, I . M . and Lifshitz, E. M . ( 1970). General cosmological solution o f the
gravitational equations with a singularity in time. Phys. Rev. Lett. , 24, 76-9.
Kiepenheuer, K . 0. ( 1950). Cosmic rays as the source of general galactic radio emission.
Phys. Rev., 79, 738.
Komar, A . ( 1965). Bootstrap gravitational geons. Phys. Rev. , 137B, 462-6.
Kosirev, N . A. ( 1934). Radiative equilibrium of the extended photosphere. Mon. Not. R.
Astron. Soc., 94, 430-43.
Kretschman, E. ( 1923). Das statische Einkorperproblem in der Einsteinschen Theorie.
Antwort an Hrn A. Gullstrand . Arkiv Mat. Astron. Fys., 17, 1-4.
Kruskal, M . D . ( 1960). M aximal extension of Schwarzschild metric. Phys. Rev., 1 19,
1743-5.
Landau, L. ( 1932). On the theory of stars. Phys. Z. Sowjetunion, 1 , 285-7 .
Landau , L. ( 1938). The origin of stellar energy . Nature, 141 , 333-4.
Laplace, P. S. ( 1796) . Exposition du Systeme du Monde, vol . 2, p. 305 . J. B. M . Duprat :
Paris. See also Hawking and Ellis ( 1973), pp. 365-8 .
Laue, M . von ( 1953). Die Relativistats theorie (3rd ed .), vol . 2, p . 135. F. Vieweg :
Braunschweig .
Lemaitre, G . ( 1933). L 'univers en expansion. Ann. Soc. Sci. (Bruxelles), A53, 5 1-8 5.
272 W Israel
Liebert, J. ( 1980). White dwarf stars. Ann. Rev. Astron . Astrophys. , 18, 363-98 .
Lifshitz, E. M . , Sudakov, V . V. and Khalatnikov, I . M . ( 196 1). Singularities of
cosmological solutions of gravitational equations III. Sov. Phys. J ET.P, 13, 1298-303.
Lifshitz, E. M. and Khalatnikov, I . M. ( 1963). Problems of relativistic cosmology. Sov.
Phys. Uspekhi, 6, 495-522.
Lindquist, R. W. and Wheeler, J. A. ( 1957) . Dynamics of a lattice universe by the
Schwarzschild-cell method . Rev. Mod. Phys., 29, 432-43 .
Lodge, Sir Oliver ( 1920). Discussion on theory of relativity. Mon. Not. R . Astron. Soc.,
80, 96- 1 1 8 .
Lodge, S i r Oliver ( 192 1). On the supposed weight and ultimate fate o f radiation. Phil.
Mag., 41 , 549-57 .
Lucretius (55 BC). On the Nature of the Universe, p . 59. Penguin Books :
Harmondsworth.
Lynden-Bell, D. ( 1969). Galactic nuclei as collapsed old quasars. Nature, 223, 690-4.
McCormmach, R. ( 1968). John M ichell and Henry Cavendish : weighing the stars. Brit .
J. Hist. Sci., 4 , 126-55.
McCrea, W. H. ( 1964). Release of gravitational energy in general relativity. Nature, 201 ,
589.
Manasse, F. K. ( 1963). Distortion in the metric of a small centre of gravitational
attraction due to its proximity to a very large mass. J. Math. Phys. , 4, 746-6 1 .
Matthews, T . A . and Sandage, A . R . ( 1 963). Optical identification of 3C48, 3C 1 96 and
3C286 with stellar objects. Astrophys. J., 138 , 30-56.
Menzel, D . H . ( 1926). The planetary nebulae. Pub/. Astron. Soc. Pacific, 38, 295--3 12 .
Michell , John ( 1784). O n the means of discovering the distance, magnitude, etc . , o f the
fixed stars, in consequence of the diminution of their light, in case such a diminution
should be found to take place in any of them, and such other data should be procured
from observations, as would be further necessary for that purpose. Phil. Trans. R . Soc.
(London), 74, 35-57 . Reprinted in Detweiler ( 1982) .
Milne, E. A . ( 1930). The analysis of stellar structure. Mon. Not. R. Astron. Soc. , 9 1 ,
4-55 ; Observatory, 53, 305-8 .
Milne, E. A. ( 19 3 1 ) . On dense stars. Observatory, 54, 140-5.
Minkowski, R. ( 1942). The Crab nebula. Astrophys. J., 96, 199-2 1 3 .
Minkowski, R. ( 196 1 ) . Letter. Sky and Telescope, 21 , 2 3 .
Misner, C. W . ( 1967). Taub-Nut space a s a counterexample t o almost anything. I n
Relativity Theory and Astrophysics, vol 1 , ed . J . Ehlers, pp. 160-7 . American
Mathematical Society : Providence, Rhode I sland.
Misner, C. W. and Wheeler, J. A. ( 1957). Classical physics as geometry : gravitation,
electromagnetism, unquantized charge and mass as properties of empty space . Ann.
Phys., 2, 525-603.
Misner, C. W. ( 1969). M ixmaster universe. Phys. Rev. Lett. , 22, 107 1-4.
Misner, C. W. ( 1972). Interpretation of gravitational wave observations. Phys. Rev. Lett.,
28, 994-7.
M0ller, Chr. and Chandrasekhar, S. ( 193 5). Relativistic degeneracy. Mon . Not. R .
Astron. Soc. , 95, 673-6.
Morrison , P . ( 1965). Summary. In Robinson, Schild and Schucking ( 1965), pp. 437-8 .
Morrison, P . ( 1977). Astronomy and the laws of physics . In Highlights of Astronomy,
vol . 4, part 1 , ed . E. A. Muller, p. 35. Reidel : Dordrecht.
Mysak, L . A. and Szekeres, G. ( 1966). Behaviour of the Schwarzschild singularity in
superimposed gravitational fields. Can. J. Phys., 44 , 6 1 7-27 .
Dark stars: the evolution of an idea 273
Oppenheimer, J. R. and Serber, R. ( 1938). On the stability of stellar neutron cores. Phys.
Rev . , 54, 540.
Oppenheimer, J. R. and Snyder, H . ( 1 939). On continued gravitational contraction.
Phys. Rev . , 56, 455-9.
Oppenheimer, J . R. and Volkoff, G. ( 1939). On massive neu tron cores. Phys. Rev., 55,
374-8 1.
Pacini , F . (1 967). Energy emission from a neu tron star. Nature, 216, 567-8 .
Painleve, P . (192 1). La M ecanique classique et la theorie de la relativite. C.R. Acad. Sci.
(Paris), 173, 677-80.
Pais, A. ( 1982) . Subtle is the Lord . . . The Science and the Life of Albert Einstein, p. 509.
Clarendon Press : Oxford.
Papapetrou, A. ( 1966). Champs gravitationnels stationnaires a symetrie axiale. Ann.
Inst. H . Poincare, A IV, 83- 105.
Payne-Gaposchkin, C. (1957). The Galactic Novae, p . 3 12 . North-Holland : Amsterdam.
Peierls, R. (1936). Note on the derivation of the equation of state for a degenerate
relativistic gas. Mon. Not. R. Astron . Soc., 96, 780-4.
Penrose, R. ( 1965). Gravitational collapse and space-time singularities. Phys. Rev. Lett.,
14, 57-9.
Penrose, R. ( 1 969). Gravitational collapse : the role of general relativity . Riv. Nuovo
Cimento, 1, 252-76.
Prendergast, K . H . and Burbidge, G. R. ( 1968). On the nature of some galactic X-ray
sources. Astrophys. J., 151 , L83-8 .
Price, R . H . (1 972). Nonspherical perturbations of relativistic gravitational collapse.
Phys. Rev., D5, 24 1 9-54.
Rees, M. J. ( 1 984) . Black hole models for active galactic nuclei . Ann. Rev. Astron .
Astrophys., 22, 47 1-506.
Rindler, W. ( 1965). Elliptic Kruskal-Schwarzschild space. Phys. Rev. Lett. , 15, 100 1-2 .
Robertson, H . P . and Noonan, T . W . ( 1968) . Relativity and Cosmology, pp. 246-52.
W. B. Saunders : Philadelphia.
Robinson, D. C. ( 1 975). Uniqueness of the Kerr black hole. Phys. Rev. Lett., 34, 905--6 .
Robinson, I . , Schild, A . and Schucking , E. L. (eds.) (1965). Quasistellar Sources and
Gravitational Collapse, Editors' introduction, pp. i-xvii . University of Chicago Press :
Chicago.
Rosenfeld, L . ( 1973). Remarks in Seizieme Conseil de Physique Solvay, p. 174. Stoops :
Brussels.
Russell, H. N . ( 1 9 12). Relation between the spectra and other characteristics of the stars.
Proc. Amer. Phil. Soc., 51, 569.
274 W Israel
Russell, H. N. ( 1 914). Relations between the spectra and other characteristics of the
stars. Popular Astron., 22, 275-94, 33 1-5 1 .
Russell, H . N . ( 1 925). The problem of stellar evolution. Nature, 1 16, 209- 12.
Russell, H. N. ( 1939). Introductory remarks to the Paris meeting : Novae and White
Dwa1fs. Hermann : Paris ( 194 1). Reprinted in Schatzman ( 1958), p. 1 .
Russell, H . N . ( 1 944) . Note o n white dwarfs and small companions. Astron. J., 51 , 13.
Russell, H . N., Dugan, R. S . and Stewart , J. Q . ( 1 938). Astronomy, vol. 2 , pp. 9 18 ,
958-9. Ginn : Boston.
Ryle, M . ( 1950). Radio astronomy. Rep. Prog. Phys., 13 , 184-246.
Ryle, M . ( 1955). Meeting of Royal Astronomical Society, 13 May 1955. Observatory, 15,
104-8 .
Salpeter, E. E. ( 1964). Accretion of interstellar matter by massive objects . Astrophys. J.,
140, 796-9.
Sanchez, N. and Whiting, B. ( 1986). Quantum field theory and the antipodal
identification of black holes. Preprint.
Schatzman , E. ( 1958). White Dwarfs pp. 68-73, 166-70. North-Holland : Amsterdam .
,
Schmidt , M . ( 1 963). 3C273 : a star-like object with a large redshift . Nature, 191, 1040.
Schonberg, M . and Chandrasekhar, S . ( 1 942). On the evolution of main-sequence stars .
Astrophys. J., 96 , 16 1-72.
Schwarzschild, K. ( 1 9 16a). O ber das Gravitationsfeld eines Masses nach der
Einsteinschen Theorie. Sitzungsberichte Koniglich Preuss. Akad. Wiss. , Physik-Math .
Kl., 189-96. Translation in Detweiler ( 1982).
Schwarzschild , K. ( 1 9 16b). O ber das Gravitationsfeld einer Kugal aus inkompressibler
Fllissigkeit nach der Einsteinschen Theorie. Sitzungsberichte Koniglich Preuss. Akad.
Wiss. , Physik-Math. Kl., 424-34.
Sen, N. R. ( 1934). On the equilibrium of an incompressible sphere. Mon. Not. R . Astron.
Soc., 94, 550-64.
Shklovsky, J . S. ( 1953). On the nature of the Crab nebula's optical emission. Dokl. Akad.
Nauk. SSR, 90, 983 .
Shklovsky, J. S . ( 1955). On the nature o f the emission from the galaxy NGC4486 . In
Radio Astronomy, ed . H. C. van de Hulst, pp. 205-7. IAU Symposium No. 4, Jodrell
Bank, August 1955. Cambridge University Press : Cambridge.
Shklovsky, J. S. ( 1978). Stars: Their Birth , Life and Death . Freeman : San Francisco .
Simpson, M . and Penrose, R . ( 1973). Internal instability in a Reissner-Nordstrom black
hole. Int. J. Theor. Phys., 1, 183-97 .
Sky and Telescope ( 1 96 1 ). American astronomers report : First true radio star? Sky and
Telescope, 2 1 , 148 .
Sky and Telescope ( 1964). Dallas conference on super radio sources. Sky and Telescope,
27, 80-4.
Smith, F. Graham and Lovell, B . ( 1 983). On the discovery of extragalactic radio sou rces.
J. Hist. Astron. , 14, 1 55-65.
Souriau , J. M. ( 1965). Prolongements du champ de Schwarzschild. Bull. Soc. Math.
(France), 93, 193-207.
Sterne, T. E. ( 1933). The equilibrium theory of the abundance of the elements : a
statistical investigation of assemblies in equilibrium in which transmutations occur.
Mon. Not. R. Astron. Soc. , 93, 736-67 (see p . 750).
Stoner, E. C. ( 1929). The limiting density in white dwarf stars. Phil. Mag., 1, 63-70.
Stoner, E. C. ( 1930). The equilibrium of dense stars. Phil. Mag., 9, 944-63 .
Stromgren, B . ( 1937). Die Theorie der Sterninnem und die Entwicklung der Stemen .
Ergebnisse Exakt. Naturwiss., 16, 465-534.
Dark stars: the evolution of an idea 275
Struve, 0 . ( 1960) . The problem of Cygnus A . Sky and Telescope, 20, 259-62 .
Struve, 0 . ( 1962). The Universe, chap. 5 . MIT Press : Cambridge, Mass.
Sullivan, W. ( 1979). Black Holes. Doubleday : New York.
Sullivan , W. T. III ( 1984) . The Early Years of Radio Astronomy. Cambridge University
Press : Cambridge.
Synge, J. L. ( 1 949) . Gravitational field of a particle. Nature, 164, 148-9.
Synge, J . L . ( 1950). The gravitational field of a particle. Proc. R . Irish Acad. , A53,
83- 1 14.
Synge, J. L . ( 1960). Relativity: The General Theory, p. 283. North-Holland : Amsterdam.
Szekeres, G. ( 1960). On the singularities of a Riemannian manifold. Pub/. Math.
Debrecen, 7, 285-30 1 .
Thompson, L. A . ( 1984) . High-resolution imaging from Mauna Kea : Cygnus A.
Astrophys . J., 279, L47-9.
Thorne, K . S. ( 1967) . The general relativistic theory of stellar structure and dynamics. In
High-Energy Astrophysics, vol . I I I , ed . C. DeWitt , E. Schatzman and P. Veron, p. 4 14.
Gordon and Breach : New York .
Thorne, K . S. ( 1970). Nonspherical gravitational collapse : does it produce black holes?
Comment. Astrophys. Space Phys., 2, 19 1-9.
Thorne, K . S. ( 1972) . Nonspherical gravitational collapse : a short review. In Magic
without Magic, ed . J. R . Klauder, pp. 23 1-58. W. H . Freeman : San Francisco .
Thorne, K . S . , Price, R . H . and MacDonald , D . A. (eds) ( 1986). Black Holes: The
Membrane Paradigm, chap . 1 . Yale University Press : New H aven.
Tipler, F . J . , Clarke, C . J . S . and Ellis, G. F . R . ( 1980) . Singularities and horizons - a
review article. In General Relativity and Gravitation, ed . A . Held , Vol . 2 , pp. 97-206 .
Plenum Press : New York.
Unru h , W. G. and Wald , R. M . ( 1982). Acceleration radiation and the generalized
second law of thermodynamics. Phys. Rev., D25, 942-58 .
Wali, K . C . ( 1982). Chandrasekhar vs. Eddington - an unanticipated confrontation.
Physics Today (October) , pp. 33-40.
Weisskopf, V . F. ( 1975) . Of atoms, mountains and stars : a study in qualitative physics.
Science, 1 87, 605- 12.
Weyl , H . ( 19 17). Zur Gravitationstheorie. Ann . Phys., 54 , 1 17-45.
Wheeler, J. A. ( 1955). Geans. Phys. Rev. , 97 , 5 1 1-36.
Wheeler, J. A. ( 1 966) . Superdense stars. Ann. Rev. Astron. Astrophys. , 4, 393-432 , see
p. 4 18 .
Wheeler, J. A. ( 1968). O u r universe : the known and the unknown. Amer. Sci. , 56, 1-20.
Whittaker, E. T. ( 1949) . From Euclid to Eddington, p . 124. Cambridge University Press :
Cambridge.
W olfendale, A. W . ( 1 986). Address on presentation of Gold Medal of the R.A.S. to
Ya. B. Zel'dovich. Q . J. R. Astron. Soc. , 27, 54 1-2 .
Zel'dovich, Ya. B . ( 196 1). The equation of state at ultrahigh densities. Sov . Phys. JETP,
14, 1 143-7 ( 1 962).
Zel'dovich, Ya . B. ( 1964). The fate of a star and the evolution of gravitational energy
upon accretion . Sov. Phys. Dok/., 9, 195-7 .
Zel'dovich, Ya. B . ( 1965) . Survey of modern cosmology. Adv. in Astron. Astrophys., 3,
24 1-379, see p . 335.
Zel'dovich, Ya. B . ( 1970). Generation of waves by a rqtating body. JETP Lett., 14,
180-- 1 .
Zel'dovich, Ya. B . and Novikov, I . D. ( 1964). Mass of quasi-stellar objects. Sov. Phys.
Dok/., 9, 834-7.
276 W Israel
8.1 Introduction
The idea that stars might exist with surface escape velocities in excess of the
speed of light is at least two hundred years old (Michell , 1784 ; Laplace,
1795). However, it was not until the work of Chandrasekhar ( 193 1) that
there was a good reason that they ought to exist. As is well known ,
Chandrasekhar showed that white dwarf stars, supported by relativistic
electron degeneracy pressure had a maximum mass of 1 .4 M 0 (for a mean
molecular weight per electron of 2 ; c.f. also Landau , 1932). Somewhat
later, the discovery of the neutron led to the proposal that neutron stars also
exist (Landau , cited in Shapiro and Teukolsky, 1 983) , and the prescient
suggestion by Baade and Zwicky ( 1934) that 'supernovae represent the
transitions from ordinary stars into neutron stars ' . The maximum mass of a
neutron star was calculated (Oppenheimer and Volkoff, 1939) to be
0.75 M 0 . Although this is rather lower than the value calculated using a
modern equation of state, it verified the general conclusion that cold stars
were limited to a mass equivalent to (Gm; /hc) 3 12 nucleons.
,...,._,
are measured in units of m and only negative contours are shown.) The
specific angular momentum is chosen to have a constant value l = 4m so that
the isobars coincide with the equipotential surfaces and the contour passing
through the cusp has zero binding energy. Gas can fill up the region occupied
by the contours outside r = 4m, and may pour through the cusp onto the
black hole. The ze�o energy surface defines a pair of fu nnels which may be
responsible for channelling some of the outflow and beaming some of the
radiation in bright quasars and Seyfert galaxies.
Astrophysical black holes 283
hole. However, we can describe the gravitational far field around a spinning
black hole using an electromagnetic analogy and decomposing it into a
gravitoelectric field g (the usual Newtonian gravity) and a gravitomagnetic
contribution H. The Einstein equations for g, H for material of density µ and
velocity v � c approximates to a form strikingly similar to Maxwell's
equation ; the geodesic equation for the motion of a freely falling particle
becomes equivalent to the Lorentz force law;
V · g = - 4nµ,
V ·H = O '
V x g = O,
(e.g . Braginsky et al., 1977, where the equations are actually given to second
order in v ) . g and H can be derived from scalar and vector potentials,
respectively ; g = - V¢, H = V x y where ¢, y are related to the metric tensor,
9a.p ' by ¢ = ( 1 + g 00 )/2 , 'Yi = g 0i . Note the minus sign in the 'Poisson' and
'Ampere' equations expressing the attractive character of gravity and the
extra factor of 4, which relates to the spin-2 nature of the gravitational field.
Now, from our electromagnetic experience, we can infer immediately that
at a distance r � m , where the space is approximately flat, a hole with spin
angular momentum S will be surrounded by a gravitoelectric field
g = - mr/r 2 where the gra vitoelectric charge is the hole's mass m . If we use
the classical gyromagnetic ratio, we identify S/2 as the gravitomagnetic
dipole moment. Now if we use the fact that the gravitomagnetic field is - 4
times the usual dipolar field, w e obtain an expression for the dipolar
gravitomagnetic field surrounding a spinning black hole,
H = 2 [S - 3(S r) r]/r 3
· , (8 .3)
g and H communicate information to a distant observer about the two
parameters m , S which characterise an uncharged black hole.
We can now use these fields to calculate the rate of precession of a ring of
gas in inclined orbit about a spinning black hole with specific angular
momentum I . Pursuing our electromagnetic analogy, a torque per unit mass
of r x (v x H) will act upon elements of the ring . Using the formula for
dipolar gravitomagnetic field and averaging over azimuth for a small angle
of inclination gives a mean torque per unit mass of 2(S x l )/r 3. The
284 R. D. Blandford
allowed spin angular momentum of the black hole and vr is the inflow speed .
(We discuss this further below.) A corollary is that the hole may change its
spin substantially after it has increased its mass by a fractional amount
(S/SmaxHm/r8p) 1 12. Some extragalactic radio sources show jets that bend on
both sides of the nucleus with a pronounced 'inversion' or 'S' symmetry (see
Fig. 8 .2). If jets are launched along the spin axis of the hole, then this type of
distortion may be related to changes in the spin direction of the hole.
Fig. 8 .2. 'Inversion symmetric' extragalactic radio source associated with the
galaxy NGC326 (marked with a + ) . The apparent S-type symmetry of the
radio contours might be caused by a systematic change in the orientation of
the spin axis of a central black hole. (R. Ekers, personal communication.)
NGC326
I' l'
Dec
0 0
-I' 0 lI
RA
Astrophysical black holes 28�
8 .2 .3 Electromagnetic effects
Another aspect of black hole astrophysics that is amenable to a classical or
more specifically 'Maxwellian' interpretation is the interaction of a black
hole with a magnetic field. Although charged , Kerr-Newman black holes
are irrelevant to astronomy, uncharged Kerr holes interacting with
magnetic fields supported by currents external to the horizon may be highly
relevant. Some intuition about this interaction can be obtained using a
sequence of four thought experiments.
Firstly, if a Schwarzschild black hole is placed in a uniform electric field,
then , after the· transients have decayed, a modified static field will remain in
which the physical components of the electric field (as seen by an observer
hovering over the horizon) becomes radial at the event horizon (e.g. Press,
1972). The observer might therefore think that the black hole is behaving
like an electrical conductor and he can introduce a fictitious surface charge
density on the horizon consistent with Gauss' theorem. This representation
of an event horizon as a 'quasi-Newtonian' surface embodied with
conductivity, as well as viscosity, surface pressure, momentum etc.
(Damour, 1978 ; Znajek, 1978) is termed the membrane paradigm and has
been extensively explpred by Thorne and colleagues {Thorne, Price and
MacDonald, 1976).
We ought to clarify what is meant by electric and magnetic fields at this
stage. One way to do this is to define a set of fiducial observers. These
observers will be at rest in Schwarzschild coordinates if the hole is non
rotating and coincide with the 'zero angular momentum' observers in the
spacetime of a rotating hole (Bardeen , 1973). The electric and magnetic
/
fields that we are talking about are the ones that they would measure . In the
absence of free charges, these fields are solenoidal on the 3D hypersurfaces of
constant Schwarzschild or Boyer-Lindquist time. We can therefore define
electric and magnetic field lines (see Fig. 8 .3).
In a second thought experiment, let a cloud of magnetised plasma fall into
a black hole. As it approaches the horizon, the cloud will be tidally distorted
and the electromagnetic field will vary rapidly. After the plasma has crossed
the horizon, electromagnetic fields will linger in the neighbourhood of the
horizon for roughly a light crossing time before decaying to leave behind a
'bald' black hole (consistent with the no-hair theorems) . This tells us that if
we regard the event horizon as possessing a conductivity, its value cannot be
infinite. In flat space, a sphere of radius A , and electrical resistance R, will
lose magnetic field in a time A /Rc 2 • Equating this to the light-crossing
,...,
time ,..., A./c, tells us that the resistance of the hole is RH ,..., 1/c = 30 n .
Astrophysical black holes 287
E H , B H = lim E r h , B r h , (8 .7)
r - 2m
Fig . 8 . 3 . An isolated Schwarzschild black hole is immersed in a uniform
electric field . (a) lfwe plot the electric field lines in Schwarzschild coordinates
(in which a finite interval of proper radius close to the horizon corresponds to
a very small interval of coordinate radius), the field lines appear to remain
straight . (b) If we transform the radial coordinate to proper radius, then the
field lines curve and cross the horizon normally. Observers hovering just
above the horizon and falling through it radially will both measure a field
perpendicular to the horizon . (Adapted from Thorne et al. , 1976.)
(a) (b)
288 R. D. Blandford
lying in the horizon, which remain finite. (The radial field components
remain finite without any renormalisation.) Note that
EH = E' + r x B',
BH = B' - i x E' , (8 .8)
verifying that EH, BH are equal in magnitude and perpendicular (see Fig .
8 .4).
Furthermore, the Poynting energy flux crossing the horizon, as measured
by an observer at infinity, will be smaller than that measured by the
Fig. 8 .4. (a) When an isolated black hole is penetrated by electric field lines,
we can imagine that these terminate on surface charges located just above the
horizon (i.e. on an imaginary surface called the stretched horizon). The
associated surface charge density is given by E .i /4n. (b) A renormalised
component of tangential magnetic field B H just outside the horizon can
likewise be thought of as terminating at a surface current density J H
r x BH/4n. (c) The surface boundary condition discussed in the text implies
=
Fig. 8.6. (a) W hen an isolated hole spins (with angular velocity QH), parallel
to an external magnetic field B0, an electric field and an apparent
quadrupolar surface charge density will be induced . The potential difference
between the pole and the equator is V- QH B0m. (b) W hen the pole and the
equator are connected via a load of resistance RL, a current will flow and
power will be dissipated in the load. This power derives from the spin of the
hole. (Adapted from Thorne et al. , 1976.)
B
I
)
I
/
I
(a)
292 R. D. Blandford
current flows along the magnetic surfaces. These surfaces therefore replace
the wires used in the circuit analysis.
Now decompose the magnetosphere into many nested elementary circuits
in which the current flows along adjacent magnetic surfaces and connects
across an annular ring of the horizon (resistance L\R8) and a part of the load
(resistance L\RL) . When the load supplies most of the resistance in the
elementary circuit (L\R8 � L\RL), the field lines will move with the hole
(w "" !18). When the load resistance is negligible (e.g . if it is a highly
conducting accretion disk), the magnetic field lines will be frozen into it and
will rotate with its angular velocity nL (e.g . MacDonald and Thome, 1982).
If we transform into a frame corotating with the magnetic field lying
between two magnetic surfaces, with angular velocity w, the potential
difference across the annular ring of horizon is L\ V8 = (!18 - w)L\<l>/2n, where
L\<l> is the magnetic flux between the two surfaces. Similarly, the potential
difference across the load is L\ VL = (w - !1dl\<l>/2n. Now current
conservation requires that I = L\ V8/L\R8 = /.\ VL/L\RL or
w - QL L\RL
(8. 10)
!18 - w L\R8
The magnetosphere acts as a sort of clutch that couples the hole to the load
and rotates with a c9mpromise angular velocity w = (!18f\RL + nLL\R8)/
(L\RL + L\R8). As it has no inertia itself, it transmits a torque L\G = I L\<l>/2n =
I 2 (L\R8 + L\RL)/(!18 - n L ). This torque does work at a rate - f\GQ8 on the
hole, increasing its spin energy at this rate . It also does work at a rate f\GQL
on the load , increasing its spin energy. The sum of these two powers, which
of course must be negative, represents dissipation at a rate 1 2 L\R8 in the hole
plus 1 2 L\RL in the load.
Next, let us idealise the problem further by imagining that the hole is
rotating with a moderate angular velocity (!18 "" 1/8m) and that low latitude
magnetic field lines connect to the inner parts of the disk which is an
excellent conductor and rotates more rapidly than the hole with angular
velocity no > nH. Intermediate latitude field lines connect to the disk at
large radii where the disk is rotating less rapidly than the hole. Finally, high
latitude field lines are open and accelerate a relativistic wind.
Under these conditions, the black hole may evolve in several different
ways. The disk-connected flux will exert a torque on the hole given by an
integral over the horizon
GD =
J (n o - nH)r�Bn d
SH
<l> , (8 . 1 1)
294 R. D. Blandford
where 2nrH is the circumference of the horizon at that latitude and Bn is the
normal component of magnetic field at the horizon . Clearly G0 can have
either sign and in particular, it is possible for the disk to spin up the hole in
an analogous manner to the spin up in pulsating X-ray binaries. The mass of
the hole will increase at a rate that is the sum of the ohmic dissipation and
the net work done on it by the disk torque. Conversely, a rapidly spinning
hole may actually inhibit disk accretion if it is magnetically coupled to the
inner edge of the accretion disk .
The open field lines can also change the mass and the spin of the hole.
However, the boundary conditions at large distance are problematical . If we
try to match the outflow onto a wind with slow speed "" v , then the effective
impedance will be RL "" V � RH. Energy will only be extracted from the hole
with low efficiency in this case. However, if the outflow is relativistic then we
can have a good impedance matching and roughly half the spin energy can
be liberated. Unfortunately, it is hard to be more quantitative than this, as
we have to match the magnetospheric solutions onto an outflowing solution
and thus determine the value of w on the field lines.
In an important extension of these ideas, Phinney ( 1983) (cf. also
Damour, 1975; Carter, 1979) has replaced the force-free equations with
relativistic M HD theory. This allows particles, as well as electromagnetic
fields to transport energy and angular momentum. In the limit that the fluid
motion dominates the magnetic stresses, the Penrose process is recovered .
Phinney argues that, although the particle stresses may be small , the
requirement that the outflowing plasma pass through the necessary Alfvenic
and magnetosonic critical points to give a supersonic wind may actually
determine w and therefore the efficiency of energy extraction , independent of
the details of the dissipation in the load . Another aspect of the theory which
must be treated self-consistently is the shape of the magnetic surfaces . These
will be determined by imposing poloidal force balance and , if we are given a
prescription for fixing w, by the distribution of toroidal current in the
accretion disk.
Electromagnetic extraction of energy from a spinning hole is a strong
candidate for powering the jets of relativistic plasma that are observed in
extragalactic radio sources. It may also be relevant to stellar mass black
holes in mass transfer binary systems.
8 .2 .4
Accretion onto black holes
.
As this topic has been extensively reviewed elsewhere, I shall just consider a
few recent developments.
Astrophysical black holes 295
disks around massive black holes in AGN. Limit cycle behaviour can be
observed in accretion disks and may be explained by a variety of different
mechanisms leading to episodic accretion (e.g. Lin et al. , 1985 ; ·s hields et al.,
1986). There is no guarantee that the accretion will be steady. In addition, as
the disk is differentially rotating, it may be able to generate magnetic fields
by some sort of dynamo process (e.g. Pudritz, 1983) as well as convect them
inwards. These fields can energise a corona where non-thermal radiation
I
lower levels of activity may be supported in this way (e.g . Duncan and
Shapiro, 198 3 ; David , Durisen and Cohn, 1986). The rate of supply of gas
may however be enhanced if the potential is triaxial (e.g . Norman and Silk,
1983) or the black hole is in a binary (e.g . Roos, 198 1). An interesting
mechanism proposed by Carter and Luminet ( 1 982) is that the tidal shock a
star receives on passage by a black hole may detonate a nuclear explosion
and lead to a sudden release of energy. However, more detailed calculations
have concluded that this does not occur in practice (Bicknell and Gingold,
1986). Massive black holes may actually be built up out of stars in a dense
cluster. In an impressive series of relativistic stellar dynamical computations
Shapiro and Teukolsky ( 1985) have shown how a collapse may begin.
However, it remains to be seen if the collapse will continue and whether or
not a large black hole can be made in this manner.
Extensive reviews of accretion onto black holes are to be found in the
books by Shapiro and Teukolsky ( 1983) and Frank, King and Raine ( 1985)
as well as the articles by Pringle ( 1 98 1), Begelman , Blandford and Rees
( 1984), Begelman ( 1985) and especially Rees ( 1984) .
8 .3 . 1 . 1 Early history
Although the discovery of quasars in 1963 is widely cited as initiating the
modem era of astronomy, three parallel lines of investigation were all
leading up to this discovery and already posed the problems with which
astrophysicists were soon to be confronted . Together, and with the
considerable benefit of hindsight, these observations also provided strong
clues as to the solutions of these problems.
Firstly, Fath ( 1 908) observed the spiral galaxy NGC 1068 (of course
before it was known to be an external galaxy), and discovered unusual high
excitation lines. Subsequent observations by Slipher ( 19 17) demonstrated
that it had a redshift of 1 100 km s - 1 and that the lines were broader than
could be accounted for by rotation of the galaxy. Other galaxies like NGC
1068 showing bright condensed nuclei and broad, high excitation emission
lines were studied by Seyfert ( 1 943) and belong to a class which now bears
his name. Interest in these galaxies waned somewhat until the late 1950s
when Woltjer ( 1959) realised that Seyfert galaxies constituted a few per cent
of all spirals and that consequently they must live for a few per cent of the age
298 R. D. Blandford
of the universe, at least 108 yr - much longer than the time it would take the
emitting gas to escape from the nucleus. As the powers o_f nuclei of the
brighter Seyfert galaxies were known to approach those of typical galaxies
( - 4 x 10 4 3 erg s - 1 ), it was already apparent that nuclear activity required
considerable energy ( � 1059 erg per galaxy). Furthermore, the existence of
smaller scale activity in our Galaxy and the Andromeda galaxy strongly
hinted that most bright galaxies had active nuclei . Woltjer also deduced that
a blue photo-ionising spectrum would be necessary to excite the emitting gas
and suggested that it was produced by a cluster of hot stars, an idea that was
contradicted by the spectroscopic measurements of NGC 1068 by Burbidge,
Burbidge and Prendergast ( 1959) , which required a smaller mass-to-ultra
violet light ratio than a normal distribution of stars.
Secondly, many radio 'stars', were identified with external galaxies
following the demonstration by Baade in 195 1 (using an accurate radio
position determined by Smith) that the second most powerful radio source
in the sky, Cygnus A (Reber, 1 944) was identified with an unusual galaxy at
redshift z = 0.05 (Baade and Minkowski, 1954). The radio power was
estimated to be 10 44 erg s - 1 , larger than the optical luminosity of a bright
galaxy. What was equally remarkable was the discovery of high excitation
emission lines in the galaxy spectrum. These were interp reted as evidence for
a galaxy collision , an idea which had extraordinary longevity in view of the
large power involved. In fact the energetics were strained even further when
Shklovsky and Ginzburg proposed that the radio emission be produced by
the synchrotron process and Burbidge ( 1959) showed that a total energy of
3 x 1059 erg was required to account for the intensity of the radio emission.
Furthermore, if the ratio of relativistic protons to electrons in the source
were to be similar to that found in galactic cosmic rays the minimum energy
would have to increase by a factor ten to - 3 x 106 0 erg , the rest mass
equivalent of 2 x 106 M 0 , enough to power the radio source for a billion
years.
Strong evidence against the colliding galaxy model also came from the
interferometric results of Jennison and Das Gupta ( 1953) who showed that
Cygnus A in fact comprised two separate regions of emission that straddled
the optical galaxy instead of being concentrated on it . This pattern was to be
repeated in many more radio galaxies (Matthews, Maltby and Moffet,
1962) . Powerful radio galaxies gave the first indications of the enormous
energies involved in the most extreme examples of AGN .
The third line of enquiry, although limited to a single example, was to
provide the link between the first two . In 19 17, Curtis discovered that the
Ast rophysical black holes 299
elliptical galaxy M87 was unusual in that it had a narrow linear feature now
called a jet protruding from its nucleus. In 1956, Hiltner discovered linear
polarisation in the jet which enabled Burbidge ( 1956) to infer that the optical
emission was synchrotron radiation by relativistic electrons and that at least
105 6 erg of energy was required .
So, by the time of the announcement of the discovery of the quasar 3C273
(Schmidt, 1963 ; Hazard , MacKey and Shimmins, 1963), there was good
evidence fo r ongoing nuclear activity , the copious production of relativistic
electrons with aggregate energy far in excess of what might be expected from
normal stars. This was even more strongly true of 3C273 which had an
optical power of 104 6 erg s - 1 and even a longer jet than in M87 seen at
,....,
both radio and optical wavelengths. The association of quasars with these
milder phenomena was clearly understood (e.g . Burbidge, Burbidge and
Sandage, 1964).
8 .3 .1 .2 Ea r ly models of quasa rs
In addition , to these observational precursors of quasars, the theoretical
ideas that were to prove useful were also being developed. In fact the first
Texas conference on relativistic astrophysics (Robinson, Schild and
Schucking, 1964) was convened prior to the discovery of the quasars that
were to dominate its discussions, stimulated in part by the theoretical
challenges posed by the double radio sources (Maran and Cameron, 1964).
Hoyle and Fowler had previously proposed a model of AGN that involved a
'superstar' forming out of 10 6 M 0 of radiation-dominated gas. This
,....,
the ratio of the gas pressure to the radiation pressure, already a small
number. Superstars could therefore not form efficient machines for
converting mass into energy ; neither did they seem particularly promising
for accelerating relativistic electrons. Nevertheless, this model did seem to
be on the right track, because Smith and Hoffleit ( 1963) soon discovered that
quasars were variable on timescales � 1 month implying that the energy was
produced within a region � 101 7 cm across. (Curiously, the discovery of
even more rapid variability in Seyfert galaxies, was not made until 1968 by
Pacholczyk and Weymann (1968).)
Various efforts to stabilise superstars using rotation, turbulence, nuclear
reactions and a 'C-field' proved unconvincing. Although it was shown that a
low entropy (i.e. thin) disk could exist with a binding energy up to 0.38 times
its mass (Bardeen and Wagoner, 1969) this structure was liable to serious
instabilities (Salpeter and Wagoner, 197 1 ; Salpeter, 197 1) and would
probably either fragment into stars or thicken to a high entropy disk
radiating a significant fraction of the Eddington limit of unspecified
structure and unknown stability properties.
Apart from the problem of not releasing much energy before collapsing
there was the separate difficulty of accelerating relativistic electrons to
produce non-thermal synchroton radiation. For this reason several authors
proposed that the active agent within a quasar be a magnetised spinning
superstar or disk , called a spinar or magnetoid (e.g. Sturrock , 196 5 ;
Ginzburg and Ozernoi , 1964). A simple estimate based on the virial theorem
suggested that a mass M would be permeated by a magnetic flux
- 10 3 0 (M/M 0 ) G cm 2 . For a mass - 108 M 0 and a size - 10 1 6 cm as
indicated by the observations this suggested field strengths up to 10 6 G which
would cause the disk to evolve rapidly dynamically (e.g. Woltjer, 197 1).
Again most of the energy would be released explosively near the end point of
the evolution just prior to the formation of a black hole. Several suggestions
were made that the double radio sources might be formed as a consequence
of such an explosion or outburst from a magnetised spinning superstar.
Now these models essentially identified the formation of a black hole with
the end of the activity. By contrast , Greenstein and Schmidt ( 1964) realised
that it was ' . . . important to know if continued energy and mass output from
such a collapsed object are possible.' A clear answer to this question was
provided by Zeldovich and Novikov ( 1964) and Salpeter ( 1964). These
authors pointed out that gas accreting onto a black hole could release its
energy reasonably efficiently. In the case of a Schwarzschild black hole, gas
should orbit the hole in circular orbits and so the efficiency of energy release
As trophysical black holes 301
should be given roughly by the binding energy of the most bound stable
circular orbit (i .e. 0.06) . They also realised that if gas were made freely
,..._,
available the accretion rate could build up to a limit setting in when the
pressure of the escaping radiation equaled the attractive pull of gravity.
Under either optically thick or optically thin conditions, this is given by
(8 . 1 2 )
X
M (TT "" 5
tEdd = -o- = 1 0 8 yr. (8 . 1 3)
M Edd 4n Gmpc
It now seems rather surpri sing that black hole models of AGN were not
developed more rapidly (although the subsequent discovery of the
microwave background in 1 965 and in 1 967 must have distracted attention
away from quasars) . Part of the explanation was that it was hard to
understand how the infall of gas could co-exist with outflow as apparently
observed in the inner parts of several galaxies, including our own . Another
reason was that it seems not to have been realised that radio sources,
quasars and Seyfert galaxies required there to be a more or less steady
release of energy over timescales � 10 8 yr. M ost models seem to have
envisaged impulsive energy release through a scaled up solar flare, black
hole formation or binary coalescence (e.g . Hoyle and Fowler, 1963a,b;
Burbidge, 1964) . This contradiction was particularly acute for the double
radio sources like Cygnus A which showed evidence for ongoing nuclear
activity through the high excitation nuclear emission lines and yet were
modelled by two anti-parallel 'jets' created in a single explosion (e.g. Le
Blanc and Wilson , 1969) . It is quite ironic that two of the staunchest
defenders of the steady state cosmology Hoyle and Burbidge, should have
believed in 'big bangs' in galactic nuclei .
The papers of Zeldovich, Novikov and Salpeter supplied the foundation
for most current thinking about the centre powerhouse in AGN. However,
they were not seriously developed until Lynden-Bell ( 1 969) derived and
applied a theory of thin accretion disks in orbit around massive
Schwarzschild black holes. He argued that magnetic stresses within the disk
would produce an effective viscosity which would cause angular momentum
to be driven outwards at the same time as mass flowed radially inwards,
releasing its gravitational binding energy in the process. Lynden-Bell
initially followed Salpeter, in assuming that the hole was non-rotating.
302 R. D. Blandford
However, as Bardeen ( 1970) was quick to point out, a black hole in an AGN
would probably be rapidly rotating, having been spun up by the gas from the
surrounding accretion disk. In this case the Kerr metric is appropriate and
stable circular orbits have a binding energy per unit mass that can be as large
as 0.42c 2 when !18 approaches its maximal value of 1/2m. (The efficiency of
energy release is still 0.2 for a hole rotating with angular velocity equal to
0.75 of its maximal value.) Spinning black holes can therefore provide
extremely efficient machines for converting the rest mass of accreted gas into
radiative energy. Accretion disk models had a further advantage. As the
magnetic field lines would be mostly frozen in to the differentially rotating
disk and would therefore have to keep undergoing magnetic reconnection ,
strong , inductive electric fields would be created in the (relatively small)
reconnection volumes and these would be able to accelerate relativistic
electrons. However, in this and subsequent papers (Lynden-Bell , 197 1 ;
Lynden-Bell and Rees, 1 97 1), accretion disks were still associated more with
dead quasars, i .e. local , low power AGN. The active, high power objects
were identified with massive, uncollapsed disks .
The idea that quasars in the prime of life might be powered by accretion
disks around black holes gradually gained ground matching contemporary
developments in the study of binary X-ray sources (e.g. Rees, 1977). At the
same time, the principal competition , models involving dense clusters of
stars, seemed to lose ground as the observational position developed.
8 .3 . 1 .3 Star clusters
A competitive class of quasar models to those involving single coherent
objects developed in parallel to models involving massive black holes. This
involved dense star clusters (e.g. Gold, Axford and Ray, 1 964; Spitzer and
Saslaw, 1 966 ; Colgate, 1967 ; Sanders, 1970). In initial versions of this model
it was proposed that quasars were the consequence of an exaggerated rate of
supernovae within the cluster. 108 M 0 of high mass stars in a volume
1 pc3 can qualitatively reproduce many of the observed features of a
,...,,
quasar, the ionising radiation coming mostly from the hot stars , the
outbursts being attributed to the supernovae themselves and the non
thermal emission deriving from the overlapping supernova remnants. In a
star cluster this dense, the relative velocity of the stars becomes comparable
with their surface escape speeds. This implies that physical collisions become
competitive � ith gravitational collisions. Unfortunately, the consequences
of these collisions have never properly been understood. If the stars coalesce,
then rapidly evolving high mass stars can be assembled . It was even
Astrophysical black holes 303
far larger than the size of the compact radio sources which are typically
"' 1 pc in size. This implies that the collimation of the jets occurs on scales
� 1 pc. There was no straightforward way to accomplish this in the context
of a massive star cluster, especially when it was discovered that the source
axes on the VLBI scale were roughly parallel to the larger scale structure.
304 R. D. Blandford
Attempts to concentrate a star cluster into a volume � 1 pc3 only hasten its
evolution into a massive black hole (Begelman and Rees, 1978 ; Shapiro and
Teukolsky, 1985). A stable compact gyroscope, i.e. a single spinning object,
rather than an assemblage of small, independent radiators was suggested.
there are N Q ,......, 4 x 106 quasars on the sky. An average magnitude for these
quasars of 20. 5m corresponds to a flux F Q "" 2 x 10 - 1 3 erg cm - 2 s - 1 if each
quasar radiates like an AO star. However, quasars must radiate significantly
more UV radiation than an AO star in order to account for the observed
emission lines, and so we must multiply this flux by a bolometric correction
which we estimate as fuv ,......, 3 . The mean redshift of these quasars is < z) 2 ,......,
and so each quasar photon will have its energy reduced by a factor
,......, ( 1 + (z) ) - 1 as the universe expands. The mean energy density of quasar
light observed at earth is then U Q ,......, N 0 F 0 fu v O + (z) )/c ,......, 1 . 5 x
10 - 1 6 erg cm - 3 - 3000 M 0 Mpc - 3 . (This is a conservative estimate. Any
increase in the number of quasars or their bolometric corrections would
only increase the required efficiency.) Now, the local density of bright
galaxies is roughly ,......, 10 - 2 h - 3 Mpc - 3 , where h is the Hubble constant in
units of 100 km s - l Mpc - 1 . Therefore, at least 3 x 105 M 0 of energy
equivalent must be released per bright galaxy. Now if we introduce an
efficiency of conversion of rest mass into radiant energy s ,......, 0. 1 , the remnant
nuclear masses per bright galaxy must be ,......, 3 x l06(s/O. l ) - 1 h - 3 M 0 , if we
assume that quasars .are powered gravitationally:
As we discuss below, it appears that most bright galaxies have central
masses < 3 x 10 6 M 0 . We can also probably already exclude the possibility
that 10 per cent of nuclear remnants are heavier than 3 x 1 09 M 0 . Hence,
the required efficiency of energy production must exceed ,......, 10 - 3 (h = 1), or
,......, 10 - 2 (h = 0.5). This virtually rules out star cluster models, as well as black
hole models that postulate super-critical accretion with low radiative
efficiency (e.g. Abramowitz, Jaroszynski and Sikora, 1978).
As we discuss below, the strongest evidence for black holes in X-ray
binaries is dynamical . Unfortunately, unlike in the stellar mass case, the
alternatives to black holes (e.g. spinars, magnetoids etc.) are far less well
specified than neutron stars, and therefore cannot be so easily ruled out on
the basis of mass determinations. There have been several attempted
dynamical determinations of the central masses of AGN .
The observational study of the nucleus of M87 by Young et al. ( 1978) and
Sargent et al. ( 1978) showed that there was a central cusp of light over and
above the central core of stars and that the measured velocity dispersion of
the stars rise� within the central ,......, 100 pc (cf also de Vaucouleurs and
Nieto , 1979). They interpreted these observations in terms of a model
306 R. D. Blandford
hole might modulate the light output with a regular period � 1 h. The effects
of Lense-Thirring precession might even be detectable. Searches for low
power periodic signals in the continuum emission of selected objects might
turn up an example of this.
Alternatively, when two galaxies (each possessing black holes in their
nuclei) merge, the hole associated with the smaller galaxy will be dragged
into the nucleus of the larger galaxy in a few orbital periods. Thereafter, the
dynamical evolution time will increase to ""' 109 yr as the interaction with
individual stars takes over (Begelman , Blandford and Rees, 1980 ; Roos,
198 1). The holes will then spiral together until they become so close that
gravitational radiation by the black hole binary takes over. Now, if we
assume that the heavier black hole has a mass m 1 = 10 6 m 1 6 M 0 , and the
lighter one a mass m 2 = 10 6 m 2 6 M 0 , then the lifetime to coalescence will
be related to the orbital period P yr by T,..., 4 x 1 07P 8 1 3 m:;_l1 3 yr. Searches
for regular peri odic behaviour in quasars (e.g . Ozernoi and Chertoprid,
1969) have not yet been fruitful and we would not expect them to contain
binaries with measurable periods. However, it is just possible that a few
lower power Seyferts may harbour binary black holes with detectable
periods � 1 yr. A black hole binary would probably perturb the central
stellar distribution so strongly that the black hole would be supplied �ith
gas and the nucleus would be active at this time. For black hole masses
m 1 6 ""' m 2 6 � 1 , and an assumed merger rate of one per galaxy, roughly 1 in
300 galaxies could contain binaries and show periodic behaviour with
P � 1 yr. (As we discuss below, Lacey and Ostriker ( 197 5) propose that
binaries of this type occur very frequently in galactic n1=1clei.) These binaries
would probably not be seen as spectroscopic binaries because the sizes of the
broad emission line regions are inferred to be significantly larger than the
radii of these hypothetical binary orbits. However, they may be detectable as
eclipsing binaries if both black holes are accompanied by extensive accretion
disks. A regular monitoring program of nearby Seyfert and LINER nuclei
might possibly uncover an example.
the emission from an accretion disk by the orbit of a single star as discussed
above and would instead probably represent the presence of a massive
uncollapsed object . Another possibility is the detection of X-ray or y-ray
lines with a common redshift relative to the host galaxy of z - 0. 1--0.2. This
would certainly revive inter� st in neutron star cluster models of AGN.
Alternatively, Space Telescope (operating with its advertised resolving
power) may fail to detect any evidence for dormant central masses in the
nuclei of local galaxies. As the arguments above imply, this would call into
question far more than the black hole model (especially if h - 0.5).
However, perhaps the most telling argument against black hole models
would come if it could be demonstrated that the compact optical and X-ray
sources are generally displaced from the dynamical centres of their host
galaxies as well as each other. An argument similar to this has already been
used to limit the masses of black holes perhaps associated with X-ray
sources in globular clusters. More relevantly, it has been applied to our
Galactic centre and it is illustrative to discuss the current controversy as to
whether or not our Galactic centre contains a massive black hole.
the fraction of the time when it is emitting at a r<:lte comparable with the
Eddington limit will be correspondingly small.
(Cowley et al. , 1983). This source has two favourable features. Firstly, the
mass function is computed to be 2.3 + 0.3 M 0 and secondly the distance to
the Magellanic clouds is known fairly accurately to be d = 55 kpc. Repeating
the arguments listed above for Cyg X- 1 , we find that the B3V spectral type
optical companion should have a mass "'"' 7 M 0 if there has been no unusual
evolution. Independent of the evolutionary history, the avoidance of X-ray
eclipses requires that mx � 10 + 4 M 0 (Paczynski, 1984).
Unfortunately, there is a possible complication in this system because the
accretion disk around the X-ray source may contribute a significant portion
of the optical emission from this binary allowing both the compact object
and its companion to be of lower mass. In particular, the mass of the X-ray
source may possibly be as low as -- 2.5 M 0 (Mazeh et al. , 1986),
uncomfortably close to the Oppenheimer-Volkoff limit. In this case the
companion would have to be a � 1 M 0 helium star, a possibility that should
eventually be ruled out using high dispersion spectroscopy.
8 .3 .2 .4 A0620-00
Most recently, McClintock and Remillard (1986) have shown that the
orbital period for an X-ray nova A0620-00 is a spectroscopic binary. (An X
ray nova is a transient X-ray source that becomes very bright for several
months and then fades away.) Fortunately, A0620-00 also brightened at
optical wavelengths and so there is no doubt about the identity of the
companion. McClintock and Remillard found thaLPb = 7 .75 h and K =
457 km s - 1 giving a mass function f(m0 , mx , i) = 3 .2 + 0.2 M 0 . Again, there
are no X-ray eclipses and from the spectral type of the optical companion
(K5V) and modelling of the light curve, it is deduced that mx > 3.2 M0
(McClintock, 1986). This system is unusual in one other respect. During its
1975 outburst, it became the brightest X-ray source in the sky. However,
during quiescence it is unobservably faint in X-rays and so appears to
radiate far less in X-rays than would be expected on the basis of the optical
emission from the accretion disk. Somehow or other the X-ray energy that
should be released by gas as it spirals in towards the black hole is being
absorbed or scattered out of the line of sight. (Alternatively, it is possible
that the gas accumulates in a reservoir in the outer parts of the disk.) Until
the reason for this behaviour is understood, there must be some small
residual doubt about the inferred geometry of the binary.
8.3 .2.5 LMC X-1
A fourth source, LMC X- 1 , has been put forward as a black hole candidate
on dynamical grounds by Hutchings et a l . ( 1983). The optical identification
314 R. D. Blandford
8.3.2 .6 SS433
An extensively studied galactic object which has also been argued to contain
a black hole· is SS433. SS433 originally appeared in a list of emission line
stars. After some false trails (chronicled in Margon, 1980), Clark and
Murdin (1978) realised that it was coincident with an X-ray source and a
compact radio source at the centre of a large supernova remnant W50 which
they naturally presumed were related. They also noted that, in addition to
the unusually strong and broad hydrogen and helium emi ssion lines, several
weaker unidentifiable lines were also present in the spectrum. In a more
careful study of the spectrum, Margon et al. (1979) identified the weaker
lines with redshifted and blueshifted hydrogen and helium transitions.
Furthermore, the Doppler shifts changed from night to night in a systematic
manner.
Among a host of ephemeral theoretical suggestions, was one by Fabian
and Rees (1979) who suggested that the emission be produced by two
oppositely directly jets, analogous to those found in the double radio
sources, but in this case containing dense clouds of emitting gas moving with
relativistic speed . Milgrom (1979) further speculated that the modulation of
the Doppler shift may be due to precession of these jets about an axis fixed in
space. Further observations by Abell and Margon (1979) verified this
kinematic model . Using several periods of data, we now know that the jet
speed is stable at v =0.26c , the vertex angle of the cone on which the jet
Astrophysical black holes 315
precesses is e = 20° the inclination of the cone axis to the line of sight is
'
j = 80° and the precession period is Pp = 163 d (Margon, 1984).
SS433 also exhibits a 1 3 . 1 d, 0.7m photometric double-peaked variation
(as well as a slower 163 d variation). This is attributed to mutual eclipsing of
an accretion disk surrounding a compact object and its binary companion
orbiting with a 1 3 . 1 d period (Leibowicz, 1984). This same period is also
detectable in small radial velocity variation of the brighter, 'stationary'
emission lines. However, the hydrogen lines vary with a smaller velocity
amplitude and different orbital phase to the higher excitation He II lines. It
is important to determine which of these two radial velocity variations
should be used to determine the mass function, because the former
possibility gives f(m0, m x , i) - 0.5 M 0 and a mass m x - 1 M 0 for the
compact object (which could presumably be a neutron star) and the latter
suggests f(m0 , mx , i) - 1 1 M0 implying a black hole companion. Now,
Crampton and Hutchings ( 198 1) have argued that the higher excitation
helium lines should be produced closer to the photo-emission source,
presumably the compact object, and that the hydrogen lines are associated
with gas streams hitting the outer edge of the accretion disk in a hot spot.
Furthermore, the phase of maximum He II radial velocity is roughly 90°
different from the primary minimum in the photometric variation which is
entirely consistent with the gas stream interpretation.
Further confirmatory evidence that the He II lines measure the velocity of
the compact object and consequently that SS433 is a high mass system is
provided by the discovery of orbital modulation of the 'moving' emission
lines. The wavelength A0 of a spectral line emitted by gas in either jet is
related to the emitted wavelength Ae by
A0 = 2 Ae 2 1 2 [c + v (sin j sin 8 cos t/f + cos j cos 8)] , (8 . 15)
(c - v ) 1
where the + signs refer to the two jets and t/J is the precessional phase which
can be written at 2nt/PP + t/f 0 in the absence of orbital modulation. Now, on
quite general grounds, we expect the star in orbit about the compact object
to nutate and to perturb the disk (together with the jets which are generally
assumed to be launched perpendicular to the disk) at half the synodic
period, Ps = 1/2(Pb- 1 + P; 1 ). If, as we shall find, the precession is retrograde
with respect to the orbital motion, we must choose the plus sign and Ps =
6.06 d. (Retrograde precession will result if the compact object exerts a
torque on the star and the disk is slave to the star, or the star forces the disk
to precess directly. Lense-Thirring precession would be prograde and is
3 16 R. D. Blandford
l l(m0 + mx) 2 M 0 . The final stage in the argument is to put a lower bound on
the mass of the companion star m0 � 20 M0 from the requirement that it fill
its Roche lobe and be able to radiate the observed luminosity of - 8 x
10 3 8 erg s - 1 (quoted by Band and Grindlay, 1985). The mass of the compact
object is then mx � 7 M 0 , indicating a black hole and consistent with it
radiating - 10 3 9 erg s - 1 at the Eddington limit.
However, as was the case with Cyg X- 1 , the compact object could itself be
a close binary with orbital period Pb 2 - 1 . 5 d and precessional period Pp "'
4 P;/3 Pb 2 cos fJ. A recent study of this possibility by Fabian et al. (1986)
shows that it is possible to devise evolutionary pathways to this
configuration. This model allows the compact object to be a neutron star.
However, it seems hard for this system to avoid producing a further
modulation of the optical emission with a period - Pb i · So far, this period
has not been detected.
In conclusion, the weight of evidence favours SS433 containing a black
hole. However, the arguments for this are less direct than in Cyg X- 1 . The
importance of SS433 is that it shares some properties with active galactic
nuclei (relativistic precessing jets, winds, broad emission lines etc.) and it
may be a miniature version of a quasar.
Astrophysical black holes 317
Fig. 8.7 . Kinematic model of SS433. Two antiparallel jets emanating from a
compact object in a binary system precess on a cone with opening angle
0 - 20° and period PP - 163 d . Tidal torques produced by the orbiting
companion impose a small nutation on the jets with a 6 .06 d period .
(Adapted from Collins, 1 985.)
318 R . D . Blandford
Devinney, 197 1) and BM Ori (Wilson, 1972) can all be modelled using
normal companion stars and so there is no obligation to postulate the
presence of a black hole. However, the absence of X-rays, only forbids an
accreting compact object and even here, the example of A0620-00 warns us
that it is possible for X-rays from a disk to be beamed away from us or
attenuated. It is still worth investigating other possibilities. Black hole
binaries that are currently X-ray active, may constitute only a small fraction
of the total.
Rapid ( ,.., 1 ms), aperiodic X-ray variability (e.g. Sunyaev, 1973) was
proposed as a signature of a black hole and for a long while Cir X- 1 was held
to be a black hole on these grounds. However, Cir X- 1 undergoes X-ray
bursting on timescales ,.., 10 s and this is strongly believed to be a signature
of a neutron star (Tennant, Fabian and Shafer, 1986). Conversely, the
known neutron star binary, V0332 + 52 (as well as GX339-4 and Cir X- 1)
exhibits rapid flickering, just like Cyg X- 1 (Stella et al. , 1985). It is rather
surprising that rapid variability has been regarded for so long as a peculiar
signature of a black hole as orbits near the surface of a ,.., 1 .4 M 0 , 10 km
neutron star have periods ,.., 1 ms, similar to those for orbits around a more
massive black hole.
Spectral criteria have also been proposed as an indication of the presence
of a black hole. In particular, White and Marshall ( 1984) pointed out that
the black hole candidates had unusually soft (i.e. steep) X-ray spectra.
However, these sources are not always in this state and many of them
(notably Cyg X- 1) change frequently to emit a hard (i.e. flat) X-ray spectrum.
Unfortunately Cir X- 1 behaves in a similar manner and so the usefulness of
this criterion is suspect at the moment.
As an illustration of the importance of obtaining a good X-ray position,
consider the case of the X-ray source OAO 1653-40 identified by Polidan et
al. ( 1978) with the supergiant binary V86 1 Seo (Wolff and Beichmann,
1979), on the basis of two X-ray eclipses separated by four times the known
binary period. If the identification had been correct, then the mass function
of0.5 M 0 and spectral type of the star would have implied a minimum mass
for the secondary star of 12.5 M 0 . However, Parmar et al. (1980) found that
the X-ray emission came from a nearby pulsating X-ray source and that
V86 1 Seo was undetectable.
A more direct indication of a black hole would appear to be the discovery
of an X-ray source with an X-ray luminosity exceeding the Eddington limit
for a star with mass equal to the Oppenheimer-Volkoff limit. SMC-X- 1
(which is at the known distance of the Magellanic clouds) might have been
Astrophysical black holes 319
Volkoff limit.) Unfortunately, even if black holes are this common, the
nearby ones will only accrete very slowly in the interstellar medium, and will
be unobservably faint (e.g. Shapiro and Teukolsky, 1983). ·
From the dynamics of stars in the solar neighbourhood, we can estimate
the density of matter in the galactic disk. This is known as the Oort limit and
the stars that we can see account for slightly more than half of it. Black holes
formed as stellar remnants in the first generation of Galactic stars may
account for the difference ( "' 0 . 1 M 0 pc - 3 , Bah call, 1984). There is al so a
'missing mass' problem associated with the halo of our Galaxy and again,
there must be as much invisible mass as visible mass at the solar radius.
These may be primordial ( "' 105 M 0) black holes. However there are some
constraints. Lacey and Ostriker ( 1985) have proposed that the halo mass
comprise "' 106 M 0 black holes perhaps formed primordially from
isothermal fluctuations in a hot, baryon-density universe (e.g. Carr and
Rees, 1984). These holes would steadily heat the galactic disk and thereby
explain the observed relationship between disk thickness and stellar age.
Lacey and Ostriker demonstrate that their interactions with individual stars
can change the local stellar velocity distribution function to a form
somewhat similar to that measured. Dynamical friction would cause masses
this heavy to sink into the galactic centre over the lifetime of a galaxy. In fact
there is a problem in that far too much mass will be accreted in our Galaxy.
These authors therefore propose that when three holes accumulate in a
galactic nucleus a dynamical slingshot (e.g. Begelman, Blandford and Rees,
1980), will lead to the ejection of some or all of the black holes. Presumably,
when the hole settling rate is large enough, a more massive black hole can be
formed. A large cosmic density of "' 106 M0 black holes, which this proposal
would require, might produce some observable gravitational lens images. A
difficulty with this model is that interstellar gas accreting onto these black
holes might make them bright enough to have been detected. For instance,
making somewhat optimistic assumptions about the accretion rate,
McDowell ( 1985) has deduced, on the basis of their absence from I RAS
source catalogues and proper motion surveys, that halo holes must be less
massive than 1000 M 0 . Another possible problem has been discussed by
"'
Bahcall , Hut and Tremaine ( 1984) who have argued that the existence of
very wide binaries, that would otherwise be disrupted by dynamical
interaction with heavy stars, constrains the missing mass in the disk to
comprise objects less massive than 2 M 0 , presumably excluding black
holes. This may also constrain the masses of halo holes.
As reviewed in Blandford and Thome ( 1979), globular clusters are
Astrophysical black holes 321
roughly a hundred times more X-ray luminous per unit mass than the rest of
the galaxy and when this was discovered, several authors proposed that
"" 1000 M 0 black holes reside in their nuclei (e.g. Bahcall and Ostriker,
1975). However, when improved X-ray positions became available, it was
clear that the X-ray sources were not located at the cluster centres as would
be expected for a massive object. From the observed distribution of these
sources, it can be concluded that the ratio of the mass of the X-ray-emitting
object to the mass of the field stars (typically ""' 0.5 M0) lies in the range
"" 1 .5--5 (Grindlay, 1985). They are therefore believed to be accreting
neutron star binaries. Nevertheless, a problem may remain in that more
globular clusters should show evidence for collapse of their cores. Larson
( 1984) has proposed that in fact central cusps are present in many globular
clusters, but these comprise invisible black holes which sink to the bottom of
the potential well.
Discovery of six pairs of quasars that are possibly gravitationally lensed
(e.g. Turner, 1986), (three of which show no evidence for luminous lenses),
has revived interest in cosmological � 101 2 M 0 massive black holes (e.g.
Paczynski , 1986). These have the additional advantage that they create only
two images as generally observed to be the case, rather than an odd number
of images as one would expect with a non-singular transparent lens (e.g.
Narayan, 1986).
Planetary mass primordial black holes have been 'postulated by Freese,
Price and Schramm ( 1983). These would have masses between the Hawking
mass ( ......, 10 1 5 g) and 1 M 0 . They may stimulate galaxy formation.
8.4 Future prospects
Black holes, an inevitable consequence of general relativity, were predicted
long before there were any observational indications that they exist.
Although astronomers have treated them with some scepticism, their place
as endpoints of stellar and galactic nuclear evolution now seems fairly
secure. The arguments for the X-ray sources Cyg X- 1 , LMCX-3 and A0620-
00 (and possibly also LMCX- 1 and SS433), being black holes in mass
transfer binaries are at least as strong as (to take one example) the reasons
for believing that helium is made in the big bang, and, in both cases, rely
essentially on Newtonian dynamics, the assumption that locally discovered
physical laws operate elsewhere in the universe, and common sense. In
active galactic nuclei, there is now an impressive body of circumstantial
evidence that most of the power derives from a coherent spinning object of
size not a lot larger than its Schwarzschild radius. The efficiency with which
322 R. D. Blandford
this object converts mass into radiation must exceed - 10 - 3 and perhaps
also 10- 2 . A - 106-10 9 M 0 black hole fits this description c<:> mpletely. The
reason why we do not reject competitors such as spinars with the conviction
that we reject neutron stars in Cyg X- 1 is that spinars are less well-defined
physical entities, and we find it hard to describe their structure and examine
their stability properties. Perhaps Nature will have the same difficulties.
After all , we do not consider it reasonable that the X-ray-emitting object in
Cyg X- 1 be a rapidly spinning disk of gas, supported by turbulence and
magnetic field for, say - 10 1 5 rotational periods . So, why should we take
any more seriously the notion of a similar structure a million times larger in
AGN surviving for - 109 periods?
Of course none of this proves that black holes exist. As I have argued , a
rigorous proof is going to be extremely hard to come by. This is the normal
state of affairs in most sciences. It is surely more productive at this stage to
accept the evidence and proceed . To quote the third volume of the Principia
(Newton, 1687), 'In experimental philosophy we are to look upon
propositions inferred by general induction from phenomena as accurately or
very nearly true, notwithstanding any contrary hypothesis that may be
imagined , till such time as other phenomena occur, by which they may either
be made more accurate, or liable to exceptions.'
Let me give some examples of ways of proceeding that are actively being
pursued . Firstly, it is slightly surprising that so few stellar mass black holes
have been found ; far less than was expected in 1972 . By contrast, almost 500
neutron stars have been discovered since 1 967. Perhaps this is telling us
something of importance about mass loss in the final stages of stellar
evolution or the dynamics of supernova explosions and that the usual corpse
of a massive star is a neutron star, black holes requiring somewhat unusual
conditions for their formation. Secondly, black holes in binary systems
display some common, though not exclusive, properties such as rapid
variability, spectral changes etc. Perhaps this behaviour, although
improperly understood , can be isolated in AGN and used to infer the masses
of the black holes. Thirdly , SS433 shows features in common with some
extragalactic double radio sources. Have we looked hard enough for a true
extragalactic counterpart to SS433?
Excepting chance discoveries, the prospects for observational advance in
this field lie mostly with space astronomy. X-ray telescopes are crucial to
finding more stellar mass black holes and to probing the most rapid
variability in AGN . Further advances must await the launches of ROSAT
and AXAF. In addition, it will be a great disappointment if Space Telescope
Astrophysical black holes 323
does not revolutionise our view of AGN by probing the central stellar cusps,
finding optical jets and telling us what types of galaxy are associated with the
different forms of activity. In particular, understanding the 'environmental
impact' of an active nucleus on the surrounding galaxy should help us to
determine the duty cycle for this activity. Finally, Gamma Ray Observatory
may discover the high energy photons that should be emitted if electron
positron pair plasmas are a prominent feature of AGN.
On the theoretical front, the major obstacle to progress, has been our
ignorance of the details of the viscosity in accretion disks and of the
dynamical importance of large scale magnetic field. The best prospects for
progress here probably lie with large scale numerical computation. This will
be carried out in the form of experiments on the behaviour of simple systems
which should give further insight into the general properties of relativistic
magnetohydrodynamical flows. There is little prospect of performing
detailed simulations of sources to which observational data can be fitted.
Similar explorations of radiative transfer in high temperature plasmas are
underway and should be interpreted in the same spirit.
Although progress on the astrophysical theory of black holes has been
disappointingly slow, it is some comfort to recall that it took a hundred
years for the work of Euler, Lagrange and Laplace to reap the full benefit of
the scientific revolution brought about by the publication of the Principia
and over 150 years for its most spectacular observational success, the
discovery of Neptune. Perhaps we should be more patient in our attempts to
derive comparable results from the general theory of reliativity.
Acknowledgements
I thank the Chief, Division of Radiophysics, CSIRO, Sydney and the Acting
Director Anglo-Australian Observatory for hospitality during the writing of
this review. I also thank David Allen for helpful discussions on the Galactic
centre and Roger Romani for guidance on SS433. I am grateful to Ineke
Stacey for assistance with the preparation of the manuscript and Robyn
Shobbrook for bibliographic help. Support under the National Science
Foundation grant AST84- 1 5355 is gratefully acknowledged.
References
Alcock, C. R. ( 1986). The Origin and Evolution of Neutron Stars, ed . D. Helfand. Reidel :
Dordrecht. (in press).
Arnett, W. D. and Bowers, R. L. ( 1977). Astrophys. J. Suppl., 33, 4 1 5 .
Arons, J . , Kulsrud , R . M . and Ostriker, J. P. ( 1 975). Astrophys. J . , 1 98 , 687.
Avni , Y. and Bahcall, J . N. ( 1975). Astrophys. J., 1 97 , 675 .
Baade, W. and Zwicky, F. ( 1934). Phys. Rev., 45, 1 3 8 .
Baade, W. and Minkowski, R . ( 1954). Astrophys. J., 206, 14.
Backer, D . and S ramek, R . ( 1987). In preparation.
Bahcall , J. N . , Hut, P. and Tremaine, S . D . ( 1984). Astrophys. J., 290, 1 5 .
Bahcall , J. N . and Ostriker, J. P . ( 1975). Nature, 256, 23 .
Bahcall , J . N . ( 1978). Ann . Rev. Astron . Astrophys., 16, 24 1 .
Bahcall , J . N . and . Soneira , R . M . ( 1980). Astrophys. J . Suppl., 44 , 73.
Bahcall , J . N . ( 1 984). Astrophys. J., 287, 926 .
Balick, B . and Brown , R . L. ( 1 974). Astrophys. J., 1 94 , 265.
Band, D. L. and Grindlay, J. E. ( 1985). Astrophys. J., 285, 702.
Bardeen, J . M . ( 1970). Nature, 226, 64.
Bardeen, J. ( 1973). Biack Holes, ed. B. DeWitt and B. S. DeWitt. Gordon and B reach :
New York.
Bardeen, J. and Wagoner, R. V. ( 1 969). Astrophys. J. Lett ., 158 , L65.
Bardeen, J. M . and Petterson, J. A . ( 1975). Astrophys. J. Lett., 1 95 , 65.
Barker, B . M. and O'Connell , R. F. ( 1 975) . Phys. Rev., D l2 , 329.
Barnes, J., Goodman, J. and Hut, P . ( 1 986). Astrophys. J., 300, 1 12.
Barr, R. and Mushotzky, R . F. ( 1986). Nature, 322, 42 1 .
Begelman, M . C , ( 1985). Astrophysics of Active Galaxies and Quasi Stellar Objects, ed .
J. Miller. University Science Books : California.
Begelman, M . C., Blandford, R. D. and Rees, M . J. ( 1 980). Nature, 287, 307.
Begelman, M . C. and Rees, M . J. ( 1978). Mon . Not. R. Astron . Soc., 188, 847.
Begelman, M . C., Blandford , R. D. and Rees, M. J. ( 1984). Rev. Mod. Phys., 56, 255.
Bethe, H. A. and Brown , G . ( 198 5). Sci. Amer., 252, 5, 40.
Bicknell, G. V . and Gingold, R. A . ( 1 986) . Preprint.
Bisnovatyi-Kogan, G. S. and Sunyaev, R . A . ( 1 972). Sov. Astron. A .J., 16, 206.
Blandford , R. D. and Thorne, K. S. ( 1 979). General Relativity, ed . S . W. Hawking and
W. Israel . Cambridge University Press : Cambridge.
Blandford, R. D. and Rees, M . J. ( 1974). Mon. Not. R. Astron. Soc., 169, 395.
Blandford, R . D. and Znajek, R. L. ( 1977). Mon. Not. R. Astron . Soc. , 179, 433 .
Bolton, C. T . ( 1972). Nature, 235, 27 1 .
Boyle, B . J . , Fong, R . , Shanks, T. and Peterson , B . A . ( 1987). Mon . Not. R . Astron. Soc.,
In press.
Braginsky, V . B., Caves, C. M . and Thome, K. S. ( 1977). Phys. Rev., D IS, 2047.
B regman, J., Butter, D., Kemper, B., Koski, A., K raft R . P . and Stone, R. P. S. ( 1 973).
Astrophys. J. Lett., 186, L 1 1 7 .
Brown , R. L. and Liszt, H . S . ( 1984). Ann. Rev. Astron. Astrophys., 2 2 , 223.
Burbidge, G. R . ( 1956). Astrophys. J., 129, 849 .
Burbidge, G. R . ( 1959). Paris Symposium on Radio Astronomy, ed . R . N . Bracewell.
Stanford : California.
Burbidge, G. R. ( 1964). Proceedings Solvay Conference.
Burbidge, E. M . , Burbidge, G. R. and Prendergast , K . H. ( 1959). Astrophys. J., 130 , 26.
Burbidge, E. M . , Burbidge, G. R . and Sandage, A . ( 1 964). Rev. Mod. Phys., 35, 97.
Cameron , A . G. W. ( 197 1). Nature, 229 , 178 .
Carr, B . J. and Rees, M . J. ( 1 984). Mon. Not. R. Astron. Soc., 206 , 3 15 .
Astrophysical black holes 32:
Collapse, ed. Robinson , I., Schild, A., Schucking, E. L., Chicago University Press :
Chicago.
Giuricin, G., Mardirossian, F., Mezzetti, M. and Ramella, M . (eds.) ( 1986). Structure
and Evolution of Active Galactic Nuclei. Reidel : Dordrecht.
Goldreich , P., Goodman, J. and Narayan, R. ( 1986). Mon. Not. R. Astron. Soc. In press.
Greenstein , J. and Schmidt, M . ( 1 964). Astrophys. J., 140, 16 1 .
Grindlay, J. ( 1985). Dynamics of Star Clusters, ed . J. Goodman and P . Hut. Reidel :
Dordrecht.
Gunn, J. E. and Ostriker, J . P . ( 1970). Astrophys. J., 157, 1 395.
Gurzadyan, V . G. and Ozemoi , L. M . ( 198 1). Astron. Astrophys., 95, 39.
Hall, D . N. B., Kleinmann, S. G. and Scoville, N. Z. ( 1982). Astrophys. J. Lett., 260,
L53.
Harrison, B. K., Thorne, K. S., Wakano , M. and Wheeler, J. A . ( 1965). Gravitational
Theory and Gravitational Collapse. Chicago University Press : Chicago .
Hawking , S . W. ( 1 965). Phys. Rev. Lett., 15, 689.
Hawley, J. ( 1986). Preprint .
Hazard, C . , MacKey, M . B. and Shimmins, A. J. ( 1963). Nature, 1 97, 1037.
Hewish, A., Bell , S. J., Pilkington, J. D. H., Scott , P. F. and Collins, R. A. ( 1 968).
Nature, 2 17, 709.
Hoyle, F. and Fowler, W. A. ( 1963a). Mon . Not. R. Astron. Soc., 1 25, 169.
Hoyle, F. and Fowler, W. A. ( 1963b). Nature, 197, 533.
Hutchings, J. B., Crampton, D. and Cowley, A. P. ( 1983). Astrophys. J. Lett. , 275, L43 .
Iben, I. ( 1 964). Quasi-stellar Sources and Gravitational Collapse, ed. Robinson , I . ,
Schild, A. and Schucking, E . L. Chicago University Press : Chicago.
Jennison, R. C. and Das Gupta , M. K. ( 1953). Nature, 1 72, 96 .
Katz, J. I . , Anderson, S . F . , Margon, B. and Grandis, S. A . ( 1982). Astrophys. J., 260,
780.
Keel , W. ( 1985). Astrophysics of Active Galaxies and Quasi-stellar Objects, ed. J . M iller.
University Science Book s : Mill Valley, California.
Kerr, R. P. ( 1963). Phys. Rev. Lett. , 1 1 , 237.
Koo , D. C. ( 1986). Structure and Evolution of Active Galactic Nuclei, eds . G. Giuricin,
F. Mardirossion, M. Mezzetti and M . Ramella. Reidel : Dordrecht.
Kormendy, J. ( 1 986). (Preprint).
Krolik, J. H. and London, R . A. ( 1 983). Astrophys. J., 267, 37 1 .
Kumar, S . and Pringle, J. E. ( 1985). Mon. Not. R . Astron. Soc., 213, 435.
Lacey, C. G. and Ostriker, J. P. ( 1 975). Nature, 256, 23.
Landau , L. ( 1932). Phys. Z. Sowjetunion. , l, 285.
Laplace, P . S . ( 1795). Le System du Monde, Vol . II. Paris.
Larson , R. B. ( 1984). Mon. Not. R. Astron. Soc., 2 10 , 763 .
Le Blanc, J. M . and Wilson, J. R . ( 1969). Astrophys. J., 161 , 54 1 .
Leibowicz, E . ( 1984). Mon. Not. R . Astron . Soc . , 210, 279.
Lense, J. and Thirring, H. ( 1 9 1 8). Phys. Z . , 19, 156.
Lightman, A. P. and Shapiro, S . L. (1977) . Astrophys. J., 2 1 1 , 244.
Lin, D. N. C., Papaloizou , J. C . B. and Faulkner, J. ( 1985). Mon . Not. R . Astron . Soc.,
2 1 2 , 105.
Lingenfelter, R. E. and Ramaty, R. ( 1982). The Galactic Centre, ed . G. Riegler and
R . Blandford . American Institute of Physics, New York .
Lo, K-Y., Backer, D . C . , Ekers, R . D . , Kellermann , K . I . , Reid , M . and Moran , J. M .
( 198 5) . Nature, 315, 124.
Lo, K-Y . ( 1 986). Science, 233, 1 394 .
Astrophysical black holes 327
Longair, M . S., Ryle, M . and Scheuer, P . A. G . ( 1973). Mon. Not. R. Astron. Soc. , 164,
243 .
Lucke, R . , Yentis, D . , Friedman, H . , Fritz, G . and Shulman , S . ( 1976). Astrophys. J.
Lett., 206, L25 .
Lynden-Bell , D . ( 1969). Nature, 223, 690.
Lynden-Bell , D . ( 19 7 1 ) . Nuclei of Galaxies, ed. D . J . K . O'Connell. North Holland :
Amsterdam .
Lynden-Bell , D . and Rees, M . J . ( 197 1). Mon . Not. R. Astron. Soc. , 152 , 46 1 .
Lynden-Bell , D . ( 1978). Phys. Scripta, 17, 185.
Lyne, A . G . , Manchester, R . N . and Taylor, J . H . ( 1985). Mon. Not. R. Astron. Soc.,
213, 6 13 .
M acDonald, D . A . and Thorne, K . S . ( 1982). Mon. Not. R. Astron. Soc. , 198, 345 .
McClintock, J . E . and Remillard, R . A . ( 1 986). Astrophys. J., 308 .
McClintock, J . E. ( 1986) . Preprint.
McDowell, J. ( 1985). Mon. Not. R. Astron. Soc. , 217, 77.
Maran , S . P. and Cameron, A . G . W . ( 1964) . Physics of Non-thermal Radio Sources.
NASA : Washington .
Margon , B . , Ford, H . C., Katz, J . I . , Kwitter, K . B . , Ulrich , R . K . , Stone, R . P . S. and
Klemola, A . ( 1 979). Astrophys. J. Lett., 230, L4 1 .
Margon , B . ( 1980). Sci. Amer., 243, 4 , 44.
Margon , B. ( 1984). Ann . Rev. Astron. Astrophys. , 22, 507.
Margon , B . , Bowyer, S. and Stone, R. S. ( 1973) . Astrophys. J. Lett., 185, L l l3 .
M arshall, F. E., Holt, S . , Mushotzky, R . F. and Becker, R . M . ( 1983). Astrophys. J .
Lett., 269, L3 l 1 .
Matthews, T . A . , M altby, P . and Moffet, A . T . ( 1962). Astrophys. J., 137 , 1 5 3 .
M azeh, T . , von Paradis, J . , van der Heuvel, E . P . J . and Savonije, G. J . ( 1986). Astron.
Astrophys., 157, 1 1 3 .
Michell, J . (1 784). Philos. Trans., 74, 3 5 .
Milgrom, M . ( 1 979). Astron. Astrophys., 76, L3 .
Miller, J . (ed.) ( 1 985). Astrophysics of Active Galaxies and Quasi-Stellar Objects,
University Science Books : California.
Misner, C. W., Thorne, K. S. and Wheeler, J. A. (1973). Gravitation. Freeman :
San Francisco.
Moffet , A . T . , Gubbay, J . , Robertson, D . S . and Legg, A . J . (197 1 ). External Galaxies
and Quasi-stellar Objects, ed. D. S. Evans. Reidel: Dordrecht.
Narayan , R. ( 1986). Quasars, ed. G. Swarup and V. Kapahi. Reidel : Dordrecht.
Newsom, G . H. and Collins , G . W . II ( 1 982). Astrophys. J., 262, 7 14.
Newton , A . J . and Binney, J . ( 1984) . Mon. Not. R. Astron . Soc., 210, 7 1 1 .
Norman, C . A . and Silk, J . ( 1983). Astrophys. J., 266, 502.
Novikov, I . D. and Thorne, K. S. ( 1973). Black Holes, ed. B. DeWitt and B. S. DeWitt.
Gordon and Breach : New York.
Oppenheimer, J. R. and Volkoff, G. M . ( 1939) . Phys . Rev., 55, 374.
Oppenheimer, J. R. and Snyder, H. ( 1939) . Phys. Rev., 56, 455.
Ozernoi, L. M. and Chertoprid, V . E. ( 1969). Sov. Astr. A. J., 46, 940.
Pacholczyk, A . G . and Weymann, R. E. ( 1968). Astron. J., 73, 8 50.
Paczynski, B . ( 1974). Astron. Astrophys. , 34, 16 1 .
Paczynski, B . ( 1984). Astrophys. J. Lett. , 273, L8 1 .
Paczynski, B . ( 1986). Nature, 321 , 4 19.
Paczynski, B. and Wiita, P. J . ( 1 980). Astron. Astrophys., 88, 23.
Papaloizou, J . C. B . and Pringle, J . E. ( 1984) . Mon. Not. R. Astron. Soc., 208, 72 1 .
328 R. D. Blandford
Papaloizou, J . C. B . and Pringle, J. E. ( 1985). Mon. Not. R. Astron. Soc. , 213, 799 .
Pakull, M . W. and Angebault, L. P . ( 1986) . Nature, 322 , 5 1 1 .
Parmar, A . N . , Brandiardi-Raymont, G., Pollard , G . S . G . , Sanford, P . W., Fabian ,
A . C., Stewart, G . C., Schreier, E. J . , Poliden, R . S . , Oegerle, W . E. and Locke, M .
( 1980). Mon . Not. R. Astron. Soc., 193, 49p.
Penrose, R . ( 1965). Phys. Rev. Lett., 14, 57.
Penrose, R. ( 1969). Rev. Nuovo. Cimento, 1, 252.
Phinney, E. S. ( 1983). Unpublished thesis, University of Cambridge .
Polidan, R . S., Pollar, G. S . G . , Sanford, P . W . and Locke, M . C. ( 1978). Nature, 275,
296.
Press, W. H. ( 1972). Astrophys. J., 138, 2 1 1 .
Pringle, J. E . ( 198 1). Ann. Rev. Astron . Astrophys., 19, 137.
Pudritz, R . E. ( 1983). Mon. Not. R. Astron . Soc., 195, 88 1 .
Reber, G. ( 1944). Astrophys. J . , 100, 279.
Rees, M . J. ( 1966). Nature, 2 1 1 , 468 .
Rees , M . J . ( 197 1). Nature, 229, 3 12.
Rees, M. J . ( 1977). Ann. NY Acad. Sci. , 302, 6 1 3 .
Rees , M . J . ( 1978). Nature, 275, 5 16.
Rees, M. J . ( 1982). The Galactic Centre, ed . G . Riegler and R . D . Blandford . American
Institute of Physics : New York.
Rees, M . J. ( 1984). Ann. Rev. Astron. Astrophys., 22, 47 1 .
Rhoades, C. E . and Ruffini , R . ( 1 974) . Phys. Rev. Lett., 33, 324.
Richstone, D. and Tremaine, S . D . ( 1985). Astrophys. J., 296, 370.
Robinson, I . , Schild, A . and Schucking, E. ( 1964). Quasi-stellar Sources and Gravitational
Collapse. Chicago University Press : Chicago.
Roos, N. ( 198 1). Astron. Astrophys. , 104, 2 1 8 .
Sakimoto, P . J . and Coroniti, F. V . ( 198 1). Astrophys. J., 247, 19.
Salpeter, E. E. ( 1964). Astrophys. J., 140, 796 .
Salpeter, E. E. ( 197 1). Nature Phys. Sci., 233, 5 .
Salpeter, E. E . and Wagoner, R . V . ( 197 1). Astrophys. J., 164, 557.
Sanders, R . H. ( 1970). Astrophys. J., 162 , 784.
Sanders, R . H . and Allen , D . A . ( 1986) . Nature, 319, 19 1 .
Sanford, P . W . , Laskarides, P . and Salton, J . eds. ( 1982). Galactic X-ray Sources. Wiley :
London .
Sargent, W. L . W., Young, P . J . , Boksenberg, A . , Shortridge, K . , Lynds, C. R . and
Hartwick, F. D. A. ( 1978). Astrophys. J., 221 , 7 3 1 .
Scheuer, P . A. G . ( 1974) . Mon. Not. R . Astron. Soc., 166, 5 1 3 .
Schmidt , M . ( 1963). Nature, 197, 1040.
Schultz, A. L. and Price, R. H. ( 1985). Astrophys. J., 291 , 1 .
Schreier, E . , Gursky, H . , Kellogg, E . , Tananbaum, H . , Giacconi, R . ( 197 1). Astrophys. J .
Lett., 1 70, L2 1 .
Seyfert, C. K . ( 1943). Astrophys. J . , 97, 28.
Shakura, N . and Sunyaev, R . A . ( 1973). Astron . Astrophys., 24, 337.
Shapiro, S . L. and Teukolsky , S . A. ( 1983). Physics of Compact Objects: Black Holes,
White Dwarfs and Neutron Stars. Wiley : New York.
Shapiro, S . L. and Teukolsky. S . A. ( 1985). Astrophys. J. Lett., 292, L4 1 .
Shields, G. A . , McKee, C . F . , Lin , D . N . C. and Begelman , M . C. ( 1986) . Astrophys. J.,
306, 90.
Shklovsky, I . S. ( 1967). Astrophys. J. Lett., 148, L l .
As t rophysical black holes 329
9. 1 Introduction
attempts to analyze radiation reaction in such systems in the late 1940s and
1950s (pp. 73 and 74 of Damour, 1983) shook physicists' faith in the ability
of the waves to carry off energy, and even in the correctness of the Landau
Lifshitz formula for the emitted wave field. It required a clever thought
experiment by Bondi ( 1957) to restore faith in the energy of the waves, and a
series of beautiful and rigorous studies of the asymptotic properties of the
waves at infinity by Bondi and collaborators (Bondi, 1960; Bondi, van der
Burg, and Metzner, 1962 ; Sachs, 1962, 1963 ; Penrose, 1963a,b) and of the
propagation of short-wavelength waves through a curved background
spacetime by Isaacson ( 1968a,b) to restore faith that the fundamental theory
of gravitational waves is soundly based.
The experimental search for cosmic gravitational waves was initiated by
Joseph Weber ( 1960) at a time when almost nothing was known about
possible cosmic sources and when nobody else had the vision to see that
there were technological possibilities of ultimate success. After a decade of
effort, Weber ( 1969) announced to the world tentative evidence that his
resonant-bar gravity-wave detectors - one near Washington, DC, the other
near Chicago - were being excited simultaneously by gravitational waves.
There followed a six-year period of excitement and feverish effort as 1 5 other
research groups around the world tried to construct and operate similar bar
detectors (Tyson and Giffard, 1978 ; Amaldi and Pizella, 1979 ; de Sabbata
and Weber, 1977 ; Weber, 1986, and references therein). Sadly, even with
markedly improved sensitivities, these efforts gave no convincing evidence
that gravity waves were actually being seen.
In parallel with this experimental effort, astrophysicists worldwide
struggled through the early 1970s to milk, from electromagnetic
observations of the universe and from fundamental theory, as much
information as possible about the characteristics of the gravity waves that
might be bathing the earth. By the mid- 1970s a fuzzy but helpful picture had
begun to emerge : While the sensitivities of the detectors to kilohertz
frequency bursts arriving, say, three times per year had improved during the
early 70s from dimensionless amplitude h31yc "" 1 x 10 - 1 5 to h31 yc "" 3 x 10 - 1 6
(a factor 10 improvement i n energy flux), it seemed highly unlikely that such
bursts bathing the earth would exceed h3;yr "" 1 x 10 - 1 6 ; a reasonable
probability of success would require h31 yr "" 10 - 2 0 or better; and a high
probability would require h3/yr "" 10 - 21 to h3/yr "" 10 - 22 (Smarr, ed., 1979;
Fig. 9.4 below). Although these estimates were discouraging, the theoretical
efforts that produced them were making clear the enormous potential payoff
that could follow the successful detection of gravity waves.
334 K. S. Thorne
Fortunately, the experimental efforts of the early 1970s had pointed the
way toward major possible detector improvements; and, consequently,
although most of the first-generation experimental groups became
discouraged and dropped out, a handful of highly talented groups continued
onward into the 1980s with a second-generation effort involving major
technological changes such as cooling the bars to liquid-helium
temperatures, changing bar materials, switching from passive to active
transducers, and even developing completely new types of detectors, most
notably laser-interferometer gravity-wave detectors (called 'beam' detectors
in this chapter). These second-generation efforts have reached fruition in the
last few years : bars with kilohertz burst sensitivities h J/y r ,..., 10 - 1 7 (30 times
higher in amplitude than the first-generation and 1000 higher in energy) are
now collecting data in coordinated searches (Section 9.5.2(d) below); and
small-scale beam detectors with h3;y r "' 5 x 10 - 1 7 are now operating (Section
9.5.3(d) below) as prototypes for full-scale detectors with projected ultimate
sensitivities in the 10 - 22 region (Section 9.5.3(g) below). The regime of
possible success has been reached, and the regimes of reasonably probable
success and highly probable success look reachable - though only with
vigorous continuing efforts and the expenditure of non-trivial sums of
money.
In parallel with these 1980s' second-generation efforts, theorists have
redoubled their struggle to firm up our understanding of the waves bathing
the earth, but with only modest results : the problem of knowing what kinds
of sources actually occur, and how frequently, is hampered by the paucity
of electromagnetic information; and, as a result, the apparent recent
improvements in our knowledge (Section 9.4 below) might be little more
than changes of fashion . On the other hand, given a specific scenario for how
a postulated source behaves, theorists have become far more adept than
before - thanks not least to supercomputers - at computing the details of the
gravitational waves it should emit (Section 9.3.3 below). As a consequence,
when waves are ultimately detected, the prospects have become reasonable
for deciphering from them the details of their sources.
While the present, 1987, gravity-wave searches might bring success, it is
not likely they will. Thus, we must anticipate a continued vigorous effort at
technology development during the coming years, with the prospects of
success improving significantly at each step along the way. In parallel, we
must anticipate a continuing major effort by relativity theorists to refine
their ability to decipher the source behaviors corresponding to postulated
gravitational-wave forms, and a continuing effort by astrophysicists to give
Gravitational radiation 335
a normal stellar core to form a neutron star, subsection c), the collapse of a
star or star cluster to form a black hole (subsection d), the inspiral and
coalescence of compact binaries (neutron stars and black holes, subsection
e), and the fall of stars and small holes into supermassive holes (subsection f).
Because our knowledge of sources is so poor, it is useful to estimate how
strong the strongest wave bursts bathing the earth could be without
violating our cherished beliefs about the laws of physics and the nature of the
universe; this is done in subsection g. The periodic sources treated in Section
9.4.2 include rotating neutron stars (rigidly rotating pulsars, and neutron
stars spun up by accretion until they encounter a radiation-reaction-driven
instability, subsection b ), and binary stars (including unevolved binaries,
WUMa stars, white-dwarf binaries, and neutron-star binaries, subsection
c). The stochastic sources in Section 9.4.3 include large numbers of binary
stars whose waves superpose stochastically (subsection b ) pre-galactic,
,
Population III stars (subsection c), the big-bang singularity in which the
universe began - with subsequent parametric amplification of its waves by
background curvature in inflationary and other scenarios (subsection d),
phase transitions in the subsequent but still early universe (subsection e),
and cosmic strings produced by phase transitions (subsection f). Present
estimates of the strengths of the waves from all these sources are shown in
Figs. 9.4 (burst), 9.6 (periodic), and 9.7 (stochastic) along with the
sensitivities of present and proposed detectors.
The detectors in Section 9.5 are divided into those that operate in the
high-frequency regime, f <; 10 Hz (Sections 9.5.2, 9.5.3, and 9.5.4), those for
low frequencies, 10 Hz <; f <; 10 - 5 Hz (Section 9.5.5), and those for very low
frequencies, f ;S 10- 5 Hz (Section 9.5.6). The high-frequency detectors are
all earth-based; but because of seismic and gravity-gradient noise, the low
and very-low-frequency detectors must be space-based. Sections 9.5.2 and
9.5.3 describe in great detail the earth-based, high-frequency bar and beam
detectors which have been under development for many years and show
great promise for the future. Section 9. 5.4 describes briefly other types of
earth-based, high-frequency detectors. Section 9.5.5 describes low
frequency detectors including doppler tracking of spacecraft (subsection a),
beam detectors in space which hold great promise for the tum of the century
(subsection b ), the normal modes of the earth and sun (subsections c and d),
the vibrations of blocks of the earth's crust (subsection e), and the earth
orbiting skyhook (subsection f). Section 9.5.6 describes very-low-frequency
detectors including the timing of pulsars (neutron-star rotations), which
recently has placed interesting observational limits on a stochastic
338 K. S. Thorne
lies in how she analyzes and thinks about the force of gravity Fi . As a
relativist, she recognizes (cf. Box 37. 1 of MTW) that Fi is made up of a nearly
steady, nearly position-independent component caused by her own failure
to fall freely, plus a component proportional to the particle's Cartesian
coordinate position x i (relative acceleration of particle and origin of
coordinates) which is caused by spacetime curvature R!%py� · This latter
contribution,
Fi = - mRjok o x k (2)
(where the index 0 denotes a component along her time basis vector), she
splits up into a piece that changes slowly in time (background curvature
contribution) plus a piece that is rapidly varying. She makes certain that
there are no rapidly moving or rapidly changing nearby sources of gravity to
account for the rapid variations; if there are none, then she can attribute the
rapidly varying component of the force to gravitational waves
FiGW -
_ - mR iGOkWo X k · (3)
It is conventional to use, as the primary entity for describing a
gravitational wave, not the Riemann curvature tensor R��� which has
dimensions 1/time 2 (or ljlength2), but rather a dimensionless 'gravitational
wave field' hf,[. In terms of the force-producing components of the Riemann
tensor R GwiOk o and proper time t as measured by our observer, h],[ is defined
by (cf. Section 2.3 of Thome 1983)
8 2 hf,!'_
8f = - 2R jGoWk o · (4)
Fig . 9. 1 . Lines of force for gravitational waves (equations (5) and (7)) : (a )
with ' + ' polarization, m ()x = Mi+x and m ()ji = -Mi+ y; and (b) with ' x '
polarization, m ()x = 1/1 x y and m ()ji = -tli x x - where dots denote time
derivatives.
y y
(a } ( b}
342 K . S. Thorne
quantity
+h = xx -
- hIT - - hITyy (7a)
produces a force field with the orientation of a ' + ' sign, while
h =
x xy - h TT
- h IT yx (7b)
produces one with the orientation of a ' x ' sign. Thus, h and h are called
+
x
the 'plus' and 'cross' (or + and x ) gravity-wave amplitudes. From these
amplitudes and the polarization tensors e;x = - e� = 1, e;Y = e;x = 1 (all
other components zero), one can reconstruct the full wave field
h Jl = h + (jk + h " ei� . (7c)
It is straightforward to show that, if one rotates the x and y axes in the
transverse plane through an angle dl/J, the gravity-wave amplitudes are
changed to
h".;w h �d cos 2dl/J + h 0Jd sin 2dl/J,
=
discussed above, it has 'spin-weight two' (it behaves like a spin-two field
under rotations). For further details see Section 2.3.2 of Thorne ( 1983), and
for the mathematics that goes along with the concepts of 'spin-weight' and
'boost-weight', see Geroch, Held and Penrose ( 1973).
computed from the Riemann tensor (equation (4)) in the proper reference
frame of any observer who is nearly at rest in the TT coordinate system. For
proofs and discussions, see e.g. Sections 35.4, 37. 1 , and 37:2 of MTW.
Whereas physics in a proper reference frame can be formulated in the
Newtonian language of three-dimensional forces, including gravitational
forces, physics in a TT coordinate system must be formulated in the
relativistic language of geodesic motion and vanishing divergence of the
stress-energy tensor; cf. Section 9.5. 1 below.
�
Ta�W = t � ( h L:a hilp ) ,
3 7 I. )
(9)
where ( - · - ) means 'average over several wavelengths'. (For a pedagogical
derivation and discussion see Sections 35. 7-35. 15 of MTW ; for a beautiful
rederivation by the method of averaged Lagrangians see MacCallum and
Taub, 1973.) If the waves are propagating in the z direction, this stress
energy tensor takes the standard form for a bundle of zero-rest-mass
particles (gravitons) moving at the speed of light in the z direction :
1
T�0w = - Tgw = - Tz�w = Tz�w = (8h +/8t) 2 + (8h x /8t) 2 ) . ( 10)
16n (
Gravitational radiation 345
-
Contrast this huge gravity-wave energy flux with the peak electromagnetic
flux at the height of the supernova, ,...,., 10- 9 erg cm 2 s - 1 ; but note that the
gravity waves should last for only a few milliseconds, while the strong
electromagnetic output lasts for days.
Corresponding to the huge energy flux (1 1) in an astrophysically
interesting gravitational wave is a huge occupation number for the quantum
states of the gravitational-wave field : it is not hard to show that for the
above supernova burst only a handful of quantum states are occupied ; and
they each contain n - 1075 gravitons (equations (6)--(8) of Thome et al.,
1979). This means that the waves behave exceedingly classically; quantum
mechanical corrections to the classical theory have fractional magnitude
1/Jn ,...,., 10- 37. (Although the full quantization of the gravitational field is
exceedingly difficult and not yet fully under control, the quantization of
weak gravitational waves propagating through a smooth background
spacetime - .equivalent to weak waves in flat spacetime - has been well
understood for decades; see, e.g., the most elementary aspects of Feynman,
1963; Dewitt, 1967a,b.)
Isaacson's stress-energy tensor (9) for gravitational wavr-s has the same
properties and plays the same role as the stress-energy tensor for any other
field or form of matter in the background spacetime. For example, Ta�w
generates background curvature through the Einstein field equations
(averaged over several wavelengths of the waves); also T;/iw has vanishing
divergence (conservation of gravity-wave energy and momentum) in
spacetime regions where the waves are not being generated, absorbed, or
scattered. For full details see Isaacson (1968b) or Section 35.15 of MTW.
9.3 The generation and propagation of gravitational waves
9.3 . 1 Wave propagation split off from wave generation
Turn, now, to the generation of gravitational waves and their propagation
from their source to the earth. Mathematically, the wave-generation
problem and the wave-propagation problem are each difficult - though for
very different reasons, so that to handle the difficulties requires two very
346 K. S. Thorne
back-reaction effects of the wave emission on the source, i.e. the 'radiation
reaction'.
( 14)
for the magnitude h of the gravity-wave field hf,! at earth.
From the 'exact' quadrupole formula ( 12) for the wave field in the local
wave zone and Isaacson's formula (9) for the stress-energy tensor of the
waves, one can compute the fluxes of energy and angular momentum carried
Gravitational radiation 349
dt
=
5 j.L.
( )
dJ y w � " f> · 8 2 fj1 8 3 "'k
k
,k,l '1 8 t 2 8 t3 •
( 16)
The linear momentum carried off by the waves vanishes when one computes
it by the quadrupole formalism ; but when one includes higher-order
corrections to the field emitted by slow-motion sources, one finds for the rate
of emission of linear momentum (first derived by Papapetrou, 1962, 197 1 ;
for a more modem derivation in the notation of this chapter see Section
IV.C of Thorne, 1980b)
dt
= (
dP yw 2_ "'°' 8 3 ,fik 8 4 ,/jki
� 8 t3 8 t4
63 J,k
) + .L. (
45 J,k,a ljk 8 t3 8 t3
)
16 "'°' t;. . 8 3 /ja 8 3 �a .
(l7)
Here Jiik is the source's 'mass octupole moment' and !Iii is its 'current
quadrupole moment' (gravitational analog of magnetic quadrupole
moment). For sources with weak internal gravity and stresses (nearly
Newtonian sources), these moments are computable from the simple
volume integrals
i i STF
fiik = (J p x x x k d3x ) , ( 18a)
!/:I). = (J p t; .lp xPv q xi d 3 x ) STF ' ( 18b)
q
where, as in equation (15a), STF means 'make it symmetric and trace-free',
i .e. 'symmetrize on all free indices and remove the traces on all pairs of free
indices'. Independently of the strengths of the internal gravity and stresses,
the moments can be read off the Newtonian potential <l> ""' - 1(g 00 + 1)
(equation ( 1 3b)) and off the 'gravitomagnetic potential' {3i - 90i in the
source's weak-field near zone :
( 19)
In equation ( 19) J P, the moment in the leading, dipolar term, is the source's
angular momentum. For further details, discussions, and derivations see
Thorne ( 1983) or Thome ( 1980b). For a discussion of the gravitomagnetic
potential see, e.g., Chapter 3 of Thome, Price and Macdonald (1986).
The laws of conservation of energy, angular momentum, and linear
momentum imply that radiation reaction should deplete the source's
350 K . S. Thorne
energy, angular momentum, and linear momentum at just the right rates to
compensate for the losses ( 15), ( 16), and ( 17); and a detailed analysis of the
radiation reaction forces reveals that this is so (Peres, 1960). Particularly
convenient in analyzing the radiation reaction in a source with weak self
gravity is a Newtonian-type radiation-reaction potential (Burke, 1969 ;
Thorne, 1969; Chandrasekhar and Esposito, 1970; Section 36.8 of MTW).
Different physicists feel comfortable with different levels of rigor. In recent
years these differences have shown up strongly and publicly in a controversy
over derivations of the quadrupole wave-generation formula ( 12) and the
formula for the energy sapped from a source by radiation reaction (negative
of equation ( 15)). Many physicists - myself among them - were quite
satisfied with derivations at the level of rigor, e.g. , of Landau and Lifshitz
( 194 1) and Peres ( 1960). Others (e.g. Ehlers et al., 1976) felt that those early
derivations were inadequately rigorous and, correspondingly, that the
quadrupole formulae were suspect for sources with non-negligible self
gravity. The controversy was heightened by the fact that there were
mathematical errors in some (but not all) of the early derivations (see
Walker and Will, 1980 and Section 3 .4.2 of Thorne, 1983 for discussions).
The controversy was still raging in the early 1980s; see, e.g. Ashtekar
( 1983). However, duiing the mid 1980s it has largely subsided. There are
now many new derivations of the quadrupole formulae, with much
improved rigor, and they all produce the same, standard results; see, e.g.,
Anderson et al. ( 1982), Blanchet and Damour ( 1984) ; Christodoulou and
Schmidt ( 1979); Isaacson, Welling and Winicour ( 1984) ; and for reviews see
Will ( 1986), Schutz ( 1986a) and Damour ( 1987).
Of particular interest is radiation reaction in the binary pulsar
PSR 19 13 + 16, which should cause the binary system's two neutron stars to
spiral together slowly with a consequent gradual decrease in their orbital
period. Because the duration of observations of the pulsar is short ( 12
years), the cumulative effects of radiation reaction on the orbit during those
observations are 100 times smaller than post-Newtonian effects; and,
consequently, the detailed effects of the radiation reaction could not be fully
understood until the orbital equations were fully under control up through
post-post-Newtonian order. Damour and Deruelle ( 1986) have now
brought the orbital equations fully under control ; and there is now a
beautiful agreement between those equations - including the quadrupolar
radiation reaction - and the observational data. For a detailed discussion
see Chapter 6 of this book.
Gravitational radiation 35 1
distance r, the strongest emitters will be those with the largest non-spherical
kinetic energies, i.e. those with the largest internal masses and with internal
velocities approaching the speed of light. Thus, the strongest emitters are
likely to violate the slow-motion assumption which underlies the
quadrupole formalism , and will require for accurate analysis either higher
order corrections to the quadrupole formalism, or wave-generation
formalisms that do not entail any slow-motion assumption.
There are a number of other wave-generation formalisms which can be
applied to such sources. This section is a catalog of them , with references to
detailed presentations and applications. For an out-of-date but unified
presentation of most of these formalisms see Thorne ( 1977).
To be tractable with a minimum of numerical computation, a wave
generation formalism must break the extreme non-linearity of the Einstein
field equations by imposing a power-series expansion in some small quantity
and keeping only the lowest , linear order or the lowest few orders. Wave
generation formalisms can be classified according to their choice of the small
expansion parameter. Slow-motion formalisms (of which the quadrupole
formalism is an example) expand in L/Jc = (size of source)/(reduced
wavelength of waves) ; subsection (a) below. Post-Minkowski formalisms
expand in the strength of the gravitational field inside the source, i.e. in the
magnitude of the deviations of the metric coefficients from their Minkowski
values ; subsection (b ). Post-Newtonian formalisms expand simultaneously
in L/Jc and the strength of the internal gravitational field; subsection (c).
Perturbation formalisms expand in the deviations of the metric from its form
for some non-radiative, astrophysical system - e.g. from the Kerr metric for
a rotating black hole, or from the metric for an equilibrium, rotating,
relativistic stellar model, or from the Friedman-Robertson-Walker metric
for a homogeneous, isotropic 'big-bang' ; subsection (d).
Of all astrophysical sources, the very strongest emitters will entail
gravitationally induced large-amplitude, high-velocity, non-spherical
internal motions - e.g. the inspiral and coalescence of a binary black hole or
binary neutron-star system . For such sources there is no small parameter in
which one can expand . The only way to compute the full details of the wave
field emitted by such sources is by the techniques of numerical relativity : the
numerical solution of the full Einstein field equations on a supercomputer;
subsection (e).
3 52 K . S. Thorne
beyond linear order; and they are sometimes called 'fast-motion' to contrast
them with slow-motion formalisms.
There is a systematic way to take a post-Minkowski wave-generation
formalism that is accurate to a given order in the strength of the source's
internal gravity, and iterate it to obtain a formalism of higher accuracy
(Thorne and Kovacs , 1975 ; Thorne, 1977).
The wave-generation formalism that is accurate to first post-Minkowski
order (first order in the strength of internal gravity) is 'Linearized theory',
i .e. the linear approximation to general relativity. Linearized theory is
discussed in most textbooks, e.g. Chapter 18 and Sections 35. 1-35.6 of
MTW . Halpern and Desbrandes ( 1969) and, independently, Press (1977)
have derived a particularly useful Linearized wave-generation formula for
systems with sizes L large compared to a reduced wavelength ..t. Examples of
gravity-wave generation that have been analyzed by Linearized theory are :
(i) the coherent (but painfully slow) transformation of electromagnetic
waves into gravitational waves (first considered by Gertsenshtein , 1962,
subsequent work reviewed in Section 4. 1 of Grishchuk and Polnarev, 1980);
and (ii) the waves emitted by the explosion of a non-spherical nuclear bomb
(Wheeler, 1962; Wood et al., 1970).
Linearized theory is completely ignorant of the source's internal gravity ;
it can correctly predict the emitted waves only if the source's motions are
governed by non-gravitational forces - typically by electric or magnetic
forces. For systems with significant but weak internal gravity (e.g. stellar
pulsations, binary systems, and high-speed stellar encounters), one must use
a wave-generation formalism accurate to the next, 'post-post-Minkowski'
or 'post-Linear' order. For the details of such a formalism, see, e.g . , Thorne
and Kovacs ( 1975), Crowley and Thorne ( 1977); and for its application to
high-speed stellar encounters (gravitational bremsstrahlung radiation) see
Kovacs and Thorne ( 1977, 1 978). For a recent review see Westpfahl ( 1985).
Thus far nobody has developed a post 2 -Linear wave-generation
formalism in detail - i .e. a formalism accurate to post 3 -Minkowski order.
There has been no great need for such a formalism, and the post-Linear
formalism is sufficiently hard to work with in practice (cf. Kovacs and
Thorne, 1 977) that it is not clear whether post 2 -Linear would be
significantly easier than full-blown numerical relativity.
inside the source the deviations of the metric from Minkowski (i.e. the
dimensionless strength of the source's gravity) have a magnitude e of order
(L/J.) 2 ; and accordingly they expand the Einstein field equations
simultaneously in e (post-Minkowski expansion) and L/J.. (slow-motion
expansion). See, e.g., Burke ( 1979) for a review of the method.
The lowest-order wave generation formalism that results from this
expansion is called 'Newtonian' because it computes the evolution of the
source using Newton's laws of gravity and mechanics, then evaluates the
source's time-evolving quadrupole moment using the standard Newtonian
volume integral ( 1 3a), then inserts that quadrupole moment into the
standard quadrupole wave-generation formula ( 12). It is this Newtonian
version of the quadrupole formalism that has been especially controversial
(see the end of Section 3 .2 above) but is now almost universally agreed to be
highly reliable.
There have been a large number of important wave-generation
calculations with this formalism. Some examples are : (i) the waves emitted
by binary systems in Newtonian, elliptical orbits (Peters and Mathews,
1963); (ii) the waves emitted by a variety of models of stars that collapse to
form neutron stars (Saenz and Shapiro, 1978, 198 1 ) ; and (iii) the waves
emitted in the .head-on' collision of two compact stars (Gilden and Shapiro,
1984).
The Newtonian wave-generation formalism starts losing accuracy when
the source's internal gravity becomes too strong {t: - 0.05) and its internal
velocities too high (v - 0.2) (Turner and Will, 1978) - e.g. in the late stages of
the spiraling together of a neutron-star binary system. In such a situation it
is useful to include the next higher-order corrections (one order higher in the
strength of gravity e , two higher in the speed L/J.. ) . The result is the post
Newtonian formalism (Epstein and Wagoner, 197 5 ; Wagoner, 1977 ;
Tsvetkov, 1984). Examples of calculations that have been performed with
the post-Newtonian formalism are the radiation from a system of bodies
whose sizes are all small compared to their separations (Wagoner and Will ,
1976), gravitational bremsstrahlung at moderate velocities (Turner and
Will , 1978), and the radiation emitted by a slowly rotating star that collapses
to a neutron star (Turner and Wagoner, 1979).
The foundations for a post 2 -Newtonian wave-generation formalism have
also been worked out (Section V . E. of Thorne, 1980b) ; but it has never been
developed in full detail . or applied to any sources. Such a formalism - by
contrast with the post 2 -Linear - would likely be far more tractable than full-
Gravitational radiation 3 55
blown numerical relativity; so it may one day prove useful in tying down the
gravitational waves from large-amplitude processes involving neutron stars.
wave zones (Section 9.3. 1), because the transition from wave generation to
wave propagation is a temporal rather than a spatial one : the transition
occurs when the size of the cosmological horizon expands to become much
larger than a wavelength, thereby unfreezing a set of frozen-in initial
perturbations. The only way, today , to analyze primordial waves is by a
perturbation formalism somewhat akin to that used for stars and black
holes : the unperturbed configuration is typically a non-radiative,
I
spatial dimensions and one time (e.g. Nakamura, 1987). This full '3 + 1'
effort will require using the world's largest supercomputers, and will require
new techniques for slicing spacetime into space plus time, for choosing the
spatial coordinates, and for differencing the Einstein equations. The effort
may absorb almost as many person-years as the development of
gravitational-wave detectors ; but it will be well worthwhile : the payoffs will
include the ability to compute in detail the waveforms from the strongest
gravity-wave sources in the universe, such as the spiraling together and
coalescence of two black holes - waveforms that will be crucial to the
interpretation of gravity-wave observations and to their use for strong-field,
highly dynamical tests of general relativity.
One should not be misled into believing that numerical relativity will be
the totally dominant tool for realistic gravitational-wave calculations in the
coming years. On the contrary, we can expect a healthy interaction between
numerical and analytical techniques; for discussion see Schutz ( 1986c).
they are fully formed, using a linearized approximation to the Einstein field
equations (Sections 35. 13 and 35. 14 of MTW) : (i) one introduces a field hap
(which is actually the trace-reversed contribution of the waves to the
spacetime metric in a suitable gauge). This field is defined to be equal to h r/
in the region from which the waves are propagating (the local wave zone in
the case of isolated sources ; the very early universe in the case of primordial
waves). One then evolves the field hap out into the surrounding universe and
to earth using the curved-spacetime wave equation (equation (35.64) of
MTW)
hap J ) µ + g:p hµ vJ - 2 h ( ) µJ P > + 2R!a. p hµ v - 2R!(ahp)µ = - l6 m5 Ta.p ·
vµ µ v
(22)
Here g �P is the background metric, the subscript and superscript I denote
covariant derivatives with respect to the background metric, R�p and R�pyb
are the Ricci and Riemann curvature tensors of the background ; and bTap is
the perturbation in the non-gravitational stress-energy tensor produced by
the trace-reversed metric perturbation ha.p itself.
Although the field ha.p initially is wavelike and thus has ,t � !e ;$ al8, it
might propagate into regions where the background has very short
lengthscales, .fR ;$ l. (For example, the waves produced by the Crab pulsar,
with ,t- 1000 km may propagate through a massive white dwarf with
!e - 1000 km and al8 - 30 000 km, or even through a neutron star with
!e - 10 km and al8 - 30 km). If this happens, one need not worry. The wave
equation (22), because it depends for its validity only on the weakness of the
ap
field h and not on the shortwave assumption, remains valid and carries the
field through the region of short background lengthscale (where, strictly
speaking, it is no longer a gravitational wave), and thence onward into
regions of long lengthscale (where it once again is a gravitational wave).
In those regions where hap is a wave, i .e. has ,t � !e, one can compute from
it the gravitational-wave field hJkT by a very simple prescription : introduce
the proper reference frame of a specific observer; and in that frame discard
the time-time and time-space parts of hap' and algebraically project out
from the space-space parts those pieces that are transverse to the
propagation direction and are trace-free. The result will be hf,[. For further
discussion and justifications see Box 35. 1 of MTW and Section 2.4.2 of
Thorne ( 198 3).
As I shall discuss below, the effects of the wave-stimulated stress-energy
perturbations bTap are never large enough to be astrophysically important,
so one can ignore them in propagation calculations. Moreover, in regions
Gravitational radiation 359
(24)
where r is distance to the source and {), ¢ are spherical polar angles centered
on the source. For a slow-motion source the function A I[ is just twice the TT
part of the second time derivative of the source's quadrupole moment, as
one sees trivially by matching (24) onto ( 12) in the source's local wave zone.
For a source at a large cosmological redshift z � 1, if one approximates the
background spacetime geometry by that of a Friedmann-Robertson
Walker cosmological model , the geometric-optics propagation produces the
same effects for gravity waves as for light : (i) the magnitude of h JkT falls off
with the same 1/R behavior as for light, where R is ( 1/2rc) x (circumference of
a sphere passing through the earth and centered on the source, at the time
the waves reach earth) (equations (29.28)-(29.33) of MTW); (ii) the
polarization, like that of light in vacuum, is parallel transported radially
360 K. S. Thorne
from source to earth; and (iii) the time dependence of the wave form is
unchanged by propagation, except for a frequency-independent redshift
!received /!em itted = 1/(1 + z). For further details and derivations see Section
2.5.4 of Thome ( 1983) or Section 7.2 of Thome ( 1977).
9.3 . 5 A catalog of wave-propagation effects
In principle gravitational waves can experience almost all the familiar
peculiarities of propagation that electromagnetic waves experience. Here I
shall enumerate those that have been studied, mention briefly their
importance or unimportance, and give references for further detail .
(d) Diffraction
Near the focal point of a gravitational lens the waves cease to propagate
along null rays and begin to diffract, thereby lessening the strength of the
focusing. The analysis of this is no different for gravitational waves than for
362 K. S. Thorne
another ('plane-wave collisi on') : the focusing itself is produced by the wave
generated background curvature. Moreover, if the waves are precisely
planar, a spacetime singularity forms at the focal plane (Khan and Penrose,
1 97 1 ; Szekeres, 1 972 ; Nutku and Halil , 1 977 ; Tipler, 1 980; Matzner and
Tipler, 1984 ; Chandrasekhar and Xanthopoulos, 1986; Yurtsever, 1987a),
and the generation of background curvature plays a key role in the
singularity. In the more realistic case (which, however, almost certainly does
not occur in the real universe except conceivably near the big-bang), in
which the waves are almost planar but die out slowly at large transverse
distances, if the transverse size is sufficiently large compared to the initial
wave amplitude, then the focusing probably still drives the amplitude up far
enough - before diffraction can act - to make background curvature
generation become strong and force a singularity to form (Yurtsever,
1987b).
bathing the earth . Such estimates will be stated below, with references; and
they are collected together beiow In Figs. 9.6 (burst sources) , 9.7 (periodic
sources) and 9.6 (stochastic sources).
9 .4 . 1 Burst sources
( ;
L\h ;.r = L\ � 4 A � V�
m rF (25)
Here the summation is over the free bodies in the system, m A and v� are the
mass and velocity of body A, and L\ denotes the change from before the burst
is emitted to afterward.
As is discussed by Braginsky and Thome ( 1987), the memory part of a
burst, L\hj,! , can be studied by any detector (with adequate sensitivity) that
operates at a frequency lower than the burst's characteristic frequency,
f ;:5 fc · Put equivalently, one can think of a burst with memory as having a
signal that extends down to all frequencies below fc (cf. the 'zero-frequency
limit' discussed by Smarr, 1977b and by Bontz and Price, 1979).
Current prejudice suggests that the strongest of bu rst sources (and thus
366 K. S. Thorne
the most interesting) may produce normal bursts rather than bursts with
memory ; but this prejudice could perfectly well be wrong. In accord with
this prejudice, the remainder of this section will focus on normal bursts.
(b) Characterization of the waves from a normal burst source and the noise in
a detector searching for them
Burst sources are best characterized by their full wave forms hjkT(t).
However, when comparing with detector sensitivities, it is helpful to have a
more compact characterization. Past discussions have used a loosely
defined 'characteristic amplitude' he and 'characteristic frequency' fe ·
However, the factors of order 3 that are glossed over by the loose definitions
are beginning to be important in the planning of gravity-wave searches,
especially in the case of the inspiral and coalescence of binary neutron stars
(subsection (e) below) ; and I therefore shall be quite careful in my definitions
of he and fe and subsequently in my corresponding discussions of detector
sensitivities.
As an aid to defining he and fe with care, I show in Fig. 9.2 a diagram of the
emission, propagation, and absorption of the wave. The source has
Fig . 9 .2. The angles I , /3, 1/1, e, <P which characterize the emission, propagation
and detection of a gravitational wave.
Sou rce
Po l a r i za t i o n
P l a n e· ---
x
Gravitational radiation 367
preferred local Cartesian axes (x, y, z) with respect to which its internal
structure is especially simple. We shall denote by (z, /3) the direction toward
earth (spherical polar angles) relative to those axes. The detector, similarly,
has preferred local Cartesian axes (x, y, z) with respect to which its internal
structure is especially simple; and we shall denote by (8 , ¢) the direction
toward the source (spherical polar angles) relative to those detector axes.
The waves themselves, as they pass through the detector, are most simply
described in a third set of Cartesian axes (x', y', z') with origin at the
detector's center of mass , z' axis along the waves' propagation direction, and
x' and y' axes so oriented in the polarization plane as to make the wave
forms h + = h �!x, = - h �.;,. , and h = h ��· = h��· , especially simple. We shall
x
denote by t/I the angle between the x' polarization axis and the </> = 0 plane.
From a model for the source one can compute the wave forms
h + (t - z' ; z, /3) and h (t - z' ; z, /3) arriving at the detector. If the detector is
x
h(t) = F + (8 , </>, t/l )h + (t; z, /3) + F (8, </> , t/l)h (t; z, /3) .
x x (26)
Here F + and F are detector beam-pattern functions (to be discussed in
x
Sections 9. 5.2 and 9.5.3 below), which depend on the direction of the source
(8 , <f> ) and the orientation t/I of the polarization axes relative to the detector's
orientation (Fig. 9.2) and which have values in the range 0 � I FA I � 1. There
will be some special choice of 8, </>, t/I for which F + = 1 and F = O ; we shall
x
call this the 'optimum source direction and polarization' for the + mode of
the waves.
In Section 9.5 we shall characterize the noise in a detector by a frequency
dependent spectral density Sh (f) (with dimensions Hz- 1 ) defined as follows :
if a precisely sinusoidal gravitational wave with known phase ex, known
frequency f > O and unknown rms amplitude h0,
h = (2)1h0 cos(2nft + ex), ex = con st. , (27a)
impinges on the detector, and if the experimenters seek to detect the wave by
Fourier analyzing the detector output with a bandwidth !J.f (integration
time f = 1/flf), then the amplitude signal-to-noise ratio will be
S h0
- (27b)
N [Sh (f)�J]f
In much of the gravitational-wave literature this Sh (f) is denoted [h(f)] 2 ;
i.e. h(f) = [Sh (f)] 1 (with dimensions Hz-!).
368 K . S. Thorne
oo
weighted by l/Sh(f) (so that noisy frequencies are suppressed) :
h
K (f) = (f) ' f-+oo h(t)
h(f) = e i 2 /r d t ,
tt (28a)
oo
Sh(f)
K(t) = f- K (f) e -i2ttfr
+oo df.
(ii) One then takes the output of the detector, which includes noise and
(28b)
possibly signal ; from it one computes the gravitational wave form h output(t)
that would have been required to produce the observed output if there
W=
f oo
(iii) This quantity will have root-mean-square contribution N from noise ;
( 2 8c)
and if the signal was actually present with starting time t0 , it will have a
signal contribution S (i.e. W = N + S), with squared signal-to-noise ratio
s2
N 1=
J.00 2ji1(f)l
o Sh(f)
2
df. (2 9)
See Wiener ( 1949), Sections 25-27 of Wainstein and Zubakov ( 1 962) , Kafka
( 1 977), and Michelson and Taber ( 1 984) for proofs and discussion. The
integral in (29) is from 0 to oo rather than - oo to + oo and there is a factor 2
present because Sh(/) is a 'one-sided spectral density' (defined only for f";::. 0 ;
negative frequencies are folded into positive). The squared signal-to-noise
ratio (29) will be the basis for our definitions of he and fc ·
Expression (29) is the squared signal-to-noise ratio for a specific
('fiducial') source at some specific distance r0 from earth . Suppose that space
is uniformly filled with sources, all identical to this fiducial source but with
random directions ( 8 , ¢), orientations (z, /3), and polarization angles t/J.
Suppose, further, that inside the source's distance r0 there is, on average, one
Gravitational radiation 369
bu rst each D0 days. Then what, on average, will be the squared signal-to
noise ratio (52/N2)51rongest of the strongest burst that occurs each D0 days?
One might expect the answer to be the fiducial 52 /N2 of (29), averaged over
all the angles e, </J, I/I, z , /3. Not so. There is a statistical preference for
directions and polarizations that give larger values of 52 /N2, because they
can be seen out to greater distances where the event rate is greater. This
effect gives, assuming the event rate goes up as r 3 and h ],! goes down as 1 /r ,
(52/N2)strongest = (53 /N 3)i where ( · ) denotes an average over randomly
" "
(30)
(3 1a)
(3 1b)
(32)
370 K . S. Thorne
,
[(
h 3 1 , "" i n
10 !, Hz ' )J (}{(::�:)��) l "" (3 -S )h . (fc). (34)
Here the range 3-5 corresponds to the range of frequencies of interest for
most burst searches, 10 - 4 Hz- 10 + 4 Hz. Note that because the events being
sought are so far out on the tail of the Gaussian probability distribution,
changing by a factor 10 or 100 the number of trial starting times t0, or asking
for 99 % or 99.9 % confidence rather than 90 %, would have a negligible
effect on this h 3 / y r ·
It is often useful to rewrite the fc and h e of equations (3 1) in terms of the
energy flux per unit frequency dEGw /dA df carried past the detector by the
waves. From equation ( 10) for the waves' stress-energy tensor together with
Parseval 's theorem we infer that
dEGw n
dA df 2 2 ( 1 - + ( ) 1 2 1 - x ( ) 1 2
= f h f + h f ), (35)
where the extra factor 2 comes from folding negative frequencies into
positive (so f > 0). When this quantity is averaged over all directions z, {3 , it
gives (4nr;) - 1 dEGw/df, where r0 is the distance to the source ; and
consequently equations (3 1) become
· [f co
-t
J [f
co
J
dEGw/df dEGw /df
fe = d ln f d ln f , (36a)
_ co fSh (f) _ co Sh (f)
Gravitational radiation 37 1
(36b)
For most burst sources (e.g. supernovae) the wave form is so uncertain
that a careful calculation of he is unjustified. In such cases it is useful to
reexp ress he (equation (36b )), approximately , in terms of the total energy
( lO )
/j.Eaw radiated :
/j.£a�/ )
(� /j. ! !
he '.:!:. !c 1 = 2 . 7 x 10 _(
20 a� E ) ( l kHz ) Mpc
2
n r0 M 0c fc r0
, (37)
where M 0 is the mass of the sun and 10 Mpc is the distance to the center of
the Virgo cluster of galaxies (assuming a Hubble constant of 100 km s - 1
Mpc - 1 ).
We turn now to a discussion of specific burst sources, and their
characteristic frequencies fe and wave strengths he.
(c) Supernovae (collapse to neutron star)
Supernovae of 'type II' are believed, with a high level of confidence, to be
created by the gravitational collapse, to a neutron-star state, of the cores of
massive, highly evolved stars. Supernovae of 'type I', by contrast, are
thought to result from nuclear explosions of white dwarfs that are accreting
mass from close companions - explosions in which the stellar core probably
does not, but might, collapse to a neutron-star state (Woosley and Weaver,
1986; Evans, Iben and Smarr, 1987, and references therein) . In addition to
these two types of optically observed supernovae, there may well be stellar
collapses to a neutron star that produce little optical display ('optically
silent supernovae'). _
degree of non-sphericity in the stellar collapse that triggers it, and somewhat
on the speed of collapse - i.e. on whether the collapse is nearly free-fall ('cold
collapse') or is more gentle due to the resistance of thermal .pressure ('hot
collapse'). Perfectly spherical collapse will produce no waves; highly non
spherical collapse will produce strong waves. Little is known about the
degree of non-sphericity in type II (which are surely due to stellar collapse) ,
but current prejudice suggests that the typical type II might be quite
spherical and thus poorly radiating. If type I are due to explosion of an
accreting white dwarf and, contrary to current thought , the explosion is
accompanied by collapse of the stellar core to a neutron star, then the white
dwarf might be rapidly rotating due to the accretion, and centrifugal forces
might then cause it to collapse very non-spherically and radiate strongly.
In the mid- 1970s there was a swing of fashion from believing that the
collapse is cold and fast to believing it is hot and slow (e.g. Wilson , 1974 ;
Schramm and Arnett, 1975). Some astrophysicists were aghast at the
consequence of this swing : for example, in the highly non-spherical but
axisymmetric collapse models of Saenz and Shapiro ( 1978 , their tables 1-4)
the total energy radiated as gravitational waves was reduced from
l!! EGw /M 0c2 "" 6 x 10 - 3 for cold and fast to "" 1 x 10 - 5 for hot and slow.
This boded ill . for attempts to detect gravitational waves, the astrophysicists
thought. However, closer scrutiny revealed ·relatively little change in the
prospects for detection : the total energy radiated dE Gw is a rather poor
indicator of detectability. Much more relevant is the amplitude signal-to
noise ratio S/N = h c /hn(fc) (equation (33)). The planned LIGO beam
detectors, when optimized for frequency fc , have hn(fc) oc fc (Fig . 9.4 and
equation ( 12 5a)) ; and this together with equation (37) gives
S/N oc (dE Gw /fc3)1 . (38)
Although the new fashion (hot and slow) corresponded to a reduction in
dEGw by a factor 600, the characteristic frequency of the waves also went
down (from � 3000 Hz to � soo Hz; see Saenz and Shapiro , 1978) ; and,
correspondingly, S/ N was reduced by less than a factor 2 .
This illustrates the importance of thinking in terms of h e , fc , and S/N =
Fig. 9 . 3 . Wave forms produced by two very different scenarios for the
collapse ofa normal star to form a neutron star . Wave form (a) is from Saenz
and Shapiro ( 1978); (b) is from Saenz and Shapiro ( 198 1).
1.0 .8
.8 p .6
.6 • .4
:4 .2
h .2 h 0 1------...
0 -.2
-.2 .,('-
.,('- - .4
-:4 - .6
-.6 -.8
-:8
- 1 .0 L...-'--_.__---'--.l..-.L..----'--�---'..l.
4 6 0.2 0 .4 0.6 0.8
2
-2 2
Tim e , 10 sec Tim e , I 0- se c
(b)
(a)
3 74 K . S. Thorne
The very different right-hand curve in Fig. 9.3 (Saenz and Shapiro, 198 1)
implies a quadrupole moment that somehow is driven into sinusoidal
oscillations which initially increase in amplitude and then die out. Again, the
fact that the period ( - 0.6 ms) is that of a neutron-star pulsation suggests
that something is triggering, then damping, such a pulsation. If this wave
form was seen roughly one day before an optical supernova was found in the
Virgo cluster, one would infer that the waves were from the supernova and
one could deduce that the quadrupole-moment oscillations are so large in
amplitude that they must have absorbed, say, 10 - 4 of the collapse energy.
The natural explanation might be parametric amplification of quadrupole
neutron star pulsations by a bouncing stellar collapse, followed by
hydrodynamic damping - the process that gave rise to this computed wave
form.
A large amount of effort has gone into model calculations of gravitational
collapse to a neutron star and the waves it emits ; but the effort has not
produced a consensus by any means! For a detailed review of the literature
up to 1982 see Eardley ( 1983) ; for an update on that review see M iiller
( 1984). In the case of rapidly rotating collapse, where the emission should be
strongest but the event rate is totally unknown (most collapses could be
slowly rotating), there are three radically different scenarios and
corresponding wave characteristics : (i) the star may remain axisymmetric
throughout the collapse. In this case the best current 'wisdom' (but by no
means a consensus) comes from calculations by Miiller ( 1982) and is
pessimistic. Those calculations predict the strongest emission to come in two
different spectral regions : fc - 1000 Hz where l!iE Gw "' 1 x 10 - 7 M 0 c 2 and
hc - 1 x 10 - 2 3 ( 10 Mpc/r0) due to the initial collapse and bounce ; and
fc - 104 Hz where �EGw "' 10 - 6 M 0c 2 and hc - 1 x 10 - 2 3 ( 10 Mpc/r0) due to
pulsations of the newly formed neutron star.- (ii) The star may become
unstable to an 'm = 2 bar-mode' deformation so it rotates end-over-end like
an American football. In this case the best current 'wisdom' is more
optimistic : the calculations of Ipser and Managan ( 1984) predict a highly
monochromatic emission at f - 1000 Hz, lasting for - 30 cycles and
producing l!iEGw "' 3 x 10 - 4 M 0 c 2 and hc - 5 x 10 - 22 ( 10 Mpc/r0). (iii) The
collapsing star may become so strongly unstable to non-axisymmetric
perturbations that on the way down it breaks up into two or more discrete
lumps. Very little is known about this possibility, and radiation reaction
from the m = 2 mode (case ii) might prevent it from occurring at all (Ipser,
1986). Eardley ( 1983) argues that if it does occur, it may produce quite
strong waves : l!iEGw - (a few) x 10 - 2 M 0 c 2 at fc - 1000 Hz, corresponding
to hc - 4 x 10 - 21 ( 10 Mpc/r0) .
Gravitational radiation 375
Because this best wisdom is so insecure, Fig . 9.4 shows wave strengths
based not on these specific models, but rather on the general equation (37)
for several possible values of fl.EGw and r0, and for the entire range of
characteristic frequencies that have shown up in model calculations,
200 Hz :S fc :S 10 000 Hz. Note that detectability depends strongly on
whether the waves come off at low frequencies or high : a factor 8 reduction
in fl.EG w can be compensated by a factor 2 reduction in fc ·
Doppler
)-
-5.5' o
G PS
\
\
10 Me Hole
376 K. S. Thorne
i(':o )
1986; and equation (37) above)
I
J, = ( 1 .3 x 10• Hz . (40a)
( ) ( ) ( )( )
5 n: M
-
15 t M e ! M 10 Mpc
h c � - e - = 7 x 10 _ 22 0.0 1 M0
( ) ( )( )
r0
- -
- 2n r0
_ 2 0 e ! 10 3 Hz lO Mpc
_
- 1 .0 x 10 • (40b)
_
0.01 fc r0
If the collapse is axisymmetric, then the efficiency e probably does not exceed
7 x 10 - 4 (Stark and Piran, 1986). However, in the non-axisymmetric case
(e.g. formation of an elongated configuration due to rapid rotation, or
378 K. S. Thorne
bifurcation into one or more lumps during collapse (the 'collapse, pursuit,
and plunge' scenario of Ruffini and Wheeler, 197 1)), the efficiency might be
in the range 0.0 1----0 . 1 (see, e.g., Eardley, 1983, and Rees, 1983). The source
characteristics (40) are shown in Fig. 9.4 for black-hole births at the Hubble
distance and at the distance of Virgo, with efficiencies of c = 10 - 2 and 10 - 4•
amplitudes are A1 = =
- - - - mode f it
-- ca l cu l a t ed
...)
(\J
c
+
·-
.....c.
-. 0 4 .___.__........
._ ..._
._ ..._
_ ....i.
_ _,__....
...._ -L...
.L... ....
.. ....
.J.... ....
.J.... --L--L..--L..
.J.... _J
-20 0 20 40
systems whose components are neutron stars or black holes, and are close
enough together to be driven into coalescence by gravitational radiation
reaction in a time less than the age of the universe. The binary pulsar
PSR 1 9 1 3 + 16 is an example of such a system ; it will coalesce 3 . 5 x 108 years
from now.
As the two bodies in a compact binary spiral together, they emit periodic
gravitational waves with a frequency that sweeps upward toward a
maximum,
fma x � 1 kHz for neutron stars, (4 1a)
10 kHz .
for holes with the larger having mass M 1 . (4 lb)
M 1 /M
fmax �
0
The wave form during the nearly Newtonian part of the frequency sweep ,
f � fmax • is easily computed from the quadrupole formalism (Section 9.3.2).
The post-Newtonian corrections to this waveform will become more and
more important as f rises toward fmax ; they are given in Wagoner and Will
( 1976) ; see also Gal'tsov, Matiukhin and Petukhov ( 1980). Ultimately, near
fm a x ' higher-order corrections or full non-linear relativity are needed to get
the wave form reasonably accurately. The final, coalescence stage will be
especially interesting and complex in the case of a neutron-star binary, and
may be quite sensitive to the masses of the two stars; and as with
supernovae, we might not understand reliably what to expect until
gravitational-wave observations show us. For a fi rst , preliminary
theoretical effort at understanding, see Clark and Eardley ( 1977). For black
holes, by contrast, numerical relativity is likely to give us, within the next
five years or so , a detailed and highly reliable picture of the final coalescence
and the wave forms it produces, including the dependence on the hole's
masses and angular momenta . Comparison of the predicted wave forms
with observed ones will constitute the strongest test ever of general
relativity. (The wave forms for the astrophysically unlikely cases of head-on
collisions of two identical non-rotating holes or neutron stars have already
been evaluated by numerical relativity ; see Smarr, 1977a for black holes,
and Gilden and Shapiro, 1984 for neutron stars.)
Because the binary system spends far more time in the early, low
frequency part of the sweep than in the later, high-frequency part or in the
final coalescence, and because planned gravity wave detectors have less
amplitude noise at low frequencies, f- 100 Hz, than at high, f � 100 Hz (cf.
Fig. 9.4), it will be easier for detectors to see the Newtonian regime of the
sweep than the post-Newtonian regime or the final coalescence - except in
380 K. S. Thorne
[ iJ
given as a function of time by (MTW equation (36. 17))
1 5 1 1
f= . (42d)
n 256 µMi (t0 - t)
The most promising detectors for coalescing neutron-star binaries and
low-mass black-hole binaries are beam detectors in the planned multi
kilometer LIGOs. As we shall see in Section 9.5.3(e), a beam detector can be
operated in several different optical configurations. The optimum
configuration for searching for coalescing binaries is likely to be one with
light recycling, for which the spectral density of shot noise (the dominant
noise above some 'seismic cutoff frequency fs) will have the form
Sh(f) = const x fk[ l + (f/fk) 2 ] at f > fs· (43a)
Here fk is a 'knee frequency' which the experimenters can adjust by
changing the reflectivities of certain �irrors in their detectors; see equation
( 1 17c) and Fig. 9. 13 below, and associated discussion. The constant in
equation (43a) is independent of the choice of fk· At frequencies below the
'seismic cutoff' fs seismic noise is likely to come on very strong ; accordingly,
we shall make the approximation
(43b)
By Fourier-transforming the wave forms (42), squaring, and averaging
()
over the source orientation angle i, we obtain
n µ 2 M3 1
< I h- + 1 2 + 1 h- " 1 2 ) (44)
12 -;: µ (nMJ)1 '
=
By inserting this source strength (44) and the detector noise (43) into
equation (30) and maximizing the resulting signal-to-noise ratio with
Gravitational radiation 38 1
respect to the knee frequency fb we find that the experimenter will do best to
choose
(45)
For smaller choices of fk there is not a wide enough frequency band between
the seismic cutoff and the knee to take optimal advantage of the broad-band
nature of the signal . For larger values of fk the experimenter loses because
the height Sh (f) of the 'noise floor' at fs < f� fk (equation (43a)) is
proportional to fk· With this choice of knee, equations (3 la) and (43)-(45)
give for the characteristic frequency fc = 0.909fk · Below (equation ( 125a)) we
shall characterize the sensitivites of beam detectors in full scale LIGOs,
when searching for bursts, by the noise amplitude h n at the knee; and ,
correspondingly, we here shall set fc (which after all is somewhat arbitrary)
to fk rather than 0.909fk :
fc = Jk = 1 .44fs · (46a)
Equations (3 lb) and ( 43)-(45) then give for the characteristic amplitude of
the waves from inspiraling binaries
hc = 0.237 �
µ1 ! ( )( )(
= 4. 1 x 10 - 22 � )(
! M j lOO Mpc lOO Hz i ).
r0 fc M0 M0 r fc I
(46b)
=
The characteristic amplitude (46b) is enhanced over the actual rms wave
strength (h� (tc) + h : (tc)) 1 the waves have at the time t tc when they sweep
-
through frequency fc enhanced by very nearly the square root of the
number of periods, n = (f 2 /j)1 = fc = (5/96n)(M/µ)(nMfc) - 1 , that the
binary spends in the vicinity of the frequency fc· This Jn -:::::.
28(µ/M 0 ) - 1 (M/M 0 ) - ! (fc/ 100 Hz) - � enhancement corresponds to the
enhancement in effective signal that the experimenters will achieve by
optimal signal processing in their search for these frequency-sweeping
bursts.
From a study of the waveform (42) using broad-band detectors at several
widely spaced locations on the earth, one can deduce the following
information : (i) the direction to the source (which comes from phase
differences in the signal at different detectors in different locations); (ii) the
inclination of the orbit to the line of sight (which comes from the amplitudes
in the two different polarization modes) ; (iii) the direction the stars move in
their orbit (which comes from the + or - sign in equation (42b )); (iv) the
combination (µ 3 M 2 )! of the reduced and total masses; and (v) the distance r
to the source. If the mass combination (µ 3 M 2 )� is � l .5M 0 , one can be
382 K. S. Thorne
fairly sure the binary was made of neutron stars ; if it is much larger, one can
be fairly sure that at least one of the bodies was a massive black hole.
Especially intriguing is the possibility (Schutz, 1986b) that, .i n the case of
neutron stars the coalescence will produce electromagnetic emission (e.g .
due to an explosion of the less massive star, Blinnikov et al. , 1984) that is
strong enough to be detected at earth and thereby to pin down the source's
location with far higher precision than can be obtained from the
gravitational waves. In this case, a redshift will probably be obtainable from
optical observations; and that redshift together with the gravitational-wave
determined distance r will give a value for the Hubble constant. Schutz
(1986b), from a detailed study of the expected noise in future beam detectors,
concludes that the prospects are good thereby to obtain a significantly better
value for the Hubble constant than we now have. Even in the absence of
electromagnetic signals from the coalescence, it may prove possible by
statistical means to determine the Hubble constant using combined data for
a number of coalescences ; see Schutz (1986b) for details.
Clark, van den Heuvel and Sutantyo ( 1979) have estimated, from
neutron-star observations in our own galaxy, that to see three coalescences
of neutron-star binaries per year one must look out to a distance of
100 � i8° Mpc, where the quoted uncertainties are at the 90 % confidence
level . Correspondingly, in Fig . 9.4 are shown the characteristic amplitude
and frequency, for a range of values of the seismic cutoff fs and
corresponding values of fc l .44fs, produced by the coalescence of two 1 .4
=
solar mass neutron stars at 100 Mpc (estimated event rate about three per
year) and at 1 the Hubble distance (event rate about ten per day). Because
much has been learned observationally about the statistics of neutron-star
binaries since these estimates of event rates were made, a careful restudy of
the estimates is much needed.
As Fig. 9.4 shows, future earth based beam detectors may be able to see
black-hole coalescences throughout the universe, so long as the more
massive of the two holes does not exceed 1000 M 0/( 1 + z) , where z is the
hole's cosmological redshift. The coalescence rate for black-hole binaries of
a few solar masses could be of order that for neutron-star binaries (a few per
year at 100 Mpc), or might well be far lower. Particularly intriguing is a
scenario, suggested as very plausible by Shapiro and Teukolsky ( 1985) and
Quinlan and Shapiro ( 1 987), in which a large fraction of galactic nuclei
create, at some phase of their evolution , a dense cluster of neutron stars and
small-mass black holes which - on a timescale of only a few years - form
tight binaries that coalesce, with the coalesced holes then forming new tight
Gravitational radiation 383
binaries that coalesce, . . . until the cluster goes unstable and collapses to
form a single large hole. This scenario suggests that in typical years the earth
might be hit by a number of spiraling wave bursts from coalescing binaries of
masses 3 M 0 -1000 M 0 at the Hubble distance, z ,.., 1 . Also intriguing but
much less likely is a scenario discussed by Bond and Carr ( 1984) in which a
sizable fraction of the mass of the universe is in black-hole binaries with
masses "" 100-1000 M 0 for which the coalescence rate could be several per
year in the local group of galaxies (distance - 1 Mpc).
One or more black hole binaries of any mass up to "" 108 M 0 might have
formed in the nuclei of a reasonable fraction of all galaxies during the past
-
life of the universe, leading to event rates <: I/year out to the Hubble
distance (cf. Fig . 1 of Rees, 1983) or they might never form. There is
actually observational evidence for supermassive black-hole binaries
formed by the coalescence of galactic nuclei (Begelman, Blandford and Rees,
1980 ; Rees, 1983) ; but the rate of such events probably does not exceed 1 60
years out to the Hubble distance (Rees, 1983).
(f) The fall of stars and small holes into supermassive holes
The supermassive (M 1 <: 105 M 0) black holes thought to inhabit the nuclei
of galaxies might typically grow by accretion on timescales as short as 108
years ; see, e.g . Section 8 .6 of Blandford and Thorne ( 1979). When such a
hole grows larger than 109 M 0 , normal stars can pass near or plunge
through its horizon without being torn apart tidally, and the number of stars
that so scatter or plunge could well be of order one per year or more (e.g.
Dymnikova , Popov and Zentsova, 1982) . For smaller supermassive holes,
scattering or plunging normal stars will be tidally disrupted, reducing the
strength of their waves; but the reduction will not be great, at least in the
case of radial infall , unless the hole is below 10 6 M 0 (Nakamura and Sasaki,
198 1 ; Haugan, Shapiro and Wasserman, 1982). For any hole, neutron stars
and satellite holes can scatter or plunge through without enough disruption
to strongly supp ress their radiation ; but the event rate (per supermassive
hole) will typically be well below one per year.
The wave forms emitted when a star or small hole is scattered by or
plunges into a supermassive hole have been evaluated with high precision
using perturbation formalisms; see, e.g . Detweiler and Szedenits ( 1979),
Kojima and Nakamura ( 1984a). The characteristic frequency and amplitude
for typical (non-head-on) impact parameters are
r '.:::::
J c - 20M 1
1 = (
10 - 4 Hz
M1
)
10 8 M 0
' (47a)
384 K. S. Thorne
(47b)
where M 1 is the mass of the large hole, M 2 is that of the infalling body, and
r0 10 Mpc might give a reasonable event rate since there are - 100 galaxies
-
as massive as or more massive than our own inside this distance, including
M87 for which observational data suggest a central black hole of mass M 1 -
4 x 109 M 0 (Section 8.3. 1 .4 of Chapter 8). These he and fc are plotted in
Fig. 9.4 for several interesting sets of parameters. It is conceivable that such
plunge bursts will be seen by beam detectors in space, if and when they are
flown.
9 .4 .2 Periodic sources
(a) Characterization of the waves from periodic sources and the noise in a
Detector searching for them
The gravitational waves from a periodic source will be characterized by a
discrete set of frequencies, and the waves at a given frequency will typically
be right-hand or left-hand elliptically polarized ; i .e. for some suitable choice
of polarization axes ex' • ey' • they will have the form (similar to that of a
decaying binary, equation (42), but with constant frequency)
h + (t) h0 + cos 2nft,
= h " (t) = + h 0 " sin 2nft. (48)
If one wishes to quantify in a precise and standard manner all the properties
of these waves, including the orientations of the 'preferred' x' and y'
polarization axes, one might best do so using 'Stokes Parameters' analogous
to those used in electromagnetic theory (Section 15 of Chandrasekhar,
1950). However, in this chapter we shall be concerned only with the
frequencies f of the waves, and at a given emitted frequency, with a suitably
defined characteristic amplitude he and a corresponding noise amplitude hn
in a detector searching for the waves. ,
As an aid in defining he and h n , consider the following situation (analog of
that for burst sources in Section 9.4. l (a)) : a theorist tells us the frequency f,
the phase, and the amplitudes h 0 + ( 1 , /3, r) and h0 (z, /3, r) to be expected from
x
Here
(50)
(with ( · · ·) denoting an average over z and /3) is the characteristic amplitude
386 K. S . Thorne
I + (e, ¢, l/l)l2>!
h (f) = < F
_[Sh (f)/f] 1 (5 1 )
"
i s the noise amplitude. (The he of equation (50) is (1)! larger than one naively
would expect from equation (27). This factor (1)1 is an approximate
correction for the fact that the angle averages in S/N should not be over
squares but, rather, over squares associated with the rotation of the earth
during data collection, then over (squares)1 covering the rest of the sky and
the orientation of the source, followed by a ! power after averaging ; cf. the
discussion preceding equation (30).) Correspondingly, if experimenters wish
to be 90 % confident of having seen that n0th brightest source after ! year of
search, then S/N must exceed 1 .655 � 1 . 7 . (Gaussian probability
distribution), and he must exceed
1.
h ,1,, = l 7 h
. 0 I
+
� 1
< F 2 )l [S,(f) x 0 - 1 Hz]l if f and phase are known .
(52a)
In the case that theory and electromagnetic observation have failed to tell us
in advance the phase and frequency of the source, except to within df, the
experimenters must try f/ilf values of the frequency, and correspondingly
,....,
Fig. 9.6. The characteristic amplitudes he (equation (50)) and frequencies f of waves from
several postulated periodic sources (thin curves), and the sensitivities h3/yr of several existing
and planned detectors (thick curves and circles) (h J/yr is the amplitude he of the weakest source
detectable with 90 % confidence in a 1 y r = 10 7 s integration if the frequency and phase of the
source are known in advance; equation (52a)) . The sources shown in the high-frequency
region , f <: 10 Hz, are all special cases of rotating, nonaxisymmetric neutron stars (Section
9.4 .2(b)). The steeply sloping dotted lines labeled NS Rotation refer to rigidly rotating
neutron stars with moment of inertial Izz 1045 g cm 2, and with various ellipticities e and
-
distances r labeled on the lines (equation (55)) . The sources in the low-frequency region,
f ;S 0 . 1 Hz, are all binary star systems in our galaxy (Section 9.4.2(c)) : several specific, known
binaries, which are indicated by name (µ Seo , V Pup, . . . ); the strongest six spectral lines from
the famous binary pulsar PSR 1 9 1 3 + 16; and the estimated strengths of the strongest white
d ward ('WD') and neutron-star ('NS') binaries in our galaxy. The detectors are discussed in
detail in the indicated subsections of Section 9.5.
10 10
\
\
\
• ,IJ. SCO
\
I0-20 10-20
\ i Boo
\ ;,,· • · . , • • WO binary
• v Pup
• sl'IOrtest
\ brri=:::s1 Wo
period
•SS CyQ
\
4. � (r:: )
\
I
•WZ Sqe
\
)( - 1
• Am CVn
\ /
• Cy9
\ 4
/.,'?
\ s5tron9·�·,� • • • ,,
�c,'�
\ N b /_ f::J�
/rl'
._o'' .$'
shortest
�:�''•s
/"IJ
•
\ 4.2 NS binary
�riod
• \.. � · - - - - - - _ _ ./
_
PSR 1913+16 •
_
Btarn 1n Spoce 5 51b)
10- � 10·1
FreQuenc y f, Hz
388 K. S . Thorne
Smith ( 1976) and references therein. In old pulsars that have been spun up
by accretion to near-millisecond rotation rates, theory and observational
data suggest that the crust and core are quite well annealed into a nearly
axisymmetric shape (Alpar and Pines, 1985); but in neutron stars that are
only tens or hundreds or thousands of years old , it might well be otherwise
(e.g. Zimmermann, 1978). (ii) The star's internal magnetic field , if sufficiently
strong , could produce sufficient magnetic pressure to distort the star
significantly (Zimmermann, 1978 ; Gal 'tsov, Tsvetkov and Tsirulev, 1984).
However, 'sufficiently strong' means, in the case, e.g., of the Crab Pulsar, ten
times stronger than the star's measured surface field . (iii) If the star is
rotating more rapidly than a critical rotation period , Pcri t - 0.7- 1 .7 ms
(which depends on the star's structure and its temperature-dependent
viscosity) , then an instability driven by gravitational radiation reaction
('Chandrasekhar ( 1970)-Friedman-Schutz' ( 1978), or 'CFS' instability) will
create and maintain significantly strong hydrodynamic waves in the star's
surface layers and mantle, propagating in the opposite direction to the star's
rotation ; and these will radiate strongly. For detailed discussions see
Wagoner ( 1984) , Lindblom ( 1986, 1987), Friedman , Ipser and Parker
( 1986) , Schutz ( 1987), Cutler and Lindblom ( 1987).
At present we are extremely ignorant of the degree of asymmetry in
rotating neutron stars, and accordingly we are ignorant of the strengths of
the periodic waves to be expected from them . Pessimists will note that there
is no observational evidence in any observed pulsar for sufficient non
axisymmetry to produce interestingly strong waves. Pessimists will point ,
especially, to the extremely small slowdown rate of the 1 .6 ms pulsar
PSR 1937 + 2 1 , which implies such weak radiation reaction that the
characteristic amplitude at earth cannot exceed 1 x 10 - 2 7 (Fig. 9.6), and the
star's non-axisymmetric ellipticity cannot exceed 3 x 10 - 9 .
Optimists will also point to PSR 1937 + 2 1 and some other millisecond
pulsars, and note a reasonably likely scenario for their origin (van den
Heuvel , 1984; Wagoner, 1984) : that they were spun up long ago by accretion
from a binary companion until they hit the CFS instability, that they
remained just beyond the instability point for awhile, with the spinup torque
of accretion being counterbalanced by gravitational radiation reaction , and
that the accretion stopped long ago leaving the stars plenty of time to anneal
and settle down into their presently observed , highly axisymmetric states.
Given the extreme observational difficulty of finding by electromagnetic
means evidence for rapid neutron-star rotation (see, e.g. , Section IV of
Reynolds and Stinebring ( 1984) for searches in the radio), it may well be that
Gravitational radiation 389
there are a number of accreting neutron stars in our galaxy now in the CFS
regime, radiating strong gravitational waves. For such a star the energy
being radiated in gravitational waves and that being radiated. as accretion
induced X-rays will both be proportional to the accretion rate; and
consequently the characteristic amplitude of the gravitational waves at earth
will be proportional to the square root of the X-ray flux arriving at earth, Fx :
h � 2 x 10 - 2 1
c
(300fHz)1( 10 - 8 ergFxcm - 2 s - 1 )1 (5 3)
(Wagoner, 1984). The frequency f of the waves will be f lvp /(2nR) where R
=
hydrodynamic wave, and v P is the pattern speed of the wave as seen in the
inertial frame of distant observers. The X-ray flux Fx "" 10 - s erg cm - 2 s - 1
is z1o that of Seo X- 1 , the brightest quasi-steady source in the sky and itself a
candidate for a CFS-unstable object. As Fig . 9.6 shows, stars with X-ray
fluxes as low as 1 lo o Seo X- 1 could b e interestingly strong sources of
gravitational waves. In such a star the density waves in the surface layers
should modulate the emitted X-rays, but the sensitivities of past X-ray
telescopes have been too poor to detect such rapid and weak modulations.
There is an interesting proposal (Wood et al. , 1986) for a new, more sensitive
X-ray telescope designed to search for such modulations in Seo X- 1 and
other, weaker X-ray sources. Such a telescope, operated in coordination
with gravitational-wave detectors, might one day give a wealth of new
information about neutron stars.
Even the most rapidly rotating of neutron stars will be smaller than a
reduced wavelength of its emitted gravitational waves and thus can be
described with reasonable accuracy by a slow-motion formalism (Ipser,
197 1). If the emitter is CFS density waves, the multipoles involved are l 3,=
will be emitted at twice the rotation frequency. From the quadrupole variant
of the slow-motion formalism we can then compute that the waves will have
the standard periodic form of equation (48) with amplitudes
- Jyy ) (nf ) 2
h o + _ 2(l + COS 2 l) Clxx
- ,
r
)(nf) 2
h 0 x
= 4 cos l (cfxx Jyy-
(54)
r
Here i is the angle between the neutron star's rotation axis and the line of
sight from the earth, and cfxx and Jyy are the components of the star's
quadrupole moment along the principal axes in its equatorial plane. The
characteristic amplitude of these waves (equation (50)) is
() i l f2
h c = 8 n 2 _3__ e zz = 7 . 7 x 1 0 - 2 08
15 r
(
I zz
)( ) (
f 2 10 kp c ,
1045 g cm 2 1 kHz r
)
(55)
where I iz is the moment of inertia of the star about its rotation axis (so
-1Izz (nf) 2 is its rotational energy) , and
8 cfx x - Jyy (56)
1z-z-
=
is its 'gravitational ellipticity' in the equatorial plane. All neutron stars for
which masses have been measured have M near l .4M 0; and depending on
the equation of state these masses correspond to 3 x 1044 g cm 2 ;S l:z:z::S 3 x
1045 g cm 2 . The likely values of the ellipticity t: are far less clear.
The observed slow-down rates of the Crab (f = 60 Hz, r = 2 kpc ), Vela
(f= 22 Hz, r = 500 pc), and PSR 1937 + 2 1 (f= 1 .25 kHz, r = 5 kpc) pulsars,
if due to gravitational radiation reaction (possible but not likely) ,
correspond to t: � 6 x 10- 4, 4 x 10- 3 , and 3 x 10- 9 respectively ; and to he �
8 x 1 0 - 2 5 , 3 x 10- 2 4, and 1 x 10- 2 7 . Zimmermann ( 1978) argues that
reasonable values for the Crab and Vela are t: - 3 x 10 - 6 and 3 x 10- 5
corresponding to h - 4 x 1 0 - 2 7 and 2 x 10 - 2 6 • Alpar and Pines ( 1985)
c
suggest reasonable values for PSR 1937 + 2 1 (which is old and well annealed)
in the range t: - 4 x 10- 1 0-l x 10- 1 1 corresponding to h c - l x 10 - 2 8 .
Blandford ( 1984) points out that, if there is a population of young pulsars
(not yet discovered) that are spinning down by gravitational radiation
reaction on a spin-down timescale -r G w, then (i) the nearest will be at a
distance r � R a{-r 8 /-r Gw )i (assumed � -r Gw ) is the mean time between
births of these pulsars in our galaxy and RG is the radius of the galaxy's disk ;
Gravitational radiation 39 1
(ii) the flux of gravitational-wave energy at earth from the nearest such
pulsar (3/64n)(2ef)2h;, will be equal to [I22(nf)2/r Gw ] (4n r2) - 1 ; and (iii) as a
consequence, the characteristic amplitude of the waves from the nearest one
will be
[
he '::::'. � zz
/ Jt [
3 r r Gw
�zz
'::::'. � Jt
3 RG r B
1 . l x 10 _ 2 5(
,...,_,
)
1 04 years !
, (S7)
r8
independently of its frequency and ellipticity.
Fig. 9.6 shows values of he and f corresponding to some of the above
possibilities.
If the star does not rotate about a principal axis of its moment of inertia,
then it will precess . When one idealizes the star's interior as rigid, then
although the interior gravity is significantly non-Newtonian, the precession
is still described by the classic equations of Euler (see Thorne and Giirsel ,
198 3 , for a proof) ; and the resulting waves will have a form that typically will
entail significant spectral components at three frequencies : twice the
rotation frequency , and the rotation frequency plus and minus the
precession frequency (Zimmermann , 1980; Zimmermann and Szedenits,
1979) . In reality, pliability of the neutron-star material will cause significant
deviations from rigid rotation ; but these three frequencies may still be
dominant. If the star is idealized as a fluid body deformed by the pressure of
an off-axis internal magnetic field, then the star does not precess and the
radiation is emitted at the rotation frequency and twice the rotation
frequency (Gal'tsov, Tsvetkov and Tsirulev, 1984) .
If and when gravitational waves from rotating neutron stars are detected,
they may carry a wealth of information about the star's structure and
dynamics in the amplitudes and relative phasings of their various spectral
components. Especially interesting may be the evolution of the various
spectral components after a star quake; together with electromagnetic
timing of the post-quake rotation, these may give us new insights into the
coupling of the solid crust or core to the fluid mantle.
h, = 8 (i2s)' � nM
( J)l
10 - 2 1 (_L_) (�)1( ) ( )
100 pc 1,
= 8 .7 x _{ (58)
M0 M0 r 01 Hz
where µ is the reduced mass and M the total mass of the system. This
amplitude is plotted in Fig. 9.6 for a few of the most strongly radiating
known binaries. For lists of the most strongly radiating binaries and their
characteristics see Braginsky ( 1965) and Douglass and Braginsky ( 1979).
White-dwarf and neutron-star binaries should also be important emitters
- and they should extend to higher frequencies than ordinary binaries ; but
there is a paucity of observational data on them and the example of shortest
known period has f only 3 x 1 0 3 Hz (see above). From the data that do
-
exist (e.g . lben and Tutukov, 1984), Lipunov and Postnov ( 1986), Lipunov,
Postnov and Prokhorov ( 1987), and Hils et al. ( 1987) have estimated the
characteristic amplitudes of the strongest white-dwarf and neutron-star
binaries ; see Fig. 9.6. For a very detailed treatment of the white-dwarf case
see Evans, lben and Smarr ( 1987). The highest frequency to be expected for
any white-dwarf binary in ou r galaxy is 0.06 Hz since mass transfer from the
less massive star to the more massive begins at or before this frequency; the
highest for any neutron-star binary is 0.007 Hz since any binary of higher
frequency than this would coalesce in a time less than the mean interval
between coalescences, "' 10 4 years.
Gravitational radiation 393
Gravitational
evolution of close radiation
binary reaction
systems; seeplays an
Paczynski important
and rol e
Sienkiewicz i n dri
( v i n g
198 1) for
the
details. 9.4 .3 Stochastic sources
(a) Characterization of stochastic gravitational waves
Ittraveling-wave
is useful to normalthink about modes stochastic
of the gravitati
gravi t o
ationalnal waves
fi e ld. i
As n terms
wi t h of
the
electromagneti
states) for each c field,
vol u me there
(2nh) are two modes (because of
3 in phase space; and correspondingly, one can two pol a ri z ati o n
easily show, the energy densi t y per unit logari t hmic interval of frequency
.universebyi.s the critical energy density Perit "" 10- 8 erg cm- 3 to close the
divided
n (f)
d
=
E GW/d 3x d n f ___!!____ ( f )3
I _
(59)
Gw - 1037 1 kHz
Pent.
Here ii is the average number of quanta i n al l modes with frequenci e s of
order f. Below we shall see that n Gw( f) is likely to be ;;;:: 10-14 at all
frequencies of interest, and correspondi n gly the mean number of quanta i n
each mode is likely to be ;(: 1020 - so large that a classical treatment is in
order.
In TT coordinates the metric perturbation due to stochastic gravitational
waves, evaluated at any chosen location xi, will be a sum over contributions
of all the modes of the field
TT
h jk ( t, ) X
i
=� TT
h Kjk ( t, )
L..- X
i
•
(60)
K
Here the i n dex K labels modes of the field. For stochastic gravi t ati onal
waves, the wave field hljk associ a ted with mode K can be regarded as a
'random process' (i.e. a stochastically fluctuating function of time t) ; and the
total field h°j,[ at location xi is the sum over all of the modes' random
processes.
The field of a chosen mode K can be expressed
TT .
h Kjk h K (t, x1)eik '
=
K
as
(61)
where h K (t) is its scalar wave function and e� is it s constant polarization
vector, so normalized that e�e� 2 i n Cartesian coordin ates; cf. equation
=
(7c). Then h K is a scalar random process in time (at fixed xi) and its statistical
properties Can be characterized by a Spectral density shK (f) .
I shall assume that the modes are defined i n such
significant correlation between their wave fields hVjk . As a result, when one a way that there i s no
averages the stress-energy tensor (9) of the waves at xi over a sufficiently
394 K. S . Thorne
d f! = (62)
1 dA d t f Q K m �n ShK ·
Here E denotes energy, A denotes area, n denotes solid angle, and the sum is
over all modes K with propagation directions in the infinitesimal solid angle
.1!1. The total energy per unit logarithmic i n terval (used in defining !1 ( f ) 0 w
nf3
above) can be expressed in terms of the specific intensity in the standard way
dE = fl dQ = t 4
f! aw (f) P erit =
d d ln f I
3
X
J Sh K , (63 )
where
As the
for integral
burst and i s over
periodic the entire
waves, so sphere
also and
for the sum
stochastic, i
we s over
shall al l modes.
introduce
asituation:
single characteri s ti c amplitude he that is tied to a specific experimental
the experimenters use two identical gravity-wave receivers
(broad-band or narrow), separated by a distance 4: l. = c/2nf, to search for
isotropic, stochastic waves in the neighborhood of frequency f. The search is
performed by a standard technique (Bendat, Section of Drever, 1958 ; 9
1983 ) : 1
the outputs, h 1 (t) and h 2(t) of detectors and 2 are passed through
i.1denti� calcentered
filter� onwhifrequencies
ch admit onlyf. The Fourier
fi l components
tered outputs in a bandwidth
w 1 (t) and w 2 (t) are
f f +
then multiplied together and integrated for a time tof get a si n gle number
W= n�+i W 1 (t)w 2 (t). This number will consist of a signal due to the identical
stochastic backgrounds contained in w 1 (t) and w 2 (t), and a Gaussian noise
due to the independent noises i n the two detectors
7
(each with
1958 )
the same
spectral densi t y Sh (f)). It turns out (e. g . Chapter of Bendat that the
ratio of the signal to the root-mean-square noise is
(64)
where
h, ( f) = [t JS•/f) T [� f dQT = [";2
= T
11 ila w U )P, n1
(66)
n - (tfdf)i (F! ) i
is the characteristic noise amplitude of the detectors. (Note: P e ri t/ 1 .7 x
Gravitational radiation 395
10-8 erg cm- 3 =(H0/lOO km s- 1 Mpc- 1 )2 where H0 is the Hubble
constant. ) Correspondingly, i f the experi m enters wish to be 90 % confident
of having seen the stochastic background during a search of duration =� i
year, the Gaussian probability distribution for S/ N requires S/ N = 1. 7 ,
he
which in turn means that must exceed the noise level
h (f )= l . � (f )= 2 .0 ( _!lf )-� [f h f)Ji
S( i
3/ yr hn (67)
10 7 Hz (F i ) .
The characteristic an:iplitudes he of various possible stochastic sources,
h 3;yr
and the noise levels of various detectors are shown in Fig. 9. 7 .
(b ) Binary stars
So many binary stars in our galaxy and in other galaxies radiate in the
frequency region ;S
f 0. 03 Hz that they should superpose to produce a strong
stochastic
and Prokhorov background.
(1987) and Lipunov
Hils and Postnov (1986), Lipunov,
et al. (1987) have made careful calculations
Postnov
of the characteristic amplitude of this stochastic background as a function of
frequency; the results of Hi l s et al. are shown in Fig. 9. 7 for the contribution
ofcontributions
our own galaxy of (which
all other should
galaxies be concentrated
(which should in bethe galactic
isotropi c )pl a ne).
should Thebe
down from those of our own galaxy by (hc )o t her/(hc )u s - 0 . 1 5 .
The binary stochastic background i n
contributions from various types of binaries. Those shown as solid curvesFi g . 9. 7 is broken up into
(unevolved binaries, WUMa stars (first discussed by Mironovskii, 1966),
and
firmly catacl
based y smi
on c variables
opti c al studi (white-dwarf
e s of the / normal-star
stati s ti c s of these systems))
types of are
stars,rather
and
thus
Evans, are rather
Iben and rel i a ble.
Smarr, Those
1987, shown
and dashed
the above (cl o se white-dwarf
references), and binari e
neutron-stars (see
binaries) are based on so little observational data and so much theory that
they are highly uncertain.
The binary background presents a serious potential obstacle to searches
forspace-based
other kindsbeamof waves i n the frequency band 0. 0 3 Hz ;S ;S
f 10-5 Hz where
above
that a thi
lM s background
0 star
detectors iwif itl hasoperate. A broad-band whiburstch can
falling onl
into y a h e burst > h e backgrou nd
supermassive black hole -in the Virgo
be seen
means, e.g . ,
cluster
will not be discernible unless the hole' s mass is M 0
< 3 105 M (cf. Figs. 9. 4
x
has
.7 ) .
( f )-i o -
and 9 A periodic source can be seen, after an integration time only if it
h e pe ri odic > i hc backgr u nd which means,
dwarf binaries are as numerous as estimated, the binaries 10z Boo e. g . , that if the
i,
close
and whi
SS Cygt e
will be discernible only after integration times of i> 7 s. For further
discussion see Evans, Iben and Smarr (1987) and Hils et al. (1987).
396 K . S. Thorne
v 5.5(0 )
Po'51 Doppler.
1986 Seams
5./7\
Frequency f , Hz
Gravitational radiation 397
dependent and
Some otherwise vacuum-dependent
plausible models expansion
produce so rate
much, i n the very early universe.
n0w(f ) � 1, as to be in
violent
Zel' d conflict
ovich and with the observed
Novikov, 1983). current
Other, state of the
equally universe
plausible (e. g .
models p. 621 of
can
produce so little, now(f ) � 1o-14, that there is no hope of detecting the
waves in the foreseeable future.
In currentl y fashionable inflationary models of the
fluctuations which initially are smaller than the horizon (A: � 9f8by universe vacuum =
(background radius of curvature)) are dri v en outsi d e the horizon (A: � Bf8)
the inflationary
constant expansion.
amplitude While outside the horizon, they are '
h. Much later, after inflation ends, non-inflationary
fr ozen' with
expansion brings them back insi d e the horizon.
each mode before entering and after leaving the horizon is The number of quanta in
n- [ 1�11 (ff}Ml'( h�x)- : ( h�) 2'
where
occupi the
e d first
by factor
each is
mode, the
and waves'
the energy
third is density,
1/(energy the
of second
one is the
graviton). volume
Before
entering the horizon n �!; so the above relation says that upon leaving it
A:leave 1 alea ve
nout � nen ter - n ter 2 a en ter ,
A:e- --
where
freezi na is the expansion factor of the universe. Thus, the epoch of amplitude
g is actually an epoch of parametric amplification (stimulated
creati
depends o n of
on new
the gravitons);
total amount and
of the
expansion total number
that occurs of gravitons
while the created
waves are
outside the horizon. (There will be additional parametric amplification as
the waves emerge from the horizon, A:,_ 9f8, but in inflationary models that is
generall y smal l compared to the ampli fi cati o n during freezing, A:� Bf 8.) The
total amount of inflationary expansion differs from one inflationary model
tounianother; and correspondingly, the models can give
ty or now too small ( � 10-14) for there to be hope of detecting the waves. now as large as
For
universe discussions
on the of
spectrumthe influence
of the of
amplified the equation
waves, see of state
Grishchuk i n the
(1977) early
and
Fig. 4 of Grishchuk and Polnarev (1980). For calculations of the waves
produced by specific inflationary scenarios see Starobinsky (1979),
Rubakov, Sazhin and Veryaskin (1982), Abbott and Wise (1984), Halliwell
and Hawking (1985), Mi j ic, Morris and Suen (1986)
Because the range of possible strengths of primordial waves is so great, weand references therein.
Gravitational radiation 399
do not bother to show it in Fig . 9.7 -aside from indicating the values of he
corresponding to various values of naw(f) .
(e) Phase transitions
During the early expansion of the universe, there may have been first-order
phase transitions
interactions. In associated
each of these with
phase QCD interactions
transitions the and
ori g i n with
al phase El e ctroweak
woul d be
supercooled, by the cosmological expansion, below the equilibrium
temperature of the new phase. Bubbles of the new
at isolated locations and expand at near-light velocity until they have phase would then nucleate
compressed
equi l ibri u m. the
As original
Witten phase
(1984) enough
has for
pointed the
out,two
and phases
Hogan to coexist·
(1986) hasin
nalyzed in detail, this 'cavitation' should have produced gravitational
awaves
sound i n
wavestwo ways:
they (i ) directly
generate, and from
(ii) expandi
subsequently n g bubbles
from and
the the subsequent
inhomogeneities
associ a ted
inhomogeneities with the
and two
corresponding co-existing phases
inhomogeneities (l arge-scale
in the density
Hubble
expansion rate). The resulting gravitational waves should possess a
spectrum
when the that peaks
cavitation at wavelengths
occurred. Those whi c h
wavelengths were of order
correspond the to hori z on
frequenci si z
e e
s
today fmax - (2 10 - 7 Hz)(kT/ 1 GeV), where T is the temperature of the
x
phase transition. Hogan' s (1986) predi c ted spectra, shown in Fi g . 9. 7 , thus
peak at fmax "' 2 10 - s Hz (QCD, T- 100 MeV) and fmax "" 2 10 - s Hz
x x
formation
(68). From will
that face severe
diagram anddi ffi
the culties. Fig.
corresponding 9. 7 shows the
discussions predi
in c ted
Section waves
9. 5 it
isplaciclearng that several
cosmi c di
string ffe rent
theory observational
i n jeopardy techniques
- or, have
hopefully, the
of prospect
discovering of
stri5°-scale
ng-produced waves. (The apparent disproof of now "' 10 - 7 coming from
anisotropy of the cosmic microwave radiation
9. 5 . 6 (c)) does not in fact constrain cosmic strings, since this observational(Fig. 9. 7 and Section
limit is sensitive only to waves that were
wavelength)-(horizon size) at the epoch of recombination - before the present and had (reduced
strings that produce this wavelength began to vibrate and radiate. )
9.5 Detection of gravitational waves
9 . 5 . 1 Methods of analyzing gravitational wave detectors
When analyzing the performance of a gravitational wave detector, it is
important to pay attention to the size L of the detector compared with a
..t
reduced wavelength of the waves it seeks.
If L � Jc then the detector can be contained entirely in the proper reference
frame of its center, and the analysis can be performed using non-relativistic
concepts augmented by the quadrupolar gravity-wave force field (3), (5). If
one prefers, of course, one instead can analyze the detector in TT
coordinates
The two using
analyses general
are relativistic
guaranteed concepts
to give and
the the
same spacetime metric
predictions for (8).
the
detector's performance, unless errors are made. However, errors are much
more likely in the TT analysis than in the proper-reference-frame analysis,
because our physical intuition about how experimental apparatus behaves is
Gravitational radiation 401
proper-reference-frame
example, we intuitively based
assume rather
that i f than
a mi c TT-coordi
rowave cavi ntate
y i s based.
ri g i d , i t As
s an
walls
will reside at fixed coordinate locations xi. This remains true in the
detector' s proper reference frame (asi d e from fractional changes of order
(L2/J:.2 ) h, which are truly negligible if the detector is small and which the
proper-reference-frame
there the coordi n ate anal y
locationssi s ignores).
of a ri g i But
d it
wall is not
are true
disturbedi n TT bycoordi
fracti n ates;
o nal
amounts of order h, which are crucial to analyses of microwave-cavity-based
gravity wave detectors.
Thus, for small detectors, L� J:., the proper-reference-frame analysis is
much
For to
l a be
rge preferred.
detectors, L � J:., one cannot introduce a proper reference frame
that covers the entire detector. Such detectors can only be analyzed using
general relativistic concepts in TT coordinates
other suitable coordinate system (rarely as good). (usual l y the best) or in some
9 .5 .2 Resonant bar detectors
More effort
gravity-wave has been
detector. put
Weber' into s resonant
original bars
detectors than into
were ofany
the other type
resonant-bar of
type;
were al
of l but
thi s one
type; of the
and eifirst-generation
g ht of the world' (pre-1977)
s twelve earth-based
research detectors
groups now
building
current and
bar operating
eff o rts three earth-based
are i n the Unidetectors
t ed States are working
(the with
University bars.
of Maryl Of athe
nd
(Weber, 1986), Stanford University (Boughn et al., 1982; Michelson, 1983)
and Louisiana State University (Hami l ton et al., 1986)); two are in Europe
(the4)University of Rome with its detector si t ed at CERN (Amal d i et al.,
198 and Moscow University (Braginsky, 1983)); and three are in the Far
East (The University of Western Australia in Perth (Blair, 1983), Tokyo
University (Owa et al., 1986) and Guangzhou, China (Hu et al., 1986)). The
improvements
been a factor in
of resonant-bar
roughly 200 sensitivities
i n amplitude, si n ce Weber' s
correspondingfirst detector
to have
40 000 in
energy; and significant further improvements are yet to come.
(a) How a resonant-bar detector works
Schematically
solid bar whose (Fig. 9.
mechani 8 ), caal resonant-bar
oscillations detector
are driven consists
by of
gravitatia olarge,
nal heavy,
waves, a
transducer that converts i n formation about the bar 's oscillations into an
electrical signal, an amplifier for the electrical signal, and a recording
402 K . S. Thorne
system.
sensor. The transducer and amplifier together are sometimes called the
The transducer typically is mounted on one end of the bar.(though other
mountings are sometimes used), and it produces
current proportional to the displacement x(t) of the bar's end from an output voltage or
equilibrium. Although x(t) is a sum of contributions from all the "" 10 2 9
normal
sopassed modes of the bar, the transducer' s output is
that only the contribution of the bar's fundamental normal mode is filtered by the amplifier
on through. This is accomplished by a band-pass
the frequency fo of the fundamental mode, with bandwidth 13.f somewhat filter centered on
smaller than the diffe rence - 0
f1 f between the bar' s fundamental and its
first harmonic.
asirreltheevant. Thus, in effect, it is the fundamental mode
gravity-wave detector; and all the other normal modes are almost of the bar that acts
Si n ce the fundamental mode involves the relati
bar's left and right ends with just one node (at the bar's center), itv e in and out motion of the
corresponds to a standi n g sound
the bar. Correspondingly, the bar's length must be wave with wavelength twice the length of
(69 )
where vs i s the speed of sound i n the bar. Typi c al solid materials have
longitudin1 al sound speeds of order 5 km s - 1; astrophysics suggests (Section
9.4) that kHz is a reasonable frequency to search for gravitational waves;
and correspondingly the lengths of typical resonant bar detectors are about
2 m and their masses are several tonnes. Notice that equation (69 ) gives for
the ratio of the length of the bar to the reduced wavelength of the
Fig. 9.8. Schematic diagram of a resonant-bar detector for gravitational
waves. The angles (9, </>, l/I) characterizing the propagation and polarization
directions of the waves relative to the detector are a specialization of the
angles (8, </>, l/J) shown in Fig. 9.2.
S ENSOR
{ TRANS DUCER
AMPL I FIER
)(
Gravitational radiation 403
allfromthatbefore
can bethemonitored is the total change /l X in the complex amplitude
wave arrives until after it has passed. Only uncommonly
long bursts, those lasting for more than f0 /llf - 100 cycles, can be
monitored i n greater detail. In the future, however,
the sensor noise under better control and thereby opening up the bandwidth there i s hope of bringing
!lf ::= 0.2f0 to permit detailed monitoring of much shorter bursts
to(Michelson and Taber, 1984) . (For a description of several first-generation
broad-band bars see Figure 2 of Drever, 1977 and associated text, and
references therein.)
(b) The sensitivity of bar detectors to short bursts
A bar detector couples to the field (equation .(26))
h(t) = F + ( O , </> , t/!)h + (t; {3) + F x ( O , </> , t/!)h x (t {3),
l, ; l, (73)
where, if the bar is axially symmetric and the direction (0, </>) and
404 K. S . Thorne
polarization angle t/J are defined as in Fig . 9. 8 , the beam-pattern factors are
+
F = sin2 cos 21/1 ,
8 = F " sin 2 sin 2t/J.
8 (74)
(See Chapter 37 of MTW for the key elements of a derivation. ) We shall
presume, throughout this subsection, that h(t) is a burst of such short
i
duration L\t ;::::; = 1/L\f that the optimal way to search for it is to measure the
mean square change IL'.\ X l 2 it produces in the fundamental mode ' s complex
amplitude. In this case the general formul a (29) for the rati o S 2IN 2 of the
burst's squared signal to the mean-square Gaussian noise in the detector can
be1981;reduced to the simple
Michelson and Taber, 1984) form (see, e. g . , Giff a rd, 1976; Pallotino and Pizella,
S 2 tMerr(2nfo ) IL'.\ Xl 2 (75)
Here k is Boltzmann ' s constant, Tn is a ' n oise temperature' which
characteri z es the overall noise in the detector, and M err is an 'effective mass '
associ a ted with the fundamental mode, so defi n ed that !M errlXl2 (2 nf0 ) 2 is
the(Sintotal
ce X energy
is actual inl the
y themodeamplitudewhen it
of is vi b
motion rati n
ofg with
the endcomplex
of the amplitude
bar, for a X.
bar
that has uniform cross-section and is long compared to its diameter, the
effective mass is Merr = 1( 1 + v2)M where v is the Poisson ratio of the bar's
materi a l and M is the bar' s mass. )
iequatiBecause
ndependent the
of net
the wave-i
mode n duced change L\X in complex
's initial complex amplitude, the numerator of amplitude is
on (75) is the energy that the wave would have deposited in the mode if
the mode had been initially unexcited. This deposited energy can
conveniently be expressed in terms of the cross-section (J 0( f) that the mode
would present to the wave i f the wave had hit
(broadside, = n/2) and with an optimal polarization ( + mode with t/J = 0):
e
it from an optimal direction
1N ,rr(2 nfol2l,U I 2 � r � f 2 lii( Jll 2a " (fl df. (7 6)
Here h( f) is the Fourier transform of h(t) (equation (73)), and for an optimal
direction and polarization (n/2)f2 l fi( J)j 2 would be the energy per unit area
per unit frequency ( f 0) carried by the waves. Because 0( f) is extremely
� (J
(82)
For
frequency a broad-band burst that is peaked near the detector' s resonant
f0 (so the fe of equation ( 3 l a) is approximately f0 ) and that lasts
a time not(80)much
foramplitude longer than flt = 1/f0 , the narrow-band characteristic
wi l be roughl y equal to the broad-band characteri
amplitude (3 lb). For such bursts, and only for such bursts, it makes sense to s ti c
plbroad-band
ot a bar detector' s narrow-band h 3 1 yr on the same graph (Fig . 9.4) as a
detector's h 3/yr· Inspiraling binaries do not belong to this class
of9.4.bursts, so their detection by bars must be discussed separately from Fig.
(c) How to optimize the sensitivities of bar detectors
Equation (82) shows that to optimize the sensitivity of a bar detector to
short, broad-band bursts, one must achieve the largest possible frequency-
406 K. S. Thorne
cabl
chamber e or prongs
(Bragi nthat
sky, suspend
Mitrofanov the bar,
and and
Panov, the resi d
1985). ual
If, gas
as in
is the vacuum
normal, thi s
envi r onment is thermali z ed at some physi c al temperature Tb (subscript ' b '
for bar or for thermal bath), then these couplings cause the mode's
amplitude to execute a random walk (Brownian motion) in the domain
:!S
IXI X th corresponding to an energy k Tb :
(84)
The fluctuation-dissi pationchanges statesXthatisthethetisame
theoremof order mescaleas onthewhich
timescal thise
random
*
walk produces th
r = Q/nf for large-amplitude vibrations to be damped fri c tionally. (Here Q
0
temperature
(88)
In practi c e, the experi m enters choose a bar earl y in their experiments,
thereby fixing the integrated cross section J a0 df; and they then struggle for
many years to develop a sensor and its coupling
cold environment, that will minimize the noise temperature Tn . Maximizing to the bar, and a thermally
J a0 df is achieved by maximizing the bar' s mass and its velocity of sound
(and, to a small extent, optimizi n g its shape
constraints such as available cryostats.) Minimizing Tn is achieved subject to other experimental
according
quality factor to equati o n (88) by (i) maximizing the fundamental
Q (i. e . minimizing its coupling to the rest of the world), (ii)
mode' s
cooling the bar to as low a physi c al temperature Tb as possible, (iii)
maxi m i z i n g the strength f3 of coupling of the
an amplifier with as low a 'noise number' kTa /(2nhfa) as possible (the transducer to the bar, (iv) using
Heisenberg uncertainty principle limits the noise number to be � 1; Weber,
1959;
matching Heff n
of er,
the 1962; Caves,
transducer 1982),
and and
amplifier (v) (astruggling
requi r to
ement get
for good
equation impedance
(88) to
be valid).
(d) Parameters of first- and second-generation bar detectors
The first-generation bar detectors (pre-1977) were aland l made of aluminum,
Q ,..., 105; they all
weighed roughly 1. 5 tonnes, and had f 1. 6 kHz
0 -::::::.
operated at room temperature, Tb - 300 K ; and most used piezo-electric
transducers
when squeezed. - i. e .
For crystals
most glued
the to
coupl the
i n g bar
of which
the produce
transducer small
to the voltages
bar was
weak, /3 � 10 - 4; but those in Britain achieved strong coupling, f3 -:::::. 0.2 and
hence wide bandwidth l!t.f/fo ,..., 1 at the price of reducing the bar's Q from
Q ,..., 105 to Q ,..., 2000 . The best amplifiers that could be impedance matched
to the piezo-electric transducers had rather large noise numbers. For these
first-generation bars the integrated cross-sections were J a0 df -::::::.
x
2 10- 2 1 cm 2 Hz, and the lowest detector noise temperatures were Tn -::::::. 4 K
corresponding to a minimum detectable burst amplitude with � year of
observation x
h 3 1yr "'3 10-16. Despite great effort in the early 70s by
excellent experimenters, there was great room for improvement. (For a
thorough revi e w of the first-generation experiments
1979; for other reviews, see Drever, 1977, Tyson and Giffard, 1978 and see Amaldi and Pi z ella,
Weber, 1986.)
In moving into the second generation, almost all the groups cooled their
bars to liquid helium temperatures (Tb = 1. 5--4 K rather than 300 K).
Gravitational radiation 40S
room temperature, but with a sensitivity h 3;yr ':!:. 1 .6 x 10 - 1 6 , two times better
than that of any of the first-generation room-temperature bars (Hu et
al. 1986). The Moscow group with its small silicon and sapphire bars
operates at a much higher frequency than any of the other groups, fo ""
8 kHz; by the time this book is published they will likely be on the air with
h 3;yr "" 4 x 10 - 1 7 . The Tokyo group, on the other hand, has chosen a much
lower frequency than the others, f0 "" 60 Hz and is now operating a narrow
band search for periodic waves from the Crab pulsar (see below). These
sensitivities are shown in Fig. 9.4, along with those of other kinds of
detectors. For details of the present detector configurations and near-term
plans, see the gravitational-wave articles in the proceedings of recent
conferences (Ruffini, ed. , 1986; MacCallum, ed. , 1987).
(89 )
and with the beam-factor averages having the values (79). In particular, the
[
detector's characteristic noise amplitude (equation (5 1)) is
G k Tb 1 1 f
c J ao df Q foi J '
h n = 15 3 (90)
and the brightest source that can be seen with 90 % confidence in i =! year of
Gravitational radiation 41 1
( TKb ) (
integration is
h 3 /yr = 1 . 7h = 3 .9 x 10 - 25
n
1 )(
J ao df fo
)( )
1 10- 21 cm 2 Hz 1 1000 Hz i 107 1
Q
.
(9 1)
The Tokyo group is currently carrying out a search for gravitational
waves from the Crab pulsar using the above technique (Owa et al., 1986).
Their 74 kg , cryogenically cooled antenna has Q = 2. 1 x 107 , Tb = 4 K,
f0 = 60 Hz, and J a0 df � 2 .2 x l0 - 2 7 cm 2 Hz (so low because for
noncylindrical antennas with frequency fo lowered by special shaping,
J a0 df oc f5 ; Hirakawa et al., 1976). Correspondingly it has h 3 ;yr "'
3 x 1 0 - 2 2 . N o other present bars are optimized fo r periodic waves, since
there are no known sources in their frequency bands ( - 900 Hz and - 8 kHz).
However, with the technology of present burst-optimized bars it should be
possible to achieve the thermal-noise-limited sensitivity (9 1) with Tb = 4 K,
J a0 df� 8 x 10 - 2 1 cm 2 Hz, f0 � 900 Hz, and Q � 5 x 106 corresponding to
h 3 ;yr � 4 x 1 0 - 2 5 (Stanford, LSU, Rome, Maryland); and Tb = 4 K,
J a0df � 2 x 1 0 - 2 3 cm 2 Hz, f0 � 8 kHz, and Q � 2 x 109 corresponding to
h 3 ;yr � 1 .4 x 1 0 - 2 5 (Moscow). See Fig. 9.6.
When searching for stochastic background it is desirable to open up the
bandwidth �.f until the sensor noise becomes almost as large as the bar's
thermal noise (�.f � 0.5 Hz for the present Stanford bar). With Sh (f) then
given (approximately) by the thermal-noise spectral density (equation (89)),
the noise amplitude hn and the 90 %-conficence !-year sensitivity h 3 ; yr for
[ 1
stochastic background become (equations (66) and (67))
G k Tb 1 1
h n (f) � 1 5 C 3 (92)
J a0 df Q J(tf df) '
h , ,, , = l .7h. - s x 10- 2V�Y( ;�:�; Y (1� )t��zy.
2
10 - Hz 1
(93)
This h 3 ;yr is shown in Fig. 9.7 for the parameters of 1987 bar technology
(those given in the last paragraph, plus llf � 1 Hz). For details of searches
for stochastic background that were carried out using first-generation bar
detectors, see Hough et al. (1975) and Hirakawa and Narihara (1975). For
discussions of sensitivity that are more detailed and sophisticated than the
above sketch, see Hirakawa, Owa and Iso ( 1985), Weiss ( 1979) and
references therein.
detectors using the present kinds of sensors : the fundamental mode of a bar,
being highly decoupled from the rest of the world, can be regarded as a
simple harmonic oscillator with mass M eff and angular frequency w 0 = 2rrf0 .
As such , it is subject to the laws of quantum mechanics for oscillators : its
generalized position x and momentum p must be regarded as hermitian
operators that fail to commute, [x , p] = i h ; and, correspondingly, the real
and imaginary parts of its complex amplitude,
X 1 = x cos w 0 t - ( ) p
Merrill o
sin w 0 t,
X 2 = x sin w 0 t +
Merrill o
( )
cos w 0 t ,
p
(94)
(95)
(97)
Notice that this standard quantum limit places the severe constraint
h 3 /yr "" (
> 4.4 x 10 - 2 0 )(
fo
1000 Hz J a df
0
)
1 10 - 2 1 cm 2 Hz 1
(99)
on the detector's burst sensitivity (82). It is fairly likely, though far from
certain, that the strongest kilohertz-frequency bursts striking the earth three
times per year have characteristic amplitudes h e < 10 - 2 0 (see Section 9.4. 1);
and , correspondingly, it may turn out to be crucial for bar detectors of the
future to circumvent the standard quantum limit (98) , (99).
The uncertainty principle (96) suggests a promising method for
circumventing the standard quantum limit (Thorne et al. , 1978 ; Braginsky,
Vorontsov and Khalili, 1 978) : one should devise a new kind of sensor that
measures X 1 with high accuracy, while giving up accuracy on X 2 . Such
sensors, called 'back-action-evading sensors' (a special case of 'quantum
non-demolition sensors'), are now under development in a number of
laboratories (Braginsky, 198 3 ; Bocko and Johnson , 1 984 ; Oelfke, 1983 ;
Blair, 1 982) ; and they may make possible bar sensitivities in the 1990s that
will beat the standard quantum limit by modest factors. For a detailed
review of quantum non-demolition measurements - i.e. measurements that
do not change the quantum state of the system being measured - see Caves
( 1 983).
Although a back-action-evading sensor gives up accuracy on one of the
wave's two quadrature components, that accuracy can be regained by
looking at the same wave with two different detectors : on one detector, with
complex amplitude X = X 1 + i X 2 , measure X 1 with high accuracy and X 2
with poor; on the other, with complex amplitude Y = Y 1 + iY2 , measure Y2
with high accuracy and Y1 with poor. From X 1 infer the detailed evolution
of one of the wave's two quadrature components; from Y2 infer the
evolution of the other. In this way , in principle, the quantum mechanical
properties of the detector can be completely circumvented and the only
constraints of principle on the accuracy of measurement are associated with
quantization of the waves themselves. For further discussion and details see
the reviews by Caves et al. ( 1 980) ; Caves ( 1 983) ; Braginsky, Vorontsov and
Thorne ( 1 980) .
It is worth noting that a back-action-evasion measurement, ideally
performed, should drive the bar's fundamental mode into a 'squeezed state'
(Hollenhorst, 1 979). Squeezed states have been studied extensively in recent
years in the context of quantum optics (see, e.g. Schumaker, 1986 ; and
Walls, 1 983) ; and we shall return to them in Section 9.5.3(f) below when
discussing beam detectors.
4 14 K . S. Thorne
9 .5 .3 Beam detectors
(a) A brief history of beam-detector research
The germ of the idea of a laser-interferometer gravitational-wave detector
('beam detector') can be found in Pirani ( 19 56) ; but - so far as I am aware -
the first explicit suggestion of such a detector was made by Gertsenshtein
and Pustovoit ( 1 962). In the mid- 1960s Joseph Weber, unaware of the
Gertsenshtein-Pustovoit work , reinvented the idea but left it lying in his
laboratory notebook unpublished and unpursued. In 1970 Rainer Weiss at
MIT, unaware of Gertsenshtein-Pustovoit or Weber, reinvented the idea
and carried out a detailed design and feasibility study (Weiss, 1972) in which
many of the techniques now being used were conceived . Unfortunately,
Weiss was unable to obtain funding to push forward with a significant
experimental effort.
Robert Forward at Hughes Research Laboratories in Malibu , California,
having learned the concept of the beam detector from Weber (his former
thesis advisor), was motivated indirectly by Weiss in 197 1 to construct a
prototype detector with funding from Hughes. By 1972 Forward and his
colleagues at Hughes were operating the world's first prototype beam
detector - an instrument that demonstrated the idea could really work, and
that was remarkably sensitive considering the modest effort put into it :
[Sh (f )J � � 2 x 10 - 1 6 Hz- � between 2500 Hz and 25 000 Hz, corresponding
to h 3; y � 1 x 10 - 1 3 for 2500 Hz bursts (Moss, Miller and Forward, 197 1 ;
r
Gravitational radiation 415
Forward and Moss, 1 972 ; Forward, 1978) . Regrettably, Forward could not
obtain funds to move from this first prototype to a more sophisticated
instrument; so his project was shut down.
With the completion of the first generation of bar detectors in 1975, each
experimental group that decided to stay in the field looked carefully at a
variety of possibilities for sensitivity improvement. While most groups
decided to stick with bars, two switched to beam detectors : Munich, led by
H . Billing, and Glasgow , led by Ronald Drever with Jim Hough second in
command. The Munich group was strongly influenced by a proposal to
develop beam detectors that Weiss had submitted to NSF , and that NSF
had refused to fund ; and so Munich pushed forward (Winkler, 1977) along
the lines that Weiss had hoped to follow , using a Michelson interferometer
design (see below) . The Glasgow group first built a small Michelson
interferometer (Drever et al. , 1 977), then switched in 1 977 to a new Fabry
Perot design invented by Drever (Drever et al., 1980) .
In 1 979 Caltech managed to attract Drever away from Glasgow (part
time at first, full-time later) , leaving Hough as the Glasgow leader. At
Caltech Drever started up a beam-detector project ; and NSF, finally
recognizing that beam detectors were worth funding, agreed to support both
Weiss at MIT to develop his original idea of a Michelson system and Drever
at Caltech to develop his Fabry-Perot system . More recently, in 1 983 , Alain
Brill et initiated a beam-detector effort in Orsay, France (Brillet and
Tourrenc , 1983; Brillet, 1 985).
Munich, Glasgow , Caltech and MIT all now have working beam
detectors with amplitude sensitivities "" 2000 times better than that of
Forward's first prototype but "" 5 times worse than the best bars. These
detectors are small-scale ( 1-40 m) prototypes for the full-scale (several
kil ometer) beam detectors that will be required for real success. Design and
costing studies are now underway for the full-scale systems (called 'Laser
Interferometer Gravity Wave Observatories' or LIGOs) ; for details of these
studies, see Linsay et al. ( 1983), Drever et al. ( 198 5), Maischberger et al.
( 1985), Winkler et al. ( 1 986), Hough et al. ( 1986) . There is hope that full
scale LIGOs will be constructed in the late 1 980s and early 1 990s and will be
operating in the mid- to late- 1 990s with sensitivities in the region where
gravity waves are expected.
waves will be weak above this frequency (see page 158 of Thorne, 1978) ; and
they have their best sens1tiv1t1es at frequencies below 1 kHz.
Correspondingly, the waves they seek all have reduced · wavelengths
X > 5 km and most have X > 50 km. Since the planned detectors all have sizes
L � 4 km, the condition L � ): for use of a 'proper-reference-frame analysis' is
satisfied, though only marginally in extreme cases. I shall use such an
analysis in the discussion below. For an outline of the alternative, TT
analysis, see, e.g., Exercise 37.6 of MTW.
A beam detector consists of one or more receivers that are operated
simultaneously, with cross-correlated outputs - the cross-correlation , as
usual, being the key to removing spurious, non-Gaussian noise. A simple
version of a Michelson-type receiver is shown in Fig. 9.9, three
dimensionally in part (a) and as seen from above in part (b). (Ignore for the
moment the propagation and polarization pieces of part (a ) . ) The receiver
consists of three masses which hang on wires from overhead supports and
swing like pendula. The masses are arranged at the ends and corner of a
right-angle L. When a gravitational wave propagates vertically through the
receiver with polarization axes along the L ( + ' polarization), its '
END
MASS
END
MAS S
RECYCL I NG
M l R ROR /�( __ _ _ _
�rifv1fIT�=�:;::==:::::=l >-'.<:
, :
LASER
'
�-�
(b) PH OTO D I O D E
Gravitational radiation 417
quadrupolar force field (5) pushes together the masses on one arm of the L
while pushing apart the masses on the other arm . In the next half-cycle of the
wave, the directions of the pushes are reversed. Since the waves being sought
have frequencies f far above the 1 Hz swinging frequency of the pendula, the
pendular restoring forces have no opportunity to make themselves felt : the
masses respond to the gravity-wave pushes as though they were free. With
the origin of the p roper reference frame placed on the central mass, the
central mass is left unaffected while the end masses oscillate longitudinally
with displacements
c5x(t) = 1Lh + (t) for mass on x axis, ( lOOa)
c5y(t) = -1Lh + (t) for mass on y axis ( lOOb)
(equation (6)). Here L is the (approximately equal) length of each arm.
Correspondingly, there is an oscillation in the difference l(t) of the arm
lengths, Dl(t) = Dx(t) - c5 y(t), given by
c5l ( t) = h + ( t)L . ( 10 1)
It is straightforward to show that in the more general case of a wave which
impinges from a direction ( 8, </>) on the sky with polarization axes rotated at
an angle t/I relative to the constant-</> plane (Fig. 9.9(a)), the difference in arm
lengths l oscillates as
c5l(t) = h(t)L, ( 102)
where h(t) has the standard form (26)
h(t) = F + ( 8, </> , t/l )h + (t; z, /3) + F ( 8, </> , t/l )h (t; z, /3),
x x
( 103)
with beam-pattern factors (cf. Forward, 1978 ; Rudenko and Sazhin, 1980;
Estabrook, 198 5 ; Schutz and Tinto, 1987)
F + (8, </>, t/I) = 1( 1 + cos 2 8) cos 2</> cos 21/1 - cos 8 sin 2</> sin 21/1,
( 104a)
F ( 8, </>, t/I) = 1( 1 + cos 2 8) cos 2</> sin 21/1 + cos 8 sin 2¢ cos 21/1.
x
( 104b)
The difference of arm lengths l(t) is monitored by Michelson
interferometry : a beam splitter and two mirrors are attached to the comer
mass as shown in Fig. 9.9(b), and one mirror is attached to each end mass. A
laser beam shines through a hole in the comer ma'ss and onto the beam
splitter, which directs half the beam toward each end mass. The mirrors on
the end masses reflect the beams back toward the comer-mass mirrors,
which in turn reflect the beams back to the end masses, which reflect the
beams back through holes in the corner-mass mirrors and onto the beam
4 18 K . S. Thorne
splitter where they are recombined. Part of the recombined beam goes out
one side of the beam splitter toward the laser (ignore for now the 'recycling
mirror' in (b), it is absent in the simple version of the receiver being discussed
here); the other part of the recombined beam goes out the other side toward
a photodetector. Oscillations in the arm-length difference '5l(t) produce
oscillations in the relative phases of the recombining light, and thence
oscillations in the fraction of the light which goes to the photodetector
versus that which goes back toward the laser. The photodetector, by
monitoring the oscillations in received intensity, in effect is monitoring the
.
oscillations '5l(t) of arm-length difference and thence the gravity-wave
oscillations h ( t)
In practice the laser beams are made to bounce back and forth in the arms
not just twice as shown in Fig. 9.9(b), but rather a large number of round
trip times B, making B distinct spots on each end mirror. In the simple case
that the gravity-wave-induced arm-length difference '51 = Lh does not change
much during these many round trips (see subsection (e) below for the case of
large change), the bouncing light beam will build up during its B trips a total
phase delay
( 10 5 )
where ,te = A e /2rr is the reduced wavelength of the light (,te = 0 . 08 18 microns
for the light from the argon ion lasers currently being used). This phase delay
can be monitored, by the photodetector, with a precision i1 <1> = l/(N/7)!,
where N Y is the total number of photons that the laser puts out during the
time f over which the photodetector intensity is averaged, and Y/ is the
photon counting efficiency of the photodetector (Y/ - 0 4-0 9 ) . When
searching for a gravity-wave burst with characteristic frequency f, it is
. .
optimal to average the photodetector intensity for half a gravity-wave
period, f - 1/2f; and correspondingly the phase delay can be inferred with a
photon-counting-no ise ('shot-noise') precision
L1<1>
sho t=
1
(Ny Y/ )!
(-
-
he/,te ) !
J Y/( 1/2J) ' ( 106)
o
where /0 is the laser output power. By comparing equations ( 105 ) and ( 106)
we obtain a rough estimate of the amplitude of a gravity-wave burst that
produces a signal of the same strength as the rms shot noise
Gravitational radiation 419
COR N E R END
R ECYC L I N G MASS MASS
M I RROR
.: • : ·
· · · . '. ·
.
Re t'RE
PHOTOD I O D E
420 K . S. Thorne
trapped
toward in
the the
beam cavi t y,
splitter. building up to high intensity before exiting back
Slight
resonance changes
and in
thereby the length
produce of each
sharp cavity
changes drive
in thethe cavity
phase of slightly
the off
exiting
light. Consequently, when the exiting light beams from the two cavities
recombine at the beam splitter, their relative
slight modulations '51 of the two cavities' length difference l; and phase is hi g hly sensitive to
correspondingly 1
the intensity of light onto the photodetector is highly
sensitive to '5 . If the comer mirrors have a probability for reflecting photons
f7tc and a transmission probability 1 - f7tc (and no scattering), and if the end
mirrors
descri b edreflect
by the much
same more
formulaseffici e ntly, then this Fabry-Perot sensitivity
( 105)-( 107) as for a Michelson receiver, wit h
is
the number of round-trips B in each Michelson arm replaced by 4/( 1 - f7tc) :
--+
B 4/(1 - f7tc). ( 108)
Itreceiver
is also possible
in an (and,
alternativein fact, is
mode current
where,practi c e)
instead to operate
of a Fabry-Perot
recombining and
interfering
one arm the
and beams,
the dif f the
erence laser' s frequency
between the is locked
laser' s to the
frequency eigenfrequency
and that of of
the
other arm is the gravity-wave signal; see, e.g. Hough et al. ( 1983) or Spero
( 1986a) for details. This mode of operation is technically easier than beam
recombination, but the ease is bought at the price of some debil itation in the
ultimately achievable shot noise.
(c) Noise in beam detectors
Photon shot noise i s but one of many noise sources
detectors. Almost always, thus far, the other noise sources are so strong that that plague beam
the effects of shot noise are lost amidst them. Typically the experimenters
struggle
shows up;for a
thenlong time
they to reduce
improve thethe other
shot noises
noise suffi
somewhat c iently
by that shot
increasing noise
the
laser power; then they begin a long struggle once
sources. In this way the overall gravity-wave amplitude noise at kilohertz again with the other noise
frequencies has been reduced during the peri o d 1980-86 by a factor 1000 ,...,,
and, correspondingly ,
h31yr ( fc) = 1 l [fcSh ( fc)]� for bursts (equation (34)) , ( 111)
h3/yr (f) = 3 . 8 [Sh (f) x 10 - 7 Hz] l
for periodic waves (equation (52a)), (112)
( llf )-� [fSh(f)]
h3/yr (f) = 4 . 5 1 0 - 7 Hz
1
i
noise in the mirrors and pockels cells above 6000 Hz. In Fig s. 9.4, 9.6 and 9.7
(upper right) are shown the sensitivities h 31yc (equations ( 1 1 1)-( 1 13)) for the
Munich and Caltech prototypes in 1986 (Schilling, 1986 ; Spero, 1986b) and
Glasgow in early 1987. As an illustration of the sensitivity progress, Fig . 9.4
also shows h 3;yr for bursts in the Munich and Caltech prototypes a few
months after each was first turned on ( 1980 and 1983).
(e) Spectral density of shot noise for simple, recycling and resonating
receivers
In the present prototypes it is advantageous to store the light in the arms as
long
gravi tas possible,
y-wave thereby
sensitivi t y; cf.building
the up
B-dependence the largest
in possible
equations phase shift and
( 105) and ( 107).
However, in a kilometer-scale LIGO one easily can store the light longer
than half a gravi t y-wave period, i. e . for more than B = 75(1 km/L) x
( 1000 Hz/f) round-trip traverses of an arm. Such long storage is self
defeati n g: the phase shi ft built up so laboriously duri n g
ofsigthen. wave gets removed during the second half-period because h(t) reverses the first half-peri o d
This shows up clearly i n the spectral densities of shot noise Sh(f) for the
idealized Mi c helson and Fabry-Perot receivers of Fi g s. 9.9 and 9. 10 (stil
without the 'recycling mirrors' in place). Assuming as above that the
Michelson
and the mirrors
Fabry-Perot have
end negligible
mirrors losses
have during
negligible B round
transmissiontrips in each
compared arm,to
the corner mirrors
1 - � E � 1 - �c = 4/B ( 1 14)
(cf. equation ( 108)), the spectral densities of shot noise are (cf. Giirsel et al.,
Fig. 9 . 1 1 . Square root o f spectral density o f noise [Sh(J)]1 plotted against
frequency f for the Munich Michelson-type beam detector with 30 m arms,
as of February 1986.
IB
3 X l0 -
C\J
10- I B
'
I
' N
3 x10- 19
� 10- 19
3 x 1 0 - 20
0 500 1000 2000 3 0 00 4 000 5 0 00
f, Hz
42 4 K. S. Thorne
I I I I
I
I
I I I I
I
I I I I I
4 I I
I I I
I I I I I
C\J 21 I I I
'
- � lu
-
-
�
...--....
3 gl
�I
QI
I
I
I I
'
I
I
I
I
I
I
I
Cf) .K
� Q)
� I I
(.) _o
�I I I
.c
2 I I \ I
I
C\J I \ \
I I
I \ I \ I
' / \ I \ I
FA BR Y- PE
RO T
_..
0
0 2 3 2 8 Lf
c
Gravitational radiation 425
maxima (Sh = oc ) occur when the lig ht is stored for an in tegral number of
gravity-wave periods. This i s just what we would
h(t) reverses sign every half-period, thereby removing during a second half
expect from the fact that
period the signal put onto the li g ht duri n g a first
Perot receiver there are no oscillations of Sh(f) because different photons hal f -peri o d . In the Fabry
experi e nce di f
the reflectivity f!ltc.ferent storage times-i. e . because of the probabi l istic nature of
Drever (1983) has devised a method for improving the sensitivity of either
aforMimuch
chelsonlonger or athan
Fabry-Perot when mirror reflecti v iti e
a half-period. The basic idea is to extract the light afters permit stori n g light
athehalf-peri
cavity o d,
along when
with further
and in storage
phase is
withsel f-defeating,
new l a ser and
light. then
More rei
speci n sert
fi i t
cally, i n to
for
the Mi c helson of Fig. 9.9(b), one adjusts the relative arm lengths so that very
little of the recombined light emerges from the beam splitter toward the
photodiode
operation thatwhere the
optimizes gravity-wave
the sensitivity, si g
itnal i
turnss read
out). out
Then (a
most mode of of
the
recombined
inserted to lig
direct ht emerges
that emergent toward l i ghtthe
back laser;
into and
the '
beamr ecycl i ng
splitter mirrors'
along wi are
t h
fresh laser
withAs aansingle light. For
recyclingthe Fabry-Perot
mirror. of Fi g . 9.10 the same effect is achi e ved
aid i n quanti f yi p g the recycli n g-i n duced
consider the (realistic) situation in which the technology of mirror(0.9999 i m provement in shot noise,
coatingsat
limits the mirror reflectivities to some maximum value f!ltmax
present; perhaps 0.999 99 a few years from now). It obviously is optimal to
place at the end of each arm a mirror with this maximum, so f!ltE = f!ltmax ·
Consider, for concreteness, the Fabry-Perot recei v er of Fi g . 9 . 10. If the
corner mirrors are chosen also to have the maximum reflectivity, f!ltc = f!ltfa
then
gained essentially al l the unused lig ht leaks out the
by recycling. In this case of a simple, non-recycling Fabry-Perot withend mirrors and nothing i s
=
.1?-?c fAE = «1'm ax the spectral density of shot noise is (cf. Giirsel et al., 1983;
Meers, 1983 ; Brillet and Meers, 1987)
( l 16a)
where
2hcJ:.e (1-f!lt E)2
So =
fo -
_ (1 -�E)c 2 .4 Hz( 1-�E
4nL 10-4 L ) ( l km) . ( 116c)
(Note
worse that,
than because
the � �� so much
limit lig
ofht i s
equationlost out the
(115a) end
- mirrors;
worse by athi s
factor noise
4 at
is
frequenci e s c E
f � f0 and by a factor 16 at f �f0 .) This non-recycled (' simple' )
Fabry-Perot
An experi mnoise
enter is
who shown
wishes as a
to solid
improve curve on in Fig.
this 9
noise. 13.level by recycl i n g
must choose a frequency fk �f0 near which the noise level is to be
minimized. The minimal noise level near f=fk will then be achieved by
setting (117a)
so(nfk)the- 1 effective storage time in each arm is l
2BL/c = 8Lc - /( 1 - �d =
= x
(l/n) (period of a gravity wave with the optimal frequency fk).
Mini m al noise also requi r es a speci a l choice for the reflectivity ,gfR of the
recycling mirror (Fig. 9.10)
1 - �R= 4(1(1--��E)c) . (117b)
With these choices of reflectivity, the light-recycling Fabry-Perot receiver of
Fig. 9.10 has shot noise (cf. Giirsel et al., 1983; Meers, 1983; Brillet and
Meers, 1987)
s,(f) = � s. [ 1 + (jJJ (117c)
.
This noise is depicted in Fig. 9.13 for two choices of fk (dashed curves). Note
that this noise has a ' k nee' at the frequency fk; for this reason fk is called the
knee frequency. Note further that recycling produces an overall
inon-recycl
mprovemented inFabry-Perot
Sh(f), at frequenci e s f � fk, by a factor f0 /2fk relative to a
with equal reflectivities for corner and end
mirrors (equations (116)) and an improvement by 2f0 /fk relative to the very
best non-recycl e d Fabry-Perot - one wi t h 1 -�max-:- - 1 - �E � 1 - � c �
8nLfk/c (equation (l 15a)). This improvement at f � fk is bought at the price
ofoverworsened noise at f� (fkf0 /2)1 . Note further that the improvement factor,
an optimal non-recycle� dcycFabry-Perot at f � fk, is
S e led 2fo 1 - �E
�o n-r cyc
s e led fk 1 - �c '
= (118)
-
whi c h is 1/(the
lostcbyal intuition
isphysi mean
leakage through numberthe of times
high-reflecti thatvthe
i t y light
end can be
mirrors). recycl
This e
isd before
just what it
should suggest.
Gravitational radiation 427
light recycling (equation ( 1 17c)), in which the end mirrors have the maximum
achievable reflectivity £1,fE = £1,fmax • the corner mirrors are adjusted to produce
the desired knee frequency (fk i and fk z for the two curves shown ; equation
( 1 17a)) , and the recycling mirror is adjusted to minimize the noise at the knee
frequency (equation ( 1 17b)). The dotted curves are for a Fabry-Perot with
light resonating (equation ( 1 19c) ; Fig. 9. 14) in which the end mirrors have t he
maximum achievable reflectivity £1,fE £1,fmax ; the corner mirrors are adjusted
=
to produce the desired resonant frequency for gravity waves (fR1 and fRz for
the two curves shown ; equation ( l 19a)); and the recycling mirror is adjusted
to minimize the noise at the resonant frequency (equation ( 1 19b)).
/
/
/
/
/
. . /
/
;� /
�
�I.Po� /\[ //
/ : /
Y'.)o. : /
� :
/ : /
R E CYCLING ·�
- - - - - - - - - ·.- - ·:- - - -/- -
. / --(
: / :
.
: / :
·. ; / :
. . .
.
�. .: ..,..... /
RECYC L I N G
- - - -- _....,. :. : .
.....fi--.6f = f0
0. 5 fk l fk l 2fkl 0. 5 fk 2 fk 2 2 fk 2 In f
fRI fR 2
428 K . S . Thorne
!fl Ell[)
E MASS
. .
. ·,
PHOTODI ODE
PHOTODIODE
(a) (b)
Gravitational radiation 429
says should work (Caves, 1987b), for beating the standard quantum limit
when
waves. performing
The key i d a
ea narrow-band
(which was measurement
invented of
independently periodic and gravi
earlier t ational
i n the
classical domain by Gordienko, Gusev and Rudenko, 1977) is to place a
spring
thereby between
turning each
the mirror
mirror and
and the
compani companion
o n i n to mass
a two-mode on which system.i t ri d es,
By
setting the ratio of the spring frequency over the gravity-wave frequency
equal
the to
little the
mirror, square one root
can of the
force mass
the of
laser the
beam' big s companion
fluctuating over
li g ht the mass
pressure of
to
act on the motion of the big mass and not the mirror, while the
interf e
immunity rometer of thereads out
receiver the
to mirror'
Heisenberg' s motion.
s quantumThe result
noi s e is
overan i
a m proved
narrow
frequency
whether i t band.
can be Whi l e this
implemented scheme i n ai s clever
practi c aland conceptually
manner i s not satisfactory,
yet clear
Although, for broad-band measurements with beam receivers, there as yet .
is no known way of beating the standard quantum limit, Caves ( 198 1) has
invented a clever scheme for getting closer
inadequate laser power causes the shot noise to exceed it. The key to Caves' to the quantum li m i t when
idea is his discovery that the ultimate source. of the photon shot noise and the
fluctuating radiation-pressure noise i s not, as people previously thought,
fluctuations
electrodynami i n c the laser
vacuum output.
state (' v Rather,
acuum i t i s fluctuations
fluctuations' ) that in the
enter quantum
the beam
splitter
light as at
it ri
headsg ht-angles
toward to
thethe
twoincoming
arms. As laser
Caves light hasandshown,superpose by ' s on
queezithe n laser
g the
vacuum' (i.e. reducing the vacuum fluctuations in the cos[(ct-x)/...te] part of
the light while increasi n g them in the sin[(ct -x)/ ...te] part; achievable in
princi p le by sending the vacuum through a
one can reduce the beam receiver's shot noise at the expense of increasing the pumped, non-linear medium),
fluctuations in its light-pressure noise. The net result is the same as if one
were using a laser of hi g her power: in
power is too low to permit achieving the quantum limit, one improves thethe typi c al case, where the actual
sensitivity
Recentlyand Cavesmoves ( 1987toward the quantum limit.
c) has elucidated the ultimate sensitivities achievable
when one combines hi s squeezed-vacuum techni
resonating. He finds that because, in a resonating system, noise associated q ue wi t h light recycling and
with losses i n the end mirrors is as important
fluctuations entering the beam splitter, squeezing of the vacuum is not on resonance as vacuum
useful. By contrast, when combined with light recycling, squeezed-vacuum
techniques can reduce Sh(f), over a broad band of frequencies llf � f, to S0
432 K . S. Thorne
( 10-f7 Hz )--a(JSh)i
h 3 /y r = 4.5 .l
(123c)
= 1 . 1 10- 22( lOOOHz
x
f ) for stochastic sources, (123c)
a
where the search for stochastic waves is presumed to use a bandwidth ll.f= f.
Below
performance ""500 ofHzthese i t i s presumed
first-generation that
LIGO seismic
receivers. noise debilitates the
During the years following the first gravity-wave searches in the LIGOs,
the experimenters plan to run a sequence of detectors with ever improving
sensitivities,
seismic isolati pushing
o n so the
the sensitivities
detectors can ever
operate downward,
at ever and
decreasi improving
n g minimum the
frequencies. A reasonable goal by the end of the 1990s is to approach the
lower-most thi c k-dashed curves in Fi g s. 9. 4 , 9. 6 and 9. 7 . These curves
L =(armto length)=
correspond advanced detectors with the following characteristics:
4 km,
Jee= 0.0818 µm (Argon-ion laser light),
1 0 17=(laser power) (photodetector sensitivity)= 100 W,
x
1017
=5.2 10 - 2 5 ( l kHz
x
f )1 for stochastic waves. (125c)
Here it is assumed that the stochasti c
bandwidth !!if=f0 over which the resonating receiver has goodsearch is restri c ted to the narrow
performance. The quantum-li m i t noise, which
levels at f� 100 Hz, is given by (equations (121) and (111)-(11 3)) exceeds these shot noise
2 h 1]
h 3 1yr = 11 [ 2 2 = 1 . 3 1 0 _ 2 3 x
( 1000 Hz)t for bursts, ( 126a)
1t mL f f
2 10- 71 Hz]
h 3/yr = 3 . 8 [tt 2 2
h t
mL f
= 4.4 10 - 29 (1cxx;. Hz ) for periodic waves,
x (126b)
h3/yr = 4.5 [f0(S10 -QL1 /S0) t]-± [J4 fSQL t
Hz J
i2 [� ( 10 - 1 Hz)2 r
=4· 5 (�3n )t(� mL
) 0c f J
/ 17
1 loo that of Seo X-1, and a stochastic background at f"' 30 Hz with nGw as
small as 10-1 0 . It seems likely to me that such sensitivities are more than
adequate
sensitivities forthansuccess;
these -waves
somewhere are likely
betweento bethosedetected
of the at
' fi more modest
rst-generation'
detectors and those of the 'advanced detectors' (Figs. 9.4 , 9.6 and 9.7).
(h) Ideas for other types of beam detectors
The
experi Michelson
m ental work and to Fabry-Perot
date has configurations,
focussed, are not the on
onl ywhipossic h b almost
le types al
of l
beam
been receivers.
pursued A number
vigorously. of
Since others
the have
LIGOs beenare concei
intendedv ed of,
to but
house have not
several
diwhich
fferenta number
receiversofsimultaneously and to have lifetimes of ;<:20 years, during
generations of receivers will
of these other configurations might one day operate in a LIGO. These other be built and operated, some
configurations include: (i) A frequency-tagged interferometer or Michelson
with overlapping beams (Drever and Weiss, 198 3 ), in which the light beam in
each
beams armof i s made
successi vto
e shi f
passest i n frequency
can overlap wi t
eachh each
other round-trip
but not i n pass,
terfere so
wi that
t h the
each
other; such an interferometer is basically a Michelson but with small,
Fabry-Perot-size mirrors. (ii) Active interferometers (Bagaev et al. , 1981;
Brillet
with and
transi tTourrenc,
ions near 1983),
the in
frequency whi c hofan
theacti v
lasere medi
li g u
ht) m (atoms
resi d es i n or
the molecules
arms (or
arm - there mi g ht be onl y one) of the interferometer. (iii) Spectroscopic
detectors (Nesterikhin, Rautian and Smirnov, 1978; Brillet and Tourrenc,
198 3 ; Borde et al. , 1983), in which laser spectroscopy is used to monitor the
frequency shifts produced by the gravitational waves.
achi Although
e ve i n
sensitivities principle
comparabl these e alternative
to those of types
the of beam
standard detectors
(Michelson can
and
Fabry-Perot)
detectors look types, practi
substantially c al
less issues
promising make the
(Brillet acti
and v e and
Tourrenc, spectroscopi
198 3); and
c
the frequency-tagged
venture a tentative verdict. detector has not yet been pursued i n suffici e nt depth to
9.5.4 O ther types of earth-laboratory detectors
Ingravity-wave
addition todetectorsbar detectors and beam detectors, a number
that could operate in an earth-bound laboratory of other types of
436 K . S. Thorne
have
promi been
s ing concei
to v
justi ed
fy of. Although
substantial none
experimentalof theseeffo has
rt, looked
some aresuffi c iently
question
marks (because they have not been pursued far
rather than discards. Because I have not looked at most of them in enough enough for a . clear verdi c t)
detai
rejects. l , I shall not venture to sort out the question marks from the total
(a) Electromagnetically coupled detectors
One
groups) type isof
a transducer
re-entrant used on bar detectors
microwave-cavity (e. g
resonator. by the Moscow
whose and
capacitance Perth is
modulated by the bar's vibrations (Section 10 ofBraginsky, Mitrofanov and
Panov, 1985; Blair, 1983). There is an obvious similarity of this to a Fabry
Perot beam receiver in which each arm is an optical cavity with length
modulated
conti n uous by the motions
sequence of of the
detector swi n ging masses.
configurations In
that fact,
leads onefromcan i
onem agine
to thea
other,
the mi by
c splitting
rowave the
cavi t bar
y into pieces
gradually and
being gradually
distorted moving
and them
expanded apart,unti with
l it
becomes the optical cavity of the Fabry-Perot (Caves, 1 978).
Thi
detectors s argument
are butleads twoto the recognition
examples of that
a bar
large detectors
class ofand beam
possible
'electromagnetically coupled detectors', in which gravitational waves drive
theor, when
motionsthe ofdetector
masses,getsandlargerelectromagneti c fields measure
than a reduced wavelength, gravity waves those motions;
drive vibrations of both the electromagnetic fields and the masses, which are
coupled together.
Other
figurations, earth-laboratory-scal
besi d es resonant e
bars electromagneti
and opti c al c ally
beams, coupled
that have con
been
considered theoretically and show some promise but have not been pursued
in a serious experimental way, include: (i) large microwave caviL, tiupconverts es in which
wal l motion, dri v en by gravitational waves wi t h k �
microwave quanta from one mode to another
frequency (Pegoraro, Picasso and Radicati, 1978 ; Caves, 1 979); (ii) optical mode of slightly higher
orwavesmicfrowave cavities with x,
L � intended for detection of high-frequency
� 104 Hz, in which the gravitational waves interact directly with the
resonating electromagnetic field to move quanta from one mode into
another, or interact directl y with a DC el e ctri
quanta at the gravity-wave frequency (Braginsky et al., 1 973); (iii) as a c or magnetic field to create
specific, much studi e d example of such
ring resonator in which circularly polarized gravitational waves a cavity: an opti c al or mi c rowave
Gravitational radiation 437
propagating
electromagnetic orthogonal
field, to the
producing resonator
a linearly plane resonate
growing phase with the
shift circulati
in the n g
field.
This scheme, proposed by Braginsky and Menskii ( 197 1), was incorrectly
analyzed by them and by me (Box 37.6 of MTW) - a source of some
embarrassment. My error was in thinking I could cover the detector with a
single proper reference frame, when its diameter Lis larger than the reduced
wavelength A of the gravitational waves it sees. My incorrect conclusion, and
that of Braginsky and Menskii ( 1971) was a quadratically growing phase
oc h(ct/ A)(ct/Ae). The correct conclusion (Linet and Tourrenc,
shift, L\<l>is a linearly
1976) growing phase shift, L\<l> oc h(ct/Ae), which makes this
scheme no better in principle than a standard beam detector. WARNING:
The literature i s full of similarly incorrect analyses
described in this section, 9.5.4; (iv) an optical cavity filled with an isotropicof the various detectors
medi u m, or an optical fiber, i n whi c h gravity-wave-produced
optical effects such as birefringence (Iacopini et al., 1979; Vinet, 1985); (v) strains induce
detectors using the Mossbauer eff e ct (Kauff m an, 1970).
For
Grishchuk a general
( review of electromagnetically
1983), and for general analyses of them, see Tourrenc and
coupled detectors, see
Grossiord ( 1974), Tourrenc (1978) and Teissier du Cros ( 1985).
Analyses of such detectors sometimes produce
conclusions because they overlook the mundane issue of thermal noise in the wildly overoptimistic
mechanical parts of the detector (e. g . the walls
For an example of an analysis that does take thermal noise into account of an el e ctromagnetic cavity).
properly, see Caves ( 1979).
(b) Superfluid interferometers and superconducting circuits
Anandan ( 198 1), Chiao ( 1982) and Anandan and Chiao ( 1982) have
suggested
toarethebased, that if
superconducting it is possible
weak to
l i construct
nk (Josephson a superfluid
junction) weak
on li n
whichk, anal o
SQUIDSgous
then such a link could form the basis for a superfluid ring
interferometer that would be sens1 t
Unfortunately, such weak links do not yet exist. Schrader ( 1984) and 1ve to gravitational waves.
independently Anandan ( 1985, 1986) have suggested a gravity-wave
detector
thechange
magnetic based on
field the
in direct interaction
superconducting between
solenoids, the
withgravitational
the resulting wave and
current
monitored by a SQUID.
9.5.5 Low-frequency detectors ( 1 0-10 - 5 Hz)
As one goes to lower and lower frequencies it becomes harder and harder to
438 K. S. Thorne .
isolate
and a gravity-wave
acoustic vi b rations detector
and i n
from an earth-bourid
fluctuating laboratory
gravity gradients from seismic
due to
people, animals, trucks, etc. Ultimately, somewhere around 10 Hz, isolation
will become impossibly difficult (see, e. g .
the 'low-frequency region' 10 Hz}>f�10 - 5 Hz will require putting , Saul son, 1984 ). Thus, to operate in
detectors
detectors. in space, or using normal modes of the earth or the sun as the
been A number
conceived of space-borne
of for operati oro nearth-
at low or sun-normal-mode
frequencies; and detectors
several have have
been
constructed and have produced
waves. In this section I shall describe these briefly. interesting limits on cosmic gravitational
(a) Doppler tracking of spacecraft
Atwaveperiods
detector between
is the a few
doppler minutes
tracki and
n g a
of few hours
spacecraft. the best present gravity
used In doppler
to control tracki
the n g a highly
frequency stable
of a clock
monochromaticon earth (' m
radi aster
o oscillator'
wave, which ) is
is
transmi t ted from
byusedthebyspacecraft earth
and ' t to the spacecraft.
ransponded' back This
to the' u plink'
earth; radi
i. e o
thewaveupl i is
nk recei
wavev edi s
the spacecraft to control, in a phase-coherent way, the frequency of .
the 'downlink' radio wave that it transmits back to earth. When the down
link
theWhen radi o wave
mastergravitationali s recei v ed
oscillator; andradiation at earth,
from thatsweeps its frequency
comparison a i s compared
doppler shi ft wi
i s t h that
read outof
through the solar system - most .
idinsteresti
tance n gly
so it with
must a
be reduced
analyzed wavelength
by TT methods of order
-it the
perturbs earth-spacecraft
the earth, the
spacecraft, and the propagating radio wave.
fluctuations in the measured doppler shift with magnitude h ],!'. Each The net result is to
c5v/v ,...,
produce
feature i n the gravitational waveform hjkT ( t ) shows up three times in the
doppler
radio waveshift, in
with a manner
the gravitythat can
wave beat regarded
the events as due
of to
emission interactionfrom of the
earth,
transpondi n g by spacecraft, and reception
Wahlquist, 1975). This triplet structure can be used to help distinguish the at earth (Estabrook and
effects of the gravitational wave from noise in the doppler data.
The use of doppler data for gravi t y-wave
Braginsky and Gertsenshtein ( 1967), and was first pursued with preexisting searches was fi r st proposed by
data by Anderson
was worked(1975) (1971).
out byforDavies The response
(1974) for of doppler
special casestracki
and n g
by to a gravity
Estabrook wave
and
Wahlquist the general case. Experimental feasibility and noise
Gravitational radiation 439
dispersion
vapor in will
the be
earth' sufficiently
s atmosphere low that
may fluctuations
become the in dispersion
most serious due
noise to water
source
(Armstrong and Sramek, 1982). Plans for monitoring the water vapor as a
means of reducing this noise are being devel o ped (Resch et al., 1984) . In the
more distant future, doppler tracking may
frequency X-band/K-band (30 GHz) tracking to reduce and monitor the be improved by using dual
ionized
spacecraft plasma
(Smarr dispersion, by installing a highly stable clock
et al., 198 3 ) and/or by moving the earth-based antenna
on board the
into earth orbit. These improvements might ultimately produce [fSh(f)]1 as
low as 10-17 (Estabrook, 1987).
Figs. 9.4, 9.6 and 9. 7 show the sensitivities h3;yr corresponding to these
noise levels (equations (34) (52a) and (67) with 1�.f f).
, =
(f) Skyhook
Bragi n sky and Thorne
gravity-wave detector3that10would (1985) have suggested
operate in an
the earth-orbiti n g ' s kyhook'
0. 1 --0.0 1 Hz region with
sensitivity [fSh(f)Ji ,..._, x - 1 7, which is much better than present or near
fublocks
ture doppler
of the earth' tracki
s n g and
crust (seeroughly
Fi g s. comparabl
9. 4 , 9. 6 and e9.to
7 ). that
The hoped
skyhook for from
would
consist of two masses, one on each end of a
center. As it orbits the earth, the cable would be stretched radially by thelong thin cable with a spri n g at i t s
earth's tidal gravitational field. Gravitational waves would pull the masses
apart and push them together in an osci l latory
be transmitted to the spring by the cable; and a sensor would monitor the fashi o n; thei r motion woul d
spriIfnitg'severresultiflies,ngthemotion.
skyhook's role will be to provide, with a simple and
inexpensi v e device, a moderate-sensi t i v i t y coverage of the 0. 1--0.0 1 Hz
region during the epoch before far more sensitive beam detectors are buil t
and installed in space.
9 . 5 .6 Very-low-frequency detectors (frequencies below 1o- 5 Hz)
Atarefrequencies below about 10 - 5 Hz the only sources of gravitatio nal waves
probably stochastic background from the early universe (Sections
4 . 3 (d,e,f)); and the best detectors involve use of distant astronomical
9.bodies.
442 K . S. Thorne
and Rees ( 1983) used data from Helfand et al. ( 1980) to obtain a similar
limit.Intrinsic noise in all the pulsars used in these analyses would have made it
difficult to improve on these limits at fixed frequency (though, by further
observations the region covered could have been extended to lower
frequencies; the lowest being of order l/(total time since the measurements
began)). Fortunately, while these analyses were in process, they were put out
ofanybusiness by the discovery (Backer et al., 1982) of a pulsar far quieter than
previously known: the millisecond pulsar PSR 1937 + 2 1 . From three
Gravitational radiation 443
years of 1937 + 2 1 timing data taken with the Arecibo radio telescope, Davis
et al. ( 1985) and Taylor ( 1987) have now placed the limit
QGw (f) :::s; 1 x 10 -
6
( f
1 0 _ 8 Hz
)4
for f '1::, fmin = 1 0 - s Hz ( 127)
on any isotropic stochastic background : see Fig . 9. 7 .
This level of sensitivity is so good that further progress at fixed frequency
is limited by the long-term frequency stability of the world's best atomic
clocks. Thus, unless clocks improve, we can expect the coefficient in ( 127) to
improve at best as C 1 , while the lower frequency limit fmin decreases as C 1
producing !l G w ( fmin ) oc C 5 (with t = O in 1982). Ultimately, when clocks
have improved by one or two orders of magnitude, noise due to interstellar
scintillation may become a problem (Armstrong , 1984).
Recently two other quiet, fast pulsars PSR 1855 + 09 and PSR 1953 + 29
have been discovered (Segelstein et al., 1986). Together with PSR 1937 + 2 1
and others that we can hope for, they may one day form a network for
gravitational-wave searches and observations. Such a network would
alleviate problems with interstellar scintillations and atomic clock
fluctuations.
1
444 K. S. Thorne
9.6 Conclusion
As I look back over this review of gravitational waves, I am struck by the
enormous changes in our theoretical understanding - or at least in
theoretical fashion - that have occurred over the past 5 , 10 and 1 5 years ; and
I am impressed even more by the progress that experimenters have made in
the quest to invent, design and build detectors of ever greater sensitivity.
That the quest ultimately will succeed seems almost assured. The only
question is when, and with how much further effort. Five years ago Jerry
Ostriker and I made a bet , to wit :
Whereas both Jeremiah P. Ostriker and K ip S. Thorne believe that
Einstein's equations are valid
And both are convinced that these equations predict the existence of
gravitational waves
And both are confident that Nature will provide what physical law
predicts
And both have faith that scientists can ultimately observe whatever
Nature does supply
Nevertheless, they differ on the likely strengths of natural sources and on
the probability of a near-future and verifiable detection.
Therefore they agree to wager one case of good red wine (JPO to supply
French wine, KST to supply California) on the detection of
extraterrestrial gravitational waves before the next Millennium (January
1 , 2000). KST wins the wager if at least two experimental groups observe
phenomena which they agree are gravitational waves. If not, JPO wins.
Signed and officially sealed
this sixth day of May 198 1
Jeremiah P. Ostriker
Kip S. Thorne
I expect to win - but I won't guarantee it.
446 K . S. Thorne
Acknowledgments
For helpful comments on the manuscript of this chapter thank Thibault
Damour, Ron Drever, John Armstrong , Peter Bender, Jin Bicak , David
I
Blair, Roger Blandford, Herman Bondi , Carlton Caves, Frank Estabrook ,
Charles Evans, Sam Finn , Craig Hogan, Jim Hough, Ed Leaver, Brian
Meers, Roger Romani, David Schoemaker, Dan Stinebring, and Kimio
Tsubono.
References
10. 1 Introduction
The first relativistic solutions for a homogeneous expanding universe were
found by Friedmann ( 1 922) before Hubble ( 1929) discovered the recession of
the nebulae. Hubble's work , which showed that the universe did not
resemble Einstein's ( 1 9 17) static model , stimulated further studies of
relativistic cosmology by Lemaitre, Tolman and others. But there was then
and remained for several decades - a severe mismatch between the relative
sophistication of the theory and the sparseness of the relevant data.
Hubble's work suggested that galaxies would have been crowded together
in the past , and emerged from some kind of 'beginning'. But he had no direct
evidence for cosmic evolution. Indeed the steady state theory, proposed in
1948 as a tenable alternative to. the 'big bang', envisaged continuous
creation of new matter and new galaxies, so that despite the expansion the
overall cosmic scene never changed.
We would not expect to discern any cosmic evolutionary trend unless we
can probe out to substantial redshifts. This entails studying objects billions
of light years away with redshifts z ::::: 1 ; although a programme to measure
the cosmic deceleration was pursued for many years with the 200-inch
Palomar telescope, the results were inconclusive, partly because normal
galaxies are not luminous enough to be detectable at sufficiently large
redshifts. It was Ryle and his colleagu�s, in the late 1950s, who found the first
evidence that our entire universe was evolving. His radio telescope could
pick up emission from some active galaxies (the ones thought to harbour
massive black holes) even when these were too far away to be observed
optically. One cannot determine a redshift or distance of these sources from
radio measurements alone, but Ryle assumed that , statistically at least, the
ones appearing faint were more distant than those appearing intense. He
460 M . J . Rees
counted the numbers with various apparent intensities, and found that there
were too many apparently faint ones (in other words, those at large
distances) compared with brighter and closer ones (Ryle, 1 958). This was
discomforting to the 'steady statesmen' , but compatible with an evolving
universe if galaxies were more prone to violent outbursts in the remote past,
when they were young. The subsequent discovery by optical astronomers of
many hundreds of active galaxies at very large redshifts (quasars) has borne
out this trend ; but these objects, and their evolution, are still too poorly
understood to be used for determining the geometry and deceleration of the
um verse.
The clinching ·evidence for a 'big bang' came when Penzias and Wilson
( 1965) detected the cosmic microwave background radiation. This radiation
(whose thermal spectrum was quickly established) implied that intergalactic
space was not completely cold, but at a temperature of 2.7 K. The
corresponding photon density is "' 4 x 108 m 3 , implying that there are
-
,..., 109 photons for every baryon. This discovery quickly led to a general
acceptance of the so-called 'hot big bang' cosmology - a shift in the
consensus among cosmologists as sudden and drastic as the shift of
geophysical opinion in favour of continental drift that took place
contemporaneously. There seemed no plausible way of accounting for the
microwave background radiation except on the hypothesis that it was a relic
of an epoch when the entire universe was hot, dense and opaque. Moreover,
the high intrinsic isotropy of this radiation - better than one part in 10 4 -
meant that the Robertson-Walker metric was a better approximation to the
real universe than the theorists of the 1930s would have dared to hope.
In the late 1960s some theorists were emboldened to carry out a series of
now-classic investigations of the early stages of a Friedmann universe
composed of matter and radiation . Insofar as the present 'mix' of matter and
thermal radiation was known, cosmologists could infer the appropriate
equation of state at earlier times, and deduce the universe's 'thermal
history'. Powerful corroborative support for the hot big bang came when the
composition of material emerging from the 'fireball' was calculated , and
found to be 25 per cent helium and 75 per cent hydrogen (Hoyle and Tayler,
1964; Peebles, 1966 ; Wagoner, Fowler and Hoyle, 1 967). This was specially
gratifying because the theory of stellar nucleosynthesis, which worked so
well for carbon, iron, etc., was hard-pressed to explain why there was so
much helium, and why its abund.ance was so uniform. Attributing most of
the observed helium to the big bang therefore solved a long-standing
problem in nucleogenesis, and bolstered cosmologists' confidence in
Galaxy formation and dark matter 461
extrapolating right back to the first few seconds of the universe's history, and
assuming that the laws of microphysics were the same then as now. The
applicability of the Friedmann equations at early times c� uld not of course
be taken for granted ; the universe is certainly not completely homogeneous
now, and could in principle have been more irregular, and more anisotropic,
in the past. But the measured isotropy, together with the singularity
theorems, implied that there must be some singularities in the universe's past
(Hawking and Ellis, 1968), and the observed helium abundance (sensitive to
the expansion rate when the temperature was - 101 ° K) constrains any
possible early anisotropies.
More detailed calculations, combined with better observations of
background radiation and of element abundances, have strengthened the
consensus that the hot big bang model is basically valid . Several discrepant
results could have emerged during the last 20 years, but did not : for instance,
the standard Friedmann hot big bang would need drastic modification if any
object were found to have a zero helium/hydrogen ratio, if any species of
neutrino were found experimentally to have a mass in the range 1 keV-
1 Me V , or if there were a glaring discrepancy between the ages of the oldest
stars and the largest plausible timescale in such models. The hot big bang is
not yet fi rm dogma. Conceivably, our satisfaction will prove as transitory as
that of a Ptolemaic astronomer who successfully fits a new epicycle. But the
hot big bang model certainly seems more plau sible than any equally specific
alternative - most cosmologists would make the stronger claim that it has a
more than 50 per cent chance of being essentially correct.
The main stages in the evolution of a standard hot big bang universe are
depicted schematically in Fig. 10. 1 . Uncertainties about the relevant physics
impede our confidence in discussing the extensive span of logarithmic time
10 - 43_ 10 - 4 s when thermal energies exceed 100 MeV . When t ;(; 10 - 4 s we
can consistently utilise physics which is 'well known'; and, so long as the
universe remains almost homogeneous, the evolution is straightforwardly
calculable. However, at some stage small initial perturbations must have
evolved into gravitationally bound systems (protogalaxies?); even though
the controlling physics may then be Newtonian gravity and gas dynamics,
the onset of non-linearity induces challengingly complex behaviou r.
The progress of the last 20 years in delineating the big bang model has
therefore brought two sets of questions into sharper focus.
(i) How did the universe behave during the very earliest phases when the
physics is uncertain? Such fundamental properties as its scale, overall
462 M. J . Rees
Relevant tnow
--- Most distant quasar
astrophysics
not well known !
--- Last scattering of
· microwave background
Relevant physics.
well known
Relevant physics
{
l 1 0-35 s
)--- Inflation
I
Baryosyn thesis (- nb/ny)
(� ' flatness '
+ fluctuations?)
very speculative
(or unknown)
1 o-43 s tPlanck Q uantum gravity
Galaxy formation and dark matter 463
that the stars and gas we observe may be little more than a tracer for the
material that is dynamically dominant. The evidence for 'dark matter' dates
back more than fifty years, but has firmed up since the classic papers of
Einasto, Kaasik and Saar ( 1974) and Ostriker, Peebles and Yahil (1 974) and
is now quite compelling (for a recent review, see the proceedings of IAU
Symposium 1 17 (Knapp and Kormendy, 1986)); what the dark matter
consists of is , however, still a mystery.
The masses inferred from relative motions of galaxies in apparently-bound
groups and clusters exceed by a factor 10 those inferred from the internal
-
M/L can be obtained for the matter in the outlying parts of galaxies with
measured rotation curves, and for the haloes of edge-on galaxies. Values of
M/L exceeding 300 solar units are sometimes found (see Rubin (1986) for a
recent review).
Estimates of the masses of galaxy clusters come from the virial theorem - a
technique first applied to the Coma cluster by Zwicky ( 1 93 3). This method is
now complemented by X-ray studies of thermal emission from hot gas in
clusters, which probe the depth of the gravitational potential well. In well
studied clusters which appear to have reached virial equilibrium, M/L is
typically 200 solar units. This matter must mainly be in some unknown form
- neither ordinary stars nor the gas that emits X-rays. The data are
summarised in Figure 10.2.
All things considered, the existence of dark matter is quite unsurprising -
there are all too many forms it could take, and the aim of observers and
theorists must be to narrow down the range of options. These topics have
been discussed and reviewed at greater length elsewhere (e.g. Dekel, Einasto
and Rees, 1987, and references cited therein).
The present Hubble timescale t8 is still uncertain, so in quoting numerical
values I will follow a widespread convention and introduce a quantity h =
(3 x 10 1 7 s/t8). The experts advocate values of h in the range 0.5- 1 ; for a
detailed assessment, see Hodge ( 198 1) or Rowan-Robinson ( 1985). The
Galaxy formation and dark matter 465
Fig . 10.2. (a) The apparent increase with scale of the mass-to-light ratio .
This increase is due to the two distinct trends : (i) ordinary stars and gas are a
decreasing proportion of the mass of the larger systems; and (ii) in the larger
systems, even the 'ordinary' (star + gas) components have a larger M/L,
because they consist primarily of elliptical galaxies with few young stars, and
contain much hot gas revealed only by its X-ray emission. (b) The same data
re-plotted , with effect (ii) subtracted out. One finds that the physically more
fundamental ratio of 'ordinary' matter to 'dark' matter is independent of scale
in all virialized systems larger than galaxies, and has a value consistent with
!1 = 0. 1--0.2. The situation is less clear on still larger scales (superclusters)
because the dynamics are uncertain and virial equilibrium does not prevail.
This figu re is adapted from Faber ( 1984) and Blumenthal et al. ( 1984) ; the
latter paper gives fuller details of the data on which it is based . (The masses of
dwarf spheroidal galaxies are uncertain and controversial ; the diagrams
show these systems plotted twice, depending on whether or not they contain
dark matter.)
� cores
(a)
Small E groups �
M
L�e
2 way cluster
...;i
�
�
Dynamical mass m
I
S2
tll)
�
Dw arf spheroidals
Small spiral groups
2 (b )
M w ay
Large
]
cluster cores
Dynamical mass I
� trn �
�
tll)
S2
'1°
Dw arf spheroidals Small spiral groups
6 8 10 12 14 16
log M
466 M. J . Rees
= 3,
mean baryon density is then nb 1 l !lbh 2 m - where nb is defined as the
fraction of the critical density Pcrit (JnGt�) - 1 that is in baryonic form. An
=
y - 1 = nb/ny - 3 x 10 - s
( T )-3 (Qbh2),
2 .7 K
(2 . 1)
fl' being a measure of the entropy per baryon ; it is a number which GUT
models must attempt to explain .
A direct lower limit on nb of order 10 - 2 can be set from the observed
mean density of baryons in conspicuously 'luminous' form (visible galaxies,
and intergalactic gas revealed by its X-ray emission), implying a baryon-to
photon ratio of not less than about 3 x 10 - 1 0 h 2 .
There is no firm evidence for any antimatter in the universe (apart from a
small fraction of antiparticles in cosmic rays, which could have been
produced in high-energy collisions). Strong constraints on the presence of
antimatter in and around our Galaxy are set by the measured limits to the X
ray background . Nevertheless, if one is strictly agnostic and free from
theoretical preconceptions, one can certainly envisa,ge that the universe
might possess matter-antimatter symmetry (i .e. that the overall net baryon
number, and ( nb/n y ) , are zero) provided that the scale of the regions of each
'sign' is at least as large as a cluster of galaxies.
The inferred dark matter in the halos of individual galaxies and in clusters
of galaxies apparently contributes a fraction n = 0. 1 -0.2 of the critical
cosmological density. Its smoother and less clumped distribution suggests
that it underwent less dissipation during the processes of galaxy formation
than the luminous stars and gas. (In a final section I shall address the
separate question of whether the data are compatible with there being still
more dark matter; enough, in particular, to provide the entire critical
density (Q = 1).)
Three strands of evidence could eventually pin down what the dark
matter in halos and clusters of galaxies really is.
(i) Particle physics. When our theories of high-energy physics become less
speculative, and we can calculate how many particles of each species
(with known mass) should have survived as relics of the big bang, it may
turn out that at least one specific species with non-zero rest mass is
predicted , on the basis of standard cosmology, to contribute
significantly to n.
Galaxy formation and dark matter 467
fragmentation of the first clouds proceed right down to the 'opacity limited'
Jeans mass? Or, is fragmentation impeded by collisions (and coalescence) or
protostars, or by tidal effects? I do not believe we can yet answer these
questions with any confidence; it is therefore worth considering both
'Jupiters' and 'VMOs' seriously, in the hope that observations can offer
some firmer clues than theory.
One way of discriminating between the 'Jupiter' and 'VMO' options is by
searching for evidence of gravitational lensing. The probability of seeing
lensing due to an object in our own halo is only of order 10 - 6 (Refsdal , 1964 ;
Paczynski, 1986b) ; but the cross-section for effective lensing is proportional
Fig. 10.3. This diagram, from Carr, Bond and Arnett ( 1984) shows various
constraints on the fraction n* of the critical cosmological density that can be
contributed by first-generation stars (or their remnants) in different mass
ranges . The objects are assumed to form at a redshift Zr . There are dynamical
constraints on the number of �. 106 M 0 black holes in galactic halos, because
such massive bodies would have sedimented inwards via dynamical drag on
the ordinary stars. The requirement not to overproduce heavy elements
constrains the number of remnants of ordinary heavy stars which end their
lives by exploding as supernovae. Stars in the mass range 0. 1 - 1 M 0 would
still be shining after - 1010 yr, producing too much background light. The
possible options for dark matter are 'Jupiters' of below 0. 1 M 0 , or the
remnants of very massive objects (V MOs) in the mass range from a few
hundred to 106 M 0 , which could have formed at a large zr. (These latter
objects do not necessarily eject any material processed beyond helium, and
leave black hole remnants.) One way of deciding between these two
possibilities is by seeking evidence for gravitational lensing, as discussed in
the text.
1 0- 1
1 0 -2
n.
1 0 -3
'--�-'-�---'-�---'L--�-'--�____L_�--L��L-�...L__J
I I I
1 0-4
I
4 8 15 Mc
1 0 -s
1 0- 1 10 1 02 1 03 1 04 1 05 1 06 1 07
M(M 0)
Galaxy formation and dark matter 469
how compelling the case for 'standard' big bang nucleosynthesis actually is.
Even in the context qf this theory, Applegate, Hogan and Scherrer (1986)
have recently · pointed out that small scale inhomogeneities in the baryon
photon ratio, such as might arise at the quark-hadron transition, could
modify the resultant abundances : if the scale of the inhomogeneities is
smaller than the neutron diffusion scale at t ::::::: 1 s, then the neutron-proton
ratio can vary from place to place in a manner that would be impossible in
any homogeneous model , with the result that the 4He abundance can be
lower than in the standard model , and the observed deuterium could be
primordial even if nb = 1 .
More
Szalay than 15 years ago, Cowsik and McLelland (1973) and Marx and
(1972) conjectured that neutrinos could
galactic halos and clusters. At that time the suggestion was not followed up provide the ' u nseen' mass i n
very
about extensi
the v ely;
possibility but ofin the
non-zero 1980s physi
neutri n oc i s ts
masses.became A more
change open-mi
of theoretin ded
c al
atti1980)tude,stimcoupled with experi m ental clai m s that my � 36 eV (Lyubim ov et al. ,
whi c h ulated
neutri n o astrophysi
clusteri n g andc istsdito
ffuexplore
sion scenarios
play a key for
role galaxy
(see formation
Section 10. 4in).
More recently, other kinds of non-baryonic matter, such as supersymmetric
particles and axions, have also been considered.
Provided
speci e s of that
el e we
mentary knowparti thec mass
le, we and
can anni
i n prih i
n l ation
ci p le cross-secti
cal c ul a te o n
how for any
many
survive from
topartinc. leProgress the bi
ing bang, and
experimental the resultant
parti c le contri
physi c sb ution
may each
thereforespeci e s makes
reveal a
whi c h must
big bang theory entirely.dynami contribute si g
Indeed,calonceni fi c antly
we to
admi n, unless
t the we
possi abandon
b i l i t y that the hot
non
baryoni c matter may be another fundamentally important,number the ratioasnnon-baryonic/nb
important for
i n
the universe becomes
cosmology as n b /n .
y
scatteri
recoil, and could thereby become trapped (Stei et al. , 1978; Spergel
gmanby ordinary
and Press, 1985). Despite being vastly outnumbered nuclei, they
472 M. J . Rees
could then contribute to energy transport in the solar core, because their
mean free path is so long ; the central temperature would then be slightly
lower, with resultant observational consequences such as an alteration in
the frequency of some modes of solar oscillation, and a reduction in the 8 B
neutrino flux . Over the lifetime of the Sun , an isothermal core of 'inos' could
build up a mass of 10 - 1 2 M 0 if annihilations did not occur. Annihilations
would restrict this buildup, unless the cross-section for annihilation is far
below that for scattering , or unless the big bang produces an excess of inos
over anti-inos (as it does for baryons). However, even if annihilations
prevent a dense enough core building up to affect the Sun's structure, energetic
neutrinos from these annihilations may reveal their presence in the
underwater detectors developed to search for proton decay. Already, scalar
or Dirac neutrinos with mass exceeding 6 GeV can be excluded , and
analogous limits come from considering annihilations in the Earth rather
than the Sun (Silk et al., 1985).
If our galactic halo were comprised of massive weakly interacting
particles, then their density near the Earth would be "" 105mGev-1 m 3 , and
their typical velocities "" 300 km s - 1 • There is a genuine prospect of
-
background in the 1960s. A null result would surprise nobody ; on the other
hand , such experiments could reveal new supersymmetric particles (or
axions, as the case may be), as well as determining what 90 per cent of our
universe consists of. (Because the detection is sensitive to velocity, they
would even reveal the halo's velocity dispersion and rotation. The mean
velocity of halo particles relative to the detector would have an annual
variation , because of the Earth's motion around the Sun. Such an annual
modulation , with an amplitude of a few per cent and a peak in June , would
be an unambiguous signature discriminating against spurious background.)
36
models apply back at temperatures of around 10 1 5 Ge V, corresponding to
times ,...,, 10 - s. The observed baryon-to-photon ratio (eq . (2 . 1)), a measure
of the fractional excess of baryons over their antiparticles at early times, is
,...,, 10 - 9. (Were it much smaller than this, the universe would not be baryon
dominated when its age was of order a characteristic stellar lifetime.) The
value of the net baryon excess arising from out-of-equilibrium decay of X
and Y particles can be computed, given a specific GUT (Kolb and Wolfram,
1980) ; it involves a small parameter related to the CP-violation parameter in
474 M. J. Rees
weak interactions. This work is not yet on the same footing as the
calculations of primordial helium and deuterium ; it is perhaps at the same
level as nucleosynthesis was in the pioneering days of Gamow and Lemaitre .
But if it could be firmed up it would represent an extraordinary triumph.
The mixture of radiation and matter characterizing our universe would not
be ad hoc but would be a consequence of the simplest initial conditions. Also,
as well as vindicating a GUT, it would reassure us about extrapolating in
one bound, based on a Friedmann model, right back to the threshold of
classical cosmology, almost back to the Planck time. On a logarithmic scale,
this is a bigger extrapolation from the nucleosynthesis era than is involved in
going to that era from the present time. It would also place constraints on
dissipative processes arising from viscosity, phase transitions, black hole
evaporation , etc., which might occur as the universe cooled through the
'desert' between 10 1 5 and 100 Ge V. Al though these ideas are still
speculative, the 'prediction' of the photon-baryon ratio may turn out to
offer one of the few empirical tests of G UTs. (If baryon number were strictly
conserved and the universe actually possessed a conserved quantity of order
108 0 , then the concept of inflation would lose its appeal . The advent of
GUTs is therefore a prerequisite for the viability of such theories,
irrespective of the mechanism that drives the phase of exponential growth.)
If the baryon-photon ratio could be calculated , this would determine nb .
If Qb < 1 , then a strictly flat universe would require some non-baryonic
contribution. Of course , one may eventually have theoretical knowledge of
the rest masses of all other relevant particles; such information , in
conjunction with knowledge of nb/n y , would determine their contribution to
Qb also . Looked at from this point of view, it perhaps seems coincidental
that non-baryonic matter should dominate, but only by an order of
magnitude rather than a vastly larger factor.
amplitudes. Note, however, that we can use these to infer the total density
fluctuations only insofar as galaxies are a good tracer for the overall mass
distribution. Large-scale inhomogeneities would also induce 'peculiar
velocities' - deviations from the Hubble flow - in galaxies and even entire
clusters . The existing evidence on superclustering - though agreed by all to
be of primary importance - is still tentative and ambiguous.
A quite separate line of attack on large-scale structure is offered by the
background radiation. The microwave background isotropy (established
via relative rather than absolute measurements) is already known to be
amazingly precise : effects at the level of one part in 104 would be detectable
in a total cosmic background which is "' 102 weaker than the contribution
from the Earth itself. Recent reviews of the data have been given by
Partridge ( 1986), Wilkinson ( 1986) and Kaiser and Silk ( 1986). There are no
confirmed anisotropies apart from a 'dipole' anisotropy, of /J. T/T ';::::, 1 . 2 x
10 - 3 , indicating a motion of our Local Group of galaxies towards a
direction 45° from the Virgo cluster. The upper limits are of order 10 - 4 on
all angles from a few arc minutes up to 90° (quadrupole).
The microwave photons were last scattered at a 'cosmic photosphere'
whose redshift z* and thickness depend on the thermal history of the matter.
If the primordial plasma (re)combined as it cooled adiabatically below a
temperature of a few thousand degrees, and there was no significant later
reheating , then z* ';::::, 10 3 ; the effective thickness of the photosphere,
determined by the width /J.z* of the function (d/dz)e - r<z> , is then only
- 0. l z* . This is because the reduction in the free electron density (which
results in the 'fog lifting' and the microwave photons becoming free to travel
uninterruptedly) is rather sudden . If reheating caused the intergalactic
medium to 'fog up' again at a later epoch, with t > 1 at redshifts below 1000,
then z* would be smaller and /J.z* � z. Three cases are illustrated in Fig. 10.4.
Note that reheating at z < 10 could never generate t > 1, even if the universe
contained enough ionized intergalactic gas to give n = 1 .
A Friedmann model with !1 0 < 1 evolves very like a 'flat' model at
redshifts > !10 1 . Provided that z* > !10 1 , which is so for almost all relevant
models, the angle subtended by a given comoving region on the last
scattering shell is essentially independent of z* . A comoving length l
corresponds to an angle
(3 . 1)
A region of present length 1 Mpc subtends an·angle -!!l0 h arc minutes. A
region that came within the horizon at z* has present length l* � 2ctH x
(!l 0 z*) - 1 1 2 and subtends an angle ()* ';::::, (!l0 /z*) 1 12.
476 M. J. Rees
{ }
by
value of ( ( bp/p�2 ) 11 2 wh�n scale I is first
s(l) = . . (3.2)
encompassed w1 thm particle honzon .
If the growth were unaffected by pressure, this spectrum leads to
( (bp/p)2 ) 1 '2 oc s(l) x 1 - 2 for all scales l within the horizon at a given time -
density contrasts, in the linear regime would grow oc R(t), and would
eventually condense out as virialised systems with gravitational binding
energy - sc 2 per unit mass. The quantity s is the 'curvature fluctuation',
which one would like to be able to calculate from first principles.
Any fluctuation straddling the cosmic photosphere whose comoving
length-scale exceeds l* , the horizon scale at z* , but is less than ctH would
generate a temperatu re anisotropy (� T/T) which is simply of order s, as was
first calculated by Sachs and Wolfe ( 1967). Similar fluctuations along the
line-of-sight with redshifts less than z* have smaller effects (Rees and
Sciama, 1968 ; Dyer, 1976). If z* � 1000, then all fluctuations subtending
angular scales '> (2!1�1 2 ) degrees exceed the horizon scale; even if z* � 10 (an
implausibly extreme lower limit, attained only if most of the critical density
Fig. 10.4. Optical depth • (z) (dot-dash lines) of the universe back to a
redshift z, and 'visibility factors' 't'e - ' (dotted lines) for two values of xe, the
fraction of the critical density in the form of ionized plasma (assumed
independent of z for illustrative purposes). Even a small amount of reheating
is sufficient to produce a last scattering surface t hat is more 'smeared out' in
redshift than the standard model of recombination (labelled 'rec'). In some
models for galaxy formation where energy (and maybe heavy elements) are
generated at large redshifts, other kinds of opacity may be competitive with
Thomson scattering (from H ogan, Kaiser and Rees, 1982).
2 / Xe = 1 Xe = 0.00 1 rec
r
1 I I
/ /
/ . ... . . . ..
/ ./·
/. . . . . . . . . ..
0
� .. ..
. .
. .
.
· ·. . .
. . .
. .
1 10
.
1 00 1 000
l +z
Galaxy formation and dark matter 477
were in gas which had been reionized at z > 10), the horizon scale subtends
� 1 5°. Consequently the high isotropy of the background (�T/T< 10 - 4 )
implies that, if no � 1
£ ,.....
•
< l 0 _ 4 ior {
l > L/ctH > 1 (z * � lO)
1 > L/ctH > 310 (z * � 10 3 ) .
(3.3)
cause large angle (quadrupole and octopole) effects ; the smaller angular
scale observations probe smaller values of L (cf. (3 . 1)).
[If n0 ::::;; 1 then our particle horizon will eventually grow to encompass an
infinite amount of matter that we cannot yet observe. Inhomogeneities on
scales � ctH can induce gradients and shear across the region within our
horizon , and may cause anisotropy in the microwave background. The
observed limits to dipole and quadrupole anisotropy can thereby be used to
set some limits on these still larger scales (Grishchuk and Zel'dovich , 1978 ;
Kaiser, 1982). These limits, which are themselves somewhat model
dependent, become progressively less stringent (in terms of c) on larger
scales. We certainly cannot rule out a gross deviation from homogeneity
(bubble wall? M inkowski space?) on scales � 10 2 ctH . There can only be
'philosophical' reasons for extrapolating the observed inhomogeneity of the
part of the universe we can now study to a domain which (if the universe is
open) is infinitely larger.]
The scales to which (3 .3) applies are larger than those on which we
observe the most conspicuous inhomogeneity in the distribution of galaxies.
It would be of particular interest in constraining theories of galaxy
formation if we could observe temperature fluctuations due to the same
scales which now display clustering. The upper limits to � T/T on the
relevant scales of a few arc minutes (cf. (3 . 1)) are at the level of a few
times 10 - s.
On angles corresponding to scales L > L * we are basically just seeing
'metric fluctuations' which have been unaffected by pressure gradients and
have evolved acausally throughout the (post-inflationary?) Friedmannian
expansion phase of the universe. But if L < L* various complications arise.
(i) The Sachs-Wolfe contribution to �T/T may not be the dominant one.
Doppler effects due to peculiar motions may be even more important;
so also may effects due to the changes in the recombination and
decoupling time in perturbed regions.
(ii) Pressure gradients and damping may have affected the perturbations
478 M. J . Rees
candidates for the dark matter in galactic halos and clusters : low mass stars ;
black hole remnants of very massive objects ; or non-baryonic matter,
perhaps in the form of supersymmetric particles or axions. I would myself
lay similar odds on these three options at the moment. However, it is
gratifying that we can expect the odds to change quite rapidly, owing either
to improved observational and experimental searches, or to progress in
particle physics .
Fig. 10.5. This diagram, from Rees ( 1 983), depicts the astrophysical limits
on the amplitude t: of adiabatic metric fluctuations, on various scales I. On
large scales, the microwave background isotropy offers stringent upper
limits. On very small scales, t: must merely be not so large that too much of
the universe collapses as the relevant scales enter the horizon ; the absence of
distortion in the microwave background sets a slightly better limit for mass
scales 1 04- 10 1 3 M 0 . The requirement that bound systems have 'turned
around' by the present epoch gives lower limits (also plotted). A spectrum
with t: � 10 - 4 on all scales is acceptable if the universe is dominated by non
baryonic matter which can start clustering before tree · If t: has a power-law
dependence on I and is - 10 - 4 on the scales relevant to galaxy formation,
0
then it cannot fall off more steeply than oc / - · 1 5 without causing excessiv�
production of primordial mini-holes . Also plotted (dotted lines) are the
amplitudes t:8 of primordial gravitational waves that can, within the next few
years, be probed by : (a) doppler tracking of spacecraft , (b) timing of 'quiet'
pulsars, and (c) timing of the orbit of the binary pulsar. This diagram is
drawn assuming a 'llat' background universe with n 0 1 . If n 0 < 1 the limits
on large scales are modified . I ndeed the appropriate definition of t: is
=
�
............ ........... . .. . ) ....
�.
·
(a)· ...·...
0
•
.• ••
� \\ \� �·-'"
-.:\. '
-;- ' ' ''
- . 1 \ \ ....... . ·�
•
·� (c
€
'\ \.
Galaxy
{ �
w ithout ' inos'
1 0-e 1 0-4
480 M . J . Rees
A separate line of attack that might shorten the list of candidates entails
exploring the implications of each for galaxy formation - specifically, for
processes whereby small primordial perturbations evolve · into proto
galaxies and clusters. The key parameter here is the spectrum of the density
fluctuations at the era of recombination, which is determined by initial
conditions (described, for instance, by (3 .2), modified by damping processes
during the fireball phase (t � 10 6 years)). The mass-scale of the first bound
system (i.e. the scale on which the growing perturbations after
recombination first become non-linear) roughly corresponds to the scale at
which the fluctuation spectrum peaks. Fig. 10.6 depicts three rival
cosmogonic schemes (see caption for further details). The disparity between
them is highlighted if we envisage what the universe would have been like,
according to each, at the epoch corresponding to z < 10. In hierarchical
'bottom-up' schemes, a lot could have happened already. In others , the
universe is then still amorphous neutral hydrogen , with all parts expanding
in nearly undeviated Hubble flow. The cosmological 'dark age' starts when
microwave background photons shift longward of the visible band , when
the universe is 10 6 years old : it ends when the first bound systems (or their
constituent stars) light up. We know that quasars - and so, presumably ,
Some galaxies ...:. had formed at Z "-' 4 (t - 109h - l years if 0 0 = 1), but this is
merely a lower limit to the redshift of the first non-linearity. We do not know
whether the 'dark age' is brief, or lasts for a billion years. We are more
confused and ignorant about this phase of cosmic history than many seem to
be about the first 10 - 3 5 s.
A related cosmogonic issue is this. To what extent is the present large
scale structure a rather direct consequence of initial fluctuations, imprinted
at 10 - 3 5 s? Or does it, contrariwise, result from secondary perturbations
generated or seeded by the first bound systems? Hogan ( 1986) has suggested
tlie phrase 'paleogeny versus neogeny', to denote this dichotomy.
I
.. ..
I
I
} Quasars
I
I
I
1 09 I
I
I
- - - - - -·- - - - - - - - f- -+ - -
.. I
30 - - - - - - +--
H I f
I
T
I I
• I
I I
I I
I I
I
I I
I I
:
I
I :
I
300 - - - - - - .- - - - -•, - ,I - , -- - - - - � - -+ - -
I
.
I
I
I
1 07
I
.
M
I
I
T
I I I
I I
I
I
I I
I
I
I
I I
1 06
••
I I
I
I ..
: : : 1
3000 11l1T1 TT i1 1T1l1 TT ill TT Ill TT iTITT IT1Ti !11 11 1 /i 111 TT 11j il 11.i iT i)1 TT
I I
: • I
:. (Re)combination : :. :
.. .. ..
.
0. 1 0. 1 0. 1
0. 0 1 0.0 1 0. 0 1
1 05 1 0 1 0 1 0 1 5 M 0 1 05 1 0 10 M0 1 05 1 0 10 1 0 15 M0
relativistically until the stage in the cosmic expansion when k T fell below
......, lO eV, with the consequence that scales � 10 1 5 M0 are homogenised by
free-streaming, leading to a cosmogony where superclusters form first and
then have to fragment into individual galaxies.)
It is relatively straightforward to calculate the growth factor for density
perturbations in 'cold dark matter' between the era when they enter the
particle horizon and the present. All scales grow roughly as the scale factor
(i.e. (bp/p) oc R oc r - 1 oc t213) after the epoch when the universe becomes
dynamically dominat �d by dark matter. This result can be derived by simple
arguments bas·ed on Newtonian cosmology (McCrea and Milne, 1934).
However, at earlier times the universe is dominated by photons, and the
expansion timescale is proportional to (GP rad ) - 112; perturbations in the
dark matter alone with a growth time (GP co M ) - 112 do not then have time to
grow significantly. See Fig. 10.7. This effect, termed 'stagspansion' by
Blumenthal and Primack ( 1983), was first calculated by Guyot and
Zel'dovich ( 1970) and Meszaros ( 1974). If the initial fluctuations have the
Harrison ( 1970)-Zel'dovich ( 1972) scale-independent form , so that each
scale has the same amplitude e when it is first encompassed within · the
particle horizon (i.e. at a time oc M), the present-day fluctuations would be
proportional to M - 213 for large masses, and be essentially independent of M
at low mass. The spectrum flattens at low masses because all scales entering
the horizon before the epoch of equal densities undergo essentially the same
growth. The resultant spectrum of density perturbations actually has a
rather gradual 'rollover' in the transition region (Fig. 10.8). There is
nevertheless a characteristic mass imprinted by cosmology, which is smaller
than the mass now within a 'Hubble volume' by a factor (Qrad/Q0 M )3'2,
where n rad ( ......, 10 - 4) is the present contribution of primordial radiation to n.
The amplitude of the fluctuations (i .e. the value of e in (3 .2)) cannot yet be
reliably predicted theoretically. The vertical scaling in Fig. 10.8 can
Galaxy formation and dark matter 483
M H(CDM)
§£ a:
p - Af - t
\
0
Supercluster
Cluster I
=
- ---i
: f> p a: '7'- ! a: tt
I p
Galaxy ' -r
-
� [ [ No damping of ] -
= I � � Const
__
small scales
I
No significant
growth
Because the spectrum in Fig. 10.8 is so nearly flat for small M, the typical
fluctuation of 106 M 0 would collapse no earlier than the epoch
,....,
6 8 10 12 14 16 18
log M/M0
Galaxy formation and dark matter 485
adiabatic), the first bound baryonic systems would be associated with high-u
peaks in the dark matter distribution on mass scales 105-10 6 M 0 . These
would collapse at a redshift z of order lOv , where v is the number of standard
deviations above the mean. Everything is straightforward while all
perturbations remain linear, but once the first bound baryonic systems have
formed - once 'first light' occurs - what happens next could be crucially
influenced by feedback from these systems. This influence depends on the
mass spectrum of the first-generation stellar objects.
Our poor understanding of early star formation impedes quantitative
study of all galaxy formation schemes, particularly the hierarchical ones
where the first stars form at pregalactic epochs. We cannot reliably decide
Fig. 1 0.9. This shows three stages in the evolution, at redshifts z = 2.5, 1 .0 and
0 respectively, of an N-body computer simulation intended to represent the
contents of a cubical volume of present dimensions 14 h - 1 Mpc in a universe
with n = 1 (Frenk et al., 1985). The initial fluctuations were chosen to match
those expected for cold dark matter (see Fig . 10.8). When the amplitude is
chosen so as to match the present-day scale of clustering there is no further
freedom in the model ; it is therefore gratifying that the matter is aggregated in
systems whose properties match those of galactic halos . The baryonic
component, which settles dissipatively in these halos to form the luminous
content of galaxies, is unimportant for the overall dynamics . The present
structure is th� outcome of gravitational instability, in an initially almost
homogeneous expanding universe, developing into the non-linear regime. In
a celebrated letter to Bentley, Newton wrote 'It seems to me, that if the matter
of our sun and planets and all the matter of the universe were evenly scattered
throughout all heavens, and every particle had an innate gravity toward all
the rest, and . . . if the matter were evenly disposed throughout an infinite
space, it could never convene into one mass; but some of it would convene
into one mass and some into another, so as to make an infinite number of
great masses, scattered at great distances from one another throughout all
that infinite space. And thus might the sun and fixed stars be formed . . . '
. . j .
.' '{- .
-�·
·' -
·,
< : ,_ ... ;
�-
z = 2.5 z = 1 .0 z = O
486 M. J . Rees
whether a collapsing baryonic cloud of ,...,, 105 M 0 will turn into a single very
massive star, or fragment into 107 stars of ultralow mass. In the latter case,
the first stars would not inject any significant energy into the remaining gas.
On the other hand, if the first stars were ordinary massive stars or
supermassive objects, their ultraviolet radiation would photoionize the
medium. Indeed, only 10 - 4 of the initial mass need turn into such stars in
,...,,
the cosmogonic process until systems exceeding this larger mass undergo
collapse; moreover, no more than 10 - 4 of the baryons (i.e. the � 3a peaks)
need condense into massive stars before reheating chokes off any further
collapse of 105 M 0 systems.
,...,,
Gas would subsequently condense into dark matter potential wells in the
mass range 108- 101 2 M 0 . The lower limit is set by the Jeans mass after
reionization, and the upper is the maximum scale where bremsstrahlung and
hydrogen and helium recombination cooling is efficient enough (see Fig.
10. 10; Compton cooling would allow larger and hotter clouds to collapse at
z � 10) . The luminosity function of galaxies depends primarily on two
things.
(i) We need to know the mass distribution of isolated virialized systems of
dark matter. This is a hard problem when the initial fluctuations have
the spectrum shown in Fig. 10.8 because of the cross-talk between
various scales, and its solution awaits N-body simulations with larger
dynamic range than have so far been carried out. At the moment it is
unclear whether well-defined substructure survives, or whether bound
systems on scales of, say , 108 M 0 are rapidly engulfed in a larger system
before baryons have much chance to condense within them.
(ii) The luminosity of a galaxy resulting from infall into a given potential
well depends on what fraction of infalling gas is retained in each
virialized clump of dark matter (likely to be larger for deeper potential
wells, i.e. those of large mass and/or those which evolve from high-a
peaks) and on the kind of stars it turns into.
The above two issues will need to be settled, before we can reliably
estimate the fuminosity function of galaxies, even when our starting point is
Galaxy formation and dark matter 487
the specific assumption of a cold dark matter spectrum that evolved from
Harrison/Zel' dovich initial fluctuations.
The following are among the observations that would help to decide
among the three cosmogonic schemes illustrated in Fig. 10.6, and to test the
'cold dark matter' model in particular.
(i) The upper limits on the microwave background fluctuations on small
angular scale constrain the amplitude of ( (6p/p) 2 ) 1 12 at the
Fig . 10. 10. This diagram , adapted from Rees and Ostriker ( 1977), delineates
the mass-radius relation for a self-gravitating gas cloud whose cooling
and dynamical timescales are equal (assuming cooling d ue only to
bremsstrahlung, H and He recombination , and line emission). A cloud of
given mass whose radius was initially very large would deflate quasistatically
(because tcool > tdyn ) until it crossed the critical line; it would then collapse in
free fall and could fragment into stars. This simple argument (which can
readily be modified to allow for non-spherical geometry, a non-baryonic
component of mass, etc.) suggests why, irrespective of the cosmological
details, no galaxies form with baryonic masses > 101 2 M 0 and radii
> 10 5 pc . We would l ike to understand why galaxies have these observed
masses and radii to the same extent that we understand the dimensions of
individual stars. The order-of-magnitude considerations summarised in this
diagram are probably part of the story, but to fill in the details we need to
know more about the initial fluctuations, and also about the efficiency of
star formation in protogalaxies.
I O'
1 06
1 04
1 010 1 0 12 I 0 14
M/M"'
488 M. J . Rees
In an influential review paper published more than a decade ago, Gott et al.
( 1974) summarized the evidence bearing on n. They concluded that the
dynamical evidence favoured a value 0. 1---0 . 2, and noted that if the matter
Galaxy formation and dark matter 489
were all baryonic, the lower end of this range was compatible with the value
favoured by standard big-bang nucleosynthesis (for a Hubble time tH � 2 x
101 0 yr, a value consistent with the ages of globular clusters, etc.). Much new
evidence has accumulated since 1974, especially on cluster dynamics and
element abundances ; and some relevant theoretical issues have been refined
and elaborated. But, if one were to update Gott et al. ' s discussion , their net
conclusion would not change much .
There has, however, been a marked change in theorists' attitudes. This is
partly because non-baryonic matter is now taken much more seriously, and
seems in some ways almost a natural expectation . But the main element in
the discussion is the concept of 'inflation' : this is so appealing , and resolves
some well-known and stubborn cosmological paradoxes in such a natural
way that it instils a strong prejudice in favour of n o = 1 . It is perhaps worth
'
spelling out the basis for this prejudice.
For all the present observable universe to have evolved from a region that
was in causal contact at the earliest times, inflation by a factor of at least 10 3 0
is required. In most versions of inflation the exponential growth, once
started, rapidly continues for many expansion timescales : it is likely to
overshoot, stretching any small part of an initial chaotic hypersurface so
that it becomes essentially flat over our present horizon scale. This would
yield n o = 1 , with a precision of order 1 part in 10 4 (the expected fluctuation
3
amplitude) . For inflation to yield the dynamically preferred value n = 0. 1 or
0.2 the inflation factor would have to be 'just' 10 0 , making the present
Robertson-Walker curvature radius of the order of the Hubble radius. This
would demand some coincidence. But there would then be an additional
requirement that appears still more contrived : our presently observable part
of the universe would have to arise from a segment of the initial hypersurface
with the seemingly very special property that its curvature was uniform to a
few parts in 105 ; otherwise the curvature fluctuations that could induce
quadrupole effects in the microwave background would not be at least 104
times smaller than the overall Robertson-Walker curvature (Wilkinson,
1986). Our universe could thus not have inflated from a typical element of an
initial chaotic hypersurface : if n # 1 , the required region would have to be
special, rather as a sphere would seem specially smooth if its surface
irregularities amounted to 10 - 5 of the uniform mean curvature .
10.5 . 1
Inferring Q0 from the cosmic deceleration
In Friedmann models with zero cosmological constant, n is directly
proportional to the deceleration of the cosmological expansion.
490 M. J . Rees
only if the Friedmann equations apply, if the cosmical constant is zero, and if
the dominant form of mass-energy has an associated pressure � -�pc 2 •
Ideally, therefore, one would like to determine n and R independently, as a
test of these assumptions.
1 0 . 5 . 2 'Biased' galaxy formation
The dynamical evidence from clusters and from galactic halos (surveyed in
Section 10.2) does not offer evidence for any value of n0 higher than "' 0.2. If
n0 is indeed unity, then 80 per cent of the mass is unaccounted for even by
dynamical considerations : it is not just the light, but evidence for gravitating
matter itself, which is 'missing'. This raises the interesting question of
whether the dynamical evidence is nonetheless compatible with n0 = 1 .
If n0 = 1 , then the dominant mass must not participate fully in the
observed clustering : galaxies must be more 'clumped' than mass in general,
so that their spatial distribution enhances and exaggerates the
inhomogeneity on large scales. This requires some kind of 'bias' in the
formation of galaxies. There are three ways this would come about (see
Dekel and Rees, 1987, for fuller discussion):
(i) The entire universe may be pervaded by a uniform component of
'missing mass' of a different nature from the clustered dark matter,
contributing n "'0.8.
(ii) There may be one important kind of dark matter, but the baryonic
component may be segregated from it even on scales - 30h - 1 Mpc, so
that galaxy formation occurs only in certain regions. This could result
from large-scale fluctuations in the initial baryon/photon ratio, from
gas dissipation within superclusters of collisionless dark matter (e.g.
neutrinos), or from very energetic winds or blast waves pushing the gas
over large distances.
(iii) Less extravagant in energy is the possibility that the large-scale baryon
distribution does trace dark matter on scales :<: lh - 1 Mpc, but the
efficiency with which baryons turn into luminous galaxies is modulated
by large-scale environmental effects.
The universe may be dynamically dominated by 'ultrahot' weakly
492 M. J . Rees
under current discussion with the observed universe. Therefore, the search
for an appropriate physical bias mechanism, and especially for confirming
observational evidence, are of great importance.
On the theoretical side, the bias mechanism is intimately related to the
cosmogonic scenario and the nature of the dark matter. Although some of
the proposed bias mechanisms may seem somewhat contrived, others are
very plausible physically : some or all of them must have affected galaxy
formation to some degree and they should be worked out in more detail. It is
evident that the notion that galaxies trace mass is an unjustified assumption.
Answers to the following observational questions would help to
distinguish between various options outlined in this section.
(i) How much diffuse gas is there in the voids?
(ii) Are voids empty of all types of galaxy, or only those types that are most
conspicuous? Any evidence that galaxies of different morphological
types display unequal degrees of clustering is relevant here.
(iii) Are there any galactic-mass dark halos with no luminous galaxy within
them? Such objects might be expected in the cold dark matter model
(assuming fluctuations with random phases) if biasing is indeed
important ; and could , if their core radius were small enough, account
for gravitational lensing of quasars even when no lens is visible.
(iv) Have the large-scale inhomogeneities in the galaxy distribution
(superclusters, etc.) given rise to substantial deviations from the Hubble
flow on corresponding scales?
I shall conclude by summarising two candidate cosmologies, both of
which now seem appealing in their different ways :
(i) The first model has nb = ntotal :::::: 0 . 1 5 (and a Hubble time tH = 2 x
10 1 0 yr). The baryons would be partly in 'luminous' form (stars and gas)
and partly in 'dark matter' (low mass stars and/or black holes).
(ii) Alternatively, theoretical predilection may dispose us to favour !l 101a 1 =
1 + 10- 4. (The small uncertainty would arise only from the initial
curvature fluctuations, which need to be 10- 4-10- 5 on the scale of
galaxies and clusters, and would, in the simplest models, extend with
similar amplitude up to the Hubble radius and even beyond .)
Conventional big bang nucleosynthesis then suggests nb :::::: 0 . 1 ,
nnon-baryon :::::: 0 .9, and introduces a new dimensionless ratio into
cosmology. The large-scale structure of the universe then involves some
kind of biasing in the formation of bright galaxies.
It may become easier to decide between the options when we have a better
knowledge of galactic evolution and star formation. It is still unclear
496 M. J . Rees
whether the luminous parts of galaxies result from infall into a preformed
halo composed of quite different stuff, or whether, contrariwise, galaxies and
their halos resulted from a single dissipative collapse process· whereby the
IMF gradually changed as contraction proceeded. Observations of quasars
back to redshifts z � 4 can now provide direct evidence of what the universe
was like at an era when galaxy formation was still going on. Numerical
simulations which can follow not only gravitational clustering (cf. Fig. 10.9)
but also the complexities of gas dynamics - shock waves, radiative cooling
and fragmentation - will soon become feasible.
The problems of large-scale cosmogony are so intermeshed that we will
not really solve any until the whole picture comes into better focus. For
instance, we cannot test theories of galaxy formation and evolution until we
understand star formation (and the possible role of active galactic nuclei) as
well as the initial fluctuations. The relevant physics is not specially recondite
- indeed most of the physics that astrophysicists need is in 'Landau and
Lifshitz' - and the most intractable phenomena are complex manifestations
of Newtonian gravity and dissipative gas dynamics. But when we ask where
the initial fluctuations come from or what the dark halos are made of, we
realise that even the most ordinary galaxies pose questions that may
transcend the physics we understand.
References
1 1. 1 Introduction
The origin of structure in the universe is one of the great cosmological
mysteries. Newton thought that it could not be explained by natu ral causes
and attributed it to God . Now, 300 years after the publication of Principia,
the problem is still unresolved , and possible ways to approach it have
started to emerge only in the last several years. One of the possibilities is
related to cosmic strings, which could arise as a random network of line-like
defects at a phase transition in the early Universe. In this scenario, massive
closed loops of string serve as seeds for the formation of galaxies and clusters
of galaxies. While matter is being accreted onto the loops, they oscillate
violently, lose their energy by gravitational radiation , shrink and disappear.
Apart from their possible role in galaxy formation , strings are fascinating
objects in their own right. The physical properties of strings are very
different from those of more familiar systems and can give rise to a rich
variety of unusual physical phenomena. In particular, if strings exist, they
can produce a number of characteristic observational effects detectable with
existing astronomical instruments.
The physical properties, evolution and cosmological consequences of
strings have been studied extensively for several years, but the subject is still
rapidly expanding. For an up-to-date guide to the literature the reader is
referred to Vilenkin ( 1985) and Preskill ( 1985). A review of all aspects of
cosmic strings would require writing a book, and here I have chosen just one
aspect which I think is the most appropriate one for a volume dedicated to
Newton's Principia.
This article reviews the gravitational properties of cosmic strings. Some of
these properties have been used to analyse various cosmological effects of
strings, but here my emphasis will be on the basic physics. Cosmological and
500 A . Vilenkin
f f
O = T�l , J.xk d2 x = - r�I aJ.xk d2x,
f
= - n d2 x (i,j, k = 1 , 2). (2 .3)
Then , neglecting the width of the string, we can write
T� = µ c5(x) c5(y) diag( l , 0, 0, 1 ). (2 .4)
We see that the tension along the string is equal to the linear mass density.
where the Lagrangian ff' is invariant under (3 .2) and (3 .3), y is the
determinant of the metric tensor of the su rface,
ax µ a x v
Ya b = 9 µ v a(a a( b , (3 .5)
The 'building blocks' for the Lagrangian ff' are the string tension µ and the
geometric qu a ntities, such as the intrinsic and extrinsic curvature of the
surface (3 . 1) and their covariant derivatives. Note that the 4-velocity uµ is
not among the building blocks; the reason is that the local rest frame of the
string, and therefore the 4-velocity, is defined only up to longitudinal
Lorentz boosts.
The dimension of ff' is mass squared, and we can write
ff' = - µ + r.xK + {3K 2/µ + · , · · (3 .7)
where K stands for the curvature (with indices suppressed) and a , {3 are
numerical coefficients. For a string with a typical curvature radius R , we
have K ,...,, R - 2 4:. µ (since the string thickness is b ,...,, µ - 1 ' 2 and we assumed
that R � b). Hence, the curvature-dependent terms in (3.7) can be neglected
and we obtain
s =
f
- µ ( - rl ' '' d ' " (3.8)
This is the Nambu action for a string (see, for example, Scherk, 1975). Up to
an overall factor, it is given by the surface area traversed by the string in
t Note that the second requirement is not satisfied for global strings. The long-range
force in this case is due to the interaction of strings with Goldstone bosons; the
corresponding action functional is derived in Vilenkin and Vachaspati ( 1986).
Gravitational interactions of cosmic strings 503
spacetime. Note that (3.8) is similar to the action for a relativistic particle,
f
S = - m ds , (3 .9)
which is proportional to the length of the particle's world line.
Varying the action (3 .8) with respect to x µ ((a) we obtain the equations of
motion for a string :
a
ac o { ( - y ) - 1 / 2 [(.X . x')x' µ - x' 2_xµ] }
a
+ ac 1 {( - y) - 1 1 2 [(.X . x')xµ - x 2 x' µ] } = 0. (3. 10)
The energy-momentum tensor can be found by varying S with respect to gµ v
(Turok , 1984; Vachaspati , 1986) :
f
Tµ' = µ d ' (( - y) - 1 1 2 b( •l (xµ - xµ ((" ))
x
{x' 2 xµ_x v + x 2 x' µx' v - (x · x')(xµx' v + x' µx v )} . (3 . 1 1)
It is easily verified that for a straight string in flat spacetime lying along the z
axis , t = (0, z = (1 , x = y = O, and eq. (3 . 1 1) reduced to eq. (2.3).
I ·
T"'(F, t) = µ d\ (x• x -x'•x") b (3l(f - X(\, t)). (4. 7)
J
E = rg d 3x = µ d\. J (4.8)
In the following two sections we shall find some solutions of the string
equations of motion in flat spacetime .
T = L, since
x(( + L/2, t + L/2) = x((, t). (5.5)
In fact, the period can be smaller than L/2 for some special loop trajectories.
An interesting property of the loop solutions is that the string typically
reaches the velocity of light at some points at certain moments during its
period (Turok , 1984). From eq. (5. 1) we have
i 2 (C t) = i[a'(( - t) - "b' K + t)] 2 • (5.6)
Now, it follows from (5 .2) and (5.3) that the vector functions a'(O and - b '(O
describe closed curves on a unit sphere as ( runs from 0 to L. These functions
should satisfy
(5.7 )
and are otherwise arbitrary. If the two curves intersect, a'((a) = - b'((b), then
Gravitational interactions of cosmic strings 505
x
506 A . Vilenkin
appropriate choice of axes (Fig. 1 1 . 1 ) the shape· of the string near a cusp is
given by y ex: x 21 3 •
The rms string velocity in a loop can be defined as v rms ( ( v 2 ) ) 11 2, where
=
(v 2 ) = 1 r dt � I d(.i2(" t)
= 1 -1( ci '(( - t) . b' (( + t)) (5. 12)
and I have used eq. (5.6). Using the identity
:1 J J' · b d ( = J ( - J" · b + J' · b') d (
and the fact that an average over period of a time derivative is equal to zero,
we see that the last term in (5. 12) vanishes, and thus
( v 2 ) = 0.5. (5. 14)
To give a specific example of loop trajectories, we can choose a and b to
describe circular motion in planes at an angle </> to one another :
ci((} 'ix - 1 (e 1 sin ix( + e 3 COS IX(},
b(() = p - 1 [(e 1 cos <J> + e sin </>) sin /J( + e 3 cos /J(] . (5. 1 5)
2
Here, ei is a unit vector along the xi-axis, ix = 2nm/L , p = 2nn/L, m and n are
relatively prime integers (otherwise the parameter ( traverses the loop more
than once between 0 and L and can be redefined to make m and n relatively
prime). It is easily checked that the period of the loop (5. 1 5) is T = L/2n. The
family of solutions (5 . 1 5) was found by Burden ( 1985) . Solutions with
(m, n) = ( 1 , 1) as well as other families of solutions had been studied earlier by
Kibble and Turok (1982) and Turok ( 1984).
Eq . (5. 1 5) with m = n = 1 describes an elliptical loop which rotates and
turns into a double line; at that moment the ends of the line are moving at
the speed of light. Then the loop returns to the elliptical shape and goes
through the cycle again. The degenerate cases </> = 0 and </> = n correspond to
an oscillating circular loop and to a rotating double line, respectively. Loops
with (m, n) # ( 1 , 1) never collapse to a double line; in fact, loops with m = 1 ,
n # 1 never self-intersect. Several snapshots of a loop with m 1 , n = 2, =
(4. 1). Then the string trajectory is described by two functions, x(z, t) and
y(z, t), and it is easily verified that
x =f(z + t), y = g(z + t) (6. 1)
is a solution of eq . (3 . 10) for arbitrary functions f and g. These solutions
describe waves of arbitrary shape propagating along the string with the
velocity of light. Note that a superposition of waves travelling in opposite
directions is not a solution, since eq. (3 . 10) is nonlinear.
p --+ oo . It is unlikely that this metric has anything to do with strings which
could have formed in our Universe, and we dismiss the case (7 .5) as
unphysical . The remaining solution (7 .4) is just the metric of flat space in
cylindrical coordinates,
ds 2 = d t 2 - dp 2 - p 2 d</> 2 - d z 2 • (7 .6)
Thu s, we have reached a surprising conclusion that the spacetime outside a
straight string is flat. We shall see, however, that it is only locally flat and
that globally it is not equivalent to Minkowski space (Vilenkin, 198 la).
To represent the interior of the string , we shall first assume a simple
energy-momentu m tensor of the form (Linet, 198 5 ; Gott, 1985; Hiscock,
1985)
T� = a(p) diag( l , 0, 0, 1). (7.7)
The general case will be discussed later. The form (7.7) is clearly con sistent
with all the symmetries of the string. In the limit of negligible string
thickness, a(p) is a b-function and (7.7) reduces to (2 .4) . Substituting (7 . 1)
and (7. 7) in Einstein's equations, it is easily shown that A(p) = con st . By a
suitable rescaling of t and z we can set A = 1 ; then
ds 2 = dt 2 - dz 2 - dp 2 - B 2 (p) d</> 2 (7 .8)
and Einstein 's equations reduce to a single equation for the function B(p) :
B"/B = - 8nGa. (7 .9)
The metric (7 .8) is nonsingular on the axis p = 0 only if
B(O) = O, B'(O) = 1 . (7 . 10)
The mass per unit length of string is
2n
('xi dp [ d</Ja(< 2 >g) 1 1 2 = - [ 1 - B'( oc )] ,
1
Jo Jo
µ= (7. 1 1)
4G
where < 2lgii is the metric on the surface (t, z) = const and < 2lg = B 2 is its
determinant. At large distances from the string (p � b) , a --+ 0, B'(p) -+ B'( oo )
and we obtain
ds 2 = dt 2 - dz 2 - dp 2 - ( 1 - 4Gµ) 2p 2 d <f> 2• (7 . 12)
A coordinate transformation
( l - 4Gµ)</J --+ </> (7 . 13)
brings the metric (7. 14) to a locally Minkowskian form (7.6), but then the
angle </> varies in the range
0 < </> < ( 1 - 4Gµ)2n . (7 . 14)
Thus, the effect of the string is to introduce an azimuthal 'deficit angle '
Gravitational interactions of cosmic strings 509
<5 = 8 n Gµ , (7. 1 5)
with the result that a surface of constant t and z has the geometi::y of a cone
rather than that of a plane. The point of the cone is smoothed on a scale
P "' <5. In the limit of an infinitely thin string the cone has a sharp point ; the
· corresponding spacetime is called conical.
The dimensionless parameter Gµ plays an important role in the physics of
cosmic strings. Its magnitude can be estimated using eq. (2 . 1),
Gµ - (17/mp) 2 � 1 , (7. 16)
where 17 is the symmetry breaking scale of strings, mP is the Planck mass and
it is assumed that 17 � mP . The string scenario of galaxy formation requires
Gµ ....., 10 - 6 , which corresponds to 11 "" 10 1 6 GeV .
1 1.8 Gravitational field of a straight string (continued)
We derived eq. (7 . 15) for the conical deficit angle assuming a simple form of
the energy-momentum tensor (7.7). In the general case, the components T�
and T$ do not vanish inside the string and eq. (7 . 1 5) has to be modified. We
shall see, however, that this equation holds in the cosmologically interesting
case of Gµ � 1 (the reason , basically, is that for Gµ � 1 the metric inside the
string is approximately flat and that transverse tensions average out to zero
in flat spacetime (see eq. (2 .3)).
We shall first express the deficit angle in terms of the curvature tensor in
the interior of the string (Ford and Vilenkin, 198 1); Let S denote a surface
(t, z) = const and let <2lg ij be the metric on S. The scalar curvature of S is
<2>R = 2B /B.
- " (8 . 1)
Integrating <2>R over the surface and using eq. (7. 12) we obtain the following
expression for the deficit angle :
f
.5 = � 1 2' R ( ' 2 'g) ' f 2 d 2 x. (8 .2)
This result, combined with eq. (8.2), enables us to find the deficit angle in
terms of the Riemann tensor of the four-dimensional spacetime.
In general, it does not seem to be possible to express o in terrris of the four
dimensional Ricci tensor, and hence of the energy-momentum tensor of the
string, T,w However, in the case that the gravitational field is sufficiently
weak that the linearized theory may be applied, such an expression can be
given. Let the metric be
gµv = 1Jµv + hµv • (8 .4)
where 1'/ µv = diag( 1, - 1, - 1, - 1) and hµv � 1 . With the gauge condition
a\l(h; - !o;h) = O (8 .5)
the linearized field equations become
O hµ v = - 1 6 nG Sµv • (8 6)
.
where D = a; - v 2 , the indices are raised and lowered using the flat metric
t'lµv and
S µv = Tµv - 111µv T. (8.7)
The linearized Riemann tensor is
( 4) Ra.µpv = !(h a.v,µp + hµp, a.v - hµv, a.p - ha.p, µv ) · (8 .8)
For a metric independent of t and z,
< 2 > R - 2< 4) R 1 2 1 2 - 2h 1 2 , 1 2 - h 1 1 , 22 - h 22 , 1 1 • (8 .9)
Using (8 .5) and (8 .6), this can be rewritten as
< 2 > R = - 1 6 nG( T1 1 + T2 2 - ! T). (8 . 10)
Substituting this in (8 .2) and using (2.2), (2.3), we obtain
S = SnG f T8 d 2 x = SnGµ . (8 . 1 1)
The use of the linear perturbation theory is justified for Gµ � 1 , which is the
case for strings of cosmological interest.
A non-perturbative analysis of Einstein's equations with a realistic
energy-momentum tensor of a string has been given by Garfinkle ( 1985).
The gravitational field of global strings, for which the energy-momentum
tensor falls off only as an inverse square of the distance from the string , has
been discussed by Aryal and Everett ( 1986). I should mention also that
spacetimes with conical singularities had been studied long before their
relevance to cosmic strings was recognized . See, for example, Bach and Weyl
( 1922), Sokolov and Starobinsky ( 1977) and Israel ( 1977).
Gravitational interactions of cosmic strings 51 1
Fig. 1 1 . 3 . Light rays emitted by the quasar intersect behind the string and
the observer sees two images of the same quasar.
Observer
"-,,,
�
Observer
512 A . Vilenkin
( 1 - ---;-2m) dt 2 - ( 1 - ---;-2m) -
For example, the metrics
i
ds 2 = dr 2 - r 2 d .i.nt 2 , ( 10. 1)
d s 2 = d t 2 - a 2 (t)[( 1 - kr 2 ) - 1 d r 2 + r 2 d Q 2] ( 10.2)
with
d Q 2 = d8 2 + ( 1 - 4Gµ) 2 sin 2 e d</> 2 ( 10.3)
describe a black hole with a string passing through it, and a Robertson
Walker universe with a string, respectively.
Some well-known solutions of Einstein's equations can be re-interpreted
in terms of cosmic strings. For example, the solution found many years ago
by Bach and Weyl ( 1922) describes, with an appropriate choice of
integration constants, a pair of black holes held apart by cosmic strings
extending to infinity in opposite directions. If one of the black holes is
removed to a very large, but not infinite, distance, one obtains a solution
representing a black hole suspended by a string in a weak uniform
gravitational field, g. To linear o rder in g,
d s 2 ( 1 - 2m/r)( 1 + 2gz) dt 2 + ( 1 - 2gz)[( 1 - 2m/r) - 1 dr 2
=
Using eqs. (4.7) and (8 .7) and integrating over r' we obtain (Turok, 1984;
Vachaspati , 1986)
hµv(r, t) = - 4Gµ Il r -x((,Fµv((;r ) d( .
r)l(l - 8 . x((, r))
' ( 1 1 .4)
where
•
= r-x(Cr
t: l r - .X(( , r)) I ' ( 1 1 .5)
Fµv= xµxv - x'µx' v +11µvx'ax'a ( 1 1 .6)
and the retarded time r is defined by
r=t- l r- .X((,r )I . ( 1 1 . 7)
The motion of non-relativistic test particles is determined mainly by the
time-averaged field of the loop,
2 fL/2
( hµv(r) ) =
L Jo
dthµv(r, t). ( 1 1 .8)
=
The period of the oscillating component of the field, T L/2, is much shorter
than the time it takes for the particle to traverse a distance - L. The effect of
the oscillating component on the particle trajectory at a distance ;S L from
Gravitational interactions of cosmic strings 515
( 1 1 .9)
f. f.
we obtain for the average field (Turok, 1983)
BGµ L/2 dt L d( pµv(r -r)
( hµv(r) ) = - -
L o o lr -x(( , -r) I
1o ' • •
( 1 1 . 10)
•
h" ' (f, t) "' - 4 GµF't,'r - I roo d(( I -Li(" <)) - I , ( 1 1 . 17)
= t l-113,
The metric ( 1 1 .22) is singular at z t, h µv oc lz -
of the force acting on a test particle is
and the z-component
Fz oc ( z - t) - 13 sign(z - t).
4 (11.24)
The force which is initially repulsive, grows in magnitude, becomes infinite
and instantaneously changes into an infinite attractive force, which then
decreases. (Note, however, that the total momentum transferred to the
particle is finite.) This burst of gravitational field propagates along the beam
at the speed of light. Of course, the linear perturbation theory cannot be
trusted at points where h µv diverges, but we expect eq . ( 11.24)
to apply for
lz - ti not too small. It is possible that nonlinear effects and the back reaction
of the gravitational field on the string near the cusps make hµ v finite at z t.
Eq .( 11.17) gives the most singular components of the gravitational field
=
on the beam . The analysis of other components and of the gravitational field
slightly off the beam is rather complicated and will not be given here.
f
can be found from the following equations (Weinberg, 1972) :
. '°' '°' dPn
P = E = L., Pn = L., dQ - , ( 12.4)
II II
dQ
dP11 Gw; { *
- - -- T µv (wn, )Tµ (w,1 ....
....k v k) - .lj2 yvv (wn , ....k ) j 2 } . ( 12.5)
7t
_
dQ '
Here, dPn/dQ is the radiation power at frequency wn = 2 nn/ T = 4nn/L per
unit solid angle in the direction of k, l"kl = wn and
Tµ '(w., k) = � f:"
f
dt exp(iw,t) d 3 x exp( - ik · X) T"'(X, t) ( 12.6)
and Pc is the critical density. Eq. ( 12 .9) is expected to apply in a wide range of
frequencies ,
( 12. 10)
With Gµ - 10 - 6, as required by the cosmic string scenario of galaxy
formation , eq. ( 12.9) gives Q9(w) - 10 - 1 . A gravitational background of
such intensity should be observable using the millisecond pulsar (Hogan
and Rees, 1984; Witten , 1984).
'Y
200
+ n = 1
• n = 3 •
• n = 5
150
A
1 00
·-
·
·- ·
.///
./ /+
50
+ ...... + ... /+
0 '--�����--'-��
0 0.5
520 A. Vilenkin
It vanishes for loops described by eq. (5 . 15) because of their high symmetry.
In general, one expects IPI to be of the same order as E in eq. ( 12.2). A
.
numerical calculation for several asymmetric loops gives (Vachaspati and
Vilenkin , 1985)
( 1 3 .2 )
with '/ p "' 10.
A loop radiating momentum at the rate ( 1 3 .2) will move with acceleration
v = y P Gµ/L. ( 13.3)
If it starts from rest, then by the end of its life it will reach the velocity
V Vr ,....., '/ Ph ,....., 0 . 1 ,
,....., ( 1 3 .4)
where 1' is from eq. ( 12 3) ..
.
In deriving the estimate ( 1 3.4) we assumed that the direction of P does not
change appreciably during the loop's lifetime. This is far from being clear.
For example, one can argue that angular momentum radiation can prevent
the loop from accumulating a large velocity. The angular momentum of a
loop is / ,...., µL2 and the angular momentum radiation rate can be written as
(on dimensional grounds)
( 13.5)
where y 1 is a numerical coefficient. Let us first consider the component of 1
parallel to 1, which changes only the magnitude of 1. If the torque ( 13.5)
causes the loop to rotate as a solid body, then the corresponding angular
acceleration is ff,.._, GµL- 2 . The time it takes the loop to rotate by about one
radian is '1.t {J- 1 12 (Gµ) - 1 12 L, and the velocity accumulated during this
,.._, ,.._,
time with acceleration ( 13.3) is v - (Gµ) 1 ' 2 � 1 . Note, however, that the
assumption that the loop reacts to a torque like a solid is not well justified .
Besides, even if i� does, the rotation axis is not, in general , at right angles to
the direction of P, and then the loop will accelerate along the direction of 1.
Torques perpendicular to 1 tend to change the direction of angular
momentum. However, because of the large value of I, the loop behaves as a
relativistic gyroscope, and the direction of its rotation axis is very stable
(Hogan, �986). According to eq . ( 13.5), this direction changes on a time
scale, l/rl ,....., L/(y 1 Gµ) , comparable with the lifetime of the loop. To
summarize, our qualitative . arguments suggest that loops can accelerate to
velocities v - 0. 1 , at least if P ·l=1= 0. To reach a reliable conclusion, one has to
study the back reaction of gravitational radiation on the loop. Note also
that here we discussed the gravitational rocket effect in flat empty space. The
effects of the cosmological expansion and of the gravitational drag due to
Gravitational interactions of cosmic strings 521
z only in the combination (t - z) . On the other hand, (a; - a;) acts only on
d s 2 dt 2 - dz 2 - ( 1 - h)(dp 2 + p 2 d ¢ 2 ) ,
= ( 14.6)
where p 2 = x 2 + y 2 . A coordinate transformation
( 1 - h)p 2 = ( l - 8Gµ)p ' 2 ( 14.7)
brings it to the form (7 . 12) . The metric ( 14.6) is locally flat, but for a
nontrivial choice of f and g the metric ( 14.4) has a nonvanishing Riemann
tensor.
To elucidate some interesting properties of the gravitational field ( 14.4) ,
let us consider a pulse of length d propagating on a straight string. We can
choose the origin of z so that at any time t the pulse is localized in the range
522 A . Vilenkin
t � z � t + d. ( 14.8)
An unusual feature of the gravitational field of the pulse is that for values of z
outside this range the metric is the same as for a straight string. Hence, the
pulse exerts a gravitational force on a test particle only for a short period
dt - d when the particle is within the slab ( 14.8). Suppose the particle is
initially at rest at a large distance, p, from the string (much greater than the
amplitude of the pulse). Then it can be shown that, after the pulse has passed
by, the particle has a velocity
v= � fr2 + g")
4 µ
dt ( 14.9)
Acknowledgements
I am grateful to Tanmay Vachaspati for many helpful discussions and to
Bruce Allen for his useful comments on the manuscript. This work was
supported in part by the National Science Foundati'on and by the General
Electric Company.
References
12.1 Introduction
During the last decade, experimental particle physics has tended to confirm
the notion that the standard SU(3) x S U(2) x U ( l ) model accounts for
essentially all the physics that we have seen. In hopes of discovering new
physical laws, many particle theorists have turned to speculations on what
happens beyond the standard model, and much of this speculation has
centered on grand unified theories (GUTs). (For a review, see Langacker,
198 1 .) GUTs explain the quantization of charge and also make a very good
prediction for sin 2 8w ( 8w = Weinberg angle), giving grand unification a
certain amount of plausibility. But the most dramatic predictions of GUTs
occur only at the extraordinary energy scale of 10 1 4 GeV . By the standards
of the local power company, this is not an extraordinary amount of energy
it is roughly what it takes to light a 100 W light bulb for about a minute.
However, the idea of having that much energy on a single elementary
particle is extraordinary. If we were to try to build a 10 1 4 GeV accelerator
with present technology, we could in principle do it , more or less. It would
be a linear accelerator with a length of about one light-year. Now such an
accelerator is unlikely to be funded, so we must tum to other means to see
the 101 4 GeV consequences of GUTs. According to standard cosmology,
the universe had a temperature with k T = 10 1 4 GeV at about 10 - 3 5 s after
t This work is supported in part by funds provided by the Robert A. Welch Foundation
and in part by the US National Science Foundation (NSF) under contract PHY
8304629.
t This w ork is supported in part by funds provided by the US Department of Energy
(DOE) under contract DE-AC02-76ER03069, and in part by the National Aeronautics
and Space Administration (NASA) under grant NAGW-553.
Inflationary cosmology 52S
the big bang, and thus the universe itself becomes the best laboratory for
studying the physics of very high energies.
The interface between particle physics and cosmology has become a very
active field in recent years, and one of the outcomes · has been the
development of inflationary cosmology. The inflationary universe is a
modification of the standard hot big bang model , motivated by several flaws
that emerge when the standard model is extrapolated backward to very
early times. The inflationary model agrees precisely with the standard model
description of the observed universe for all times later than about 10 - 3 o s,
and all the successes of the standa rd model are preserved. For the first
fraction of a second , however, the scenario is dramatically different.
According to the inflationary model , the universe underwent a brief period
of exponential expansion, or inflation, during which its scale factor
increased by a factor perhaps 105 0 times larger than in standard cosmology.
In the course of this spectacular growth spurt all the matter, energy, and
entropy in the universe could have been created from virtually nothing.
The goal of this article is to explain the basics of inflationary models, and
then to summarize some of the more recent research. Emphasis will be
placed on research in which we have been involved - we do not mean to
imply that these investigations are more important than others, but we want
to write about what we understand well. Furthermore, there are already
very good reviews which emphasize other lines of development. Linde's
review ( 1984a) gives particular attention to the recent work in which he has
been involved, such as chaotic inflation, inflation in the context of
supergravity, and also quantum cosmology. There is also an article by Linde
(Chapter 13, this volume) which we have not yet seen . Brandenberger's
review ( 1985) emphasizes the underlying quantum field theory, such as the
effective potential , the finite temperature effective potential , the decay of the
false vacuum, and Hawking radiation in de Sitter space. Steinhardt's article
( 1986) featu res a detailed explanation of the properties that an underlying
particle theory must have in order to be consistent with inflationary
cosmology. Tu rner's 'Inflationary Paradigm ' ( 1985) emphasizes the role of
density fluctuations , the requirements on the underlying particle theory, and
some simple examples of particle theories that work. As the title implies, the
paper also stresses that inflation is not a specific theory, but rather a
mechanism which can be implemented in a number of different ways. We
would also like to recommend the reprint volume edited by Abbott and Pi
( 1986).
Throughout this paper we will set h = c = k = 1, and we will take the GeV
5 26 S. K . Blau and A . H. Guth
(2 .4)
(2. 12)
Since the universe expands adiabatically, the entropy per comoving volume
S = R 3s (2. 13)
528 S. K. Blau and A . H. Guth
-1.
species does not change, one has
R oc T (2 . 1 4)
Using eq. (2. 10) one finds
p oc R - 4 (radiation-dominated), (2. 1 5)
which can be substituted into the Einstein equation (2.4). For the early
universe R is small enough so that the curvature term - k/R 2 may be
ignored , leading to the simple solution
R oc t 1 1 2 (radiation-dominated). (2. 16)
( ) 1 12 _
It follows that H(t) = 1/2t, and eqs. (2.4) and (2. 10) can then be used to find
45 MP
T2 = (radiation-dominated), (2. 17)
7t Nerr
3 4t
where Mp = 1/G 1 1 2 = 1 .2 x 101 9 GeV is the Planck mass.
At the highest temperatures all of the fundamental particles of nature
contributed to the thermal radiation . As the temperature fell below the mass
of a given particle species, those particles disappeared from the thermal
equilibrium gas - they are sometimes said to have 'frozen out'. Modern
particle physics also predicts that there must have been a number of phase
transitions which took place as the universe cooled, but in the standard
model one assumes that these phase transitions were inconsequential . That
is, they occurred quickly when the temperature fell to the critical
temperature, and the release of latent heat was negligible.
When the temperature reached 1 MeV, the effectively massless degrees of
freedom were the photons, electrons, positrons, and neutrinos. If there are
three species of neutrinos, then Nerr = 1().£ and t = 0. 74 s. Calculations of the
neutrino interaction rates indicate that at this time the neutrinos began to
'decouple' - i .e., they lost thermal contact with the rest of matter. For all
times somewhat later than this, the neutrinos can be described as a
collisionless gas. It follows that the neutrinos maintained a thermal
distribution , with R� equal to the same value it had before decoupling .
When T � ! MeV (at t � 3.0 s), the electron-positron pairs began to freeze
out. Their entropy (given by eq. (2. 1 2) with Nerr = 2) was imparted to the
photons, and as a result R 1'y was increased by a factor of (1J ) 1 13 . The ratio of
the neutrino temperature to the photon temperature,
�/ I'y = (i4f ) 1 / 3
' (2. 18)
has been maintained until the present. If we assume that there are three
Inflationary cosmology 529
[ ] [
would be given by
J
R(t0) 3 J;, (t) 3
Pm (t) = Pco R(t) � Pco ' (2.22)
J;, o
where t 0 denotes the present time, and J;,0 denotes the present value of the
photon temperature. This expression will equal the energy density P r of
t It is conceivable that the universe today is dominated by some unseen relativistic decay
product, but we will not deal with this possibility here. (See Turner, Steigman and
Krauss, 1984.)
530 S. K. Blau and A. H. Guth
IH (t) = R(t)
I' d t
'
,
Jo R(t') (3.2)
which is the total distance a light pulse could have traveled since the initial
singularity. In the standard cosmology R(t) oc. t1 12 , so /H 2t. Since the phase
=
equilibrium point in the standard model , and the only time scale which
appears in the equations describing the early universe is the Planck time
t p = (Mp) - 1 = 5.4 x 10 - 44 s. (3 . 10)
Since the universe is about 106 0 Planck times old , it is remarkable that n is
still in the vicinity of one.
Of course for a k = O universe n is always exactly equal to one, but we
regard this possibility as very unlikely. Recall that the Robertson-Walker
metric is defined for any real value of k, and that the discrete choices ( + 1 ,
- 1 , or 0) are obtained by rescaling the coordinates. Thus any positive k is
scaled to + 1 , any negative k is scaled to - 1 , but k = O is obtained only by
starting with k = 0, which is a set of measure zero on the real line. In the
following discussion the possibility k = 0 is ignored.
We can express the flatness problem more quantitatively by considering
the behavior in time of (Q - 1)/Q, which using eqs. (2.4) and (2.7) can be
I nflationary cosmology 533
written as
Q- 1 3k
. � (3 . 1 1 )
8 n Gp R 2
- --
n
The scale factor R can be eliminated from this equation by using (2. 1 3),
leading to
Q- 1 3ks 2 13
- . (3. 12)
n 8nGpS 1 1 3
This equation is useful because the quantities on the right-hand side are easy
to evaluate : p and s are expressed in terms of the temperature by eqs. (2. 10)
and (2. 12), and S is constant as the universe evolves. To calculate the value of
S, note that eqs. (2 .4) and (2.7) imply that
k
R1 = 2 (3 . 1 3)
H (Q - 1) '
and therefore
S= [ �
H ' ( - l) rs (3. 14)
The right-hand side of this equation can be evaluated for the present era,
taking H � ( l0 1 0 yr) - 1 , lf! - 1 1 < 1 , and s = 2.8 x 103 cm - 3 . The result is
s > 108 7 ' (3 . 1 5)
which is an extraordinarily large number. The fact that S is so large is an
alternative statement of the flatness problem. In the context of the standard
cosmological model S is a parameter whose value is fixed by the initial
conditions, and one would expect its value to be of order unity unless there is
some reason to believe otherwise. In this language, the flatness problem is
the fact that the standard model provides no reason to believe otherwise.
One can calculate, for example, the allowed range of n when the
temperature was 1 MeV , when the processes of big bang nucleosynthesis
were just beginning (at about one second after the big bang). Using eq . (3 . 1 2)
one finds
(3. 1 6a)
At the time of the GUT phase transition, when T � 101 4 GeV , one has
1n - l l :S 10 - 49 . (3 . 1 6b)
Finally, if one extrapolates all the way back to the Planck time, when T �
101 9 GeV , then
(3. 16c)
534 S. K. Blau and A . H. Guth
- ft bt' 2/3 - 3b tr .
The coordinate value of the horizon distance at tr is given by
d t' - 1 l/3r
(3. 18)
0
/H.coord
-
N=
2rcoord
=2
/H,coord tr
[( ) ] [ ( o ) ]
t o l / 2 - 1 2 7; 1 / 2
= T
- l -:::::, 75. (3 . 19)
t In the above discussion we have assumed for simplicity that n = 1 and that the
universe was matter-dominated throughout the relevant portion of its history. The
calculation has been carried out (Guth , 1983a) without using either of the
approximations, and it was found that the problem is even a little worse - the two
regions were at least 90 horizon distances apart.
536 S. K . Blau and A. H. Gu th
region with excess mass will produce an attractive gravitational field, further
increasing the mass density contrast. Thus, if one extrapolates to very early
times one must find a much more uniform distribution of mass. At t �
10- 3 5 s one must assume that these perturbations were present, but it is
difficult to understand why they were so incredibly small.
To determine how small the perturbations must have been at t 10 - 3 5 s,
=
we must review some of the basic facts concerning the growth of mass
density perturbations (Olson, 1976; Bardeen, 1 980; Press and Vishniac,
1980). Since we are interested in the early universe, it will be sufficient to deal
with perturbations in a flat, radiation-dominated Robertson-Walker
universe. One begins by decomposing the mass density function into
homogeneous and inhomogeneous pieces :
p(x, t) p0(t) + 6p(x, t),
= (3 .20 )
where x is the comoving coordinate of the Robertson-Walker coordinate
system. The next step is to write down the general relativity equations that
describe the evolution of bp and the changes in the metric which are induced.
Since bp/p is assumed to be small in the early universe, one can simplify the
equations by expanding to first order in this quantity. Finally, the linearized
equations can be Fourier transformed, and the resulting equations are then
soluble. The results that one finds depend somewhat on how one chooses to
define the equal-time hypersurfaces in the perturbed spacetime - one
popular choice is the 'comoving gauge', in which the time variable is defined
by a set of clocks that move with the matter fluid . One then finds that the
behavior of the Fourier transformed quantity bp(k, t)/p0 depends on the
relative magnitudes of the physical wavelength and the Hubble length
H - 1 = 2t. (Since the horizon length /H 2 t as well, one often hears the words
=
(3 .22)
where H 0, !l80, and R(t0) refer to the present values of the Hubble constant,
the baryonic contribution to n, and the scale factor, respectively. Using the
fact that R T � constant, the last factor can be replaced by T 3 (t)/T 3 (t0), and
T(t) can be evaluated using eq. (2. 17). The baryonic mass within the sphere
is then given by
(3.23)
In the second line above we took the value of Ne rr from eq. (2. 19), which
applies fo r times later than a few seconds. Taking !l80h6 � 10- 2 and setting
M8 = Mg , one finds
(3 .24)
The galaxy scale perturbations will grow linearly with t for times earlier
than t(Mg ), and so they will grow by a factor of about 2 x 1044 between t =
10 - 3 5 s and t(Mg ). Given the desi red value of 1 0 - 4 at t(Mg ), one has
[J - _
_.!!_ (galaxy scale) � 5 x 10 -49 at t = 10- 3 5 s . (3.25)
p
5 38 S. K. Blau and A . H. Guth
This sounds like a very small number, but to decide for sure one should
compare it to some reasonable standard . As a comparison, we will consider
the 1/N 1 ' 2 Poisson fluctuations which are characteristic of any gas of
particles with random locations.
The universe contains about 101 0 particles per baryon - mostly photons
and neutrinos - and so the number of particles associated with a typical
galaxy is given by
N 10 M 0 x
= 1 2
105 7 baryons 10 1 0 particles
M0
x
baryon
.
= 10 7 9 partic1es. (3 .26)
bpp = ---k-
Thus, the Poisson fluctuations in the mass density are
� 3 x 10- 4 0 , (3.27)
N
These Poisson fluctuations are nine orders of magnitude larger than those
permitted by the standard cosmological model. If we extrapolate this model
bp-
back to the Planck time,
and so at this time the fluctuations required in the standard model are
seventeen orders of magnitude smaller than Poisson fluctuations.
Thus, if one is to begin the standard cosmological model at times as early
as 10- 3 5 s, then one must begin with density fluctuations which are non
zero, but which are incredibly small - much smaller than 1/N 1 ' 2 . Initial
conditions of this sort seem very peculiar, since density fluctuations of order
1/N 1 ' 2 are almost universal in macroscopic systems - they are present (in flat
space) for a classical gas in thermal equilibrium, and also for quantum
thermal radiation of either bosons or fermions. Density fluctuations are
much less than 1/N 1 12 only for highly ordered systems, such as a
configuration of particles arranged uniformly on a lattice. Density
fluctuations are also much less than 1/N 1 ' 2 for a degenerate Fermi gas at
zero temperature - in this case, the fluctuations are suppressed by the long
range correlations imposed by the Pauli exclusion principle.
The density fluctuation problem may be viewed as a local version of the
flatness problem, and it was described in general terms in the same paper by
Dicke and Peebles ( 1 979). The flatness problem is, in essence, the
observation that the universe in the large is unstable against perturbations
away from n = 1. The density fluctuation problem hinges upon the fact that
the universe is unstable against mass density pertu rbations over a very wide
range of .mass scales.
I njlationary cosmology 53S
For a wide range of parameters (Guth and Tye, 1 9 80 ) the Higgs field in the
minimal S U(5) GUT has an effective potential with precisely these
properties . The false vacuum is a Higgs field configuration which breaks the
gauge symmetry to S U(4) x U( l) rather than S U(3) x S U(2) x U( l ). The
critical temperature is T., � 101 4 GeV .
Although a false vacuum has never been observed, its essential properties
are independent of the details of the particle theory and can therefore be
predicted rather unambiguously. The false vacuum has a constant energy
density p = P r , the value of which is determined by the parameters of the
particle theory. The energy density is typically of the order of the fourth
power of the characteristic mass scale of the theory, which for GUTs means
that
(4.2)
This energy density is almost unimaginably large. It is the energy density
that a large star would have if it were compressed to the size of a proton.
We may deduce the energy-momentum tensor Tµ11 of the Higgs field in the
false vacuum by observing that Tµ 11 is a covariantly conserved tensor
constructed from the Higgs field, the metric, and the first and second
derivatives of these fields. The Higgs field in the false vacuum is constant, so
the only tensors that may be constructed are gµ 11 and
Gµ v = Rµ 11 --J;gµ11 R. (4.3)
Thus, the energy-momentum tensor must assume the form
1'µ 11 = A gµ 11 + BG µ 11 , (4.4)
where A and B are constants. The Einstein field equations are
G µ 11 = - 8nG Tµ11 , (4.5)
Pc
4>ra1se
542 S. K . Blau and A . H. Guth
so that the term BG µ v in eq . (4.4) may be absorbed into the left-hand side of
(4. 5) by redefining the gravitational constant. Having done this we write
(4.6)
where eq . (2 .2) has been used to identify the constant A with the energy
density of the false vacuum. Eq . ( 4.6) implies that the pressure p of the false
vacuum equals Pr; it is large and negative. Since the pressure is constant, it
-
has no gradient and therefore does not produce any mechanical forces.
However, the pressure does have very dramatic gravitational effects which
will be discussed below. The energy-momentum tensor of eq . (4.6) produces
a term in the Einstein field equations which is identical in form to a positive
cosmological constant - the only difference is that the energy-momentum
tensor of the false vacuum is not permanent, but changes form when the false
vacuum decays.
The false vacuum is unstable, and decays by the spontaneous nucleation
of bubbles of the new phase. For some parameters it is also possible for
magnetic monopoles, which could remain as relics from a high temperature
past, to serve as nucleation sites for bubbles of the new phase (Steinhardt,
198 1a , b; Guth and E. Weinberg, 198 1). However, it appears that a phase
transition drive n by monopoles could never be slow enough to allow for a
successful inflationary model, so we will assume that these monopoles are
stable against nucleation . (Steinhardt ( 198 la, b) has shown that this
assumption is valid for a wide range of parameters in the context of the
minimal S U(5) GUT.) The process of random bubble nucleation will be
discussed in more detail in Section 12.6, but for now it will be sufficient to
simply assume that the nucleation rate is very low.
We are now ready to describe the chronology of the original inflationary
model. As with other cosmological scenarios, the starting point is somewhat
a matter of taste and philosophical prejudice . An advantage of the
inflationary scenario is that it appears to allow a wide variety of starting
configurations - the resulting universe is very insensitive to the details of the
initial conditions. We require only that the early universe was hot (T > I;; ) in
at least some places, and that at least some of these regions were expanding
rapidly enough so that they would cool to I;; before gravitational effects
reversed the expansion . At T,. it is necessary that at least some of these hot
regions had a size about equal to the horizon distance .
If the Higgs field were in thermal equ ilibrium within such a hot region ,
then < </>) � <Praise . In fact, the universe had not had time to thermalize at this
point (Steigman, 1983). Therefore, we need to assume that there were some
Inflationary cosmology 54 �
regions of high energy density with ( </> ) � </> raise . These regions would cool to
� and would then start to supercool below �· At this point the phase
transition would begin, occurring through the spontaneous nucleation of
bubbles of the new phase. We assume, however, that the nucleation rate is
very slow, so that the initially very hot regions would supercool to T near
zero while remaining near the false vacuum. (This assumption is known
(Guth and E. Weinberg, 198 1) to be valid for a wide range of parameters in
the minimal S U(5) model .)
We now follow the evolution of those regions which have supercooled and
approached the false vacuum state. To see what happens next, it is easiest to
begin by assuming that the region is homogeneous, isotropic, and flat.
(Later we will describe what happens when these assumptions are dropped.)
The region can then be described by the metric of eq . (2 . 1 ) with k = 0, and eq .
(2 .4) becomes
(4.7)
which has the solution
R(t) oc ex r (4.8)
t The properties of de Sitter space are well described by Hawking and Ellis ( 1973).
544 S. K . Blau and A . H . Guth
which subsequently supercool into the false vacuum state, it does explain the
origin of most of the momentum of the cosmic expansion : the big bang gets
its big push from the false vacuum. The standard cosmology, ·by contrast,
makes no attempt to explain the expansion of the big bang .
Now let us consider what would happen if the initial region were not
homogeneous, isotropic, and flat. In that case, one must examine the
behavior of perturbations about the de Sitter metric. These perturbations
seem to be governed by a 'cosmological no-hair theorem', which states that
whenever the energy-momentum tensor is given by eq. (4.6), any locally
measurable perturbation about the de Sitter metric is damped exponentially
on the time scale of x - 1 • Any initial particle density is diluted to negligibility,
and any initial distortion of the metric is stretched (i .e., redshifted) until it is
no longer locally detectable.
If other matter is present in addition to that described by eq. (4.6), then the
theorem is still applicable provided that the energy-momentum tensor of
the additional matter obeys certain conditions, known as the strong and
dominant energy conditions. The strong energy condition states that
(Tµ 11 -�gµ 11T\) Wµ W" � O (4. 10)
µ
for any timelike vector W , and reduces in the case of a perfect fluid to the
condition p + 3 p � O . The dominant energy condition states that for any
timelike Wµ , Tµ11 Wµ W" � O and Tµ 11 W " is non-spacelike . For a perfect fluid
this condition reduces to p � IPI ·
The no-hair theorem has been demonstrated (Frieman and Will , 1982 ;
Barrow, 1983 ; Boucher and Gibbons, 1983 ; Ginsparg and Perry, 1983) in
the context of linearized perturbation theory, and it is conjectured to hold
even for large perturbations (Gibbons and Hawking, 197 7 ; Hawking and
Moss, 1982). Its validity in the non-perturbative regime has also been
verified in certain exactly soluble models (Wald, 1983). Recently Jensen and
Stein-Schabes ( 1986) have given a proof which holds whenever a
synchronous reference frame exists - i.e. , whenever spacetime can be filled
with a family of timelike geodesics which do not intersect each other. For
non-perturbative situations, the statement of the no-hair conjectu re must
include some kind of restriction to insure that the false vacuum energy
density becomes dominant. In particular, this restriction must exclude the
case of a closed universe that has a positive cosmological constant which is
too weak to prevent it from collapsing. t Although to ou r knowledge this
extra restriction has never been stated, one would guess that it is not
t We thank David Garfinkle for making us aware of this simple example.
Inflationary cosmology 545
t Although the authors continue to believe that the 'no-hair' conjecture is valid under
reasonable circumstances, the issues have become controversial . Ford ( 1985) and Barrow
( 1986) have discussed instabilities of de Sitter space, but they deal with situations in
which there is no false vacuum of the type we have discussed . Traschen and Hill ( 1986)
conclude that de Sitter space can be destabilized by minimally coupled scalar fields, but
their method is invalid unless there are N species of such fields, with NH2 � M� . For
typical GUT parameters, this would require an unrealistic 10 1 8 scalar fields. Antoniadis,
lliopoulos and Tomaras ( 1986) have argued that the graviton propagator in de Sitter
space has infrared singularities, but it appears (Allen, 1986) that the pathologies which
they discovered were caused by their choice of gauge conditions. Other papers discussing
instabilities of de Sitter space include Myhrvold ( 1983u, b), Mottola ( 1985, 1986) and
Mazur and Mottola ( 1986).
546 S. K . Blau and A . H. Guth
During the inflationary era one has p = - p = - pr , and one can see that eq.
(2.5) is satisfied identically, with the energy of the expanding gas increasing
due to the negative pressure. If the spacetime were asy mptotically
Minkowskian it would be possible to define a conserved total (i .e. , matter
plus gravitational) energy (see, for example, S. Weinberg, 1972, pp. 165-72;
or Witten , 198 la). The Robertson-Walker metric, however, does not admit
a global conservation law of this type.
Perhaps the most startling idea to come out of recent developments in
cosmology is the suggestion that the universe may possess no conserved
quantities which distinguish it from the vacuum. The total matter energy in
the observed universe is of course very large - � 1078 GeV - but we have just
seen that this quantity is not conserved. The baryon number of the universe
is apparently also huge - � 1078 - but according to GUTs , baryon number is
also not conserved . On the other hand , there are several quantities in nature
which we believe are exactly conserved , such as electric charge and angular
momentum. The consistency of our theories would break down if these
quantities were not conserved . It is therefore very suggestive to note that the
observed value for each of these quantities is compatible with zero . Thus,
provided that baryon ,number is not conserved, the universe appears to be
devoid of all conserved quantities. In that case, it is tempting to believe that
the universe began from nothing, or from almost nothing. The inflationary
universe illustrates the latter possibility. The idea that the universe may have
emerged from 'absolutely nothing' was suggested by Tryon ( 1973) , and
further suggestions along these lines were discussed by Brout , Englert and
Spindel ( 1 979); Gott (1982); Atkatz and Pagels ( 1982); Vilenkin ( 1982,
1983a , b, 1984, 1985b) ; and Linde ( 1983a, 1984b, c).
Let us now return to discuss the fate of those regions which have been
undergoing exponential expansion in the false vacuum state. We suppose
that inflation continued for a time '1.t, during which the regions expanded by
a factor
(4. 1 1)
The inflationary era then ended with a phase transition from the metastable
phase (with ( c/>) � c/>raise ) to the stable phase (with ( c/>) � cl>true ). The original
inflationary universe model relies on the assumption that the phase
transition took place suddenly, with rapid thermalization of the latent heat.
This assumption is now known to be false, and in Section 12.6 we will
discuss what would actually happen. For pedagogical purposes, however,
we will now follow the logic of the original model. We will therefore suppose
Inflationary cosmology 547
that when the energy density P r of the false vacuum was released , it rapidly
thermalized to produce a hot gas of particles - precisely the initial state that
was postulated in the standard cosmological model. Thus, in the
inflationary model the false vacuum energy is also the source of essentially
all the entropy in the observed universe.
The temperature to which this gas reheated can be calculated for any
particular particle theory by using the conservation of energy. Typically one
finds that the reheating temperature is given by
(4. 12)
From here on, the inflationary scenario j oins the standard cosmology. As
the hot gas cooled below the GUT scale, the baryon non-conserving
interactions produced a small net excess of quarks over antiquarks. These
excess quarks eventually resulted in the baryons which we observe in the
universe today (see, for example, Kolb and Tu rner, 1983 ; Yoshimura, 198 1).
Thus, in this model the false vacuum energy is also the source of essentially
all the matter in the universe. Note that the baryon number production must
occur after the inflationary era so that the baryon number density is not
unacceptably diluted .
While the inflationary scenario was certainly motivated by GUTs, it is
worth noting that there are really only two properties of the GUTs that are
essential to allow inflation to take place. First, the theory must contain some
sort of a false vacuum - i.e. , a metastable state with p + 3p < 0 - to drive the
inflation . And , second , the theory must allow for baryon production to take
place after the decay of the false vacuum.
Jo
ftc dt'
� < l8 (tr) = R ( tc ) R(t') = 2tc . (5. 1)
.
In the inflationary model, on the other hand , the value of R(t) increases by a
factor of Z during the period of inflation , so
"/8 (tr) � 2Ztc . (5.2)
548 S. K . Blau and A . H . Guth
size of the observable universe at times before the GUT phase transition was
smaller than it would have been in the standard scenario by a factor of Z. if
z > 10 2 5 then the entire observable universe would have been within its
"" '
r(t t ) =
' N
I.I R(t')
dt'
--
= x - 1 (e - x1N e x1 ) .
- -
(6.2)
N
As t � oo
(6.3)
The fact that the limit is finite reflects the existence of event horizons in
de Sitter space ; if an event takes place at time t E , a comoving observer whose
coordinate distance from the event is greater than x - 1 e - xiE will never be
able to detect it. The physical horizon distance is then R ( tE ) x - 1 e - XIE = x - 1 '
independent of time. Note that (6.3) does not imply that the bubble stops
growing - it continues to grow at the speed oflight. However, the scale of the
coordinate system is changing so fast that the coordinate velocity of light
approaches zero.
Now consider what happens when random bubble formation begins, at a
time which we will call t8 . The first bubbles which form will have the largest
asymptotic coordinate radii, while the asymptotic coordinate radii of
bubbles which form at a later time t will be smaller by a factor
exp { - x(t - t8 ) } . The asymptotic coordinate volume of bubbles which form
552 S. K . Blau and A . H. Guth
another bubble will remain forever inside it, having no effect on p(t) . Second ,
we will pretend that each bubble appears instantaneously with its
asymptotic coordinate radius x - 1 e - xrN . Since x i is a short time for our
-
p(t = t, ) = exp
"
{�
dn Z = p, } (6.6)
(6 .7)
The range of values for e which can arise for reasonable values of the
554 S. K. Blau and A. H. Guth
parameters in the minimal S U(5) GUT has been investigated (Guth and
E. Weinberg, 1983). It was found that values anywhere in the range
1 0 - 1 0 000 � 8 � 1 0 1 7 (6.8)
are quite plausible. There are two main reasons why one finds such a
spectacularly large range for 8. It is actually the exponents that appear in eq.
(6.8) which are calculated, and these exponents in turn depend on rather
high powers of the GUT parameters. Thus, values of 8 which satisfy eq . (6.7)
are quite plausible, but it would require significant fine-tuning of parameters
to have 8 near to 10- 2. It is much more plausible to have 8 many orders of
magnitude smaller.
For 8 � 1 , the instantaneous phase transition assumed in the previous
section must be replaced by one in which p(t) approaches zero with an
exponential time constant which is very long compared with the expansion
time x - 1 •
To see if such a slow phase transition might work, one must examine the
properties of the resulting distribution of bubbles (Guth and E. Weinberg,
1983). It can be proven rigorously that if 8 < 10 6 , then the bubbles will form
-
finite sized clusters only, no matter how long one waits, even though
p(t) 0. In mathematical terminology, the bubbles never 'percolate'. It can
�
wall of such a bubble could not thermalize until T fell below 106 K, a value
too low for baryon production or even nucleosynthesis.
The conclusion is therefore clear that the original inflationary scenario is
unworkable. What is needed is a more graceful way to end the period of
inflation . The model would work if one could find a mechanism which
would allow s to be much less than one during the inflationary era, and then
to simultaneously and quickly change to a large value throughout the space.
No mechanism of this type has ever been found. In the next section we will
discuss the new inflationary model , which provides an elegant solution to
this 'graceful exit' problem.
how it ends. Some time ago Tryon ( 1973) proposed the speculative but
attractive idea that the universe may have begun from a quantum
fluctuation of 'absolutely nothing'. This idea has since been pu rsued in the
context of the inflationary universe by Vilenkin ( 1982, 1983a, b, 1984, 1985b)
and Linde ( 1983a, 1984b, c). In these scenarios the universe tunnels directly
from a state of 'absolute nothingness' into the false vacuum, with no need for
an intermediate hot phase. In a similar spirit Hartle and Hawking ( 1983)
have proposed a unique wave function for the universe, incorporating
dynamics which leads to an inflationary era. (See also Hawking , 1984a , b,
1986; Moss and Wright, 1984; Hawking and Luttrell, 1984; and Hawking
and Wu , 1985.) Linde ( 1983c, d, 1986a , b, c) has proposed and developed the
idea of chaotic inflation, in which inflation is driven by a scalar field which is
initially chaotic but far from thermal equilibrium. (See also Goncharov and
Linde, 1984a, b, c ; Goncharov, Linde and Vysotskii, 1984; and Khlopov
and Linde, 1984.)
In order for the new inflationary universe scenario to occur, the
underlying particle theory must contain a scalar field cf> which has the
following properties :t
(i) The effective potential function V(cf>) must have a minimum at a value of
cf> not equal to zero .
(ii) V(cf>) must be very flat in the vicinity of cf> = O. The value ¢ = 0 is usually
assumed to be a local maximum of V(cf>).
(iii) At high temperature T, the thermal equilibrium value of cf> (i .e. , the
minimum of the finite temperature effective potential VT(¢ , T)) should
lie at cf> = O.
An effective potential function of this general form is shown in Fig. 12.3. The
point cf> = 0 is an equilibrium point which in the example shown is just barely
unstable. We will refer to the field configuration cf> = 0 at zero temperature as
the false vacuum, even though the term is traditionally reserved for
configurations that are classically stable. The example shown is in fact the
minimal S U(5) Coleman-Weinberg potential (Coleman and E. Weinberg ,
1973), which will be discussed in more detail in Section 12. 10.
The early papers on the new inflationary universe assumed that the Higgs
field responsible for breaking the grand unified symmetry would also play
the role of the field cf> which drives the inflation . However, it was soon found
(as we will discuss in Sections 12.8 and 12. 10) that the effective potential for
t The properties required of the scalar field are described in more detail by Steinhardt
and Turner ( 1984).
Inflationary cosmology 557
the Higgs field is not flat enough, due to the radiative corrections arising
from the gauge couplings. The quantum fluctuations in the Higgs field then
result in mass density fluctuations which are unacceptably large. Thus, in
most newer models the ¢ field is a gauge singlet 'inflation' field, which in
many cases serves no function other than the driving of inflation . Pi ( 1984),
however, has proposed that the axion field can drive inflation, and Ovrut
and Steinhardt ( 1983 , 1984a, b, c, 1986; see also Albrecht and Steinhardt,
198 3 ; and Lindblom, Ovrut and Steinhardt, 1986) have shown that inflation
can be driven by the same scalar field that breaks supersymmetry. We will
use the term 'new inflation' to refer to any model based on a slow rollover
phase transition.
The initial conditions of the universe required by the new inflationary
cosmology are almost identical to those assumed in the original inflationary
scenario. In either case the initial conditions imposed are much less strict
than they are in the standard cosmology. The early universe must have
contained some regions with temperature T > 'I'c, which were expanding
rapidly enough so that they would cool down to 'I'c, before gravitational
1
effects had a chance to reverse the expansion. Such regions must have had a
size of about a horizon length x - when T 'I'c, , or else they would have
=
subsequently collapsed .
Fig. 12.3. The form of the scalar field effective potential function required for
the new inflationary universe model. The example shown is the potential for
the minimal S U(5) theory with Coleman-Weinberg parameters.
10
Pr
'
..,.,-....
>
c.:i
G,)
"'
0
5
.....
"'
:::...
._,
.1-
0 '--����....���� ..
....�--D"'-�-- -'..__�
0 0.5 1 .0 1 . 5
tf> ( l 0 1 5 GeV)
558 S. K . Blau and A. H. Guth
pushed from the top of the plateau by quantum and/or thermal fluctuations.
These fluctuations were random , and therefore the field began to roll down
the hill at different times in different places. The evolution of the scalar field
I njlationary cosmology 559
In some models the natural mass scale of the effective potential in the vicinity
of ¢ = 0 might be much less than z, but the Gibbons-Hawking thermal
effects mentioned above would in any case prevent the coherence length
from being greater than a number of order x - 1 . For typical GUT
parameters, x 1 � 1 0 - 2 4 cm
- .
The regions over which the scalar field was approximately uniform will be
called 'coherence regions'. We avoid the use of the word 'bubble' in
describing these regions, for they are non-spherical and they do not have
sharp boundaries.
The initial fluctuations must be described quantum mechanically, but the
subsequent evolution can be described classically. (This transition from a
quantum to a classical description will be discussed in detail in Section 12.9.)
The scalar field within a coherence region was essentially homogeneous, so
we may ignore the gradient terms in the classical equations of motion.
Thermal effects were important in the initial fluctuations, but as inflation
continued the temperature rapidly became negligible. One then has 0¢ =
- o V/oc/J , which in the de Sitter metric of eq . (6. 1) becomes
(7.3)
where ¢ 0(t) is the classical homogeneous field which obeys eq. (7.3).
Including the·spatial derivative term in 0¢, the equation of motion (using
the de Sitter metric of eq. (6. 1)) becomes
I njlationary cosmology 563
(8 .2)
f
f (X , t) = d 3 k e'' · 'j' (k , t). (8 .4)
_ �_ (8.7)
k a2 v _ 1 /2 "
__
- ( ¢ o ( t * ))
a¢ 2
From eq. (7.2) it follows that A * � x - 1 .
The history of a fluctuation with a given comoving wave number k is
shown in Fig. 12 .4. The physical wavelength A grows exponentially during
the inflationary era, which extends from ti to tr , and then grows as t 112 du ring
the radiation-dominated era. The graph also shows the Hubble length,
H - 1 (t), which is constant during the inflationary era and grows linearly with
t during the radiation-dominated era. Note that the curves for A and H - 1
cross at two points, labelled A and B the physical wavelength is initially
-
less than the Hubble length, then becomes larger, and then becomes smaller
564 S. K . Blau and A . H. Guth
Fig. 12.4. Wavelengths and the Hubble length as a function of time. One
curve shows the evolution of the physical wavelength A. of a typical
fluctuation, proportional to R(t), and the other curve shows the evolution of
the Hubble length H - 1 (t).
- Inflation -1
I
I
Inflationary cosmology 565
and
(8.8b)
Note that there are no physical processes which operate at scales larger
than the Hubble length, since points at this separation are (at least
temporarily) causally disconnected . Thus the amplitude of the fluctuation
shown in Fig. 12.4 is actually determined by processes that occur in the
vicinity of point A, when the physical wavelength is � x - 1 • The form of Fig.
12.4, therefore, illustrates an important difference between inflationary and
standard cosmology . If Fig . 12.4 had been drawn for standard cosmology,
the physical wavelength would cross the Hubble length only once, and at all
times earlier than the crossing time the physical wavelength would be larger
than the Hubble length. For this reason it is hard even to conceive of a
mechanism , within the standard cosmology, by which the density
perturbations could be determined by physical processes.
Eq . (8.5) is a linear equation of second order. The general solution can
therefore be expressed in terms of two linearly independent basis functions,
and in principle two initial conditions must be given to uniquely specify a
solution . We will be interested in the behavior of the solution, however, only
du ring the period when the scalar field begins to roll off the hill, at t � tr.
Assuming that the time scale for this rolling is large compared with x - 1 , it
will be shown in the Appendix that the damping term in eq . (8 . 5) causes one
of the two linearly independent solutions to become negligible. Specifically,
we will demonstrate that for t > t * (k) one of the two basis functions grows
monotonically, and the other falls off faster than e - 3xr . (To see that this
result is reasonable, the reader may wish to check the simplified case in
which the right-hand side of the equation is replaced by m 2 <5$.) By the
assumption (8.8a) the falling basis function can be neglected, and then the
growing basis function becomes the essentially unique solution. Since the
neglected basis function falls off so rapidly, the expression ' � 1 ' in eq. (8.8a)
can be interpreted roughly as ' ;c; 5'.
For x(t - t * ) � 1 the second term on the right-hand side of eq. (8. 5) can be
neglected , and the equation is then identical to the one satisfied by ¢0(t).
Since we have argued that the solution to the equation is essentially unique,
it follows that <5$(k, t) oc ef>0(t). The proportionality constant may depend on
k, and we give it the name - <5i(k). Thus,
<5 $(k , t) = - <5i(k)</J0(t), (8.9)
from which it follows that
<5¢(x, t) = - br (x)c/>0(t). (8. 10)
566 S. K. Blau and A. H. Guth
(bf* (k. , t) bJ(k , t)) = f(�:�, (�:�� r}HX' - Xl ( {!f (X ' , t) bf(X, t)) (8 . 14)
diverges as the volume of space, since the integrand depends only on x ' - x.
This infinity can be factored out by considering the expectation value
(6J* (k ' , t) 6J(k , t)) , which can be written as
(6J* (k' , t) 6J(k , t)) = k - 3 6 3 (k' - k ) [�f(k , t)J 2 , (8 . 1 5)
where
(8 . 1 6)
Inflationary cosmology 567
by following the evolution using eq. (8. 5). Depending on the form of
o 2 V/o<f> 2 (<f>0), this method may or may not be tractable an �lytically - it
would certainly be tractable numerically. One can get a good
approximation to the answer, however, by using a simple matching
argument. While eq. (8 . 18) i's accurate only for t � t * (k), it should be a
reasonable estimate at t � t * (k). Similarly the equation
11 </>(k, t) = 11-r (k) </; 0(t), (8. 19)
which follows from eq. (8 .9), is accurate only for t � t * (k), but it should also
be a reasonable estimate for t � t * (k). Combining this equation with (8. 12),
one finds
11p (k) 4x 11 </>(k, t * (k))
(8 .20)
p H
� cPo (t* (k))
]
Using eqs. (8. 18) and (8 .6) one then has
11p(k) x2 [ 1 a2 v
+ 2 iJ</> 2 o
(<f> (t * (k)))
1 12
(8 .2 1)
p H 1t 3 ! 2 <fi o (t* (k))
� 1
X
In Section 12. 10 we will apply this result to the specific case of the S U(5)
Coleman-Wei_n berg potential.
Since the rollover process must be slow for the new inflationary scenario
to be viable, the function <f>0(t * (k)) must be a slowly varying function of k - it
follows that the fluctuation amplitude given by eq. (8.2 1) is slowly varying. A
spectrum for which [11p (k)/pJ I H = constant is completely scale invariant, and
is known as the Harrison-Zeldovich spectrum (Harrison , 1970; Zeldovich,
1972). Thus, inflationary models yield a nearly Harrison-Zeldovich
spectrum.
Note that the perturbations arise from a nearly free quantum field theory,
and therefore their probability distribution is Gaussian - each Fourier mode
obeys a Gaussian probability distribution, and has a phase and magnitude
uncorrelated with those of the other modes.
The perturbations produced in this way are 'adiabatic', in the sense that
the baryon number to entropy ratio is unperturbed . The adiabaticity i s a
consequence of the short range nature of the baryon production processes,
which occurred in these models very shortly after the GUT phase transition.
The natural length scale for these interactions was necessarily smaller than
the Hubble length, but at this time the perturbations of approximately
galactic scale had wavelengths of order 102 1 times the Hubble length. These
perturbations are described by an inhomogeneous time delay ch(x) of the
phase transition, which occurred in a background geometry which was not
Inflationary cosmology 569
yet significantly perturbed. Thus the time delay function 6-r(x) was
essentially constant within any volume of Hubble size. Any two .distinct
volumes of Hubble size therefore underwent the same evolution, although
slightly out of synchronization. The net baryon number produced per unit of
entropy was determined by this evolution, and is therefore the same
everywhere.
Thus, the generic prediction of inflation is a nearly Harrison-Zeldovich
spectrum of adiabatic, Gaussian fluctuations. A spectrum of this type, with
an amplitude of [L\p(k)/pJ I H � 10 - 4 or 10- 5 , appears to be a viable
candidate for the spectrum that produced the observed structure of the
universe. It is very difficult, however, to know whether this spectrum is
compatible in detail with the observed universe. To answer this question one
would have to calculate the evolution of these perturbations up to present
time - a calculation which requires an understanding of the dark matter in
the universe and the roles of non-linear dynamics and perhaps non
gravitational forces. The difficulties of the calculation can be minimized by
focussing attention on the large-scale structure of the microwave
background radiation (see, for example, Abbott and Wise, 1984a, b), since in
this case a linearized, purely gravitational calculation is expected to be valid.
Unfortunately, though, it is very difficult to obtain data with sufficient
precision. The problem of comparing theory with observation is further
complicated by the need to predict the observable consequences of the
perturbations; i .e., one must understand how enhancements in the matter
density lead eventually to the production of detectable radiation. It has
frequently been assumed naively that 'light follows mass', but now the
possibility of 'biased galaxy formation' (Kaiser, 1986 ; Bardeen et al. , 1986) is
taken quite seriously. The uncertainties about the connection between light
and mass can be bypassed by directly observing the peculiar velocity field
(see, for example, Kaiser, 1983 ; Vittorio and Silk, 198 5 ; Vittorio ,
Juszkiewicz and Davis, 1986; Vittorio and Turner, 1986) , but these
measurements are very difficult. Research on these questions is very active at
present, but it seems too early to expect a definitive answer. There is some
evidence, however, that a Harrison-Zeldovich spectrum of adiabatic,
Gaussian fluctuations cannot account for the detailed structure of the
observed universe. For a sample of the literature, see IAU ( 1986).
The perturbations discussed above are generated by the non-uniformity
of the slow-rollover phase transition, and are expected to be present in any
inflationary model. It is therefore very economical to assume that these
fluctuations are responsible for the formation of structure in the universe. It
570 S. K. Blau and A . H. Guth
believable answers to the most important questions. We will not present the
calculations here, but we will describe the assumptions and the conclusions.
The idealization of a globally homogeneous classical background
solution is undoubtedly invalid at some level , and a number of questions can
be raised about its use :
(i) Is the picture of a classical slow rollover valid? It has been pointed out
by Mazenko et al. ( 1985) that at high temperatures, when one says that
¢ � o, one really means that the spatial or time average of ¢ is about
equal to zero - the field itself is undergoing large fluctuations. As the
system cools, they argue, it is possible that these fluctuations, which
extend initially out to the minimum of the potential ¢c or beyond , will
cause the scalar field to settle quickly into small regions with ¢ = + ¢c in
each of these regions. When one looks at the spatial average one might
see what appears to be a rolling motion , but the actual local dynamics
could be quite different.
(ii) What is the physical significance of the classical function ¢ 0 ( t) ?
Hawking and M oss ( 1983) point out that the system begins in a thermal
ensemble which possesses an exact symmetry, ¢ � - ¢. The dynamics
is also consistent with this symmetry, and it therefore follows that
< ¢ (x, t)) remains zero for all time - the field presumably does roll down
the hill but, since it is equally likely to roll in any direction, the
expectation value remains zero. A number of authors have suggested
that the quantity { < ¢ 2 (x, t))} 112 can play the role of ¢ 0 ( t) but this
,
this calculation . The methods discussed below will address the question
in what we feel is a more transparent way. The results obtained with
these newer methods are essentially in agreement with those of the
earlier authors.
In this section we will try to provide answers for these questions. To
understand the quantum mechanical behavior of unstable systems better,
we begin by discussing the relatively simple problem of an upside-down
harmonic oscillator. Consider a single particle of mass m moving in one
dimension , under the influence of the potential
V(x) = -}kx 2 . (9. 1)
We will track the behavior of a wave packet initially having the Gaussian
form
l/J(x, t = 0) ex: exp { - x 2/2h5} . (9.2)
The solution is expressed most simply by introducing the new variables
b 2 = n/(mk) 1 12 (9.3a)
and
W 2 = k/m . (9.3b)
Note that b corresponds to the characteristic quantum mechanical length
scale of the problem, analogous to the Bohr radius of the hydrogen atom.
More precisely, b corresponds to the width of the ground state wave
function of the right-side-up harmonic oscillator potential with the same
absolute value of k. Similarly, w describes the natural frequency of the
corresponding right-side-up harmonic oscillator. The solution maintains
the Gaussian form l/J(x, t) = A(t) exp( - B(t)x 2 ) for all time, where A (t) and
B(t) are complex functions. B(t) can be expressed as
1 .
B(t) = ta n (¢ - 1wt), (9.4)
2b 2
where tan ¢ = b 2 /h 5 . For large times this solution has the form
{
l/J(x, t) ex: exp -
x2 x
2h 2 (t) + 2b 2
i
} (9.5a)
where
(b4 + h o4 ) 1 / 2
h(t) = ewr . (9.Sb)
2h o
The behavior of eq . (9.5b) is easily understood - the width of the wave packet
at large times is minimized for a specific value of h0, which turns out to be b.
If h 0 were chosen very small, then the width would become large due to the
I njlationary cosmology 573
- .
classically down the hill, but the time at which it started to roll is described
by a probability distribution. In this situation an ensemble average (whether
it be an average of x or x 2 ) can obscure the physics, since it averages over
systems that are in very different stages of their evolution.
Having discussed this toy problem, we can now describe the quantum
field theory model proposed by GP. Any perturbative treatment of quantum
574 S. K. Blau and A. H. Guth
field theory (including effective action methods) begins with a free field
theory approximation, but the standard methods were developed for the
purpose of treating perturbations about stable configurations. The GP
model , on the other hand , is a free field theory approximation to a scalar
field in an unstable configuration, perched at the top of a hill in the potential
energy diagram. The model is somewhat crude, but we believe that it
qualitatively describes the correct physics and provides a reasonable
approximation to the behavior of the scalar field in the new inflationary
universe. It is plausible that this model could serve as a valid zero order
approximation to a systematic calculation in which interactions are taken
into account perturbatively - so far, however, no such calculation has been
carried out.
� 4
Vo (</>) = -1:(µ 2 - !a T 2 )</> 2 + '
a
(9. 10)
P-
- �( ) µ2 1 /2
4 + x2 ' (9. 16b)
and
(9. 16c)
The linear combination in (9. 16) was chosen because at asymptotically early
times it has the form
''' iB(r ) e - iw(t)t
'I' (k ' t)
[ 2 w ( t)] 1 1 2 '
- e e- 3 rx /2 ----,-
(9. 17)
where eiB<r> is a slowly varying phase factor. The factor e - 3 xrt 2 would not be
present in Minkowski space - it is a slowly varying shift in the normalization
conventions. Using the canonical commutation relations with eqs. (9. 16)
and (9. 12), it can be shown that
[d(k), dt(k')] = (P(k' -k). (9. 18)
Given the mode function behavior (9 . 17) and the commutation relations
(9. 18), it follows that at early times the operators dt(k) and d(k) can be
interpreted as creation and annihilation operators of nearly Minkowskian
particles. t
For the special case µ = 0, the Hankel function reduces to the simple
closed form expression
() ( )
H< O1 2 ( z) = - �
1 12
z + i eiz . (9. 19)
3 nz z
Note that the choice of the linear combination of solutions to define the
mode function t/l(k, t) is merely a choice of convention, and does not
constitute a physical assumption. A different choice for t/J(k, t) would lead to
a different meaning for the operators dt(k) and d(k). The statement that the
operators d(k) annihilate the vacuum, on the other hand, is a physical
assumption. For the conventions used here (which are more or less
t To compare the normalization conventions, however, one must be aware of several
differences between this formalism and the standard Minkowski space formalism : in this
formalism t/f(k, t) contains an explicit factor e - 3x11 2 ; the measure o f integration i n (9. 12)
and the J-function in (9. 18) are expressed in terms of coordinate momenta rather than
physical momenta.
I njl.ationary cosmology 577
( d t (k ) d ( k ) ) -
-, - - 1 ,
3 (....k. - -k) (9.2 1)
2 2 1 1
exp{ (k + 1' ) /T } 12 - {J .
For late times it can be shown that the Fourier amplitudes of the field obey a
classical probability distribution, just as the variable x did in the example of
the upside-down harmonic oscillator. In this case the classical
approximation becomes valid for a given mode when its frequency (see eq.
(9. 1 5)) becomes small compared with the de Sitter expansion rate X ·
Since the GP model is a free field theory, essentially everything about it
can be calculated exactly. Here we will summarize the most important
results.
(i) Qualitative behavior of </>(i, t). One might ask what </>(x, t) would
typically look like if one measured it at all points in space at a given time, but
one must remember that a quantum field at a single point is invariably
described by a probability distribution with an infinite width. To obtain a
measurable operator one must 'smear' the quantum field over some finite
volume. One therefore defines a smeared field
subtractions , but one must bear in mind the caveal that such an expectation
value averages over systems in different stages of evolution . For ' potentials
A
which are extremely flat (i.e., µ 2 :;;; x 2 and a � 1), the calculation of < </>12 (t)) for
l � 1 / f shows that ¢, (x, t) has a high probability of hovering at ¢ , � o for a
long time before beginning to roll down the hill of V(</J ) . Thus, the picture of
the classical slow rollover is valid under these circumstances, but it would
not be valid for a value of a of order one.
(ii) The meaning of </J0(t). In the new inflationary universe scenario, the
observed universe develops from part of a much larger region which has
evolved into the false vacuum. Let � denote the coordinate diameter of the
region which evolves into the observed universe, and define a 'wave number
of the universe' ku = � - 1 . One can then write a Fourier expansion for the
smeared field, separating it into two parts :
¢, (x, t) = J.
k <ku
d 3k [· · ·J + J.
k>ku
d 3 k [· · -J , (9.23)
where the integrand [ · · "] has the same form as in eq. (9. 12) , but with an
additional damping factor e - < 1 t 2>k 21 2 which arises from the smearing . The
first term on the right-hand side describes fluctuations with wavelengths
longer than the diameter of the observed universe, and thus can be
considered homogeneous for astronomical purposes. It is this term that we
identify with ¢ 0 (t). In this formalism it obeys a classical equation of motion,
but the time at which the rolling begins is described by a probability
distribution. The second term represents inhomogeneities on scales less than
that of the observed universe, and these correspond to the term b ¢(x, t) in
eq . (8 . 1) .
(iii) How long will </J0(t) hover around <P � o? In the GP model this
question can be answered unambiguously by calculating the probability
distribution of ¢0(t). One finds that the probability distribution is Gaussian ,
and that < </J�(t)) � < </Jf(t)) for l = l/f, so the comments made above in
paragraph (i) apply. The GP formalism can also be used to discuss the case
of a Coleman-Weinberg potential . The results are similar to those obtained
by Linde ( 1982c) and Vilenkin and Ford ( 1982), but the picture is somewhat
different. We will return to this discussion in the next section , where we will
discuss the Coleman-Weinberg potential in detail .
(iv) Calculation of density perturbations. The probability distribution for
density perturbations bp/p in the GP model can be calculated exactly. The
results are in excellent agreement with those obtained by applying the
Inflationary cosmology 579
For details about the effective potential function , the reader can refer to the
articles by Coleman and E. Weinberg ( 1973), S. Weinberg ( 1 973), Jackiw
( 1974), and Dolan and Jackiw ( 1974a). For more details about the finite
temperature effective potential, there are articles by Kirzhnits and Linde
(1972), Dolan and Jackiw ( 1974b), S. Weinberg (1974), and Bernard ( 1974).
There is a review article on effective potentials by Coleman ( 1975), and a
review article on finite temperature effects by Linde ( 1979). The review
article by Brandenberger ( 1985) contains a thorough treatment of both the
zero temperature and finite temperature effective potentials.
In the minimal S U(5) theory, the full gauge symmetry is broken to the
subgroup S U(3) x S U(2) x U(l) by a set of Higgs fields <l> which transform
according to the adjoint representation of S U(5). That is, <l> represents a
traceless, hermition, 5 x 5 matrix of fields (with 24 independent components)
which transforms under the gauge group as
<l>'(x) = g - 1 (x)<l>(x)g(x) , ( 10. 1)
where g(x) denotes an S U(5) matrix. The symmetry breaking is
accomplished by the fields <l> acquiring a vacuum expectation value of the
form
<l> - (__l_ /
1 ) l 2 � d"iag [ 1 , 1 ' 1 ' _J_2 ' - _J_J ( 10.2)
- s If' 2 ,
_
0
-
>
<l"
C-'
II)
0
'°
"'
::::..
.._..
-5
1 .5 X 1 0 1 4 GeV
-10
· o.s 1 .0 1 .5
5
cf> ( l 0 1 GeV)
582 S. K . Blau and A. H. Guth
potential . For small T the dip which creates the minimum at ¢ 0 has a
depth of order T 4 , and a width of order T.
=
Having summarized the particle physics background, we are now pre
pared to discuss the details of the inflationary scenario in the context of the
minimal S U(5) GUT. Following the general scenario laid out in Section
12.7, we assume that the early universe contained hot expanding regions
within which ( ¢) was near to its thermal equilibrium value of zero. As the
temperature of such a region fell below 1'c, it became possible for bubbles of
the new phase to nucleate. However, as long as T � x � 10 1 0 GeV one can
rely on flat space calculations (Sher, 198 1 ; see also Abbott, 198 1 ; Billoire
and Tamvakis, 1982 ; and Cook and Mahanthappa, 1982) that show that the
nucleation rate is completely negligible.
Once the temperature fell to the order of x, the scalar field was agitated by
quantum and thermal fluctuations which tended to start it rolling down the
hill of the potential energy diagram. It is important to know whether the
scalar field would remain on the top of the hill for long enough to allow for
sufficient inflation. This question has been addressed by Linde ( 1982c) and
Vilenkin and Ford (1982), who concluded that this model does not lead to
enough inflation . It will be easier to discuss the issues, however, after more
groundwork has been laid, so we will return to this topic at the end of the
section. Meanwhile, in order to continue our pedagogical discussion, we will
assume that the scalar field remained perched at the top of the hill of the
potential energy diagram for some unspecified length of time.
The scalar field then rolled down the hill of the effective potential diagram,
obeying the classical equation of motion. The general form of this equation
was given as (7.3), which for the case of the Coleman-Weinberg potential
becomes
( 10 . 7)
The logarithm is a slowly varying function compared with ¢ 3 . The coupling
constant ex may also be viewed as a function of ¢ , because it is a 'running'
coupling constant (Sher, 198 1) whose value depends on the choice of a
renormalization scale µ :
ex (µ 2) = �1
4n
3 n (µ ; S U(5) )
2 A2 '
( 10.8)
=
where A s u ( S) 2.5 x 105 GeV is fixed by the requirement that ex � ls at the
GUT scale. t The renormalization scale is in principle arbitrary, but the
t We have used here the renormalization group equations for the unbroken S U(5) gauge
group , since we are interested in the case </> �o.
Inflationary cosmology 583
Fig . 12.6. The evolution of the scalar field during a slow-rollover phase
transition in the minimal S U(5) theory.
,.-..
> 1 Q13
C-'
d)
'-'
,.-..
.....
'-'
� 1 011
x Ctr -t)
584 S. K. Blau and A. H. Guth
lo = R(t o )k - 1 =
R (tr)
k - e "' (
R(to ) l xrr - 7;eheat 1
To X
)
- (Xk - 1 ex tr ) . (10. 1 3)
Thus
(10. 14)
where
b = r�:at X - l � lO m � 10 - 1 5 Jigh t-yr.
T.
( 10. 1 5)
A galactic scale corresponds to about 10 6 light-yr, so 10/b � 10 2 1 .
Eq. (10. 12) can now be rewritten as
x(tr - t * ) = ln ( l 0/b) - 1 I n[�x(tr - t * )] . (10. 16)
For wave numbers corresponding to the scale of galaxies the second term on
the right-hand side is a factor of 40 smaller than the first term, and it will
therefore be neglected in what follows. The fluctuation amplitude as
expressed by eq. (8 .2 1) can then be evaluated using eqs. ( 10. 10), ( 10. 16), and
I njlationary cosmology 585
In deriving ( 10. 17) the second term inside the square brackets on the right
( 10. 1 7)
hand side of eq. (8 .2 1) was neglected - on the scale of galaxies this term has a
value of 0.09, which is small compared with one.
Inserting numbers into ( 10. 17), one finds that on the scale of galaxies
p
/). � 70 (/ � 106 1ight-yr).
- 0 ( 10. 18)
p H
The density fluctuation amplitude varies slowly with wavelength - for the
scale of the observed universe one finds
p/). � 9 1
- (/ 0 � 10 1 0 light-yr) . ( 10. 19)
p H
These values of (!1p/p )IH are much too large to be acceptable - the desired
number would be about 10- 4 . In fact, the numbers obtained here are so
large that the predictions of the theory are uncertain. The linearized
calculation which we have carried out requires that (!1p/p)IH � 1 , so the
calculation is not valid. The calculation suggests, however, that the density
perturbations are strongly non-linear at the time when the wavelength
becomes comparable with the Hubble length , so it seems reasonable to
speculate that the matter would collapse to black holes on approximately
this scale. Note that in a flat universe any sphere with a radius r equal to the
Hubble length encloses a mass M = -j:nr3 p such that r = rschwarzschild = 2GM. In
an unperturbed flat universe the momentum of the expansion prevents the
collapse of such a sphere to a black hole, but one would expect that any
increase in mass density by a factor of 2 or more would result in black hole
•
formation. Since (!1p/p)IH � 1 on all relevant scales, we expect that the black
holes would continually coalesce to produce larger black holes on the scale
of the ever-increasing Hubble length.
In any case it seems clear that the theory is inconsistent with the observed
universe. If the true prediction of the theory were (!1p/p )IH � io - 4 , then the
linearized calculation would have been valid and could not have led to the
results ( 10. 18) and ( 10. 19).
Note that the desired answer of about io- 4 could be obtained if a were
t This formula is a factor of J2 larger than that in Guth and Pi ( 1 982) because here we
derive the asymptotic amplitude of the wave, while the earlier reference derived the
asymptotic root-mean-square value.
586 S. K . Bla u and A . H. Guth
smaller, but it would have to be very much smaller - the required value is
about 10 - 1 2. Thus, it seems that an extremely weakly interacting scalar field
is needed to drive inflation.
We can now return to the question of whether or not the Higgs field in this
model would remain perched at the top of the hill for long enough to provide
an adequate amount of inflation. As mentioned above, this question has
been studied by Linde ( 1982c) and Vilenkin and Ford ( 1982 ; see also
Vilenkin, 1983c), who found a negative answer. Their argument begins with
the observation that for ¢ � o , the Coleman-Weinberg potential can be
approximated by V( ¢) = constant, describing a free , massless scalar field in
de Sitter space (minimally coupled to gravity). This quantum field theory
has the peculiar property that there exists no stable vacuum state - the scalar
field never settles near any particular value, but instead undergoes an
endless random walk up and down the real number line. Since there is no
stable vacuum the renormalized expectation value < ¢ 2 ) is time-dependent,
and is found to behave as
1 3
(¢ 2 (t)) = X (t - t o ) , ( 10.20)
4rc2
where t0 depends on the arbitrary infinite subtraction. If one interprets
< ¢ 2 (t)) as the quantum version of the classical ¢6(t) (an interpretation
which we do not recommend), then one can compare the time derivative of
( 10.20) with the time derivative of ( 10. 10). If one insists that d ( ¢ 2 (t))/d t ;$
d¢6/dt, one finds that
( )
6 rc2 1 12
x(tr - t) ;$ -----;;- � 1 1. ( 10.2 1)
Comparing the coefficient of T 2 with eqs. (9. 10) and (9. 14), one sets
a = 20mx � l .4
y2 = ia f 2 �o.35f 2 .
.
( 10 2 3a )
( 10.23b)
(Note that the term in eq. ( 10.22) proportional to T 4 is independent of ¢ ,
and therefore has n o dynamical effect . The term i s not included i n the GP
calculation.)
As explained in the previous section, GP define the smeared field ¢,(x, t),
and then calculate the finite quantity ( $,2 (x, t)) . This quantity averages over
systems in different stages of evolution, and therefore the interpretation has
to be done carefully - nonetheless, a small value for the quantity implies
unambiguously that quantum fluctuations are small . The calculation is
carried out using eqs. (9. 12), (9.22), (9 . 16), and (9.2 1), and then eq. (9. 19) is
oo -k2[2 {
< <P f (x t)) = 4n 2 o k 2 + y2 3 / 2 1
A
+
k2 + y2 }
used to reduce the answer to the special case µ = 0. The result is given by
, x2 f k2 dk e
e - 2xr
J
x coth
{( ) }
( )
k 2 + y 2 1 /2
.
x 2
( 10.24)
2T
As an example, we can take the co moving smearing length as l = 1/f, so the
physical smearing length at any 'time is 1/T, where T = f e - xr is the
background temperature. On dimensional grounds, one expects this
smearing length to correspond roughly to a coherence length. The integral
can then be carried out numerically, yielding
< ¢f(t)) = 0.02 19 x 2 + 0.0 1 9s r 2 • ( 10.25)
Thus, the mean-squared value of the smeared field consists of a thermal
contribution which redshifts to zero, and a residual contribution. Note that
( (fif(t)) does not contain a contribution that grows linearly with t, as in eq.
-
( 10.20). The difference is in the nature of the short distance cutoff. Eq. ( 10.24)
is expressed in terms of a fixed coordinate smearing length l this type of
smearing is appropriate to discuss the behavior of the mean value of ¢
throughout the region which will evolve to become the observed universe.
The renormalized expectation value ( ¢ 2 ) , on the other hand, is calculated
with a fixed physical cutoff momentum - this type of cutoff is appropriate to
discuss local measurements of ¢, where the short wavelength fluctuations
588 S. K. Blau and A. H. Guth
time constants of the slow rollover. The assumption that J¢(x, t) is a small
pertu rbation that can be treated linearly is therefore badly violated , and the
entire picture of the slow rollover breaks down . Thus, this model seems to
exhibit the pathological behavior suggested by Mazenko, Unruh and Wald
( 1985).
We conclude that the new inflationary model with a minimal S U(5) GUT
has two flaws : the density fluctuations are unacceptably large, and the
amount of inflation is insufficient. Both of these flaws, however, are
symptoms of one underlying problem : the quantum fluctuations in J¢(x, t)
are too large. As we saw in eq . ( 10.27), however, the size of these quantum
fluctuations is controlled by general principles of quantum field theory, and
does not depend on the detailed shape of the minimal S U(5) Coleman
Weinberg potential. What is needed, then , is a potential for which the
classical solution gives a value of cfi(t) large enough to suppress the effect of
these fluctuations. The example studied here suggests that the requirement
of obtaining [�p(k)/pJ IH � 10 - 4 is more stringent than the requirement of
obtaining su fficient inflation. Thus, it seems plausible that in models for
which [�p(k)/pJ IH has an acceptable value, the expected amount of inflation
will greatly exceed the minimal requirement of 58 time constants.
Once the failure of the minimal S U(5) theory was discovered, efforts
turned toward the construction of particle theories that have the properties
necessary for the new inflationary cosmology. The first model to be studied
(Dimopoulos and Raby , 1983 ; Alb recht et al., 1982, 1983) was the
'geometric hierarchy model' (proposed by Witten ( 1 98 1b)), a
supersymmetric model which makes use of a fundamental energy scale of
abou t 10 1 2 GeV. This model provided acceptable density fluctuations and
much more than adequate inflation , but the reheating temperatu re of the
model was too low to allow for baryogenesis. Subsequent work has led to a
number of models that are believed to be completely successful. In a series of
papers (Ellis et al., 1982, 1983a, b ; Nanopoulos et al., 1983a, b; Nanopoulos
and Srednicki, 1983 ; Nanopoulos, Olive and Srednicki, 1983 ; Gelmini,
Nanopoulos and Olive, 1983; Olive, 1983 ; Linde, 1983e,f; Ellis et al., 1984;
Kounnas and Quiros, 198 5 ; Ellis et al. , 198 5 ; Gelmini, Kounnas and
Nanopoulos, 198 5 ; Enqvist et al., 1985a, b; Jensen and Olive, 1985, 1986), a
group loosely centered at CERN has developed a model called 'primordial
inflation', in which inflation is driven by a scalar field which breaks
supersymmetry near the Planck scale. Alternative approaches using
supersymmetry and/or supergravity have been developed by Ovrut and
Steinhardt ( 1983, 1984a, b, c, 1986; see also Albrecht and Steinhardt, 1983 ;
590 S. K. Blau and A . H. Guth
and Lindblom, Ovrut and Steinhardt, 1986) and by Holman, Ramond and
Ross ( 1984). Shafi and Vilenkin ( 1984) have shown that supersymmetry is
not necessary for inflation - they constructed a model bas ed on a non
minimal SU(5) theory that includes a weakly coupled gauge singlet
'inflation' field. Pi ( 1984) developed a variant of this model in which the role
of the inflaton is played by the axion field. Other particle physics models that
are believed to lead to successful inflationary cosmologies have been
constructed by Hung (1984), Gupta and Quinn (1984), and Shafi and
Stecker ( 1984).
At the least, the models mentioned above demonstrate that the problems
associated with inflation in the context of the minimal S U(5) GUT can be
overcome. However, we think it is fair to say that all of these theories appear
somewhat contrived. The ultimate goal of constructing a theory which is
consistent with inflation and which also provides an elegant solution to the
problems of particle physics has yet to be achieved.
implies that the metric in the true vacuum region has the usual
Schwarzschild form, so the gravitational field is not expected to oppose the
force of the pressure gradient. Thus, the second observer would not expect to
see inflation. The resolution of this paradox relies on a highly non-Euclidean
geometry, allowing the false vacuum region to inflate without moving
outward into the true vacuum region .
An example of a false vacuum bubble solution is illustrated in Fig. 12.7.
The figure shows a diagram of spacetime, with the angular coordinates e and
</> suppressed - light-like lines travel at 45°. To the right of the bubble wall
(shown as a heavy line with an arrow on it) the diagram represents a region
of Schwarzschild space, shown in Kruskal-Szekeres coordinates. The
behavior of the solution can be illustrated more intuitively by the diagrams
in Fig . 12.8, which show the spatial hypersurfaces at successive values of the
time coordinate used in Fig . 12.7. The diagrams are drawn by suppressing
one dimension of the hypersurface and embedding the resulting two
dimensional surface in a three-dimensional space so that the curvature can
be displayed. The labels (a)-(d) in Fig . 12.8 correspond to the horizontal
slices indicated in Fig . 12.7.
The resolution of the paradox is now apparent : the shape is distorted so
that a force driving the bubble wall from the true vacuum to the false
vacuum pushes the wall to larger values of the radius. Meanwhile the false
c
r= 0
non-singular
b
singular
592 S. K. Blau and A . H . Guth
Fig. 12.8. The evolution of a false vacuum bubble solution . Each lettered
diagram illustrates a spacelike hypersurface indicated in Fig. 1 2 . 7. The
diagrams are drawn by suppressing one dimension of the hypersurface and
embedding the resulting two-dimensional surface in a three-dimensional
space so that the curvature can be displayed. The false vacuum region is
indicated by shading. Note that diagram (d) shows a child universe detaching
from the original spacetime.
I "'v c I / ,g c I
(a) (b)
_
I ____._.
, _.,...
... /___,/
v
(c) (d)
Inflationary cosmology 593
12. 12 Conclusion
As emphasized by Tu rner (1985), inflation is really a paradigm, and not a
specific theory. In particular, there are a number of variations of inflation
that appear to be capable of achieving the same objectives. Of these,
probably the best known is Linde's proposal of chaotic inflation (Linde,
1983b, c, d, 1986a, b, c; see also Goncharov and Linde, 1984a, b, c ;
Goncharov e t al., 1984; and Khlopov and Linde, 1984). Our primary reason
for omitting a discussion of chaotic inflation was the simple fact that Linde is
also contributing an article to this volume (Chapter 13). The key idea of
chaotic inflation is the observation that inflation can work for a scalar field
potential as simple as V( </>) = A</> 4 , provided that one makes some
assumptions about the initial conditions. Linde proposes that the scalar
field begins in a chaotic state, so that there are some regions in which the
value of </> is a few times larger than MP . These regions must exceed some
minimal size, which is estimated to be somewhere between 2H - 1 (Linde,
1984a) and 9H - 1 (Turner, 1985), where H denotes the Hubble constant.
Then </> rolls down the hill of the potential energy diagram, and a
straightforward calculation indicates that there is an adequate amount of
inflation. The Hubble 'constant' is not a constant in this case, but it is slowly
varying, so the expansion can be called 'quasi-exponential' . In order for the
density fluctuations to have the desired magnitude of (L\p/p ) I H � 10- 4 , the
coupling constant A. must be very small : A. � 10 - 1 3. This number is typical of
working inflationary models. Thus, chaotic inflation can greatly enlarge the
class of particle theories consistent with inflationary cosmology - the
plateau of the effective potential diagram is no longer required. It seems
difficult to assess the plausibility of the initial conditions required for chaotic
inflation, but the same can be said for the standard new inflationary
scenario. Either model appears to be much more plausible than the standard
cosmology (Section 12.2), and probably both models could benefit from
more work concerning this question.
Another variation of inflation is the Starobinsky model (Starobinsky,
1979, 1980), a model proposed in its original form several years before
inflation itself. In this model the energy density of the false vacuum is
supplied by the curved space quantum corrections to the renormalized
expectation value of the energy-momentum tensor of one or more scalar
Inflationary cosmology 595
fields in the theory (Dowker and Critchley, 1976). While the initial
conditions required for this model do not appear plausible at the level of
classical physics, the desired conditions can perhaps be achieved in models
in which the universe is created by a quantum fluctuation from 'nothing'.
For a summary of the model and its literature, see Linde ( 1984a) and
Vilenkin ( 1985c).
Other alternative inflationary models include induced gravity inflation
(Accetta, Zoller and Turner, 1985 ; Spokoiny, 1984), and inflation driven by
a scalar field which represents the radius of a compactified dimension (Shafi
and Wetterich, 1983, 1985). Both of these possibilities, along with chaotic
inflation, are summarized in Turner's review ( 1 985).
In conclusion , we want to say that the basic idea of inflation - the idea that
the universe went through a period during which it expanded exponentially
while trapped in a false vacuum - appears to us to be probably correct. It is a
very simple and natu ral idea in the context of spontaneously broken gauge
theories, and it seems to solve some very fundamental cosmological
problems. On the other hand , we clearly do not yet have the details straight.
In order to understand the density fluctuations, for example, we must at the
same time understand the details of particle physics at GUT energy scales.
We are presumably some distance from that goal .
Finally, we wish to emphasize that probably the most dramatic recent
development in cosmology is the realization that the universe may be
completely devoid of conserved quantum numbers. If so, then , even if we do
not understand the precise scenario, it becomes very plausible that our
observed universe emerged from nothing or from almost nothing. The
universe may indeed be the ultimate free lunch.
Acknowledgements
We would like to thank the many people with whom we have interacted over
the years, exchanging information , insights, opinions, and suggestions
concerning inflationary cosmology. A complete list is impossible to
construct, but the list should certainly include Larry Abbott, Jim Bardeen ,
M arc Davis, Lawrence Ford, Stephen Hawking, Lawrence Krauss, Andrei
Linde, Ian M oss, So-Young Pi, Bill Press, David Schramm, Marc Sher,
Gary Steigman, Paul Steinhardt, M ichael Turner, Henry Tye, Alex
Vilenkin , and Erick Weinberg. In addition, our understanding of density
fluctuations in the new inflationary univers� was aided by conversations
with Aleksei Starobinsky and Jemal Guven, and our understanding of false
vacuum bubbles was aided by the input of Edward Farhi, Jose Figueroa-
596 S. K . Blau and A. H. Guth
Appendix
In this append-ix we will prove a mathematical property of the solutions to a
differential equation which was used in Section 12.8. The equation has the
form
(A . l)
where the dot denotes differentiation with respect to t, and where there exists
a t* such that w 2 (t) > 0 for t > t*. We will show that one can find two linearly
independent solutions u1 (t) and u 2(t) with the property that for t > t*, u 1 (t) is
monotonically increasing and l u 2 (t) I < constant x e - 3xi . Thu s , at large times
there is an essentially unique solution u 1 (t).
To prove the theorem, we first construct u1 (t) . This can be done by
choosing initial data at t t* , with u1 (t*) > 0 and u 1 (t* ) > 0. By writing eq.
=
(A. l) as
(A.2)
one can see immediately that u > 0 for all t > t* .
To construct u 2 (t), consider an arbitrary solution v(t) and construct the
Wronksian
= =
(A.3)
One can easily verify that W - 3x W, so one has W W0 e - 3xi. It follows
Inflationary cosmology 597
that
u21 i_ ()
�
d t U1 = w0 e 3 r '
- x (A.4)
which can be integrated to give
t > t* ,
' (A. 6)
e - 3xr e - 3xr
0 < u 2 (t) < < . (A. 7)
3xu1 ( t) 3xu1 (t*)
References
Aaronson , M . , Mould , J., Huchra, J., Sullivan, W. T., Schommer, R . A. and Bothun,
G . D . ( 1980). Astrophys. J., 239, 12, 66-B l .
Abbott , L . F . ( 1 98 1). Nucl. Phys., B 185, 233.
Abbott, L. F., Farhi, E . and Wise, M. B. ( 1982). Phys. Lett., 1 17B, 29.
Abbott , L . and Pi, S .-Y. ( 1986). Inflationary Cosmology. World Scientific : Singapore .
Abbott , L. F. and Sikivie, P . ( 1983). Phys. Lett., 120B, 133.
Abbott, L. F . and Wise, M . B . ( 1984a). Phys. Lett., 135B, 279.
Abbott, L. F. and Wise, M . B. ( 1 984b). Astrophys. J., 282, L47 .
Abramowitz, M . and Stegun , I . A . ( 1 972). Handbook of Mathematical Functions with
Formulas, Graphs, and Mathematical Methods. Dover : New York .
Accetta, F., Zoller, D . and Turner, M . S . ( 1985). Phys. Rev., D31 , 3046.
Albrecht, A. and Brandenberger, R. ( 1985). Phys. Rev., D31 , 1225 .
Albrecht, A . , Brandenberger, R . and Matzner, R . ( 1985). Phys. Rev., D32, 1280.
Albrecht, A . , Dimopoulos, S., Fischler, W., Kolb, E. W . , Raby, S. and Steinhardt, P . J.
( 1 982). Proceedings of the 3rd Marcel Grossman Meeting on the Recent Developments
in General Relativity, ed . Hu Ning, p . 5 1 1 . North-Holland : Amsterdam.
Albrecht, A . , Dimopoulos, S . , Fischler, W . , Kolb, E. W . , Raby, S. and Steinhardt, P. J.
( 1983). Nucl. Phys., B229, 528 .
Albrecht, A . and Steinhardt, P . J . ( 1982). Phys. Rev. Lett., 48, 1220.
Albrecht, A. and Steinhardt, P. J. ( 1983). Phys. Lett. , 131B, 45 .
Albrecht, A . , Steinhardt, P . J . , Turner, M . S. and Wilczek, F. ( 1982). Phys. Rev. Lett.,
48, 1437.
Allen , B. ( 1986). The graviton propagator in de Sitter space, Tufts preprint TUTP 86-9.
Antoniadis, I . , Iliopoulos, J. and Tomaras, T. N. ( 1 986). Phys. Rev. Lett. , 56, 1 3 19.
Arya! , M ., Everett, A . E., Vilenkin, A . and Vachaspati, T. ( 1986). Phys. Rev., 034, 434.
Atkatz, D. and Pagels, H . ( 1982). Phys. Rev. , D25, 2065.
598 S. K. Blau and A. H. Guth
Aurilia, A., Denardo, G., Legovini, F. and Spallucci, E. ( 1984). Phys. Lett., 1478, 258.
Aurilia, A., Denardo, G., Legovini, F. and Spallucci, E. ( 1985). Nucl. Phys., 8252, 523.
Axenides, M., Brandenberger, R. and Turner, M. S. ( 1983). Phys. Lett., 1268, 178.
Bardeen, J . M . ( 1980). Phys. Rev., D22, 1882.
Bardeen, J . M . , Bond, J . R . , Kaiser, N. and Szalay, A . S. ( 1 986). The statistics of peaks
of Gaussian random fields. To be published in Astrophys. J.
Bardeen, J. M ., Steinhardt, P. J . and Turner, M . S. ( 1983). Phys. Rev., D28, 679.
Barrow, J. D. ( 1983). In The Very Early Universe (Proceedings of the Nuffield
Workshop), ed. G. W. Gibbons, S. W. Hawking and S . T. C. S iklos, p. 267.
Cambridge University Press: Cambridge.
Barrow, J. D. ( 1986). The deflationary universe : an instability of the de Sitter u niverse.
Sussex preprint.
Bennett, D. P. ( 1986) . Phys. Rev., D33, 872.
Berezin, V . A., Kuzmin , V . A . and Tkachev, I . I . ( 1983). Phys. Lett. , 1208, 9 1 .
Berezin, V . A . , Kuzmin , V . A . and Tkachev, I . I . ( 1985). I n Proceedings of 3rd Seminar
on Quantum Gravity, 1 984 , ed . M . A. Markov, V. A. Berezin and V. P. Frolov, p. 605.
World Scientific : S ingapore.
Bernard, C. ( 1974). Phys. Rev., D9, 33 13.
Billoire, A. and Tamvakis, K . ( 1982). Nucl. Phys., 8200, 329.
Blau, S . K . , Guendelman, E. I . and Guth, A. H . ( 1986). The dynamics of false vacuum
bubbles. Submitted to Phys. Rev., D .
Bludman, S. A . and Ruderman, M . A . ( 1977). Phys. Rev. Lett., 38, 255 .
Boesgaard, A. and Steigman, G . ( 1985). To be published in Ann . Rev. Astron. Astrophys.
Boucher, W. and. Gibbons, G. W. ( 1983) . In The Very Early Universe (Proceedings of
the NulTield Workshop); ed . G. W. Gibbons, S. W. Hawking and S. T. C. Siklos ,
p. 273 . Cambridge University Press : Cambridge.
Branch, D. ( 1979). Mon. Not. R. Astron. Soc., 186, 609.
Brandenberger, R. H. ( 1984). Nucl. Phys., 8245, 328.
Brandenberger, R . H . ( 1985). Rev. Mod. Phys., 57, 1 .
Brandenberger, R . , Kahn , R . and Press, W . H . ( 1983). Phys. Rev., D28, 1809.
Brandenberger, R . H . and Turok, N. ( 1986). Phys. Rev., D33, 2 182.
Brout, R . , Englert , F. and Spindel , P . ( 1979). Phys. Rev. Lett., 43, 4 17.
Bunch, T. S . and Davies, P . C. W. ( 1978). Proc. Roy. Soc. London, Ser., A360, 1 17.
Buras, A. J . , Ellis, J . , Gaillard, M . K . and Nanopoulos, D. V . ( 1978). Nucl. Phys., 8 1 35,
66.
Callan, C. G. and Coleman, S. ( 1977). Phys. Rev., D l6, 1762.
Chudnovsky, E. M., Field , G . B., Spergel , D . N. and Vilenkin, A . ( 1986).
Superconducting cosmic strings. Submitted to Phys. Rev., D .
Coleman , S . ( 1 975) . I n Laws of Hadronic Matter (Erice, 1973), ed . A . Zichichi , p . 1 39 .
Academic Press : New York .
Coleman, S . ( 1977) . Phys. Rev., D IS, 2929.
Coleman, S. ( 1979). In The Whys of Subnuclear Physics (Proceedings of the
International School of Subnuclear Physics, Ettore Majorana, Erice, 1977), ed.
A. Zichichi. Plenum : New York.
Coleman, S . and De Luccia, F. ( 1980). Phys. Rev. , D21 , 3305.
Coleman, S . and Weinberg, E. J . ( 1973) . Phys. Rev., D7, 1 888 .
Cook, G . P . and Mahanthappa, K . T. ( 1 982). Phys. Rev., D25, 1 1 54.
Dicke, R . H . and Peebles, P. J . E. ( 1979) . In General Relativity: An Einstein Centenary
Survey, ed . S . W . Hawking and W . Israel . Cambridge University Press : Cambridge.
Inflationary cosmology 599
Dicus , D . , Kolb , E. W. and Teplitz, V. ( 1977). Phys . Rev. Lett., 39, 168 .
Dimopoulos, S . and Raby, S . ( 1983) . Nucl. Phys., B219, 479.
Dine, M . and Fischler, W. ( 1983). Phys. Lett., 1 20B, 137.
Dolan, L . and Jackiw, R. ( 1974a). Phys. Rev., 09, 2904.
Dolan , L. and Jackiw, R. ( 1974b) . Phys. Rev., 09, 3320.
Dolgov, A . D. and Linde, A . D. ( 1982). Phys. Lett., 1 1 6B, 329.
Dowker, J. S. and Critchley, R. ( 1976). Phys. Rev., 0 13, 3224.
Einhorn, M . B. and Sato, K. ( 198 1). Nucl. Phys., B180 [FS2] , 385.
Ellis , J . , Enqvist, K . , Gelmini, G . , Kounnas, C., Masiero, A . , Nanopoulos, D . V . and
Smirnov, A. Yu . ( 1 984). Phys. Lett., 147B, 27 .
Ellis, J . , Enqvist, K . , Nanopoulos, D. V., Olive, K. A . and Srednicki, M. ( 1985). Phys.
Lett . , 152B, 175 [Erratum: Phys. Lett., 156B, 452 ( 1985)] .
Ellis, J . , Nanopoulos, D . V . , Olive, K . A. and Tamvakis, K . ( 1982). Phys. Lett. , 1 18B,
335.
Ellis, J . , Nanopoulos, D. V., Olive, K . A . and Tamvakis, K . ( 1983a). Nucl. Phys., B221 ,
524.
Ellis, J . , Nanopoulos, D. V., Olive, K. A. and Tamvakis, K. ( 1983b). Phys. Lett. , 120B,
33 1 .
Ellis, J . and Steigman, G . ( 1980). Phys. Lett., B89, 186 .
Enqvist, K . , Nanopoulos, D . V . , Quiros, M . and Kounnas, C. ( 1985a). Nucl. Phys.,
B262 , 538.
Enqvist, K., Nanopoulos, D . V . , Quiros, M. and Kounnas, C. ( 1985b). Nucl. Phys.,
B262 , 556.
Faber, S . M. and Gallagher, J. S . ( 1979). Ann. Rev. Astron. Astrophys., 17, 135.
Farhi, E . and Gu th , A . H. ( 1986). An obstacle to creating a universe in the laboratory,
unpublished .
Ford , L . H . ( 1985). Phys. Rev. , 031 , 7 10.
Frieman , J . A . and Will , C. M. ( 1982). Astrophys. J., 259, 437.
Gelmini, G. B., Kounnas, C. and Nanopoulos, D. V. ( 1985). Nucl. Phys., B250, 177.
Gelmini, G. B., Nanopoulos, D. V. and Olive, K . A . ( 1983). Phys. Lett., 131B, 53.
Georgi , H. and Glashow, S . L. ( 1974). Phys. Rev. Lett. , 32 , 438.
Georgi, H . , Quinn, H. R. and Weinberg, S . ( 1974). Phys. Rev. Lett. , 33, 45 1.
Gibbons, G . W. and Hawking, S . W. ( 1977). Phys. Rev., 0 15, 273 8.
Ginsparg , P . & Perry, M. J . ( 1983). Nucl. Phys., B222 , 245 .
Gliner, E. B . ( 1965) . Zh . Eksp. Teor. Fiz., 49, 542 (JETP Lett. , 22, 378 ( 1966)).
Gliner, E. B. ( 1970) . Dok/. Akad. Nauk SSSR, 192 , 771 (Sov. Phys. Doklady, 15, 559
( 1970)).
Gliner, E. B. and Dymnikova, I . G. ( 1975). Pis'ma Astron. Zh ., 1 , 7 (Sov. Astr. Lett., 1 ,
9 3 ( 1975)).
Goldman, T., Kolb., E. W. and Toussaint, D. ( 198 1). Phys. Rev., 023, 867.
Goncharov, A. S. and Linde, A . D. ( 1984a). Zh . Eksp. Teor. Fiz., 86, 1 594 (JETP, 59,
930 ( 1984)).
Goncharov, A. S. and Linde, A. D. ( 1984b) . Phys. Lett ., 139B, 27.
Goncharov, A. S. and Linde, A. D. ( 1984c). Class. Quantum Grav., 1, L 75.
Goncharov, A. S., Linde, A. D . and Vysotskii, M. I . ( 1984). Phys. Lett., 1478, 279.
Gott, J. R. ( 1982) . Nature, 295, 304.
Gupta, S . and Quinn, H . R. ( 1984). Phys. Rev., 029, 279 1 .
Guth, A . H. ( 198 1). Phys. Rev., 023, 347.
Guth, A . H . ( 1983a). In Asymptotic Realms of Physics: Essays in Honor of Francis E.
600 S. K. Blau and A . H. Guth
Kodama , H . , Sasaki, M . and Sato, K . ( 1982) . Prog. Theor. Phys., 68, 1979.
Kodama , H . , Sasaki , M ., Sato, K. and Maeda, K. ( 198 1 ) . Prog . Theor. Phys., 66, 2052.
Kolb, E. W. and Turner, M . S. ( 1983). Ann. Rev. Nucl. Part. Sci., 33, 645 .
Kolb , E . W. and Wolfram, S . ( 1980) . Astrophys. J., 239, 428 .
Kounnas, C. and Quiros, M . ( 1985). Phys. Lett., 151B, 189.
Langacker, P. ( 198 1). Phys. Rep., 72C, 185.
Lazarides, G ., Shafi , Q. and Trower, W. P . ( 1982). Phys. Rev. Lett., 49, 1756.
Lindblom, P . R . , Ovrut, B . A. and Steinhardt, P. J . ( 1986) . Phys. Lett. , B172, 309.
Linde, A. D. ( 1974) . Zh. Eksp. Teor. Fiz., 19, 320 (JETP Lett., 19, 183 ( 1974)) .
Linde, A . D . ( 1979) . Rep. Prog . Phys., 42, 389.
Linde, A. D . ( 1982a). Phys. Lett., 108B, 389.
Linde, A. D. ( 1982b) . Phys. Lett., 1 14B, 43 1 .
Linde, A . D . ( 1982c) . Phys. Lett., 1 16B, 335.
Linde, A . D . ( 1983a) . In The Very Early Universe (Proceedings of the Nuffield
Workshop), ed . G. W. Gibbons, S . W. Hawking and S. T. C. Siklos, p. 205 .
Cambridge University Press : Cambridge.
Linde, A. D. ( 1983b) . Zh. Eksp. Teor. Fiz., 38, 149 (JETP Lett., 38, 176) .
Linde, A . D . ( 1 983c) . Phys. Lett., 129B, 177.
Linde, A . D. ( 1 983d) . Phys. Lett., 131B, 330.
Linde, A . D . ( 1983e). Phys. Lett. , 132B , 3 17.
Linde, A . D . ( 1983!). Zh. Eksp. Teor. Fiz., 37, 606 (JETP Lett., 37, 724 ( 1983)).
Linde, A . D. ( 1 984a). Rep. Prog. Phys., 47, 925 .
Linde, A . D . ( 1984b). Nuovo Cim. Lett. , 39, 40 1 .
Linde, A . D . ( 1984c) . Zh. Eksp. Teor. Fiz., 87, 369 (JE TP Lett., 60(2), 2 1 1 ( 1984)).
Linde, A . D . ( 1984d) . Zh. Eksp. Teor. Fiz., 40, 496 (JETP Lett., 40, 1333 ( 1984)).
Linde, A . D . ( 1985). Phys. Lett., 158B , 375.
Linde, A . D. ( l986a) . Mod. Phys. Lett., Al, 8 1 .
Linde, A . D . ( 1986b) . Eternally existing, self-reproducing inflationary universe. To be
published in Proceedings of Nobel Symposium on Unification of Fundamental
Interactions.
Linde, A. D. ( 1986c) . Eternally existing self-reproducing chaotic inflationary universe.
Phys. Lett., 1 75B, 395.
Lindley, D . ( 1985). The inflationary universe : a brief history. Unpublished.
Maeda, K . , Sato , K . , Sasaki , M . and Kodama , H . ( 1982) . Phys. Lett. , 108B, 98.
Mazenko, G . F., Unruh, W. G . and Wald, R. M. ( 1985). Phys. Rev., D31 , 273.
Mazur, P. and M ottola, E. ( 1986) . Spontaneous breaking of de Sitter symmetry by
radiative effects. Santa Barbara Inst. Theor. Phys. preprint NSF-ITP-85- 153.
Misner, C. W . , Thorne, K . S . and Wheeler, J . A. ( 1973). Gravitation. Freeman : San
Francisco .
Moss, I . G . ( 1983). Phys. Lett., 1 28B, 385.
Moss, I . G . ( 1984). Nucl. Phys., B238, 436.
Moss, I. G. and Wright, W. A . ( 1984). Phys. Rev., D29, 1067.
M ottola, E. ( 1985). Phys. Rev., D31 , 754.
Mottola, E . ( 1986). Phys. Rev., D33, 16 16.
M yhrvold, N . P . ( 1983a) . Phys. Rev., D28, 2439.
Myhrvold, N . P. ( 1983b). Phys. Lett., 132B, 308 .
Nanopoulos, D. V., Olive, K . A . and Srednicki, M . ( 1983). Phys. Lett., 127B, 30.
Nanopoulos, D. V., Olive, K. A., Srednicki , M . and Tamvakis, K. ( 1983a). Phys. Lett.,
123B, 4 1 .
602 S. K . Blau and A . H. Guth
Nanopoulos, D. V., Olive, K . A., Srednicki, M. and Tamvakis, K. ( 1983b). Phys. Lett.,
124B, 1 7 1 .
Nanopoulos, D . V . and Srednicki, M . ( 1983). Phys. Lett., 133B, 287.
Olive, K . A. ( 1983). In Galaxies and the Early Universe (Proceedings of the 18th
Rencontre de Moriond, Astrophysics Meeting), ed. J . Audouze and
J. Tran Thanh Van, p. 3. Reidel : Dordrecht.
Olson, D. W . ( 1976). Phys. Rev., D 14, 327.
Ostriker, J. P ., Thompson, C. and Witten, E. ( 1986) . Cosmological effects of
superconducting strings. Submitted to Phys. Lett.
Ovrut, B. A. and Steinhardt, P. J. ( 1983). Phys. Lett., 1 33B, 1 6 1 .
Ovrut, B . A . and Steinhardt, P . J . ( 1984a) . Phys. Rev. Lett., 53, 732.
Ovrut, B . A. and Steinhardt, P . J. ( 1984b) . Phys. Rev., D30, 206 1 .
Ovrut, B . A . and Steinhardt, P . J . ( 1984c) . Phys. Lett. , 147B, 263 .
Ovrut, B . A. and Steinhardt, P. J . ( 1986). In Inner Space/Outer Space: The Interface
between Cosmology and Particle Physics, ed. E. W . Kolb, M. S. Turner, D. Lindley,
K. Olive and D. Seckel . Chicago University Press : Chicago.
Peebles , P. J . E. ( 1979). Astron . J., 84 , 730.
Peebles, P. J . E. ( 1984) . Astrophys. J., 284 , 439 .
Pi, S .-Y . ( 1984) . Phys. Rev. Lett., 52, 1725.
Preskill, J . P . ( 1979). Phys. Rev. Lett., 43, 1365.
Preskill , J., Wise, M. B . and Wilczek , F. ( 1983). Phys. Lett. , 1 20B, 127.
Press, W . H. and Vishniac, E. T. ( 1980) . Astrophys. J., 239, 1 .
Rindler, W . ( 1956) . Mon . Not. R . Astron. Soc., 1 16 , 663 .
Sandage, A . and Tammam;1, G . A . ( 1 976) . Astrophys. J., 2 1 0 , 7 .
Sasaki, M . and Sato, K . ( 1982). Prag. Theor. Phys., 68 , 1979.
Sato , K . ( 1 98 1a). Phys. Lett., 99B, 66.
Sato, K . ( 198 1b). Mon . Not. R. Astron. Soc ., 1 95, 467 .
Sato, K . ( 198 lc). Prag . Theor. Phys., 66, 2287.
Sato , K., Kodama, H . , Sasaki, M . and Maeda , K . ( 1982) . Phys. Lett., 1 08B, 103 .
Sato, K . , Sasaki, M . , Kodama, H . and Maeda, K . ( 198 1 ) . Prag . Theor. Phys., 65 , 1443.
Scherrer, R. J. and Frieman, J . A. ( 1 986) . Phys. Rev., D33, 3556.
Schramm, D . N . and Steigman, G. ( 198 1). Astrophys. J., 243, 1 .
Seckel, D . and Turner, M . S . ( 1985). Phys. Rev., D32, 3 17 8 .
Shafi, Q . and Stecker, F. W . ( 1 984) . Phys. Rev. Lett., 53, 1 292.
Shafi, Q . and Vilenkin, A. ( 1 984) . Phys. Rev. Lett., 52, 69 1 .
Shafi, Q. and Wetterich, C . ( 1983). Phys. Lett. , 129B, 387.
Shafi, Q. and Wetterich, C. ( 1985). Phys. Lett. , 152B, 5 1 .
Sher, M . ( 198 1). Phys. Rev., D24, 1699.
Silk , J . and Turner, M. S. ( 1986) . Double inflation. Submitted to Phys. Rev., D .
Spokoiny , B . L . ( 1984). Phys. Lett. , 147B, 39 .
Starobinsky, A . A . ( 1 979). Zh. Eksp. Tear. Fiz., 30, 7 1 9 (JETP Lett. , 30, 682 ( 1979)).
Starobinsky, A. A. ( 1980) . Phys. Lett., 91B, 99 .
Starobinsky, A . A . ( 1982). Phys. Lett., 1 17B, 175.
Steigman, G . ( 1983). In Unification of the Fundamental Particle Interactions II
(Proceedings of the Europhysics Study Conference, Erice, Italy, Oct. 6- 14, 198 1 ) , ed .
J . Ellis and S . Ferrara . Plenum : New York .
Steinhardt, P . J . ( 198 la). Phys. Rev., D24, 842.
Steinhardt, P . J . ( 198 lb). Nucl. Phys., B190, 583.
Steinhardt, P . J . ( 1986). In High Energy Physics, 1 985 (Proceedings of the Yale
Inflationary cosmology 603
13. 1 Introduction
One of the most popular models for the evolution of the universe developed
at the beginning of the 1980s is the inflationary universe scenario. This
scenario in its present form (see e.g. Linde, 1 984c, 1986c; Blau and Guth, this
volume, Ch. 1 2) makes it possible to give answers to about ten different
problems relating both to cosmology and to elementary particle physics. It
then becomes possible to understand why the observable part of the
universe, with a size 1 028 cm, is so flat, homogeneous and isotropic, why
,....,
we do not see any primordial monopoles, what is the origin of galaxies, etc.
No alternative theory that can solve all these problems has been suggested
so far. Therefore it seems plausible that something like inflation actually did
occur in the very early stages of the evolution of the universe.
Historically, there were many different versions of the inflationary
universe scenario. For example, about 20 years ago it was argued (Gliner,
1965) that the superdense baryonic matter should have the vacuum-like
equation of state, p . - p, and its energy-momentum tensor should have the
form 'I'µ v ,...., gwo1A . This would lead to exponential expansion of the universe in
the very early stages of its evolution (Gliner, 1970; Gliner and Dymnicova,
1975; Gurevich, 1975). At present, however, it seems that the equation of
state of superdense baryonic matter should be not p = - p but p = !p .
The next stage i n the development of the inflationary universe scenario i s
associated with the Starobinsky model (Starobinsky, 1979, 1980). He noted
that the exponentially expanding Friedmann universe (de Sitter space) is an
unstable solution of the Einstein equations with quantum corrections
(Dowker and Critchley, 1976), which after the development of the instability
transforms into the hot Friedmann universe. The main pu rpose of this
model was to solve the singularity problem, which proved to be too
Inflation and quantum cosmology 605
=
temperature T in the very early universe. Typically, the minimum of V(<f>) at
large T is displaced to </> 0, and with decrease in T, a phase transition to
some state <P o =t- 0 occurs (Kirzhnits, 1972; Linde, 1979). If this phase
transition proceeds from a strongly supercooled vacuum state </> = 0, the
total entropy of the universe after the phase transition may increase
considerably (Linde, 1979). As pointed out by Guth ( 198 1), this effect can
help to solve the horizon problem, the flatness problem and the primordial
monopole problem. His model (now called the old inflationary universe
scenario) was very simple and attractive. It was based on four important
assumptions.
( 1) Initially tpe universe was in a symmetric state </> 0 due to high
temperature effects.
=
(2) After the universe expands the field </> becomes trapped at a local
minimum of V ( <f> ) near <f> = O. With a decrease of temperature the total
energy-momentum tensor of matter becomes equal to 'F,i v gµv V(O).
The universe in such a state expands exponentially.
=
(3) The stage of exponential expansion (inflation) finishes at the moment of
=
the phase transition to the stable state </> </> 0 =t- 0.
(4) The phase transition occurs due to formation of bubbles with </> </> 0 . =
The process of reheating the universe occurs due to bubble wall
collisions.
Unfortunately, in this scenario the universe becomes extremely
inhomogeneous after reheating (Guth, 198 1 ; Hawking and Moss, 1982;
Guth and Weinberg, 1983). This problem was solved in the context of the
new inflationary universe scenario (Linde, 1982a-d ; Albrecht and
Steinhardt, 1982). In the new scenario the last two assumptions mentioned
above were abandoned. This scenario is still very popular. In my opinion,
however, this scenario also is not perfect. It can be realised only in
some theories with rather unnatural potentials V( </>) that are extremely flat
606 A. Linde
=
near </> 0 and are sufficiently curved near the global minimum of V(</> ) .
Density perturbations produced in this scenario are sufficiently small only if
V(O)/M� "' 10- 1 0 and if the field </> interacts extremely weakly with all other
fields (e.g. A. -- 10 - 1 2 for V(</>) -- V(O) - -!A. </> 4). The first condition means that
inflation in this scenario starts very late, which makes it impossible to solve
the horizon problem in this scenario if the universe is closed : the universe
typically collapses before the beginning of inflation (Linde, 1984c ). The
=
second condition implies that thermal effects typically cannot raise the field
</> to the top of the effective potential at </> 0, and in such a case inflation
does not occur at all (Linde, 1984c, 1985a, b). Consequently, despite many
efforts, no realistic versions of the new inflationary universe scenario have
been suggested so far.
The only scenario in which these problems do not arise is the chaotic
inflationary scenario (Linde, 1983b, 1984a, b, 1985a, b). The main idea of
this scenario is based on the investigation of the possibility of inflation in a
universe filled with some non-equilibrium initial distribution of the field </>,
without making any ad hoc assumptions (1)-(4) concerning thermal
equilibrium, supercooling, etc. This scenario is a most general one, and,
surprisingly enough, it works perfectly well in a wide class of theories with
fairly natural effective potentials V(</> ) . Therefore in the present paper we
shall consider only this version of the inflationary universe scenario.
The concrete models of inflation are modified with each new development
of the underlying elementary particle theory. However, some basic features
of these models remain intact. Many conceptual problems of the
inflationary cosmology have now been solved . But two problems are still
widely discussed at present.
The first problem is connected with the initial conditions in the early
universe. This problem is also related to the singularity problem. The main
part of the problem is not the existence of singularities in the universe, but
the statement (or the common belief) that the universe does not exist
eternally and that there exists 'some time at which there is no spacetime at
all'. What is the origin of the universe? Was it created in a singular state or
has it appeared due to a quantum ju mp 'from nothing'? Which initial
conditions in the new-born universe are most natural ?
Different cosmologists give different answers to these questions. Some of
these answers are based on a phenomenological description of the universe
soon after its formation (Linde, 1983b, 1985a, b). Another approach is based
on the investigation of the· wave function of the universe {DeWitt, 1967;
Wheeler, 1968). The choice of a particular wave function in this approach is
Injlation and quantum cosmology 607
a2 3M ffi :
Here V(</J) is the effective potential of the field <P (in our case V(</J) = !m2</J2),
H = a/a, a(t) is the scale factor of the locally Friedmannian universe (inside
the domain under consideration), k = + 1, - 1 , or 0 for a closed, open or flat
universe, respectively. If the field <P initially is sufficiently large (</J � Mp),
then the functions </J(t) and a(t) rapidly approach the asymptotic regime
mMp
</J(t) = <Po - t, (2.4)
2(3n) 1 1 2
(2.5)
According to (2.4) and (2.5), during a time r - <fJ/mM P the value of the field <P
remains almost unchanged and the universe expands quasi-exponentially :
a(t + dt) - a(t) exp(H dt) (2.6)
for dt � r = </J/mM P . Here
2n1 12 m</J
H= · (2.7)
73 Mp '
H � r - 1 for </J � Mp .
The regime of quasi-exponential expansion (inflation) occurs for <P �!M P .
For <P � !M P the field <P oscillates rapidly, and if this field interacts with other
matter fields (which are not written explicitly in (2. 1)), its potential energy
V(</J -!MP ) - m 2 Mffi is transformed into heat. The reheating temperature TR
may be of the order (mMp)112 or somewhat smaller, depending on the
strength of the interaction of the field <P with other fields. It is important that
TR does not depend on the initial value <Po of the field </J. The only parameter
which depends on <Po is the scale factor a(t), which grows exp((2n/M ffi )</J6)
times during inflation.
If, as is usually assumed, a classical description of the universe becomes
possible only when the energy-momentum tensor of matter becomes
smaller than M� , then at this moment aµ <jJ a µ <jJ � M� and V(</J) � M� .
Inflation and quantum cosmology 609
Therefore the only constraint on the initial amplitude of the field <P is given
by J,m2 /</> 2 � M; . This gives a typical initial value of the field <P :
M
<P o "' m� . (2.8)
l - M ; 1 exp
(M2ni </>�) - Mi 1 exp (2nm2M P2 ) . (2.9)
f
= dx(t) exp(i S(x(t)), (3.2)
where '1'11 is a complete set of energy eigenstates corresponding to the
energies E11 � 0. To obtain an expression for the ground-state wave function
'I' 0 (x), one should make a rotation t -+ - i -r and take the limit as -r' -+ - oc .
In the summation (3.2) only the term n = O with E 0 = 0 survives, and the
integral transforms into J dx(-r) exp( - Sdx(-r))) . Hartle and Hawking have
argued that the generalisation of this result to the case of interest in the semi
classical approximation would yield eq. (3. 1).
The gravitational action corresponding to the Euclidean section S 4 of
de Sitter space dS4 with a(-r) = H - 1 (</J) cos H-r is negative,
'
f [( )
SE (a ,,;..'P ) = - � d 11
2
da 2
dry
A
3
J
- a2 + a 4 . 3 nM� = - 3Mi
2 16 V( <P)
'
(3.3)
Here 1J is the conformal time, 11 = J dt/a(t), A = 8 n V/M� . Therefore,
according to (3. 1),
'1'0 (a, ¢) - exp( - SE(a, ¢)) - exp( 3MP4
16 V(</J)
. ) (3 .4)
Inflation and quantum cosmology 61 1
This means that the probability P of finding the universe in the state with
¢ = const, a = H - 1 (¢) = (3M�/8n V(¢)) 1 1 2 is given by
( )
P(¢) ,...., l\J'ol2 ,...., exp
3M4
8 V(;)
. (3. 5)
This expression has a very sharp maximum as V(¢) -+ 0. Therefore the
probability of finding the universe in a state with a large field ¢ and having a
long stage of inflation becomes strongly diminished. One can argue, of
course, that at V(¢) � M� the function P(¢) becomes constant, and if one
introduces some cut-off at V(<f>) -+ 0 and integrates over all values of the field
-
¢ from OC' to + oc , then the probability of finding the universe with large ¢
(with V(¢) � M�) becomes large (Hawking and Page, 1986). However, this
would break the rule according to which one should try not to appeal to the
processes which occur at densities much greater than the Planck
density M� .
There exists an alternative choice of the wave function of the universe. It
can be argued that the analogy between the standard theory (3.2) and the
,
gravitational theory (3.3) is incomplete. Indeed, there is an overall minus
sign in the expression for SE (a ¢) (3 .3), which indicates that the
gravitational 'energy' associated with the scale factor a is negative. (This is
related to the well-known fact that the total energy of a closed universe is
zero, being a sum of the positive energy of matter and the negative energy of
the scale factor a.) In such a case, to obtain \JI 0 from (3 .2) one should rotate t
-
not to ii' but to + ir ,t which leads to (Linde, 1984)
,
-
P - l 'P 0 (a , </> ll 2 - exp( 2 I S, (a, </> liJ - exp ( -:�;i} (3 .6)
Actually, this result is valid only if the evolution of the field ¢ is very slow, so
that this field acts only as a cosmological constant A (¢) = 8n V( ¢)/M� in
(3.3). Fortunately, this is indeed the case during inflation.
Later the same result was obtained by another method, devised by
Zeldovich and Starobinsky ( 1984), Rubakov ( 1984) and Vilenkin ( 1983).
-
This result can be interpreted as a probability of quantum tunnelling of the
universe from a = O (from 'nothing') to a = H 1 (¢). In complete agreement
with the results of the previous section, eq. (3 .6) predicts that a typical initial
value of the field ¢ is given by V(¢) ;...,, Mi (if one does not speculate about the
possibility that V(¢) � Mi), which leads to a very long stage of inflation.
It must be said that there is no rigorous proof of either eq. (3. 1) or eq. (3.6),
t In our opinion this does not lead to a change of sign of the gravitational constant, as
claimed by Hawking (this volume, Ch. 14). In any case, the usual rule of rotation does
not give '1' 0 if En < O .
612 A . Linde
and the physical meaning of creation of everything from 'nothing' is far from
clear. Therefore a deeper understanding of the physical processes in the
inflationary universe is necessary in order to investigate the wave function of
the universe 'I' 0(a, <P) and to suggest a correct interpretation of this wave
function. With this purpose we shall try to investigate the global structure of
the inflationary universe, and go beyond the minisuperspace approach used
in the derivation of eqs. (3. 1) and (3.6). I
(4. 1 )
and with initial wavelength �I - H - 1 . Later, their wavelength grows expo
nentially as a(i), eq. (2.5), and the field <P + b</J(x) becomes almost exactly
homogeneous, slowly decreasing according to eq. (2.4). However, at the
same time new perturbations of the field <P are generated, and so on. This
process looks like a Brownian motion of the field </J. Inhomogeneities of the
resulting distribution of the field <P lead to density perturbations bp(x),
which grow very slowly (logarithmically) as their wavelengths grow. On a
galactic scale bp/p ,..., lO(m/MP ), which at m ,..., 10 - 5 M P ,..., 10 1 4 Ge V, gives the
desirable value bp/p ,..., 10 - 4 necessary for galaxy formation (Mukhanov and
Chibisov, 198 1 , 1982; Hawking, 1982, 1985; Starobinsky, 1982 ; Guth and
Pi, 1982 ; Bardeen et al., 1983). However, on a much greater scale
perturbations bp/p become very large. The estimates of bp(x)/p in the
chaotic inflation scenario show (Linde, 1984c) that density perturbations
formed at the moment at which the classical scalar field was equal to <P have
the amplitude
( )
bp(</J) - </J [V(</J)] 1 ;2 ,..., _!!!_ _1!__ 2
C (4.2)
p M� Mp Mp '
where C = O( l). This means that bp/p - l for
M P1 ; 2
i.p ,...,
A. > _ _ . MP ·
1 /2 (4.3)
m
Perturbations which are formed at that moment have at present the
Inflation and quantum cosmology 613
wavelength
(4.4)
decreases by
mMp M�
t1.</> (4. 5)
= 2J3n1 ' 2H = 4n</> '
in accordance with eq. (2.4). The physical size of this domain during the time
llt = H - 1 increases e times, and its physical volume increases e 3 times. As a
result of generation of perturbations of the field </>, eq. (4. 1), the value of the
field </> in this domain becomes </> - ll</> + b</>(x). Note that for
</> � Mp(Mp/m)1 1 2 , l b</>(x) I is much greater than ll</>, see Fig. 13. 1 . Since a
typical wavelength of the perturbations b</>(x) generated during the time
llt = H - 1 is O(H - 1), the domain whose initial size is O(H - 1), after
expanding e times, becomes divided into 0( e 3/2) domains of size O(H - 1 ),
m which the field has a value </> - O(H), and O(e3/2) domains of size
Fig. 1 3 . 1 . The region of possible values of the field <P is divided into four
parts : ( 1 ) <P > Mi/m. Fluctuations of metric and of the scalar field <P are
seems impossible. (2) Mp(M p/m) 11 2 :5 <P :5 Mi/m. In this region fluctuations
extremely large, and any classical description of this field and of spacetime
of the field <P are large, but dispersion of these fluctuations averaged over the
coordinate volume is small, Ll � fP · I n this region fPc rolls down according to
c c
eq. (2.4), but the field <P averaged over the physical volume, fPp, diffuses up to
<P - Mi/m. (3) Mp :5 </J :5 Mp(Mp/m)1' 2 . In this region <Pc � <PP decreases
according to eq. (2.4). Inflation occurs until the field <P enters the region
<P � Mp. (4) <P � Mp. The field <P rolls down , oscillates near the minimum of
V(<P) and its energy transforms into heat.
v
Afp4
mM3p
0 Af 2
m
.:::.r
Inflation and quantum cosmology 615
O(H - i ) with field value <P + O(H). In other words, the original domain
(mini-universe) of size O(H - 1 ) after the time dt = H - 1 , expands and
separates into O(e 3 ) mini-universes of a size O(H - 1 ), and in (almost)
half of these mini-universes the field <P grows rather than decreases. During
the next interval dt = H - 1 the total number of domains with a growing
field <P increases again, and so on. This means that the total physical
volume of domains containing permanently growing field <P increases as
exp[(3 - ln 2)Ht] - exp(2.3H(</J)t) for </J � Mp(Mp/m) 1 1 2, whereas the total
physical volume of domains in which the field <P does not decrease grows
approximately as ! exp(3Ht). Since the value of H(</J) increases with
a growth of ¢ , the main part of the physical volume of the universe
emerges as a result of expansion of domains with a maximal possible
field ¢ , i .e. with <P - M�/m, at which V(</J) - M�. (Note that for <P � M�/m,
if it is possible to consider such domains at a classical level, the process
of self-reproduction of inflationary mini-universes with a growing
field <P becomes suppressed, since at V(</J) � Mi a typical value of
aµ (b</J) aµ (b</J) - H 4 is greater than V(</J), which does not lead to creation of
inflationary mini-universes with <P � M�/m.) Therefore, whereas the field <P
averaged over the coordinate volume of any given domain (i.e. cpc),
gradually decreases in accordance with eq. (2.4), the field <P averaged over
the physical volume of a domain that initially contains the field
<P "<, Mp (Mp/m) 1 12 grows to cpp - M�/m (Linde, 1986a).
It may be useful to look at the same problem from another point of view.
The Brownian motion of the field <P at Mp � <P � M�/m can be. described (for
changes of the field <P that are not too large) by the. diffusion equation
=
at a¢
(
ape !_____ a(�Pc) � a v
a¢· + )
3H a¢ '
(4.6)
where the coefficient of diffusion � = H 3/8n2 . This equation for the case
H(</J) = const. was first derived by Starobinsky ( 1984, 1986); for a more
detailed derivation see Goncharov and Linde ( 1986, 1987). For the special
case a v;a¢ = 0 this equation was obtained by Vilenkin ( 1983).
The stationary solution (aPc/at = O) would be
Pc - exp(3M;/8 V(</J)), (4.7)
which is equal to the square of the Hartle-Hawking wave function of
the universe (3. 5) (Starobinsky, 1986; Linde, 1986b, c). A more general
stationary solution would also contain a term
3nj 0 �!) · exp (83�J) ) J: dq\ exp ( - �7/i) (4.8)
616 A. Linde
-
(Starobinsky, 1986), which would correspond to a stationary flux of
probability Jo from cf> = + oo to cf> = oo . However, eq. (4.6) actually has no
normalisable stationary solutions, since it is not valid at l ct>I � Mp (and at
l ct>I � M�/m).t The field cf> at l c/> I � Mp is rapidly oscillating rather than slowly
rolling down, there is no diffusion of the field cf> from the region cf> � Mp to
the region cf> � Mp , and the averaged field cf> decreases according to eq. (2.4).
A detailed discussion of the solutions of eq. (4.6) is included in a separate
publication (Goncharov and Linde, 1986, 1987 ; Linde, 1986c). Here we
would like to discuss some qualitative' features of these solutions and their
physical interpretation.
Let us consider a domain of initial size I - H - 1 (cf>). The field cf> = cf>o inside
such a domain is distributed homogeneously, since its classical part becomes
homogeneous due to inflation and perturbations b cf>(x) become essential
only at a scale I � H - 1 . There are two main stages of evolution of the field cf>
inside this domain. The first stage has a duration !lt ,...., 2 fo c/> 0/mM p .
During this time the average field (/Jc inside this domain remains practically
unchanged, (/)c = c/> 0 , eq. (2.4), whereas the dispersion 11-; = (bc/>2 ) grows as
(H 3/4n 2)!1t (Vilenkin and Ford, 1982; Linde, 1982c; Starobinsky, 1982) up to
the value 11; "" (cm2cf>ci/Mt), c = 0(1). Note that, at m2c/>�/2 4:. Mt, !le is much
smaller than ¢ 0 . Therefore the behaviour of the field (/Jc at the next stage of
the universe's expansion can be described by eq. (2.4), and the field (/Jc
decreases linearly. Fluctuations be/> are also generated at that stage, but they
have much smaller amplitude and dispersio!l, whereas the dispersion of
bcf>(x) produced at the first stage behaves as (/Jc (Guth and Pi, 1982), and it
therefore remains constant during inflation, l bcf>(x) I - l¢cl = mM p/2 fo =
const. As a result, the function Pc(cf>) is mainly determined by the
fluctuations produced at the first stage, and is given by
p (
c ( A-'+' ) ,...., ex p - A
2 Llc2
) (
(cf> - (/)c(t))2 (cf>
"' exp - - (/)c
)2Mt )
2Cm 2A..'f' 4O • (4.9)
t Eq . (4.8) (without the term (4.7)) may represent such a solution, but only if there exists
a constant probability flux from the region </>'$> M�/m, where eq. (4.6) is not valid.
Inflation and quantum cosmology 617
( )
stage of diffusion from </> 0 to some field </> with V( </>) � M� is given by
Pc(</>) ,..., exp - fo
3 M:
. (4. 1 1)
m3</>t
This solution describes quantum creation of domains of a size l � H - 1 (</> ),
which occurs due to the diffusion of the field </> from </> 0 � M � /m to </> � ¢ 0 .
Direct diffusion with formation of a domain filled with the field </> is
possible only during the time t = c(2fo </J/�Mp ) , c = O ( l ) . At larger
times a more rapid process is a diffusion to some field $ > </> and a
subsequent classical rolling down from $ to </> . Therefore one may interpret
a distribution Pc (</J) formed after a time t = c(2fo </J/mM p ) as a probability
of a quantum creation of a mini-universe filled with a field </> (Linde,
1986b, c ; Goncharov and Linde, 1987)
(
Pc (</>) - exp - c
3M 4
2m2;2
) (
"' exp - 2c
3M 4
S V(;) ,
) (4. 12)
which is in agreement with the estimate (3 . 6) _ of the probability of a
quantum creation of the inflationary universe (Linde, 1984a, b; Zeldovich
and Starobinsky, 1984; Rubakov, 1984; Vilenkin, 1983). The same result is
valid for all sufficiently steep potentials V( </> ), in particular for all V (</> ) - </>"
(Linde, 1986b, c ; Goncharov and Linde, 1987), and, as discussed by
Rubakov ( 1984), no quantum particle production occurs during the process
of the mini-universe creation. Presumably, the process considered above is
complementary to the process of quantum creation 'from nothing' (see also
the next section). In our case, creation of an inflationary (mini-) unive�e
618 A . Linde
assume that the universe was created as a whole at some initial moment t =0
(Linde, 1982e).
There are some problems associated with this suggestion. The parts of the
universe with </> = 0 locally have the same geometrical properties as de Sitter
space. A geodesically complete de Sitter space is closed and its scale factor is
a(t) = H - 1 cosh Ht. For oo < t < 0, a(t) decreases, and the lifetime of such a
-
space in the unstable vacuum state </> = 0 is finite. Therefore such a universe
cannot remain in the unstable state </> = 0 at t = O, when inflation starts. One
may therefore wonder whether the eternally existing universe without initial
singularity and with the major part of its volume in the unstable state <f> = O,
is geodesically complete, or whether the initial singularity in this scenario is
unavoidable (Linde, 1983a).
One should note, that the global geometry of such a universe differs
considerably from the geometry of de Sitter space, since the part of the
coordinate volume which remains in the de Sitter phase <f> =O becomes
infinitesimally small in the course of time. Therefore it is not excluded that
such a universe is geodesically complete, which gives us a possible solution
of the problem of the initial cosmological singularity (Linde, 1982e).
However, there is no need to enter into a detailed discussion of this
possibility here, since no realistic versions of the old and of the new
inflationary universe scenario have been elaborated so far, and the
probability of quantum creation of the universe with </> = 0, V(O) ;S 10- 1 0 M�
(see the beginning of this chapter) is vanishingly small, see eq. (3.6).
On the other hand, in the eternally existing chaotic self-reproducing
inflationary universe discussed above the major part of the physical volume
of the universe emerges from the regions with V( </>) M � . In this sense the
-
Thus, there exist two main versions of the chaotic inflationary scenario .
(i) There may exist an initial global singular space-like hypersurface. In
this case the universe as a whole emerges from a state with a Planck density
p ,..,., M� at some moment t = tp , at which it becomes possible to speak about
the universe in terms of classical spacetime. A natural initial value of the field
¢ at the Planck time is ¢ "' M�/m "' 105Mp for m - 10- 5Mp. Then the
universe endlessly reproduces itself, due to generation of long-wave
fluctuations of the field ¢. This process occurs for M p(M p/m) 11 2 ;5 4> ;5 M�/m
(300Mp ;S c/> ;S M�/m) , i.e. for mM� ;S V(c/>) ;S M: . In the domains with ¢ ;5
Mp(Mp/m) 1 '2 this process becomes inefficient, and each such domain after
inflation looks like a Friedmann m1m-universe of a size / ,...,
Mi 1 exp [2n(Mp/m)] "' 10 1 0 5 cm. In this model the universe has a beginning
but has no end.
(ii) The possibility that the universe has a global singular space-like
hypersurface seems rather improbable unless the universe is compact and its
initial size is O(M; 1 ) . There is no reason for different, causally disconnected,
regions of the universe to start their expansion simultaneously. If the
universe is not compact, there should be no global beginning for its
evolution. A model which illustrates this possibility was suggested above.
The inflationary universe may infinitely reproduce itself, and it may have no
beginning and no end. Any two points of such a universe in a sufficiently
distant past could be causally connected and corresponding observers could
synchronise their clocks even though later they may live in mini-universes
which have become causally disconnected due to the exponential expansion
of the universe.
Note that the 'energy' of the scale factor a(t) is negative, and inflation may
be considered as a result of instability arising from the pumping of energy
from a(t) to the field c/>(t) (Linde, 1984c). For example, the energy of the
scalar field 4> in a closed inflationary universe grows exponentially, whereas
the energy of the scale factor a(t) becomes exponentially large and negative,
the sum of their energies being equal to zero. It was unclear why this
instability, being potentially possible, does not develop after inflation. We
now know a possible answer. The gravitational vacuum in the major part of
its physical volume always remains in the unstable inflationary state with
energy density of the order of M� . However, during the evolution of the
universe many islands of stability are formed, one of which is the mini
universe in which we now live.
Inflation and quantum cosmology 62 1
Note that this process may occur at densities many orders of magnitude
smaller than Mt . Therefore to prove the very existence of the process of self
reproduction of the inflationary universe in our scenario there is no need to
appeal to unknown physical processes at densities greater than the Planck
dens1ty P P "' M 4P .
.
=
formed. Since the Hubble parameter H(</>) during mini-universe formation is
very large, H (8n V(</>)/3M�) 11 2 � 10 - 2 Mp "' 10 1 7 GeV, fluctuations of the
scalar fields <I> are strong enough to transfer the classical fields <I> in the new
born mini-universe from one local minimum of the effective potential
V(<I>, </>) to another (Linde, 1983c, 1984c). This changes the low-energy
elementary particle physics inside the new mini-universe. A particular
example which is relevant to this effect is the supersymmetric S U(5) model.
The effective potential V(<I>) in this model has several minima of
approximately equal depth, and only one of them corresponds to the
desirable symmetry breaking S U(3) x U( 1). Even ifthe universe initially was
in one particular vacuum state corresponding to one of these minima of
V(<I>), after inflation it becomes divided into many mini-universes
corresponding to all possible minima of V(<I>). A typical time which is
necessary for a quantum tunneling from one local minimum of V(<I>) to
another is many orders greater than 10 1 0 years, which is the age of the
observable part of the universe. In one of these mini-universes the vacuum
622 A. Linde
=
solutions, such as
ds 2 dt2 -1H - 2((cosh2(J3 Ht) dx2 + d82 + sin 2 8 d</>2). (6.2)
This solution describes a universe that is a product of a two-dimensional
de Sitter space dS and a compact sphere S with a very small radius R 2
2 2
(j3 H) - 1 . This is a particular case of a Kantowski-Sachs universe
=
(Kantowski and Sachs, 1966; Kofman et al. , 1983; Paul et al., 1986). This
solution is unstable; it (locally) transforms into dS4. However, it is possible
to stabilize it, for example by an appropriate addition of Rn terms to the
Lagrangian, eq. (2. 1) (Deruelle and Madore, 1986 ; Linde and Zelnikov,
1987).
Recently it was shown that long-wave quantum perturbations of scalar
fields </> similar to those discussed in Section 13.4 are generated in the
dS 2 x S 2 universe (6.2) as well (Kofman and Starobinsky, 1987). t This result
is directly related to the main conclusions of the present chapter (Linde,
1986c). Namely, if inflation is initially three dimensional as in ordinary de
Sitter space dS 4 (6. 1), then fluctuatio'n s of the scalar field </> create some
inflationary domains (mini-universe), in which V(</>) ,..., M;. Due to large
fluctuations of </> and of metric gµ11 inside domains of a size O(H - 1 ) ,..., Mi 1 ,
expansion inside some of these domains may become one dimensional (6.2),
independently of what occurs in the nearby domains. This leads to formation
of Kantowski-Sachs 'branches', which spread out of de Sitter space.
However, fluctuations in the Kantowski-Sachs mini-universes lead again to
creation of domains with V(</>) ,..., M; where de Sitter mini-universes (6. 1) can
be created. With account taken of this effect, the inflationary universe in our
scenario may appear as a system of huge inflationary bubbles dS4 connected
with each other by thin inflationary tubes dS 2 x S 2 of exponentially large
length (Fig. 13.2).
The possibility of existence of such a complicated spacetime structure is
directly related to the 'no-hair' theorem for de Sitter space, which is valid
also for the one-dimensional exponential expansion (6.2). The processes
which occur in a part of a tube (6.2) of initial size dl <, H - 1 proceed
independently of what occurs in other parts of the universe. Actually, the
Kantowski-Sachs universe (6.2) is not the only solution of the Einstein
equations with V(</>) > 0 which leads to spontaneous compactification of a
t This is actually a general property of all inflationary models related to the existence of
d zero modes of a scalar field </> in the Euclidean section Sd of d-dimensional de Sitter
space dSd .
Inflation and quantum cosmology 625
Fig . 1 3 .2. This picture gives some idea of the global structure of the chaotic
inflationary universe. Child ; universes created where V(<J>) � M; have the
same 'genetic code' as their mother universes : they have the same number of
dimensions and the same (or almost the same) vacuum structure. However,
the universes created where V(<J>) is not much smaller than M; are
'mutants', which may have different dimensionality and different low-energy
elementary particle physics inside them. Each mini-universe may die, but the
universe as a whole has no end and may have no beginning. A typical
thickness of 'tubes' connecting mini-universes after inflation may become
very large . However, during inflation some of the tubes have a very small
thickness. For example, some of the de Sitter mini-universes dS4 may be
connected by the Kantowski-Sachs tubes dSd - n x Sn, where the radius of the
sphere Sn is of the order O(H - 1 (</>)). The mini-universes compactified during
inflation may ·serve as seeds for the next stage of the process of
compactification, which occurs after inflation.
626 A . Linde
M: for ever due to the effects connected with the long-wave fluctuations of
the field ¢, see Section 13.4. In those domains of the universe in which th e
field ¢ slowly decreases, the radius of compactification R11(¢) slowly grows
as H - 1 ( ¢) unless there exist some dynamical mechanism which fixes the
radius of compactification at some value R 0 � M; 1 . In this sense the
domains of the universe compactified during inflation may serve as seeds for
different possible types of compactified spaces in the Kaluza-Klein theories.
This provides us with a new scenario of compactification as compared
with the scenario of a classical power-law Kasner-like anisotropic expansion
discussed by several authors (see, e.g . Freund, 1982; Freund and Oh, 1985).
Namely, during inflation at mM� � V(c/J) � M: mini-universes with all
possible types of compactification are formed in different causally
disconnected regions of the universe, and those mini-universes in which
inflation (in the uncompactified directions) remains possible after their
formation later become exponentially large. The process of formation of
new mini-universes has no end, in the major part of the physical volume of
the universe _this process occurs even now, and therefore even if the
probability of formation of a mini-universe of some particular type is
strongly suppressed, many such mini-universes should exist at present.
According to this scenario, we live in a four-dimensional spacetime with
our type of compactification , not for the reason that other types of
compactification are impossible (or improbable), but for the reason that life
of our type cannot exist in spaces with other dimensionality and with a
different low-energy elementary particle physics (Linde, 1983a, 1984c,
1986c; see also Rozental, 1980, 1984).
13 .6 .3 Inflation and the anthropie principle
Discussion of some of the problems considered above is based on the so
called anthropic principle (Dicke, 196 1 ; Collins and Hawking, 1973 ; Carr
and Rees, 1979; Rozental, 1980; Linde, 1984c). According to this principle,
we live in a four-dimensi onal, homogeneou s, isotropic world with
e2/4n = 1 �7, me = 0.5 MeV , etc. , for the reason that life of our type in a
different world would be impossible. For example, life of our type would be
Several years ago the anthropic principle seemed rather esoteric, since it
implied that many universes might exist, but we live in just one of them, which
Injlation and quantum cosmology 627
is sufficiently suitable for us. It was not clear in what sense one could speak
of many universes if our universe is unique, and whether such fundamental
constants of nature as the dimensionality of spacetime, the vacuum energy,
the value of electric charge, etc. , can change when one 'travels' from one
universe to another.
A possible answer to the first part of this question was suggested by the
many-worlds interpretation of quantum mechanics (Everett, 1957). Further
investigation of this possibility may be of profound importance. However, it
seems doubtful that it is possible to use this interpretation correctly without
a proper understanding of the nature of consciousness. Does an observer
just observe the universe, or does he 'create' it? (Wheeler, 1979). What is
actually split : the universe or consciousness? Can consciousness exist 'by
itself' Oike spacetime without matter) or is it merely an arena for the
manifestation of spacetime and matter? In what sense can consciousness
'choose' a universe to live in?
Thus the development of physics reveals problems which traditionally
were beyond the scope of physics . It seems that to go further we must
investigate these problems without prejudice, rather than wait until
philosophers try to do it for us. However, such a path is not easy to follow,
and it seems encouraging that inflation makes it possible to circumvent
some of the above-mentioned problems that precluded the justification of
the anthropic principle. Namely, from the scenario discussed in this chapter
it follows that even if the universe at some time contains only one domain
(mini-universe) of initial size O(H<;O � O(Mp- 1 ), later it splits into many
causally disconnected mini-universes of exponentially large size, in which all
possible types of compactification and all possible vacuum states are
realised . In this sense our universe actually consists of many universes of all
possible types . Whereas several years ago the dimensionality of spacetime,
the vacuum energy density, the value of electric charge, the Yukawa
couplings, etc. , were regarded as true constants, it now becomes clear that
these 'constants' actually depend on the type of compactification and on the
mechanism of symmetry breaking, which may be different in different
domains of our universe .
One of the main objections to the anthropic principle was the assertion
that for the existence of life of our type there is no need for our universe to be
so uniform on scales much greater than the scale of the Solar System. We
now know the answer to this question . The size of the homogeneous locally
Friedmannian mini-universes in which bp/p ,....., 10 - 4 (which is necessary for
the formation of galaxies of our type) becomes after inflation much greater
628 A. Linde
than the size of the Solar System. For example, in the model (2. 1) bp/p on a
galactic scale is given by O( lO)(m/Mp), which gives m - 10 - 5Mp . On the
other hand, the size of a uniform part of our universe is greater than
Mi 1 exp [2n{Mp/m)] � 101 05 cm, which is much greater than the observable
part of the universe ( - 1028 cm).
This means that actually it is possible to justify some kind of weak
anthropic principle in t he inflationary cosmology. The line of thought
advocated here is an alternative to the old assumption that in a 'true' theory
it must be possible to compute unambiguously all masses, coupling
constants, etc. From our point of view, it is rather improbable that a 'true'
theory must have only one 'true' ground state. In the context of the unified
theories of all fundamental interactions, the validity of this assumption
becomes unlikely. According to the scenario discussed in this chapter, this
assumption is probably incorrect; in any case it is not necessary. The old
question of why our universe is the only possible one now is replaced by a
new question : in which theories is the existence of mini-universes of our type
possible? This question is still very difficult, but it is much easier than the old
one.
13. 7 Conclusions
The inflationary universe scenario continues to develop rapidly. It appears
that the global structure of the universe according to this scenario is
determined by quantum effects, that the major part of the physical volume of
the universe should even now be in an inflationary state with V(¢) - M� ,
that the universe eternally reproduces itself, and that it consists of
exponentially many mini-universes of different types. Some of these
results are model-dependent, some are not. We do not know what the fate of
the ideas discussed above will be, nor how they will be modified by future
developments of elementary particle physics and of the theory of
superstrings. In any case, the inflationary universe scenario may serve as a
good example, showing how many exciting surprises the theory of gravity
developed three hundred years ago can still give us.
References
14. 1 Introduction
A few years ago I received a reprint request from an Institute of Quantum
Oceanography somewhere in the Soviet Far East. I thought : What could be
more ridiculous? Oceanography is a subject that is pre-eminently classical
becau se it describes the behaviour of very large systems. Moreover,
oceanography is based on the Navier-Stokes equation , which is a classical
effective theory describing how large numbers of particles interact according
to a more basic theory, quantum electrodynamics. Presumably, any
quantum effects would have to be calculated in the underlying theory.
Why is quantum cosmology any less ridiculous than quantum
oceanography? After all , the universe is an even bigger and more classical
system than the oceans. Further, general relativity, which we use to describe
the universe, may be only a low energy effective theory which approximates
some more basic theory, such as string theory.
The answer to the first objection is that the spacetime structure of the
universe is certainly classical today, to a very good approximation.
However, there are problems with a large or infinite universe, as Newton
realised . One would expect the gravitational attraction between all the
different bodies in the universe to cause them to accelerate towards each
other. Newton argued that this would indeed happen in a large but finite
universe. However, he claimed that in an infinite universe the bodies would
not all come together because there would not be a central point for them to
fall to . This is a fallacious argument because in an infinite universe any point
can be regarded as the centre. A correct treatment shows that an infinite
universe can not remain in a stationary state if gravity is attractive. Yet so
firmly held was the belief in an unchanging universe that when Einstein first
proposed general relativity he added a cosmological constant in order to
632 S. W Hawking
obtain a static solution for the universe, thus missing a golden opportunity
to predict that the universe should be expanding or contracting. I shall
discuss later why it should be that we observe it to be expanding and not
contracting.
If one traces the expansion back in time, one finds that all the galaxies
would have been on top of each other about 15 thousand million years ago.
At first it was thought that there was an earlier contracting phase and that
the particles in the universe would come very close to each other but would
miss each other. The universe would reach a high but finite density and
would then re-expand (Lifshitz and Khalatnikov, 1963). However, a series of
theorems (Hawking and Penrose, 1970; Hawking and Ellis, 1973) showed
that if classical general relativity were correct, there would inevitably be a
singularity at which all physical laws would break down. Thus classical
cosmology predicts its own downfall. In order to determine how the classical
evolution of the universe began one has to appeal to quantum cosmology
and study the early quantum era.
But what about the second objection? Is general relativity the
fundamental underlying theory of gravity or is it just a low energy
approximation to some more basic theory? The fact that pure general
relativity is not finite at two loops (Goroff and Sagnotti, 1985) suggests it is
not the ultimate theory. It is an open question whether su pergravity, the
supersymmetric extention of general relativity, is finite at three loops and
beyond but no-one is prepared to do the calculation. Recently, however,
people have begun to consider seriously the possibility that general relativity
may be just a low energy approximation to some theory such as
superstrings, although the evidence that superstrings are finite is not, at the
moment, any better than that for supergravity.
Even if general relativity is only a low energy effective theory it may yet be
sufficient to answer the key question in cosmology : Why did the classical
evolution phase of the universe start off the way it did? An indication that
this is indeed the case is provided by the fact that many of the features of the
universe that we observe can be explained by supposing that there was a
phase of exponential 'inflationary' expansion in the early universe. This is
described in more detail in the articles by Linde, and Blau and Guth
(Chapters 13, 12, this volume). In order not to generate fluctuations in the
microwave background bigger than the observational upper limit of 10- 4,
the energy density in the inflationary era cannot have been greater than
about 10- 1 0 m� (Rubakov et al., 1982 ; Hawking, 1984a). This would put the
inflationary era well inside the regime in which general relativity should be a
Quantum cosmology 633
good approximation. It would also be well inside the region in which any
possible extra dimensions were compactified. Thus it might be reasonable to
hope that the saddle point or semi-classical approximation to the quantum
mechanical path integral for general relativity in four dimensions would give
a reasonable indication of how the universe began. In what follows I shall
assume that the lowest-order term in the action for a spacetime metric is the
Einstein one, as it must be for agreement with ordinary, low energy,
observations. However, I shall bear in mind the possibilities of higher-order
terms and extra dimensions.
14.2 The quantum state of the universe
I shall use the Euclidean path integral approach. The basic assumption of
this is that the 'probability' in some sense of a positive definite spacetime
metric g µv and matter fields <P on a manifold M is proportional to exp( - [)
where [ is the Euclidean action. In general relativity
f = __l_ J, R (g)1 12 ddx + -l Jal K(h) 1 12 dd - 1x - fLm [<P] (g)112 ddx ,
16n
M Sn
M
where h and K are respectively the determinant of the first fundamental form
and the trace of the second fundamental form of the boundary aM of M. In
string theory the action [of a metric 9µ v • antisymmetric tensor field Bµ v and
dilaton field <P is given by the log of the path integral of the string action
over all maps of string world sheets into the given space. For most fields the
path integral will not be conformally invariant . This will mean that the path
integral diverges and [ will be infinite. Such fields will be suppressed by an
infinite factor. However, the path integral over maps into certain
background fields will be conformally invariant. The action for these fields
will be that of general relativity plus higher-order terms.
The probability of an observable 0 having the value A can be found by
summing the projection operator n A over the basic probability over all
Euclidean metrics and fields belonging to some class C.
P0 (A) = L d[gµ,] d [</>] IT A exp( - f),
where n A = 1 if the value of 0 is A and zero otherwise. From such
probabilities and the conditional probability, the probability of A given B,
P(A, B)
P(A I B) =
P(B) ,
where P(A, B) is the joint probability of A and B, one can calculate the
outcome of all allowable measurements.
634 S. W. Haw king
The choice of the class C of metrics and fields on .w hich one considers the
probability measure exp( - l) determines the quantum state of the universe.
C is usually specified by the asymptotic behaviour of the metric and matter
fields, just as the state of the universe in classical general relativity can be
specified by the asymptotic behaviour of these fields. For instance, one could
demand that C consist of all metrics that approach the metric of Euclidean
flat space outside some compact region and all matter fields that go to zero
at infinity. The quantum state so defined is the vacuum state used in S matrix
calculations. In these one considers incoming and outgoing states that differ
from Euclidean flat space and zero matter fields at infinity in certain ways.
The path integral over all such fields gives the amplitude to go from the
initial to the final state.
In these S matrix calculations one considers only measurements at infinity
and does not ask questions about what happens in the middle of the
spacetime. However, this is not much help for cosmology : it is unlikely that
the universe is asymptotically flat, and, even if it were, we are not really
interested in what happens at infinity but in events in some finite region
surrounding us. Suppose we took the class C of metrics and matter fields
that defines the quantum state of the universe to be the class described above
of asymptotically Euclidean metrics and fields. Then the path integral to
calculate the probability of a value of an observable 0 would receive
contributions from two kinds of metrics. There would be connected
asymptotically Euclidean metrics and there would be a disconnected metric
which consisted of a compact component that contained the observable 0
and a separate asymptotically Euclidean component. One can not exclude
disconnected metrics from the class C because any disconnected metric can
be approximated arbitrarily closely by a connected metric in which the
different components are joined by thin tubes with negligible action. It turns
out that for observables that depend only on a compact region the dominant
contribution to the path integral comes from the compact regions of
disconnected metrics. Thus, as far as cosmology is concerned, the
probabilities of observables would be almost the same if one took the class C
to consist of compact metrics and matter fields that are regular on them.
In fact, this seems a much more natural choice for the class C that defines
the quantum state of the universe. It does not refer to any unobserved
asymptotic region and it does not involve any boundary or edge to
spacetime at infinity or a singularity where one would have to appeal to
some outside agency to set the boundary conditions. It would mean that
spacetime would be completely self contained and would be determined
Quantum cosmology 635
completely by the laws of physics : there would not be any points where the
laws broke down and there would not be any edge of spacetime at which
unpredictable influences could enter the universe. This choice of boundary
conditions for the class C can be paraphrased as : 'The boundary condition
of the universe is that it has no boundary' (Hawking, 1982; Hartle and
Hawking, 1983 ; Hawking, 1984b).
This choice of the quantum state of the universe is very analogous to the
vacuum state in string theory which is defined by all maps of closed string
world sheets without boundary into Euclidean flat space. More generally,
one can define a 'ground' state of no string excitations about any set of
background fields that satisfy certain conditions by all maps of closed string
world sheets into the background. Thus one can regard the 'no boundary'
quantum state for the universe as a 'ground' state (Hartle and Hawking,
1983). It is, however, different from other ground states. In other quantum
theories non-trivial field configurations have positive energy. They therefore
cannot appear in the zero energy ground state except as quantum
fluctuations. In the case of gravity it is also true that any asymptotically flat
metric has positive energy, except flat space, which has zero energy.
However, in a closed, non-asymptotically flat universe there is no infinity at
which to define the energy of the field configuration. In a sense the total
energy of a closed universe is zero : the positive energy of the matter fields
and gravitational waves is exactly balanced by the negative potential energy
which arises because gravity is attractive. It is this negative potential energy
that allows non-trivial gravitational fields to appear in the 'ground' state of
the universe.
Unfortunately, this negative energy also causes the Euclidean action !for
general relativity to be unbounded below (Gibbons et al., 1978), thus
causing exp( - l) not to be a good probability measure on the space C offield
configurations. In certain cases it may be possible to deal with this difficulty
by rotating the contour of integration of the conformal factor in the path
integral from real values to be parallel to the imaginary axis. However, there
does not seem to be a general prescription that will guarantee that the path
integral converges. This difficulty might be overcome in string theory where
the string action is positive in Euclidean backgrounds. It may be, however,
that the difficulty in making the path integral converge is fundamental to the
fact that the 'ground' state of the universe seems to be highly non-trivial. In
any event it would seem reasonable to expect that the main contribution to
the path integral would come from fields that are near stationary points of
the action l, that is, near solutions of the field equations.
636 S. W. Hawking
ask questions only about the d - 1 metric hii induced on S by the d metric 9 µ v
on M because the components nµgµv of 9 µ v that lie out of S can be given any
values by a diffeomorphism of M that leaves S fixed . Thus the probability
that the surface S has the induced metric hii and matter fields ¢0 is
where n (hij , </>ol is the projection operator which has value 1 if the induced
metric and matter fields on S have the given values and is zero otherwise.
One can cut the manifold M at the surface S to obtain a new manifold M
Quantum cosmology 637
bounded by two copies S and S' of S. One can then define p(hij' c/> 0 ; h;j, cf>� ) to
be the path integral over all metrics and matter fields on M which· agree with
the given values hij• c/> 0 on S and h;j, cf>� on S. The quantity p can be regarded
as a density matrix describing the quantum state of the universe as seen from
a single spacelike surface for the following reasons :
(i) The diagonal elements of p, that is, when hij = h;j and cf> o = cf>� , give the
probability of finding a surface S with the metric hij and matter
fields cf> 0 .
(ii) If S divides M into two parts, the manifold M will consist of two
disconnected parts, M + and M The path integral for p will factorise :
_ .
p(hij • cf> o ; h;j, cf>�) = \!' + (hij • cf> o)\J' _ (h;j, cf>� ) ,
where the wave functions \!' + and \!' are given by the path integral
over all metrics and matter fields on M + and M respectively which
_
have the given values on S and S'. If the matter fields cf> are CP invariant,
_
\!' + = \!' and both are real (Hawking , 1985). \!' is known as 'The Wave
_
One can think of the density matrices which do not factorise in the
following way : Imagine a set of surfaces I; which , together with S, divide the
spacetime manifold M into two parts. One can take the disjoint union of the
� and S as the surface which is used to define p (there is no reason why this
su rface has to be connected) . In this case the manifold M will be
disconnected and the path integral for p will factorise into the product of two
wave functions which will depend on the metrics and matter fields on two
sets of surfaces, S, I; and S' , r; . The quantity p will therefore be the density
matrix for a pure quantum state. However, an observer will be able to
measure the metric and matter fields only on one connected component of
the surface (say, S) and will not know anything about their values on the
other components, Ji, or even if any other components are required to divide
the spacetime manifold into two parts. The observer will therefore have to
sum over all possible metrics and matter fields on the surfaces 7;. This
summation or trace over the fields on the 1; will reduce p to a density matrix
corresponding to a mixed state in the fields on the remaining surfaces S and
638 S. W Hawking
S' . It is like when you have a system consisting of two parts A and B.
Suppose the system is in a pure quantum state but that you can observe only
part A. Then, as you have no knowledge about B, you have to sum over all
possibilities for B, with equal weight. This reduces the density matrix for the
system from a pure state to a mixed state.
The summation over all fields on the surfaces 1{ is equivalent to joining
the surfaces 1{ to r; and doing the path integral over all metrics and matter
fields on a manifold M whose only boundaries are the surfaces S and S' .
There is an overcounting because, as well as summing over all metrics and
matter fields, one is summing over all positions of the surfaces Ii in M .
However, the path integral over these extra degrees of freedom can be
factored out by introducing ghosts. The reduced path integral is then the
same as that for the density matrix p for a single pair of surfaces S and S' .
Thus one can see that the reason that the density matrix for S corresponds to
a mixed state is that one is observing the state of the universe on a single
spacelike surface and ignoring the possibility that spacetime may be not
simply connected and so require other surfaces Ii as well as S to divide M
into two parts.
where S is the surface t = 0. The Euclidean action can then be written in the
Hamiltonian form :
l= - f
dt d' - ' x(n'ili;; + n0 <,6 - NH" - N ,H ' ) ,
i
where n i = - (h1 1 2/16n)(Kii _ hiiK) is the Euclidean momentum conjugate
to h ii ' Kii is the second fundamental form of S,
�
H o = 16 n G iiktn iinkt -
l n h i 1 2 d - 1 R + T oo
Hi = - 2nii; i + To i
G iik t = 1h - 1 1 2 (h ikhi, + huhjk - hiihk1 ) .
As was stated above, the components of gµ v that lie out of the su rface S can
be given any values by a diffeomorphism of M that leaves S fixed. This
means that the variational derivative of the path integral for p with respect
Quantum cosmology 639
to N and Ni on S mu st be zero :
bp
bN .
I J
= - d [g µ v] d[ ¢ o]
bl
bN .
I
exp( - i) =
-
Hip =O
[ [
bp
bN J
= - d gµ vJ d </J oJ
bf
bN
-
exp( - I) = Hp = O,
-
where the operators fl and Hi are obtained from the corresponding classical
j
expressions by replacing the Euclidean momentum ni by - b/bhij and n</> by
- 6/6¢.
The first equation is called the momentum constraint. It is a first-order
equation for p on superspace, the space W of all metrics h ij and matter fields
¢ on a su rface S. It implies that p is the same for metrics and matter fields
which can be obtained from each other by coordinate transformations in S.
The second equation is called the Wheeler-DeWitt equation. It holds at
each point of superspace, except where h ii = h;i and ¢0 = ¢ � . When this is
true, the separation between S and S' in the metric gµ v on the manifold M
may be zero . In this case, it is no longer true that the variation of p with
respect to N is zero . There is an infinite dimensional delta function on the
right-hand side of the Wheeler-DeWitt equation . Thus, the Wheeler
DeWitt equation Is like the equation for the propagator, G(x, x') =
(¢(x)¢(x')) :
( - D + m 2 )G(x, x') = 6(x, x') .
As the point x tends towards x' , the propagator diverges like r 2 d , where r
-
is the distance between x and x'. Thus G(x , x') will be infinite. Similarly,
p(h ij, ¢0; hii' ¢0), the diagonal elements of the density matrix, will be infinite.
This infinity arises from Euclidean geometries of the form S x S1 , where the
S1 is of very short radius. However, we are interested really only in the
probabilities for Lorentzian geometries, because we live in a Lorentzian
universe, not a Euclidean one. One can recognise the part of the density
matrix p that corresponds to Lorentzian geometries by the fact tha� it will
oscillate rapidly as a function of the scale factor of the metrics hij and h ;i
(Hawking, 1984b). One therefore wants to subtract out the infinite,
Euclidean, component and leave a finite, Lorentzian, component. One way
of doing this is to consider only spacetime manifolds M which the surface S
divides into two parts. The density matrix from such geometries will be of
the factorised form :
p(h ii' ¢0 ; h ;i , ¢ � ) = 'P (h ii' </J o ) 'P (h ;i , ¢ � ),
where the wave function lJ' obeys the Wheeler-DeWitt equation with no
640 S. W. Hawking
delta function on the right-hand side. This part of the density matrix will
therefore remain finite when hu = h;i and ¢ 0 = ¢�. In a su persymmetric
theory, such as supergravity or superstrings, the infinity at the diagonal in
the density matrix would probably be cancelled by the fermions.
14.5 Minisuperspace
The Wheeler-DeWitt equation can be regarded as a second-order
differential equation for p or \JI on superspace , the infinite-dimensional
space of all metrics and matter fields on S. It is hard to solve such an
equation . Instead , progress has been made by using finite dimensional
approximations to superspace, called minisuperspaces, first introduced by
Misner ( 1970). In other words, one reduces the infinite number of degrees of
freedom of the gravitational and matter fields and of the gauge to a finite
number and solves the Wheeler-DeWitt equation on a finite-dimensional
space.
1 4 .5 . 1 de Sitter model
The simplest example is a homogeneous isotropic four-dimensional universe
with a cosmo�ogical constant and metric
ds 2 = o- 2 [N 2 dt 2 + a 2 dQ�] .
[ [
The action is
I
1 da 2
J J
1
I = -- dtNa - - + I - A.a 2
_
'
2 N 2 dt
where o- 2 = 1nm; , a is the radius of the 3-sphere space-like surfaces and
A. =-to- 2 A . One can choose N = a. The first two terms in the Euclidean action
are negative definite. This means that the path integral over a does not
converge. However, one can make the path integral converge by taking a to
be imaginary. This corresponds to integrating the conformal factor over a
contour parallel to the imaginary axis (Gibbons et al . , 1978).
With a imaginary, the action is the same as that of the anharmonic
oscillator. The density matrix p(a, a') is given by a path integral over all
values of a on a manifold M bounded by surfaces with radii a and a'. There
are two kinds of such manifold : ones that have two disconnected
components, which correspond to spacetimes that are divided in two by S,
and connected ones, which correspond to non-simply connected spacetimes
that S does not divide.
Consider fi rst the case in which S divides M in two. The density matrix
from these geometries that S divides into two is the product of wave
Quantum cosmology 641
functions :
p (a, a') = \Jl(a)'I'(a'),
where the wave function 'I' is given by a path integral over compact 4-
geometries bounded by a 3-sphere of radius a or a' . One would expect this
path integral to be approximately A exp( B), where B is the action of a
-
solution of the classical Euclidean field equations with the given boundary
conditions and the prefactor A is given by a path integral over small
fluctuations about the solution of the classical field equations. The compact
homogeneous isotropic solution of the Euclidean field equations is a 4-
sphere of radius A. - 1 1 2 . A 3-sphere of radius a < A. - 1 1 2 can fit into such a 4-
sphere in two positions : it can bound more or less than half the 4-sphere.
The action B of both these solutions of the classical equations is negative,
with the action of more than half the 4-sphere being the more negative. One
might therefore expect that this solution would provide the dominant
contribution to the path integral . However, if one takes the scale factor a to
be imaginary, in order to make the path integral converge, and then
analytically continues back to real a, one finds that the dominant
contribution comes from the solution that corresponds to less than half the
4-sphere, rather than the other solution which corresponds to more than
half the 4-sphere, as one might have expected . This conclusion also follows
from an analysis of the path integral in the K repr�sentation. (Hartle and
Hawking , 1983).
In terms of the gauge choice N = a , used above, the path integral is over a
with a the given value at t = 0 and a = 0 at t = + oc . This path integral is the
same as that for the propagator for the an harmonic oscillator from ia at t = 0
to 0 at t oc . But this gives the ground state wave function. Thus
=
less than half the 4-sphere, as above. However, for a > A - 1 1 2 , there is no
Euclidean solution of the classical field equations for a compact
homogeneous isotropic 4-space bounded by a 3-sphere of radius a. Instead
there are complex metrics which are solutions of the field equations with the
required properties. Near the 3-sphere of radius a, one can take a section
through the complexified spacetime manifold on which the metric is real and
Lorentzian. This is reflected in the fact that A0(i a) will oscillate for a > A - 1 1 2 :
exponential wave functions correspond to Euclidean 4-geometries and
642 S. W. Haw king
divide into two parts is given by a path integral with a fixed at the given
values at t = 0 and t = ti for some Euclidean time interval ti . But this is equal
to the real part of the propagator K(i a, O; i a', ti) for the anharmonic
oscillator from i a at t = O to i a' at t = ti .
"
where A11(x) are the wave functions of the excited states of the anharmonic
oscillator and E11 are the energy levels. To obtain the density matrix one has
to integrate over all values of ti because the two surfaces can have any time
separation :
P (a ' a') = Re
f.00 K(i a , O , i a' , t i ) dti = Re "
· A 1 (i a)A 11(i a')
L... 1 ·
0 n
E
II
One can interpret this as saying that the universe is in the state specified by
i
the wave function Re(A11(i a)) with the relative probability (E") - . Note that
the universe need not be 'on shell' in the sense that the Wheeler-DeWitt
operator acting on A11 is not 0, but E11 • This term in the Wheeler-DeWitt
equation acts as if the universe contained a certain amount of negative
energy radiation . It will cause the classical solution corresponding to A " by
the WKB approximation to bounce at a larger radius than ;_ - i 1 2 . Thus, the
effect of the universe being in a mixed quantum state might be observable.
However, at large values of a, the effect of the negative energy radiation
would be very small and the universe would expand exponentially, like the
de Sitter solution.
[[ J [
scalar field ¢ that is constant on the surfaces of homogeneity is
_ 1
I [
I = -2 dtNa 2
N
1
JJ
d a 2 2 d¢ 2
dt
a
- dt J
+ 1 - a 2 m 2A'+', 2 .
[
wave function will obey the Wheeler-DeWitt equation
1 1 a a 1 a1 ]
- - -aP- - - 2 - a 2 + a 4 m 2 ¢ 2 \f [a, ¢J = 0,
2 aP aa aa a 2 acp
--
which the met ric is nearly real and Lorentzian. This solution will have a
minimum radius of order 1/rn¢ and will expand exponentially with ¢ slowly
decreasing . It will be a quantum realisation of the 'chaotic inflation' model
proposed by Linde ( 1983).
After an exponential expansion of the universe by a factor of order
exp {-!¢ 2 ), the scalar field will start to oscillate with frequency rn. The energy
momentum tensor of the scalar field will change from that of an effective
cosmological constant to that of pressure-free matter. The universe will
change from an exponential expansion to a matter-dominated one. In a
model with other matter fields, one would expect the energy in the massive
scalar field oscillations to be converted into zero rest mass particles. The
universe would then expand as a radiation-dominated model .
The universe would expand to a maximum radius and then recollapse.
One would expect that if such complex, almost Lorentzian , geometries
contributed to the wave function in their expanding phase, they would also
contribute in their contracting phase. However, although a few solutions
will bounce at small radius and expand again (Hawking , 1984b; Page,
1985a, b), most solutions will collapse to a singularity. They will give an
oscillating contribution to the wave function, even in the region a < 1/rn<P of
superspace where the dominant contribution is exponential . It will also
mean that the boundary condition for the Wheeler-DeWitt equation on the
light cone x = + y is not exactly '¥ = 1 , as was assumed in some earlier papers
(Hawking and Wu , 1985; Moss and Wright, 1983).
The density matrix from geometries that S does not divide into two parts
has not been calculated yet . By analogy with the de Sitter model , one might
expect that the part which corresponds to Lorentzian geometries would
behave like solutions with a massive scalar field and negative energy
radiation . One would not expect the negative energy to prevent collapse to a
singularity.
To summarise, in this model , the universe begins its expansion from a
non-singular state. It expands in an inflationary manner, goes over to a
matter or radiation-dominated expansion , reaches a maximum radius and
recollapses to a singularity. This will be discussed fu rther in Sections 14.7
and 14.8.
However, ultimately one would like to know the density matrix or wave
function on the whole of superspace, not just a finite-dimensional subspace.
This is a bit of a tall order but one can use a 'midisuperspace' approximation
in which one takes the action to all orders in a finite number of degrees of
freedom and to second order in the remaining degrees of freedom.
A treatment of the massive scalar field model on these lines has been given
by Halliwell and Hawking ( 1985). The two degrees of freedom of the model
described above are treated exactly, and the rest as perturbations on the
background determined by the two-dimensional minisuperspace model . As
in the model above, the oscillating part of the background wave function
corresponds by the WKB approximation to a universe which starts at a
minimum radius, expands in an inflationary and then a matter-dominated
manner, reaches a maximum radius and recollapses to a singularity.
From the 'no boundary' condition the behaviour of the perturbations is
determined by a path integral of the perturbation modes over the compact
geometries represented by the background wave function. In the case of
Euclidean geometries that are part of a 4-sphere or of complex geometries
that are near such a Euclidean geometry, one can use an adiabatic
approximation. to show that the perturbation modes are in their ground
state, with the minimum excitation compatible ·with the uncertainty
principle. This means that the Lorentzian geometries that correspond to the
oscillating part of the wave function start off at the minimum radius with all
the perturbation modes in the ground state. As the universe inflates, the
adiabatic approximation remains good and the perturbation modes remain
in their ground states until their wavelength becomes longer than the
horizon size or, in other words, their frequency is red shifted to less than the
expansion time scale. After this, the wave functions of the perturbation
modes freeze and do not relax adiabatically to remain in the ground state as
the frequency of the modes changes.
The perturbation modes remain frozen until the wavelength of the modes
becomes less than the horizon size again during the matter- or radiation
dominated expansion. Because they have not been able to relax
adiabatically, they will then be in a highly excited state. After this, they will
evolve like classical perturbations of a Friedmann universe. They will have a
'scale free' spectrum, that is, their rms amplitude at the time the wavelength
equals the horizon size will be independent of the wavelength. The
amplitude will be roughly lO(m/mp), where m is the mass of the scalar field.
Thus they would have the right amplitude of about 10 - 4 to account for
galaxy formation if m is about 10 1 4 GeV.
Quantum cosmology 647
In order to generate sufficient inflation, the initial value of the scalar field
¢ has to be greater than about 8 . However, with m = 10- smp , the energy
density of the scalar field will still be a lot less than the Planck density. Thus
it may be reasonable in quantum cosmology to ignore higher-order terms
and extra dimensions.
In the recollapse phase the perturbations will continue to grow classically.
They will not retu rn to their ground state when the universe becomes small
again, as I suggested (Hawking , 198 5). The reason is that when they start
expanding, the background compact geometry bounded by the surface S is
near to the Euclidean geometry of half a 4-sphere. On such a background the
adiabatic approximation will hold for the perturbation modes, so they will
be in their ground state. However, when the universe recollapses, the
background geometry will be near a Lorentzian solution which expands and
recontracts. The adiabatic approximation will not hold on such a
background . Thus the perturbation modes will not be in their ground state
when the universe recollapses, but will be highly excited .
would seem to imply that there was some concept of time in which the
universe did not exist before a certain instant and then came into being. But
time is defined only within the universe, and does not exist outside it, as was
pointed out by Saint Augustine (400) : 'What did God do before He made
Heaven and Earth? I do not answer as one did merrily : He was preparing
Hell for those that ask such questions. For at no time had God not made
anything because time itself was made by God .'
The modern view is very similar. In general relativity, time is just a
coordinate that labels events in the universe. It does not have any meaning
outside the spacetime manifold . To ask what happened before the universe
began is like asking for a point on the Earth at 9 1 ° north latitude; it just is
not defined . Instead of talking about the universe being created, and maybe
coming to an end , one should just say : The universe is.
References
Gibbons, G . W., Hawking, S. W. and Perry, M . J. ( 1978). Nucl. Phys., Bl38, 14 1 .
Goroff, M . H . and Sagnotti, A . ( 1985) . Phys. Lett., 160B, 8 1 .
Halliwell, J . J . and Hawking, S . W. ( 1985). Phys. Rev., 031, 1 777.
Hartle, J . B . and Hawking, S . W . ( 1983). Phys. Rev., 028, 2960.
Hawking , S. W. ( 1982) . In Astrophysical Cosmology. Proceedings of the Study Week on
Cosmology and Fundamental Physics, ed . H . A. Bruck, G. V. Coyne and M. S.
Longair. Pontificia Academiae Scientarium : Vatican City.
Hawking, S. W. and Penrose, R. ( 1970). Proc. Roy. Soc. Lon., A314, 529.
Hawking, S. W. and Ellis, G. F. R. ( 1973) . The Large Scale Structure of Space- Time.
Cambridge University Press: Cambridge.
Hawking, S. W. ( 1984a) . Phys. Lett., 150B , 339.
Hawking , S. W. ( 1984b). Nucl. Phys. , B239, 257.
Hawking , S . W. and Wu , Z. C. ( 1985) . Phys. Lett. , 151B, 15.
Hawking, S . W. ( 1985). Phys. Rev., 032 , 2489.
Hawking, S. W. and Page, D. N . ( 1986). Nucl. Phys., B264 , 1 8 5 .
Hawking, S. W. ( 1987). Physica Scripta (in press).
Laflamme, R. ( 1987) . The wave function of a S1 x S2 universe. Preprint , to be published.
Lifshitz, E. M . and Khalatnikov, I . M . ( 1963). Adv. Phys., 12, 185.
Linde, A. D. ( 1 98 3) . Phys. Lett., 1 29B, 177.
Linde, A . D. ( 1985). Phys. Lett., 162B, 2 8 1 .
Misner, C . W. ( 1 970) . I n Magic without Magic , ed . J. R . Klauder. Freeman : San
Francisco.
Moss, I . and Wright, W. ( 1983). Phys. Rev., 029, 1067 .
Page, D. N. ( 1985a). Class. & Q .G., 1 , 4 1 7.
'
15. 1 Introduction
Superstring theory is an approach to the unification of fundamental
particles and forces that has attracted a great deal of attention in the last two
years [ 1] . It offers the possibility of overcoming many of the shortcomings of
the standard model and providing a unified theory free from much of the
arbitrariness that. is inherent in conventional point-particle theories.
The standard model of electroweak and strong forces has enjoyed an
enormous amount of success. Indeed, it appears to be consistent with all
established particle physics experiments. This being the case, the first
question one should ask is why one should even be looking for something
better. Most criticisms of the standard model are based on the fact that it
requires a number of arbitrary choices and fine-tuning adjustments of
parameters. These features do not prove that it is wrong or even incomplete.
However, given the history of successes in elementary particle physics, it is
natural to seek a deeper underlying theory that can account for many of the
arbitrary choices and parameters. These include the choice of gauge groups
and representations, the number of families of quarks and leptons, the
origins of the Higgs symmetry-breaking mechanism, and the specific values
of various parameters.
The aesthetic shortcomings of the standard model or a grand unified
model do not prove that it is wrong or incomplete. The thing that does this
most convincingly is the absence of gravity. There is a straightforward way
to couple Einsteinian gravity to any relativistic field theory by following the
t Work supported in part by the US Department of Energy under contract DEAC 03-
8 1- ER40050.
:j: Based on a lecture presented at the Twenty-third International Conference on High
Energy Physics in Berkeley, California, July 1986.
Superstring unification 653
also necessary that loop amplitudes be free from anomalies and possess
modular invariance : properties to which we shall return . Additional criteria,
perhaps of non-perturbative origin , may eventually be found to be necessary
also.
Six string theories are known that satisfy the consistency properties
mentioned above. Each of them requires that spacetime has ten dimensions
(nine space and one time) and that the two-dimensional world-sheet action
has superconformal symmetry. These theories are listed in Table 1 5 . 1 . There
is some evidence that the three heterotic theories are actually different
phases of the same theory, which would reduce the number of different
theories to be considered .
Superstring theories possess not only two-dimensional superconformal
symmetry but local ten-dimensional super-Poincare symmetry, a possibility
first pointed out by Gliozzi , Scherk and Olive [6] . There are three
possibilities for D = 10 supersymmetry, each of which can be realized in a
superstring theory. The minimal irreducible spin or in D = 10 satisfies
simultaneous M ajorana and Weyl properties and has 16 independent real
components. A theory with a single conserved Maj orana-Weyl supercharge
has N = 1 supersymmetry, a possibility realized by three of the six theories
listed in the table. There are two distinct possibilities for theories with two
conserved Majorana-Weyl supercharges. Either they have opposite
chirality (type IIA) or they have the same chirality (type JIB). This is the
maximum amount of supersymmetry that is possible in an interacting
theory (corresponding to N = 8 supersymmetry in four dimensions). The
type II theories are not promising for phenomenology because they do not
accommodate elementary Yang-Mills fields. However, the type JIB theory
is in some respects the most beautiful of all the theories, and it is not totally
out of the question that it could form the basis of a realistic phenomenology.
Type I [1 1] N= l S0(32)
Type IIA [12] N=2
Type IIB [12] N=2
Heterotic [ 13] N= l S0(32)
Heterotic [ 13] N= l Ea x Ea
Heterotic [14] N=O S0(16) x S0( 16)
656 J. H. Schwarz
one string breaking into two , or two strings joining to give a single one. The
surface shown is topologically like a pair of pants. Since the surface is
smooth the particular spacetime point at which the interaction takes place is
a frame-dependent question . If the time slices are taken using planes that are
tilted relative to the ones shown , a different interaction point would be
identified . Thus there is no preferred point on the surface and the existence of
interaction is an inevitable feature of geometric origin, not something that
needs to be appended to the theory in an ad hoc way.
In calculating S-matrix elements one should consider amplitudes for
scattering closed strings that propagate to time + oo , i .e., with tubes
emerging from the surface that extend to infinity. However, the conformal
symmetry of the underlying two-dimensional field theory [ 16, 17] , implies
that world sheets that differ by a conformal transformation are equivalent .
In particular, one may map the asymptotic strings to finite coordinates.
When this is done, they are represented by points on the world sheet .
Altogether, the amplitude is then represented by a closed oriented surface
with 'punctures' representing the initial and final string states.
The topological classification of closed oriented two-dimensional surfaces
is characterized by a single integer, the genus g, which can be thought of as
the number of handles that is attached to a sphere. Thus, as shown in Fig .
1 5.2, the sphere has g = 0, the torus has g 1 , and so forth.
=
Fig. 1 5 . 1 . A piece of world sheet in the shape of pants. The slice at time t 1
shows two closed strings while the one at time t 2 shows one closed string.
658 J. H. Schwarz
g=O 0 Sphere
g= l G Torus
g=2 � etc.
Superstring unification 659
f,M.
dµ n (F ii)k , · k1 .
i <j
(1)
In this expression the indices i and j refer to the external particles and the k;
are the momenta (assumed to be on-shell). The function log Fij is
proportional to the Green's function on the world sheet between the
coordinates of particles i and j. Also, dµ represents a suitable integration
measure on the 'moduli space' M9 •
Not only the two-dimensional world sheet, but also the 2(3g - 3 + n)
dimensional moduli space has a complex structure. As a result it is possible
to express the measure dµ and functions F ii i n terms of holomorphic
functions on M9 • The space M9 is naturally expressed as a quotient of a space
r;,, known as Teichmiiller space, divided by an infinite discrete group of
transformations. The expressions dµ and Fii can be expressed as products of
various terms that are well defined on I;, but are multivalued ('line bundles')
on M9 • In order for the loop amplitude to be well defined it is necessary that
when the relevant products are formed, expressions that are single-valued
on M9 ('modular invariant') result. The fact that this works for each of the
various string theories is highly non-trivial . In fact, it almost completely
determines the various functions!
Moduli space M 9 has a very rich and subtle topology. In fact, it even has
boundary components that correspond to singular limits of the geometry in
which the world sheet 'degenerates'. The relevant issue for finiteness is
whether or not the measure dµ diverges on any of these boundary
components so fast as to give a singular integral . There are two distinct ways
in which the su rface can degenerate. One, depicted in Fig . 1 5.3(a), involves
the formation of a long thin tube that separates the diagram into two pieces.
The second, shown in Fig. 15. 3(b), involves the formation of a long thin tube
that does not separate the surface.
choice of background fields. In short, one would like to know the string
analog of the Hilbert-Einstein action of general relativity. This has not yet
been achieved, but an enormous effort by many workers over the past year
has brought us much closer to this goal . In particular, a beautiful field theory
for open strings, correct at least at tree level , has been achieved . It is almost
impossible to give adequate credit to all the workers who have made
important contributions. Some of them are Siegel and Zwiebach [2 1] ,
Banks and Peskin [22] , Neveu and West [23] , Hata, ltoh, Kugo,
Kunitomo and Ogawa [24] . I will give a brief description of the results in
the form developed by Witten [25] .
One important remark is that in field theory an individual Feynman
diagram is given by an integral whose integration region is topologically
trivial : some sort of hypercube. Thus, since moduli space Mg has a
complicated topology, it cannot arise from a single diagram . What happens
is that many different field theory diagrams give different topologically
trivial pieces of the integration region Mg, all with the same integrand, and
the sum gives the complete genus g amplitude. Thus the individual diagrams
provide a 'triangulation' of moduli space [26] .
Witten's description of the field theory of open bosonic strings has many
analogies with Yang-Mills theory. This is not really surprising in as much as
open strings are an infinite-component generalization of Yang-Mills fields.
(a)
(b)
Superstring unification 661
(2a)
a, µ
a matrix of one form. This is a natural quantity from a geometric point of
view. The analogous object in open-string field theory is the string field
A [xP (a) , c(a)] . (2b)
This is a functional field that creates or destroys an entire string with
coordinates xP(a), c(a), where the parameter a is taken to have the range 0 :::;;
a :::;; n. The coordinates c(a) are anticommuting ghost degrees of freedom
that arise in the first quantization of the action. They are essential so that,
when A is expanded in an infinite sequence of point fields, there is an
appropriate set of auxiliary Stuckel berg fields at each mass level. The details
of such an expansion are quite complicated, but the mathematics of the
complete string field is not so bad. The Yang-Mills field is one of the infinity
of terms in such an expansion .
The string field A can be regarded as a matrix (in analogy to Aap) by
regarding the coordinates with 0 :::;; a :::;; n/2 as providing the left matrix index
and those with n/2 :::;; a :::;; n as providing the right matrix index as shown in
Fig. 15.4(a). One could also associate quark-like charges with the ends of the
Fig. 15.4. An open string has a left-hand (u < n/2) and right-hand (u > n/2)
segment, depicted in (a), which can be treated as matrix indices. The
multiplication A*B = C is depicted in (b).
L R
o=O 0 = 7r/2 0 = 7r
(a)
_JL
(b)
662 J. H. Schwarz
strings which would then be included in the matrix labels as well. This is a
minor and inessential complication, which we shall suppress. By not
including such charges we are constructing the string generalization of U( l)
gauge theory. U(l) gauge theory (without matter fields) is a free theory, but
the string extension has non-trivial interactions, as we shall see.
In the case of Yang-Mills theory, we can multiply two fields by the rule
(3a)
)'
f
I � gµ'g " F ,F,,,
µ (7)
identifies the left and right segments of the string field Lagrangian. Thus in
the case of string theory we define
Jx = J L u< 1t/2
3 2
b(x (u) - x(n - u)) · . · e - < i/ 2 )1/>(1tf > x . (8)
As indicated in Fig. 15.5(a), this identifies the left and right segments of X . A
ghost factor has been inserted at the midpoint. (c/> is a bosonized form of the
ghost coordinates.) The · · · signifies that analogous integrations should also
be performed for the ghost coordinates. We now have the necessary
ingredients to write a string action . If we try to emulate the Yang-Mills
formula we run into a problem, namely no analog of the metric gµp has been
defined. Rather than trying to find one, it proves more fruitful to look for a
gauge-invariant action that does not require one. The simplest possibility is
given by the Chem-Simons form
I
I - A • QA + iA . A . A . (9)
a=O
-------:
a = 7r/2 c
______,. a = 7r
(a)
R L
L�R
�R L"'\
(b )
Superstring unification 665
Fig. 15.6. The three-string vertex with external string propagators attached
is given by a world sheet that is flat everywhere except at the interaction point
where a small circle of radius r has circumference 3nr.
666 J. H. Schwarz
15.5 Anomalies
The gauge invariance of classical gauge theories implies the existence of
conserved gauge currents, whose form can be deduced by the standard
Noether procedure. In Yang-Mills theories the associated conserved
charges are the symmetry generators. In general relativity the conserved
current is the · energy-momentum tensor and the associated charges are the
energy and momentu m . The classical conservation of these currents can ,
under certain circumstances, be destroyed by quantum effects called
anomalies. When this happens unphysical modes of the gauge fields become
coupled in the S matrix leading to a breakdown of unitarity and causality.
Thus, in general, all anomalies in gauge currents must cancel or else the
theory must be rejected as inconsistent .
The Feynman diagrams that give rise to anomalies typically arise at one
loop order and involve a loop of chiral fermions. In the well-known case of a
gauge theory in four dimensions the simplest diagram that can give rise to
anomalies is a triangle diagram, as depicted in Fig. 1 5.7(a). At one vertex we
have attached the current whose conservation we wish to study. Gauge
fields are attached to the other two . The anomaly that occurs typically has
the form
oµ Jµ ,...., c,µv p J.. Fµv Fp J.. .
Analogous phenomena can occur in higher dimensions, but then it is
necessary to consider diagrams with more external lines. The reason is
simply that the c, symbol has D indices in D dimensions. Thus in ten
dimensions, for example, the simplest anomaly has the structure
oµ Jµ "' eµ 1 µ2 · ' " µ • 0Fµ 1 µ • • • Fµ9µ 1 ·
2 0
Superstring unification 667
The corresponding Feynman diagram must have at least six legs (hexagon
graph) as shown in Fig . 1 5 .7(b).
In the case of string theories the situation is similar, but one should
consider a Feynman diagram appropriate to string theory, i.e. a two
dimensional surface. For example, in the case of a theory involving closed
oriented strings the only relevant diagram is a torus diagram with external
strings (one to represent the gauge current and five to represent gauge fields),
as shown in Fig. 1 5.8. In the case of type IIA superstrings all anomalies
trivially cancel because of the left-right symmetry of the theory. In the case
of the type IIB theory and the heterotic string theories the spectrum is not
A
(a)
J A
(b)
There are many groups in addition to the two that have been mentioned
that have 496 generators. To obtain additional restrictions we consider
anomaly diagrams involving Yang-Mills fields as well as gravitons. The
cancellation of all such anomalies leads to the additional requirement that
an arbitrary generator F of the algebra, expressed in the adjoint
representation , should satisfy the equation [1 1]
Tr F6 = la Tr F 2 Tr F4 - 1
4loo {Tr F 2 )3•
Remarkably this equation, with exactly the right coefficients, is satisfied for
both S0(32) and Ea x Ea. The only other possible solutions are Ea x
[U(1)] 2 4 a and [U(l)J496, neither of which seems to correspond to a string
theory. In the anomaly analysis an antisymmetric tensor field Bµ v that is part
of the supergravity multiplet plays an important role. The details are given
in the references.
The SO( 16) x SO( 16) heterotic string theory, listed in Table 15. 1 , has only
240 generators. This is possible because it is not supersymmetric. It is chiral ,
however, and the cancellation of anomalies involves 'miracles' analogous to
those of the other theories [ 14] .
The groups S0(32) and E a x Ea , singled out by the anomaly cancellation
analysis in superstring theories, had previously arisen in another context
[32] . Specifically, mathematicians have investigated lattices, like those of
solid-state physics, in higher dimensions. They were led to consider in
particular lattices that are self-dual (i .e. are coincident with the dual lattice)
for which the distance squared of each lattice site from the origin is an even
integer. (A cubic lattice generated by orthogonal unit vectors is self-dual but
not even.) It turns out that even self-dual lattices only occur in dimensions
that are multiples of eight. In eight dimensions there is just one and it is
generated by the root vectors of the Lie algebra of Ea. In 16 dimensions there
are two of them . One is obvious from the eight-dimensional construction,
namely the root lattice generated by Ea x Ea· The second possibility is
closely associated with S0(32). It is generated by root vectors of the algebra
S0(32) and certain spinorial weights of its covering group spin(32) . More
precisely, it is the weight lattice of the group spin(32)/Z 2 • This coincidence
between self-dual lattices and the algeb ras singled out by the anomaly
analysis was skillfully exploited in the construction of the heterotic string
theories. The basic idea is that modes that travel clockwise (right-moving)
on the string are described by the mathematics of a ten-dimensional
superstring whereas those that travel counterclockwise (left-moving) are
described by the mathematics of a 26-dimensional bosonic string. Ten of the
670 J. H. Schwarz
1 5.6 Compactification
It may be that our theoretical understanding of superstring theory is not yet
sufficiently developed to be able to do correct phenomenology . If we
nevertheless choose to plunge in with reckless abandon, then it is clear that a
crucial question is what to do with the six extra spatial dimensions. The
natural guess is that they curl up to form a six-dimensional space K that is
sufficiently small not to have been observed . In fact , the only fundamental
scale in the theory is the Planck length, and it is natural to suppose that the
internal space K is roughly of this size. This means that it is about the same
size that is characteristic of strings themselves.
Since string theory contains gravity it should determine spacetime
geometry dynamically� and so we must require that the 'background
geometry' M 4 x K corresponds to a solution of the classical equations of
motion. That is the classical statement sometimes called 'spontaneous
compactification'. Quantum mechanically, it should be determined as part
of the characterization of the vacuum : the quantum state of lowest energy. It
is generally assumed , since that is all we can do at this point, that a classical
solution is a good approximation to a quantum ground state. This might not
be true. Another potential pitfall arises from describing K in terms of
classical differential geometry. In doing this it is implicit that that geometry
is determined entirely in terms of the gravitational field (metric tensor) .
However, this is just one of an infinity of modes of the string . Of course, it is
singled out by being massless. The massive modes could play an equally
important role, however, in characterizing the geometry and topology of K
if K is not much larger than the characteristic string length scale. In this case
a whole new type of geometry, let us call it 'string geometry', may be
required for a suitable description of K . Since this does not yet exist, one
assumes that it is not necessary. I would not be surprised , however, if the
prevailing opinion changes when these matters are better understood .
Whatever the correct language for describing it may turn out to be, once
the vacuum of the theory is identified correctly we shall be in a strong
position for calculating many quantities of physical interest. The particle
Superstring unification 67 1
References
Quantum field theories are usually studied either by means of path integrals
or by means of canonical quantization. Path integral quantization has the
great virtue of explicitly maintaining all relevant symmetries, such as
Poincare invariance. The canonical approach is usually interpreted as an
approach that ruins Poincare invariance from the beginning through an
explicit choice of a 'time' coordinate. This is not necessarily so, however.
The essence of the canonical formalism can be developed in a way that
manifestly preserves all relevant symmetries, including Poincare invariance
(Witten , 1986; Zuckerman , 1986). The purpose of the present paper is to
carry this out in the case of non-abelian gauge theories and general
relativity.
In the canonical formalism of a theory with N degrees of freedom, one
usually introduces coordinates and momenta p i and qi , i,j = 1 , . . . , N. One
then defines the two-form
(1)
1
It is convenient to combine the p i and qi i n a variable Q , I = 1 , . . . , 2N, with
Q i = p i for i � N and Q i = q i - N for i > N. One can think of w as an
antisymmetric 2N x 2N matrix W u whose non-zero matrix elements are
w i , i + N = - w i + N, i = 1 . This matrix is invertible; we will denote the inverse
matrix as wu. One defines the Poisson bracket of any two functions A(Q 1 )
and B(Q 1 ) by
[ an
u oA
A , BJ = w a 1 a r
Q Q (2)
t Research supported in part by the National Science Foundation under grants P HY80-
19754 and PHY86- 16 129.
Geometrical theories 677
Given the form of ( 1 ), this is easily seen to coincide with the usual definitions
of Poisson brackets. The advantage of the definition in (2) is that, as is well
known (see e.g . Abraham and Marsden , 1967), the essential features of w can
be described in an invariant way. Let Z be the phase space of the theory
under discussion , that is , the space on which the ps and qs are coordinates. If
one interprets w as a two-form on Z, then it is clearly a closed two-form,
dw = O, (3)
since its components are constant in the coordinate system used in ( 1). What
is more, we have already noted that the matrix w is invertible. The converse
to this is as follows. Let w be any two-form on a manifold Z (which for us will
be the phase space of a physical theory). Suppose that w is closed (obeys (3)),
and is non-degenerate in the sense that at each point z E Z , the matrix w ii (z)
is invertible. Then it is a classical theorem that locally one can introduce
coordinates on Z to put w in the standard form ( 1 ). (This is not true globally
in theories with interesting geometrical content.) A non-degenerate closed
two-form is called a symplectic structure. Thus, to describe the canonical
formalism of a theory it is not at all necessary to find or choose ps and qs; the
essence of the matter is to describe a symplectic structure on the classical
phase space.
Clearly, the notion of a 'symplectic structure on phase space' is a more
intrinsic concept than' the idea of choosing ps and qs. However, at first sight
it might appear that the very concept of phase space is a non-covariant
concept, tied to a non-covariant , Hamiltonian description . This is not really
so . The whole idea , classically, of picking ps and qs is that the initial values of
the ps and qs determine a solution of the classical equations. More precisely,
classical solutions of any given physical theory, in any given coordinate
system , are in one-to-one correspondence with the values of the ps and qs at
time zero . This simple consideration leads us to a manifestly covariant
definition of what we mean by classical phase space : in a given physical
theory, classical phase space is the space of solutions of the classical
equations. We can always, if we wish, pick a coordinate system and identify
the classical solutions with the initial data in that coordinate system, but
there is no necessity to make such a non-covariant choice.
Given a relativistic field theory, such as the scalar field theory with
Lagrangian
I
A = dx 1 . . • dx. a.,. . . ( </> ) b <f> ( x 1 ) b¢(x 2 ) . . . b <f> (x. ).
. x, (7)
is not unique, since the b</J(x) are not linearly independent, being subject
to (6).
Finally, we need an exterior derivative on Z, which we will call b and
which must map k forms to (k + 1)-forms. It should obey
b2 = 0 (9)
and the Leibniz rule
b(AB) = bA B + ( - l)AA · bB.
· ( 1 0)
We define b by saying that acting on the zero-form (function) </J(x) , b gives
the one-form that we have called b<fJ(x) . The Leibniz rule then determines the
f
action of b on an arbitrary zero-form r :
br
b(r( ¢ )) = dx b</J(x), ( 1 1)
b</J(x)
where br/b</J(x) is just the variational derivative of r with respect to </J(x).
Equation ( 1 1) is a familiar formula to which we are giving, perhaps, a
slightly novel meaning . To act with b on k-forms of k > 0, one must bear in
mind, first of all , that as the one-form b</J(x) is the exterior derivative of the
zero-form </J(x), it must be closed :
b(b</J(x)) = 0. ( 12)
Also , using the Leibniz rule, we then have the exterior derivative of a general
k form (7) :
f
bA � dx 0 . . . dx, foJ�(;:\ ¢)
b ¢ (x0) b ¢ (x 1 ) . . . J ¢ (x, ). ( 13)
Although our definition of b has been rather formal , one can readily see that
it possesses the standard properties of the exterior derivative. Thus, if V is a
vector field, A a zero-form, and iv the operation of contraction with V, we
have V(A) = iv bA .
Having defined the relevant concepts, how are we to find a symplectic
structure in , say, the scalar field theory (4)? The idea (Witten, 1986 ;
Zuckerman , 1986) is to consider the 'symplectic current'
( 14)
At each spacetime point, ( 14) is a two-form on Z ; but in its dependence on x,
la is a conserved current :
( 15)
To verify ( 1 5) , one needs the equation of motion (6) and the fact that b ¢ is
anticommuting. As la is conserved, its integral over an initial value
680 C. Crnkovic and E. Witten
hypersurface L,
( 16)
is Poincare invariant.
Geometrical theories 68 1
µ
It is easy to see that w is also closed ; 6A , being the exterior derivative of
µ
the zero-form A , is closed, while, in view of (2 1) and the anticommutativity
of b, we have
6(6FwJ = b(Dµ 6Aa - Da 6Aµ) = 6([Aµ, bAaJ - [Aa, bAµ] )
= { bAµ , bAa } - { 6Aa, bAµ } = 0 . (24)
Therefore, w is closed .
It remains to discuss the behavior of w under gauge transformations. First
of all , the gauge transformation law for the gauge field is
Aµ � Aµ + [Dµ, s] = Aµ + aµs + [Aµ, s] . (25)
Varying (25), we find that under gauge transformations, bA transforms as
bAµ � bAµ + [ JAµ, sJ . (26 )
In particular, JA transforms homogeneously under gauge transformations.
And JF transforms in the same way :
JFµa � JFµa + [JFµa ' s] . (27)
Consequently, r and w are gauge invariant.
This is an important step in the right direction, but it is not the end of the
story. Let i be the space of solutions of ( 18), and let Z be the space of
solutions of ( 18) modulo gauge transformations. Thus, Z = Z/G , with G
being the group of gauge transformations. So far we have defined a gauge
invariant closed two-form w on i. What we want is a gauge-invariant closed
two-form on Z . We would like to show that the differential form w that we
have defined on i is the pullback from Z to i of a differential form on Z,
which we will also call w . This will be so if the following condition is obeyed.
If V is any vector field tangent to the G orbits on i, and i v is the operation of
contraction with V, the requirement is ivw = 0. This is a fancy way of saying
that the components of w in the gauge directions are zero ; one must require
this since the gauge directions a re eliminated in passing from i to Z, so a
differential form on Z cannot have non-zero components in those directions.
In Yang-Mills theory, the gauge directions in field space are simply
JAµ = D µs. (28)
More generally, we can consider a field variation
JAµ = J' Aµ + Dµs, (29)
which has a gauge component Dµs, and another component J'Aµ which is
not pure gauge. To verify that w has vanishing components in the gauge
di rections, we must show that if we insert (29) in the definition of w, the term
proportional to s drops out. The expression which must vanish is
682 C. Crnkovic and E. Witten
I I
�w = d L. Tr [D• • bF'• + bA. [F"" ' 6]] = dL. a Tr(ef"" ) .
µ
(30)
In the last step, we have used anticommutativity of s and bF11.µ and the
equation of motion ( 19) . Indeed , (30) vanishes, being the integral of a total
derivative, and this completes the construction of a symplectic structure on
the gauge-invariant space Z.
W= i dL, jg J"
Poincare invariant.
Geometrical theories 683
bw = J. dE, (b jg r + Jg M')
c5r = - ! c5r� 6g µ" b I n g + ! br� ·[Jga µ c5 In g .
" "
Remembering that 6 In g is an anticommuting one-form whose square is
zero , we have
c)J<I = -!Ja b ln g .
As b Jg = ! Jg b ln g , 6w = O. Thus, w is closed .
It now remains to investigate gauge invariance of w . The fact that w is
invariant under diffeomorphisms is relatively trivial ; it follows from the fact
that all ingredients in the definition of w, including c5r, transform
homogeneously, like tensors. As in the Yang-Mills case, the more delicate
point is to show that we obtain a closed two-form not just on the space i of
solutions of the Einstein equations, but also on the subtler space Z = Z/G,
with G being the group of diffeomorphisms. We must show that components
of w tangent to the G orbits vanish. Under a diffeomorphism xµ -+ xu + eµ ,
the metric changes by
9µ v -+ 9µ v + Dµev + D veµ ·
W e assume e has compact support or, more generally, i s asymptotic at
infinity to a Killing vector field. It we write
bgµv = b 'gµ v + Dµ ev + Dv eµ , (36)
where the De terms are pure gauge but Sg is not, then our task is to show
that, with (36) inserted in w , the De terms do not contribute. It is useful first
to rewrite Ja as
r = !gaP (Dµ bg vp + D 6gµp - Dp bgµJ bgµ"
"
!
- Dµ bg µ a 6 In g + -! bg µaDµ b In g - !(Da b In g) b In g . (37)
Inserting (36) in (37), the term linear in e is (after dropping the ' from {J'g)
µ
/).. Jrl- = [Dµ , D a] e bg µ" + [D v , Dµ] bga e"
" µ
+ Dv [(Dµ bg " + D" b In g)ea + D" bgµaeµ
+ Dµea bgµ " + !D"ea b In g - (a � v )] , (38)
where we have used (3 3), (32), and the identity [Dµ , D ]e µ = O, which follows
"
from Rµv = O. We are entitled to discard from /),,Jµ terms of the form D Xµa,
a
with xµa being an antisymmetric tensor; such terms will vanish when
inserted in the integral for w. Discarding such total derivatives from (38), we
684 C. Crnkovic and E. Witten
References