0 ratings0% found this document useful (0 votes) 285 views149 pagesGeophysical Data Analysis - Discrete Inverse Theory - Menke (1989)
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here.
Available Formats
Download as PDF or read online on Scribd
an
aug RAIS
HYSICS SERIES
A334) eee 21
yal ts
ata
Py
(oy
Ge Ure ach
‘se ineor
Theory
oryThis is Volume 45 in
INTERNATIONAL GEOPHYSICS SERIES
A series of monographs and textbooks
Edited by RENATA DMOWSKA and JAMES R. HOLTON
‘A complete list of the books in this series appears at the end of this volume.
GEOPHYSICAL DATA ANALYSIS:
DISCRETE INVERSE THEORY
Revised Edition
William Menke
Laniont-Doherty Geological Observatory and
Department of Geological Sciences
Columbia University
Palisades, New York
Formerly of
College of Oceanography
Oregon State University
@
ACADEMIC PRESS, INC.
Harcourt Brace Jovanovich, Publishers
San Diego New York Berkeley Boston
London Sydney Tokyo TorontoS GlB02.AL Mag 198F
Le
* (F Buvce IVA.
Blackwell 259%
© US MBAS
Copyright © 1989, 1984 by Academic Press. Inc.
by any means, electronic or mechanical, including photocopy. recording, or
any information storage and retrieval system, without permission in writ
from the publisher
ie Press, Ine
-go, California 92101
United Kingdom Edition published by
Academic Press Limited
24-28 Oval Road, London NWI 7DX
Library of Congress Cataloging-in-Publication Data
Menke, William
Geophysical data
Menke. - Rev. ed
p. em
Bibliography: p.
Includes index.
ISBN 0-12-490921-3 (alk. papery
1, Geophysies--Measurement, 2, Oceanography--Measurement
3. Inverse problems (Differential equations}--Numerical solutions.
1, Title. I, Series,
QC8O2.AIM46 1989
551--de20 89-31224
cr
mnalysis : diserete inverse theory / William
Printed in the United States of America
xy 90 91 92 9 XN 7654321
CONTENTS
PREFACE xi
INTRODUCTION 1
DESCRIBING INVERSE PROBLEMS
1.1 Formulating Inverse Problems 7
1.2. The Linear Inverse Problem 9
1.3 Examples of Formulating Inverse Problems 10
1.4 Solutions to Inverse Problems 17
SOME COMMENTS ON PROBABILITY THEORY
2.1 Noise and Random Variables 21
2.2 Correlated Data 24
2.3 Functions of Random Variables 27
2.4 Gaussian Distributions 29
2.5 Testing the Assumption of Gaussian Statistics 31
2.6 Confidence Intervals 33
SOLUTION OF THE LINEAR, GAUSSIAN INVERSE PROBLEM,
VIEWPOINT 1: THE LENGTH METHOD
3.1 The Lengths of Estimates 35
3.2. Measures of Length 36
3.3. Least Squares for a Straight Line 39
IF ZF_492)vi Contents
3.4 The Least Squares Solution of the Linear Inverse
Problem 40
3.5 Some Examples 42
3.6 The Existence of the Least Squares Solution 45
3.7 The Purely Underdetermined Problem 48
3.8 Mixed—Determined Problems 50
3.9 Weighted Measures of Length as a Type of A Priori
Information 52
3.10 Other Types of A Priori Information 55
3.11 The Variance of the Model Parameter Estimates 58
3.12 Variance and Prediction Error of the Least Squares
Solution 38
4 SOLUTION OF THE LINEAR, GAUSSIAN INVERSE PROBLEM,
VIEWPOINT 2: GENERALIZED INVERSES
4.1 Solutions versus Operators 61
4.2. The Data Resolution Matrix 62
4.3. The Model Resolution Matrix 64
4.4 The Unit Covariance Matrix 65
4.5 Resolution and Covariance of Some Generalized
Inverses 66
4.6 Measures of Goodness of Resolution and Covariance’ 67
4.7 Generalized Inverses with Good Resolution and
Covariance 68
4.8 — Sidelobes and the Backus-Gilbert Spread Function 7
4.9 The Backus-Gilbert Generalized Inverse for the
Underdetermined Problem 73
4.10 Including the Covariance Size 75
4.11 The Trade-off of Resolution and Variance 76
5 SOLUTION OF THE LINEAR, GAUSSIAN INVERSE PROBLEM,
VIEWPOINT 3: MAXIMUM LIKELIHOOD METHODS
5.1 The Mean of a Group of Measurements 79
Maximum Likelihood Solution of the Linear Inverse
Problem 82
5.3. A Priori Distributions 83
5.4 Maximum Likelihood for an Exact Theory 87
5.5 Inexact Theories 89
Contents
5.6
5.7
5.8
5.9
5.10
vii
The Simple Gaussian Case with a Linear Theory a
The General Linear, Gaussian Case 92
Equivalence of the Three Viewpoints 95
The F Test of Error Improvement Significance 96
Derivation of the Formulas of Section 5.7 97
6 NONUNIQUENESS AND LOCALIZED AVERAGES
6.1
6.2
6.3
6.4
6.5
6.6
Null Vectors and Nonuniqueness 101
Null Vectors of a Simple Inverse Problem. 102
Localized Averages of Model Parameters 103
Relationship to the Resolution Matrix 104
Averages versus Estimates 105
‘Nonunique Averaging Vectors and A Priori
Information 106
7 APPLICATIONS OF VECTOR SPACES
TA
7.2
73
74
1S
7.6
17
78
79
Model and Data Spaces 109
Householder Transformations 1
Designing Householder Transformations 1s
Transformations That Do Not Preserve Length 17
The Solution of the Mixed-Determined Problem 118
Singular-Value Decomposition and the Natural Generalized
Inverse 119
Derivation of the Singular-Value Decomposition 124
Simplifying Linear Equality and Inequality
Constraints 125
Inequality Constraints 126
8 LINEAR INVERSE PROBLEMS AND NON-GAUSSIAN
DISTRIBUTIONS
8.1L, Norms and Exponential Distributions 133
8.2 Maximum Likelihood Estimate of the Mean of an Exponential
Distribution 135
8.3. The General Linear Problem 137
8.4 Solving L, Norm Problems 138
8.5 The L. Norm 14110
11
12
NONLINEAR INVERSE PROBLEMS
Contents
9.1 Parameterizations 143
9.2 Linearizing Parameterizations 147
9.3 The Nonlinear Inverse Problem with Gaussian Data
9.4 Special Cases 153
9.5 Convergence and Nonuniqueness of Nonlinear L,
Problems 153
9.6 Non-Gaussian Distributions 156
9.7 Maximum Entropy Methods 160
FACTOR ANALYSIS
10.1 The Factor Analysis Problem 161
10.2. Normalization and Physicality Constraints
165
10.3 @Q-Mode and R-Mode Factor Analysis 167
10.4 Empirical Orthogonal Function Analysis
167
CONTINUOUS INVERSE THEORY AND TOMOGRAPHY
11.1 The Backus-Gilbert Inverse Problem 171
11.2 Resolution and Variance Trade-off 173
147
11.3 Approximating Continuous Inverse Problems as Discrete
Problems 174
11.4 Tomography and Continuous Inverse Theory
11.5 Tomography and the Radon Transform
11.6 The Fourier Slice Theorem 178
11.7 Backprojection 179
SAMPLE INVERSE PROBLEMS.
12.1 An Image Enhancement Problem 183
12.2 Digital Filter Design 187
12.3 Adjustment of Crossover Errors 190
12.4 An Acoustic Tomography Problem 194
176
177
12.5 Temperature Distribution in an Igneous Intrusion
12.6 L,, L,, and L, Fitting of a Straight Line
12.7 Finding the Mean of a Set of Unit Vectors
12.8 Gaussian Curve Fitting 210
12.9 Earthquake Location 213
12.10 Vibrational Problems 217
202
207
198
Contents ix
13
14
NUMERICAL ALGORITHMS
13.1 Solving Even-Determined Problems 222
13.2 Inverting a Square Matrix. 229
13.3 Solving Underdetermined and Overdetermined
Problems 231
13.4 L, Problems with Inequality Constraints 240
13.5 Finding the Eigenvalues and Eigenvectors of a Real
Symmetric Matrix 251
13.6 The Singular-Value Decomposition of a Matrix 254
13.7 The Simplex Method and the Linear Programming
Problem 256
APPLICATIONS OF INVERSE THEORY TO GEOPHYSICS
14.1 Earthquake Location and the Determination of the Velocity
Structure of the Earth from Travel Time Data 261
14.2 Velocity Structure from Free Oscillations and Seismic Surface
Waves 265
14.3 Seismic Attenuation 267
14.4 Signal Correlation 267
14.5. Tectonic Plate Motions 268
14.6 Gravity and Geomagnetism 269
14.7 Electromagnetic Induction and the Magnetotelluric
Method 270
14.8 Ocean Circulation 271
APPENDIX A: Implementing Constraints with Lagrange
Multipliers 273
APPENDIX B: L, inverse Theory with Complex
Quantities 275
REFERENCES. 277
INDEX 281
INTERNATIONAL GEOPHYSICS SERIES 288PREFACE
Every researcher in the applied sciences who has analyzed data has
practiced inverse theory. Inverse theory is simply the set of methods
used to extract useful inferences about the world from physical mea-
surements. The fitting of a straight line to data involves a simple appli-
cation of inverse theory. Tomography, popularized by the physician’s
CAT scanner, uses it on a more sophisticated level.
The study of inverse theory, however, is more than the cataloging of
methods of data analysis. It is an attempt to organize these techniques,
to bring out their underlying similarities and pin down their differ-
ences, and to deal with the fundamental question of the limits of
information that can be gleaned from any given data set.
Physical properties fall into two general classes: those that can be
described by discrete parameters (e.g., the mass of the earth or the
Position of the atoms in a protein molecule) and those that must be
described by continuous functions (e.g., temperature over the face of
the earth or electric field intensity in a capacitor). Inverse theory em-
ploys different mathematical techniques for these two classes of pa-
rameters: the theory of matrix equations for discrete parameters and
the theory of integral equations for continuous functions.
Being introductory in nature, this book deals only with “discrete
xixii Preface
inverse theory,” that is, the part of the theory concerned with parame-
ters that either are truly discrete or can be adequately approximated as
discrete. By adhering to these limitations, inverse theory can be pre-
sented on a level that is accessible to most first-year graduate students
and many college seniors in the applied sciences. The only mathemat-
ics that is presumed is a working knowledge of the calculus and linear
algebra and some familiarity with general concepts from probability
theory and statistics.
The treatment of inverse theory in this book is divided into four
parts. Chapters 1 and 2 provide a general background, explaining what
inverse problems are and what constitutes their solution as well as
reviewing some of the basic concepts from probability theory that will
be applied throughout the text. Chapters 3-7 discuss the solution of
the canonical inverse problem: the linear problem with Gaussian sta-
tistics. This is the best understood of all inverse problems; and it is here
that the fundamental notions of uncertainty, uniqueness, and resolu-
tion can be most clearly developed. Chapters 8-11 extend the discus-
sion to problems that are non-Gaussian and nonlinear. Chapters 12-
14 provide examples of the use of inverse theory and a discussion of the
numerical algorithms that must be employed to solve inverse problems
on a computer.
Many people helped me write this book. I am very grateful to my
students at Columbia University and at Oregon State University for
the helpful comments they gave me during the courses I have taught on
inverse theory. Critical readings of the manuscript were undertaken by
Leroy Dorman, L. Neil Frazer, and Walt Pilant; I thank them for their
advice and encouragement. I also thank my copyeditor, Ellen Drake,
draftsperson, Susan Binder, and typist, Lyn Shaterian, for the very
professional attention they gave to their respective work. Finally, I
thank the many hundreds of scientists and mathematicians whose
ideas I drew upon in writing this book.
INTRODUCTION
Inverse theory is an organized set of mathematical techniques for
reducing data to obtain useful information about the physical world
on the basis of inferences drawn from observations. Inverse theory, as
we shall consider it in this book, is limited to observations and
questions that can be represented numerically. The observations of the
world will consist of a tabulation of measurements, or “data.” The
questions we want to answer will be stated in terms of the numerical
values (and statistics) of specific (but not necessarily directly measur-
able) propertics of the world. These properties will be called “model
parameters” for reasons that will become apparent. We shall assume
that there is some specific method (usually a mathematical theory or
model) for relating the model parameters to the data.
The question, What causes the motion of the planets?, for example,
is not one to which inverse theory can be applied. Even though it is
perfectly scientific and historically important, its answer is not numer-
ical in nature. On the other hand, inverse theory can be applied to the
question, Assuming that Newtonian mechanics applies, determine the
number and orbits of the planets on the basis of the observed orbit of
Halley’s comet. The number of planets and their orbital ephemerides
12 Introduction
are numerical in nature. Another important difference between these
two problems is that the first asks us to determine the reason for the
orbital motions, and the second presupposes the reason and asks us
only to determine certain details. Inverse theory rarely supplies the
kind of insight demanded by the first question; it always demands that
the physical model be specified beforehand.
The term “inverse theory” is used in contrast to “forward theory,”
which is defined as the process of predicting the results of measure-
ments (predicting data) on the basis of some general principle or model
anda set of specific conditions relevant to the problem at hand. Inverse
theory, roughly speaking, addresses the reverse problem: starting with
data and a general principle or model, it determines estimates of the
model parameters. In the above example, predicting the orbit of
Halley’s comet from the presumably well-known orbital ephermerides
of the planets is a problem for forward theory.
Another comparison of forward and inverse problems is provided
by the phenomenon of temperature variation as a function of depth
beneath the earth’s surface. Let us assume that the temperature in-
creases linearly with depth in the earth; that is, temperature Tis related
to depth z by the rule T(z) = az + b, where @ and b are numerical
constants. If one knows that a = 0.1 and b = 25, then one can solve
the forward problem simply by evaluating the formula for any desired
depth. The inverse problem would be to determine a and b on the basis
of a suite of temperature measurements made at different depths in,
say, a bore hole. One may recognize that this is the problem of fitting a
straight line to data, which is a substantially harder problem than the
forward problem of evaluating a first-degree polynomial. This brings
out a property of most inverse problems: that they are substantially
harder to solve than their corresponding forward problems,
Forward problem.
model parameters —> model — prediction of data
Inverse problem:
data —— model —— estimates of model parameters
Note that the role of inverse theory is to provide information about
unknown numerical parameters that go into the model, not to provide
the model itself. Nevertheless, inverse theory can often provide a
means for assessing the correctness of a given model or of discriminat-
ing between several possible models.
Introduction 3
‘The model parameters one encounters in inverse theory vary from
discrete numerical quantities to continuous functions of one or more
variables. The intercept and slope of the straight line mentioned above
are examples of discrete parameters. Temperature, which varies con-
tinuously with position, is an example of a continuous function. This
book deals only with discrete inverse theory, in which the model
parameters are represented as a set of a finite number of numerical
values. This limitation does not, in practice, exclude the study of
continuous functions, since they can usually be adequately approxi-
mated by a finite number of discrete parameters. Temperature, for
example, might be represented by its value at a finite number of closely
spaced points or by a set of splines with a finite number of coefficients.
This approach does, however, limit the rigor with which continuous
functions can be studied. Parameterizations of continuous functions
are always both approximate and, to some degree, arbitrary properties,
which cast a certain amount of imprecision into the theory. Neverthe-
less, discrete inverse theory is a good starting place for the study of
inverse theory in general, since it relies mainly on the theory of vectors
and matrices rather than on the somewhat more complicated theory of
continuous functions and operators. Furthermore, careful application
‘of discrete inverse theory can often yield considerable insight, even
when applied to problems involving continuous parameters.
Although the main purpose of inverse theory is to provide estimates
of model parameters, the theory has a considerably larger scope. Even
in cases in which the model parameters are the only desired results,
there is a plethora of related information that can be extracted to help
determine the “goodness” of the solution to the inverse problem. The
actual values of the model parameters are indeed irrelevant in cases
when we are mainly interested in using inverse theory as a tool in
experimental design or in summarizing the data. Some of the ques-
tions inverse theory can help answer are the following.
(a) What are the underlying similarities among inverse problems?
(b) How are estimates of model parameters made?
(c) How much of the error in the measurements shows up as error
in the estimates of the model parameters?
(d) Given a particular experimental design, can a certain set of
model parameters really be determined?
These questions emphasize that there are many different kinds of
answers to inverse problems and many different criteria by which the4 Introduction
goodness of those answers can be judged. Much of the subject of
inverse theory is concerned with recognizing when certain criteria are
more applicable than others, as well as detecting and avoiding (if
possible) the various pitfalls that can arise.
Inverse problems arise in many branches of the physical sciences.
An incomplete list might include such entries as
(a) medical tomography,
(b) image enhancement,
(c) curve fitting,
(d) earthquake location,
(e) factor analysis,
(f) determination of earth structure from geophysical data,
(g) satellite navigation,
(h) mapping of celestial radio sources with interferometry, and
(i) analysis of molecular structure by x-ray diffraction.
Inverse theory was developed by scientists and mathematicians
having various backgrounds and goals. Thus, although the resulting
versions of the theory possess strong and fundamental similarities,
they have tended to look, superficially, very different. One of the goals
of this book is to present the various aspects of discrete inverse theory
in such a way that both the individual viewpoints and the “big picture”
can be clearly understood.
There are perhaps three major viewpoints from which inverse
theory can be approached. The first and oldest sprang from probability
theory —a natural starting place for such “noisy” quantities as obser-
vations of the real world. In this version of inverse theory the data and
model parameters are treated as random variables, and a great deal of
emphasis is placed on determining the probability distributions that
they follow. This viewpoint leads very naturally to the analysis of error
and to tests of the significance of answers.
The second viewpoint developed from that part of the physical
sciences that retains a deterministic stance and avoids the explicit use
of probability theory. This approach has tended to deal only with
estimates of model parameters (and perhaps with their error bars)
rather than with probability distributions per se. Yet what one means
by an estimate is often nothing more than the expected value of a
probability distribution: the difference is only one of emphasis.
The third viewpoint arose from a consideration of model parame-
ters that are inherently continuous functions. Whereas the other two.
Introduction Ss
viewpoints handled this problem by approximating continuous func-
tions with a finite number of discrete parameters, the third developed
methods for handling continuous function explicitly. Although con-
tinuous inverse theory is not within the scope of this book, many of the
concepts originally developed for it have application to discrete inverse
theory, especially when it is used with discretized continuous func-
tions.
This book is written at a level that might correspond to a first
graduate course in inverse theory for quantitative applied scientists.
Although inverse theory is a mathematical subject, an attempt has
been made to keep the mathematical treatment self-contained. With a
few exceptions, only a working knowledge of the calculus and matrix
algebra is presumed. Nevertheless, the treatment is in no sense simpli-
fied. Realistic examples, drawn from the scientific literature, are used
to illustrate the various techniques. Since in practice the solutions to
most inverse problems require substantial computational effort, atten-
tion is given to the kinds of algorithms that might be used to imple-
ment the solutions on a modern digital computer.1
DESCRIBING INVERSE
PROBLEMS
1.1 Formulating Inverse Problems
The starting place in most inverse problems is a description of the
data. Since in most inverse problems the data are simply a table of
numerical values, a vector provides a convenient means of their
representation. If N measurements are performed in a particular
experiment, for instance, one might consider these numbers as the
elements ofa vector d of length N. Similarly, the model parameters can
be represented as the elements of a vector m, which, is of length M.
data: d=[d,, dy, ds, dy... , dyl™
model parameters: m= [m,, my, m3, 1m, ... , My
(il)
Here T signifies transpose.
The basic statement of an inverse problem is that the model parame-
ters and the data are in some way related. This relationship is called the
model. Usually the model takes the form of one or more formulas that
the data and model parameters are expected to follow.8 | Describing Inverse Problems.
If for instance, one were attempting to determine the density of an
object by measuring its mass and volume, there would be two data—
mass and volume (say, d, and dy, respectively)—and one unknown
model parameter, density (say, m,). The model would be the statement
that density times volume equals mass, which can be written com-
pactly by the vector equation d,m, = d,.
In more realistic situations the data and model parameters are
related in more complicated ways. Most generally, the data and model
parameters might be related by one or more implicit equations such as
Add, m) = 0
Ad, m) = 0
: (1.2)
fi(d, m) =0
where L is the number of equations. In the above examples concerning
the measuring of density, L = | and d,m, = d, would constitute the
one equation of the form f;(d, m) = 0. These implicit equations, which
can be compactly written as the vector equation f(d, m) = 0, summa-
rize what is known about how the measured data and the unknown
model parameters are related. The purpose of inverse theory is to
solve, or “invert,” these equations for the model parameters, or what-
ever kinds of answers might be possible or desirable in any given
situation.
No claims are made either that the equations f(d, m) = 0 contain
enough information to specify the model parameters uniquely or that
they are even consistent. One of the purposes of inverse theory is to
answer these kinds of questions and to provide means of dealing with
the problems that they imply. In general, f(d, m) = 0 can consist of
arbitrarily complicated (nonlinear) functions of the data and model
parameters. In many problems, however, the equation takes on one of
several simple forms. It is convenient to give names to some of these
special cases, since they commonly arise in practical problems; and we
shall give them special consideration in later chapters.
1.1.1 IMPLICIT LINEAR FORM
The function f is linear in both data and model parameters and can
therefore be written as the matrix equation
1.2 The Linear Inverse Problem 9
d
ram=0-F| (1.3)
m
where F is an L X (M+ N) matrix.
1.1.2. EXPLICIT FORM
In many instances it is possible to separate the data from the model
parameters and thus to form L = N equations that are linear in the
data (but still nonlinear in the model parameters through a vector
function g).
f(d, m) = 0 = d — g(m) (1.4)
1.1.3 EXPLICIT LINEAR FORM
In the explicit linear form the function g is also linear, leading to the
NX M matrix equation (where L = N)
f(d, m)=0=d—-Gm (1.5)
Using this form is equivalent to saying that the matrix F in Section
LLL is:
F=[1G] (1.6)
1.2. The Linear Inverse Problem
The simplest and best-understood inverse problems are those that
can be represented with the explicit linear equation Gm = d. This
equation, therefore, forms the foundation of the study of discrete
inverse theory. As will be shown below, many important inverse
problems that arise in the physical sciences involve precisely this
equation. Others, while involving more complicated equations, can
often be solved through linear approximations.
__ The matrix G is called the data kernel, in analogy to the theory of
integral equations, in which the analogs of the data and model parame-
ters are two continuous functions d(x) and m(x), where x is some
independent variable. Continuous inverse theory lies between these
two extremes, with discrete data but a continuous model function:10 1 Describing Inverse Problems
Discrete inverse theory:
M
4,=SGym, (1.7a)
a
Continuous inverse theory:
d,= J G0)m(x) dx (1.7b)
Integral equation theory:
dy) = J G(y,x)m(x) dx (1.7e)
The main difference between discrete inverse theory, continuous
inverse theory, and integral equation theory is whether the model m
and data d are treated as continuous functions or discrete parameters.
The data d, in inverse theory are necessarily discrete, since inverse
theory is concerned with deducing information from observational
data, which always has a discrete nature. Both continuous inverse
problems and integral equations can be converted to discrete inverse
problems by approximating the integral as a summation using the
trapezoidal rule or some other quadrature formula.
1.3 Examples of Formulating
Inverse Problems
1.3.1 EXAMPLE 1: FITTING A STRAIGHT LINE
Suppose that N temperature measurements 7; are made at depths z,
in the earth. The data are then a vector d of N measurements of
temperature, where d=[T,, 7), Ts, . . . , Ty]?. The depths z, are
not, strictly speaking, data. Instead, they provide some auxiliary infor-
mation that describes the geometry of the experiment. This distinction
will be further clarified below.
Suppose that we assume a model in which temperature is a linear
function of depth: T= a + bz. The intercept a and slope b then form
1.3 Examples of Formulating Inverse Problems ul
the two model parameters of the problem, m = [a,b]™. According to
the model, each temperature observation must satisfy T= a + bz:
T,=a+ by
=atbz
. (1.8)
Ty =at bey
These equations can be arranged as the matrix equation Gm = d:
qT, 1 4
Ty ly
fale: a (1.9)
7 See
Tr] La 2x
1.3.2. EXAMPLE 2: FITTING A PARABOLA
If the model in Example | is changed to assume a quadratic varia-
tion of temperature with depth of the form T= a + bz + cz?, then a
new model parameter is added to the problem, m= [a,d,c]’. The
number of model parameters is now M = 3. The data are supposed to
satisfy
T, =a +t bz, + cz}
T, = at bz, + 23
(1.10)
Ty = a+ bzy + cz