100% found this document useful (1 vote)
16 views72 pages

Get Discrete inverse and state estimation problems with geophysical fluid applications Carl Wunsch PDF ebook with Full Chapters Now

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 72

Visit https://ebookultra.

com to download the full version and


explore more ebooks

Discrete inverse and state estimation problems with


geophysical fluid applications Carl Wunsch

_____ Click the link below to download _____


https://ebookultra.com/download/discrete-inverse-and-
state-estimation-problems-with-geophysical-fluid-
applications-carl-wunsch/

Explore and download more ebooks at ebookultra.com


Here are some recommended products that might interest you.
You can download now and explore!

Linear and nonlinear inverse problems with practical


applications 2012.11.15 Edition Jennifer L. Mueller

https://ebookultra.com/download/linear-and-nonlinear-inverse-problems-
with-practical-applications-2012-11-15-edition-jennifer-l-mueller/

ebookultra.com

Geophysical Data Analysis and Inverse Theory with MATLAB


and Python 5th Edition William Menke

https://ebookultra.com/download/geophysical-data-analysis-and-inverse-
theory-with-matlab-and-python-5th-edition-william-menke/

ebookultra.com

Complex Variables with Applications 3rd Edition David A.


Wunsch

https://ebookultra.com/download/complex-variables-with-
applications-3rd-edition-david-a-wunsch/

ebookultra.com

Discrete Chaos Second Edition With Applications in Science


and Engineering Elaydi

https://ebookultra.com/download/discrete-chaos-second-edition-with-
applications-in-science-and-engineering-elaydi/

ebookultra.com
Inverse Problems of Wave Processes A. S. Blagoveshchenskii

https://ebookultra.com/download/inverse-problems-of-wave-processes-a-
s-blagoveshchenskii/

ebookultra.com

Inverse Boundary Spectral Problems 1st Edition Alexander


Kachalov

https://ebookultra.com/download/inverse-boundary-spectral-
problems-1st-edition-alexander-kachalov/

ebookultra.com

Biometric inverse problems 1st Edition Svetlana N.


Yanushkevich

https://ebookultra.com/download/biometric-inverse-problems-1st-
edition-svetlana-n-yanushkevich/

ebookultra.com

Coefficient Inverse Problems for Parabolic Type Equations


and Their Application Danilaev

https://ebookultra.com/download/coefficient-inverse-problems-for-
parabolic-type-equations-and-their-application-danilaev/

ebookultra.com

Multidimensional Inverse and Ill Posed Problems for


Differential Equations Yu. E. Anikonov

https://ebookultra.com/download/multidimensional-inverse-and-ill-
posed-problems-for-differential-equations-yu-e-anikonov/

ebookultra.com
Discrete inverse and state estimation problems with
geophysical fluid applications Carl Wunsch Digital Instant
Download
Author(s): Carl Wunsch
ISBN(s): 9780521854245, 0521854245
Edition: CUP
File Details: PDF, 5.81 MB
Year: 2006
Language: english
This page intentionally left blank
DISCRETE INVERSE AND STATE
ESTIMATION PROBLEMS
With Geophysical Fluid Applications

The problems of making inferences about the natural world from noisy observations
and imperfect theories occur in almost all scientific disciplines. This book addresses
these problems using examples taken from geophysical fluid dynamics. It focuses
on discrete formulations, both static and time-varying, known variously as inverse,
state estimation or data assimilation problems. Starting with fundamental algebraic
and statistical ideas, the book guides the reader through a range of inference tools
including the singular value decomposition, Gauss–Markov and minimum variance
estimates, Kalman filters and related smoothers, and adjoint (Lagrange multiplier)
methods. The final chapters discuss a variety of practical applications to geophysical
flow problems.
Discrete Inverse and State Estimation Problems: With Geophysical Fluid Appli-
cations is an ideal introduction to the topic for graduate students and researchers
in oceanography, meteorology, climate dynamics, geophysical fluid dynamics, and
any field in which models are used to interpret observations. It is accessible to
a wide scientific audience, as the only prerequisite is an understanding of linear
algebra.

Carl Wunsch is Cecil and Ida Green Professor of Physical Oceanography at the
Department of Earth, Atmospheric and Planetary Sciences, Massachusetts Institute
of Technology. After gaining his Ph.D. in geophysics in 1966 at MIT, he has risen
through the department, becoming its head for the period between 1977–81. He
subsequently served as Secretary of the Navy Research Professor and has held senior
visiting positions at many prestigious universities and institutes across the world.
His previous books include Ocean Acoustic Tomography (Cambridge University
Press, 1995) with W. Munk and P. Worcester, and The Ocean Circulation Inverse
Problem (Cambridge University Press, 1996).
DISCRETE INVERSE AND STATE
ESTIMATION PROBLEMS
With Geophysical Fluid Applications

CARL WUNSCH
Department of Earth, Atmospheric and Planetary Sciences
Massachusetts Institute of Technology
  
Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo

Cambridge University Press


The Edinburgh Building, Cambridge  , UK
Published in the United States of America by Cambridge University Press, New York
www.cambridge.org
Information on this title: www.cambridge.org/9780521854245

© C. Wunsch 2006

This publication is in copyright. Subject to statutory exception and to the provision of


relevant collective licensing agreements, no reproduction of any part may take place
without the written permission of Cambridge University Press.

First published in print format 2006

- ---- eBook (NetLibrary)


- --- eBook (NetLibrary)

- ---- hardback


- --- hardback

Cambridge University Press has no responsibility for the persistence or accuracy of s
for external or third-party internet websites referred to in this publication, and does not
guarantee that any content on such websites is, or will remain, accurate or appropriate.
To Walter Munk for decades of friendship and exciting collaboration.
Contents

Preface page ix
Acknowledgements xi
Part I Fundamental machinery 1
1 Introduction 3
1.1 Differential equations 4
1.2 Partial differential equations 7
1.3 More examples 10
1.4 Importance of the forward model 17
2 Basic machinery 19
2.1 Background 19
2.2 Matrix and vector algebra 19
2.3 Simple statistics: regression 29
2.4 Least-squares 43
2.5 The singular vector expansion 69
2.6 Combined least-squares and adjoints 118
2.7 Minimum variance estimation and simultaneous equations 125
2.8 Improving recursively 136
2.9 Summary 143
Appendix 1. Maximum likelihood 145
Appendix 2. Differential operators and Green functions 146
Appendix 3. Recursive least-squares and Gauss–Markov solutions 148
3 Extensions of methods 152
3.1 The general eigenvector/eigenvalue problem 152
3.2 Sampling 155
3.3 Inequality constraints: non-negative least-squares 164
3.4 Linear programming 166
3.5 Empirical orthogonal functions 169
3.6 Kriging and other variants of Gauss–Markov estimation 170

vii
viii Contents

3.7 Non-linear problems 171


4 The time-dependent inverse problem: state estimation 178
4.1 Background 178
4.2 Basic ideas and notation 180
4.3 Estimation 192
4.4 Control and estimation problems 214
4.5 Duality and simplification: the steady-state filter and adjoint 229
4.6 Controllability and observability 232
4.7 Non-linear models 234
4.8 Forward models 248
4.9 A summary 250
Appendix. Automatic differentiation and adjoints 250
5 Time-dependent methods – 2 256
5.1 Monte Carlo/ensemble methods 256
5.2 Numerical engineering: the search for practicality 260
5.3 Uncertainty in Lagrange multiplier method 269
5.4 Non-normal systems 270
5.5 Adaptive problems 273
Appendix. Doubling 274
Part II Applications 277
6 Applications to steady problems 279
6.1 Steady-state tracer distributions 280
6.2 The steady ocean circulation inverse problem 282
6.3 Property fluxes 309
6.4 Application to real oceanographic problems 311
6.5 Linear programming solutions 326
6.6 The β-spiral and variant methods 328
6.7 Alleged failure of inverse methods 331
6.8 Applications of empirical orthogonal functions (EOFs)
(singular vectors) 333
6.9 Non-linear problems 335
7 Applications to time-dependent fluid problems 340
7.1 Time-dependent tracers 341
7.2 Global ocean states by Lagrange multiplier methods 342
7.3 Global ocean states by sequential methods 351
7.4 Miscellaneous approximations and applications 354
7.5 Meteorological applications 356
References 357
Index 367
Colour plates between pp. 182 and 183.
Preface

This book is to a large extent the second edition of The Ocean Circulation Inverse
Problem, but it differs from the original version in a number of ways. While teach-
ing the basic material at MIT and elsewhere over the past ten years, it became
clear that it was of interest to many students outside of physical oceanography –
the audience for whom the book had been written. The oceanographic material,
instead of being a motivating factor, was in practice an obstacle to understanding
for students with no oceanic background. In the revision, therefore, I have tried to
make the examples more generic and understandable, I hope, to anyone with even
rudimentary experience with simple fluid flows.
Also many of the oceanographic applications of the methods, which were still
novel and controversial at the time of writing, have become familiar and almost
commonplace. The oceanography, now confined to the two last chapters, is thus
focussed less on explaining why and how the calculations were done, and more on
summarizing what has been accomplished. Furthermore, the time-dependent prob-
lem (here called “state estimation” to distinguish it from meteorological practice)
has evolved rapidly in the oceanographic community from a hypothetical method-
ology to one that is clearly practical and in ever-growing use.
The focus is, however, on the basic concepts and not on the practical numerical
engineering required to use the ideas on the very large problems encountered with
real fluids. Anyone attempting to model the global ocean or atmosphere or equiv-
alent large scale system must confront issues of data storage, code parallelization,
truncation errors, grid refinement, and the like. Almost none of these important
problems are taken up here. Before constructive approaches to the practical prob-
lems can be found, one must understand the fundamental ideas. An analogy is the
need to understand the implications of Maxwell’s equations for electromagnetic
phenomena before one undertakes to build a high fidelity receiver. The effective
engineering of an electronic instrument can only be helped by good understanding

ix
x Preface

of how one works in principle, albeit the details of making one work in practice
can be quite different.
In the interests of keeping the book as short as possible, I have, however, omitted
some of the more interesting theoretical material of the original version, but which
readers can find in the wider literature on control theory. It is assumed that the
reader has a familiarity at the introductory level with matrices and vectors, although
everything is ultimately defined in Chapter 2.
Finally, I have tried to correct the dismaying number of typographical and other
errors in the previous book, but have surely introduced others. Reports of errors of
any type will be gratefully received.
I thank the students and colleagues who over the years have suggested correc-
tions, modifications, and clarifications. My time and energies have been supported
financially by the National Aeronautics and Space Administration, and the National
Science Foundation through grants and contracts, as well as by the Massachusetts
Institute of Technology through the Cecil and Ida Green Professorship.
Acknowledgements

The following figures are reproduced by permission of the American Geophysical


Union: 4.8, 6.16–6.20, 6.23–6.26, 6.31, 7.3–7.5 and 7.7–7.10. Figures 2.16, 6.21,
6.27, 6.32, 7.1 and 7.11 are reproduced by permission of the American Meteoro-
logical Society. Figure 6.22 is reproduced by permission of Kluwer.

xi
Part I
Fundamental machinery
1
Introduction

The most powerful insights into the behavior of the physical world are obtained
when observations are well described by a theoretical framework that is then avail-
able for predicting new phenomena or new observations. An example is the observed
behavior of radio signals and their extremely accurate description by the Maxwell
equations of electromagnetic radiation. Other such examples include planetary mo-
tions through Newtonian mechanics, or the movement of the atmosphere and ocean
as described by the equations of fluid mechanics, or the propagation of seismic
waves as described by the elastic wave equations. To the degree that the theoretical
framework supports, and is supported by, the observations one develops sufficient
confidence to calculate similar phenomena in previously unexplored domains or to
make predictions of future behavior (e.g., the position of the moon in 1000 years,
or the climate state of the earth in 100 years).
Developing a coherent view of the physical world requires some mastery, there-
fore, of both a framework, and of the meaning and interpretation of real data.
Conventional scientific education, at least in the physical sciences, puts a heavy
emphasis on learning how to solve appropriate differential and partial differential
equations (Maxwell, Schrödinger, Navier–Stokes, etc.). One learns which problems
are “well-posed,” how to construct solutions either exactly or approximately, and
how to interpret the results. Much less emphasis is placed on the problems of under-
standing the implications of data, which are inevitably imperfect – containing noise
of various types, often incomplete, and possibly inconsistent and thus considered
mathematically “ill-posed” or “ill-conditioned.” When working with observations,
ill-posedness is the norm, not the exception.
Many interesting problems arise in using observations in conjunction with theory.
In particular, one is driven to conclude that there are no well-posed problems outside
of textbooks, that stochastic elements are inevitably present and must be confronted,
and that more generally, one must make inferences about the world from data that
are necessarily always incomplete. The main purpose of this introductory chapter

3
4 Introduction

is to provide some comparatively simple examples of the type of problems one


confronts in practice, and for which many interesting and useful tools exist for their
solution. In an older context, this subject was called the “calculus of observations.”1
Here we refer to “inverse methods,” although many different approaches are so
labeled.

1.1 Differential equations


Differential equations are often used to describe natural processes. Consider the
elementary problem of finding the temperature in a bar where one end, at r = rA , is
held at constant temperature TA , and at the other end, r = rB , it is held at temperature
TB. The only mechanism for heat transfer within the bar is by molecular diffusion,
so that the governing equation is

d2 T
κ = 0, (1.1)
dr 2
subject to the boundary conditions

T (rA ) = TA , T (rB ) = TB . (1.2)

Equation (1.1) is so simple we can write its solution in a number of different ways.
One form is

T (r ) = a + br, (1.3)

where a, b are unknown parameters, until some additional information is provided.


Here the additional information is contained in the boundary conditions (1.2), and,
with two parameters to be found, there is just sufficient information, and
 
rB TA − rA TB TB − TA
T (r ) = + r, (1.4)
rB − rA rB − rA

which is a straight line. Such problems, or analogues for much more complicated
systems, are sometimes called “forward” or “direct” and they are “well-posed”:
exactly enough information is available to produce a unique solution insensitive to
perturbations in any element (easily proved here, not so easily in other cases). The
solution is both stable and differentiable. This sort of problem and its solution is
what is generally taught in elementary science courses.
On the other hand, the problems one encounters in actually doing science differ
significantly – both in the questions being asked, and in the information available.
1.1 Differential equations 5

For example:

1. One or both of the boundary values TA , TB is known from measurements; they are thus
given as TA = TA(c) ±TA , TB = TB(c) ±TB , where the TA,B are an estimate of the
possible inaccuracies in the theoretical values Ti(c) . (Exactly what that might mean is
taken up later.)
2. One or both of the positions, rA,B is also the result of measurement and are of the form
(c)
rA,B ± rA,B .
3. TB is missing altogether, but is known to be positive, TB > 0.
(c)
4. One of the boundary values, e.g., TB , is unknown, but an interior value Tint = Tint ±
Tint is provided instead. Perhaps many interior values are known, but none of them
perfectly.

Other possibilities exist. But even this short list raises a number of interesting,
practical problems. One of the themes of this book is that almost nothing in reality
is known perfectly. It is possible that TA , TB are very small; but as long as they
are not actually zero, there is no longer any possibility of finding a unique solution.
Many variations on this model and theme arise in practice. Suppose the problem
is made slightly more interesting by introducing a “source” ST (r ), so that the
temperature field is thought to satisfy the equation

d2 T (r )
= ST (r ), (1.5)
dr 2
along with its boundary conditions, producing another conventional forward prob-
lem. One can convert (1.5) into a different problem by supposing that one knows
T (r ), and seeks ST (r ). Such a problem is even easier to solve than the conven-
tional one: differentiate T twice. Because convention dictates that the “forward”
or “direct” problem involves the determination of T (r ) from a known ST (r ) and
boundary data, this latter problem might be labeled as an “inverse” one – simply
because it contrasts with the conventional formulation.
In practice, a whole series of new problems can be raised: suppose ST (r )
is imperfectly known. How should one proceed? If one knows ST (r ) and T (r )
at a series of positions ri = rA , rB , could one nonetheless deduce the bound-
ary conditions? Could one deduce ST (r ) if it were not known at these interior
values?
T (r ) has been supposed to satisfy the differential equation (1.1). For many
purposes, it is helpful to reduce the problem to one that is intrinsically discrete.
One way to do this would be to expand the solution in a system of polynomials,

T (r ) = α 0r 0 + α 1r 1 + · · · + α m r m , (1.6)
6 Introduction

and
ST (r ) = β 0r 0 + β 1r 1 + · · · + β n r n , (1.7)
where the β i would conventionally be known, and the problem has been reduced
from the need to find a function T (r ) defined for all values of r, to one in which
only the finite number of parameters α i , i = 0, 1, . . . , m, must be found.
An alternative discretization is obtained by using the coordinate r. Divide the in-
terval rA = 0 ≤ r ≤ rB into N − 1 intervals of length r, so that rB = (N − 1) r.
Then, taking a simple one-sided difference:
T (2r ) − 2T (r ) + T (0) = (r )2 ST (r ),
T (3r ) − 2T (2r ) + T (1r ) = (r )2 ST (2r ), (1.8)
..
.
T ((N − 1) r ) − 2T ((N − 2) r ) + T ((N − 3)r ) = (r )2 ST ((N − 2) r ) .
If one counts the number of equations in (1.8) it is readily found that there are N − 2,
but with a total of N unknown T ( pr ). The two missing pieces of information are
provided by the two boundary conditions T (0r ) = T0 , T ((N − 1) r ) = TN −1 .
Thus the problem of solving the differential equation has been reduced to finding
the solution of a set of ordinary linear simultaneous algebraic equations, which we
will write, in the notation of Chapter 2, as
Ax = b, (1.9)
where A is a square matrix, x is the vector of unknowns T ( pr ), and b is the vector
of values q( pt), and of boundary values. The list above, of variations, e.g., where
a boundary condition is missing, or where interior values are provided instead of
boundary conditions, then becomes statements about having too few, or possibly
too many, equations for the number of unknowns. Uncertainties in the Ti or in
the q( pr ) become statements about having to solve simultaneous equations with
uncertainties in some elements. That models, even non-linear ones, can be reduced
to sets of simultaneous equations, is the unifying theme of this book. One might
need truly vast numbers of grid points, pr, or polynomial terms, and ingenuity in
the formulation to obtain adequate accuracy, but as long as the number of parameters
N < ∞, one has achieved a great, unifying simplification.
Consider a little more interesting ordinary differential equation, that for the
simple mass–spring oscillator:
d2 ξ (t) dξ (t)
m 2
+ε + k0 ξ (t) = Sξ (t), (1.10)
dt dt
where m is mass, k0 is a spring constant, and ε is a dissipation parameter. Although
1.2 Partial differential equations 7

the equation is slightly more complicated than (1.5), and we have relabeled the
independent variable as t (to suggest time), rather than as r, there really is no
fundamental difference. This differential equation can also be solved in any number
of ways. As a second-order equation, it is well-known that one must provide two
extra conditions to have enough information to have a unique solution. Typically,
there are initial conditions, ξ (0), dξ (0)/dt – a position and velocity, but there is
nothing to prevent us from assigning two end conditions, ξ (0), ξ (t = t f ), or even
two velocity conditions dξ (0)/dt, dξ (t f )/dt, etc.
If we naively discretize (1.10) as we did the straight-line equation, we have
   
εt k0 (t)2 εt
ξ ( pt + t) − 2 − − ξ ( pt) − − 1 ξ ( pt − t)
m m m
Sξ (( p − 1) t)
= (t)2 , 2 ≤ p ≤ N − 1, (1.11)
m
which is another set of simultaneous equations as in (1.9) in the unknown ξ ( pt);
an equation count again would show that there are two fewer equations than un-
knowns – corresponding to the two boundary or two initial conditions. In Chapter 2,
several methods will be developed for solving sets of simultaneous linear equations,
even when there are apparently too few or too many of them. In the present case,
if one were given ξ (0), ξ (1t), Eq. (1.11) could be stepped forward in time, gen-
erating ξ (3t), ξ (4t), . . . , ξ ((N − 1)t). The result would be identical to the
solution of the simultaneous equations – but with far less computation.
But if one were given ξ ((N − 1)t) instead of ξ (1t), such a simple time-
stepping rule could no longer be used. A similar difficulty would arise if q( jt) were
missing for some j, but instead one had knowledge of ξ ( pt), for some p. Looked
at as a set of simultaneous equations, there is no conceptual problem: one simply
solves it, all at once, by Gaussian elimination or equivalent. There is a problem
only if one sought to time-step the equation forward, but without the required
second condition at the starting point – there would be inadequate information to
go forward in time. Many of the so-called inverse methods explored in this book
are ways to solve simultaneous equations while avoiding the need for all-at-once
brute-force solution. Nonetheless, one is urged to always recall that most of the
interesting algorithms are just clever ways of solving large sets of such equations.

1.2 Partial differential equations


Finding the solutions of linear differential equations is equivalent, when discretized,
to solving sets of simultaneous linear algebraic equations. Unsurprisingly, the same
is true of partial differential equations. As an example, consider a very familiar
problem:
8 Introduction

Solve

∇ 2 φ = ρ, (1.12)

for φ, given ρ, in the domain r ∈ D, subject to the boundary conditions φ = φ 0 on


the boundary ∂ D, where r is a spatial coordinate of dimension greater than 1.
This statement is the Dirichlet problem for the Laplace–Poisson equation, whose
solution is well-behaved, unique, and stable to perturbations in the boundary data,
φ 0 , and the source or forcing, ρ. Because it is the familiar boundary value problem,
it is by convention again labeled a forward or direct problem. Now consider a
different version of the above:
Solve (1.12) for ρ given φ in the domain D.
This latter problem is again easier to solve than the forward one: differentiate
φ twice to obtain the Laplacian, and ρ is obtained from (1.12). Because the prob-
lem as stated is inverse to the conventional forward one, it is labeled, as with the
ordinary differential equation, an inverse problem. It is inverse to a more familiar
boundary value problem in the sense that the usual unknowns φ have been inverted
or interchanged with (some of) the usual knowns ρ. Notice that both forward and
inverse problems, as posed, are well-behaved and produce uniquely determined
answers (ruling out mathematical pathologies in any of ρ, φ 0 , ∂ D, or φ). Again,
there are many variations possible: one could, for example, demand computation
of the boundary conditions, φ 0 , from given information about some or all of φ, ρ.
Write the Laplace–Poisson equation in finite difference form for two Cartesian
dimensions:

φ i+1, j − 2φ i, j + φ i−1, j + φ i, j+1 − 2φ i, j + φ i, j−1 = (x)2 ρ i j ,


i, j ∈ D,
(1.13)
with square grid elements of dimension x. To make the bookkeeping as simple
as possible, suppose the domain D is the square N × N grid displayed in Fig. 1.1,
so that ∂ D is the four line segments shown. There are (N − 2) × (N − 2) interior
grid points, and Eqs. (1.13) are then (N − 2) × (N − 2) equations in N 2 of the φ i j .
If this is the forward problem with ρ i j specified, there are fewer equations than
unknowns. But appending the set of boundary conditions to (1.13):

φ i j = φ 0i j , i, j ∈ ∂ D, (1.14)

there are precisely 4N − 4 of these conditions, and thus the combined set (1.13)
plus (1.14), written as (1.9) with,

x = vec{φ i j } = [ φ 11 φ 12 . φ N N ]T ,
b = vec{ρ i j , φ i0j } = [ ρ 22 ρ 23 . ρ N −1,N −1 φ 011 . φ 0N ,N ]T ,
1.2 Partial differential equations 9

Figure 1.1 Square, homogeneous grid used for discretizing the Laplacian, thus
reducing the partial differential equation to a set of linear simultaneous equations.

which is a set of M = N 2 equations in M = N 2 unknowns. (The operator, vec,


forms a column vector out of the two-dimensional array φ i j ; the superscript T is the
vector transpose, defined in Chapter 2.) The nice properties of the Dirichlet problem
can be deduced from the well-behaved character of the matrix A. Thus the forward
problem corresponds directly with the solution of an ordinary set of simultane-
ous algebraic equations.2 One complementary inverse problem says: “Using (1.9)
compute ρ i j and the boundary conditions, given φ i j ,” which is an even simpler
computation – it involves just multiplying the known x by the known matrix A.
But now let us make one small change in the forward problem, making it the
Neumann one:
Solve

∇ 2 φ = ρ, (1.15)

for φ, given ρ, in the domain r ∈ D subject to the boundary conditions ∂φ/∂ m̂ = φ 0


on the boundary ∂ D, where r is the spatial coordinate and m̂ is the unit normal to
the boundary.
This new problem is another classical, much analyzed forward problem. It is,
however, well-known that the solution is indeterminate up to an additive constant.
This indeterminacy is clear in the discrete form: Eqs. (1.14) are now replaced by

φ i+1, j − φ i, j = φ 0i j , i, j ∈ ∂ D  (1.16)


10 Introduction

etc., where ∂ D  represents the set of boundary indices necessary to compute the
local normal derivative. There is a new combined set:

Ax = b1 , x = vec{φ i j }, b1 = vec{ρ i j , φ 0i j }. (1.17)

Because only differences of the φ i j are specified, there is no information concern-


ing the absolute value of x. When some machinery is obtained in Chapter 2, we
will be able to demonstrate automatically that even though (1.17) appears to be M
equations in M unknowns, in fact only M − 1 of the equations are independent,
and thus the Neumann problem is an underdetermined one. This property of the
Neumann problem is well-known, and there are many ways of handling it, either
in the continuous or discrete forms. In the discrete form, a simple way is to add
one equation setting the value at any point to zero (or anything else). A further
complication with the Neumann problem is that it can be set up as a contradiction –
even while underdetermined – if the flux boundary conditions do not balance the in-
terior sources. The simultaneous presence of underdetermination and contradiction
is commonplace in real problems.

1.3 More examples


A tracer box model
In scientific practice, one often has observations of elements of the solution of the
differential system or other model. Such situations vary enormously in the com-
plexity and sophistication of both the data and the model. A useful and interesting
example of a simple system, with applications in many fields, is one in which there
is a large reservoir (Fig. 1.2) connected to a number of source regions which provide
fluid to the reservoir. One would like to determine the rate of mass transfer from
each source region to the reservoir.
Suppose that some chemical tracer or dye, C0 , is measured in the reservoir, and
that the concentrations of the dye, Ci , in each source region are known. Let the
unknown transfer rates be Ji0 (transfer from source i to reservoir 0). Then we must
have

C1 J10 + C2 J20 + · · · + C N JN 0 = C0 J0∞ , (1.18)

which says that, for a steady state, the rate of transfer in must equal the rate of
transfer out (written J0∞ ). To conserve mass,

J10 + J20 + · · · + JN 0 = J0∞ . (1.19)

This model has produced two equations in N + 1 unknowns, [J10 , J20 , . . .,


JN 0 , J0∞ ], which evidently is insufficient information if N > 1. The equations
1.3 More examples 11

J10, C1
J20, C2

J0∞
C0

JN0, CN

Figure 1.2 A simple reservoir problem in which there are multiple sources of
flow, at rates Ji0 , each carrying an identifiable property Ci , perhaps a chemical
concentration. In the forward problem, given Ji0 , Ci one could calculate C0 . One
form of inverse problem provides C0 and the Ci and seeks the values of Ji0 .

have also been written as though everything were perfect. If, for example, the
tracer concentrations Ci were measured with finite precision and accuracy (they
always are), the resulting inaccuracy might be accommodated as

C1 J10 + C2 J20 + · · · + C N JN 0 + n = C0 J0∞ , (1.20)

where n represents the resulting error in the equation. Its introduction produces
another unknown. If the reservoir were capable of some degree of storage or fluc-
tuation in level, an error term could be introduced into (1.19) as well. One should
also notice that, as formulated, one of the apparently infinite number of solutions
to Eqs. (6.1, 1.19) includes Ji0 = J0∞ = 0 – no flow at all. More information is
required if this null solution is to be excluded.
To make the problem slightly more interesting, suppose that the tracer C is
radioactive, and diminishes with a decay constant λ. Equation (6.1) becomes

C1 J10 + C2 J20 + · · · + C N JN 0 − C0 J0∞ = −λC0 . (1.21)

If C0 > 0, Ji j = 0 is no longer a possible solution, but there remain many more


unknowns than equations. These equations are once again in the canonical linear
form Ax = b.
12 Introduction

Figure 1.3 Generic tomographic problem in two dimensions. Measurements are


made by integrating through an otherwise impenetrable solid between the trans-
mitting sources and receivers using x-rays, sound, radio waves, etc. Properties
can be anything measurable, including travel times, intensities, group velocities,
etc., as long as they are functions of the parameters sought (such as the density or
sound speed). The tomographic problem is to reconstruct the interior from these
integrals. In the particular configuration shown, the source and receiver are sup-
posed to revolve so that a very large number of paths can be built up. It is also
supposed that the division into small rectangles is an adequate representation. In
principle, one can have many more integrals than the number of squares defining
the unknowns.

A tomographic problem
So-called tomographic problems occur in many fields, most notably in medicine, but
also in materials testing, oceanography, meteorology, and geophysics. Generically,
they arise when one is faced with the problem of inferring the distribution of
properties inside an area or volume based upon a series of integrals through the
region. Consider Fig. 1.3, where, to be specific, suppose we are looking at the top
of the head of a patient lying supine in a so-called CAT-scanner. The two external
shell sectors represent a source of x-rays, and a set of x-ray detectors. X-rays are
emitted from the source and travel through the patient along the indicated lines
where the intensity of the received beam is measured. Let the absorptivity/unit
length within the patient be a function, c(r), where r is the vector position within
the patient’s head. Consider one source at rs and a receptor at re connected by the
1.3 More examples 13

Figure 1.4 Simplified geometry for defining a tomographic problem. Some squares
may have no integrals passing through them; others may be multiply-covered.
Boxes outside the physical body can be handled in a number of ways, including
the addition of constraints setting the corresponding c j = 0.

path as indicated. Then the intensity measured at the receptor is


 re
I (rs , re ) = c (r (s)) ds, (1.22)
rs

where s is the arc length along the path. The basic tomographic problem is to
determine c(r) for all r in the patient, from measurements of I. c can be a function
of both position and the physical parameters of interest. In the medical problem, the
shell sectors rotate around the patient, and an enormous number of integrals along
(almost) all possible paths are obtained. An analytical solution to this problem,
as the number of paths becomes infinite, is produced by the Radon transform.3
Given that tumors and the like have a different absorptivity to normal tissue, the
reconstructed image of c(r) permits physicians to “see” inside the patient. In most
other situations, however, the number of paths tends to be much smaller than the
formal number of unknowns and other solution methods must be found.
Note first, however, that Eq. (1.22) should be modified to reflect the inability
of any system to produce a perfect measurement of the integral, and so, more
realistically,
 re
I (rs , re ) = c (r (s)) ds + n(rs , re ), (1.23)
rs

where n is the measurement noise.


To proceed, surround the patient with a bounding square (Fig. 1.4) – to produce a
simple geometry – and divide the area into sub-squares as indicated, each numbered
in sequence, j = 1, 2, . . . , N . These squares are supposed sufficiently small that
c (r) is effectively constant within them. Also number the paths, i = 1, 2, . . . , M.
14 Introduction

Then Eq. (1.23) can be approximated with arbitrary accuracy (by letting the sub-
square dimensions become arbitrarily small) as

N
Ii = c j ri j + n i . (1.24)
j=1

Here ri j is the arc length of path i within square j (most of them will vanish for
any particular path). These last equations are of the form
Ex + n = y, (1.25)
where E ={ri j }, x = [c j ], y = [Ii ], n = [n i ] . Quite commonly there are many
more unknown c j than there are integrals Ii . (In the present context, there is no
distinction made between using matrices A, E. E will generally be used where noise
elements are present, and A where none are intended.)
Tomographic measurements do not always consist of x-ray intensities. In seis-
mology or oceanography, for example, c j is commonly 1/v j , where v j is the speed
of sound or seismic waves within the area; I is then a travel time rather than an inten-
sity. The equations remain the same, however. This methodology also works in three
dimensions, the paths need not be straight lines, and there are many generalizations.4
A problem of great practical importance is determining what one can say about the
solutions to Eqs. (1.25) even where many more unknowns exist than formal pieces
of information yi .
As with all these problems, many other forms of discretization are possible. For
example, the continuous function c (r) can be expanded:

c (r) = anm Tn (r x ) Tm (r y ), (1.26)
n m
where r =(r x , r y ), and the Tn are any suitable expansion functions (sines and
cosines, Chebyschev polynomials, etc.). The linear equations (4.35) then repre-
sent constraints leading to the determination of the anm .

A second tracer problem


Consider the closed volume in Fig. 1.5 enclosed by four boundaries as shown.
There are steady flows, vi (z) , i = 1, . . . , 4, either into or out of the volume, each
carrying a corresponding fluid of constant density ρ 0 · z is the vertical coordinate.
If the width of each boundary is li , the statement that mass is conserved within the
volume is simply
 r  0
li ρ 0 vi (z) dz = 0, (1.27)
i=1 −h

where the convention is made that flows into the box are positive, and flows out
are negative. z = −h is the lower boundary of the volume and z = 0 is the top
1.3 More examples 15

L1 v1
L4 L2
L3

v4 v3
v2 h

Figure 1.5 Volume of fluid bounded on four open vertical and two horizontal sides
across which fluid is supposed to flow. Mass is conserved, giving one relationship
among the fluid transports vi ; conservation of one or more other tracers Ci leads
to additional useful relationships.

one. If the vi are unknown, Eq. (1.27) represents one equation (constraint) in four
unknowns:
 0
vi (z) dz, 1 ≤ i ≤ 4. (1.28)
−h

One possible, if boring, solution is vi (z) = 0. To make the problem somewhat


more interesting, suppose that, for some mysterious reason, the vertical derivatives,
vi (z) = dvi (z) /dz, are known so that
 z
vi (z) = vi (z) dz + bi (z 0 ), (1.29)
z0

where z 0 is a convenient place to start the integration (but can be any value).
bi are integration constants (bi = vi (z 0 )) that remain unknown. Constraint (1.27)
becomes


4  0  z 
li ρ 0 vi (z  )dz  + bi (z 0 ) dz = 0, (1.30)
i=1 −h z0
16 Introduction

or

4 
4  0  z
hli bi (z 0 ) = − li dz vi (z  )dz  , (1.31)
i=1 i=1 −h z0

where the right-hand side is known. Equation (1.31) is still one equation in four
unknown bi , but the zero-solution is no longer possible, unless the right-hand side
vanishes. Equation (1.31) is a statement that the weighted average of the bi on the
left-hand side is known. If one seeks to obtain estimates of the bi separately, more
information is required.
Suppose that information pertains to a tracer, perhaps a red dye, known to be
conservative, and that the box concentration of red dye, C, is known to be in a
steady state. Then conservation of C becomes
4   0  4  0  z
hli Ci (z) dz bi = − li dz Ci (z  )vi (z  )dz  , (1.32)
i=1 −h i=1 −h −z 0

where Ci (z) is the concentration of red dye on each boundary. Equation (1.32)
provides a second relationship for the four unknown bi . One might try to measure
another dye concentration, perhaps green dye, and write an equation for this second
tracer, exactly analogous to (1.32). With enough such dye measurements, there
might be more constraint equations than unknown bi . In any case, no matter how
many dyes are measured, the resulting equation set is of the form (1.9). The number
of boundaries is not limited to four, but can be either fewer, or many more.5

Vibrating string
Consider a uniform vibrating string anchored at its ends r x = 0, r x = L . The free
motion of the string is governed by the wave equation
∂ 2η 1 ∂ 2η
− = 0, c2 = T /ρ, (1.33)
∂r x2 c2 ∂t 2
where T is the tension and ρ the density. Free modes of vibration (eigen-frequencies)
are found to exist at discrete frequencies, sq ,
qπc
2πsq = , q = 1, 2, 3, . . . , (1.34)
L
which is the solution to a classical forward problem. A number of interesting and
useful inverse problems can be formulated. For example, given sq ± sq , q =
1, 2, . . . , M, to determine L or c. These are particularly simple problems, because
there is only one parameter, either c or L, to determine. More generally, it is obvious
from Eq. (1.34) that one has information only about the ratio c/L – they could not
be determined separately.
1.4 Importance of the forward model 17

Suppose, however, that the density varies along the string, ρ = ρ(r x ), so that
c = c (r x ). Then, it may be confirmed that the observed frequencies are no longer
given by Eq. (1.34), but by expressions involving the integral of c over the length
of the string. An important problem is then to infer c(r x ), and hence ρ(r x ). One
might wonder whether, under these new circumstances, L can be determined inde-
pendently of c?
A host of such problems exist, in which the observed frequencies of free modes
are used to infer properties of media in one to three dimensions. The most elaborate
applications are in geophysics and solar physics, where the normal mode frequen-
cies of the vibrating whole Earth or Sun are used to infer the interior properties
(density, elastic parameters, magnetic field strength, etc.).6 A good exercise is to
render the spatially variable string problem into discrete form.

1.4 Importance of the forward model


Inference about the physical world from data requires assertions about the structure
of the data and its internal relationships. One sometimes hears claims from people
who are expert in measurements that “I don’t use models.” Such a claim is almost
always vacuous. What the speaker usually means is that he doesn’t use equations, but
is manipulating his data in some simple way (e.g., forming an average) that seems
to be so unsophisticated that no model is present. Consider, however, a simple
problem faced by someone trying to determine the average temperature in a room.
A thermometer is successively placed at different three-dimensional locations, ri ,
at times ti . Let the measurements be yi and the value of interest be

1 M
m̃ = yi . (1.35)
M i=1

In deciding to compute, and use, m̃ the observer has probably made a long list of very
sophisticated, but implicit, model assumptions. Among them we might suggest: (1)
Thermometers actually measure the length of a fluid, or an oscillator frequency, or
a voltage and require knowledge of the relation to temperature as well as potentially
elaborate calibration methods. (2) That the temperature in the room is sufficiently
slowly changing that all of the ti can be regarded as effectively identical. A different
observer might suggest that the temperature in the room is governed by shock waves
bouncing between the walls at intervals of seconds or less. Should that be true, m̃
constructed from the available samples might prove completely meaningless. It
might be objected that such an hypothesis is far-fetched. But the assumption that
the room temperature is governed, e.g., by a slowly evolving diffusion process, is
a specific, and perhaps incorrect model. (3) That the errors in the thermometer are
18 Introduction

such that the best estimate of the room mean temperature is obtained by the simple
sum in Eq. (1.35). There are many measurement devices for which this assumption
is a very poor one (perhaps the instrument is drifting, or has a calibration that varies
with temperature), and we will discuss how to determine averages in Chapter 2. But
the assumption that property m̃ is useful, is a strong model assumption concerning
both the instrument being used and the physical process it is measuring.
This list can be extended, but more generally, the inverse problems listed earlier
in this chapter only make sense to the degree that the underlying forward model
is likely to be an adequate physical description of the observations. For example,
if one is attempting to determine ρ in Eq. (1.15) by taking the Laplacian ∇ 2 φ,
(analytically or numerically), the solution to the inverse problem is only sensible if
this equation really represents the correct governing physics. If the correct equation
to use were, instead,
∂ 2 φ 1 ∂φ
+ = ρ, (1.36)
∂r x2 2 ∂r y
where r y is another coordinate, the calculated value of ρ would be incorrect. One
might, however, have good reason to use Eq. (1.15) as the most likely hypothesis,
but nonetheless remain open to the possibility that it is not an adequate descrip-
tor of the required field, ρ. A good methodology, of the type to be developed in
subsequent chapters, permits posing the question: is my model consistent with the
data? If the answer to the question is “yes,” a careful investigator would never
claim that the resulting answer is the correct one and that the model has been “val-
idated” or “verified.” One claims only that the answer and the model are consistent
with the observations, and remains open to the possibility that some new piece
of information will be obtained that completely invalidates the model (e.g., some
direct measurements of ρ showing that the inferred value is simply wrong). One
can never validate or verify a model, one can only show consistency with existing
observations.7

Notes
1 Whittaker and Robinson (1944).
2 Lanczos (1961) has a much fuller discussion of this correspondence.
3 Herman (1980).
4 Herman (1980); Munk et al. (1995).
5 Oceanographers will recognize this apparently highly artificial problem as being a slightly
simplified version of the so-called geostrophic inverse problem, and which is of great practical
importance. It is a central subject in Chapter 6.
6 Aki and Richards (1980). A famous two-dimensional version of the problem is described by Kač
(1966); see also Gordon and Webb (1996).
7 Oreskes et al. (1994).
2
Basic machinery

2.1 Background
The purpose of this chapter is to record a number of results that are useful in
finding and understanding the solutions to sets of usually noisy simultaneous linear
equations and in which formally there may be too much or too little information.
A lot of the material is elementary; good textbooks exist, to which the reader
will be referred. Some of what follows is discussed primarily so as to produce
a consistent notation for later use. But some topics are given what may be an
unfamiliar interpretation, and I urge everyone to at least skim the chapter.
Our basic tools are those of matrix and vector algebra as they relate to the solution
of linear simultaneous equations, and some elementary statistical ideas – mainly
concerning covariance, correlation, and dispersion. Least-squares is reviewed, with
an emphasis placed upon the arbitrariness of the distinction between knowns, un-
knowns, and noise. The singular-value decomposition is a central building block,
producing the clearest understanding of least-squares and related formulations.
Minimum variance estimation is introduced through the Gauss–Markov theorem
as an alternative method for obtaining solutions to simultaneous equations, and its
relation to and distinction from least-squares is discussed. The chapter ends with
a brief discussion of recursive least-squares and estimation; this part is essential
background for the study of time-dependent problems in Chapter 4.

2.2 Matrix and vector algebra


This subject is very large and well-developed and it is not my intention to re-
peat material better found elsewhere.1 Only a brief survey of essential results is
provided.
A matrix is an M × N array of elements of the form
A = {Ai j }, i = 1, 2, . . . , M, j = 1, 2, . . . , N .

19
20 Basic machinery

Normally a matrix is denoted by a bold-faced capital letter. A vector is a special


case of an M × 1 matrix, written as a bold-face lower case letter, for example, q.
Corresponding capital or lower case letters for Greek symbols are also indicated in
bold-face. Unless otherwise stipulated, vectors are understood to be columnar. The
transpose of a matrix A is written AT and is defined as {AT }i j = A ji , an interchange
of the rows and columns of A. Thus (AT )T = A. Transposition applied to vectors
is sometimes used to save space in printing, for example, q = [q1 , q2 , . . . , q N ]T is
the same as
⎡ ⎤
q1
⎢ q2 ⎥
⎢ ⎥
q = ⎢ . ⎥.
⎣ .. ⎦
qN

Matrices and vectors 


√ N
A conventional measure of length of a vector is aT a = i ai2 = a. The inner,
L
or dot, product between two L × 1 vectors a, b is written aT b ≡ a · b = i=1 ai bi
and is a scalar. Such an inner product is the “projection” of a onto b (or vice versa).
It is readily shown that |aT b| = a b | cos φ| ≤ a b, where the magnitude
of cos φ ranges between zero, when the vectors are orthogonal, and one, when they
are parallel.
Suppose we have a collection of N vectors, ei , each of dimension N . If it is
possible to represent perfectly an arbitrary N -dimensional vector f as the linear
sum
N
f= α i ei , (2.1)
i=1

then ei are said to be a “basis.” A necessary and sufficient condition for them to
have that property is that they should be “independent,” that is, no one of them
should be perfectly representable by the others:
N
ej − β i ei = 0, j = 1, 2, . . . , N . (2.2)
i=1, i= j

A subset of the e j are said to span a subspace (all vectors perfectly representable
by the subset). For example, [1, −1, 0]T , [1, 1, 0]T span the subspace of all vectors
[v1 , v2 , 0]T . A “spanning set” completely describes the subspace too, but might have
additional, redundant vectors. Thus the vectors [1, −1, 0]T , [1, 1, 0]T , [1, 1/2, 0]
span the subspace but are not a basis for it.
2.2 Matrix and vector algebra 21

f
e1
e2
h

f
Figure 2.1 Schematic of expansion of an arbitrary vector f in two vectors e1 , e2
which may nearly coincide in direction.

The expansion coefficients α i in (2.1) are obtained by taking the dot product
of (2.1) with each of the vectors in turn:
N
α i eTk ei = eTk f, k = 1, 2, . . . , N , (2.3)
i=1

which is a system of N equations in N unknowns. The α i are most readily found


if the ei are a mutually orthonormal set, that is, if

eiT e j = δ i j ,

but this requirement is not a necessary one. With a basis, the information contained
in the set of projections, eiT f = fT ei , is adequate then to determine the α i and thus
all the information required to reconstruct f is contained in the dot products.
The concept of “nearly dependent” vectors is helpful and can be understood
heuristically. Consider Fig. 2.1, in which the space is two-dimensional. Then the
two vectors e1 , e2 , as depicted there, are independent and can be used to expand
an arbitrary two-dimensional vector f in the plane. The simultaneous equations
become

α 1 eT1 e1 + α 2 eT1 e2 = eT1 f, (2.4)


α 1 eT2 e1 + α 2 eT2 e2 = eT2 f.

The vectors become nearly parallel as the angle φ in Fig. 2.1 goes to zero; as long as
they are not identically parallel, they can still be used mathematically to represent f
perfectly. An important feature is that even if the lengths of e1, e2 , f are all order-one,
the expansion coefficients α 1 , α 2 can have unbounded magnitudes when the angle
φ becomes small and f is nearly orthogonal to both (measured by angle η).
22 Basic machinery

That is to say, we find readily from (2.4) that


eT1 f eT2 e2 − eT2 f eT1 e2
α1 = 2
, (2.5)
eT1 e1 eT2 e2 − eT1 e2

eT2 f eT1 e1 − eT1 f eT2 e1


α2 = 2
. (2.6)
eT1 e1 eT2 e2 − eT1 e2
Suppose for simplicity that f has unit length, and that the ei have also been normal-
ized to unit length as shown in Fig. 2.1. Then,
cos (η − φ) − cos φ cos η sin η
α1 = = , (2.7)
1 − cos φ2 sin φ
α 2 = cos η − sin η cot φ (2.8)

and whose magnitudes can become arbitrarily large as φ → 0. One can imagine a
situation in which α 1 e1 and α 2 e2 were separately measured and found to be very
large. One could then erroneously infer that the sum vector, f, was equally large.
This property of the expansion in non-orthogonal vectors potentially producing
large coefficients becomes important later (Chapter 5) as a way of gaining insight
into the behavior of so-called non-normal operators. The generalization to higher
dimensions is left to the reader’s intuition. One anticipates that as φ becomes very
small, numerical problems can arise in using these “almost parallel” vectors.

Gram–Schmidt process
One often has a set of p independent, but non-orthonormal vectors, hi , and it is con-
venient to find a new set gi , which are orthonormal. The “Gram–Schmidt process”
operates by induction. Suppose the first k of the hi have been orthonormalized to a
new set, gi . To generate vector k + 1, let
k
gk+1 = hk+1 − γ jgj. (2.9)
j

Because gk+1 must be orthogonal to the preceding gi , i = 1, . . . , k, take the dot


products of (2.9) with each of these vectors, producing a set of simultaneous equa-
tions for determining the unknown γ j . The resulting gk+1 is easily given unit norm
by dividing by its length.
Given the first k of N necessary vectors, an additional N − k independent vectors,
hi , are needed. There are several possibilities. The necessary extra vectors might
be generated by filling their elements with random numbers. Or a very simple trial
set like hk+1 = [1, 0, 0, . . . , 0]T , hk+2 = [0, 1, 0, . . . 0], . . . might be adequate. If
one is unlucky, the set chosen might prove not to be independent of the existing gi .
2.2 Matrix and vector algebra 23

But a simple numerical perturbation usually suffices to render them so. In practice,
the algorithm is changed to what is usually called the “modified Gram–Schmidt
process” for purposes of numerical stability.2

2.2.1 Matrix multiplication and identities


It has been found convenient and fruitful to usually define multiplication of two
matrices A, B, written as C = AB, such that
P
Ci j = Ai p B pj . (2.10)
p=1

For the definition (2.10) to make sense, A must be an M × P matrix and B must
be P × N (including the special case of P × 1, a column vector). That is, the two
matrices must be “conformable.” If two matrices are multiplied, or a matrix and
a vector are multiplied, conformability is implied – otherwise one can be assured
that an error has been made. Note that AB = BA even where both products exist,
except under special circumstances. Define A2 = AA, etc. Other definitions of
matrix multiplication exist, and are useful, but are not needed here.
The mathematical operation in (2.10) may appear arbitrary, but a physical inter-
pretation is available: Matrix multiplication is the dot product of all of the rows of A
with all of the columns of B. Thus multiplication of a vector by a matrix represents
the projections of the rows of the matrix onto the vector.
Define a matrix, E, each of whose columns is the corresponding vector ei , and
a vector, α = {α i }, in the same order. Then the expansion (2.1) can be written
compactly as

f = Eα. (2.11)

A “symmetric matrix” is one for which AT = A. The product AT A represents


the array of all the dot products of the columns of A with themselves, and similarly,
AAT represents the set of all dot products of all the rows of A with themselves.
It follows that (AB)T = BT AT . Because we have (AAT )T = AAT , (AT A)T = AT A,
both of these matrices are symmetric.
The “trace” of a square M × M matrix A is defined as trace(A) = iM Aii . A
“diagonal matrix” is square and zero except for the terms along the main diagonal,
although we will later generalize this definition. The operator diag(q) forms a square
diagonal matrix with q along the main diagonal.
The special L × L diagonal matrix I L , with Iii = 1, is the “identity.” Usually,
when the dimension of I L is clear from the context, the subscript is omitted. IA = A,
AI = I, for any A for which the products make sense. If there is a matrix B, such
24 Basic machinery

that BE = I, then B is the “left inverse” of E. If B is the left inverse of E and E is


square, a standard result is that it must also be a right inverse: EB = I, B is then
called “the inverse of E” and is usually written E−1 . Square matrices with inverses
are “non-singular.” Analytical expressions exist for a few inverses; more generally,
linear algebra books explain how to find them numerically when they exist. If E is
not square, one must distinguish left and right inverses, sometimes written, E+ , and
referred to as “generalized inverses.” Some of them will be encountered later. A
useful result is that (AB)−1 = B−1 A−1 , if the inverses exist. A notational shorthand
is (A−1 )T = (AT )−1 ≡ A−T .
The “length,” or norm, of a vector has already been introduced. But several
choices are possible; for present purposes, the conventional l2 norm already defined,
1/2
N
f2 ≡ (fT f)1/2 = f i2 , (2.12)
i=1

is most useful; often the subscript is omitted. Equation (2.12) leads in turn to the
measure of distance between two vectors, a, b, as

a − b2 = (a − b)T (a − b), (2.13)
which is the familiar Cartesian distance. Distances can also be measured in such a
way that deviations of certain elements of c = a − b count for more than others –
that is, a metric, or set of weights can be introduced with a definition,

cW = cW c,
i i ii i
(2.14)
depending upon the importance to be attached to magnitudes of different elements,
stretching and shrinking various coordinates. Finally, in the most general form,
distance can be measured in a coordinate system both stretched and rotated relative
to the original one

cW = cT Wc, (2.15)
where W is an arbitrary matrix (but usually, for physical reasons, symmetric and
positive definite,3 implying that cT Wc ≥ 0).

2.2.2 Linear simultaneous equations


Consider a set of M-linear equations in N -unknowns,
Ex = y. (2.16)
Because of the appearance of simultaneous equations in situations in which the yi are
observed, and where x are parameters whose values are sought, it is often convenient
2.2 Matrix and vector algebra 25

to refer to (2.16) as a set of measurements of x that produced the observations or


data, y. If M > N , the system is said to be “formally overdetermined.” If M < N ,
it is “underdetermined,” and if M = N , it is “formally just-determined.” The use
of the word “formally” has a purpose we will come to. Knowledge of the matrix
inverse to E would make it easy to solve a set of L equations in L unknowns, by
left-multiplying (2.16) by E−1 :
E−1 Ex = Ix = x = E−1 y.
The reader is cautioned that although matrix inverses are a very powerful theoretical
tool, one is usually ill-advised to solve large sets of simultaneous equations by
employing E−1 ; better numerical methods are available for the purpose.4
There are several ways to view the meaning of any set of linear simultaneous
equations. If the columns of E continue to be denoted ei , then (2.16) is
x1 e1 + x2 e2 + · · · + x N e N = y. (2.17)
The ability to so describe an arbitrary y, or to solve the equations, would thus
depend upon whether the M × 1 vector y can be specified by a sum of N -column
vectors, ei . That is, it would depend upon their being a spanning set. In this view,
the elements of x are simply the corresponding expansion coefficients. Depending
upon the ratio of M to N , that is, the number of equations compared to the number of
unknown elements, one faces the possibility that there are fewer expansion vectors
ei than elements of y (M > N ), or that there are more expansion vectors available
than elements of y (M < N ). Thus the overdetermined case corresponds to having
fewer expansion vectors, and the underdetermined case corresponds to having more
expansion vectors, than the dimension of y. It is possible that in the overdetermined
case, the too-few expansion vectors are not actually independent, so that there are
even fewer vectors available than is first apparent. Similarly, in the underdetermined
case, there is the possibility that although it appears we have more expansion vectors
than required, fewer may be independent than the number of elements of y, and the
consequences of that case need to be understood as well.
An alternative interpretation of simultaneous linear equations denotes the rows
of E as riT , i = 1, 2, . . . , M. Then Eq. (2.16) is a set of M-inner products,
riT x = yi , i = 1, 2, . . . , M. (2.18)
That is, the set of simultaneous equations is also equivalent to being provided with
the value of M-dot products of the N -dimensional unknown vector, x, with M
known vectors, ri . Whether that is sufficient information to determine x depends
upon whether the ri are a spanning set. In this view, in the overdetermined case,
one has more dot products available than unknown elements xi , and, in the under-
determined case, there are fewer such values than unknowns.
26 Basic machinery

A special set of simultaneous equations for square matrices, A, is labelled the


“eigenvalue/eigenvector problem,”

Ae =λe. (2.19)

In this set of linear simultaneous equations one seeks a special vector, e, such
that for some as yet unknown scalar eigenvalue, λ, there is a solution. An N × N
matrix will have up to N solutions (λi , ei ), but the nature of these elements and their
relations require considerable effort to deduce. We will look at this problem more
later; for the moment, it again suffices to say that numerical methods for solving
Eq. (2.19) are well-known.

2.2.3 Matrix norms


A number of useful definitions of a matrix size, or norm, exist. The so-called
“spectral norm” or “2-norm” defined as

A2 = maximum eigenvalue of (AT A) (2.20)

is usually adequate. Without difficulty, it may be seen that this definition is equiv-
alent to
xT AT Ax Ax2
A2 = max = max (2.21)
T
x x x2
where the maximum is defined over all vectors x.5 Another useful measure is the
“Frobenius norm,”

M N 
AF = i=1
A 2
j=1 i j
= trace(AT A). (2.22)

Neither norm requires A to be square. These norms permit one to derive various use-
ful results. Consider the following illustration. Suppose Q is square, and Q < 1,
then

(I + Q)−1 = I − Q + Q2 − · · · , (2.23)

which may be verified by multiplying both sides by I + Q, doing term-by-term


multiplication and measuring the remainders with either norm.
Nothing has been said about actually finding the numerical values of either
the matrix inverse or the eigenvectors and eigenvalues. Computational algorithms
for obtaining them have been developed by experts, and are discussed in many
good textbooks.6 Software systems like MATLAB, Maple, IDL, and Mathematica
implement them in easy-to-use form. For purposes of this book, we assume the
2.2 Matrix and vector algebra 27

reader has at least a rudimentary knowledge of these techniques and access to a


good software implementation.

2.2.4 Identities: differentiation


There are some identities and matrix/vector definitions that prove useful.
A square “positive definite” matrix A, is one for which the scalar “quadratic
form,”
J = xT Ax,
is positive for all possible vectors x. (It suffices to consider only symmetric A be-
cause for a general matrix, xT Ax = xT [(A + AT )/2]x, which follows from the scalar
property of the quadratic form.) If J ≥ 0 for all x, A is “positive semi-definite,”
or “non-negative definite.” Linear algebra books show that a necessary and suffi-
cient requirement for positive definiteness is that A has only positive eigenvalues
(Eq. 2.19) and a semi-definite one must have all non-negative eigenvalues.
We end up doing a certain amount of differentiation and other operations with
respect to matrices and vectors. A number of formulas are very helpful, and save a
lot of writing. They are all demonstrated by doing the derivatives term-by-term. Let
q, r be N × 1 column vectors, and A, B, C be matrices. The derivative of a matrix
by a scalar is just the matrix of element by element derivatives. Alternatively, if s
is any scalar, its derivative by a vector,
 
∂s ∂s ∂s T
= ··· , (2.24)
∂q ∂q1 ∂q N
is a column vector (the gradient; some authors define it to be a row vector). The
derivative of one vector by another is defined as a matrix:
⎧ ⎫
∂r1 ∂r2

⎪ · ∂r M


 ⎪ ⎪
∂q ∂q ∂q
 ⎪

1 1 1


∂r ∂ri ∂r 1
· · ∂r M
= = ∂q 2 ∂q 2 ≡ B. (2.25)
∂q ∂q j ⎪
⎪ · · · · ⎪ ⎪

⎪ ⎪
⎩ ∂r1 ∂r M ⎪⎭
∂q N
· · ∂q N

If r, q are of the same dimension, the determinant of B = det (B) is the “Jacobian”
of r.7
The second derivative of a scalar,
⎧ ∂2s ∂2s

· · ∂q∂1 qs N ⎪
2
  ⎪ ⎨ ∂q12 ∂q1 q2 ⎬
∂ s
2
∂ ∂s
= = · · · · · , (2.26)
∂q2 ∂qi ∂q j ⎪
⎩ ∂2s ∂2s ⎪ ⎭
∂q N ∂q1
· · · ∂q 2
N

is the “Hessian” of s and is the derivative of the gradient of s.


28 Basic machinery

Assuming conformability, the inner product, J = rT q = qT r, is a scalar. The


differential of J is

dJ = drT q + rT dq = dqT r + qT dr, (2.27)

and hence the partial derivatives are


∂(qT r) ∂(rT q)
= = r, (2.28)
∂q ∂q
∂(qT q)
= 2q. (2.29)
∂q
It follows immediately that, for matrix/vector products,
∂ ∂ T
(Bq) = BT , (q B) = B. (2.30)
∂q ∂q
The first of these is used repeatedly, and attention is called to the apparently trivial
fact that differentiation of Bq with respect to q produces the transpose of B – the
origin, as seen later, of so-called adjoint models. For a quadratic form,
J = qT Aq
∂J (2.31)
= (A + AT )q,
∂q
and the Hessian of the quadratic form is 2A if A = AT .
Differentiation of a scalar function (e.g., J in Eq. 2.31) or a vector by a matrix,
A, is readily defined.8 Differentiation of a matrix by another matrix results in a
third, very large, matrix. One special case of the differential of a matrix function
proves useful later on. It can be shown9 that

dAn = (dA) An−1 + A (dA) An−2 + · · · + An−1 (dA), (2.32)

where A is square. Thus the derivative with respect to some scalar, k, is


 
dAn (dA) n−1 n−2 (dA) n−1 dA
= A +A A + ··· + A . (2.33)
dk dk dk dk
There are a few, unfortunately unintuitive, matrix inversion identities that are
essential later. They are derived by considering the square, partitioned matrix,
 
A B
, (2.34)
BT C

where AT = A, CT = C, but B can be rectangular of conformable dimensions


in (2.34).10 The most important of the identities, sometimes called the “matrix
2.3 Simple statistics: regression 29

inversion lemma” is, in one form,


{C − BT A−1 B}−1 = {I − C−1 BT A−1 B}−1 C−1
(2.35)
= C−1 − C−1 BT (BC−1 BT − A)−1 BC−1 ,

where it is assumed that the inverses exist.11 A variant is


ABT (C + BABT )−1 = (A−1 + BT C−1 B)−1 BT C−1 . (2.36)
Equation (2.36) is readily confirmed by left-multiplying both sides by (A−1 +
BT C−1 B), and right-multiplying by (C + BABT ) and showing that the two sides of
the resulting equation are equal.
Another identity, found by “completing the square,” is demonstrated by directly
multiplying it out, and requires C = CT (A is unrestricted, but the matrices must
be conformable as shown):
ACAT − BAT − ABT = (A − BC−1 )C(A − BC−1 )T − BC−1 BT . (2.37)

2.3 Simple statistics: regression


2.3.1 Probability densities, moments
Some statistical ideas are required, but the discussion is confined to stating some
basic notions and to developing a notation.12 We require the idea of a probability
density for a random variable x. This subject is a very deep one, but our approach
is heuristic.13 Suppose that an arbitrarily large number of experiments can be con-
ducted for the determination of the values of x, denoted X i , i = 1, 2, . . . , M, and
a histogram of the experimental values found. The frequency function, or prob-
ability density, will be defined as the limit, supposing it exists, of the histogram
of an arbitrarily large number of experiments, M → ∞, divided into bins of ar-
bitrarily small value ranges, and normalized by M, to produce the fraction of the
total appearing in the ranges. Let the corresponding limiting frequency function
be denoted px (X )dX, interpreted as the fraction (probability) of values of x lying
in the range, X ≤ x ≤ X + dX. As a consequence of the definition, px (X ) ≥ 0
and

  ∞
px (X ) dX = px (X ) dX = 1. (2.38)
all X −∞

The infinite integral is a convenient way of representing an integral over “all X ,”


as px simply vanishes for impossible values of X. (It should be noted that this
so-called frequentist approach has fallen out of favor, with Bayesian assumptions
being regarded as ultimately more rigorous and fruitful. For introductory purposes,
30 Basic machinery

however, empirical frequency functions appear to provide an adequate intuitive


basis for proceeding.)
The “average,” or “mean,” or “expected value” is denoted x and defined as

x ≡ X px (X )dX = m 1 . (2.39)
all X

The mean is the center of mass of the probability density. Knowledge of the true
mean value of a random variable is commonly all that we are willing to assume
known. If forced to “forecast” the numerical value of x under such circumstances,
often the best we can do is to employ x . If the deviation from the true mean is
denoted x so that x = x + x , such a forecast has the virtue that we are assured
the average forecast error, x , would be zero if many such forecasts are made. The
bracket operation is very important throughout this book; it has the property that if
a is a non-random quantity, ax = a x and ax + y = a x + y .
Quantity x is the “first-moment” of the probability density. Higher order mo-
ments are defined as
 ∞
mn = x =n
X n px (X )dX,
−∞

where n are the non-negative integers. A useful theoretical result is that a knowledge
of all the moments is usually enough to completely define the probability density
themselves. (There are troublesome situations with, e.g., non-existent moments,
as with the so-called Cauchy distribution, px (X ) = (2/π ) (1/(1 + X 2 )) X ≥ 0,
whose mean is infinite.) For many important probability densities, including the
Gaussian, a knowledge of the first two moments n = 1, 2 is sufficient to define
all the others, and hence the full probability density. It is common to define the
moments for n > 1 about the mean, so that one has
 ∞
μn = (x − x ) = n
(X − X )n px (X )dX.
−∞

μ2 is the variance and often written μ2 = σ 2 , where σ is the “standard deviation.”

2.3.2 Sample estimates: bias


In observational sciences, one normally must estimate the values defining the proba-
bility density from the data itself. Thus the first moment, the mean, is often computed
as the “sample average,”
M
1
m̃ 1 = x M ≡ Xi . (2.40)
M i=1
2.3 Simple statistics: regression 31

The notation m̃ 1 is used to distinguish the sample estimate from the true value, m 1 .
On the other hand, if the experiment of computing m̃ 1 from M samples could be
repeated many times, the mean of the sample estimates would be the true mean.
This conclusion is readily seen by considering the expected value of the difference
from the true mean:
 !
1 M
x M− x = Xi − x
M i=1
M
1 M
= Xi − x = x − x = 0.
M i=1
M

Such an estimate is said to be “unbiassed”: its expected value is the quantity one
seeks.
The interpretation is that, for finite M, we do not expect that the sample mean
will equal the true mean, but that if we could produce sample averages from distinct
groups of observations, the sample averages would themselves have an average that
will fluctuate about the true mean, with equal probability of being higher or lower.
There are many sample estimates, however, some of which we encounter, where
the expected value of the sample estimate is not equal to the true estimate. Such
an estimator is said to be “biassed.” A simple example of a biassed estimator is the
“sample variance,” defined as
M
1
s2 ≡ (X i − x M) .
2
(2.41)
M i

For reasons explained later in this chapter (p. 42), one finds that

M −1 2
s2 = σ = σ 2 ,
M
and thus the expected value is not the true variance. (This particular estimate is
“asymptotically unbiassed,” as the bias vanishes as M → ∞.)
We are assured that the sample mean is unbiassed. But the probability that
x M = x , that is that we obtain exactly the true value, is very small. It helps to
have a measure of the extent to which x M is likely to be very far from x . To do
so, we need the idea of dispersion – the expected or average squared value of some
quantity about some interesting value, like its mean. The most familiar measure of
dispersion is the variance, already used above, the expected fluctuation of a random
variable about its mean:

σ 2 = (x − x )2 .
Discovering Diverse Content Through
Random Scribd Documents
masters of assemblies, well, deeply fixed,
which are given from dominating over the
one shepherd. herd, appointed so by a
shepherd, who is the
only one.

(11.) The words of (but as ‫ דברי‬is repeated, we have ‘those


words of’) wise men as goads (‫ ;דברי חכמים כדרבנות‬the play is
manifest between ‫ דבר‬and ‫דרב‬, but ‫ דרבנות‬occurs here only, note also
its two accents, and the noun in a concrete form at 1 Samuel xiii. 21)
and as nails (the particle of comparison being repeated, gives the
idea ‘but as nails as well,’ but ‫ַמ ְֹש ְמ ֹר ות‬, spelt with the letter sin, occurs
here only; only one instance of a piel participle of the very common
verb ‫ שמר‬occurs, viz. Jonah ii. 8 (9), but the noun ‫ משמר‬is also quite
common; see 1 Chronicles ix. 23, where ‫ לִמ ׁשָמ רֹות‬occurs in the sense
of ‘by wards’) planted masters of assemblies (‫אספה‬, occurs as a
feminine noun Isaiah xxiv. 22; LXX. συναγωγὴν. This word has a
curious history, which will further illustrate its meaning at this place.
The reading of the Hexapla at Isaiah xxiv. 22 is: Οʹ. καὶ συνάξουσι ※
συναγωγὴν αὐτῆς ※ εἰς δεσμωτήριον καὶ ἀποκλείσουσιν εἰς ὀχύρωμα. Σ.
καὶ ἀθροισθήσονται ἀθροισμὸν δεσμίου καὶ συγκλεισθήσονται εἰς
συγκλεισμόν, showing a difficulty about the word which in the Hexapla
is included between ♦asterisks. Here the Peshito renders Koheleth
by ‘masters of thresholds,’ and uses this same word
at Exodus xii. 22, 23, and Deuteronomy vi. 9; hence they understood
the allusion to be to the stake set across the entrance of the fold to
prevent the cattle from straying out. That this reading will explain
Isaiah xxiv. 22 is evident enough, and that it will explain this passage
also will be seen if we look upon these goads as both compelling the
oxen to labour, and, as the Syriac renders, forming the stakes which
close the entrance to the lair); are given from a shepherd, one
only (notice that ‫ אחד‬stands emphatically at the end of the sentence,
and must mean, therefore, ‘the shepherd, who is the only one.’ This
verse has greatly perplexed commentators: a diligent perusal of the
ancient versions, and following the hint given by the Syriac above,
will show us what is the real meaning. The LXX. read πεπυρωμένοι
A¹. D. E. X., which A², B alter to πεφυτευμένοι, and which Aquila
follows; probably πεπυρωμένοι was a misreading of πεπαρμένοι, from
πάρω, ‘to infix;’ Symmachus reads πεπηγότες ‘constructed on,’ see
Hebrews viii. 2. All the ancient Greek versions consider ‫ בעלי‬as a
preposition, and render by παρὰ τῶν; the LXX. render ‫ אספות‬by
συναγμάτων; B. συνθεμάτων; Aquila συνταγμάτων; Syriac Hexapla
; Symmachus συναχθέντων, otherwise συναντημάτων; Syriac
, all which words have nearly the same
meaning. The Syriac also gives here, from Theodotion, what is
possibly a rendering of this place, but may be intended for
παρεωραμένῳ, in verse 14, , i.e. ἀόρατοι, ‘unseen things.’
The Peshito rendering, perhaps, may give us the clue――these
‘collections’ or ‘collectanea’ are the instances of human life adduced
by Koheleth, and it is men [like a herd of oxen driven by goads, and
confined by stakes] over which these wise words are masters; and
this we think will make all clear. The meaning of the passage will be:
‘These words of wise men are like goads, by which the ox is incited
to labour, but then they are like stakes [i.e. to which the ox may be
tied, or, of which a fence might be made to confine him] as well,
planted around the persons brought together, [or the herd, and so
infixed as masters of the assembly]; they are given from the
shepherd [or herdsman’――for the word has both senses]――‘who
is the only one;’ or, since ‫ ֵמ ֶֹר ֶעה‬is ambiguous, and pointed ‫ ִמ ְר ֶעה‬means
‘pasture,’ they give pasture only. Thus the metaphor is kept up and
the equivoke maintained).

♦ referencing the “※” symbol


12 And further, by But as to anything else
these, my son, be from these, my son, be
admonished: of making warned: making of many
many books there is no treatises would lead to
end; and much ¹study is no result, and much
a weariness of the flesh. study would but weary
the body.
¹ Or,
reading.

(12.) But for the rest (repeating the formula of verse 9, and
hence a further extension of the same idea), from them (emphatic,
‘but for anything else that these wise words can do’ is the meaning)
my son, be admonished: makings of books (‫ ספרים‬used for the
sake of the alliteration with ‫ אספות‬above) the many (i.e. too many) is
nothing of an end (i.e. gives no result) and study (‫ להג‬occurs here
only, LXX. μελέτη) the much (too much) wearies (compare chapter
i. 8) the flesh. (Thus even wisdom itself is no cure for the ills of
humanity. The catalogue of human ills and the instances of human
evanescence would form too large a volume for humanity to master,
so that in this case also the world itself would not contain the books
which should be written. The grand result of all however is easily
obtained, and follows.)

13 ¶ ¹Let us hear the The end of the


conclusion of the whole matter, even all that hath
matter: Fear God, and been heard, is this: With
keep his regard to the
commandments: for this Almighty, fear him;
is the whole duty of man. and with regard also
to his commandments,
¹ Or, The
keep them, for this is
end of the everything to
matter, humanity.
even all
that hath
been
heard, is.

(13.) The conclusion (compare chapter iii. 11, vii. 2) of the


word (i.e. the final reason), the whole (with article in its usual
sense), is heard (niphal), with respect to the Deity, fear; and with
respect to his commandments, keep (notice the emphatic ‫את‬,
which, however, the LXX. do not render by σὺν, because the article
and position give the emphasis required) for this is all the man (i.e.
the whole duty, happiness, etc., of humanity).

14 For God shall For with respect to every


bring every work into act, the Almighty will
judgment, with every bring to adjustment all
secret thing, whether it that is mysterious,
be good, or whether it be whether it be a good or
evil. an evil.
(14.) For with respect to all working (or doings, notice the
prefixed ‫ )מ‬the Deity (noun before the verb) is bringing into
judgment upon all the hidden (niphal participle, but ‫ עלם‬has not lost
its meaning, it is the hidden past and future, hence the LXX.
παρεωραμένῳ, ‘overlooked,’ compare 1 Kings x. 3), if it be good or if
evil (if it be a good act or an evil one either. That is, God will bring all
these mysteries into orderly adjustment, and in the sequel vindicate
his holiness and justice).

Δόξα ἐν ὑψίστοις θεῷ καὶ ἐπὶ γῆς εἰρήνη ἐν


ἀνθρόποις ε ὐ δ ο κ ί α――Luke ii. 14.
♦ERRATA.

♦ All Errata and Addenda items


have been corrected in the
text.

Page 7, Notes, column 2 line 14, for


‘hiphil’ read ‘niphal.’

Page 19, Text, line 10, for ‘he’ read ‘He.’

Page 31, Notes, column 2, line 6, for ‘ratio’


read ‘oratio.’
ADDENDA.
Page 9, Notes. At end of Note on verse 17, after
‘spirit,’ add: It may be observed that ‫ רעיון‬occurs in
the Chaldee of Daniel――see Daniel ii. 29, 30; iv.
19 (16); v. 6, etc., always in the sense of a ‘painful
reflection,’ but in later Chaldee and Syriac as ‘a
reflection’ of any kind. As the sense in which
Koheleth uses the word is the nearest to the root-
meaning, is it not an evidence, so far, of earlier
composition of his book?

Page 12, Notes. At end of Note on verse 5 add:


It should have been mentioned that ‫ פרדס‬is also
considered to afford an indication of late composition.
It is said to be a Persian word; it occurs, however,
Nehemiah ii. 8; Canticles iv. 10. The word admits of
Semitic derivation, from ‫פרד‬, ‘to divide,’ ‘cut off in
portions,’ ‘lay out.’ If it be really an exotic, no date of
introduction is more probable than that of Solomon. It
is also to be noted that in the context it follows the
word ‘gardens,’ which is quite natural if it were
intended to denote a foreign luxury recently
introduced.
Page 15, Notes. At end of Note on verse 12 add:
This most obscure passage may perhaps receive
some light from a further discussion of the word ‫כבר‬
and other forms derived from the same root. The
feminine or abstract occurs Genesis xxxv. 16,
xlviii. 7, and 2 Kings v. 19, joined with ‫ארץ‬, rendered
in the Authorized Version a ‘little’ way. The verb
occurs in hiphil, Job xxxv. 16, xxxvi. 31, translated
‘multiplied,’ ‘in abundance;’ and in the hiphil form,
with the characteristic jud̄ inserted――Job viii. 2,
xv. 10, etc.; Isaiah x. 13, xvii. 12, etc.――in the
sense of ‘full of years,’ ‘overflowing,’ and the like. A
diligent comparison of these meanings shows that
‘fulness,’ in the sense of ‘completeness,’ must be the
root-meaning; and hence, when applied to time, the
LXX. render ἤδη, ‘already.’ With this meaning agree
also the Arabic and Syriac, see Fuerst, Lexicon, s.
voc. The meaning then of the word is, the ‘complete
present.’ With regard to the use of the root ‫ מלך‬in the
sense of counsel, it occurs once in Hebrew, viz.
Nehemiah v. 7, and once in biblical Chaldee, Daniel
iv. 27 (24). This meaning is common, as remarked in
the note, in Aramaic. The fair inference from this is,
that the root-meaning of the Hebrew word is ‘to
counsel,’ just as the root-meaning of the word
Apostle is ‘one sent.’ These senses are just what the
context requires. Koheleth turns round to see wisdom
in comparison with, or contradistinction to, false
hopes and false prudence, and asks how the man,
that is, humanity, can tell the one from the other. His
words are ‘what is,’ not ‘who is the man,’ etc.,
equivalent to――‘in what way can humanity enter
upon the results of the counsel,’ ‘or the king,’――the
equivoke being, we believe, intentional, and the
contracted relative giving a conditional turn to the
sentence――‘with respect to that which at present he
performs it.’ It would have been better if the word
with had been printed in the notes with a small letter,
as the division hardly amounts to a period, though
the connexion is not close. The suffix of the verb
refers back through the relative pronoun to counsel,
and might be well rendered into English thus――‘In
respect of which he at present takes that counsel.’
The LXX., contrary to their custom, omit ἤδη,
because it is perhaps sufficiently included in
ἐπελεύσεται, or because τὰ ὅσα ἤδη ἐποίησαν αὐτήν
would not have been intelligible. It is evident this all
squares with the context. Koheleth, as Solomon,
discovered that with all his wisdom he could not
practically discern the difference between this true
wisdom and that false prudence which led him to
accumulate only to be disappointed in his successor.
Page 19, Notes. At end of Note on verse 25 add:
The phrase ‫ ומי יחוש חוץ ממני‬requires further
elucidation. The reading ‫ממנו‬, supported by the LXX.,
is also confirmed by Hebrew MSS. The literal
rendering is――‘and who hastens outside him.’ This
the LXX. translate καὶ τίς πιέται πάρεξ αὐτοῦ, ‘who
drinks,’ etc. There is a reading of A², φείσεται,
‘spares.’ The former is supported by Peshito, Arabic,
and Theodotion――the latter by Aquila, Symmachus,
and Jerome. If the Greek text alone had to be
considered, φείσεται would, as the harder reading, be
entitled to the preference. It is readily seen, however,
that it arose from a conjectural alteration of the
Hebrew text into ‫חוס‬, for which there is no authority;
neither will the meaning to ‘spare’ make any sense in
the context. As the root occurs frequently, we are
driven to the conclusion that the rendering of the
LXX. was by design. Schleusner’s conjecture that
πίεται is used in the signification of ‘sensibus frui,’ is
no doubt correct――compare Habakkuk i. 8, as also
Isaiah xxviii. 16. Considered as ad sensum, this
rendering gives the idea of the Hebrew text correctly.

printed by t. and a. constable, printers


to her majesty,
at the edinburgh university press.
Catalogue.
A SELECTION FROM THE

BOOKS
PUBLISHED DURING 1869, 1870, 1871, AND 1872,
BY

Messrs. RIVINGTON,
HIGH STREET, OXFORD;
TRINITY STREET, CAMBRIDGE;
WATERLOO PLACE, LONDON.

THE BOOK OF CHURCH LAW. Being an exposition


of the Legal Rights and Duties of the Clergy and
Laity of the Church of England. By the Rev.
John Henry Blunt, M.A., F.S.A. Revised by
Walter G. F. Phillimore, B.C.L., Barrister-at-
Law, and Chancellor of the Diocese of Lincoln.
Crown 8vo. 7s. 6d.
“We have tested this work on various points of a crucial
character, and have found it very accurate and full in its information.
It embodies the results of the most recent acts of the Legislature on
the clerical profession and the rights of the laity.”――Standard.
“Already in our leading columns we have directed attention to
Messrs. Blunt and Phillimore’s ‘Book of Church Law,’ as an excellent
manual for ordinary use. It is a book which should stand on every
clergyman’s shelves ready for use when any legal matter arises
about which its possessor is in doubt.... It is to be hoped that the
authorities at our Theological Colleges sufficiently recognize the
value of a little legal knowledge on the part of the clergy to
recommend this book to their students. It would serve admirably as
the text-book for a set of lectures, and we trust we shall hear that its
publication has done something to encourage the younger clergy to
make themselves masters of at least the general outlines of
Ecclesiastical Law, as it relates to the Church of
England.”――Church Times.

“There is a copious index, and the whole volume forms a Handy-


book of Church Law down to the present time, which, if found on the
library shelves of most of the clergy, would often save them from
much unnecessary trouble, vexation, and expense.”――National
Church.

THOUGHTS ON PERSONAL RELIGION; being a


Treatise on the Christian Life in its Two Chief
Elements, Devotion and Practice. By Edward
Meyrick Goulburn, D.D., Dean of Norwich.
New Edition. Small 8vo. 6s. 6d.

An Edition for Presentation, Two Volumes,


small 8vo. 10s. 6d.
Also, a cheap Edition. Small 8vo. 3s. 6d.

THE PURSUIT OF HOLINESS: a Sequel to


“Thoughts on Personal Religion,” intended to
carry the Reader somewhat farther onward in
the Spiritual Life. By Edward Meyrick
Goulburn, D.D., Dean of Norwich, and formerly
one of Her Majesty’s Chaplains in Ordinary.
Fourth Edition. Small 8vo. 5s.

THE STAR OF CHILDHOOD. A First Book of


Prayers and Instruction for Children. Compiled
by a Priest. Edited by the Rev. T. T. Carter,
M.A., Rector of Clewer, Berks. With Six
Illustrations, reduced from Engravings by Fra
Angelico. Royal 16mo. 2s. 6d.
“All the Instructions, all of the Hymns, and most of the Prayers
here are excellent. And when we use the cautionary expression
‘most of the,’ &c., we do not mean to imply that all the prayers are
not excellent in themselves, but only to express a doubt whether in
some cases they may not be a little too elaborate for children. Of
course it by no means follows that when you use a book you are to
use equally every portion of it: what does not suit one may suit a
score of others, and this book is clearly compiled on the
comprehensive principle. But to give a veracious verdict on the book
it is needful to mention this. We need hardly say that it is well worth
buying, and of a very high order of merit.”――Literary
Churchman.

“Messrs. Rivington have sent us a manual of prayers for


children, called ‘The Star of Childhood,’ edited by the Rev. T. T.
Carter, a very full collection, including instruction as well as devotion,
and a judicious selection of hymns.”――Church Review.

“The Rev. T. T. Carter, of Clewer, has put forth a much needed


and excellent book of devotions for little children, called ‘The Star of
Childhood.’ We think it fair to tell our readers, that in it they will find
that for children who have lost a near relative a short
commemorative prayer is provided; but we most earnestly hope that
even by those who are not willing to accept this usage, the book will
not be rejected, for it is a most valuable one.”――Monthly Packet.

“One amongst the books before us deserves especial notice,


entitled ‘The Star of Childhood,’ and edited by the Rev. T. T. Carter: it
is eminently adapted for a New Year’s Gift. It is a manual of prayer
for children, with hymns, litanies, and instructions. Some of the
hymns are illustrative of our Lord’s life; and to these are added
reduced copies from engravings of Fra Angelico.”――Penny Post.

“Supposing a child to be capable of using a devotional manual,


the book before us is, in its general structure, as good an attempt to
meet the want as could have been put forth. In the first place it
succeeds, where so many like efforts fail, in the matter of simplicity.
The language is quite within the compass of a young child; that is to
say, it is such as a young child can be made to understand; for we
do not suppose that the book is intended to be put directly into his
hands, but through the hands of an instructor.”――Church Bells.

“To the same hand which gave us the ‘Treasury of Devotion’ we


are indebted for this beautiful little manual for children. Beginning
with prayers suited to the comprehension of the youngest, it contains
devotions, litanies, hymns, and instructions, carefully proportioned to
the gradually increasing powers of a child’s mind from the earliest
years, until confirmation. This little book cannot fail to influence for
good the impressible hearts of children, and we hope that ere long it
will be in the hands of all those who are blessed with Catholic-
minded parents. It is beautifully got up, and is rendered more
attractive by the capital engravings of Fra Angelico’s pictures of
scenes of our Lord’s childhood. God-parents could scarcely find a
more appropriate gift for their God-children than this, or one that is
more likely to lead them to a knowledge of the truth.”――Church
Union Gazette.
“‘The Star of Childhood’ is a first book of Prayers and instruction
for children, compiled by a Priest, and edited by the Rev. T. T. Carter,
rector of Clever. It is a very careful compilation, and the name of its
editor is a warrant for its devotional tone.”――Guardian.

“A handsomely got up and attractive volume, with several good


illustrations from Fra Angelico’s most famous paintings.”――Union
Review.

BY THE SAME COMPILER AND EDITOR.

THE TREASURY OF DEVOTION: A Manual pf


Prayers for General and Daily Use. Sixth Edition.
Imperial 32mo, 2s. 6d.; limp cloth, 2s. Bound
with the Book of Common Prayer, 3s. 6d.

THE WAY OF LIFE: A Book of Prayers and


Instruction for the Young (at School). Imperial
32mo, 1s. 6d.

THE GUIDE TO HEAVEN: A Book of Prayers for


every Want. For the Working Classes. New
Edition. Imperial 32mo, 1s. 6d.; limp cloth, 1s.

The Edition in large type may still be had.


Crown 8vo, 1s. 6d.; limp cloth, 1s.

THE PATH OF HOLINESS: A First Book of Prayers,


with the Service of the Holy Communion, for the
Young. With Illustrations. Crown 16mo, 1s. 6d.;
limp cloth, 1s.

LECTURES ON THE REUNION OF THE


CHURCHES. By John J. Ign. Von Döllinger,
D.D., D.C.L., Professor of Ecclesiastical History
in the University of Munich, Provost of the
Chapel-Royal, &c. &c. Authorized Translation,
with Preface by Henry Nutcombe Oxenham,
M.A., late Scholar of Balliol College, Oxford.
Crown 8vo. 5s.
“... Marked by all the author’s well-known, varied learning,
breadth of view, and outspoken spirit. The momentous question
which the Doctor discusses has long occupied the thoughts of some
of the most earnest and enlightened divines in all branches of the
Christian communion, though wide apart in other points of belief and
practice. On the infinite importance of reunion among Christian
Churches in their endeavour to evangelize the yet remaining two-
thirds of the human race――strangers to any form of
Christianity――the author enlarges with power and eloquence; and
this topic is one of unusual and lasting interest, though, of course,
only one among ♦a host of others equally important and equally well
discussed.”――Standard.

♦ duplicated word removed “a”

“In the present state of thought respecting the union of the


Churches, these Lectures will be welcomed by very many persons of
different schools of religious thought. They are not the hasty words
of an enthusiast, but the calm, well-considered, and carefully
prepared writings of one whose soul is profoundly moved by his
great subject. They form a contribution to the literature of this grave
question, valuable alike for its breadth of historical survey, its
fairness, the due regard paid to existing obstacles, and the practical
character of its suggestions.”――London Quarterly Review.

BRIGHSTONE SERMONS. By George Moberly,


D.C.L., Bishop of Salisbury. Second Edition.
Crown 8vo. 7s. 6d.

THE SAYINGS OF THE GREAT FORTY DAYS,


Between the Resurrection and Ascension,
regarded as the Outlines of the Kingdom of God.
In Five Discourses. With an Examination of Dr.
Newman’s Theory of Development. By George
Moberly, D.C.L., Bishop of Salisbury. Fourth
Edition. Crown 8vo. 7s. 6d.

WARNINGS OF THE HOLY WEEK, &c. Being a


Course of Parochial Lectures for the Week
before Easter and the Easter Festivals. By the
Rev. W. Adams, M.A., late Vicar of St. Peter’s-in-
the-East, Oxford, and Fellow of Merton College.
Seventh Edition. Small 8vo. 4s. 6d.

SELF-RENUNCIATION. From the French. With


Introduction by the Rev. T. T. Carter, M.A.,
Rector of Clewer. Crown 8vo. 6s.
“It is excessively difficult to review or criticise, in detail, a book of
this kind, and yet its abounding merits, its practicalness, its
searching good sense and thoroughness, and its frequent beauty,
too, make us wish to do something more than announce its
publication.... The style is eminently clear, free from redundance and
prolixity.”――Literary Churchman.

“Few save Religious and those brought into immediate contact


with them are, in all probability, acquainted with the French treatise
of Guilloré, a portion of which is now, for the first time we believe,
done into English.... Hence the suitableness of such a book as this
for those who, in the midst of their families, are endeavouring to
advance in the spiritual life. Hundreds of devout souls living in the
world have been encouraged and helped by such books as Dr.
Neale’s ‘Sermons preached in a Religious House.’ For such the
present work will be found appropriate, while for Religious
themselves it will be invaluable.”――Church Times.

THE ORIGIN AND DEVELOPMENT OF RELIGIOUS


BELIEF. By S. Baring-gould, M.A., Author of
“Curious Myths of the Middle Ages.”

Volume I. MONOTHEISM and POLYTHEISM.


Second Edition. 8vo. 15s.

Volume II. CHRISTIANITY. 8vo. 15s.

THE HIDDEN LIFE OF THE SOUL. From the


French. By the Author of “A Dominican Artist,”
“Life of Madame Louise de France,” &c. Crown
8vo. 5s.
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade

Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.

Let us accompany you on the journey of exploring knowledge and


personal growth!

ebookultra.com

You might also like