0% found this document useful (0 votes)

23 views136 pages

Network Quality Control

Uploaded by

noor.cuvilliers.apple

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views136 pages

Network Quality Control

Uploaded by

noor.cuvilliers.apple

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 136

Network quality control

Peter J.G. Teunissen

NETWORK QUALITY CONTROL

P.J.G. Teunissen
Series on Mathematical Geodesy and Positioning

© PJG Teunissen, first edition 2006

© TU Delft OPEN Publishing, second edition 2024

ISBN: 978-94-6366-950-4 (Ebook)

ISBN: 978-94-6366-949-8 (Paperback/softback)
DOI: https://10.59490/tb.100

This work is licensed under a Creative Commons Attribution 4.0 International license

Keywords: precision, testing, reliability, free networks, connected networks,

constrained networks
Preface 2nd Edition

To promote open access, this new edition of Network Quality Control is published by TU Delft
Open Publishing instead of Delft Academic Press. The book builds on the foundations of
adjustment theory and testing theory to enable precise and reliable designs of geodetic
networks and their connections. As such, the book is a natural follow-on of the books
Adjustment Theory (2nd Ed. 2024) and Testing Theory (3rd Ed. 2024), both with TU Delft Open
Publishing.

September, 2024

Peter J.G. Teunissen

Foreword

This book is the result of a series of lectures and courses the author has given on the topic
of network analysis. During these courses it became clear that there is a need for refer-
ence material that integrates network analysis with the statistical foundations of parameter
estimation and hypothesis testing. Network quality control deals with the qualitative as-
pects of network design, network adjustment, network validation and network connection,
and as such conveys the necessary knowledge for computing and analysing networks in
an integrated manner.
In completing the book, the author received valuable assistance from Ir. Hedwig Ver-
hoef, Dr. Ir. Dennis Odijk and Ria Scholtes. Hedwig Verhoef has also been one of the
lecturers and took care of editing a large portion of the book. This assistance is greatly
acknowledged.
P.J.G. Teunissen
December 2006
Contents

1 An overview 1
2 Estimation and precision 9
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Consistency and uniqueness . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 The Linear A-model: observation equations . . . . . . . . . . . . . . . . 13
2.3.1 Least-squares estimates . . . . . . . . . . . . . . . . . . . . . . . 13
2.3.2 A stochastic model for the observations . . . . . . . . . . . . . . 17
2.3.3 Least-squares estimators . . . . . . . . . . . . . . . . . . . . . . 18
2.3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.4 The Nonlinear A-model: observation equations . . . . . . . . . . . . . . 20
2.4.1 Nonlinear observation equations . . . . . . . . . . . . . . . . . . 20
2.4.2 The linearized observation equations . . . . . . . . . . . . . . . . 23
2.4.3 Least-Squares iteration . . . . . . . . . . . . . . . . . . . . . . . 28
2.5 The B-Model: condition equations . . . . . . . . . . . . . . . . . . . . . 29
2.5.1 Linear condition equations . . . . . . . . . . . . . . . . . . . . . 29
2.5.2 Nonlinear condition equations . . . . . . . . . . . . . . . . . . . 32
2.6 Special Least-Squares procedures . . . . . . . . . . . . . . . . . . . . . 32
2.6.1 Recursive least-squares . . . . . . . . . . . . . . . . . . . . . . . 33
2.6.2 Constrained least-squares . . . . . . . . . . . . . . . . . . . . . . 34
2.6.3 Minimally constrained least-squares . . . . . . . . . . . . . . . . 36
2.7 Quality control: precision . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3 Testing and reliability 45

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2 Basic concepts of hypothesis testing . . . . . . . . . . . . . . . . . . . . 46
3.2.1 Statistical hypotheses . . . . . . . . . . . . . . . . . . . . . . . . 46
3.2.2 Test of statistical hypotheses . . . . . . . . . . . . . . . . . . . . 49
3.2.3 Two types of errors . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.2.4 General steps in testing hypotheses . . . . . . . . . . . . . . . . 56
3.3 Test statistics for the linear(ized) model . . . . . . . . . . . . . . . . . . 57
3.3.1 The null and alternative hypothesis . . . . . . . . . . . . . . . . 57

vi
Contents vii

3.3.2 Residuals and model errors . . . . . . . . . . . . . . . . . . . . . 58

3.3.3 Significance of model error . . . . . . . . . . . . . . . . . . . . . 60
3.4 Detection, identification and adaptation . . . . . . . . . . . . . . . . . . 61
3.4.1 Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.4.2 Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.4.3 Adaptation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.5 Reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.5.1 Power of tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.5.2 Internal reliability . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.5.3 External reliability . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.5.4 Connection of two height systems: an example . . . . . . . . . . 69
3.6 Quality control: precision and reliability . . . . . . . . . . . . . . . . . . 73

4 Adjustment and validation of networks 77

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.2 The free network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.2.1 Some typical observation equations . . . . . . . . . . . . . . . . 80
4.2.2 On the invariance of geodetic observables . . . . . . . . . . . . . 83
4.2.3 Adjustment of free GPS network: an example . . . . . . . . . . . 88
4.2.4 Testing of free levelling network: an example . . . . . . . . . . . 92
4.3 The connected network . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.3.1 The observation equations . . . . . . . . . . . . . . . . . . . . . 94
4.3.2 The unconstrained connection for testing . . . . . . . . . . . . . 97
4.4 The constrained connection for coordinate computation . . . . . . . . . . 101
4.4.1 A levelling example . . . . . . . . . . . . . . . . . . . . . . . . 104
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

A Appendix 111
A.1 Mean and variance of scalar random variables . . . . . . . . . . . . . . . 111
A.2 Mean and variance of vector random variables . . . . . . . . . . . . . . . 115

B References 123
Chapter 1

An overview

This introductory chapter gives an overview of the material presented in the book. The
book consists of three parts. A first part on estimation theory, a second part on testing
theory and a third part on network theory. The first two parts are of a more general
nature. The material presented therein is in principle applicable to any geodetic project
where measurements are involved. Most of the examples given however, are focussed
on the network application. In the third part, the computation and validation of geodetic
networks is treated. In this part, we make a frequent use of the material presented in the
first two parts. In order to give a bird’s eye view of the material presented, we start with a
brief overview of the three parts.

ADJUSTMENT: The need for an adjustment arises when one has to solve an inconsis-
tent system of equations. In geodesy this is most often the case, when one has to solve
a redundant system of observation equations. The adjustment principle used is that of
least-squares. A prerequisite for applying this principle in a proper way, is that a number
of basic assumptions need to be made about the input data, the measurements. Since mea-
surements are always uncertain to a certain degree, they are modeled as sample values of a
random vector, the m-vector of observables y (note: the underscore will be used to denote
random variables). In case the vector of observables is normally distributed, its distri-
bution is uniquely characterized by the first two (central) moments: the expectation (or
mean) E{y} and the dispersion (or variance) D{y}. Information on both the expectation
and dispersion needs to be provided, before any adjustment can be carried out.

Functional model: In case of geodetic networks, the observables contain information on

the relative geometry of the network points. Examples are: height differences, angles,
distances, baselines, etc. Knowing the information content of the observables, allows
one to link them to the parameters which are used for describing the geometry of the
network. These parameters, which are often coordinates, are to be determined from the
adjustment. The link between the observables and the n-vector of unknown parameters x,
is established by means of the system of m observation equations

E{y} = Ax

This system is referred to as the functional model. It is given once the design matrix A of
order m × n is specified.
The system as it is given here, is linear in x. Quite often however, the observation
equations are nonlinear. In that case a linearization needs to be carried, to make the
system linear again. The parameter vector x usually consists of coordinates and possi-
bly, additional nuisance parameters, such as for instance orientation unknowns in case of
theodolite measurements. The coordinates could be of any type. For instance, they could

1
2 Network Quality Control

be Cartesian coordinates or geographic coordinates. The choice of the type of coordinates

is not essential for the adjustment, but is more a matter of convenience and depends on
what is required for the particular application at hand.

Stochastic model: Measurements are intrinsically uncertain. Remeasurement of the same

phenomenon under similar circumstances, will usually give slightly different results. This
variability in the outcomes of measurements is modelled through the probability density
function of y. In case of the normal distribution, it is completely captured by its dispersion.
In order to properly weigh the observables in the adjustment process, the dispersion needs
to be specified beforehand. It is given as
D{y} = Qy
This is the stochastic model, with Qy being the m × m variance matrix of the observ-
ables. In these lecture notes, we will assume that Qy is known. Hence, unknown variance
components and their estimation are not treated.
Since the variance matrix describes the variability one can expect of the measurements
when they are repeated under similar circumstances, it is said to describe the precision of
the observables. In order to be able to specify Qy correctly, a good understanding of the
measurement equipment and the measurement procedures used, is needed. Quite often
the variance matrix Qy can be taken as a diagonal matrix. This is the case, when the
measurements have been obtained independently from one another. The variance matrix
becomes full (nondiagonal) however, when for instance, the measurements themselves are
the result of a previous adjustment. This is the case when connecting geodetic networks.

Least-squares: Once the measurements have been collected and the functional model
and the stochastic model have been specified, the actual adjustment can be carried out.
The least-squares estimator of the unknown parameter vector x, is given as
x̂ = (AT Q−1 −1 T −1
y A) A Qy y

It depends on the design matrix A, the variance matrix Qy and the vector of observables
y. With x̂, one can compute the adjusted observables as ŷ = Ax̂ and the least- squares
residuals as ê = y − ŷ.
The above expression for the least-squares estimator is based on a functional model
which is linear. In the nonlinear case, one will first have to apply a linearization before
the above expression can be applied. For the linearization one will need approximate
values for the unknown parameters. In case already an approximate knowledge on the
geometry of the network is available, the approximate coordinates of the network points
can be obtained from a map. If not, a minimum set of the observations themselves will
have to be used for computing approximate coordinates. In case the approximate values
of the unknown parameters are rather poor, one often will have to iterate the least-squares
solution.

Quality: Every function of a random vector, is itself a random variable as well. Thus
x̂ is a random vector, just like the vector of observables y is. And when x̂ is linearly
related to y, it will have a normal distribution whenever y has one. In that case also the
1. An overview 3

distribution of x̂ can be uniquely characterized by means of its expectation and dispersion.

Its expectation reads

E{x̂} = x

Thus the expectation of the least-squares estimator equals the unknown, but sought for
parameter vector x. This property is known as unbiasedness. From an empirical point
of view, the equation implies, that if the adjustment would be repeated, each time with
measurements collected under similar circumstances, then the different outcomes of the
adjustment would on the average coincide with x. It will be clear, that this is a desirable
property indeed.
The dispersion of x̂, describing its precision, is given as

D{x̂} = (AT Q−1

y A)
−1

This variance matrix is independent of y. This is a very useful property, since it implies
that one can compute the precision of the least-squares estimator without having the actual
measurements available. Only the two matrices A and Qy need to be known. Thus once
the functional model and stochastic model have been specified, one is already in a position
to know the precision of the adjustment result. It also implies, that if one is not satisfied
with this precision, one can change it by changing A and/or Qy . This is typically done at
the design stage of a geodetic project, prior to the actual measurement stage. Changing
the geometry of the network and/or adding/deleting observables, will change A. Using
different measurement equipment and/or different measurement procedures, changes Qy .

TESTING: Applying only an adjustment to the observed data is not enough. The result
of an adjustment and its quality rely heavily on the validity of the functional and stochastic
model. Errors in one of the two, or in both, will invalidate the adjustment results. One
therefore needs, in addition to the methods of adjustment theory, also methods that allow
one to check the validity of the assumptions underlying the functional and stochastic
model. These methods are provided for by the theory of statistical testing.

Model errors: One can make various errors when formulating the model needed for
the adjustment. The functional model could have been misspecified, E{y} = Ax. The
stochastic model could have been misspecified, D{y} = Qy . Even the distribution of y
need not be normal. In these lecture notes, we restrict our attention to misspecifications
in the functional model. These are by far the most common modelling errors that occur
in practice. Denoting the model error as b, we have E{y} = Ax + b. If it is suspected
that model errors did indeed occur, one usually, on the basis of experience, has a fair idea
what type of model error could have occurred. This implies that one is able to specify the
vector b in the form of equations like

b = C∇

where C is a matrix of order m × q and ∇ is a q-vector. The vector ∇ is still unknown,

but the matrix C is then known. This matrix specifies how the vector of observables is
4 Network Quality Control

assumed to be related to the unknown error vector ∇. A typical example of modelling

errors that can be captured through this description are blunders in the measurements. In
case of a single blunder in one of the measurements, the C-matrix reduces to a unit vector,
having a one at the entry that corresponds with the corrupted measurement.

Test statistic: It will be intuitively clear that the least-squares residual vector ê, must
play an important role in validating the model. It is zero, when the measurements form a
perfect match with the functional model, and it departs from zero, the more the measure-
ments fail to match the model. A test statistic is a random variable that measures on the
basis of the least-squares residuals, the likelihood that a model error has occurred. For a
model error of the type C∇, it reads

T q = êT Q−1 T −1 −1 −1 T −1
y C(C Qy Qê Qy C) C Qy ê

It depends, apart from the least-squares residuals, also on the matrix C, on the design
matrix A (through Qê ) and on the variance matrix Qy . The test statistic has a central Chi-
squared distribution, with q degrees of freedom, χ 2 (q, 0), when the model error would be
absent. When the value of the test statistic falls in the right tail-area of this distribution,
one is inclined to belief that the model error indeed occurred. Thus the presence of the
model error is believed to be likely, when Tq > χα2q (q, 0), where αq is the chosen level of
significance.

Testing procedure: In practice it is generally not only one model error one is concerned
about, but quite often many more than one. In order to take care of these various potential
modelling errors, one needs a testing procedure. It consists of three steps: detection,
identification and adaptation. The purpose of the detection step is to infer whether one
has any reason to belief that the model is wrong. In this step one still has no particular
model error in mind. The test statistic for detection, reads

T m−n = êT Q−1

y ê

One decides to reject the model, when Tm−n > χα2m−n (m − n, 0).
When the detection step leads to rejection, the next step is the identification of the
most likely model error. The identification step is performed with test statistics like T q .
It implies that one needs to have an idea about the type of model errors that are likely to
occur in the particular application at hand. Each member of this class of potential model
errors is then specified through a matrix C. In case of one dimensional model errors, such
as blunders, the C-matrix becomes a vector, denoted as c. In that case q = 1 and the test
statistic T q simplifies considerably. One can then make use of its square-root, which reads

cT Q−1
y ê
w=
cT Q−1 −1
y Qê Qy c

This test statistic has a standard normal distribution N(0, 1) in the absence of the model
error. The particular model error that corresponds with the vector c, is then said to have
occurred with a high likelihood, when | w |> N 1 (0, 1). In order to have the model error
2 α1
1. An overview 5

detected and identified with the same probability, one will have to relate the two levels
of significance, αm−n and α1 . This is done by equating the power and the noncentrality
parameters of the above two test statistics T m−n and w.
Once certain model errors have been identified as sufficiently likely, the last step con-
sists of an adaptation of the data and/or model. This implies either a remeasurement of the
data or the inclusion of additional parameters into the model, such that the model errors
are accounted for. In both cases one always should check again of course, whether the
newly created situation is acceptable or not.

Quality: In case a model error of the type C∇ occurs, the least-squares estimator x̂ will
become biased. Thus E{x̂} = x. The dispersion or precision of the estimator however,
remains unaffected by this model error. The bias in x̂, due to a model error C∇, is given
as
∇x̂ = (AT Q−1 −1 T −1
y A) A Qy C∇

The purpose of testing the model, is to minimize the risk of having a biased least-squares
solution. However, one should realize that the outcomes of the statistical tests are not
exact and thus also prone to errors. It depends on the ’strenght’ of the model, how much
confidence one will have in the outcomes of these statistical tests. A measure of this
confidence is provided for by the concept of reliability. When the above w-test statistic is
used, the size of the model error that can be found with a probability γ , is given by the
Minimal Detectable Bias (MDB). It reads

λ (α1 , 1, γ )
| ∇ |=
cT Q−1
y Qê Qy c
−1

where λ (α1 , 1, γ ) is a known function of the level of significance α1 and the detection
probability (power) γ . The set of MDB’s, one for each model error considered, is said to
describe the internal reliability of the model.
As it was the case with precision, the internal reliability can be computed once the
design matrix A and the variance matrix Qy are available. Changing A and/or changing
Qy , will change the MDB’s. In this way one can thus change (e.g. improve) the internal
reliability. Substitution of C | ∇ | for C∇ in the above expression for ∇x̂, will show by
how much the least-squares solution becomes biased, when a model error of the size of
the MDB occurs. The bias vectors ∇x̂, one for each model error considered, is then said
to describe the external reliability of the model.

NETWORKS: The theory of adjustment and testing, is in principle applicable to any

geodetic project where measurements are involved for the determination of unknown pa-
rameters. But in case of a project like computing a geodetic network, some additional
considerations need to be taken into account as well. The aim of computing a geodetic
network is to determine the geometry of the configuration of a set of points. The set of
points usually consists of: (1) newly established points, of which the coordinates still
need to be determined, and (2) already existing points, the so-called control points, of
which the coordinates are known. By means of a network adjustment the relative geome-
try of the new points is determined and integrated into the geometry of the existing control
6 Network Quality Control

points. The determination and validation of the overall geometry is usually divided in two
phases: (1) the free network phase, and (2) the connected network phase.

Free network phase: In this phase, the known coordinates of the control points do not
take part in the determination of the geometry. It is thus free from the influence of the
existing control points. The idea is that a good geodetic network should be sufficiently
precise and reliable in itself, without the need of external control. It implies, when in the
second phase, the connected network phase, rejection of the model occurs, that one has
good reason to belief that the cause for rejection should be sought in the set of control
points, instead of in the geometry of the free network.
As with any geodetic project, the three steps involved in the free network phase are:
design (precision and reliability), adjustment (determination of geometry) and testing
(validation of geometry). With free networks however, there is one additional aspect
that should be considered carefully. It is the fundamental non-uniqueness in the relation
between geodetic observables and coordinates. This implies, that when computing co-
ordinates for the free network, additional information in the form of minimal constraints
are needed, to eliminate the non-uniqueness between observables and coordinates. The
minimal constraints however, are not unique. There is a whole set from which they can
be chosen. This implies that the set of adjusted coordinates of the free network, includ-
ing their variance matrix and external reliability, are not unique as well. This on its turn
implies, that one should only use procedures for evaluating the precision and reliability,
that are guaranteed to be invariant for the choise of minimal constraints. If this precau-
tion is not taken, one will end up using an evaluation procedure of which the outcome is
dependent on the arbitrary choice of minimal constraints.

Connected network phase: The purpose of this second phase is to integrate the geom-
etry of the free network into the geometry of the control points. The observables are the
coordinates of the free network and the coordinates of the control points. Since the co-
ordinates of the two sets are often given in different coordinate systems, the connection
model will often be based on a coordinate transformation from the coordinate system of
the free network to that of the control network.
In contrast to the free network phase, the design, adjustment and testing are now
somewhat nonstandard. First of all there is not much left to design. Once the free network
phase has been passed, the geometry of the free network as well as that of the control
points are given. This implies that already at the design stage of the free network, one
should take into account the distribution of the free network points with respect to the
distribution of the control points.
Secondly, the adjustment in the connected network phase is not an ordinary least-
squares adjustment. In most applications, it is not very practical to see the coordinates
of the control points change everytime a free network is connected to them. This would
happen however, when an ordinary adjustment would be carried out. Thus instead, a
constrained adjustment is applied, with the explicit constraints that the coordinates of the
control points remain fixed.
For testing however, a constrained adjustment would not be realistic. After all, the
1. An overview 7

coordinates of the control points are still samples from random variables and therefore
not exact. Thus for the validation of the connected geometry, the testing is based on the
least-squares residuals that follow from an ordinary adjustment and not from a constrained
adjustment.
Chapter 2

Estimation and precision

2.1 Introduction

This chapter is an introduction to least-squares estimation theory and it is written at a

level which should be sufficient for understanding the chapters following. The terms
’estimation’ and ’adjustment’ are often used interchangingly. Adjustment is the process
of making an inconsistent system of equations consistent, whereas estimation refers to
the process of determining values for the unknown parameters that occur in the system of
equations.
Loosely speaking, the need for an adjustment arises when there are more observations
than unknowns in a system of equations. The system of equations is then usually incon-
sistent. That is, without an adjustment, the given observations will not be compatible with
the system of equations, they will not fit these equations. Adjustment is thus the process
of making an inconsistent system of equations, consistent.
In section 2, we start off by discussing the concepts of consistency and uniqueness in
more detail. Inconsistency of a system of equations can only occur when the number of
equations, m, exceeds the rank of the system’s matrix A, rank A. The difference m−rank A
is known as the redundancy of the system. The system is always consistent when the
redundancy equals zero. In that case, there is no need for an adjustment and the system
will always have a solution. The solution however, may not be unique. The solution is
only unique when the system’s matrix is of full rank, that is, when rank A equals the total
number of unknowns in the system.
The system of equations will generally be inconsistent when the redundancy is un-
equal to zero. In section 3 we show how an inconsistent system of equations can be made
consistent. The method employed is that of least- squares. Both the deterministic as well
as stochastic formulation of the least-squares method are discussed. For the stochastic for-
mulation we rely on the concept of random variables. The first two (central) moments of a
random variable, the mean and the variance, together with the very important propagation
laws, are discussed in the appendix.
In section 4, we generalize the least-squares method and show how it can be used when
the observation equations are nonlinear. In applications, one will find that the observation
equations are more often than not, nonlinear. Using Taylor’s theorem, it is shown how
the nonlinear observation equations can be linearized and how, by means of an iteration
process, the standard least-squares approach can be employed.
In section 5, we introduce the concept of condition equations. The formulation in
terms of condition equations is the dual to the formulation in terms of observation equa-
tions. It is much like the equation of a circle, x2 + y2 = R, is the dual to the two parametric
equations x = R cos φ and y = R sin φ . It is shown how the least- squares formulae need
to be formulated, when one is dealing with condition equations. Both linear as well as

9
10 Network Quality Control

nonlinear condition equations are considered.

In section 6, we discuss three least-squares procedures which are of use for the ma-
terial discussed in the later parts of these lecture notes. First we discuss recursive least-
squares estimation, then constrained least-squares estimation and finally, minimally con-
strained least-squares estimation. Minimally constrained least-squares estimation is needed
when the systems’s matrix is less than of full rank. In most geodetic applications this is
more the rule than the exception. When discussing minimally constrained least-squares,
particular attention is given to the dependency of the variance matrix on the chosen set of
minimal constraints.
The chapter is concluded with a section on quality control. In this section, only one
aspect of quality is discussed, namely that of precision. Precision as expressed by the
variance matrix, can be said to describe the variability of sample values of an unbiased
estimator around its mean. In this last section, we also present some ways of testing the
precision against a given criterium.

2.2 Consistency and uniqueness

Assume that we want to determine n unknown parameters xα ∈ R, α = 1, . . ., n, from a

given set of m measurements yi ∈ R, i = 1, . . ., m. If the measurements bear a known
linear relationship with the unknown parameters, we may write
n
yi = ∑ aiα xα , i = 1, . . .m (2.1)
α =1

In this equation the known scalars aiα determine how the measurements are related to the
unknown parameters. By introducing the matrix A and the vectors y and x as
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
a11 . . . a1n y1 x1
⎢ . .. ⎥ ⎢ . ⎥ ⎢ . ⎥
A = ⎣ .. . ⎦ , y = ⎣ .. ⎦ , x = ⎣ .. ⎦
am1 . . . amn ym xn
equation (2.1) can be written in matrix-vector form as

y = Ax
(2.2)
m×1 m×nn×1

This is a system of m linear equations in n unkown parameters. In order to be able to

solve this linear system, it is first of interest to know under what conditions a solution to
the linear system exists and if so, whether it is unique or not.
It will be clear that a solution to the linear system (2.2) exists if and only if the m-
vector y can be written as a linear combination of the column vectors of matrix A. If this
is the case, the vector y is said to be an element of the column space or range space of
matrix A. The range space of A is denoted as R(A). Thus a solution to (2.2) exists if and
only if

y ∈ R(A) (2.3)
2. Estimation and precision 11

Systems of equations for which this holds true, are called consistent. A system is said
to be inconsistent if it is not consistent. In this case the vector y can not be written as a
linear combination of the column vectors of matrix A and hence no vector x exists such
that (2.2) holds true.
Since y ∈ Rm , it follows from (2.3) that consistency is guaranteed if R(A) = Rm , that
is, if the range space of matrix A equals the m-dimensional space of real numbers. But
R(A) = Rm only holds true, if the dimension of R(A) equals the dimension of Rm , which
is m. It follows therefore, since the dimension of R(A) equals the rank of matrix A (note:
the rank of a matrix equals the total number of linear independent columns of this matrix),
that consistency is guaranteed if and only if
rank A = m (2.4)
In all other cases, rank A < m, the linear system may or may not be consistent.
Let us now assume that the system is indeed consistent. The next question one may ask
is whether the solution to (2.2) is unique or not. That is, whether the information content
of the measurements collected in the vector y is sufficient to determine the parameter
vector x uniquely. The solution is only unique if all the column vectors of matrix A are
linear independent. Hence, the solution is unique if the rank of matrix A equals the number
of unknown parameters,
rank A = n (2.5)
To see this, assume x and x = x to be two different solutions of (2.2). Then Ax = Ax
or A(x − x ) = 0. But this can only be the case if some of the columns of matrix A are
lineary dependent, which contradicts the assumption (2.5) of full rank. Thus the solution
is unique when rank A = n and it is nonunique when rank A < n.
Unless otherwise stated, we will assume from now on that the matrix A of the linear
system (2.2) is of full rank.
With rank A = n and the fact that the rank of a matrix is always equal or less than the
number of its rows or its columns, follows that we can discriminate between the following
two cases
m = n = rank A or m > n = rank A (2.6)
In the first case, both (2.4) and (2.5) are satisfied. This implies that the system is both
consistent and unique. Thus a solution exists and this solution is also unique. The unique
solution, denoted by x̂, is given as
x̂ = A−1 y (2.7)
where A−1 denotes the inverse of matrix A.

Example 1
Consider the linear system

2 1 3 x1
= (2.8)
1 2 −1 x2

y A
12 Network Quality Control

In this case we have: m = 2, n = 2 and rank A = 2. Thus, the system is

consistent since rank A = m = 2, and the system has a unique solution since
rank A = n = 2. The unique solution of (2.8) reads: x̂ = (5/7 , 3/7)T .

In the second case, only (2.5) is satisfied. This implies that a unique solution exists,
provided that the system is consistent. If the system is consistent, the unique solution can
be obtained by inverting n out of the m > n linear equations. Hence, we first partition
(2.2) as
y1 A1
= x
y2 A2
where y1 is an n-vector, y2 is an m − n-vector and A1 and A2 are of order n × n and
(m − n) × n respectively. The unique solution x̂ follows then as

x̂ = A−1
1 y

Note that y2 is not used in computing x̂. This is due to the fact that in the present situation,
y2 is consistent with y1 and hence, it does not contain any additional information.

Example 2
Consider the linear system
⎡ ⎤ ⎡ ⎤
−2 1 3
⎣ 3 ⎦ = ⎣ 2 −1 ⎦ x1
(2.9)
x2
−1 1 2

y A

In this case we have: m = 3, n = 2 and rank A = 2. Since m = 3 >rank A = 2,

consistency of the system is not automatically guaranteed. A closer look at
the measurementvector y of (2.9) shows however that
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
−2 1 3
⎣ 3 ⎦ = 1. ⎣ 2 ⎦ − 1. ⎣ −1 ⎦
−1 1 2

This shows that y can be written as a linear combination of te columnvectors

of A. Therefore y ∈ R(A), showing that the system is consistent. And since
n =rank A = 2, its solution is also unique.
If we partition (2.9) as
⎡ ⎤ ⎡ ⎤
−2 1 3
⎢ 3 ⎥ ⎢ 2 −1 ⎥ x1
⎢ ⎥ ⎢ ⎥
⎣ · · · ⎦ = ⎣ · · ·· · · ⎦ x2
−1 1 2
2. Estimation and precision 13

the unique solution follows as

−1 −1
x̂1 1 3 −2 1 −1 −3 −2 1
= =− =
x̂2 2 −1 3 7 −2 1 3 −1

The system (2.9) may also be partitioned as

⎡ ⎤ ⎡ ⎤
−2 1 3
⎢ · · · ⎥ ⎢ · · ·· · · ⎥ x1
⎢ ⎥ ⎢ ⎥
⎣ 3 ⎦ = ⎣ 2 −1 ⎦ x2
−1 1 2

The unique solution follows then as

−1 −1
x̂1 2 −1 3 1 2 1 3 1
= =− =
x̂2 1 2 −1 5 −1 2 −1 −1

As we have seen, a full rank system of m equations in n unknowns (rank A = n),

can only be inconsistent when m − n > 0. For a less than full rank system (rank A < n),
this only holds true when m − rank A > 0. This number, which is a very important one in
adjustment theory, is referred to as the redundancy of the system of observation equations.
Thus

redundancy = m − rank A (2.10)

The question that remains is: What to do when the linear system of equations is incon-
sistent? In that case, we first need to make the system consistent before a solution can be
computed. There are however many ways in which an inconsistent system can be made
consistent. In the next section we will give a way, which is intuitively appealing. And
lateron we will show that this approach of finding a solution to an inconsistent system
also has some optimality properties, in particular in a probabilistic context.

2.3 The Linear A-model: observation equations

2.3.1 Least-squares estimates

An inconsistent system, that is, a system for which y ∈
/ R(A) holds, can be made consistent
by introducing an errorvector e as (see figure 2.1):

y = Ax + e , m > n = rank A (2.11)

m×1 m×nn×1 m×1

In (2.11), y and A are given, whereas x and e are unknown. From the geometry of
figure 2.1 it seems intuitively appealing to estimate x as x̂ such that Ax̂ is as close as
possible to the given measurement- or observationvector y. In other words, the idea is to
14 Network Quality Control

y Rm

e
O
Ax
Ê (A)
Figure 2.1 The geometry of y = Ax + e.

find that value of x that minimizes the length of the vector e = y − Ax. This idea leads to
the following minimization problem:

min (y − Ax)T (y − Ax) (2.12)

From calculus we know that x̂ is a solution of (2.12) if x̂ satisfies:

∂F ∂ 2F
(x̂) = 0 and (x̂) positive − definite (2.13)
∂x ∂ x2
where F(x) is given as

F(x) = (y − Ax)T (y − Ax) = yT y − 2yT Ax + xT AT Ax (2.14)

Taking the first-order and second-order partial derivatives of F(x) gives

∂F ∂ 2F
(x) = −2AT y + 2AT Ax and (x) = 2AT A (2.15)
∂x ∂ x2
Equating the first equation of (2.15) to zero shows that x̂ satisfies the normal equations

AT Ax̂ = AT y (2.16)

Since rank AT A =rank A = n, the system is consistent and has a unique solution. Through
an inversion of the normalmatrix AT A the unique solution of (2.16) is found as

x̂ = (AT A)−1 AT y (2.17)

That this solution x̂ is indeed the minimizer of (2.14) follows from the fact that the matrix
∂ 2 F/∂ x2 of (2.15) is indeed positive-definite. The vector x̂ is known as the least-squares
estimate of x, since it produces the least possible value of the sum-of-squares function
F(x).
From the normal equations (2.16) follows that AT (y − Ax̂) = 0. This shows that the
vector ê = y−Ax̂, which is the least-squares estimate of e, is orthogonal to the range space
of matrix A (see figure 2.2):

AT ê = 0 , with ê = y − Ax̂ (2.18)

2. Estimation and precision 15

y Rm
ê

O ŷ = Ax̂

Ê (A)
Figure 2.2 The geometry of Least-Squares.

Example 3
Let us consider the next problem

y1 1 e1
= x+
y2 1 e2

With A = (1, 1)T the normal equation AT Ax̂ = AT y reads

2x̂ = y1 + y2

Hence the least-squares estimate x̂ = (AT A)−1 AT y is

1
x̂ = (y1 + y2 )
2
Thus, to estimate x, one adds the measurements and divides by the number
of measurements. Hence, the least-squares estimate equals in this case the
arithmetic average.
The least-squares estimates of the obserations and observation errors follow
from ŷ = Ax̂ and ê = y − ŷ as

ŷ1 1 y1 + y2 ê1 1 y1 − y2
= , and =
ŷ2 2 y1 + y2 ê2 2 y2 − y1

Finally the square of the length of ê follows as

1
êT ê = (y1 − y2 )2
2

So far we have discussed the unweighted least-squares principle. The least-squares

principle can be generalized however by introducing a positive-definite m × m weight
matrix W . This is done by replacing (2.12) by the following minimization problem

min(y − Ax)T W (y − Ax) (2.19)

x
16 Network Quality Control

Table 2.1 Weighted Least-Squares.

Consistent Linear System

y = Ax + e, y, e ∈ Rm , x ∈ Rn , m > n = rank A

Weighted Least-Squares Principle

min(y − Ax)T W (y − Ax), W = positive definite
x

Weighted Least-Squares Estimates

parametervector : x̂ = (AT WA)−1 AT Wy
observationvector : ŷ = Ax̂
errorvector : ê = y − ŷ

The solution of (2.19) can be derived along lines which are similar as the ones used
for solving (2.12). The solution of (2.19) reads
x̂ = (AT WA)−1 AT Wy (2.20)
This is the weighted least-squares estimate of x. In case of weighted least-squares the
normal equations read: AT WAx̂ = AT Wy. This shows that the vector ê = y − Ax̂, which is
the weighted least-squares estimate of e, satisfies
AT W ê = 0 , with ê = y − Ax̂ (2.21)
If the inner product of the observation space Rm is defined as (a, b) = aT W b, ∀a, b ∈ Rm ,
(2.21) can also be written as (Ax, ê) = 0, ∀x ∈ Rm . This shows that also in the case of
weighted least-squares, the vector ê can be considered to be orthogonal to the rangspace
of A.
A summary of the least-squares algorithm is given in Table 2.1.

Example 4
Consider again the problem

y1 1 e1
= x+
y2 1 e2

As weight matrix we take

w11 0
W=
0 w22

With A = (1, 1)T the normal equation AT WAx̂ = AT Wy reads

(w11 + w22 )x̂ = w11 y1 + w22 y2

2. Estimation and precision 17

Hence the weighted least-squares estimate x̂ = (AT WA)−1 AT Wy is

w11 y1 + w22 y2
x̂ =
w11 + w22

Thus instead of the arithmetic average of y1 and y2 , as we had with W = I,

the above estimate is a weighted average of the data. This average is closer
to y1 than to y2 if w11 > w22 .
The weighted least-squares estimate of the observations, ŷ = Ax̂, is

ŷ1 1 w11 y1 + w22 y2

=
ŷ2 w11 + w22 w11 y1 + w22 y2

and the weighted least-squares estimate of ê, ê = y − ŷ is

ê1 1 w22 (y1 − y2 )

=
ê2 w11 + w22 w11 (y2 − y1 )

Note that |ê2 | > |ê1 | if w11 > w22 .

Finally the weighted sum of squares êT W ê is
w11 w22
êT W ê = (y − y2 )2
w11 + w22 1

2.3.2 A stochastic model for the observations

In the previous section the principle of least-squares was introduced. The least-squares
principle enables us in case of inconsistent systems, to obtain an intuitively appealing
estimate x̂ of the parametervector x. But although the least-squares estimate x̂ is intuitively
appealing, no quality measures as yet can be attached to the estimate. That is, we know
how to compute the estimate x̂, but we are not able yet to say how good the estimate really
is. Of course, the numerical value of the least value of the sum of squares, êT W ê, does
indicate something about the quality of x̂. If êT W ê is small one is inclined to have more
confidence in the estimate x̂, than when êT W ê is large. But how small is small? Besides,
êT W ê is identically zero if the linear system is consistent. Would this then automatically
imply that the estimate x̂ has good quality? Not really, since the observations may still be
subject to measurement errors.
In order to obtain quality measures for the results of least-squares estimation, we start
by introducing a qualitative description of the input, that is of the observations. This de-
scription will be of a probabilistic nature. The introduction of a probabilistic description
is motivated by the experimental fact that the variability in the outcome of measurements,
when repeated under similar circumstances, can be described to a sufficient degree by
stochastic or random variables. We will therefore assume that the observation vector y,
which contains the numerical values of the measurements, constitutes a sample of the
18 Network Quality Control

random vector of observables y (Note: the underscore indicates that we are dealing with
a random variable). It is furthermore assumed that the vector of observables y can be
written as the sum of a deterministic functional part Ax and a random residual part e:

y = Ax + e (2.22)

Although a random vector is completely described by its probability density function,

we will restrict ourselves to the first two moments of random variables. That is, we will
restrict ourselves to the mean and to the variance matrix. Properties of the mean and of
the variance matrix together with the error propagation laws, are given in the appendix.
If we assume that e models the probabilistic nature of the variability in the measure-
ments, it seems acceptable to assume that this variability is zero ”on the average” and
therefore that the mean of ê is zero:

E{e} = 0 (2.23)

where E{.} stands for the mathematical expectation operator. The measurement variabil-
ity itself is modelled through the dispersion- or variance matrix of e. We will assume that
this matrix is known and denote it by Qy :

D{e} = Qy (2.24)

where D{.} stands for the dispersion operator. It is defined in terms of E{.} as

D{.} = E{(. − E{.})(. − E{.})T }

With (2.23) and (2.24) we are now in the position to determine the mean and variance
matrix of the vector of observables y. Application of the law of propagation of means and
the law of propagation of variances to (2.22) gives with (2.23) and (2.24):

E{y} = Ax ; D{y} = Qy (2.25)

This will be our model for the vector of observables y. As the results of the next sec-
tion show, model (2.25) enables us to describe the quality of the results of least-squares
estimation in terms of the mean and the variance matrix.

2.3.3 Least-squares estimators

Functions of random variables are again random variables. It follows therefore, that if the
vector of observables is assumed to be a random vector y and when it is substituted for y
in the formulae of table 1 in section 2.3.1, the results are again random variables:
⎧ T −1 T
⎨ x̂ = (A WA) A W y
ŷ = Ax̂ (2.26)
⎩
ê = y − ŷ
These random vectors will be called least-squares estimators. And if y is replaced by its
sample or measurement value y, we speak of least-squares estimates. The quality of the
above estimators can now be deduced from the first two moments of y.
2. Estimation and precision 19

The first moment; the mean: Together with E{y} = Ax, an application of the propaga-
tion law of means to (2.26) gives
⎧
⎨ E{x̂} = x
E{ŷ} = E{y} (2.27)
⎩
E{ê} = E{e} = 0
These results show, that under the assumption that (2.25) holds, the least-squares estima-
tors are unbiased estimators. Note that this property of unbiasedness is independent of
the choice for the weightmatrix W .

The second moment; the variance matrix: Together with D{y} = Qy , an application of
the propagation law of variances and covariances to (2.26) gives
⎧ T −1 T T −1
⎨ Qx̂ = (A WA) A W QyWA(A WA)
Q = AQx̂ A T
(2.28)
⎩ ŷ
Qê = [I − A(AT WA)−1 AT W ]Qy [I − A(AT WA)−1 AT W ]T
and
⎧
⎪
⎨ Qx̂ŷ = Qx̂ AT
Qx̂ê = (AT WA)−1 AT W Qy − Qx̂ AT (2.29)
⎪
⎩ Q
ŷê = AQx̂ ê

The above variance matrices enable us now to give a complete precision description
of any arbitrary linear function of the estimators. Consider for instance the linear function
θ̂ = aT x̂. Application of the propagation law of variances gives then for the precision of
θ̂ : σθ̂2 = aT Qx̂ a.
The above results enable us to describe the quality of the results of least-squares es-
timation in terms of the mean and the variance matrix. The introduction of a stochastic
model for the vector of observables y enables us however also to judge the merits of the
least-squares principle itself. Recall that the least-squares principle was introduced on the
basis of intuition and not on the basis of probabilistic reasoning. With the mathematical
model (2.24) one could now however try to develop an estimation procedure that pro-
duces estimators with certain well defined probabilistic optimality properties. One such
procedure is based on the principle of ”Best Linear Unbiased Estimation (BLUE)”.
Assume that we are interested in estimating a parameter θ which is a linear function
of x:
θ = aT x (2.30)
1×1 1×nn×1
The estimator of θ will be denoted as θ̂ . Then according to the BLUE’s criteria, the
estimator θ̂ of θ has to be a linear function of y,

θ̂ = lT y (2.31)
1×1 1×mm×1
such that it is unbiased,
E{θ̂ } = θ (2.32)
20 Network Quality Control

and such that it is best in the sense of minimum variance,

σθ̂2 → minimum (2.33)

The objective is thus to find a vector l ∈ Rm such that with (2.31), the conditions (2.32)
and (2.33) are satisfied. It can be shown that the solution to the above problem is given by

l T = aT (AT Q−1 −1 T −1
y A) A Qy

If we substitute this into (2.31) we get

θ̂ = aT (AT Q−1 −1 T −1
y A) A Qy y (2.34)

This is the best linear unbiased estimator of θ . The important result (2.34) shows that the
best linear unbiased estimator of x is given by

x̂ = (AT Q−1 −1 T −1
y A) A Qy y (2.35)

A comparison between (2.26) and (2.35) shows that the BLUE of x is identical to the
weighted least-squares estimator of x if the weight matrix W is taken to be equal to the
inverse of the variance matrix of y:

W = Q−1
y (2.36)

This is an important result, because it shows that the weighted least-squares estimators are
best in the probabilistic sence of having minimal variance if (2.36) holds. The variances
and covariances of these estimators follow if the weightmatrix W is replaced in (2.28) and
(2.29) by Q−1
y .
From now on we will always assume, unless stated otherwise that the weightmatrix
W is chosen to be equal to Q−1 y . Consequently no distinction will be made any more in
these lecture notes between weighted least-squares estimators and best linear unbiased
estimators. Instead we will simply speak of least-squares estimators.

2.3.4 Summary
In Table 2.2 an overview is given of the main results of Least-Squares Estimation.
The table shows the linear model of observation equations, the linear A-model. Based
on A, Qy and y, the least-squares estimator of the unknown parameter vector x can be
determined. And from it, one can determine the adjusted vector of observables, ŷ, the
least-squares residual vector, ê, and the least-squares estimator of an arbitrary function
θ = aT x, as θ̂ = aT x̂. As shown in the table, each of these random vectors have their own
mean and variance matrix. Some of the random vectors are correlated, such as x̂ and ŷ,
and some of them are not, such as x̂ and ê.

2.4 The Nonlinear A-model: observation equations

2.4.1 Nonlinear observation equations

Up to this point the development of the theory was based on the assumption that the
m-vector E{y} is linearly related to the n-vector of unknown parameters x. In geodetic
2. Estimation and precision 21

Table 2.2 Least-Squares Estimation.

THE LINEAR A-MODEL

E{y} = Ax , D{y} = Qy , m ≥ n = rank A

m×1 m×nn×1 m×m m×m

LEAST-SQUARES ESTIMATORS

x̂ = (AT Q−1 −1 T −1
y A) A Qy y ê = y − ŷ
ŷ = Ax̂ θ̂ = aT x̂

MEAN

E{x̂} = x E{ê} = E{e} = 0

E{ŷ} = E{y} = Ax E{θ̂ } = θ = aT x

VARIANCES AND COVARIANCES

Qx̂ = (AT Q−1

y A)
−1 Qê = Qy − Qŷ
Qŷ = AQx̂ AT σθ̂2 = aT Qx̂ a
Qx̂ŷ = Qx̂ AT , Qx̂ê = 0 , Qŷê = 0

applications there are however only a few cases where this assumption truly holds. A
typical example is levelling. In the majority of applications however the m-vector E{y} is
nonlinearly related to the n-vector of unknown parameters x. This implies that instead of
the linear A-model (2.25), we are generally dealing with a nonlinear model of observation
equations:
E{y} = A(x) ; D{y} = Qy (2.37)
where A(.) is a nonlinear vectorfunction from Rn into Rm . The following two simple
examples should make this clear.

Example 5
Consider the configuration of figure 2.3. The x,y coordinates of the three
22 Network Quality Control

3
a34

yi j = y j − yi
ai j
4
a14
y i xi j = x j − xi

1 a24 2

Figure 2.3 Azimuth resection.

points 1, 2 and 3 are known and the coordinates x4 and y4 of point 4 are
unknown. The observables consist of the three azimuth variates a14 , a24 , and
a34 . Since azimuth and coordinates are related as
x j − xi xi j
tan ai j = =
y j − yi yi j
The model of observation equations for the configuration of figure 2.3 reads
⎡ ⎤ ⎡ ⎤
a14 arctan[x14 /y14 ]
E{⎣ a ⎦} = ⎣ arctan[x /y ] ⎦
24 24 24
a34 arctan[x34 /y34 ]
This model consists of three nonlinear observation equations in the two un-
known parameters x4 and y4 .
Example 6
Consider the situation of figure 2.4. It shows two cartesian coordinate sys-
tems: the x,y-system and the u,v-system. The two systems only differ in their
orientation. This means that if the coordinates of a point i are given in the
u,v-system, (ui , vi ), a rotation through an angle α is needed to obtain the
coordinates of the same point i in the x,y-system, (xi , yi ):

xi cos α − sin α ui
= (2.38)
yi sin α cos α vi
Let us now assume that we have at our disposal the coordinate observables
of two points in both coordinate systems: (xi , yi ) and (ui , vi ), i = 1, 2. Using
(2.38), our model reads then
xi cos α − sin α ui
E{ }= E{ } , i = 1, 2 (2.39)
yi sin α cos α vi
2. Estimation and precision 23

i
yi

u
α
ui

xi x

Figure 2.4 A coordinate transformation.

This model is however still not in the form of observation equations. If we

consider the orientation angle α and the coordinates of the two points in
the u,v-system as the unknown parameters, (2.39) can be written in terms of
observation equations as
⎡ ⎤ ⎡ ⎤
x1 u1 cos α − v1 sin α
⎢ y1 ⎥ ⎢ u1 sin α + v1 cos α ⎥
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ u2 cos α − v2 sin α ⎥
⎢ x2 ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
⎢ y2 ⎥ ⎢ u2 sin α + v2 cos α ⎥
E{⎢ ⎥} = ⎢ ⎥ (2.40)
⎢ u1 ⎥ ⎢ u1 ⎥
⎢ ⎥ ⎢ ⎥
⎢ v1 ⎥ ⎢ v1 ⎥
⎢ ⎥ ⎢ ⎥
⎣ u2 ⎦ ⎣ u2 ⎦
v2 v2

This model consists of eight observations in five unknown parameters. Note

that the first four observation equations are nonlinear.

2.4.2 The linearized observation equations

We know how to compute least-squares estimators in case of a linear A-model. But how
are we now going to compute least-squares estimators if the model of observation equa-
tions is nonlinear? For the majority of nonlinear problems the solution is to approximate
the originally nonlinear A-model with a linear one. In order to show how this can be done,
we first recall the celebrated theorem of Taylor. This theorem reads:

Taylor’s Theorem: Let f (x) be a function from Rn into R which is smooth enough. Let
x0 ∈ Rn be an approximation to x ∈ Rn and define Δx = x − x0 , and θ = x0 +t(x − x0 ) with
24 Network Quality Control

t ∈ R. Then a scalar t ∈ (0, 1) exists such that

n n n
1
f (x) = f (x0 ) + ∑ ∂α f (x0 )Δxα + 2 ∑ ∑ ∂αβ
2
f (x0 )Δxα Δxβ + . . .
α =1 α =1 β =1
1 n n (2.41)
. . .+ ∑ ...
(q − 1)! α =1 α ∑ ∂αq−1
...α
1 q−1
f (x0 )Δxα1 . . .Δxαq−1 + Rq (θ , Δx)
1 q−1 =1

with the remainder

1 n n
Rq (θ , Δx) = ∑ . . . ∑ ∂αq ...αq f (θ )Δxα1 . . .Δxαq
q! α =1 αq =1 1
(2.42)
1

In (2.41) and (2.42), ∂αq ...αq f (x) denotes the qth-order partial derivative of f (x) eval-
1
uated at x. For the case q = 2, it follows from (2.41) and (2.42) that
n
1 n n 2
f (x) = f (x0 ) + ∑ ∂α f (x0 )Δxα +
2 α∑ ∑ ∂αβ f (θ )Δxα Δxβ (2.43)
α =1 =1 β =1

If we introduce the gradientvector and Hessianmatrix of f (x) respectively as

⎡ ⎤ ⎡ 2 2 f (x) ⎤
∂1 f (x) ∂11 f (x) · · · ∂1n
⎢ .. ⎥ ⎢ .. .. ⎥
∂x f (x) = ⎣ . ⎦ and ∂xx 2
f (x) = ⎣ . . ⎦
∂n f (x) ∂n1
2 f (x) · · · ∂ 2 f (x)
nn

then equation (2.43) may be written in the more compact matrix-vector form as
1
f (x) = f (x0 ) + ∂x f (x0 )T Δx + ΔxT ∂xx
2
f (θ )Δx (2.44)
2
This important result shows that a nonlinear function f (x) can be written as a sum of
three terms. The first term in this sum is the zero-order term f (x0 ). The zero-order term
depends on x0 but is independent of x. The second term in the sum is the first-order term
∂x f (x0 )T Δx. It depends on x0 and is linearly dependent on x. Finally, the third term in the
sum is the second-order remainder R2 (θ , Δx).
A consequence of Taylor’s Theorem is that the remainder R2 (θ , Δx) can be made
arbitrarily small by choosing the approximation x0 close enough to x. Now assume that the
x0 approximation is chosen such that the second-order remainder can indeed be neglected.
Then, instead of (2.44) we may write to a sufficient approximation:
f (x) = f (x0 ) + ∂x f (x0 )T Δx (2.45)
Hence, if x0 is sufficiently close to x, the nonlinear function f (x) can be approximated to
a sufficient degree by the function f (x0 ) + ∂x f (x0 )T Δx which is linear in x. This function
is the linearized version of f (x). A geometric interpretation of this linearization is given
in figure 2.5 for the case n = 1. Let us now apply the above linearization to our nonlinear
observation equations
⎡ ⎤
a1 (x)
⎢ .. ⎥
E{y} = A(x) = ⎣ . ⎦ (2.46)
am (x)
2. Estimation and precision 25

y = f (x)
y

f (x0 ) y = f (x0 ) + ddxf (x0 )(x − x0)

x0 x

d
Figure 2.5 The nonlinear curve y = f (x) and its linear tangent y = f (x0) + dx f (x0)(x − x0).

Each nonlinear observation equation ai (x), can now be linearized according to (2.45).
This gives
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
a1 (x) a1 (x0 ) ∂x a1 (x0 )T
⎢ .. ⎥ ⎢ .. ⎥ ⎢ .. ⎥
⎣ . ⎦ = ⎣ . ⎦ + ⎣ . ⎦ Δx
(2.47)
am (x) am (x0 ) ∂x am (x0 )T
m×1 m×1 m×n n×1

If we denote the m × n matrix of (2.47) as ∂x A(x0 ), and substitute (2.47) into (2.46) we
get

E{y} = A(x0 ) + ∂x A(x0 )Δx (2.48)

If we bring the constant m-vector A(x0 ) to the lefthandside of the equation and define
Δy = y − A(x0 ), we finally obtain our linearized model of observation equations

E{Δy} = ∂x A(x0 )Δx ; D{Δy} = Qy (2.49)

This is the linearized A-model. Compare (2.49) with (2.37) and (2.25). Note when com-
paring (2.49) with (2.25) that in the linearized A-model Δy takes the place of y, ∂x A(x0 )
takes the place of A and Δx takes the place of x. Since the linearized A-model is linear, our
standard formulae of least-squares can be applied again. This gives for the least-squares
estimator x̂ = x0 + Δx̂ of x:

x̂ = x0 + [∂x A(x0 )T Q−1 −1 T −1

y ∂x A(x0 )] ∂x A(x0 ) Qy Δy (2.50)

And application of the propagation law of variances to (2.50) gives

Qx̂ = [∂x A(x0 )T Q−1

y ∂x A(x0 )]
−1
(2.51)

It will be clear that the above results, (2.50) and (2.51), are approximate in the sense that
the second-order remainder is neglected. But these approximations are good enough if the
second-order remainder can be neglected to a sufficient degree. In this case also the op-
timality conditions of least-squares (unbiasedness, minimal variance) hold to a sufficient
degree. A summary of the linearized least-squares estimators is given in Table 2.3.
26 Network Quality Control

Table 2.3 Linearized Least-Squares estimation.

THE NONLINEAR A-MODEL

E{y} = A(x) ; D{y} = Qy ; A(.) : Rn → Rm

THE LINEARIZED A-MODEL

E{Δy} = ∂x A(x0 )Δx ; D{Δy} = Qy , m ≥ n = rank ∂x A(x0 )

m×1 m×nn×1 m×m m×m

LEAST-SQUARES ESTIMATORS

x̂ = x0 + [∂x A(x0 )T Q−1 −1 T −1

y ∂x A(x0 )] ∂x A(x0 ) Qy [y − A(x0 )]
ŷ = A(x̂)
ê = y − ŷ

VARIANCES

Qx̂ = [∂x A(x0 )T Q−1

y ∂x A(x0 )]
−1

Qŷ = ∂x A(x0 )Qx̂ ∂x A(x0 )T

Qê = Qy − Qŷ

Example 7
Consider the configuration of figure 2.6. The x,y coordinates of the three
points 1, 2 and 3 are known and the two coordinates x4 and y4 of point 4 are
unknown. The observables consist of the three distance variates l 14 , l 24 , l 34 .
Since distance and coordinates are related as

li j = (x2i j + y2i j )1/2

the model of observation equations for the configuration of figure 2.6 reads
⎡ ⎤ ⎡ 2 ⎤
l 14 (x14 + y214 )1/2
E{⎣ l 24 ⎦} = ⎣ (x224 + y224 )1/2 ⎦ (2.52)
l 34 (x234 + y234 )1/2
2. Estimation and precision 27

This model consists of three nonlinear observation equations in the two un-
known parameters x4 and y4 .
In order to linearize (2.52) we need approximate values for the unknown
coordinates x4 and y4 . These approximate values will be denoted as x04 and

j
l 34

yi j = y j − yi
li j
4
l 14 l 24
y i xi j = x j − xi

1 2

Figure 2.6 Distance resection.

y04 . With these approximate values a linearization of (2.52) gives

⎡ ⎤ ⎡ 0 0 ⎤
Δl 14 (x4 − x1 )/l14
0 (y04 − y1 )/l14
Δx4
E{⎣ Δl 24 ⎦} = ⎣ (x04 − x2 )/l24
0 0 ⎦
(y04 − y2 )/l24 (2.53)
Δy4
Δl 34 (x4 − x3 )/l34 (y4 − y3 )/l34
0 0 0 0

Δx
Δy ∂x A(x0 )

where

Δl i4 = l i4 − li4
0 , l 0 = [(x0 − x )2 + (y0 − y )2 ]1/2 , i = 1, 2, 3
i4 4 i 4 i
Δx4 = x4 − x4 , Δy4 = y4 − y04
0

Model (2.53) is the linearized version of the nonlinear A-model (2.52).

28 Network Quality Control

Example 8
Consider the nonlinear A-model (2.40) of example 6. The unknown parame-
ters are α and ui and vi for i = 1, 2. The approximate values of these param-
eters will be denoted as α 0 , u0i , v0i , for i = 1, 2. Linearization of (2.40) gives
then
⎡ ⎤ ⎡ ⎤
Δx1 u01 sin α 0 − v01 cos α 0 cos α 0 − sin α 0 0 0
⎢ Δy1 ⎥ ⎢ u01 cos α 0 − v01 sin α 0 sin α 0 cos α 0 0 0 ⎥⎡ ⎤
⎢ ⎥ ⎢ ⎥ Δα
⎢ Δx2 ⎥ ⎢ −u02 sin α 0 − v02 cos α 0 cos α 0 − sin α 0 ⎥
⎢ ⎥ ⎢ 0 0 ⎥⎢ Δu1 ⎥
⎢ ⎥ ⎢ ⎥⎢ ⎥
⎢ Δy2 ⎥ ⎢ u02 cos α 0 − v02 sin α 0 0 0 sin α 0 cos α 0 ⎥⎢ ⎥(2.54)
E{⎢ ⎥} = ⎢ ⎥⎢ Δv1 ⎥
⎢ Δu1 ⎥ ⎢ 0 1 0 0 0 ⎥⎣ ⎦
⎢ ⎥ ⎢ ⎥ Δu2
⎢ Δv1 ⎥ ⎢ 0 0 1 0 0 ⎥
⎢ ⎥ ⎢ ⎥ Δv2
⎣ Δu2 ⎦ ⎣ 0 0 0 1 0 ⎦
Δv2 0 0 0 0 1 Δx

Δy ∂x A(x0 )

where:
⎧
⎪
⎪ Δxi = xi − x0i , x0i = u0i cos α 0 − v0i sin α 0
⎪
⎪
⎪
⎨ Δyi = yi − yi ,
0 y0i = u0i sin α 0 + v0i cos α 0
Δui = ui − u0i , Δui = ui − u0i , for i = 1, 2
⎪
⎪
⎪
⎪ Δvi = vi − vi ,
0 Δvi = vi − v0i , for i = 1, 2
⎪
⎩ Δα = α − α 0

2.4.3 Least-Squares iteration

Up to this point it was assumed that the second-order remainder was sufficiently small
and that x0 was a good enough approximation. If this is not the case, then x̂ as computed
by (2.50) is not the least-squares estimate and hence an unacceptable error is made. In
order to repair this situation, we need to improve upon the approximation x0 . It seems
reasonable to expect that the estimate

x1 = x0 + [∂x A(x0 )T Q−1 −1 T −1

y ∂x A(x0 )] ∂x A(x0 ) Qy (y − A(x0 ))

is a better approximation than x0 . That is, it seems reasonable to expect that x1 is closer
to the true least-squares estimate than x0 . In fact one can show that this is indeed the
case for most practical applications. But if x1 is a better approximation than x0 , a further
improvement can be expected if we replace x0 by x1 in the linearization of the nonlinear
model. The recomputed linearized least-squares estimate reads then

x2 = x1 + [∂x A(x1 )T Q−1 −1 T −1

y ∂x A(x1 )] ∂x A(x1 ) Qy (y − A(x1 ))

By repeating this process a number of times, one can expect that finally the solution
converges to the actual least-squares estimate x̂. This is called the least-squares iteration
process. The iteration is usually terminated if the difference between successive solutions
is negligible. A flowdiagram of the least-squares iteration process is shown in Figure 2.7.
2. Estimation and precision 29

initialization
x0 , set i = 0

r(xi ) = ∂x A(xi )T W(y − A(xi )) right-hand side

N(xi ) = ∂x A(xi ) W ∂x A(xi )

T
normal matrix
i := i + 1
N(xi ) x̂i = r(xi ) normal equations

xi+1 = xi + x̂i update

x̂i 2
N(xi ) ≤δ
No

Yes

x̂ := xi+1

Figure 2.7 Least-squares iteration.

2.5 The B-Model: condition equations

2.5.1 Linear condition equations

In the previous sections we considered the model of observation equations. In this section
and the next we briefly review the model of condition equations. As our starting point we
take the linear A-model:
y = Ax + e , m ≥ n = rank A (2.55)
m×1 m×nn×1 m×1
This linear model is uniquely solvable if m = n, i.e. if the number of observables equals
the number of unknown parameters. In this case A is a square matrix which is invertible
because of rank A = n. If m = n no conditions can be imposed on the observables. If m >
n =rank A, then more observables are available then strictly needed for the determination
of the n unknown parameters. In this case an (m − n)-number of redundant observables
exist. Each seperate redundant observable gives rise to the possibility of formulating a
condition equation. Thus the total number of independent condition equations that can be
formulated equals

r = m−n (2.56)

This number is also referred to as the redundancy of the model. We will now show how
one can construct the condition equations, given the linear A-model (2.55). Each of the
column vectors of matrix A is an element of the observationspace Rm . Together, the n-
number of linearly independent column vectors of A span the range space of A. This range
space has dimension n and it is a linear subspace of Rm : R(A) ⊂ Rm . Since dimR(A) = n
30 Network Quality Control

and dimRm = m, exactly (m − n)-number of linearly independent vectors can be found

that are orthogonal to R(A). Let us denote these vectors as: bi ∈ Rm , i = 1, . . ., (m − n).
Then

bi ⊥ R(A) or AT bi = 0 , i = 1, . . ., (m − n)

From this follows, if the (m − n)-number of linearly independent vectors bi are collected
in an m × (m − n) matrix B as

B = (b1 , b2, . . ., bm−n )

m × (m − n)

that
BT A = 0 ; rank B = m−n
(2.57)
(m − n) × m m × n (m − n) × n

This result may now be used to obtain the model of condition equations from (2.55).
Premultiplication of the linear system of observation equations in (2.55) by BT gives
together with (2.57), the following linear model of condition equations:

BT E{y} = 0 ; D{y} = Qy ; rank B = r = m − n

(2.58)
n × mm × 1 n×1 m×m m×m

Example 9
Consider the following linear A-model:
⎡ ⎤ ⎡ ⎤
y1 1
⎢ ⎥
E{⎣ y2 ⎦} = ⎣ 1 ⎦ x ; D{y} = Qy (2.59)
y3 1

Since m = 3, n = 1 and rank A = 1 = n, the redundancy equals m − n = 2.

Hence two linearly independent condition equations can be formulated. The
two vectors

b1 = (1, −1, 0)T and b2 = (0, 1, −1)T

are linearly independent and are both orthogonal to the column vector of
matrix A in (2.59). Hence the with (2.59) corresponding linear model of
condition equations reads
⎤ ⎡
y1
1 −1 0 ⎢ ⎥ 0
E{⎣ y2 ⎦} = ; D{y} = Qy (2.60)
0 1 −1 0
y3
BT
y

2. Estimation and precision 31

Table 2.4 Least-Squares estimation.

THE LINEAR B-MODEL

BT E{y} = 0 ; D{y} = Qy ; rank B = r = m − n

n × mm × 1 n×1 m×m m×m

LEAST-SQUARES ESTIMATORS

ŷ = [I − Qy B(BT Qy B)−1 BT ]y ; ê = y − ŷ

VARIANCES AND COVARIANCES

Qŷ = Qy B(BT Qy B)−1 BT Qy ; Qê = Qy − Qŷ ; Qŷê = 0

Now that we have the disposal of the linear B-model (2.58), how are we going to
compute the corresponding least-squares estimators? We know how to compute the least-
squares estimators for the linear A-model. The corresponding formulae are however all
expressed in terms of the A-matrix. What is needed is therefore to transform these for-
mulae such that they are expressed in terms of the B-matrix. This is possible with the
following important matrix identity:

A(AT Q−1 −1 T −1 T −1 T
y A) A Qy = I − Qy B(B Qy B) B (2.61)

The proof of this matrix identity goes as follows. We define two matrices C and C̄ as:
⎡ T −1 −1 T −1 ⎤
(A Qy A) A Qy
..
C = A . Qy B and C̄ = ⎣ · · ·· · ·· · · ⎦ (2.62)
T
(B Qy B) B −1 T

Since both matrices C and C̄ are of the order m × m and since both can be shown to be of
full rank, it follows that both matrices are invertible. From (2.62) follows with the help of
(2.57) that C̄C = Im . Hence C̄ = C −1 and therefore CC̄ = Im . Substitution of (2.62) into
this last expression proofs (2.61).
With (2.61) and the least-squares results of table 2 of section 2.3.4 we are now in the
position to derive the expressions for the least-squares estimators in terms of the matrix
B. The results are summarized in table 2.4.
32 Network Quality Control

For some applications it may happen, when formulating the condition equations, that
the linear combinations of the observables do not sum up to a zero vector, but instead to a
known nonzero vector b. In that case the model of condition equations reads

BT E{y} = b ; D{y} = Qy

Due to the fact that b = 0, the solution for ŷ now takes a somewhat different form. It reads

ŷ = y − Qy B(BT Qy B)−1 (BT y − b)

Note that this solution reduces to the one given earlier, when b = 0.

2.5.2 Nonlinear condition equations

Just like in case of the A-model, there are very few geodetic applications for which the
model of condition equations is linear. In most cases the model of condition equations is
nonlinear. The nonlinear B-model reads:

BT (E{y}) = 0 ; D{y} = Qy (2.63)

where BT (.) is a nonlinear vectorfunction from Rm into Rm−n . The relation between the
nonlinear B-model and the nonlinear A-model is given by

BT (A(x)) = 0 , ∀ x ∈ Rn (2.64)

This is the nonlinear generalization of (2.57). If we take the partial derivative with respect
to x of (2.64) and apply the chainrule, we get

[∂y B(y0 )]T [∂xA(x0 )] = 0 ; y0 = A(x0 ) (2.65)

This is the linearized version of (2.64). Compare (2.65) with (2.57). With (2.65) we
are now in the position to construct the linearized B-model from the linearized A-model
(2.49). Premultiplication of (2.49) with the matrix [∂y B(y0 )]T gives together with (2.65)
the result

[∂y B(y0 )]T E{Δy} = 0 ; D{Δy} = Qy (2.66)

This is the linearized B-model. With (2.66) we are now in the position again to apply our
standard least-squares estimation formulae.

2.6 Special Least-Squares procedures

In this section three special cases of least-squares estimation will be discussed. They
are: recursive least-squares, constrained least-squares and minimally constrained least-
squares. In particular the last two will be needed when we discuss the adjustment and
testing of geodetic networks. Constrained least-squares, which can be seen as a particular
form of recursive least-squares, deals with the problem of solving a model of observation
equations, when there are explicit constraints on the unknown parameters. As it is shown,
the solution can be obtained in two steps. In the first step the model is solved without the
2. Estimation and precision 33

constraints, while in the second step, the final solution is obtained by using the results of
the first step in a formulation based on condition equations.
Minimally constrained least-squares is needed when the design matrix fails to be of
full rank. This typically occurs in cases of free network adjustments. Due to the rank de-
fect of the design matrix, a set of necessary and sufficient constraints need to be imposed
on the parameter vector in order to be able to compute a solution. The set of constraints
that can be imposed however, is not unique. This implies that a whole family of solutions
can be computed, of which each member is characterized by a particular set of minimal
constraints.

2.6.1 Recursive least-squares

In this section we will partition the observation equations into two sets and show how the
least-squares solution can be obtained using a step-wise or recursive approach.
Let us assume that the model of observation equations E{y} = Ax , D{y} = Qy is
partitioned as

y1 A1 y1 Q1 0
E{ }= x , D{ }= (2.67)
y2 A2 y2 0 Q2

Note that y1 and y2 are assumed to be uncorrelated. This assumption is essential in order
to be able to formulate a recursive solution. We will denote the least-squares solution
which is based on the first set of observation equations as x̂(1) and the least- squares
solution which is based on both sets, as x̂(2) . The least-squares solution which is based on
the first set of observation equations reads

x̂(1) = (AT1 Q−1 −1 T −1

1 A1 ) A1 Q1 y1 , Qx̂ = (AT1 Q−1
1 A1 )
−1
(2.68)
(1)

and the least-squares solution which is based on the complete set of observation equations,
reads

x̂(2) = (AT1 Q−1 T −1 −1 T −1 T −1
1 A1 + A2 Q2 A2 ) (A1 Q1 y1 + A2 Q2 y2 )
(2.69)
Qx̂ = (AT1 Q−1 T −1
1 A1 + A2 Q2 A2 )
−1
(2)

Comparing the two solutions shows that the second solution can also be written as

x̂(2) = (Q−1 T −1 −1 −1 T −1
x̂ + A2 Q2 A2 ) (Qx̂ x̂(1) + A2 Q2 y2 ) , Qx̂ = (Q−1 T −1
x̂ + A2 Q2 A2 )
−1
(2.70)
(1) (1) (2) (1)

This result shows that the solution of the partitioned model can be obtained in two steps.
First one solves for the first set of observation equations. This gives x̂(1) and Qx̂ . Then in
(1)
the second step this result is used together with y2 to obtain x̂(2) and Qx̂ . This recursive
(2)
procedure has an important implication in practice. It implies that if new observations,
say y2 , become available one does not need to save the old observations y1 to compute
x̂(1) . One can compute x̂(2) from y2 and the solution x̂(1) of the first step. It will be clear
that one can extent the recursion to more than two steps by using a partitioning of the
model of observation equations into more than two sets.
34 Network Quality Control

Note that the first equation of (2.70) can also be written as

x̂(2) = x̂(1) + (Q−1 T −1 −1 T −1

x̂ + A2 Q2 A2 ) A2 Q2 (y2 − A2 x̂(1) ) (2.71)
(1)

The expression now clearly shows how x̂(2) is found from updating x̂(1) . The correction to
x̂(1) depends on the difference y2 −A2 x̂(1) . Since A2 x̂(1) can be interpreted as the prediction
of E{y2 } based on y1 , the difference y2 − A2 x̂(1) is called the predicted residual of E{y2 }.
Note that the predicted residual is not the same as the least-squares residual. The least-
squares residual of E{y2 } reads namely y2 − A2 x̂(2) .
Note that in the above expressions we need to invert a matrix having an order which
is equal to the number of entries in the parameter vector x. One can however also derive
an expression for x̂(2) in which a matrix needs to be inverted which has an order equal to
the dimension of y2 . From the matrix identity

(Q−1 T −1 −1 T −1 T T −1
x̂ + A2 Q2 A2 ) A2 Q2 = Qx̂ A2 (Q2 + A2 Qx̂ A2 )
(1) (1) (1)

follows that the solution (2.70) may also be written as

⎧
⎪ x̂ = x̂(1) + Qx̂ AT2 (Q2 + A2 Qx̂ AT2 )−1 (y2 − A2 x̂(1) )
⎨ (2) (1) (1)
(2.72)
⎪
⎩ Q
x̂ = Qx̂ − Qx̂ AT2 (Q2 + A2 Qx̂ AT2 )−1 A2 Qx̂
(2) (1) (1) (1) (1)

Both expressions (2.70) and (2.72) give identical results. But the second expression is
more advantageous than the first, when the dimension of y2 is small compared to the
dimension of x. In that case the order of the matrix that needs to be inverted is smaller.
The above expression shows that the correction to x̂(1) is small if the predicted residual
is small. This is also what one would expect. Also note that the correction is small if the
variance of x̂(1) is small. This is also understandable, because if the variance of x̂(1) is
small one has more confidence in it and one therefore would like to give more weight to
it than to y2 .
The above expression also shows how the variance matrix gets updated. Because of
the minus sign, the precision of the estimator gets better. This is understandable, since by
including y2 more information is available to estimate x.
The above results show how the least-squares solution of the parameter vector x can
be updated in recursive form. But this is not the only solution that can be updated recur-
sively. Also the weighted sum-of- squares of the least-squares residuals, can be updated
recursively. It can be shown that

êT Q−1 T −1 T −1
y ê = ê1 Q1 ê1 + v2 Qv v2 (2.73)
2

where ê1 = y1 − A1 x̂(1) is the least-squares residual vector of the first step and v2 = y2 −
A2 x̂(1) is the predicted residual which comes available in the second step.

2.6.2 Constrained least-squares

It may happen for a particular application that we know, when formulating the model
of observation equations, that some strict relations exist between the entries of the pa-
2. Estimation and precision 35

rameter vector. In that case, we are dealing with a model of observation equations, with
constraints on the parameter vector. Our model reads then

E{y} = Ax , D{y} = Qy with BT x = b (2.74)

The equations of BT x = b constitute the constraints, where matrix B and vector b are
assumed known. In order to solve the above model in a least-squares sense, we can make
use of the results of the previous section. First note that we may write (2.74) also as

y A y Qy 0
E{ }= x , D{ }= (2.75)
b BT b 0 Qb = 0
This is again a model of observation equations which has been partitioned into two sets.
The only difference with the model treated in the previous section is that the variance
matrix corresponding to the second set of observation equations is zero (Qb = 0). This
variance matrix is set to zero, since it is assumed that the relations BT x = b are strictly
valid. Thus b, with sample value b, is interpreted as an observable having a zero variance
matrix.
Based on the results of the previous section, it follows that the solution of the above
model can also be obtained in two steps. The solution of the first step will be denoted as
x̂ and the solution of the second step as x̂b . The solution of the first step reads then

x̂ = (AT Q−1 −1 T −1 T −1 −1
y A) A Qy y , Qx̂ = (A Qy A) (2.76)

This is the solution one would get when the constraints are not taken into account. The
solution of the second step reads

x̂b = x̂ + Qx̂ B(BT Qx̂ B)−1 (b − BT x̂)
(2.77)
Qx̂ = Qx̂ − Qx̂ B(BT Qx̂ B)−1 BT Qx̂
b

This solution directly follows from using (2.72). Note that A2 and Q2 of the previous
section, correspond with BT and Qb = 0 of the present section.
The above result shows that also the solution of a constrained model of observation
equations can be obtained in two steps. The constraints are disregarded in the first step.
In the second step, the results of the first step are used together with the constraints. Note
that the solution of the second step is in fact the solution one would obtain when solving
for the model of condition equations BT E{x̂} = b , D{x̂} = Qx̂ .
As in the previous section, also the weighted sum-of-squares of the least-squares resid-
uals can be written as a sum. If we denote the least- squares residual vector for the con-
strained model as êb and its counterpart when the constraints are disregared as ê, it follows
that

êTb Q−1 T −1 T T T −1 T
y êb = ê Qy ê + (b − B x̂) (B Qx̂ B) (b − B x̂) (2.78)

Remark: In some applications it may happen that we perform the adjustment with con-
straints on the parameters, although we know that these constraints are in fact not deter-
ministic but instead stochastic. Thus although we know that Qb = 0, we then still perform
the adjustment as if Qb = 0. This approach of course, will give a result which is less
36 Network Quality Control

optimal. Sometimes however, such an approach is dictated by what is practically feasi-

ble. A good example in this respect is the connection of a geodetic network to existing
control points. Since it is not practical to see the coordinates of the control points change,
everytime a new network is connected to them, the adjustment is carried out in such a way
that the coordinates of the existing control points remain fixed. Hence in the adjustment
process, their variance matrix is set to zero. But this does not mean that we are allowed to
set this variance matrix to zero as well when applying the error propagation law. In this
case, the second equation of (2.77) is thus not valid anymore. This equation was namely
obtained from applying the error propagation law to the first equation, under the assump-
tion that the constraints are nonstochastic. To obtain a proper precision description, we
need to take Qb into account as well. If this is done, we obtain instead of the second
equation of (2.77), the following expression for the variance matrix

Qx̂ = Qx̂ − Qx̂ B(BT Qx̂ B)−1 [BT Qx̂ B − Qb ](BT Qx̂ B)−1 BT Qx̂ (2.79)
b

This result reduces to that of (2.77) when Qb = 0, showing that Qx̂ < Qx̂ . This inequality
b
is not guaranteed anymore however, when Qb = 0. It then depends on whether BT Qx̂ B >
Qb holds true or not. This explains why it is of importance when connecting networks
to existing control, to have an existing control which is of a better precision than the
precision of the network connected to it. If this is not the case, then BT Qx̂ B < Qb and thus
Qx̂ > Qx̂ , showing that the precision of the network after the connection is poorer than it
b
was before the connection.

2.6.3 Minimally constrained least-squares

Up to this point, all the matrices that we worked with were assumed to be of full rank.
We will now investigate what can be done when the A-matrix of the model of observation
equations

E{y} = A x , D{y} = Qy
(2.80)
m×1 m×n n×1 m×1 m×m

is less than of full rank. Let us assume that the rank of matrix A is given as

rank A = r < n

Hence, the rank defect equals (n − r). This implies that there exist (n − r) linear indepen-
dent combinations of the column vectors of matrix A that produce the zero vector. These
linear combinations are said to span the null space of matrix A. The null space of A is
defined as

N(A) = {x ∈ Rn | Ax = 0}

It is a subspace of the parameter space Rn and its dimension equals (n − r). Let us assume
that the columns of the n × (n − r) matrix G span the null space of A. Then

R(G) = N(A) and AG = O

2. Estimation and precision 37

Thus the range space of G equals the null space of A and the columns of matrix G form
the linear combinations that need to be taken of the columns of matrix A to obtain the zero
vector. Since AG = O, it follows that
E{y} = Ax = A(x + Gγ ) , with γ ∈ Rn−r
This shows that E{y} remains unchanged, when we add the vector Gγ , with γ ∈ Rn−r
arbitrary, to the vector x. Hence, E{y} is invariant to these type of changes of the pa-
rameter vector. It will now be clear, since E{y} is insensitive to the above changes of the
parameter vector, that one can not expect the rank defect model of observation equations
to have a unique solution. The information content of the measurements is simply not
sufficient enough to determine x uniquely.
In practice one will meet such situations for instance, when adjusting so-called free
networks. Imagine the simple example of adjusting a single levelling loop. In this case
the levelled height differences constitute the measurements and the heights of the points
in the loop constitute the unknown parameters. It will be clear that one can not determine
heights from observed height differences only. The height differences will not change
when one adds a constant value to all the heights of the points in the levelling loop. This
shows that we need some additional information, in order to be able to solve for the
heights. For this simple example, the height of one of the points in the levelling loop
would suffice already.
Also for the general case, the lack of information in E{y} to determine x uniquely,
implies that additional information is needed. We will formulate the additional informa-
tion as constraints on the parameter vector x. Thus instead of (2.80), we will consider the
constrained model
E{y} = Ax , D{y} = Qy with BT x = 0 (2.81)
(note: For reasons of simplicity we have set the value of the constant vector b in the
constraints, equal to zero.) The matrix B is assumed to be of full rank. Thus the rows of
the matrix BT are assumed to be linearly independent.
When introducing the constraints, it is of importance to understand, that they are
merely used as a tool to allow us to be able to compute a solution for x (note: there are
other, but equivalent, ways to deal with a rank defect model of observation equations, for
instance by using the theory of generalized inverses. This however, is beyond the scope
of the present lecture notes). Since the constraints are only there to allow us to compute
a solution for x, the constraints should not interfere with the intrinsic properties of the
adjustment itself. In other words, the constraints should contain the information which
is minimally needed to eliminate the lack of information in E{y}. This implies that the
contraints should satisfy two conditions. First, the constraints should be such that indeed
a solution for x can be computed. This however, is not the only condition the constraints
should satisfy. If it would be the only condition, then B = I would be an admissible
choice. But this choice can not be allowed of course, since it overconstraints the solution.
In fact when B is chosen equal to the identity matrix, no adjustment would be necessary
anymore and the measurements would not contribute to the solution. This clearly, is not
acceptable. Thus the constraints should also satisfy a second condition, which is that the
38 Network Quality Control

least-squares solution of the measurements, ŷ, is invariant to the particular choice of the
constraints.
If we translate the above conditions in terms of linear algebra, it follows that the
constraints are admissible, if and only if the matrix B satisfies

Rn = R(B) ⊕ R(AT ) (2.82)

Thus the range space of matrix B should be complementary to the range space of matrix
AT . Two spaces are said to be complementary, when the two spaces together span the
whole space of real numbers and their intersection only contains the zero vector. Since
the dimension of Rn equals n and the dimension of R(AT ) equals the rank of A, which is
rank A = r, it follows that the dimension of R(B) must equal

dimension R(B) = n − r

This shows that the number of linearly independent constraints that are admitted, equals
(n − r). This is also understandable if one considers the dimension of the null space of A.
That is, one needs as many constraints as there are dimensions in the null space of A.
An alternative, but equivalent formulation of the above condition (2.82) can be for-
mulated as follows. Let the linear independent columns of the n × r matrix B⊥ span the
null space of matrix BT . The above condition is then equivalent to

Rn = N(A) ⊕ R(B⊥ ) (2.83)

Thus the range space of matrix B⊥ should be complementary to the null space of matrix
A. A direct consequence of this condition is that the matrix G, of which the columns span
N(A), and the matrix B⊥ , together form a square and full rank matrix. Thus the partitioned
matrix (B⊥ , G) is square and invertible. Since it is square and invertible, it may be used
to reparametrize the parameter vector x as

β
x = (B⊥ , G) (2.84)
γ
This equation establishes a one-to-one relation between on the one hand x and on the other
hand β and γ . When we substitute (2.84) into (2.81), we obtain

E{y} = AB⊥ β , D{y} = Qy with BT Gγ = 0 (2.85)

since AG = O and BT B⊥ = O. Note that this reparametrization shows which part of the
unknown parameters can be determined from the measurements and which part is deter-
mined by the constraints. Since the matrix BT G is square and invertible, it directly follows
from the constraints that γ = 0 (note: the invertibility of BT G is a direct consequence of
(2.82) or (2.83)). Thus γ is determined by the constraints and β is the part that still needs
to be determined in a least-squares sense from the measurements. The least-squares solu-
tion for β reads

β̂ = [(B⊥ )T AT Q−1 ⊥ −1 ⊥ T T −1
y A(B )] (B ) A Qy y
2. Estimation and precision 39

If we substitute this together with γ = 0 into (2.84), we obtain the least-squares solution
of the minimally constrained model (2.81) as
⎧
⎪
⎨ x̂b = (B⊥ )[(B⊥ )T AT Q−1 ⊥ −1 ⊥ T T −1
y A(B )] (B ) A Qy y
(2.86)
⎪
⎩ Q
x̂ = (B⊥ )[(B⊥ )T AT Q−1 ⊥ −1 ⊥ T
y A(B )] (B )
b

This is the unique least-squares solution of the minimally constrained model (2.81) and it
is one of the infinitely many least-squares solutions of the rank defect model (2.80). The
whole family of least-squares solutions of model (2.80) reads then

x̂ = x̂b + Gγ (2.87)

where γ is still undetermined.

An important property of these least-squares solutions is that they all produce the same
least-squares solution for the measurements. Thus ŷ = Ax̂ is invariant for the choice of
the constraints BT x = 0, as long as they are minimal, that is, as long as matrix B satisfies
the condition (2.82). In terms of the example of the levelling loop this corresponds to the
fact that the least-squares solution for the height differences will be invariant to the choice
which one of the points in the loop is assumed to be constrained in height.

S-transformations: Above we have shown how an arbitrary least-squares solution x̂ of

model (2.80) can be obtained from a particular solution x̂b and an undetermined part Gγ .
But also the inverse is possible. That is, one can also obtain a particular solution from any
arbitrary solution. To see this, we premultiply (2.87) with BT . This gives BT x̂ = BT Gγ ,
since BT x̂b = 0 by definition. Hence, γ = (BT G)−1 BT x̂. From substituting this into (2.87)
follows then
⎧
⎪
⎨ x̂b = [I − G(BT G)−1 BT ]x̂
(2.88)
⎪
⎩ Q T G)−1 BT ]Q [I − G(BT G)−1 BT ]T
x̂ = [I − G(B x̂
b

The second equation has been obtained from applying the error propagation to the first
equation. This result shows that we only need to know the transformation matrix

Sb = [I − G(BT G)−1 BT ] (2.89)

and thus the matrices B and G, in order to obtain x̂b from x̂. This transformation matrix
is a very famous one in geodesy and is known as the S-transformation. With the S-
transformation we are thus able to transform any least-squares solution to a minimally
constrained solution, specified by the matrix B. The matrix G, of which the columns span
the null space of A, depends on A and thus on the type of measurements which are included
in the model. For a levelling network for instance, the undetermined part will correspond
with a constant shift in the height of all points. For a triangulation network however, the
undetermined part will correspond to a translation, a rotation and a scale change of the
network. Angles do namely not contain information on the position, orientation and size
of the network.
40 Network Quality Control

One may wonder how the above results correspond to the results which were obtained
in the previous section for the case of constrained least- squares. In the previous section,
where matrix A was assumed to be of full rank, we showed that the least-squares solution
of the constrained model could be obtained in two steps. The unconstrained solution of
the first step, was used as input for the second step to finally come up with the constrained
solution. These two steps can also be recognized in case of the minimally constrained so-
lution. The first step corresponds then with x̂, which is an arbitrary least-squares solution
of model (2.80), and the second step corresponds then with (2.88).

Unbiasedness: In one of our earlier sections we claimed that the least-squares estimator
of the parameter vector is unbiased and thus that E{x̂} = x holds true. At that point
however, we still assumed the matrix A to be of full rank. So, what happens when the
matrix A is not of full rank? We know that in that case the measurements fail to contain
enough information to determine x uniquely. We may therefore suspect that in that case
the least-squares estimators will also fail to be unbiased estimators of the parameter vector
x. And indeed, when we take the expectation of (2.86), we obtain
E{x̂b } = (B⊥ )[(B⊥ )T AT Q−1 ⊥ −1 ⊥ T T −1
y A(B )] (B ) A Qy E{y}
= (B )[(B ) A Qy A(B⊥ )]−1 (B⊥ )T AT Q−1
⊥ ⊥ T T −1
y Ax = x

Thus x̂b is not an unbiased estimator of x. If the minimally constrained least-squares

estimator is not an unbiased estimator of x, what is it then an unbiased estimator of? To
see this, we substitute (2.84) into the above equation. This gives E{x̂b } = B⊥ β = x − Gγ ,
or when γ = (BT G)−1 BT x is substituted in it,
E{x̂b } = [I − G(BT G)−1 BT ]x = Sb x (2.90)
This shows that x̂b is an unbiased estimator of Sb x. In terms of our levelling example
it means that the minimally constrained least-squares estimators of the heights are not
unbiased estimators of absolute heights, but instead are unbiased estimators of the heights
which are defined through the minimal constraint, i.e. the constraint height of one of the
points in the levelling network.

2.7 Quality control: precision

Now that we have developed most of the adjustment theory that will be needed in these
lecture notes, let us pause for a few moments and reflect on the various steps involved
when an adjustment is carried out.

Formulate model: Before any adjustment can be carried out, one first must have a clear
idea of: (1) the observables that are going to be used, (2) their assumed relation to the
unknown parameters and (3) their assumed precision. Thus first one must be able to
formulate the model of observation equations
E{y} = Ax , D{y} = Qy (2.91)
This model consists of the functional model E{y} = Ax, which is specified through the
m×n design matrix A, and of the stochastic model D{y} = Qy , which is specified through
2. Estimation and precision 41

the m × m variance matrix Qy . In the functional model the link is established between the
measurements and the unknown parameters. It captures the geometry of the network. In
the stochastic model, one specifies the precision of the measurements. It depends on the
type of measurement equipment used and on the measurement procedures used.

Adjustment: Based on the above model and on an available sample of y, the mea-
surements, one can commence with the actual adjustment. Using the principle of least-
squares, the estimator for the unknown parameter vector x reads

x̂ = (AT Q−1 −1 T −1
y A) A Qy y (2.92)

Here it has been assumed that the design matrix is of full rank. If this is not the case, then
the approach of using minimal constraints needs to be used.
If we assume that y is normally distributed, then its probability density function is
completely specified by its first two moments, being its expectation and its dispersion.
This then also holds true for the probability density function of the estimator x̂. The qual-
ity of this estimator can thus be characterized by its expectation E{x̂} and its dispersion
D{x̂}. For its expectation we have

E{x̂} = x (2.93)

Loosely speaking, this implies that if the adjustment is repeated a sufficient number of
times, each time with a new sample from y, that the various outcomes for x̂ would on
the average coincide with x. This property of unbiasedness is an important one and it is
automatically fulfilled when the least-squares principle is used, provided of course that
the model which forms the basis of the adjustment is correct. In the next chapter we will
see what happens when the model is specified incorrectly.

Precision: For the moment we will assume that the model used is correct and thus that the
estimator x̂ is unbiased. In that case, the quality of the estimator is completely captured
by its dispersion, being the variance matrix Qx̂ . Thus under the provision that the least-
squares estimator is unbiased, we may judge its quality on the basis of its variance matrix

Qx̂ = (AT Q−1

y A)
−1
(2.94)

The variance matrix is said to describe the precision of the estimator. It describes the
amount of variability in samples of x̂ around its mean.
Since the variance matrix depends on A and on Qy , one can change Qx̂ by changing
A and/or Qy . This thus gives us a way of improving the precision of the least-squares
solution. For instance, if for a certain application the precision turns out to be not good
enough, one may decide to use more precise measurement equipment. In that case Qy
changes for the better and also Qx̂ will then change for the better. In many applications
however, one will not have too much leeway in choosing from different sets or types of
measurement equipment. In that case one depends on A for improving the precision (note:
for that reason, matrix A is also often referred to as the design matrix). There are two ways
in which A can be changed. One can change the dimension of the matrix and/or one can
change its structure. For instance, when the precision of a geodetic network is not good
42 Network Quality Control

enough one can decide to add more observations to the network. In that case, matrix A is
extended row wise. However, it may also be possible that a mere change in the geometry
of the network already suffices to improve its precision. In that case, it is the structure of
A that changes, while its dimension stays the same.

Precision testing: Although we know how to change Qx̂ for the better or the worse, we
of course still need a way of deciding when the precision, as expressed by the variance
matrix, is good enough. It will be clear that this depends very much on the particular
application at hand. What is important though, is that one has a precision criterium avail-
able by which the precision of the least-squares solution can be judged. The following are
some approaches that can be used for testing the precision of the solution.
It may happen in a particular application, that one is only interested in one particular
function of the parameters, say θ = f T x. In that case it becomes very easy to judge its
quality. One then simply has to compare its variance with the given criterium

σθ̂2 = f T Qx̂ f < criterium (2.95)

The situation becomes more difficult though, when the application at hand requires that
the precision of more than one function needs to be judged upon. Let us assume that
the functions of interest are given as θ = F T x, where F is a matrix. The corresponding
variance matrix is then given as Qθ̂ = F T Qx̂ F. One way to judge the precision in this
case, is by inspecting the variances and covariances of the matrix Qθ̂ . An alternative way
would be to use the average precision as precision measure. In that case one relies on the
trace of the variance matrix
1 trace (F T Q F) < criterium (2.96)
p x̂

where p is the dimension of the vector θ . When using the trace one has to be aware of
the fact that one is not taking all the information of the variance matrix Qθ̂ into account.
It depends only on the variances and not on the covariances. Hence, the correlation be-
tween the entries of θ̂ are then not taken into account. When using the trace one should
also make sure that it makes sense to speak of an average variance. In other words the
entries of θ should contain the same type of variables. It would not make sense to use
the trace, when θ contains completely different variables, each having their own physical
dimension.
The trace of Qθ̂ equals the sum of its eigenvalues. Instead of the sum of eigenvalues,
one might decide that it suffices to consider the largest eigenvalue λmax only,

λmax (F T Qx̂ F) < criterium (2.97)

In that case one is thus testing whether the function of θ which has the poorest precision,
still passes the precision criterium. When this test is passed successfully, one knows that
all other functions of θ will also have a precision which is better than the criterium. For
some applications this may be an advantage, but it could be a disadvantage as well. It
could be a disadvantage in the sense that the above test could be overly conservative.
That is, when the function having the poorest precision passes the above test, all other
2. Estimation and precision 43

functions, which by definition have a better precision, may turn out to be unnecessarily
precise.
So far we assumed that the precision criterium was given in scalar form. But this need
not be the case. The precision criterium could also be given in matrix form. In that case
one is working with a criterium matrix, which we will denote as Cx . The precision test
amounts then to testing whether the precision as expressed by the actual variance matrix
Qθ̂ = F T Qx̂ F is better than or as good as the precision expressed by the criterium matrix
F T Cx F. Also this test can be executed by means of solving an eigenvalue problem, but
now it will be a generalized eigenvalue problem

| F T Qx̂ F − λ F T Cx F |= 0 (2.98)

Note that when the matrix F T Cx F is taken as the identity matrix, the largest eigenvalue of
(2.98) reduces to that of (2.97). Using the generalized eigenvalue problem is thus indeed
more general than the previous discussed approaches. It is characterized by the fact that
it allows one to compare the actual variance matrix Qθ̂ directly with its criterium. The
two matrices F T Qx̂ F and F T Cx F are identical, when all eigenvalues equal one, and all
functions of θ̂ have a better precision than the criterium when the largest generalized
eigenvalue is less than one.
So far we assumed the matrix A to be of full rank. In many geodetic applications
however, this is not the case. We know from our section on minimally constrained least-
squares, that the variance matrix Qx̂ will not be unique, when A has a rank defect. In that
case the variance matrix depends on the chosen set of minimal constraints. Since these
constraints should not affect our conclusion when evaluating the precision, we have to
make sure that our procedure of precision testing is invariant for the minimal constraints.
This is possible when we make use of the S-transformations. Let Cx be the criterium
matrix and Qx̂ the variance matrix of a minimally constrained solution. The eigenvalues
b
of the generalized eigenvalue problem

| Qx̂ − λ SbCx STb |= 0 (2.99)

where Sb is the S-transformation that corresponds to the minimal constraints of Qx̂ , are
b
then invariant to the chosen set of minimal constraints. Thus when the largest eigenvalue
is less than one, it is guaranteed that all functions of x̂b have a precision which is better
than what they would have were Qx̂ be replaced by the criterium matrix.
b
Chapter 3

Testing and reliability

3.1 Introduction

In the previous chapter we considered the model of observation equations

E{y} = Ax , D{y} = Qy (3.1)

and showed how a solution for the unknown parameter vector x, based on the least-squares
principle, could be obtained. The least-squares solution x̂ is optimal in the sense that it is
unbiased and that it is of minimal variance in the class of linear unbiased estimators. These
optimality properties only hold true however, when the model (3.1) is correct. There are
many ways in which the model could have been misspecified. The functional model
could be wrong, in which case E{y} = Ax. As a consequence, the least-squares estimator
becomes biased, E{x̂} = x. Or, the stochastic model could be wrong, in which case
D{y} = Qy . As a consequence, the property of minimal variance is lost.
The topic of the present chapter is to present ways of checking the validity of the
above model. We start off in section 3.2, by discussing the basic concepts of hypothesis
testing. We consider the general form of a null hypothesis and an alternative hypothesis,
show what a test between the two implies and discuss the two type of errors one can
make. Following these basic concepts, we move on in section 3.3 to the testing of the
above model against the alternative model

E{y} = Ax +C∇ , D{y} = Qy (3.2)

Note that the two models only differ in their functional model. Hence we restrict ourselves
in these lecture notes to misspecifications in the mean of y. In geodetic practice these are
by far the most frequently occurring type of model errors (e.g. measurement blunders,
unaccounted systematic effects). In section 3.3, we present and discuss the test statistic
T q through which the model (3.1) can be tested against the alternative (3.2).
In most practical applications, it is usually not only one model error one is concerned
about, but quite often many more than one. To each of these model errors there belongs
a vector C∇, with the matrix C specifying how the model error is related to the vector
of observables. As a result one is not dealing with only one alternative hypothesis, but
with as many alternative hypotheses as there are model errors one is willing to consider.
This implies that one needs a testing procedure for handling the various alternative hy-
potheses. In section 3.4 such a procedure is presented. It consists of a detection step, an
identification step and an adaptation step.
Just like the results of an adjustment are not exact (that is, not deterministic, but
stochastic), also the results of the statistical tests are not exact. The confidence one will
have in the outcomes of the statistical tests depends in a large part on the ’strength’ of the
model. A practical way of diagnosing this confidence is provided for by the concept of

45
46 Network Quality Control

reliability. It is introduced in section 3.5. Reliability together with precision, can then be
considered to describe the quality which one can expect of the results of an adjustment
and testing. They are both considered in the last section of this chapter.

3.2 Basic concepts of hypothesis testing

3.2.1 Statistical hypotheses

Many social, technical and scientific problems result in the question whether a particular
theory or hypothesis is true or false. In order to answer this question one can try to design
an experiment such that its outcome can also be predicted by the postulated theory. After
performing the experiment, one can then confront the experimental outcome with the
theoretically predicted value and on the basis of this comparison try to conclude whether
the postulated theory or hypothesis should be rejected. That is, if the outcome of the
experiment disagrees with the theoretically predicted value, one could conclude that the
postulated theory or hypothesis should be rejected. On the other hand, if the experimental
outcome is in agreement with the theoretically predicted value, one could conclude that
as yet no evidence is available to reject the postulated theory or hypothesis.

Example 10
According to the postulated theory or hypothesis the three points 1, 2 and 3
of figure 3.1 lie on one straight line. In order to test or verify this hypothesis
2
l12 l23

l13
1 3

Figure 3.1 Three distances between three known points

we need to design an experiment such that its outcome can be compared with
the theoretically predicted value.
If the postulated hypothesis is correct, the three distances l12 , l23 and l13
should satisfy the relation:

l13 = l12 + l23

Thus, under the assumption that the hypothesis is correct we have:

H : l12 + l23 − l13 = 0 (3.3)

To denote a hypothesis, we will use a capital H followed by a colon that in

turn is followed by the assertion that specifies the hypothesis.
As an experiment we can now measure the three distances l12 , l23 and l13 ,
compute l12 + l23 − l13 and verify whether this computed value agrees or dis-
agrees with the theoretically predicted value of H. If it agrees, we are inclined
3. Testing and reliability 47

to accept the hypothesis that the three points lie on one straight line. In case
of disagreement we are inclined to reject hypothesis H.
It will be clear that in practice the testing of hypotheses is complicated by the fact that
experiments (in particular experiments where measurements are involved) in general do
not give outcomes that are exact. That is, experimental outcomes are usually affected by
an amount of uncertainty, due for instance to measurement errors. In order to take care
of this uncertainty, we will, in analogy with our derivation of Adjustment Theory in the
previous chapter, model the uncertainty by making use of the results from the theory of
random variables. The verification or testing of postulated hypotheses will therefore be
based on the testing of hypotheses of random variables of which the probability distri-
bution depends on the theory or hypothesis postulated. From now on we will therefore
consider statistical hypotheses.

Statistical hypothesis: A statistical hypothesis is an assertion or conjecture about the

probability distribution of one or more random variables, for which it is assumed that a
random sample (mostly through measurements) is available.
The structure of a statistical hypothesis H is in general the following:
H : y ∼ py (y|x) (3.4)
with x fully or partially specified. This statistical hypothesis should be read as follows:
According to H the scalar or vector observable random variable y has a probability density
function given by py (y|x). The scalar, vector or matrix parameter x used in the notation
of py (y|x) indicates that the probability density function of y is known except for the
unknown parameter x. Thus, by specifying (either fully or partially) the parameter x, an
assertion or conjecture about the density function of y is made. In order to see how a
statistical hypothesis for a particular problem can be formulated, let us continue with our
example 10.

Example 10 (continued)
We know from experience that in many cases the uncertainty in geodetic mea-
surements can be adequately modeled by the normal distribution. We there-
fore model the three distances between the three points 1, 2 and 3 as normally
distributed random variables 1 If we also assume that the three distances are
uncorrelated and all have the same known variance 13 σ 2 , the simultaneous
probability density function of the three distance observables becomes:
⎡ ⎤ ⎡ ⎤
l 13 E{l 13 }
⎣ l ⎦ ∼ N(⎣ E{l } ⎦ , Q) with Q = 1 I (3.5)
12 12 3 3
l 23 E{l 23 }

The notation N(a, B) is a shorthand notation for the normal distribution, hav-
ing a as mean and B as variance matrix. Statement (3.5) could already be
1 Note that strictly speaking distances can never be normally distributed. A distance is always nonnegative,

whereas the normal distribution, due to its infinite tails, admits negative sample values.
48 Network Quality Control

considered a statistical hypothesis, since it has the same structure as (3.4).

Statement (3.5) asserts that the three distance observables are indeed nor-
mally distributed with unknown mean, but with known variancematrix Q.
Statement (3.5) is however not yet the statistical hypothesis we are looking
for. What we are looking for is a statistical hypothesis of which the probabil-
ity density function depends on the theory or hypothesis postulated. For our
case this means that we have to incorporate in some way the hypothesis that
the three points lie on one straight line. We know mathematically that this
assertion implies that

l12 + l23 − l13 = 0 (3.6)

However, we cannot make this relation hold for the random variables l 12 , l 23
and l 13 . This is simply because of the fact that random variables cannot be
equal to a constant. Thus, a statement like: l 12 + l 23 − l 13 = 0 is nonsensical.
What we can do is assume that relation (3.6) holds for the expected values of
the random variables l 12 , l 23 and l 13 :

E{l 12 } + E{l 23 } − E{l 13 } = 0 (3.7)

For the hypothesis considered this relation makes sense. It can namely be
interpreted as stating that if the measurement experiment were to be repeated
a great number of times, then on the average the measurements will satisfy
(3.7). With (3.5) and (3.7) we can now state our statistical hypothesis as:
⎡ ⎤ ⎡ ⎤
l 13 E{l 13 }
1
H : ⎣ l 12 ⎦ ∼ N(⎣ E{l 12 } ⎦ , I3 )
3
l 23 E{l 23 } (3.8)

with E{l 12 } + E{l 23 } − E{l 13 } = 0

This hypothesis has the same structure of (3.4) with the three means playing
the role of the parameter x.

In many hypothesis-testing problems two hypotheses are discussed: The first, the
hypothesis being tested, is called the null hypothesis and is denoted by H0 . The second
is called the alternative hypothesis and is denoted by HA . The thinking is that if the null
hypothesis H0 is false, then the alternative hypothesis HA is true, and vice versa. We often
say thatH0 is tested against, or versus, HA .
In studying hypotheses it is also convenient to classify them into one of two types by
means of the following definition: if a hypothesis completely specifies the distribution,
that is, if it specifies its functional form as well as the values of its parameters, it is called
a simple hypothesis; otherwise it is called a composite hypothesis.
3. Testing and reliability 49

Example 10 (continued)
In our example, (3.8) is the hypothesis to be tested. Thus, the null hypothesis
reads in our case:
⎡ ⎤ ⎡ ⎤
l 13 E{l 13 }
1
H0 : ⎣ l 12 ⎦ ∼ N(⎣ E{l 12 } ⎦ , I3 )
3
l 23 E{l 23 } (3.9)

with E{l 12 } + E{l 23 } − E{l 13 } = 0

Since we want to find out whether E{l 12 } + E{l 23 } − E{l 13 } = 0 or not,

we could take as alternative the inequality E{l 12 } + E{l 23 } − E{l 13 } = 0.
However, we know from the geometry of our problem that the left hand side
of the inequality can never be negative. The alternative should therefore read:
E{l 12 } +E{l 23 } −E{l 13 } > 0. Our alternative hypothesis takes therefore the
form:
⎡ ⎤ ⎡ ⎤
l 13 E{l 13 }
1
HA : ⎣ l 12 ⎦ ∼ N(⎣ E{l 12 } ⎦ , I3 )
3 (3.10)
l 23 E{l 23 }
with E{l 12 } + E{l 23 } − E{l 13 } > 0
When comparing (3.9) and (3.10) we see that the type of the distribution
of the observables and their variance matrix are not in question. They are
assumed to be known and identical under both H0 and HA .
Both of the above hypotheses, H0 and HA , are examples of composite hy-
potheses. The above null hypothesis H0 would become a simple hypothesis
if the individual expectations of the observables were assumed known.

3.2.2 Test of statistical hypotheses

After the statistical hypotheses H0 and HA have been formulated, one would like to test
them in order to find out whether H0 should be rejected or not.

Test of a statistical hypothesis: A test of a statistical hypothesis

H0 : y ∼ py (y|x)
with x fully or partially specified, is a rule or procedure, in which a random sample of y
is used for deciding whether to reject or not reject H0 . A test of a statistical hypothesis is
completely specified by the so-called critical region, which will be denoted by K.

Critical region K: The critical region K of a test is the set of sample values of y for which
H0 is to be rejected. Thus, H0 is rejected if y ∈ K.
It will be obvious that we would like to choose a critical region so as to obtain a test
with desirable properties, that is, a test that is ”best” in a certain sense. But let us first
have a look at a simple testing problem for which, on more or less intuitive grounds, an
acceptable critical region can be found.
50 Network Quality Control

Example 11
Let us assume that a geodesist measures a scalar variable, and that this mea-
surement can be modeled as a random variable y with density function

1 1
y ∼ √ exp[− (y − E{y})2 ] (3.11)
2π 2

Thus, it is assumed that y has a normal distribution with unit variance. Al-
though this assumption constitutes a statistical hypothesis, it will not be
tested here because the geodesist is quite certain of the validity of this as-
sumption. The geodesist is however not certain about the value of the expec-
tation of y. His assumption is that the value of E{y} is x0 . This assumption
is the statistical hypothesis to be tested. Denote this hypothesis by H0 . Then,

H0 : E{y} = x0 (3.12)

Let HA denote the alternative hypothesis that E{y} = x0 . Then:

HA : E{y} = x0 (3.13)

Thus the problem is one of testing the simple hypothesis H0 against the com-
posite hypothesis HA . To test H0 , a single observation on the random variable
y is made. In real-life problems one usually takes several observations, but to
avoid complicating the discussion at this stage only one observation is taken
here. On the basis of the value of y obtained, denoted by y, a decision will
be made either to accept H0 or reject it. The latter decision, of course, is
equivalent to accepting HA . The problem then is to determine what values
of y should be selected for accepting H0 and what values for rejecting H0 . If
a choice has been made of the values of y that will correspond to rejection,
then the remaining values of y will necessarily correspond to acceptance. As
defined above, the rejection values of y constitute the critical region K of the
test. Figure 3.2 shows the distribution of y under H0 and under two possible
alternatives HA and HA . Looking at this figure, it seems reasonable to re-
1 2
ject H0 if the observation y is remote enough from E{y}. If H0 is true, the

HA Ho HA
2 1

E{y} = x0 E{y} = x0 E{y} = x0

Figure 3.2 H0 : E{y} = x0 versus HA : E{y} = x0 .

3. Testing and reliability 51

probability of a sample of y falling in a region remote from E{y} is namely

small. And if HA is true, this probability may be large. Thus the critical re-
gion K should contain those sample values of y that are remote enough from
E{y}. Also, since the alternative hypothesis can be located on either side of
E{y}, it seems obvious to have one portion of K located in the left tail of H0
and one portion of K located in the right tail of H0 . Finally, one can argue
that since the distribution is symmetric about its mean value, also the critical
region K should be symmetric about E{y}. This as a result gives the form of
the critical region K as shown in figure 3.3.

E{y} = x0
K K
reject accept reject

Figure 3.3 Critical region K for testing H0 : E{y} = x0 versus HA : E{y} = x0 .

3.2.3 Two types of errors

We have seen that a test of a statistical hypothesis is completely specified once the critical
region K of the test is given. The null hypothesis H0 is rejected if the sample value or
observation falls in the critical region, i.e. if y ∈ K. Otherwise the null hypothesis H0 is
accepted, i.e. if y ∈/K. With this kind of thinking two types of errors can be made:

Type I error: Rejection of H0 when in fact H0 is true.

Type II error: Acceptance of H0 when in fact H0 is false.

Table 3.1 shows the decision table with the type I and II errors.
The size of a type I error is defined as the probability that a sample value of y falls in
the critical region when in fact H0 is true. This probability is denoted by α and is called
the size of the test or the level of significance of the test. Thus:

α = P(type I error) = P(rejection of H0 when H0 is true)

or

α = P(y ∈ K|H0 ) = py (y|H0 )dy (3.14)
K

The size of the test, α , can be computed once the critical region K and the probability
density function of y is known under H0 . The size of a type II error is defined as the
52 Network Quality Control

Table 3.1 Decision table with type I and type II error.

H0 true H0 false
Reject H0 Wrong Correct
y∈K Type I error
Accept H0 Correct Wrong
y ∈/K Type II error

probability that a sample value of y falls outside the critical region when in fact H0 is
false. This probability is denoted by β . Thus:

β = P(type II error) = P(acceptance of H0 when H0 is false)

or

β = P(y ∈/K|HA) = 1 − py (y|HA )dy (3.15)
K

The size of a type II error, β , can be computed once the critical region K and the proba-
bility density function of y is known under HA .

Example 12
Assume that y is distributed as

y ∼ N(E{y}, σ 2 ) (3.16)

with known variance σ 2 .

The following two simple hypotheses are considered

H0 : E{y} = x0 (3.17)

and

HA : E{y} = xA > x0 (3.18)

The situation is sketched in figure 3.4.

Since the alternative hypothesis HA is located on the right of the nullhypoth-
esis H0 , it seems intuitively appealing to choose the critical region K right-
sided. Figure 3.5a and 3.5b show two possible right-sided critical regions
K. They also show the size of the test, α , which corresponds to the area un-
der the graph of the distribution of y under H0 for the interval of the critical
region K.
The size of the test, α , can be computed once the probability density function
of y under H0 is known and the form and location of the critical region K is
3. Testing and reliability 53

Ho HA

x0 xA

Figure 3.4 Two simple hypotheses: H0 : E{y} = x0 and HA : E{y} = xA > x0 .

known. In the present example the form of the critical region has been chosen
right-sided. Its location is determined by the value of kα , the so-called critical
value of the test. Thus, for the present example the size of the test can be
computed as
∞
α= py (y|x0 )dy
kα

or, since
1 1 1
py (y|x0 ) = √ exp[− (y − x0 )2 ]
2πσ 2 σ2
as
∞
1 1 1
α= √ exp[− (y − x0 )2 ]dy (3.19)
kα 2πσ 2 σ2

When one is dealing with one-dimensional normally distributed random vari-

ables, one can usually compute the size of the test, α , from tables given for
the standard normal distribution. In order to compute (3.19) with the help
of such a table, we first have to apply a transformation of variables to (3.19).
Since y is normally distributed under H0 with mean x0 and variance σ 2 , it
follows that the random variable z, defined as

y − x0
z= (3.20)
σ
is standard normally distributed under H0 . And since

kα − x0
α = P(y > kα |H0 ) = P(z > |H0 ) (3.21)
σ
we can use the last expression of (3.21) for computing α . Application of the
change of variables (3.20) to (3.19) gives
∞
1 1
α= kα −x0 √ exp[− z2 ]dz (3.22)
σ 2π 2
54 Network Quality Control

Ho HA

x0 kα xA
K
accept reject

(a)

Ho HA
α

kα x0 xA
K
accept reject

(b)

Figure 3.5 Critical region K and size of test, α .

We can now make use of the table of the standard normal distribution. Table
2.6 shows some typical values of the α and kα for the case that x0 = 1 and
σ = 2.
As we have seen the location of the critical region K is determined by the
value chosen for kα , the critical value of the test. But what value should
we choose for kα ? Here the geodesist should base his judgement on his
experience. Usually one first makes a choice for the size of the test, α , and
then by using (3.22) or Table 3.2 determines the corresponding critical value
kα . For instance, if one fixes α at α = 0.01, the corresponding critical value
kα (for the present example with x0 = 1 and σ = 2) reads kα = 5.64. The
choice of α is based on the probability of a type I error one is willing to
accept. For instance, if one chooses α as α = 0.01, one is willing to accept
that 1 out of a 100 experiments leads to rejection of H0 when in fact H0 is
true.
Let us now consider the size of a type II error, β . Figure 3.6 shows for the
present example the size of a type II error, β . It corresponds to the area under
the graph of the distribution of y under HA for the interval complementary to
3. Testing and reliability 55

Table 3.2 Test size α , critical value kα for x0 = 1 and σ = 2.

kα − x0
α kα
σ
0.1 1.28 3.56
0.05 1.65 4.30
0.01 2.32 5.64
0.001 2.98 6.96

the critical region K. The size of a type II error, β , can be computed once the

Ho HA

β α

x0 kα xA
K
accept reject

Figure 3.6 The sizes of type I and type II error, α and β , for testing H0 : E{y} = x0 versus
HA : E{y} = xA > x0 .

probability density function of y under HA is known and the critical region K

is known. Thus, for the present example the size of the type II error can be
computed as
k
α
β= py (y|xA )dy
−∞

or, since

1 1 1
py (y|xA ) = √ exp[− (y − xA )2 ]
2πσ 2 σ2
as
kα
1 1 1
β= √ exp[− (y − xA )2 ]dy (3.23)
−∞ 2πσ 2 σ2

Also this value can be computed with the help of the table of the standard
normal distribution. But first some transformations are needed. It will be
clear that the probability that a sample or observation of y falls in the critical
56 Network Quality Control

region K when HA is true, is identical to 1 minus the probability that the

sample does not fall in the critical region when HA is true. Thus,

β = P(y ∈/K|HA) = 1 − P(y ∈ K|HA) (3.24)

Since for the present example,

∞
1 1 1
P(y ∈ K|HA) = √ exp[− (y − xA )2 ]dy
kα 2πσ 2 σ2

substitution into (3.24) gives

∞
1 1 1
1−β = √ exp[− (y − xA )2 ]dy (3.25)
kα 2πσ 2 σ2

This formula has the same structure as (3.19). The value 1 − β can therefore
be computed in exactly the same manner as the size of the test, α , was com-
puted. And from 1 − β it is trivial to compute β , the size of the type II error.

3.2.4 General steps in testing hypotheses

Thus far we have discussed the basic concepts underlying most of the hypothesis-testing
problems. The same concept and guidelines will provide the basis for solving more com-
plicated hypothesis-testing problems as treated in the next sections. Here we summarize
the main steps of testing hypotheses about a general probability model.

a From the nature of the experimental data and the consideration of the assertions
that are to be examined, identify the appropriate null hypothesis and alternative
hypothesis:

H0 : y ∼ py (y|x0 ) versus H0 : y ∼ py (y|xA )

In most geodetic applications, the probability density function can be considered

to be normal under the null hypothesis as well as under the alternative hypothesis.
Since the normal distribution is uniquely characterized by its first two (central)
moments, the expectation (mean) and dispersion (variance), the two hypotheses
can then only differ with respect to the mean and/or variance of y. In these lecture
notes, we will only consider differences in the mean of y.

b Determine the critical region K. In practice this implies that one determines a func-
tion of y, the test statistic, of which the values should always be larger than a chosen
critical value. The values of y which make this happen, form the critical region.

c Specify the size of the type I error, α , that one wishes to assign to the testing
process. Use tables to determine the location of the critical region K from

α = P(y ∈ K|H0 ) = py (y|x0 )dy
K
3. Testing and reliability 57

For many distributions, like the normal distribution or the Chi-squared distribution,
these tables can be found in standard textbooks on statistics.

d Compute the size of the type II error, β

β = P(y ∈/K|HA) = 1 − py (y|xA )dy
K

to ensure that there exists a reasonable protection against type II errors. Its com-
plement, γ = 1 − β , is known as the detection probability or as the power of the
test.

e After the test has been explicitly formulated, determine whether the sample or ob-
servation y of y falls in the critical region K or not. Reject H0 if y ∈ K, and accept
H0 if y ∈/K. Never claim however that the hypotheses have been proved false or
true by the testing.

3.3 Test statistics for the linear(ized) model

3.3.1 The null and alternative hypothesis

Before any start can be made with the application of the theory of statistical testing, one
needs to have a clear idea of the null hypothesis Ho and of the alternative hypothesis Ha .
In the previous chapter on adjustment theory, we have been working with the model of
observation equations

E{y} = Ax , D{y} = Qy

We have seen, that in order to be able to apply the least-squares principle, only the first two
moments of the random vector of observables ,y, need to be specified. The first moment
(mean) E{y} = Ax and the second moment (variance matrix) D{y} = Qy . In the case of
statistical testing however, this is not sufficient. In addition, one will have to specify the
type of probability function of y as well. Since most observational data in geodesy can
be modelled as samples drawn from a normal distribution, we will assume that y has the
normal probability density function
m 1 1
py (y | x) = (2π )− 2 | Qy |− 2 exp[− (y − Ax)T Q−1
y (y − Ax)]
2
Thus we assume that y is Normally distributed, with mean Ax and with the variance matrix
Qy . In shorthand notation

Ho : y ∼ N(Ax, Qy ) (3.26)

This will be our null hypothesis.

In order to test the null hypothesis against an alternative hypothesis, we need to know
what type of misspecifications one can expect in the null hypothesis. Of course, every part
of the null hypothesis could have been specified incorrectly. The mean Ax can be wrong,
the variance matrix Qy can be wrong and/or y could have a distribution other than the
58 Network Quality Control

normal distribution. In these lecture notes we will restrict ourselves to misspecifications

in the mean. Experience has shown, that these are by large the most common errors that
occur when formulating the model. As alternative hypothesis , we will therefore use

Ha : y ∼ N(Ax +C∇, Qy) (3.27)

where C is a known matrix of order m × q and ∇ is an unknown vector of order q × 1.

With Ha we thus oppose the null hypothesis Ho to a more relaxed alternative hypothesis,
in which more explanatory parameters, namely ∇, are introduced. These additional pa-
rameters are then supposed to model those effects which were assumed absent in Ho . For
instance, through C∇ one may model the presence of one or more blunders in the data,
or the the presence of refraction, or any other systematic effect which was not taken into
account in Ho . The relation between E{y} and ∇ is specified through the matrix C.
The purpose of testing Ho against Ha is now, to infer whether the data supports Ho or
whether the data rejects Ho on the basis of Ha .

3.3.2 Residuals and model errors

It will be intuitively clear that the data and thus the observable y must be instrumental for
the testing of Ho against Ha . In this section we will discuss which function of y is a likely
candidate to be used in the testing of Ho versus Ha .

The least-squares residual: If we write

y = Ax + e (3.28)

then

Ho : E{e} = 0 , Ha : E{e} = C∇ = 0 (3.29)

Thus the mean of the residual vector e will be zero when Ho is true and unequal to zero
when Ho is false. This shows that if we would have a sample or measurement of e avail-
able, we could use it to decide on the validity of Ho . Would the sample be close to zero,
we would be inclined to accept Ho and would it differ greatly from zero, we would be
inclined to reject Ho . Unfortunately, no sample of e = y − Ax is available, since x is un-
known. Instead of considering e, let us therefore consider its estimator. The least-squares
solution of x and e under Ho , reads
x̂ = (AT Q−1 −1 T −1
y A) A Qy y
ê = [Im − A(A Q−1
T −1 T −1
y A) A Qy ]y

With (3.28), this can also be written as

x̂ = x + (AT Q−1 −1 T −1
y A) A Qy e
ê = [Im − A(A Qy A)−1 AT Q−1
T −1
y ]e

This shows, when we take the expectation and use (3.29), that

E{x̂} = x E{x̂} = x
Ho : , Ha : (3.30)
E{ê} = 0 E{ê} = C∇ = 0
3. Testing and reliability 59

Thus apart from the mean of e, also the mean of ê is zero when Ho is true and nonzero
when Ho is false. Note that in the first case, x̂ is an unbiased estimator of x, while it
becomes biased when Ho is false. Contrary to e, we do have a sample available of ê, since
it is a function of y. Hence, instead of using e, we could use ê to decide on the validity of
Ho . If the sample of ê is close to zero, we are inclined to accept Ho and if it differs greatly
from zero, we are inclined to reject Ho .

The least-squares model error: Instead of using the least-squares residual vector ê, it
also seems intuitively clear that the model error ∇ itself must be instrumental in deciding
on the validity of Ho . Under Ha , we have

Ha : E{y} = Ax +C∇ (3.31)

The model error ∇ itself is unknown of course. Let us therefore consider its least-squares
ˆ Since it is a function of y, we do have a sample value of it available. On the
estimator ∇.
basis of this value we could also decide on the validity of Ho . If the sample value of ∇ˆ is
small, we are inclined to belief that the model error is absent and thus that Ho is true. On
the other hand, if the sample value of ∇ ˆ turns out be significant, we will certainly not be
inclined to belief Ho and rather have more faith in Ha .
From the above discussion, it seems that both ê and ∇ ˆ can be used for the testing of
Ho against Ha . One can therefore expect that the two estimators must be related in some
way. And this is indeed the case. To show this, we first solve for ∇. The normal equations
that belong to Ha of (3.31), read

AT Q−1 T −1
y A A Qy C x̂a AT Q−1
y y
= (3.32)
C Qy A C T Q−1
T −1
y C ∇ˆ C T Q−1
y y

where x̂a is given the subindex a to indicate that it is the least-squares estimator of x under
Ha and not the least-squares estimator of x under Ho . From the above normal equations,
the reduced normal equations for ∇ ˆ follow as

C T Q−1 T −1 −1 T −1 ˆ T −1 T −1 −1 T −1
y [Im − A(A Qy A) A Qy ]C ∇ = C Qy [Im − A(A Qy A) A Qy ]y

In this expression, we recognize

ê = [Im − A(AT Q−1 −1 T −1

y A) A Qy ]y
Qê Q−1y = [Im − A(AT Q−1 −1 T −1
y A) A Qy ]

The least-squares estimator of the model error and its variance matrix follows therefore
as
ˆ = [C T Q−1 Q Q−1C]−1C T Q−1 ê , Q = [C T Q−1 Q Q−1C]−1
∇ (3.33)
y ê y y ˆ ∇ y ê y

ˆ is indeed a function of ê. Under Ho and Ha , we have

Thus ∇

ˆ = 0, D{∇}
E{∇} ˆ =Q ˆ = ∇ = 0,
E{∇} ˆ =Q
D{∇}
∇ˆ ∇ˆ
Ho : , Ha : (3.34)
E{ê} = 0, D{ê} = Qê 0, D{ê} = Qê
E{ê} = C∇ =
60 Network Quality Control

3.3.3 Significance of model error

Up to this point we spoke about ∇ ˆ being small or large, with reference to the acceptance
ˆ small or large? Here we need an objective
or rejection of Ho . But when do we consider ∇
ˆ
way to measure the significance of ∇.

The one dimensional case: Let us first consider the one dimensional case. In that case,
q = 1 and ∇ˆ becomes a scalar instead of a vector. The significance of ∇ ˆ can now be
measured by using the precision of the estimator, thus by using its standard deviation σ ˆ .
∇
ˆ by its standard deviation σ and define the random variable
We therefore divide ∇ ˆ
∇

∇ˆ
w= (3.35)
σˆ
∇

This random variable has a standard normal distribution under Ho . Thus under Ho , it has
a zero mean, with a variance of one. Under Ha however, it will have a nonzero mean, but
again with a variance of one. Thus
∇
Ho : w ∼ N(0, 1) and Ha : w ∼ N( , 1) (3.36)
σ∇ˆ

Since the distribution of w is completely known under Ho , we are now in a position to test
the significance of the model error. The model error is said to be significant, if

| w | > N 1 (0, 1) (3.37)

2α

where N 1 (0, 1) is the critical value of the standard normal distribution, based on the level
2α
of significance α (the test is two-sided). Note that instead of working with the absolute
value of w, one can also work with its square. In that case, the test reads

w2 > χα2 (1, 0) (3.38)

where χα2 (1, 0) is the critical value of the central Chi- squared distribution having one
degree of freedom.

The higher dimensional case: One can not use the above test when q > 1. In that case we
need to take the complete variance matrix Q ˆ into account, to measure the significance
∇
ˆ Instead of using the one dimensional test statistic w2 , we therefore use the q-
of ∇.
dimensional test statistic
T
ˆ Q−1 ∇
Tq = ∇ ˆ (3.39)
ˆ
∇

Note that T q = w2 , when q = 1. In the higher dimensional case, the model error is said to
be significant, if

Tq > χα2 (q, 0) (3.40)

where χα2 (q, 0) is the critical value of the central Chi- square distribution, having q degrees
of freedom.
3. Testing and reliability 61

Using the least-squares residual: We have seen that the least- squares estimator of the
model error, ∇,ˆ can be written as a function of the least-squares residual vector ê. This
implies that the above test statistic T q can also be expressed in terms of ê. If we substitute
(3.33) into (3.39), we get

T q = êT Q−1 T −1 −1 −1 T −1
y C[C Qy Qê Qy C] C Qy ê (3.41)

Although the two test statistics (3.39) and (3.41) are identical, the second expression is
often the more practical one, since the least-squares residual vector ê is often already
available from the adjustment based on Ho .

3.4 Detection, identification and adaptation

In the previous section we gave the test statistic for testing the null hypothesis Ho against
a particular alternative hypothesis Ha . In most practical applications however, it is usually
not only one model error one is concerned about, but quite often many more than one. This
implies that one needs a testing procedure for handling the various alternative hypotheses.
In this subsection we will discuss a way of structuring such a testing procedure. It will
consist of the following three steps: detection, identification and adaptation.

3.4.1 Detection
Since one usually first wants to know whether one can have any confidence in the assumed
null hypothesis without the need to specify any particular alternative hypothesis, the first
step consists of a check on the overall validity of Ho . This implies that one opposes the
null hypothesis to the most relaxed alternative hypothesis possible. The most relaxed
alternative hypothesis is the one that leaves the observables completely free. Hence, un-
der this alternative hypothesis no restrictions at all are imposed on the observables. We
therefore have the situation

Ho : E{y} = Ax versus Ha : E{y} ∈ Rm (3.42)

Since E{y} ∈ Rm implies that matrix (A,C) is square and invertible, it follows that matrix
C has q = m − n columns and that its range space is complementary to the range space of
A. Thus Rm = R(A) ⊕ R(C). It can be shown that in this case, the test statistic of (3.41)
simplifies to

T m−n = êT Q−1

y ê (3.43)

The appropriate test statistic for testing the null hypothesis against the most relaxed alter-
native hypthesis, is thus equal to the weighted sum-of-squares of the least-squares resid-
uals. The null hypothesis will then be rejected when

Tm−n > χα2 (m − n, 0) (3.44)

The σ̂ 2 test: In the literature one often sees the above overall model test also formulated
in a slightly different way. Let us use the factorization Qy = σ 2 Gy , where σ 2 is the
62 Network Quality Control

variance factor of unit weight and where Gy is the corresponding cofactor matrix. It can
be shown that
êT G−1
y ê
σ̂ 2 =
m−n
is an unbiased estimator of σ 2 . Thus E{σ̂ 2 } = σ 2 . The test (3.44) can now also be
formulated as
σ̂ 2 χα2 (m − n, 0)
> = Fα (m − n, ∞, 0)
σ2 m−n
where F(m − n, ∞, 0) is the central F-distribution having m − n and ∞ degrees of freedom.

3.4.2 Identification
In the detection phase, one tests the overall validity of the null hypothesis. If this leads to
a rejection of the null hypothesis, one has to search for possible model misspecifications.
That is, one will then have try to identify the model error which caused the rejection of the
null hypothesis. This implies that one will have to specify through the matrix C, the type
of likely model errors. This specification of possible alternative hypotheses is application
dependent and is one of the more difficult tasks in hypothesis testing. It depends namely
very much on ones experience, which type of model errors one considers to be likely.

The 1-dimensional case: In case the model error can be represented by a scalar, q = 1
and matrix C reduces to a vector which will be denoted by the lowercase character c. This
implies that the alternative hypothesis takes the form
Ha : E{y} = Ax + c∇ (3.45)
The alternative hypothesis is specified, once the vector c is specified. The appropriate test
statistic for testing the null hypothesis against the above alternative hypothesis Ha follows
when the vector c is substituted for the C-matrix in (3.41). It gives
[cT Q−1
y ê]
2
T q=1 = w2 = T −1 −1
c Qy Qê Qy c
or when the square-root is taken
cT Q−1
y ê
w= (3.46)
cT Q−1 −1
y Qê Qy c

This test statistic has a standard normal distribution N(0, 1) under Ho . The evidence on
whether the model error as specified by (3.45) did or did not occur, is based on the test
| w | > N 1 (0, 1) (3.47)
2 α1

Data snooping: Apart from the possibility of having a one dimensional test as (3.47), it is
standard practice in geodesy to always first check the individual observations for potential
blunders. This implies that the alternative hypotheses take the form
Hai : E{y} = Ax + ci ∇ i = 1, . . ., m (3.48)
3. Testing and reliability 63

with

ci = (0, . . ., 0, 1, 0, . . ., 0)T

Thus ci is a unit vector having the 1 as its ith entry. The additional term ci ∇ models the
presence of a blunder in the ith observation. The appropriate test statistic for testing the
null hypothesis against the above alternative hypothesis Hai is again of the general form
of (3.46), but now with the c-vector chosen as ci ,

cTi Q−1
y ê
wi = (3.49)
cTi Q−1 −1
y Qê Qy ci

This test statistic has of course also a standard normal distribution N(0, 1) under Ho . By
letting i run from 1 up to and including m, one can screen the whole data set on the
presence of potential blunders in the individual observations. The test statistic wi which
returns the in absolute value largest value, then pinpoints the observation which is most
likely corrupted with a blunder. Its significance is measured by comparing the value of
the test statistic with the critical value. Thus the jth observation is suspected to have a
blunder, when

| w j | ≥ | wi | ∀i and | w j | > N 1 (0, 1) (3.50)

2 α1

This procedure of screening each individual observation for the presence of a blunder, is
known as data snooping.
In many applications in practice, the variance matrix Qy is diagonal. If that is the
case, the expression of the above test statistic simplifies considerably. With a diagonal
Qy -matrix, we have
êi
wi =
σê
i

The appropriate test statistic is then thus equal to the least-squares residual of the ith
observation divided by the standard deviation of the residual.

The higher dimensional case: It may happen that a particular model error can not be
represented by a single scalar. In that case q > 1 and ∇ becomes a vector. The appropriate
test statistic is then the one we met earlier, namely

T q = êT Q−1 T −1 −1 −1 T −1
y C[C Qy Qê Qy C] C Qy ê (3.51)

It is through the matrix C that one specifies the type of model error.

3.4.3 Adaptation
Once one or more likely model errors have been identified, a corrective action needs to be
undertaken in order to get the null hypothesis accepted. Here, one of the two following
approaches can be used in principle. Either one replaces the data or part of the data with
new data such that the null hypothesis does get accepted, or, one replaces the original
64 Network Quality Control

null hypothesis with a new hypothesis that does take the identified model errors into ac-
count. The first approach amounts to a remeasurement of (part of) the data. This approach
is feasible for instance, when in case of datasnooping some individual observations are
identified as being potentially corrupted by blunders. These are then the observations
which get remeasured. In the second approach no remeasurement is undertaken. Instead
the model of the null hypothesis is enlarged by adding additional parameters such that all
identified model errors are taken care off. Thus with this approach, the identified alterna-
tive hypothesis becomes the new null hypothesis.
Once the adaptation step is completed, one of course still has to make sure whether
the newly created situation is acceptable or not. This at least implies a repetition of the
detection step. When adaptation is applied, one also has to be aware of the fact that since
the model may have changed, also the ’strength of the model’ may have changed. In
fact, when the model is adapted through the addition of more explanatory parameters, the
model has become weaker in the sense that the test statistics will now have less detection
and identification power. That is, the reliability has become poorer. It depends on the
particular application at hand, whether this is considered acceptable or not.

3.5 Reliability

In the previous section we considered a testing procedure for the detection, identification
and adaptation of model errors. Hence, we now know how to search for potential model
errors and how to test their significance. But what we do not know yet, is how well these
tests will perform. In particular we would like to know how the tests perform in terms of
their power of detecting the model errors.

3.5.1 Power of tests

When applying the various tests, one should be aware of the fact that the outcomes of the
tests may be erroneous as well. This is due to the fact that the test statistics which are used
for testing, are random variables. They are random variables, which have a probability
density function that depends on which of the hypotheses is true. For our test statistic T q
we have
Ho : T q ∼ χ 2 (q, 0) and Ha : T q ∼ χ 2 (q, λ ) (3.52)
with the noncentrality parameter
λ = ∇T C T Q−1 −1
y Qê Qy C∇
(3.53)
= ∇T C T Q−1 T −1 −1 T −1
y [Qy − A(A Qy A) A ]Qy C∇
Thus T q has a central Chi-square distribution with q degrees of freedom, when Ho is true,
but a noncentral Chi-square distribution with q degrees of freedom and a noncentrality
parameter λ , when the alternative hypothesis Ha is true.
Recall from one of the earlier sections that, when testing, two type of errors can be
made. A type I error is made, when one decides to reject the null hypothesis, while in fact
it is true. The probability of such an error is given by the level of significance α and reads
∞
α = P[T q > χα2 (q, 0) | Ho ] = pχ 2 (x | q, 0)dx (3.54)
χα2 (q,0)
3. Testing and reliability 65

where pχ 2 (x | q, 0) is the probability density function of the central Chi-square distribu-

tion, having q degrees of freedom.
A type II error is made, when one decides to accept the null hypothesis, while in fact
it is false. The probability of such an error is given as β = P[T q ≤ χα2 (q, 0) | Ha ]. Instead
of working with β , one can also work with its complement γ = 1 − β , which is known
as the power of the test. It is the probability of correctly rejecting the null hypothesis.
Hence, it is the probability
∞
γ = P[T q > χα2 (q, 0) | Ha ] = pχ 2 (x | q, λ )dx (3.55)
χα2 (q,0)

Note that the power γ depends on the three parameters α , q and λ . Using a shorthand
notation, we write
γ = γ (α , q, λ ) (3.56)

When testing, we of course would like to have a sufficiently high probability of correctly
detecting a model error when it occurs. One can make γ larger by increasing α , or, by
decreasing q, or, by increasing λ . This can be seen as follows. A larger α implies a
smaller critical value χα2 (q, 0) and via the integral (3.55) thus a larger power γ . Thus if
we want a smaller probability for the error of the first kind (α smaller), this will go at the
cost of a smaller γ as well. That is, one can not simultaneously decrease α and increase
γ.
The power γ also gets larger, when q gets smaller. This is also understandable. When
q gets smaller, the less additional parameters are used in Ha and therefore the more ”infor-
mation” is used in formulating Ha . For such an alternative hypothesis, one would expect
that if it is true, the probability of accepting it will be higher than for an alternative hy-
pothesis that contains more additional parameters. Finally, the power γ also gets larger,
when λ gets larger. This is understandable when one considers (3.53). For instance, one
would expect to have a higher probability of correctly rejecting the null hypothesis, when
the model error gets larger. And when ∇ gets larger, also the noncentrality parameter λ
gets larger.
Using α and/or q as tuning parameters to increase γ , does not make much sense how-
ever. The parameter q can not be changed at will, since it depends on the type of model
error one is considering. And increasing α , also does not make sense, since it would lead
to an increased probability of an error of the first kind. Hence, this leaves us with λ .
According to (3.53), the noncentrality parameter depends on

1. the model error C∇

2. the variance matrix Qy

3. The design matrix A

Since one also can not change the model error at will, it is through changes in the variance
matrix Qy and/or in the design matrix A that one can increase λ , thereby improving the
detection power of the test. Recall that it is also through these two matrices that one can
improve the precision of the least-squares solution. Thus the detection power γ of the tests
66 Network Quality Control

can, if needed, be improved by using more precise measurement equipment, by adding

more obervations to the model and/or by changing the structure of the design matrix A.

3.5.2 Internal reliability

The power γ is the probability with which a model error C∇ can be found with the ap-
propiate test. Thus given the model error C∇, we can compute the noncentrality parameter
as
λ = ∇T C T Q−1 −1
y Qê Qy C∇ (3.57)
and from it, together with α and q, the power as
γ = γ (α , q, λ ) (3.58)
We can however also follow the inverse route. That is, given the power γ , the level of
significance α and the dimension q, the noncentrality parameter can be computed as
λ = λ (α , q, γ ) (3.59)
This function is the inverse of (3.58). If we combine (3.57) with (3.59), we get a quadratic
equation in ∇
λ (α , q, γ ) = ∇T C T Q−1 −1
y Qê Qy C∇ (3.60)
This equation is said to describe the internal reliability of the null hypothesis E{y} = Ax
with respect to the alternative hypothesis E{y} = Ax + C∇. Each model error C∇ that
satisfies the quadratic equation, can be found with a probability γ . Since it is often more
practical to know the size of the model error that can be found with a certain probability,
than knowing the probability with which a certain model error can be found, we will use
the inverse of (3.60).

Minimal Detectable Biases (MDB’s): In the one dimensional case q = 1 and the matrix
C reduces to the vector c. Inverting (3.60) becomes then rather straightforward. As a
result we get for the size of the model error

λ (α1 , 1, γ )
|∇|= (3.61)
cT Q−1 −1
y Qê Qy c

The variate | ∇ | is known as the Minimal Detectable Bias (MDB). It is the size of the
model error that can just be detected with a probability γ , using the appropriate w-test
statistic. Larger errors will be detected with a larger probability and smaller errors with a
smaller probability.
In order to guarantee that the model error c∇ is detected with the same probability
by both the w-test and the overall model test, one will have to relate the critical values of
the two tests. This can be done by equalizing both the powers of the two tests and their
noncentrality parameters. Thus if λ (αm−n , m − n, γm−n) is the noncentrality parameter of
the T m−n -test statistic and λ (α1 , 1, γ1) the noncentrality parameter of the T 1 -test statistic,
we have
λ (αm−n , m − n, γ ) = λ (α1 , 1, γ ) (3.62)
3. Testing and reliability 67

From this relation, αm−n and thus the critical value of the overall model test, can be
computed, once the power γ and the level of significance α1 is chosen. Common values
in case of geodetic networks, are α1 = 0.001 and γ = 0.80.

Data snooping: Note that the MDB can be computed for each alternative hypothesis,
once the vector c of that particular alternative hypothesis has been specified. In case of
data snooping, the MDB’s of the individual observations are computed as

λ (α1 , 1, γ )
| ∇i | = i = 1, . . ., m (3.63)
ci Q−1
T −1
y Qê Qy ci

As with the wi -test statistic, also this expression simplifies considerably when the variance
matrix Qy is diagonal. In that case we have

λ (α1 , 1, γ )
| ∇i |= σyi (3.64)
1 − σŷ2 /σy2i
i

where σy2i is the a priori variance of the ith observation and σŷ2 is the a posteriori variance
i
of this observation. We thus clearly see that a better precision of the observation as well
as a larger amount in which its precision gets improved by the adjustment, will improve
the internal reliability, that is, will result in a smaller MDB.

The higher dimensional case: When q > 1, the inversion is a bit more involved. In this
case ∇ is a vector, which implies that its direction needs to be taken into account as well.
We use the factorization ∇ = ∇ d, where d is a unit vector (d T d = 1). If we substitute
the factorization into (3.60) and then invert the result, we get

λ (αq , q, γ )
∇ = (d = unit vector) (3.65)
d T C T Q−1 −1
y Qê Qy Cd

The size of the model error now depends on the chosen direction vector d. But by letting
d vary over the unit sphere in Rq , one can obtain the whole range of MDB’s that can be
detected with a probability γ .

3.5.3 External reliability

With the MDB’s one describes the internal reliability. In practical applications however,
one often not only wants to now the size of the model errors that can be detected with
a certain probability, but also what their impact would be on the estimated parameters.
After all, it are the estimated parameters, or functions thereof, which constitute the final
result of an adjustment. The external reliability describes the influence of model errors of
the size of the MDB’s, on the final result of the adjustment.
For the expectation of the least-squares estimator x̂ under the alternative hypothesis
Ha , we have

E{x̂ | Ha } = (AT Q−1 −1 T −1

y A) A Qy E{y | Ha }
68 Network Quality Control

Substitution of E{y | Ha } = Ax +C∇, gives

E{x̂ | Ha } = x + (AT Q−1 −1 T −1

y A) A Qy C∇

This shows that the bias in x̂, due to the presence of the model error C∇, is

∇x̂ = (AT Q−1 −1 T −1

y A) A Qy C∇ (3.66)

Of course we do not know the actual model error C∇. But we do know the size of the
model error that can be detected with a probability of γ . Hence, if we use the MDB and
replace C∇ by c | ∇ | in case q = 1 or by Cd ∇ in case q > 1, the vector ∇x̂ will show
us by how much the least-squares estimator x̂ gets biased, when a model error of the size
of the MDB would occur. Note that there are as many MDB’s as there are alternative
hypotheses. Hence, there are also as many vectors ∇x̂. This whole set is said to describe
the external reliability.
In certain applications, one may not be interested in the whole parameter vector x,
but only in particular functions of it, say θ = F T x. In that case the external reliability is
described by

∇θ̂ = F T (AT Q−1 −1 T −1

y A) A Qy C∇ (3.67)

A particular case occurs, when one is only interested in the x1 - part of the parameter vector
x = (xT1 , xT2 )T . In that case F T = (I, 0). The bias-vector ∇x̂1 can then be shown to equal

∇x̂1 = (ĀT1 Q−1 −1 T −1

y Ā1 ) Ā1 Qy C∇ (3.68)

where, with the partitioning A = (AT1 , AT2 )T , the matrix ĀT1 Q−1
y Ā1 is the reduced normal
matrix and Ā1 = [I − A2 (AT2 Q−1
y A2 )−1 AT Q−1 ]A .
2 y 1

Bias-to-Noise Ratios (BNR’s): In order to measure the significance of the external relia-
bility, one can compare the bias-vectors ∇x̂ with the precision, or variance matrix Qx̂ , of
x̂. This can be done by using the Bias-to-Noise Ratios (BNR)

λx̂ = ∇x̂T Q−1

x̂ ∇x̂ with Qx̂ = (AT Q−1y A)
−1

λθ̂ = ∇θ̂ T Q−1 ∇θ̂ T

with Qθ̂ = F Qx̂ F (3.69)
θ̂
λx̂ = ∇x̂T1 Q−1
x̂ ∇x̂1 with Qx̂ = (ĀT1 Q−1
y Ā1 )
−1
1 1 1

Note that the BNR’s are dimensionless and that they measure the squared lengths of the
bias-vectors in the metric defined by the appropriate variance matrix. Also note that the
BNR’s are scalars. Thus λx̂ is a scalar, whereas ∇x̂ is a vector. This implies that with the
BNR’s, one only needs to evaluate a scalar per alternative hypothesis, whereas otherwise
a complete vector would have to be evaluated for each alternative hypothesis.
From a computational point of view, there are also some shortcuts that can be used
when computing the BNR’s. For instance, when the complete vector ∇x̂ is considered, it
can be shown that

λx̂ = ∇x̂T Q−1 T T −1

x̂ ∇x̂ = ∇ C Qy C∇ − λ (αq , q, γ ) (3.70)
3. Testing and reliability 69

The second expression on the right hand side may sometimes be easier to compute than
the first expression on the right hand side, in particular when Qy is diagonal. For the BNR
of a subset of the parameter vector, one can show that

λx̂ = ∇x̂T1 Q−1 T T −1 T −1 −1 T −1

x̂ ∇x̂1 = λx̂ − ∇ C Qy A2 (A2 Qy A2 ) A2 Qy C∇ (3.71)
1 1

An important feature of the BNR’s is that they can be used to formulate upperbounds
on the external reliability of functions of the parameters. For instance, if we consider the
function θ̂ = f T x̂ having the variance σθ̂2 and the bias ∇θ̂ = f T ∇x̂, then

∇θ̂ ≤ σθ̂ λx̂ (3.72)

This shows, that the potential bias in θ̂ due to an undetected model error of the size of the
MDB, will never be larger than the standard deviation times the square root of the BNR.

3.5.4 Connection of two height systems: an example

Let us assume that we need to connect two levelling networks of which the heights are
defined in two different height-systems (height-datums). The height of point i in the first
system is denoted as hi and the height of the same point in the second system as Hi . As
observables we have available the heights hi and H i of the same n points in both height-
systems, i = 1, . . ., n.
We assume that all heights are uncorrelated, that all heights of the first system have
the same variance σh2 and that all heights of the second system have the same variance σH2
(note: these assumptions are of course not really realistic, but simply serve to make our
example analytically tractable). If we assume that the two height systems differ in scale
and have a constant offset, then the two different heights of point i are related as

hi = λ Hi + t

with the scale factor λ and the translation t.

Based on the above assumptions, the model of observation equations for connecting
the two networks, becomes
⎡ ⎤ ⎡ ⎤
h1 λ H1 + t
⎢ . ⎥ ⎢ .. ⎥
⎢ .. ⎥ ⎢ ⎥
⎢ ⎥ ⎢ . ⎥
⎢ ⎥ ⎢ ⎥
⎢ hn ⎥ ⎢ λ Hn + t ⎥ Qh QhH σh2 In 0
E{⎢ ⎥} = ⎢ ⎥ , = (3.73)
⎢ H1 ⎥ ⎢ H1 ⎥ QHh QH 0 σH2 In
⎢ ⎥ ⎢ ⎥
⎢ .. ⎥ ⎢ .. ⎥
⎣ . ⎦ ⎣ . ⎦
Hn Hn
Note that the observation equations are nonlinear due to the presence of the scale factor
λ . We will now consider four different cases. First we will assume that the difference
in scale and the constant offset are absent. Then we will assume that there is only a
constant offset, followed by the case that there is only a difference in scale. Finally, we
will assume that there is both a constant offset and a difference in scale. For all four cases
we will discuss the MDB’s that correspond to the use of data snooping.
70 Network Quality Control

Data snooping applied to the above connection model, implies that we are testing for
errors in the individual height coordinates. It will be clear that with the above model, one
will never be able to discriminate whether a blunder occurred in the h-coordinate or in the
H- coordinate of a point i. That is, one will never be able to pinpoint a blunder in a height
coordinate to one of the two height systems. Hence, it suffices to restrict our attention to
the wi -test statistics of one of the two sets of heights. We choose to restrict our attention to
the h-coordinates. Since the complete variance matrix of the observables is diagonal, we
can make use of the simplified expression (3.64) for the MDB’s. For the hi coordinates, it
reads

17.075
| ∇i |= σh (3.74)
1 − σ 2 /σh2
ĥi i

Note that we assumed λ (α1 , 1, γ ) = 17.075. This value is based on the values α1 = 0.001
and γ = 0.80.

Scale and translation absent: When both scale and translation are absent, the model of
observation equations becomes linear and the least- squares solution of (3.73) amounts to
taking a simple weighted average of the data. The variance of the least-squares solution
ĥi reads then

σh2 σH2
σĥ2 = (3.75)
i σh2 + σH2
Substitution into (3.74) gives for the MDB

| ∇i |= 17.075(σh2 + σH2 ) (3.76)

This shows that the MDB will be about 5.8 times the standard deviation of the height
coordinate, if the two networks are equally precise. Hence, a blunder of this size in the hi
coordinate can be found with a probability of 80% with the wi -test.

Only scale is absent: In this case the model of observation equations is again linear. Note
however that the redundancy has decreased by one. In the previous case the redundancy
equalled n, whereas now, due to the additional unknown translation, it equals n − 1. Due
to the decrease in redundancy, the model will have less ’strenght’ and one can therefore
expect that the results of the adjustment will also be somewhat less precise. And indeed,
the variance of the least-squares solution ĥi reads now

σh2 σH2 σ2 1
σĥ2 = (1 + h2 ) (3.77)
i σh + σH
2 2 σH n
Note that (3.75) and (3.77) will differ less, the larger n is. Hence, the difference between
the two variances will become smaller when the number of points in the two overlapping
networks increases. The difference will also be small when σH2 >> σh2 , that is, when the
second network is considerably less precise than the first network. But in that case we also
would have σĥ2 ≈ σh2 , thus showing that no significant improvement in precision has taken
i
3. Testing and reliability 71

place. This is of course understandable, because if the second network is considerably less
precise than the first network, it will also not contribute much to the adjustment.
Substitution of (3.77) into (3.74) gives for the MDB

17.075(σh2 + σH2 )
| ∇i |= (3.78)
1 − 1n
Note that the MDB goes to infinity when n = 1. This is due to the fact that there is no
redundancy when n = 1. In that case no model errors at all can be found by the statistical
tests and the solution is said to be infinitely unreliable. Also note that the above MDB is
larger than the one of (3.76). This is due to the decrease in redundancy of one. That is,
the outcomes of the statistical tests will now be somewhat less reliable than they were in
the previous case. But again, this difference can be made small by increasing the number
of points n.

Only translation is absent: When only the translation is absent, we again have a model
with a redundancy of n − 1. Note however, that the observation equations are now non-
linear. Hence, first a linearization needs to be carried out. The linearized model reads
⎡ ⎤ ⎡ ⎤
Δh1 λo H1o
⎢ . ⎥ ⎢ .. .. ⎥ ⎡ ⎤
⎢ .. ⎥ ⎢ . ⎥
⎢ ⎥ ⎢ . ⎥ ΔH1
⎢ ⎥ ⎢
⎢ Δhn ⎥ ⎢ λ o Hno ⎥ ⎢ . ⎥
⎥ ⎢ .. ⎥
E{⎢ ⎥} = ⎢ ⎥⎢ ⎥ (3.79)
⎢ ΔH 1 ⎥ ⎢ 1 0 ⎥ ⎣ ΔHn ⎦
⎢ ⎥ ⎢ ⎥
⎢ .. ⎥ ⎢ .. .. ⎥
Δλ
⎣ . ⎦ ⎣ . . ⎦
ΔH n 1 0
where λ o is the approximate scale factor and Hio , i = 1, . . ., n, are the approximate heights
of the second network. Since geodetic networks often only differ slightly in scale, one
may take as approximate value λ o = 1. We will do so here also.
Note that the design matrix of the above model depends on the approximate heights
Hio . Hence, also the precision of the least- squares estimators will depend on them. The
variance of the least-squares solution Δĥi reads now
σh2 σH2 σh2 (Hio )2
σĥ2 = (1 + ) (3.80)
i σh2 + σH2 σH2 ∑nj=1(H oj )2
Compare this result with that of (3.77). Substitution of (3.80) into (3.74) gives for the
MDB

17.075(σh2 + σH2 )
| ∇i |=
(3.81)
(H o )2
1 − ∑n i(H o )2
j=1 j

Compare this result with that of (3.78). In the previous case, the MDB was, apart from the
a priori precision, only dependent on the number of points n. In the present case however,
the MDB has also become dependent on the heights of the points themselves. Thus in the
above expression for the MDB, we see four effects at work:
72 Network Quality Control

1. A priori precision: σh2 + σH2

2. Number of points: ∑nj=1 (.)2

3. Height of the point tested: Hio

4. Height distribution of network: ∑nj=1(H oj )2

As before the MDB gets smaller when the a priori precision improves and/or when the
number of points increases. But now the MDB also gets smaller when the height of the
point being tested is closer to zero and/or when the heights of the remaining points are
further away from zero.

Both scale and translation present: As in the previous case, the observation equations
are again nonlinear. The linearized model reads
⎡ ⎤ ⎡ ⎤
Δh1 λo 1 H1o ⎡ ⎤
⎢ . ⎥ ⎢ .. .. .. ⎥ ΔH1
⎢ .. ⎥ ⎢ . ⎥
⎢ ⎥ ⎢ . . ⎥⎢ . ⎥
⎢ ⎥ ⎢ ⎢ . ⎥
⎢ Δhn ⎥ ⎢ λ o 1 Hno ⎥ ⎥⎢ . ⎥
E{⎢ ⎥ } = ⎢ ⎥⎢ ⎥ (3.82)
⎢ ΔH 1 ⎥ ⎢ 1 0 0 ⎥ ⎢ ΔHn ⎥
⎢ ⎥ ⎢ ⎥
⎢ .. ⎥ ⎢ .. .. .. ⎥ ⎣ Δt ⎦
⎣ . ⎦ ⎣ . . . ⎦ Δλ
ΔH n 1 0 0
Due to the additional unknown, the translation t, the redundancy now equals n − 2. Thus
again we can expect a somewhat less precise and less reliable result. The variance of the
least-squares solution Δĥi reads now

σh2 σH2 σ2 1 (H̄ o )2

σĥ2 = [1 + h2 ( + n i o 2 )] (3.83)
i σh + σH
2 2 σH n ∑ j=1(H̄ j )

where H̄io = Hio − 1n ∑nj=1 H oj , i.e. the difference in height of point i with respect to the
average height of the network. Compare this result with that of (3.80). Substitution of
(3.83) into (3.74) gives for the MDB

17.075(σh2 + σH2 )
| ∇i |=
(3.84)
(H̄ o )2
1 − 1n − ∑n (i H̄ o )2
j=1 j

Compare this result with that of (3.81). We see almost the same four effects present in the
expression for the MDB. The only difference is that the heights are now referenced to the
average height of the network, instead of to zero, as it was the case when the translation
was absent. Thus the point having the smallest MDB is the one which height is closest to
the average of the network. If it coincides with the average, then H̄io = 0 and the above
expression reduces to that of (3.78).
3. Testing and reliability 73

3.6 Quality control: precision and reliability

We are now in a position to summarize the two main diagnostics by which the quality
of estimation and testing can be characterized. They are precision and reliability. In
order to discuss them properly, we follow the general steps involved when performing the
adjustment and the testing.

Formulate Ho and Ha : In order to be able to perform the adjustment, one first must have
a working hypothesis, the null hypothesis Ho , available. It reads

Ho : E{y} = Ax , D{y} = Qy (3.85)

This is the model one beliefs to be true and on which one would like to base the ad-
justment. But of course, relying on this model without checking its validity would be
dangerous, since errors in it could completely ruin the results of the adjustment. The
purpose of testing is therefore to check whether the null hypothesis is likely to be true
or not. This is done by opposing the above model to one or more alternative hypotheses
Ha . In these lecture notes we have restricted ourselves to alternative hypotheses, which
differ from the null hypothesis in their functional model only. The alternative hypotheses
considered are therefore of the form

Ha : E{y} = Ax +C∇ , D{y} = Qy (3.86)

The two type of hypotheses thus differ in their mean of y only, E{y | Ha } = E{y | Ho } +
C∇. The vector ∇y = C∇, with matrix C known and vector ∇ unknown, describes the
assumed model error.

The quality under Ho and Ha : Based on our working hypothesis Ho , the least-squares
estimator of x reads

x̂ = (AT Q−1 −1 T −1
y A) A Qy y (3.87)

It will be clear that the quality of x̂, as expressed by its expectation and its dispersion,
depends on whether Ho is true or Ha is true. In the first case we have

mean : E{x̂} = x
Ho true (3.88)
variance : D{x̂} = Qx̂

with Qx̂ = (AT Q−1 −1

y A) . In the second case however, we have

mean : E{x̂} = x + ∇x̂
Ha true (3.89)
variance : D{x̂} = Qx̂

with ∇x̂ = (AT Q−1 −1 T −1

y A) A Qy C∇. This shows that the least-squares estimator x̂ has the
same variance matrix under Ha as it has under Ho . Thus the precision of the estimator is
not affected by the model error C∇. Its expectation or its mean however, does get affected.
Under Ho it is an unbiased estimator, but under Ha it becomes biased, due to the presence
of the model error C∇.
74 Network Quality Control

Since one can never be completely sure whether the null hypothesis is true or whether
one of the alternative hypotheses is true, the quality of the estimator is made up of the two
components ∇x̂ and Qx̂ . The variance matrix, which describes the precision, is known and
in the last section of the previous chapter we discussed how one could evaluate it. The
bias ∇x̂ however, is unknown, since it depends on the unknown model error C∇.

Testing: In order to minimize the risk that a bias like ∇x̂ indeed occurs, one needs to
check the validity of the null hypothesis. This is the whole purpose of testing. Although
there is only one null hypothesis, there are in practice often many more than one alter-
native hypotheses. The class of alternative hypotheses considered depends very much on
the application at hand and on ones experience. Through the alternative hypotheses one
specifies the model errors that one beliefs are likely to occur. The testing procedure to
be applied consists then of detection, identification and adaptation. In the detection step
the overall validity of the null hypothesis is checked. Ones this test leads to a rejection
of Ho , one needs to identify the most likely alternative hypothesis. In the final step, the
adaptation step, one then corrects for the identified misspecification in the null hypothesis.

Reliability: Although the statistical testing of Ho minimizes the risk that a bias like ∇x̂
occurs, one should realize that the outcomes of the statistical tests are not exact and thus
still prone to errors (type I and type II error). It depends on the ’strength’ of the model,
how much confidence one will have in these outcomes. A measure of this confidence is
provided for by the concept of reliability. When the w- test statistics are used, the internal
reliability is described by the set of MDB’s, one for each alternative hypothesis. The
MDB is given as

λ (α1 , 1, γ )
|∇|= (3.90)
c Q−1
T
y Qê Qy c
−1

It is the size of the model error that can be found with a probability γ , when using the
w-test. The internal reliability improves when the MDB’s get smaller and gets worse
when the MDB’s get larger. Note however, that it only makes sense to consider these
MDB’s when the statistical tests are actually carried out. Hence, the solution is said to be
infinitely unreliable by definition, if no statistical testing has taken place. Also note, that
the MDB, apart from being dependent on the chosen values for α1 and γ , is governed by
the c- vector, the design matrix A and the variance matrix Qy . The vector c can not be
changed at will, since it depends on the particular model error one is considering. Hence,
this leaves us, much like in the case of precision, with the two matrices A and Qy for
improving the internal reliability.
The external reliability describes how model errors of the size of the MDB’s propagate
into the results of the adjustment,

∇x̂ = (AT Q−1 −1 T −1

y A) A Qy c | ∇ | (3.91)

This vector thus describes the bias in x̂, when a model error of the size of the MDB has
occurred.
3. Testing and reliability 75

Precision and reliability: Once one has evaluated the precision of the least-squares so-
lution and found it adequate enough for the application at hand, one can evaluate the
significance of the bias vector ∇x̂, through the Bias-to-Noise Ratio (BNR)

λx̂ = ∇x̂T Q−1

x̂ ∇x̂ (3.92)

The dimensionless BNR measures the bias relative to the precision. For an arbitrary
function θ̂ = f T x̂, the BNR can be used to obtain the upperbound

∇θ̂
≤ λx̂ (3.93)
σθ̂

Thus if for the particular application at hand, the BNR is considered to be small enough,
the bias in any function of x̂ is guaranteed to be sufficiently insignificant as well.
Chapter 4

Adjustment and validation of networks

4.1 Introduction

In this chapter we will discuss the various computational steps involved for determining
the geometry of a geodetic network. Let us first briefly review the general steps involved.
They are the design, the adjustment and the testing.

Design: Before one can start, one needs to specify the functional model and the stochastic
model. In the functional model, one formulates the assumed relation between the observ-
ables and the unknown parameters. These observation equations depend on the type of
observables used (e.g. angles, distances, baselines, etc.) and on the choice of parameteri-
zation (e.g. Cartesian coordinates or geographic coordinates). The observation equations
may be linear or nonlinear. In the nonlinear case, they first need to be linearized. For the
linearization, approximate values for the parameters are needed. When the approximate
geometry of the network is known, it is often possible to obtain the approximate coordi-
nates from a map. The approximate coordinates can also be computed from a sufficient
set of observations. The design matrix A is known, once the functional model is specified.
With the stochastic model, one specifies the assumed distributional properties of the
observables. In geodesy it often suffices to assume the observables to be normally dis-
tributed. In addition one has to specify the second moment, the variance matrix, of the
distribution. The variance matrix of the observables describes their precision. The spec-
ification of this variance matrix Qy depends on the measurement equipment and on the
measurement procedures used.
The two matrices A and Qy are known, once the functional and stochastic model are
known. These two matrices can be used before the actual adjustment and testing is carried
out, to infer the expected quality in terms of precision and reliability, of the network. In
order to evaluate the precision of the least-squares solution x̂, its variance matrix Qx̂ is
used. This matrix quantifies how random errors in the observables propagate into the
least-squares solution. In order to evaluate the reliability of the least-squares solution, the
minimal detectable bias vector ∇x̂ is used. It quantifies how a potential error c∇ in the
functional model propagates into the least-squares solution. The size of the potential error
is coupled to the power of the corresponding test statistic. Both Qx̂ and ∇x̂ depend on A
and Qy . Hence, one can improve the precision and reliability, by changing A and/or Qy .

Adjustment: Once one is satisfied with the design of the network, the actual adjustment
can be carried out. It needs, apart from the design matrix A and the variance matrix Qy ,
of course also the actual observations y. The adjustment is based on the principle of least-
squares and it produces a solution for the unknown parameters. This solution is obtained
by solving the normal equations, for which usually the Cholesky-decomposition is used.
In case of nonlinear observation equations, the least-squares solution is usually iterated

77
78 Network Quality Control

using the normal equations based on the linearized observation equations. The number of
iterations will be small, when good approximate values are used.
On the basis of the assumption that the model (functional and stochastic) has been
specified correctly, the linear(ized) least-squares estimators are known to be unbiased and
of minimal variance. These properties will fail to hold however, when a misspecified
model has been used in the computations. Hence, before the least-squares solution x̂ can
be accepted, one has to test the validity of the model.

Testing: The validity of the model (the null hypothesis) is tested by opposing it to vari-
ous alternative models (the alternative hypotheses). For each alternative hypothesis, one
has an appropriate test statistic. But all test statistics are functions of the least- squares
residual vector ê. The testing procedure consists of the following three steps: detection,
identification and adaptation. In the detection step, the null hypothesis is opposed to the
most relaxed alternative hypothesis. The purpose of the detection step is to infer whether
one has any reason to belief that the null hypothesis is indeed wrong. When the detection
step leads to a rejection of the null hypothesis, the next step is the identification of the
most likely model error. For identification one needs to specify the alternative hypothesis.
This choice depends on the type of model error one expects to be present. Hence, it very
much depends on the application at hand. It is standard practice however, to have data
snooping included in the identification step. In case of data snooping each of the individ-
ual observations is screened for potential blunders. Once certain model errors have been
identified as sufficiently likely, the last step consists of an adaptation of the data and/or
model. Depending on the situation, two approaches are possible in principle. One can
either decide to remeasure some of the observables, or, one can decide to include addi-
tional parameters in the model, such that the model errors are accounted for. The first
approach is possible in case the model errors are due to clear measurement errors, e.g.
blunders in individual observations. The second approach can be used for more compli-
cated situations. In this case the identified alternative hypothesis will become the new null
hypothesis. One should be aware however, that in this case, due to the change in model,
the precision and reliability of the solution will change as well.

The above considerations for the design, the adjustment and the testing, are valid for
any geodetic project where measurements are used to determine unknown parameters.
When computing geodetic networks however, some additional aspects need to be consid-
ered as well. The construction of a geodetic network implies that the geometry of the
configuration of a set of points is determined. The set of points usually consists of: (1)
newly established points, of which the coordinates still need to be determined, and (2)
already existing points, the so-called control points, of which the coordinates are known.
By means of a network adjustment the relative geometry of the new points is determined
and integrated into the geometry of the existing control points. The determination of the
geometry is usually divided into two parts, the so-called free network adjustment and the
connection adjustment.

The free network: In the free network adjustment, the known coordinates of the control
4. Adjustment and validation of networks 79

points do not take part in the determination of the geometry of the point field. This
adjustment step is thus free from the influence of the existing control points. The idea is
that a good geodetic network should be sufficiently precise and reliable in itself, without
the need of external control.
Since the coordinates of the control points do not take part in the adjustment, one
is confronted with the fundamental non-uniqueness in the relation between geodetic ob-
servables and coordinates. In a levelling network for instance, absolute heights can not
be determined if only height differences are measured. That is, for computing heights,
additional information is needed on the absolute height of the network. Similarly, one
can not obtain the position, the orientation and the scale of a triangulation network if only
angles are measured. The additional information which is needed to be able to compute
coordinates from the geodetic observables, is provided for in the form of so-called min-
imal constraints. These minimal constraints are not unique. There is a whole set from
which the constraints can be chosen. It is important though, that the constraints are mini-
mal. That is, they should not only be necessary to eliminate the lack of information in the
observables, but they should also be sufficient.
After the design of the free network, its coordinates are computed by means of a
least-squares adjustment and its validity is checked by means of the statistical testing
of the observations. The coordinates depend on the chosen minimal constraints, but the
statistical tests do not.

The connected network: Once the geometry of the free network has been determined
to ones satisfaction, it needs to be integrated into the existing geometry of the control
points. The data used for this connection, are the results of the free network adjustment
together with the coordinates of the control points. One can descriminate between the so-
called constrained connection and the unconstrained connection. In most applications,
it is not very practical to see the coordinates of the control points change everytime a
free network is connected to them. This would happen however, when an ordinary least-
squares adjustment is carried out. In that case all observations, including the coordinates
of the control points, would get corrections due to the least-squares adjustment. In order to
circumvent this, no ordinary adjustment, but a constrained adjustment is carried out. This
implies that the connection is carried out, with the explicit constraints that the coordinates
of the existing control points remain fixed.
For the statistical testing of the observations however, a constrained adjustment would
not be realistic. Although practice dictates that the coordinates of the control points re-
main fixed, these coordinates are of course still samples from random variables. Hence,
for the statistical testing of the observations, the variance matrix of the control points
should not be set to zero, but should be included in the adjustment as well. Thus, the
constrained connection is carried out for the final computation of the coordinates, but the
unconstrained connection is used for the statistical testing of the control points.

This chapter is organized as follows. First we will discuss the free network case and then
we will show how they can be connected to existing control. For the free networks, we
present the observation equations, discuss their invariance and show how one can choose
80 Network Quality Control

appropriate sets of minimal constraints. As an example we discuss the adjustment of a

small free GPS network and the testing of a small free levelling network.

4.2 The free network

In this section we consider networks for which the observational data are insufficient
to determine either the (horizontal and/or vertical) position, orientation or scale of the
network (note: here and in the remaining part of the lecture notes, we will disregard so
called configuration defects. That is, we assume that sufficient observational data are used
to determine the configuration of the network).
As we have seen in the earlier sections on adjustment theory, a consequence of the
non-uniqueness is that the design matrix A of the model of observation equations E{y} =
Ax, D{y} = Qy , will have a rank defect. Therefore no unbiased linear estimator x̂ = Ly
of x exists, since this would require E{x̂} = LE{y} = Ax for all x, or AL = I, which is
impossible since the rank of a product of two matrices can not exceed the rank of either
factor. But although x is not unbiased estimable, we have seen that functions of x exist
that are unbiased estimable. In particular we have shown that the minimally constrained
solution x̂b is an unbiased estimator of Sb x, with Sb being the S-transformation defined by
the null space of the design matrix, N(A) = R(G), and the minimal constraints BT x = 0.
In this section we will show how minimally constrained solutions for geodetic net-
works can be constructed. We start off by presenting the nonlinear observation equations
of some geodetic observables and their linearized versions. Then we discuss the invari-
ance properties of these geodetic observables. Their invariance can usually be related to
properties of coordinate transformations. Once these properties of invariance are under-
stood, one will be able to identify the null space of the design matrix A. As a consequence
the matrix G, of which the columns span N(A), can be constructed. Understanding the in-
variance, also allows one to specify the minimal constraints BT x = 0. They are needed to
be able to compute a particular least-squares solution of the rank defect model E{y} = Ax,
D{y} = Qy . Such a solution is one of the many free network solutions that can be com-
puted. By means of the S-transformation, one is able to transform one particular free
network solution into another. This section will be concluded with an adjustment exam-
ple of a free GPS network and a testing example of a free levelling network.

4.2.1 Some typical observation equations

In this section we will present some typical examples of geodetic observation equations.
Also their linearized versions are given. We will parametrize the observation equations in
terms of Cartesian coordinates (in one dimension the height h; in two dimensions: x and
y; and in three dimensions: x, y and z).

Height difference: Probably the simplest of all geodetic observation equations is the one
that corresponds with observed height differences. Let hi j denote the height difference be-
tween two points i and j, and let the heights of these two points be denoted as respectively
hi and h j . The observation equation for an observed height difference reads then
E{hi j } = h j − hi (4.1)
4. Adjustment and validation of networks 81

Since this equation is already linear, no linearization is needed.

Azimuth: If we consider a two dimensional network and assume that all network points
are located in the two-dimensional Euclidean plane, the nonlinear observation equation
for an azimuth ai j between the two points i and j reads
x j − xi
E{ai j } = arctan (4.2)
y j − yi
Linearization gives
yoj − yoi yoj − yoi xoj − xoi xoj − xoi
E{Δai j } = Δx j − Δxi − Δy j + Δyi (4.3)
(lioj )2 (lioj )2 (lioj )2 (lioj )2

with
xoj − xoi
Δai j = ai j − arctan
yoj − yoi

and

Δx j = x j − xoj , Δxi = xi − xoi , Δy j = y j − yoj , Δyi = yi − yoi

and where (.)o denotes the approximate value.

Distance: For a planar network, the nonlinear observation equation for a distance reads

E{l i j } = (x j − xi )2 + (y j − yi )2 (4.4)

Linearization gives
xoj − xoi xoj − xoi yoj − yoi yoj − yoi
E{Δl i j } = Δx j − Δxi + Δy j − Δyi (4.5)
(lioj ) (lioj ) (lioj ) (lioj )

with

Δl i j = l i j − (xoj − xoi )2 + (yoj − yoi )2

Angle: An angle αi jk between three points i, j and k, is the difference between their
azimuths a jk and a ji. Thus we have
xk − x j xi − x j
E{α i jk} = E{a jk − a ji } = arctan − arctan (4.6)
yk − y j yi − y j
Its linearized version follows then from the linearized versions of the two azimuths.

Distance ratio: The distance ratio vi jk between three points i, j and k, is the ratio of their
distances l jk and l ji. Thus we have

l jk (xk − x j )2 + (yk − y j )2
E{vi jk } = E{ } = (4.7)
l ji (xi − x j )2 + (yi − y j )2
82 Network Quality Control

Linearization gives
l ojiΔl jk − l ojk Δl ji
E{Δvi jk } =
(l oji)2

This can be further expressed in terms of the coordinate increments by making use of
(4.5).

Direction: A direction ri j is an ”azimuth with an unknown orientation”. Its observation

equation reads thus

E{ri j } = ai j + o j (4.8)

where o j is the orientation unknown. Apart form the additional orientation unknown, the
linearized observation equation for a direction is the same as that for an azimuth. Note
that the difference of two directions having the same orientation, produces an angle.

Pseudo-distance: A pseudo-distance si j (not to be confused with the pseudo-distance of

GPS) is a ”distance with an unknown scale”. Its observation equation reads therefore

E{si j } = λi li j (4.9)

where λi is the scale unknown. Linearization gives

E{Δsi j } = λio Δli j + lioj Δλi

This can be further expressed in terms of the coordinate increments by making use of
(4.5). Note that the ratio of two pseudo-distances having the same scale factor, produces
a distance ratio.

The three dimensional case: So far we assumed all network points to lie in the two di-
mensional Euclidean plane. For a three dimensional network though, we have to take the
third dimension into account as well. For a network of a sufficiently large extent, also the
change in the direction of the plumbline will have to be taken into account. This implies
that for direction measurements like the azimuth ai j and the zenith angle zi j , the astro-
nomical latitude Φi and astronomical longitude Λi will enter the observation equations as
well. The nonlinear observation equations for respectively the azimuth, the zenith angle
and the distance between two points i and j, are given as
− sinΛi (x j −xi )+cosΛi (y j −yi )
E{ai j } = arctan − sin Φ
i cosΛi (x j −xi )−sinΦi sin Λi (y j −yi )+cosΦi (z j −zi )

cosΦi cosΛi (x j −xi )+cosΦi sinΛi (y j −yi )+sinΦi (z j −zi )

E{zi j } = arccos li j
(4.10)

E{l i j } = (x j − xi )2 + (y j − yi )2 + (z j − zi )2

The unknowns in these observation equations are now apart from the Cartesian coordi-
nates, also the astronomical latitude and longitude. The linearization of the above obser-
vation equations is left as an exercise to the reader.
4. Adjustment and validation of networks 83

GPS baseline: As our last example we consider the three dimensional baseline. When
expressed in Cartesian coordinates, the observation equations for its three components
read
E{xi j } = x j − xi
E{yi j } = y j − yi (4.11)
E{zi j } = z j − zi

As with the height differences no linearization is needed, since the equations are already
linear in the parameters.

Instead of using Cartesian coordinates, one may of course use other type of coordinates
as well. In three dimensions, one often also makes use of the geographic coordinates φ ,
λ and h, where h now refers to the height above the reference ellipsoid. The Cartesian
coordinates and geographic coordinates are related as
x = (N + h) cos φ cos λ
y = (N + h) cos φ sin λ (4.12)
z = (N(1 − e2 ) + h) sin φ
where N is the radius of curvature in the prime vertical
a
N=
1 − e2 sin2 φ

and e2 = (a2 − b2 )/a2 , with a and b being the lenghts of the major and minor axes of the
ellipsoid of revolution.

4.2.2 On the invariance of geodetic observables

It will be clear that most geodetic observables (if not all!) do not contain enough infor-
mation to determine the coordinates of a geodetic network uniquely. That is, the geodetic
observables are invariant against certain coordinate transformations. In this section we
will investigate the invariance of some geodetic observables and show how the G-matrix,
of which the columns span the null space of the design matrix A, can be constructed. We
also give examples of the minimal constraints that can be chosen.

The one dimensional case: As a one dimensional example we consider the case of level-
ling. Let hi denote the height of point i in the first coordinate system and let Hi denote the
height of the same point in the second coordinate system. The coordinate transformation
between the two coordinate systems reads then

hi = Hi + t (4.13)

where t is a translation or a shift in height, which is constant for all points. It will be clear
that this coordinate transformation leaves observed height differences invariant. That is,
for the observation equation of a height difference we have

E{hi j } = h j − hi = H j − Hi (4.14)
84 Network Quality Control

This shows that the height differences are invariant for the translation t. This immedi-
ately implies that the design matrix A of a levelling network that is built up from height
differences only, will have a null space which is spanned by the column of the matrix
⎡ ⎤
1
⎢ ⎥
G = ⎣ ... ⎦
(4.15)
1
n×1

where n is the number of points in the network.

Since matrix G has only one column, only one constraint is needed to eliminate the
nonuniqueness in the relation between height differences and heights. Possible choices
for the matrix B of the minimal constraints BT x = 0 are
BT = (1, 0, . . ., 0)
BT = (0, 0, . . ., 1)
(4.16)
BT = (1, 0, . . ., 1)
BT = (1, 1, . . ., 1)
In the first case, the height of the first point is fixed and in the second case, the height of
the last point is fixed. With the third choice, the sum of the heights of the first and last
point is fixed. The fourth case, corresponds to a fixing of the sum of heights of all points.
When the two matrices G and B are known, one can also construct the S-transformation,
Sb = I − G(BT G)−1 BT , that corresponds to the chosen minimal constraints. When the
height of the first point is fixed, it reads
⎡ ⎤
0 0 ... 0
⎢ −1 1 ⎥
⎢ ⎥
Sb = ⎢ . . ⎥
⎣ .. .. ⎦
−1 1
The two dimensional case: In two dimensions we are dealing with observables like
angles and distances. Their observation equations are nonlinear. The construction of the
G-matrix is now a bit more involved, since a linearization is involved as well. Let us first
show the general principle of constructing the G-matrix. Let us assume that the nonlinear
observation equations are invariant for the coordinate transformation
x = T (u, p) (4.17)
where the vector x contains the coordinates of the first coordinate system, the vector u
contains the coordinates of the same points in the second coordinate system and where
vector p contains the transformation parameters. Invariance of the nonlinear observation
equations A(x) for the transformation (4.17), implies that A(x) = A(u) and thus that
A(T (u, p)) = A(u) for all p (4.18)
Since the observation equations are insensitive to changes in p, it follows if we take the
partial derivatives of (4.18) with respect to p, that
∂x A(xo ) ∂ p T (uo , po) = 0 (4.19)

A G
4. Adjustment and validation of networks 85

Thus once we know the transformation which leaves the observation equations invariant,
we only need to take its partial derivative with respect to the parameters in order to obtain
the G-matrix.
The transformation which plays an important role in case of geodetic observables, is
the Similarity transformation. For the two dimensional case, it reads (see figure 4.1)
xi cos α − sin α ui tx
=λ + (4.20)
yi sin α cos α vi ty
with
λ : scale
α : rotation
tx , ty : translation
and where xi , yi are the coordinates of the first coordinate system and ui , vi , the coordinates
of the second coordinate system. Linearization of the similarity transformation, assuming

y
v

i
yi

u
ui
vi

1 α
ty

1/λ
tx xi x

Figure 4.1 The two dimensional similarity transformation.

that λ o = 1, α o = 0, txo = 0 and tyo = 0, gives

⎡ ⎤
Δtx
Δxi Δui 1 0 xoi −yoi ⎢ Δty ⎥
= + ⎢ ⎥ (4.21)
Δyi Δvi 0 1 yoi xoi ⎣ Δλ ⎦
Δα
Angles and distance ratios: It will be clear that angles fail to contain information about
scale, orientation and translation. This also holds true for distance ratios. This can be
seen as follows. Substitution of the coordinate transformation (4.20) into the observation
equation of a distance ratio gives

l jk x2jk + y2jk u2jk + v2jk
E{ } = =
l ji x2ji + y2ji u2ji + v2ji
86 Network Quality Control

thus showing that the transformation parameters λ , α , tx and ty are absent. These param-
eters will also be absent when one considers the observation equation of an angle. As
a consequence, the design matrix A of a geodetic network, that has been built up from
distance ratios and/or angles only, will have a null space which is spanned by the columns
of the matrix
⎡ . . . .. ⎤
.. .. .. . ⎥
⎢
⎢ 1 0 xo −yo ⎥
⎢ i i ⎥
G= ⎢ ⎣ 0 1 yi
o xoi ⎥ ⎦ (4.22)
.. .. .. ..
. . . .
2n × 4
where n is the number of points in the network.
Since matrix G has four independent columns, four constraints are needed to take care
of the nonuniqueness. The simplest way to fix the degrees of freedom of scale, orientation
and translation, would be to fix the coordinates of two points. The corresponding B-matrix
reads then
0 0 . . . I2 0 . . . 0 0
BT =
0 0 . . . 0 I2 . . . 0 0 (4.23)
4 × 2n
Azimuth: When we substitute the coordinate transformation (4.20) into the observa-
tion equation of an azimuth, we get
xi j cos α ui j − sin α vi j
E{ai j } = arctan = arctan
yi j sin α ui j + cos α vi j
This shows that scale, λ , and the two translations, tx , ty , get eliminated, but that the angle
of rotation α does not get eliminated. Hence, with azimuth observables one cannot de-
termine the scale and position of the network, but only its orientation. As a consequence,
the design matrix A of a geodetic network, that has been built up from azimuths only, will
have a null space which is spanned by the columns of the matrix
⎡ . . . ⎤
. . .
⎢ . . .o ⎥
⎢ 1 0 x ⎥
⎢ i ⎥
G= ⎢ ⎣ 0 1 yi ⎦
o ⎥ (4.24)
.. .. ..
. . .
2n × 3
Now three constraints are needed. One could for instance fix the two coordinates of the
first point. This takes care of the two translational degrees of freedom. The scale can be
fixed by constraining the distance between the first and second point. The corresponding
B-matrix reads then
⎡ ⎤
1 0 0 0 0 ...
BT = ⎣ 0 1 0 0 0 ... ⎦
(4.25)
−x12 −y12 x12 yo12 0 . . .
o o o

3 × 2n
4. Adjustment and validation of networks 87

Distance: For distance observables one can expect that the rotation angle and the
translations get eliminated. And indeed, we have

E{l i j } = x2i j + y2i j = λ u2i j + v2i j

Thus in this case the G-matrix is given as

⎡ . . .. ⎤
.. .. . ⎥
⎢
⎢ 1 0 −yo ⎥
⎢ i ⎥
G= ⎢ ⎣ 0 1 xi ⎦
o ⎥ (4.26)
.. .. ..
. . .
2n × 3
In this case an admissible set of minimal constraints would be: the fixing of the two
coordinates of one point and the fixing of the orientation between the first and the second
point.

The three dimensional case: In three dimensions, the nonlinear similarity transformation
is given as
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
xi ui tx
⎣ y ⎦ = λ R(α )R(β )R(γ ) ⎣ v ⎦ + ⎣ ty ⎦ (4.27)
i i
zi wi tz
with
λ : scale
α, β , γ : rotation
tx , ty, tz : translation
and the three rotation matrices
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1 0 0 cos β 0 −sin β cos γ sin γ 0
R(α ) = ⎣ 0 cos α sin α ⎦ , R(β ) = ⎣ 0 1 0 ⎦ , R(γ ) = ⎣ −sin γ cos γ 0 ⎦
0 −sin α cos α sin β 0 cos β 0 0 1

Linearization, using the approximate values λ o = 1, α o = β o = γ o = 0 and txo = tyo = tzo =

0, gives
⎡ ⎤
Δtx
⎢ Δt ⎥
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎢ y ⎥
Δxi Δui 1 0 0 xoi −zoi yoi ⎢ Δt ⎥
0 ⎢ z ⎥
⎣ Δy ⎦ = ⎣ Δv ⎦ + ⎣ 0 1 0 yo zo −x o ⎦ ⎢ Δλ ⎥
0 ⎢ ⎥ (4.28)
i i i i i ⎢ ⎥
Δzi Δwi 0 0 1 zoi −yoi xoi 0 ⎢ Δα ⎥
⎣ Δβ ⎥
⎢
⎦
∂u T (u ,p )
o o
Δγ
As in the two dimensional case, the G-matrix is again built up from some or all of the
columns of matrix ∂u T (uo , po ). In case of angles and distance ratios, all the columns are
used. For other observables though, only some of the columns are used.
88 Network Quality Control

Table 4.1 Entrie(s) of G-column(s) for different observation types.

dim. dimN(A)= actual observation entries of G-column(s)

netw. dimR(G) types(s) translation(s) rotation(s) scale
1 1 height differences 1
distances and 1 0
2 2
azimuts 0 1
distances 1 0 yoi
3
0 1 −xoi
angles and/or 1 0 yoi xoi
4
distance ratios 0 1 −xoi yoi
distances, azimuts 1 0 0
3 3 astronomical latitudes 0 1 0
and longitudes 0 0 1
distances 1 0 0 0 −zoi yoi
6 0 1 0 zoi 0 −xoi
0 0 1 −yoi xoi 0
angles and/or 1 0 0 0 −zoi yoi xoi
7 distance ratios 0 1 0 zoi 0 −xoi yoi
0 0 1 −yoi xoi 0 zoi

GPS baseline: Since the baseline observable is only invariant for translations and not
for rotations and scale changes, the G-matrix of a geodetic network built up from baselines
only, is given as
⎡ . . . ⎤
.. .. ..
⎢ ⎥
⎢ 1 0 0 ⎥
⎢ ⎥
⎢ 0 1 0 ⎥
G= ⎢ ⎢ ⎥ (4.29)
⎥
⎣ 0 0 1 ⎦
.. .. ..
. . .
3n × 3
Note that this is the three dimensional analogue of the levelling-case.
To conclude this section, we have given in table 4.1 the various entries of the G-matrix,
when different type of geodetic observables are used.

4.2.3 Adjustment of free GPS network: an example

Let us assume that we have three GPS baselines available, between the three points 1, 2
and 3. The baseline between the two points i and j will be denoted as bi j . Thus, when the
baseline is expressed in Cartesian coordinates, we have

bi j = (xi j , yi j , zi j )T
4. Adjustment and validation of networks 89

It is our goal to determine the Cartesian coordinates of the three points 1, 2 and 3. The
position vector of point i will be denoted as pi . Thus

pi = (xi , yi , zi )T

We will assume that the three baselines have been determined independently. Thus no
correlation is assumed to exist between the baselines. We also assume that the three
baselines have been determined with the same precision. Thus we assume that all three
baselines have the same variance matrix, which will be denoted as Q.
Based on the above assumptions, we can formulate the model of observation equations
as
⎡ ⎤ ⎡ ⎤⎡ ⎤ ⎡ ⎤ ⎡ ⎤
b12 −I3 I3 0 p1 b12 Q 0 0
E{⎣ b23 ⎦} = ⎣ 0 −I3 I3 ⎦ ⎣ p2 ⎦ , D{⎣ b23 ⎦} = ⎣ 0 Q 0 ⎦ (4.30)
b31 −I3 0 I3 p3 b31 0 0 Q

y A x y Qy

This is a system of 9 observation equations in 9 unknowns. Note however, that the design
matrix is not of full rank. It has a rank defect of 3. Thus the redundancy of the model
equals 9 − 9 + 3 = 3. Thus if one would formulate the model in terms of conditions
equations, one would have 3 independent condition equations. These three equations are
given as E{b12 + b23 + b31 } = 0.
Due to the rank defect of the model of observation equations, we need to specify
minimal constraints in order to be able to compute a particular least-squares solution. We
know that the range space of the matrix B of the minimal constraints BT x = 0, needs to
satisfy Rn = R(B) ⊕ R(AT ) = R(B) ⊕ N(A)⊥ . Thus R(B) needs to be complementary to
N(A)⊥ = R(G)⊥ . Since the null space of A is spanned by the columns of the matrix
⎡ ⎤
I3
G = ⎣ I3 ⎦ (4.31)
I3

it follows that R(B) needs to be complementary to

⎡ ⎤
I3 0
G⊥ = ⎣ −I3 I3 ⎦
0 −I3
This condition is satisfied by the following three choices for matrix B,
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
I3 0 I3
B =⎣ 0 ⎦ , B =⎣ 0 ⎦ , B
(1) (3) =⎣ I ⎦
(1+2+3) 3 (4.32)
0 I3 I3
There are of course many more matrices that satisfy the above condition, but we will
confine our discussion to the above three.
90 Network Quality Control

Point 1 as fixed (datum) point: The choice of matrix B(1) for the minimal constraints,
corresponds to a fixing of the coordinates of point 1. The corresponding least- squares
solution will be denoted as x̂(1) . It reads

x̂(1) = B⊥ ⊥ T T −1 ⊥ −1 ⊥ T T −1
(1) [(B(1) ) A Qy AB(1) ] (B(1) ) A Qy y

and it has as variance matrix

Qx̂ = B⊥ ⊥ T T −1 ⊥ −1 ⊥ T
(1) [(B(1) ) A Qy AB(1) ] (B(1) )
(1)

Since
⎡ ⎤ ⎡ ⎤
2Q−1 −Q−1 −Q−1 0 0
A Qy A = ⎣ −Q−1
T −1
2Q−1 −Q−1 ⎦ and B⊥
(1) = ⎣ I
3 0 ⎦
−Q−1 −Q−1 2Q−1 0 I3
it follows that
−1
2Q−1 −Q−1 2
3Q
1
3Q
[(B⊥ T T −1 ⊥ −1
(1) ) A Qy AB(1) ] = =
−Q−1 2Q−1 1
3Q
2
3Q

Hence, the variance matrix of the minimally constrained least-squares solution, keeping
point 1 fixed, reads
⎡ ⎤
0 0 0
Qx̂ = ⎣ 0 23 Q 13 Q ⎦ (4.33)
(1)
0 13 Q 23 Q
This matrix describes the precision of the coordinates of the free GPS network, when the
minimal constraints correspond to a fixing of point 1.

Point 3 as fixed (datum) point: Instead of choosing point 1 as fixed point, one may
of course also choose another point, say point 3. In that case the variance matrix of the
minimally constrained least-squares solution, reads
⎡ 2 1 ⎤
3Q 3Q 0
Q = ⎣ 1Q 2Q 0 ⎦
x̂(3) 3 3 (4.34)
0 0 0
This matrix describes the precision of the coordinates of the free GPS network, when the
minimal constraints correspond to a fixing of point 3. Note that the two variance matrices
of (4.33) and (4.34) differ greatly. It is important to recognize however, that these two
variance matrices contain identical information. One can transform the one into the other
by means of an S-transformation. The transformation, that transforms any arbitrary least-
squares solution to x̂(1) , reads

S(1) = I9 − G(BT(1) G)−1 BT(1)

⎡ ⎤
0 0 0 (4.35)
= ⎣ −I I3 0 ⎦
3
−I3 0 I3
4. Adjustment and validation of networks 91

Hence, the variance matrix Qx̂ can be obtained from the variance matrix Qx̂ , by means
(1) (3)
of the transformation (verify yourself) Qx̂ = S(1) Qx̂ ST(1) .
(1) (3)

Fixing the sum of the coordinates: Instead of fixing the three coordinates of one of the
points of the network, one may also decide to fix of all points of the network, their sum
of x-coordinates, their sum of y-coordinates and their sum of z-coordinates. Also these
three constraints are admissible. The S-transformation that corresponds with this set of
minimal constraints is given as
S(1+2+3) = I9 − G(BT(1+2+3) G)−1 BT(1+2+3)

⎡ ⎤
2
3 I3 − 13 I3 − 13 I3 (4.36)
= ⎣ − 13 I3 2
3 I3 − 13 I3 ⎦
− 13 I3 − 13 I3 2
3 I3
With this S-transformation, we can thus obtain the variance matrix of x̂(1+2+3) from Qx̂
(3)
as
Qx̂ = S(1+2+3) Qx̂ ST(1+2+3)
(1+2+3) 3

⎡ ⎤
2
9Q − 19 Q − 19 Q (4.37)
= ⎣ − 19 Q 2
− 19 Q ⎦
9Q
− 19 Q − 19 Q 2
9Q
The entries of this variance matrix again differ greatly from the entries of Qx̂ and Qx̂ .
(1) (3)
The three minimally constrained solutions x̂(1) , x̂(3) and x̂(1+2+3) are however completely
equivalent. All three produce the same least-squares solution for the measurements. This
thus also holds for the variance matrix Qŷ . Verify yourself that indeed Qŷ = AQx̂ AT =
(1)
AQx̂ AT = AQx̂ AT .
(3) (1+2+3)

Evaluation of free network precision: When evaluating the precision of a free network,
one has to make sure that the evaluation is not affected by the choice of minimal con-
straints. These constraints do not contain information which is essential for the precision-
evaluation. They are merely a tool to be able to compute coordinates. Thus when the
precision evaluation is based on, say Qx̂ , one should use a procedure which gives re-
(1)
sults that are identical to the results that one would obtain, when the precision evaluation
is based on, say Qx̂ . Hence, when one wants to compare Qx̂ with a criterium matrix,
(3) (1)
one has to make sure that the criterium matrix is defined with respect to the same set
of minimal constraints. This can be accomplished by transforming the criterium matrix
with the appropriate S-transformation. Thus when Cx denotes the criterium matrix, one
should compare Qx̂ with S(1)Cx ST(1) , and Qx̂ with S(3)Cx ST(3) . Only then will the two
(1) (3)
evaluations give identical results.
In case one decides to base the evaluation on the generalized eigenvalue problem, the
appropriate formulation is thus
| Qx̂ − λ S(1)Cx ST(1) |= 0 (4.38)
(1)
92 Network Quality Control

and not | Qx̂ − λ Cx |= 0. And if it is Qx̂ , that needs to be evaluated, the appropriate
(1) (3)
formulation is

| Qx̂ − λ S(3)Cx ST(3) |= 0 (4.39)

(3)

and not | Qx̂ − λ Cx |= 0. Both the eigenvalue problems (4.38) and (4.39), will give
(3)
identical results for the eigenvalues.

4.2.4 Testing of free levelling network: an example

The levelling network consists of the four points 1, 2, 3 and 4, and the following five
observables

h12 , h23 , h31 , h34 , h42

Thus the network consists of two levelling loops, loop 1 − 2 − 3 and loop 2 − 3 − 4. The
observables are assumed to be uncorrelated, all having the same variance σ 2 . We know
that we need to introduce one minimal constraint in order to eliminate the rank deficiency.
As minimal constraint we choose to fix the height of the first point:

h1 = 0

The model of observation equations, with the minimal constraint included, reads then
⎡ ⎤ ⎡ ⎤
h12 1 0 0
⎡ 1 ⎤
⎢ h23 ⎥ ⎢ −1 1 0 ⎥
⎢ ⎥ ⎢ ⎥ h21
E{⎢ ⎥ ⎢
⎢ h31 ⎥} = ⎢ 0 −1 0 ⎥ h3
⎥⎣ ⎦ , Qy = σ 2 I
5 (4.40)
⎣ h ⎦ ⎣ 0 −1 1 ⎦ h1
34 4
h42 1 0 −1
Note that due to the elimination of h1 , the design matrix is indeed of full rank. The
remaining heights are given the upperindex 1 to show that they are defined with respect
to the fixing of the height of the first point. There are 5 observations and 3 unknowns.
The redundancy is therefore equal to 2. The two condition equations can be identified as
E{h12 + h23 + h31 } = 0 and E{h23 + h34 + h42 } = 0.
Detection: In order to test the above model, we will start with the overall model test.
The general expression for the corresponding test statistic reads

T = êT Q−1
y ê

When applied to the above model, it becomes (verify yourself)

T
1 h12 + h23 + h31 3 −1 h12 + h23 + h31
T= (4.41)
8σ 2 h23 + h34 + h42 −1 3 h23 + h34 + h42
Since the redundancy equals 2, this test statistic has a central Chi- squared distribution,
with 2 degrees of freedom under the null hypothesis. Note that the overall model test
statistic is a function of the misclosure of the levelling loop 1 − 2 − 3 and of the levelling
loop 2 − 3 − 4.
4. Adjustment and validation of networks 93

Identification: In case identification of model errors is needed, we need to decide what

type of model errors are likely to occur. It is standard practice to search for blunders in the
individual observations. This is the data snooping approach. Since the variance matrix Qy
is diagonal, the wi -test statistics for data snooping have the simple form
êi
wi = i = 1, . . ., 5
σê
i

They have a standard normal distribution under the null hypothesis. When the wi -test
statistic is worked out for the above model, we get (verify yourself)
1√
w1 = 2σ 6
(3h12 + 2h23 + 3h31 − h34 − h41 )
1
√ (h + 2h + h + h + h )
w2 = σ 8 12 23 31 34 41
w3 = w1 (4.42)
w4 = 1√
2σ 6
(−h12 + 2h23 − h31 + 3h34 + 3h41 )
w5 = w4
Note that some of the test statistics are identical. This is understandable if one considers
the geometry of the levelling network. A blunder in h12 can not be discriminated from a
blunder in h31 (w3 = w1 ) . Also, a blunder in h34 can not be discriminated from a blunder
in h42 (w5 = w4 ).
If in addition to potential blunders in the data, one also suspects that, say all three
observed height differences of the levelling loop 1 − 2 − 3 are erroneous by a constant
amount, then also an additional identification test needs to be performed. In this case one
has to make use of the general expression for the w-test statistic. It reads
cT Q−1
y ê
w=
cT Q−1 −1
y Qê Qy c

The appropriate c-vector for testing whether a constant shift in the first three observations
occurred or not, reads

c = (1, 1, 1, 0, 0)T

When the expression of the w-test statistic is worked out for the above model, we get
(verify yourself)
h12 + h23 + h31
w= √ (4.43)
3σ
Hence, the appropriate test statistic equals the misclosure of the levelling loop 1 − 2 − 3,
divided by its standard deviation.
Adaptation: In case data snooping lead to the conclusion that, say the second obser-
vation was erroneous, one can decide to remeasure this observation and after remeasure-
ment, again apply the whole testing procedure. For the case that all first three observations
94 Network Quality Control

are off by a constant amount, one can of course also opt for remeasurement. But instead,
one may also decide to adapt the model. In that case the new model becomes
⎡ ⎤ ⎡ ⎤
h12 1 0 0 1 ⎡ 1 ⎤
⎢ h23 ⎥ ⎢ −1 1 h2
⎢ ⎥ ⎢ 0 1 ⎥ ⎥ ⎢ h1 ⎥
E{⎢ ⎥ ⎢ ⎥⎢ 3 ⎥
⎢ h31 ⎥} = ⎢ 0 −1 0 1 ⎥ ⎣ h1 ⎦ , Qy = σ I5
2
(4.44)
⎣ h ⎦ ⎣ 0 −1 1 0 ⎦ 4
34 ∇
h42 1 0 −1 0
One should be aware however, that this change also results in a change for precision and
reliability. Both will become poorer and it depends on the particular application at hand
whether one is willing to accept this or not.

4.3 The connected network

4.3.1 The observation equations

The purpose of the connection is to integrate the geometry of the free network with that of
the network of existing control points and to express the result in the coordinate system of
the control points. Since the coordinate system used for the free network adjustment, may
differ from the one used for the control points, one could be dealing with two different
coordinate systems. In order to show this difference, we will make use of the following
notation. The Cartesian coordinates of the free network are denoted as (x, y) in case of two
dimensions and as (x, y, z) in case of three dimensions. The coordinates of the network of
control points are denoted as (u, v) or (u, v, w). The coordinates of all free network points
are collected in the vector p and the coordinates of all control network points are collected
in the vector q. For the two dimensional case, we thus have

p = (. . ., xi , yi , . . .)T
q = (. . ., ui , vi , . . .)T
The free network and the control network will usually overlap. Hence, three type of points
can be discriminated. The points that are part of the free network, but not of the control
network. The points that are part of both the free network and the control network. And
the points that are part of the control network, but not of the free network. The two sets
of coordinates of the free network will be denoted as
p1 = (x1 , y1 , . . ., xn1 , yn1 )T
p2 = (xn +1 , yn +1 , . . ., xn +n , yn )T
1 1 1 2 1 +n2

where p1 contains the coordinates of the n1 points of the free network that are not part of
the overlap, and where p2 contains the coordinates of the n2 points of the free network
that are part of the overlap. Thus the total number of points of the free network is assumed
to be equal to n1 + n2 . As a result of the free network adjustment, we thus have available

p1 p1 p1 Q p1 Q p1 p2
E{ }= , D{ }= (4.45)
p2 p2 p2 Q p2 p1 Q p2
4. Adjustment and validation of networks 95

The two sets of coordinates of the control network will be denoted as

q2 = (u1 , v1 , . . ., un2 , vn2 )T
q3 = (un +1 , vn +1 , . . ., xn +n , yn )T
2 2 2 3 2 +n3

where q2 contains the coordinates of the n2 points of the control network that are part of
the overlap, and where q3 contains the coordinates of the n3 points of the control network
that are not part of the overlap. Thus the total number of points of the control network is
assumed to be equal to n2 + n3 . Of the control network, we assume to have available

q2 q2 q2 Qq2 Qq2 q3
E{ }= , D{ }= (4.46)
q3 q3 q3 Qq3 q2 Qq3

In order to be able to combine (4.45) with (4.46), we still need to consider the coordi-
nate transformation between the two coordinate systems. Although the type of coordinate
transformation that is needed, depends on the particular application at hand, any coordi-
nate transformation will be of the general form p = F(q, t), where t denotes the vector of
transformation parameters. When applied to the two sets p1 and p2 , we thus have

p1 = F1 (q1 , t)
(4.47)
p2 = F2 (q2 , t)
With (4.45), (4.46) and (4.47), we are now in the position to formulate the model of
observation equations for the connection adjustment. It reads
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
p1 F1 (q1 , t) p1 Q p1 Q p1 p2 0 0
⎢ p ⎥ ⎢ F (q , t) ⎥ ⎢ p ⎥ ⎢ Q 0 ⎥
⎢ ⎥ ⎥ , D{⎢ 2 ⎥ ⎢ p2 p1 Q p2 0 ⎥
E{⎢ 2 ⎥} = ⎢ ⎣
2 2
⎦ ⎢ ⎥ } = ⎢ ⎥(4.48)
⎣ q2 ⎦ q2 ⎣ q2 ⎦ ⎣ 0 0 Qq2 Qq2 q3 ⎦
q3 q3 q3 0 0 Qq3 q2 Qq3

Note that the observation equations are linear in q3 , but possibly nonlinear in q1 , q2 and t.
Whether the observation equations are nonlinear or not, depends on the type of coordinate
transformation used. In case of nonlinear observation equations, we still need to apply a
linearization in order to obtain linear(ized) observation equations. Linearization of (4.48)
gives
⎡ ⎤ ⎡ ⎤⎡ ⎤
Δp1 T1 0 0 A1 Δq1
⎢ Δp ⎥ ⎢ 0 T 0 A ⎥ ⎢ Δq ⎥
⎢ 2 ⎥
E{⎢ ⎥} = ⎢ 2
⎣ 0 I 0 0 ⎦ ⎣ Δq ⎦
2 ⎥⎢ 2 ⎥ (4.49)
⎣ Δq2 ⎦ 3
Δq3 0 0 I 0 Δt

where
T1 = ∂q1 F1 (qo1 , t o) A1 = ∂t F1 (qo1 , t o)
(4.50)
T2 = ∂q2 F2 (qo2 , t o) A2 = ∂t F2 (qo2 , t o)
The structure of these four matrices of course also depends on the type of coordinate
transformation used. As an example, we will show how they look like when the three
dimensional similarity transformation is used.
96 Network Quality Control

The three dimensional similarity transformation between two sets of Cartesian coor-
dinates reads
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
xi ui tx
⎣ y ⎦ = λ R(α )R(β )R(γ ) ⎣ v ⎦ + ⎣ ty ⎦ (4.51)
i i
zi wi tz

with the scale λ , the translation vector (tx , ty, tz)T and the three rotation matrices
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1 0 0 cos β 0 −sin β cos γ sin γ 0
R(α ) = ⎣ 0 cos α sin α ⎦ , R(β ) = ⎣ 0 1 0 ⎦ , R(γ ) = ⎣ −sin γ cos γ 0 ⎦
0 −sin α cos α sin β 0 cos β 0 0 1

After linearization, the four matrices of (4.50) read

⎡ ⎤ ⎡ ⎤
λ o Ro λ o Ro
⎢ ⎥ ⎢ ⎥
T1 ⎢ . ⎥ T2 ⎢ . ⎥
=⎢ . ⎥, =⎢ . ⎥
3n1 × 3n1 ⎣ . ⎦ 3n2 × 3n2 ⎣ . ⎦
λ o Ro λ o Ro

⎡
λ o RoW1o I3
⎤ ⎡
λ o RoW3no I3
⎤ (4.52)
⎢ ⎥ ⎢ 1 +1 ⎥
⎢ . . ⎥ ⎢ ⎥
A1 A2 ⎢ . . ⎥
=⎢
⎢ . . ⎥,
⎥ =⎢ . . ⎥
3n1 × 7 ⎣ . . ⎦ 3n2 × 7 ⎢ . . ⎥
⎣ ⎦
λ o RoW3n
o I3 λ o RoW3n
o I3
1 1 +3n2

with Ro = R(α o )R(β o )R(γ o ), Δt = (Δ ln λ , Δα , Δβ , Δγ , Δtx, Δty , Δtz)T and

⎡ o ⎤
ui −woi cos β o sin γ o − voi sin β o −woi cos γ o voi
Wi = ⎣ voi
o
woi cos β o cos γ o + uoi sin β o −woi sin γ o −uoi ⎦
wi −vi cos β cos γ + ui cos β sin γ vi sin γ o + uoi cos γ o
o o o o o o o o 0
In case of the two dimensional similarity transformation, the matrices of (4.52) will have
the same structure, be it that now Ro = R(α o ), Δt = (Δ ln λ , Δα , Δtx , Δty )T and

uoi −voi
Wio =
voi uoi
We conclude this section with some remarks.
Remark 1: In the above linearization, no particular assumptions were made about the
values of the approximate transformation parameters. In some applications it may happen
however, that the two coordinate systems differ only slightly in scale and orientation. In
that case one can choose as approximate values λ o = 1, α o = β o = γ o = 0. Note that this
results in a considerable simplification of the above matrices. We then have λ o Ro = I3
and
⎡ o ⎤
ui 0 −woi voi
Wio = ⎣ voi woi 0 −uoi ⎦
wi −vi
o o o
ui 0
Remark 2: When the similarity transformation is used in the connection model (4.48)
with all its transformation parameters included, one should recognize that the scale, the
orientation and the location of the free network is abandoned in favour of the scale, the
orientation and the location of the control network. Thus only the shape of the free net-
work will then contribute to the determination of the geometry of the connected network.
4. Adjustment and validation of networks 97

Remark 3: Above it was assumed that all transformation parameters of the similarity
transformation are unknown. In some applications it may happen however, that all or
some of these transformation parameters are known. For instance, it may happen that
one knowns (from an earlier adjustment) the relative orientation and scale between the
WGS84 coordinate system and the National coordinate system. This would imply that
if a free GPS network is expressed in the WGS84 system and the coordinates of the
control network in the National coordinate system, that the scale and three orientation
parameters are not needed as unknowns in the model of observation equations. If they are
known with a sufficient precision, they could be treated as constants. The only remaining
transformation parameters would then be the three translation parameters.
Remark 4: Above it was assumed that the coordinates of the control network are of the
Cartesian type. This need not be the case of course. The model is easily adapted however,
for the case that one uses other than Cartesian coordinates. One only needs to substitute
in the above nonlinear model of observation equations, the relation that exists between
Cartesian coordinates and the type of coordinates that one needs to use.
Remark 5: In the model (4.48), it was assumed that no correlation exists between the
coordinates of the free network and the coordinates of the control network. For most
practical applications this assumption is realistic, since the measurement processes that
produced the coordinates of the two networks can usually be assumed to be independent.
Remark 6: In the model (4.48) it was also assumed that the variance matrix of the co-
ordinates of the control network, Qq , is available. In practice this may not be the case
however. Fortunately, for the constrained connection adjustment it is not needed. In that
case the adjustment is performed with Qq = 0. For the statistical testing of the model
(4.48), it is needed however. In that case one will have to work with a substitute ’vari-
ance matrix’, that then hopefully will give a sufficiently good approximation to the actual
variance matrix Qq .
Remark 7: In the connection model (4.48) we included the coordinates of the nonover-
lapping points of the control network, q3 . In the remaining part of these lecture notes,
they will be disregarded however. These coordinates do not contribute to the redundancy
of the model and therefore also not to the results of statistical testing, and they remain
unchanged when a constrained connection adjustment with Qq = 0 is applied.

4.3.2 The unconstrained connection for testing

In this section we will discuss ways of testing the validity of the connection model (4.48).
It will be clear that the coordinates of the nonoverlapping points of the two networks do
not contribute to the redundancy of the model and therefore also not to the results of the
statistical tests. Hence, we can restrict our attention to the linear(ized) model

Δp2 T2 A2 Δq2 Δp 2 Q p2 0
E{ }= , D{ }= (4.53)
Δq2 I 0 Δt Δq2 0 Qq2

The redundancy of this model equals (2n2 − nt ) in two dimensions and (3n2 − nt ) in
three dimensions, where nt is the dimension of the vector of transformation parameters.
98 Network Quality Control

Thus with the full similarity transformation in two dimensions, we have a redundancy of
(2n2 − 4). This shows that in the overlap of the two networks, a minimum of two points
is needed.
We can reduce the number of unknown parameters in this model, if we eliminate Δq2
by making use of the coordinate differences Δd = Δp2 − T2 Δq2 . This gives

Ho : E{Δd} = A2 Δt , D{Δd} = Q p2 + T2 Qq2 T2T (4.54)

This model will be our null hypothesis Ho . As alternative hypothesis Ha , we will consider
the model
Δt
Ha : E{Δd} = A2 C , D{Δd} = Q p2 + T2 Qq2 T2T (4.55)
∇
where ∇ is an r × 1 vector of assumed model errors and C is a matrix that specifies how
the model errors are related to the observations. From our chapter on statistical testing, we
know that the above null hypothesis can be tested against the above alternative hypothesis,
using the test statistic

T r = êTd Q−1 T −1 −1 −1 T −1
d C(C Qd Qê Qd C) C Qd êd (4.56)
d

where êd is the least-squares residual vector of Δd and Qê its variance matrix. Assuming
d
normally distributed observables, one will decide to reject Ho on the basis of Ha , when

Tr > χα2 (r, 0)

with χα2 (r, 0) the critical value of the central Chi- square distribution, having r degrees of
freedom.
Usually one will start with the most relaxed alternative hypothesis. This is the hypoth-
esis for which the matrix (A2 ,C) is square and regular. The corresponding test statistic
reads

T = êTd Q−1
d êd

Under the null hypothesis Ho , it has a central Chi-square distribution with the degrees of
freedom being equal to the redundancy of the model under Ho . The test corresponding to
the most relaxed alternative hypothesis, is usually referred to as the overall model test and
its purpose is to detect unspecified model errors.
When the overall model test leads to a rejection of the null hypothesis, there is a high
likelihood that the model under Ho has been specified incorrectly. Potential model errors
are: (1) the assumed relation between the two coordinate systems is wrong; (2) the shape
of the free network differs significantly from the shape of the control network, due to,
for instance, an error in one or more of the coordinates; (3) the assumptions about the
stochastic model are incorrect, due to, for instance, a too optimistically specified variance
matrix of the coordinates.
In the following we will restrict ourselves to model errors in the functional model.
They can be specified by means of the matrix C. We will now discuss some examples.
4. Adjustment and validation of networks 99

Errors in the coordinates: Since the coordinates form the observations in the connec-
tion model, it seems reasonable to first check the coordinates on possible errors. First
note that one will never be able to find errors in the coordinates of the points in the two
nonoverlapping parts of the two networks. These coordinates do not contribute to the
redundancy of the model and will therefore not be part of the test statistics. Fortunately,
these errors will also have no influence on the adjusted coordinates of the other points
after the connection.
Secondly note, that one will also not be able to discriminate whether an error has oc-
curred in the coordinates of the free network or in the coordinates of the control network.
This is due to the fact that the observations in the model (4.55) are formed from coordinate
differences.
In order to check for a potential error in the coordinates, the simplest alternative hy-
pothesis is the one for which it is assumed that only one coordinate of one of the points
is wrong. In this case, ∇ is a scalar (r = 1) and matrix C becomes a vector, which will be
denoted by the lowercase character c. Hence, instead of using (4.56) one can in this case
also use its square-root, being the w-test statistic
cT Q−1
d d
ê
w=
cT Q−1
d
Qê Q−1
d
c

For the three dimensional case, the c-vector then takes one of the following three forms

cxi = (0 0 0 . . .1 0 0 . . .0 0 0)T
cyi = (0 0 0 . . .0 1 0 . . .0 0 0)T
czi = (0 0 0 . . .0 0 1 . . .0 0 0)T
where i refers to the point being tested. In this way one can systematically check all
coordinates of all overlapping points on potential errors.

Point identification errors: An error in only one coordinate of a point may occur due to
typing errors. But in other cases of course, it is more likely, when an error occurs, that it
will not be confined to a single coordinate only, but instead will effect all coordinates of
a point. Such a type of error occurs when one erroneously beliefs that the free network
coordinates and the control network coordinates indeed refer to the same point. These
type of errors are called point identification errors.
In order to test for these type of errors, the C-matrix takes the form
⎡ ⎤T
0 0 0 ... 1 0 0 ... 0 0 0
C = ⎣ 0 0 0 ... 0 1 0 ... 0 0 0 ⎦
i
0 0 0 ... 0 0 1 ... 0 0 0
Height or eccentricity error: In case of GPS, it not seldom occurs that an error is made
in the measured GPS antenna height. The c-vector then takes the form
T
ci = 0 . . . 0 hxi hyi hzi 0 . . . 0
100 Network Quality Control

where the vector (hxi , hyi , hzi )T specifies the direction of the potential antenna height offset
of point i in the approximate coordinate system. In case of an eccentricity error, the C-
matrix has two columns,
T
0 . . . 0 exi eyi ezi 0 ... 0
Ci =
0 . . . 0 f xi f yi f zi 0 ... 0

where the two vectors (exi , eyi , ezi )T and ( f xi , f yi , f zi )T span the plane in which the eccen-
tricity error of point i is supposed to have taken place.

Misspecified coordinate transformation: Instead of having errors in the data, it may

also happen that one has misspecified the coordinate transformation between the two co-
ordinate systems. As an example consider a free GPS network that needs to be connected
to a control network. Let us assume that the coordinates of the GPS network are expressed
in the system of WGS84. Furthermore it is assumed that the coordinate system of the con-
trol network, has a scale and an orientation which is identical to the WGS84 system. The
model under the null hypothesis reads then
⎡ ⎤ ⎡ . . . ⎤
.. .. .. ..
⎢ . ⎥
⎢ Δx − Δu ⎥ ⎢ ⎥⎡ ⎤
⎢ i ⎥
⎢ 1 0 0 ⎥ Δtx
⎢ i
⎥ ⎢ ⎥
Ho : E{⎢ Δyi − Δvi ⎥} = ⎢ ⎢ 0 1 0 ⎥ Δty
⎥⎣ ⎦
⎢ ⎥ ⎢ 0 0 1 ⎥ Δt
⎢ Δzi − Δwi ⎥ ⎣ ⎦ z
⎣ ⎦
.. .. .. ..
. . . .

Thus only the translation vector is included in the coordinate transformation. Now assume
that the null hypothesis gets rejected by the overall model test and that one suspects that
the two coordinate systems, apart from differing in their location, differ also slightly in
scale. In that case one will have to oppose the above null hypothesis to the following
alternative hypothesis
⎡ ⎤ ⎡ . . .
.. .. .. .. .. ⎤
⎢ . ⎥ . ⎡ ⎤
⎢ Δx − Δu ⎥ ⎢ ⎥ Δtx
⎢ ⎥ ⎢ 1 0 0 u o ⎥
⎢ i i
⎥ ⎢ i ⎥ ⎢ Δt
y ⎥
Ha : E{⎢ Δyi − Δvi ⎥} = ⎢ ⎢ 0 1 0 voi ⎥ ⎥
⎢
⎣
⎥
⎢ ⎥ ⎢ 0 0 1 wo ⎥ Δtz ⎦
⎢ Δzi − Δwi ⎥ ⎣ i ⎦
⎣
..
⎦
.. .. .. .. Δ ln λ
. . . . .

The model is thus enlarged with one additional column, being the column which models
the scale difference between the two coordinate systems. This column vector is thus the
appropriate choice for the c-vector

c = (. . ., uoi , voi , woi, . . .)T

Note that λ o has been taken equal to one, since the difference in scale between the two
coordinate systems, although potentially significant, was still assumed to be small.
4. Adjustment and validation of networks 101

4.4 The constrained connection for coordinate computation

In the previous section the validation of the connection model was considered. In the
present section we will consider the estimation problem. For validation, we could restrict
our attention to the coordinates of the overlapping points. For estimation however, we
need to take the coordinates of the nonoverlapping points of the free network into account
as well. After all, the purpose of the connection is to obtain the coordinates of these newly
established points in the coordinate system of the control network. Thus instead of the
model (4.54), we will now work with the enlarged model
Δp1 T1 A1 Δq1 Δp1 Q p1 Q p1 p2
E{ }= , D{ }= (4.57)
Δd 0 A2 Δt Δd Q p2 p1 Qd
Remember that the coordinate differences of the overlapping points are contained in Δd =
Δp2 − T2 Δq2 , which has as variance matrix, Qd = Q p2 + T2 Qq2 T2T .
The above model can be solved using the standard least-squares algorithm. As a
result one would obtain the unconstrained least- squares solution. In practice however,
circumstances often dictate that the coordinates of the control network remain unchanged.
This can be accomplished by applying the standard least-squares algorithm to the above
model, with the additional constraints that
Qq2 := 0 and Δq2 := 0 (4.58)
Setting Qq2 to zero, implies that no least-squares correction is given to Δq2 . This together
with the setting of Δq2 to zero, implies that the coordinates of the control points in the
overlap remain fixed to their original values (note: the original values are used as approx-
imate values in the linearization). In order to discriminate between the unconstrained and
the constrained least-squares solution, we will use the superscript c for the constrained
case. Thus Δd and Δd c have the same variance matrix, namely Qd , but their sample
values differ. They are given respectively as
Δd = Δp2 − T2 Δq2 and Δd c = Δp2
Instead of solving the above model in one step, we will solve it in three steps. This
will help us in getting a somewhat better insight in the various aspects of the connection
adjustment.

The first step: We already remarked that the coordinates of the nonoverlapping points
of the free network do not contribute to the redundancy of the above model. Likewise,
these coordinates also do not contribute to the determination of the transformation param-
eters. Only the coordinates of the overlapping points contribute to the determination of
the transformation parameters. We therefore start by first considering
E{Δd} = A2 Δt , D{Δd} = Qd (4.59)
From it the unconstrained least-squares estimator of the vector of transformation param-
eters follows as
Δt̂ = (AT2 Q−1 −1 T −1 T −1
d A2 ) A2 Qd Δd , Qtˆ = (A2 Qd A2 )
−1
(4.60)
102 Network Quality Control

The constrained least-squares estimator however, reads

⎧ c
T −1 −1 T −1
⎨ Δt̂ = (A2 Q p A2 ) A2 Q p Δd
⎪ c
2 2
(4.61)
⎪
⎩ Q
tˆc
= (AT2 Q−1 −1 T −1 −1 T −1
p A2 ) A2 Q p Qd Q p A2 (A2 Q p A2 )
−1
2 2 2 2

Note that in the application of the error propagation law, the randomness of Δq2 has been
taken into account. After all, although we constrain the solution to the coordinates of
the control network, these coordinates are still of a stochastic nature. We know that a
least-squares solution that takes the stochasticity of all variables into account will give
estimators that are of minimal variance. In the above constrained least-squares solution
this is not the case. Thus

Qtˆ < Qtˆc

showing that the transformation parameters of the unconstrained solution are of a better
precision than the ones of the constrained solution. This is thus the price one pays, when
one is forced to keep the coordinates of the control network unchanged.
The unconstrained solution for the coordinate differences and their least-squares resid-
uals are given as
Δd̂ = A2 Δt̂ , Qdˆ = A2 QtˆAT2
(4.62)
êd = Δd − Δd̂ , Qê = Qd − A2 QtˆAT2
d

As we have seen in the previous section, these least-squares residuals form the basis for
the statistical testing of the validity of the connection model.

The second step: In this second step we will determine the solution for the coordinates
of the free network in the nonoverlapping part. We know that Δp1 does not contribute to
the redundancy of the model. If in addition, Δp1 would not correlate with Δp2 and thus
not with Δd, then Δp1 = Δ p̂2 would hold true. That is, in that case the coordinates of the
free network in the nonoverlapping part would not change as well. But since Q p1 p2 = 0,
it follows that these coordinates do change. In fact, their residual vector can be computed
from êd as

ê p = Q p1 p2 Q−1
d êd
2

Hence, the unconstrained least-squares solution reads

Δ p̂1 = Δp1 − Q p1 p2 Q−1
d
(Δd − Δd̂)
(4.63)
Q p̂ = Q p1 − Q p1 p2 Qd (Qd − Qdˆ)Q−1
−1
d
Q p2 p1
1

Of course if the connection model fails to have any redundancy at all, then êd = 0 and
Δ p̂1 = Δp1 . This would for instance happen when in the two dimensional case, the con-
nection model is based on the full similarity transformation and the number of points in
the overlap equals only two.
The constrained least-squares solution reads

Δ p̂c1 = Δp1 − Q p1 p2 Q−1 c

p (Δd − A2 Δt̂ )
c
(4.64)
2
4. Adjustment and validation of networks 103

and again we have Q p̂c > Q p̂ .

1 1
The above two solutions are still expressed in the coordinate system of the free net-
work. Hence, in order to express them in the coordinate system of the control network,
we still need a third step.

The third step: After the unconstrained adjustment of the connection model (4.57), we
have

Δ p̂1 = T1 Δq̂1 + A1 Δt̂

Since matrix T1 is invertible, it follows after substitution of (4.60) and (4.63), that
Δp1 A1
Δq̂1 = T1−1 I −Q p1 p2 Q−1 − Δt̂ (4.65)
d Δd A2
This is the final unconstrained least-squares solution of the coordinates of the nonoverlap-
ping points of the free network, expressed in the coordinate system of the control network.
This is thus the solution one gets for the coordinates of the nonoverlapping points, when
the model (4.57) is solved using the standard least- squares algorithm.
In case of a constrained adjustment of the connection model (4.57) we have

Δ p̂c1 = T1 Δq̂c1 + A1 Δtˆc

From substitution of (4.61) and (4.64), follows then

Δp1 A1
Δq̂c1 = T1−1 I −Q p1 p2 Q−1
p − Δt̂ c (4.66)
2 Δd c A2
This is the final constrained least-squares solution of the coordinates of the nonoverlap-
ping points of the free network, expressed in the coordinate system of the control network.
Hence, this is the solution of the newly established points, when one requires that the co-
ordinates of the control network remain unchanged.
The above expression is the most general expression one can get. Let us now consider
three special cases.
No correlation: In the special case that the coordinates of Δp1 do not correlate with those
of Δp2 , we have Q p1 p2 = 0. The above expression simplifies then to

Δq̂c1 = T1−1 (Δp1 − A1 Δt̂ c )

In this case the transformation A1 Δt̂ c is directly applied to Δp 1 , without taking the vector
c
of least-squares residuals êcd = Δd c − Δd̂ into account.
No redundancy: The above simplified expression is also obtained in case there is no
redundancy in the connection model. An absence of redundancy implies that the points
in the overlap have as many coordinates as there are transformation parameters. Hence,
there is no need for an adjustment to determine the transformation parameters. They are
uniquely determined from the coordinates of the points in the overlap and the least-squares
residual vector êcd is identically zero.
104 Network Quality Control

No coordinate transformation: In case the two coordinate systems of the free network
and the control network coincide, the transformation parameters will be absent from the
model and T1 will be equal to the identity matrix. In that case the above constrained
solution simplifies to

Δq̂c1 = Δp1 − Q p1 p2 Q−1

p Δd
c
(4.67)
2

The correction to the coordinates of Δp1 is now only based on the difference between the
coordinates of the points in the overlap.

We already remarked that the precision of the constrained solution is less than that of
the unconstrained solution, Qq̂c > Qq̂ . This need not be dramatic. It is simply a conse-
1 1
quence of the fact that by constraining the solution, one is in fact using a less than optimal
estimator. But what is important to recognize though, is that the constrained least- squares
solution can be poorer than the solution one had before the connection was carried out.
This is easily shown for the above solution (4.67). Application of the error propagation
law to (4.67) gives

Qq̂c = Q p1 − Q p1 p2 Q−1 −1
p (Q p2 − Qq2 )Q p Q p2 p1
1 2 2

This shows that

Qq̂c > Q p1 when Q p2 < Qq2 (4.68)

But this means, if the coordinates of the control network are of a poorer precision than
the coordinates of the free network (Q p2 < Qq2 ), that the adjusted coordinates, when
constraining is applied, will have a precision which is poorer than before the adjustment
was carried out (Qq̂c > Q p1 ). This is the reason, why in surveying one works from the
1
’large to the small’. That is, in the densification process, the free networks are connected
to ’higher order’ control networks, i.e. networks that are of a better precision than the free
networks.

4.4.1 A levelling example

To conclude this chapter, we will give an example of testing a connection model. The
adjustment is therefore of the unconstrained type. The constrained adjustment is rather
straightforward and is left to the reader. The connection model is one where two levelling
networks need to be connected. The model is assumed to read
⎡ ⎤ ⎡ ⎤
h1 1 1
⎢ . ⎥ ⎢ .. .. ⎥ ⎡ ⎤
⎢ .. ⎥ ⎢ . ⎥
⎢ ⎥ ⎢ . ⎥ H1
⎢ ⎥ ⎢ ⎥⎢ . ⎥
⎢ h ⎥ ⎢ 1 1 ⎥ ⎢ .. ⎥ Qh QhH σh2 In 0
E{⎢ n ⎥} = ⎢ ⎥⎢ ⎥ , = (4.69)
⎢ H1 ⎥ ⎢ 1 0 ⎥ ⎣ Hn ⎦ QHh QH 0 σH2 In
⎢ . ⎥ ⎢ ⎥
⎢ . ⎥ ⎢ .. .. ⎥
t
⎣ . ⎦ ⎣ . . ⎦
Hn 1 0
4. Adjustment and validation of networks 105

The two overlapping levelling networks are thus assumed to differ in height only. The
redundancy of the model equals (n − 1). First we will give the least-squares estimators of
Hi and t, and then we will discuss the detection and identification step.

Least-squares estimators: After forming the system of normal equations of the above
model and solving it for the translation parameter, we get
⎧
⎨ t̂ = h̄ − H̄
(4.70)
⎩ 2
σtˆ = 1n (σh2 + σH2 )
where
1 n 1 n
h̄ = ∑
n i=1
hi , H̄ = ∑ H i
n i=1

are the average heights of the two networks. In this particular case, the least-squares
estimator of the translation parameter thus equals the difference of the height averages of
the two networks.
When we solve the system of normal equations for the height coordinate Hi of point
i, we get
⎧ σ2
⎪
⎪ Ĥ = H i − σ 2 +Hσ 2 [(H i − H̄) − (hi − h̄)]
⎨ i h H
(4.71)
⎪
⎪
⎩ σ 2 = σ 2 [1 − σH2 (1 − 1 )]
Ĥi H σh2 +σH2 n

This shows that the least-squares residual of H i is given as

σH2
êH = [(H i − H̄) − (hi − h̄)]
i σh + σH2
2

It is constructed from a difference of two differences. The two differences are the height
of point i with respect to the average height in the first height system and the height of
point i with respect to the average height in the second height system. Note that the
residual gets smaller, when σH2 gets smaller. This is understandable. The more precise the
H i coordinate is, the more confidence one has in it and the less one is willing to give it a
large correction. In the limiting case σH2 = 0, we have Ĥ i = H i . In that case no correction
is applied at all. If we consider the other limiting case σh2 = 0, then of course the hi
coordinate would not get a correction, ĥi = hi . In this case, we also would have

Ĥ i = hi + H̄ − h̄
= hi − t̂

thus showing that Ĥ i is now obtained from simply applying the translation estimator to
hi .

Detection: In order to verify whether it is justified to rely our results of adjustment on

the above model (4.69), we need to test it. In order to be able to test it, the least-squares
106 Network Quality Control

residuals should at least not be identical to zero, which would happen when there is no
redundancy. Since the redundancy equals (n − 1), we thus at least need more than one
point in the two overlapping networks.
To test the above model, we first consider its overall validity. The general form of the
appropriate test statistic for detection reads T redundancy = êT Q−1
y ê. When this expression
is worked out for the above model we get
∑ni=1 (H i − hi )2 − n(H̄ − h̄)2
T n−1 = (4.72)
σh2 + σH2
Note that the term n(H̄ − h̄)2 would be absent in case the translation parameter t would
be absent from the model. In that case the test statistic would simply equal the sum of
the squared height differences divided by the sum of the two variances. The redundancy
would then also have increased by one to n.
With the above test statistic, the model is invalidated or rejected when
Tn−1 > χα2 (n − 1, 0) (4.73)
When rejection occurs, there are of course many type of misspecifications in the model
that could have caused it. The misspecifications could be in the functional and/or the
stochastic model. For instance, when the stochastic model is too optimistic, the variances
are too small and the value of Tn−1 would be unrealistically large. As a result, one could
have an unjustified rejection of the model. This is therefore the reason, why the variance
σH2 is not set to zero, when testing the connection model. Thus the unconstrained con-
nection is used for testing, while the constrained connection, with σH2 = 0, is used for the
actual coordinate computation.
Instead of having a too optimistic stochastic model, one could of course also have used
a too pessimistic stochastic model. In that case the variances are too large and Tn−1 would
be unrealistically small. If one suspects this to be the case, then the above one-sided test
should be replaced by its two-sided counterpart. Since the expectation of T n−1 /(n − 1)
equals one under the null hypothesis, one then tests whether this ratio is significantly
larger or smaller than one. We will however assume that the stochastic model is correct
and restrict our attention to misspecifications in the functional model.

Identification: If we assume that the rejection of the model could have been caused by
an error in the coordinate set of the H-system, the c-vector of the w-test statistic will have
the form
c = (0T , cTH )T with cH = (cH , . . ., cHn )T (4.74)
1

When worked out, the corresponding w-test statistic reads then

∑ni=1 cHi (H i − hi ) − (H̄ − h̄) ∑ni=1 cHi
wH = (4.75)
(σh2 + σH2 )(∑ni=1 c2H − 1n (∑2i=1 cH )2 )
i i

and the test would be

| wH | > N 1 (0, 1) (4.76)
2 α1
4. Adjustment and validation of networks 107

where the level of significance α1 is coupled to that of α by λ (α1 , 1, γ ) = λ (α , n − 1, γ ),

so as to ensure that both tests (4.73) and (4.76) would have an equal probability γ of
correct rejection.
If we want to test for an error in the coordinate set of the h-system instead of in the
H-system, then one would need to choose instead of (4.74), the c-vector c = (cTh , 0T )T .
But for ch = cH , the two test statistics wh and wH would become identical apart from
a change in sign (verify yourself). This simply implies that one is not able to identify
whether the model error originated from the h-system or from the H-system. This is
much like the situation where an error in a sum or difference can not be unambiguously
assigned to either components of the sum or difference. This is also the reason why it
is of importance to have a well-tested free network before the connection is carried out.
Because if then during the testing of the connection model, rejection occurs, one can
confidently assume that the error is present in the coordinate set of the control network,
in our case the coordinate set of the H-system.

Data snooping: If we want to screen the individual H i coordinates for the presence of
blunders, the cH -vector takes the form

cH = (0, . . ., 0, 1, 0, . . ., 0)T
i

and the above wH test statistic reduces to

(H − hi ) − (H̄ − h̄)
wH = i (4.77)
i
(σh2 + σH2 )(1 − 1n )

The size of the blunder that can be found with a probability γ = 0.80 using this test statistic
with a level of significance α1 = 0.001, is given by the MDB,

17.075(σh2 + σH2 )
| ∇i |= (4.78)
1 − 1n
This is the MDB which we already met earlier in the section on Reliability.

Testing for scale: If one suspects that the model was rejected due to the erroneous as-
sumption that scale is absent, the appropriate c- vector for testing for the presence of scale,
is

c = (cTh , 0T )T with ch = (H1o , . . ., Hno)T

The size of the scale factor that then can be found with probability γ = 0.80, is given by
the MDB

17.075(σh2 + σH2 )
| ∇λ |= (4.79)
∑ni=1 (Hio − H̄i0 )2

This shows that the presence of scale become better identifiable, when ∑ni=1 (Hio − H̄i0 )2
is large, that is, when the network would have large height differences with respect to the
average height.
108 Network Quality Control

4.5 Summary

As a conclusion to this chapter, we have summarized the main steps involved in four
flow diagrams. In figure 4.2, the adjustment and testing steps are shown for both the free
network and the connected network (note: the constrained connection is also referred
to as a pseudo least-squares connection). The model used for the connection is usually

observations model
formulation

free network
adjustment

rejected testing
observations

accepted

coordinates least-squares
known control connection
points adjustment

rejected testing
connection
points

accepted

pseudo resulting
least-squares
connection coordinates

Figure 4.2 Flow diagram of free and connected network phase.

based on a coordinate transformation. After all, the coordinates of the free network may
not be defined in the same coordinate system as those of the control network. In case
of connecting a GPS-based free network to existing control, one is dealing with WGS84
coordinates and with the coordinates of the National Reference System (NatRS). The
model for the connection can be formulated in two ways. Either one formulates it so that
the results are expressed in the WGS84 coordinate system, or one formulates it so that the
results are expressed in the NatRS coordinate system. The first formulation is shown in
figure 4.3 and the second formulation is shown in figure 4.4. With the first formulation
one of course will need an additional (back) transformation step to get the final results
expressed in the NatRS coordinate system.
The various steps involved in the transformation between the two systems are shown
in figure 4.5.
4. Adjustment and validation of networks 109

known control
points in
local system

transformation GPS baselines

local to WGS84 in WGS84

known control least-squares

points in adjustment
WGS84 and testing

pseudo
least-squares
adjustment

transformation coordinates in
WGS84 to local WGS84

coordinates in
local system

Figure 4.3 Connection adjustment and testing in WGS84 system.

GPS baselines
in WGS84

transformation
WGS84 to local

GPS baselines
in local system

known control least-squares

points in adjustment
local system and testing

pseudo
least-squares
adjustment

coordinates in
local system

Figure 4.4 Connection adjustment and testing in NatRS or local system.

110 Network Quality Control

GPS baseline
network

X-Y-Z 3D similarity X-Y-Z

in WGS84 transformation in NatRS

transformation
to ellipsoidal
coordinates

ellipsoidal
coordinates
in NatRS

geoid heights H=h-N map projection

orthometric North - East

heights in NatRS

Figure 4.5 Datum transformation WGS84 - NatRS.

Appendix A

Appendix

In this appendix, we consider the first two moments of a random variable, the mean and
the variance. We also show how they propagate when functions of the random variable
are taken.
First we consider the mean of a scalar random variable and give the propagation law
of the mean when the function is either linear or nonlinear. In the latter case, use is made
of a linearization. Then we do the same for the variance of a scalar random variable. After
the scalar case has been treated, we generalize to the vectorial case.

A.1 Mean and variance of scalar random variables

Mean of a scalar random variable: Let x be a continuous scalar random variable with
probability density function px (x). The expectation or mean of x is by definition the
integral
+∞
E{x} = xpx (x)dx (A.1)
−∞

This number will also be denoted by mx .

Mean of a function of a scalar random variable: Given a scalar random variable x and
a function f (x), we form the random variable y = f (x). As we see from (A.1), the mean
of y is given by
+∞
E{y} = ypy (y)dy (A.2)
−∞

It appears therefore, that to determine the mean of y, we must first find its probability
density function py (y). This, however, is not necessary. As the next theorem shows, E{y}
can be expressed directly in terms of the function f (x) and the density px (x) of x.
Theorem:
+∞
E{ f (x)} = f (x)px (x)dx (A.3)
−∞

Proof: We shall sketch a proof using the curve f (x) of figure A.1. With y = f (x1 ) =
f (x2 ) = f (x3 ) as in the figure, we see that

P(y ≤ y ≤ y + dy) =
= P(x1 ≤ x ≤ x1 + dx1 ) + P(x2 − dx2 ≤ x ≤ x2 ) + P(x3 ≤ x ≤ x3 + dx3 )
or

py (y)dy = px (x1 )dx1 + px (x2 )dx2 + px (x3 )dx3

111
112 Network quality control

f (x)

dy
y

dx1 dx2 dx3

x1 x2 x3 x

Figure A.1 The curve y = f (x).

Multiplying by y, we obtain
ypy (y)dy = f (x1 )px (x1 )dx1 + f (x2 )px (x2 )dx2 + f (x3 )px (x3 )dx3

Thus, to each differential in (A.2) there corresponds one or more differentials in (A.3). As
dy covers the y-axis, the corresponding dx’s are nonoverlapping and they cover the entire
x-axis. Hence, the integrals in (A.2) and (A.3) are equal.
It appears from (A.3) that to determine the mean of y, we must know the probability
density function px (x). In general this is true. However, in the special case that the
function f (x) is linear not the complete density of x needs to be known but only the mean
mx of x.
Theorem (propagation law of the mean): Given a scalar random variable x and a func-
tion f (x), we form the random variable y = f (x). If the function f (x) is linear,

f (x) = ax + b (A.4)
then
my = amx + b (A.5)

Proof: Substitution of (A.4) into (A.3) gives

+∞
my = −∞ (ax + b)px (x)dx
+∞ +∞
= a −∞ x px (x)dx + b −∞ 1 px (x)dx
= amx + b
If the function f (x) is nonlinear we need strictly speaking the density px (x) of x in order
to compute the mean of y = f (x). However, by using Taylor’s formula an approximation
to the mean of y can be derived that gets round the difficulty of having to know the density
px (x) of x.
Theorem (linearized propagation law of the mean): Given a scalar random variable
x and a nonlinear function f (x), we form the random variable y = f (x). Let x0 be an
Appendix 113

approximation to a sample of x and define Δy = y − f (x0 ) and Δx = x − x0 . Then a first-

order approximation to the mean of Δy is

. d
mΔy = f (x0 )mΔx (A.6)
dx
Proof: Application of Taylor’s formula gives
d 1 d2
y = f (x0 ) + f (x0 )(x − x0 ) + f (x0 )(x − x0 )2 + . . .
dx 2 dx2
If we take the expectation we get
d 1 d2
E{y} = f (x0 ) + f (x0 )E{(x − x0 )} + f (x0 )E{(x − x0 )2 } + . . .
dx 2 dx2
This may be written with

mΔy = E{y} − f (x0 ) , mΔx = E{(x − x0 )}

and
2
E{(x − x0 )2 } = E{ (x − mx ) − x0 − mx ) }
= E{(x − mx )2 } − 2E{(x − mx )(x0 − mx )} + e{(x0 − mx )2 }
= E{(x − mx )2 } + m2Δx
= σx2 + m2Δx
as
d 1 d2 ! "
mΔy = f (x0 )mΔx + 2
f (x0 ) σx2 + m2Δx + . . .
dx 2 dx
If we neglect the second and higher order terms, the result (A.6) follows.

Examples

1. With f (x) = cos x, we define the random variable y = f (x). Since

d
f (x0 ) = − sin x0 , we find to a first order
dx
.
E{y − cos x0 } = − sin x0 E{x − x0 }
d
2. With f (x) = x3 , we define the random variable y = f (x). Since f (x0 ) = 3(x0 )2
dx
we find to a first order
.
E{y − (x0 )3 } = 3(x0 )2 E{x − x0 }

Variance of a scalar random variable: The variance of a scalar random variable is by

definition the integral
+∞
E{(x − E{x})2 } = (x − mx )2 px (x)dx (A.7)
−∞
114 Network quality control

This number will also be denoted by σx2 . The positive constant σx is called the standard
deviation of x. From the definition it follows that

σx2 = E{(x − mx )2 } = E{x2 } − 2E{x}mx + m2x

σx2 = E{x2 } − m2x (A.8)

Theorem (propagation law of the variance): Given a scalar random variable x and a
function f (x), we form the random variable y = f (x). If the function f (x) is linear,

f (x) = ax + b (A.9)

then

σy2 = a2 σx2 (A.10)

Proof: According to definition (A.7) we have

σy2 = E{(y − E{y})2 } = E{( f (x) − my )2 }

With (A.3) this gives

+∞
σy2 = ( f (x) − my )2 px (x)dx
−∞

Substitution of (A.5) and (A.9) gives

σy2 = +∞ 2
−∞ [(ax + b) − (amx + b)] px (x)dx
+∞
= a −∞ (x − mx ) px (x)dx
2 2

= a2 σx2

The above result shows that if f (x) is linear, knowledge of σx2 is sufficient for computing
the variance of y = f (x). For nonlinear functions f (x) this is generally not true. If the
function f (x) is nonlinear one will generally need to know the complete density px (x)
of x. However, by using Taylor’s formula an approximation to the variance of y can be
derived that gets round the difficulty of having to know px (x).

Theorem (Linearized propagation law of the variance): Given a scalar random vari-
able x and a nonlinear function f (x), we form the random variable y = f (x). Let x0 be an
approximation to a sample of x. Then a first-order approximation to the variance of y is
# $2
. d
σy2 = f (x ) σx2
0
(A.11)
dx
Proof: Substitution of
d
f (x) = f (x0 ) + dx f (x0 )(x − x0 ) + . . .
mx = f (x ) + dx f (x0 )(mx − x0 ) + . . .
0 d
Appendix 115

into
+∞
σy2 = ( f (x) − my )2 px (x)dx
−∞

gives after neglecting second and higher order terms

+∞ ! d "2
σy2 = −∞ f (x0 )(x − mx ) px (x)dx
!d dx "2 +∞
= f (x0 ) −∞ (x − mx )2 px (x)dx
!ddx " 2
dx f (x ) σx
= 0 2

Examples

1. A first order approximation to the variance of y = cos x is

.
σy2 = sin2 x0 σx2
2. A first order approximation to the variance of y = x3 is
.
σy2 = 9(x0 )4 σx2

A.2 Mean and variance of vector random variables

Mean of a vector random variable: Let xi , i = 1, 2, . . ., n be n continuous scalar random

variables with joint probability density function px (x1 , x2 , . . ., xn ). The expectation or
mean of the random n-vector (x1 , x2 , . . ., xn )T is by definition the integral
⎡ ⎤ ⎡ ⎤
x1 x1
⎢ x ⎥ +∞ +∞ ⎢ x ⎥
⎢ 2 ⎥ ⎢ 2 ⎥
E{⎢ . ⎥} = ··· ⎢ . ⎥ px (x1 , x2 , . . ., xn )dx1 . . .dxn (A.12)
⎣ .. ⎦ −∞ −∞ ⎣ .. ⎦
xn xn
If we use the notation
x = ( x1 x2 · · · xn )T , x = ( x1 x2 · · · xn )T
px (x) = px (x1 , x2 , . . ., xn ) , dx = dx1 . . .dxn
+∞ +∞
and = ···
−∞ −∞

we may write (A.12) in the compact form:

E{x} = xpx (x)dx (A.13)

From (A.12) follows that

+∞ +∞
E{xi } = ··· xi px (x1 , x2 , . . ., xn )dx1 . . .dxn (A.14)
−∞ −∞
116 Network quality control

This result seems to contradict definition (A.1). Note however that (A.14) reduces to
(A.1), since
+∞
px (x1 , . . ., x j , . . ., xn )dx j = px (x1 , . . ., x j−1, x j+1, . . ., xn )
−∞

Hence, (A.12) may also be written as

⎡ ⎤ ⎡ +∞ ⎤
x1 −∞ x1 px (x1 )dx1
⎢ x ⎥ ⎢ −∞+∞
x2 px (x2 )dx2 ⎥
⎢ 2 ⎥ ⎢ ⎥
E{⎢ . ⎥} = ⎢ .. ⎥ (A.15)
⎣ .. ⎦ ⎣ . ⎦
+∞
xn −∞ xn px (xn )dxn

Thus in order to compute E{xi } one only needs the marginal density pxi (xi ).

Mean of a vectorfunction of a random vector: Given a random n-vector x and a vec-

torfunction F(x), F : Rn → Rm we form the random m-vector y = F(x). As we see from
(A.15) the mean of yi is given by
+∞
E{yi } = yi pyi (yi )dyi (A.16)
−∞

It appears, therefore, that to determine the mean of yi , we must first find its marginal
probability density function pyi (yi ). This, however, is not necessary. As the next theorem
shows, E{yi } can be expressed directly in terms of the vectorfunction F(x) and the joint
density of x.

Theorem:
+∞ +∞
E{F(x)} = ··· F(x1 , . . ., xn )px (x1 , . . ., xn )dx1 . . .dxn (A.17)
−∞ −∞
or

E{F(x)} = F(x)px (x)dx (A.18)

Proof: The proof is similar to the proof given in the earlier section of this appendix.
The following theorem is an extremely important one, and it is used frequently in
these lecture notes.
Theorem (propagation law of the mean): Given a random n-vector x and a vectorfunc-
tion F(x), F : Rn → Rm , we form the random m-vector y = F(x). If the vectorfunction
F(x) is linear,

F(x) = Ax + b
(A.19)
m×1 m×nn×1 m×1

then
my = A mx + b
(A.20)
m×1 m×nn×1 m×1
Appendix 117

Proof: If we denote the n columnvectors of matrix A by ai , i = 1, . . ., n, we may write

(A.19) as
n
F(x) = ∑ ai xi + b
i=1

If we substitute this into (A.17) we get

n +∞ +∞
E{F(x)} = ∑ ai −∞
···
−∞
xi px (x1 , . . ., xn )dx1 . . .dxn
i=1

+∞ +∞
+b ··· px (x1 , . . ., xn )dx1 . . .dxn
−∞ −∞
or
+∞
E{F(x)} = ∑ni=1 ai −∞ xi pxi (xi )dxi + b

= ∑ni=1 ai mxi + b

= Amx + b
Without proof we also give the linearized version of the propagation law of the mean.

Theorem (linearized propagation law of the mean): Given a random n-vector x and a
nonlinear vectorfunction F(x), F : Rn → Rm we form the random m-vector y = F(x). Let
x0 ∈ Rn be an approximation to a sample of x and define Δy = y − F(x0 ) and Δx = x − x0 .
Then we have to a first-order
.
mΔy = ∂x F(x0 )mΔx (A.21)

Examples

1. Let the two random variables y1 and y2 be defined as

y1 = 1 x1 + 3 x2 + 5 x3 + 2
y2 = 4 x1 + 2 x2 − 1 x3 + 3
Then
⎡ ⎤
x1
y1 1 3 5 2
E{ }= E{⎣ x2 ⎦} +
y2 4 2 −1 3
x3
2. Let the two random variables y1 and y2 be defined as

y1 = sin(x1 x2 ) + x3
y2 = x21 + x2 + 4

With the approximate values x01 , x02 and x03 we have to a first-order
⎡ ⎤
x1 − x01
y1 − sin(x01 x02 ) − x03 x02 cos(x01 x02 ) x01 cos(x01 x02 ) 1
E{ }= E{ x2 − x2 ⎦}
⎣ 0
y2 − (x01 )2 − x02 − 4 2x01 1 0
x3 − x03
118 Network quality control

Variancematrix of a random vector: Let xi , i = 1, 2, . . ., n be n continuous scalar random
variables with joint probability density function px (x1 , . . ., xn ). The variancematrix of the
random n-vector (x1 , x2 , . . ., xn )T is by definition the integral
⎡ ⎤⎡ ⎤T
x1 − E{x1 } x1 − E{x1 }
⎢ .. ⎥⎢ .. ⎥
E{⎣ . ⎦⎣ . ⎦ }=
xn − E{xn } xn − E{xn }
(A.22)
⎡ ⎤⎡ ⎤T
+∞ +∞ x1 − E{x1 } x1 − E{x1 }
⎢ .. ⎥⎢ .. ⎥
= ··· ⎣ . ⎦⎣ . ⎦ px (x1 , . . ., xn )dx1 . . .dxn
−∞ −∞
xn − E{xn } xn − E{xn }
This variancematrix will also be denoted by Qx . Using vector notation we may write
(A.22) in the compact form

E{(x − mx )(x − mx )T } = (x − mx )(x − mx )T px (x)dx (A.23)
From (A.22) it follows that
E{(xi − mxi )(x j − mx j )} =
+∞ +∞
= ··· (xi − mxi )(x j − mx j )px (x1 , . . ., xn )dx1 . . .dxn
−∞ −∞
Integration gives
+∞ +∞
E{(xi − mxi )(x j − mx j )} = (xi − mxi )(x j − mx j )px (xi , x j )dxi dx j (A.24)
−∞ −∞
in which px (xi , x j )dxi dx j is the joint density function of the two random variables xi and
x j . The scalar (A.24) is called the covariance of the two random variables xi and x j .
This covariance will also be denoted as σxix j . Note that the off-diagonal elements of the
variancematrix Qx consist of the covariances between the elements of the random vector
x.
If i = j it follows from (A.24) that
+∞ +∞
E{(xi − mxi )2 } = (xi − mxi )2 px (xi , x j )dxi dx j
−∞ −∞
Integration gives
+∞
E{(xi − mxi )2 } = (xi − mxi )2 pxi (xi )dxi
−∞
This is the variance of xi . Note that the variance σx2 is the ith-diagonal element of the
i
variancematrix Qx . Hence, the variancematrix Qx can be written as
⎡ ⎤
σx21 σx1 x2 · · · σx1 xn
⎢ ⎥
⎢ σx2 x1 σx22 ⎥
Qx = ⎢ ⎢ .. ..
⎥
⎥ (A.25)
n×n ⎣ . . ⎦
σxn x1 σx2n
Appendix 119

Note that the variancematrix is a symmetric matrix.

Theorem (propagation law of variances): Given a random n-vector x and a vectorfunc-
tion F(x), F : Rn → Rm , we form the random m-vector y = F(x). If the vectorfunction
F(x) is linear,
F(x) = Ax + b
(A.26)
m×1 m×n n×1 m×1

then
Qy = A Qx AT
(A.27)
m×m m×n n×n n×n

Proof: If we denote the n columnvectors of matrix A by ai , i = 1, . . ., n, we may write

(A.26) as
n
F(x) = ∑ ai xi + b (A.28)
i=1

In a similar way we may write (A.20) as

n
my = ∑ a i mx i + b (A.29)
i=1

Substitution of (A.28) and (A.29) into

Qy = E{(y − my )(y − my )T }
+∞ +∞
= ··· (F(x) − my )(F(x) − my )T px (x1 , . . ., xn )dx1 . . .dxn
−∞ −∞
and using
+∞ +∞
σxi x j = ··· (xi − mxi )(xi − mxi )T px (x1 , . . ., xn )dx1 . . .dxn
−∞ −∞
gives
n n
Qy = ∑ ∑ ai aTj σxix j
i=1 j=1

This can be written as

⎡ ⎤
∑nj=1 σx1 x j aTj
⎢ .. ⎥
Qy = a1 · · · an ⎢⎣ .
⎥
⎦
n
σ
∑ j=1 xn x j ja T

or as
⎡ ⎤⎡ T ⎤
σx x · · · σx1 xn a1
⎢ .1 1 .. ⎥ ⎢ .. ⎥
Qy = a1 · · · an ⎣ .. . ⎦⎣ . ⎦
σxn x1 · · · σxn xn aTn

= AQx AT
120 Network quality control

Without proof we also give the linearized version of the propagation law of variances.

Theorem (linearized propagation law of variances): Given a random n-vector x and a

nonlinear vectorfunction F(x), F : Rn → Rm , we form the random m-vector y = F(x). Let
x0 ∈ Rn be an approximation to a sample of x. Then we have to a first order
.
Qy = ∂x F(x0 ) Qx ∂x F(x0 )T
(A.30)
m×m m×n n×n n×m

Examples

1. Let the two random variables y1 and y2 be defined as

y1 = 1 x1 + 3 x2 + 5 x3 + 2
y2 = 4 x1 + 2 x2 − 1 x3 + 3

The variancematrix of x = (x1 , x2 , x3 )T is given as

⎡ ⎤
1 1 0
Qx = ⎣ 1 2 0 ⎦
0 0 1
Then
⎡ ⎤⎡ ⎤
1 1 0 1 4
1 3 5 ⎣ 1 2 0 ⎦⎣ 3
Qy = 2 ⎦
4 2 −1
0 0 1 5 −1

50 25
=
25 41
2. Let the two random variables y1 and y2 be defined as

y1 = x1 + x22 + x1 x3
(A.31)
y2 = x31 + sin x2

The variancematrix of x = (x1 , x2 , x3 )T is given as

⎡ ⎤
1 0 0
Qx = ⎣ 0 1 0 ⎦ (A.32)
0 0 2
The 2 × 3 matrix of partial derivatives ∂x F(x0 ) follows from (A.31) as
⎡ ⎤
∂x1 F 1 (x0 ) ∂x2 F 1 (x0 ) ∂x3 F 1 (x0 )
∂x F(x0 ) = ⎣ ⎦
2×3 ∂x1 F 2 (x0 ) ∂x2 F 2 (x0 ) ∂x3 F 2 (x0 )
(A.33)
1 + x03 2x02 x01
=
3(x01 )2 cos x02 0
Appendix 121

If we take as approximate values x01 = 1, x02 = 0, x03 = 0, (A.33) becomes

1 0 1
∂x F(x0 ) = (A.34)
3 1 0
The variancematrix Qy follows now from (A.32) and (A.34) to a first-
order as
⎡ ⎤⎡ ⎤
1 0 0 1 3
. 1 0 1 ⎣
Qy = 0 1 0 ⎦⎣ 0 1 ⎦
3 1 0
0 0 2 1 0
. 3 3
=
3 10

Appendix B

References

In order to aid a further study in the field of adjustment, testing and computation of geode-
tic networks, the list of references given below gives an introductory overview of the
existing international literature. With a few exceptions, the list concentrates on books,
lectures notes and reports.

Adjustment and testing

[1] Baarda, W. (1967): Statistical Concepts in Geodesy, Netherlands Geodetic Commis-

sion, Publications on Geodesy, New Series, Vol. 2, No. 4, Delft.

[2] Baarda, W. (1968): A Testing Procedure for use in Geodetic Networks, Netherlands
Geodetic Commission, Publications on Geodesy, New Series, Vol. 2. No. 5, Delft.

[3] Grafarend, E., B. Schaffrin (1993): Ausgeleichungsrechnung in Linearen Modellen,

B.I. Wissenschaftsverlag.

[4] Graybill, F.A. (1976): Theory and Application of the Linear Model, Duxbury Press.

[5] Koch, K.-R. (1987): Parameterschätzung und Hypothesestests in linearen Modellen,

2nd edition, Dümmler Verlag, Bonn.

[6] Meissl, P. (1982): Least Squares Adjustment: A Modern Approach, Mitteilungen der
Geodätischen Institute der Technische Universität Graz. Folge 43.

[7] Mikhail, E. M. (1976): Observations and Least Squares, University Press of Amer-
ica.

[8] Rao, C.R. (1973): Linear Statistical Inference and its Applications, 2nd edition,
Wiley series in probability and mathematical statistics.

[9] Teunissen, P.J.G. (2001): Adjustment Theory: an introduction, 2nd edition, Series
on Mathematical Geodesy and Positioning, Delft University Press, ISBN 90-407-
1974-8.

[10] Teunissen, P.J.G. (2000): Testing Theory: an introduction, Series on Mathematical

Geodesy and Positioning, Delft University Press, ISBN 90-407-1975-6.

The first two references concentrate on testing theory. Much of the quality control
procedures which have become standard practice nowadays, can be traced back to these
two original works. The references [3], [4], [5] and [8] are reference books, whereas the
references [6], [7], [9] and [10] are more of the lecture notes type. Reliability theory is
only delt with in the first two and in the last reference.

123
124 Network Quality Control

Surveying and geodetic networks

[1] Alberda, J.E. (1974): Planning and Optimization of Networks: Some General Con-
siderations. Boll. Geod. Sc. Aff., 33, pp. 209-240.

[2] Alberda, J.E. (1981): Inleiding Landmeetkunde, 3rd edition, Delft University Press.

[3] Baarda, W. (1973): S-transformations and Criterion Matrices, Netherlands Geodetic

Commission, Publications on Geodesy, New Series, Vol. 5, No.1, Delft.

[4] Borre, K. (1990): Landmaeling, Aalborg.

[5] Delft Geodetic Computing Centre (Eds.) (1982): Forty Years of Thought. Anniver-
sary volume (Vol. 1 and 2) on the occasion of prof. Baarda’s 65th birthday, Delft.

[6] Grafarend, E., Sanso (Eds.) (1985): Optimization and Design of Geodetic Networks,
Springer Verlag.

[7] Grafarend, E., B. Schaffrin (1974): Unbiased free net adjustment. Survey review, 22,
pp. 200-218.

[8] Heck, B. (1987): Rechenverfahren und Auswertemodelle der Landesvermessung,

Herbert Wichmann Verlag.

[9] Kahmen, H., W. Faig (1988): Surveying, Walter de Gruyter.

[10] Mierlo van, J. (1979): Free Network Adjustment and S- transformations, DGK B,
No. 252, pp. 41-54.

[11] Moffitt, F.H., H. Bouchard (1982): Surveying, 7th edition, Harper and Row.

[12] Richardus, P. (1984): Project Surveying, Balkema.

[13] Teunissen, P.J.G. (1984): Generalized Inverses, The Datum problem and S-transform-
ations, Mathematical and Physical Geodesy Report 84.1, Delft, pp. 44.

The references [2], [4], [8], [9], [11], and [12], are all textbooks on surveying. The
references [2], [4], [9] and [11], are of an introductory nature. Reference [8] concentrates
on functional models, in particular spatial modelling and reference [12] includes aspects
of quality control. Optimization and design aspects of geodetic networks are treated in
their whole range of variety in the references [1], [5] and [6]. Free networks are treated
in particular in [3], [7], [10] and [13]. The concept of S-transformations was introduced
in [3] and its relation to generalized inverses is included in [10] and [13].

GPS Surveying

[1] Husti, G.J. (2000): Global Positioning System (in Dutch), Series on Mathematical
Geodesy and Positioning, Delft University Press, ISBN 90-407-1977-2.
References 125

[2] Hofmann-Wellenhof, B., H. Lichtenegger, J. Collins (2001): Global Positioning Sys-

tem: Theory and Practice, 6th edition, Springer Verlag.

[3] Leick, A. (2005): GPS Satellite Surveying, 3rd edition, John Wiley and Sons.

[4] Seeber, G. (2003): Satellite Geodesy, 2nd edition, Walter de Gruyter.

[5] Teunissen, P.J.G. and A. Kleusberg (1998): GPS for Geodesy, 2nd edition, Springer
Verlag.
Network quality control
Peter J.G. Teunissen

The aim of computing a geodetic network is to determine the geometry of the configuration of a set of points
from spatial observations (e.g. GPS baselines and/or terrestrial measurements). The configuration of points
usually consists of newly established points, of which the coordinates still need to be determined, and already
existing points, the so-called control points, of which the coordinates are known.

Network quality control deals with the qualitative aspects of network design, network adjustment, network
validation and network connection. By means of a network adjustment the relative geometry of the new points is
determined and integrated into the geometry of the existing control points. Prior to the network adjustment, the
geometry of the network is designed on the basis of precision and reliability criteria.

The adjustment and validation of the overall geometry can be divided in two phases, the free network phase
and the connected network phase. In the free network phase, the known coordinates of the control points do not
take part in the adjustment and validation. The possible use of a free network phase is based on the idea that a
good geodetic network should be sufficiently precise and reliable in itself, without the need of external control.
Moreover, it allows one to validate the quality of the external control.

In the connected network phase, the geometry of the free network is integrated into the geometry of the control
points. Adjustment and validation in this second phase differs from the free network phase. The adjustment in
the second phase is a constrained connection adjustment, since it is often not practical to see the coordinates
of the control points change every time a free network is connected to them. For the validation of the connected
network however, the unconstrained connected adjustment is used as input. This allows one to take the intrinsic
uncertainty of the coordinates of the control points in the connection phase into account.

The goal of this introductory text on network quality control is to convey the necessary knowledge for designing,
adjusting and testing geodetic networks. For the purpose of network design, the precision and reliability theory
is worked out in detail. This includes the minimal detectable biases and the bias-to-noise ratios. For the purpose
of the network adjustment, the principles of unconstrained-, constrained-, and minimally constrained least-
squares estimation, are treated. For the network testing, the principles of hypothesis testing are presented and
worked out for the different network cases. For the free network phase this includes the overall model test,
the w-test, and the data snooping procedure. For the connected network phase, it includes the T-test, with an
emphasis on the detection and identification of errors in the control points.

P.J.G. Teunissen
Delft University of Technology,
Faculty of Civil Engineering and Geosciences

Dr (Peter) Teunissen is Professor of Geodesy at Delft

University of Technology (DUT) and an elected member
of the Royal Netherlands Academy of Arts and Sciences.
He is research-active in various fields of Geodesy, with
current research focused on the development of theory
, models, and algorithms for high-accuracy applications © 2024 TU Delft Open
of satellite navigation and remote sensing systems. ISBN 978-94-6366-950-4
His past DUT positions include Head of the Delft Earth
DOI https://doi.org/
Observation Institute, Education Director of Geomatics
Engineering and Vice-Dean of Civil Engineering and 10.59490/tb.100
Geosciences. His books at TUDelft Open are Adjustment textbooks.open.tudelft.nl
Theory, Testing Theory, Dynamic Data Processing and
Network Quality Control. Cover image: A. Smits

Intrusion Detection Honeypots
From Everand
Intrusion Detection Honeypots
Chris Sanders
3/5 (2)
Stat 331 Course Notes
No ratings yet
Stat 331 Course Notes
79 pages
Theory of Errors
100% (1)
Theory of Errors
14 pages
Data Fitting and Uncertainty (A Practical Introduction To Weighted Least Squares and Beyond)
No ratings yet
Data Fitting and Uncertainty (A Practical Introduction To Weighted Least Squares and Beyond)
6 pages
Nico Sneeuw, F. Kruum, Adjustment Theory - Lecture Notes 2015
No ratings yet
Nico Sneeuw, F. Kruum, Adjustment Theory - Lecture Notes 2015
153 pages
LN Ausgl
No ratings yet
LN Ausgl
154 pages
Geodezie
No ratings yet
Geodezie
87 pages
Nico Sneeuw, F. Kruum, Adjustment Theory - Lecture Notes 2010
No ratings yet
Nico Sneeuw, F. Kruum, Adjustment Theory - Lecture Notes 2010
103 pages
Testing Theory
No ratings yet
Testing Theory
150 pages
Econometrics UAB
No ratings yet
Econometrics UAB
353 pages
178 HW 9
No ratings yet
178 HW 9
153 pages
OSU Adjustment Notes Part 1
No ratings yet
OSU Adjustment Notes Part 1
230 pages
178 HW 6
No ratings yet
178 HW 6
125 pages
Geodetic Surveying
No ratings yet
Geodetic Surveying
78 pages
Adjustment Theory An Introduction
No ratings yet
Adjustment Theory An Introduction
10 pages
Creel M Econometrics
No ratings yet
Creel M Econometrics
479 pages
Probability and Statistics For Engineers - BookCornell
No ratings yet
Probability and Statistics For Engineers - BookCornell
463 pages
Graduate Econometrics Lecture Notes - Michael Creel (414 Pages)
100% (1)
Graduate Econometrics Lecture Notes - Michael Creel (414 Pages)
414 pages
Reg Book Stat
No ratings yet
Reg Book Stat
79 pages
Osu Adjustment Computation Notes Parts 1 and 2 0
No ratings yet
Osu Adjustment Computation Notes Parts 1 and 2 0
335 pages
Ebook Econometrics
No ratings yet
Ebook Econometrics
1,006 pages
OSU Adjustment Notes Part 1
No ratings yet
OSU Adjustment Notes Part 1
225 pages
Least Squares Adjustment
100% (1)
Least Squares Adjustment
47 pages
Lecture Notes - Kristiaan Pelckmans
100% (1)
Lecture Notes - Kristiaan Pelckmans
153 pages
Lecture Notes 2013
No ratings yet
Lecture Notes 2013
231 pages
Adjustment Theory PDF
No ratings yet
Adjustment Theory PDF
100 pages
YChen Thesis Final11
No ratings yet
YChen Thesis Final11
181 pages
Lecture Notes For A Course On System Identification, v2012: Kristiaan Pelckmans
No ratings yet
Lecture Notes For A Course On System Identification, v2012: Kristiaan Pelckmans
24 pages
2021 - Creel - Econometrics (Githuib Book)
No ratings yet
2021 - Creel - Econometrics (Githuib Book)
1,060 pages
Detection of Abrupt Changes: Theory and Application
No ratings yet
Detection of Abrupt Changes: Theory and Application
470 pages
Adjustment Examples
No ratings yet
Adjustment Examples
254 pages
Econometrics Simpler Note
No ratings yet
Econometrics Simpler Note
692 pages
Statistical Methods in Experimental Chemistry
100% (1)
Statistical Methods in Experimental Chemistry
103 pages
136 hw11
No ratings yet
136 hw11
89 pages
Michele Basseville Igor V Nikiforov - Detection of Abrupt Changes Theory and Application
No ratings yet
Michele Basseville Igor V Nikiforov - Detection of Abrupt Changes Theory and Application
469 pages
Machine Learning
No ratings yet
Machine Learning
662 pages
Estimation and Detection Theory by Don H. Johnson
No ratings yet
Estimation and Detection Theory by Don H. Johnson
214 pages
Machine Learning
No ratings yet
Machine Learning
674 pages
A Testing Procedijre For Use in Ge, Odetic Ne, Tworks: Netherlands Geodetic Commission
No ratings yet
A Testing Procedijre For Use in Ge, Odetic Ne, Tworks: Netherlands Geodetic Commission
97 pages
STA2005S Regression
No ratings yet
STA2005S Regression
92 pages
MLbook Extract
No ratings yet
MLbook Extract
14 pages
Econometric S
No ratings yet
Econometric S
1,341 pages
Msi PDF
No ratings yet
Msi PDF
127 pages
Statistical Methods in Geodesy
No ratings yet
Statistical Methods in Geodesy
135 pages
Introduction To Statistics WITH SAS
No ratings yet
Introduction To Statistics WITH SAS
238 pages
Dtic Ada621766 PDF
No ratings yet
Dtic Ada621766 PDF
239 pages
5CTA0 Reader 1
No ratings yet
5CTA0 Reader 1
88 pages
Notes 12j686o
No ratings yet
Notes 12j686o
272 pages
ECE 236B Course Notes
No ratings yet
ECE 236B Course Notes
90 pages
46-Book Manuscript-325-1-10-20220103
No ratings yet
46-Book Manuscript-325-1-10-20220103
442 pages
ISATIS 2013 Technical References
100% (1)
ISATIS 2013 Technical References
260 pages
Advanced college algebra study guide
From Everand
Advanced college algebra study guide
Harrison Cook
No ratings yet
ADVANCED COLLEGE ALGEBRA STUDY GUIDE
From Everand
ADVANCED COLLEGE ALGEBRA STUDY GUIDE
Harrison K Cook
No ratings yet
Gray Hat Hacking the Ethical Hacker's
From Everand
Gray Hat Hacking the Ethical Hacker's
Çağatay Şanlı
5/5 (1)
ChatGPT for Business: Strategies for Success
From Everand
ChatGPT for Business: Strategies for Success
Matthew C. Smith
1/5 (1)
Unlocking Statistics for the Social Sciences
From Everand
Unlocking Statistics for the Social Sciences
Norma Sinclair
No ratings yet
Securing ChatGPT: Best Practices for Protecting Sensitive Data in AI Language Models
From Everand
Securing ChatGPT: Best Practices for Protecting Sensitive Data in AI Language Models
Matthew C. Smith
No ratings yet
Content Creation Revolution with chatGPT
From Everand
Content Creation Revolution with chatGPT
Maria Cowen
No ratings yet
Human Nature Potential in Nurture
From Everand
Human Nature Potential in Nurture
David L. Hawk
No ratings yet
Risk Management and System Safety
From Everand
Risk Management and System Safety
Leonam dos Santos Guimarães
5/5 (1)
Ee 16
No ratings yet
Ee 16
139 pages
Courses of Study - MAL
100% (1)
Courses of Study - MAL
12 pages
Ann R16 Unit 2
No ratings yet
Ann R16 Unit 2
16 pages
BCAOL Programme Guide Final
No ratings yet
BCAOL Programme Guide Final
70 pages
2023 Classes 1-3 Matrices
No ratings yet
2023 Classes 1-3 Matrices
3 pages
Matrices Form 3
No ratings yet
Matrices Form 3
1 page
Lab4 5
No ratings yet
Lab4 5
4 pages
LU Decomposition Method PDF
No ratings yet
LU Decomposition Method PDF
5 pages
MATH FIVE MARKS 2ndpuc
No ratings yet
MATH FIVE MARKS 2ndpuc
4 pages
LA23 Notes Part II
No ratings yet
LA23 Notes Part II
66 pages
Matrices L2 12th Elite - DPP
No ratings yet
Matrices L2 12th Elite - DPP
79 pages
Linear Algebra: Project - Class CC11
No ratings yet
Linear Algebra: Project - Class CC11
21 pages
Quantitative Analysis PDF
No ratings yet
Quantitative Analysis PDF
347 pages
ANM Formulae - GSK
No ratings yet
ANM Formulae - GSK
6 pages
Error Analysis of Direct Methods of Matrix Inversion
No ratings yet
Error Analysis of Direct Methods of Matrix Inversion
50 pages
Controllability & Observability
No ratings yet
Controllability & Observability
34 pages
Group Theory - Pathfinder
No ratings yet
Group Theory - Pathfinder
298 pages
Introduction To FDM
No ratings yet
Introduction To FDM
82 pages
R22B.Tech - CSE Syllabus
No ratings yet
R22B.Tech - CSE Syllabus
154 pages
University of Palestine Gaza Strip Civil Engineering College Numerical Analysis CIVL 3309 Dr. Suhail Lubbad
No ratings yet
University of Palestine Gaza Strip Civil Engineering College Numerical Analysis CIVL 3309 Dr. Suhail Lubbad
42 pages
Inverse of A Matrix To The NTH Power Is The NTH Power of The Inverse of The Matrix: Proof
0% (1)
Inverse of A Matrix To The NTH Power Is The NTH Power of The Inverse of The Matrix: Proof
3 pages
Cvxset Ex
No ratings yet
Cvxset Ex
7 pages
KSEEB 2nd PUC Mathematics SYLLABUS FOR 2020 21new
No ratings yet
KSEEB 2nd PUC Mathematics SYLLABUS FOR 2020 21new
3 pages
Math - Grade IX
No ratings yet
Math - Grade IX
11 pages
UPSC Math Syllabus
No ratings yet
UPSC Math Syllabus
6 pages
MT 1117: Linear Algebra For ICT: Instructor: A.V. Mathias Department of Mathematics & Statistics University of Dodoma
No ratings yet
MT 1117: Linear Algebra For ICT: Instructor: A.V. Mathias Department of Mathematics & Statistics University of Dodoma
25 pages
Linear Algebra and Its Applications: Jiashang Jiang, Hua Dai, Yongxin Yuan
No ratings yet
Linear Algebra and Its Applications: Jiashang Jiang, Hua Dai, Yongxin Yuan
14 pages
Eng Math 01 Study Guide
No ratings yet
Eng Math 01 Study Guide
220 pages
Online - Business Mathematics
No ratings yet
Online - Business Mathematics
2 pages
Matrices - DPP 08 (Of Lec 10) - Kautilya 2025
No ratings yet
Matrices - DPP 08 (Of Lec 10) - Kautilya 2025
7 pages