Network Quality Control
Network Quality Control
P.J.G. Teunissen
Series on Mathematical Geodesy and Positioning
This work is licensed under a Creative Commons Attribution 4.0 International license
To promote open access, this new edition of Network Quality Control is published by TU Delft
Open Publishing instead of Delft Academic Press. The book builds on the foundations of
adjustment theory and testing theory to enable precise and reliable designs of geodetic
networks and their connections. As such, the book is a natural follow-on of the books
Adjustment Theory (2nd Ed. 2024) and Testing Theory (3rd Ed. 2024), both with TU Delft Open
Publishing.
September, 2024
This book is the result of a series of lectures and courses the author has given on the topic
of network analysis. During these courses it became clear that there is a need for refer-
ence material that integrates network analysis with the statistical foundations of parameter
estimation and hypothesis testing. Network quality control deals with the qualitative as-
pects of network design, network adjustment, network validation and network connection,
and as such conveys the necessary knowledge for computing and analysing networks in
an integrated manner.
In completing the book, the author received valuable assistance from Ir. Hedwig Ver-
hoef, Dr. Ir. Dennis Odijk and Ria Scholtes. Hedwig Verhoef has also been one of the
lecturers and took care of editing a large portion of the book. This assistance is greatly
acknowledged.
P.J.G. Teunissen
December 2006
Contents
1 An overview 1
2 Estimation and precision 9
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Consistency and uniqueness . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 The Linear A-model: observation equations . . . . . . . . . . . . . . . . 13
2.3.1 Least-squares estimates . . . . . . . . . . . . . . . . . . . . . . . 13
2.3.2 A stochastic model for the observations . . . . . . . . . . . . . . 17
2.3.3 Least-squares estimators . . . . . . . . . . . . . . . . . . . . . . 18
2.3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.4 The Nonlinear A-model: observation equations . . . . . . . . . . . . . . 20
2.4.1 Nonlinear observation equations . . . . . . . . . . . . . . . . . . 20
2.4.2 The linearized observation equations . . . . . . . . . . . . . . . . 23
2.4.3 Least-Squares iteration . . . . . . . . . . . . . . . . . . . . . . . 28
2.5 The B-Model: condition equations . . . . . . . . . . . . . . . . . . . . . 29
2.5.1 Linear condition equations . . . . . . . . . . . . . . . . . . . . . 29
2.5.2 Nonlinear condition equations . . . . . . . . . . . . . . . . . . . 32
2.6 Special Least-Squares procedures . . . . . . . . . . . . . . . . . . . . . 32
2.6.1 Recursive least-squares . . . . . . . . . . . . . . . . . . . . . . . 33
2.6.2 Constrained least-squares . . . . . . . . . . . . . . . . . . . . . . 34
2.6.3 Minimally constrained least-squares . . . . . . . . . . . . . . . . 36
2.7 Quality control: precision . . . . . . . . . . . . . . . . . . . . . . . . . . 40
vi
Contents vii
A Appendix 111
A.1 Mean and variance of scalar random variables . . . . . . . . . . . . . . . 111
A.2 Mean and variance of vector random variables . . . . . . . . . . . . . . . 115
B References 123
Chapter 1
An overview
This introductory chapter gives an overview of the material presented in the book. The
book consists of three parts. A first part on estimation theory, a second part on testing
theory and a third part on network theory. The first two parts are of a more general
nature. The material presented therein is in principle applicable to any geodetic project
where measurements are involved. Most of the examples given however, are focussed
on the network application. In the third part, the computation and validation of geodetic
networks is treated. In this part, we make a frequent use of the material presented in the
first two parts. In order to give a bird’s eye view of the material presented, we start with a
brief overview of the three parts.
ADJUSTMENT: The need for an adjustment arises when one has to solve an inconsis-
tent system of equations. In geodesy this is most often the case, when one has to solve
a redundant system of observation equations. The adjustment principle used is that of
least-squares. A prerequisite for applying this principle in a proper way, is that a number
of basic assumptions need to be made about the input data, the measurements. Since mea-
surements are always uncertain to a certain degree, they are modeled as sample values of a
random vector, the m-vector of observables y (note: the underscore will be used to denote
random variables). In case the vector of observables is normally distributed, its distri-
bution is uniquely characterized by the first two (central) moments: the expectation (or
mean) E{y} and the dispersion (or variance) D{y}. Information on both the expectation
and dispersion needs to be provided, before any adjustment can be carried out.
E{y} = Ax
This system is referred to as the functional model. It is given once the design matrix A of
order m × n is specified.
The system as it is given here, is linear in x. Quite often however, the observation
equations are nonlinear. In that case a linearization needs to be carried, to make the
system linear again. The parameter vector x usually consists of coordinates and possi-
bly, additional nuisance parameters, such as for instance orientation unknowns in case of
theodolite measurements. The coordinates could be of any type. For instance, they could
1
2 Network Quality Control
Least-squares: Once the measurements have been collected and the functional model
and the stochastic model have been specified, the actual adjustment can be carried out.
The least-squares estimator of the unknown parameter vector x, is given as
x̂ = (AT Q−1 −1 T −1
y A) A Qy y
It depends on the design matrix A, the variance matrix Qy and the vector of observables
y. With x̂, one can compute the adjusted observables as ŷ = Ax̂ and the least- squares
residuals as ê = y − ŷ.
The above expression for the least-squares estimator is based on a functional model
which is linear. In the nonlinear case, one will first have to apply a linearization before
the above expression can be applied. For the linearization one will need approximate
values for the unknown parameters. In case already an approximate knowledge on the
geometry of the network is available, the approximate coordinates of the network points
can be obtained from a map. If not, a minimum set of the observations themselves will
have to be used for computing approximate coordinates. In case the approximate values
of the unknown parameters are rather poor, one often will have to iterate the least-squares
solution.
Quality: Every function of a random vector, is itself a random variable as well. Thus
x̂ is a random vector, just like the vector of observables y is. And when x̂ is linearly
related to y, it will have a normal distribution whenever y has one. In that case also the
1. An overview 3
E{x̂} = x
Thus the expectation of the least-squares estimator equals the unknown, but sought for
parameter vector x. This property is known as unbiasedness. From an empirical point
of view, the equation implies, that if the adjustment would be repeated, each time with
measurements collected under similar circumstances, then the different outcomes of the
adjustment would on the average coincide with x. It will be clear, that this is a desirable
property indeed.
The dispersion of x̂, describing its precision, is given as
This variance matrix is independent of y. This is a very useful property, since it implies
that one can compute the precision of the least-squares estimator without having the actual
measurements available. Only the two matrices A and Qy need to be known. Thus once
the functional model and stochastic model have been specified, one is already in a position
to know the precision of the adjustment result. It also implies, that if one is not satisfied
with this precision, one can change it by changing A and/or Qy . This is typically done at
the design stage of a geodetic project, prior to the actual measurement stage. Changing
the geometry of the network and/or adding/deleting observables, will change A. Using
different measurement equipment and/or different measurement procedures, changes Qy .
TESTING: Applying only an adjustment to the observed data is not enough. The result
of an adjustment and its quality rely heavily on the validity of the functional and stochastic
model. Errors in one of the two, or in both, will invalidate the adjustment results. One
therefore needs, in addition to the methods of adjustment theory, also methods that allow
one to check the validity of the assumptions underlying the functional and stochastic
model. These methods are provided for by the theory of statistical testing.
Model errors: One can make various errors when formulating the model needed for
the adjustment. The functional model could have been misspecified, E{y} = Ax. The
stochastic model could have been misspecified, D{y} = Qy . Even the distribution of y
need not be normal. In these lecture notes, we restrict our attention to misspecifications
in the functional model. These are by far the most common modelling errors that occur
in practice. Denoting the model error as b, we have E{y} = Ax + b. If it is suspected
that model errors did indeed occur, one usually, on the basis of experience, has a fair idea
what type of model error could have occurred. This implies that one is able to specify the
vector b in the form of equations like
b = C∇
Test statistic: It will be intuitively clear that the least-squares residual vector ê, must
play an important role in validating the model. It is zero, when the measurements form a
perfect match with the functional model, and it departs from zero, the more the measure-
ments fail to match the model. A test statistic is a random variable that measures on the
basis of the least-squares residuals, the likelihood that a model error has occurred. For a
model error of the type C∇, it reads
T q = êT Q−1 T −1 −1 −1 T −1
y C(C Qy Qê Qy C) C Qy ê
It depends, apart from the least-squares residuals, also on the matrix C, on the design
matrix A (through Qê ) and on the variance matrix Qy . The test statistic has a central Chi-
squared distribution, with q degrees of freedom, χ 2 (q, 0), when the model error would be
absent. When the value of the test statistic falls in the right tail-area of this distribution,
one is inclined to belief that the model error indeed occurred. Thus the presence of the
model error is believed to be likely, when Tq > χα2q (q, 0), where αq is the chosen level of
significance.
Testing procedure: In practice it is generally not only one model error one is concerned
about, but quite often many more than one. In order to take care of these various potential
modelling errors, one needs a testing procedure. It consists of three steps: detection,
identification and adaptation. The purpose of the detection step is to infer whether one
has any reason to belief that the model is wrong. In this step one still has no particular
model error in mind. The test statistic for detection, reads
One decides to reject the model, when Tm−n > χα2m−n (m − n, 0).
When the detection step leads to rejection, the next step is the identification of the
most likely model error. The identification step is performed with test statistics like T q .
It implies that one needs to have an idea about the type of model errors that are likely to
occur in the particular application at hand. Each member of this class of potential model
errors is then specified through a matrix C. In case of one dimensional model errors, such
as blunders, the C-matrix becomes a vector, denoted as c. In that case q = 1 and the test
statistic T q simplifies considerably. One can then make use of its square-root, which reads
cT Q−1
y ê
w=
cT Q−1 −1
y Qê Qy c
This test statistic has a standard normal distribution N(0, 1) in the absence of the model
error. The particular model error that corresponds with the vector c, is then said to have
occurred with a high likelihood, when | w |> N 1 (0, 1). In order to have the model error
2 α1
1. An overview 5
detected and identified with the same probability, one will have to relate the two levels
of significance, αm−n and α1 . This is done by equating the power and the noncentrality
parameters of the above two test statistics T m−n and w.
Once certain model errors have been identified as sufficiently likely, the last step con-
sists of an adaptation of the data and/or model. This implies either a remeasurement of the
data or the inclusion of additional parameters into the model, such that the model errors
are accounted for. In both cases one always should check again of course, whether the
newly created situation is acceptable or not.
Quality: In case a model error of the type C∇ occurs, the least-squares estimator x̂ will
become biased. Thus E{x̂} = x. The dispersion or precision of the estimator however,
remains unaffected by this model error. The bias in x̂, due to a model error C∇, is given
as
∇x̂ = (AT Q−1 −1 T −1
y A) A Qy C∇
The purpose of testing the model, is to minimize the risk of having a biased least-squares
solution. However, one should realize that the outcomes of the statistical tests are not
exact and thus also prone to errors. It depends on the ’strenght’ of the model, how much
confidence one will have in the outcomes of these statistical tests. A measure of this
confidence is provided for by the concept of reliability. When the above w-test statistic is
used, the size of the model error that can be found with a probability γ , is given by the
Minimal Detectable Bias (MDB). It reads
λ (α1 , 1, γ )
| ∇ |=
cT Q−1
y Qê Qy c
−1
where λ (α1 , 1, γ ) is a known function of the level of significance α1 and the detection
probability (power) γ . The set of MDB’s, one for each model error considered, is said to
describe the internal reliability of the model.
As it was the case with precision, the internal reliability can be computed once the
design matrix A and the variance matrix Qy are available. Changing A and/or changing
Qy , will change the MDB’s. In this way one can thus change (e.g. improve) the internal
reliability. Substitution of C | ∇ | for C∇ in the above expression for ∇x̂, will show by
how much the least-squares solution becomes biased, when a model error of the size of
the MDB occurs. The bias vectors ∇x̂, one for each model error considered, is then said
to describe the external reliability of the model.
points. The determination and validation of the overall geometry is usually divided in two
phases: (1) the free network phase, and (2) the connected network phase.
Free network phase: In this phase, the known coordinates of the control points do not
take part in the determination of the geometry. It is thus free from the influence of the
existing control points. The idea is that a good geodetic network should be sufficiently
precise and reliable in itself, without the need of external control. It implies, when in the
second phase, the connected network phase, rejection of the model occurs, that one has
good reason to belief that the cause for rejection should be sought in the set of control
points, instead of in the geometry of the free network.
As with any geodetic project, the three steps involved in the free network phase are:
design (precision and reliability), adjustment (determination of geometry) and testing
(validation of geometry). With free networks however, there is one additional aspect
that should be considered carefully. It is the fundamental non-uniqueness in the relation
between geodetic observables and coordinates. This implies, that when computing co-
ordinates for the free network, additional information in the form of minimal constraints
are needed, to eliminate the non-uniqueness between observables and coordinates. The
minimal constraints however, are not unique. There is a whole set from which they can
be chosen. This implies that the set of adjusted coordinates of the free network, includ-
ing their variance matrix and external reliability, are not unique as well. This on its turn
implies, that one should only use procedures for evaluating the precision and reliability,
that are guaranteed to be invariant for the choise of minimal constraints. If this precau-
tion is not taken, one will end up using an evaluation procedure of which the outcome is
dependent on the arbitrary choice of minimal constraints.
Connected network phase: The purpose of this second phase is to integrate the geom-
etry of the free network into the geometry of the control points. The observables are the
coordinates of the free network and the coordinates of the control points. Since the co-
ordinates of the two sets are often given in different coordinate systems, the connection
model will often be based on a coordinate transformation from the coordinate system of
the free network to that of the control network.
In contrast to the free network phase, the design, adjustment and testing are now
somewhat nonstandard. First of all there is not much left to design. Once the free network
phase has been passed, the geometry of the free network as well as that of the control
points are given. This implies that already at the design stage of the free network, one
should take into account the distribution of the free network points with respect to the
distribution of the control points.
Secondly, the adjustment in the connected network phase is not an ordinary least-
squares adjustment. In most applications, it is not very practical to see the coordinates
of the control points change everytime a free network is connected to them. This would
happen however, when an ordinary adjustment would be carried out. Thus instead, a
constrained adjustment is applied, with the explicit constraints that the coordinates of the
control points remain fixed.
For testing however, a constrained adjustment would not be realistic. After all, the
1. An overview 7
coordinates of the control points are still samples from random variables and therefore
not exact. Thus for the validation of the connected geometry, the testing is based on the
least-squares residuals that follow from an ordinary adjustment and not from a constrained
adjustment.
Chapter 2
2.1 Introduction
9
10 Network Quality Control
In this equation the known scalars aiα determine how the measurements are related to the
unknown parameters. By introducing the matrix A and the vectors y and x as
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
a11 . . . a1n y1 x1
⎢ . .. ⎥ ⎢ . ⎥ ⎢ . ⎥
A = ⎣ .. . ⎦ , y = ⎣ .. ⎦ , x = ⎣ .. ⎦
am1 . . . amn ym xn
equation (2.1) can be written in matrix-vector form as
y = Ax
(2.2)
m×1 m×nn×1
y ∈ R(A) (2.3)
2. Estimation and precision 11
Systems of equations for which this holds true, are called consistent. A system is said
to be inconsistent if it is not consistent. In this case the vector y can not be written as a
linear combination of the column vectors of matrix A and hence no vector x exists such
that (2.2) holds true.
Since y ∈ Rm , it follows from (2.3) that consistency is guaranteed if R(A) = Rm , that
is, if the range space of matrix A equals the m-dimensional space of real numbers. But
R(A) = Rm only holds true, if the dimension of R(A) equals the dimension of Rm , which
is m. It follows therefore, since the dimension of R(A) equals the rank of matrix A (note:
the rank of a matrix equals the total number of linear independent columns of this matrix),
that consistency is guaranteed if and only if
rank A = m (2.4)
In all other cases, rank A < m, the linear system may or may not be consistent.
Let us now assume that the system is indeed consistent. The next question one may ask
is whether the solution to (2.2) is unique or not. That is, whether the information content
of the measurements collected in the vector y is sufficient to determine the parameter
vector x uniquely. The solution is only unique if all the column vectors of matrix A are
linear independent. Hence, the solution is unique if the rank of matrix A equals the number
of unknown parameters,
rank A = n (2.5)
To see this, assume x and x = x to be two different solutions of (2.2). Then Ax = Ax
or A(x − x ) = 0. But this can only be the case if some of the columns of matrix A are
lineary dependent, which contradicts the assumption (2.5) of full rank. Thus the solution
is unique when rank A = n and it is nonunique when rank A < n.
Unless otherwise stated, we will assume from now on that the matrix A of the linear
system (2.2) is of full rank.
With rank A = n and the fact that the rank of a matrix is always equal or less than the
number of its rows or its columns, follows that we can discriminate between the following
two cases
m = n = rank A or m > n = rank A (2.6)
In the first case, both (2.4) and (2.5) are satisfied. This implies that the system is both
consistent and unique. Thus a solution exists and this solution is also unique. The unique
solution, denoted by x̂, is given as
x̂ = A−1 y (2.7)
where A−1 denotes the inverse of matrix A.
Example 1
Consider the linear system
2 1 3 x1
= (2.8)
1 2 −1 x2
y A
12 Network Quality Control
In the second case, only (2.5) is satisfied. This implies that a unique solution exists,
provided that the system is consistent. If the system is consistent, the unique solution can
be obtained by inverting n out of the m > n linear equations. Hence, we first partition
(2.2) as
y1 A1
= x
y2 A2
where y1 is an n-vector, y2 is an m − n-vector and A1 and A2 are of order n × n and
(m − n) × n respectively. The unique solution x̂ follows then as
x̂ = A−1
1 y
Note that y2 is not used in computing x̂. This is due to the fact that in the present situation,
y2 is consistent with y1 and hence, it does not contain any additional information.
Example 2
Consider the linear system
⎡ ⎤ ⎡ ⎤
−2 1 3
⎣ 3 ⎦ = ⎣ 2 −1 ⎦ x1
(2.9)
x2
−1 1 2
y A
−1 −1
x̂1 2 −1 3 1 2 1 3 1
= =− =
x̂2 1 2 −1 5 −1 2 −1 −1
The question that remains is: What to do when the linear system of equations is incon-
sistent? In that case, we first need to make the system consistent before a solution can be
computed. There are however many ways in which an inconsistent system can be made
consistent. In the next section we will give a way, which is intuitively appealing. And
lateron we will show that this approach of finding a solution to an inconsistent system
also has some optimality properties, in particular in a probabilistic context.
In (2.11), y and A are given, whereas x and e are unknown. From the geometry of
figure 2.1 it seems intuitively appealing to estimate x as x̂ such that Ax̂ is as close as
possible to the given measurement- or observationvector y. In other words, the idea is to
14 Network Quality Control
y Rm
e
O
Ax
Ê (A)
Figure 2.1 The geometry of y = Ax + e.
find that value of x that minimizes the length of the vector e = y − Ax. This idea leads to
the following minimization problem:
AT Ax̂ = AT y (2.16)
Since rank AT A =rank A = n, the system is consistent and has a unique solution. Through
an inversion of the normalmatrix AT A the unique solution of (2.16) is found as
That this solution x̂ is indeed the minimizer of (2.14) follows from the fact that the matrix
∂ 2 F/∂ x2 of (2.15) is indeed positive-definite. The vector x̂ is known as the least-squares
estimate of x, since it produces the least possible value of the sum-of-squares function
F(x).
From the normal equations (2.16) follows that AT (y − Ax̂) = 0. This shows that the
vector ê = y−Ax̂, which is the least-squares estimate of e, is orthogonal to the range space
of matrix A (see figure 2.2):
y Rm
ê
O ŷ = Ax̂
Ê (A)
Figure 2.2 The geometry of Least-Squares.
Example 3
Let us consider the next problem
y1 1 e1
= x+
y2 1 e2
2x̂ = y1 + y2
1
x̂ = (y1 + y2 )
2
Thus, to estimate x, one adds the measurements and divides by the number
of measurements. Hence, the least-squares estimate equals in this case the
arithmetic average.
The least-squares estimates of the obserations and observation errors follow
from ŷ = Ax̂ and ê = y − ŷ as
ŷ1 1 y1 + y2 ê1 1 y1 − y2
= , and =
ŷ2 2 y1 + y2 ê2 2 y2 − y1
1
êT ê = (y1 − y2 )2
2
The solution of (2.19) can be derived along lines which are similar as the ones used
for solving (2.12). The solution of (2.19) reads
x̂ = (AT WA)−1 AT Wy (2.20)
This is the weighted least-squares estimate of x. In case of weighted least-squares the
normal equations read: AT WAx̂ = AT Wy. This shows that the vector ê = y − Ax̂, which is
the weighted least-squares estimate of e, satisfies
AT W ê = 0 , with ê = y − Ax̂ (2.21)
If the inner product of the observation space Rm is defined as (a, b) = aT W b, ∀a, b ∈ Rm ,
(2.21) can also be written as (Ax, ê) = 0, ∀x ∈ Rm . This shows that also in the case of
weighted least-squares, the vector ê can be considered to be orthogonal to the rangspace
of A.
A summary of the least-squares algorithm is given in Table 2.1.
Example 4
Consider again the problem
y1 1 e1
= x+
y2 1 e2
w11 0
W=
0 w22
w11 y1 + w22 y2
x̂ =
w11 + w22
random vector of observables y (Note: the underscore indicates that we are dealing with
a random variable). It is furthermore assumed that the vector of observables y can be
written as the sum of a deterministic functional part Ax and a random residual part e:
y = Ax + e (2.22)
E{e} = 0 (2.23)
where E{.} stands for the mathematical expectation operator. The measurement variabil-
ity itself is modelled through the dispersion- or variance matrix of e. We will assume that
this matrix is known and denote it by Qy :
D{e} = Qy (2.24)
where D{.} stands for the dispersion operator. It is defined in terms of E{.} as
With (2.23) and (2.24) we are now in the position to determine the mean and variance
matrix of the vector of observables y. Application of the law of propagation of means and
the law of propagation of variances to (2.22) gives with (2.23) and (2.24):
This will be our model for the vector of observables y. As the results of the next sec-
tion show, model (2.25) enables us to describe the quality of the results of least-squares
estimation in terms of the mean and the variance matrix.
The first moment; the mean: Together with E{y} = Ax, an application of the propaga-
tion law of means to (2.26) gives
⎧
⎨ E{x̂} = x
E{ŷ} = E{y} (2.27)
⎩
E{ê} = E{e} = 0
These results show, that under the assumption that (2.25) holds, the least-squares estima-
tors are unbiased estimators. Note that this property of unbiasedness is independent of
the choice for the weightmatrix W .
The second moment; the variance matrix: Together with D{y} = Qy , an application of
the propagation law of variances and covariances to (2.26) gives
⎧ T −1 T T −1
⎨ Qx̂ = (A WA) A W QyWA(A WA)
Q = AQx̂ A T
(2.28)
⎩ ŷ
Qê = [I − A(AT WA)−1 AT W ]Qy [I − A(AT WA)−1 AT W ]T
and
⎧
⎪
⎨ Qx̂ŷ = Qx̂ AT
Qx̂ê = (AT WA)−1 AT W Qy − Qx̂ AT (2.29)
⎪
⎩ Q
ŷê = AQx̂ ê
The above variance matrices enable us now to give a complete precision description
of any arbitrary linear function of the estimators. Consider for instance the linear function
θ̂ = aT x̂. Application of the propagation law of variances gives then for the precision of
θ̂ : σθ̂2 = aT Qx̂ a.
The above results enable us to describe the quality of the results of least-squares es-
timation in terms of the mean and the variance matrix. The introduction of a stochastic
model for the vector of observables y enables us however also to judge the merits of the
least-squares principle itself. Recall that the least-squares principle was introduced on the
basis of intuition and not on the basis of probabilistic reasoning. With the mathematical
model (2.24) one could now however try to develop an estimation procedure that pro-
duces estimators with certain well defined probabilistic optimality properties. One such
procedure is based on the principle of ”Best Linear Unbiased Estimation (BLUE)”.
Assume that we are interested in estimating a parameter θ which is a linear function
of x:
θ = aT x (2.30)
1×1 1×nn×1
The estimator of θ will be denoted as θ̂ . Then according to the BLUE’s criteria, the
estimator θ̂ of θ has to be a linear function of y,
θ̂ = lT y (2.31)
1×1 1×mm×1
such that it is unbiased,
E{θ̂ } = θ (2.32)
20 Network Quality Control
The objective is thus to find a vector l ∈ Rm such that with (2.31), the conditions (2.32)
and (2.33) are satisfied. It can be shown that the solution to the above problem is given by
l T = aT (AT Q−1 −1 T −1
y A) A Qy
θ̂ = aT (AT Q−1 −1 T −1
y A) A Qy y (2.34)
This is the best linear unbiased estimator of θ . The important result (2.34) shows that the
best linear unbiased estimator of x is given by
x̂ = (AT Q−1 −1 T −1
y A) A Qy y (2.35)
A comparison between (2.26) and (2.35) shows that the BLUE of x is identical to the
weighted least-squares estimator of x if the weight matrix W is taken to be equal to the
inverse of the variance matrix of y:
W = Q−1
y (2.36)
This is an important result, because it shows that the weighted least-squares estimators are
best in the probabilistic sence of having minimal variance if (2.36) holds. The variances
and covariances of these estimators follow if the weightmatrix W is replaced in (2.28) and
(2.29) by Q−1
y .
From now on we will always assume, unless stated otherwise that the weightmatrix
W is chosen to be equal to Q−1 y . Consequently no distinction will be made any more in
these lecture notes between weighted least-squares estimators and best linear unbiased
estimators. Instead we will simply speak of least-squares estimators.
2.3.4 Summary
In Table 2.2 an overview is given of the main results of Least-Squares Estimation.
The table shows the linear model of observation equations, the linear A-model. Based
on A, Qy and y, the least-squares estimator of the unknown parameter vector x can be
determined. And from it, one can determine the adjusted vector of observables, ŷ, the
least-squares residual vector, ê, and the least-squares estimator of an arbitrary function
θ = aT x, as θ̂ = aT x̂. As shown in the table, each of these random vectors have their own
mean and variance matrix. Some of the random vectors are correlated, such as x̂ and ŷ,
and some of them are not, such as x̂ and ê.
LEAST-SQUARES ESTIMATORS
x̂ = (AT Q−1 −1 T −1
y A) A Qy y ê = y − ŷ
ŷ = Ax̂ θ̂ = aT x̂
MEAN
applications there are however only a few cases where this assumption truly holds. A
typical example is levelling. In the majority of applications however the m-vector E{y} is
nonlinearly related to the n-vector of unknown parameters x. This implies that instead of
the linear A-model (2.25), we are generally dealing with a nonlinear model of observation
equations:
E{y} = A(x) ; D{y} = Qy (2.37)
where A(.) is a nonlinear vectorfunction from Rn into Rm . The following two simple
examples should make this clear.
Example 5
Consider the configuration of figure 2.3. The x,y coordinates of the three
22 Network Quality Control
3
a34
yi j = y j − yi
ai j
4
a14
y i xi j = x j − xi
1 a24 2
points 1, 2 and 3 are known and the coordinates x4 and y4 of point 4 are
unknown. The observables consist of the three azimuth variates a14 , a24 , and
a34 . Since azimuth and coordinates are related as
x j − xi xi j
tan ai j = =
y j − yi yi j
The model of observation equations for the configuration of figure 2.3 reads
⎡ ⎤ ⎡ ⎤
a14 arctan[x14 /y14 ]
E{⎣ a ⎦} = ⎣ arctan[x /y ] ⎦
24 24 24
a34 arctan[x34 /y34 ]
This model consists of three nonlinear observation equations in the two un-
known parameters x4 and y4 .
Example 6
Consider the situation of figure 2.4. It shows two cartesian coordinate sys-
tems: the x,y-system and the u,v-system. The two systems only differ in their
orientation. This means that if the coordinates of a point i are given in the
u,v-system, (ui , vi ), a rotation through an angle α is needed to obtain the
coordinates of the same point i in the x,y-system, (xi , yi ):
xi cos α − sin α ui
= (2.38)
yi sin α cos α vi
Let us now assume that we have at our disposal the coordinate observables
of two points in both coordinate systems: (xi , yi ) and (ui , vi ), i = 1, 2. Using
(2.38), our model reads then
xi cos α − sin α ui
E{ }= E{ } , i = 1, 2 (2.39)
yi sin α cos α vi
2. Estimation and precision 23
i
yi
u
α
ui
vi
xi x
Taylor’s Theorem: Let f (x) be a function from Rn into R which is smooth enough. Let
x0 ∈ Rn be an approximation to x ∈ Rn and define Δx = x − x0 , and θ = x0 +t(x − x0 ) with
24 Network Quality Control
In (2.41) and (2.42), ∂αq ...αq f (x) denotes the qth-order partial derivative of f (x) eval-
1
uated at x. For the case q = 2, it follows from (2.41) and (2.42) that
n
1 n n 2
f (x) = f (x0 ) + ∑ ∂α f (x0 )Δxα +
2 α∑ ∑ ∂αβ f (θ )Δxα Δxβ (2.43)
α =1 =1 β =1
then equation (2.43) may be written in the more compact matrix-vector form as
1
f (x) = f (x0 ) + ∂x f (x0 )T Δx + ΔxT ∂xx
2
f (θ )Δx (2.44)
2
This important result shows that a nonlinear function f (x) can be written as a sum of
three terms. The first term in this sum is the zero-order term f (x0 ). The zero-order term
depends on x0 but is independent of x. The second term in the sum is the first-order term
∂x f (x0 )T Δx. It depends on x0 and is linearly dependent on x. Finally, the third term in the
sum is the second-order remainder R2 (θ , Δx).
A consequence of Taylor’s Theorem is that the remainder R2 (θ , Δx) can be made
arbitrarily small by choosing the approximation x0 close enough to x. Now assume that the
x0 approximation is chosen such that the second-order remainder can indeed be neglected.
Then, instead of (2.44) we may write to a sufficient approximation:
f (x) = f (x0 ) + ∂x f (x0 )T Δx (2.45)
Hence, if x0 is sufficiently close to x, the nonlinear function f (x) can be approximated to
a sufficient degree by the function f (x0 ) + ∂x f (x0 )T Δx which is linear in x. This function
is the linearized version of f (x). A geometric interpretation of this linearization is given
in figure 2.5 for the case n = 1. Let us now apply the above linearization to our nonlinear
observation equations
⎡ ⎤
a1 (x)
⎢ .. ⎥
E{y} = A(x) = ⎣ . ⎦ (2.46)
am (x)
2. Estimation and precision 25
y = f (x)
y
x0 x
d
Figure 2.5 The nonlinear curve y = f (x) and its linear tangent y = f (x0) + dx f (x0)(x − x0).
Each nonlinear observation equation ai (x), can now be linearized according to (2.45).
This gives
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
a1 (x) a1 (x0 ) ∂x a1 (x0 )T
⎢ .. ⎥ ⎢ .. ⎥ ⎢ .. ⎥
⎣ . ⎦ = ⎣ . ⎦ + ⎣ . ⎦ Δx
(2.47)
am (x) am (x0 ) ∂x am (x0 )T
m×1 m×1 m×n n×1
If we denote the m × n matrix of (2.47) as ∂x A(x0 ), and substitute (2.47) into (2.46) we
get
If we bring the constant m-vector A(x0 ) to the lefthandside of the equation and define
Δy = y − A(x0 ), we finally obtain our linearized model of observation equations
This is the linearized A-model. Compare (2.49) with (2.37) and (2.25). Note when com-
paring (2.49) with (2.25) that in the linearized A-model Δy takes the place of y, ∂x A(x0 )
takes the place of A and Δx takes the place of x. Since the linearized A-model is linear, our
standard formulae of least-squares can be applied again. This gives for the least-squares
estimator x̂ = x0 + Δx̂ of x:
It will be clear that the above results, (2.50) and (2.51), are approximate in the sense that
the second-order remainder is neglected. But these approximations are good enough if the
second-order remainder can be neglected to a sufficient degree. In this case also the op-
timality conditions of least-squares (unbiasedness, minimal variance) hold to a sufficient
degree. A summary of the linearized least-squares estimators is given in Table 2.3.
26 Network Quality Control
LEAST-SQUARES ESTIMATORS
VARIANCES
Qê = Qy − Qŷ
Example 7
Consider the configuration of figure 2.6. The x,y coordinates of the three
points 1, 2 and 3 are known and the two coordinates x4 and y4 of point 4 are
unknown. The observables consist of the three distance variates l 14 , l 24 , l 34 .
Since distance and coordinates are related as
the model of observation equations for the configuration of figure 2.6 reads
⎡ ⎤ ⎡ 2 ⎤
l 14 (x14 + y214 )1/2
E{⎣ l 24 ⎦} = ⎣ (x224 + y224 )1/2 ⎦ (2.52)
l 34 (x234 + y234 )1/2
2. Estimation and precision 27
This model consists of three nonlinear observation equations in the two un-
known parameters x4 and y4 .
In order to linearize (2.52) we need approximate values for the unknown
coordinates x4 and y4 . These approximate values will be denoted as x04 and
j
l 34
yi j = y j − yi
li j
4
l 14 l 24
y i xi j = x j − xi
1 2
where
Δl i4 = l i4 − li4
0 , l 0 = [(x0 − x )2 + (y0 − y )2 ]1/2 , i = 1, 2, 3
i4 4 i 4 i
Δx4 = x4 − x4 , Δy4 = y4 − y04
0
Example 8
Consider the nonlinear A-model (2.40) of example 6. The unknown parame-
ters are α and ui and vi for i = 1, 2. The approximate values of these param-
eters will be denoted as α 0 , u0i , v0i , for i = 1, 2. Linearization of (2.40) gives
then
⎡ ⎤ ⎡ ⎤
Δx1 u01 sin α 0 − v01 cos α 0 cos α 0 − sin α 0 0 0
⎢ Δy1 ⎥ ⎢ u01 cos α 0 − v01 sin α 0 sin α 0 cos α 0 0 0 ⎥⎡ ⎤
⎢ ⎥ ⎢ ⎥ Δα
⎢ Δx2 ⎥ ⎢ −u02 sin α 0 − v02 cos α 0 cos α 0 − sin α 0 ⎥
⎢ ⎥ ⎢ 0 0 ⎥⎢ Δu1 ⎥
⎢ ⎥ ⎢ ⎥⎢ ⎥
⎢ Δy2 ⎥ ⎢ u02 cos α 0 − v02 sin α 0 0 0 sin α 0 cos α 0 ⎥⎢ ⎥(2.54)
E{⎢ ⎥} = ⎢ ⎥⎢ Δv1 ⎥
⎢ Δu1 ⎥ ⎢ 0 1 0 0 0 ⎥⎣ ⎦
⎢ ⎥ ⎢ ⎥ Δu2
⎢ Δv1 ⎥ ⎢ 0 0 1 0 0 ⎥
⎢ ⎥ ⎢ ⎥ Δv2
⎣ Δu2 ⎦ ⎣ 0 0 0 1 0 ⎦
Δv2 0 0 0 0 1 Δx
Δy ∂x A(x0 )
where:
⎧
⎪
⎪ Δxi = xi − x0i , x0i = u0i cos α 0 − v0i sin α 0
⎪
⎪
⎪
⎨ Δyi = yi − yi ,
0 y0i = u0i sin α 0 + v0i cos α 0
Δui = ui − u0i , Δui = ui − u0i , for i = 1, 2
⎪
⎪
⎪
⎪ Δvi = vi − vi ,
0 Δvi = vi − v0i , for i = 1, 2
⎪
⎩ Δα = α − α 0
is a better approximation than x0 . That is, it seems reasonable to expect that x1 is closer
to the true least-squares estimate than x0 . In fact one can show that this is indeed the
case for most practical applications. But if x1 is a better approximation than x0 , a further
improvement can be expected if we replace x0 by x1 in the linearization of the nonlinear
model. The recomputed linearized least-squares estimate reads then
By repeating this process a number of times, one can expect that finally the solution
converges to the actual least-squares estimate x̂. This is called the least-squares iteration
process. The iteration is usually terminated if the difference between successive solutions
is negligible. A flowdiagram of the least-squares iteration process is shown in Figure 2.7.
2. Estimation and precision 29
initialization
x0 , set i = 0
x̂i 2
N(xi ) ≤δ
No
Yes
x̂ := xi+1
r = m−n (2.56)
This number is also referred to as the redundancy of the model. We will now show how
one can construct the condition equations, given the linear A-model (2.55). Each of the
column vectors of matrix A is an element of the observationspace Rm . Together, the n-
number of linearly independent column vectors of A span the range space of A. This range
space has dimension n and it is a linear subspace of Rm : R(A) ⊂ Rm . Since dimR(A) = n
30 Network Quality Control
bi ⊥ R(A) or AT bi = 0 , i = 1, . . ., (m − n)
From this follows, if the (m − n)-number of linearly independent vectors bi are collected
in an m × (m − n) matrix B as
that
BT A = 0 ; rank B = m−n
(2.57)
(m − n) × m m × n (m − n) × n
This result may now be used to obtain the model of condition equations from (2.55).
Premultiplication of the linear system of observation equations in (2.55) by BT gives
together with (2.57), the following linear model of condition equations:
Example 9
Consider the following linear A-model:
⎡ ⎤ ⎡ ⎤
y1 1
⎢ ⎥
E{⎣ y2 ⎦} = ⎣ 1 ⎦ x ; D{y} = Qy (2.59)
y3 1
are linearly independent and are both orthogonal to the column vector of
matrix A in (2.59). Hence the with (2.59) corresponding linear model of
condition equations reads
⎤ ⎡
y1
1 −1 0 ⎢ ⎥ 0
E{⎣ y2 ⎦} = ; D{y} = Qy (2.60)
0 1 −1 0
y3
BT
y
2. Estimation and precision 31
LEAST-SQUARES ESTIMATORS
ŷ = [I − Qy B(BT Qy B)−1 BT ]y ; ê = y − ŷ
Now that we have the disposal of the linear B-model (2.58), how are we going to
compute the corresponding least-squares estimators? We know how to compute the least-
squares estimators for the linear A-model. The corresponding formulae are however all
expressed in terms of the A-matrix. What is needed is therefore to transform these for-
mulae such that they are expressed in terms of the B-matrix. This is possible with the
following important matrix identity:
A(AT Q−1 −1 T −1 T −1 T
y A) A Qy = I − Qy B(B Qy B) B (2.61)
The proof of this matrix identity goes as follows. We define two matrices C and C̄ as:
⎡ T −1 −1 T −1 ⎤
(A Qy A) A Qy
..
C = A . Qy B and C̄ = ⎣ · · ·· · ·· · · ⎦ (2.62)
T
(B Qy B) B −1 T
Since both matrices C and C̄ are of the order m × m and since both can be shown to be of
full rank, it follows that both matrices are invertible. From (2.62) follows with the help of
(2.57) that C̄C = Im . Hence C̄ = C −1 and therefore CC̄ = Im . Substitution of (2.62) into
this last expression proofs (2.61).
With (2.61) and the least-squares results of table 2 of section 2.3.4 we are now in the
position to derive the expressions for the least-squares estimators in terms of the matrix
B. The results are summarized in table 2.4.
32 Network Quality Control
For some applications it may happen, when formulating the condition equations, that
the linear combinations of the observables do not sum up to a zero vector, but instead to a
known nonzero vector b. In that case the model of condition equations reads
BT E{y} = b ; D{y} = Qy
Due to the fact that b = 0, the solution for ŷ now takes a somewhat different form. It reads
Note that this solution reduces to the one given earlier, when b = 0.
where BT (.) is a nonlinear vectorfunction from Rm into Rm−n . The relation between the
nonlinear B-model and the nonlinear A-model is given by
BT (A(x)) = 0 , ∀ x ∈ Rn (2.64)
This is the nonlinear generalization of (2.57). If we take the partial derivative with respect
to x of (2.64) and apply the chainrule, we get
This is the linearized version of (2.64). Compare (2.65) with (2.57). With (2.65) we
are now in the position to construct the linearized B-model from the linearized A-model
(2.49). Premultiplication of (2.49) with the matrix [∂y B(y0 )]T gives together with (2.65)
the result
This is the linearized B-model. With (2.66) we are now in the position again to apply our
standard least-squares estimation formulae.
In this section three special cases of least-squares estimation will be discussed. They
are: recursive least-squares, constrained least-squares and minimally constrained least-
squares. In particular the last two will be needed when we discuss the adjustment and
testing of geodetic networks. Constrained least-squares, which can be seen as a particular
form of recursive least-squares, deals with the problem of solving a model of observation
equations, when there are explicit constraints on the unknown parameters. As it is shown,
the solution can be obtained in two steps. In the first step the model is solved without the
2. Estimation and precision 33
constraints, while in the second step, the final solution is obtained by using the results of
the first step in a formulation based on condition equations.
Minimally constrained least-squares is needed when the design matrix fails to be of
full rank. This typically occurs in cases of free network adjustments. Due to the rank de-
fect of the design matrix, a set of necessary and sufficient constraints need to be imposed
on the parameter vector in order to be able to compute a solution. The set of constraints
that can be imposed however, is not unique. This implies that a whole family of solutions
can be computed, of which each member is characterized by a particular set of minimal
constraints.
Note that y1 and y2 are assumed to be uncorrelated. This assumption is essential in order
to be able to formulate a recursive solution. We will denote the least-squares solution
which is based on the first set of observation equations as x̂(1) and the least- squares
solution which is based on both sets, as x̂(2) . The least-squares solution which is based on
the first set of observation equations reads
and the least-squares solution which is based on the complete set of observation equations,
reads
x̂(2) = (AT1 Q−1 T −1 −1 T −1 T −1
1 A1 + A2 Q2 A2 ) (A1 Q1 y1 + A2 Q2 y2 )
(2.69)
Qx̂ = (AT1 Q−1 T −1
1 A1 + A2 Q2 A2 )
−1
(2)
Comparing the two solutions shows that the second solution can also be written as
x̂(2) = (Q−1 T −1 −1 −1 T −1
x̂ + A2 Q2 A2 ) (Qx̂ x̂(1) + A2 Q2 y2 ) , Qx̂ = (Q−1 T −1
x̂ + A2 Q2 A2 )
−1
(2.70)
(1) (1) (2) (1)
This result shows that the solution of the partitioned model can be obtained in two steps.
First one solves for the first set of observation equations. This gives x̂(1) and Qx̂ . Then in
(1)
the second step this result is used together with y2 to obtain x̂(2) and Qx̂ . This recursive
(2)
procedure has an important implication in practice. It implies that if new observations,
say y2 , become available one does not need to save the old observations y1 to compute
x̂(1) . One can compute x̂(2) from y2 and the solution x̂(1) of the first step. It will be clear
that one can extent the recursion to more than two steps by using a partitioning of the
model of observation equations into more than two sets.
34 Network Quality Control
The expression now clearly shows how x̂(2) is found from updating x̂(1) . The correction to
x̂(1) depends on the difference y2 −A2 x̂(1) . Since A2 x̂(1) can be interpreted as the prediction
of E{y2 } based on y1 , the difference y2 − A2 x̂(1) is called the predicted residual of E{y2 }.
Note that the predicted residual is not the same as the least-squares residual. The least-
squares residual of E{y2 } reads namely y2 − A2 x̂(2) .
Note that in the above expressions we need to invert a matrix having an order which
is equal to the number of entries in the parameter vector x. One can however also derive
an expression for x̂(2) in which a matrix needs to be inverted which has an order equal to
the dimension of y2 . From the matrix identity
(Q−1 T −1 −1 T −1 T T −1
x̂ + A2 Q2 A2 ) A2 Q2 = Qx̂ A2 (Q2 + A2 Qx̂ A2 )
(1) (1) (1)
Both expressions (2.70) and (2.72) give identical results. But the second expression is
more advantageous than the first, when the dimension of y2 is small compared to the
dimension of x. In that case the order of the matrix that needs to be inverted is smaller.
The above expression shows that the correction to x̂(1) is small if the predicted residual
is small. This is also what one would expect. Also note that the correction is small if the
variance of x̂(1) is small. This is also understandable, because if the variance of x̂(1) is
small one has more confidence in it and one therefore would like to give more weight to
it than to y2 .
The above expression also shows how the variance matrix gets updated. Because of
the minus sign, the precision of the estimator gets better. This is understandable, since by
including y2 more information is available to estimate x.
The above results show how the least-squares solution of the parameter vector x can
be updated in recursive form. But this is not the only solution that can be updated recur-
sively. Also the weighted sum-of- squares of the least-squares residuals, can be updated
recursively. It can be shown that
êT Q−1 T −1 T −1
y ê = ê1 Q1 ê1 + v2 Qv v2 (2.73)
2
where ê1 = y1 − A1 x̂(1) is the least-squares residual vector of the first step and v2 = y2 −
A2 x̂(1) is the predicted residual which comes available in the second step.
rameter vector. In that case, we are dealing with a model of observation equations, with
constraints on the parameter vector. Our model reads then
The equations of BT x = b constitute the constraints, where matrix B and vector b are
assumed known. In order to solve the above model in a least-squares sense, we can make
use of the results of the previous section. First note that we may write (2.74) also as
y A y Qy 0
E{ }= x , D{ }= (2.75)
b BT b 0 Qb = 0
This is again a model of observation equations which has been partitioned into two sets.
The only difference with the model treated in the previous section is that the variance
matrix corresponding to the second set of observation equations is zero (Qb = 0). This
variance matrix is set to zero, since it is assumed that the relations BT x = b are strictly
valid. Thus b, with sample value b, is interpreted as an observable having a zero variance
matrix.
Based on the results of the previous section, it follows that the solution of the above
model can also be obtained in two steps. The solution of the first step will be denoted as
x̂ and the solution of the second step as x̂b . The solution of the first step reads then
x̂ = (AT Q−1 −1 T −1 T −1 −1
y A) A Qy y , Qx̂ = (A Qy A) (2.76)
This is the solution one would get when the constraints are not taken into account. The
solution of the second step reads
x̂b = x̂ + Qx̂ B(BT Qx̂ B)−1 (b − BT x̂)
(2.77)
Qx̂ = Qx̂ − Qx̂ B(BT Qx̂ B)−1 BT Qx̂
b
This solution directly follows from using (2.72). Note that A2 and Q2 of the previous
section, correspond with BT and Qb = 0 of the present section.
The above result shows that also the solution of a constrained model of observation
equations can be obtained in two steps. The constraints are disregarded in the first step.
In the second step, the results of the first step are used together with the constraints. Note
that the solution of the second step is in fact the solution one would obtain when solving
for the model of condition equations BT E{x̂} = b , D{x̂} = Qx̂ .
As in the previous section, also the weighted sum-of-squares of the least-squares resid-
uals can be written as a sum. If we denote the least- squares residual vector for the con-
strained model as êb and its counterpart when the constraints are disregared as ê, it follows
that
êTb Q−1 T −1 T T T −1 T
y êb = ê Qy ê + (b − B x̂) (B Qx̂ B) (b − B x̂) (2.78)
Remark: In some applications it may happen that we perform the adjustment with con-
straints on the parameters, although we know that these constraints are in fact not deter-
ministic but instead stochastic. Thus although we know that Qb = 0, we then still perform
the adjustment as if Qb = 0. This approach of course, will give a result which is less
36 Network Quality Control
Qx̂ = Qx̂ − Qx̂ B(BT Qx̂ B)−1 [BT Qx̂ B − Qb ](BT Qx̂ B)−1 BT Qx̂ (2.79)
b
This result reduces to that of (2.77) when Qb = 0, showing that Qx̂ < Qx̂ . This inequality
b
is not guaranteed anymore however, when Qb = 0. It then depends on whether BT Qx̂ B >
Qb holds true or not. This explains why it is of importance when connecting networks
to existing control, to have an existing control which is of a better precision than the
precision of the network connected to it. If this is not the case, then BT Qx̂ B < Qb and thus
Qx̂ > Qx̂ , showing that the precision of the network after the connection is poorer than it
b
was before the connection.
E{y} = A x , D{y} = Qy
(2.80)
m×1 m×n n×1 m×1 m×m
is less than of full rank. Let us assume that the rank of matrix A is given as
rank A = r < n
Hence, the rank defect equals (n − r). This implies that there exist (n − r) linear indepen-
dent combinations of the column vectors of matrix A that produce the zero vector. These
linear combinations are said to span the null space of matrix A. The null space of A is
defined as
N(A) = {x ∈ Rn | Ax = 0}
It is a subspace of the parameter space Rn and its dimension equals (n − r). Let us assume
that the columns of the n × (n − r) matrix G span the null space of A. Then
Thus the range space of G equals the null space of A and the columns of matrix G form
the linear combinations that need to be taken of the columns of matrix A to obtain the zero
vector. Since AG = O, it follows that
E{y} = Ax = A(x + Gγ ) , with γ ∈ Rn−r
This shows that E{y} remains unchanged, when we add the vector Gγ , with γ ∈ Rn−r
arbitrary, to the vector x. Hence, E{y} is invariant to these type of changes of the pa-
rameter vector. It will now be clear, since E{y} is insensitive to the above changes of the
parameter vector, that one can not expect the rank defect model of observation equations
to have a unique solution. The information content of the measurements is simply not
sufficient enough to determine x uniquely.
In practice one will meet such situations for instance, when adjusting so-called free
networks. Imagine the simple example of adjusting a single levelling loop. In this case
the levelled height differences constitute the measurements and the heights of the points
in the loop constitute the unknown parameters. It will be clear that one can not determine
heights from observed height differences only. The height differences will not change
when one adds a constant value to all the heights of the points in the levelling loop. This
shows that we need some additional information, in order to be able to solve for the
heights. For this simple example, the height of one of the points in the levelling loop
would suffice already.
Also for the general case, the lack of information in E{y} to determine x uniquely,
implies that additional information is needed. We will formulate the additional informa-
tion as constraints on the parameter vector x. Thus instead of (2.80), we will consider the
constrained model
E{y} = Ax , D{y} = Qy with BT x = 0 (2.81)
(note: For reasons of simplicity we have set the value of the constant vector b in the
constraints, equal to zero.) The matrix B is assumed to be of full rank. Thus the rows of
the matrix BT are assumed to be linearly independent.
When introducing the constraints, it is of importance to understand, that they are
merely used as a tool to allow us to be able to compute a solution for x (note: there are
other, but equivalent, ways to deal with a rank defect model of observation equations, for
instance by using the theory of generalized inverses. This however, is beyond the scope
of the present lecture notes). Since the constraints are only there to allow us to compute
a solution for x, the constraints should not interfere with the intrinsic properties of the
adjustment itself. In other words, the constraints should contain the information which
is minimally needed to eliminate the lack of information in E{y}. This implies that the
contraints should satisfy two conditions. First, the constraints should be such that indeed
a solution for x can be computed. This however, is not the only condition the constraints
should satisfy. If it would be the only condition, then B = I would be an admissible
choice. But this choice can not be allowed of course, since it overconstraints the solution.
In fact when B is chosen equal to the identity matrix, no adjustment would be necessary
anymore and the measurements would not contribute to the solution. This clearly, is not
acceptable. Thus the constraints should also satisfy a second condition, which is that the
38 Network Quality Control
least-squares solution of the measurements, ŷ, is invariant to the particular choice of the
constraints.
If we translate the above conditions in terms of linear algebra, it follows that the
constraints are admissible, if and only if the matrix B satisfies
Thus the range space of matrix B should be complementary to the range space of matrix
AT . Two spaces are said to be complementary, when the two spaces together span the
whole space of real numbers and their intersection only contains the zero vector. Since
the dimension of Rn equals n and the dimension of R(AT ) equals the rank of A, which is
rank A = r, it follows that the dimension of R(B) must equal
dimension R(B) = n − r
This shows that the number of linearly independent constraints that are admitted, equals
(n − r). This is also understandable if one considers the dimension of the null space of A.
That is, one needs as many constraints as there are dimensions in the null space of A.
An alternative, but equivalent formulation of the above condition (2.82) can be for-
mulated as follows. Let the linear independent columns of the n × r matrix B⊥ span the
null space of matrix BT . The above condition is then equivalent to
Thus the range space of matrix B⊥ should be complementary to the null space of matrix
A. A direct consequence of this condition is that the matrix G, of which the columns span
N(A), and the matrix B⊥ , together form a square and full rank matrix. Thus the partitioned
matrix (B⊥ , G) is square and invertible. Since it is square and invertible, it may be used
to reparametrize the parameter vector x as
β
x = (B⊥ , G) (2.84)
γ
This equation establishes a one-to-one relation between on the one hand x and on the other
hand β and γ . When we substitute (2.84) into (2.81), we obtain
since AG = O and BT B⊥ = O. Note that this reparametrization shows which part of the
unknown parameters can be determined from the measurements and which part is deter-
mined by the constraints. Since the matrix BT G is square and invertible, it directly follows
from the constraints that γ = 0 (note: the invertibility of BT G is a direct consequence of
(2.82) or (2.83)). Thus γ is determined by the constraints and β is the part that still needs
to be determined in a least-squares sense from the measurements. The least-squares solu-
tion for β reads
β̂ = [(B⊥ )T AT Q−1 ⊥ −1 ⊥ T T −1
y A(B )] (B ) A Qy y
2. Estimation and precision 39
If we substitute this together with γ = 0 into (2.84), we obtain the least-squares solution
of the minimally constrained model (2.81) as
⎧
⎪
⎨ x̂b = (B⊥ )[(B⊥ )T AT Q−1 ⊥ −1 ⊥ T T −1
y A(B )] (B ) A Qy y
(2.86)
⎪
⎩ Q
x̂ = (B⊥ )[(B⊥ )T AT Q−1 ⊥ −1 ⊥ T
y A(B )] (B )
b
This is the unique least-squares solution of the minimally constrained model (2.81) and it
is one of the infinitely many least-squares solutions of the rank defect model (2.80). The
whole family of least-squares solutions of model (2.80) reads then
x̂ = x̂b + Gγ (2.87)
The second equation has been obtained from applying the error propagation to the first
equation. This result shows that we only need to know the transformation matrix
and thus the matrices B and G, in order to obtain x̂b from x̂. This transformation matrix
is a very famous one in geodesy and is known as the S-transformation. With the S-
transformation we are thus able to transform any least-squares solution to a minimally
constrained solution, specified by the matrix B. The matrix G, of which the columns span
the null space of A, depends on A and thus on the type of measurements which are included
in the model. For a levelling network for instance, the undetermined part will correspond
with a constant shift in the height of all points. For a triangulation network however, the
undetermined part will correspond to a translation, a rotation and a scale change of the
network. Angles do namely not contain information on the position, orientation and size
of the network.
40 Network Quality Control
One may wonder how the above results correspond to the results which were obtained
in the previous section for the case of constrained least- squares. In the previous section,
where matrix A was assumed to be of full rank, we showed that the least-squares solution
of the constrained model could be obtained in two steps. The unconstrained solution of
the first step, was used as input for the second step to finally come up with the constrained
solution. These two steps can also be recognized in case of the minimally constrained so-
lution. The first step corresponds then with x̂, which is an arbitrary least-squares solution
of model (2.80), and the second step corresponds then with (2.88).
Unbiasedness: In one of our earlier sections we claimed that the least-squares estimator
of the parameter vector is unbiased and thus that E{x̂} = x holds true. At that point
however, we still assumed the matrix A to be of full rank. So, what happens when the
matrix A is not of full rank? We know that in that case the measurements fail to contain
enough information to determine x uniquely. We may therefore suspect that in that case
the least-squares estimators will also fail to be unbiased estimators of the parameter vector
x. And indeed, when we take the expectation of (2.86), we obtain
E{x̂b } = (B⊥ )[(B⊥ )T AT Q−1 ⊥ −1 ⊥ T T −1
y A(B )] (B ) A Qy E{y}
= (B )[(B ) A Qy A(B⊥ )]−1 (B⊥ )T AT Q−1
⊥ ⊥ T T −1
y Ax = x
Now that we have developed most of the adjustment theory that will be needed in these
lecture notes, let us pause for a few moments and reflect on the various steps involved
when an adjustment is carried out.
Formulate model: Before any adjustment can be carried out, one first must have a clear
idea of: (1) the observables that are going to be used, (2) their assumed relation to the
unknown parameters and (3) their assumed precision. Thus first one must be able to
formulate the model of observation equations
E{y} = Ax , D{y} = Qy (2.91)
This model consists of the functional model E{y} = Ax, which is specified through the
m×n design matrix A, and of the stochastic model D{y} = Qy , which is specified through
2. Estimation and precision 41
the m × m variance matrix Qy . In the functional model the link is established between the
measurements and the unknown parameters. It captures the geometry of the network. In
the stochastic model, one specifies the precision of the measurements. It depends on the
type of measurement equipment used and on the measurement procedures used.
Adjustment: Based on the above model and on an available sample of y, the mea-
surements, one can commence with the actual adjustment. Using the principle of least-
squares, the estimator for the unknown parameter vector x reads
x̂ = (AT Q−1 −1 T −1
y A) A Qy y (2.92)
Here it has been assumed that the design matrix is of full rank. If this is not the case, then
the approach of using minimal constraints needs to be used.
If we assume that y is normally distributed, then its probability density function is
completely specified by its first two moments, being its expectation and its dispersion.
This then also holds true for the probability density function of the estimator x̂. The qual-
ity of this estimator can thus be characterized by its expectation E{x̂} and its dispersion
D{x̂}. For its expectation we have
E{x̂} = x (2.93)
Loosely speaking, this implies that if the adjustment is repeated a sufficient number of
times, each time with a new sample from y, that the various outcomes for x̂ would on
the average coincide with x. This property of unbiasedness is an important one and it is
automatically fulfilled when the least-squares principle is used, provided of course that
the model which forms the basis of the adjustment is correct. In the next chapter we will
see what happens when the model is specified incorrectly.
Precision: For the moment we will assume that the model used is correct and thus that the
estimator x̂ is unbiased. In that case, the quality of the estimator is completely captured
by its dispersion, being the variance matrix Qx̂ . Thus under the provision that the least-
squares estimator is unbiased, we may judge its quality on the basis of its variance matrix
The variance matrix is said to describe the precision of the estimator. It describes the
amount of variability in samples of x̂ around its mean.
Since the variance matrix depends on A and on Qy , one can change Qx̂ by changing
A and/or Qy . This thus gives us a way of improving the precision of the least-squares
solution. For instance, if for a certain application the precision turns out to be not good
enough, one may decide to use more precise measurement equipment. In that case Qy
changes for the better and also Qx̂ will then change for the better. In many applications
however, one will not have too much leeway in choosing from different sets or types of
measurement equipment. In that case one depends on A for improving the precision (note:
for that reason, matrix A is also often referred to as the design matrix). There are two ways
in which A can be changed. One can change the dimension of the matrix and/or one can
change its structure. For instance, when the precision of a geodetic network is not good
42 Network Quality Control
enough one can decide to add more observations to the network. In that case, matrix A is
extended row wise. However, it may also be possible that a mere change in the geometry
of the network already suffices to improve its precision. In that case, it is the structure of
A that changes, while its dimension stays the same.
Precision testing: Although we know how to change Qx̂ for the better or the worse, we
of course still need a way of deciding when the precision, as expressed by the variance
matrix, is good enough. It will be clear that this depends very much on the particular
application at hand. What is important though, is that one has a precision criterium avail-
able by which the precision of the least-squares solution can be judged. The following are
some approaches that can be used for testing the precision of the solution.
It may happen in a particular application, that one is only interested in one particular
function of the parameters, say θ = f T x. In that case it becomes very easy to judge its
quality. One then simply has to compare its variance with the given criterium
The situation becomes more difficult though, when the application at hand requires that
the precision of more than one function needs to be judged upon. Let us assume that
the functions of interest are given as θ = F T x, where F is a matrix. The corresponding
variance matrix is then given as Qθ̂ = F T Qx̂ F. One way to judge the precision in this
case, is by inspecting the variances and covariances of the matrix Qθ̂ . An alternative way
would be to use the average precision as precision measure. In that case one relies on the
trace of the variance matrix
1 trace (F T Q F) < criterium (2.96)
p x̂
where p is the dimension of the vector θ . When using the trace one has to be aware of
the fact that one is not taking all the information of the variance matrix Qθ̂ into account.
It depends only on the variances and not on the covariances. Hence, the correlation be-
tween the entries of θ̂ are then not taken into account. When using the trace one should
also make sure that it makes sense to speak of an average variance. In other words the
entries of θ should contain the same type of variables. It would not make sense to use
the trace, when θ contains completely different variables, each having their own physical
dimension.
The trace of Qθ̂ equals the sum of its eigenvalues. Instead of the sum of eigenvalues,
one might decide that it suffices to consider the largest eigenvalue λmax only,
In that case one is thus testing whether the function of θ which has the poorest precision,
still passes the precision criterium. When this test is passed successfully, one knows that
all other functions of θ will also have a precision which is better than the criterium. For
some applications this may be an advantage, but it could be a disadvantage as well. It
could be a disadvantage in the sense that the above test could be overly conservative.
That is, when the function having the poorest precision passes the above test, all other
2. Estimation and precision 43
functions, which by definition have a better precision, may turn out to be unnecessarily
precise.
So far we assumed that the precision criterium was given in scalar form. But this need
not be the case. The precision criterium could also be given in matrix form. In that case
one is working with a criterium matrix, which we will denote as Cx . The precision test
amounts then to testing whether the precision as expressed by the actual variance matrix
Qθ̂ = F T Qx̂ F is better than or as good as the precision expressed by the criterium matrix
F T Cx F. Also this test can be executed by means of solving an eigenvalue problem, but
now it will be a generalized eigenvalue problem
| F T Qx̂ F − λ F T Cx F |= 0 (2.98)
Note that when the matrix F T Cx F is taken as the identity matrix, the largest eigenvalue of
(2.98) reduces to that of (2.97). Using the generalized eigenvalue problem is thus indeed
more general than the previous discussed approaches. It is characterized by the fact that
it allows one to compare the actual variance matrix Qθ̂ directly with its criterium. The
two matrices F T Qx̂ F and F T Cx F are identical, when all eigenvalues equal one, and all
functions of θ̂ have a better precision than the criterium when the largest generalized
eigenvalue is less than one.
So far we assumed the matrix A to be of full rank. In many geodetic applications
however, this is not the case. We know from our section on minimally constrained least-
squares, that the variance matrix Qx̂ will not be unique, when A has a rank defect. In that
case the variance matrix depends on the chosen set of minimal constraints. Since these
constraints should not affect our conclusion when evaluating the precision, we have to
make sure that our procedure of precision testing is invariant for the minimal constraints.
This is possible when we make use of the S-transformations. Let Cx be the criterium
matrix and Qx̂ the variance matrix of a minimally constrained solution. The eigenvalues
b
of the generalized eigenvalue problem
where Sb is the S-transformation that corresponds to the minimal constraints of Qx̂ , are
b
then invariant to the chosen set of minimal constraints. Thus when the largest eigenvalue
is less than one, it is guaranteed that all functions of x̂b have a precision which is better
than what they would have were Qx̂ be replaced by the criterium matrix.
b
Chapter 3
3.1 Introduction
and showed how a solution for the unknown parameter vector x, based on the least-squares
principle, could be obtained. The least-squares solution x̂ is optimal in the sense that it is
unbiased and that it is of minimal variance in the class of linear unbiased estimators. These
optimality properties only hold true however, when the model (3.1) is correct. There are
many ways in which the model could have been misspecified. The functional model
could be wrong, in which case E{y} = Ax. As a consequence, the least-squares estimator
becomes biased, E{x̂} = x. Or, the stochastic model could be wrong, in which case
D{y} = Qy . As a consequence, the property of minimal variance is lost.
The topic of the present chapter is to present ways of checking the validity of the
above model. We start off in section 3.2, by discussing the basic concepts of hypothesis
testing. We consider the general form of a null hypothesis and an alternative hypothesis,
show what a test between the two implies and discuss the two type of errors one can
make. Following these basic concepts, we move on in section 3.3 to the testing of the
above model against the alternative model
Note that the two models only differ in their functional model. Hence we restrict ourselves
in these lecture notes to misspecifications in the mean of y. In geodetic practice these are
by far the most frequently occurring type of model errors (e.g. measurement blunders,
unaccounted systematic effects). In section 3.3, we present and discuss the test statistic
T q through which the model (3.1) can be tested against the alternative (3.2).
In most practical applications, it is usually not only one model error one is concerned
about, but quite often many more than one. To each of these model errors there belongs
a vector C∇, with the matrix C specifying how the model error is related to the vector
of observables. As a result one is not dealing with only one alternative hypothesis, but
with as many alternative hypotheses as there are model errors one is willing to consider.
This implies that one needs a testing procedure for handling the various alternative hy-
potheses. In section 3.4 such a procedure is presented. It consists of a detection step, an
identification step and an adaptation step.
Just like the results of an adjustment are not exact (that is, not deterministic, but
stochastic), also the results of the statistical tests are not exact. The confidence one will
have in the outcomes of the statistical tests depends in a large part on the ’strength’ of the
model. A practical way of diagnosing this confidence is provided for by the concept of
45
46 Network Quality Control
reliability. It is introduced in section 3.5. Reliability together with precision, can then be
considered to describe the quality which one can expect of the results of an adjustment
and testing. They are both considered in the last section of this chapter.
Example 10
According to the postulated theory or hypothesis the three points 1, 2 and 3
of figure 3.1 lie on one straight line. In order to test or verify this hypothesis
2
l12 l23
l13
1 3
we need to design an experiment such that its outcome can be compared with
the theoretically predicted value.
If the postulated hypothesis is correct, the three distances l12 , l23 and l13
should satisfy the relation:
to accept the hypothesis that the three points lie on one straight line. In case
of disagreement we are inclined to reject hypothesis H.
It will be clear that in practice the testing of hypotheses is complicated by the fact that
experiments (in particular experiments where measurements are involved) in general do
not give outcomes that are exact. That is, experimental outcomes are usually affected by
an amount of uncertainty, due for instance to measurement errors. In order to take care
of this uncertainty, we will, in analogy with our derivation of Adjustment Theory in the
previous chapter, model the uncertainty by making use of the results from the theory of
random variables. The verification or testing of postulated hypotheses will therefore be
based on the testing of hypotheses of random variables of which the probability distri-
bution depends on the theory or hypothesis postulated. From now on we will therefore
consider statistical hypotheses.
Example 10 (continued)
We know from experience that in many cases the uncertainty in geodetic mea-
surements can be adequately modeled by the normal distribution. We there-
fore model the three distances between the three points 1, 2 and 3 as normally
distributed random variables 1 If we also assume that the three distances are
uncorrelated and all have the same known variance 13 σ 2 , the simultaneous
probability density function of the three distance observables becomes:
⎡ ⎤ ⎡ ⎤
l 13 E{l 13 }
⎣ l ⎦ ∼ N(⎣ E{l } ⎦ , Q) with Q = 1 I (3.5)
12 12 3 3
l 23 E{l 23 }
The notation N(a, B) is a shorthand notation for the normal distribution, hav-
ing a as mean and B as variance matrix. Statement (3.5) could already be
1 Note that strictly speaking distances can never be normally distributed. A distance is always nonnegative,
whereas the normal distribution, due to its infinite tails, admits negative sample values.
48 Network Quality Control
However, we cannot make this relation hold for the random variables l 12 , l 23
and l 13 . This is simply because of the fact that random variables cannot be
equal to a constant. Thus, a statement like: l 12 + l 23 − l 13 = 0 is nonsensical.
What we can do is assume that relation (3.6) holds for the expected values of
the random variables l 12 , l 23 and l 13 :
For the hypothesis considered this relation makes sense. It can namely be
interpreted as stating that if the measurement experiment were to be repeated
a great number of times, then on the average the measurements will satisfy
(3.7). With (3.5) and (3.7) we can now state our statistical hypothesis as:
⎡ ⎤ ⎡ ⎤
l 13 E{l 13 }
1
H : ⎣ l 12 ⎦ ∼ N(⎣ E{l 12 } ⎦ , I3 )
3
l 23 E{l 23 } (3.8)
This hypothesis has the same structure of (3.4) with the three means playing
the role of the parameter x.
In many hypothesis-testing problems two hypotheses are discussed: The first, the
hypothesis being tested, is called the null hypothesis and is denoted by H0 . The second
is called the alternative hypothesis and is denoted by HA . The thinking is that if the null
hypothesis H0 is false, then the alternative hypothesis HA is true, and vice versa. We often
say thatH0 is tested against, or versus, HA .
In studying hypotheses it is also convenient to classify them into one of two types by
means of the following definition: if a hypothesis completely specifies the distribution,
that is, if it specifies its functional form as well as the values of its parameters, it is called
a simple hypothesis; otherwise it is called a composite hypothesis.
3. Testing and reliability 49
Example 10 (continued)
In our example, (3.8) is the hypothesis to be tested. Thus, the null hypothesis
reads in our case:
⎡ ⎤ ⎡ ⎤
l 13 E{l 13 }
1
H0 : ⎣ l 12 ⎦ ∼ N(⎣ E{l 12 } ⎦ , I3 )
3
l 23 E{l 23 } (3.9)
Critical region K: The critical region K of a test is the set of sample values of y for which
H0 is to be rejected. Thus, H0 is rejected if y ∈ K.
It will be obvious that we would like to choose a critical region so as to obtain a test
with desirable properties, that is, a test that is ”best” in a certain sense. But let us first
have a look at a simple testing problem for which, on more or less intuitive grounds, an
acceptable critical region can be found.
50 Network Quality Control
Example 11
Let us assume that a geodesist measures a scalar variable, and that this mea-
surement can be modeled as a random variable y with density function
1 1
y ∼ √ exp[− (y − E{y})2 ] (3.11)
2π 2
Thus, it is assumed that y has a normal distribution with unit variance. Al-
though this assumption constitutes a statistical hypothesis, it will not be
tested here because the geodesist is quite certain of the validity of this as-
sumption. The geodesist is however not certain about the value of the expec-
tation of y. His assumption is that the value of E{y} is x0 . This assumption
is the statistical hypothesis to be tested. Denote this hypothesis by H0 . Then,
H0 : E{y} = x0 (3.12)
HA : E{y} = x0 (3.13)
Thus the problem is one of testing the simple hypothesis H0 against the com-
posite hypothesis HA . To test H0 , a single observation on the random variable
y is made. In real-life problems one usually takes several observations, but to
avoid complicating the discussion at this stage only one observation is taken
here. On the basis of the value of y obtained, denoted by y, a decision will
be made either to accept H0 or reject it. The latter decision, of course, is
equivalent to accepting HA . The problem then is to determine what values
of y should be selected for accepting H0 and what values for rejecting H0 . If
a choice has been made of the values of y that will correspond to rejection,
then the remaining values of y will necessarily correspond to acceptance. As
defined above, the rejection values of y constitute the critical region K of the
test. Figure 3.2 shows the distribution of y under H0 and under two possible
alternatives HA and HA . Looking at this figure, it seems reasonable to re-
1 2
ject H0 if the observation y is remote enough from E{y}. If H0 is true, the
HA Ho HA
2 1
Ho
E{y} = x0
K K
reject accept reject
Table 3.1 shows the decision table with the type I and II errors.
The size of a type I error is defined as the probability that a sample value of y falls in
the critical region when in fact H0 is true. This probability is denoted by α and is called
the size of the test or the level of significance of the test. Thus:
The size of the test, α , can be computed once the critical region K and the probability
density function of y is known under H0 . The size of a type II error is defined as the
52 Network Quality Control
H0 true H0 false
Reject H0 Wrong Correct
y∈K Type I error
Accept H0 Correct Wrong
y ∈/K Type II error
probability that a sample value of y falls outside the critical region when in fact H0 is
false. This probability is denoted by β . Thus:
or
β = P(y ∈/K|HA) = 1 − py (y|HA )dy (3.15)
K
The size of a type II error, β , can be computed once the critical region K and the proba-
bility density function of y is known under HA .
Example 12
Assume that y is distributed as
y ∼ N(E{y}, σ 2 ) (3.16)
H0 : E{y} = x0 (3.17)
and
Ho HA
x0 xA
known. In the present example the form of the critical region has been chosen
right-sided. Its location is determined by the value of kα , the so-called critical
value of the test. Thus, for the present example the size of the test can be
computed as
∞
α= py (y|x0 )dy
kα
or, since
1 1 1
py (y|x0 ) = √ exp[− (y − x0 )2 ]
2πσ 2 σ2
as
∞
1 1 1
α= √ exp[− (y − x0 )2 ]dy (3.19)
kα 2πσ 2 σ2
y − x0
z= (3.20)
σ
is standard normally distributed under H0 . And since
kα − x0
α = P(y > kα |H0 ) = P(z > |H0 ) (3.21)
σ
we can use the last expression of (3.21) for computing α . Application of the
change of variables (3.20) to (3.19) gives
∞
1 1
α= kα −x0 √ exp[− z2 ]dz (3.22)
σ 2π 2
54 Network Quality Control
Ho HA
x0 kα xA
K
accept reject
(a)
Ho HA
α
kα x0 xA
K
accept reject
(b)
We can now make use of the table of the standard normal distribution. Table
2.6 shows some typical values of the α and kα for the case that x0 = 1 and
σ = 2.
As we have seen the location of the critical region K is determined by the
value chosen for kα , the critical value of the test. But what value should
we choose for kα ? Here the geodesist should base his judgement on his
experience. Usually one first makes a choice for the size of the test, α , and
then by using (3.22) or Table 3.2 determines the corresponding critical value
kα . For instance, if one fixes α at α = 0.01, the corresponding critical value
kα (for the present example with x0 = 1 and σ = 2) reads kα = 5.64. The
choice of α is based on the probability of a type I error one is willing to
accept. For instance, if one chooses α as α = 0.01, one is willing to accept
that 1 out of a 100 experiments leads to rejection of H0 when in fact H0 is
true.
Let us now consider the size of a type II error, β . Figure 3.6 shows for the
present example the size of a type II error, β . It corresponds to the area under
the graph of the distribution of y under HA for the interval complementary to
3. Testing and reliability 55
kα − x0
α kα
σ
0.1 1.28 3.56
0.05 1.65 4.30
0.01 2.32 5.64
0.001 2.98 6.96
the critical region K. The size of a type II error, β , can be computed once the
Ho HA
β α
x0 kα xA
K
accept reject
Figure 3.6 The sizes of type I and type II error, α and β , for testing H0 : E{y} = x0 versus
HA : E{y} = xA > x0 .
or, since
1 1 1
py (y|xA ) = √ exp[− (y − xA )2 ]
2πσ 2 σ2
as
kα
1 1 1
β= √ exp[− (y − xA )2 ]dy (3.23)
−∞ 2πσ 2 σ2
Also this value can be computed with the help of the table of the standard
normal distribution. But first some transformations are needed. It will be
clear that the probability that a sample or observation of y falls in the critical
56 Network Quality Control
This formula has the same structure as (3.19). The value 1 − β can therefore
be computed in exactly the same manner as the size of the test, α , was com-
puted. And from 1 − β it is trivial to compute β , the size of the type II error.
a From the nature of the experimental data and the consideration of the assertions
that are to be examined, identify the appropriate null hypothesis and alternative
hypothesis:
b Determine the critical region K. In practice this implies that one determines a func-
tion of y, the test statistic, of which the values should always be larger than a chosen
critical value. The values of y which make this happen, form the critical region.
c Specify the size of the type I error, α , that one wishes to assign to the testing
process. Use tables to determine the location of the critical region K from
α = P(y ∈ K|H0 ) = py (y|x0 )dy
K
3. Testing and reliability 57
For many distributions, like the normal distribution or the Chi-squared distribution,
these tables can be found in standard textbooks on statistics.
to ensure that there exists a reasonable protection against type II errors. Its com-
plement, γ = 1 − β , is known as the detection probability or as the power of the
test.
e After the test has been explicitly formulated, determine whether the sample or ob-
servation y of y falls in the critical region K or not. Reject H0 if y ∈ K, and accept
H0 if y ∈/K. Never claim however that the hypotheses have been proved false or
true by the testing.
E{y} = Ax , D{y} = Qy
We have seen, that in order to be able to apply the least-squares principle, only the first two
moments of the random vector of observables ,y, need to be specified. The first moment
(mean) E{y} = Ax and the second moment (variance matrix) D{y} = Qy . In the case of
statistical testing however, this is not sufficient. In addition, one will have to specify the
type of probability function of y as well. Since most observational data in geodesy can
be modelled as samples drawn from a normal distribution, we will assume that y has the
normal probability density function
m 1 1
py (y | x) = (2π )− 2 | Qy |− 2 exp[− (y − Ax)T Q−1
y (y − Ax)]
2
Thus we assume that y is Normally distributed, with mean Ax and with the variance matrix
Qy . In shorthand notation
Ho : y ∼ N(Ax, Qy ) (3.26)
y = Ax + e (3.28)
then
Thus the mean of the residual vector e will be zero when Ho is true and unequal to zero
when Ho is false. This shows that if we would have a sample or measurement of e avail-
able, we could use it to decide on the validity of Ho . Would the sample be close to zero,
we would be inclined to accept Ho and would it differ greatly from zero, we would be
inclined to reject Ho . Unfortunately, no sample of e = y − Ax is available, since x is un-
known. Instead of considering e, let us therefore consider its estimator. The least-squares
solution of x and e under Ho , reads
x̂ = (AT Q−1 −1 T −1
y A) A Qy y
ê = [Im − A(A Q−1
T −1 T −1
y A) A Qy ]y
This shows, when we take the expectation and use (3.29), that
E{x̂} = x E{x̂} = x
Ho : , Ha : (3.30)
E{ê} = 0 E{ê} = C∇ = 0
3. Testing and reliability 59
Thus apart from the mean of e, also the mean of ê is zero when Ho is true and nonzero
when Ho is false. Note that in the first case, x̂ is an unbiased estimator of x, while it
becomes biased when Ho is false. Contrary to e, we do have a sample available of ê, since
it is a function of y. Hence, instead of using e, we could use ê to decide on the validity of
Ho . If the sample of ê is close to zero, we are inclined to accept Ho and if it differs greatly
from zero, we are inclined to reject Ho .
The least-squares model error: Instead of using the least-squares residual vector ê, it
also seems intuitively clear that the model error ∇ itself must be instrumental in deciding
on the validity of Ho . Under Ha , we have
The model error ∇ itself is unknown of course. Let us therefore consider its least-squares
ˆ Since it is a function of y, we do have a sample value of it available. On the
estimator ∇.
basis of this value we could also decide on the validity of Ho . If the sample value of ∇ˆ is
small, we are inclined to belief that the model error is absent and thus that Ho is true. On
the other hand, if the sample value of ∇ ˆ turns out be significant, we will certainly not be
inclined to belief Ho and rather have more faith in Ha .
From the above discussion, it seems that both ê and ∇ ˆ can be used for the testing of
Ho against Ha . One can therefore expect that the two estimators must be related in some
way. And this is indeed the case. To show this, we first solve for ∇. The normal equations
that belong to Ha of (3.31), read
AT Q−1 T −1
y A A Qy C x̂a AT Q−1
y y
= (3.32)
C Qy A C T Q−1
T −1
y C ∇ˆ C T Q−1
y y
where x̂a is given the subindex a to indicate that it is the least-squares estimator of x under
Ha and not the least-squares estimator of x under Ho . From the above normal equations,
the reduced normal equations for ∇ ˆ follow as
C T Q−1 T −1 −1 T −1 ˆ T −1 T −1 −1 T −1
y [Im − A(A Qy A) A Qy ]C ∇ = C Qy [Im − A(A Qy A) A Qy ]y
The least-squares estimator of the model error and its variance matrix follows therefore
as
ˆ = [C T Q−1 Q Q−1C]−1C T Q−1 ê , Q = [C T Q−1 Q Q−1C]−1
∇ (3.33)
y ê y y ˆ ∇ y ê y
The one dimensional case: Let us first consider the one dimensional case. In that case,
q = 1 and ∇ˆ becomes a scalar instead of a vector. The significance of ∇ ˆ can now be
measured by using the precision of the estimator, thus by using its standard deviation σ ˆ .
∇
ˆ by its standard deviation σ and define the random variable
We therefore divide ∇ ˆ
∇
∇ˆ
w= (3.35)
σˆ
∇
This random variable has a standard normal distribution under Ho . Thus under Ho , it has
a zero mean, with a variance of one. Under Ha however, it will have a nonzero mean, but
again with a variance of one. Thus
∇
Ho : w ∼ N(0, 1) and Ha : w ∼ N( , 1) (3.36)
σ∇ˆ
Since the distribution of w is completely known under Ho , we are now in a position to test
the significance of the model error. The model error is said to be significant, if
where N 1 (0, 1) is the critical value of the standard normal distribution, based on the level
2α
of significance α (the test is two-sided). Note that instead of working with the absolute
value of w, one can also work with its square. In that case, the test reads
where χα2 (1, 0) is the critical value of the central Chi- squared distribution having one
degree of freedom.
The higher dimensional case: One can not use the above test when q > 1. In that case we
need to take the complete variance matrix Q ˆ into account, to measure the significance
∇
ˆ Instead of using the one dimensional test statistic w2 , we therefore use the q-
of ∇.
dimensional test statistic
T
ˆ Q−1 ∇
Tq = ∇ ˆ (3.39)
ˆ
∇
Note that T q = w2 , when q = 1. In the higher dimensional case, the model error is said to
be significant, if
where χα2 (q, 0) is the critical value of the central Chi- square distribution, having q degrees
of freedom.
3. Testing and reliability 61
Using the least-squares residual: We have seen that the least- squares estimator of the
model error, ∇,ˆ can be written as a function of the least-squares residual vector ê. This
implies that the above test statistic T q can also be expressed in terms of ê. If we substitute
(3.33) into (3.39), we get
T q = êT Q−1 T −1 −1 −1 T −1
y C[C Qy Qê Qy C] C Qy ê (3.41)
Although the two test statistics (3.39) and (3.41) are identical, the second expression is
often the more practical one, since the least-squares residual vector ê is often already
available from the adjustment based on Ho .
In the previous section we gave the test statistic for testing the null hypothesis Ho against
a particular alternative hypothesis Ha . In most practical applications however, it is usually
not only one model error one is concerned about, but quite often many more than one. This
implies that one needs a testing procedure for handling the various alternative hypotheses.
In this subsection we will discuss a way of structuring such a testing procedure. It will
consist of the following three steps: detection, identification and adaptation.
3.4.1 Detection
Since one usually first wants to know whether one can have any confidence in the assumed
null hypothesis without the need to specify any particular alternative hypothesis, the first
step consists of a check on the overall validity of Ho . This implies that one opposes the
null hypothesis to the most relaxed alternative hypothesis possible. The most relaxed
alternative hypothesis is the one that leaves the observables completely free. Hence, un-
der this alternative hypothesis no restrictions at all are imposed on the observables. We
therefore have the situation
Since E{y} ∈ Rm implies that matrix (A,C) is square and invertible, it follows that matrix
C has q = m − n columns and that its range space is complementary to the range space of
A. Thus Rm = R(A) ⊕ R(C). It can be shown that in this case, the test statistic of (3.41)
simplifies to
The appropriate test statistic for testing the null hypothesis against the most relaxed alter-
native hypthesis, is thus equal to the weighted sum-of-squares of the least-squares resid-
uals. The null hypothesis will then be rejected when
The σ̂ 2 test: In the literature one often sees the above overall model test also formulated
in a slightly different way. Let us use the factorization Qy = σ 2 Gy , where σ 2 is the
62 Network Quality Control
variance factor of unit weight and where Gy is the corresponding cofactor matrix. It can
be shown that
êT G−1
y ê
σ̂ 2 =
m−n
is an unbiased estimator of σ 2 . Thus E{σ̂ 2 } = σ 2 . The test (3.44) can now also be
formulated as
σ̂ 2 χα2 (m − n, 0)
> = Fα (m − n, ∞, 0)
σ2 m−n
where F(m − n, ∞, 0) is the central F-distribution having m − n and ∞ degrees of freedom.
3.4.2 Identification
In the detection phase, one tests the overall validity of the null hypothesis. If this leads to
a rejection of the null hypothesis, one has to search for possible model misspecifications.
That is, one will then have try to identify the model error which caused the rejection of the
null hypothesis. This implies that one will have to specify through the matrix C, the type
of likely model errors. This specification of possible alternative hypotheses is application
dependent and is one of the more difficult tasks in hypothesis testing. It depends namely
very much on ones experience, which type of model errors one considers to be likely.
The 1-dimensional case: In case the model error can be represented by a scalar, q = 1
and matrix C reduces to a vector which will be denoted by the lowercase character c. This
implies that the alternative hypothesis takes the form
Ha : E{y} = Ax + c∇ (3.45)
The alternative hypothesis is specified, once the vector c is specified. The appropriate test
statistic for testing the null hypothesis against the above alternative hypothesis Ha follows
when the vector c is substituted for the C-matrix in (3.41). It gives
[cT Q−1
y ê]
2
T q=1 = w2 = T −1 −1
c Qy Qê Qy c
or when the square-root is taken
cT Q−1
y ê
w= (3.46)
cT Q−1 −1
y Qê Qy c
This test statistic has a standard normal distribution N(0, 1) under Ho . The evidence on
whether the model error as specified by (3.45) did or did not occur, is based on the test
| w | > N 1 (0, 1) (3.47)
2 α1
Data snooping: Apart from the possibility of having a one dimensional test as (3.47), it is
standard practice in geodesy to always first check the individual observations for potential
blunders. This implies that the alternative hypotheses take the form
Hai : E{y} = Ax + ci ∇ i = 1, . . ., m (3.48)
3. Testing and reliability 63
with
ci = (0, . . ., 0, 1, 0, . . ., 0)T
Thus ci is a unit vector having the 1 as its ith entry. The additional term ci ∇ models the
presence of a blunder in the ith observation. The appropriate test statistic for testing the
null hypothesis against the above alternative hypothesis Hai is again of the general form
of (3.46), but now with the c-vector chosen as ci ,
cTi Q−1
y ê
wi = (3.49)
cTi Q−1 −1
y Qê Qy ci
This test statistic has of course also a standard normal distribution N(0, 1) under Ho . By
letting i run from 1 up to and including m, one can screen the whole data set on the
presence of potential blunders in the individual observations. The test statistic wi which
returns the in absolute value largest value, then pinpoints the observation which is most
likely corrupted with a blunder. Its significance is measured by comparing the value of
the test statistic with the critical value. Thus the jth observation is suspected to have a
blunder, when
This procedure of screening each individual observation for the presence of a blunder, is
known as data snooping.
In many applications in practice, the variance matrix Qy is diagonal. If that is the
case, the expression of the above test statistic simplifies considerably. With a diagonal
Qy -matrix, we have
êi
wi =
σê
i
The appropriate test statistic is then thus equal to the least-squares residual of the ith
observation divided by the standard deviation of the residual.
The higher dimensional case: It may happen that a particular model error can not be
represented by a single scalar. In that case q > 1 and ∇ becomes a vector. The appropriate
test statistic is then the one we met earlier, namely
T q = êT Q−1 T −1 −1 −1 T −1
y C[C Qy Qê Qy C] C Qy ê (3.51)
It is through the matrix C that one specifies the type of model error.
3.4.3 Adaptation
Once one or more likely model errors have been identified, a corrective action needs to be
undertaken in order to get the null hypothesis accepted. Here, one of the two following
approaches can be used in principle. Either one replaces the data or part of the data with
new data such that the null hypothesis does get accepted, or, one replaces the original
64 Network Quality Control
null hypothesis with a new hypothesis that does take the identified model errors into ac-
count. The first approach amounts to a remeasurement of (part of) the data. This approach
is feasible for instance, when in case of datasnooping some individual observations are
identified as being potentially corrupted by blunders. These are then the observations
which get remeasured. In the second approach no remeasurement is undertaken. Instead
the model of the null hypothesis is enlarged by adding additional parameters such that all
identified model errors are taken care off. Thus with this approach, the identified alterna-
tive hypothesis becomes the new null hypothesis.
Once the adaptation step is completed, one of course still has to make sure whether
the newly created situation is acceptable or not. This at least implies a repetition of the
detection step. When adaptation is applied, one also has to be aware of the fact that since
the model may have changed, also the ’strength of the model’ may have changed. In
fact, when the model is adapted through the addition of more explanatory parameters, the
model has become weaker in the sense that the test statistics will now have less detection
and identification power. That is, the reliability has become poorer. It depends on the
particular application at hand, whether this is considered acceptable or not.
3.5 Reliability
In the previous section we considered a testing procedure for the detection, identification
and adaptation of model errors. Hence, we now know how to search for potential model
errors and how to test their significance. But what we do not know yet, is how well these
tests will perform. In particular we would like to know how the tests perform in terms of
their power of detecting the model errors.
Note that the power γ depends on the three parameters α , q and λ . Using a shorthand
notation, we write
γ = γ (α , q, λ ) (3.56)
When testing, we of course would like to have a sufficiently high probability of correctly
detecting a model error when it occurs. One can make γ larger by increasing α , or, by
decreasing q, or, by increasing λ . This can be seen as follows. A larger α implies a
smaller critical value χα2 (q, 0) and via the integral (3.55) thus a larger power γ . Thus if
we want a smaller probability for the error of the first kind (α smaller), this will go at the
cost of a smaller γ as well. That is, one can not simultaneously decrease α and increase
γ.
The power γ also gets larger, when q gets smaller. This is also understandable. When
q gets smaller, the less additional parameters are used in Ha and therefore the more ”infor-
mation” is used in formulating Ha . For such an alternative hypothesis, one would expect
that if it is true, the probability of accepting it will be higher than for an alternative hy-
pothesis that contains more additional parameters. Finally, the power γ also gets larger,
when λ gets larger. This is understandable when one considers (3.53). For instance, one
would expect to have a higher probability of correctly rejecting the null hypothesis, when
the model error gets larger. And when ∇ gets larger, also the noncentrality parameter λ
gets larger.
Using α and/or q as tuning parameters to increase γ , does not make much sense how-
ever. The parameter q can not be changed at will, since it depends on the type of model
error one is considering. And increasing α , also does not make sense, since it would lead
to an increased probability of an error of the first kind. Hence, this leaves us with λ .
According to (3.53), the noncentrality parameter depends on
Since one also can not change the model error at will, it is through changes in the variance
matrix Qy and/or in the design matrix A that one can increase λ , thereby improving the
detection power of the test. Recall that it is also through these two matrices that one can
improve the precision of the least-squares solution. Thus the detection power γ of the tests
66 Network Quality Control
Minimal Detectable Biases (MDB’s): In the one dimensional case q = 1 and the matrix
C reduces to the vector c. Inverting (3.60) becomes then rather straightforward. As a
result we get for the size of the model error
λ (α1 , 1, γ )
|∇|= (3.61)
cT Q−1 −1
y Qê Qy c
The variate | ∇ | is known as the Minimal Detectable Bias (MDB). It is the size of the
model error that can just be detected with a probability γ , using the appropriate w-test
statistic. Larger errors will be detected with a larger probability and smaller errors with a
smaller probability.
In order to guarantee that the model error c∇ is detected with the same probability
by both the w-test and the overall model test, one will have to relate the critical values of
the two tests. This can be done by equalizing both the powers of the two tests and their
noncentrality parameters. Thus if λ (αm−n , m − n, γm−n) is the noncentrality parameter of
the T m−n -test statistic and λ (α1 , 1, γ1) the noncentrality parameter of the T 1 -test statistic,
we have
λ (αm−n , m − n, γ ) = λ (α1 , 1, γ ) (3.62)
3. Testing and reliability 67
From this relation, αm−n and thus the critical value of the overall model test, can be
computed, once the power γ and the level of significance α1 is chosen. Common values
in case of geodetic networks, are α1 = 0.001 and γ = 0.80.
Data snooping: Note that the MDB can be computed for each alternative hypothesis,
once the vector c of that particular alternative hypothesis has been specified. In case of
data snooping, the MDB’s of the individual observations are computed as
λ (α1 , 1, γ )
| ∇i | = i = 1, . . ., m (3.63)
ci Q−1
T −1
y Qê Qy ci
As with the wi -test statistic, also this expression simplifies considerably when the variance
matrix Qy is diagonal. In that case we have
λ (α1 , 1, γ )
| ∇i |= σyi (3.64)
1 − σŷ2 /σy2i
i
where σy2i is the a priori variance of the ith observation and σŷ2 is the a posteriori variance
i
of this observation. We thus clearly see that a better precision of the observation as well
as a larger amount in which its precision gets improved by the adjustment, will improve
the internal reliability, that is, will result in a smaller MDB.
The higher dimensional case: When q > 1, the inversion is a bit more involved. In this
case ∇ is a vector, which implies that its direction needs to be taken into account as well.
We use the factorization ∇ = ∇ d, where d is a unit vector (d T d = 1). If we substitute
the factorization into (3.60) and then invert the result, we get
λ (αq , q, γ )
∇ = (d = unit vector) (3.65)
d T C T Q−1 −1
y Qê Qy Cd
The size of the model error now depends on the chosen direction vector d. But by letting
d vary over the unit sphere in Rq , one can obtain the whole range of MDB’s that can be
detected with a probability γ .
This shows that the bias in x̂, due to the presence of the model error C∇, is
Of course we do not know the actual model error C∇. But we do know the size of the
model error that can be detected with a probability of γ . Hence, if we use the MDB and
replace C∇ by c | ∇ | in case q = 1 or by Cd ∇ in case q > 1, the vector ∇x̂ will show
us by how much the least-squares estimator x̂ gets biased, when a model error of the size
of the MDB would occur. Note that there are as many MDB’s as there are alternative
hypotheses. Hence, there are also as many vectors ∇x̂. This whole set is said to describe
the external reliability.
In certain applications, one may not be interested in the whole parameter vector x,
but only in particular functions of it, say θ = F T x. In that case the external reliability is
described by
A particular case occurs, when one is only interested in the x1 - part of the parameter vector
x = (xT1 , xT2 )T . In that case F T = (I, 0). The bias-vector ∇x̂1 can then be shown to equal
where, with the partitioning A = (AT1 , AT2 )T , the matrix ĀT1 Q−1
y Ā1 is the reduced normal
matrix and Ā1 = [I − A2 (AT2 Q−1
y A2 )−1 AT Q−1 ]A .
2 y 1
Bias-to-Noise Ratios (BNR’s): In order to measure the significance of the external relia-
bility, one can compare the bias-vectors ∇x̂ with the precision, or variance matrix Qx̂ , of
x̂. This can be done by using the Bias-to-Noise Ratios (BNR)
Note that the BNR’s are dimensionless and that they measure the squared lengths of the
bias-vectors in the metric defined by the appropriate variance matrix. Also note that the
BNR’s are scalars. Thus λx̂ is a scalar, whereas ∇x̂ is a vector. This implies that with the
BNR’s, one only needs to evaluate a scalar per alternative hypothesis, whereas otherwise
a complete vector would have to be evaluated for each alternative hypothesis.
From a computational point of view, there are also some shortcuts that can be used
when computing the BNR’s. For instance, when the complete vector ∇x̂ is considered, it
can be shown that
The second expression on the right hand side may sometimes be easier to compute than
the first expression on the right hand side, in particular when Qy is diagonal. For the BNR
of a subset of the parameter vector, one can show that
An important feature of the BNR’s is that they can be used to formulate upperbounds
on the external reliability of functions of the parameters. For instance, if we consider the
function θ̂ = f T x̂ having the variance σθ̂2 and the bias ∇θ̂ = f T ∇x̂, then
∇θ̂ ≤ σθ̂ λx̂ (3.72)
This shows, that the potential bias in θ̂ due to an undetected model error of the size of the
MDB, will never be larger than the standard deviation times the square root of the BNR.
hi = λ Hi + t
Data snooping applied to the above connection model, implies that we are testing for
errors in the individual height coordinates. It will be clear that with the above model, one
will never be able to discriminate whether a blunder occurred in the h-coordinate or in the
H- coordinate of a point i. That is, one will never be able to pinpoint a blunder in a height
coordinate to one of the two height systems. Hence, it suffices to restrict our attention to
the wi -test statistics of one of the two sets of heights. We choose to restrict our attention to
the h-coordinates. Since the complete variance matrix of the observables is diagonal, we
can make use of the simplified expression (3.64) for the MDB’s. For the hi coordinates, it
reads
17.075
| ∇i |= σh (3.74)
1 − σ 2 /σh2
ĥi i
Note that we assumed λ (α1 , 1, γ ) = 17.075. This value is based on the values α1 = 0.001
and γ = 0.80.
Scale and translation absent: When both scale and translation are absent, the model of
observation equations becomes linear and the least- squares solution of (3.73) amounts to
taking a simple weighted average of the data. The variance of the least-squares solution
ĥi reads then
σh2 σH2
σĥ2 = (3.75)
i σh2 + σH2
Substitution into (3.74) gives for the MDB
| ∇i |= 17.075(σh2 + σH2 ) (3.76)
This shows that the MDB will be about 5.8 times the standard deviation of the height
coordinate, if the two networks are equally precise. Hence, a blunder of this size in the hi
coordinate can be found with a probability of 80% with the wi -test.
Only scale is absent: In this case the model of observation equations is again linear. Note
however that the redundancy has decreased by one. In the previous case the redundancy
equalled n, whereas now, due to the additional unknown translation, it equals n − 1. Due
to the decrease in redundancy, the model will have less ’strenght’ and one can therefore
expect that the results of the adjustment will also be somewhat less precise. And indeed,
the variance of the least-squares solution ĥi reads now
σh2 σH2 σ2 1
σĥ2 = (1 + h2 ) (3.77)
i σh + σH
2 2 σH n
Note that (3.75) and (3.77) will differ less, the larger n is. Hence, the difference between
the two variances will become smaller when the number of points in the two overlapping
networks increases. The difference will also be small when σH2 >> σh2 , that is, when the
second network is considerably less precise than the first network. But in that case we also
would have σĥ2 ≈ σh2 , thus showing that no significant improvement in precision has taken
i
3. Testing and reliability 71
place. This is of course understandable, because if the second network is considerably less
precise than the first network, it will also not contribute much to the adjustment.
Substitution of (3.77) into (3.74) gives for the MDB
17.075(σh2 + σH2 )
| ∇i |= (3.78)
1 − 1n
Note that the MDB goes to infinity when n = 1. This is due to the fact that there is no
redundancy when n = 1. In that case no model errors at all can be found by the statistical
tests and the solution is said to be infinitely unreliable. Also note that the above MDB is
larger than the one of (3.76). This is due to the decrease in redundancy of one. That is,
the outcomes of the statistical tests will now be somewhat less reliable than they were in
the previous case. But again, this difference can be made small by increasing the number
of points n.
Only translation is absent: When only the translation is absent, we again have a model
with a redundancy of n − 1. Note however, that the observation equations are now non-
linear. Hence, first a linearization needs to be carried out. The linearized model reads
⎡ ⎤ ⎡ ⎤
Δh1 λo H1o
⎢ . ⎥ ⎢ .. .. ⎥ ⎡ ⎤
⎢ .. ⎥ ⎢ . ⎥
⎢ ⎥ ⎢ . ⎥ ΔH1
⎢ ⎥ ⎢
⎢ Δhn ⎥ ⎢ λ o Hno ⎥ ⎢ . ⎥
⎥ ⎢ .. ⎥
E{⎢ ⎥} = ⎢ ⎥⎢ ⎥ (3.79)
⎢ ΔH 1 ⎥ ⎢ 1 0 ⎥ ⎣ ΔHn ⎦
⎢ ⎥ ⎢ ⎥
⎢ .. ⎥ ⎢ .. .. ⎥
Δλ
⎣ . ⎦ ⎣ . . ⎦
ΔH n 1 0
where λ o is the approximate scale factor and Hio , i = 1, . . ., n, are the approximate heights
of the second network. Since geodetic networks often only differ slightly in scale, one
may take as approximate value λ o = 1. We will do so here also.
Note that the design matrix of the above model depends on the approximate heights
Hio . Hence, also the precision of the least- squares estimators will depend on them. The
variance of the least-squares solution Δĥi reads now
σh2 σH2 σh2 (Hio )2
σĥ2 = (1 + ) (3.80)
i σh2 + σH2 σH2 ∑nj=1(H oj )2
Compare this result with that of (3.77). Substitution of (3.80) into (3.74) gives for the
MDB
17.075(σh2 + σH2 )
| ∇i |=
(3.81)
(H o )2
1 − ∑n i(H o )2
j=1 j
Compare this result with that of (3.78). In the previous case, the MDB was, apart from the
a priori precision, only dependent on the number of points n. In the present case however,
the MDB has also become dependent on the heights of the points themselves. Thus in the
above expression for the MDB, we see four effects at work:
72 Network Quality Control
As before the MDB gets smaller when the a priori precision improves and/or when the
number of points increases. But now the MDB also gets smaller when the height of the
point being tested is closer to zero and/or when the heights of the remaining points are
further away from zero.
Both scale and translation present: As in the previous case, the observation equations
are again nonlinear. The linearized model reads
⎡ ⎤ ⎡ ⎤
Δh1 λo 1 H1o ⎡ ⎤
⎢ . ⎥ ⎢ .. .. .. ⎥ ΔH1
⎢ .. ⎥ ⎢ . ⎥
⎢ ⎥ ⎢ . . ⎥⎢ . ⎥
⎢ ⎥ ⎢ ⎢ . ⎥
⎢ Δhn ⎥ ⎢ λ o 1 Hno ⎥ ⎥⎢ . ⎥
E{⎢ ⎥ } = ⎢ ⎥⎢ ⎥ (3.82)
⎢ ΔH 1 ⎥ ⎢ 1 0 0 ⎥ ⎢ ΔHn ⎥
⎢ ⎥ ⎢ ⎥
⎢ .. ⎥ ⎢ .. .. .. ⎥ ⎣ Δt ⎦
⎣ . ⎦ ⎣ . . . ⎦ Δλ
ΔH n 1 0 0
Due to the additional unknown, the translation t, the redundancy now equals n − 2. Thus
again we can expect a somewhat less precise and less reliable result. The variance of the
least-squares solution Δĥi reads now
where H̄io = Hio − 1n ∑nj=1 H oj , i.e. the difference in height of point i with respect to the
average height of the network. Compare this result with that of (3.80). Substitution of
(3.83) into (3.74) gives for the MDB
17.075(σh2 + σH2 )
| ∇i |=
(3.84)
(H̄ o )2
1 − 1n − ∑n (i H̄ o )2
j=1 j
Compare this result with that of (3.81). We see almost the same four effects present in the
expression for the MDB. The only difference is that the heights are now referenced to the
average height of the network, instead of to zero, as it was the case when the translation
was absent. Thus the point having the smallest MDB is the one which height is closest to
the average of the network. If it coincides with the average, then H̄io = 0 and the above
expression reduces to that of (3.78).
3. Testing and reliability 73
We are now in a position to summarize the two main diagnostics by which the quality
of estimation and testing can be characterized. They are precision and reliability. In
order to discuss them properly, we follow the general steps involved when performing the
adjustment and the testing.
Formulate Ho and Ha : In order to be able to perform the adjustment, one first must have
a working hypothesis, the null hypothesis Ho , available. It reads
This is the model one beliefs to be true and on which one would like to base the ad-
justment. But of course, relying on this model without checking its validity would be
dangerous, since errors in it could completely ruin the results of the adjustment. The
purpose of testing is therefore to check whether the null hypothesis is likely to be true
or not. This is done by opposing the above model to one or more alternative hypotheses
Ha . In these lecture notes we have restricted ourselves to alternative hypotheses, which
differ from the null hypothesis in their functional model only. The alternative hypotheses
considered are therefore of the form
The two type of hypotheses thus differ in their mean of y only, E{y | Ha } = E{y | Ho } +
C∇. The vector ∇y = C∇, with matrix C known and vector ∇ unknown, describes the
assumed model error.
The quality under Ho and Ha : Based on our working hypothesis Ho , the least-squares
estimator of x reads
x̂ = (AT Q−1 −1 T −1
y A) A Qy y (3.87)
It will be clear that the quality of x̂, as expressed by its expectation and its dispersion,
depends on whether Ho is true or Ha is true. In the first case we have
mean : E{x̂} = x
Ho true (3.88)
variance : D{x̂} = Qx̂
Since one can never be completely sure whether the null hypothesis is true or whether
one of the alternative hypotheses is true, the quality of the estimator is made up of the two
components ∇x̂ and Qx̂ . The variance matrix, which describes the precision, is known and
in the last section of the previous chapter we discussed how one could evaluate it. The
bias ∇x̂ however, is unknown, since it depends on the unknown model error C∇.
Testing: In order to minimize the risk that a bias like ∇x̂ indeed occurs, one needs to
check the validity of the null hypothesis. This is the whole purpose of testing. Although
there is only one null hypothesis, there are in practice often many more than one alter-
native hypotheses. The class of alternative hypotheses considered depends very much on
the application at hand and on ones experience. Through the alternative hypotheses one
specifies the model errors that one beliefs are likely to occur. The testing procedure to
be applied consists then of detection, identification and adaptation. In the detection step
the overall validity of the null hypothesis is checked. Ones this test leads to a rejection
of Ho , one needs to identify the most likely alternative hypothesis. In the final step, the
adaptation step, one then corrects for the identified misspecification in the null hypothesis.
Reliability: Although the statistical testing of Ho minimizes the risk that a bias like ∇x̂
occurs, one should realize that the outcomes of the statistical tests are not exact and thus
still prone to errors (type I and type II error). It depends on the ’strength’ of the model,
how much confidence one will have in these outcomes. A measure of this confidence is
provided for by the concept of reliability. When the w- test statistics are used, the internal
reliability is described by the set of MDB’s, one for each alternative hypothesis. The
MDB is given as
λ (α1 , 1, γ )
|∇|= (3.90)
c Q−1
T
y Qê Qy c
−1
It is the size of the model error that can be found with a probability γ , when using the
w-test. The internal reliability improves when the MDB’s get smaller and gets worse
when the MDB’s get larger. Note however, that it only makes sense to consider these
MDB’s when the statistical tests are actually carried out. Hence, the solution is said to be
infinitely unreliable by definition, if no statistical testing has taken place. Also note, that
the MDB, apart from being dependent on the chosen values for α1 and γ , is governed by
the c- vector, the design matrix A and the variance matrix Qy . The vector c can not be
changed at will, since it depends on the particular model error one is considering. Hence,
this leaves us, much like in the case of precision, with the two matrices A and Qy for
improving the internal reliability.
The external reliability describes how model errors of the size of the MDB’s propagate
into the results of the adjustment,
This vector thus describes the bias in x̂, when a model error of the size of the MDB has
occurred.
3. Testing and reliability 75
Precision and reliability: Once one has evaluated the precision of the least-squares so-
lution and found it adequate enough for the application at hand, one can evaluate the
significance of the bias vector ∇x̂, through the Bias-to-Noise Ratio (BNR)
The dimensionless BNR measures the bias relative to the precision. For an arbitrary
function θ̂ = f T x̂, the BNR can be used to obtain the upperbound
∇θ̂
≤ λx̂ (3.93)
σθ̂
Thus if for the particular application at hand, the BNR is considered to be small enough,
the bias in any function of x̂ is guaranteed to be sufficiently insignificant as well.
Chapter 4
4.1 Introduction
In this chapter we will discuss the various computational steps involved for determining
the geometry of a geodetic network. Let us first briefly review the general steps involved.
They are the design, the adjustment and the testing.
Design: Before one can start, one needs to specify the functional model and the stochastic
model. In the functional model, one formulates the assumed relation between the observ-
ables and the unknown parameters. These observation equations depend on the type of
observables used (e.g. angles, distances, baselines, etc.) and on the choice of parameteri-
zation (e.g. Cartesian coordinates or geographic coordinates). The observation equations
may be linear or nonlinear. In the nonlinear case, they first need to be linearized. For the
linearization, approximate values for the parameters are needed. When the approximate
geometry of the network is known, it is often possible to obtain the approximate coordi-
nates from a map. The approximate coordinates can also be computed from a sufficient
set of observations. The design matrix A is known, once the functional model is specified.
With the stochastic model, one specifies the assumed distributional properties of the
observables. In geodesy it often suffices to assume the observables to be normally dis-
tributed. In addition one has to specify the second moment, the variance matrix, of the
distribution. The variance matrix of the observables describes their precision. The spec-
ification of this variance matrix Qy depends on the measurement equipment and on the
measurement procedures used.
The two matrices A and Qy are known, once the functional and stochastic model are
known. These two matrices can be used before the actual adjustment and testing is carried
out, to infer the expected quality in terms of precision and reliability, of the network. In
order to evaluate the precision of the least-squares solution x̂, its variance matrix Qx̂ is
used. This matrix quantifies how random errors in the observables propagate into the
least-squares solution. In order to evaluate the reliability of the least-squares solution, the
minimal detectable bias vector ∇x̂ is used. It quantifies how a potential error c∇ in the
functional model propagates into the least-squares solution. The size of the potential error
is coupled to the power of the corresponding test statistic. Both Qx̂ and ∇x̂ depend on A
and Qy . Hence, one can improve the precision and reliability, by changing A and/or Qy .
Adjustment: Once one is satisfied with the design of the network, the actual adjustment
can be carried out. It needs, apart from the design matrix A and the variance matrix Qy ,
of course also the actual observations y. The adjustment is based on the principle of least-
squares and it produces a solution for the unknown parameters. This solution is obtained
by solving the normal equations, for which usually the Cholesky-decomposition is used.
In case of nonlinear observation equations, the least-squares solution is usually iterated
77
78 Network Quality Control
using the normal equations based on the linearized observation equations. The number of
iterations will be small, when good approximate values are used.
On the basis of the assumption that the model (functional and stochastic) has been
specified correctly, the linear(ized) least-squares estimators are known to be unbiased and
of minimal variance. These properties will fail to hold however, when a misspecified
model has been used in the computations. Hence, before the least-squares solution x̂ can
be accepted, one has to test the validity of the model.
Testing: The validity of the model (the null hypothesis) is tested by opposing it to vari-
ous alternative models (the alternative hypotheses). For each alternative hypothesis, one
has an appropriate test statistic. But all test statistics are functions of the least- squares
residual vector ê. The testing procedure consists of the following three steps: detection,
identification and adaptation. In the detection step, the null hypothesis is opposed to the
most relaxed alternative hypothesis. The purpose of the detection step is to infer whether
one has any reason to belief that the null hypothesis is indeed wrong. When the detection
step leads to a rejection of the null hypothesis, the next step is the identification of the
most likely model error. For identification one needs to specify the alternative hypothesis.
This choice depends on the type of model error one expects to be present. Hence, it very
much depends on the application at hand. It is standard practice however, to have data
snooping included in the identification step. In case of data snooping each of the individ-
ual observations is screened for potential blunders. Once certain model errors have been
identified as sufficiently likely, the last step consists of an adaptation of the data and/or
model. Depending on the situation, two approaches are possible in principle. One can
either decide to remeasure some of the observables, or, one can decide to include addi-
tional parameters in the model, such that the model errors are accounted for. The first
approach is possible in case the model errors are due to clear measurement errors, e.g.
blunders in individual observations. The second approach can be used for more compli-
cated situations. In this case the identified alternative hypothesis will become the new null
hypothesis. One should be aware however, that in this case, due to the change in model,
the precision and reliability of the solution will change as well.
The above considerations for the design, the adjustment and the testing, are valid for
any geodetic project where measurements are used to determine unknown parameters.
When computing geodetic networks however, some additional aspects need to be consid-
ered as well. The construction of a geodetic network implies that the geometry of the
configuration of a set of points is determined. The set of points usually consists of: (1)
newly established points, of which the coordinates still need to be determined, and (2)
already existing points, the so-called control points, of which the coordinates are known.
By means of a network adjustment the relative geometry of the new points is determined
and integrated into the geometry of the existing control points. The determination of the
geometry is usually divided into two parts, the so-called free network adjustment and the
connection adjustment.
The free network: In the free network adjustment, the known coordinates of the control
4. Adjustment and validation of networks 79
points do not take part in the determination of the geometry of the point field. This
adjustment step is thus free from the influence of the existing control points. The idea is
that a good geodetic network should be sufficiently precise and reliable in itself, without
the need of external control.
Since the coordinates of the control points do not take part in the adjustment, one
is confronted with the fundamental non-uniqueness in the relation between geodetic ob-
servables and coordinates. In a levelling network for instance, absolute heights can not
be determined if only height differences are measured. That is, for computing heights,
additional information is needed on the absolute height of the network. Similarly, one
can not obtain the position, the orientation and the scale of a triangulation network if only
angles are measured. The additional information which is needed to be able to compute
coordinates from the geodetic observables, is provided for in the form of so-called min-
imal constraints. These minimal constraints are not unique. There is a whole set from
which the constraints can be chosen. It is important though, that the constraints are mini-
mal. That is, they should not only be necessary to eliminate the lack of information in the
observables, but they should also be sufficient.
After the design of the free network, its coordinates are computed by means of a
least-squares adjustment and its validity is checked by means of the statistical testing
of the observations. The coordinates depend on the chosen minimal constraints, but the
statistical tests do not.
The connected network: Once the geometry of the free network has been determined
to ones satisfaction, it needs to be integrated into the existing geometry of the control
points. The data used for this connection, are the results of the free network adjustment
together with the coordinates of the control points. One can descriminate between the so-
called constrained connection and the unconstrained connection. In most applications,
it is not very practical to see the coordinates of the control points change everytime a
free network is connected to them. This would happen however, when an ordinary least-
squares adjustment is carried out. In that case all observations, including the coordinates
of the control points, would get corrections due to the least-squares adjustment. In order to
circumvent this, no ordinary adjustment, but a constrained adjustment is carried out. This
implies that the connection is carried out, with the explicit constraints that the coordinates
of the existing control points remain fixed.
For the statistical testing of the observations however, a constrained adjustment would
not be realistic. Although practice dictates that the coordinates of the control points re-
main fixed, these coordinates are of course still samples from random variables. Hence,
for the statistical testing of the observations, the variance matrix of the control points
should not be set to zero, but should be included in the adjustment as well. Thus, the
constrained connection is carried out for the final computation of the coordinates, but the
unconstrained connection is used for the statistical testing of the control points.
This chapter is organized as follows. First we will discuss the free network case and then
we will show how they can be connected to existing control. For the free networks, we
present the observation equations, discuss their invariance and show how one can choose
80 Network Quality Control
In this section we consider networks for which the observational data are insufficient
to determine either the (horizontal and/or vertical) position, orientation or scale of the
network (note: here and in the remaining part of the lecture notes, we will disregard so
called configuration defects. That is, we assume that sufficient observational data are used
to determine the configuration of the network).
As we have seen in the earlier sections on adjustment theory, a consequence of the
non-uniqueness is that the design matrix A of the model of observation equations E{y} =
Ax, D{y} = Qy , will have a rank defect. Therefore no unbiased linear estimator x̂ = Ly
of x exists, since this would require E{x̂} = LE{y} = Ax for all x, or AL = I, which is
impossible since the rank of a product of two matrices can not exceed the rank of either
factor. But although x is not unbiased estimable, we have seen that functions of x exist
that are unbiased estimable. In particular we have shown that the minimally constrained
solution x̂b is an unbiased estimator of Sb x, with Sb being the S-transformation defined by
the null space of the design matrix, N(A) = R(G), and the minimal constraints BT x = 0.
In this section we will show how minimally constrained solutions for geodetic net-
works can be constructed. We start off by presenting the nonlinear observation equations
of some geodetic observables and their linearized versions. Then we discuss the invari-
ance properties of these geodetic observables. Their invariance can usually be related to
properties of coordinate transformations. Once these properties of invariance are under-
stood, one will be able to identify the null space of the design matrix A. As a consequence
the matrix G, of which the columns span N(A), can be constructed. Understanding the in-
variance, also allows one to specify the minimal constraints BT x = 0. They are needed to
be able to compute a particular least-squares solution of the rank defect model E{y} = Ax,
D{y} = Qy . Such a solution is one of the many free network solutions that can be com-
puted. By means of the S-transformation, one is able to transform one particular free
network solution into another. This section will be concluded with an adjustment exam-
ple of a free GPS network and a testing example of a free levelling network.
Height difference: Probably the simplest of all geodetic observation equations is the one
that corresponds with observed height differences. Let hi j denote the height difference be-
tween two points i and j, and let the heights of these two points be denoted as respectively
hi and h j . The observation equation for an observed height difference reads then
E{hi j } = h j − hi (4.1)
4. Adjustment and validation of networks 81
Azimuth: If we consider a two dimensional network and assume that all network points
are located in the two-dimensional Euclidean plane, the nonlinear observation equation
for an azimuth ai j between the two points i and j reads
x j − xi
E{ai j } = arctan (4.2)
y j − yi
Linearization gives
yoj − yoi yoj − yoi xoj − xoi xoj − xoi
E{Δai j } = Δx j − Δxi − Δy j + Δyi (4.3)
(lioj )2 (lioj )2 (lioj )2 (lioj )2
with
xoj − xoi
Δai j = ai j − arctan
yoj − yoi
and
Distance: For a planar network, the nonlinear observation equation for a distance reads
E{l i j } = (x j − xi )2 + (y j − yi )2 (4.4)
Linearization gives
xoj − xoi xoj − xoi yoj − yoi yoj − yoi
E{Δl i j } = Δx j − Δxi + Δy j − Δyi (4.5)
(lioj ) (lioj ) (lioj ) (lioj )
with
Δl i j = l i j − (xoj − xoi )2 + (yoj − yoi )2
Angle: An angle αi jk between three points i, j and k, is the difference between their
azimuths a jk and a ji. Thus we have
xk − x j xi − x j
E{α i jk} = E{a jk − a ji } = arctan − arctan (4.6)
yk − y j yi − y j
Its linearized version follows then from the linearized versions of the two azimuths.
Distance ratio: The distance ratio vi jk between three points i, j and k, is the ratio of their
distances l jk and l ji. Thus we have
l jk (xk − x j )2 + (yk − y j )2
E{vi jk } = E{ } = (4.7)
l ji (xi − x j )2 + (yi − y j )2
82 Network Quality Control
Linearization gives
l ojiΔl jk − l ojk Δl ji
E{Δvi jk } =
(l oji)2
This can be further expressed in terms of the coordinate increments by making use of
(4.5).
E{ri j } = ai j + o j (4.8)
where o j is the orientation unknown. Apart form the additional orientation unknown, the
linearized observation equation for a direction is the same as that for an azimuth. Note
that the difference of two directions having the same orientation, produces an angle.
E{si j } = λi li j (4.9)
This can be further expressed in terms of the coordinate increments by making use of
(4.5). Note that the ratio of two pseudo-distances having the same scale factor, produces
a distance ratio.
The three dimensional case: So far we assumed all network points to lie in the two di-
mensional Euclidean plane. For a three dimensional network though, we have to take the
third dimension into account as well. For a network of a sufficiently large extent, also the
change in the direction of the plumbline will have to be taken into account. This implies
that for direction measurements like the azimuth ai j and the zenith angle zi j , the astro-
nomical latitude Φi and astronomical longitude Λi will enter the observation equations as
well. The nonlinear observation equations for respectively the azimuth, the zenith angle
and the distance between two points i and j, are given as
− sinΛi (x j −xi )+cosΛi (y j −yi )
E{ai j } = arctan − sin Φ
i cosΛi (x j −xi )−sinΦi sin Λi (y j −yi )+cosΦi (z j −zi )
E{l i j } = (x j − xi )2 + (y j − yi )2 + (z j − zi )2
The unknowns in these observation equations are now apart from the Cartesian coordi-
nates, also the astronomical latitude and longitude. The linearization of the above obser-
vation equations is left as an exercise to the reader.
4. Adjustment and validation of networks 83
GPS baseline: As our last example we consider the three dimensional baseline. When
expressed in Cartesian coordinates, the observation equations for its three components
read
E{xi j } = x j − xi
E{yi j } = y j − yi (4.11)
E{zi j } = z j − zi
As with the height differences no linearization is needed, since the equations are already
linear in the parameters.
Instead of using Cartesian coordinates, one may of course use other type of coordinates
as well. In three dimensions, one often also makes use of the geographic coordinates φ ,
λ and h, where h now refers to the height above the reference ellipsoid. The Cartesian
coordinates and geographic coordinates are related as
x = (N + h) cos φ cos λ
y = (N + h) cos φ sin λ (4.12)
z = (N(1 − e2 ) + h) sin φ
where N is the radius of curvature in the prime vertical
a
N=
1 − e2 sin2 φ
and e2 = (a2 − b2 )/a2 , with a and b being the lenghts of the major and minor axes of the
ellipsoid of revolution.
The one dimensional case: As a one dimensional example we consider the case of level-
ling. Let hi denote the height of point i in the first coordinate system and let Hi denote the
height of the same point in the second coordinate system. The coordinate transformation
between the two coordinate systems reads then
hi = Hi + t (4.13)
where t is a translation or a shift in height, which is constant for all points. It will be clear
that this coordinate transformation leaves observed height differences invariant. That is,
for the observation equation of a height difference we have
E{hi j } = h j − hi = H j − Hi (4.14)
84 Network Quality Control
This shows that the height differences are invariant for the translation t. This immedi-
ately implies that the design matrix A of a levelling network that is built up from height
differences only, will have a null space which is spanned by the column of the matrix
⎡ ⎤
1
⎢ ⎥
G = ⎣ ... ⎦
(4.15)
1
n×1
Thus once we know the transformation which leaves the observation equations invariant,
we only need to take its partial derivative with respect to the parameters in order to obtain
the G-matrix.
The transformation which plays an important role in case of geodetic observables, is
the Similarity transformation. For the two dimensional case, it reads (see figure 4.1)
xi cos α − sin α ui tx
=λ + (4.20)
yi sin α cos α vi ty
with
λ : scale
α : rotation
tx , ty : translation
and where xi , yi are the coordinates of the first coordinate system and ui , vi , the coordinates
of the second coordinate system. Linearization of the similarity transformation, assuming
y
v
i
yi
u
ui
vi
1 α
ty
1/λ
tx xi x
thus showing that the transformation parameters λ , α , tx and ty are absent. These param-
eters will also be absent when one considers the observation equation of an angle. As
a consequence, the design matrix A of a geodetic network, that has been built up from
distance ratios and/or angles only, will have a null space which is spanned by the columns
of the matrix
⎡ . . . .. ⎤
.. .. .. . ⎥
⎢
⎢ 1 0 xo −yo ⎥
⎢ i i ⎥
G= ⎢ ⎣ 0 1 yi
o xoi ⎥ ⎦ (4.22)
.. .. .. ..
. . . .
2n × 4
where n is the number of points in the network.
Since matrix G has four independent columns, four constraints are needed to take care
of the nonuniqueness. The simplest way to fix the degrees of freedom of scale, orientation
and translation, would be to fix the coordinates of two points. The corresponding B-matrix
reads then
0 0 . . . I2 0 . . . 0 0
BT =
0 0 . . . 0 I2 . . . 0 0 (4.23)
4 × 2n
Azimuth: When we substitute the coordinate transformation (4.20) into the observa-
tion equation of an azimuth, we get
xi j cos α ui j − sin α vi j
E{ai j } = arctan = arctan
yi j sin α ui j + cos α vi j
This shows that scale, λ , and the two translations, tx , ty , get eliminated, but that the angle
of rotation α does not get eliminated. Hence, with azimuth observables one cannot de-
termine the scale and position of the network, but only its orientation. As a consequence,
the design matrix A of a geodetic network, that has been built up from azimuths only, will
have a null space which is spanned by the columns of the matrix
⎡ . . . ⎤
. . .
⎢ . . .o ⎥
⎢ 1 0 x ⎥
⎢ i ⎥
G= ⎢ ⎣ 0 1 yi ⎦
o ⎥ (4.24)
.. .. ..
. . .
2n × 3
Now three constraints are needed. One could for instance fix the two coordinates of the
first point. This takes care of the two translational degrees of freedom. The scale can be
fixed by constraining the distance between the first and second point. The corresponding
B-matrix reads then
⎡ ⎤
1 0 0 0 0 ...
BT = ⎣ 0 1 0 0 0 ... ⎦
(4.25)
−x12 −y12 x12 yo12 0 . . .
o o o
3 × 2n
4. Adjustment and validation of networks 87
Distance: For distance observables one can expect that the rotation angle and the
translations get eliminated. And indeed, we have
E{l i j } = x2i j + y2i j = λ u2i j + v2i j
The three dimensional case: In three dimensions, the nonlinear similarity transformation
is given as
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
xi ui tx
⎣ y ⎦ = λ R(α )R(β )R(γ ) ⎣ v ⎦ + ⎣ ty ⎦ (4.27)
i i
zi wi tz
with
λ : scale
α, β , γ : rotation
tx , ty, tz : translation
and the three rotation matrices
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1 0 0 cos β 0 −sin β cos γ sin γ 0
R(α ) = ⎣ 0 cos α sin α ⎦ , R(β ) = ⎣ 0 1 0 ⎦ , R(γ ) = ⎣ −sin γ cos γ 0 ⎦
0 −sin α cos α sin β 0 cos β 0 0 1
GPS baseline: Since the baseline observable is only invariant for translations and not
for rotations and scale changes, the G-matrix of a geodetic network built up from baselines
only, is given as
⎡ . . . ⎤
.. .. ..
⎢ ⎥
⎢ 1 0 0 ⎥
⎢ ⎥
⎢ 0 1 0 ⎥
G= ⎢ ⎢ ⎥ (4.29)
⎥
⎣ 0 0 1 ⎦
.. .. ..
. . .
3n × 3
Note that this is the three dimensional analogue of the levelling-case.
To conclude this section, we have given in table 4.1 the various entries of the G-matrix,
when different type of geodetic observables are used.
bi j = (xi j , yi j , zi j )T
4. Adjustment and validation of networks 89
It is our goal to determine the Cartesian coordinates of the three points 1, 2 and 3. The
position vector of point i will be denoted as pi . Thus
pi = (xi , yi , zi )T
We will assume that the three baselines have been determined independently. Thus no
correlation is assumed to exist between the baselines. We also assume that the three
baselines have been determined with the same precision. Thus we assume that all three
baselines have the same variance matrix, which will be denoted as Q.
Based on the above assumptions, we can formulate the model of observation equations
as
⎡ ⎤ ⎡ ⎤⎡ ⎤ ⎡ ⎤ ⎡ ⎤
b12 −I3 I3 0 p1 b12 Q 0 0
E{⎣ b23 ⎦} = ⎣ 0 −I3 I3 ⎦ ⎣ p2 ⎦ , D{⎣ b23 ⎦} = ⎣ 0 Q 0 ⎦ (4.30)
b31 −I3 0 I3 p3 b31 0 0 Q
y A x y Qy
This is a system of 9 observation equations in 9 unknowns. Note however, that the design
matrix is not of full rank. It has a rank defect of 3. Thus the redundancy of the model
equals 9 − 9 + 3 = 3. Thus if one would formulate the model in terms of conditions
equations, one would have 3 independent condition equations. These three equations are
given as E{b12 + b23 + b31 } = 0.
Due to the rank defect of the model of observation equations, we need to specify
minimal constraints in order to be able to compute a particular least-squares solution. We
know that the range space of the matrix B of the minimal constraints BT x = 0, needs to
satisfy Rn = R(B) ⊕ R(AT ) = R(B) ⊕ N(A)⊥ . Thus R(B) needs to be complementary to
N(A)⊥ = R(G)⊥ . Since the null space of A is spanned by the columns of the matrix
⎡ ⎤
I3
G = ⎣ I3 ⎦ (4.31)
I3
Point 1 as fixed (datum) point: The choice of matrix B(1) for the minimal constraints,
corresponds to a fixing of the coordinates of point 1. The corresponding least- squares
solution will be denoted as x̂(1) . It reads
x̂(1) = B⊥ ⊥ T T −1 ⊥ −1 ⊥ T T −1
(1) [(B(1) ) A Qy AB(1) ] (B(1) ) A Qy y
Since
⎡ ⎤ ⎡ ⎤
2Q−1 −Q−1 −Q−1 0 0
A Qy A = ⎣ −Q−1
T −1
2Q−1 −Q−1 ⎦ and B⊥
(1) = ⎣ I
3 0 ⎦
−Q−1 −Q−1 2Q−1 0 I3
it follows that
−1
2Q−1 −Q−1 2
3Q
1
3Q
[(B⊥ T T −1 ⊥ −1
(1) ) A Qy AB(1) ] = =
−Q−1 2Q−1 1
3Q
2
3Q
Hence, the variance matrix of the minimally constrained least-squares solution, keeping
point 1 fixed, reads
⎡ ⎤
0 0 0
Qx̂ = ⎣ 0 23 Q 13 Q ⎦ (4.33)
(1)
0 13 Q 23 Q
This matrix describes the precision of the coordinates of the free GPS network, when the
minimal constraints correspond to a fixing of point 1.
Point 3 as fixed (datum) point: Instead of choosing point 1 as fixed point, one may
of course also choose another point, say point 3. In that case the variance matrix of the
minimally constrained least-squares solution, reads
⎡ 2 1 ⎤
3Q 3Q 0
Q = ⎣ 1Q 2Q 0 ⎦
x̂(3) 3 3 (4.34)
0 0 0
This matrix describes the precision of the coordinates of the free GPS network, when the
minimal constraints correspond to a fixing of point 3. Note that the two variance matrices
of (4.33) and (4.34) differ greatly. It is important to recognize however, that these two
variance matrices contain identical information. One can transform the one into the other
by means of an S-transformation. The transformation, that transforms any arbitrary least-
squares solution to x̂(1) , reads
⎡ ⎤
0 0 0 (4.35)
= ⎣ −I I3 0 ⎦
3
−I3 0 I3
4. Adjustment and validation of networks 91
Hence, the variance matrix Qx̂ can be obtained from the variance matrix Qx̂ , by means
(1) (3)
of the transformation (verify yourself) Qx̂ = S(1) Qx̂ ST(1) .
(1) (3)
Fixing the sum of the coordinates: Instead of fixing the three coordinates of one of the
points of the network, one may also decide to fix of all points of the network, their sum
of x-coordinates, their sum of y-coordinates and their sum of z-coordinates. Also these
three constraints are admissible. The S-transformation that corresponds with this set of
minimal constraints is given as
S(1+2+3) = I9 − G(BT(1+2+3) G)−1 BT(1+2+3)
⎡ ⎤
2
3 I3 − 13 I3 − 13 I3 (4.36)
= ⎣ − 13 I3 2
3 I3 − 13 I3 ⎦
− 13 I3 − 13 I3 2
3 I3
With this S-transformation, we can thus obtain the variance matrix of x̂(1+2+3) from Qx̂
(3)
as
Qx̂ = S(1+2+3) Qx̂ ST(1+2+3)
(1+2+3) 3
⎡ ⎤
2
9Q − 19 Q − 19 Q (4.37)
= ⎣ − 19 Q 2
− 19 Q ⎦
9Q
− 19 Q − 19 Q 2
9Q
The entries of this variance matrix again differ greatly from the entries of Qx̂ and Qx̂ .
(1) (3)
The three minimally constrained solutions x̂(1) , x̂(3) and x̂(1+2+3) are however completely
equivalent. All three produce the same least-squares solution for the measurements. This
thus also holds for the variance matrix Qŷ . Verify yourself that indeed Qŷ = AQx̂ AT =
(1)
AQx̂ AT = AQx̂ AT .
(3) (1+2+3)
Evaluation of free network precision: When evaluating the precision of a free network,
one has to make sure that the evaluation is not affected by the choice of minimal con-
straints. These constraints do not contain information which is essential for the precision-
evaluation. They are merely a tool to be able to compute coordinates. Thus when the
precision evaluation is based on, say Qx̂ , one should use a procedure which gives re-
(1)
sults that are identical to the results that one would obtain, when the precision evaluation
is based on, say Qx̂ . Hence, when one wants to compare Qx̂ with a criterium matrix,
(3) (1)
one has to make sure that the criterium matrix is defined with respect to the same set
of minimal constraints. This can be accomplished by transforming the criterium matrix
with the appropriate S-transformation. Thus when Cx denotes the criterium matrix, one
should compare Qx̂ with S(1)Cx ST(1) , and Qx̂ with S(3)Cx ST(3) . Only then will the two
(1) (3)
evaluations give identical results.
In case one decides to base the evaluation on the generalized eigenvalue problem, the
appropriate formulation is thus
| Qx̂ − λ S(1)Cx ST(1) |= 0 (4.38)
(1)
92 Network Quality Control
and not | Qx̂ − λ Cx |= 0. And if it is Qx̂ , that needs to be evaluated, the appropriate
(1) (3)
formulation is
and not | Qx̂ − λ Cx |= 0. Both the eigenvalue problems (4.38) and (4.39), will give
(3)
identical results for the eigenvalues.
Thus the network consists of two levelling loops, loop 1 − 2 − 3 and loop 2 − 3 − 4. The
observables are assumed to be uncorrelated, all having the same variance σ 2 . We know
that we need to introduce one minimal constraint in order to eliminate the rank deficiency.
As minimal constraint we choose to fix the height of the first point:
h1 = 0
The model of observation equations, with the minimal constraint included, reads then
⎡ ⎤ ⎡ ⎤
h12 1 0 0
⎡ 1 ⎤
⎢ h23 ⎥ ⎢ −1 1 0 ⎥
⎢ ⎥ ⎢ ⎥ h21
E{⎢ ⎥ ⎢
⎢ h31 ⎥} = ⎢ 0 −1 0 ⎥ h3
⎥⎣ ⎦ , Qy = σ 2 I
5 (4.40)
⎣ h ⎦ ⎣ 0 −1 1 ⎦ h1
34 4
h42 1 0 −1
Note that due to the elimination of h1 , the design matrix is indeed of full rank. The
remaining heights are given the upperindex 1 to show that they are defined with respect
to the fixing of the height of the first point. There are 5 observations and 3 unknowns.
The redundancy is therefore equal to 2. The two condition equations can be identified as
E{h12 + h23 + h31 } = 0 and E{h23 + h34 + h42 } = 0.
Detection: In order to test the above model, we will start with the overall model test.
The general expression for the corresponding test statistic reads
T = êT Q−1
y ê
They have a standard normal distribution under the null hypothesis. When the wi -test
statistic is worked out for the above model, we get (verify yourself)
1√
w1 = 2σ 6
(3h12 + 2h23 + 3h31 − h34 − h41 )
1
√ (h + 2h + h + h + h )
w2 = σ 8 12 23 31 34 41
w3 = w1 (4.42)
w4 = 1√
2σ 6
(−h12 + 2h23 − h31 + 3h34 + 3h41 )
w5 = w4
Note that some of the test statistics are identical. This is understandable if one considers
the geometry of the levelling network. A blunder in h12 can not be discriminated from a
blunder in h31 (w3 = w1 ) . Also, a blunder in h34 can not be discriminated from a blunder
in h42 (w5 = w4 ).
If in addition to potential blunders in the data, one also suspects that, say all three
observed height differences of the levelling loop 1 − 2 − 3 are erroneous by a constant
amount, then also an additional identification test needs to be performed. In this case one
has to make use of the general expression for the w-test statistic. It reads
cT Q−1
y ê
w=
cT Q−1 −1
y Qê Qy c
The appropriate c-vector for testing whether a constant shift in the first three observations
occurred or not, reads
c = (1, 1, 1, 0, 0)T
When the expression of the w-test statistic is worked out for the above model, we get
(verify yourself)
h12 + h23 + h31
w= √ (4.43)
3σ
Hence, the appropriate test statistic equals the misclosure of the levelling loop 1 − 2 − 3,
divided by its standard deviation.
Adaptation: In case data snooping lead to the conclusion that, say the second obser-
vation was erroneous, one can decide to remeasure this observation and after remeasure-
ment, again apply the whole testing procedure. For the case that all first three observations
94 Network Quality Control
are off by a constant amount, one can of course also opt for remeasurement. But instead,
one may also decide to adapt the model. In that case the new model becomes
⎡ ⎤ ⎡ ⎤
h12 1 0 0 1 ⎡ 1 ⎤
⎢ h23 ⎥ ⎢ −1 1 h2
⎢ ⎥ ⎢ 0 1 ⎥ ⎥ ⎢ h1 ⎥
E{⎢ ⎥ ⎢ ⎥⎢ 3 ⎥
⎢ h31 ⎥} = ⎢ 0 −1 0 1 ⎥ ⎣ h1 ⎦ , Qy = σ I5
2
(4.44)
⎣ h ⎦ ⎣ 0 −1 1 0 ⎦ 4
34 ∇
h42 1 0 −1 0
One should be aware however, that this change also results in a change for precision and
reliability. Both will become poorer and it depends on the particular application at hand
whether one is willing to accept this or not.
p = (. . ., xi , yi , . . .)T
q = (. . ., ui , vi , . . .)T
The free network and the control network will usually overlap. Hence, three type of points
can be discriminated. The points that are part of the free network, but not of the control
network. The points that are part of both the free network and the control network. And
the points that are part of the control network, but not of the free network. The two sets
of coordinates of the free network will be denoted as
p1 = (x1 , y1 , . . ., xn1 , yn1 )T
p2 = (xn +1 , yn +1 , . . ., xn +n , yn )T
1 1 1 2 1 +n2
where p1 contains the coordinates of the n1 points of the free network that are not part of
the overlap, and where p2 contains the coordinates of the n2 points of the free network
that are part of the overlap. Thus the total number of points of the free network is assumed
to be equal to n1 + n2 . As a result of the free network adjustment, we thus have available
p1 p1 p1 Q p1 Q p1 p2
E{ }= , D{ }= (4.45)
p2 p2 p2 Q p2 p1 Q p2
4. Adjustment and validation of networks 95
where q2 contains the coordinates of the n2 points of the control network that are part of
the overlap, and where q3 contains the coordinates of the n3 points of the control network
that are not part of the overlap. Thus the total number of points of the control network is
assumed to be equal to n2 + n3 . Of the control network, we assume to have available
q2 q2 q2 Qq2 Qq2 q3
E{ }= , D{ }= (4.46)
q3 q3 q3 Qq3 q2 Qq3
In order to be able to combine (4.45) with (4.46), we still need to consider the coordi-
nate transformation between the two coordinate systems. Although the type of coordinate
transformation that is needed, depends on the particular application at hand, any coordi-
nate transformation will be of the general form p = F(q, t), where t denotes the vector of
transformation parameters. When applied to the two sets p1 and p2 , we thus have
p1 = F1 (q1 , t)
(4.47)
p2 = F2 (q2 , t)
With (4.45), (4.46) and (4.47), we are now in the position to formulate the model of
observation equations for the connection adjustment. It reads
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
p1 F1 (q1 , t) p1 Q p1 Q p1 p2 0 0
⎢ p ⎥ ⎢ F (q , t) ⎥ ⎢ p ⎥ ⎢ Q 0 ⎥
⎢ ⎥ ⎥ , D{⎢ 2 ⎥ ⎢ p2 p1 Q p2 0 ⎥
E{⎢ 2 ⎥} = ⎢ ⎣
2 2
⎦ ⎢ ⎥ } = ⎢ ⎥(4.48)
⎣ q2 ⎦ q2 ⎣ q2 ⎦ ⎣ 0 0 Qq2 Qq2 q3 ⎦
q3 q3 q3 0 0 Qq3 q2 Qq3
Note that the observation equations are linear in q3 , but possibly nonlinear in q1 , q2 and t.
Whether the observation equations are nonlinear or not, depends on the type of coordinate
transformation used. In case of nonlinear observation equations, we still need to apply a
linearization in order to obtain linear(ized) observation equations. Linearization of (4.48)
gives
⎡ ⎤ ⎡ ⎤⎡ ⎤
Δp1 T1 0 0 A1 Δq1
⎢ Δp ⎥ ⎢ 0 T 0 A ⎥ ⎢ Δq ⎥
⎢ 2 ⎥
E{⎢ ⎥} = ⎢ 2
⎣ 0 I 0 0 ⎦ ⎣ Δq ⎦
2 ⎥⎢ 2 ⎥ (4.49)
⎣ Δq2 ⎦ 3
Δq3 0 0 I 0 Δt
where
T1 = ∂q1 F1 (qo1 , t o) A1 = ∂t F1 (qo1 , t o)
(4.50)
T2 = ∂q2 F2 (qo2 , t o) A2 = ∂t F2 (qo2 , t o)
The structure of these four matrices of course also depends on the type of coordinate
transformation used. As an example, we will show how they look like when the three
dimensional similarity transformation is used.
96 Network Quality Control
The three dimensional similarity transformation between two sets of Cartesian coor-
dinates reads
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
xi ui tx
⎣ y ⎦ = λ R(α )R(β )R(γ ) ⎣ v ⎦ + ⎣ ty ⎦ (4.51)
i i
zi wi tz
with the scale λ , the translation vector (tx , ty, tz)T and the three rotation matrices
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1 0 0 cos β 0 −sin β cos γ sin γ 0
R(α ) = ⎣ 0 cos α sin α ⎦ , R(β ) = ⎣ 0 1 0 ⎦ , R(γ ) = ⎣ −sin γ cos γ 0 ⎦
0 −sin α cos α sin β 0 cos β 0 0 1
⎡
λ o RoW1o I3
⎤ ⎡
λ o RoW3no I3
⎤ (4.52)
⎢ ⎥ ⎢ 1 +1 ⎥
⎢ . . ⎥ ⎢ ⎥
A1 A2 ⎢ . . ⎥
=⎢
⎢ . . ⎥,
⎥ =⎢ . . ⎥
3n1 × 7 ⎣ . . ⎦ 3n2 × 7 ⎢ . . ⎥
⎣ ⎦
λ o RoW3n
o I3 λ o RoW3n
o I3
1 1 +3n2
uoi −voi
Wio =
voi uoi
We conclude this section with some remarks.
Remark 1: In the above linearization, no particular assumptions were made about the
values of the approximate transformation parameters. In some applications it may happen
however, that the two coordinate systems differ only slightly in scale and orientation. In
that case one can choose as approximate values λ o = 1, α o = β o = γ o = 0. Note that this
results in a considerable simplification of the above matrices. We then have λ o Ro = I3
and
⎡ o ⎤
ui 0 −woi voi
Wio = ⎣ voi woi 0 −uoi ⎦
wi −vi
o o o
ui 0
Remark 2: When the similarity transformation is used in the connection model (4.48)
with all its transformation parameters included, one should recognize that the scale, the
orientation and the location of the free network is abandoned in favour of the scale, the
orientation and the location of the control network. Thus only the shape of the free net-
work will then contribute to the determination of the geometry of the connected network.
4. Adjustment and validation of networks 97
Remark 3: Above it was assumed that all transformation parameters of the similarity
transformation are unknown. In some applications it may happen however, that all or
some of these transformation parameters are known. For instance, it may happen that
one knowns (from an earlier adjustment) the relative orientation and scale between the
WGS84 coordinate system and the National coordinate system. This would imply that
if a free GPS network is expressed in the WGS84 system and the coordinates of the
control network in the National coordinate system, that the scale and three orientation
parameters are not needed as unknowns in the model of observation equations. If they are
known with a sufficient precision, they could be treated as constants. The only remaining
transformation parameters would then be the three translation parameters.
Remark 4: Above it was assumed that the coordinates of the control network are of the
Cartesian type. This need not be the case of course. The model is easily adapted however,
for the case that one uses other than Cartesian coordinates. One only needs to substitute
in the above nonlinear model of observation equations, the relation that exists between
Cartesian coordinates and the type of coordinates that one needs to use.
Remark 5: In the model (4.48), it was assumed that no correlation exists between the
coordinates of the free network and the coordinates of the control network. For most
practical applications this assumption is realistic, since the measurement processes that
produced the coordinates of the two networks can usually be assumed to be independent.
Remark 6: In the model (4.48) it was also assumed that the variance matrix of the co-
ordinates of the control network, Qq , is available. In practice this may not be the case
however. Fortunately, for the constrained connection adjustment it is not needed. In that
case the adjustment is performed with Qq = 0. For the statistical testing of the model
(4.48), it is needed however. In that case one will have to work with a substitute ’vari-
ance matrix’, that then hopefully will give a sufficiently good approximation to the actual
variance matrix Qq .
Remark 7: In the connection model (4.48) we included the coordinates of the nonover-
lapping points of the control network, q3 . In the remaining part of these lecture notes,
they will be disregarded however. These coordinates do not contribute to the redundancy
of the model and therefore also not to the results of statistical testing, and they remain
unchanged when a constrained connection adjustment with Qq = 0 is applied.
The redundancy of this model equals (2n2 − nt ) in two dimensions and (3n2 − nt ) in
three dimensions, where nt is the dimension of the vector of transformation parameters.
98 Network Quality Control
Thus with the full similarity transformation in two dimensions, we have a redundancy of
(2n2 − 4). This shows that in the overlap of the two networks, a minimum of two points
is needed.
We can reduce the number of unknown parameters in this model, if we eliminate Δq2
by making use of the coordinate differences Δd = Δp2 − T2 Δq2 . This gives
This model will be our null hypothesis Ho . As alternative hypothesis Ha , we will consider
the model
Δt
Ha : E{Δd} = A2 C , D{Δd} = Q p2 + T2 Qq2 T2T (4.55)
∇
where ∇ is an r × 1 vector of assumed model errors and C is a matrix that specifies how
the model errors are related to the observations. From our chapter on statistical testing, we
know that the above null hypothesis can be tested against the above alternative hypothesis,
using the test statistic
T r = êTd Q−1 T −1 −1 −1 T −1
d C(C Qd Qê Qd C) C Qd êd (4.56)
d
where êd is the least-squares residual vector of Δd and Qê its variance matrix. Assuming
d
normally distributed observables, one will decide to reject Ho on the basis of Ha , when
with χα2 (r, 0) the critical value of the central Chi- square distribution, having r degrees of
freedom.
Usually one will start with the most relaxed alternative hypothesis. This is the hypoth-
esis for which the matrix (A2 ,C) is square and regular. The corresponding test statistic
reads
T = êTd Q−1
d êd
Under the null hypothesis Ho , it has a central Chi-square distribution with the degrees of
freedom being equal to the redundancy of the model under Ho . The test corresponding to
the most relaxed alternative hypothesis, is usually referred to as the overall model test and
its purpose is to detect unspecified model errors.
When the overall model test leads to a rejection of the null hypothesis, there is a high
likelihood that the model under Ho has been specified incorrectly. Potential model errors
are: (1) the assumed relation between the two coordinate systems is wrong; (2) the shape
of the free network differs significantly from the shape of the control network, due to,
for instance, an error in one or more of the coordinates; (3) the assumptions about the
stochastic model are incorrect, due to, for instance, a too optimistically specified variance
matrix of the coordinates.
In the following we will restrict ourselves to model errors in the functional model.
They can be specified by means of the matrix C. We will now discuss some examples.
4. Adjustment and validation of networks 99
Errors in the coordinates: Since the coordinates form the observations in the connec-
tion model, it seems reasonable to first check the coordinates on possible errors. First
note that one will never be able to find errors in the coordinates of the points in the two
nonoverlapping parts of the two networks. These coordinates do not contribute to the
redundancy of the model and will therefore not be part of the test statistics. Fortunately,
these errors will also have no influence on the adjusted coordinates of the other points
after the connection.
Secondly note, that one will also not be able to discriminate whether an error has oc-
curred in the coordinates of the free network or in the coordinates of the control network.
This is due to the fact that the observations in the model (4.55) are formed from coordinate
differences.
In order to check for a potential error in the coordinates, the simplest alternative hy-
pothesis is the one for which it is assumed that only one coordinate of one of the points
is wrong. In this case, ∇ is a scalar (r = 1) and matrix C becomes a vector, which will be
denoted by the lowercase character c. Hence, instead of using (4.56) one can in this case
also use its square-root, being the w-test statistic
cT Q−1
d d
ê
w=
cT Q−1
d
Qê Q−1
d
c
For the three dimensional case, the c-vector then takes one of the following three forms
cxi = (0 0 0 . . .1 0 0 . . .0 0 0)T
cyi = (0 0 0 . . .0 1 0 . . .0 0 0)T
czi = (0 0 0 . . .0 0 1 . . .0 0 0)T
where i refers to the point being tested. In this way one can systematically check all
coordinates of all overlapping points on potential errors.
Point identification errors: An error in only one coordinate of a point may occur due to
typing errors. But in other cases of course, it is more likely, when an error occurs, that it
will not be confined to a single coordinate only, but instead will effect all coordinates of
a point. Such a type of error occurs when one erroneously beliefs that the free network
coordinates and the control network coordinates indeed refer to the same point. These
type of errors are called point identification errors.
In order to test for these type of errors, the C-matrix takes the form
⎡ ⎤T
0 0 0 ... 1 0 0 ... 0 0 0
C = ⎣ 0 0 0 ... 0 1 0 ... 0 0 0 ⎦
i
0 0 0 ... 0 0 1 ... 0 0 0
Height or eccentricity error: In case of GPS, it not seldom occurs that an error is made
in the measured GPS antenna height. The c-vector then takes the form
T
ci = 0 . . . 0 hxi hyi hzi 0 . . . 0
100 Network Quality Control
where the vector (hxi , hyi , hzi )T specifies the direction of the potential antenna height offset
of point i in the approximate coordinate system. In case of an eccentricity error, the C-
matrix has two columns,
T
0 . . . 0 exi eyi ezi 0 ... 0
Ci =
0 . . . 0 f xi f yi f zi 0 ... 0
where the two vectors (exi , eyi , ezi )T and ( f xi , f yi , f zi )T span the plane in which the eccen-
tricity error of point i is supposed to have taken place.
Thus only the translation vector is included in the coordinate transformation. Now assume
that the null hypothesis gets rejected by the overall model test and that one suspects that
the two coordinate systems, apart from differing in their location, differ also slightly in
scale. In that case one will have to oppose the above null hypothesis to the following
alternative hypothesis
⎡ ⎤ ⎡ . . .
.. .. .. .. .. ⎤
⎢ . ⎥ . ⎡ ⎤
⎢ Δx − Δu ⎥ ⎢ ⎥ Δtx
⎢ ⎥ ⎢ 1 0 0 u o ⎥
⎢ i i
⎥ ⎢ i ⎥ ⎢ Δt
y ⎥
Ha : E{⎢ Δyi − Δvi ⎥} = ⎢ ⎢ 0 1 0 voi ⎥ ⎥
⎢
⎣
⎥
⎢ ⎥ ⎢ 0 0 1 wo ⎥ Δtz ⎦
⎢ Δzi − Δwi ⎥ ⎣ i ⎦
⎣
..
⎦
.. .. .. .. Δ ln λ
. . . . .
The model is thus enlarged with one additional column, being the column which models
the scale difference between the two coordinate systems. This column vector is thus the
appropriate choice for the c-vector
Note that λ o has been taken equal to one, since the difference in scale between the two
coordinate systems, although potentially significant, was still assumed to be small.
4. Adjustment and validation of networks 101
In the previous section the validation of the connection model was considered. In the
present section we will consider the estimation problem. For validation, we could restrict
our attention to the coordinates of the overlapping points. For estimation however, we
need to take the coordinates of the nonoverlapping points of the free network into account
as well. After all, the purpose of the connection is to obtain the coordinates of these newly
established points in the coordinate system of the control network. Thus instead of the
model (4.54), we will now work with the enlarged model
Δp1 T1 A1 Δq1 Δp1 Q p1 Q p1 p2
E{ }= , D{ }= (4.57)
Δd 0 A2 Δt Δd Q p2 p1 Qd
Remember that the coordinate differences of the overlapping points are contained in Δd =
Δp2 − T2 Δq2 , which has as variance matrix, Qd = Q p2 + T2 Qq2 T2T .
The above model can be solved using the standard least-squares algorithm. As a
result one would obtain the unconstrained least- squares solution. In practice however,
circumstances often dictate that the coordinates of the control network remain unchanged.
This can be accomplished by applying the standard least-squares algorithm to the above
model, with the additional constraints that
Qq2 := 0 and Δq2 := 0 (4.58)
Setting Qq2 to zero, implies that no least-squares correction is given to Δq2 . This together
with the setting of Δq2 to zero, implies that the coordinates of the control points in the
overlap remain fixed to their original values (note: the original values are used as approx-
imate values in the linearization). In order to discriminate between the unconstrained and
the constrained least-squares solution, we will use the superscript c for the constrained
case. Thus Δd and Δd c have the same variance matrix, namely Qd , but their sample
values differ. They are given respectively as
Δd = Δp2 − T2 Δq2 and Δd c = Δp2
Instead of solving the above model in one step, we will solve it in three steps. This
will help us in getting a somewhat better insight in the various aspects of the connection
adjustment.
The first step: We already remarked that the coordinates of the nonoverlapping points
of the free network do not contribute to the redundancy of the above model. Likewise,
these coordinates also do not contribute to the determination of the transformation param-
eters. Only the coordinates of the overlapping points contribute to the determination of
the transformation parameters. We therefore start by first considering
E{Δd} = A2 Δt , D{Δd} = Qd (4.59)
From it the unconstrained least-squares estimator of the vector of transformation param-
eters follows as
Δt̂ = (AT2 Q−1 −1 T −1 T −1
d A2 ) A2 Qd Δd , Qtˆ = (A2 Qd A2 )
−1
(4.60)
102 Network Quality Control
Note that in the application of the error propagation law, the randomness of Δq2 has been
taken into account. After all, although we constrain the solution to the coordinates of
the control network, these coordinates are still of a stochastic nature. We know that a
least-squares solution that takes the stochasticity of all variables into account will give
estimators that are of minimal variance. In the above constrained least-squares solution
this is not the case. Thus
showing that the transformation parameters of the unconstrained solution are of a better
precision than the ones of the constrained solution. This is thus the price one pays, when
one is forced to keep the coordinates of the control network unchanged.
The unconstrained solution for the coordinate differences and their least-squares resid-
uals are given as
Δd̂ = A2 Δt̂ , Qdˆ = A2 QtˆAT2
(4.62)
êd = Δd − Δd̂ , Qê = Qd − A2 QtˆAT2
d
As we have seen in the previous section, these least-squares residuals form the basis for
the statistical testing of the validity of the connection model.
The second step: In this second step we will determine the solution for the coordinates
of the free network in the nonoverlapping part. We know that Δp1 does not contribute to
the redundancy of the model. If in addition, Δp1 would not correlate with Δp2 and thus
not with Δd, then Δp1 = Δ p̂2 would hold true. That is, in that case the coordinates of the
free network in the nonoverlapping part would not change as well. But since Q p1 p2 = 0,
it follows that these coordinates do change. In fact, their residual vector can be computed
from êd as
ê p = Q p1 p2 Q−1
d êd
2
Of course if the connection model fails to have any redundancy at all, then êd = 0 and
Δ p̂1 = Δp1 . This would for instance happen when in the two dimensional case, the con-
nection model is based on the full similarity transformation and the number of points in
the overlap equals only two.
The constrained least-squares solution reads
The third step: After the unconstrained adjustment of the connection model (4.57), we
have
Since matrix T1 is invertible, it follows after substitution of (4.60) and (4.63), that
Δp1 A1
Δq̂1 = T1−1 I −Q p1 p2 Q−1 − Δt̂ (4.65)
d Δd A2
This is the final unconstrained least-squares solution of the coordinates of the nonoverlap-
ping points of the free network, expressed in the coordinate system of the control network.
This is thus the solution one gets for the coordinates of the nonoverlapping points, when
the model (4.57) is solved using the standard least- squares algorithm.
In case of a constrained adjustment of the connection model (4.57) we have
In this case the transformation A1 Δt̂ c is directly applied to Δp 1 , without taking the vector
c
of least-squares residuals êcd = Δd c − Δd̂ into account.
No redundancy: The above simplified expression is also obtained in case there is no
redundancy in the connection model. An absence of redundancy implies that the points
in the overlap have as many coordinates as there are transformation parameters. Hence,
there is no need for an adjustment to determine the transformation parameters. They are
uniquely determined from the coordinates of the points in the overlap and the least-squares
residual vector êcd is identically zero.
104 Network Quality Control
No coordinate transformation: In case the two coordinate systems of the free network
and the control network coincide, the transformation parameters will be absent from the
model and T1 will be equal to the identity matrix. In that case the above constrained
solution simplifies to
The correction to the coordinates of Δp1 is now only based on the difference between the
coordinates of the points in the overlap.
We already remarked that the precision of the constrained solution is less than that of
the unconstrained solution, Qq̂c > Qq̂ . This need not be dramatic. It is simply a conse-
1 1
quence of the fact that by constraining the solution, one is in fact using a less than optimal
estimator. But what is important to recognize though, is that the constrained least- squares
solution can be poorer than the solution one had before the connection was carried out.
This is easily shown for the above solution (4.67). Application of the error propagation
law to (4.67) gives
Qq̂c = Q p1 − Q p1 p2 Q−1 −1
p (Q p2 − Qq2 )Q p Q p2 p1
1 2 2
But this means, if the coordinates of the control network are of a poorer precision than
the coordinates of the free network (Q p2 < Qq2 ), that the adjusted coordinates, when
constraining is applied, will have a precision which is poorer than before the adjustment
was carried out (Qq̂c > Q p1 ). This is the reason, why in surveying one works from the
1
’large to the small’. That is, in the densification process, the free networks are connected
to ’higher order’ control networks, i.e. networks that are of a better precision than the free
networks.
The two overlapping levelling networks are thus assumed to differ in height only. The
redundancy of the model equals (n − 1). First we will give the least-squares estimators of
Hi and t, and then we will discuss the detection and identification step.
Least-squares estimators: After forming the system of normal equations of the above
model and solving it for the translation parameter, we get
⎧
⎨ t̂ = h̄ − H̄
(4.70)
⎩ 2
σtˆ = 1n (σh2 + σH2 )
where
1 n 1 n
h̄ = ∑
n i=1
hi , H̄ = ∑ H i
n i=1
are the average heights of the two networks. In this particular case, the least-squares
estimator of the translation parameter thus equals the difference of the height averages of
the two networks.
When we solve the system of normal equations for the height coordinate Hi of point
i, we get
⎧ σ2
⎪
⎪ Ĥ = H i − σ 2 +Hσ 2 [(H i − H̄) − (hi − h̄)]
⎨ i h H
(4.71)
⎪
⎪
⎩ σ 2 = σ 2 [1 − σH2 (1 − 1 )]
Ĥi H σh2 +σH2 n
σH2
êH = [(H i − H̄) − (hi − h̄)]
i σh + σH2
2
It is constructed from a difference of two differences. The two differences are the height
of point i with respect to the average height in the first height system and the height of
point i with respect to the average height in the second height system. Note that the
residual gets smaller, when σH2 gets smaller. This is understandable. The more precise the
H i coordinate is, the more confidence one has in it and the less one is willing to give it a
large correction. In the limiting case σH2 = 0, we have Ĥ i = H i . In that case no correction
is applied at all. If we consider the other limiting case σh2 = 0, then of course the hi
coordinate would not get a correction, ĥi = hi . In this case, we also would have
Ĥ i = hi + H̄ − h̄
= hi − t̂
thus showing that Ĥ i is now obtained from simply applying the translation estimator to
hi .
residuals should at least not be identical to zero, which would happen when there is no
redundancy. Since the redundancy equals (n − 1), we thus at least need more than one
point in the two overlapping networks.
To test the above model, we first consider its overall validity. The general form of the
appropriate test statistic for detection reads T redundancy = êT Q−1
y ê. When this expression
is worked out for the above model we get
∑ni=1 (H i − hi )2 − n(H̄ − h̄)2
T n−1 = (4.72)
σh2 + σH2
Note that the term n(H̄ − h̄)2 would be absent in case the translation parameter t would
be absent from the model. In that case the test statistic would simply equal the sum of
the squared height differences divided by the sum of the two variances. The redundancy
would then also have increased by one to n.
With the above test statistic, the model is invalidated or rejected when
Tn−1 > χα2 (n − 1, 0) (4.73)
When rejection occurs, there are of course many type of misspecifications in the model
that could have caused it. The misspecifications could be in the functional and/or the
stochastic model. For instance, when the stochastic model is too optimistic, the variances
are too small and the value of Tn−1 would be unrealistically large. As a result, one could
have an unjustified rejection of the model. This is therefore the reason, why the variance
σH2 is not set to zero, when testing the connection model. Thus the unconstrained con-
nection is used for testing, while the constrained connection, with σH2 = 0, is used for the
actual coordinate computation.
Instead of having a too optimistic stochastic model, one could of course also have used
a too pessimistic stochastic model. In that case the variances are too large and Tn−1 would
be unrealistically small. If one suspects this to be the case, then the above one-sided test
should be replaced by its two-sided counterpart. Since the expectation of T n−1 /(n − 1)
equals one under the null hypothesis, one then tests whether this ratio is significantly
larger or smaller than one. We will however assume that the stochastic model is correct
and restrict our attention to misspecifications in the functional model.
Identification: If we assume that the rejection of the model could have been caused by
an error in the coordinate set of the H-system, the c-vector of the w-test statistic will have
the form
c = (0T , cTH )T with cH = (cH , . . ., cHn )T (4.74)
1
Data snooping: If we want to screen the individual H i coordinates for the presence of
blunders, the cH -vector takes the form
cH = (0, . . ., 0, 1, 0, . . ., 0)T
i
The size of the blunder that can be found with a probability γ = 0.80 using this test statistic
with a level of significance α1 = 0.001, is given by the MDB,
17.075(σh2 + σH2 )
| ∇i |= (4.78)
1 − 1n
This is the MDB which we already met earlier in the section on Reliability.
Testing for scale: If one suspects that the model was rejected due to the erroneous as-
sumption that scale is absent, the appropriate c- vector for testing for the presence of scale,
is
The size of the scale factor that then can be found with probability γ = 0.80, is given by
the MDB
17.075(σh2 + σH2 )
| ∇λ |= (4.79)
∑ni=1 (Hio − H̄i0 )2
This shows that the presence of scale become better identifiable, when ∑ni=1 (Hio − H̄i0 )2
is large, that is, when the network would have large height differences with respect to the
average height.
108 Network Quality Control
4.5 Summary
As a conclusion to this chapter, we have summarized the main steps involved in four
flow diagrams. In figure 4.2, the adjustment and testing steps are shown for both the free
network and the connected network (note: the constrained connection is also referred
to as a pseudo least-squares connection). The model used for the connection is usually
observations model
formulation
free network
adjustment
rejected testing
observations
accepted
coordinates least-squares
known control connection
points adjustment
rejected testing
connection
points
accepted
pseudo resulting
least-squares
connection coordinates
based on a coordinate transformation. After all, the coordinates of the free network may
not be defined in the same coordinate system as those of the control network. In case
of connecting a GPS-based free network to existing control, one is dealing with WGS84
coordinates and with the coordinates of the National Reference System (NatRS). The
model for the connection can be formulated in two ways. Either one formulates it so that
the results are expressed in the WGS84 coordinate system, or one formulates it so that the
results are expressed in the NatRS coordinate system. The first formulation is shown in
figure 4.3 and the second formulation is shown in figure 4.4. With the first formulation
one of course will need an additional (back) transformation step to get the final results
expressed in the NatRS coordinate system.
The various steps involved in the transformation between the two systems are shown
in figure 4.5.
4. Adjustment and validation of networks 109
known control
points in
local system
pseudo
least-squares
adjustment
transformation coordinates in
WGS84 to local WGS84
coordinates in
local system
GPS baselines
in WGS84
transformation
WGS84 to local
GPS baselines
in local system
pseudo
least-squares
adjustment
coordinates in
local system
GPS baseline
network
transformation
to ellipsoidal
coordinates
ellipsoidal
coordinates
in NatRS
Appendix
In this appendix, we consider the first two moments of a random variable, the mean and
the variance. We also show how they propagate when functions of the random variable
are taken.
First we consider the mean of a scalar random variable and give the propagation law
of the mean when the function is either linear or nonlinear. In the latter case, use is made
of a linearization. Then we do the same for the variance of a scalar random variable. After
the scalar case has been treated, we generalize to the vectorial case.
Mean of a scalar random variable: Let x be a continuous scalar random variable with
probability density function px (x). The expectation or mean of x is by definition the
integral
+∞
E{x} = xpx (x)dx (A.1)
−∞
It appears therefore, that to determine the mean of y, we must first find its probability
density function py (y). This, however, is not necessary. As the next theorem shows, E{y}
can be expressed directly in terms of the function f (x) and the density px (x) of x.
Theorem:
+∞
E{ f (x)} = f (x)px (x)dx (A.3)
−∞
Proof: We shall sketch a proof using the curve f (x) of figure A.1. With y = f (x1 ) =
f (x2 ) = f (x3 ) as in the figure, we see that
P(y ≤ y ≤ y + dy) =
= P(x1 ≤ x ≤ x1 + dx1 ) + P(x2 − dx2 ≤ x ≤ x2 ) + P(x3 ≤ x ≤ x3 + dx3 )
or
111
112 Network quality control
f (x)
dy
y
x1 x2 x3 x
Multiplying by y, we obtain
ypy (y)dy = f (x1 )px (x1 )dx1 + f (x2 )px (x2 )dx2 + f (x3 )px (x3 )dx3
Thus, to each differential in (A.2) there corresponds one or more differentials in (A.3). As
dy covers the y-axis, the corresponding dx’s are nonoverlapping and they cover the entire
x-axis. Hence, the integrals in (A.2) and (A.3) are equal.
It appears from (A.3) that to determine the mean of y, we must know the probability
density function px (x). In general this is true. However, in the special case that the
function f (x) is linear not the complete density of x needs to be known but only the mean
mx of x.
Theorem (propagation law of the mean): Given a scalar random variable x and a func-
tion f (x), we form the random variable y = f (x). If the function f (x) is linear,
f (x) = ax + b (A.4)
then
my = amx + b (A.5)
. d
mΔy = f (x0 )mΔx (A.6)
dx
Proof: Application of Taylor’s formula gives
d 1 d2
y = f (x0 ) + f (x0 )(x − x0 ) + f (x0 )(x − x0 )2 + . . .
dx 2 dx2
If we take the expectation we get
d 1 d2
E{y} = f (x0 ) + f (x0 )E{(x − x0 )} + f (x0 )E{(x − x0 )2 } + . . .
dx 2 dx2
This may be written with
and
2
E{(x − x0 )2 } = E{ (x − mx ) − x0 − mx ) }
= E{(x − mx )2 } − 2E{(x − mx )(x0 − mx )} + e{(x0 − mx )2 }
= E{(x − mx )2 } + m2Δx
= σx2 + m2Δx
as
d 1 d2 ! "
mΔy = f (x0 )mΔx + 2
f (x0 ) σx2 + m2Δx + . . .
dx 2 dx
If we neglect the second and higher order terms, the result (A.6) follows.
Examples
This number will also be denoted by σx2 . The positive constant σx is called the standard
deviation of x. From the definition it follows that
or
Theorem (propagation law of the variance): Given a scalar random variable x and a
function f (x), we form the random variable y = f (x). If the function f (x) is linear,
f (x) = ax + b (A.9)
then
σy2 = +∞ 2
−∞ [(ax + b) − (amx + b)] px (x)dx
+∞
= a −∞ (x − mx ) px (x)dx
2 2
= a2 σx2
The above result shows that if f (x) is linear, knowledge of σx2 is sufficient for computing
the variance of y = f (x). For nonlinear functions f (x) this is generally not true. If the
function f (x) is nonlinear one will generally need to know the complete density px (x)
of x. However, by using Taylor’s formula an approximation to the variance of y can be
derived that gets round the difficulty of having to know px (x).
Theorem (Linearized propagation law of the variance): Given a scalar random vari-
able x and a nonlinear function f (x), we form the random variable y = f (x). Let x0 be an
approximation to a sample of x. Then a first-order approximation to the variance of y is
# $2
. d
σy2 = f (x ) σx2
0
(A.11)
dx
Proof: Substitution of
d
f (x) = f (x0 ) + dx f (x0 )(x − x0 ) + . . .
mx = f (x ) + dx f (x0 )(mx − x0 ) + . . .
0 d
Appendix 115
into
+∞
σy2 = ( f (x) − my )2 px (x)dx
−∞
Examples
This result seems to contradict definition (A.1). Note however that (A.14) reduces to
(A.1), since
+∞
px (x1 , . . ., x j , . . ., xn )dx j = px (x1 , . . ., x j−1, x j+1, . . ., xn )
−∞
Thus in order to compute E{xi } one only needs the marginal density pxi (xi ).
It appears, therefore, that to determine the mean of yi , we must first find its marginal
probability density function pyi (yi ). This, however, is not necessary. As the next theorem
shows, E{yi } can be expressed directly in terms of the vectorfunction F(x) and the joint
density of x.
Theorem:
+∞ +∞
E{F(x)} = ··· F(x1 , . . ., xn )px (x1 , . . ., xn )dx1 . . .dxn (A.17)
−∞ −∞
or
E{F(x)} = F(x)px (x)dx (A.18)
Proof: The proof is similar to the proof given in the earlier section of this appendix.
The following theorem is an extremely important one, and it is used frequently in
these lecture notes.
Theorem (propagation law of the mean): Given a random n-vector x and a vectorfunc-
tion F(x), F : Rn → Rm , we form the random m-vector y = F(x). If the vectorfunction
F(x) is linear,
F(x) = Ax + b
(A.19)
m×1 m×nn×1 m×1
then
my = A mx + b
(A.20)
m×1 m×nn×1 m×1
Appendix 117
+∞ +∞
+b ··· px (x1 , . . ., xn )dx1 . . .dxn
−∞ −∞
or
+∞
E{F(x)} = ∑ni=1 ai −∞ xi pxi (xi )dxi + b
= ∑ni=1 ai mxi + b
= Amx + b
Without proof we also give the linearized version of the propagation law of the mean.
Theorem (linearized propagation law of the mean): Given a random n-vector x and a
nonlinear vectorfunction F(x), F : Rn → Rm we form the random m-vector y = F(x). Let
x0 ∈ Rn be an approximation to a sample of x and define Δy = y − F(x0 ) and Δx = x − x0 .
Then we have to a first-order
.
mΔy = ∂x F(x0 )mΔx (A.21)
Examples
With the approximate values x01 , x02 and x03 we have to a first-order
⎡ ⎤
x1 − x01
y1 − sin(x01 x02 ) − x03 x02 cos(x01 x02 ) x01 cos(x01 x02 ) 1
E{ }= E{ x2 − x2 ⎦}
⎣ 0
y2 − (x01 )2 − x02 − 4 2x01 1 0
x3 − x03
118 Network quality control
Variancematrix of a random vector: Let xi , i = 1, 2, . . ., n be n continuous scalar random
variables with joint probability density function px (x1 , . . ., xn ). The variancematrix of the
random n-vector (x1 , x2 , . . ., xn )T is by definition the integral
⎡ ⎤⎡ ⎤T
x1 − E{x1 } x1 − E{x1 }
⎢ .. ⎥⎢ .. ⎥
E{⎣ . ⎦⎣ . ⎦ }=
xn − E{xn } xn − E{xn }
(A.22)
⎡ ⎤⎡ ⎤T
+∞ +∞ x1 − E{x1 } x1 − E{x1 }
⎢ .. ⎥⎢ .. ⎥
= ··· ⎣ . ⎦⎣ . ⎦ px (x1 , . . ., xn )dx1 . . .dxn
−∞ −∞
xn − E{xn } xn − E{xn }
This variancematrix will also be denoted by Qx . Using vector notation we may write
(A.22) in the compact form
E{(x − mx )(x − mx )T } = (x − mx )(x − mx )T px (x)dx (A.23)
From (A.22) it follows that
E{(xi − mxi )(x j − mx j )} =
+∞ +∞
= ··· (xi − mxi )(x j − mx j )px (x1 , . . ., xn )dx1 . . .dxn
−∞ −∞
Integration gives
+∞ +∞
E{(xi − mxi )(x j − mx j )} = (xi − mxi )(x j − mx j )px (xi , x j )dxi dx j (A.24)
−∞ −∞
in which px (xi , x j )dxi dx j is the joint density function of the two random variables xi and
x j . The scalar (A.24) is called the covariance of the two random variables xi and x j .
This covariance will also be denoted as σxix j . Note that the off-diagonal elements of the
variancematrix Qx consist of the covariances between the elements of the random vector
x.
If i = j it follows from (A.24) that
+∞ +∞
E{(xi − mxi )2 } = (xi − mxi )2 px (xi , x j )dxi dx j
−∞ −∞
Integration gives
+∞
E{(xi − mxi )2 } = (xi − mxi )2 pxi (xi )dxi
−∞
This is the variance of xi . Note that the variance σx2 is the ith-diagonal element of the
i
variancematrix Qx . Hence, the variancematrix Qx can be written as
⎡ ⎤
σx21 σx1 x2 · · · σx1 xn
⎢ ⎥
⎢ σx2 x1 σx22 ⎥
Qx = ⎢ ⎢ .. ..
⎥
⎥ (A.25)
n×n ⎣ . . ⎦
σxn x1 σx2n
Appendix 119
then
Qy = A Qx AT
(A.27)
m×m m×n n×n n×n
or as
⎡ ⎤⎡ T ⎤
σx x · · · σx1 xn a1
⎢ .1 1 .. ⎥ ⎢ .. ⎥
Qy = a1 · · · an ⎣ .. . ⎦⎣ . ⎦
σxn x1 · · · σxn xn aTn
= AQx AT
120 Network quality control
Without proof we also give the linearized version of the propagation law of variances.
Examples
50 25
=
25 41
2. Let the two random variables y1 and y2 be defined as
y1 = x1 + x22 + x1 x3
(A.31)
y2 = x31 + sin x2
Appendix B
References
In order to aid a further study in the field of adjustment, testing and computation of geode-
tic networks, the list of references given below gives an introductory overview of the
existing international literature. With a few exceptions, the list concentrates on books,
lectures notes and reports.
[2] Baarda, W. (1968): A Testing Procedure for use in Geodetic Networks, Netherlands
Geodetic Commission, Publications on Geodesy, New Series, Vol. 2. No. 5, Delft.
[4] Graybill, F.A. (1976): Theory and Application of the Linear Model, Duxbury Press.
[6] Meissl, P. (1982): Least Squares Adjustment: A Modern Approach, Mitteilungen der
Geodätischen Institute der Technische Universität Graz. Folge 43.
[7] Mikhail, E. M. (1976): Observations and Least Squares, University Press of Amer-
ica.
[8] Rao, C.R. (1973): Linear Statistical Inference and its Applications, 2nd edition,
Wiley series in probability and mathematical statistics.
[9] Teunissen, P.J.G. (2001): Adjustment Theory: an introduction, 2nd edition, Series
on Mathematical Geodesy and Positioning, Delft University Press, ISBN 90-407-
1974-8.
The first two references concentrate on testing theory. Much of the quality control
procedures which have become standard practice nowadays, can be traced back to these
two original works. The references [3], [4], [5] and [8] are reference books, whereas the
references [6], [7], [9] and [10] are more of the lecture notes type. Reliability theory is
only delt with in the first two and in the last reference.
123
124 Network Quality Control
[1] Alberda, J.E. (1974): Planning and Optimization of Networks: Some General Con-
siderations. Boll. Geod. Sc. Aff., 33, pp. 209-240.
[2] Alberda, J.E. (1981): Inleiding Landmeetkunde, 3rd edition, Delft University Press.
[5] Delft Geodetic Computing Centre (Eds.) (1982): Forty Years of Thought. Anniver-
sary volume (Vol. 1 and 2) on the occasion of prof. Baarda’s 65th birthday, Delft.
[6] Grafarend, E., Sanso (Eds.) (1985): Optimization and Design of Geodetic Networks,
Springer Verlag.
[7] Grafarend, E., B. Schaffrin (1974): Unbiased free net adjustment. Survey review, 22,
pp. 200-218.
[10] Mierlo van, J. (1979): Free Network Adjustment and S- transformations, DGK B,
No. 252, pp. 41-54.
[11] Moffitt, F.H., H. Bouchard (1982): Surveying, 7th edition, Harper and Row.
[13] Teunissen, P.J.G. (1984): Generalized Inverses, The Datum problem and S-transform-
ations, Mathematical and Physical Geodesy Report 84.1, Delft, pp. 44.
The references [2], [4], [8], [9], [11], and [12], are all textbooks on surveying. The
references [2], [4], [9] and [11], are of an introductory nature. Reference [8] concentrates
on functional models, in particular spatial modelling and reference [12] includes aspects
of quality control. Optimization and design aspects of geodetic networks are treated in
their whole range of variety in the references [1], [5] and [6]. Free networks are treated
in particular in [3], [7], [10] and [13]. The concept of S-transformations was introduced
in [3] and its relation to generalized inverses is included in [10] and [13].
GPS Surveying
[1] Husti, G.J. (2000): Global Positioning System (in Dutch), Series on Mathematical
Geodesy and Positioning, Delft University Press, ISBN 90-407-1977-2.
References 125
[3] Leick, A. (2005): GPS Satellite Surveying, 3rd edition, John Wiley and Sons.
[5] Teunissen, P.J.G. and A. Kleusberg (1998): GPS for Geodesy, 2nd edition, Springer
Verlag.
Network quality control
Peter J.G. Teunissen
The aim of computing a geodetic network is to determine the geometry of the configuration of a set of points
from spatial observations (e.g. GPS baselines and/or terrestrial measurements). The configuration of points
usually consists of newly established points, of which the coordinates still need to be determined, and already
existing points, the so-called control points, of which the coordinates are known.
Network quality control deals with the qualitative aspects of network design, network adjustment, network
validation and network connection. By means of a network adjustment the relative geometry of the new points is
determined and integrated into the geometry of the existing control points. Prior to the network adjustment, the
geometry of the network is designed on the basis of precision and reliability criteria.
The adjustment and validation of the overall geometry can be divided in two phases, the free network phase
and the connected network phase. In the free network phase, the known coordinates of the control points do not
take part in the adjustment and validation. The possible use of a free network phase is based on the idea that a
good geodetic network should be sufficiently precise and reliable in itself, without the need of external control.
Moreover, it allows one to validate the quality of the external control.
In the connected network phase, the geometry of the free network is integrated into the geometry of the control
points. Adjustment and validation in this second phase differs from the free network phase. The adjustment in
the second phase is a constrained connection adjustment, since it is often not practical to see the coordinates
of the control points change every time a free network is connected to them. For the validation of the connected
network however, the unconstrained connected adjustment is used as input. This allows one to take the intrinsic
uncertainty of the coordinates of the control points in the connection phase into account.
The goal of this introductory text on network quality control is to convey the necessary knowledge for designing,
adjusting and testing geodetic networks. For the purpose of network design, the precision and reliability theory
is worked out in detail. This includes the minimal detectable biases and the bias-to-noise ratios. For the purpose
of the network adjustment, the principles of unconstrained-, constrained-, and minimally constrained least-
squares estimation, are treated. For the network testing, the principles of hypothesis testing are presented and
worked out for the different network cases. For the free network phase this includes the overall model test,
the w-test, and the data snooping procedure. For the connected network phase, it includes the T-test, with an
emphasis on the detection and identification of errors in the control points.
P.J.G. Teunissen
Delft University of Technology,
Faculty of Civil Engineering and Geosciences