0% found this document useful (0 votes)
24 views152 pages

ILS Lecture Notes

Uploaded by

stopbraggingxd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views152 pages

ILS Lecture Notes

Uploaded by

stopbraggingxd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 152

Introduction to Laboratory

Skills (ILS)
31ILS

August 2023
2
Introduction to Laboratory
Skills

Introduction

The course ’Introduction to Laboratory Skills (ILS, course code 31ILS)’ consists of two
parts: a theoretical part (lectures) and a part with practical exercises. The practical
exercises consist of an experimental part with six experiments in which the obtained
theoretical knowledge has to be applied, and a introduction to some important computer
skills. More information can be found in the study guide. These lecture notes will serve
as a manual for the theoretical part.
Besides learning how to deal with uncertainties and errors in measurement results,
the curriculum also covers a simple introduction to probability theory and analyzing
measurement results (fitting). At the end of each chapter there is a section called ’As-
signments and problems’. The assignments are meant to elucidate the course material.
The problems give more challenging problems on exam level.
Knowledge of the contents of this course is a requirement to be able to perform exper-
iments on a sufficient level in the follow-up courses.

For further reading we suggest:

J.R. Taylor: “An Introduction to Error Analysis”

P.R. Bevington & D.K. Robinson: “Data Reduction and Error Analysis for the Physical
Sciences”

D.C. Baird: “Experimentation: An Introduction to Measurement Theory and Experi-


ment Design”

The laboratory staff

i
ii
Table of contents

1 Analyzing measurement results 1


1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Measurement errors and their causes . . . . . . . . . . . . . . . . . . . 3
1.3 Random errors and systematic errors . . . . . . . . . . . . . . . . . . . 5
1.4 Confidence intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4.1 The 100% confidence interval . . . . . . . . . . . . . . . . . . . 7
1.4.2 The 68% confidence interval . . . . . . . . . . . . . . . . . . . . 8
1.5 Notation of measurement results . . . . . . . . . . . . . . . . . . . . . . 8
1.6 Consistency of measurement results . . . . . . . . . . . . . . . . . . . . 10
1.7 Absolute and relative uncertainties . . . . . . . . . . . . . . . . . . . . 10
1.8 Exercises and problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.8.1 Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2 100% Confidence intervals 13


2.1 Linear approximations of functions . . . . . . . . . . . . . . . . . . . . 13
2.1.1 One-dimensional functions x = f (x1 ) . . . . . . . . . . . . . . . 13
2.1.2 Two-dimensional functions x = f (x1 , x2 ) . . . . . . . . . . . . . 14
2.1.3 Higher-dimensional functions x = f (x1 , x2 , ..., xn ) . . . . . . . . 16
2.2 Calculating independent uncertainties . . . . . . . . . . . . . . . . . . . 16
2.2.1 The general case: x = f (x[1] , x[2] , ..., x[n] ) . . . . . . . . . . . . . 17
2.2.2 The sum of measured quantities: x = x[1] + x[2] . . . . . . . . . 17
2.2.3 The difference in measured quantities: x = x[1] − x[2] . . . . . . 18
2.2.4 Multiplication by a constant: x = C x[1] . . . . . . . . . . . . . . 18
n
x = C1 x[1] + C2 x[2] + ... + Cn x[n] = Ci x[i] . . . . . . . . . . .
P
2.2.5 18
i=1

2.2.6 The product of measured quantities: x = x[1] · x[2] . . . . . . . . 19


2.2.7 The ratio of measured quantities: x = x[1] /x[2] . . . . . . . . . . 20
iii

2.2.8 Powers of measured quantities: x = x[1] . . . . . . . . . . . . 21
 α1 α2 αn n α
x = C · x[1] · x[2] · ... · x[n] x[i] i . . . . . . .
Q
2.2.9 =C 22
i=1

2.2.10 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3 Dependent uncertainties . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4 Exercises and problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4.1 Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4.2 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3 Measuring with random variations: 68% intervals 31


3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3 Calculation rules for the expectation value and the variance . . . . . . 35
3.4 The Gaussian distribution . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.5 The uncertainty in measurements with random variations . . . . . . . . 37
3.5.1 The width of the probability distribution . . . . . . . . . . . . . 38
3.5.2 The accuracy of the mean . . . . . . . . . . . . . . . . . . . . . 41
3.6 An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.7 The relationship between the measurement accuracy and the number of
measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.8 Calculation rules for 68% intervals . . . . . . . . . . . . . . . . . . . . . 44
3.9 Overview of calculation rules for uncertainties in measurement results . 46
3.10 The uncertainty in the uncertainty . . . . . . . . . . . . . . . . . . . . 47
3.11 Dependent uncertainties . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.11.1 The covariance . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.11.2 Calculation rule for quantities with dependent uncertainties . . 49
3.12 Exercises and problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.12.1 Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.12.2 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4 Distribution functions 57
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2 Discrete and continuous distributions . . . . . . . . . . . . . . . . . . . 57
4.3 The binomial distribution . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.4 The Poisson distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 64
iv
4.4.1 The Poisson distribution in a different perspective . . . . . . . . 66
4.5 The normal, or Gaussian distribution . . . . . . . . . . . . . . . . . . . 68
4.5.1 The reduced normal distribution . . . . . . . . . . . . . . . . . . 69
4.6 Comparison of the three distribution functions . . . . . . . . . . . . . . 70
4.7 Exercises and problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.7.1 Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.7.2 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

5 Combining measurement results 79


5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.2 Combining measurements of the same quantity . . . . . . . . . . . . . . 80
5.3 Finding a straight line through a number of measurements: the method
of least squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.3.1 A random straight line . . . . . . . . . . . . . . . . . . . . . . . 82
5.3.2 A straight line through the origin . . . . . . . . . . . . . . . . . 86
5.3.3 The general case: unequal uncertainties Sȳi . . . . . . . . . . . . 87
5.4 Other relationships between the points (xi , yi ) . . . . . . . . . . . . . . 89
5.4.1 Relationships that can be made linear . . . . . . . . . . . . . . 89
5.4.2 The method of least squares for non-linear relationships . . . . . 90
5.5 Exercises and problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.5.1 Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.5.2 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
R +∞ 2 √
A Proof that −∞
e−x dx = π 99

B Combining measurements of the same quantity 101

C The uncertainty in the method of least squares 103

D Overview of formulas and distribution functions 105

E Overview of distribution functions 107

F χ2 test of a distribution function 109

G Formulas for least squares fits 112

H Answers Questions 115


v
I Answers Problems 118

vi
1. Analyzing measurement results

1.1 Introduction
When the value of a physical quantity is determined with an experiment, the accuracy
with which this happens is of great importance. Such an experiment (measurement)
often endeavors to test a (physical) model in which the value of this quantity is predicted
or in which other measurements have to be verified. Another possibility is that the
obtained value of the physical quantity is used in additional experiments or calculations.
For all of these cases it is key to know how accurate the physical quantity has been
determined, or equivalently, what the uncertainty is in the measured value. Since
it’s practically impossible to determine a value of a quantity with infinite precision,
there will always be a discrepancy between the real value, which we obviously don’t
know, and the experimentally measured value. When we are able to give the upper
bound of the size of the difference between the real and the measured value, we have
automatically given an indication of the accuracy of the measurement. It is often
difficult to determine this upper bound; sometimes we will even have to be content
with making an estimation. However, it is utterly absurd to give an experimentally
obtained value without mentioning its uncertainty.
As an example, we can look at the determination of the density ρ of an object. The
objective of this experiment is to see if this object is made of 18-karat gold (ρ = 15.5
g/cm3 ) or a cheap alloy (ρ = 13.8 g/cm3 ). The mass m of the object is measured with
a scale and equals 1238 g. The volume is determined by fully submerging the object
in water. From the rise of the water level ∆h and the surface area S, the volume can
be calculated. The area S is determined to be S = 165.13 cm2 and the water level
is measured with a ruler and equals 25 cm and 25.5 cm before and after submerging,
respectively. The density can now be calculated with
m
ρ= . (1.1)
S ∆h
Inserting the measured values gives ρ = 14.994 g/cm3 . We can now straightforwardly
conclude: the result is much closer to 15.5 g/cm2 than to 13.8 g/cm2 , so the object is
made of gold.
However, we have failed to mention the uncertainty with which ρ is determined. Let
us make a rough estimate. It’s quite realistic to assume that the reading error from
the ruler is 0.5 mm. Consequently, the water level before submersion must have been
between 24.95 cm and 25.05 cm. After submersion the level must have been between
25.45 and 25.55 cm. The rise of the water level ∆h is then 0.4 cm in the smallest
case and 0.6 cm in the highest case (25.55-24.95 cm). If we would have used the latter
value in our calculations, we would have found ρ = 12.49 g/cm3 and if we would have
1
used ∆h=0.4 cm, we would have found that ρ= 18.74 g/cm3 . Therefore, the real value
of ρ lies between these two values, which means that from these measurements it is
impossible to conclude whether or not this object is made of gold.
This example points out what a major role the uncertainty of a measurement plays in
an experiment. We even conveniently neglected the fact that the mass and the surface
area of the water were also subject to measurement errors which add to the uncertainty
in the final result. The example not only shows that it is necessary to calculate the
uncertainty after the experiment, but also that it can be very useful to estimate the
uncertainty beforehand. The reading error of the ruler put a spanner in the works for
us and we could have known this beforehand. Generally, the methods and equipment
that are required to obtain an acceptable amount of uncertainty are determined before
the experiment. Afterwards there should of course be a thorough analysis of whether
or not the uncertainty criteria were actually met.
In this course we will examine all of this very carefully. We will get acquainted with
various types of errors and uncertainties and we will be introduced to a set of basic
mathematical rules to calculate uncertainties in our measurement results. In the former
example, the situation gets more complicated when the uncertainties in m and in S are
included in the calculation. We will also see how different measurement results (of the
same quantity) can be combined and how you can present your results, for example in
a graph.
Before we will examine the various types of errors and their causes, we should first
pay attention to the terms ’measurement error’ and ’measurement uncertainty’. A
measurement error is the difference between an (experimentally) obtained value of a
(physical) quantity and the real value of that quantity. Generally, this error is not
known, because if it would, we could calculate the real value from the experimentally
obtained value. In some cases there are measurement errors that we do know and
thus can correct for. A more detailed description of that topic is given in the next
section. A measurement uncertainty is the maximum value that this measurement
error can have, or gives a value below which the measurement error will be with a
certain probability. The measurement uncertainty is generally determined from the
used measurement methods and the used equipment. As stated earlier, this can be quite
challenging in some cases and sometimes an estimation is the best available option.
The terms measurement error and uncertainty are used interchangeably quite often.
Although there is a fundamental difference in the meaning of these words, we will not
be too strict in using these terms.

2
1.2 Measurement errors and their causes
To distinguish between various sources of measurement errors, we will first consider the
general setup of a physical experiment. This is schematically depicted in figure 1.1.

physical
system

measurement
model/theory
system

observer

environment

Figure 1.1: Schematic depiction of an experiment.

In a physical experiment there is an interaction between the experimenter (the observer)


and the measurement system (the equipment etc.), which in its turn interacts with the
physical system (the quantity under examination). All of this together is placed in an
environment that can influence the physical system, the measurement system and of
course also the experimenter. Additionally, the observer has a certain picture of the
physical system by means of a model or a theory. Measurement errors that will occur
in the experiment will find their cause in one of the parts of the measurement system
mentioned above or in the interaction between these parts. We will list a number of
errors and give several examples.

1. The physical system / The quantity under investigation

(a) The quantity under investigation is not well defined.


- The length of a rubber band is not well defined if the size of the tensile
force in this rubber band is not mentioned. Also, the start- and endpoint
have to be well defined to speak of a length.

3
(b) The measured quantity is subject to fundamental fluctuations (noise).
- The size of any electrical resistance is fluctuating slightly in time. In very
accurate experiments this can play a role in the uncertainty.

2. The measurement system

(a) Calibration errors: it is possible that measurement instruments are not cal-
ibrated properly.
- The speedometer in a car usually does not give the correct value for your
actual speed.
(b) Limited reading accuracy from measurement equipment.
- A caliper can only be read with an accuracy of 0.05 mm. When a higher
accuracy is required, another instrument needs to be used (e.g. a microme-
ter).
(c) Bad equipment.
- For example, if there is a small gap between two gears, friction in the
bearings or a dead travel in the instrument, this will inevitably lead to faulty
measurement results.
(d) The sensitivity of the instrument varies in time.
- Lasers have to warm up before they meet their specifications.
(e) Fluctuations in the measurement equipment
- Especially electronic instrumentation can exhibit fluctuations. For example,
an amplifier will always be subject to noise.
(f) The measurement system affects the quantity under investigation (physical
system).
- When a voltmeter with a small input impedance is connected to a circuit,
this will often result in a faulty measured value.
- When the temperature of a small amount of liquid is measured with a
thermometer that has a significantly different temperature (too cold/too
hot), this will also result in a measurement error.

3. The observer

(a) The experimenter is using the equipment in a faulty manner.


- For example, the experimenter can use the wrong settings in a device.
(b) The experimenter reads out the measured values incorrectly.
(c) Reaction time, physiological errors.
- Stopping a stopwatch too late or being influenced by the results of previous
measurements.

4
4. The environment

(a) Influences of the environment affecting the physical system, the measurement
system or the observer.
- Magnetic or electric fields can disturb a measurement (e.g. an elevator in
a building).
- Vibrations in the building can cause errors.
- Fluctuations in temperature.

5. The model / The theory

(a) The theory can be too simple or incorrect.


- Measurements of a resistor of which the resistance depends on the amount
of current flowing through it, can go completely wrong if the theory assumes
the resistance of the resistor to be current or temperature independent.

1.3 Random errors and systematic errors


Experimental uncertainties that cause a difference between the measured value and the
real value of a physical quantity can be grouped into two categories, the random error
and the systematic error. Systematic errors always give an error in the same ‘direction’
(too big/too small) when the experiment is repeated. For example, when a voltmeter
shows a measured value of 9.7 V at an applied voltage of 10 V, we are dealing with
a systematic error. No matter how often we repeat the experiment, we will always
find a value that is too small. Besides systematic errors there are random errors or, in
other words, fluctuations. If we repeat the measurement, we will sometimes find a value
that is too small and at other times a value that is too big. There is no preferential
‘direction’.
Obviously, systematic errors are much more troublesome than random errors. In the
latter case, repeating the experiment and then averaging will often suffice to give a
satisfactory result. Systematic errors call for a different approach. We will have to
trace and eliminate these errors if possible, or at least correct for them. However,
discovering systematic errors is often quite challenging. It is often necessary to take a
critical look at each aspect of the used measurement method and work out if something
could have gong wrong at any stage. Sometimes the instruments have to be recalibrated
and in the worst case a whole new setup has to be developed.
However, some tricks exist to detect systematic errors and estimate their size.

• Sometimes it is possible to use an additional method to perform the same type of


measurements. When the results are consistent with each other, it is likely that
all systematic errors have been eliminated.

• Similarly, systematic errors can be detected by using other instruments and com-
paring the results.
5
• In numerous cases it is also possible to use a reference measurement of an already
known quantity. For example, if we have to determine the wavelength of the
light emitted by a gas-discharge lamp containing an unknown gas, we could first
perform measurements on a lamp with a known gas that emits light at a known
wavelength.

• Finally, in some experiments it is possible to do a measurement without a mea-


surement object, a so called reference or ‘blank’ measurement. For example, when
the intensity of a light source has to be determined, we can first perform a mea-
surement with the light source turned off to measure the background radiation,
which can in its turn be subtracted from the ‘real’ measurement.

It is not always possible to eliminate systematic errors. In the previous example we


could try to make the room pitch black and eliminate the background radiation. If
this is unsuccessful, we can also try to correct for it. We will first have to determine
the magnitude of the correction. However, in both cases we will have to deal with the
systematic errors before turning to the random errors.
As stated earlier, random errors have a fluctuating character and unpredictably change
sign. The origin of these errors can be found in the physical system, the measurement
system, the observer or the environment. Some quantities exhibit intrinsic fluctuations,
for example a resistance will always fluctuate. Additionally, fluctuations can be induced
by the measurement system; for example, an amplifier will add noise to the measurement
signal. When the observer reads out measurement data from an analog voltmeter on
which there is an indicator pointing between two markers and he estimates the position
of the indicator to be 0.1 times the distance between the two markers, he will not be
able to determine this very accurately. Consequently, when repeating the experiment
several times, there will be a certain spread around the real value (position). Finally,
the environment can also induce fluctuations in the measurement. For example, when
a temperature-dependent quantity is measured or if the measurement equipment is
temperature dependent, there will be fluctuations present in the measurements due to
the fluctuations in the temperature of the environment.
Obviously, random errors, as opposed to systematic errors, cannot be eliminated (com-
pletely). At times it is possible to reduce their size, for example by using better equip-
ment (e.g. an amplifier that generates less noise) or for example by stabilizing the
temperature of the environment. The accuracy of the result of an experiment that is
subject to fluctuations can be improved by performing more and/or longer measure-
ments. In chapter 3 we will discuss how we can calculate both the uncertainty from a
series of measurements and the size of the fluctuations.

6
1.4 Confidence intervals
In section 1.1 we have already seen that it is rather pointless to talk about the outcome
of an experiment without talking about the reliability of the result or, in other words, its
uncertainty. We can do this by providing each result with a so called confidence interval.
We will distinguish between two kinds of confidence intervals, the 100% confidence
interval (also called the maximum error) and the 68% confidence interval (the standard
deviation). A 100% confidence interval denotes the interval in which the real value lies
with a probability of (almost) 100%. A 68% interval denotes the interval in which the
real value lies with a probability of 68%. This means that there still is a significant
probability that the real value is outside of this interval. In the following sections, we
will learn when to use these intervals and how we should use them.

1.4.1 The 100% confidence interval


When a measurement is performed in which all systematic errors have been eliminated,
but no fluctuations are observed, we will use 100% confidence intervals. An example
of this is when we measure a voltage V with a digital (multi-)meter. If the measuring
range of the meter has been chosen such that the last digit it displays is not fluctuating,
it will give the same result for each measurement.
For example, if the meter is used for a range of 0-200 mV and it displays a constant
value of 126 mV (no digits after the comma). Actually, the equipment is insufficiently
accurate to observe fluctuations. If this is the case, we will use a 100% confidence
interval. Since we do not know whether the meter rounds or truncates the real value,
it is possible that the real value lies just above 125 mV or just below 127. Therefore,
we will use an uncertainty of 1 mV. After all, we know with a 100% certainty that the
real value will lie between 125 mV and 127. The notation we will use in this case is
V ± ∆V =(126±1) mV, in which the 100% confidence interval is denoted by ∆V . For
the 68% intervals we use a slightly different notation.
Another scenario in which the 100% intervals are used, is when we only do a single
measurement of a quantity, even when we know that fluctuations will be present. Ac-
tually, we should perform several measurements, calculate the average, and determine
the interval in which the real value lies with a probability of 68% (see §1.4.2). However,
the size of this interval will depend on the size of the fluctuations and thus the spread
between the individual measurements. After all, when the spread between the individ-
ual measurements is large, the obtained average will be a less accurate approximation
compared to when the spread is small. Later in this course we will calculate the size
of the 68% interval from the spread between individual measurements. However, when
we only measure once, we will be unable to determine this spread (unless we ’coinci-
dentally’ know how large the spread would be) and thus we will have to turn to 100%
confidence intervals. We will have to make a good estimation of the maximum error in
our measurement.

7
Summarizing: We use the maximum error as uncertainty when we perform a single
measurement OR when each repetition of the measurement gives the exact same result.
We denote the result of a measurement of a quantity p with a maximum error ∆p as
p ± ∆p .

1.4.2 The 68% confidence interval


Usually, measurements do not give the same result when repeating the experiment. In
this case we will have to measure several times and calculate the 68% confidence interval
from the spread in the results. We call this the standard deviation. The notation that
we use in this case is p ± Sp . How we can calculate Sp and how we should use it will
be discussed later in the course. The requirement that the 68% interval has to fulfill
is that random errors that we observe in the measurement result are large compared
to the reading error of our measurement instruments. We will (obviously) see that
increasing the amount of measurements will decrease the size of the confidence interval.
So the more often we repeat our measurement, the more accurate the final result will
become. However, it is not possible to exceed the reading accuracy. In other words, a
68% confidence interval that is smaller than the reading error is not possible. The exact
meaning of the 68% interval is rather difficult to understand. In any case, it provides
us with an interval around the measured (average) value in which the real value will lie
with a probability of 68%. We will come back to this later.
Summarizing: We will use the standard deviation as uncertainty when the measurements
show random errors AND more than one measurement is performed.

1.5 Notation of measurement results


When the experimenter gives his results, he will have to ask himself what these re-
sults actually mean and what the uncertainties mean. The situation can occur where
the calculation of the uncertainty of a measured length of 62.61 mm gives a result of
4.044873298 mm. This number of decimals is easily achieved with modern equipment.
The question arises whether or not it is useful to denote the result as being (62.61±
4.044873298) mm. The answer to this question clearly is no, this is pointless. The last
digits of the uncertainty provide us with no useful information whatsoever and should
therefore be omitted.
A possible next attempt could be (62.61±4.04) mm. The average and the uncertainty
now have an equal amount of digits behind the comma, but the answer to our previously
asked question still is: no, it’s still nonsense! The digits behind the comma of the
uncertainty do not give any insight in how accurate the length is determined. Also, when
this uncertainty is determined from the spread observed in a series of measurements,
there is a large probability that a slightly different uncertainty would be found when
the measurement series is repeated. This means that the uncertainty is subject to its
own uncertainty, which (see §3.10) in a series of 10 measurements will approximately
be 25% and after 50 measurements will still be approximately 10%. We will therefore
8
have to round our uncertainty to one significant figure, unless the measurement series
is so large that we can round to two significant figures. However, this will almost never
be the case. In our example this means that the right value of our uncertainty is 4 mm.
However, the notation (62.61±4) mm is still incorrect, because mentioning the 0.61 mm
in the length has no added value considering that the uncertainty is 4 mm.
The only correct notation is therefore (63±4) mm, in which we have rounded the length
to the same significant figure as the uncertainty.
Only when we use the result not as a final result, but as an intermediate result (so when
we have to use it in our further calculations), we are allowed to use two significant digits
(so (62.6±4.0) mm).
Note that the units are of great importance. Without giving the correct units, a result
is just as useless. Obviously, the result and the uncertainty have to be given in the
same unit.
In some cases, a measurement result will not be accompanied by an uncertainty. We
will then agree that the uncertainty is given by half of the least significant digit, so
in the case of a length of 63 mm this would mean that the full notation would be
(63.0±0.5) mm. However, we advise you to always denote the uncertainty.
Some problems can arise when the uncertainties have to be denoted in factors of ten
of the used units. For example, a length of 73.24 mm has an uncertainty of 33 mm.
Because we have to round to one significant digit, we cannot use 30 mm as a value for the
uncertainty. Instead, we will have to use the so called scientific notation: (7±3)·101 mm.
We will also not write (53700±400) J, but we do write (5.37±0.04)·104 J.
We will briefly list the rules below:

1. Uncertainties in the measurement results will be denoted with one significant


figure. Rounding is necessary.

2. For intermediate results, two significant figures can be taken into account.

3. The least significant figure in a result has to have the same position relative to
the comma as that of the uncertainty.

4. Units have to be mentioned and both the result and the uncertainty should obvi-
ously have the same unit.

9
1.6 Consistency of measurement results
Uncertainties in measurement results will have to be denoted to give an idea of the accu-
racy with which the result has been determined and to be able to for example compare
it with the result obtained with other measurement methods or other equipment.
When 100% intervals are used, it is self-evident that multiple measurements of the
same quantity are consistent if their confidence intervals overlap. After all, we know
with 100% certainty that the real value will have to lie within the given intervals. We
then also know that the real value has to lie within the region of overlap of two 100%
intervals.

-For example, two measurements of the density of a certain substance result


in the values ρ1 ± ∆ρ1 =(15±2) g/cm3 and ρ2 ± ∆ρ2 =(13.9±0.2) g/cm3 .
In these measurements there is overlap between the confidence intervals, so
the results are consistent.
- Two different measurements of a resistance R result in the values R1 ±
∆R1 =(36±2) Ω and R2 ± ∆R2 =(46±5) Ω. Since there is no overlap between
the confidence intervals, the results are inconsistent.

When 68% intervals are used, the situation gets slightly more complicated. Because
there is a 32% probability that the real value is not in the given confidence interval,
there is still a reasonable probability that, when no overlap is found, the results are in
fact consistent. However, it should be noted that this only is the case when the obtained
intervals are in each others vicinity. The probability of finding two consistent results
that are miles away from each other is negligible. Vice versa, it is also possible that when
there is an overlap, the overlapping region does not contain the actual value. There is
apparent consistency. In this case we will have to be careful with our conclusions. In
the case of the 68% intervals, there is no guarantee that the real value is in the region
of overlap. We will come back to this later in chapter 5.

1.7 Absolute and relative uncertainties


The uncertainties as we have seen them so far (so ∆p or Sp ), are absolute uncertainties.
The advantage of using these is that it is directly clear what the size of the confidence
interval is that belongs to the measurement result. Besides absolute uncertainties,
relative uncertainties (also called the precision) are also commonly used, so ∆p /p or
Sp /p. These denote what fraction (or what percentage) the uncertainty is of the real
value. In some calculation rules, which we will encounter in the next chapters, we
use relative uncertainties. Note that absolute uncertainties have the same unit as the
corresponding measured quantity and that relative uncertainties have no unit (they are
dimensionless)!

10
1.8 Exercises and problems

1.8.1 Questions
1. Look again at the examples of the measurement uncertainties of §1.2. Which of
these uncertainties are random and which are systematic?

2. A steel ruler of 30 cm with a reading accuracy of 0.5 mm is used to measure


the length of a rod with a length of about 12 cm. What is the absolute maximal
accuracy with which the length can be determined? What is the relative accuracy?

3. Can a steel ruler with a scale division of 0.5 mm be used to measure a distance
of 2 cm with 1% relative precision? And if the ruler has a vernier scale, readable
at 0.05 mm?

4. An electronic voltmeter has a reading accuracy of 0.1 Volt. What is the smallest
voltage that can be measured with a relative accuracy of 5%? And with a precision
of 1%?

5. Write the following measurement results for a height h, a time interval t, a charge
Q, a wavelength λ and a momentum p in a clearer form with the correct number
of significant digits:
h ± ∆h = (6.04 ± 0.05429) m
t ± ∆t = (20.5432 ± 1) s
Q ± ∆Q = (−3.21 · 10−19 ± 2.67 · 10−20 ) C
λ ± ∆λ = (0.000000673 ± 0.00000007) m
p ± ∆p = (4.367 · 103 ± 42) gm/s

11
12
2. 100% Confidence intervals

2.1 Linear approximations of functions


In §2.2 we will derive the calculation rules for the usage of 100% intervals. From §2.2.6
onwards we’ll have to use approximations that assume that the relative uncertainties are
small. The mathematical approximations that are necessary for this will be presented
in this section.

2.1.1 One-dimensional functions x = f (x1 )

D x

x + D x = f(x 1+ D x 1)

x = f(x 1)

D x 1

x 1 x 1+ D x 1

Figure 2.1: The function x = f (x1 )

We have a function f which has the value x at the point x1 and we’ll investigate what
the value of this function will be in the vicinity of this point x1 . This might look similar
to what is happening in figure 2.1. If we assume that the piece of the curve between
the points x1 and x1 + ∆x1 is a straight line, which is a fair assumption when ∆x1 is
small enough, we can express the slope of the curve as

f (x1 + ∆x1 ) − f (x1 ) ∆x


f ′ (x1 ) = slope ≈ = (2.1)
∆x1 ∆x1

This expression is exact in the limit ∆x1 → 0 (as this is the definition of the derivative
f ′ (x1 )). From the expression above we can conclude two things: firstly, we can write

f (x1 + ∆x1 ) ≈ f (x1 ) + ∆x1 f ′ (x1 ) , (2.2)


13
and secondly
∆x ≈ ∆x1 f ′ (x1 ) . (2.3)
Again: this only works when ∆x1 is small!

2.1.2 Two-dimensional functions x = f (x1 , x2 )

x + D x

x D x
4 a
D x
D x b 3
D x b
2
1
D x a

x 2 x 2+ D x 2

x 1

x 1+ D x 1

Figure 2.2: The function x = f (x1 , x2 )

Analogous to the one-dimensional case we now want to find an expression for f (x1 +
∆x1 , x2 + ∆x2 ). Now the function f (x1 , x2 ) does not describe a line anymore as in
the one-dimensional case, but it describes a plane in three-dimensional space. A small
piece of such a plane is drawn in figure 2.2. In this plane, point ⃝ 1 is the point for
which x = f (x1 , x2 ) holds and point ⃝4 is the point for which the value of the function
equals x + ∆x = f (x1 + ∆x1 , x2 + ∆x2 ). We want to determine the height difference
∆x between these two points. To go from point ⃝ 1 to point ⃝4 we can go along the
edge via point ⃝ 2 , but we can also go along the other edge and pass point ⃝ 3 . We are
going to assume, analogous to the one-dimensional case, that ∆x1 and ∆x2 will both
be so small that the small piece of the plane is flat (in figure 2.2 this is not quite the
case). In this case it doesn’t matter which of the two paths we follow: both edges are
identical. We will go along point ⃝ 2 in the upward direction. The trajectory from point
⃝1 to point ⃝2 is a one-dimensional function, since x2 remains unchanged for the whole
trajectory and can therefore be assumed constant. The height difference ∆xa between
the points ⃝1 and ⃝ 2 can be calculated using equation 2.3:
∆xa ≈ ∆x1 f ′ (x1 ), (2.4)
14
in which f ′ (x1 ) is the slope of the edge of the small piece of the plane between the points
⃝1 and ⃝2 . We will call this the partial derivative of the function f with respect to the
∂f
variable x1 . We do not denote this with f ′ (x1 ), but with ∂x 1
. This partial derivative is
defined as
∂f f (x1 + dx1 , x2 ) − f (x1 , x2 )
(x1 , x2 ) = lim . (2.5)
∂x1 dx1 →0 dx1
∂f
Note that here x2 is taken as a constant. However, this partial derivative ∂x 1
is still
different for each x2 and therefore is still a function of x2 . The second part of the
trajectory takes us from point ⃝ 2 to point ⃝ 4 . The height difference between these
points can be calculated in a similar way, but now x1 is constant for this part of the
trajectory and we can write
∂f
∆xb ≈ ∆x2 , (2.6)
∂x2
∂f
in which ∂x 2
is the partial derivative of the function f with respect to the variable x2
defined as
∂f f (x1 , x2 + dx2 ) − f (x1 , x2 )
(x1 , x2 ) = lim . (2.7)
∂x2 dx2 →0 dx2
Note that with partial derivatives we pretend that the function only depends on one
variable and that the other variables are constant.

Intermezzo: examples of calculating partial derivatives.

∂f ∂f
– f (x, y) = x + y ⇒ =1 , =1
∂x ∂y
∂f ∂f
– f (x, y) = sin(x + y) ⇒ = cos(x + y) , = cos(x + y)
∂x ∂y
∂f ∂f
– f (x, y) = xy ⇒ =y , =x
∂x ∂y
∂f ∂f
– f (x, y) = sin(xy) ⇒ = y cos(xy) , = x cos(xy)
∂x ∂y

The total height difference ∆x = ∆xa + ∆xb is given by


∂f ∂f
∆x = ∆xa + ∆xb = ∆x1 (x1 , x2 ) + ∆x2 (x1 , x2 ). (2.8)
∂x1 ∂x2
With this equation we have found an expression for f (x1 + ∆x1 , x2 + ∆x2 ):
∂f ∂f
f (x1 + ∆x1 , x2 + ∆x2 ) ≈ f (x1 , x2 ) + ∆x1 (x1 , x2 ) + ∆x2 (x1 , x2 ). (2.9)
∂x1 ∂x2
We can also write
∂f ∂f
∆x ≈ ∆x1 + ∆x2 . (2.10)
∂x1 ∂x2
15
2.1.3 Higher-dimensional functions x = f (x1 , x2 , ..., xn )
For functions of multiple variables it’s not possible to make an understandable sketch
anymore. However, completely analogous to earlier derivations, we can write
∂f ∂f
f (x1 + ∆x1 , ..., xn + ∆xn ) ≈ f (x1 , ..., xn ) + ∆x1 + ... + ∆xn (2.11)
∂x1 ∂xn
and
∂f ∂f ∂f
∆x ≈ ∆x1 + ∆x2 + ... + ∆xn . (2.12)
∂x1 ∂x2 ∂xn

2.2 Calculating independent uncertainties


When a quantity has to be calculated from several other measured quantities, the
question arises how the uncertainty in the final answer can be determined from the
uncertainties in the measured quantities. For example, when the volume of a cylinder
has to be determined and the height and the diameter of the cylinder have been mea-
sured, we need to have a method to determine the uncertainty of the volume from the
uncertainties in the height and in the diameter. One method could be to fill in the
minimum and maximum values for the measured quantities, thereby determining the
interval in which the actual value lies. This is a rather tedious method (especially with
complicated functions) and can even lead to erroneous results. In this chapter we’ll
derive the calculation method with which we can calculate the final result quite easily.
In this chapter we’ll limit ourselves to maximum uncertainties (100% intervals). For
the 68% intervals, similar methods can be derived, which only look slightly different.
We’ll look at these methods in the next chapter.
Before we continue, two things should be noted. Firstly: uncertainties are always
positive, even when the quantity at hand is negative. For example, with a measured
voltage of V = (−3.04±0.07) V the uncertainty ∆V = 0.07 V is positive! Secondly:
uncertainties in measured quantities that will be used in further calculations should
always be independent of each other. In the example of determining the volume of
a cylinder, the height and diameter should be determined independently in order for
the uncertainties to be independent. This will generally be true, but measurement
techniques exist for which this is not the case. We will elaborate on this later in this
chapter.
In section 2.2.1 a general expression for the uncertainty ∆x will first be given. In the
subsequent sections, this expression will be worked out for simple cases. In section
2.2.10 several examples will be given.

16
2.2.1 The general case: x = f (x[1] , x[2] , ..., x[n] )
The measured quantities are x[1] ± ∆x[1] , x[2] ± ∆x[2] , ..., x[n] ± ∆x[n] . Using equation
(2.11), we see that

x ± ∆x = f (x[1] ± ∆x[1] , x[2] ± ∆x[2] , ..., x[n] ± ∆x[n] )


∂f ∂f ∂f
= f (x[1] , x[2] , ..., x[n] ) ± ∆x[1] [1] ± ∆x[2] [2] ± ... ± ∆x[n] [n] . (2.13)
∂x ∂x ∂x
This only works when all ∆x[i] are small, but we assumed this anyway. It’s clear that x
has its maximum at
∂f ∂f ∂f
x + ∆x = x + ∆x[1] [1]
+ ∆x[2] [2]
+ ... + ∆x[n] (2.14)
∂x ∂x ∂x[n]

and its minimum at


∂f ∂f ∂f
x − ∆x = x − ∆x[1] [1]
− ∆x[2] [2]
− ... − ∆x[n] . (2.15)
∂x ∂x ∂x[n]

Note the absolute values!


The general formula for the uncertainty ∆x therefore is
n
P ∂f
∆x = ∂x[i]
· ∆x[i] . (2.16)
i=1

This is the most general formula. With this, all formulas in the next section can be
derived. Check this yourself.

2.2.2 The sum of measured quantities: x = x[1] + x[2]


Measured are the quantities x[1] and x[2] with uncertainties ∆x[1] and ∆x[2] respectively.
We want to determine the uncertainty ∆x in the quantity x, which is the sum of x[1]
and x[2] , so x = x [1] + x[2] . The interval in which the actual value of the first quantity
lies
 [2] is of course x[1] − ∆x[1] , x[1] + ∆x[1] and the second quantity lies in the interval
x − ∆x[2] , x[2] + ∆x[2] . The smallest value that x can have is equal to sum of the
smallest values of x[1] and x[2] , so

x − ∆x = (x[1] − ∆x[1] ) + (x[2] − ∆x[2] ) = (x[1] + x[2] ) − (∆x[1] + ∆x[2] ). (2.17)

The upper limit of x is given by the sum of the upper limits of x[1] and x[2] :

x + ∆x = (x[1] + ∆x[1] ) + (x[2] + ∆x[2] ) = (x[1] + x[2] ) + (∆x[1] + ∆x[2] ). (2.18)

We can conclude from this that

∆x = ∆x[1] + ∆x[2] . (2.19)


17
2.2.3 The difference in measured quantities: x = x[1] − x[2]
In a similar way we can now calculate the uncertainty when two measured quantities
are subtracted from each other. However, now x is at its minimum when x[1] is at its
minimum and x[2] at its maximum, so

x − ∆x = (x[1] − ∆x[1] ) − (x[2] + ∆x[2] ) = (x[1] − x[2] ) − (∆x[1] + ∆x[2] ). (2.20)

Now x is at its maximum when x [1] is at its maximum and x[2] at its minimum, so

x + ∆x = (x[1] + ∆x[1] ) − (x[2] − ∆x[2] ) = (x[1] − x[2] ) + (∆x[1] + ∆x[2] ). (2.21)

Again, we see that the obtained equation is

∆x = ∆x[1] + ∆x[2] . (2.22)

2.2.4 Multiplication by a constant: x = C x[1]


We want to determine the uncertainty ∆x of the quantity x, which is the product of a
constant C and the measured quantity x [1] . We need to distinguish between positive
and negative values of C. When C is positive, x of course is at its maximum when x [1]
is at its maximum, so

x + ∆x = C (x[1] + ∆x[1] ) = C x[1] + C ∆x[1] , (2.23)

and x is at its minimum when x [1] is at its minimum:

x − ∆x = C (x[1] − ∆x[1] ) = C x[1] − C ∆x[1] . (2.24)

In this case ∆x = C ∆x[1] . When C is negative, x is at its maximum when x[1] is at its
minimum, so
x + ∆x = C (x[1] − ∆x[1] ) = C x[1] − C ∆x[1] , (2.25)
and x is at its minimum when x [1] is at its maximum, so

x − ∆x = C (x[1] + ∆x[1] ) = C x[1] + C ∆x[1] . (2.26)

Therefore, in this case ∆x = −C ∆x[1] . Conclusion: the general expression is

∆x = |C| ∆x[1] . (2.27)

n
x = C1 x[1] + C2 x[2] + ... + Cn x[n] = Ci x[i]
P
2.2.5
i=1

This case is a combination of §2.2.2 and §2.2.4. The result can easily be guessed:
n
P
∆x = |Ci | ∆x[i] . (2.28)
i=1

18
2.2.6 The product of measured quantities: x = x[1] · x[2]
Here we should also distinguish between positive and negative values, but now those of
x[1] and x[2] . Let’s assume both are positive for now. The same reasoning as with the
sum of two measured quantities can be used: x is at its maximum when both x[1] and
x[2] are at their maximum, so
x + ∆x = (x[1] + ∆x[1] )(x[2] + ∆x[2] ) = x[1] x[2] + x[2] ∆x[1] + x[1] ∆x[2] + ∆x[1] ∆x[2]
 
[1] [2] ∆x[1] ∆x[2] ∆x[1] ∆x[2]
= x x 1 + [1] + [2] + [1] [2] (2.29)
x x x x
and x is at its minimum when both x[1] and x[2] are at their minimum:
x − ∆x = (x[1] − ∆x[1] )(x[2] − ∆x[2] ) = x[1] x[2] − x[2] ∆x[1] − x[1] ∆x[2] + ∆x[1] ∆x[2]
 
[1] [2] ∆x[1] ∆x[2] ∆x[1] ∆x[2]
= x x 1 − [1] − [2] + [1] [2] . (2.30)
x x x x
When we divide the left and right part of equation (2.29) by x (= x[1] x[2] ), we find
∆x ∆ [1] ∆ [2] ∆ [1] ∆ [2]
1+ = 1 + x[1] + x[2] + x[1] [2]x . (2.31)
x x x x x
∆ ∆
If all relative uncertainties in the measured quantities are small, i.e. xx[1][1] and xx[2][2] are
small (compared to 1), then the product of both is even smaller. When for example the
∆ ∆
relative uncertainties are measured to be 5% (so xx[1][1] = xx[2][2] = 0.05), then the term
∆x[1] ∆x[2] ∆ ∆
= 0.0025. This means that this term is negligible compared to xx[1][1] and xx[2][2] .
x[1] x[2]
We find that
∆x ∆ [1] ∆ [2]
1+ ≈ 1 + x[1] + x[2] , (2.32)
x x x
from which follows that
∆x ∆ [1] ∆ [2]
≈ x[1] + x[2] . (2.33)
x x x
From the expression for x − ∆x we find the same via
∆x ∆ [1] ∆ [2] ∆ [1] ∆ [2] ∆ [1] ∆ [2]
1− = 1 − x[1] − x[2] + x[1] [2]x ≈ 1 − x[1] − x[2] (2.34)
x x x x x x x
(again with small relative uncertainties), from which again follows that
∆x ∆ [1] ∆ [2]
≈ x[1] + x[2] . (2.35)
x x x
∆ ∆ ∆x[1] ∆x[2]
Neglecting the term xx[1] x[2]
[1] x[2] is only allowed when x[1]
≪ 1 and x[2]
≪ 1, so when
[1] [2]
∆x[1] ≪ x and ∆x[2] ≪ x .
The absolute uncertainties in the measured quantities have to be much smaller than
the quantities themselves!
If now one (or both) of the measured quantities has a negative value, the calculation
gets a little bit more complicated (just as in the case when x = C x[1] with C < 0). Try
to derive by yourself that the general formula looks like the following:
∆x ∆x[1] ∆x[2]
= + . (2.36)
|x| |x[1] | |x[2] |
19
2.2.7 The ratio of measured quantities: x = x[1] /x[2]
Again, this case is slightly more difficult. Like in the last case, we will first consider
positive values of x[1] and x[2] for simplicity; we’ll generalize this later. The quantity x
to be calculated is at its maximum when x[1] is at its maximum, but when x[2] is at its
minimum (the smaller the denominator, the larger the fraction), so
∆x[1]
x[1] + ∆x[1] x[1] 1 + x[1]
x + ∆x = [2] = [2] ∆x[2]
. (2.37)
x − ∆x[2] x 1−
x[2]

1 1
Intermezzo: We will first show that 1−δ ≈ 1 + δ and that 1+δ ≈ 1 − δ if δ is
sufficiently small, i.e. δ ≪ 1. This is relatively simple to understand when
we note that
(1 + δ) · (1 − δ) = 1 + δ − δ − δ 2 = 1 − δ 2 ≈ 1. (2.38)
This of course only works when δ 2 ≪ 1. By dividing the left and right part
of the equation by 1 + δ, we find
1
1−δ ≈ . (2.39)
1+δ
By not dividing by 1 + δ, but by 1 − δ, we find that
1
1+δ ≈ . (2.40)
1−δ
With this the proof is given. When δ=0.1, the result is still 1% accurate
(check this yourself).

We are going the use the approximation above to work out equation (2.37). If we call
∆x[2] 1
x[2]
= δ, the factor 1−δ will appear in our equation. So, using the rules we found

above, we can approximate this by 1 + δ = 1 + xx[2][2] . Therefore we find
x[1] x[1]
    
∆x[1] ∆x[2] ∆x[1] ∆x[2]
x + ∆x ≈ [2] 1 + [1] 1 + [2] ≈ [2] 1 + [1] + [2] , (2.41)
x x x x x x
∆ ∆
where we again have neglected the term xx[1] x[2]
[1] x[2] . Dividing the left and right part by
[1]
x (= xx[2] ) gives
∆x ∆ [1] ∆ [2]
= x[1] + x[2] . (2.42)
x x x
In a similar way we can derive that
∆[1]
x[1] − ∆x[1] x[1] 1 − xx[1]
x − ∆x = = [2]
x[2] + ∆x[2] x 1 + ∆x[2][2]
x
x[1] x[1]
    
∆x[1] ∆x[2] ∆x[1] ∆x[2]
≈ [2] 1 − [1] 1 − [2] ≈ [2] 1 − [1] − [2] , (2.43)
x x x x x x
20
from which again follows that
∆x ∆ [1] ∆ [2]
= x[1] + x[2] . (2.44)
x x x
When x[1] or x[2] (or both) are negative, we find the same result as with the product, so

∆x ∆ [1] ∆ [2]
= x[1] + x[2] . (2.45)
|x| |x | |x |

Note that when we add or subtract two quantities, their absolute uncertainties should
be added (see equations (2.19) and (2.22)), whereas when we multiply or divide two
quantities, their relative uncertainties should be added (see equations (2.36) and (2.45)).


2.2.8 Powers of measured quantities: x = x[1]

We again start with the case x[1] > 0 and α > 0. We write
 α
[1]
α [1] α
 ∆x[1]
x + ∆x = x + ∆x[1] = x 1 + [1] . (2.46)
x

Intermezzo: we will show that (1 + δ)α ≈ 1 + α δ when δ ≪ 1. Using


equation (2.2) this can be seen fairly easily. Consider the function f (x) = xα :
its derivative is f ′ (x) = α xα−1 . If we now insert the values x1 = 1 and
∆x1 = δ in equation (2.2), we find

(1 + δ)α ≈ 1α + δ α 1α−1 = 1 + α δ. (2.47)

We can use the approximation above in the expression for x + ∆x . If we again take

δ = xx[1][1] , then
 
[1] α
 ∆x[1]
x + ∆x ≈ x 1 + α [1] (2.48)
x
and therefore
∆x ∆ [1]
= α x[1] . (2.49)
x x
We find exactly the same result when we perform the calculation for x − ∆x .

When we look at the case where α < 0 (but still x[1] > 0), then actually x = x[1] =
1 [1]
|α| . This means that x is at its maximum when x is at its minimum:
( )
x [1]

!|α|
1 [1] α
 1 [1] α
 1
x + ∆x = |α|
= x  |α| = x ∆x[1]
. (2.50)
(x[1] − ∆x[1] ) 1−
∆x[1]
1− x[1]
x[1]

21
1 ∆x[1]
We again use ∆x[1]
≈ 1+ (see equation (2.40)) and then divide by x (=
1 − x[1] x[1]
[1] α

x ), in order to find
 |α|
∆x ∆x[1] ∆ [1]
1+ ≈ 1 + [1] ≈ 1 + |α| x[1] , (2.51)
x x x
where we have also used equation (2.47). We therefore find
∆x ∆ [1]
= |α| x[1] . (2.52)
x x
Via x − ∆x we find the same expression.
Derive by yourself that, when negative values of x[1] are allowed, the general expression
becomes:
∆x ∆ [1]
=|α| x[1] . (2.53)
|x| |x |

α1 α2 α n n αi


x = C · x[1] · x[2] · ... · x[n] x[i]
Q
2.2.9 =C
i=1

Show by yourself that in this case


n
 
∆x
P ∆x[i]
= |αi | . (2.54)
|x|
i=1 |x[i] |

∆x
Note that the constant C doesn’t show up in (it does of course in ∆x ).
|x|

2.2.10 Examples
• Suppose that we want to measure the surface area S of a circle and in order to
do this we measure the diameter d. The uncertainty in d is ∆d . We calculate the
area by S = πd2 /4, but what is ∆S ?

We use equation (2.54), but only with one term (so n = 1). We set x = S, C = π4 ,
x[1] = d and α1 = 2. Therefore,
∆S ∆d ∆d
= |2| =2 . (2.55)
S |d| d
22
An alternative method would be to use the general expression (2.16). Again, we
set x[1] = d and use the function S = f (d) = π4 d2 . The partial derivative we now
need is ∂x∂f[1] = ∂S
∂d
= π2 d. Substituting this in the general equation gives
π π
∆S = d ∆d = d ∆d , (2.56)
2 2
and dividing by S = π4 d2 gives
π
∆S 2
d ∆d ∆d
= π 2 =2 . (2.57)
S 4
d d

• Suppose we want to measure the volume of the cylinder below.

In order to do this we measure h ± ∆h and d ± ∆d . The volume V of course is


given by V = πd2 h/4. Using equation (2.54) we can calculate ∆V . We set x = V ,
C = π/4, x[1] = d, α1 = 2, x[2] = h and α2 = 1 in order to find

∆V ∆d ∆h ∆d ∆h
= |2| + |1| =2 + . (2.58)
V |d| |h| d h

Again, the alternative is to use equation (2.16). We again set x[1] = d, x[2] = h
and use the function V = f (d, h) = πd2 h/4. The partial derivatives we need are
∂f ∂V ∂f ∂V
[1]
= = πdh/2 and [2]
= = πd2 /4. Substitution gives
∂x ∂d ∂x ∂h
π π π π
∆V = dh ∆d + d2 ∆h = dh ∆d + d2 ∆h , (2.59)
2 4 2 4
and dividing by V = π4 d2 h gives
π π 2
∆V 2
dh ∆d 4
d ∆h ∆d ∆h
= π 2 + π 2 =2 + . (2.60)
V 4
dh 4
dh d h

23
• The refractive index of glass can be determined by shining light on a flat glass
surface as shown in the figure below.

air

glass

By measuring the angle of refraction r ± ∆r at a certain angle of incidence i ± ∆i ,


we can, using the equation
sin i
n = , (2.61)
sin r
determine the refractive index n. How do we calculate ∆n ? We do this again by
using equation (2.16). We set x[1] = i, x[2] = r and use the function n = f (i, r) =
sin i ∂f ∂n cos i
= sin i · (sin r)−1 . The partial derivatives we need are [1]
= =
sin r ∂x ∂i sin r
∂f ∂n sin i · cos r
and = =− . Inserting this in the general equation gives
∂x[2] ∂r sin2 r
∂n ∂n cos i sin i · cos r
∆n = ∆i + ∆r = ∆i + ∆r . (2.62)
∂i ∂r sin r sin2 r

We can in fact use this to calculate ∆n . We can simplify the equation above a
little by dividing by |n|. We then obtain
∆n ∆i ∆r
= + . (2.63)
|n| |tan i| |tan r|

Check this yourself.


Suppose that we measure i = (36 ± 1)◦ and r = (23 ± 1)◦ . In order to use the
equation above, we have to convert the uncertainties to radians instead of just
filling in the uncertainties in degrees. Ask yourself why. Converting to radians
gives ∆i = ∆r = 0.018 (not the final result, so two significant digits). By filling in
sin 36◦
everything, we find n = = 1.50432 (not yet rounded, we can do this when
sin 23◦
0.018 0.018
we’ve calculated ∆n ) and ∆nn = ◦
+ = 0.0672. Rounding gives us the
tan 36 tan 23◦
final result: n±∆n = (1.5±0.1). Note that the refractive index is a dimensionless
unit.
24
2.3 Dependent uncertainties
We mentioned before that all calculation rules we’ve used so far can only be used when
the uncertainties are independent. What that means and what can go wrong will be
demonstrated using an example.

R 1

R 2

The total resistance R of two resistances R1 and R2 in parallel is known to be


1 1 1
= + , (2.64)
R R1 R2
from which follows that
R1 R2
R= . (2.65)
R1 + R2

If we would know (by measurement) that R1 = (5.4 ± 0.1) Ω and also that
R2 = (1.40 ± 0.05) Ω, we could fill this in:

(5.4 ± 0.1) · (1.40 ± 0.05)


R= Ω. (2.66)
(5.4 ± 0.1) + (1.40 ± 0.05)

The product in the numerator we can calculate using section 2.2.6 (equa-
tion (2.36)): (5.4 ± 0.1) · (1.40 ± 0.05) Ω2 = (7.56 ± 0.41) Ω2 (check this
yourself). The sum in the demominator we can calculate using section 2.2.2
(equation (2.19)): (5.4 ± 0.1) + (1.40 ± 0.05) Ω = (6.80 ± 0.15) Ω. Note that
this isn’t the final result yet, so we still include two significant digits (since
we’ll have to use it in further calculations). Inserting these numbers in the
ratio gives:
(7.56 ± 0.41) Ω2
R= = (1.11 ± 0.08) Ω, (2.67)
(6.80 ± 0.15) Ω
where we have used the results of §2.2.7.

We made a huge mistake here. In the last step we divided R1 R2 by R1 +R2 and used the
calculation rules for independent uncertainties. This of course is not correct. If in reality
R1 R2 was bigger than we thought it would be (so bigger than 7.56 Ω2 ), then R1 + R2
would of course also be bigger than the 6.80 Ω with which we are calculating. And if
R1 R2 would be smaller than 7.56 Ω2 , then R1 + R2 would also be smaller than 6.80 Ω.
In the derivation of the uncertainty in the ratio of two quantities we assumed that one
quantity can have a variation in the positive direction, while the other quantity has a
25
variation in the negative direction (this gave the maximum variation of the ratio itself).
This cannot happen in the case demonstrated above; the variations in the numerator
and in the denominator always go in the same direction (too big / too small). One
would expect the true uncertainty in R to be smaller than what we’ve calculated here.
We’ll demonstrate this.
The simplest way is use the equation R1 = R11 + R12 and to first calculate the uncertainties
in R11 and in R12 . From
R1 = (5.4 ± 0.1)Ω (2.68)
follows that
1
= (0.1852 ± 0.0034)Ω−1 ; (2.69)
R1
and from
R2 = (1.40 ± 0.05)Ω (2.70)
follows that
1
= (0.714 ± 0.026)Ω−1 (2.71)
R2
(equation (2.53) with α = −1). The sum of these two quantities can be calculated quite
easily:
1
= (0.899 ± 0.029)Ω−1 , (2.72)
R
from which again follows that

R = (1.11 ± 0.04)Ω (2.73)

(again equation (2.53) with α = −1). We see that the uncertainty is indeed smaller
than what we found in the incorrect case.
An alternative way would have been to use the general expression (equation (2.16)). In
that case we find that
∂R ∂R
∆R = ∆R1 + ∆R2 . (2.74)
∂R1 ∂R2
We can calculate the partial derivatives:
 2
∂R R2
= ,
∂R1 R1 + R2
 2
∂R R1
= . (2.75)
∂R2 R1 + R2
∂R ∂R
Check this yourself. Also note that ∂R 1
and ∂R 2
are dimensionless. The expression for
∆R becomes  2  2
R2 R1
∆R = ∆R1 + ∆R2 , (2.76)
R1 + R2 R1 + R2
and filling in all numbers again gives ∆R = 0.04 Ω.

26
2.4 Exercises and problems

2.4.1 Questions
∂f ∂f
1. Calculate the partial derivatives and of the following functions:
∂x ∂y

(a) f (x, y) = x4 y 2
x2
(b) f (x, y) =
x+y
4x2 + 2y
(c) f (x, y) =
3x − y
sin(x)
(d) f (x, y) =
sin(y)
(e) f (x, y) = x sin(xy)

2. Determine the density ρ ± ∆ρ of a body if its mass is m ± ∆m = (24.320 ± 0.005)


g and its volume is V ± ∆V = (10.20 ± 0.05) cm3 .

3. Someone has two numbers: x ± ∆x = 10 ± 1 and y ± ∆y = 20 ± 1. What is the


best estimate of the product p = xy? First calculate the smallest possible value
and then calculate the largest possible value. What is p ± ∆p in this case? Now
∆p ∆x ∆y
calculate p ± ∆p using the expression = + . Compare the results.
p x y
Next, do the same for x ± ∆x = 10 ± 6 and y ± ∆y = 20 ± 15, first using
the minimum/maximum value and then using the formula. Again, compare the
results and explain.

4. With a good stopwatch and some experience, one can measure time intervals from
approximately 1 second to many minutes, with a maximum error of approximately
0.1 s. Say we want to measure the period T ≈ 0.5 s of a pendulum. If we
measure only one swing, then we have a relative uncertainty of about 20%. By
measuring multiple swings, the relative uncertainty can be improved (=reduced),
as illustrated by the following.

(a) If we measure the duration of 5 subsequent swings as (2.4 ± 0.1) s, how large
is T with relative and absolute uncertainty?
(b) If 20 subsequent swings take (9.4±0.1) s , What is T (including uncertainty)?

5. An experiment with a simple p pendulum is used to determine the gravitational


acceleration using T = 2π l/g. How large are the absolute error and relative
uncertainty in g if the precision (=relative uncertainty) of T is 2% and that of l
is 1.5%?
27
6. The heat capacity S of a liquid is measured using a continuous flow calorimeter,
obeying V I = JSQ(T2 − T1 ). V and I have 2% precision, J is known ‘exactly’,
Q is measured with 0.5% precision and T1 and T2 are measured with ± 0.1 ◦ C
maximum error. What is the minimum value of T2 − T1 such that S can be
determined with an accuracy of at least 10% ?

7. The wavelength λ of monochromatic light kan be determined by shining a beam on


a diffraction grating and measuring the angle θ over which the light is diffracted.
The wavelength can be determined from λ = d sin(θ) where d is the slit separation.
The angle θ can be measured with an accuracy of 1 arc minute, the slit separation
d is assumed to be known precisely. Calculate the relative uncertainty ∆λ/λ if
we measure θ = 15◦ 35′ .
x
8. If z = 2 , calculate the ratio of the accuracy (=relative uncertainty) in z to
x +1
that in x. In other words, calculate ∆|z|z / ∆|x|x .

9. If z = sin(x) and x lies between x0 − ∆x and x0 + ∆x , between which boundaries


does z lie?

10. The focal distance f of a thin lens is given by f1 = 1o + 1i , where o and i are
the object- and image distance, respectively. If the object distance is o ± ∆o =
(24.30 ± 0.05) cm and the image distance is i ± ∆i = (17.40 ± 0.05) cm, calculate
f ± ∆f .

2.4.2 Problems
1. An electrician uses two resistors in a parallel circuit. The resistance of these
resistors is rated by the manufacturer at R1 =100 Ω±5% and R2 =270 Ω±5%, re-
spectively. The uncertainties herein are 100%-intervals. The total resitance R of
the parallel circuit follows R1 = R11 + R12 .

(a) Calculate the value of R and its uncertainty


(b) If the electrician wishes to have a more precisely known value of R, which of
the resistors should he swap for a more accurately rated one?

2. Using a grating spectroscope, the wavelength of monochromatic light can be de-


termined. This can be done as follows:

la m p c o llim a to r

tr a lie a
k ijk e r

28
Through the scope (which is rotatable around the centre of the circle), one can
look at the pattern of maxima and minima in light intensity which occurs due
to the grating. The 0-th order maximum (n = 0) occurs at α0 ≃ 0o , the n-th
maximum occurs at an angle α with regard to the n=0-maximum, where α is
given by
nλ = d sin (α) .

Here, λ is the wavelength of light (to be determined) and d the grating constant
(the distance between the grooves of the grating). By measuring the angle α at
the n-th maximum, λ can be determined. This angle α is determined by once
measuring it for a left turn (positive angle α+ ) and once measuring it for a right
turn (negative angle α− ). Since the pattern is symmetric around n = 0, i.e.,
around α0 ≃ 0o , we calculate α as the difference of these two results, divided by
two:

α+ − α−
α= .
2
The grating constant d is very accurately known and equals d = 2190 nm. The
wavelength to be determined is around λ ≃ 600 nm. The uncertainty in deter-
mining the angles is taken as the readout accuracy of the apparatus, which is
1 o
∆α+ =∆α− =1′ = 60 .

(a) Why is α determined as discussed (by measuring once for a left turn and
once for a right turn) and not by taking α = α+ − α0 (so, measuring once
for a left turn and once at n = 0 with α0 ≃ 0)?
(b) Show that
rn the uncertainty ∆λ in the calculated wavelength is given by
2 o
∆λ = nd 1 − nλd
∆α+ . Realise that ∆α+ =∆α− .

(c) Calculate the uncertainty in the calculated wavelength if it is measured at


n = 1 (so, at the first maximum). Assume that λ ≃ 600 nm.
(d) What is the highest possible value of n? Explain why.
(e) For what n is the uncertainty ∆λ minimal?
(f) Calculate the minimal ∆λ .
(g) We measure at n=2. The results are: α+ = (32o 29′ ± 0o 1′ ) and α− =
(−32o 39′ ± 0o 1′ ). Calculate the wavelength λ and its uncertainty ∆λ .

3. (a) What are the two conditions under which the general formula for 100%-
intervals is valid?
(b) For both conditions, explain why they are neccesary. Possibly, give an ex-
ample to prove this.
(c) If y = x1 + x2 where x1 and x2 are measured with respective uncertainties
∆x1 and ∆x2 , are both conditions still neccesary?
29
4. Someone wants to determine the resistance of an unknown resistor R. They
have at their disposal a voltage source producing an unknown voltage V , an
amperemeter that can measure current with an accuracy of ∆i (100%-interval) and
a very precise resistance R1 (so ∆R1 = 0). The unknown resistance is determined
as follows: the voltage source is connected to the unknown resistor and with
the amperemeter, the current I0 running through it is measured. In this case
V = I0 R. Then, the voltage source is connected over a series circuit containing
both the unknown and the known resistor, and the current I1 is measured. Now,
V = I1 (R + R1 ).

I1
(a) Show that the unknown resitance R can be calculated using R = R1 .
I0 − I1
(b) If the uncertainty in the current measurement is called ∆i (so ∆I0 = ∆I1 =
∆i ), give an expression for the uncertainty ∆R to which R can be determined.
(c) For what value of R1 is the measurement most accurate? Express this opti-
mal R1 as a function of R. Hint: Write ∆R as a function of V , R, R1 and
∆i (not I0 and I1 ), and calculate when ∆R is minimal.

In an experiment, a resistor with R1 = 1.0000 kΩ is used. The results of the


measurements are I0 = (6.5 ± 0.1) mA and I1 = (4.6 ± 0.1) mA (both 100%-
intervals)

(d) Calculate the unkown resistance R and its uncertainty.


(e) Which resistance R1 would have given an optimal result, and what would
the uncertainty ∆R have been in that case?

5. Using the thin lens formula f1 = 1i + 1o , the focal distance f of an ideal lens is
determined. To this end, an object is placed at a distance o from the lens, and
the distance i from the image to the lens in measured. The measurements of o
and i are performed by measuring the positions of the object (po ), lens (pl ), and
image (pi ) on an optical rail. Then of course, o = pl − po and i = pi − pl . We
measure po = (26.35 ± 0.05) cm, pl = (60.40 ± 0.05) cm en pi = (86.8 ± 0.2) cm.
All uncertainties are 100%-intervals. The uncertainty in pi is larger than in po
and pl , since it is difficult to determine where the image is exactly sharp.

(a) Calculate i en o and their uncertainties ∆i and ∆o .


(b) Can these values be used to calculate f ? Explain your answer.
(c) Calculate the focal length f of the lens and its uncertainty ∆f .

30
3. Measuring with random variations: 68%
intervals

3.1 Introduction
In the last chapter we looked at measurements where no spread in the measurement
results was found and we derived calculation rules for this. In this chapter we’ll look at
measurements where we, upon repeating the measurement, find different results each
try. It’s obvious that these results have to be averaged to get to the final result, but how
we determine an uncertainty and how we use it in calculations isn’t so obvious. For this
reason, we’ll investigate this in more detail. But before we do that, some definitions
should first be given and some concepts should be introduced.

3.2 Definitions
When a different result is found each time a measurement is repeated, it’s impossi-
ble to predict what the exact result of the next (new) measurement will be. In that
case we speak of a random quantity or a stochastic variable. In some cases the result
of a measurement can take any possible value, but in most cases the possible values
of the results are limited in some way. The set of all possible outcomes of a certain
measurement we call the sample space. This sample space might be continuous (when
we are dealing with continuous stochastic variables) or discrete (for discrete stochastic
variables). The sample space of a continuous stochastic variable is by definition in-
finitely big and therefore not all possible outcomes will be found with a finite number
of measurements. When we are dealing with discrete stochastic variables, the sample
space isn’t infinitely big by definition, but it still can be. When we are performing
measurements, we make a set of observations. These we call the population. Of such a
population we can make a frequency distribution. When we are dealing with a discrete
quantity, this simply is the number of times an outcome from the sample space occurs
in the population (so the number of times a certain value has been found in our series
of measurements).

For example, a dice has been thrown 30 times. The outcome we call x. It’s
clear that x is a discrete stochastic variable and the frequency distribution
F(x) of the population of 30 throws might look like figure 3.1.

When we are dealing with a continuous stochastic variable, we can’t easily make such a
distribution. When we’re dealing with a discrete quantity, an element from the sample
31
8

F(x 3

0
0 1 2 3 4 5 6 7

Figure 3.1: Frequency distribution F(x) of the results of 30 throws with a dice.

space will (with enough measurements) occur multiple times in the population, but
with a continuous quantity this is not the case. We should therefore divide the sample
space in intervals and count how many times an outcome falls in one of these intervals.
How big we make these intervals of course depends on the size of the sample. The
bigger the sample (the amount of measurements), the smaller we can choose the size
of the intervals. For this continuous case with intervals, we don’t use the frequency
distribution, but the frequency density distribution, which is defined as the number
measurements in a certain interval divided by the size of the interval.

For example, we can measure the velocity of ions in a gas discharge. The
measurement is repeated 400 times and this has resulted in:

Speed of the Number Speed of the Number Speed of the Number


ions in of ions ions in of ions ions in of ions
m/s m/s m/s
0-20 0 180-200 25 360-380 25
20-40 0 200-220 45 380-400 16
40-60 0 220-240 50 400-420 12
60-80 0 240-260 43 420-440 10
80-100 1 260-280 37 440-460 5
100-120 0 280-300 32 460-480 3
120-140 2 300-320 20 480-500 0
140-160 7 320-340 25 500-520 1
160-180 18 340-360 23 520-540 0

The frequency density distribution then looks like the one in figure 3.2. Note
the unit of the frequency density distribution in this figure (s/m). Try to
think of why this is the case.
32
2.5

2.0

1.5

F (v)
1.0

0.5

0.0

0 100 200 300 400 500

Figure 3.2: Frequency density distribution F (v) of the velocities of ions in a gas discharge
experiment.

In this example measurements of the velocities of different ions in a gas discharge were
done. The results (probably) represent what the distribution of the velocities of different
ions looks like. In other words, measurements have been done on ions that probably
all have a different velocity. However, it also is possible to perform measurements on
a quantity that has one single true value, but where the measurements still show a
spread around this single value. This can have multiple causes, as we have already seen
in chapter 1. We will take a closer look at this case, although what will follow is also
true for the measurements in the example above.
We will perform measurements on a quantity that has one true value xt . We don’t know
this xt , but we want to find an approximation (estimate) that is as accurate as possible.
It’s fair to assume that the probability of finding a result close to xt is bigger than the
probability of finding a result far from xt . We now define the probability density p(x) as
the probability of finding a result between x and x + dx, divided by dx. In other words,
p(x)dx is the probability of finding a result between x and x + dx. One condition that
should be noted is that dx should be small (we should actually take the limit dx → 0).
Since the probability of finding a result near xt is bigger than that of finding one far
from xt , the probability density will look something like figure 3.3.
The maximum of p(x) of course lies at x = xt . The probability of finding a result
Rb
between x = a and x = b upon performing a measurement is P (a ≤ x ≤ b) = a p(x) dx.
R +∞
This is the size of the shaded area below the curve. Also note that −∞ p(x) dx = 1.

We assumed that we were dealing with a continuous quantity x. If x would


be a discrete quantity, where M outcomes are possible (x1 , .., xM ), this of
M
P
course would become p(xk ) ∆xk = 1, where ∆xk is the distance between
k=1
the possible outcomes xk and its neighbors.
33
r

Figure 3.3: Probability density p(x). The probability P (a ≤ x ≤ b) of finding a result


between a and b is the shaded area under the curve.

We’ll now introduce another concept and that is the expectation value ε ⟨x⟩ of the
quantity x, which for a continuous probability distribution is defined as

Z+∞
ε ⟨x⟩ = x p(x) dx. (3.1)
−∞

This expectation value of x is nothing more than the sum (=integral) of all possible
values of x, multiplied by the corresponding probability of finding this result. It is
actually the result one would expect if the measurement would be performed an infinite
number of times and all results would be averaged. It won’t amaze you that for the
probability density in the figure above ε ⟨x⟩ = xt , i.e. the true value. You therefore
expect, upon repeating the measurement an infinite number of times and averaging the
results, to find the true value as the average. However, in practice we won’t be measuring
an infinite number of times, but only a finite number of times and therefore the found
average won’t exactly be the true value xt , but only an approximation. However, for
measurements that show a small spread in the results (so a ‘narrow’ probability density
function), the average will be a better approximation of xt than for measurements with
a large spread. The deviation of the found average from xt is of course the uncertainty
in the experiment. We’ll try to calculate this in one of the coming sections. It’s not
only possible to calculate the expectation value of x, but we can do this for any function
of x, so
Z+∞
ε ⟨f (x)⟩ = f (x) p(x) dx, (3.2)
−∞
R +∞
for example ε ⟨x2 ⟩ = −∞ x2 p(x) dx. In the next section we’ll derive a number of
calculation rules for the expectation value, with which we can calculate the difference
34
between the mean of a (finite) number of measurements and the true value. Lastly, we
define the variance var⟨x⟩ of x. This is defined as the expectation value of (x − xt )2 ,
so the square of the difference between the measured quantity x and the true value xt .
This is also denoted by σ 2 :

var ⟨x⟩ = σ 2 = ε (x − xt )2 . (3.3)

Just as ε ⟨x⟩ is the result you would expect to find upon repeating the measurement an
infinite number of times and averaging the results, var⟨x⟩ is the expectation value of
how far the measured value lies from the true value (and then the square of this) upon
repeating the experiment an infinite number of times. We use the square in order to
get rid of any possible minus signs. We do this because ε ⟨(x − xt )⟩ will be 0, since we’ll
measure a value smaller than xt about p as often as we’ll measure a value bigger than
xt . The square root of the variance var ⟨x⟩ is a measure for the size of the spread
of the results and
p therefore for the width of the probability distribution. The σ in the
equation (σ = var ⟨x⟩) is also called the standard deviation.

3.3 Calculation rules for the expectation value and


the variance
The following calculation rules can be derived quite easily:

i. ε ⟨ax⟩ = a ε ⟨x⟩

ii. ε ⟨x + y⟩ = ε ⟨x⟩ + ε ⟨y⟩

iii. var⟨ax⟩ = a2 var⟨x⟩

iv. var⟨x + y⟩ = var⟨x⟩ + var⟨y⟩

Proof:

+∞
R +∞
R
i. ε ⟨ax⟩ = ax p(x) dx = a x p(x) dx = aε ⟨x⟩ . Q.e.d.
−∞ −∞

+∞
R +∞
R +∞
R +∞
R
ii. ε ⟨x + y⟩ = (x + y) p(x) p(y) dx dy = xp(x) p(y) dx dy+
−∞ −∞ −∞ −∞
+∞
R +∞
R +∞
R +∞
R +∞
R +∞
R
y p(x) p(y) dx dy = x p(x) dx p(y) dy + p(x) dx y p(y) dy =
−∞ −∞ −∞ −∞ −∞ −∞
+∞
R +∞
R
x p(x) dx + y p(y) dy = ε ⟨x⟩ + ε ⟨y⟩ . Q.e.d.
−∞ −∞

iii. var⟨ax⟩ = ε ⟨(ax − axt )2 ⟩ = ε ⟨a2 (x − xt )2 ⟩ = a2 ε ⟨(x − xt )2 ⟩ = a2 var⟨x⟩, where


we have used calculation rule (i). Q.e.d.
35
iv. var⟨x + y⟩ = ε [(x + y) − (xt + yt )]2 = ε [(x − xt ) + (y − yt )]2 =
ε ⟨(x − xt )2 ⟩ + ε ⟨(y − yt )2 ⟩ + 2ε ⟨(x − xt )(y − yt )⟩ =
ε ⟨(x − xt )2 ⟩ + ε ⟨(y − yt )2 ⟩ = var⟨x⟩ + var⟨y⟩ . Q.e.d.

3.4 The Gaussian distribution


In by far the most cases, the spread in the results will follow a so called normal dis-
tribution (also called a Gaussian distribution). This has a probability density function
p(x) that looks like
2
p(x) = α e−β(x−xt ) , (3.4)
where α and β are constants. We can calculate these constants using R +∞the rules that
exist for a probability density function p(x). The first rule is that −∞ p(x) dx = 1.
The second rule is that var⟨x⟩ is denoted by σ 2 .
Using the first
q rule with the above mentioned equation gives a relationship between α
+∞ 2 √
and β: α = βπ . Check this yourself and use the fact that −∞ e−x dx = π (see also
R

appendix A). We can therefore already simplify the equation to


r
β −β(x−xt )2
p(x) = e . (3.5)
π
Using the second rule we get:
Z+∞
var ⟨x⟩ = ε (x − xt )2 = (x − xt )2 p(x) dx = σ 2 . (3.6)
−∞

1
By inserting p(x) in the above equation (again, check this yourself) we get β = 2σ 2
, so
the probability distribution now looks like the following:
1 (x−xt )2
p(x) = √ e− 2σ2 . (3.7)
σ 2π
A plot of this distribution can be found in figure 3.4.
Now the following important results can be calculated: the probability of a result
R x +σ
between xt − σ and xt + σ (= P (xt − σ ≤ x ≤ xt + σ) = xtt−σ p(x) dx) is equal to
68.27 % and the probability of a result between xt − 2σ and xt + 2σ is 95.45 % and the
probability of a result between xt − 3σ and xt + 3σ is 99.73 %.

36
t t t t

Figure 3.4: Probability distribution of a Gaussian or normal distribution. The probabilities


of finding an outcome in the shaded areas are denoted in the figure.

3.5 The uncertainty in measurements with random


variations
In this section we’ll use the definitions and calculation rules for the expectation value
and the variance to give a value for the uncertainty in a series of measurements. We’ll
first do a small recap. We are measuring a quantity with a true value xt . The measure-
ments show random variations, which means that the measurements spread around the
true value xt . If we perform one measurement, the probability of finding an outcome
(result) with a value between x and x + dx (dx → 0) is equal to p(x) dx. Here, p(x)
is the probability density function like the one drawn in the last section. When we
perform many measurements (in principle an infinite amount), the spread in the results
of the measurements (the frequency distribution) will look the same as this p(x). So,
the spread of the results around the true value xt is determined by the width of this
function. The mean of the (still infinite amount of) results will be equal to the true
value xt . In practice we’ll only be able to perform a finite amount of measurements.
The series of measurements that we perform consists of N measurements and the results
of these N measurements we will call x1 , x2 , .., xN . The mean x̄ we calculate with
N
1 X
x̄ = xi . (3.8)
N i=1

We of course hope that this average x̄ will be a good approximation of the true value
xt . Whether or not this is case, will depend on (a) the number of measurements (the
more the better) and (b) the probability density function p(x). With a small amount of
measurements it’s probable that the found average will give a bad value (approximation)
for xt . For a narrow probability density function the found average x̄ will be more
reliable than for a wide function. To determine the reliability of the average (which is
37
the uncertainty in the final result that we’re looking for), we have to investigate (a)
what the width of the probability density is and (b) what the connection is between
the width and the reliability of x̄.

3.5.1 The width of the probability distribution


To describe the width of the probability distribution we defined the variance σ 2 . This
gives the average of the square of the difference between the measurement results and the
true value when we perform extremely many (an infinite amount of) measurements.
When performing a finite amount of measurements this won’t exactly be true, but only
in approximation. The average of the square of the difference between the results and
the true value with a finite amount of measurements we also call the sample variance,
which we denote by S 2 and is equal to
N
1 X
S2 = (xi − xt )2 . (3.9)
N i=1

With an infinite amount of measurements σ 2 = S 2 , but with a finite amount σ 2 ≈ S 2 .


So, in order to make an estimation of the width of the probability distribution, we
can try to calculate the sample variance S 2 , since we then have an approximation for
σ 2 . There lies one big problem in trying this: we don’t know xt , but only x̄ as an
approximation of xt and we therefore can’t calculate S 2 using the above equation. If
we eliminate xt however, this problem is fixed. In order to do that, we write
N N
1 X 1 X
S2 = (xi − xt )2 = (xi − x̄ + x̄ − xt )2 . (3.10)
N i=1 N i=1

We haven’t done anything illegal here (yet). We can expand the square:

1 X
S2 = (xi − x̄)2 + (x̄ − xt )2 + 2(xi − x̄)(x̄ − xt ) , (3.11)
N i

and the summation we can perform over each term individually:


1 X 1 X 2 X
S2 = (xi − x̄)2 + (x̄ − xt )2 + (xi − x̄)(x̄ − xt ). (3.12)
N i N i N i

P N
P
Here we use the abbreviation for . In the second term nothing is dependent on i,
i i=1
so the summation is nothing more than the sum of N times the same number ((x̄−xt )2 ),
so it can be written as N1 · N · (x̄ − xt )2 = (x̄ − xt )2 . In the last term we can place
(x̄ − xt ) in front of the summation, as this is not dependent on i. We now have

1 X 2 X
S2 = (xi − x̄)2 + (x̄ − xt )2 + (x̄ − xt ) (xi − x̄). (3.13)
N i N i
38
P P P
The summation in the last term can be calculated, since (xi − x̄) = xi − x̄ =
i i i
N x̄ − N x̄ = 0. Therefore, the last term is 0, so we are left with
1 X
S 2 = (x̄ − xt )2 + (xi − x̄)2 . (3.14)
N i

The second term on the right hand side of the above equation resembles the definition
of the sample variance, but now with x̄ instead of xt . This is nice, because we can
actually calculate this term. However, the first term still gives us trouble, as it contains
xt again. The problem now has become: can we make an estimation of (x̄ − xt )2 ? It
probably won’t amaze you that we can, since otherwise we would have been deriving
all these equations for nothing. We can write
!
1 X 1 X 1 X 1 X
x̄ − xt = xi − xt = xi − xt = (xi − xt ) , (3.15)
N i N i N i N i

and the square of this is


" #2 " # " #
1 X 1 X X
(x̄ − xt )2 = (xi − xt ) = 2 (xi − xt ) · (xi − xt ) . (3.16)
N i N i i

The last part of this equation consists of the product of two summations. In both
summations, i runs from 1 to N , but each value i in the first summation is multiplied
by all values of i from the second summation. We might as well call this j (to prevent
further confusion) and therefore
" # " #
1 X X
(x̄ − xt )2 = 2 (xi − xt ) · (xj − xt ) , (3.17)
N i j

where i and j are completely independent of one another. Because of this independence,
both summations may stand at the beginning of the equation:
1 XX
(x̄ − xt )2 = [(xi − xt ) (xj − xt )] . (3.18)
N2 i j
PP
The means nothing more than a summation over all values of i and all values of
i j
j. We’ll distinguish between terms where i and j have the same value and terms where
this isn’t the case. Therefore, we write
1 X 1 XX
(x̄ − xt )2 = (x i − x t ) 2
+ [(xi − xt ) (xj − xt )] . (3.19)
N2 i N 2 i j̸=i

The last term is a summation over all values of i and j that aren’t equal to each other,
so xi and xj in this term always represent results of different measurements. Because
the results show a random spread around a true value xt , (xi − xt ) will be positive or
negative an equal amount of times. The same goes for (xj − xt ), but because it concerns
39
different measurements, this is completely uncorrelated. This means that the second
term of the above equation will also be pretty much 0 when we perform a large amount
of measurements, and therefore we’re left with just the first term:
1 X
(x̄ − xt )2 ≈ 2 (xi − xt )2 . (3.20)
N i

On the right hand side of this equation we recognize the definition of the sample variance
S 2 , but now with a factor N12 instead of N1 , so we can write:
1 2
(x̄ − xt )2 ≈ S . (3.21)
N
With this we’ve made an estimation of the first term in equation (3.14). By filling in
this estimation, we find
1 2 1 X
S2 ≈ S + (xi − x̄)2 . (3.22)
N N i
By bringing the first part of the right hand side to the left:
 
1 1 X
1− S2 ≈ (xi − x̄)2 , (3.23)
N N i

and multiplying both sides with N and dividing by (N − 1) finally gives an expression
for the sample variance
1 X
S2 ≈ (xi − x̄)2 . (3.24)
N −1 i
This is a value we can calculate! The width S of the probability distribution then is
(roughly) equal to the square root of this, so
r
1 P
S ≈ (xi − x̄)2 . (3.25)
N −1 i

Summarizing: We have found a way to use a series of N measurements with results


x1 , x2 , .., xN to estimate the true value xt (we used x̄ for this). Moreover, we have
made an estimation of the standard deviation, i.e. the width of the probability distri-
bution function p(x) that corresponds to the experiment. We used two approximations
for this. Firstly, we assumed that the square root of sample variance (S) is a good
approximation
P P for the standard deviation σ and secondly, that we can neglect the term
[(xi − xt ) · (xj − xt )]. The condition for which these approximations are valid is
i j̸=i
that the number of measurements N is large. In practice it turns out that a measure-
ment series of N = 10 already works reasonably well.
The next question of course is what we can do with this number: what does it actually
mean? Well, it means that when the probability distribution is Gaussian (and we’re
going to assume that for now), we now know that a probability of roughly 68 % can be
assigned to finding a result between x̄ − S and x̄ + S. Equivalently, we can state that
about 68 % of the results from a measurement series will lie in this interval.
40
3.5.2 The accuracy of the mean
We’re still looking for the accuracy with which the mean x̄ of a measurement series is
determined as an approximation of the true value xt of the quantity under investigation.
By now we know how the width S of the probability distribution can be calculated,
but that still isn’t the quantity that we’re looking for. After all, if we were to take
a bigger series of measurements (so a larger value of N ), the width of the probability
distribution, and therefore S, will not change, whereas it’s clear that the final result
would be more accurate. We should therefore find the relationships between S, x̄ and
the uncertainty in x̄.
To determine this uncertainty we have to ask ourselves what would happen if we
would repeat the series of N measurements. Of course, we wouldn’t be doing this in
real life, but let’s do it in our minds for now. It’s probable that upon repeating the
measurements we’d find a slightly different value for x̄ and if we were to repeat it again,
x̄ would again be slightly different, and so on. We should therefore also find a spread
in the averages and this spread would again be a Gaussian distribution. But because
one of these averages is a much better estimation of xt (and therefore it will be much
closer to xt ) than one single measurement, the spread in the distribution of the means
will be (much) smaller than the spread in the separate measurements (the width of the
probability distribution). The question is how much. Let’s call this spread Sm , the
standard deviation of the mean (completely analogous to S). If we could determine
this Sm , we’d also know the width of the probability distribution of the averages (= the
spread of the mean) and we’d know that an average will lie between xt −Sm and xt +Sm
with a probability of 68%. Conversely, if the average x̄ is determined only once, the
true value would lie between x̄ − Sm and x̄ + Sm with a probability of 68%, with which
the uncertainty the we’re looking for is found. An illustration of a probability function
of separate measurements and one of averages can be found in figure 3.5. The areas in
which with 68% certainty a separate measurement xi or an average x̄, respectively, will
lie are denoted in the figure.
In the case of the separate measurements we had to repeat the experiment N times in
order to calculate S. Should we now repeat the whole series a couple of times in order
to be able to calculate Sm ? Fortunately, the answers to this is: ‘no’. There is a method
to do this from one measurement series. Completely analogous to the definition of S
(S 2 ≈ var⟨x⟩), we define Sm as follows:
2
Sm = var ⟨x̄⟩ . (3.26)
1
P
We can write x̄ as N
xi , which gives
i
* +
2 1 X
Sm = var xi . (3.27)
N i

We then use calculation rule (iii) from §3.3 to find


 2 * +
2 1 X
Sm = var xi (3.28)
N i
41
distribution of
mean values

distribution function

distribution of individual
measurement results

Figure 3.5: Distribution of the means (narrow curve) compared to that of separate mea-
surements (wider curve).

and because the measurement results xi are independent, using calculation rule (iv),
we find
2 1 X
Sm = 2 var ⟨xi ⟩ . (3.29)
N i
Now, the variance var⟨xi ⟩ of course is the same for all xi and equal to var⟨x⟩, so the
summation contains N times the same number var⟨x⟩ and we find

2 1 1
Sm = var ⟨x⟩ = S 2 . (3.30)
N N
Because we could calculate S, we can also calculate Sm and we obtain the final result:
s
1 N
(xi − x̄)2 .
P
Sm = (3.31)
N (N − 1) i=1

Indeed, we only have to determine the average of a series of N measurement once and
then we can calculate the standard deviation of the sample mean, as Sm is officially
called. And because there is 68% certainty that the true value xt is in between x̄ − Sm
and x̄ + Sm , we have also found the uncertainty; the 68% confidence interval.

42
3.6 An example
A number of measurements of the length of an object give the results 50.35-50.33-50.45-
50.38-50.35-50.27-50.41-50.39-50.36 and 50.39 mm. The average of this is 50.37 mm, so
we can calculate:

i xi (mm) xi − x̄ (mm) (xi − x̄)2 (mm2 )


1 50.35 -0.02 4·10−4
2 50.33 -0.04 1.6·10−3
3 50.45 0.08 6.4·10−3
4 50.38 0.01 1·10−4
5 50.35 -0.02 4·10−4
6 50.27 -0.10 1·10−2
7 50.41 0.04 1.6·10−3
8 50.39 0.02 4·10−4
9 50.36 -0.01 1·10−4
10 50.39 0.02 4·10−4
2
1
(xi − x̄) = 2.14 · 10−2 mm2
P P
x̄ = 10 xi = 50.368 mm

From this we calculate


s
1 X
S= (xi − x̄)2 = 0.049 mm (3.32)
N −1 i

and from this


S
Sm = √ = 0.015 mm. (3.33)
N
So the final result of the measurement is

x ± Sx̄ = (50.37 ± 0.02) mm. (3.34)

3.7 The relationship between the measurement ac-


curacy and the number of measurements
As noted earlier, the measurement accuracy (the uncertainty of the mean) depends
on the width S (or actually σ) of the distribution function and on the number of
measurements we perform in the measurement series. The spread in these measurements
of course is fixed, so S is independent of the number of measurements we perform.
This might seem strange when looking at the definition of S 2 , since a factor N1 occurs
(xi − x̄)2 , which
P
somewhere. You should however keep in mind that the summation
i
also occurs in the definition of S 2 , also increases linearly with N when N is big enough.
Try to come up with an explanation for this yourself.
43
Since S is not dependent on N and since the standard deviation of the mean is defined
as Sm = √SN , we can conclude that the latter will most definitely depend on the number
of measurements. We also see that we can expect that Sm will decrease with increasing
N , so the result will get more accurate with a larger measurement series. An unpleasant
point here is that the uncertainty is proportional to the factor √1N : the accuracy is not
inversely proportional to the number of measurements but only with the square root.
This means that when we perform N measurements and find out that the uncertainty
is still too big and that it should be decreased by a factor 2, then we should extend
the measurement series to 4N measurements to reach this goal. So in order to get
a factor p increase in accuracy, we should perform p2 times the number of
measurements.

3.8 Calculation rules for 68% intervals


In chapter 2 we derived the calculation rules for 100% confidence intervals. We already
noted there that the calculation rules for 68% intervals look slightly different. To
illustrate this we will, before deriving a general rule, show the calculation rule for the
sum of two quantities.
First we should define our notation a bit more strictly. The uncertainty in the result
of a measurement series is the standard deviation of the mean Sm . The standard
deviation of the measurement results themselves was S. We will get in trouble if we
use multiple quantities each having their own uncertainty. The standard deviation of
the measurement results of a quantity x we will now call Sx and the standard deviation
of the mean of x we will call Sx̄ . The mean of the measurements xi of a quantity x we
will call x̄.
We have measured the quantities x[1] and x[2] (results are x̄[1] and x̄[2] , respectively)
and found the uncertainties Sx̄[1] and Sx̄[2] (68% intervals). The quantity x that we
should calculate is the sum of these two measured quantities, i.e. x = x[1] + x[2] . The
uncertainty in x we will call, conform to the definitions we stated earlier, Sx̄ and for
this we can write
Sx̄2 = var ⟨x̄⟩ . (3.35)
Now since x = x[1] + x[2] and therefore x̄ = x̄[1] + x̄[2] , we get

Sx̄2 = var x̄[1] + x̄[2] = var x̄[1] + var x̄[2] = Sx̄2[1] + Sx̄2[2] . (3.36)

So in contrast to the 100% intervals where upon summation of two quantities the
uncertainties should be added, we should now ‘quadratically’ add the uncertainties
with 68% intervals.
We will now directly derive the general calculation rule for combining uncertainties
in the case of 68% confidence intervals. We have measured M quantities x[1] , .., x[M ]
with independent uncertainties Sx̄[1] , .., Sx̄[M ] . We are looking for the uncertainty in the
quantity x which is a function of the M measured quantities according to

x = f (x[1] , ...., x[M ] ). (3.37)


44
We are going to be looking for the uncertainty in x, so for Sx̄ . The measured quantities
[1] [M ]
x[1] , .., x[M ] all have a true value xt , .., xt and the results of the measurement series are
x̄[1] , .., x̄[M ] . We are going to assume that the uncertainties in the measured quantities
are small, which implies that the averages x̄[1] , .., x̄[M ] do not deviate too much from the
[1] [M ]
true values xt , .., xt . We can then write the function as
 
[1] [1] [M ] [M ]
x̄ = f (x̄[1] , .., x̄[M ] ) = f xt + (x̄[1] − xt ), ...., xt + (x̄[M ] − xt ) (3.38)

[i]
and since all terms x̄[i] − xt are assumed to be small, we can use the approximation
from §2.1.3 and write
  ∂f   ∂f
[1] [M ] [1] [M ]
x̄ ≈ f (xt , .., xt ) + x̄[1] − xt + .... + x̄ [M ]
− x t
∂x[1] ∂x[M ]
X M   ∂f
[1] [M ] [i] [i]
= f (xt , .., xt ) + x̄ − xt , (3.39)
i=1
∂x[i]

∂f [i] [1] [M ]
where ∂x[i] is the partial derivative of f with respect to x at the point (xt , ...., xt ).
2
The uncertainty in x is determined by Sx̄ = var⟨x̄⟩ in which we can insert the afore-
mentioned equation, resulting in
* M 
+
X  ∂f
[1] [M ] [i]
Sx̄2 = var ⟨x̄⟩ ≈ var f (xt , ...., xt ) + x̄[i] − xt . (3.40)
i=1
∂x[i]

The right hand side of this equation is the variance of the sum of a couple of terms
[1] [M ]
that are all independent of each other. After all, the first term is f (xt , ...., xt ) = xt ,
M  
[i] ∂f
x̄[i] − xt ∂x
P
which is a constant, and the other terms [i] are all independent since
i=1

E can therefore apply calculation rule (iv) from §3.3 and


[i]
the quantities
D x are. We
[1] [M ]
because var f (xt , .., xt ) =var⟨xt ⟩ = 0 (a constant has no variance), we can write

M
X   ∂f 
[i]
Sx̄2 = var [i]
x̄ − xt . (3.41)
i=1
∂x[i]

Now using calculation rule (iii) from §3.3 and by knowing that ∂x ∂f
[i] also is a constant,

we find
M  2 M  2 M  2
2
X ∂f D
[i] [i]
E X ∂f [i]
X ∂f
Sx̄ = [i]
var x̄ − xt = [i]
var x̄ = [i]
Sx̄2[i] .
i=1
∂x i=1
∂x i=1
∂x
D E (3.42)
[i] [i]
Here we have used the fact that var x̄[i] − xt =var x̄[i] . This correct as xt is a
constant. We have now found a general equation for the uncertainty Sx̄ :
s  2
M ∂f
Sx̄2[i] .
P
Sx̄ = (3.43)
i=1 ∂x[i]
45
This looks quite similar to the one for 100% intervals (see §2.2.1), but now the terms
should be added quadratically.
From this general calculation rule we can derive all other specific calculation rules. Try
this yourself for several cases. In the next section a full overview of the calculation rules
for both 100% intervals and 68% intervals can be found.

3.9 Overview of calculation rules for uncertainties


in measurement results
The following table gives an overview of all calculation rules, both for 100% intervals
and for 68% intervals. In principle all rules can be derived from the general one (the
last row in the table). However, in many cases it is easier to use one of the other rules.

x 100% intervals 68% intervals

x[1] + C ∆x = ∆x[1] Sx̄ = Sx̄[1]

x[1] + x[2] ∆x = ∆x[1] + ∆x[2] Sx̄2 = Sx̄2[1] + Sx̄2[2]

x[1] − x[2] ∆x = ∆x[1] + ∆x[2] Sx̄2 = Sx̄2[1] + Sx̄2[2]

Cx[1] ∆x = |C| ∆x[1] Sx̄ = |C| Sx̄[1]

Ci x[i] Sx̄2 = Ci2 Sx̄2[i]


P P P
∆x = |Ci | ∆x[i]
i i i

 2  2  2
∆x ∆ [1] ∆ [2] Sx Sx[1] Sx[2]
x [1]
·x [2]
= x[1] + x[2] = +
|x| |x | |x | x x[1] x[2]
2 2 2
x[1]
  
∆x ∆ [1] ∆ [2] Sx Sx[1] Sx[2]
= x[1] + x[2] = +
x[2] |x| |x | |x | x x[1] x[2]

α ∆x ∆ [1] Sx S [1]
x[1] = |α| x[1] = |α| x[1]
|x| |x | |x| |x |
 2  2
[i] αi ∆x P ∆ [i] Sx Sx[i]
= |αi | x[i]
Q  P
C x = αi [i]
i |x| i |x | x i x
 2
P ∂f ∂f
f (x[1] , x[2] , .., x[M ] ) Sx̄2 =
P
∆x = ∆ [i] S [i]
i ∂x[i] x i ∂x[i] x

46
3.10 The uncertainty in the uncertainty
In chapter 1 we saw that the uncertainty in a measurement series can usually be rounded
to one significant digit, because this uncertainty also has its own uncertainty. The
uncertainty as described by equation (3.8) will give a slightly different value for each
measurement series, in the same way that the average of a new measurement series
will have a slightly different value. So, just like with the averages, the uncertainties will
show a spread when the measurement series is repeated a couple of times. We can again
do this in our minds, since we can calculate that the spread in the standard deviation
S has a variance
1 1
var ⟨S⟩ = σ2 ≈ S 2, (3.44)
2 (N − 1) 2 (N − 1)
and the variance in the standard deviation of the mean is
1 2 1
var ⟨Sm ⟩ = σm ≈ S2 , (3.45)
2 (N − 1) 2 (N − 1) m
where N again is the size of the measurement series (the number of measurements).
S
From this follows that for the relative uncertainties SSS and SSmm
s
SS SSm 1
= = . (3.46)
S Sm 2 (N − 1)
We won’t go into the details of the derivation of the above equation. In figure 3.6 the
S
fraction SSmm is plotted as a function of the number of measurements N in the series.

80

60

40

20

0
0 10 20 30 40 50 60 70 80 90 100

Figure 3.6: Ratio of the uncertainty SS in the standard deviation S and the standard
deviation itself as a function of the number of measurements N in the series.

The point that lies all the way to the left in this curve is for N =2, so this has an
uncertainty of over 70% in the standard deviation. Even with N =10 we still find an
uncertainty of 25% and this slowly gets better with increasing N , as can be seen in the
figure.
47
Example: after an experiment an uncertainty (standard deviation of the
mean) of Sm = 0.44 has been found. We learned that we should round this
to Sm = 0.4. The question is whether or not this is problematic. It’s a
problem when the error that we make (let us call this ∆Sm ) is bigger than
(or equal to) the uncertainty in the value of Sm , so when

∆Sm > SSm .

Dividing this equation by Sm , gives


s
∆Sm SS 1
> m = ,
Sm Sm 2(N − 1)

and from this it follows that


 2
1 Sm
N >1+ .
2 ∆Sm

By filling in the found standard deviation and the rounding (∆Sm ), we see
that only when N > 61 the rounding wouldn’t be appropriate. So in this
case only with a very big measurement series (over 61 measurements) a
rounding of the uncertainty to two digits would have been meaningful.

3.11 Dependent uncertainties


(This section isn’t part of the official course material)
In §2.3 we saw that in the case of 100% intervals we should be very cautious when
the uncertainties of the quantities that we’re using to calculate another quantity are
dependent upon one another. The calculation rules do not apply in those cases. The
same goes for the calculation rules of 68% intervals that we derived in this chapter.
The reason why the calculation rules do not apply anymore is easy to understand.
During the derivations we assumed we could apply the calculation rules in §3.3 for the
expectation value and the variance. Rule (iv) said we could write the variance of a sum
of two variables as the sum of the variances of the variables separately:

var ⟨x + y⟩ = var ⟨x⟩ + var ⟨y⟩ . (3.47)

This was under the assumption that ε ⟨(x − xt )(y − yt )⟩ = 0, which means that with
many (an infinite amount of) measurements the average of the product of the deviations
in x and y from their true values is zero. This only happens when the uncertainties in
the quantities x and y are independent.
48
3.11.1 The covariance
When the uncertainties in the measured quantities x and y are dependent on each other,
the term ε ⟨(x − xt )(y − yt )⟩ should not be removed. We also call ε ⟨(x − xt )(y − yt )⟩
the covariance of x and y and denote this by covar⟨xy⟩. So, we have

var ⟨x + y⟩ = var ⟨x⟩ + var ⟨y⟩ + 2 covar ⟨xy⟩ , (3.48)

where
+∞
ZZ
covar ⟨xy⟩ = ε ⟨(x − xt )(y − yt )⟩ = (x − xt )(y − yt ) p(x, y) dx dy. (3.49)
−∞

Here, p(x, y) is the probability density function of the variables x and y together.

3.11.2 Calculation rule for quantities with dependent uncer-


tainties
The general calculation rule that we had for quantities with independent uncertainties
2 2
(Sx̄2 = ∂x∂f[1] Sx̄2[1] + ∂x∂f[2] Sx̄2[2] ) can be extended for the case of dependent uncertain-
ties. In the case of a combination of two quantities (so M =2), this becomes
 2  2   
∂f ∂f ∂f ∂f
Sx̄2 = Sx̄2[1] + Sx̄2[2] +2 Sx̄[1] x̄[2] , (3.50)
∂x[1] ∂x[2] ∂x[1] ∂x[2]

where Sx̄[1] x̄[21] is an estimation of the covariance covar x̄[1] x̄[2] . We should determine
this estimation and we can do that from our measurement series. A good approximation
is
[1] [2] 1 X  [1] [1]

[2] [2]

covar x̄ x̄ ≈ xi − x̄ xi − x̄ . (3.51)
N i
It seems that we can calculate the uncertainty in a quantity that is a combination of
multiple quantities with dependent uncertainties. However, this tends to be a lot of
work and it should be noted that one should try to avoid this (for example, in the same
way as in section 2.3).

49
3.12 Exercises and problems

3.12.1 Questions
1. When measuring the thickness of a thin layer of liquid helium, the following angles
(in minutes) were observed:
34 35 45 40 46
38 47 36 38 34
33 36 43 43 37
38 32 38 40 33
38 40 48 39 32
36 40 40 36 34

(a) Draw a histogram of the observations.


(b) Which observation has the highest frequency (i.e., occurs most often)?
(c) What is the so-called median, i.e., the measured value with as many obser-
vation above it as below it?
(d) What is the mean (i.e., average)?
(e) Calculate the best estimate of the standard deviation of the population.
(f) What boundaries give a single measurement a 68% probability of falling
between them?
(g) Which boundaries have a 95% chance of containing a single observation?

2. Assume that the (non-Gaussian) probability density distribution p(x) is given by

p(x) = C for |x| < a


= 0 for |x| ≥ a,

where a and C are constants.

(a) Make a schematic drawing of p(x).


+∞
R
(b) Use the normalization condition p(x)dx = 1 to express C in terms of a.
−∞

(c) What is the meaning of a?


+∞
R +∞
R
(d) Calculate x̄ = ε ⟨x⟩ = xp(x)dx and σ 2 = (x − ε ⟨x⟩)2 p(x)dx.
−∞ −∞

Interpretation of this result: if measurements of a quantitiy constantly lie in the


interval [x̄ − a, x̄ + a] then σ ≈ a2 is a decent estimate.

3. Someone measures a voltage V . He repeats the measurement N times and find


a spread (standard deviation) S of 0.5 mV. How did he determine S (give an
expression)? How does S depend on the total number of measurements N ?
50
4. Determine the density ρ ± Sρ of a body if its mass is m ± Sm = (24.320 ± 0.005)
g and its volume is V ± SV = (10.20 ± 0.05) cm3 .

5. Forpa simple pendulum of length l, the period of oscillation is given by T =


2π l/g. From there it follows that g = 4π 2 l/T 2 . Calculate the acceleration g
due to gravity and its uncertainty Sg if it is determined that l ± Sl = (93.0 ± 0.1)
cm and T ± ST = (1.936 ± 0.004) s.

6. The bridge of Wheatstone consists of the following electrical circuit:

R R 1

x y
L
V

A voltage V is applied over a a variable resistor, the wiper of which is connected to


an amperemeter (A). The electrical circuit also contains a resistor with unknown
(to be determined) resistance R and a resistor with a very accurately known
resistance R1 . The position of the wiper is set such that no current flows through
the amperemeter.The total length of the wiper is L. In that case we measure
the distance x from the wiper to the left side of the variable resistor (making
the distance from the wiper to the right side y). The resistance R can now be
determined using the expression
x
R= R1 .
y
Given is R1 = (1.00 ± 0.01) kΩ and L = 30.000 cm. x = (12.0 ± 0.5) cm is
measured. These uncertainties reflect a 68%-confidence interval. Calculate the
unknown resistance R and its uncertainty.
p
7. A spring, loaded with mass m, oscillates at a period T = 2π m/k where k is
the spring constant. For 8 different masses, the oscillation period is measured.
See the table. For each measurement, calculate the spring constant and give the
average result including uncertainty.
m (g) 513 581 634 691 752 834 901 950
T (s) 1.24 1.33 1.36 1.44 1.50 1.59 1.65 1.69
51
N
1
P
8. The average x̄ of N quantities xi (i=1,..,N ) is given by x̄ = N
xi . The deviation
i=1
of xi from this average is given by di = xi − x̄. Prove that the mean d¯ of these
deviations is d¯ = 0.

3.12.2 Problems
1. Because it is (sometimes) difficult to perform calculations with a Gaussian prob-
ability distribution, it can be coarsely approximated by the following probability
density function p(x), where a is a constant.

p (x )
b

b
2

-2 a -a 0 a 2 a x

1
(a) Show that b = 3a
holds.
(b) Show that for a probability function which is symmetric around x = 0 (so
for which p(x) = p(−x)), it is generally true that xw = ε ⟨x⟩ = 0.
(c) Calculate for the given probability density function the variance var⟨x⟩ of x
and from that the standard deviation σ. Express σ in terms of a.
(d) Calculate which percentage of measurements (out of an infinite amount of
measurements) will give a value between xw − σ and xw + σ (so in this case
between −σ and σ). The constants a and b should be eliminated.

2. The normal distribution is given by the probability density


!
1 (x − x̄)2
p(x) = √ exp − ,
σ 2π 2σ 2

where x̄ = ε ⟨x⟩ and σ is the standard deviation. The ‘Full Width at Half Max-
imum’ (FWHM ) is defined as the width of the distribution function at half its
height (so the distance between the points for which p(x) = 12 p(x̄) holds)
p
Show that the expression FWHM = 2σ 2 ln(2) ≈ 2.35 σ holds true.

3. In an experiment, the velocity of ions in a gas discharge is measured. The results


are given in the frequency density distribution below.

52
1 0

F (v ) (s /m )
8

0
0 1 0 0 2 0 0 3 0 0

v (m /s )

We assume that the velocities follow a Gaussian distribution. Estimate the vari-
ance var⟨v⟩ of the ion velocity.

Hint: use the result of the previous exercise.

4. A manufacturer produces balls for ball bearings. When he sells them, he naturally
has to state the diameter of the balls and the uncertainty therein. He measures the
diameters of one batch, and finds the diameters follow a Gaussian distribution.
Because he thinks a 68%-confidence interval is not accurate enough, he states
the 95%-confidence interval (the 2S-interval). The measured batch contains 10
different balls, and the results are 4.995 - 5.004 - 4.998 - 5.001 - 4.989 - 5.007 -
5.001 - 5.001 - 4.999 - 5.004 mm. Since the measurements were very precise, the
individual errors are negligible.
What value and uncertainty does he state for the diameter of the balls? Indicate
how these values are calculated (state the formula).
Hint: The balls are sold seperately and each balls has to satisfy the given specifi-
cations.

5. Someone connects an unknown resistor to a current source and measures simul-


taneously the current I through the resistor and the voltage V over the resistor
a few times. There appear to be fluctuations in both the current measurements
and the voltage measurements. The results of the seperate measurements are pre-
sented in the tabel below. The last column gives the resistance calculated from
the pair of voltage and current measurements.
53
i Ii (µA) Vi (µV) Ri (Ω)
1 5.045 50.6 10.030
2 5.150 50.7 9.845
3 5.015 49.8 9.930
4 5.015 50.2 10.010
5 5.330 51.8 9.719
6 4.985 49.2 9.870
7 4.930 49.9 10.122
8 5.115 50.9 9.951
Now the resistance kan be determined in a few ways. Firstly, one can calcu-
late the average current I with uncertainty SI and the average voltage V with
uncertainty SV (from columns 2 and 3), and from these determine the average
±SV
resistance R ± SR = VI±S i
. Another method is to take the individual resistance
values Ri (column 4) and from these calculate an average resistance R with un-
certainty SR . Calculate R using both methods and give the formulas with which
the uncetainties are calculated. Explain why the two methods give different un-
certainties SR . Which method is correct, and why is the other method incorrect?
6. Someone measures an electrical current I. He repeats the measurement N times
and finds a spread in the individual measurements (standard deviation) S of 0.5
µA. Answer the following questions:

(a) How did they determine S (give an expression)?


(b) How does S depend on the total number of measurements N ?
(c) If they want to know the current I within an uncertainty of 0.1 µA, how
many measurements (N ) will they have to perform? Explain your answer.

7. The resistance R of an (unknown) resistor is measured by appling a voltage V and


measuring the current I. The applied voltage V is 3 Volt, and is very (infinitely)
accurately known. The current is measured 10 times and there appears to be some
spread in the results. From these measurements it is calculated that I = (2.3 ±
0.2) µA. How many extra measurements are necessary such that the resistance
can be determined to a relative accuracy of 5%? Note that we are dealing with
68%-intervals. Explain your answer.
8. In some cases, a 68%-interval is not precise enough. After all, there is still a 32%
chance that the real value of the quantity of interest lies outside this interval. As
an alternative, two times the standard deviation from the average (i.e., 2S) is often
given. This is a 95%-interval. Show that the general formula for 95%-intervals is
the same as that for 68%-intervals.
9. An electrician uses two resistors in a parrallel circuit. The resistance of these
resistors is rated by the manufacturer at R1 =100 Ω±5% and R2 =270 Ω±5%, re-
spectively. In contrast to the same question from last chapter, the uncertain-
ties are now 68%-intervals. The total resitance R of the parallel circuit follows
1
R
= R11 + R12 .
54
(a) Calculate the value of R and its uncertainty
(b) If the electrician wishes to have a more precisely known value of R, which of
the resistors should he swap for a more accurately rated one?

55
56
4. Distribution functions

4.1 Introduction

In chapter 3 we briefly discussed the Gaussian (normal) distribution, because in most


cases the spread of measurement results will follow this distribution function. The
expressions found for the standard deviation S, for the standard deviation of the mean
Sm and the calculation rules that we have derived, are not exclusively valid for Gaussian
distributions. They are valid for every possible distribution function p(x) that results
in a spread of values around the real value, given that the spread is not too large. Only
when we started calling Sm the 68% confidence interval, we have used the Gaussian
distribution. This is because for this distribution we may state that when we measure
(infinitely) many times, 68% of the measurement results will lie between xt − σ and
xt + σ. For other distribution functions this might not be the case. In this chapter
several important distribution functions will be introduced that will coincide in their
limiting cases.

4.2 Discrete and continuous distributions

When conducting an experiment it is possible that the result can have any value in
a continuous range. However, it is also possible that the result is discrete. In both
cases infinite repetition of the experiment will result in a distribution of measurement
values that is proportional to the distribution function. For example, we expect that in
the first example in §3.2 (the dice throwing experiment) that each result has the same
probability, namely 1/6. Therefore, many throws will result in a frequency distribution
F(x) that is the same, constant value for each x. The fact that this was not the case in
the example is caused by the limited number of throws. After countless measurements
(N ), we find
1
P (x = xk ) ≈ F(xk ), (4.1)
N
in which P (x = xk ) is the probability of finding a result xk . If the sample space
is limited in size, as was the case with the dice, the probability distribution will be
sufficiently approximated by a finite amount of measurements. In a very large sample
space this is rather difficult, since either countless measurements have to be performed
or the measurement results have to be divided in several classes. The latter leads to a
frequency density distribution F (x), in which the number of measurement results in a
certain interval has to be divided by the width ∆x of that interval. This procedure also
has to be followed when measuring continuous quantities. The probability P (x = xk )
57
to find a measurement result in the interval k is now approximated by
1
P (x = xk ) ≈ F (xk ) ∆x, (4.2)
N
where N again is the total number of measurements, F (xk ) the value of the frequency
density distribution at the k-th interval and ∆x is the interval width. For continuous
quantities we have defined a probability density function p(x) in §3.2, in which the
probability P (a ≤ x ≤ b) was given by
Zb
P (a ≤ x ≤ b) = p(x) dx. (4.3)
a

In large populations this probability density function p(x) can be approximated by


1
N
F (xk ).
As we have seen several times before, the probability density distributions look rather
often like the Gaussian curve in §3.4. Actually, the experiment with the dice is not a
very good example for this, since it has a uniform distribution (all results have an equal
probability). However, when the experiment is repeated N times and we add all N
throws, we will quickly find a probability distribution that rather looks like a Gaussian
curve. Figure 4.1 illustrates this for N = 2, N = 4 and N = 6. The probability
distributions are depicted with circles and the best-fitting Gaussian curves are the solid
lines. For N = 6 we find a Gaussian curve that is in very good agreement with the
measured probability distributions.

Figure 4.1: The probability p(x) to measure x in an experiment, where the thrown number
of dots of all N dice throws are added together. Depicted are the distributions
for 2, 4 and 6 throws respectively (circles). The solid lines are the best-fitting
Guassian distributions.

58
Although a dice experiment is not really similar to a real physical experiment, discrete
probability distributions are nonetheless very common in physics. In the next few
sections we will discuss the binomial distribution and the Poisson distribution as they
are very important discrete probability distributions in physics. Additionally, we will
discuss the normal distribution, or Gaussian distribution, as a continuous distribution.

4.3 The binomial distribution


As a first example of a discrete probability distribution we will consider the position of a
drunken man that moves in one dimension, for example in a narrow alley. The drunken
man takes steps of length l and the probability that he steps forward we call p and the
probability that he steps backwards we call q = 1 − p. We assume that the man is
sufficiently intoxicated, such that the direction of his current step is independent of the
direction of his previous steps. In the simplest case this is p = q = 21 , but when the alley
is uphill p ̸= q, since q will be higher. After taking N steps, the drunk will find himself
at position x = ml, where −N ≤ m ≤ N . The question is, what is the probability that
the man will find himself at position x = ml after N steps. This probability we call
PN (m). To arrive at position m, the man has to have taken n1 steps forward and n2
steps backward. The variables n1 and n2 have to satisfy the equations

n1 + n2 = N,
n1 − n2 = m, (4.4)

from which we can easily calculate that

N +m
n1 = ,
2
N −m
n2 = . (4.5)
2
We will also see that when the total amount of steps N is even, the position m also has
to be even and when N is odd, m also has to be odd. Consequently, the number of steps
forward and backward are fixed. If we are at position m after N steps, according to the
equation above we have to have taken n1 steps in the forward direction and n2 steps
in the backward direction. However, it is completely irrelevant when those steps have
been taken. It is completely possible that he first took n1 steps forward and then n2
steps backward, but every sequence of steps is allowed as long as their total add up to
n1 and n2 respectively. The probability that he takes n1 steps forward and subsequently
n2 steps backward is
p p ..... p q q ..... q = pn1 q n2 . (4.6)
| {z } | {z }
n1 factors n2 factors

Every sequence of n1 steps forward and n2 steps backward has that same probability.
The total probability to end up at position m is equal to the amount of possible ways
to take n1 steps forward and n2 steps backward, multiplied by the probability pn1 q n2 .
59
The total amount of possibilities is given by
 
N! N! N
= = , (4.7)
n1 ! n2 ! n1 ! (N − n1 )! n1
in which N ! stands for N · (N − 1) · (N − 2) · ... · 1 (and by definition 0! = 1). Why this
is the case, we will see in a moment. The probability that the drunk is at position m
after N steps can now be written as
   
N n1 N −n1 N n1
PN (m) = PN (n1 , N − n1 ) = p q = p (1 − p)N −n1 . (4.8)
n1 n1
With PN (n1 , N − n1 ) we obviously mean the probability to take n1 steps forward and
N − n1 steps backward. This probability distribution we call the binomial distribution.
Its name comes from the so called binomial expansion
N N  
N
X N! n N −n
X N n N −n
(p + q) = p q = p q . (4.9)
n=0
n! (N − n)! n=0
n

From this we instantly notice that


N
X
PN (n1 , N − n1 ) = (p + q)N = 1N = 1, (4.10)
n1 =0

and this is rather pleasant, because that is the probability that the man will end up at
any place, which is of course 1.

The reason that the number of possible ways to take n1 steps forward and
n2 steps backward equals nN1 can be intuitively explained as follows. It is
equivalent to the case in which we have to calculate the number of ways
to arrange a sequence of n1 red balls and n2 blue balls. If the balls are
numbered, and thus distinguishable, this is simple: for the first ball there
are N positions available, for the second N − 1, for the third N − 2 and so
on. For the last ball there is only one position available. The total number
of possibilities is therefore N (N − 1)(N − 2)...1 = N !. However, when only
the color of the ball is considered and not the number (the drunken man
doesn’t number his steps either), we realize that we have double counted
quite a few times when coming to N ! possibilities. We will have to divide
by the number of ways to arrange n1 numbered red balls and also by the
number of ways to arrange n2 red balls. The former is of course n1 ! and
follows the same reasoning as for N balls and the latter is n2 !. The total
number of sequences now is
 
N! N! N
= = ,
n1 ! n2 ! n1 ! (N − n1 )! n1
which is the original expression that we wanted to prove.

60
Figure 4.2 gives the probability distribution PN (m) = PN (n1 , N − n1 ) for the case
p = q = 21 and N = 20. Since the probabilities of taking a forward or backward step
are equal, the largest probability is the probability that the drunken man will end up at
m = 0. If the total number of steps is uneven, the probability distribution will have its
maxima at m = −1 and m = 1. Since it’s rather difficult to work with m (you would
have to keep track of whether m is even or uneven every single time), usually only n1
is considered. If we just call it n, the binomial distribution takes the form
 
N n
PN (n) = p (1 − p)N −n , (4.11)
n

where n is the number of steps forward (with corresponding probability p). In this
equation it is unimportant whether N is even or uneven. From now on we will continue
to use this form.
-20 -10 0 10 20
0.20

0.15

N=20
P N(m )

0.10

0.05

0.00

0 5 10 15 20

n1

Figure 4.2: Binomial distribution PN (m) for the case p = q = 1/2, or equivalently the
probability that a drunken man finds himself at position m after taking N steps
if the probabilities of taking a step forward and taking a step backward are equal.
For the graphed distribution N = 20. The number n1 along the horizontal axis
is the number of steps taken in the forward direction.

We will start by calculating what the expectation value ε ⟨n⟩ is of n (upon performing
many measurements) and the variance var⟨n⟩ = ε (n − ε ⟨n⟩)2 .
ε ⟨n⟩ is given by
N
X N
X
N
p (1 − p)N −n .
 n
ε ⟨n⟩ = n PN (n) = n n
(4.12)
n=0 n=0

61
The n = 0 term in this summation gives 0, so we can leave it out and we write,
N N −1
X
N −n
X  n′ +1 ′
N n
(n′ + 1) N
(1 − p)N −n −1 ,

ε ⟨n⟩ = n n
p (1 − p) = n′ +1
p (4.13)
n=1 n′ =0

in which we have introduced an n′ in the right part, which is equal to n′ = n − 1 (so


n = n′ + 1). We can now write (n′ + 1) n′N+1 as N Nn−1
 
′ (the proof for this will be given
below), and the expression becomes
N −1 N −1
n′ N −n′ −1
X
N −1
X  n′
N −1 ′
p (1 − p)(N −1)−n .

ε ⟨n⟩ = N n′
p p (1 − p) = Np n′
(4.14)
n′ =0 n′ =0

Introducing the variable M = N − 1 results in


M
′ ′
X
M
pn (1 − p)M −n = N p (p + 1 − p)M = N p (1)M = N p.

ε ⟨n⟩ = N p n′
(4.15)
n′ =0

Therefore, the expectation value ε ⟨n⟩ is given by

ε ⟨n⟩ = N p. (4.16)

N N −1
The proof that (n′ + 1)
 
n′ +1
=N n′
can be given as follows.

N! N!
(n′ + 1) N
= (n′ + 1)

n′ +1
= ′ ,
(n′ ′
+ 1)! (N − n − 1)! n ! (N − n′ − 1)!

since (n′ + 1)! = (n′ + 1) · n′ · (n′ − 1) · ... · 1 = (n′ + 1) · n′ !. This results in

N! (N − 1)! N −1

=N ′ =N n′
.
n′ ! (N ′
− n − 1)! n ! ((N − 1) − n′ )!

Q.e.d.

The variance var⟨n⟩ is defined as

var ⟨n⟩ = ε (n − ε ⟨n⟩)2 = ε (n − N p)2 , (4.17)

where we have used the obtained expression for ε ⟨n⟩. Expanding this gives

var ⟨n⟩ = ε (n − N p)2 = ε n2 − 2N pn + N 2 p2 = ε n2 − 2N pε ⟨n⟩ + N 2 p2


= ε n2 − 2N 2 p2 + N 2 p2 = ε n2 − N 2 p2 . (4.18)

62
Now we need to calculate ε ⟨n2 ⟩:
N N −1
′ ′
X X
N −1
ε n2 = n2 N
pn (1 − p)N −n = (n′ + 1)N pn +1 (1 − p)N −n −1 ,
 
n n′
(4.19)
n=0 n′ =0
where we have applied the same trick we used earlier. In the term on the right we can
split the n′ + 1, which results in
N −1 N −1
n′ N −1−n′
X
N −1
X
N −1
 n′ +1 ′
2
n′ N (1 − p)N −n −1 .

ε n = Np n′
p (1 − p) + n′
p (4.20)
n′ =0 n′ =0

The first term in this equation again equals N p (according to equations (4.14) and
(4.15)) and for the right term we can again use the trick, so that we obtain
N −2
′′ ′′
X
N −2
2 2
pn (1 − p)N −2−n = N p + N (N − 1)p2 . (4.21)

ε n = Np + Np (N − 1) n′′
n′′ =0

When we insert this in equation (4.18), we will find


var ⟨n⟩ = N p + N (N − 1)p2 − N 2 p2 = N p − N p2 = N p(1 − p) = N pq. (4.22)
Thus, the variance equals
var ⟨n⟩ = N pq. (4.23)
The example of the drunken man we have considered until now seems to be rather un-
related to real physical systems. However, the distribution function is certainly relevant
in physics. Often the distribution function will take a significantly more complicated
form since it has to be taken in 2 or 3 dimensions instead of the one-dimensional version
we used in our example. We will name a few physical examples.
• Magnetism: an atom has spin 21 and a magnetic moment µ. According to quantum
mechanics, the spin can be either ’up’ or ’down’ with respect to a given direction.
When the probabilities of both spin states are equal, then what is the net magnetic
moment of N of these atoms?
• Diffusion of a molecule in a gas: a certain molecule moves in three dimensions
in a gas with an average distance ℓ between the collisions with other molecules.
What distance has it travelled after N collisions?
• The light intensity as a result of N non-coherent light sources: the amplitude
of the light of each source can be considered as a 2-dimensional vector whose
direction determines the phase. All phases are random and the amplitude of the
net result determines the intensity of the light of all sources together. This has
to be determined statistically.
Summarizing
For the binomial distribution PN (n) = Nn pn q N −n (with p + q = 1) the expectation


value is given by
ε ⟨n⟩ = N p (4.24)
and the variance by p
var ⟨n⟩ = N pq ⇒ σ = N pq. (4.25)
63
4.4 The Poisson distribution
The Poisson distribution is essentially a special case of the binomial distribution. We
will let the number of steps N approach infinity, but the product N p (and thus ε ⟨n⟩)
will remain constant. Consequently, p will approach 0. An important example of this
is the measurement of the radioactivity of a sample. We measure pulses with a Geiger-
Müller counter for a time t. The number of pulses that we measure during this time t
we call n. The probability of measuring a pulse during a time interval ∆t is p. In our
t
measurement time t there are N = ∆t of these time intervals. The probability P (n)
that we will measure n pulses in N time intervals is equal to the probability that the
drunken man from the previous section took n steps forward out of a total of N steps.
This means that P (n) follows the binomial distribution
 
N n
P (n) = p (1 − p)N −n . (4.26)
n

However, the decay process is not discrete like the steps from the drunken man, but
continuous. This means that we are not really dealing with time intervals ∆t, but that
we will need to make these time intervals infinitesimally small. So we take the limit
t
∆t → 0, which implies that N = ∆t → ∞. However, the average number of pulses
µ = ε⟨n⟩ that we will count during the measurement time t, if we repeat the experiment
an infinite amount of times, remains unchanged. This is undoubtedly related to the
magnitude of the radioactivity of the sample. We will have to keep ε⟨n⟩ = N p = µ
constant. Hence, the equation becomes
  µ n  µ N −n
P (n) = lim Nn 1− . (4.27)
N →∞ N N
We can evaluate this limit analytically. We will not give the proof here, but eventually
we will find for the Poisson distribution
µn −µ
P (n) = e , (4.28)
n!
where µ is (again) the average number of pulses that we would measure if we would
repeat the experiment an infinite amount of times. The expectation value ε⟨n⟩ and the
variance var⟨n⟩ = ε ⟨(n − ε⟨n⟩)2 ⟩ can be calculated easily.

64
ε ⟨n⟩ = µ and var ⟨n⟩ = N pq (from the binomial distribution) = µ, because N p = µ was
kept constant and q = 1, because p was set to 0. The last result is rather unexpected.
In chapter 3 we have seen that when the measurement data shows a variation around
an expectation value, it was necessary to repeat the experiment several times to get an
impression of the magnitude of this variation. From this variation we could determine
the uncertainty in the expected value. When we know that we are dealing with a process
that follows a Poisson distribution, then in theory it is possible that we can estimate the
variation that we would find if we would repeat the experiment several times, in a single
measurement. If we assume that this single measurement results p in a value√n1 that is

close to µ, then the variation in the measurements will be σ = var ⟨n⟩ = µ ≈ n1 .
With this we can also say something about the probability and the probability interval
in which µ must lie. We will then know that there is a 68% probability that µ will lie
√ √
between the values n1 − n1 and n1 + n1 .

√ √
We have assumed that µ ≈ n1 . However, in a single measurement we
know that n1 can deviate from µ significantly. We will now show that the
approximation is acceptable nonetheless, provided that µ is not too small.
We know that there is a 68% probability that a single measurement lies
√ √
within the boundaries µ − µ and µ + µ, so with a 68% probability it holds
√ √ √
that µ − µ < n1 < µ + µ. Let us calculate what we would find for n1

for the boundary n1 = µ + µ:

√ q
√ √ q √  1/2
n1 = µ + µ = µ 1 + √1 = µ 1+ √1 .
µ µ

If µ ≫ 1, then √1 ≪ 1, so the approximation we used in equation (2.47)


µ
applies:
√ √  1
 √
n1 ≈ µ 1+ √
2 µ
= µ + 12 .

Analogously we will find for the 68% lower boundary µ − 12 . Note that with
a 68% probability the square root of a single measurement result will only

deviate at most 21 from µ and thus will be an acceptable approximation of
the true value.

Summarizing
µn
For the Poisson distribution P (n) = n!
e−µ the variance is given by

ε ⟨n⟩ = µ (4.29)

and

var ⟨n⟩ = µ ⇒ σ = µ. (4.30)
65
4.4.1 The Poisson distribution in a different perspective
In our previous statements about the Poisson distribution we have assumed that we
have a fixed time interval and we have calculated how many pulses we can expect in
this interval. We will now consider a fixed number of pulses and find out how long it
will take to measure this number of pulses. We will start simple with one pulse. The
question is: how long will it take until the next pulse will be measured? Since the pulses
from a radioactive sample are not measured at regular time intervals, we are dealing
with a probability distribution. This probability distribution of measuring a next pulse
at time t, we will call p1 (t). We start our measurement at time t = 0 and we divide
the time interval until time t at which the next pulse is measured in small intervals dt.
We know for sure that in all those intervals dt there was no pulse measured, except for
the time interval t and t + dt, in which there was one measured pulse. The probability
of measuring a pulse in such a time interval dt we call q for now, so the probability is
described by
p1 (t) dt = (1 − q)N q. (4.31)
t
In this equation we have N intervals dt until time t, so dt = . The term (1 − q)N
N
is the probability that in the first N intervals there is no pulse and the term q is the
probability that in the N +1’th interval there is a pulse. The probability q of measuring
a pulse in the time interval dt of course depends on the size of this interval. For very
small time intervals dt we can write q = q0 dt = dt/τ , in which q0 = 1/τ is a constant.
If we insert this in equation (4.31) and make the intervals infinitesimally small (so the
number of steps N gets infinitely large), we will find
 N
t 1
p1 (t) dt = lim 1 − dt. (4.32)
N →∞ Nτ τ

If we subsequently use the expression


 z m
exp (z) = lim 1+ ,
m→∞ m
then we will find for the probability distribution p1 (t):
 
1 t
p1 (t) = exp − . (4.33)
τ τ
R∞
Note that we have automatically met the condition p1 (t) dt = 1. We can also directly
0
see that the assumption is correct that the probability q of measuring a pulse in time
interval dt is equal to q = dt/τ . After all, the probability of measuring a pulse in a
small interval is nothing more than

q = lim p1 (t) dt = dt/τ .


t↓0

66
We will now find an expression for pN (t). This is the probability density function that
describes the probability of measuring the N ’th pulse at time t. This probability is of
course a summation of all probabilities to measure N − 1 pulses in a time t′ (where
0 ≤ t′ ≤ t) and the last pulse in the remaining time t − t′ , or equivalently

Zt
pN (t) = pN −1 (t′ ) p1 (t − t′ ) dt′ . (4.34)
0

After some calculations we get


t
p2 (t) = exp (−t/τ ) ,
τ2
t2
p3 (t) = exp (−t/τ ) ,
2τ 3
t3
p4 (t) = exp (−t/τ ) , etc. (4.35)
6τ 4
From this we can see that the general expression for pN (t) has to be something like
pN (t) = αN tN −1 exp (−t/τ ). By inserting this in equation (4.34), we will find

Zt
1
N −1
αN −1 t′N −2 exp (−t′ /τ ) exp − (t − t)′ /τ dt′

αN t exp (−t/τ ) =
τ
0
Zt
= αN −1 exp (−t/τ ) t′N −2 dt′
0
1 N −1
= αN −1 exp (−t/τ ) t (4.36)
N -1
and since α1 = 1, we get
αN −1 1
αN = = . (4.37)
N -1 (N -1)!
The general expression for pN (t) is therefore

tN −1
 
t
pN (t) = exp − . (4.38)
(N -1)! τ N τ

From this expression we can subsequently also derive equation (4.28). This goes as
follows. The probability of measuring n pulses in a time interval T we call PT (n).
This probability is the summation of all probabilities that the n pulses will occur in an
interval t (with 0 ≤ t ≤ T ) and that in the rest of the interval (so T − t) there are no
more pulses:
ZT
PT (n) = pn (t) PT −t (0) dt. (4.39)
0
67
The probability PT −t (0) to find no more pulses in the interval T − t is obviously equal
to the sum of all probabilities that the next pulse is later than that, which means that
Z∞ Z∞  ′
′ ′ 1 t
PT −t (0) = p1 (t ) dt = exp − dt′
τ τ
T −t T −t
 
T −t
= exp − . (4.40)
τ

Inserting this in equation (4.39) gives

ZT
tn−1
PT (n) = exp (−t/τ ) exp (− (T − t) /τ ) dt
(n-1)! τ n
0
ZT
exp (−T /τ )
= tn−1 dt. (4.41)
(n-1)! τ n
0

From this the final result follows:


(T /τ )n
PT (n) = exp (−T /τ ) . (4.42)
n!
This is the same expression as equation (4.28) with µ = T /τ . From this we can
immediately see that the average number of pulses is proportional to the duration T of
the time interval. The average number of pulses per unit of time is 1/τ .

4.5 The normal, or Gaussian distribution


In §3.4 we have extensively discussed this distribution. Here, we will elucidate on the
relation between the two aforementioned distribution functions. We will again consider
the random walk of the drunken man, but for simplicity we will only let him walk
forward (probability p), or stand still (probability q = 1 − p). When the man walked
a total of N steps, of which n were forward and N − n were a step in place, he will
finally find himself at position nl. Also, we let the number of steps N approach infinity.
However, we will not keep the product N p fixed like with the Poisson distribution, but
we will take the product N l fixed, so the size of his steps will approach 0. This way the
drunk will not walk infinitely far away. The probability of taking n steps forward out
of N steps in total and standing still for N − n times, was
 
N n N −n
PN (n, N − n) = p q , (4.43)
n
and thus the probability to end up at position n was
 
N n N −n
PN (n) = p q , (4.44)
n
68
in which of course p + q = 1. When N becomes very (in principle infinitely) large, we
can write this as a continuous function:
1 (n−pN )2
PN (n) = √ e− 2N pq . (4.45)
2πN pq
We will not get into the mathematical details. If we call the position of the drunken
man x = nl, then we can write for this the Gaussian distribution
1 (x−µ)2
PN (x) = √ e− 2σ2 , (4.46)
2πσ
as we have previously seen in §3.4. From this it directly follows that
µ = pN l,
p p
σ = N pql = N p(1 − p)l. (4.47)
We can conclude that when a spread occurs in measurement values that are the result
of a (very) large number of elementary steps that on themselves have nothing to do
with a normal distribution, then the sum of those elementary steps is described with a
normal distribution. So regardless of the distribution of each of the elementary steps,
the Gaussian distribution arises anyway. This is also the reason why we assume that
in a large majority of the cases the spread in measurement results is described by this
distribution. Even when it is not certain that this will be the case with a physical
quantity, this is assumed most of the time anyway.

4.5.1 The reduced normal distribution


A downside to the use of the normal distribution is that it is rather hard to do calcula-
tions with it. For example, the probability to find a result x between the values a and
b is
Zb Zb
1 (x−x̄)2
p(x)dx = √ e− 2σ2 dx, (4.48)
2πσ
a a
and this integral cannot be solved analytically, only numerically. Tabulating is a pos-
sibility, but for each value of the mean x̄ and standard deviation σ we would have to
make a new table. To get around this, we create a new variable z,
x − x̄
z= . (4.49)
σ
Since x follows a normal distribution (with average x̄ and standard deviation σ), z also
follows a normal distribution:
1 z2
φ(z) = √ e− 2 . (4.50)

This is the normal distribution with mean z̄ = 0 and standard deviation σ = 1. Below
we have provided a table with the so called p-value, P (z ≥ zc ). This is the probability
of finding a result z that is larger than the determined value zc and is given by
Z∞ Z∞ 2
1 z
P (z ≥ zc ) = φ(z)dz = √ e− 2 dz. (4.51)

zc zc

69
zc P (z ≥ zc ) zc P (z ≥ zc ) zc P (z ≥ zc ) zc P (z ≥ zc )
0.0 0.500 1.0 0.159 2.0 0.023 3.0 1.35·10−3
0.1 0.460 1.1 0.136 2.1 0.018 3.1 9.68·10−4
0.2 0.421 1.2 0.115 2.2 0.014 3.2 6.87·10−4
0.3 0.382 1.3 0.097 2.3 0.011 3.3 4.83·10−4
0.4 0.345 1.4 0.081 2.4 0.008 3.4 3.37·10−4
0.5 0.309 1.5 0.067 2.5 6.21·10−3 3.5 2.33·10−4
0.6 0.274 1.6 0.055 2.6 4.66·10−3 3.6 1.59·10−4
0.7 0.242 1.7 0.045 2.7 3.47·10−3 3.7 1.08·10−4
0.8 0.212 1.8 0.036 2.8 2.56·10−3 3.8 7.24·10−5
0.9 0.184 1.9 0.029 2.9 1.87·10−3 4.0 3.17·10−5

If both x̄ and σ are known of a normal distribution, it is possible to calculate the p-value
from the table.

4.6 Comparison of the three distribution functions


In the previous section we have seen that the sum of a large number of elementary
stochastic steps follows a normal distribution. The question arises how large those
steps really need to be to speak of a Gaussian distribution. In §4.2 we have given the
results of the sum of several dice throws. While one throw with the dice is not even
close to following a normal distribution, we can see from the figure in §4.2 that already
for N = 4 the Gaussian curve is approximated very well. Apparently it converges to a
normal distribution rather quickly.
In figure 4.3 we have depicted the three distribution functions once again for the same
parameters N , p, µ and σ. For the binomial distribution the parameters N and p are
essential and we have used N = 80 and p = 0.1. From this follows that for the Poisson
distribution we have to take µ = N p = 8.0 to be able to compare it with the binomial
distribution and for√the normal
√ distribution µ = pN l = 8.0, from which follows that
l = 1 and thus σ = N pql = 7.2 = 2.68.

70
Binomial distribution
Poisson distribution

Figure 4.3: Comparison of the three probability distribution functions. The Gaussian distri-
bution (solid line) is depicted for σ = 2.68 and x = 8.0, the Poisson distribution
(squares) for µ = 8.0 and the binomial distribution (circles) for N = 80 and
p = 0.1.

4.7 Exercises and problems

4.7.1 Questions
1. The width of the normal distribution is characterized by the parameter σ (stan-
dard deviation). The width of the distribution at half its height (i.e., the distance
between the points for which p(x) = 12 p(x̄)) is called the ‘Full Width at Half
p
Maximum’ (FWHM). Prove that FWHM=2σ 2 ln(2) = 2.35σ.
This is a way to quickly determine the standard deviation from a plotted frequency
distribution (∼probability density): σ = FWHM
2.35
≈ 37 FWHM.

2. Take note: The first part of this exercise is a copy from an exercise in chapter
3. When measuring the thickness of a thin layer of liquid helium, the following
angles (in minutes) were observed:
34 35 45 40 46
38 47 36 38 34
33 36 43 43 37
38 32 38 40 33
38 40 48 39 32
36 40 40 36 34

(a) Draw a histogram of the observations.


(b) Which observation has the highest frequency (i.e., occurs most often)?
71
(c) What is the so-called median, i.e., the measured value with as many obser-
vation above it as below it?
(d) What is the mean (i.e., average)?
(e) Calculate the best estimate of the standard deviation of the population.
(f) Calculate the standard deviation of the mean.
(g) Calculate the standard deviation of the standard deviation.
(h) What boundaries give a single measurement a 68% probability of falling
between them?
(i) Which boundaries have a 95% chance of containing a single observation?
(j) Give the boundaries between which the average lies with a 68% probability.
(k) Between which boundaries does the average lie with a 95% probability?
(l) Between which boundaries does the standard deviation lie with a 68% prob-
ability?
(m) A normally distributed sample space has a so-called ‘probable error’ p ≈
0.67σ (with σ the standard deviation of the population). It has as a theo-
retical property that:
i. 25% of the observations lie under x̄ − p (x̄ is the average of the popula-
tion),
ii. 50% of the observations lie between x̄ − p and x̄ + p,
iii. 25% of the observation lie above x̄ + p.
For the given 30 angles, check how well they satisfy this theoretical property.
(n) Say that one of the observations had given a value of 55. Should this obser-
vation be accepted as ‘correct’ or should it be rejected and why?
(o) Take two random samples of 5 observations each from the whole series. Cal-
culate their mean and standard deviation. Compare these result to each
other and to the more accurate values from the total sample of 20 observa-
tions.
(p) Say that the experiment requires a standard deviation of the average that
is at most 1% of the average. How many observations would have to be
performed, at least?
(q) If the standard deviation of the average has to be within 5% of the average,
how many observations have to be performed?

3. Using the reduced normal distribution, show that there is a 68% probability of
finding a measurement result between x̄ − σ and x̄ + σ for a Gaussian probability
distribution.
4. Someone measures the radioactivity of a sample using a Geiger-Müller counter.
During a time interval t the apparatus counts n pulses. This amount of pulses
satisfies the Poisson distribution
µn −µ
p(n) = e
n!
72
where p(n) is the probability of counting n pulses during the whole time interval
and µ is the ‘real value’, i.e., µ is the average amount of counted pulses if the
experiment were repeated an infinite amount of times. Of course, µ is propor-
tional to the level of radioactivity of the sample. The experimenter repeats the
measurement 10 times for a time interval t=200 s and finds as results 4673, 4628,
4656, 4509, 4698, 4710, 4642, 4590, 4558 and 4731 pulses, respectively.
(a) To verify whether the measured amount of pulses satisfies the Poisson distri-
bution, one can compare the spread in measurements (standard deviation)
with the theoretical standard deviation. Do so and give an uncertainty in
your answer. Use that the standard deviation
q SS of the standard deviation
SS 1
S is approximated by the formula S = 2(N −1) , where N is the amount of
measurements. What is the conclusion?
(b) Give the best estimate of µ and the uncertainty.
Now measurements are performed using another sample, which is about half as
radioactive as the first sample (i.e. µ for this second sample is approximately half
the value of µ for the first sample). The experimenter decides to perform one
measurement during one time interval.
(c) What time interval should be chosen so that µ can be determined with the
same absolute accuracy as for the first sample? You do not have to give an
uncertainty, this is only an estimate.
(d) What time interval should be chosen so that µ can be determined with the
same relative accuracy as for the first sample? (Again, no uncertainty is
needed)

4.7.2 Problems
1. A number of measurements of the length of an object give as results 50.32, 50.41,
50.35, 50.38, 50.26, 50.44, 50.38, 50.38, 50.36, 50.41, 50.55 mm.
The last measurement is quite far from the average. Should the measurement be
rejected or is it likely that it is a ‘good’ measurement? In your answer, use the
reduced normal distribution. This reduced normal distribution is given by the
function
1 z2
φ(z) = √ e− 2 .

It is derived from the ‘ordinary’ normal distribution (Gaussian distribution)
1 (x−x̄)2
p(x) = √ e− 2σ2 .
σ 2π
The so-called exceedance probability P (z ≥ zc ) of the reduced normal distribution
is given by:
Z∞
P (z ≥ zc ) = φ(z) dz.
zc

73
This is the probability to find (measure) a value of z which is larger than zc . In
the lecture notes, a table is given where this exceedance probability is given as a
function of zc .

Hint: Calculate how far off the final measurement is from the average of the first
10 measurements, and calculate using the reduced normal distribution what the
probility is of finding a measurement outcome that is at least as far away from the
average. Is it likely to find such an outcome in a relatively small measurement
series?

2. The presidential election in a very large country had 2 candidates A and B. The
amount of votes cast was so large that it can be considered infinite. To get an
estimate of which of the two candidates got the most votes, a representative sample
is taken. Of all the votes cast, N are selected and viewed. Of this sample of N
ballots, n were for candidate A and N − n were for candidate B. For simplicity’s
sake, we assume there were no invalid ballots. Using this sample, we of course
want to make a statement about the fraction x of all votes that candidate A got.
As a first estimate, you might say that x = n/N , but is this true? In a sample
resulting in n votes for A and N − n votes for B, we can write the probability
density function for x as:
 
N
p (x) = (N + 1) xn (1 − x)N −n .
n
Because x is a fraction, of course 0 ≤ x ≤ 1 holds. Note that this probability
density looks very much like the binomial distribution, but now with the fraction
x as unknown (instead of the amount n).
For each N and n, p (x) is a different function. For example for N = 10 and n = 3
(left) and for N = 10 and n = 7 (right), it looks as follows:

3.0 3.0

N=10 N=10
2.5 2.5

2.0 2.0
n
p (x)

p (x)

1.5 1.5

1.0 1.0

0.5 0.5

0.0 0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

x x

The given probability density function is identical to the ‘Beta distribution’ which
74
is defined as
1
p (x) = xa−1 (1 − x)b−1 .
B (a, b)
In the expression for this probability density, the term B (a, b) appears. This is
the so-called Beta-function which is not important for us at this point. Known
(from literature, but you could also just calculate this), is that the expactation
value ε ⟨x⟩ equals
a
ε ⟨x⟩ =
a+b
and the variance var⟨x⟩ equals
ab
var ⟨x⟩ = 2 .
(a + b) (a + b + 1)

(a) Express the symbols a and b of the Beta-distribution in terms of n and/or


N from the ‘presidential distribution’.
(b) Show that the maximum of the distribution p (x) is always at x = n/N
(c) If from the sample of N ballots, n are for candidate A, what do you expect
is the original fraction of all votes cast that candidate A got? Hint: use the
expectation value.
(d) Shortly explain why the answer to (c) does not equal n/N (which may be
your intuitive expectation). Hint: Take a look at the plots of p (x).
ε ⟨x⟩ (1 − ε ⟨x⟩)
(e) Show that the variance var⟨x⟩ of x equals var⟨x⟩ = .
N +3
(f) From a sample of N = 100, there are n = 49 votes for candidate A and 51
votes for B. What do you expect x to be and what is the uncertainty in this
expectation? Can a conclusion be drawn from this sample?
(g) If in reality, 50.1% of all votes are for candidate A (so xw = 0.501), how
large should N be to appoint a winner with approximately 68% certainty (σ-
interval) from the sample? And how large should N be to get 95% certainty
(2σ-interval)?

3. In a casino, players can buy chips for e 1,-. The owner wants to introduce the
following game. A player continuously draws one card from a very large collection
of cards. The ratio between face cards (Jack-Ace) and pip cards (2-10) is 4:9 and
does not change by drawing cards. Each time a player draws a card, he has to
stake 2 chips. The player continues drawing cards as long as he draws face cards.
When the player draws a pip card (2,3,...,10), the game ends and the player wins
the square of the amount of drawn cards (including the last pip card) in chips.
So, if the player immediately draws a pip card, he wins 1 chip but he had staked
2 (net loss); if he first draws a face card and then a pip card, he wins 4 chips but
he had staked 4 (break-even); etc

(a) Show that the probability P (n) that the player has to stop after the n-th
card (n = 1, ..., ∞) equals
75
 n
9 4
P (n) =
4 13

(b) Show that in the long term, the casino will turn a loss with this game and
calculate what the average loss (to the casino) per player will be
Hint: below this exercise there is a list of expressions for series
(c) Calculate what the mimimum stake (in chips) per drawn card should be,
such that the casino can make profit in the long term.
(d) Now players have to stake the amount of chips calculated in (c). What is
the probability that a player makes a profit in 1 game?

Some series:
P∞ 1 1 1 1
= 1 + + + + ··· = ∞
n=1 n 2 3 4
P∞ (−1)n−1 1 1 1
= 1 − + − + · · · =ln(2)
n=1 n 2 3 4
 2
P∞ 1 1 1 1 π2
=1+ + + + ··· =
n=1 n 4 9 16 6
P∞ (−1)n−1 1 1 1 π
= 1 − + − + ··· =
n=1 2n − 1 3 5 7 4
∞ 1
rn = 1 + r + r2 + r3 + r4 + · · · =
P
if |r| < 1
n=0 1−r
∞ r
nrn = r + 2r2 + 3r3 + 4r4 + · · · =
P
if |r| < 1
n=1 (1 − r)2
∞ 1+r
n2 rn = r + 4r2 + 9r3 + 16r4 + · · · =
P
if |r| < 1
n=1 (1 − r)3
4. Two players A en B throw an (ideal) dice. They take turns throwing as long as
a roll is larger than the (opponent’s) roll before it. The one who throws an equal
or lesser roll than the previous roll, loses the game. The table below gives the
probability P (n) that the game ends after precisely the n-th turn.

n P (n)
1 0
2 27216/46656 = 7/12
3 15120/46656 = 35/108
4 3780/46656 = 35/432
5 ?
6 35/46656
7 1/46656

(a) Explain why P (n)=0 for n > 7.


(b) Show by calculation that P (7)=1/46656.
(c) Calculate P (5).
76
(d) If this game is repeated an infinite amount of times, what is the average of
the amount of throws and what is the spread (standard deviation) in the
amount of throws?
(e) Which player loses most often?

Now the players play for money. Both players stake 1 Euro at the beginning of
the game. After the second throw (if the game has not finished by that time), the
player who wins most often (according to the previous question) stakes another
Euro before each new roll. The other player does not. The player who wins the
game, takes the pot.

(f) Calculate who is expected to make the most money. Calculate the average
profit per game.
R αt − 1 αt R 2 αt
5. For this exercise, the following is given: t eαt dt = e and t e dt =
α2
α2 t2 − 2αt + 2 αt
e .
α3
The duration of phone calls is distributed according to the so-called exponen-
tial distribution. This distribution is described by the probability density function

p(t) = K e−t/µ

where K and µ are constants. Moreover, µ = 4.05 minute.

(a) Express the constant K in terms of µ.


(b) Calculate the average duration t̄ of phone calls.
(c) Calculate the probability that a call takes longer than 8 minutes.

The calling rates are e 0.20 as a starting rate (which is always charged) plus
e 0.20 per minute. This price is not calculated in full minutes, but calculated
exactly as a function of the call duration.

(d) Calcule the average price per phone call.

The phone company offers alternative rates. Here, no starting rate is charged, but
the rate is e 0.04 times the call duration (expressed in minuted) squared. Here,
again, the exact call duration is used and not rounded to whole minutes.

(e) Again calculate the average price per phone call, but now for these alternative
rates. Which rates are better for the caller?
(f) For which average call duration (instead of the given 4.05 minutes) is the
turning point? (i.e., are both rates equal on average)?

77
78
5. Combining measurement results

5.1 Introduction
So far we have talked about the way in which one single measurement (100% intervals)
or a series of measurements (68% intervals) should be analyzed. At this moment we will
limit ourselves to measurements that show a spread in the results. From a measurement
series one can determine the average and its uncertainty. After this the result can be
compared to values from literature or to the results of other measurements of the same
quantity. In §1.6 we discussed how we can draw conclusions from this comparison.
When the measurement results from different experiments are consistent, all results
can be combined to form an even more accurate final result. For example, when two
series of measurements of a quantity x result in x1 = 9.8 ± 0.3 and x2 = 10.2 ± 0.6
(where we neglect the units for now), we can already feel that the true value of x will
lie somewhere above 9.8 and below 10.2. Since the uncertainty in x1 is smaller than the
one in x2 , the true value will probably lie slightly closer to x1 than to x2 . The question
is whether or not a better estimation of the true value than x1 or x2 is possible and
how we should determine this. The answer will be derived in the next section.
Another way to combine measurement results is used when a quantity is a function of
another (measured) quantity and measurements have been performed for several values
of the latter quantity. For example, the extension of a spring is measured and each time
a different mass is hanging from it. If the extension is plotted as a function of the mass,
the measurements will more or less (depending on the uncertainty) follow a straight
line. The slope of this line gives the spring constant. When the measured extensions
have a big uncertainty and therefore will spread widely along the best-fitting line, the
slope of this line will also have a big uncertainty. How we can determine the best-fitting
line and the uncertainty in its slope from a series of measurements will be discussed
further on in this chapter.
It is of course also possible that the measurement points do not follow a straight line,
but have some other functional dependence. Sometimes it is possible to perform an
operation on the measurement points such that they follow a straight line, but in cases
where this is not possible a different method should be used.

79
5.2 Combining measurements of the same quantity
When we have different results from different measurement series of the same quantity
x, the question is: how can we combine these to form a final result? For example, the
quantity x is determined from 3 measurement series, resulting in

x1 = 9.8 ± 0.3;
x2 = 10.2 ± 0.6;
x3 = 9.9 ± 0.9. (5.1)

How should we average this and how do we determine the uncertainty in the final result?
We always start with checking whether or not the measurements are consistent (see
§1.6). This obviously is the case here. We already mentioned earlier that the true
value will probably lie closer to x1 than to x2 , since the uncertainty in x1 is smaller.
It is obvious that the results should be averaged with a weighing factor that in one
way or another depends on the uncertainty: the smaller the uncertainty, the bigger the
weighing factor. The weighing factors we will use will be called G1 , G2 and G3 . If we
know these factors, we can calculate the weighted average x̄ using
G1 x1 + G2 x2 + G3 x3
x̄ = . (5.2)
G1 + G2 + G3
To derive what G1 , G2 and G3 should be, we assume, for convenience, that the results
are determined using the same method, so with the same spread in the separate mea-
surement results. So, all measurement series have the same spread S. Therefore, we
know that the respective uncertainties Sx̄1 , Sx̄2 and Sx̄3 are determined from
S
Sx̄i = √ (i = 1..3), (5.3)
Ni
where Ni is the number of measurements in measurement series i. From the uncertain-
ties Sx̄1 , Sx̄2 and Sx̄3 in our example we now know that
 2
S S 2

N1 = Sx̄1
= 0.3 ,
 2
S S 2

N2 = Sx̄2
= 0.6 ,
 2
S S 2

N3 = Sx̄
= 0.9
. (5.4)
3

In total we have N1 measurements with x̄=9.8, spread in accordance with the spread
S, N2 measurements with x̄=10.2, spread in accordance with the spread S and N3
measurements with x̄=9.9, also spread in accordance with the spread S. If we now
consider this as one big measurement series (with N1 + N2 + N3 measurements in total),
we find
 2  2  2
S S
N1 x1 + N2 x2 + N3 x3 Sx̄1
x1 + Sx̄ x2 + SSx̄ x3
2 3
x̄ = =  2  2  2 , (5.5)
N1 + N2 + N3 S S S
Sx̄
+ Sx̄
+ Sx̄
1 2 3

80
where we have used the earlier expressions for N1 , .., N3 . Dividing both the numerator
and the denominator by S 2 finally gives
x1
2 + x2 2 + x3 2
(Sx̄1 ) (Sx̄2 ) (Sx̄3 )
x̄ =   2  2  2 , (5.6)
1
Sx̄1
+ S1x̄ + S1x̄
2 3

which gives us our weighing factors G1 , G2 and G3 :


 2
Gi = S1x̄ , i = 1..3. (5.7)
i

The uncertainty in the final result can be found just as easily: N1 + N2 + N3 measure-
ments with spread S, gives
S S
Sm = √ = r 2 . (5.8)
N1 + N2 + N3 S
2 
S
2 
S
Sx̄1
+ Sx̄2
+ Sx̄3

By eliminating S we obtain the final result


1
Sm = √ . (5.9)
G1 + G2 + G3
For our example this gives a final result of x̄ = 9.9 ± 0.3.
In this derivation we assumed that all measurement results xi ±Sx̄i are determined with
the same method of measuring and have the same spread S. This doesn’t necessarily
have to be the case. It is possible to prove that for measurement results with different
spreads in the separate measurements, we can use the same formulas. We will not go
into the details of this.
Summarizing
When N measurement series give the results x1 ± Sx̄1 , x2 ± Sx̄2 until xN ± Sx̄N , we can
determine the weighted average by
N
P
Gi xi
i=1
x̄ = N
, (5.10)
P
Gi
i=1

where  2
1
Gi = , (5.11)
Sx̄i
and the uncertainty in this weighted average is
1
Sx̄ = s . (5.12)
N
P
Gi
i=1

81
Note that the uncertainty Sx̄ in the weighted average is always smaller than
the uncertainties Sx̄i in the separate results. This can easily be proven by
the following equations. For the sum of all weighing factors we have
X
Gi = G1 + G2 + ... + Gj + ... + GN ≥ Gj
i

for all j, since all factors Gi are positive. From this immediately follows
that
1 1
P ≤ , ∀j = 1..N,
Gi Gj
and therefore also that
1 1
pP ≤p ⇒ Sx̄ ≤ Sx̄j , ∀j = 1..N.
Gi Gj

5.3 Finding a straight line through a number of


measurements: the method of least squares

5.3.1 A random straight line


We will now look at a case where two measured quantities depend linearly on each
other. These quantities we will call x and y and thus we can state

y = y(x) = ax + b. (5.13)

This dependence (the slope a and the offset b) we want to determine from the measure-
ments (xi , yi ). Each measurement point is determined from its own measurements series
and we will assume that the uncertainty in all yi is constant and equal to Sȳ . Also, for
now, we will assume that the uncertainties in all xi are very small (xi is determined
with infinite accuracy). In a graph this might look like what can be seen in figure 5.1.
The measurement points are indicated by dots with corresponding regions of error (the
error bars) that have the same size for all measurements (Sȳ ). It is clear that all
points are consistent with the straight line that has been drawn through them. We are
looking for the best-fitting line y = ax + b, so the question is: what are a and b? The
measurement points we call (xi , yi ) (i = 1..N ). At the point x = xi we expect to find
y = y(xi ) = axi + b, but we find y = yi . The difference we call ∆yi (see figure), so

∆yi = yi − y(xi ) = yi − (axi + b). (5.14)

The parameters a and b need to be chosen such that the measurement points (xi , yi )
all lie as close to the line as possible. This means that the deviations ∆yi need to
82
y

( x i ,yi )
D yi

a = ta n a
a

b x

Figure 5.1: The best-fitting straight line through the points (xi , yi ). We assumed that the
uncertainties in all xi are negligible and that those in all yi (the error bars) are
the same for all points. The vertical distance from a point (xi , yi ) to the straight
line is ∆yi .

P
be as small as possible. We could consider to minimize the sum ∆yi of all these
deviations, but since ∆yi canPbe both positive or negative according to the definition
above, minimizing
P the sum ∆yi won’t give the desired result. We could minimize
the sum |∆yi |, but usually it’s
Phard to calculate with absolute values, so instead we
2
are going to minimize the sum (∆yi ) . We also have mathematical reasons for this,
but we will get back to that. We define
N
X N
X
S= ∆yi2 = (yi − axi − b)2 = f (a, b). (5.15)
i=1 i=1

With f (a, b) we mean that this S is a function of the variables a and b. We need to
choose a and b such that S = f (a, b) is minimal. We know that a one-dimensional
function f (x) is minimal when its first derivative is zero: df
dx
= 0. Analogous to this,
f (a, b) is at its minimum when ∂a = 0 and ∂b = 0, where ∂a and ∂f
∂f ∂f ∂f
∂b
are the partial
derivatives of f with respect to a and b, respectively (see also §2.1.2). From the first
condition ( ∂S
∂a
= 0) we deduce
N N
∂S ∂ X 2
X ∂
= (yi − axi − b) = (yi − axi − b)2
∂a ∂a i=1 i=1
∂a
X X X X
= −2xi (yi − axi − b) = −2 xi yi + 2a x2i + 2b xi = 0
i i i i
X X X
⇒ a x2i +b xi = xi yi , (5.16)
i i i
83
and from the second
N N
∂S ∂ X X ∂
= (yi − axi − b)2 = (yi − axi − b)2
∂b ∂b i=1 i=1
∂b
X X X X
= −2 (yi − axi − b) = −2 yi + 2a xi + 2b 1=0
i i i i
X X
⇒ a xi + bN = yi . (5.17)
i i

Note that the summations can be calculated from the measurements and therefore they
are constants. We now have a system of two equations with two unknowns:

α11 a + α12 b = δ1 ,
(5.18)
α21 a + α22 b = δ2 ,

x2i , α12 = α21 =


P P P P
where α11 = xi , α22 = N , δ1 = xi yi and δ2 = yi are
i i i i
constants. We can solve this system and we find

α22 δ1 − α12 δ2


 a= ,

 α11 α22 − α12 α21
(5.19)

 α11 δ2 − α21 δ1
 b=
 .
α11 α22 − α12 α21

By filling in α11 , α12 , α21 , α22 , δ1 and δ2 we get the final result
P P P
N xi yi − xi yi


 i i i
 a= ,


D

(5.20)
x2i
P P P P



 yi − xi xi y i
i i i i
b= ,


D
 2
x2i − (xi − x̄)2 .
P P P
where D = N xi . Note that this can also be written as D = N
i i i
From this we immediately see that at all times D > 0.
With this we can calculate the slope a and the offset b of the best-fitting line. In the
introduction of this chapter we gave the example of determining a spring constant.
This constant follows directly from the above equation for a. However, we know that
the measurements will show a spread around the best-fitting line, so there will be an
uncertainty in the values of a and b we just calculated. Obviously, we also want to
calculate these uncertainties. These uncertainties must have something to do with the
spread Sȳ in the individual measurements of y and therefore also with the spread of the
points around the line we just found. We will not provide the proof for the following
84
expressions for the uncertainties Sa and Sb in a and b.
 r
2
r
 N S ȳ N

 Sa = = Sȳ ,
D D



s P sP (5.21)
2 2 2


 S ȳ x i x i
i i

 Sb =

= Sȳ ,
D D
 2
P 2
= N (xi − x̄)2 . We emphasize again that so
P P
where again D = N xi − xi
i i i
far the measurement points yi were averages of measurement series and that they had
an uncertainty Sȳ , which was the standard deviation of the mean. However, it can also
occur that the measurement points (xi , yi ) come from just one measurement instead
of a whole series. We are still assuming that the spread (width of the probability
distribution) is the same for each measurement. In that case the measurements yi don’t
have an uncertainty Sȳ , but Sy (the standard deviation in the separate measurements).
A problem is emerging here. When we had a measurement series, we could determine
the uncertainty of the mean (i.e. the standard deviation of the mean) by using equation
(3.31). The only way to determine the standard deviation and the standard deviation
of the mean is by using a series of measurements. However, now we have only one
measurement per point and from this we cannot determine Sy . We do have a series of
measurements yi , but each point yi has its own true value yi,t . But since we assumed
that all measurements show the same spread, we can still calculate Sy . Analogous to
equation (3.9) we can write
N
1 X
Sy2 = (yi − yi,t )2 , (5.22)
N i=1

where we now have a true value yi,t for each measurement point yi . We also know that
the true values yi,t lie on a straight line (the line that we are looking for). Similar to
section 3.5.1, where we didn’t know the true value xt , but did know an average x̄, we
now don’t know the exact straight line on which the true values yi,t lie, but we have
found an approximation y(x) = ax + b. When we filled in x̄ instead of xt in section
3.5.1, we saw that we had to use a term N1−1 instead of N1 . In the same way we can
show that when we fill in axi + b instead of yi,t , the term N1 should be changed into
1
N −2
. We will not prove this here. The 2 in this term comes from the so-called number
of degrees of freedom. In this case we have two unknowns that we want to determine (a
and b), so the number of degrees of freedom indeed is 2. Generally, with p unknowns,
we should use the term N 1−p . For example, with a straight line y = ax (in the next
section), we have the term N1−1 .
We can now calculate the term Sy2 , which occurs in the expressions for Sa and Sb , using

N N
1 X 2 1 X
Sy2 = ∆yi = (yi − (axi + b))2 . (5.23)
N − 2 i=1 N − 2 i=1
85
Summarizing
We saw that we can calculate the best-fitting line through a series of measurements
(xi , yi ) by using equation (5.20). We assumed here that the uncertainties in all xi
are negligible and that those in all yi are equal. When the points yi are averages of
series of measurements, they have an uncertainty Sȳ and then we can calculate the
uncertainties Sa and Sb in a and b, respectively, using equation (5.21). When the points
yi are separate measurements, they have an uncertainty Sy . We can calculate this using
equation (5.23). For the uncertainties Sa and Sb we can still use equation (5.21), but
now with Sy instead of Sȳ .

5.3.2 A straight line through the origin


The expressions above apply to a random straight line y = ax + b. However, in some
cases the measurement points (xi , yi ) should lie on a line through the origin:
y(x) = ax. (5.24)
The expressions we found earlier then simplify a great deal. Analogous to the earlier
calculations we define
∆yi = yi − y(xi ) = yi − axi (5.25)
and the sum of the quadratic deviations from the straight line
N
X N
X
S= ∆yi2 = (yi − axi )2 = f (a). (5.26)
i=1 i=1

This S is only a function of one variable (a) and minimizing this is easy:
dS d X X X X
= (yi − axi )2 = −2xi (yi − axi ) = 2a x2i − 2 xi yi = 0
da da i i i i
X X
2
⇒ a xi = xi y i , (5.27)
i i

from which follows that P


xi y i
i
a= P 2 . (5.28)
xi
i
The uncertainty Sa in this is given by
Sy
Sa = rP , (5.29)
x2i
i

where Sy is the standard deviation of the mean (so actually Sȳ ) when the measurement
points yi are averages of series of measurements. So, Sy is given by
N N
1 X 2 1 X
Sy2 = ∆yi = (yi − axi )2 , (5.30)
N − 1 i=1 N − 1 i=1
86
when the points yi are separate measurements.
1
Again note that in this expression for Sy2 there is a factor N −1
, whereas in the general
case we had a factor N1−2 .

5.3.3 The general case: unequal uncertainties Sȳi


In the last section we assumed that all measurement points yi had the same constant
uncertainty Sȳ . Every point yi was an average of a measurement series and Sȳ was the
standard deviation of the mean. However, in many cases the uncertainties Sȳi will not
be the same. The question then is whether or not the equations in the last section
still apply in those cases. The answer to this is: ‘no’. This can be understood quite
easily. Analogous to the way in which we combined measurements of the same quantity
in section 5.2, a measurement point with a small uncertainty is more important in
determining the best-fitting straight line than a point with a big uncertainty. We did
not account for this in our earlier calculations (all uncertainties were equal, so we didn’t
have to). After all, we minimized the sum of the quadratic deviations from the straight
line in equation (5.15) and here all points have an equal weighing factor. It seems
obvious that, just like in section 5.2, we should use weighing factors Gi that will look
something like Gi = 1/Sȳ2i . We will demonstrate this.
We are going to use a series of measurements (xi , yi ) in which the uncertainties in xi
are negligible and the measurement points yi have uncertainties Si (we use Si instead
of Sȳi for convenience). We will be looking for the best-fitting line
y(x) = ax + b (5.31)
to these points. Every measured yi has a true value yi,t that should lie on the straight
line, so yi,t = axi + b. The probability Pi of finding a measurement point yi with a true
value yi,t is
! !
1 (yi − yi,t )2 1 (yi − y(xi ))2
Pi = √ exp − =√ exp − , (5.32)
2πSi 2Si2 2πSi 2Si2
where we have assumed that the averages yi have a Gaussian distribution with a width
Si around a true value yi,t (see figure 3.5). The total probability that a line with
unknowns a and b is the line that we are looking for, i.e. the one belonging to the whole
set of measurement points (xi , yi ), is the product of all individual probabilities Pi :
N
Y
P (a, b) = P1 P2 ....PN = Pi . (5.33)
i=1

By filling in the expression we found earlier for Pi , we get


N
" !#
Y 1 (yi − y(xi ))2
P (a, b) = √ exp −
i=1
2πSi 2Si2
"N # N
!
Y 1 1 X (yi − y(xi ))2
= √ exp − 2
. (5.34)
i=1
2πSi 2 i=1
S i

87
Next, we define χ2 (‘Chi squared’) as
N
2
X (yi − y(xi ))2
χ = , (5.35)
i=1
Si2
which allows us to write the probability as
"N #  
Y 1 1 2
P (a, b) = √ exp − χ . (5.36)
i=1
2πSi 2
Q 1
Note that the term √
2πS
doesn’t depend on the straight line that we choose, since it
i
i
only contains the uncertainties Si in
 the measurement points. In other words, this term
is constant. The term exp − 21 χ2 does depend on the straight line that we choose,
i.e. on the choice of the parameters a and b. We now choose the straight line that has
the largest total probability of being the right one, which boils down to maximizing
P (a, b). However, when P (a, b) is maximal, χ2 is minimal. So, maximizing PP(a, b) can
be done by minimizing χ2 . Note that this is similar to minimizing S = ∆yi2 , as
i
we did in section 5.3.1, but now with weighing factors 1/Si2 (since χ2 =
P 2 2
∆yi /Si ).
i
Solving this also goes analogous to what we did earlier, but the result looks a little
more complicated:  P 1 P xi yi P xi P yi

 2 2
− 2 2
i Si i Si i Si i Si


 a=
 ;
D


(5.37)

 P x2i P yi P xi P xi y i


2 2
− 2 2
 b = i Si i Si i Si i Si


,

D
P 1 P x2i P xi 2
 
where D = 2 2
− 2
. Note that when all uncertainties Si are equal, the
i Si i Si i Si
fractions in equation (5.37) can be divided out, which gives the ‘old’ result from section
5.3.1. The equations for the uncertainties Sa and Sb in a and b, respectively, also look
slightly different from what we found in section 5.3.1:
 v
 uP 1
 u

 t S2
 i i

 a
 S = ;
D

v (5.38)
2
u P xi

 u



 t S2
i
 i
Sb = ,


D
P 1 P x2i P xi 2
 
where again D = 2 2
− 2
. Here we can also find the old result by
i Si i Si i Si
setting all uncertainties Si equal. This whole story only works when all uncertainties
Si (so for each i) are known. That only is case when each point yi and its uncertainty
Si = Sȳi are determined from a measurement series. So, this method does not work
when the points yi are separate measurements.
88
5.4 Other relationships between the points (xi, yi)

In the last section we fitted a straight line through the measurement points (xi , yi ).
One can of course imagine that the relationship between the points xi and yi (or the
model) is not linear. In this case we should use other techniques. We will demonstrate
this in the coming sections.

5.4.1 Relationships that can be made linear

For example, when the points (xi , yi ) follow the relationship (model)

y = a ebx , (5.39)

it is not only possible to plot yi as a function of xi (which will not give us a straight
line), but also to plot the natural logarithm ln(yi ) as a function of xi . After all, the
equation above can be written as

ln(y) = bx + ln(a). (5.40)

This does give us a straight line with slope b and offset ln(a). So when we plot ln(yi )
as a function of xi , we can use the method from the last section and fit a straight line.
However, this does not instantly solve our problem. First of all, we have a dimension
problem. The measured quantity y has a dimension and in the equation above we just
took the logarithm of this. So, we should always be cautious and realize that we are
looking at a number and then the logarithm of this. The second problem is worse
though. The condition for using the ‘simple’ method of least squares from section 5.3.1
was that the uncertainties in y were equal for all yi . When we take the logarithm of yi ,
this will not be the case anymore. After all, according to §3.8 the uncertainty in ln(y)
is given by
∂ ln(y) Sy
Sln(y) = Sy = . (5.41)
∂y y
and this isn’t necessarily constant, since y can be different for each measurement point.
We can only use the method of least squares to obtain an indication of the best-fitting
straight line. From the equation above, it is obvious that, when Sy is constant, the
uncertainty Sln(y) is big for small values of y and small for big values of y. The points
(xi , yi ) that have a small Sln(yi ) then have much more influence on the position of the
best-fitting line than the points with big Sln(y) . If we want to do it the right way, we
should use the more extended equations from section 5.3.3.
The trick we used to get points that don’t follow a straight line in a straight line, can
be applied quite often, so not only in the case of an exponential function. In all cases
one should be cautious with dimensions and the uncertainty in the y-values will not be
constant anymore.
89
5.4.2 The method of least squares for non-linear relationships
As an example we take the position of a falling object (in vacuum). Theoretically, the
position y can be described by
y = y0 + v0 t + 21 gt2 , (5.42)
where y0 is the initial position (at time t = 0), v0 the initial velocity and g the gravita-
tional acceleration. The measurement points (ti , yi ) obviously do not lie on a straight
line, but on a parabola. It is the term v0 t that puts a spanner in the works here. If this
term wasn’t there, we could simply plot yi as a function of t2i to get a straight line. Try

to figure out yourself why this works better than plotting yi as a function of ti . So in
this case we should use another method.
We could try to modify the method of least squares for a parabolic curve, but a more
general case is that of an M th-degree polynomial
y = a0 + a1 x + a2 x2 + ... + aM xM . (5.43)

Our job is now to find the parameters a0 , ..aM such that the curve fits best to our data.
The square of the difference ∆yi between the measurement values yi and the polynomial
fit can be defined analogous to the linear case:
2
∆yi2 = yi − a0 − a1 xi − a2 x2i − ... − aM xM
i , (5.44)
and the sum S of these quadratic deviations is given by
X X 2
S= ∆yi2 = yi − a0 − a1 xi − a2 x2i − ... − aM xM
i . (5.45)
i i

This S can again be minimized by setting


∂S ∂S ∂S
= = ... = = 0. (5.46)
∂a0 ∂a1 ∂aM
This gives the following system of M +1 linear equations with M +1 unknowns (a0 , ..aM ):
P 2
... P xM
P P   P
PN P x2i P xi3
  
i a0 yi
xM +1   P
xi xi xi ... i   a.1   xi yi 
  
 .
. .
. .
. .
.  .   .. 
. . . . . .
 
= , (5.47)
 P    P 
xki xk+1 k+2 k+M k
P P P
x ... x a x y
    
 i i i   k   i i 
.. .. .. ..  .   ..
  ..  
 
 . .
P M P M +1 P M +2 . P 2M. P M . 
xi xi xi ... xi aM xi y i

where we have already used matrix notation. Check this yourself. Mathematics tells us
that this is quite easily solved. We won’t go into the details here.
Mathematics also tells us that many functions can be written as a polynomial (the
so-called series expansion). In principle the degree of such polynomials (M ) is infinite,
but the coefficients ai usually get negligibly small for large i. We can approximate these
functions with an M th-degree polynomial with a finite M . In that case we can use the
method from this section to calculate the coefficients ai that aren’t negligible.
90
5.5 Exercises and problems

5.5.1 Questions
1. The charge of an electron is determined in three different ways (using 3 different
measurement series). The results are:
Q1 = (1.58 ± 0.04) · 10−19 C,
Q2 = (1.64 ± 0.09) · 10−19 C,
Q3 = (1.60 ± 0.03) · 10−19 C.
Combine these three results to a single end result and calculate the uncertainty
therein.
2. A vehicle passes 4 different positions xi with a constant speed v. The positions
are known to an accuracy of 0.1 m (68% confidence interval). To determine the
speed v of the vehicle, the time instants at which it passes the positions xi are
measured. These times ti are measured with an accuracy of 0.5 s (again, 68%
confidence interval). The measurement results are:
xi (m) 0 1000 2000 3000
ti (s) 12.4 37.2 60.9 84.9

(a) In order to determine the speed using the least-squares method, what would
you plot along the x-axis and what on the y-axis?

An alternative method of determining v is to take the average speeds when passing


the different positions xi (so, for each position dividing the distance covered by
the time taken), and combining the obtained speeds vi in a correct way to get a
final answer.
(b) Calculate v and its uncertainty using this alternative method. Present your
final answer in a correct way.
3. To determine the spring constant k of a spring, the spring is loaded with different
masses m and the resulting length l of the spring is measured. The uncertainty
in the masses is Sm = 0.02 g, and that in the lengths is Sl = 0.3 cm. The results
of the measurements are
m (g) 200 300 400 500 600 700 800 900
l (cm) 5.1 5.5 5.9 6.8 7.4 7.5 8.6 9.4
Because it holds that mg = k(l − l0 ), where g is the acceleration due to gravity
and l0 is the length of the unloaded spring, the line l = l0 + mg
k
has to be fitted
through the measurement points as good as possible. Determine l0 , k and the
uncertainty in both.
4. A train passes four different positions x with a speed v, assumed constant. De-
termine v and the uncertainty therein. The data are
x (m) 0 1000 2000 3000
t (s) 17.6 40.4 67.7 90.1
91
5. The rate A at which a radioactive sample decays, decreases according to
t
A = A0 e− τ ,
with τ the average lifetime of the sample. The following measurement results are
obtained (a.u.=arbitrary units)
t (hour) 0 1 2 3
A (a.u.) 13.8 7.9 6.1 2.9
Give the best estimate for τ .
6. If y = Af (x) + Bg(x), with A and B unknown parameters and f and g known
functions (e.g., f (x) = cos(x) and g(x) = sin(x)), and N measurement points
{xi , yi } are available, we find the best estimates of A and B from

X X X
A {f (xi )}2 + B f (xi )g(xi ) = yi f (xi ),
i i i
X X 2
X
A f (xi )g(xi ) + B {g(xi )} = yi g(xi ).
i i i

A, B, SA and SB follow from


yi f (xi ) {g(xi )}2 − yi g(xi ) f (xi )g(xi )
P P P P
A = ,
D
{f (xi )}2 yi g(xi ) − yi f (xi ) f (xi )g(xi )
P P P P
B = ,
D
Sy2 {g(xi )}2 Sy2 {f (xi )}2
P P
2 2
SA = , SB =
D D
where
X X nX X o2
2 2
D = {f (xi )} {g(xi )} − f (xi ) g(xi ) and
1 X
Sy2 = {yi − Af (xi ) − Bg(xi )}2 .
N −2 i

Apply these expressions to the following problem.


The height y of a mass loading a vertical spring is given by
y = A cos(ωt) + B sin(ωt) = y0 sin(ωt + φ).
With a fast camera, an observer finds for 5 ‘equidistant’ time instants the following
heights y:
t (s) -0.4 -0.2 0.0 0.2 0.4
y (cm) 3 -16 6 9 -8
Measurements also show that ω = 10 rad/s to a very high precision. If Sȳ ≈ 2
cm, do the data lie acceptably on the known curve? First plot the points and try
to fit a sine though them as good as possible. Then determine y0 and φ.
92
5.5.2 Problems
1. An experimenter measures the light intensity of a light beam which he attenuates
(weakens) using plates of absorbing material. He uses four plates with different
thicknesses. The results are given in the table below. Alle intensity measurements
in this table are the result of a measurement series. The uncertainties in them are
therefore 68%-intervals. The uncertainties in the thicknesses are 100%-intervals.

Thickness (mm) Intensity (W/m2 )


0.500 ± 0.005 9.2 ± 0.9
0.995 ± 0.005 4.7 ± 0.5
2.050 ± 0.005 0.81 ± 0.08
2.750 ± 0.005 0.31 ± 0.03
The light intensity I as function of the thickness d of the absorbing material is
given by
I = I0 exp (−Cd) ,
where I0 is the intensity of the unattentuated beam and C is the absorption
coefficient.

(a) How can linear adjustment be used to determine I0 and C. In other words,
what should you plot against what to get a linear relation?
(b) What are the conditions under which the least-squares method (for which
the expressions are given in Appendix G of the lecture notes) are valid?
(c) Make a correct graph of the measurement results showing a linear relation,
according to the rules.
(d) Determine visually (from the graph) the unknowns I0 and C.
(e) Determine using the least-squares method the unknowns I0 and C and their
uncertainties.

2. An electron can be bound to a proton in different ways (forming a hydrogen


atom). During an experiment, the binding energies of the electron to the proton
are measured. A theoretical model from J.R.Rydberg around 1890 predicts that
the binding energy is given by:
C
En = 2
n
where En is the binding energy of the electron to the proton in units of electron
volts (eV) , and n is a positive integer (n = 1, 2, 3, ....). According to Rydberg’s
calculations, C = 13.61 eV.
An experimenter obtains the following measurement results:
E1 = 13.7 eV
E2 = 3.7 eV
E3 = 1.3 eV
E4 = 1.1 eV
E5 = 0.4 eV
93
The uncertainty in all measurements is 0.3 eV (68%-intervals).

(a) How would you plot the data in a graph to determine C from a linear rela-
tion? Should this line pass through the origin?
(b) Make a clear graph of the measurement results showing a linear relation,
according to the rules.
(c) What is the (inevitable) disavantage of this way of plotting?
(d) From the graph, estimate C and its uncertainty (so not using the least-
squares method).
(e) Calculate C and its uncertainty using the least-squares method.
(f) Is the experimentally found value of C consistent with the value predicted
by Rydberg? Explain your answer.
(g) An alternative method is to calculate a value of C for each measurement
result (using C = n2 En ), and combining the 5 obtained values of C into one
final result. Do so and indicate how the uncertainties in the values of C are
calculated, and how the uncertainty of the end result is determined.
(h) In the previous question, which of the measured energies was most important
in determining the final answer? Can you also explain this using the graph
from question (b)?

3. In an experiment, the voltage Vi coming from a measurement device is measured


at time instants ti . The observed signals schematically look as shown in the
image below. In reality however, there are a lot more data points.

10

4
V

0
0 10 20 30 40 50

At first (until t ≃ 10 ns) the signal level is constant. From t ≃ 10 ns to t = 50 ns,
theory predicts that voltage will follow V (ti ) = Vb − V0 exp − ti −tτ
0
, where V0 is
the (theoretical) voltage at time t0 . Here, t0 can be freely chosen in the mentioned
94
interval. From the measured voltages, τ should be determined. Each voltage Vi is
measured exactly once, hence the constant voltage Vb is determined from a large
amount of measurements. The time instants ti are very precisely determined,
hence the uncertainty in them is negligible. To avoid problems with dimensions,
we divide the above expression by a reference voltage Vref , so it becomes
 
V (ti ) Vb V0 ti − t0
= − exp − .
Vref Vref Vref τ
As a reference voltage, we can for example take Vref = 1 V.
Somewhere between t = 0 ns and t = 10 ns, an interval is chosen to determine Vb .
Then, between t = 10 ns and t = 50 ns an interval is chosen in which the measured
voltages are manipulated such that a linear relation is obtained, from which the
decay time τ can be determined. t0 is chosen as a lower bound to this interval,
and tn is chosen as an upper bound (so it contains a total of n+1 measurement
points).

(a) What are the two unknowns (parameters to be fitted) in the second part (t
between 10 ns and 50 ns)?
(b) What do you plot against what to get a linear relation in the second part
(t between 10 ns and 50 ns)? How can the two unknowns from part (a) be
calculated from this plot (only indicate the method)?
(c) Why is it necessary that Vb be determined very accurately?
(d) In part (b), why can the upper bound tn of the time interval not be taken at
50 ns, and should tn in fact be significantly smaller? From the plot, estimate
how large tn may be.
(e) We know that the spread (standard deviation) SV of all measured voltages
Vi (so in both intervals) is constant; however, how large it is is unknown.
If we assume that the uncertainty SVb in Vb is sufficiently small when it is
smaller than 10% of the accuracy SV of the individual voltage measurements,
how many points are needed in determining Vb ?
(f) What is the accuracy Syi of the y-quantity in the lineair graph of excercise
(b)? Express it in terms of SV . Assume that SVb is negligible.
(g) When fitting a straight line, an expression is minimised. Which expression
is minimized in this case?
(h) The expressions for a straight-line-fit with unequal uncertainties are given in
appendix G of the lecture notes. Is it necessary to known SV in order to use
these formulas?

4. Using a Geiger-Müller counter, measurements are performed on a radioactive


sample. With this instrument, pulses are counted during some time interval. We
know that the chance P (n) of measuring n pulses during this time interval is equal
to
µn
P (n) = exp (−µ) ,
n!
95
Where µ is the average amount of pulses that would be measured if the experiment
were repeated an infinite amount of times. In other words, µ = ε ⟨n⟩.

(a) Using the expression for the Poisson distribution, show that µ is indeed the
average amount of pulses for an infinite amount of measurements, so that
indeed ε ⟨n⟩ = µ.

P µn
Hint: n!
= exp (µ) .
n=0

The measurements are performed for different time intervals T . Each time interval
T of course has a different average µ (T ). Naturally, µ (T ) ∝ T . For each time
interval, N measurements are performed where N is relatively large. The average
n̄ of such a series of N measurements is (hopefully) a decent approximation of
the real average µ (T ) for that time interval. Because we are performing real
measurement series (at different time intervals), we can calculate their standard
deviations. For a Poisson distribution, we know that theoretically, the variance
var⟨n⟩ of the amount of measured pulses is equal to the average µ. To check this,
we compare the measured standard deviations in the different measurement series
(at different time intervals) to their theoretical values.

(b) What is the theoretical relation between the standard deviation σ and the
average µ?

The measurement results are given in the table below. Here, n̄ is the average
amount of pulses of the measurement series and S is the standard deviation as
determined from the series. All measurement series consisted of N = 30 measure-
ments. We call the seperate measurements of each series ni (i=1..30).
Tijdsinterval T (s) n̄ S
10 71 8.4
20 145 11.2
30 217 13.3
40 293 16.4
50 357 19.5

(c) Using which formula were the S-values in the table determined?
(d) If the measured averages n̄ approximate µ (T ), what is their theoretical un-
certainty? Only give an expression. Express this uncertainty in terms of µ
(≈ n̄). The answer cannot contain the seperate measurements ni .

The uncertainties in the measured standard deviations can be calculated using


SS
S
= √ 1 . Note that the relative uncertainty in S is the same for each
2(N −1)
measurement series.

(e) If we want to plot the measured standard deviations and the measured av-
erage amounts of pulses in some way to check the theoretical relation from
part (b) using the least-squares method, what will you plot against what?
Explain your answer.
96
(f) Should the straight line from the previous question pass through the origin
or not?
(g) What are the uncertainties in the quantities plotted along the x-axis and the
y-axis, respectively?
(h) Make a clear graph, following the rules.
(i) Using the least-squares method, determine the slope of the graph and its
uncertainty.
(j) Is the calculated slope consistent with theory?

5. Hint: The derivative of tan α is 1/ cos2 α.


The height H of a tower is measured at different distances D from the foot of
the tower, by using a telescope to measure the angle α between the line connect-
ing the top of the tower and the point of observation, and the horizontal plane.
Schematically, the set-up looks as in the image below. The telescope is mounted
on a tripod at a height of H0 = 1.8 m exactly (so the uncertainty is negligible).

a
H 0

H − H0
Of course tan (α) = holds. The measurements are given in a table:
D
Di (m) αi (degree)
50.0 ± 0.5 63 ± 2
100.0 41
150.0 32
200.0 26
250.0 20

The uncertainties in all distances Di are equal and those in all measured angles
αi are also equal.

(a) If you want to determine the height H using a straight line fit, what do you
plot against what? Clearly indicate what is plotted along the x-axis and
what on the y-axis, and explain why.
97
(b) Which of the formulas in appendix G of the lecture notes is used to find the
best-fitting straight line? Explain why.
(c) How can this/these formula(s) be used to determine the height and its un-
certainty? Give, if necessary, the formula for he uncertainty SH in H.
(d) Make a correct graph showing a linear relation, according to the rules.
(e) Determine graphically (from the graph, without calculating), the height of
the tower and the uncertainty therein.
(f) Determine by calculation the height of the tower and the uncertainty therein.

98
R +∞ 2 √
A. Proof that −∞ e−x dx = π

In section 3.4 we claimed that the integral


Z+∞
2 √
e−x dx = π. (A.1)
−∞

2
We will now provide the proof for this. Since the function e−x has no primitive function,
we cannot directly calculate the integral, so we will have to use a trick. We will call the
integral I for now, so
Z+∞
2
I= e−x dx. (A.2)
−∞

The square of this integral is


Z+∞ Z+∞ Z+∞ Z+∞
2 2 2 2
I2 = e−x dx e−y dy = e−(x +y ) dx dy. (A.3)
−∞ −∞ −∞ −∞

Subsequently we will change to polar coordinates (r, φ) instead of using Cartesian co-
ordinates (x, y). Note that x2 + y 2 = r2 and that dx dy = r dr dφ, so the expression
becomes
Z2π Z+∞
2
I 2 = dφ e−r r dr. (A.4)
0 0

Since the integral covers the whole xy-plane, we have to integrate r from 0 to ∞ and
φ from 0 to 2π. The obtained integral we can evaluate analytically, since the function
2 2
r e−r does have a primitive function (− 21 e−r ). The first part (the integral over φ)
simply gives us a constant 2π. This means that
h i∞
2 1 −r2
I = 2π − 2 e = 2π(0 + 12 ) = π. (A.5)
0

This is the square of the integral that we were originally trying to calculate, so
Z+∞
2 √
e−x dx = π. (A.6)
−∞

99
100
B. Combining measurements of the same
quantity

In this appendix we will show that the equations (5.10) until (5.12) are valid in the gen-
eral case, which means that they are also valid when the acquired measurement results
are obtained with different measurement methods. We will consider N measurement
series (independent) with results x̄i ± Sx̄i . We will assume that the measurement results
of all series show a Gaussian spread (and according to §3.5.2 the means do as well). We
now want to combine these N results to one ‘average’ x̄. The probability pi (x̄) dx that
the true value lies within an interval dx around x̄ is for series i given by the probability
density function !
1 (x̄ − x̄i )2
pi (x̄) = √ exp − (B.1)
Sx̄i 2π 2Sx̄2i

The total probability density p (x̄) of all measurement series combined, is the product
of all pi (x̄):
N N
! N
!
Y Y 1 1 X (x̄ − x̄i )2
p (x̄) = pi (x̄) = √ exp − 2
. (B.2)
i=1 i=1
S x̄ i
2π 2 i=1
2S x̄ i

We should choose the mean x̄ in such a way that this probability density is optimal. So
the mean x̄ that we are looking for, has to give a maximum probability density p (x̄),
so
dp (x̄)
= 0. (B.3)
dx̄
Since p (x̄) is of the form C exp (f (x̄)), we can state that (chain rule)

df (x̄) X  x̄ − x̄i 
=0⇒ = 0. (B.4)
dx̄ Sx̄2i

From this simply follows that


P x̄i
S2
x̄ = P x̄i , (B.5)
1
Sx̄2i
with which equation (5.10) is proven to be correct.
The uncertainty Sx̄ of this result can be calculated quite easily. Applying the general
calculation rule (3.43) gives
X  ∂ x̄ 2
2
Sx̄ = Sx̄2i . (B.6)
∂ x̄i
101
The partial derivative in this equation is equal to
∂ x̄ 1 1
=P (B.7)
∂ x̄i 1 Sx̄2i
2
j Sx̄j

and thus
P 1
2
1 1 i Sx̄i 1
X
Sx̄2 = #2 2
=" #2 = P 1 , (B.8)
Sx̄i
"
i P 1 P 1 2
2 2 i Sx̄i
j Sx̄j j Sx̄j

which means that


1
Sx̄ = r , (B.9)
P 1
2
i Sx̄i

with which equation (5.12) is proven to be correct.

102
C. The uncertainty in the method of least
squares

In this appendix we will provide a derivation for the uncertainties Sa and Sb in the
coefficients a and b of the best-fitting straight line y = ax + b for a set of data points
(xi , yi ). We will do this for the general case of §5.3.3, in which all data points yi (in
theory) have different uncertainties Si = Sȳi . The coefficients a and b themselves are
given by equation (5.37)
The uncertainties we find with the help of the general calculation rule (3.43):

P ∂a 2 2
  
2
S = Si ;


 a ∂yi


(C.1)
 2
∂b


 Sb2 = Si2 .

 P
∂yi

∂a
From equation (5.37) we can calculate the partial derivative :
∂yi
!
∂a 1 xi X 1 1 X xj
= − , (C.2)
∂yi D Si2 j Sj2 Si2 j Sj2

2
P 1 P x2i

xi
where D= − . Inserting this in equation (C.1) gives
Si2 Si2 Si2
!2
X 1 xi X 1 1 X xj
Sa2 = 2 2 2
− 2 2
Si2 (C.3)
i
D S i j
Sj Si j
S j
 !2 ! !2 !
2
1  X 1 X xi X xi X 1 
= −
D2  i Si2 i
Si2 i
Si2 i
Si2 
! ! ! !2 
2
1 X 1  X 1 X xi X xi 
= −
D2 i
Si2  i Si2 i
Si2 i
Si2 
P 1
S2
= i i .
D

This is the result of equation (5.38).


103
∂b
For the partial derivative we can say that
∂yi
!
∂b 1 1 X x2j xi X xj
= − . (C.4)
∂yi D Si2 j Sj2 Si2 j Sj2

Again inserting this in equation (C.1) gives


!2
X 1 1 X x2j xi X xj
Sb2 = 2 2 2
− 2 2
Si2 (C.5)
i
D Si j
Sj S i j
S j
 !2 ! !2 !
2 2
1  X 1 Xx
i
X xi Xx 
i
= 2 2
− 2
2
D  i Si i
Si i
Si i
Si2 
! ! ! !2 
2 2
1 X xi  X 1 Xx
i
X xi 
= −
D2 i
Si2  i Si2 i
Si2 i
Si2 
P x2i
S2
= i i .
D
This is again the result of equation (5.38).

104
D. Overview of formulas and distribution
functions

On the next page we have provided an overview of the formulas that can be utilized for
continuous and discrete probability distributions and measurements.

105
Continuous probabilities Discrete probabilities Measurements
Measurement quantity x xk (k = 1..M )
Measurement value xi (i = 1..N )
Frequency distribution F(xk ) = N Pk = N pk ∆x F(xk )(k = 1..M )
(for discrete outcomes xk )
Frequency density F (xk ) ≈ N p(xk ) F (xk ) = Mk /∆x
(Mk = number of measurements in)
interval ∆x around xk
1 Mk
Probability density p(x) pk = p(xk ) pk = N1 F (xk ) = ∆x N
Rb Mk 1
Probability P (a ≤ xi ≤ b) = p(x)dx Pk = pk ∆x Pk = N
= pk ∆x ≈ N
F (xk )∆x
a
kb
P
P (a ≤ xi ≤ b) = Pk (continuous quantities)
k=ka
Pk = N1 F(xk )
(discrete quantities)
Pkb
P (a ≤ xi ≤ b) = Pk
k=ka
R∞ M
P M
P 1
N
P
Expectation value ε ⟨x⟩ = xp(x)dx ε ⟨x⟩ = xk p(xk )dx = x k Pk ε ⟨x⟩ = N
xi
−∞ k=1 k=1 i=1
R∞ M
P 1
N
P
ε ⟨f (x)⟩ = f (x)p(x)dx ε ⟨f (x)⟩ = f (xk )P (xk ) ε ⟨f (x)⟩ = N
f (xi )
−∞ k=1 i=1
R∞ M
P 1
P
Mean x̄ = ε ⟨x⟩ = xp(x)dx x̄ = ε ⟨x⟩ = x k Pk x̄ = N
xi
−∞ k=1 measurements i
xk MNk
P P
x̄ ≈ xk pk ∆x =
interv. k interv. k
Variance σ 2 =var⟨x⟩ = ε ⟨(x − xt )2 ⟩ σ 2 =var⟨x⟩ = ε ⟨(x − xt )2 ⟩ S 2 =var⟨x⟩ = ε ⟨(x − xt )2 ⟩
R∞ M
1
N N
1
(x − xt )2 p(x)dx (xk − xt )2 P (xk ) (xi − xt )2 ≈ (xi − x̄)2
P P P
= = N N −1
−∞ k=1 i=1 i=1

106
E. Overview of distribution functions

In this appendix we will list the three distribution functions from chapter 4.
Gaussian distribution Binomial distribution Poisson distribution
(x−µ)2

√1 e
Probability density p(x) = σ 2π
2σ 2

µn −µ
N
pn (1−p)N−n

Probability PN (n) = n
P (n) = e
n!
Mean x̄ = µ n̄ = N p n̄ = µ
Expectation value ε ⟨x⟩ = x̄ = µ ε ⟨x⟩ = n̄ = N p ε ⟨x⟩ = n̄ = µ
Variance var⟨x⟩ = σ 2 var⟨x⟩ = σ 2 = N p (1−p) var⟨x⟩ = σ 2 = µ

107
108
F. χ2 test of a distribution function

In chapter 4 we have discussed several distribution functions that we often encounter


while doing experiments. However, we will not always know whether or not the measure-
ment results (have to) follow a certain distribution function. Therefore, it is sometimes
useful to discuss whether or not the distribution function we assume to be relevant (the
fit) accurately describes the observed behavior. Therefore, we need a test for our fit.
An example is depicted in figure F.1.

h (x k)
p (x )

h (x k)- N P (x k)

x 1 x 2 x k

Figure F.1: Test of a distribution function.

The measured quantity x is continuous. The measurements have been plotted as a


histogram h(xk ). This shows the number of observed measurement results that fall in
an interval with a width ∆x around a value xk . We have performed N measurements
in total, which means that
M
X
h(xk ) = N. (F.1)
k=1

109
In the figure the assumed probability density function p(x) is plotted as well. To be
able to estimate the accuracy with which the fit p(x) describes the measurement results,
we will have to compare the results with the expected results. The probability P (xk )
to find a result in the interval around xk in a single measurement, is
xk + ∆x
2
Z
P (xk ) = p(x) dx ≈ p(xk ) ∆x (F.2)
xk − ∆x
2

and the total number of measurements that we expect in the interval is N P (xk ). We
define, analogous to equation (5.35), again
M
2
X (h(xk ) − N P (xk ))2
χ = , (F.3)
k=1
(Sk (h))2

where (Sk (h))2 = var⟨h(xk )⟩ is the spread in the histogram values h(xk ). In other words,
this is the spread in values h(xk ) that we would find if we would measure the histogram
a (large) number of times. Fortunately, we will not have to do this to be able to
determine the size of Sk (h). The probability P (hk ) of finding exactly hk measurements
in the interval around xk after N measurements is given by the binomial distribution
 
N
P (hk ) = (P (xk ))hk (1 − P (xk ))N −hk . (F.4)
hk

From the binomial distribution we know that (see chapter 4 and appendix E) the mean
is given by
h̄k = N p = N P (xk ), (F.5)
and the variance by
 
h̄k
var ⟨hk ⟩ = N p(1−p) = N P (xk ) (1 − P (xk )) = h̄k 1− . (F.6)
N

When the total number of measurements N becomes relatively large, we will see that

var ⟨hk ⟩ ≈ h̄k = N P (xk ). (F.7)

Hence, the expression for χ2 becomes


M M
2
X (h(xk ) − N P (xk ))2 X (h(xk ) − N P (xk ))2
χ = ≈ . (F.8)
k=1
N P (xk ) (1 − P (xk )) k=1
N P (xk )

The value for χ2 we can now calculate, since the values h(xk ) are measured and P (xk )
is known according to equation F.2. Note that in this equation we assume a continuous
probability distribution; however, also in the case of a discrete distribution we know
P (xk ) and we can calculate χ2 . The value of χ2 is used as a measure for how good
the agreement between data and distribution function is. If the measured frequencies
110
(histogram values) h(xk ) exactly match the predicted values N P (xk ), we would find
χ2 = 0. However, this is very improbable, since (as stated earlier) there will definitely
be a spread in the values of h(xk ). Large values of χ2 correspond to large deviations
from the assumed distribution function.
To calculate what the probability is to find a certain value for χ2 , we will use the so
called reduced Chi squared χ2ν . This is defined as

χ2
χ2ν = , (F.9)
ν
where ν is the so called number of degrees of freedom. This number of degrees of
freedom ν is the number of xk values M minus the number of parameters that we have
to calculate from the measurement data to describe the probability distribution. For
example, for the Gaussian distribution we have to calculate the mean and the standard
deviation, so in that case ν = M − 2. For the Poisson distribution√ we only need the
mean (the standard deviation is automatically known, since σ = n̄), so ν = M − 1.
The expectation value for χ2ν is
χ2ν = 1. (F.10)
When χ2ν is determined from the measurement data, we can calculate what the proba-
bility is that this is the case and thus determine what the probability is that the given
distribution function is correct. It is outside of the scope of these lecture notes to go
into detail on this topic. For example, the book of Bevinton and Robinson lists a table
giving these probabilities.

111
112
G. Formulas for least squares fits
FITTING FUNCTION FITTING PARAMETERS UNCERTAINTIES
s
N Sy2
P P P
N x i yi − x i yi
a= Sa =
N xi − ( xi )2
P 2
y = ax + b N x2i − ( xi )2
P P P
Syi = Sy = const. P 2P P P s
yi − xi xi yi Sy2 x2i
P
xi
b= Sb =
N x2i − ( xi )2
P P
N x2i − ( xi )2
P P

P Sy
y = ax xi yi Sa = qP
a= P 2
Syi = Sy = const. xi x2i

y = kx + b P
yi − k
P
xi Sy
k = const. b= Sb = √
Sy i = Sy = const. N N
s
x4i
P
x4ixi yi − x3i
P P P P 2
xi yi
a1 = P 3 2 Sa1 = Sy 2
x2i x4i − x3i
P P P
2
P P 4
y = a1 x + a2 x2 x x − xi
P 2 Pi 2 i P 3 P s
x2i
P
Syi = Sy = const. xi xi yi − xi xi yi
a2 = P 2P 4 P 3 2 Sa2 = Sy 2
2 x4i − x3i
P P P
xi xi − xi xi
v
P 1 P x i yi P x i P yi u P 1

u
Si2 Si2 Si2 Si2
u Si2
a= Sa = u
t P 1 P x2i P xi 2
u  
P 1 P x2i P xi 2
 
− −
y = ax + b Si2 Si2 Si2 Si2 Si2 Si2
v
Syi = Si ̸= const. 2
P xi P yi P xi P xi yi
u P x2i

u
2 2 Si2 Si2 Si2
u
Si Si u
b= Sb = u
u P 1 P x2 P x 2
P 1 P x2i P xi 2
 
i i
− −
t
Si2 Si2 Si2 Si2 2
Si 2
Si

y = ax 1
xi yi /S 2
P
Sa = qP
Syi = Si ̸= const. a = P 2 2i
xi /Si x2i /Si2

y = kx + b 1
yi /Si2 − k xi /Si2
P P
k = const. Sb = qP
b=
1/Si2 1/Si2
P
Syi = Si ̸= const.
v
P x4i
P xi yi P x3i P x2i yi
u
u P x4i
− u
Si2
Si2 Si2 Si2 Si2 Sa1
u
=u
a1 = 2 u P 2 P 4 P 3 2
P x2i P x4i

x3i
 t xi xi xi

P
2 2 −
y = a1 x + a2 x2 Si2 Si2 Si2 Si Si Si2
v
Syi = Si ̸= const. P x2i P x2i yi P x3i P xi yi u P x2i
− u
Si2 Si2 Si2 Si2 Si2
u
a2 =
u
Sa2 =u
P x2i P x4i P x3i 2 u P 2 P 4 P 3 2
 
− t xi xi xi
Si2 Si2 Si2 2 2 −
Si Si Si2
113
114
H. Answers Questions

Only answers to uneven questions are given

Question 2.4.1.1
∂ 4 2
(a) ∂x (x y ) = 4x3 y 2
∂ 4 2
∂y (x y ) = 2x4 y
∂ 2 2
(b) ∂x (x /(x + y)) = 2xy/(x2 + y)2
∂ 2 2
∂y (x /(x + y)) = −x2 /(x2 + y)2
∂ 2 2
(c) ∂x ((4x + 2y)/(3x − y)) = (12x − 8xy − 6y)/(3x − y)2
∂ 2 2 2
∂y (((4x + 2y)/(3x − y)) = (4x + 6x)/(3x − y)


(d) ∂x (sin x/ sin y) = cos x/ sin y

∂y (sin x/ sin y) = − sin x cos y/(sin y)2

(e) ∂x (x sin(xy)) = sin(xy) + xy cos(xy)

∂y (x sin(xy)) = x2 cos(xy)

Question 2.4.1.3
∆p
(i) pmin = 171 pmax = 231 is consistent with p = 0.15 ∆p = 30

(ii) pmin = 20 pmax = 560


Uncertainty interval is not symmetric round p = 200; uncertainties are not small!

Question 2.4.1.5
∆g
g = 0.055 ∆g = 0.055g

Question 2.4.1.7
∆λ
λ = cos θ
sin θ ∆θ
With ∆θ = 1′ = 2.91 · 10−4 rad this results in ∆λ
λ = 1.04 · 10−3

Question 2.4.1.9
sin x − ∆x · cos x and sin x + ∆x · cos x

Question 3.12.1.1

115
(b) 38,40

(c) Median = 38

(d) Mean = 38.3

(e) Standard deviation = 4.4

(f) 34,43

(g) 30,47

Question 3.12.1.3
q
S = N 1−1 N 2
P
i=1 (Vi − V̄ )
Standard deviation does NOT depend on N .

Question 3.12.1.5
g ± sg = (9.80 ± 0.04) m/s2

Question 3.12.1.7
k̄ ± sk̄ = (13.16 ± 0.06) N/m

Question 4.7.1.1 !
1 (x − x)2
p(x) = √ exp −
σ 2π 2σ 2
It is immediately clear that
1
p(x) = √
σ 2π
Thus if we solve p(x) = 21 p(x) for x
!
(x − x)2 1
exp − =
2σ 2 2

(x − x)2 2 = 2σ 2 ln 2 → x

This gives = ln 2 → (x − x) 12 = x ± σ 2 ln 2
2σ 2 √
Therefore the F HW M = 2σ 2 ln 2 ≈ 2.35σ Q.E.D.

Question 4.7.1.3
P (x̄ − σ < x < x̄ + σ) = P (−1 < z < 1) = P (z ≥ −1) − P (z ≥ 1) = 1 − 2 × P (z ≥
1) = 1 − 2 × 0.159 = 0.681

Question 5.5.1.1

116
x̄ ± Sx̄ = (1.60 ± 0.02) · 10−19 C

Question 5.5.1.3
Partly done during lecture, see slides for explanation

Question 5.5.1.5
τ = 2.025 s−1

117
I. Answers Problems

Problem 2.4.2.1

1 1 1
(a) Method 1: = +
R R1 R2

1
R1 = 100 Ω ± 5% = (100 ± 5) Ω ⇒ = (0.01000 ± 0.00050) Ω−1 ,
R1
1
R2 = 270 Ω ± 5% = (270 ± 14) Ω ⇒ = (0.00370 ± 0.00019) Ω−1 ,
R2
since the relative uncertainties in Ri and 1/Ri are equal.
The absolute uncertainties can now be added:
1
= (0.01370 ± 0.00069) Ω−1
R
⇒ R = (73 ± 4) Ω

This last step is valid because the relative uncertainties in R and in 1/R are equal.
R1 R2
Method 2: R =
R1 + R2

R22

∂R R2 (R1 + R2 ) − R1 R2
= =


∂R1 (R1 + R2 ) 2 (R1 + R2 ) 2  ∂R ∂R
2 ⇒ ∆R = ∆ R1 + ∆R2
∂R R1 ∂R1 ∂R2
=


(R1 + R2 )2

∂R2
R22 R12
⇒ ∆R = ∆ R 1 + ∆R2 = 3.686 Ω
(R + R2 )2 (R + R2 )2
| 1 {z } | 1 {z }
2.663 Ω 1.023 Ω

(b) The largest contribution to the uncertainty comes from R1 : S1/R1 = 0.0005 Ω−1 vs.
S1/R2 = 0.0002 Ω−1 with method 1 or 2.663 Ω vs. 1.023 Ω.
Thus replacing R1 is the most sensible.

Problem 2.4.2.2

(a) If α = α+ − α0 were to be used, then the uncertainty ∆α in α is of course

∆α = 2∆α+ .

With the applied method, the uncertainty is

∆α = 12 2∆α+ = ∆α+ .

This method is therefore more precise and results in a more accurate value for λ.
118
(b)  
d α+ − α−
nλ = d sin α ⇒ λ = sin .
n 2
The uncertainty ∆λ now follows from:
 
∂λ ∂λ ∂λ ∂λ
∆λ = ∆α+ + ∆ α− = + ∆α+ (because ∆α+ = ∆α− )
∂α+ ∂α− ∂α+ ∂α−
 
d α+ − α− d d p
⇒ ∆λ = cos ∆ α+ = |cos α| ∆α+ = 1 − sin2 α ∆α+
n 2 n n
s  2
d nλ
= 1− ∆ α+
n d

Alternatively:
d
λ = sin α
n
∂λ d d
⇒ ∆λ = ∆α = |cos α| ∆α = |cos α| ∆α+ ( because according to (a), ∆α = ∆α+ )
∂α n n
Continued as above.

(c) Use part (a) and substitute everything:


s  s
d 2 2190 2

2
1 π
∆λ = ∆α+ −λ = − 6002 = 0.6 nm
n 60 180 1

(d) In the formula nλ = d sin α, of course the value of sin α has to remain smaller than 1
(for sin α = 1 the angle is π/2 which is the extreme).
Thus
nλ d 2190
sin α = ≤1⇒n≤ = = 3.65
d λ 600
Therefore n = 3 is the maximum.
s 
d 2
(e) ∆λ = ∆α+ − λ2 is smallest for as big as possible n, hence for n = 3.
n
s  s
d 2 2190 2

2
1 π
(f) ∆λ = ∆α+ −λ = − 6002 = 0.12 nm.
n 60 180 3
 
d d α+ − α− 2190
(g) λ = sin α = sin = sin(32.567◦ ) = 589.417 nm (not rounded off
n n 2 2
yet).
dp 2190 p 1 π
∆λ = 1 − sin2 α ∆α+ = 1 − sin2 32.567◦ = 0.27 nm.
n 2 60 180
Hence the final answer is λ = (589.4 ± 0.3) nm.

Problem 2.4.2.3

(a) (i) The uncertainties ∆xi have to be very small (with respect to xi ).
119
(ii) The uncertainties ∆xi have to be independent of each other.

(b) (i) In the derivation, the slope of the function f was taken as constant around xi ,
thus in the one dimensional case, df /dxi = ∆f /∆xi .
Example: sin α with ∆α very large, then ∆sin α > 1 is possible.
(ii) It is presumed that the correct values of the measured xi can lie randomly within
the intervals so that the lower- and upperbounds of the known quantity x can
be reached by assuming the necessary lower- and upperbounds of xi (or possibly
another value within the interval ). This is not (always) possible for uncertainties
that are not independent.
Counterexample: y = x1 /x2 with x2 = 1 − x1 . If the actual value of x1 is larger
than the measured value, then the actual value of x2 must be smaller than the
measured value.

(c) The first condition is no longer necessary since for summation it holds that ∆y =
∆x1 + ∆x2 , provided that the uncertainties are independent. This is because y =
f (x1 , x2 ) = x1 + x2 is a linear function. The slope of f (x1 , x2 ) is therefore not only
constant in a small region near the point (x1 , x2 ), but everywhere. The uncertainties
∆x1 and ∆x2 would still have to be independent.

Problem 2.4.2.4

(a)

V = I0 R = I1 (R + R1 )
I1
⇒ R (I0 − I1 ) = I1 R1 ⇒ R = R1
I0 − I1

(b)
 
∂R ∂R ∂R ∂R
∆R = ∆ I0 + ∆ I1 = + ∆I
∂I0 ∂I1 ∂I0 ∂I1
 
I1 R1 I0 R1
= + ∆I
(I0 − I1 )2 (I0 − I1 )2
I0 + I1
⇒ ∆R = R1 ∆I
(I0 − I1 )2

(c) I0 and I1 are entirely determined by V , R and R1 , thus ∆R can also be written as

R −1
∆R = R (R + R1 )(2R + R1 )∆I
V 1
R 2R2

⇒ ∆R = + 3R + R1 ∆I
V R1

This expression includes 3 terms: a term ∝ 1/R1 , a term that does not depend on R1
and a term ∝ R1 .
For small values of R1 the first term dominates (∆R → ∞) and for large values of R1
120
the third term dominates (∆R → ∞). So somewhere there has to be a minimum.
∆R is minimal when d∆R /dR1 = 0:

d∆R
= −R1−2 (R + R1 )(2R + R1 ) + R1−1 (2R + R1 ) + R−1 (R + R1 ) = 0
dR1

⇒ R12 − 2R2 = 0 ⇒ R1 = R 2

(d) Using (a) we find


4.6
R= × 1 kΩ = 2.42 kΩ (not rounded off yet)
6.5 − 4.6
The uncertainty is found using (b):
6.5 + 4.6
∆R = × 0.1 × 1 kΩ = 0.31 kΩ
(6.5 − 4.6)2

Therefore the answer is


R = (2.4 ± 0.3) kΩ

(e) Using (c) we find that the result is optimum when R1 = R 2 = 3.4 kΩ
The uncertainty ∆R was

[oud] R  [oud] −1  [oud]



[oud]

∆R = R1 R + R1 2R + R1
V
and thus the ratio between the ’old’ uncertainty and the new:
[oud]
∆R R1 (R + R1 )(2R + R1 ) 1.0 × 5.8 × 8.2
[oud]
=   = = 0.7 ⇒ ∆R = 0.2 kΩ
∆R [oud]
R1 R + R1
[oud]
2R + R1 3.4 × 3.4 × 5.8

Problem 2.4.2.5

(a)

i = pi − pl ⇒ ∆i = ∆pi + ∆pl
o = pl − po ⇒ ∆o = ∆pl + ∆po

Substituting everything gives:

i = (26.40 ± 0.25) cm
o = (34.05 ± 0.10) cm

or rounded off correctly:

i = (26.4 ± 0.3) cm
o = (34.1 ± 0.1) cm

(b) These values cannot be used since the calculated uncertainties in i and o are dependent
(both depend on ∆pl ).
121
(c) Since uncertainties are dependent we have to use the general rule for 100% intervals,
starting with the uncertainties in the measurements. From the thin lens formula it
follows that
io (pi − pl )(pl − po )
f= =
i+o pi − po
The uncertainty ∆f in f follows from
∂f ∂f ∂f
∆f = ∆p o + ∆p l + ∆p i
∂po ∂pl ∂pi
Calculating both gives
∂f ∂f ∂f
∆f = ∆p o + ∆p l + ∆pi
∂po ∂pl ∂pi
(pi − pl )2 pi − 2pl + po (pl − po )2
= ∆ p o + ∆ p + ∆p i
(pi − po )2 pi − po l
(pi − po )2
i2 i−o o2
= ∆ po + ∆ pl + ∆pi
(i + o)2 i+o (i + o)2
i2 ∆po + o2 ∆pi + |(i − o)(i + o)| ∆pl
=
(i + o)2
i2 ∆po + o2 ∆pi + (i2 − o2 ) ∆pl
=
(i + o)2
Substituting everything gives f = 14.870 cm and ∆f = 0.079 cm and thus the final
answer is:
f = (14.87 ± 0.08) cm

Problem 3.12.2.1

(a) From Z +∞
p(x) dx = 1
−∞
follows
Z +∞
b
p(x) dx = 2a + 2ab = 3ab = 1
−∞ 2
1
⇒b=
3a
(b)
Z +∞
xw = ε ⟨x⟩ = x p(x) dx
−∞
Z 0 Z +∞
= x p(x) dx + x p(x) dx
−∞ 0
Z 0 Z +∞
= −x p(−x) −dx + x p(x) dx
∞ 0
Z ∞ Z +∞
= − x p(−x) dx + x p(x) dx
0 0
Z +∞
= x [p(x) − p(−x)] dx
0
122
For a symmetric probability density p(x) = p(−x) and thus is the [p(x) − p(−x)]-term
in this integral 0.
Conculsion: ε ⟨x⟩ = 0 for a symmetric probability density.

(c)
Z +∞
var ⟨x⟩ = (x − xw )2 p(x) dx
−∞
Z +∞
= x2 p(x) dx
−∞
Z a 2aZ
2 b 2
= 2 b x dx + 2 x dx
0 a 2
1 3 a 1 3 2a 2 3
   
1 
= a b + b 8a3 − a3 = a2

= 2b x +b x
3 0 3 a 3 3
p
Hence var⟨x⟩ = a2 and from this σ = var ⟨x⟩ = a.

(d) Part of measured results lying between −σ and +σ is calculated using


Z +σ Z +a
2
p(x) dx = p(x) dx = 2ab =
−σ −a 3

Thus 66.7 % lies in this region.

Problem 3.12.2.2
!
1 (x − x)2
p(x) = √ exp −
σ 2π 2σ 2
It is immediately clear that
1
p(x) = √
σ 2π
Thus if we solve p(x) = 12 p(x) for x
!
(x − x)2 1
exp − =
2σ 2 2
!
(x − x)2
⇒ exp =2
2σ 2
(x − x)2
⇒ = ln 2
2σ 2
⇒ (x − x)2 = 2σ 2 ln 2

⇒ x1,2 = x ± σ 2 ln 2

The FWHM is therefore



FWHM = x2 − x1 = 2σ 2 ln 2 ≈ 2.35 σ

Q.e.d.
123
Problem 3.12.2.3
We know that
var ⟨x⟩ = σ 2
From the figure we estimate
FWHM ≈ 110 m/s ≈ 2.35 σ
Therefore
σ ≈ 47 m/s en var ⟨x⟩ ≈ 2.2 103 m2 /s2

Problem 3.12.2.4
The value he reports for the diameter is the average:
1 X
d= di = 4.9999 mm
10
For the uncertainty he reports (multiplied by two) the standard deviation (not that of the
average). Because he reports the uncertainty in the diameter of a single ball, he therefore
has to report the uncertainty in one single measurement, thus the standarddev. of separate
measurements. Which is determined using
r r
1 2 1 2
Sd = di − d = di − d = 0.0051 mm
N −1 9

Thus, as the uncertainty he reports 0.01 mm (95%-interval), thus

d = (5.00 ± 0.01) mm

Problem 3.12.2.5
First method:
Use
1 X
x= xi
N
and s
1 X
Sx = (xi − x)2
N (N − 1)
for both the current and voltage. We then find

I ± SI = (5.073 ± 0.044) µA and


V ± SV = (50.39 ± 0.28) µV

For the resistance we find


(50.39 ± 0.28) [µV]
R= = (9.9 ± 0.1) Ω
(5.073 ± 0.044 [µA]

The uncertainty is found by the quadratic sum of the relative uncertainties.


Second method:
Averaging Ri -values as with I and V in the first method gives:

R ± SR = (9.93 ± 0.04) Ω
124
The difference in the uncertainty with respect tot he first method is clear. This is because
the uncertainties in V and in I are not independent of each other. Apparently there are
fluctuations present in the current which lead to fluctuations in the measured potential. The
first method therefore is not the correct one because it assumes that SI and SV are independent
when calculating the uncertainty in R.

Problem 3.12.2.6

(a) S is calculated with


s
1 X 2
S= Ii − I¯
N −1
i

(b) S is determined by the spread in separate measurements which is of course independent


of the number of measurements N .

1. The uncertainty is

 2  2
S S 0.5
Sm = √ ⇒ N = = = 25
N Sm 0.1

Thus 25 measurements are needed.

Problem 3.12.2.7
The measured uncertainty is SI = 0.2 µA. From this follows

SR SI 0.2 0.2
= = ⇒ SR = R
R I 2.3 2.3

The desired uncertainty is


SR
= 0.05 ⇒ SR = 0.05 R
R

0.2
This is = 1.74× as accurate, hence 3.0 times as much measurements are needed =
2.3 × 0.05
30 measurements. Therefore 20 extra measurements are needed.

Problem 3.12.2.8
The general formula:
v
u n 
uX ∂y 2

Sy = t Sx2i
∂xi
i=1

Denoting the 95%-intervals Dxi = 2Sxi and Dy = 2Sy , the 95%-uncertainty in the final answer
125
then is
v
u n 
uX ∂y 2

Dy = 2Sy = 2 t Sx2i
∂xi
i=1
v
u n 
u X ∂y 2

= t4 Sx2i
∂xi
i=1
v
u n 
uX ∂y 2

= t 22 Sx2i
∂xi
i=1
v
u n 
uX ∂y 2

= t (2Sxi )2
∂xi
i=1
v
u n 
uX ∂y 2

= t Dx2i
∂xi
i=1

r
Pn  ∂y 2
and thus is Dy = i=1 ∂xi Dx2i the same rule as for 68%-intervals.

Problem 3.12.2.9

1 1 1
(a) Method 1: = +
R R1 R2

1
R1 = 100 Ω ± 5% = (100 ± 5) Ω ⇒ = (0.01000 ± 0.00050) Ω−1
R1
1
R2 = 270 Ω ± 5% = (270 ± 14) Ω ⇒ = (0.00370 ± 0.00019) Ω−1
R2
because the relative uncertainties in Ri and 1/Ri are equal.
We can now take the quadratic sum of the absolute uncertainties:
1
= (0.01370 ± 0.00053) Ω−1
R
⇒ R = (73 ± 3) Ω

The last bit is because the relative uncertainties in R and in 1/R are equal .
R1 R2
Method 2: R =
R1 + R2

R22

∂R R2 (R1 + R2 ) − R1 R2
= =

∂R 2 2 ∂R 2 2
    
∂R1 (R1 + R2 ) 2 (R1 + R2 ) 2 
2
⇒ SR = SR1 + SR2
∂R R12 ∂R1 ∂R2
=


(R1 + R2 )2

∂R2
2 2
R22 R12
 
2 2 2
⇒ SR = S R1 + SR = 8.061 Ω2 ⇒ SR = 2.84 Ω
(R1 + R2 )2 (R1 + R2 )2 2
| {z } | {z }
7.089 Ω2 0.972 Ω2
126
(b) The biggest contribution to the uncertainty is accounted for by R1 : S1/R1 = 0.0005 Ω−1
vs. S1/R2 = 0.0002 Ω−1 in method 1 or 7.089 Ω2 vs. 0.972 Ω2 .
Hence replacing R1 would be the most sensible.

Problem 4.7.2.1
The average of the first 10 measurements:

1 X
l= li = 50.369 mm
10
i

Spread (standard deviation) in the first 10 measurements:


r
1 2
σl = li − l = 0.051 mm (Spread of separate measurements)
N −1
50.55−50.369
The last measurements differs 50.55 − 50.369 mm from it, which is 0.051 = 3.6 σ.
The probability for z ≥ 3.6 is 1.59 · 10−4 .
This is very small for a series of 10 measurements, hence reject.
Problem 4.7.2.2

(a) Comparing both equations for p (x) simply gives (equating powers):

a−1=n⇒a=n+1

and
b−1=N −n⇒b=N +1−n

(b) p (x) = K xn (1 − x)N −n is maximum if

dp (x)
=0
dx
⇒ K n xn−1 (1 − x)N −n − K xn (N − n) (1 − x)N −n−1 = 0
⇒ n (1 − x) − (N − n) x = 0
n
⇒ n−Nx=0⇒x=
N

Q.e.d.

(c) Expectation value of the original fraction x is

a n+1 n+1
ε ⟨x⟩ = = =
a+b n+1+N +1−n N +2

(d) It is not equal to x where p (x) is maximum because the distribution function is not
symmetric (see figure).For example, in the left figure the probability that xw is greater
than n/N is more than 50%.
127
(e)

ab
var ⟨x⟩ = 2
(a + b) (a + b + 1)
(n + 1) (N + 1 − n)
=
(N + 2)2 (N + 3)
1 n + 1 N + 2 − (n + 1)
=
N +3 N +2 N +2
1
= ε ⟨x⟩ (1 − ε ⟨x⟩)
N +3
ε ⟨x⟩ (1 − ε ⟨x⟩)
= .
N +3

(f) For N = 100 and n = 49


50
ε ⟨x⟩ = = 0.490
102
The uncertainty in it is the standard deviation σ, hence
r
p 0.49 × 0.51
σ = var ⟨x⟩ = = 0.049
103
Conclusion:
x = (0.49 ± 0.05)

(g) With xw = 0.501 has to apply:


r
0501 × 0.499
σ≈ < (0.501 − 0.5) = 0.001
N +3

Nonetheless the uncertainty (= σ) has to be smaller than the deviation with respect
to 50%.
Hence
r
0.501 × 0.499
< 0.001
N +3
⇒ N + 3 > 0.25 × 106
⇒ N > 250 000

For 95% confidence has to apply

2σr< 0.001
0.501 × 0.499
⇒ 2 < 0.001
N +3
⇒ N + 3 > 4 × 0.25 × 106
⇒ N > 1 000 000

We can also consider this problem from the binomial distribution perceptive with the
same result. Which goes as follows.
128
For the actual fraction xw for candidate A the probability P (n) of n votes for him out
of a trial of total N votes is equal to
 
N n
P (n) = xw (1 − xw )N −n (binomiaalverdeling).
n

Hence expected the number of votes that is

ε ⟨n⟩ = N xw

(known from the binomial distribution). The uncertainty herein then is


p p
σ = var ⟨n⟩ = N xw (1 − xw )

(also known from the bin.distr.). The result will be unequivocal(within ca. 68 %) if
the expected number of votes N xw more than σ deviates from 12 N (50% for both
candidates). Thus if

1 p
N xw − N > N xw (1 − xw )
2
√ √
⇒ 0.001 N > 0.501 × 0.499 ≈ 0.5
⇒ N > 250 000

The result is unequivocal within ca. 95 % if

1 p
N xw − N > 2σ = 2 N xw (1 − xw )
2
√ √
⇒ 0.001 N > 2 0.501 × 0.499 ≈ 1.0
⇒ N > 1 000 000

Problem 4.7.2.3

(a) The probability that the player has to stop after the n-th card is equal to the probability
that the player first draws n − 1 cards and subsequently draws a non-card. The proba-
bility of drawing n − 1 cards is (4/13)n−1 and the probability of drawing a non-card is
9/13. The total probability is the product of both probabilities, thus
 n−1
9 4 n
 
4 9
P (n) = = .
13 13 4 13
(b) The profit of a player is given by the ’profitfunction’
W (n) = n2 − 2n
Where n is the number of cards drawn. The n2 -term is the payment and the 2n-term is
the stake(2 chips per drawn card). The average profit per participant is the expectation
value of this profitfunction  n
∞ 9 P∞ 4
2
P
ε ⟨W (n)⟩ = W (n)P (n) = (n − 2n) =
n=1  4n=1 13  
4 n 18 P 4 n 4 n 18 P

 ∞ ∞ ∞
 n
9 P 2 9 P 2 4
= n − n = n − n
4 n=1 13 4 n=1 13 4 n=0 13 4 n=0 13
The first term is calculated with the aid of the last series in the attachments with
129
r = 4/13 and the second term with the second last series. We then find
17 4
9 13 18 13
ε ⟨W (n)⟩ =  3 −  2 = 5.98
4 9 4 9
13 13
The expected profit per participant( and thus the loss for the casino) is 5.98 fiches.

(c) The required stake per card is m chips. The expectation value of the profit per player
therefore is  n  n  2 
∞ ∞

9 P 2 4 9m P 4 13 17 9m
ε ⟨W (n)⟩ = n − n = −
4 n=0 13 4 n=0 13 9 4 13
The casino will be making a profit if
17 9m 17 × 13
− <0⇒m> = 6.13
4 13 9×4
Thus the casino will be making a profit when the stake is 7 chips or more.

(d) Players therefore have to stake 7 chips per card. The profitfunction then becomes
W (n) = n2 − 7n
Only above n = 8 is W (n) > 0 and will a player be profiting. The probability that he
draws 8 or more cards is  n
9 4 8 P
∞ ∞
  ∞  n
P 9 P 4 4
P (n ≥ 8) = P (n) = =
n=8 4 n=8 13 4 13 n=0 13
With the aid of the fifth series we then find
9 4 8 1
   7
4
P (n ≥ 8) = = = 0.00026.
4 13 9/13 13
The probability therefore is extremely small.

Problem 4.7.2.4

(a) After 7 throws it is certainly over since the longest set is comprised of the throws
1-2-3-4-5-6. The 7th throw is always smaller.

(b) For 7 throws the order of part a) has to be thrown. The probability of each throws is
 6
1 1
1/6 and thus is the total probability = .
6 46656
(c) The total probability is, hence
P7 27216 + 15120 + 3780 + 35 + 1
P (n) = + P (5) = 1
n=1 46656
504 7
⇒ P (5) = = = 0.0108.
46656 648
(d) The average amount of throws per game is
P7 2 × 27216 + 3 × 15120 + 4 × 3780 + 5 × 504 + 6 × 35 + 7 × 1
n̄ = nP (n) =
n=1 46656
117649
= = 2.522
46656
The standard deviation is calculated using
7
σ2 = (n − n̄)2 P (n) = 0.486.
P
n=1
⇒ σ = 0.697
130
(e) The probability that the first player wins is
27216 + 3780 + 35 31031
P1 = P (2) + P (4) + P (6) = = = 0.6651.
46656 46656
The probability that the second player wins is
15120 + 505 + 1 15626
P2 = P (3) + P (5) + P (7) = = = 0.3349.
46656 46656
(f) The table below shows the stake and profit/loss for both player per throw. The profit-
columns apply if the game is over after that particular throw. The profit is then the
pot minus the stake of the that player.
throw stake A stake B pot profit A profit B
1 1 1 2 -1 -1
2 0 0 2 1 -1
3 1 0 3 -2 2
4 1 0 4 1 -1
5 1 0 5 -4 4
6 1 0 6 1 -1
7 1 0 7 -6 6
The expectation value for the profit of the first player is
P7
ε ⟨WA ⟩ = WA (n)P (n)
n=1
−1 × 0 + 1 × 27216 − 2 × 15120 + 1 × 3780 − 4 × 504 + 1 × 35 − 6 × 1
= = −0.0264
46656
and thus for the second player
P7
ε ⟨WB ⟩ = WB (n)P (n)
n=1
−1 × 0 − 1 × 27216 + 2 × 15120 − 1 × 3780 + 4 × 504 − 1 × 35 + 6 × 1
= = 0.0264
46656
The second player will therefore win with an average profit of 2.64 cent per game.

Problem 4.7.2.5
R∞ R∞ ∞
Ke−t/µ dt = −Kµe−t/µ 0 = Kµ = 1 ⇒ K = 1/µ

(a) p(t)d(t) =
0 0

R∞ 1 R∞ −t/µ ∞
dt = −(t + µ)e−t/µ 0 = µ = 4.05 min

9b) t̄ = tp(t)dt = te
0 µ0

R∞ ∞
p(t)dt = −e−t/µ 8 = e−1.975 = 0.139

(c) P (t > 8[min]) =
8
Thus 13.9 % probability.

(d) Costpricefunction K(t) = 0.2 + 0.2t and the expectation value therefore is
1 R∞ 0.2 R∞
ε ⟨K(t)⟩ = K(t)e−t/µ dt = (1 + t)e−t/µ dt = 0.2(1 + t̄) = 1.01 Euro
µ0 µ 0

(e) New costpricefunction K(t) = 0.04t2 and the expectation value therefore is
1 R∞ 0.04 R∞ 2 −t/µ
ε ⟨K(t)⟩ = K(t)e−t/µ dt = t e dt
µ0 µ 0

= 0.04 (−t2 − 2µt − 2µ2 )e−t/µ 0 = 0.08µ2 = 1.31 Euro
 

The first tariff is therefore cheaper


131
(f) In the case of smaller average call duration’s the second tariff is cheaper because it goes
to zero as the call duration goes to zero. Whilst with the second tariff the fixed fee
remains. The turning point is found by solving:
ε ⟨K1 (t)⟩ = ε ⟨K2 (t)⟩ ⇒ 0.2(1 + µ) = 0.08µ2 ⇒ µ = 3.266 min

Problem 5.5.2.1

(a) I = I0 exp (−Cd) ⇒ ln (I) = ln (I0 ) − Cd.


Thus plotting ln (I) against d or inversely. Criteria: negligible uncertainties along the x-
axis, constant uncertainties along the y-axis. Thus (also look at the following elements):
Pot ln (I) against d.

(b) The conditions are:

• There has to be (or is expected) a linear relation between x and y (open door);

• The uncertainties of the variable along the x-axis must be negligible;

• Only the measurements of Sy (or Sy ) are assumed to suffer appreciable uncertain-


ties, i.e. Sy ≪ y.

(c) In a figure:

1
ln (I)

-1

-2
0.0 0.5 1.0 1.5 2.0 2.5 3.0

Dikte d (mm)

(d) Drawing lines of worst fit:


132
3

2 3.11 - 1.59 d

2.83 - 1.43 d

ln (I)

-1

-2
0.0 0.5 1.0 1.5 2.0 2.5 3.0

Dikte d (mm)

Hence the slope is

a = (−1.51 ± 0.08) mm−1 = −C


⇒ C = (1.51 ± 0.08) mm−1

and the y-intercept is

b = (2.97 ± 0.14) = ln (I0 )


⇒ I0 = (19 ± 3) W/m2

The uncertainty herein is determined via

SI0
Sln(I0 ) = ⇒ SI0 = I0 Sln(I0 ) = 19.49 · 0.14 = 2.7 W/m2
I0

(e) Method 1: Make use of the formulas in appendix G: Formulas least-square-fits


(Note: In the answer (excessively) many numbers are given to show that method 1 and
method 2 lead to exactly the same result)
xi = d (mm) I (Wb/m2 ) yi = ln(I) Syi = SII 1/Sy2i xi yi /Sy2i xi /Sy2i yi /Sy2i x2i /Sy2i
0.5000 9.2 2.2192 0.0978 104.4938 115.9465 52.2469 231.8931 26.1235
0.9950 4.7 1.5476 0.1064 88.3600 136.0589 87.9182 136.7426 87.4786
2.0500 0.81 −0.2107 0.0988 102.5156 −44.2845 210.1570 −21.6022 430.8219
2.7500 0.31 −1.1712 0.0968 106.7778 −343.9049 293.6389 −125.0563 807.5069
xi yi /Sy2i xi /Sy2i = yi /Sy2i =
P P P P P P P 2 2
xi = yi = 1/Syi = xi Syi =
6.2950 2.3849 402.1472 −136.1839 643.9610 221.9772 1351.9309
mm mm mm mm2

Slope
133
P 1 P xi yi P xi P yi
Si2 Si2
− Si2 Si2
a = 2
x2i
P
P 1 P xi
Si2 Si2
− Si2
402.1472 · −136.1839 − 643.9610 · 221.9772 −197710.6420
= 2
=
402.1472 · 1351.9309 − 643.9610 128989.4663
−1
= −1.5328 mm = −C.

Uncertainty

v
u P 1
Si2
u
Sa = u 2
x2i
tP P
1 P xi
Si2 Si2
− Si2
r
402.1472
= = 0.05584 mm−1 = SC .
128989.4663

y-intercept

P x2i P yi P xi P xi yi
Si2 Si2
− Si2 Si2
b = 2
x2i
P
P 1 P xi
Si2 Si2
− Si2
1351.9309 · 221.9772 − 643.9610 · −136.1839 387794.9525
= =
128989.4663 12.397
= 3.0064 = ln (I0 )
⇒ I0 = 20.2147 W/m2 .

Uncertainty

v
x2i
u P
u
u Si2
Sb = u 2
x2i
P
xi
tP 1 P
Si2 Si2
− Si2
r
1351.9309
= = 0.1024
128989.4663
⇒ SI0 = I0 Sln(I0 ) = 20.294 · 0.1024 = 2.07 W/m2 .

Hence final result (now correctly rounded):

C = (1.53 ± 0.06) mm−1


en I0 = (20 ± 2) W/m2 .

Method 2: Make us of Origin


Output Origin:
134
From the tables can be deduced (incorrectly rounded):

a = (−1.53277 ± 0.05584) mm−1


b = (3.00641 ± 0.10238)

From this follows (correctly rounded)

C = (1.53 ± 0.06) mm−1


I0 = (eb ± bSb ) = (20 ± 2) W/m2

Problem 5.5.2.2

(a) Plotting En against 1/n2 gives a straight line through the origin with slope C.
The grounds for this are:

• The uncertainties are in present in En but not in n, thus we plot n against the
x-axis and f (En ) against the y-axis.
135
• The uncertainties in En are constant thus make use of f (En ), instead of En .

(b) In a figure:

16

14

12

10
En (eV)

0.0 0.2 0.4 0.6 0.8 1.0 1.2

n
2
1/

(c) The disadvantage is that the points will not be equally distributed along the x-range in
the graph

(d) Drawing lines of worst fit:

16

14 y = 13.4 x
y = 14.0 x
12

10
En (eV)

0.0 0.2 0.4 0.6 0.8 1.0 1.2

n
2
1/

Estimating from this:


C = (13.7 ± 0.3) eV
136
(e) (This part is allowed to be solved using Origin; the answer should be exactly the same)
Using the method of least-squares fitting for line thought the origin:

xi = 1/n2 yi = Ei (eV) xi yi (eV) x2i


1.0 13.7 13.7 1.0
0.25 3.7 0.925 0.0625
0.111 1.3 0.144 0.01235
0.0625 1.1 0.0688 0.0003906
0.04 0.4 0.016 0.00016
P P 2
xi yi = 14.85 xi = 1.0804

Slope: P
x i yi 14.85
a= P 2 = = 13.74 eV
xi 1.0804
Uncertainty:
Sy 0.3
Sa = qPi =√ = 0.29 eV
x2i 1.0804

Conclusion:
C = (13.8 ± 0.3) eV

1. The measurements are consistent because there is a overlap.

2. In a table:

n2 En (eV) Ci = n2 En (eV)
1 13.7 ± 0.3 13.7 ± 0.3
4 3.7 ± 0.3 14.8 ± 1.2
9 1.3 ± 0.3 11.7 ± 2.7
16 1.1 ± 0.3 17.6 ± 4.8
25 0.4 ± 0.3 10.0 ± 7.5

The Uncertainties SCi are determined via

SCi = n2 SEn

Using the expressions for combining the results Ci , we find:


2

SCi =Pn SEn  
i Gi Ci  
C= P


G

i

i 
1 ⇒ C = (13.8 ± 0.3) eV
Gi = 2
SCi




1 

SC = pP 


i G i

3. The first measurement gives the most important contribution to C as it has the smallest
uncertainty. In the graph this is the point that lies most far right and that wholly
determines the positioning of the line.
137
Problem 5.5.2.3

(a) The two unknowns in the second part are V0 and τ

(b) For the second is applicable:


 
V0 ti − t0 Vb − Vi
exp − =
Vref τ Vref
   
V0 ti − t0 Vb − Vi
⇒ ln − = ln
Vref τ Vref
 
Vb − Vi
Thus plotting ln against ti − t0 yields a straight line with slope a = 1/τ and
Vref
intercept b = ln (V0 /Vref ).
And thus is τ = 1/a and V0 = Vref exp(b).
 
Vb − Vi
(c) Because ln is along the y-axis where Vb is determined once only. An incorrect
Vref
value of Vb propagates in a similar way through all y-points of the linear regression and
thus is a systematic error.

(d) For large values of t the signal approaches the value Vb . Because of the noise in the
measurements points it incidentally occurs that Vi > Vb and as a result of it (Vb −Vi )/Vref
becomes negative. The log on both sides then cannot be taken. From the figure it can
be estimated that Vb ≃ 8 V. Below t = 41 n all Vi -values lie below Vb .

(e) SVi is the standard deviation of separate measurements. SVb is the standard deviation of
the average because Vb is determined from the series of measured values. This exhibits
of course the same spread of separate measured values, thus is
SV
SVb = √ i .
N

If one demands that SVb = 0.1SVi then N = 10 and thus N = 100.

(f)
dy Vref 1 SVi
Sy = S 
Vb −Vi
 = SVi = SVi =
ln Vref
dVi Vb − Vi Vref Vb − Vi

(g) In this case what is minimized is


X
χ2 = Gi (yi − axi − b)2
i

whereby the weights Gi are given by


 2
1 Vb − Vi
Gi = 2 = .
S ii SVi

Thus minimized is
 2    2
2 Vb − Vi Vb − Vi
χ = ln − a(ti − t0 ) − b .
SVi Vref
138
(h) All the terms in the expressions for both the slope a and intercept b contain terms as
1/Si2 where Si is given by SVi /(Vb − Vi ) (see(f)). Because herein SVi is constant, can
in all terms 1/SV2i be taken outside the of the summation and eliminated by division.
Thus SV2i cancels everywhere. This of course does not apply to the uncertainties Sa
and Sb : there the numerator contains a a factor 1/SV2 and the denominator 1/SV4 (both
below the root-square), hence Sa and Sb are both proportional to SV . Conclusion: we
can determine the est fitting straight line, bu not the uncertainties.

Problem 5.5.2.4

(a)
∞ ∞
X X µn
ε⟨n⟩ = n P (n) = n exp(−µ)
n!
n=0 n=0

X µn
= exp(−µ) n
n!
n=1

X µn
= exp(−µ)
(n − 1)!
n=1

X µn−1
= exp(−µ) µ
(n − 1)!
n=1

X µn
= µ exp(−µ) = µ exp(−µ) exp(µ) = µ
n!
n=0

p √
(b) σ = var⟨n⟩ = µ

(c) Determined using v


u N
u 1 X
S=t (ni − n)2
N −1
i=1

(d) r
S µ
Sn = √ =
N N

(e) We know then that S ≈ n. Because the relative uncertainty in S is constant we must
plot ln(S):
1
ln(S) ≈ 2 ln(n).
Thus plot: ln(S) against ln(n) gives a straight line through the origin with slope 0.5.

(f) Yes because it is a linear relation of the form y = ax. There is only a single unknown,
the slope ( which should be 0.5).

(g) On the x-axis:


Sn 1
ln(n) ⇒ Sx = Sln(n) = ≈√
n Nn
139
On the y-axis:
SS 1 1
ln(S) ⇒ Sy = Sln(S) = =p = √ = 0.13.
S 2(N − 1) 58

(h) (This section is allowed to be determined using Origin; the answer should be the same)
In a table:

x = ln(n) Sx = N1n y = ln(S) Sy


4.263 0.022 2.13 0.13
4.977 0.015 2.42 0.13
5.380 0.012 2.59 0.13
5.680 0.011 2.80 0.13
5.878 0.010 2.97 0.13

In a figure:
3.2

3.0

2.8
ln(S)

2.6

2.4

2.2

2.0

4.0 4.2 4.4 4.6 4.8 5.0 5.2 5.4 5.6 5.8 6.0

ln(n)

(i) In a table:

xi yi xi yi x2i
4.263 2.13 9.080 18.173
4.977 2.42 12.044 24.771
5.380 2.59 13.934 28.944
5.680 2.80 15.904 32.262
5.878 2.97 17.458 34.551
P P
= 68.420 = 138.701

The slope therefore is P


x i yi 68.420
a= P 2 = = 0.493
xi 138.701
140
The uncertainty therein:

Sy 0.13
Sa = qP =√ = 0.011
x2i 138.701

The final answer therefore is


a = (0.49 ± 0.01)

(j) The theoretical slope is 0.5. Hence the experiment and theory are in agreement because
the theoretical value lies within the uncertainty range of the measurement.

Problem 5.5.2.5
(This section is allowed to be determined using Origin; the answer should be the same)

(a) tan α = (H − H0 )/D hence plotting: tan α against 1/D. This gives a straight line
through he origin with slope H − H0 . Alternatively 1/ tan α can be plotted against D.
This gives a straight line through he origin with slope 1/(H − H0 ). Along the x-axis
is 1/D or D because their uncertainties are negligible. Along the y-axis goes tan α or
1/ tan α. In both cases the uncertainty is not constant, so it wont make a difference.

(b) The line goes through the origin and the uncertainties along the y-axis are not constant
and thus the latter method should be made use of: straight line through the origin with
different uncertainties.

(c) If tan α is plotted against 1/D the height H sought is equal to a + H0 and the uncer-
tainty SH in H is equal to uncertainty Sa in the slope (because H0 does not have an
uncertainty).
If 1/ tan α is plotted against D the the height H sought is equal to H0 + 1/a and the
dH
uncertainty SH in H is equal to Sa /a2 (because SH = Sa = Sa /a2 ).
da
(d) If tan α is plotted against 1/D:
2.2

2.0

1.8

i 1/Di tan αi Stan αi 1.6

1 0.020 1.963 0.17 1.4

2 0.010 0.869 0.061


1.2
3 0.0067 0.625 0.049
tan( )

1.0
4 0.0050 0.488 0.043
5 0.0040 0.364 0.040 0.8

0.6

0.4

0.2

0.0

0.000 0.005 0.010 0.015 0.020 0.025

D
-1 -1
(m )

If 1/ tan α is plotted against D:


141
3.0

2.5

i Di 1/ tan αi S1/ tan αi


1 50 0.510 0.044 2.0

2 100 1.150 0.081

1/tan( )
3 150 1.60 0.12 1.5

4 200 2.05 0.18


5 250 2.75 0.30
1.0

0.5

0.0

0 50 100 150 200 250 300

D (m)

142
(e) Drawing two lines of worst fit:
2.2 3.0

2.0
y = 88.2 x
2.5 y = 0.0113 x
1.8
y = 96.2 x
y = 0.0104 x
1.6

2.0
1.4

1.2

1/tan( )
tan( )

1.5
1.0

0.8
1.0

0.6

0.4
0.5

0.2

0.0 0.0

0.000 0.005 0.010 0.015 0.020 0.025 0 50 100 150 200 250 300

-1 -1
D (m ) D (m)

Estimation from this gives:


H = (94 ± 4) m
(f) Method 1: tan α against 1/D:

i xi = 1/Di yi = tan αi Si = Sαi / cos2 αi x2i /Si2 xi yi /Si2


1 0.02 1.96 0.17 0.01384 1.3564
2 0.01 0.869 0.061 0.02687 2.3354
3 0.0067 0.625 0.049 0.01870 1.7441
4 0.0050 0.488 0.043 0.01352 1.3196
5 0.0040 0.364 0.040 0.01000 0.9100
P P
= 0.08293 = 7.6655
7.6655
H = a + H0 = + 1.8 = 94.23 m
0.08293
1
SH = Sa = √ = 3.5 m
0.08293
Hence is
H = (94 ± 4) m
Method 2: 1/ tan α against D:

i xi = Di yi = 1/ tan αi Si = Sαi / sin2 αi x2i /Si2 xi yi /Si2


1 50 0.510 0.044 1.291 · 106 13171
2 100 1.150 0.081 1.524 · 106 17528
3 150 1.600 0.124 1.463 · 106 15609
4 200 2.05 0.18 1.234 · 106 12654
5 250 2.75 0.30 0.694 · 106 7639
= 6.208 · 106
P P
= 66601
66601
a = = 0.01073 m−1 ⇒ H = H0 + 1/a = 95.01 m
6.208 · 106
1
Sa = √ = 4.0 · 104 m−1 ⇒ SH = Sa /a2 = 3.5 m
6.208 · 106
143
Hence is
H = (95 ± 4) m

144

You might also like