ILS Lecture Notes
ILS Lecture Notes
Skills (ILS)
31ILS
August 2023
2
Introduction to Laboratory
Skills
Introduction
The course ’Introduction to Laboratory Skills (ILS, course code 31ILS)’ consists of two
parts: a theoretical part (lectures) and a part with practical exercises. The practical
exercises consist of an experimental part with six experiments in which the obtained
theoretical knowledge has to be applied, and a introduction to some important computer
skills. More information can be found in the study guide. These lecture notes will serve
as a manual for the theoretical part.
Besides learning how to deal with uncertainties and errors in measurement results,
the curriculum also covers a simple introduction to probability theory and analyzing
measurement results (fitting). At the end of each chapter there is a section called ’As-
signments and problems’. The assignments are meant to elucidate the course material.
The problems give more challenging problems on exam level.
Knowledge of the contents of this course is a requirement to be able to perform exper-
iments on a sufficient level in the follow-up courses.
P.R. Bevington & D.K. Robinson: “Data Reduction and Error Analysis for the Physical
Sciences”
i
ii
Table of contents
2.2.10 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3 Dependent uncertainties . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4 Exercises and problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4.1 Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4.2 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4 Distribution functions 57
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2 Discrete and continuous distributions . . . . . . . . . . . . . . . . . . . 57
4.3 The binomial distribution . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.4 The Poisson distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 64
iv
4.4.1 The Poisson distribution in a different perspective . . . . . . . . 66
4.5 The normal, or Gaussian distribution . . . . . . . . . . . . . . . . . . . 68
4.5.1 The reduced normal distribution . . . . . . . . . . . . . . . . . . 69
4.6 Comparison of the three distribution functions . . . . . . . . . . . . . . 70
4.7 Exercises and problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.7.1 Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.7.2 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
vi
1. Analyzing measurement results
1.1 Introduction
When the value of a physical quantity is determined with an experiment, the accuracy
with which this happens is of great importance. Such an experiment (measurement)
often endeavors to test a (physical) model in which the value of this quantity is predicted
or in which other measurements have to be verified. Another possibility is that the
obtained value of the physical quantity is used in additional experiments or calculations.
For all of these cases it is key to know how accurate the physical quantity has been
determined, or equivalently, what the uncertainty is in the measured value. Since
it’s practically impossible to determine a value of a quantity with infinite precision,
there will always be a discrepancy between the real value, which we obviously don’t
know, and the experimentally measured value. When we are able to give the upper
bound of the size of the difference between the real and the measured value, we have
automatically given an indication of the accuracy of the measurement. It is often
difficult to determine this upper bound; sometimes we will even have to be content
with making an estimation. However, it is utterly absurd to give an experimentally
obtained value without mentioning its uncertainty.
As an example, we can look at the determination of the density ρ of an object. The
objective of this experiment is to see if this object is made of 18-karat gold (ρ = 15.5
g/cm3 ) or a cheap alloy (ρ = 13.8 g/cm3 ). The mass m of the object is measured with
a scale and equals 1238 g. The volume is determined by fully submerging the object
in water. From the rise of the water level ∆h and the surface area S, the volume can
be calculated. The area S is determined to be S = 165.13 cm2 and the water level
is measured with a ruler and equals 25 cm and 25.5 cm before and after submerging,
respectively. The density can now be calculated with
m
ρ= . (1.1)
S ∆h
Inserting the measured values gives ρ = 14.994 g/cm3 . We can now straightforwardly
conclude: the result is much closer to 15.5 g/cm2 than to 13.8 g/cm2 , so the object is
made of gold.
However, we have failed to mention the uncertainty with which ρ is determined. Let
us make a rough estimate. It’s quite realistic to assume that the reading error from
the ruler is 0.5 mm. Consequently, the water level before submersion must have been
between 24.95 cm and 25.05 cm. After submersion the level must have been between
25.45 and 25.55 cm. The rise of the water level ∆h is then 0.4 cm in the smallest
case and 0.6 cm in the highest case (25.55-24.95 cm). If we would have used the latter
value in our calculations, we would have found ρ = 12.49 g/cm3 and if we would have
1
used ∆h=0.4 cm, we would have found that ρ= 18.74 g/cm3 . Therefore, the real value
of ρ lies between these two values, which means that from these measurements it is
impossible to conclude whether or not this object is made of gold.
This example points out what a major role the uncertainty of a measurement plays in
an experiment. We even conveniently neglected the fact that the mass and the surface
area of the water were also subject to measurement errors which add to the uncertainty
in the final result. The example not only shows that it is necessary to calculate the
uncertainty after the experiment, but also that it can be very useful to estimate the
uncertainty beforehand. The reading error of the ruler put a spanner in the works for
us and we could have known this beforehand. Generally, the methods and equipment
that are required to obtain an acceptable amount of uncertainty are determined before
the experiment. Afterwards there should of course be a thorough analysis of whether
or not the uncertainty criteria were actually met.
In this course we will examine all of this very carefully. We will get acquainted with
various types of errors and uncertainties and we will be introduced to a set of basic
mathematical rules to calculate uncertainties in our measurement results. In the former
example, the situation gets more complicated when the uncertainties in m and in S are
included in the calculation. We will also see how different measurement results (of the
same quantity) can be combined and how you can present your results, for example in
a graph.
Before we will examine the various types of errors and their causes, we should first
pay attention to the terms ’measurement error’ and ’measurement uncertainty’. A
measurement error is the difference between an (experimentally) obtained value of a
(physical) quantity and the real value of that quantity. Generally, this error is not
known, because if it would, we could calculate the real value from the experimentally
obtained value. In some cases there are measurement errors that we do know and
thus can correct for. A more detailed description of that topic is given in the next
section. A measurement uncertainty is the maximum value that this measurement
error can have, or gives a value below which the measurement error will be with a
certain probability. The measurement uncertainty is generally determined from the
used measurement methods and the used equipment. As stated earlier, this can be quite
challenging in some cases and sometimes an estimation is the best available option.
The terms measurement error and uncertainty are used interchangeably quite often.
Although there is a fundamental difference in the meaning of these words, we will not
be too strict in using these terms.
2
1.2 Measurement errors and their causes
To distinguish between various sources of measurement errors, we will first consider the
general setup of a physical experiment. This is schematically depicted in figure 1.1.
physical
system
measurement
model/theory
system
observer
environment
3
(b) The measured quantity is subject to fundamental fluctuations (noise).
- The size of any electrical resistance is fluctuating slightly in time. In very
accurate experiments this can play a role in the uncertainty.
(a) Calibration errors: it is possible that measurement instruments are not cal-
ibrated properly.
- The speedometer in a car usually does not give the correct value for your
actual speed.
(b) Limited reading accuracy from measurement equipment.
- A caliper can only be read with an accuracy of 0.05 mm. When a higher
accuracy is required, another instrument needs to be used (e.g. a microme-
ter).
(c) Bad equipment.
- For example, if there is a small gap between two gears, friction in the
bearings or a dead travel in the instrument, this will inevitably lead to faulty
measurement results.
(d) The sensitivity of the instrument varies in time.
- Lasers have to warm up before they meet their specifications.
(e) Fluctuations in the measurement equipment
- Especially electronic instrumentation can exhibit fluctuations. For example,
an amplifier will always be subject to noise.
(f) The measurement system affects the quantity under investigation (physical
system).
- When a voltmeter with a small input impedance is connected to a circuit,
this will often result in a faulty measured value.
- When the temperature of a small amount of liquid is measured with a
thermometer that has a significantly different temperature (too cold/too
hot), this will also result in a measurement error.
3. The observer
4
4. The environment
(a) Influences of the environment affecting the physical system, the measurement
system or the observer.
- Magnetic or electric fields can disturb a measurement (e.g. an elevator in
a building).
- Vibrations in the building can cause errors.
- Fluctuations in temperature.
• Similarly, systematic errors can be detected by using other instruments and com-
paring the results.
5
• In numerous cases it is also possible to use a reference measurement of an already
known quantity. For example, if we have to determine the wavelength of the
light emitted by a gas-discharge lamp containing an unknown gas, we could first
perform measurements on a lamp with a known gas that emits light at a known
wavelength.
6
1.4 Confidence intervals
In section 1.1 we have already seen that it is rather pointless to talk about the outcome
of an experiment without talking about the reliability of the result or, in other words, its
uncertainty. We can do this by providing each result with a so called confidence interval.
We will distinguish between two kinds of confidence intervals, the 100% confidence
interval (also called the maximum error) and the 68% confidence interval (the standard
deviation). A 100% confidence interval denotes the interval in which the real value lies
with a probability of (almost) 100%. A 68% interval denotes the interval in which the
real value lies with a probability of 68%. This means that there still is a significant
probability that the real value is outside of this interval. In the following sections, we
will learn when to use these intervals and how we should use them.
7
Summarizing: We use the maximum error as uncertainty when we perform a single
measurement OR when each repetition of the measurement gives the exact same result.
We denote the result of a measurement of a quantity p with a maximum error ∆p as
p ± ∆p .
2. For intermediate results, two significant figures can be taken into account.
3. The least significant figure in a result has to have the same position relative to
the comma as that of the uncertainty.
4. Units have to be mentioned and both the result and the uncertainty should obvi-
ously have the same unit.
9
1.6 Consistency of measurement results
Uncertainties in measurement results will have to be denoted to give an idea of the accu-
racy with which the result has been determined and to be able to for example compare
it with the result obtained with other measurement methods or other equipment.
When 100% intervals are used, it is self-evident that multiple measurements of the
same quantity are consistent if their confidence intervals overlap. After all, we know
with 100% certainty that the real value will have to lie within the given intervals. We
then also know that the real value has to lie within the region of overlap of two 100%
intervals.
When 68% intervals are used, the situation gets slightly more complicated. Because
there is a 32% probability that the real value is not in the given confidence interval,
there is still a reasonable probability that, when no overlap is found, the results are in
fact consistent. However, it should be noted that this only is the case when the obtained
intervals are in each others vicinity. The probability of finding two consistent results
that are miles away from each other is negligible. Vice versa, it is also possible that when
there is an overlap, the overlapping region does not contain the actual value. There is
apparent consistency. In this case we will have to be careful with our conclusions. In
the case of the 68% intervals, there is no guarantee that the real value is in the region
of overlap. We will come back to this later in chapter 5.
10
1.8 Exercises and problems
1.8.1 Questions
1. Look again at the examples of the measurement uncertainties of §1.2. Which of
these uncertainties are random and which are systematic?
3. Can a steel ruler with a scale division of 0.5 mm be used to measure a distance
of 2 cm with 1% relative precision? And if the ruler has a vernier scale, readable
at 0.05 mm?
4. An electronic voltmeter has a reading accuracy of 0.1 Volt. What is the smallest
voltage that can be measured with a relative accuracy of 5%? And with a precision
of 1%?
5. Write the following measurement results for a height h, a time interval t, a charge
Q, a wavelength λ and a momentum p in a clearer form with the correct number
of significant digits:
h ± ∆h = (6.04 ± 0.05429) m
t ± ∆t = (20.5432 ± 1) s
Q ± ∆Q = (−3.21 · 10−19 ± 2.67 · 10−20 ) C
λ ± ∆λ = (0.000000673 ± 0.00000007) m
p ± ∆p = (4.367 · 103 ± 42) gm/s
11
12
2. 100% Confidence intervals
D x
x + D x = f(x 1+ D x 1)
x = f(x 1)
D x 1
x 1 x 1+ D x 1
We have a function f which has the value x at the point x1 and we’ll investigate what
the value of this function will be in the vicinity of this point x1 . This might look similar
to what is happening in figure 2.1. If we assume that the piece of the curve between
the points x1 and x1 + ∆x1 is a straight line, which is a fair assumption when ∆x1 is
small enough, we can express the slope of the curve as
This expression is exact in the limit ∆x1 → 0 (as this is the definition of the derivative
f ′ (x1 )). From the expression above we can conclude two things: firstly, we can write
x + D x
x D x
4 a
D x
D x b 3
D x b
2
1
D x a
x 2 x 2+ D x 2
x 1
x 1+ D x 1
Analogous to the one-dimensional case we now want to find an expression for f (x1 +
∆x1 , x2 + ∆x2 ). Now the function f (x1 , x2 ) does not describe a line anymore as in
the one-dimensional case, but it describes a plane in three-dimensional space. A small
piece of such a plane is drawn in figure 2.2. In this plane, point ⃝ 1 is the point for
which x = f (x1 , x2 ) holds and point ⃝4 is the point for which the value of the function
equals x + ∆x = f (x1 + ∆x1 , x2 + ∆x2 ). We want to determine the height difference
∆x between these two points. To go from point ⃝ 1 to point ⃝4 we can go along the
edge via point ⃝ 2 , but we can also go along the other edge and pass point ⃝ 3 . We are
going to assume, analogous to the one-dimensional case, that ∆x1 and ∆x2 will both
be so small that the small piece of the plane is flat (in figure 2.2 this is not quite the
case). In this case it doesn’t matter which of the two paths we follow: both edges are
identical. We will go along point ⃝ 2 in the upward direction. The trajectory from point
⃝1 to point ⃝2 is a one-dimensional function, since x2 remains unchanged for the whole
trajectory and can therefore be assumed constant. The height difference ∆xa between
the points ⃝1 and ⃝ 2 can be calculated using equation 2.3:
∆xa ≈ ∆x1 f ′ (x1 ), (2.4)
14
in which f ′ (x1 ) is the slope of the edge of the small piece of the plane between the points
⃝1 and ⃝2 . We will call this the partial derivative of the function f with respect to the
∂f
variable x1 . We do not denote this with f ′ (x1 ), but with ∂x 1
. This partial derivative is
defined as
∂f f (x1 + dx1 , x2 ) − f (x1 , x2 )
(x1 , x2 ) = lim . (2.5)
∂x1 dx1 →0 dx1
∂f
Note that here x2 is taken as a constant. However, this partial derivative ∂x 1
is still
different for each x2 and therefore is still a function of x2 . The second part of the
trajectory takes us from point ⃝ 2 to point ⃝ 4 . The height difference between these
points can be calculated in a similar way, but now x1 is constant for this part of the
trajectory and we can write
∂f
∆xb ≈ ∆x2 , (2.6)
∂x2
∂f
in which ∂x 2
is the partial derivative of the function f with respect to the variable x2
defined as
∂f f (x1 , x2 + dx2 ) − f (x1 , x2 )
(x1 , x2 ) = lim . (2.7)
∂x2 dx2 →0 dx2
Note that with partial derivatives we pretend that the function only depends on one
variable and that the other variables are constant.
∂f ∂f
– f (x, y) = x + y ⇒ =1 , =1
∂x ∂y
∂f ∂f
– f (x, y) = sin(x + y) ⇒ = cos(x + y) , = cos(x + y)
∂x ∂y
∂f ∂f
– f (x, y) = xy ⇒ =y , =x
∂x ∂y
∂f ∂f
– f (x, y) = sin(xy) ⇒ = y cos(xy) , = x cos(xy)
∂x ∂y
16
2.2.1 The general case: x = f (x[1] , x[2] , ..., x[n] )
The measured quantities are x[1] ± ∆x[1] , x[2] ± ∆x[2] , ..., x[n] ± ∆x[n] . Using equation
(2.11), we see that
This is the most general formula. With this, all formulas in the next section can be
derived. Check this yourself.
The upper limit of x is given by the sum of the upper limits of x[1] and x[2] :
Now x is at its maximum when x [1] is at its maximum and x[2] at its minimum, so
In this case ∆x = C ∆x[1] . When C is negative, x is at its maximum when x[1] is at its
minimum, so
x + ∆x = C (x[1] − ∆x[1] ) = C x[1] − C ∆x[1] , (2.25)
and x is at its minimum when x [1] is at its maximum, so
n
x = C1 x[1] + C2 x[2] + ... + Cn x[n] = Ci x[i]
P
2.2.5
i=1
This case is a combination of §2.2.2 and §2.2.4. The result can easily be guessed:
n
P
∆x = |Ci | ∆x[i] . (2.28)
i=1
18
2.2.6 The product of measured quantities: x = x[1] · x[2]
Here we should also distinguish between positive and negative values, but now those of
x[1] and x[2] . Let’s assume both are positive for now. The same reasoning as with the
sum of two measured quantities can be used: x is at its maximum when both x[1] and
x[2] are at their maximum, so
x + ∆x = (x[1] + ∆x[1] )(x[2] + ∆x[2] ) = x[1] x[2] + x[2] ∆x[1] + x[1] ∆x[2] + ∆x[1] ∆x[2]
[1] [2] ∆x[1] ∆x[2] ∆x[1] ∆x[2]
= x x 1 + [1] + [2] + [1] [2] (2.29)
x x x x
and x is at its minimum when both x[1] and x[2] are at their minimum:
x − ∆x = (x[1] − ∆x[1] )(x[2] − ∆x[2] ) = x[1] x[2] − x[2] ∆x[1] − x[1] ∆x[2] + ∆x[1] ∆x[2]
[1] [2] ∆x[1] ∆x[2] ∆x[1] ∆x[2]
= x x 1 − [1] − [2] + [1] [2] . (2.30)
x x x x
When we divide the left and right part of equation (2.29) by x (= x[1] x[2] ), we find
∆x ∆ [1] ∆ [2] ∆ [1] ∆ [2]
1+ = 1 + x[1] + x[2] + x[1] [2]x . (2.31)
x x x x x
∆ ∆
If all relative uncertainties in the measured quantities are small, i.e. xx[1][1] and xx[2][2] are
small (compared to 1), then the product of both is even smaller. When for example the
∆ ∆
relative uncertainties are measured to be 5% (so xx[1][1] = xx[2][2] = 0.05), then the term
∆x[1] ∆x[2] ∆ ∆
= 0.0025. This means that this term is negligible compared to xx[1][1] and xx[2][2] .
x[1] x[2]
We find that
∆x ∆ [1] ∆ [2]
1+ ≈ 1 + x[1] + x[2] , (2.32)
x x x
from which follows that
∆x ∆ [1] ∆ [2]
≈ x[1] + x[2] . (2.33)
x x x
From the expression for x − ∆x we find the same via
∆x ∆ [1] ∆ [2] ∆ [1] ∆ [2] ∆ [1] ∆ [2]
1− = 1 − x[1] − x[2] + x[1] [2]x ≈ 1 − x[1] − x[2] (2.34)
x x x x x x x
(again with small relative uncertainties), from which again follows that
∆x ∆ [1] ∆ [2]
≈ x[1] + x[2] . (2.35)
x x x
∆ ∆ ∆x[1] ∆x[2]
Neglecting the term xx[1] x[2]
[1] x[2] is only allowed when x[1]
≪ 1 and x[2]
≪ 1, so when
[1] [2]
∆x[1] ≪ x and ∆x[2] ≪ x .
The absolute uncertainties in the measured quantities have to be much smaller than
the quantities themselves!
If now one (or both) of the measured quantities has a negative value, the calculation
gets a little bit more complicated (just as in the case when x = C x[1] with C < 0). Try
to derive by yourself that the general formula looks like the following:
∆x ∆x[1] ∆x[2]
= + . (2.36)
|x| |x[1] | |x[2] |
19
2.2.7 The ratio of measured quantities: x = x[1] /x[2]
Again, this case is slightly more difficult. Like in the last case, we will first consider
positive values of x[1] and x[2] for simplicity; we’ll generalize this later. The quantity x
to be calculated is at its maximum when x[1] is at its maximum, but when x[2] is at its
minimum (the smaller the denominator, the larger the fraction), so
∆x[1]
x[1] + ∆x[1] x[1] 1 + x[1]
x + ∆x = [2] = [2] ∆x[2]
. (2.37)
x − ∆x[2] x 1−
x[2]
1 1
Intermezzo: We will first show that 1−δ ≈ 1 + δ and that 1+δ ≈ 1 − δ if δ is
sufficiently small, i.e. δ ≪ 1. This is relatively simple to understand when
we note that
(1 + δ) · (1 − δ) = 1 + δ − δ − δ 2 = 1 − δ 2 ≈ 1. (2.38)
This of course only works when δ 2 ≪ 1. By dividing the left and right part
of the equation by 1 + δ, we find
1
1−δ ≈ . (2.39)
1+δ
By not dividing by 1 + δ, but by 1 − δ, we find that
1
1+δ ≈ . (2.40)
1−δ
With this the proof is given. When δ=0.1, the result is still 1% accurate
(check this yourself).
We are going the use the approximation above to work out equation (2.37). If we call
∆x[2] 1
x[2]
= δ, the factor 1−δ will appear in our equation. So, using the rules we found
∆
above, we can approximate this by 1 + δ = 1 + xx[2][2] . Therefore we find
x[1] x[1]
∆x[1] ∆x[2] ∆x[1] ∆x[2]
x + ∆x ≈ [2] 1 + [1] 1 + [2] ≈ [2] 1 + [1] + [2] , (2.41)
x x x x x x
∆ ∆
where we again have neglected the term xx[1] x[2]
[1] x[2] . Dividing the left and right part by
[1]
x (= xx[2] ) gives
∆x ∆ [1] ∆ [2]
= x[1] + x[2] . (2.42)
x x x
In a similar way we can derive that
∆[1]
x[1] − ∆x[1] x[1] 1 − xx[1]
x − ∆x = = [2]
x[2] + ∆x[2] x 1 + ∆x[2][2]
x
x[1] x[1]
∆x[1] ∆x[2] ∆x[1] ∆x[2]
≈ [2] 1 − [1] 1 − [2] ≈ [2] 1 − [1] − [2] , (2.43)
x x x x x x
20
from which again follows that
∆x ∆ [1] ∆ [2]
= x[1] + x[2] . (2.44)
x x x
When x[1] or x[2] (or both) are negative, we find the same result as with the product, so
∆x ∆ [1] ∆ [2]
= x[1] + x[2] . (2.45)
|x| |x | |x |
Note that when we add or subtract two quantities, their absolute uncertainties should
be added (see equations (2.19) and (2.22)), whereas when we multiply or divide two
quantities, their relative uncertainties should be added (see equations (2.36) and (2.45)).
α
2.2.8 Powers of measured quantities: x = x[1]
We again start with the case x[1] > 0 and α > 0. We write
α
[1]
α [1] α
∆x[1]
x + ∆x = x + ∆x[1] = x 1 + [1] . (2.46)
x
We can use the approximation above in the expression for x + ∆x . If we again take
∆
δ = xx[1][1] , then
[1] α
∆x[1]
x + ∆x ≈ x 1 + α [1] (2.48)
x
and therefore
∆x ∆ [1]
= α x[1] . (2.49)
x x
We find exactly the same result when we perform the calculation for x − ∆x .
α
When we look at the case where α < 0 (but still x[1] > 0), then actually x = x[1] =
1 [1]
|α| . This means that x is at its maximum when x is at its minimum:
( )
x [1]
!|α|
1 [1] α
1 [1] α
1
x + ∆x = |α|
= x |α| = x ∆x[1]
. (2.50)
(x[1] − ∆x[1] ) 1−
∆x[1]
1− x[1]
x[1]
21
1 ∆x[1]
We again use ∆x[1]
≈ 1+ (see equation (2.40)) and then divide by x (=
1 − x[1] x[1]
[1] α
x ), in order to find
|α|
∆x ∆x[1] ∆ [1]
1+ ≈ 1 + [1] ≈ 1 + |α| x[1] , (2.51)
x x x
where we have also used equation (2.47). We therefore find
∆x ∆ [1]
= |α| x[1] . (2.52)
x x
Via x − ∆x we find the same expression.
Derive by yourself that, when negative values of x[1] are allowed, the general expression
becomes:
∆x ∆ [1]
=|α| x[1] . (2.53)
|x| |x |
∆x
Note that the constant C doesn’t show up in (it does of course in ∆x ).
|x|
2.2.10 Examples
• Suppose that we want to measure the surface area S of a circle and in order to
do this we measure the diameter d. The uncertainty in d is ∆d . We calculate the
area by S = πd2 /4, but what is ∆S ?
We use equation (2.54), but only with one term (so n = 1). We set x = S, C = π4 ,
x[1] = d and α1 = 2. Therefore,
∆S ∆d ∆d
= |2| =2 . (2.55)
S |d| d
22
An alternative method would be to use the general expression (2.16). Again, we
set x[1] = d and use the function S = f (d) = π4 d2 . The partial derivative we now
need is ∂x∂f[1] = ∂S
∂d
= π2 d. Substituting this in the general equation gives
π π
∆S = d ∆d = d ∆d , (2.56)
2 2
and dividing by S = π4 d2 gives
π
∆S 2
d ∆d ∆d
= π 2 =2 . (2.57)
S 4
d d
∆V ∆d ∆h ∆d ∆h
= |2| + |1| =2 + . (2.58)
V |d| |h| d h
Again, the alternative is to use equation (2.16). We again set x[1] = d, x[2] = h
and use the function V = f (d, h) = πd2 h/4. The partial derivatives we need are
∂f ∂V ∂f ∂V
[1]
= = πdh/2 and [2]
= = πd2 /4. Substitution gives
∂x ∂d ∂x ∂h
π π π π
∆V = dh ∆d + d2 ∆h = dh ∆d + d2 ∆h , (2.59)
2 4 2 4
and dividing by V = π4 d2 h gives
π π 2
∆V 2
dh ∆d 4
d ∆h ∆d ∆h
= π 2 + π 2 =2 + . (2.60)
V 4
dh 4
dh d h
23
• The refractive index of glass can be determined by shining light on a flat glass
surface as shown in the figure below.
air
glass
We can in fact use this to calculate ∆n . We can simplify the equation above a
little by dividing by |n|. We then obtain
∆n ∆i ∆r
= + . (2.63)
|n| |tan i| |tan r|
R 1
R 2
If we would know (by measurement) that R1 = (5.4 ± 0.1) Ω and also that
R2 = (1.40 ± 0.05) Ω, we could fill this in:
The product in the numerator we can calculate using section 2.2.6 (equa-
tion (2.36)): (5.4 ± 0.1) · (1.40 ± 0.05) Ω2 = (7.56 ± 0.41) Ω2 (check this
yourself). The sum in the demominator we can calculate using section 2.2.2
(equation (2.19)): (5.4 ± 0.1) + (1.40 ± 0.05) Ω = (6.80 ± 0.15) Ω. Note that
this isn’t the final result yet, so we still include two significant digits (since
we’ll have to use it in further calculations). Inserting these numbers in the
ratio gives:
(7.56 ± 0.41) Ω2
R= = (1.11 ± 0.08) Ω, (2.67)
(6.80 ± 0.15) Ω
where we have used the results of §2.2.7.
We made a huge mistake here. In the last step we divided R1 R2 by R1 +R2 and used the
calculation rules for independent uncertainties. This of course is not correct. If in reality
R1 R2 was bigger than we thought it would be (so bigger than 7.56 Ω2 ), then R1 + R2
would of course also be bigger than the 6.80 Ω with which we are calculating. And if
R1 R2 would be smaller than 7.56 Ω2 , then R1 + R2 would also be smaller than 6.80 Ω.
In the derivation of the uncertainty in the ratio of two quantities we assumed that one
quantity can have a variation in the positive direction, while the other quantity has a
25
variation in the negative direction (this gave the maximum variation of the ratio itself).
This cannot happen in the case demonstrated above; the variations in the numerator
and in the denominator always go in the same direction (too big / too small). One
would expect the true uncertainty in R to be smaller than what we’ve calculated here.
We’ll demonstrate this.
The simplest way is use the equation R1 = R11 + R12 and to first calculate the uncertainties
in R11 and in R12 . From
R1 = (5.4 ± 0.1)Ω (2.68)
follows that
1
= (0.1852 ± 0.0034)Ω−1 ; (2.69)
R1
and from
R2 = (1.40 ± 0.05)Ω (2.70)
follows that
1
= (0.714 ± 0.026)Ω−1 (2.71)
R2
(equation (2.53) with α = −1). The sum of these two quantities can be calculated quite
easily:
1
= (0.899 ± 0.029)Ω−1 , (2.72)
R
from which again follows that
(again equation (2.53) with α = −1). We see that the uncertainty is indeed smaller
than what we found in the incorrect case.
An alternative way would have been to use the general expression (equation (2.16)). In
that case we find that
∂R ∂R
∆R = ∆R1 + ∆R2 . (2.74)
∂R1 ∂R2
We can calculate the partial derivatives:
2
∂R R2
= ,
∂R1 R1 + R2
2
∂R R1
= . (2.75)
∂R2 R1 + R2
∂R ∂R
Check this yourself. Also note that ∂R 1
and ∂R 2
are dimensionless. The expression for
∆R becomes 2 2
R2 R1
∆R = ∆R1 + ∆R2 , (2.76)
R1 + R2 R1 + R2
and filling in all numbers again gives ∆R = 0.04 Ω.
26
2.4 Exercises and problems
2.4.1 Questions
∂f ∂f
1. Calculate the partial derivatives and of the following functions:
∂x ∂y
(a) f (x, y) = x4 y 2
x2
(b) f (x, y) =
x+y
4x2 + 2y
(c) f (x, y) =
3x − y
sin(x)
(d) f (x, y) =
sin(y)
(e) f (x, y) = x sin(xy)
4. With a good stopwatch and some experience, one can measure time intervals from
approximately 1 second to many minutes, with a maximum error of approximately
0.1 s. Say we want to measure the period T ≈ 0.5 s of a pendulum. If we
measure only one swing, then we have a relative uncertainty of about 20%. By
measuring multiple swings, the relative uncertainty can be improved (=reduced),
as illustrated by the following.
(a) If we measure the duration of 5 subsequent swings as (2.4 ± 0.1) s, how large
is T with relative and absolute uncertainty?
(b) If 20 subsequent swings take (9.4±0.1) s , What is T (including uncertainty)?
10. The focal distance f of a thin lens is given by f1 = 1o + 1i , where o and i are
the object- and image distance, respectively. If the object distance is o ± ∆o =
(24.30 ± 0.05) cm and the image distance is i ± ∆i = (17.40 ± 0.05) cm, calculate
f ± ∆f .
2.4.2 Problems
1. An electrician uses two resistors in a parallel circuit. The resistance of these
resistors is rated by the manufacturer at R1 =100 Ω±5% and R2 =270 Ω±5%, re-
spectively. The uncertainties herein are 100%-intervals. The total resitance R of
the parallel circuit follows R1 = R11 + R12 .
la m p c o llim a to r
tr a lie a
k ijk e r
28
Through the scope (which is rotatable around the centre of the circle), one can
look at the pattern of maxima and minima in light intensity which occurs due
to the grating. The 0-th order maximum (n = 0) occurs at α0 ≃ 0o , the n-th
maximum occurs at an angle α with regard to the n=0-maximum, where α is
given by
nλ = d sin (α) .
Here, λ is the wavelength of light (to be determined) and d the grating constant
(the distance between the grooves of the grating). By measuring the angle α at
the n-th maximum, λ can be determined. This angle α is determined by once
measuring it for a left turn (positive angle α+ ) and once measuring it for a right
turn (negative angle α− ). Since the pattern is symmetric around n = 0, i.e.,
around α0 ≃ 0o , we calculate α as the difference of these two results, divided by
two:
α+ − α−
α= .
2
The grating constant d is very accurately known and equals d = 2190 nm. The
wavelength to be determined is around λ ≃ 600 nm. The uncertainty in deter-
mining the angles is taken as the readout accuracy of the apparatus, which is
1 o
∆α+ =∆α− =1′ = 60 .
(a) Why is α determined as discussed (by measuring once for a left turn and
once for a right turn) and not by taking α = α+ − α0 (so, measuring once
for a left turn and once at n = 0 with α0 ≃ 0)?
(b) Show that
rn the uncertainty ∆λ in the calculated wavelength is given by
2 o
∆λ = nd 1 − nλd
∆α+ . Realise that ∆α+ =∆α− .
3. (a) What are the two conditions under which the general formula for 100%-
intervals is valid?
(b) For both conditions, explain why they are neccesary. Possibly, give an ex-
ample to prove this.
(c) If y = x1 + x2 where x1 and x2 are measured with respective uncertainties
∆x1 and ∆x2 , are both conditions still neccesary?
29
4. Someone wants to determine the resistance of an unknown resistor R. They
have at their disposal a voltage source producing an unknown voltage V , an
amperemeter that can measure current with an accuracy of ∆i (100%-interval) and
a very precise resistance R1 (so ∆R1 = 0). The unknown resistance is determined
as follows: the voltage source is connected to the unknown resistor and with
the amperemeter, the current I0 running through it is measured. In this case
V = I0 R. Then, the voltage source is connected over a series circuit containing
both the unknown and the known resistor, and the current I1 is measured. Now,
V = I1 (R + R1 ).
I1
(a) Show that the unknown resitance R can be calculated using R = R1 .
I0 − I1
(b) If the uncertainty in the current measurement is called ∆i (so ∆I0 = ∆I1 =
∆i ), give an expression for the uncertainty ∆R to which R can be determined.
(c) For what value of R1 is the measurement most accurate? Express this opti-
mal R1 as a function of R. Hint: Write ∆R as a function of V , R, R1 and
∆i (not I0 and I1 ), and calculate when ∆R is minimal.
5. Using the thin lens formula f1 = 1i + 1o , the focal distance f of an ideal lens is
determined. To this end, an object is placed at a distance o from the lens, and
the distance i from the image to the lens in measured. The measurements of o
and i are performed by measuring the positions of the object (po ), lens (pl ), and
image (pi ) on an optical rail. Then of course, o = pl − po and i = pi − pl . We
measure po = (26.35 ± 0.05) cm, pl = (60.40 ± 0.05) cm en pi = (86.8 ± 0.2) cm.
All uncertainties are 100%-intervals. The uncertainty in pi is larger than in po
and pl , since it is difficult to determine where the image is exactly sharp.
30
3. Measuring with random variations: 68%
intervals
3.1 Introduction
In the last chapter we looked at measurements where no spread in the measurement
results was found and we derived calculation rules for this. In this chapter we’ll look at
measurements where we, upon repeating the measurement, find different results each
try. It’s obvious that these results have to be averaged to get to the final result, but how
we determine an uncertainty and how we use it in calculations isn’t so obvious. For this
reason, we’ll investigate this in more detail. But before we do that, some definitions
should first be given and some concepts should be introduced.
3.2 Definitions
When a different result is found each time a measurement is repeated, it’s impossi-
ble to predict what the exact result of the next (new) measurement will be. In that
case we speak of a random quantity or a stochastic variable. In some cases the result
of a measurement can take any possible value, but in most cases the possible values
of the results are limited in some way. The set of all possible outcomes of a certain
measurement we call the sample space. This sample space might be continuous (when
we are dealing with continuous stochastic variables) or discrete (for discrete stochastic
variables). The sample space of a continuous stochastic variable is by definition in-
finitely big and therefore not all possible outcomes will be found with a finite number
of measurements. When we are dealing with discrete stochastic variables, the sample
space isn’t infinitely big by definition, but it still can be. When we are performing
measurements, we make a set of observations. These we call the population. Of such a
population we can make a frequency distribution. When we are dealing with a discrete
quantity, this simply is the number of times an outcome from the sample space occurs
in the population (so the number of times a certain value has been found in our series
of measurements).
For example, a dice has been thrown 30 times. The outcome we call x. It’s
clear that x is a discrete stochastic variable and the frequency distribution
F(x) of the population of 30 throws might look like figure 3.1.
When we are dealing with a continuous stochastic variable, we can’t easily make such a
distribution. When we’re dealing with a discrete quantity, an element from the sample
31
8
F(x 3
0
0 1 2 3 4 5 6 7
Figure 3.1: Frequency distribution F(x) of the results of 30 throws with a dice.
space will (with enough measurements) occur multiple times in the population, but
with a continuous quantity this is not the case. We should therefore divide the sample
space in intervals and count how many times an outcome falls in one of these intervals.
How big we make these intervals of course depends on the size of the sample. The
bigger the sample (the amount of measurements), the smaller we can choose the size
of the intervals. For this continuous case with intervals, we don’t use the frequency
distribution, but the frequency density distribution, which is defined as the number
measurements in a certain interval divided by the size of the interval.
For example, we can measure the velocity of ions in a gas discharge. The
measurement is repeated 400 times and this has resulted in:
The frequency density distribution then looks like the one in figure 3.2. Note
the unit of the frequency density distribution in this figure (s/m). Try to
think of why this is the case.
32
2.5
2.0
1.5
F (v)
1.0
0.5
0.0
Figure 3.2: Frequency density distribution F (v) of the velocities of ions in a gas discharge
experiment.
In this example measurements of the velocities of different ions in a gas discharge were
done. The results (probably) represent what the distribution of the velocities of different
ions looks like. In other words, measurements have been done on ions that probably
all have a different velocity. However, it also is possible to perform measurements on
a quantity that has one single true value, but where the measurements still show a
spread around this single value. This can have multiple causes, as we have already seen
in chapter 1. We will take a closer look at this case, although what will follow is also
true for the measurements in the example above.
We will perform measurements on a quantity that has one true value xt . We don’t know
this xt , but we want to find an approximation (estimate) that is as accurate as possible.
It’s fair to assume that the probability of finding a result close to xt is bigger than the
probability of finding a result far from xt . We now define the probability density p(x) as
the probability of finding a result between x and x + dx, divided by dx. In other words,
p(x)dx is the probability of finding a result between x and x + dx. One condition that
should be noted is that dx should be small (we should actually take the limit dx → 0).
Since the probability of finding a result near xt is bigger than that of finding one far
from xt , the probability density will look something like figure 3.3.
The maximum of p(x) of course lies at x = xt . The probability of finding a result
Rb
between x = a and x = b upon performing a measurement is P (a ≤ x ≤ b) = a p(x) dx.
R +∞
This is the size of the shaded area below the curve. Also note that −∞ p(x) dx = 1.
We’ll now introduce another concept and that is the expectation value ε ⟨x⟩ of the
quantity x, which for a continuous probability distribution is defined as
Z+∞
ε ⟨x⟩ = x p(x) dx. (3.1)
−∞
This expectation value of x is nothing more than the sum (=integral) of all possible
values of x, multiplied by the corresponding probability of finding this result. It is
actually the result one would expect if the measurement would be performed an infinite
number of times and all results would be averaged. It won’t amaze you that for the
probability density in the figure above ε ⟨x⟩ = xt , i.e. the true value. You therefore
expect, upon repeating the measurement an infinite number of times and averaging the
results, to find the true value as the average. However, in practice we won’t be measuring
an infinite number of times, but only a finite number of times and therefore the found
average won’t exactly be the true value xt , but only an approximation. However, for
measurements that show a small spread in the results (so a ‘narrow’ probability density
function), the average will be a better approximation of xt than for measurements with
a large spread. The deviation of the found average from xt is of course the uncertainty
in the experiment. We’ll try to calculate this in one of the coming sections. It’s not
only possible to calculate the expectation value of x, but we can do this for any function
of x, so
Z+∞
ε ⟨f (x)⟩ = f (x) p(x) dx, (3.2)
−∞
R +∞
for example ε ⟨x2 ⟩ = −∞ x2 p(x) dx. In the next section we’ll derive a number of
calculation rules for the expectation value, with which we can calculate the difference
34
between the mean of a (finite) number of measurements and the true value. Lastly, we
define the variance var⟨x⟩ of x. This is defined as the expectation value of (x − xt )2 ,
so the square of the difference between the measured quantity x and the true value xt .
This is also denoted by σ 2 :
Just as ε ⟨x⟩ is the result you would expect to find upon repeating the measurement an
infinite number of times and averaging the results, var⟨x⟩ is the expectation value of
how far the measured value lies from the true value (and then the square of this) upon
repeating the experiment an infinite number of times. We use the square in order to
get rid of any possible minus signs. We do this because ε ⟨(x − xt )⟩ will be 0, since we’ll
measure a value smaller than xt about p as often as we’ll measure a value bigger than
xt . The square root of the variance var ⟨x⟩ is a measure for the size of the spread
of the results and
p therefore for the width of the probability distribution. The σ in the
equation (σ = var ⟨x⟩) is also called the standard deviation.
i. ε ⟨ax⟩ = a ε ⟨x⟩
Proof:
+∞
R +∞
R
i. ε ⟨ax⟩ = ax p(x) dx = a x p(x) dx = aε ⟨x⟩ . Q.e.d.
−∞ −∞
+∞
R +∞
R +∞
R +∞
R
ii. ε ⟨x + y⟩ = (x + y) p(x) p(y) dx dy = xp(x) p(y) dx dy+
−∞ −∞ −∞ −∞
+∞
R +∞
R +∞
R +∞
R +∞
R +∞
R
y p(x) p(y) dx dy = x p(x) dx p(y) dy + p(x) dx y p(y) dy =
−∞ −∞ −∞ −∞ −∞ −∞
+∞
R +∞
R
x p(x) dx + y p(y) dy = ε ⟨x⟩ + ε ⟨y⟩ . Q.e.d.
−∞ −∞
1
By inserting p(x) in the above equation (again, check this yourself) we get β = 2σ 2
, so
the probability distribution now looks like the following:
1 (x−xt )2
p(x) = √ e− 2σ2 . (3.7)
σ 2π
A plot of this distribution can be found in figure 3.4.
Now the following important results can be calculated: the probability of a result
R x +σ
between xt − σ and xt + σ (= P (xt − σ ≤ x ≤ xt + σ) = xtt−σ p(x) dx) is equal to
68.27 % and the probability of a result between xt − 2σ and xt + 2σ is 95.45 % and the
probability of a result between xt − 3σ and xt + 3σ is 99.73 %.
36
t t t t
We of course hope that this average x̄ will be a good approximation of the true value
xt . Whether or not this is case, will depend on (a) the number of measurements (the
more the better) and (b) the probability density function p(x). With a small amount of
measurements it’s probable that the found average will give a bad value (approximation)
for xt . For a narrow probability density function the found average x̄ will be more
reliable than for a wide function. To determine the reliability of the average (which is
37
the uncertainty in the final result that we’re looking for), we have to investigate (a)
what the width of the probability density is and (b) what the connection is between
the width and the reliability of x̄.
We haven’t done anything illegal here (yet). We can expand the square:
1 X
S2 = (xi − x̄)2 + (x̄ − xt )2 + 2(xi − x̄)(x̄ − xt ) , (3.11)
N i
P N
P
Here we use the abbreviation for . In the second term nothing is dependent on i,
i i=1
so the summation is nothing more than the sum of N times the same number ((x̄−xt )2 ),
so it can be written as N1 · N · (x̄ − xt )2 = (x̄ − xt )2 . In the last term we can place
(x̄ − xt ) in front of the summation, as this is not dependent on i. We now have
1 X 2 X
S2 = (xi − x̄)2 + (x̄ − xt )2 + (x̄ − xt ) (xi − x̄). (3.13)
N i N i
38
P P P
The summation in the last term can be calculated, since (xi − x̄) = xi − x̄ =
i i i
N x̄ − N x̄ = 0. Therefore, the last term is 0, so we are left with
1 X
S 2 = (x̄ − xt )2 + (xi − x̄)2 . (3.14)
N i
The second term on the right hand side of the above equation resembles the definition
of the sample variance, but now with x̄ instead of xt . This is nice, because we can
actually calculate this term. However, the first term still gives us trouble, as it contains
xt again. The problem now has become: can we make an estimation of (x̄ − xt )2 ? It
probably won’t amaze you that we can, since otherwise we would have been deriving
all these equations for nothing. We can write
!
1 X 1 X 1 X 1 X
x̄ − xt = xi − xt = xi − xt = (xi − xt ) , (3.15)
N i N i N i N i
The last part of this equation consists of the product of two summations. In both
summations, i runs from 1 to N , but each value i in the first summation is multiplied
by all values of i from the second summation. We might as well call this j (to prevent
further confusion) and therefore
" # " #
1 X X
(x̄ − xt )2 = 2 (xi − xt ) · (xj − xt ) , (3.17)
N i j
where i and j are completely independent of one another. Because of this independence,
both summations may stand at the beginning of the equation:
1 XX
(x̄ − xt )2 = [(xi − xt ) (xj − xt )] . (3.18)
N2 i j
PP
The means nothing more than a summation over all values of i and all values of
i j
j. We’ll distinguish between terms where i and j have the same value and terms where
this isn’t the case. Therefore, we write
1 X 1 XX
(x̄ − xt )2 = (x i − x t ) 2
+ [(xi − xt ) (xj − xt )] . (3.19)
N2 i N 2 i j̸=i
The last term is a summation over all values of i and j that aren’t equal to each other,
so xi and xj in this term always represent results of different measurements. Because
the results show a random spread around a true value xt , (xi − xt ) will be positive or
negative an equal amount of times. The same goes for (xj − xt ), but because it concerns
39
different measurements, this is completely uncorrelated. This means that the second
term of the above equation will also be pretty much 0 when we perform a large amount
of measurements, and therefore we’re left with just the first term:
1 X
(x̄ − xt )2 ≈ 2 (xi − xt )2 . (3.20)
N i
On the right hand side of this equation we recognize the definition of the sample variance
S 2 , but now with a factor N12 instead of N1 , so we can write:
1 2
(x̄ − xt )2 ≈ S . (3.21)
N
With this we’ve made an estimation of the first term in equation (3.14). By filling in
this estimation, we find
1 2 1 X
S2 ≈ S + (xi − x̄)2 . (3.22)
N N i
By bringing the first part of the right hand side to the left:
1 1 X
1− S2 ≈ (xi − x̄)2 , (3.23)
N N i
and multiplying both sides with N and dividing by (N − 1) finally gives an expression
for the sample variance
1 X
S2 ≈ (xi − x̄)2 . (3.24)
N −1 i
This is a value we can calculate! The width S of the probability distribution then is
(roughly) equal to the square root of this, so
r
1 P
S ≈ (xi − x̄)2 . (3.25)
N −1 i
distribution function
distribution of individual
measurement results
Figure 3.5: Distribution of the means (narrow curve) compared to that of separate mea-
surements (wider curve).
and because the measurement results xi are independent, using calculation rule (iv),
we find
2 1 X
Sm = 2 var ⟨xi ⟩ . (3.29)
N i
Now, the variance var⟨xi ⟩ of course is the same for all xi and equal to var⟨x⟩, so the
summation contains N times the same number var⟨x⟩ and we find
2 1 1
Sm = var ⟨x⟩ = S 2 . (3.30)
N N
Because we could calculate S, we can also calculate Sm and we obtain the final result:
s
1 N
(xi − x̄)2 .
P
Sm = (3.31)
N (N − 1) i=1
Indeed, we only have to determine the average of a series of N measurement once and
then we can calculate the standard deviation of the sample mean, as Sm is officially
called. And because there is 68% certainty that the true value xt is in between x̄ − Sm
and x̄ + Sm , we have also found the uncertainty; the 68% confidence interval.
42
3.6 An example
A number of measurements of the length of an object give the results 50.35-50.33-50.45-
50.38-50.35-50.27-50.41-50.39-50.36 and 50.39 mm. The average of this is 50.37 mm, so
we can calculate:
Sx̄2 = var x̄[1] + x̄[2] = var x̄[1] + var x̄[2] = Sx̄2[1] + Sx̄2[2] . (3.36)
So in contrast to the 100% intervals where upon summation of two quantities the
uncertainties should be added, we should now ‘quadratically’ add the uncertainties
with 68% intervals.
We will now directly derive the general calculation rule for combining uncertainties
in the case of 68% confidence intervals. We have measured M quantities x[1] , .., x[M ]
with independent uncertainties Sx̄[1] , .., Sx̄[M ] . We are looking for the uncertainty in the
quantity x which is a function of the M measured quantities according to
[i]
and since all terms x̄[i] − xt are assumed to be small, we can use the approximation
from §2.1.3 and write
∂f ∂f
[1] [M ] [1] [M ]
x̄ ≈ f (xt , .., xt ) + x̄[1] − xt + .... + x̄ [M ]
− x t
∂x[1] ∂x[M ]
X M ∂f
[1] [M ] [i] [i]
= f (xt , .., xt ) + x̄ − xt , (3.39)
i=1
∂x[i]
∂f [i] [1] [M ]
where ∂x[i] is the partial derivative of f with respect to x at the point (xt , ...., xt ).
2
The uncertainty in x is determined by Sx̄ = var⟨x̄⟩ in which we can insert the afore-
mentioned equation, resulting in
* M
+
X ∂f
[1] [M ] [i]
Sx̄2 = var ⟨x̄⟩ ≈ var f (xt , ...., xt ) + x̄[i] − xt . (3.40)
i=1
∂x[i]
The right hand side of this equation is the variance of the sum of a couple of terms
[1] [M ]
that are all independent of each other. After all, the first term is f (xt , ...., xt ) = xt ,
M
[i] ∂f
x̄[i] − xt ∂x
P
which is a constant, and the other terms [i] are all independent since
i=1
M
X ∂f
[i]
Sx̄2 = var [i]
x̄ − xt . (3.41)
i=1
∂x[i]
Now using calculation rule (iii) from §3.3 and by knowing that ∂x ∂f
[i] also is a constant,
we find
M 2 M 2 M 2
2
X ∂f D
[i] [i]
E X ∂f [i]
X ∂f
Sx̄ = [i]
var x̄ − xt = [i]
var x̄ = [i]
Sx̄2[i] .
i=1
∂x i=1
∂x i=1
∂x
D E (3.42)
[i] [i]
Here we have used the fact that var x̄[i] − xt =var x̄[i] . This correct as xt is a
constant. We have now found a general equation for the uncertainty Sx̄ :
s 2
M ∂f
Sx̄2[i] .
P
Sx̄ = (3.43)
i=1 ∂x[i]
45
This looks quite similar to the one for 100% intervals (see §2.2.1), but now the terms
should be added quadratically.
From this general calculation rule we can derive all other specific calculation rules. Try
this yourself for several cases. In the next section a full overview of the calculation rules
for both 100% intervals and 68% intervals can be found.
2 2 2
∆x ∆ [1] ∆ [2] Sx Sx[1] Sx[2]
x [1]
·x [2]
= x[1] + x[2] = +
|x| |x | |x | x x[1] x[2]
2 2 2
x[1]
∆x ∆ [1] ∆ [2] Sx Sx[1] Sx[2]
= x[1] + x[2] = +
x[2] |x| |x | |x | x x[1] x[2]
α ∆x ∆ [1] Sx S [1]
x[1] = |α| x[1] = |α| x[1]
|x| |x | |x| |x |
2 2
[i] αi ∆x P ∆ [i] Sx Sx[i]
= |αi | x[i]
Q P
C x = αi [i]
i |x| i |x | x i x
2
P ∂f ∂f
f (x[1] , x[2] , .., x[M ] ) Sx̄2 =
P
∆x = ∆ [i] S [i]
i ∂x[i] x i ∂x[i] x
46
3.10 The uncertainty in the uncertainty
In chapter 1 we saw that the uncertainty in a measurement series can usually be rounded
to one significant digit, because this uncertainty also has its own uncertainty. The
uncertainty as described by equation (3.8) will give a slightly different value for each
measurement series, in the same way that the average of a new measurement series
will have a slightly different value. So, just like with the averages, the uncertainties will
show a spread when the measurement series is repeated a couple of times. We can again
do this in our minds, since we can calculate that the spread in the standard deviation
S has a variance
1 1
var ⟨S⟩ = σ2 ≈ S 2, (3.44)
2 (N − 1) 2 (N − 1)
and the variance in the standard deviation of the mean is
1 2 1
var ⟨Sm ⟩ = σm ≈ S2 , (3.45)
2 (N − 1) 2 (N − 1) m
where N again is the size of the measurement series (the number of measurements).
S
From this follows that for the relative uncertainties SSS and SSmm
s
SS SSm 1
= = . (3.46)
S Sm 2 (N − 1)
We won’t go into the details of the derivation of the above equation. In figure 3.6 the
S
fraction SSmm is plotted as a function of the number of measurements N in the series.
80
60
40
20
0
0 10 20 30 40 50 60 70 80 90 100
Figure 3.6: Ratio of the uncertainty SS in the standard deviation S and the standard
deviation itself as a function of the number of measurements N in the series.
The point that lies all the way to the left in this curve is for N =2, so this has an
uncertainty of over 70% in the standard deviation. Even with N =10 we still find an
uncertainty of 25% and this slowly gets better with increasing N , as can be seen in the
figure.
47
Example: after an experiment an uncertainty (standard deviation of the
mean) of Sm = 0.44 has been found. We learned that we should round this
to Sm = 0.4. The question is whether or not this is problematic. It’s a
problem when the error that we make (let us call this ∆Sm ) is bigger than
(or equal to) the uncertainty in the value of Sm , so when
By filling in the found standard deviation and the rounding (∆Sm ), we see
that only when N > 61 the rounding wouldn’t be appropriate. So in this
case only with a very big measurement series (over 61 measurements) a
rounding of the uncertainty to two digits would have been meaningful.
This was under the assumption that ε ⟨(x − xt )(y − yt )⟩ = 0, which means that with
many (an infinite amount of) measurements the average of the product of the deviations
in x and y from their true values is zero. This only happens when the uncertainties in
the quantities x and y are independent.
48
3.11.1 The covariance
When the uncertainties in the measured quantities x and y are dependent on each other,
the term ε ⟨(x − xt )(y − yt )⟩ should not be removed. We also call ε ⟨(x − xt )(y − yt )⟩
the covariance of x and y and denote this by covar⟨xy⟩. So, we have
where
+∞
ZZ
covar ⟨xy⟩ = ε ⟨(x − xt )(y − yt )⟩ = (x − xt )(y − yt ) p(x, y) dx dy. (3.49)
−∞
Here, p(x, y) is the probability density function of the variables x and y together.
where Sx̄[1] x̄[21] is an estimation of the covariance covar x̄[1] x̄[2] . We should determine
this estimation and we can do that from our measurement series. A good approximation
is
[1] [2] 1 X [1] [1]
[2] [2]
covar x̄ x̄ ≈ xi − x̄ xi − x̄ . (3.51)
N i
It seems that we can calculate the uncertainty in a quantity that is a combination of
multiple quantities with dependent uncertainties. However, this tends to be a lot of
work and it should be noted that one should try to avoid this (for example, in the same
way as in section 2.3).
49
3.12 Exercises and problems
3.12.1 Questions
1. When measuring the thickness of a thin layer of liquid helium, the following angles
(in minutes) were observed:
34 35 45 40 46
38 47 36 38 34
33 36 43 43 37
38 32 38 40 33
38 40 48 39 32
36 40 40 36 34
R R 1
x y
L
V
3.12.2 Problems
1. Because it is (sometimes) difficult to perform calculations with a Gaussian prob-
ability distribution, it can be coarsely approximated by the following probability
density function p(x), where a is a constant.
p (x )
b
b
2
-2 a -a 0 a 2 a x
1
(a) Show that b = 3a
holds.
(b) Show that for a probability function which is symmetric around x = 0 (so
for which p(x) = p(−x)), it is generally true that xw = ε ⟨x⟩ = 0.
(c) Calculate for the given probability density function the variance var⟨x⟩ of x
and from that the standard deviation σ. Express σ in terms of a.
(d) Calculate which percentage of measurements (out of an infinite amount of
measurements) will give a value between xw − σ and xw + σ (so in this case
between −σ and σ). The constants a and b should be eliminated.
where x̄ = ε ⟨x⟩ and σ is the standard deviation. The ‘Full Width at Half Max-
imum’ (FWHM ) is defined as the width of the distribution function at half its
height (so the distance between the points for which p(x) = 12 p(x̄) holds)
p
Show that the expression FWHM = 2σ 2 ln(2) ≈ 2.35 σ holds true.
52
1 0
F (v ) (s /m )
8
0
0 1 0 0 2 0 0 3 0 0
v (m /s )
We assume that the velocities follow a Gaussian distribution. Estimate the vari-
ance var⟨v⟩ of the ion velocity.
4. A manufacturer produces balls for ball bearings. When he sells them, he naturally
has to state the diameter of the balls and the uncertainty therein. He measures the
diameters of one batch, and finds the diameters follow a Gaussian distribution.
Because he thinks a 68%-confidence interval is not accurate enough, he states
the 95%-confidence interval (the 2S-interval). The measured batch contains 10
different balls, and the results are 4.995 - 5.004 - 4.998 - 5.001 - 4.989 - 5.007 -
5.001 - 5.001 - 4.999 - 5.004 mm. Since the measurements were very precise, the
individual errors are negligible.
What value and uncertainty does he state for the diameter of the balls? Indicate
how these values are calculated (state the formula).
Hint: The balls are sold seperately and each balls has to satisfy the given specifi-
cations.
55
56
4. Distribution functions
4.1 Introduction
When conducting an experiment it is possible that the result can have any value in
a continuous range. However, it is also possible that the result is discrete. In both
cases infinite repetition of the experiment will result in a distribution of measurement
values that is proportional to the distribution function. For example, we expect that in
the first example in §3.2 (the dice throwing experiment) that each result has the same
probability, namely 1/6. Therefore, many throws will result in a frequency distribution
F(x) that is the same, constant value for each x. The fact that this was not the case in
the example is caused by the limited number of throws. After countless measurements
(N ), we find
1
P (x = xk ) ≈ F(xk ), (4.1)
N
in which P (x = xk ) is the probability of finding a result xk . If the sample space
is limited in size, as was the case with the dice, the probability distribution will be
sufficiently approximated by a finite amount of measurements. In a very large sample
space this is rather difficult, since either countless measurements have to be performed
or the measurement results have to be divided in several classes. The latter leads to a
frequency density distribution F (x), in which the number of measurement results in a
certain interval has to be divided by the width ∆x of that interval. This procedure also
has to be followed when measuring continuous quantities. The probability P (x = xk )
57
to find a measurement result in the interval k is now approximated by
1
P (x = xk ) ≈ F (xk ) ∆x, (4.2)
N
where N again is the total number of measurements, F (xk ) the value of the frequency
density distribution at the k-th interval and ∆x is the interval width. For continuous
quantities we have defined a probability density function p(x) in §3.2, in which the
probability P (a ≤ x ≤ b) was given by
Zb
P (a ≤ x ≤ b) = p(x) dx. (4.3)
a
Figure 4.1: The probability p(x) to measure x in an experiment, where the thrown number
of dots of all N dice throws are added together. Depicted are the distributions
for 2, 4 and 6 throws respectively (circles). The solid lines are the best-fitting
Guassian distributions.
58
Although a dice experiment is not really similar to a real physical experiment, discrete
probability distributions are nonetheless very common in physics. In the next few
sections we will discuss the binomial distribution and the Poisson distribution as they
are very important discrete probability distributions in physics. Additionally, we will
discuss the normal distribution, or Gaussian distribution, as a continuous distribution.
n1 + n2 = N,
n1 − n2 = m, (4.4)
N +m
n1 = ,
2
N −m
n2 = . (4.5)
2
We will also see that when the total amount of steps N is even, the position m also has
to be even and when N is odd, m also has to be odd. Consequently, the number of steps
forward and backward are fixed. If we are at position m after N steps, according to the
equation above we have to have taken n1 steps in the forward direction and n2 steps
in the backward direction. However, it is completely irrelevant when those steps have
been taken. It is completely possible that he first took n1 steps forward and then n2
steps backward, but every sequence of steps is allowed as long as their total add up to
n1 and n2 respectively. The probability that he takes n1 steps forward and subsequently
n2 steps backward is
p p ..... p q q ..... q = pn1 q n2 . (4.6)
| {z } | {z }
n1 factors n2 factors
Every sequence of n1 steps forward and n2 steps backward has that same probability.
The total probability to end up at position m is equal to the amount of possible ways
to take n1 steps forward and n2 steps backward, multiplied by the probability pn1 q n2 .
59
The total amount of possibilities is given by
N! N! N
= = , (4.7)
n1 ! n2 ! n1 ! (N − n1 )! n1
in which N ! stands for N · (N − 1) · (N − 2) · ... · 1 (and by definition 0! = 1). Why this
is the case, we will see in a moment. The probability that the drunk is at position m
after N steps can now be written as
N n1 N −n1 N n1
PN (m) = PN (n1 , N − n1 ) = p q = p (1 − p)N −n1 . (4.8)
n1 n1
With PN (n1 , N − n1 ) we obviously mean the probability to take n1 steps forward and
N − n1 steps backward. This probability distribution we call the binomial distribution.
Its name comes from the so called binomial expansion
N N
N
X N! n N −n
X N n N −n
(p + q) = p q = p q . (4.9)
n=0
n! (N − n)! n=0
n
and this is rather pleasant, because that is the probability that the man will end up at
any place, which is of course 1.
The reason that the number of possible ways to take n1 steps forward and
n2 steps backward equals nN1 can be intuitively explained as follows. It is
equivalent to the case in which we have to calculate the number of ways
to arrange a sequence of n1 red balls and n2 blue balls. If the balls are
numbered, and thus distinguishable, this is simple: for the first ball there
are N positions available, for the second N − 1, for the third N − 2 and so
on. For the last ball there is only one position available. The total number
of possibilities is therefore N (N − 1)(N − 2)...1 = N !. However, when only
the color of the ball is considered and not the number (the drunken man
doesn’t number his steps either), we realize that we have double counted
quite a few times when coming to N ! possibilities. We will have to divide
by the number of ways to arrange n1 numbered red balls and also by the
number of ways to arrange n2 red balls. The former is of course n1 ! and
follows the same reasoning as for N balls and the latter is n2 !. The total
number of sequences now is
N! N! N
= = ,
n1 ! n2 ! n1 ! (N − n1 )! n1
which is the original expression that we wanted to prove.
60
Figure 4.2 gives the probability distribution PN (m) = PN (n1 , N − n1 ) for the case
p = q = 21 and N = 20. Since the probabilities of taking a forward or backward step
are equal, the largest probability is the probability that the drunken man will end up at
m = 0. If the total number of steps is uneven, the probability distribution will have its
maxima at m = −1 and m = 1. Since it’s rather difficult to work with m (you would
have to keep track of whether m is even or uneven every single time), usually only n1
is considered. If we just call it n, the binomial distribution takes the form
N n
PN (n) = p (1 − p)N −n , (4.11)
n
where n is the number of steps forward (with corresponding probability p). In this
equation it is unimportant whether N is even or uneven. From now on we will continue
to use this form.
-20 -10 0 10 20
0.20
0.15
N=20
P N(m )
0.10
0.05
0.00
0 5 10 15 20
n1
Figure 4.2: Binomial distribution PN (m) for the case p = q = 1/2, or equivalently the
probability that a drunken man finds himself at position m after taking N steps
if the probabilities of taking a step forward and taking a step backward are equal.
For the graphed distribution N = 20. The number n1 along the horizontal axis
is the number of steps taken in the forward direction.
We will start by calculating what the expectation value ε ⟨n⟩ is of n (upon performing
many measurements) and the variance var⟨n⟩ = ε (n − ε ⟨n⟩)2 .
ε ⟨n⟩ is given by
N
X N
X
N
p (1 − p)N −n .
n
ε ⟨n⟩ = n PN (n) = n n
(4.12)
n=0 n=0
61
The n = 0 term in this summation gives 0, so we can leave it out and we write,
N N −1
X
N −n
X n′ +1 ′
N n
(n′ + 1) N
(1 − p)N −n −1 ,
ε ⟨n⟩ = n n
p (1 − p) = n′ +1
p (4.13)
n=1 n′ =0
ε ⟨n⟩ = N p. (4.16)
N N −1
The proof that (n′ + 1)
n′ +1
=N n′
can be given as follows.
N! N!
(n′ + 1) N
= (n′ + 1)
n′ +1
= ′ ,
(n′ ′
+ 1)! (N − n − 1)! n ! (N − n′ − 1)!
N! (N − 1)! N −1
=N ′ =N n′
.
n′ ! (N ′
− n − 1)! n ! ((N − 1) − n′ )!
Q.e.d.
where we have used the obtained expression for ε ⟨n⟩. Expanding this gives
62
Now we need to calculate ε ⟨n2 ⟩:
N N −1
′ ′
X X
N −1
ε n2 = n2 N
pn (1 − p)N −n = (n′ + 1)N pn +1 (1 − p)N −n −1 ,
n n′
(4.19)
n=0 n′ =0
where we have applied the same trick we used earlier. In the term on the right we can
split the n′ + 1, which results in
N −1 N −1
n′ N −1−n′
X
N −1
X
N −1
n′ +1 ′
2
n′ N (1 − p)N −n −1 .
ε n = Np n′
p (1 − p) + n′
p (4.20)
n′ =0 n′ =0
The first term in this equation again equals N p (according to equations (4.14) and
(4.15)) and for the right term we can again use the trick, so that we obtain
N −2
′′ ′′
X
N −2
2 2
pn (1 − p)N −2−n = N p + N (N − 1)p2 . (4.21)
ε n = Np + Np (N − 1) n′′
n′′ =0
value is given by
ε ⟨n⟩ = N p (4.24)
and the variance by p
var ⟨n⟩ = N pq ⇒ σ = N pq. (4.25)
63
4.4 The Poisson distribution
The Poisson distribution is essentially a special case of the binomial distribution. We
will let the number of steps N approach infinity, but the product N p (and thus ε ⟨n⟩)
will remain constant. Consequently, p will approach 0. An important example of this
is the measurement of the radioactivity of a sample. We measure pulses with a Geiger-
Müller counter for a time t. The number of pulses that we measure during this time t
we call n. The probability of measuring a pulse during a time interval ∆t is p. In our
t
measurement time t there are N = ∆t of these time intervals. The probability P (n)
that we will measure n pulses in N time intervals is equal to the probability that the
drunken man from the previous section took n steps forward out of a total of N steps.
This means that P (n) follows the binomial distribution
N n
P (n) = p (1 − p)N −n . (4.26)
n
However, the decay process is not discrete like the steps from the drunken man, but
continuous. This means that we are not really dealing with time intervals ∆t, but that
we will need to make these time intervals infinitesimally small. So we take the limit
t
∆t → 0, which implies that N = ∆t → ∞. However, the average number of pulses
µ = ε⟨n⟩ that we will count during the measurement time t, if we repeat the experiment
an infinite amount of times, remains unchanged. This is undoubtedly related to the
magnitude of the radioactivity of the sample. We will have to keep ε⟨n⟩ = N p = µ
constant. Hence, the equation becomes
µ n µ N −n
P (n) = lim Nn 1− . (4.27)
N →∞ N N
We can evaluate this limit analytically. We will not give the proof here, but eventually
we will find for the Poisson distribution
µn −µ
P (n) = e , (4.28)
n!
where µ is (again) the average number of pulses that we would measure if we would
repeat the experiment an infinite amount of times. The expectation value ε⟨n⟩ and the
variance var⟨n⟩ = ε ⟨(n − ε⟨n⟩)2 ⟩ can be calculated easily.
64
ε ⟨n⟩ = µ and var ⟨n⟩ = N pq (from the binomial distribution) = µ, because N p = µ was
kept constant and q = 1, because p was set to 0. The last result is rather unexpected.
In chapter 3 we have seen that when the measurement data shows a variation around
an expectation value, it was necessary to repeat the experiment several times to get an
impression of the magnitude of this variation. From this variation we could determine
the uncertainty in the expected value. When we know that we are dealing with a process
that follows a Poisson distribution, then in theory it is possible that we can estimate the
variation that we would find if we would repeat the experiment several times, in a single
measurement. If we assume that this single measurement results p in a value√n1 that is
√
close to µ, then the variation in the measurements will be σ = var ⟨n⟩ = µ ≈ n1 .
With this we can also say something about the probability and the probability interval
in which µ must lie. We will then know that there is a 68% probability that µ will lie
√ √
between the values n1 − n1 and n1 + n1 .
√ √
We have assumed that µ ≈ n1 . However, in a single measurement we
know that n1 can deviate from µ significantly. We will now show that the
approximation is acceptable nonetheless, provided that µ is not too small.
We know that there is a 68% probability that a single measurement lies
√ √
within the boundaries µ − µ and µ + µ, so with a 68% probability it holds
√ √ √
that µ − µ < n1 < µ + µ. Let us calculate what we would find for n1
√
for the boundary n1 = µ + µ:
√ q
√ √ q √ 1/2
n1 = µ + µ = µ 1 + √1 = µ 1+ √1 .
µ µ
Summarizing
µn
For the Poisson distribution P (n) = n!
e−µ the variance is given by
ε ⟨n⟩ = µ (4.29)
and
√
var ⟨n⟩ = µ ⇒ σ = µ. (4.30)
65
4.4.1 The Poisson distribution in a different perspective
In our previous statements about the Poisson distribution we have assumed that we
have a fixed time interval and we have calculated how many pulses we can expect in
this interval. We will now consider a fixed number of pulses and find out how long it
will take to measure this number of pulses. We will start simple with one pulse. The
question is: how long will it take until the next pulse will be measured? Since the pulses
from a radioactive sample are not measured at regular time intervals, we are dealing
with a probability distribution. This probability distribution of measuring a next pulse
at time t, we will call p1 (t). We start our measurement at time t = 0 and we divide
the time interval until time t at which the next pulse is measured in small intervals dt.
We know for sure that in all those intervals dt there was no pulse measured, except for
the time interval t and t + dt, in which there was one measured pulse. The probability
of measuring a pulse in such a time interval dt we call q for now, so the probability is
described by
p1 (t) dt = (1 − q)N q. (4.31)
t
In this equation we have N intervals dt until time t, so dt = . The term (1 − q)N
N
is the probability that in the first N intervals there is no pulse and the term q is the
probability that in the N +1’th interval there is a pulse. The probability q of measuring
a pulse in the time interval dt of course depends on the size of this interval. For very
small time intervals dt we can write q = q0 dt = dt/τ , in which q0 = 1/τ is a constant.
If we insert this in equation (4.31) and make the intervals infinitesimally small (so the
number of steps N gets infinitely large), we will find
N
t 1
p1 (t) dt = lim 1 − dt. (4.32)
N →∞ Nτ τ
66
We will now find an expression for pN (t). This is the probability density function that
describes the probability of measuring the N ’th pulse at time t. This probability is of
course a summation of all probabilities to measure N − 1 pulses in a time t′ (where
0 ≤ t′ ≤ t) and the last pulse in the remaining time t − t′ , or equivalently
Zt
pN (t) = pN −1 (t′ ) p1 (t − t′ ) dt′ . (4.34)
0
Zt
1
N −1
αN −1 t′N −2 exp (−t′ /τ ) exp − (t − t)′ /τ dt′
αN t exp (−t/τ ) =
τ
0
Zt
= αN −1 exp (−t/τ ) t′N −2 dt′
0
1 N −1
= αN −1 exp (−t/τ ) t (4.36)
N -1
and since α1 = 1, we get
αN −1 1
αN = = . (4.37)
N -1 (N -1)!
The general expression for pN (t) is therefore
tN −1
t
pN (t) = exp − . (4.38)
(N -1)! τ N τ
From this expression we can subsequently also derive equation (4.28). This goes as
follows. The probability of measuring n pulses in a time interval T we call PT (n).
This probability is the summation of all probabilities that the n pulses will occur in an
interval t (with 0 ≤ t ≤ T ) and that in the rest of the interval (so T − t) there are no
more pulses:
ZT
PT (n) = pn (t) PT −t (0) dt. (4.39)
0
67
The probability PT −t (0) to find no more pulses in the interval T − t is obviously equal
to the sum of all probabilities that the next pulse is later than that, which means that
Z∞ Z∞ ′
′ ′ 1 t
PT −t (0) = p1 (t ) dt = exp − dt′
τ τ
T −t T −t
T −t
= exp − . (4.40)
τ
ZT
tn−1
PT (n) = exp (−t/τ ) exp (− (T − t) /τ ) dt
(n-1)! τ n
0
ZT
exp (−T /τ )
= tn−1 dt. (4.41)
(n-1)! τ n
0
69
zc P (z ≥ zc ) zc P (z ≥ zc ) zc P (z ≥ zc ) zc P (z ≥ zc )
0.0 0.500 1.0 0.159 2.0 0.023 3.0 1.35·10−3
0.1 0.460 1.1 0.136 2.1 0.018 3.1 9.68·10−4
0.2 0.421 1.2 0.115 2.2 0.014 3.2 6.87·10−4
0.3 0.382 1.3 0.097 2.3 0.011 3.3 4.83·10−4
0.4 0.345 1.4 0.081 2.4 0.008 3.4 3.37·10−4
0.5 0.309 1.5 0.067 2.5 6.21·10−3 3.5 2.33·10−4
0.6 0.274 1.6 0.055 2.6 4.66·10−3 3.6 1.59·10−4
0.7 0.242 1.7 0.045 2.7 3.47·10−3 3.7 1.08·10−4
0.8 0.212 1.8 0.036 2.8 2.56·10−3 3.8 7.24·10−5
0.9 0.184 1.9 0.029 2.9 1.87·10−3 4.0 3.17·10−5
If both x̄ and σ are known of a normal distribution, it is possible to calculate the p-value
from the table.
70
Binomial distribution
Poisson distribution
Figure 4.3: Comparison of the three probability distribution functions. The Gaussian distri-
bution (solid line) is depicted for σ = 2.68 and x = 8.0, the Poisson distribution
(squares) for µ = 8.0 and the binomial distribution (circles) for N = 80 and
p = 0.1.
4.7.1 Questions
1. The width of the normal distribution is characterized by the parameter σ (stan-
dard deviation). The width of the distribution at half its height (i.e., the distance
between the points for which p(x) = 12 p(x̄)) is called the ‘Full Width at Half
p
Maximum’ (FWHM). Prove that FWHM=2σ 2 ln(2) = 2.35σ.
This is a way to quickly determine the standard deviation from a plotted frequency
distribution (∼probability density): σ = FWHM
2.35
≈ 37 FWHM.
2. Take note: The first part of this exercise is a copy from an exercise in chapter
3. When measuring the thickness of a thin layer of liquid helium, the following
angles (in minutes) were observed:
34 35 45 40 46
38 47 36 38 34
33 36 43 43 37
38 32 38 40 33
38 40 48 39 32
36 40 40 36 34
3. Using the reduced normal distribution, show that there is a 68% probability of
finding a measurement result between x̄ − σ and x̄ + σ for a Gaussian probability
distribution.
4. Someone measures the radioactivity of a sample using a Geiger-Müller counter.
During a time interval t the apparatus counts n pulses. This amount of pulses
satisfies the Poisson distribution
µn −µ
p(n) = e
n!
72
where p(n) is the probability of counting n pulses during the whole time interval
and µ is the ‘real value’, i.e., µ is the average amount of counted pulses if the
experiment were repeated an infinite amount of times. Of course, µ is propor-
tional to the level of radioactivity of the sample. The experimenter repeats the
measurement 10 times for a time interval t=200 s and finds as results 4673, 4628,
4656, 4509, 4698, 4710, 4642, 4590, 4558 and 4731 pulses, respectively.
(a) To verify whether the measured amount of pulses satisfies the Poisson distri-
bution, one can compare the spread in measurements (standard deviation)
with the theoretical standard deviation. Do so and give an uncertainty in
your answer. Use that the standard deviation
q SS of the standard deviation
SS 1
S is approximated by the formula S = 2(N −1) , where N is the amount of
measurements. What is the conclusion?
(b) Give the best estimate of µ and the uncertainty.
Now measurements are performed using another sample, which is about half as
radioactive as the first sample (i.e. µ for this second sample is approximately half
the value of µ for the first sample). The experimenter decides to perform one
measurement during one time interval.
(c) What time interval should be chosen so that µ can be determined with the
same absolute accuracy as for the first sample? You do not have to give an
uncertainty, this is only an estimate.
(d) What time interval should be chosen so that µ can be determined with the
same relative accuracy as for the first sample? (Again, no uncertainty is
needed)
4.7.2 Problems
1. A number of measurements of the length of an object give as results 50.32, 50.41,
50.35, 50.38, 50.26, 50.44, 50.38, 50.38, 50.36, 50.41, 50.55 mm.
The last measurement is quite far from the average. Should the measurement be
rejected or is it likely that it is a ‘good’ measurement? In your answer, use the
reduced normal distribution. This reduced normal distribution is given by the
function
1 z2
φ(z) = √ e− 2 .
2π
It is derived from the ‘ordinary’ normal distribution (Gaussian distribution)
1 (x−x̄)2
p(x) = √ e− 2σ2 .
σ 2π
The so-called exceedance probability P (z ≥ zc ) of the reduced normal distribution
is given by:
Z∞
P (z ≥ zc ) = φ(z) dz.
zc
73
This is the probability to find (measure) a value of z which is larger than zc . In
the lecture notes, a table is given where this exceedance probability is given as a
function of zc .
Hint: Calculate how far off the final measurement is from the average of the first
10 measurements, and calculate using the reduced normal distribution what the
probility is of finding a measurement outcome that is at least as far away from the
average. Is it likely to find such an outcome in a relatively small measurement
series?
2. The presidential election in a very large country had 2 candidates A and B. The
amount of votes cast was so large that it can be considered infinite. To get an
estimate of which of the two candidates got the most votes, a representative sample
is taken. Of all the votes cast, N are selected and viewed. Of this sample of N
ballots, n were for candidate A and N − n were for candidate B. For simplicity’s
sake, we assume there were no invalid ballots. Using this sample, we of course
want to make a statement about the fraction x of all votes that candidate A got.
As a first estimate, you might say that x = n/N , but is this true? In a sample
resulting in n votes for A and N − n votes for B, we can write the probability
density function for x as:
N
p (x) = (N + 1) xn (1 − x)N −n .
n
Because x is a fraction, of course 0 ≤ x ≤ 1 holds. Note that this probability
density looks very much like the binomial distribution, but now with the fraction
x as unknown (instead of the amount n).
For each N and n, p (x) is a different function. For example for N = 10 and n = 3
(left) and for N = 10 and n = 7 (right), it looks as follows:
3.0 3.0
N=10 N=10
2.5 2.5
2.0 2.0
n
p (x)
p (x)
1.5 1.5
1.0 1.0
0.5 0.5
0.0 0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
x x
The given probability density function is identical to the ‘Beta distribution’ which
74
is defined as
1
p (x) = xa−1 (1 − x)b−1 .
B (a, b)
In the expression for this probability density, the term B (a, b) appears. This is
the so-called Beta-function which is not important for us at this point. Known
(from literature, but you could also just calculate this), is that the expactation
value ε ⟨x⟩ equals
a
ε ⟨x⟩ =
a+b
and the variance var⟨x⟩ equals
ab
var ⟨x⟩ = 2 .
(a + b) (a + b + 1)
3. In a casino, players can buy chips for e 1,-. The owner wants to introduce the
following game. A player continuously draws one card from a very large collection
of cards. The ratio between face cards (Jack-Ace) and pip cards (2-10) is 4:9 and
does not change by drawing cards. Each time a player draws a card, he has to
stake 2 chips. The player continues drawing cards as long as he draws face cards.
When the player draws a pip card (2,3,...,10), the game ends and the player wins
the square of the amount of drawn cards (including the last pip card) in chips.
So, if the player immediately draws a pip card, he wins 1 chip but he had staked
2 (net loss); if he first draws a face card and then a pip card, he wins 4 chips but
he had staked 4 (break-even); etc
(a) Show that the probability P (n) that the player has to stop after the n-th
card (n = 1, ..., ∞) equals
75
n
9 4
P (n) =
4 13
(b) Show that in the long term, the casino will turn a loss with this game and
calculate what the average loss (to the casino) per player will be
Hint: below this exercise there is a list of expressions for series
(c) Calculate what the mimimum stake (in chips) per drawn card should be,
such that the casino can make profit in the long term.
(d) Now players have to stake the amount of chips calculated in (c). What is
the probability that a player makes a profit in 1 game?
Some series:
P∞ 1 1 1 1
= 1 + + + + ··· = ∞
n=1 n 2 3 4
P∞ (−1)n−1 1 1 1
= 1 − + − + · · · =ln(2)
n=1 n 2 3 4
2
P∞ 1 1 1 1 π2
=1+ + + + ··· =
n=1 n 4 9 16 6
P∞ (−1)n−1 1 1 1 π
= 1 − + − + ··· =
n=1 2n − 1 3 5 7 4
∞ 1
rn = 1 + r + r2 + r3 + r4 + · · · =
P
if |r| < 1
n=0 1−r
∞ r
nrn = r + 2r2 + 3r3 + 4r4 + · · · =
P
if |r| < 1
n=1 (1 − r)2
∞ 1+r
n2 rn = r + 4r2 + 9r3 + 16r4 + · · · =
P
if |r| < 1
n=1 (1 − r)3
4. Two players A en B throw an (ideal) dice. They take turns throwing as long as
a roll is larger than the (opponent’s) roll before it. The one who throws an equal
or lesser roll than the previous roll, loses the game. The table below gives the
probability P (n) that the game ends after precisely the n-th turn.
n P (n)
1 0
2 27216/46656 = 7/12
3 15120/46656 = 35/108
4 3780/46656 = 35/432
5 ?
6 35/46656
7 1/46656
Now the players play for money. Both players stake 1 Euro at the beginning of
the game. After the second throw (if the game has not finished by that time), the
player who wins most often (according to the previous question) stakes another
Euro before each new roll. The other player does not. The player who wins the
game, takes the pot.
(f) Calculate who is expected to make the most money. Calculate the average
profit per game.
R αt − 1 αt R 2 αt
5. For this exercise, the following is given: t eαt dt = e and t e dt =
α2
α2 t2 − 2αt + 2 αt
e .
α3
The duration of phone calls is distributed according to the so-called exponen-
tial distribution. This distribution is described by the probability density function
p(t) = K e−t/µ
The calling rates are e 0.20 as a starting rate (which is always charged) plus
e 0.20 per minute. This price is not calculated in full minutes, but calculated
exactly as a function of the call duration.
The phone company offers alternative rates. Here, no starting rate is charged, but
the rate is e 0.04 times the call duration (expressed in minuted) squared. Here,
again, the exact call duration is used and not rounded to whole minutes.
(e) Again calculate the average price per phone call, but now for these alternative
rates. Which rates are better for the caller?
(f) For which average call duration (instead of the given 4.05 minutes) is the
turning point? (i.e., are both rates equal on average)?
77
78
5. Combining measurement results
5.1 Introduction
So far we have talked about the way in which one single measurement (100% intervals)
or a series of measurements (68% intervals) should be analyzed. At this moment we will
limit ourselves to measurements that show a spread in the results. From a measurement
series one can determine the average and its uncertainty. After this the result can be
compared to values from literature or to the results of other measurements of the same
quantity. In §1.6 we discussed how we can draw conclusions from this comparison.
When the measurement results from different experiments are consistent, all results
can be combined to form an even more accurate final result. For example, when two
series of measurements of a quantity x result in x1 = 9.8 ± 0.3 and x2 = 10.2 ± 0.6
(where we neglect the units for now), we can already feel that the true value of x will
lie somewhere above 9.8 and below 10.2. Since the uncertainty in x1 is smaller than the
one in x2 , the true value will probably lie slightly closer to x1 than to x2 . The question
is whether or not a better estimation of the true value than x1 or x2 is possible and
how we should determine this. The answer will be derived in the next section.
Another way to combine measurement results is used when a quantity is a function of
another (measured) quantity and measurements have been performed for several values
of the latter quantity. For example, the extension of a spring is measured and each time
a different mass is hanging from it. If the extension is plotted as a function of the mass,
the measurements will more or less (depending on the uncertainty) follow a straight
line. The slope of this line gives the spring constant. When the measured extensions
have a big uncertainty and therefore will spread widely along the best-fitting line, the
slope of this line will also have a big uncertainty. How we can determine the best-fitting
line and the uncertainty in its slope from a series of measurements will be discussed
further on in this chapter.
It is of course also possible that the measurement points do not follow a straight line,
but have some other functional dependence. Sometimes it is possible to perform an
operation on the measurement points such that they follow a straight line, but in cases
where this is not possible a different method should be used.
79
5.2 Combining measurements of the same quantity
When we have different results from different measurement series of the same quantity
x, the question is: how can we combine these to form a final result? For example, the
quantity x is determined from 3 measurement series, resulting in
x1 = 9.8 ± 0.3;
x2 = 10.2 ± 0.6;
x3 = 9.9 ± 0.9. (5.1)
How should we average this and how do we determine the uncertainty in the final result?
We always start with checking whether or not the measurements are consistent (see
§1.6). This obviously is the case here. We already mentioned earlier that the true
value will probably lie closer to x1 than to x2 , since the uncertainty in x1 is smaller.
It is obvious that the results should be averaged with a weighing factor that in one
way or another depends on the uncertainty: the smaller the uncertainty, the bigger the
weighing factor. The weighing factors we will use will be called G1 , G2 and G3 . If we
know these factors, we can calculate the weighted average x̄ using
G1 x1 + G2 x2 + G3 x3
x̄ = . (5.2)
G1 + G2 + G3
To derive what G1 , G2 and G3 should be, we assume, for convenience, that the results
are determined using the same method, so with the same spread in the separate mea-
surement results. So, all measurement series have the same spread S. Therefore, we
know that the respective uncertainties Sx̄1 , Sx̄2 and Sx̄3 are determined from
S
Sx̄i = √ (i = 1..3), (5.3)
Ni
where Ni is the number of measurements in measurement series i. From the uncertain-
ties Sx̄1 , Sx̄2 and Sx̄3 in our example we now know that
2
S S 2
N1 = Sx̄1
= 0.3 ,
2
S S 2
N2 = Sx̄2
= 0.6 ,
2
S S 2
N3 = Sx̄
= 0.9
. (5.4)
3
In total we have N1 measurements with x̄=9.8, spread in accordance with the spread
S, N2 measurements with x̄=10.2, spread in accordance with the spread S and N3
measurements with x̄=9.9, also spread in accordance with the spread S. If we now
consider this as one big measurement series (with N1 + N2 + N3 measurements in total),
we find
2 2 2
S S
N1 x1 + N2 x2 + N3 x3 Sx̄1
x1 + Sx̄ x2 + SSx̄ x3
2 3
x̄ = = 2 2 2 , (5.5)
N1 + N2 + N3 S S S
Sx̄
+ Sx̄
+ Sx̄
1 2 3
80
where we have used the earlier expressions for N1 , .., N3 . Dividing both the numerator
and the denominator by S 2 finally gives
x1
2 + x2 2 + x3 2
(Sx̄1 ) (Sx̄2 ) (Sx̄3 )
x̄ = 2 2 2 , (5.6)
1
Sx̄1
+ S1x̄ + S1x̄
2 3
The uncertainty in the final result can be found just as easily: N1 + N2 + N3 measure-
ments with spread S, gives
S S
Sm = √ = r 2 . (5.8)
N1 + N2 + N3 S
2
S
2
S
Sx̄1
+ Sx̄2
+ Sx̄3
where 2
1
Gi = , (5.11)
Sx̄i
and the uncertainty in this weighted average is
1
Sx̄ = s . (5.12)
N
P
Gi
i=1
81
Note that the uncertainty Sx̄ in the weighted average is always smaller than
the uncertainties Sx̄i in the separate results. This can easily be proven by
the following equations. For the sum of all weighing factors we have
X
Gi = G1 + G2 + ... + Gj + ... + GN ≥ Gj
i
for all j, since all factors Gi are positive. From this immediately follows
that
1 1
P ≤ , ∀j = 1..N,
Gi Gj
and therefore also that
1 1
pP ≤p ⇒ Sx̄ ≤ Sx̄j , ∀j = 1..N.
Gi Gj
y = y(x) = ax + b. (5.13)
This dependence (the slope a and the offset b) we want to determine from the measure-
ments (xi , yi ). Each measurement point is determined from its own measurements series
and we will assume that the uncertainty in all yi is constant and equal to Sȳ . Also, for
now, we will assume that the uncertainties in all xi are very small (xi is determined
with infinite accuracy). In a graph this might look like what can be seen in figure 5.1.
The measurement points are indicated by dots with corresponding regions of error (the
error bars) that have the same size for all measurements (Sȳ ). It is clear that all
points are consistent with the straight line that has been drawn through them. We are
looking for the best-fitting line y = ax + b, so the question is: what are a and b? The
measurement points we call (xi , yi ) (i = 1..N ). At the point x = xi we expect to find
y = y(xi ) = axi + b, but we find y = yi . The difference we call ∆yi (see figure), so
The parameters a and b need to be chosen such that the measurement points (xi , yi )
all lie as close to the line as possible. This means that the deviations ∆yi need to
82
y
( x i ,yi )
D yi
a = ta n a
a
b x
Figure 5.1: The best-fitting straight line through the points (xi , yi ). We assumed that the
uncertainties in all xi are negligible and that those in all yi (the error bars) are
the same for all points. The vertical distance from a point (xi , yi ) to the straight
line is ∆yi .
P
be as small as possible. We could consider to minimize the sum ∆yi of all these
deviations, but since ∆yi canPbe both positive or negative according to the definition
above, minimizing
P the sum ∆yi won’t give the desired result. We could minimize
the sum |∆yi |, but usually it’s
Phard to calculate with absolute values, so instead we
2
are going to minimize the sum (∆yi ) . We also have mathematical reasons for this,
but we will get back to that. We define
N
X N
X
S= ∆yi2 = (yi − axi − b)2 = f (a, b). (5.15)
i=1 i=1
With f (a, b) we mean that this S is a function of the variables a and b. We need to
choose a and b such that S = f (a, b) is minimal. We know that a one-dimensional
function f (x) is minimal when its first derivative is zero: df
dx
= 0. Analogous to this,
f (a, b) is at its minimum when ∂a = 0 and ∂b = 0, where ∂a and ∂f
∂f ∂f ∂f
∂b
are the partial
derivatives of f with respect to a and b, respectively (see also §2.1.2). From the first
condition ( ∂S
∂a
= 0) we deduce
N N
∂S ∂ X 2
X ∂
= (yi − axi − b) = (yi − axi − b)2
∂a ∂a i=1 i=1
∂a
X X X X
= −2xi (yi − axi − b) = −2 xi yi + 2a x2i + 2b xi = 0
i i i i
X X X
⇒ a x2i +b xi = xi yi , (5.16)
i i i
83
and from the second
N N
∂S ∂ X X ∂
= (yi − axi − b)2 = (yi − axi − b)2
∂b ∂b i=1 i=1
∂b
X X X X
= −2 (yi − axi − b) = −2 yi + 2a xi + 2b 1=0
i i i i
X X
⇒ a xi + bN = yi . (5.17)
i i
Note that the summations can be calculated from the measurements and therefore they
are constants. We now have a system of two equations with two unknowns:
α11 a + α12 b = δ1 ,
(5.18)
α21 a + α22 b = δ2 ,
α22 δ1 − α12 δ2
a= ,
α11 α22 − α12 α21
(5.19)
α11 δ2 − α21 δ1
b=
.
α11 α22 − α12 α21
By filling in α11 , α12 , α21 , α22 , δ1 and δ2 we get the final result
P P P
N xi yi − xi yi
i i i
a= ,
D
(5.20)
x2i
P P P P
yi − xi xi y i
i i i i
b= ,
D
2
x2i − (xi − x̄)2 .
P P P
where D = N xi . Note that this can also be written as D = N
i i i
From this we immediately see that at all times D > 0.
With this we can calculate the slope a and the offset b of the best-fitting line. In the
introduction of this chapter we gave the example of determining a spring constant.
This constant follows directly from the above equation for a. However, we know that
the measurements will show a spread around the best-fitting line, so there will be an
uncertainty in the values of a and b we just calculated. Obviously, we also want to
calculate these uncertainties. These uncertainties must have something to do with the
spread Sȳ in the individual measurements of y and therefore also with the spread of the
points around the line we just found. We will not provide the proof for the following
84
expressions for the uncertainties Sa and Sb in a and b.
r
2
r
N S ȳ N
Sa = = Sȳ ,
D D
s P sP (5.21)
2 2 2
S ȳ x i x i
i i
Sb =
= Sȳ ,
D D
2
P 2
= N (xi − x̄)2 . We emphasize again that so
P P
where again D = N xi − xi
i i i
far the measurement points yi were averages of measurement series and that they had
an uncertainty Sȳ , which was the standard deviation of the mean. However, it can also
occur that the measurement points (xi , yi ) come from just one measurement instead
of a whole series. We are still assuming that the spread (width of the probability
distribution) is the same for each measurement. In that case the measurements yi don’t
have an uncertainty Sȳ , but Sy (the standard deviation in the separate measurements).
A problem is emerging here. When we had a measurement series, we could determine
the uncertainty of the mean (i.e. the standard deviation of the mean) by using equation
(3.31). The only way to determine the standard deviation and the standard deviation
of the mean is by using a series of measurements. However, now we have only one
measurement per point and from this we cannot determine Sy . We do have a series of
measurements yi , but each point yi has its own true value yi,t . But since we assumed
that all measurements show the same spread, we can still calculate Sy . Analogous to
equation (3.9) we can write
N
1 X
Sy2 = (yi − yi,t )2 , (5.22)
N i=1
where we now have a true value yi,t for each measurement point yi . We also know that
the true values yi,t lie on a straight line (the line that we are looking for). Similar to
section 3.5.1, where we didn’t know the true value xt , but did know an average x̄, we
now don’t know the exact straight line on which the true values yi,t lie, but we have
found an approximation y(x) = ax + b. When we filled in x̄ instead of xt in section
3.5.1, we saw that we had to use a term N1−1 instead of N1 . In the same way we can
show that when we fill in axi + b instead of yi,t , the term N1 should be changed into
1
N −2
. We will not prove this here. The 2 in this term comes from the so-called number
of degrees of freedom. In this case we have two unknowns that we want to determine (a
and b), so the number of degrees of freedom indeed is 2. Generally, with p unknowns,
we should use the term N 1−p . For example, with a straight line y = ax (in the next
section), we have the term N1−1 .
We can now calculate the term Sy2 , which occurs in the expressions for Sa and Sb , using
N N
1 X 2 1 X
Sy2 = ∆yi = (yi − (axi + b))2 . (5.23)
N − 2 i=1 N − 2 i=1
85
Summarizing
We saw that we can calculate the best-fitting line through a series of measurements
(xi , yi ) by using equation (5.20). We assumed here that the uncertainties in all xi
are negligible and that those in all yi are equal. When the points yi are averages of
series of measurements, they have an uncertainty Sȳ and then we can calculate the
uncertainties Sa and Sb in a and b, respectively, using equation (5.21). When the points
yi are separate measurements, they have an uncertainty Sy . We can calculate this using
equation (5.23). For the uncertainties Sa and Sb we can still use equation (5.21), but
now with Sy instead of Sȳ .
This S is only a function of one variable (a) and minimizing this is easy:
dS d X X X X
= (yi − axi )2 = −2xi (yi − axi ) = 2a x2i − 2 xi yi = 0
da da i i i i
X X
2
⇒ a xi = xi y i , (5.27)
i i
where Sy is the standard deviation of the mean (so actually Sȳ ) when the measurement
points yi are averages of series of measurements. So, Sy is given by
N N
1 X 2 1 X
Sy2 = ∆yi = (yi − axi )2 , (5.30)
N − 1 i=1 N − 1 i=1
86
when the points yi are separate measurements.
1
Again note that in this expression for Sy2 there is a factor N −1
, whereas in the general
case we had a factor N1−2 .
87
Next, we define χ2 (‘Chi squared’) as
N
2
X (yi − y(xi ))2
χ = , (5.35)
i=1
Si2
which allows us to write the probability as
"N #
Y 1 1 2
P (a, b) = √ exp − χ . (5.36)
i=1
2πSi 2
Q 1
Note that the term √
2πS
doesn’t depend on the straight line that we choose, since it
i
i
only contains the uncertainties Si in
the measurement points. In other words, this term
is constant. The term exp − 21 χ2 does depend on the straight line that we choose,
i.e. on the choice of the parameters a and b. We now choose the straight line that has
the largest total probability of being the right one, which boils down to maximizing
P (a, b). However, when P (a, b) is maximal, χ2 is minimal. So, maximizing PP(a, b) can
be done by minimizing χ2 . Note that this is similar to minimizing S = ∆yi2 , as
i
we did in section 5.3.1, but now with weighing factors 1/Si2 (since χ2 =
P 2 2
∆yi /Si ).
i
Solving this also goes analogous to what we did earlier, but the result looks a little
more complicated: P 1 P xi yi P xi P yi
2 2
− 2 2
i Si i Si i Si i Si
a=
;
D
(5.37)
P x2i P yi P xi P xi y i
2 2
− 2 2
b = i Si i Si i Si i Si
,
D
P 1 P x2i P xi 2
where D = 2 2
− 2
. Note that when all uncertainties Si are equal, the
i Si i Si i Si
fractions in equation (5.37) can be divided out, which gives the ‘old’ result from section
5.3.1. The equations for the uncertainties Sa and Sb in a and b, respectively, also look
slightly different from what we found in section 5.3.1:
v
uP 1
u
t S2
i i
a
S = ;
D
v (5.38)
2
u P xi
u
t S2
i
i
Sb = ,
D
P 1 P x2i P xi 2
where again D = 2 2
− 2
. Here we can also find the old result by
i Si i Si i Si
setting all uncertainties Si equal. This whole story only works when all uncertainties
Si (so for each i) are known. That only is case when each point yi and its uncertainty
Si = Sȳi are determined from a measurement series. So, this method does not work
when the points yi are separate measurements.
88
5.4 Other relationships between the points (xi, yi)
In the last section we fitted a straight line through the measurement points (xi , yi ).
One can of course imagine that the relationship between the points xi and yi (or the
model) is not linear. In this case we should use other techniques. We will demonstrate
this in the coming sections.
For example, when the points (xi , yi ) follow the relationship (model)
y = a ebx , (5.39)
it is not only possible to plot yi as a function of xi (which will not give us a straight
line), but also to plot the natural logarithm ln(yi ) as a function of xi . After all, the
equation above can be written as
This does give us a straight line with slope b and offset ln(a). So when we plot ln(yi )
as a function of xi , we can use the method from the last section and fit a straight line.
However, this does not instantly solve our problem. First of all, we have a dimension
problem. The measured quantity y has a dimension and in the equation above we just
took the logarithm of this. So, we should always be cautious and realize that we are
looking at a number and then the logarithm of this. The second problem is worse
though. The condition for using the ‘simple’ method of least squares from section 5.3.1
was that the uncertainties in y were equal for all yi . When we take the logarithm of yi ,
this will not be the case anymore. After all, according to §3.8 the uncertainty in ln(y)
is given by
∂ ln(y) Sy
Sln(y) = Sy = . (5.41)
∂y y
and this isn’t necessarily constant, since y can be different for each measurement point.
We can only use the method of least squares to obtain an indication of the best-fitting
straight line. From the equation above, it is obvious that, when Sy is constant, the
uncertainty Sln(y) is big for small values of y and small for big values of y. The points
(xi , yi ) that have a small Sln(yi ) then have much more influence on the position of the
best-fitting line than the points with big Sln(y) . If we want to do it the right way, we
should use the more extended equations from section 5.3.3.
The trick we used to get points that don’t follow a straight line in a straight line, can
be applied quite often, so not only in the case of an exponential function. In all cases
one should be cautious with dimensions and the uncertainty in the y-values will not be
constant anymore.
89
5.4.2 The method of least squares for non-linear relationships
As an example we take the position of a falling object (in vacuum). Theoretically, the
position y can be described by
y = y0 + v0 t + 21 gt2 , (5.42)
where y0 is the initial position (at time t = 0), v0 the initial velocity and g the gravita-
tional acceleration. The measurement points (ti , yi ) obviously do not lie on a straight
line, but on a parabola. It is the term v0 t that puts a spanner in the works here. If this
term wasn’t there, we could simply plot yi as a function of t2i to get a straight line. Try
√
to figure out yourself why this works better than plotting yi as a function of ti . So in
this case we should use another method.
We could try to modify the method of least squares for a parabolic curve, but a more
general case is that of an M th-degree polynomial
y = a0 + a1 x + a2 x2 + ... + aM xM . (5.43)
Our job is now to find the parameters a0 , ..aM such that the curve fits best to our data.
The square of the difference ∆yi between the measurement values yi and the polynomial
fit can be defined analogous to the linear case:
2
∆yi2 = yi − a0 − a1 xi − a2 x2i − ... − aM xM
i , (5.44)
and the sum S of these quadratic deviations is given by
X X 2
S= ∆yi2 = yi − a0 − a1 xi − a2 x2i − ... − aM xM
i . (5.45)
i i
where we have already used matrix notation. Check this yourself. Mathematics tells us
that this is quite easily solved. We won’t go into the details here.
Mathematics also tells us that many functions can be written as a polynomial (the
so-called series expansion). In principle the degree of such polynomials (M ) is infinite,
but the coefficients ai usually get negligibly small for large i. We can approximate these
functions with an M th-degree polynomial with a finite M . In that case we can use the
method from this section to calculate the coefficients ai that aren’t negligible.
90
5.5 Exercises and problems
5.5.1 Questions
1. The charge of an electron is determined in three different ways (using 3 different
measurement series). The results are:
Q1 = (1.58 ± 0.04) · 10−19 C,
Q2 = (1.64 ± 0.09) · 10−19 C,
Q3 = (1.60 ± 0.03) · 10−19 C.
Combine these three results to a single end result and calculate the uncertainty
therein.
2. A vehicle passes 4 different positions xi with a constant speed v. The positions
are known to an accuracy of 0.1 m (68% confidence interval). To determine the
speed v of the vehicle, the time instants at which it passes the positions xi are
measured. These times ti are measured with an accuracy of 0.5 s (again, 68%
confidence interval). The measurement results are:
xi (m) 0 1000 2000 3000
ti (s) 12.4 37.2 60.9 84.9
(a) In order to determine the speed using the least-squares method, what would
you plot along the x-axis and what on the y-axis?
X X X
A {f (xi )}2 + B f (xi )g(xi ) = yi f (xi ),
i i i
X X 2
X
A f (xi )g(xi ) + B {g(xi )} = yi g(xi ).
i i i
(a) How can linear adjustment be used to determine I0 and C. In other words,
what should you plot against what to get a linear relation?
(b) What are the conditions under which the least-squares method (for which
the expressions are given in Appendix G of the lecture notes) are valid?
(c) Make a correct graph of the measurement results showing a linear relation,
according to the rules.
(d) Determine visually (from the graph) the unknowns I0 and C.
(e) Determine using the least-squares method the unknowns I0 and C and their
uncertainties.
(a) How would you plot the data in a graph to determine C from a linear rela-
tion? Should this line pass through the origin?
(b) Make a clear graph of the measurement results showing a linear relation,
according to the rules.
(c) What is the (inevitable) disavantage of this way of plotting?
(d) From the graph, estimate C and its uncertainty (so not using the least-
squares method).
(e) Calculate C and its uncertainty using the least-squares method.
(f) Is the experimentally found value of C consistent with the value predicted
by Rydberg? Explain your answer.
(g) An alternative method is to calculate a value of C for each measurement
result (using C = n2 En ), and combining the 5 obtained values of C into one
final result. Do so and indicate how the uncertainties in the values of C are
calculated, and how the uncertainty of the end result is determined.
(h) In the previous question, which of the measured energies was most important
in determining the final answer? Can you also explain this using the graph
from question (b)?
10
4
V
0
0 10 20 30 40 50
At first (until t ≃ 10 ns) the signal level is constant. From t ≃ 10 ns to t = 50 ns,
theory predicts that voltage will follow V (ti ) = Vb − V0 exp − ti −tτ
0
, where V0 is
the (theoretical) voltage at time t0 . Here, t0 can be freely chosen in the mentioned
94
interval. From the measured voltages, τ should be determined. Each voltage Vi is
measured exactly once, hence the constant voltage Vb is determined from a large
amount of measurements. The time instants ti are very precisely determined,
hence the uncertainty in them is negligible. To avoid problems with dimensions,
we divide the above expression by a reference voltage Vref , so it becomes
V (ti ) Vb V0 ti − t0
= − exp − .
Vref Vref Vref τ
As a reference voltage, we can for example take Vref = 1 V.
Somewhere between t = 0 ns and t = 10 ns, an interval is chosen to determine Vb .
Then, between t = 10 ns and t = 50 ns an interval is chosen in which the measured
voltages are manipulated such that a linear relation is obtained, from which the
decay time τ can be determined. t0 is chosen as a lower bound to this interval,
and tn is chosen as an upper bound (so it contains a total of n+1 measurement
points).
(a) What are the two unknowns (parameters to be fitted) in the second part (t
between 10 ns and 50 ns)?
(b) What do you plot against what to get a linear relation in the second part
(t between 10 ns and 50 ns)? How can the two unknowns from part (a) be
calculated from this plot (only indicate the method)?
(c) Why is it necessary that Vb be determined very accurately?
(d) In part (b), why can the upper bound tn of the time interval not be taken at
50 ns, and should tn in fact be significantly smaller? From the plot, estimate
how large tn may be.
(e) We know that the spread (standard deviation) SV of all measured voltages
Vi (so in both intervals) is constant; however, how large it is is unknown.
If we assume that the uncertainty SVb in Vb is sufficiently small when it is
smaller than 10% of the accuracy SV of the individual voltage measurements,
how many points are needed in determining Vb ?
(f) What is the accuracy Syi of the y-quantity in the lineair graph of excercise
(b)? Express it in terms of SV . Assume that SVb is negligible.
(g) When fitting a straight line, an expression is minimised. Which expression
is minimized in this case?
(h) The expressions for a straight-line-fit with unequal uncertainties are given in
appendix G of the lecture notes. Is it necessary to known SV in order to use
these formulas?
(a) Using the expression for the Poisson distribution, show that µ is indeed the
average amount of pulses for an infinite amount of measurements, so that
indeed ε ⟨n⟩ = µ.
∞
P µn
Hint: n!
= exp (µ) .
n=0
The measurements are performed for different time intervals T . Each time interval
T of course has a different average µ (T ). Naturally, µ (T ) ∝ T . For each time
interval, N measurements are performed where N is relatively large. The average
n̄ of such a series of N measurements is (hopefully) a decent approximation of
the real average µ (T ) for that time interval. Because we are performing real
measurement series (at different time intervals), we can calculate their standard
deviations. For a Poisson distribution, we know that theoretically, the variance
var⟨n⟩ of the amount of measured pulses is equal to the average µ. To check this,
we compare the measured standard deviations in the different measurement series
(at different time intervals) to their theoretical values.
(b) What is the theoretical relation between the standard deviation σ and the
average µ?
The measurement results are given in the table below. Here, n̄ is the average
amount of pulses of the measurement series and S is the standard deviation as
determined from the series. All measurement series consisted of N = 30 measure-
ments. We call the seperate measurements of each series ni (i=1..30).
Tijdsinterval T (s) n̄ S
10 71 8.4
20 145 11.2
30 217 13.3
40 293 16.4
50 357 19.5
(c) Using which formula were the S-values in the table determined?
(d) If the measured averages n̄ approximate µ (T ), what is their theoretical un-
certainty? Only give an expression. Express this uncertainty in terms of µ
(≈ n̄). The answer cannot contain the seperate measurements ni .
(e) If we want to plot the measured standard deviations and the measured av-
erage amounts of pulses in some way to check the theoretical relation from
part (b) using the least-squares method, what will you plot against what?
Explain your answer.
96
(f) Should the straight line from the previous question pass through the origin
or not?
(g) What are the uncertainties in the quantities plotted along the x-axis and the
y-axis, respectively?
(h) Make a clear graph, following the rules.
(i) Using the least-squares method, determine the slope of the graph and its
uncertainty.
(j) Is the calculated slope consistent with theory?
a
H 0
H − H0
Of course tan (α) = holds. The measurements are given in a table:
D
Di (m) αi (degree)
50.0 ± 0.5 63 ± 2
100.0 41
150.0 32
200.0 26
250.0 20
The uncertainties in all distances Di are equal and those in all measured angles
αi are also equal.
(a) If you want to determine the height H using a straight line fit, what do you
plot against what? Clearly indicate what is plotted along the x-axis and
what on the y-axis, and explain why.
97
(b) Which of the formulas in appendix G of the lecture notes is used to find the
best-fitting straight line? Explain why.
(c) How can this/these formula(s) be used to determine the height and its un-
certainty? Give, if necessary, the formula for he uncertainty SH in H.
(d) Make a correct graph showing a linear relation, according to the rules.
(e) Determine graphically (from the graph, without calculating), the height of
the tower and the uncertainty therein.
(f) Determine by calculation the height of the tower and the uncertainty therein.
98
R +∞ 2 √
A. Proof that −∞ e−x dx = π
2
We will now provide the proof for this. Since the function e−x has no primitive function,
we cannot directly calculate the integral, so we will have to use a trick. We will call the
integral I for now, so
Z+∞
2
I= e−x dx. (A.2)
−∞
Subsequently we will change to polar coordinates (r, φ) instead of using Cartesian co-
ordinates (x, y). Note that x2 + y 2 = r2 and that dx dy = r dr dφ, so the expression
becomes
Z2π Z+∞
2
I 2 = dφ e−r r dr. (A.4)
0 0
Since the integral covers the whole xy-plane, we have to integrate r from 0 to ∞ and
φ from 0 to 2π. The obtained integral we can evaluate analytically, since the function
2 2
r e−r does have a primitive function (− 21 e−r ). The first part (the integral over φ)
simply gives us a constant 2π. This means that
h i∞
2 1 −r2
I = 2π − 2 e = 2π(0 + 12 ) = π. (A.5)
0
This is the square of the integral that we were originally trying to calculate, so
Z+∞
2 √
e−x dx = π. (A.6)
−∞
99
100
B. Combining measurements of the same
quantity
In this appendix we will show that the equations (5.10) until (5.12) are valid in the gen-
eral case, which means that they are also valid when the acquired measurement results
are obtained with different measurement methods. We will consider N measurement
series (independent) with results x̄i ± Sx̄i . We will assume that the measurement results
of all series show a Gaussian spread (and according to §3.5.2 the means do as well). We
now want to combine these N results to one ‘average’ x̄. The probability pi (x̄) dx that
the true value lies within an interval dx around x̄ is for series i given by the probability
density function !
1 (x̄ − x̄i )2
pi (x̄) = √ exp − (B.1)
Sx̄i 2π 2Sx̄2i
The total probability density p (x̄) of all measurement series combined, is the product
of all pi (x̄):
N N
! N
!
Y Y 1 1 X (x̄ − x̄i )2
p (x̄) = pi (x̄) = √ exp − 2
. (B.2)
i=1 i=1
S x̄ i
2π 2 i=1
2S x̄ i
We should choose the mean x̄ in such a way that this probability density is optimal. So
the mean x̄ that we are looking for, has to give a maximum probability density p (x̄),
so
dp (x̄)
= 0. (B.3)
dx̄
Since p (x̄) is of the form C exp (f (x̄)), we can state that (chain rule)
df (x̄) X x̄ − x̄i
=0⇒ = 0. (B.4)
dx̄ Sx̄2i
and thus
P 1
2
1 1 i Sx̄i 1
X
Sx̄2 = #2 2
=" #2 = P 1 , (B.8)
Sx̄i
"
i P 1 P 1 2
2 2 i Sx̄i
j Sx̄j j Sx̄j
102
C. The uncertainty in the method of least
squares
In this appendix we will provide a derivation for the uncertainties Sa and Sb in the
coefficients a and b of the best-fitting straight line y = ax + b for a set of data points
(xi , yi ). We will do this for the general case of §5.3.3, in which all data points yi (in
theory) have different uncertainties Si = Sȳi . The coefficients a and b themselves are
given by equation (5.37)
The uncertainties we find with the help of the general calculation rule (3.43):
P ∂a 2 2
2
S = Si ;
a ∂yi
(C.1)
2
∂b
Sb2 = Si2 .
P
∂yi
∂a
From equation (5.37) we can calculate the partial derivative :
∂yi
!
∂a 1 xi X 1 1 X xj
= − , (C.2)
∂yi D Si2 j Sj2 Si2 j Sj2
2
P 1 P x2i
xi
where D= − . Inserting this in equation (C.1) gives
Si2 Si2 Si2
!2
X 1 xi X 1 1 X xj
Sa2 = 2 2 2
− 2 2
Si2 (C.3)
i
D S i j
Sj Si j
S j
!2 ! !2 !
2
1 X 1 X xi X xi X 1
= −
D2 i Si2 i
Si2 i
Si2 i
Si2
! ! ! !2
2
1 X 1 X 1 X xi X xi
= −
D2 i
Si2 i Si2 i
Si2 i
Si2
P 1
S2
= i i .
D
104
D. Overview of formulas and distribution
functions
On the next page we have provided an overview of the formulas that can be utilized for
continuous and discrete probability distributions and measurements.
105
Continuous probabilities Discrete probabilities Measurements
Measurement quantity x xk (k = 1..M )
Measurement value xi (i = 1..N )
Frequency distribution F(xk ) = N Pk = N pk ∆x F(xk )(k = 1..M )
(for discrete outcomes xk )
Frequency density F (xk ) ≈ N p(xk ) F (xk ) = Mk /∆x
(Mk = number of measurements in)
interval ∆x around xk
1 Mk
Probability density p(x) pk = p(xk ) pk = N1 F (xk ) = ∆x N
Rb Mk 1
Probability P (a ≤ xi ≤ b) = p(x)dx Pk = pk ∆x Pk = N
= pk ∆x ≈ N
F (xk )∆x
a
kb
P
P (a ≤ xi ≤ b) = Pk (continuous quantities)
k=ka
Pk = N1 F(xk )
(discrete quantities)
Pkb
P (a ≤ xi ≤ b) = Pk
k=ka
R∞ M
P M
P 1
N
P
Expectation value ε ⟨x⟩ = xp(x)dx ε ⟨x⟩ = xk p(xk )dx = x k Pk ε ⟨x⟩ = N
xi
−∞ k=1 k=1 i=1
R∞ M
P 1
N
P
ε ⟨f (x)⟩ = f (x)p(x)dx ε ⟨f (x)⟩ = f (xk )P (xk ) ε ⟨f (x)⟩ = N
f (xi )
−∞ k=1 i=1
R∞ M
P 1
P
Mean x̄ = ε ⟨x⟩ = xp(x)dx x̄ = ε ⟨x⟩ = x k Pk x̄ = N
xi
−∞ k=1 measurements i
xk MNk
P P
x̄ ≈ xk pk ∆x =
interv. k interv. k
Variance σ 2 =var⟨x⟩ = ε ⟨(x − xt )2 ⟩ σ 2 =var⟨x⟩ = ε ⟨(x − xt )2 ⟩ S 2 =var⟨x⟩ = ε ⟨(x − xt )2 ⟩
R∞ M
1
N N
1
(x − xt )2 p(x)dx (xk − xt )2 P (xk ) (xi − xt )2 ≈ (xi − x̄)2
P P P
= = N N −1
−∞ k=1 i=1 i=1
106
E. Overview of distribution functions
In this appendix we will list the three distribution functions from chapter 4.
Gaussian distribution Binomial distribution Poisson distribution
(x−µ)2
−
√1 e
Probability density p(x) = σ 2π
2σ 2
µn −µ
N
pn (1−p)N−n
Probability PN (n) = n
P (n) = e
n!
Mean x̄ = µ n̄ = N p n̄ = µ
Expectation value ε ⟨x⟩ = x̄ = µ ε ⟨x⟩ = n̄ = N p ε ⟨x⟩ = n̄ = µ
Variance var⟨x⟩ = σ 2 var⟨x⟩ = σ 2 = N p (1−p) var⟨x⟩ = σ 2 = µ
107
108
F. χ2 test of a distribution function
h (x k)
p (x )
h (x k)- N P (x k)
x 1 x 2 x k
109
In the figure the assumed probability density function p(x) is plotted as well. To be
able to estimate the accuracy with which the fit p(x) describes the measurement results,
we will have to compare the results with the expected results. The probability P (xk )
to find a result in the interval around xk in a single measurement, is
xk + ∆x
2
Z
P (xk ) = p(x) dx ≈ p(xk ) ∆x (F.2)
xk − ∆x
2
and the total number of measurements that we expect in the interval is N P (xk ). We
define, analogous to equation (5.35), again
M
2
X (h(xk ) − N P (xk ))2
χ = , (F.3)
k=1
(Sk (h))2
where (Sk (h))2 = var⟨h(xk )⟩ is the spread in the histogram values h(xk ). In other words,
this is the spread in values h(xk ) that we would find if we would measure the histogram
a (large) number of times. Fortunately, we will not have to do this to be able to
determine the size of Sk (h). The probability P (hk ) of finding exactly hk measurements
in the interval around xk after N measurements is given by the binomial distribution
N
P (hk ) = (P (xk ))hk (1 − P (xk ))N −hk . (F.4)
hk
From the binomial distribution we know that (see chapter 4 and appendix E) the mean
is given by
h̄k = N p = N P (xk ), (F.5)
and the variance by
h̄k
var ⟨hk ⟩ = N p(1−p) = N P (xk ) (1 − P (xk )) = h̄k 1− . (F.6)
N
When the total number of measurements N becomes relatively large, we will see that
The value for χ2 we can now calculate, since the values h(xk ) are measured and P (xk )
is known according to equation F.2. Note that in this equation we assume a continuous
probability distribution; however, also in the case of a discrete distribution we know
P (xk ) and we can calculate χ2 . The value of χ2 is used as a measure for how good
the agreement between data and distribution function is. If the measured frequencies
110
(histogram values) h(xk ) exactly match the predicted values N P (xk ), we would find
χ2 = 0. However, this is very improbable, since (as stated earlier) there will definitely
be a spread in the values of h(xk ). Large values of χ2 correspond to large deviations
from the assumed distribution function.
To calculate what the probability is to find a certain value for χ2 , we will use the so
called reduced Chi squared χ2ν . This is defined as
χ2
χ2ν = , (F.9)
ν
where ν is the so called number of degrees of freedom. This number of degrees of
freedom ν is the number of xk values M minus the number of parameters that we have
to calculate from the measurement data to describe the probability distribution. For
example, for the Gaussian distribution we have to calculate the mean and the standard
deviation, so in that case ν = M − 2. For the Poisson distribution√ we only need the
mean (the standard deviation is automatically known, since σ = n̄), so ν = M − 1.
The expectation value for χ2ν is
χ2ν = 1. (F.10)
When χ2ν is determined from the measurement data, we can calculate what the proba-
bility is that this is the case and thus determine what the probability is that the given
distribution function is correct. It is outside of the scope of these lecture notes to go
into detail on this topic. For example, the book of Bevinton and Robinson lists a table
giving these probabilities.
111
112
G. Formulas for least squares fits
FITTING FUNCTION FITTING PARAMETERS UNCERTAINTIES
s
N Sy2
P P P
N x i yi − x i yi
a= Sa =
N xi − ( xi )2
P 2
y = ax + b N x2i − ( xi )2
P P P
Syi = Sy = const. P 2P P P s
yi − xi xi yi Sy2 x2i
P
xi
b= Sb =
N x2i − ( xi )2
P P
N x2i − ( xi )2
P P
P Sy
y = ax xi yi Sa = qP
a= P 2
Syi = Sy = const. xi x2i
y = kx + b P
yi − k
P
xi Sy
k = const. b= Sb = √
Sy i = Sy = const. N N
s
x4i
P
x4ixi yi − x3i
P P P P 2
xi yi
a1 = P 3 2 Sa1 = Sy 2
x2i x4i − x3i
P P P
2
P P 4
y = a1 x + a2 x2 x x − xi
P 2 Pi 2 i P 3 P s
x2i
P
Syi = Sy = const. xi xi yi − xi xi yi
a2 = P 2P 4 P 3 2 Sa2 = Sy 2
2 x4i − x3i
P P P
xi xi − xi xi
v
P 1 P x i yi P x i P yi u P 1
−
u
Si2 Si2 Si2 Si2
u Si2
a= Sa = u
t P 1 P x2i P xi 2
u
P 1 P x2i P xi 2
− −
y = ax + b Si2 Si2 Si2 Si2 Si2 Si2
v
Syi = Si ̸= const. 2
P xi P yi P xi P xi yi
u P x2i
−
u
2 2 Si2 Si2 Si2
u
Si Si u
b= Sb = u
u P 1 P x2 P x 2
P 1 P x2i P xi 2
i i
− −
t
Si2 Si2 Si2 Si2 2
Si 2
Si
y = ax 1
xi yi /S 2
P
Sa = qP
Syi = Si ̸= const. a = P 2 2i
xi /Si x2i /Si2
y = kx + b 1
yi /Si2 − k xi /Si2
P P
k = const. Sb = qP
b=
1/Si2 1/Si2
P
Syi = Si ̸= const.
v
P x4i
P xi yi P x3i P x2i yi
u
u P x4i
− u
Si2
Si2 Si2 Si2 Si2 Sa1
u
=u
a1 = 2 u P 2 P 4 P 3 2
P x2i P x4i
x3i
t xi xi xi
−
P
2 2 −
y = a1 x + a2 x2 Si2 Si2 Si2 Si Si Si2
v
Syi = Si ̸= const. P x2i P x2i yi P x3i P xi yi u P x2i
− u
Si2 Si2 Si2 Si2 Si2
u
a2 =
u
Sa2 =u
P x2i P x4i P x3i 2 u P 2 P 4 P 3 2
− t xi xi xi
Si2 Si2 Si2 2 2 −
Si Si Si2
113
114
H. Answers Questions
Question 2.4.1.1
∂ 4 2
(a) ∂x (x y ) = 4x3 y 2
∂ 4 2
∂y (x y ) = 2x4 y
∂ 2 2
(b) ∂x (x /(x + y)) = 2xy/(x2 + y)2
∂ 2 2
∂y (x /(x + y)) = −x2 /(x2 + y)2
∂ 2 2
(c) ∂x ((4x + 2y)/(3x − y)) = (12x − 8xy − 6y)/(3x − y)2
∂ 2 2 2
∂y (((4x + 2y)/(3x − y)) = (4x + 6x)/(3x − y)
∂
(d) ∂x (sin x/ sin y) = cos x/ sin y
∂
∂y (sin x/ sin y) = − sin x cos y/(sin y)2
∂
(e) ∂x (x sin(xy)) = sin(xy) + xy cos(xy)
∂
∂y (x sin(xy)) = x2 cos(xy)
Question 2.4.1.3
∆p
(i) pmin = 171 pmax = 231 is consistent with p = 0.15 ∆p = 30
Question 2.4.1.5
∆g
g = 0.055 ∆g = 0.055g
Question 2.4.1.7
∆λ
λ = cos θ
sin θ ∆θ
With ∆θ = 1′ = 2.91 · 10−4 rad this results in ∆λ
λ = 1.04 · 10−3
Question 2.4.1.9
sin x − ∆x · cos x and sin x + ∆x · cos x
Question 3.12.1.1
115
(b) 38,40
(c) Median = 38
(f) 34,43
(g) 30,47
Question 3.12.1.3
q
S = N 1−1 N 2
P
i=1 (Vi − V̄ )
Standard deviation does NOT depend on N .
Question 3.12.1.5
g ± sg = (9.80 ± 0.04) m/s2
Question 3.12.1.7
k̄ ± sk̄ = (13.16 ± 0.06) N/m
Question 4.7.1.1 !
1 (x − x)2
p(x) = √ exp −
σ 2π 2σ 2
It is immediately clear that
1
p(x) = √
σ 2π
Thus if we solve p(x) = 21 p(x) for x
!
(x − x)2 1
exp − =
2σ 2 2
(x − x)2 2 = 2σ 2 ln 2 → x
√
This gives = ln 2 → (x − x) 12 = x ± σ 2 ln 2
2σ 2 √
Therefore the F HW M = 2σ 2 ln 2 ≈ 2.35σ Q.E.D.
Question 4.7.1.3
P (x̄ − σ < x < x̄ + σ) = P (−1 < z < 1) = P (z ≥ −1) − P (z ≥ 1) = 1 − 2 × P (z ≥
1) = 1 − 2 × 0.159 = 0.681
Question 5.5.1.1
116
x̄ ± Sx̄ = (1.60 ± 0.02) · 10−19 C
Question 5.5.1.3
Partly done during lecture, see slides for explanation
Question 5.5.1.5
τ = 2.025 s−1
117
I. Answers Problems
Problem 2.4.2.1
1 1 1
(a) Method 1: = +
R R1 R2
1
R1 = 100 Ω ± 5% = (100 ± 5) Ω ⇒ = (0.01000 ± 0.00050) Ω−1 ,
R1
1
R2 = 270 Ω ± 5% = (270 ± 14) Ω ⇒ = (0.00370 ± 0.00019) Ω−1 ,
R2
since the relative uncertainties in Ri and 1/Ri are equal.
The absolute uncertainties can now be added:
1
= (0.01370 ± 0.00069) Ω−1
R
⇒ R = (73 ± 4) Ω
This last step is valid because the relative uncertainties in R and in 1/R are equal.
R1 R2
Method 2: R =
R1 + R2
R22
∂R R2 (R1 + R2 ) − R1 R2
= =
∂R1 (R1 + R2 ) 2 (R1 + R2 ) 2 ∂R ∂R
2 ⇒ ∆R = ∆ R1 + ∆R2
∂R R1 ∂R1 ∂R2
=
(R1 + R2 )2
∂R2
R22 R12
⇒ ∆R = ∆ R 1 + ∆R2 = 3.686 Ω
(R + R2 )2 (R + R2 )2
| 1 {z } | 1 {z }
2.663 Ω 1.023 Ω
(b) The largest contribution to the uncertainty comes from R1 : S1/R1 = 0.0005 Ω−1 vs.
S1/R2 = 0.0002 Ω−1 with method 1 or 2.663 Ω vs. 1.023 Ω.
Thus replacing R1 is the most sensible.
Problem 2.4.2.2
∆α = 2∆α+ .
∆α = 12 2∆α+ = ∆α+ .
This method is therefore more precise and results in a more accurate value for λ.
118
(b)
d α+ − α−
nλ = d sin α ⇒ λ = sin .
n 2
The uncertainty ∆λ now follows from:
∂λ ∂λ ∂λ ∂λ
∆λ = ∆α+ + ∆ α− = + ∆α+ (because ∆α+ = ∆α− )
∂α+ ∂α− ∂α+ ∂α−
d α+ − α− d d p
⇒ ∆λ = cos ∆ α+ = |cos α| ∆α+ = 1 − sin2 α ∆α+
n 2 n n
s 2
d nλ
= 1− ∆ α+
n d
Alternatively:
d
λ = sin α
n
∂λ d d
⇒ ∆λ = ∆α = |cos α| ∆α = |cos α| ∆α+ ( because according to (a), ∆α = ∆α+ )
∂α n n
Continued as above.
(d) In the formula nλ = d sin α, of course the value of sin α has to remain smaller than 1
(for sin α = 1 the angle is π/2 which is the extreme).
Thus
nλ d 2190
sin α = ≤1⇒n≤ = = 3.65
d λ 600
Therefore n = 3 is the maximum.
s
d 2
(e) ∆λ = ∆α+ − λ2 is smallest for as big as possible n, hence for n = 3.
n
s s
d 2 2190 2
2
1 π
(f) ∆λ = ∆α+ −λ = − 6002 = 0.12 nm.
n 60 180 3
d d α+ − α− 2190
(g) λ = sin α = sin = sin(32.567◦ ) = 589.417 nm (not rounded off
n n 2 2
yet).
dp 2190 p 1 π
∆λ = 1 − sin2 α ∆α+ = 1 − sin2 32.567◦ = 0.27 nm.
n 2 60 180
Hence the final answer is λ = (589.4 ± 0.3) nm.
Problem 2.4.2.3
(a) (i) The uncertainties ∆xi have to be very small (with respect to xi ).
119
(ii) The uncertainties ∆xi have to be independent of each other.
(b) (i) In the derivation, the slope of the function f was taken as constant around xi ,
thus in the one dimensional case, df /dxi = ∆f /∆xi .
Example: sin α with ∆α very large, then ∆sin α > 1 is possible.
(ii) It is presumed that the correct values of the measured xi can lie randomly within
the intervals so that the lower- and upperbounds of the known quantity x can
be reached by assuming the necessary lower- and upperbounds of xi (or possibly
another value within the interval ). This is not (always) possible for uncertainties
that are not independent.
Counterexample: y = x1 /x2 with x2 = 1 − x1 . If the actual value of x1 is larger
than the measured value, then the actual value of x2 must be smaller than the
measured value.
(c) The first condition is no longer necessary since for summation it holds that ∆y =
∆x1 + ∆x2 , provided that the uncertainties are independent. This is because y =
f (x1 , x2 ) = x1 + x2 is a linear function. The slope of f (x1 , x2 ) is therefore not only
constant in a small region near the point (x1 , x2 ), but everywhere. The uncertainties
∆x1 and ∆x2 would still have to be independent.
Problem 2.4.2.4
(a)
V = I0 R = I1 (R + R1 )
I1
⇒ R (I0 − I1 ) = I1 R1 ⇒ R = R1
I0 − I1
(b)
∂R ∂R ∂R ∂R
∆R = ∆ I0 + ∆ I1 = + ∆I
∂I0 ∂I1 ∂I0 ∂I1
I1 R1 I0 R1
= + ∆I
(I0 − I1 )2 (I0 − I1 )2
I0 + I1
⇒ ∆R = R1 ∆I
(I0 − I1 )2
(c) I0 and I1 are entirely determined by V , R and R1 , thus ∆R can also be written as
R −1
∆R = R (R + R1 )(2R + R1 )∆I
V 1
R 2R2
⇒ ∆R = + 3R + R1 ∆I
V R1
This expression includes 3 terms: a term ∝ 1/R1 , a term that does not depend on R1
and a term ∝ R1 .
For small values of R1 the first term dominates (∆R → ∞) and for large values of R1
120
the third term dominates (∆R → ∞). So somewhere there has to be a minimum.
∆R is minimal when d∆R /dR1 = 0:
d∆R
= −R1−2 (R + R1 )(2R + R1 ) + R1−1 (2R + R1 ) + R−1 (R + R1 ) = 0
dR1
√
⇒ R12 − 2R2 = 0 ⇒ R1 = R 2
Problem 2.4.2.5
(a)
i = pi − pl ⇒ ∆i = ∆pi + ∆pl
o = pl − po ⇒ ∆o = ∆pl + ∆po
i = (26.40 ± 0.25) cm
o = (34.05 ± 0.10) cm
i = (26.4 ± 0.3) cm
o = (34.1 ± 0.1) cm
(b) These values cannot be used since the calculated uncertainties in i and o are dependent
(both depend on ∆pl ).
121
(c) Since uncertainties are dependent we have to use the general rule for 100% intervals,
starting with the uncertainties in the measurements. From the thin lens formula it
follows that
io (pi − pl )(pl − po )
f= =
i+o pi − po
The uncertainty ∆f in f follows from
∂f ∂f ∂f
∆f = ∆p o + ∆p l + ∆p i
∂po ∂pl ∂pi
Calculating both gives
∂f ∂f ∂f
∆f = ∆p o + ∆p l + ∆pi
∂po ∂pl ∂pi
(pi − pl )2 pi − 2pl + po (pl − po )2
= ∆ p o + ∆ p + ∆p i
(pi − po )2 pi − po l
(pi − po )2
i2 i−o o2
= ∆ po + ∆ pl + ∆pi
(i + o)2 i+o (i + o)2
i2 ∆po + o2 ∆pi + |(i − o)(i + o)| ∆pl
=
(i + o)2
i2 ∆po + o2 ∆pi + (i2 − o2 ) ∆pl
=
(i + o)2
Substituting everything gives f = 14.870 cm and ∆f = 0.079 cm and thus the final
answer is:
f = (14.87 ± 0.08) cm
Problem 3.12.2.1
(a) From Z +∞
p(x) dx = 1
−∞
follows
Z +∞
b
p(x) dx = 2a + 2ab = 3ab = 1
−∞ 2
1
⇒b=
3a
(b)
Z +∞
xw = ε ⟨x⟩ = x p(x) dx
−∞
Z 0 Z +∞
= x p(x) dx + x p(x) dx
−∞ 0
Z 0 Z +∞
= −x p(−x) −dx + x p(x) dx
∞ 0
Z ∞ Z +∞
= − x p(−x) dx + x p(x) dx
0 0
Z +∞
= x [p(x) − p(−x)] dx
0
122
For a symmetric probability density p(x) = p(−x) and thus is the [p(x) − p(−x)]-term
in this integral 0.
Conculsion: ε ⟨x⟩ = 0 for a symmetric probability density.
(c)
Z +∞
var ⟨x⟩ = (x − xw )2 p(x) dx
−∞
Z +∞
= x2 p(x) dx
−∞
Z a 2aZ
2 b 2
= 2 b x dx + 2 x dx
0 a 2
1 3 a 1 3 2a 2 3
1
= a b + b 8a3 − a3 = a2
= 2b x +b x
3 0 3 a 3 3
p
Hence var⟨x⟩ = a2 and from this σ = var ⟨x⟩ = a.
Problem 3.12.2.2
!
1 (x − x)2
p(x) = √ exp −
σ 2π 2σ 2
It is immediately clear that
1
p(x) = √
σ 2π
Thus if we solve p(x) = 12 p(x) for x
!
(x − x)2 1
exp − =
2σ 2 2
!
(x − x)2
⇒ exp =2
2σ 2
(x − x)2
⇒ = ln 2
2σ 2
⇒ (x − x)2 = 2σ 2 ln 2
√
⇒ x1,2 = x ± σ 2 ln 2
Q.e.d.
123
Problem 3.12.2.3
We know that
var ⟨x⟩ = σ 2
From the figure we estimate
FWHM ≈ 110 m/s ≈ 2.35 σ
Therefore
σ ≈ 47 m/s en var ⟨x⟩ ≈ 2.2 103 m2 /s2
Problem 3.12.2.4
The value he reports for the diameter is the average:
1 X
d= di = 4.9999 mm
10
For the uncertainty he reports (multiplied by two) the standard deviation (not that of the
average). Because he reports the uncertainty in the diameter of a single ball, he therefore
has to report the uncertainty in one single measurement, thus the standarddev. of separate
measurements. Which is determined using
r r
1 2 1 2
Sd = di − d = di − d = 0.0051 mm
N −1 9
d = (5.00 ± 0.01) mm
Problem 3.12.2.5
First method:
Use
1 X
x= xi
N
and s
1 X
Sx = (xi − x)2
N (N − 1)
for both the current and voltage. We then find
R ± SR = (9.93 ± 0.04) Ω
124
The difference in the uncertainty with respect tot he first method is clear. This is because
the uncertainties in V and in I are not independent of each other. Apparently there are
fluctuations present in the current which lead to fluctuations in the measured potential. The
first method therefore is not the correct one because it assumes that SI and SV are independent
when calculating the uncertainty in R.
Problem 3.12.2.6
1. The uncertainty is
2 2
S S 0.5
Sm = √ ⇒ N = = = 25
N Sm 0.1
Problem 3.12.2.7
The measured uncertainty is SI = 0.2 µA. From this follows
SR SI 0.2 0.2
= = ⇒ SR = R
R I 2.3 2.3
0.2
This is = 1.74× as accurate, hence 3.0 times as much measurements are needed =
2.3 × 0.05
30 measurements. Therefore 20 extra measurements are needed.
Problem 3.12.2.8
The general formula:
v
u n
uX ∂y 2
Sy = t Sx2i
∂xi
i=1
Denoting the 95%-intervals Dxi = 2Sxi and Dy = 2Sy , the 95%-uncertainty in the final answer
125
then is
v
u n
uX ∂y 2
Dy = 2Sy = 2 t Sx2i
∂xi
i=1
v
u n
u X ∂y 2
= t4 Sx2i
∂xi
i=1
v
u n
uX ∂y 2
= t 22 Sx2i
∂xi
i=1
v
u n
uX ∂y 2
= t (2Sxi )2
∂xi
i=1
v
u n
uX ∂y 2
= t Dx2i
∂xi
i=1
r
Pn ∂y 2
and thus is Dy = i=1 ∂xi Dx2i the same rule as for 68%-intervals.
Problem 3.12.2.9
1 1 1
(a) Method 1: = +
R R1 R2
1
R1 = 100 Ω ± 5% = (100 ± 5) Ω ⇒ = (0.01000 ± 0.00050) Ω−1
R1
1
R2 = 270 Ω ± 5% = (270 ± 14) Ω ⇒ = (0.00370 ± 0.00019) Ω−1
R2
because the relative uncertainties in Ri and 1/Ri are equal.
We can now take the quadratic sum of the absolute uncertainties:
1
= (0.01370 ± 0.00053) Ω−1
R
⇒ R = (73 ± 3) Ω
The last bit is because the relative uncertainties in R and in 1/R are equal .
R1 R2
Method 2: R =
R1 + R2
R22
∂R R2 (R1 + R2 ) − R1 R2
= =
∂R 2 2 ∂R 2 2
∂R1 (R1 + R2 ) 2 (R1 + R2 ) 2
2
⇒ SR = SR1 + SR2
∂R R12 ∂R1 ∂R2
=
(R1 + R2 )2
∂R2
2 2
R22 R12
2 2 2
⇒ SR = S R1 + SR = 8.061 Ω2 ⇒ SR = 2.84 Ω
(R1 + R2 )2 (R1 + R2 )2 2
| {z } | {z }
7.089 Ω2 0.972 Ω2
126
(b) The biggest contribution to the uncertainty is accounted for by R1 : S1/R1 = 0.0005 Ω−1
vs. S1/R2 = 0.0002 Ω−1 in method 1 or 7.089 Ω2 vs. 0.972 Ω2 .
Hence replacing R1 would be the most sensible.
Problem 4.7.2.1
The average of the first 10 measurements:
1 X
l= li = 50.369 mm
10
i
(a) Comparing both equations for p (x) simply gives (equating powers):
a−1=n⇒a=n+1
and
b−1=N −n⇒b=N +1−n
dp (x)
=0
dx
⇒ K n xn−1 (1 − x)N −n − K xn (N − n) (1 − x)N −n−1 = 0
⇒ n (1 − x) − (N − n) x = 0
n
⇒ n−Nx=0⇒x=
N
Q.e.d.
a n+1 n+1
ε ⟨x⟩ = = =
a+b n+1+N +1−n N +2
(d) It is not equal to x where p (x) is maximum because the distribution function is not
symmetric (see figure).For example, in the left figure the probability that xw is greater
than n/N is more than 50%.
127
(e)
ab
var ⟨x⟩ = 2
(a + b) (a + b + 1)
(n + 1) (N + 1 − n)
=
(N + 2)2 (N + 3)
1 n + 1 N + 2 − (n + 1)
=
N +3 N +2 N +2
1
= ε ⟨x⟩ (1 − ε ⟨x⟩)
N +3
ε ⟨x⟩ (1 − ε ⟨x⟩)
= .
N +3
Nonetheless the uncertainty (= σ) has to be smaller than the deviation with respect
to 50%.
Hence
r
0.501 × 0.499
< 0.001
N +3
⇒ N + 3 > 0.25 × 106
⇒ N > 250 000
2σr< 0.001
0.501 × 0.499
⇒ 2 < 0.001
N +3
⇒ N + 3 > 4 × 0.25 × 106
⇒ N > 1 000 000
We can also consider this problem from the binomial distribution perceptive with the
same result. Which goes as follows.
128
For the actual fraction xw for candidate A the probability P (n) of n votes for him out
of a trial of total N votes is equal to
N n
P (n) = xw (1 − xw )N −n (binomiaalverdeling).
n
ε ⟨n⟩ = N xw
(also known from the bin.distr.). The result will be unequivocal(within ca. 68 %) if
the expected number of votes N xw more than σ deviates from 12 N (50% for both
candidates). Thus if
1 p
N xw − N > N xw (1 − xw )
2
√ √
⇒ 0.001 N > 0.501 × 0.499 ≈ 0.5
⇒ N > 250 000
1 p
N xw − N > 2σ = 2 N xw (1 − xw )
2
√ √
⇒ 0.001 N > 2 0.501 × 0.499 ≈ 1.0
⇒ N > 1 000 000
Problem 4.7.2.3
(a) The probability that the player has to stop after the n-th card is equal to the probability
that the player first draws n − 1 cards and subsequently draws a non-card. The proba-
bility of drawing n − 1 cards is (4/13)n−1 and the probability of drawing a non-card is
9/13. The total probability is the product of both probabilities, thus
n−1
9 4 n
4 9
P (n) = = .
13 13 4 13
(b) The profit of a player is given by the ’profitfunction’
W (n) = n2 − 2n
Where n is the number of cards drawn. The n2 -term is the payment and the 2n-term is
the stake(2 chips per drawn card). The average profit per participant is the expectation
value of this profitfunction n
∞ 9 P∞ 4
2
P
ε ⟨W (n)⟩ = W (n)P (n) = (n − 2n) =
n=1 4n=1 13
4 n 18 P 4 n 4 n 18 P
∞
∞ ∞ ∞
n
9 P 2 9 P 2 4
= n − n = n − n
4 n=1 13 4 n=1 13 4 n=0 13 4 n=0 13
The first term is calculated with the aid of the last series in the attachments with
129
r = 4/13 and the second term with the second last series. We then find
17 4
9 13 18 13
ε ⟨W (n)⟩ = 3 − 2 = 5.98
4 9 4 9
13 13
The expected profit per participant( and thus the loss for the casino) is 5.98 fiches.
(c) The required stake per card is m chips. The expectation value of the profit per player
therefore is n n 2
∞ ∞
9 P 2 4 9m P 4 13 17 9m
ε ⟨W (n)⟩ = n − n = −
4 n=0 13 4 n=0 13 9 4 13
The casino will be making a profit if
17 9m 17 × 13
− <0⇒m> = 6.13
4 13 9×4
Thus the casino will be making a profit when the stake is 7 chips or more.
(d) Players therefore have to stake 7 chips per card. The profitfunction then becomes
W (n) = n2 − 7n
Only above n = 8 is W (n) > 0 and will a player be profiting. The probability that he
draws 8 or more cards is n
9 4 8 P
∞ ∞
∞ n
P 9 P 4 4
P (n ≥ 8) = P (n) = =
n=8 4 n=8 13 4 13 n=0 13
With the aid of the fifth series we then find
9 4 8 1
7
4
P (n ≥ 8) = = = 0.00026.
4 13 9/13 13
The probability therefore is extremely small.
Problem 4.7.2.4
(a) After 7 throws it is certainly over since the longest set is comprised of the throws
1-2-3-4-5-6. The 7th throw is always smaller.
(b) For 7 throws the order of part a) has to be thrown. The probability of each throws is
6
1 1
1/6 and thus is the total probability = .
6 46656
(c) The total probability is, hence
P7 27216 + 15120 + 3780 + 35 + 1
P (n) = + P (5) = 1
n=1 46656
504 7
⇒ P (5) = = = 0.0108.
46656 648
(d) The average amount of throws per game is
P7 2 × 27216 + 3 × 15120 + 4 × 3780 + 5 × 504 + 6 × 35 + 7 × 1
n̄ = nP (n) =
n=1 46656
117649
= = 2.522
46656
The standard deviation is calculated using
7
σ2 = (n − n̄)2 P (n) = 0.486.
P
n=1
⇒ σ = 0.697
130
(e) The probability that the first player wins is
27216 + 3780 + 35 31031
P1 = P (2) + P (4) + P (6) = = = 0.6651.
46656 46656
The probability that the second player wins is
15120 + 505 + 1 15626
P2 = P (3) + P (5) + P (7) = = = 0.3349.
46656 46656
(f) The table below shows the stake and profit/loss for both player per throw. The profit-
columns apply if the game is over after that particular throw. The profit is then the
pot minus the stake of the that player.
throw stake A stake B pot profit A profit B
1 1 1 2 -1 -1
2 0 0 2 1 -1
3 1 0 3 -2 2
4 1 0 4 1 -1
5 1 0 5 -4 4
6 1 0 6 1 -1
7 1 0 7 -6 6
The expectation value for the profit of the first player is
P7
ε ⟨WA ⟩ = WA (n)P (n)
n=1
−1 × 0 + 1 × 27216 − 2 × 15120 + 1 × 3780 − 4 × 504 + 1 × 35 − 6 × 1
= = −0.0264
46656
and thus for the second player
P7
ε ⟨WB ⟩ = WB (n)P (n)
n=1
−1 × 0 − 1 × 27216 + 2 × 15120 − 1 × 3780 + 4 × 504 − 1 × 35 + 6 × 1
= = 0.0264
46656
The second player will therefore win with an average profit of 2.64 cent per game.
Problem 4.7.2.5
R∞ R∞ ∞
Ke−t/µ dt = −Kµe−t/µ 0 = Kµ = 1 ⇒ K = 1/µ
(a) p(t)d(t) =
0 0
R∞ 1 R∞ −t/µ ∞
dt = −(t + µ)e−t/µ 0 = µ = 4.05 min
9b) t̄ = tp(t)dt = te
0 µ0
R∞ ∞
p(t)dt = −e−t/µ 8 = e−1.975 = 0.139
(c) P (t > 8[min]) =
8
Thus 13.9 % probability.
(d) Costpricefunction K(t) = 0.2 + 0.2t and the expectation value therefore is
1 R∞ 0.2 R∞
ε ⟨K(t)⟩ = K(t)e−t/µ dt = (1 + t)e−t/µ dt = 0.2(1 + t̄) = 1.01 Euro
µ0 µ 0
(e) New costpricefunction K(t) = 0.04t2 and the expectation value therefore is
1 R∞ 0.04 R∞ 2 −t/µ
ε ⟨K(t)⟩ = K(t)e−t/µ dt = t e dt
µ0 µ 0
∞
= 0.04 (−t2 − 2µt − 2µ2 )e−t/µ 0 = 0.08µ2 = 1.31 Euro
Problem 5.5.2.1
• There has to be (or is expected) a linear relation between x and y (open door);
(c) In a figure:
1
ln (I)
-1
-2
0.0 0.5 1.0 1.5 2.0 2.5 3.0
Dikte d (mm)
2 3.11 - 1.59 d
2.83 - 1.43 d
ln (I)
-1
-2
0.0 0.5 1.0 1.5 2.0 2.5 3.0
Dikte d (mm)
SI0
Sln(I0 ) = ⇒ SI0 = I0 Sln(I0 ) = 19.49 · 0.14 = 2.7 W/m2
I0
Slope
133
P 1 P xi yi P xi P yi
Si2 Si2
− Si2 Si2
a = 2
x2i
P
P 1 P xi
Si2 Si2
− Si2
402.1472 · −136.1839 − 643.9610 · 221.9772 −197710.6420
= 2
=
402.1472 · 1351.9309 − 643.9610 128989.4663
−1
= −1.5328 mm = −C.
Uncertainty
v
u P 1
Si2
u
Sa = u 2
x2i
tP P
1 P xi
Si2 Si2
− Si2
r
402.1472
= = 0.05584 mm−1 = SC .
128989.4663
y-intercept
P x2i P yi P xi P xi yi
Si2 Si2
− Si2 Si2
b = 2
x2i
P
P 1 P xi
Si2 Si2
− Si2
1351.9309 · 221.9772 − 643.9610 · −136.1839 387794.9525
= =
128989.4663 12.397
= 3.0064 = ln (I0 )
⇒ I0 = 20.2147 W/m2 .
Uncertainty
v
x2i
u P
u
u Si2
Sb = u 2
x2i
P
xi
tP 1 P
Si2 Si2
− Si2
r
1351.9309
= = 0.1024
128989.4663
⇒ SI0 = I0 Sln(I0 ) = 20.294 · 0.1024 = 2.07 W/m2 .
Problem 5.5.2.2
(a) Plotting En against 1/n2 gives a straight line through the origin with slope C.
The grounds for this are:
• The uncertainties are in present in En but not in n, thus we plot n against the
x-axis and f (En ) against the y-axis.
135
• The uncertainties in En are constant thus make use of f (En ), instead of En .
(b) In a figure:
16
14
12
10
En (eV)
n
2
1/
(c) The disadvantage is that the points will not be equally distributed along the x-range in
the graph
16
14 y = 13.4 x
y = 14.0 x
12
10
En (eV)
n
2
1/
Slope: P
x i yi 14.85
a= P 2 = = 13.74 eV
xi 1.0804
Uncertainty:
Sy 0.3
Sa = qPi =√ = 0.29 eV
x2i 1.0804
Conclusion:
C = (13.8 ± 0.3) eV
2. In a table:
n2 En (eV) Ci = n2 En (eV)
1 13.7 ± 0.3 13.7 ± 0.3
4 3.7 ± 0.3 14.8 ± 1.2
9 1.3 ± 0.3 11.7 ± 2.7
16 1.1 ± 0.3 17.6 ± 4.8
25 0.4 ± 0.3 10.0 ± 7.5
SCi = n2 SEn
3. The first measurement gives the most important contribution to C as it has the smallest
uncertainty. In the graph this is the point that lies most far right and that wholly
determines the positioning of the line.
137
Problem 5.5.2.3
(d) For large values of t the signal approaches the value Vb . Because of the noise in the
measurements points it incidentally occurs that Vi > Vb and as a result of it (Vb −Vi )/Vref
becomes negative. The log on both sides then cannot be taken. From the figure it can
be estimated that Vb ≃ 8 V. Below t = 41 n all Vi -values lie below Vb .
(e) SVi is the standard deviation of separate measurements. SVb is the standard deviation of
the average because Vb is determined from the series of measured values. This exhibits
of course the same spread of separate measured values, thus is
SV
SVb = √ i .
N
√
If one demands that SVb = 0.1SVi then N = 10 and thus N = 100.
(f)
dy Vref 1 SVi
Sy = S
Vb −Vi
= SVi = SVi =
ln Vref
dVi Vb − Vi Vref Vb − Vi
Thus minimized is
2 2
2 Vb − Vi Vb − Vi
χ = ln − a(ti − t0 ) − b .
SVi Vref
138
(h) All the terms in the expressions for both the slope a and intercept b contain terms as
1/Si2 where Si is given by SVi /(Vb − Vi ) (see(f)). Because herein SVi is constant, can
in all terms 1/SV2i be taken outside the of the summation and eliminated by division.
Thus SV2i cancels everywhere. This of course does not apply to the uncertainties Sa
and Sb : there the numerator contains a a factor 1/SV2 and the denominator 1/SV4 (both
below the root-square), hence Sa and Sb are both proportional to SV . Conclusion: we
can determine the est fitting straight line, bu not the uncertainties.
Problem 5.5.2.4
(a)
∞ ∞
X X µn
ε⟨n⟩ = n P (n) = n exp(−µ)
n!
n=0 n=0
∞
X µn
= exp(−µ) n
n!
n=1
∞
X µn
= exp(−µ)
(n − 1)!
n=1
∞
X µn−1
= exp(−µ) µ
(n − 1)!
n=1
∞
X µn
= µ exp(−µ) = µ exp(−µ) exp(µ) = µ
n!
n=0
p √
(b) σ = var⟨n⟩ = µ
(d) r
S µ
Sn = √ =
N N
√
(e) We know then that S ≈ n. Because the relative uncertainty in S is constant we must
plot ln(S):
1
ln(S) ≈ 2 ln(n).
Thus plot: ln(S) against ln(n) gives a straight line through the origin with slope 0.5.
(f) Yes because it is a linear relation of the form y = ax. There is only a single unknown,
the slope ( which should be 0.5).
(h) (This section is allowed to be determined using Origin; the answer should be the same)
In a table:
In a figure:
3.2
3.0
2.8
ln(S)
2.6
2.4
2.2
2.0
4.0 4.2 4.4 4.6 4.8 5.0 5.2 5.4 5.6 5.8 6.0
ln(n)
(i) In a table:
xi yi xi yi x2i
4.263 2.13 9.080 18.173
4.977 2.42 12.044 24.771
5.380 2.59 13.934 28.944
5.680 2.80 15.904 32.262
5.878 2.97 17.458 34.551
P P
= 68.420 = 138.701
Sy 0.13
Sa = qP =√ = 0.011
x2i 138.701
(j) The theoretical slope is 0.5. Hence the experiment and theory are in agreement because
the theoretical value lies within the uncertainty range of the measurement.
Problem 5.5.2.5
(This section is allowed to be determined using Origin; the answer should be the same)
(a) tan α = (H − H0 )/D hence plotting: tan α against 1/D. This gives a straight line
through he origin with slope H − H0 . Alternatively 1/ tan α can be plotted against D.
This gives a straight line through he origin with slope 1/(H − H0 ). Along the x-axis
is 1/D or D because their uncertainties are negligible. Along the y-axis goes tan α or
1/ tan α. In both cases the uncertainty is not constant, so it wont make a difference.
(b) The line goes through the origin and the uncertainties along the y-axis are not constant
and thus the latter method should be made use of: straight line through the origin with
different uncertainties.
(c) If tan α is plotted against 1/D the height H sought is equal to a + H0 and the uncer-
tainty SH in H is equal to uncertainty Sa in the slope (because H0 does not have an
uncertainty).
If 1/ tan α is plotted against D the the height H sought is equal to H0 + 1/a and the
dH
uncertainty SH in H is equal to Sa /a2 (because SH = Sa = Sa /a2 ).
da
(d) If tan α is plotted against 1/D:
2.2
2.0
1.8
1.0
4 0.0050 0.488 0.043
5 0.0040 0.364 0.040 0.8
0.6
0.4
0.2
0.0
D
-1 -1
(m )
2.5
1/tan( )
3 150 1.60 0.12 1.5
0.5
0.0
D (m)
142
(e) Drawing two lines of worst fit:
2.2 3.0
2.0
y = 88.2 x
2.5 y = 0.0113 x
1.8
y = 96.2 x
y = 0.0104 x
1.6
2.0
1.4
1.2
1/tan( )
tan( )
1.5
1.0
0.8
1.0
0.6
0.4
0.5
0.2
0.0 0.0
0.000 0.005 0.010 0.015 0.020 0.025 0 50 100 150 200 250 300
-1 -1
D (m ) D (m)
144