Error Analysis
Error Analysis
I Introduction
Every measured physical quantity has an uncertainty or error associated with it.
An experiment, in general, involves
(i) Direct measurement of various quantities (primary measurements) and
(ii) Calculation of the physical quantity of interest which is a function of the measured
quantities. An uncertainty or error in the final result arises because of the errors in the
primary measurements (assuming that there is no approximation involved in the
calculation). For example, the result of a recent experiment to determine the velocity of
light (Phys. Rev. Lett. 29, 1346 (1972)) was given as
The error in the value of C arises from the errors in primary measurements, viz.,
frequency and wavelength. [g, σ]
-----------------------------------------------------------------------------------------------------------
Physics Today, July 2007
C = 299, 792, 458 m/sec.
-----------------------------------------------------------------------------------------------------------
Error analysis, therefore, consists of
(i) Estimating the errors in all primary measurements, and
(ii) Propagating the error at each step of the calculation. This analysis serves two
purposes. First, the error in the final result ( 1.1 m/sec in the above example) is an
indication of the precision of the measurement and, therefore, an important part of
the result. Second, the analysis also tells us which primary measurement is causing
1
more error than others and thus indicates the direction for further improvement of the
experiment.
For example, in measuring 'g' with a simple pendulum, if the error analysis
reveals that the errors in 'g' caused by measurements of l (length of the pendulum)
and T (time period) are 0.5 cm/sec2 and 3.5 cm/sec2 respectively, then we know that
there is no point in trying to devise a more accurate measurement of l. Rather, we
should try to reduce the uncertainty in T by counting a larger number of periods or
using a better device to measure time. Thus, error analysis prior to the experiment is
an important aspect of planning the experiment.
Nomenclature
(i) 'Discrepancy' denotes the difference between two measured values of the same
quantity.
(ii) 'Systematic errors' are errors which occur in every measurement in the same
way-often in the same direction and of the same magnitude - for example, length
measurement with a faulty scale. These errors can, in principle, be eliminated or
corrected for.
(iii) 'Random errors' are errors which can cause the result of a measurement to deviate
in either direction from its true value. We shall confine our attention to these
errors, and discuss them under two heads: estimated and statistical errors.
II Estimated Errors
Estimating a primary error
What matters really is the effective least count and not the nominal least count.
For example, in measuring electric current with an ammeter, if the smallest division
corresponds to 0.1 amp, but the marks are far enough apart so that you can easily make
out a quarter of a division, then the effective least count will be 0.025 amp. On the
other hand, if you are reading a vernier scale where five successive marks on the
vernier scale (say, 27th to 31st ) look equally well in coincidence with the main scale, the
effective least count is 3 times the nominal one. Therefore, make a judicious estimate
of the least count. The estimated error is, in general, related to the limiting factor in the
accuracy. This limiting factor need not always be the least count. Another example,
in a null-point electrical measurement, suppose the deflection in the galvanometer
remains zero for all values of resistance R from 351 to 360 . In that case, the
uncertainty in R is 10 (rather 5 ), even though the least count of the resistance box
may be 1 . [Stop-watch]
How to calculate the error associated with f, which is a function of the measured
quantities a, b, and c ?
2
Let f = f (a, b, c). (1)
f f f
df da db dc . (2)
a b c
Eq. (2) relates the differential increment in f resulting from differential increments in a,
b, c. Thus, if our errors in a, b, c (denoted as a, b, c) are small compared to a, b, c,
respectively, then we may say
f f f
f a b c . (3)
a b c
where the modulus signs have been put because errors in a, b, and c are independent of
each other and may be in the positive or negative direction. Therefore, the
maximum possible error will be obtained only by adding absolute values of all the
independent contributions. (All the ‟s are considered positive by definition). Special
care has to be taken when all errors are not independent of each other. This will become
clear in special case (V) below.
(i) For addition or subtraction, the absolute errors are added, e.g.,
if f a b c , then
f a b c . (4)
(ii) For multiplication and division, the fractional (or percent) errors are added,
e.g.,
ab
if f , then
c
1 1 1 1
f a b c . (5)
f a b c
(iii) For raising to constant powers, including fractional powers, the fractional error
is multiplied by the power, e.g.,
if f a 3.6 , then
1 1
f 3.6 a . (6)
f a
(iv) In mixed calculations, break up the calculation into simple parts, e.g.,
3
3
a
if f c 2 , then
b
a 3
f c 2 .
b
b a 1 1
As a b
a b a b
1 3
3
and 3
c 2 c
2 2c
c
1
1 a 3
So, f a 2 b c 2 c . (7)
b b 2
Note that the same result could have been derived directly by differentiation.
ab
(v) Consider f a2 .
c
The relation for error, before putting the modulus signs, is
b a ab
f a b 2
c 2aa .
c c c
Note that the a factors in the first and fourth terms are not independent errors.
Therefore, we must not add the absolute values of these two terms indiscriminately.
The correct way to handle it is to collect the coefficients of each independent errors
before putting modulus signs, i.e.,
b a ab
f 2a a b 2 c . (8)
c c c
4
The exact definition of "most likely" depends on the distribution governing the
random events. For all random processes whose probability of occurrence is small
and constant, Poisson distribution is applicable, i.e.,
mn m
Pn e , (9)
n!
where Pn is the probability that you will observe a particular count n, when the
expectation value is m.
It can be shown that if an infinite number of measurements are made, (i) their
average would be m and (ii) their standard deviation (s.d.) would be √m, for this
distribution. Also, if m is not too small, then 68% or nearly two-thirds of the
measurements would yield numbers within one s.d. in the range m √m.
k
x n2
x ,
n 1 k 1 (10)
where xn is the deviation of measurement xn from the mean. However, since we
know the distribution, we can ascribe the s.d. even to a single measurement.
f f f
2 2 2
f a b c . (11)
a b c
(i) For addition or subtraction, the squares of errors are added, e.g.
if f a bc
then, 2f 2a b2 2c . (12)
(ii) For multiplication or division, the squares of fractional errors are added, e.g.
5
ab
if f ,
c
f
2
2 2 2
then a b c . (13)
f a b c
(iii) If a measurement is repeated n times, the error in the mean is a factor √n less than
the error in a single measurement, i.e.,
f
f . (14)
n
Note that Eqs. (11-14) apply to any statistical quantities a, b, etc, i.e., primary
measurements as well as computed quantities whereas
m m (15)
IV Miscellaneous
Repeated measurements
f
f , (16)
n
where f is the error in one measurement. Hence one way of minimising random
errors is to repeat the measurement many times.
6
Here is a simple method of obtaining the best fit for a straight line on a graph.
Having plotted all the points ( x1 , y1 )........( xn , yn ), plot also the centroid ( x , y ).
Then consider all straight lines through the centroid (use a transparent ruler) and visually
judge which one will represent the best mean.
Having drawn the best line, estimate the error in slope as follows. Rotate
the ruler about the centroid until its edge passes through the cluster of points at the
'top right' and the 'bottom left'. This new line gives one extreme possibility; let the
difference between the slopes of this and the best line be m1 .Similarly determine m2
corresponding to the other extreme. The error in the slope may be taken as
m1 m2 1 (17)
m . ,
2 n
where n is the number of points. The factor √n comes because evaluating the slope
from the graph is essentially an averaging process.
It should be noted that if the scale of the graph is not large enough, the least
count of the graph may itself become a limiting factor in the accuracy of the result.
Therefore, it is desirable to select the scale so that the least count of the graph paper is
much smaller than the experimental error.
7
V Instructions
1. Calculate the estimated/statistical error for the final result. In any graph you plot,
show error bars. (If the errors are too small to show up on the graph, then write them
somewhere on the graph).
2. If the same quantity has been measured/calculated many times, you need not
determine the errors each time. Similarly, one typical error bar on the graph will be
enough.
f a b
if f ab; ,
f a b
a 10.0 0.1, b 5.1 0.2
0.1 0.2
then , f 51.0
10.0 5.1
0.51 2.0
2.5
Therefore, f 51.0 2.5 .
Here the penultimate step must not be skipped because it shows that the contribution to
the error from b is large.
4. Where the final result is a known quantity (for example, e/m), show the discrepancy
of your result from the standard value. If this is much greater or even less than the
estimated error, this is abnormal and requires explanation. (g = 9.3 0.1 or 10.0
0.5) OR (g = 9.8 0.5 or 9.8 0.1).
8
Consider N measurements of quantity x, yielding values x1, x2 …..xN. One defines
1 N
Mean : x lim xi , (18)
N
N i 1
Deviations: The deviation di of any measurement xi from the mean x of the parent
distribution is defined as
d i xi x . (19)
Note that if the x is the true value of the quantity being measured, di is also the true
error in xi .
The arithmetic average of the deviations for an infinite number of observations
must vanish, by definition of (Eq.(18)
1 N 1 N
lim xi x lim xi x 0 . (20)
N
N i 1 N
N i 1
There are several indices one can use to indicate the spread (dispersion) of the
measurements about the central value, i.e., the mean value. The dispersion is a
measure of precision. One can define average deviation d as the average of the
magnitudes of the deviations (absolute values of the deviations)
d lim xi x .
1 N
N N
i 1
This can be used as a measure of the dispersion of the expected observation about the
mean. However, a more appropriate measure of the dispersion is found in the
parameter called standard deviation , defined as
1 N 2
1
lim xi x lim x x .
2 2 2
i (21)
i 1
N N
N N
d i
2
i 1
. (22)
N
The expression derived from a statistical analysis is
N
d i
2
(23)
i 1
( N 1)
9
where the denominator is N-1 instead of N. In practice, the distinction between these
formulae is unimportant. According to the general theory of statistics the reliability
of a result depends upon the number of measurements and, in general, improves with
the square root of the number.
y a bx (24)
by determining the values of the coefficients a and b such that the discrepancy is
minimized between the values of our measurements yi and the corresponding
values y = f(xi) given by Eq. (24). We cannot determine the coefficients exactly with
only a finite number of observations, but we do want to extract from these data the
most probable estimates for the coefficients.
The problem is to establish criteria for minimizing the discrepancy and
optimizing the estimates of the coefficients. For any arbitrary values of a and b, we can
calculate the deviations yi between each of the observed values yi and the
corresponding calculated values
If the coefficients are well chosen, these deviations should be relatively small. The
sum of these deviations is not a good measure of how well we have approximated
the data with our calculated straight line because large positive deviations can be
balanced by large negative ones to yield a small sum even when the fit is bad. We might
however consider summing up the absolute values of the deviations, but this leads
to difficulties in obtaining an analytical solution. We consider instead the sum of the
squares of deviations.
There is no unique correct method for optimizing the coefficients which is
valid for all cases. There exists, however, a method which can be fairly well justified,
which is simple and straightforward, which is well established experimentally (???)
as being appropriate, and which is accepted by convention. This is the method of
least squares which we will explain using the method of maximum likelihood.
10
Method of maximum likelihood
y( x ) a0 b0 x . (26)
For any given value of x = xi, we can calculate the probability Pi for making the
observed measurement yi, assuming a Gaussian distribution with a standard deviation
i for the observations about the actual value y(xi). It is given by
1 1 y yx 2
Pi exp i i
. (27)
i 2 2 i
The probability for making the observed set of measurements of the N values of yi is the
product of these probabilities
1
1 yi yxi
2
Pa0 , b0 Pi exp ,
i 2 i (28)
2
where the product is taken for i ranging from 1 to N.
Similarly, for any estimated values of the coefficients a and b, we can calculate the
probability that we should make the observed set of measurements
1 y
2
Pa, b exp 1 i
. (29)
2 2 i
i
The method of maximum likelihood consists of making the assumption that the
observed set of measurements is more likely to have come from the actual parent
distribution of Eq. (26) than from any other similar distribution with different
coefficients and, therefore, the probability in Eq. (28) is the maximum probability
attainable with Eq. (29). The best estimates for a and b are therefore those values
which maximise the probability in Eq. (29). The first term of Eq. (29) is a constant,
independent of the values of a or b. Thus, maximizing the probability P(a, b) is
equivalent to minimizing the sum in the exponential. We define the quantity 2 to
be this sum
1
2
y
i
2
2 yi a bxi 2 , (30)
i i
N
where always implies and consider this to be the appropriate measure of the
i 1
goodness of fit.
Our method for finding the optimum fit to the data will be to minimize this
weighted sum of squares of deviations and, hence, to find the fit which produces the
smallest sum of squares or the least-squares fit.
11
Minimizing 2: In order to find the values of the coefficients a and b which
yield the minimum value for 2, we use the method of differential calculus for
minimizing the function with respect to more than one coefficient. The minimum value
of the function 2 of Eq. (30) is one which yields a value of zero for both of the partial
derivatives with respect to each of the coefficients.
2
2 y i a bxi 2
1
a a i
(31)
2
2
yi a bxi 0
2
x y a bxi 0 ,
2
2
b
i i
where we have for the present assumed all of the standard deviations equal, i = . In
other words, errors in y's are assumed to be the same for all values of x. These
equations can be rearranged to yield a pair of simultaneous equations
y i aN b x i
(32)
and x y i i a x i b x i2 ,
N
We wish to solve Eqs. (32) for the coefficients a and b. This will give us the
values of the coefficients for which 2, the sum of squares of the deviations of the
data points from the calculated fit, is a minimum. The solutions are:
a
1
x y x x y
2
i i i i i
b
1
N xi yi xi yi (33)
N xi2 xi
2
.
Errors in the coefficients a and b: Now we enquire what errors should be assigned
to a and b. In general, the errors in y's corresponding to different values of x will be
different. To find standard deviation in 'a', say Sa, we approach in the following way.
The deviations in 'a' will get contributions from variations in individual yi's. The
contribution of the deviation, yn of a typical measured value, yn to the standard
deviation Sa is found using the first equation of Eq.(33) reproduced below as
a
x y x x y
2
n n n n n
. (34)
N x x 2 2
n n
12
By differentiating it partially with respect to yi we get
a
yi
xn xn xi y .
2
(35)
yi N xn2 xn
2 i
Since yi is assumed statistically independent of xi we may replace yi by its average
value
yi 2
yi y N
. (36)
a xn xn xi
2
yi y .
N xn2 xn
yi 2 (37)
The standard deviation Sa of the slope „a‟ is found by squaring this expression,
summing over all measured values of y (that is, summing the index i from 1 to N), and
taking the square root of this sum. Also, it should be realized that xi = xn, and x2i =
x2n. The result of this calculation is
x 2
Sa y
N x x
n
2
. (38)
2
n n
In a similar manner, the standard deviation Sb of the intercept „b‟ can be found as
N
Sb y
N x 2
n xn
2
. (39)
13
EXAMPLE I
14
Random residuals, order of magnitude smaller values of normalized 2 ( ~10-10)
for the √T fit as against systematic residuals & 2 ~10-9 for the lnT fit.
Exptal. accuracy of ours is ~ 5 ppm (how ???, repeated mmts., fluctuations of
nanovoltmeter, crimp connections).
Goodness of fit ~ 10 ppm [2 ( ~10-10)] Good fit to that extent!!!
If our accuracy was better than 1 ppm, the fit would have been considered poor.
EXAMPLE II
ent (EMU) 5
Long Mom P2 -1.11889 4.56836E-
ent (EMU) E-5 8
3.320
3.305
0 10 20 30 40 50 60 70 80 90
Temperature (K)
15
Sample M(0) C R2 D
Composition (emu) (10-4 K-3/2) (10-6 K- (K) Norm meVÅ2
5/2
(Sample ID) )
2.185
Sample1_30min@1000 Oe
3/2
Best fit using Bloch's T
2.180
M (10 emu)
2.175
-4
2.170
2.165
2.160
15 20 25 30
T(K)
M (T) between 10 and 30 K (<< Tc ) fit quite well to Bloch‟s T3/2 law giving spin-wave
stiffness constant D = 157 and 195 meV Å2 for the thickest films of Mo ~13.5 % and
10.9 %.
Change is still 1 % but the absolute change is only 2x10-6 emu as against 2x10-2 emu for
the bulk. The sample is just push-fitted into the straw with no packing whatsoever.
16
Superparamagnetism
Superparamagnetism
Samples are single layers of Ni nanoparticles with non-conducting Al2O3 on
both sides deposited on both Si and Sapphire substrates using PLD technique.
M(H,T) measured using Quantum Design MPMS (SQUID magnetometer).
Diamagnetic contribution of substrate subtracted (typically χ = - 2 x 10-4 emu/tesla).
-6
4.0x10
-6
3.5x10
-6
3.0x10
M (T) at H = 200 Oe for 6 nm Ni
-6
2.5x10
sample. Langevin/Brillouin function
M (emu)
-6
2.0x10 M = (cothx-1/x) fits well with μ =
-6
1.5x10 2700 μB where x = μH/kBT.
-6
1.0x10
-7
5.0x10
0.0
0 50 100 150 200 250 300
T (K) 82
17