Chapt 6 - Statistic Data
Chapt 6 - Statistic Data
Chapt 6 - Statistic Data
1
• Experimentalist use statistical calculations
to sharpen their judgments concerning the
quality of experimental measurements.
2
1. Defining a numerical interval around the
mean of a set of replicate analytical results
within which the population mean can be
expected to lie with a certain probability.
This interval is called the confidence
interval.
5
Types of Laboratory Errors
• Systematic Error
• Random Error
Systematic Error
represented by a constant bias between your
results and the true answer.
7
Random Error
result of random variation in experimental
data.
9
• Individual results from a set of
measurements are seldom the same, so a
central or “best” value is used for the set.
• 1st – the central value of a set should be
more reliable than any of the individual
results.
• 2nd – variation in the data should provide
a measure of the uncertainty associated
with the central result.
– Mean or median may serve as the central
10
value for a set of replicated measurements
• Mean – average value of two or more
measurements.
Mean = 19.8,
12
• What is Precision?
13
– All these terms are a function of the
deviation from the mean di, or just the
deviation, which is defined as
14
• The relationship between deviation from
the mean and the three precision terms is
given by the following equation
15
when N is small;
when N ∞;
16
spooled - s for more than 1 set of data
30
17
• Variance
= s2
18
How about Accuracy?
19
• We may determine precision just by
replicating and repeating a measurement.
; where
xt = the true or accepted value
of the quantity
Eg: Data from determination for iron
concentration (in slide 12)
- Absolute error of the result immediately to the
left of the true value of 20.00 ppm is -0.2 ppm
Fe, the result at 20.10 ppm has an error of 0.1
ppm Fe.
21
Relative Error (Er)
The percent of relative error is given by
the expression.
©Gary Christian,
Analytical Chemistry,
6th Ed. (Wiley) Fig. 3.2 Normal error curve.
The quantity z represents the deviation of
a result from the mean for a population
of data measured relative to the standard
deviation.
26
CL for SINGLE n FEW
MEASUREMENTS
Single
CL = x ± zs
CI
For N Measurements
27
Example 2.1
The mercury in samples of seven fish
taken from Chesapeake Bay was
determined by a method based on the
absorption of radiation by gaseous
elemental mercury (Table 2.1).
28
Table 2.1 Result for Example 2.1
Number of Sum of squares
Mean, ppm
Specimen samples Hg content, ppm of deviations
Hg
measured from mean
1 3 1.80, 1.58, 1.64 1.673 0.0258
30
From these calculations, we conclude that it is
80% probable that m, the population mean (and,
Solution in the absence if determinate error, the true
value) lies in the interval between 1.67 and 1.93
ppm Hg.
i)
From table CL
for various
values of z
ii) (Table 2.2)
Confidence Levels, % z
50 0.67
68 1.00
80 1.29
90 1.64
95 1.96
95.4 2.00
99 2.58
99.7 3.00
99.9 3.29
32
b) For the three measurements, From table CL
for various
values of z
Solution (Table 2.2)
33
Example 2.2
How many replicate measurements of specimen 1 in example
2.1 are needed to decrease the 95% confidence interval to ±
0.07 ppm Hg?
Solution
1.96
34
We conclude the eight measurements would provide a
slightly better than 95% chance of the population mean
lying within ± 0.07 ppm of the experimental mean.
37
Select a confidence level (95% is good) for the number of samples analyzed
(= degrees of freedom +1).
Confidence limit = x ± ts/√N.
It depends on the precision, s, and the confidence level you select.
Table 2.4
38
©Gary Christian, Analytical Chemistry, 6th Ed. (Wiley)
Example 2.3
39
Here, = 0.252/3 = 0.084. Table 2.4
indicates that t = 4.30 for two degrees of
freedom and 95% confidence. Thus
40
(b) Because s = 0.005% is a good estimate of
s,
42
– If agreement is found – the hypothetical model
serves as the basis for further experiments.
• Experimental results seldom agree exactly
with those predicted from a theoretical
model.
• Consequently, scientist and engineers
frequently must judge whether a numerical
difference is a manifestation of the random
errors inevitable in all measurements.
• Certain statistical test are useful in
sharpening these judgments.
43
Tests of this kind make use of a null
hypothesis.
44
These probability level are often called
significant levels (a).
45
1) The mean of an experimental data ( ) with
what is believed to be the true value (m);
46
Comparing an Experimental
Mean with the True Value
• A common way of testing for bias in an
analytical method is to use the method to
analyze a sample whose composition is
accurately known.
• Bias in analytical method is illustrated in
Figure 3.2.
• Method A has no bias, so the population mean
mA is the true value (xt).
• Method B has a systematic error, or bias that is
given by 47
bias = mB – xt = mB – mA
A B
Analytical result, xt
49
• The judgment must then be made whether
this difference is the consequence of
random error or, alternatively a systematic
error.
• In treating this type of problem statistically,
the difference - xt is compared with the
difference that could be caused by random
error.
• If the observed difference ( - xt ) < that
computed for a chosen probability – null
hypothesis that and xt are the same and
cannot be rejected (or not significantly50
difference)
• If - xt is significantly larger than either the
expected or the critical value, we may assume
that the difference is real and the systematic
error is significant.
• The critical value for rejecting the null
hypothesis is calculated by rewriting for CL in
slide 32.
52
53
• From Table 2.4 (slides 31), we find that at
the 95% confidence level, t has a value of
3.18 for the three degrees of freedom.
• Thus we can calculate a test value of t from
our data and compare it to the values given
in the Tables 2.4 at the desired confidence
level.
• The t test value is calculated from
•
54
Since 4.375 > 3.18 (tcrit), so we can conclude
that a difference this large is significant an
reject the null hypothesis at the confidence
level chosen (95%).
56
Comparing Two Experimental
Means
• Use to judge whether a difference in the
means of two sets of identical analyses is
real and constitutes evidence that the
samples are different or whether the
discrepancy is simply a consequence of
random errors in the two sets.
57
Eg
set data of Material 1 Set Data of Material 2
x1 x1
x2 x2
x3 x3
x4 x4
. .
. .
. .
. .
N1 N2
58
• Thus, the variance of the difference (d =
x1 – x2) between the means is given by
59
If we then assume that the pooled standard
deviation spooled is a good estimate of both
sm1 and sm2, then
and
60
Substituting this equation into in slide 45 (and
also for xt), we find that
where
61
We then compare our test value of t with the critical
value obtained from the Table 2.5 for the
particular confidence level desired.
The number of degrees freedom (F) for finding the
critical value of t in Table 2.5 is N1 + N2 – 2.
If the absolute value (from calculation) of the test
statistic is smaller than the critical value, the null
hypothesis is accepted and no significant
difference between the means has been
demonstrated or vice versa.
If a good estimate of s is available, equation can be
modified by inserting z for t and s for s.
62
F = s12/s22.
You compare the variances of two different methods to see if there is a
significant difference in the methods, at the 95% confidence level.
Table 2.5
66
Using the Q test
67
• This ratio is then compared with rejection
values Qcrit found in Table 2.6.
68
QCalc = outlier difference/range.
If QCalc > QTable, then reject the outlier as due to a systematic error.
70
71
72
73
74
75
76
77
78
79
80
81
Eg:
To test the null hypothesis at 5% probability
(observation differences occur 5 times in 100)