0% found this document useful (0 votes)
37 views

Statistics in Analytical Chemistry-Part 1: Instructor: Nguyen Thao Trang

This document discusses errors in analytical chemistry measurements. It defines types of errors like systematic, random, and gross errors. It also covers important terms like precision, accuracy, mean, and median. Significant figures rules for addition, subtraction, multiplication, and division are provided. The document discusses calculating absolute and relative errors. It explains that random errors cannot be eliminated and cause replicate measurements to vary randomly around the mean.

Uploaded by

Leo Pis
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views

Statistics in Analytical Chemistry-Part 1: Instructor: Nguyen Thao Trang

This document discusses errors in analytical chemistry measurements. It defines types of errors like systematic, random, and gross errors. It also covers important terms like precision, accuracy, mean, and median. Significant figures rules for addition, subtraction, multiplication, and division are provided. The document discusses calculating absolute and relative errors. It explains that random errors cannot be eliminated and cause replicate measurements to vary randomly around the mean.

Uploaded by

Leo Pis
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Analytical Chemistry

Chapter 2
Statistics in Analytical Chemistry- Part 1

Instructor: Nguyen Thao Trang


Outlines
• Errors in chemical analysis
– Important terms
– Significance figures
– Systematic errors
– Random errors

• Statistical treatment of random errors


– Gaussian distribution
– Error propagation
– Confidence interval

2
Introduction
• All measurements always involve in errors and uncertainties.
• Example: Errors involved in a titration

chem.uiuc.edu
chem-ilp.net

– Difference in color of the solution of at the endpoint: caused by


experimenter.
– Difference in volume of the titrant used: caused by personal error, fail
in calibration of buret,… 3
Introduction
• Measurement errors are inherently part of quantized world
we live  it is impossible to perform a chemical analysis that
is free of errors and uncertainties.

• Errors and uncertainties need to be minimized in a chemical


analysis. Frequent calibration, analysis of known samples, care
when exercising will lessen some errors.

• Size of errors needs to be determined if it is acceptable.

4
Important terms
• Mean: X, is the numerical average:

where Xi is the ith measurement, and n is the number of


independent measurements.

5
Important terms
• Median:
– Xmed is the middle value when data are ordered from the smallest to
the largest value.
– Odd number of measurements: median is the middle value.
– Even number of measurements: median is the average of the n/2 and
the (n/2) + 1 measurements, where n is the number of measurements.

6
Important terms
• Precision: refers to the closeness of the results
obtained from identical measurement →
describes reproducibility.

• Accuracy: describes how close a single True value


measurement to the true value and is
expressed by error. Measurement

• Precision and accuracy: are both achieved


when results are close to each other and to the
true value.

7
Significant figures
• Significant figures: the number of digits reported in a
measurement reflect the accuracy of the measurement and
the precision of the measurement device.

• Significant figures are all certain figures plus one extra figure
having some uncertainty.
• Example:

8
Significant figures
• Rule 1: Disregard all initial zeros, all remaining digits including
terminal zeros and zeros between nonzero integers are
significant.

• Examples: Determine the number of significant figures of


a. 0.005
b. 0.030
c. 0.207
d. 92500

9
Significant figures
• Rule 2: For addition and subtraction, the smallest number of
digits to the right of the decimal set the significance.
• Examples:
1.362 22.989 770
+ 35.453 Rule for rounding to drop
+ 3.111
all insignificant numbers:
4.473 58.442 770 round up for digits ≥ 5,
round down for digits < 5
Not significant
58.443
Rounding up
• Exercises:
1) Rounding to 3 significant figures: 0.135 2; 0.0216 74
2) Write answer with the correct number of digits: 12.3 – 1.63 =;
1.021 + 1.63 =
10
Significant figures
• Rule 3: For multiple and division, the smallest number of
significant digits determines the significance.
• Examples:

3.26 × 10-5 34.60


× 1.78 ÷ 2.4687
5.80 × 10-5 14.05

• Exercise:
Write answer with the correct number of digits: 4.34 × 9.2 = 39.928

11
Significant figures
• Rule 4:
– Number of digits in mantissa of log x = number of significant figures in
x
• Example:

– Number of digits in antilog x ( 10x) = number of significant figures in


mantissa of x:
• Example:

• Exercises: find the significant figures of these numbers:


log 0.001 237 = ? ; log 3.2 = ?
antilog 4.37 = ? ; 102.600 = ?
12
Errors
• Absolute error E: in the measurement of a quantity x is given
by the equation:
𝐸𝐸 = 𝑋𝑋𝑖𝑖 -𝑋𝑋𝑡𝑡
Where 𝑋𝑋𝑡𝑡 is the true or accepted value.

– Example: Results from 6 replicate determinations of iron in aqueous


samples of a standard solution containing 20.0 ppm iron(III ).

1st: 19.4; 2nd: 19.5; 3rd: 19.6; 4th: 19.8; 5th: 20.1; 6th: 20.3.

• Absolute error of the 4th replicate:


E = 19.8 - 20.0 = - 0.2 ppm
• Absolute error of the 5th replicate:
E = 20.1 - 20.0 = 0.1 ppm

– The sign in stating the absolute error is retained.

13
Errors
• Relative error Er: is a more useful quantity than the absolute
error.
𝑋𝑋𝑖𝑖 −𝑋𝑋𝑡𝑡 𝑋𝑋𝑖𝑖 −𝑋𝑋𝑡𝑡
𝐸𝐸𝑟𝑟 = % 𝐸𝐸𝑟𝑟 = × 100%
𝑋𝑋𝑡𝑡 𝑋𝑋𝑡𝑡

– Example: Results from six replicate determinations of iron in aqueous


samples of a standard solution containing 20.0 ppm iron(III).

1st:19.4; 2nd: 19.5; 3rd: 19.6; 4th: 19.8; 5th: 20.1; 6th: 20.3
Mean = 19.8
Relative error for the mean:
Er = (19.8 - 20.0) x 100%/20.0 = - 1%

14
Errors
• Results can be precise without being accurate or accurate
without being precise.

15
Fundamentals of analytical chemistry, Skoog, D. A
Errors
• Every measurement has some uncertainty, called
experimental error.
• Experimental error is classified as systematic or random.
• Systematic errors:
– Also called determinate error, arises from a flaw in equipment or the
design of an experiment. If you conduct the experiment again in
exactly the same manner, the error is reproducible.
– In principle, systematic error can be discovered and corrected.

Measured pH 7.38  0.18 unit too high

When you read a pH of


7.00, the actual pH of the
Known pH 7.20
sample is ?
www.twinklinghope.wordpress.com 16
Systematic errors
• 3 types of systematic errors:

– Instrumental errors: are caused by non ideal instrument behavior, by


faulty calibrations, or by use under inappropriate conditions.
Calibration or proper use eliminates most systematic errors of this
type.

– Method errors: arise from non-ideal chemical or physical behavior of


analytical systems. Errors inherent in a method are often difficult to
detect and are thus the most serious of the three types of systematic
error.

– Personal errors: result from the carelessness, inattention, or personal


limitations of the experimenter.

17
Random errors
• Random errors:
– Also called indeterminate error, arises from uncontrolled variables in
the measurement.
– Random error has an equal chance of being positive or negative.
– It is always present and very difficult to be corrected.
– Example: Reading a scale

58.? (58.2, 58.3 or 58.4)

18
Gross errors
• Gross errors:
– Gross errors differ from indeterminate and determinate errors. They
usually occur only occasionally, are often large and may cause a result
to be either high or low.

– They are often the product of human errors.

– Example: Lost of precipitate before weighing  low result; Touching a


weighing bottle with bare hands after zero  high mass reading.

– Gross errors lead to outliers, results that appear to differ markedly


from all other data in a set of replicate measurements.

– Statistical tests can be performed to determine if a result is an outlier.

19
Statistical treatment of random errors
• Random or indeterminate errors exist in every measurement.

• Never totally be eliminated and are often the major source of


uncertainty in a determination.

• Accumulated effect of the individual uncertainties causes


replicate measurements to fluctuate randomly around the
mean of the set.

20
Statistical treatment of random errors
• Distribution of random errors:
– Example: Calibration of a 10 mL pipet with replication of 50 times.

 Replicate data from most quantitative analytical experiments


approaches that of the Gaussian curve (bell-shaped curve).
1 − ( x − µ )2 /2σ 2 µ: population mean
y= e
σ 2π σ: standard deviation
21
Fundamentals of analytical chemistry, Skoog, D. A
Statistical treatment of random errors
• Statistical analysis is based on the assumption that random
errors in analytical results follow a Gaussian, or normal
distribution.

• Population is the collection of all measurements of interest,


can be real and finite or a hypothesis or concept.

• Characterizing population by taking sample.

• The larger the number of samples, the closer the distribution


becomes to normal.

22
Properties of Gaussian curve
• Difference between sample mean 𝑋𝑋� and population mean µ

∑𝑁𝑁
𝑖𝑖=1 𝑋𝑋𝑖𝑖 ∑𝑁𝑁
𝑖𝑖=1 𝑋𝑋𝑖𝑖

𝑋𝑋 = 𝜇𝜇 =
𝑁𝑁 𝑁𝑁

N represents the number of N represents the number of measurements


measurements in the sample set. in the population.

• When no systematic errors present, population mean is also the true


value.
• When number of measurements in the sample set is small, 𝑋𝑋� is different
from 𝜇𝜇 µ.
• Probable difference between 𝑋𝑋� and 𝜇𝜇 � decreases with increasing the
number of measurements made up the sample.

23
Properties of Gaussian curve
• Population standard deviation σ: is a measure of the precision
of a population data.

∑𝑁𝑁
𝑖𝑖=1 𝑋𝑋𝑖𝑖 − 𝜇𝜇
2 N represents the number of data
𝜎𝜎 = points that make up the data.
𝑁𝑁

𝑋𝑋 − 𝜇𝜇 1 − z 2 /2
𝐼𝐼𝐼𝐼 𝑧𝑧 = y= e
𝜎𝜎 σ 2π

24
Fundamentals of analytical chemistry, Skoog, D. A
Properties of Gaussian curve
• Area under the Gaussian curve: between a pair or limits gives
the probability of a measured value.
– Example: calculate the probability of a measured value within ± σ.
+σ +1
1 1
=area ∫ =e − ( x − µ )2 /2σ 2
dx ∫−= e − z 2 /2
dz 0.683
−σ σ 2π 1 2π
~ 68.3% of the values will lie ~ 99.7% of the values will lie
within ± σ (z = ± 1) within ± 3σ (z = ± 3)

25
Fundamentals of analytical chemistry, Skoog, D. A
Properties of Gaussian curve
• The area under entire Gaussian curve = 1  100 % the values
making up the population will lie within ±∞.

26
Properties of Gaussian curve

Quantitative chemical analysis, Daniel Harris

27
Sample standard deviation
• Sample standard deviation s (absolute standard deviation):
2
∑𝑁𝑁 �
𝑖𝑖=1 𝑋𝑋𝑖𝑖 − 𝑋𝑋
2 ∑𝑁𝑁
𝑖𝑖=1 𝑑𝑑𝑖𝑖
𝑠𝑠 = =
𝑁𝑁 − 1 𝑁𝑁 − 1

• Where 𝑋𝑋𝑖𝑖 − 𝑋𝑋� 2 represents the deviation di of value Xi from the


mean 𝑋𝑋� .
• (N-1) is the number of degrees of freedom.
• Alternative expression of s:

2
2 ∑𝑁𝑁 𝑋𝑋
∑𝑁𝑁 � ∑𝑖𝑖=1 𝑋𝑋𝑖𝑖 − 𝑖𝑖=1 𝑖𝑖
𝑁𝑁
𝑖𝑖=1 𝑋𝑋𝑖𝑖 − 𝑋𝑋
2
𝑠𝑠 = = 𝑁𝑁
𝑁𝑁 − 1 𝑁𝑁 − 1

28
Sample standard deviation
• Sample standard deviation s:
– Example:

29
Sample standard deviation
• Pooling data to increase the reliability of s:
– The pooled estimate of σ, spooled is a weighted average of individual
estimates:

30
Sample standard deviation
• Variance (s2): can be used to describe the precision of the
data. 2
∑𝑁𝑁 �
𝑖𝑖=1 𝑋𝑋𝑖𝑖 − 𝑋𝑋
2 ∑𝑁𝑁
𝑖𝑖=1 𝑑𝑑𝑖𝑖
𝑠𝑠 =
2 =
𝑁𝑁 − 1 𝑁𝑁 − 1
𝑠𝑠
• Relative standard deviation (RSD): 𝑅𝑅𝑅𝑅𝑅𝑅 =
𝑋𝑋�

– The result is often expressed in ppt (part per thousand):


𝑠𝑠
𝑅𝑅𝑅𝑅𝑅𝑅 𝑖𝑖𝑖𝑖 𝑝𝑝𝑝𝑝𝑝𝑝 = × 1000 𝑝𝑝𝑝𝑝𝑝𝑝
𝑋𝑋�
– The result is also expressed in percent, coefficient of variance (CV):
𝑠𝑠
𝐶𝐶𝐶𝐶 = × 100%
𝑋𝑋�
• Spread or range (w): describes the precision of a set of
replicate results. 𝑤𝑤 = 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣 − 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣
31
Error propagation
• Addition/subtraction:

If 𝑦𝑦 = 𝑎𝑎 + 𝑏𝑏 − 𝑐𝑐; then 𝑠𝑠𝑦𝑦 = 𝑠𝑠𝑎𝑎 2 + 𝑠𝑠𝑏𝑏 2 + 𝑠𝑠𝑐𝑐 2


• Example:
Standard deviation of the result:

• Multiplication/Division:
𝑠𝑠𝑦𝑦 𝑠𝑠𝑎𝑎 2 𝑠𝑠𝑏𝑏 2 𝑠𝑠𝑐𝑐 2
If 𝑦𝑦 = 𝑎𝑎 × 𝑏𝑏/𝑐𝑐; then = + +
𝑦𝑦 𝑎𝑎 𝑏𝑏 𝑐𝑐
• Example:

32
Error propagation
• Exponential:
𝑠𝑠𝑦𝑦 𝑠𝑠𝑎𝑎
If 𝑦𝑦 = 𝑎𝑎 𝑥𝑥 ; then = 𝑥𝑥 (the exponent x can be considered
𝑦𝑦 𝑎𝑎 free of uncertainty).

• Example:

33
Error propagation
• Logarithm and antilogarithm:

1 𝑠𝑠𝑥𝑥 𝑠𝑠𝑥𝑥
𝐼𝐼𝐼𝐼 𝑦𝑦 = log 𝑥𝑥 ; then 𝑠𝑠𝑦𝑦 = ≅ 0.434 26
𝑙𝑙𝑙𝑙𝑙𝑙 𝑥𝑥 𝑥𝑥

𝑠𝑠𝑦𝑦
𝐼𝐼𝐼𝐼 𝑦𝑦 = 10𝑥𝑥 ; then = 𝑙𝑙𝑙𝑙𝑙𝑙 𝑠𝑠𝑥𝑥 ≅ 2.302 6 𝑠𝑠𝑥𝑥
𝑦𝑦

• Examples:

34
Confidence intervals (CI)
• Confidence interval for the mean is the range of values within
which the population mean µ is expected to lie with a certain
probability.
• Example: 99% probable that the true population mean for a
set of calcium measurements lies in the interval 7.25% ±
0.15% Ca. Thus, the mean should lie in the interval from
7.10% to 7.40% Ca with 99% probability.

• 99%  confidence level % calcium (Ca)


• 7.10% - 7.40 %  confidence interval
7.40%
• 7.10%, 7.40%  confidence limits 99% chance that
the true value
7.25%
lies in this
interval
7.10%

35
CI when σ is known or s is a good approximation of σ

• For a single measurement:


– CI for 𝜇𝜇 = 𝑋𝑋 ± 𝑧𝑧𝑧𝑧 (z comes from the area under the Gaussian curve)
– % confidence is the % area defined by ± z.

Z = ± 0.67  50 % probability that µ Z = ± 2.58  90 % probability that µ


will fall in the interval 𝑋𝑋� ± 0.67𝜎𝜎 will fall in the interval 𝑋𝑋� ± 2.58 𝜎𝜎

– The probability that a result is outside of the confidence


level is often called the significance level. 36
CI when σ is known or s is a good approximation of σ
– Values for z at various confidence levels are listed in Table 7- 1

• For a series of measurements:


𝑧𝑧𝑧𝑧
– CI for 𝜇𝜇 = 𝑋𝑋� ± experimental mean 𝑋𝑋� ; of N measurements
𝑁𝑁

37
CI when σ is known or s is a good approximation of σ

• Example 1: Determine the 80% and 95% confidence intervals


for (a) the first entry (1108 mg/L glucose) and (b) the mean
value for month 1. Assume that in each part, s = 19 is a good
estimate of σ.

38
CI when σ is known or s is a good approximation of σ

• Example 2: How many replicate measurements in month 1 are


needed to decrease the 95% confidence interval to 1100.3 ±
10.0 mg/L of glucose?

 14 measurements are needed to provide a slightly better than


95% chance that the population mean will lie within ± 10 mg/L
of the experimental mean.

39
CI when σ is unknown
• Often, limitations in time or in the amount of available sample
prevent us to assume s is a good estimate of σ.
• Use t statistical parameter t (Student’s t), which is defined in
exactly the same way as z except that s is substituted for σ.
• For a single measurement with result x:
𝑥𝑥 − 𝜇𝜇
𝑡𝑡 =
𝑠𝑠
• For the mean of N measurements:
𝑥𝑥̅ − 𝜇𝜇
𝑡𝑡 =
𝑠𝑠/ 𝑁𝑁
• CI for the mean of N replicate measurements:
𝑡𝑡𝑡𝑡
𝐶𝐶𝐶𝐶 𝑓𝑓𝑓𝑓𝑓𝑓 𝜇𝜇 = 𝑥𝑥� ±
𝑁𝑁
Note: t depends on the desired confidence level and the number of degrees
of freedom (N-1) in the calculation of s.
40
CI when σ is unknown
• Values of t at different degree of freedom and confidence
level:

41
CI when σ is unknown
• Example 1: chemist obtained the following data for the
alcohol content of a sample of blood: % C2H5OH: 0.084, 0.089,
and 0.079. Calculate the 95% confidence interval for the mean
assuming:
(a) The three results obtained are the only indication of the precision of
the method

42
CI when σ is unknown
(b) from previous experience on hundreds of samples, we know that the
standard deviation of the method s = 0.005% C2H5OH and is a good
estimate of σ

 A sure knowledge of σ (± 0.006% as compared to ± 0.012%


of unknown σ)can decrease the confidence interval by a
significant amount.

43

You might also like