0% found this document useful (0 votes)
141 views31 pages

Statistics - Review 26 Oct 2022

The mid-term examination for CHEM2241 will take place on November 2nd from 10:30 am to 12:20 pm. It will consist of 5 compulsory questions covering topics such as error propagation, statistical data treatment, confidence intervals, t-tests, F-tests, and Q-tests. The exam is worth 20% of the student's overall grade and the higher of the mid-term or final exam mark will be used. Students should review their lab notes, textbook, and lecture materials in preparation with a focus on calculations, experiments, and statistical formulas.

Uploaded by

Tsz Wun CHOW
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
141 views31 pages

Statistics - Review 26 Oct 2022

The mid-term examination for CHEM2241 will take place on November 2nd from 10:30 am to 12:20 pm. It will consist of 5 compulsory questions covering topics such as error propagation, statistical data treatment, confidence intervals, t-tests, F-tests, and Q-tests. The exam is worth 20% of the student's overall grade and the higher of the mid-term or final exam mark will be used. Students should review their lab notes, textbook, and lecture materials in preparation with a focus on calculations, experiments, and statistical formulas.

Uploaded by

Tsz Wun CHOW
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

CHEM2241 tutorial

Oct 26, 2022


Mid-term Examination
 Nov. 02, 2022 (Wed), 10:30 am – 12:20 pm
 2-hour examination
 5 questions (compulsory; statistics: (Error types and
propagation, Statistical data treatment and evaluation:
confidence intervals, t-test, F-test, Q-test; Sampling,
standardization, and calibration)
 20% weighting
 Test mark will be replaced by higher examination
mark
Mid-term Examination format and syllabus
• Questions on calculation and explanation / discussion :
 Calculation (bookwork and challenging questions)
 Know your experiments (lab classes and lectures), review textbook and lecture
notes
 Statistical formulas and tables will be provided.
 Spectroscopic methods (instrumentation) will not be examined
Examination techniques

 Arrive on time
 Bring a functional calculator (that is approved by Exam Secretary)
 Use 5-10 minutes to check all questions; you are allowed to ask questions
related to the examination paper only in the first 30 minutes
 Reserve 10 minutes for checking your answers at the end of the examination
 1 minute for 1 mark in answering questions
Uncertainties in
Numerical Computations: Calibration curve:

Mass = 4.635 ± 0.002 g Volume = 1.13 ± 0.05 mL d4


Measured Measured 𝑦𝑦 = 𝑚𝑚𝑚𝑚 + 𝑏𝑏

intensity
Mass Volume d3

d2
𝑑𝑑12 + 𝑑𝑑22 + 𝑑𝑑32 + 𝑑𝑑42
d1 = minimum
uncertainty
concentration
Computed density = ? ± ? g/mL Conc. = ? ± ? M

Topic Chapter
Errors in chemical analysis: systematic and random errors 5, 6
Statistical data treatment and evaluation: confidence intervals, t-
7
test, F-test, Q-test
Quality assurance: sampling, standardization, and calibration 8
Chapter 6D, Skoog and West’s Fundamentals of Analytical Chemistry, 9th edition, Brooks/Cole (2014)
Chem 2241
Take-home message: Tools of Analytical Chemistry

 Chapter 2 Chemicals, apparatus, and unit operations of analytical chemistry

 Chapters 5 -6 Errors in chemical analysis

 Chapter 7 statistical data treatment and evaluation

 Chapter 8 standardization, and calibration

2017 CHEM2241 IKC 6


Tolerances of Laboratory Apparatus

*Class B volumetric apparatus. Tolerances for Class A ≅ half of Class B.

Typical sample mass = 1 g


Relative error of a balance = 0.1×10-3/1 = 1×10-4
The most accurate apparatus in the laboratory is the analytical balance.
Therefore, laboratory apparatus are calibrated using the analytical balance.
Reference:
Vogel’s Textbook of Quantitative Chemical analysis, 5/e, p. 78-89, Longman (1989).
2017 CHEM2241 IKC 7
Significant Figures
Significant figures in a number are all the certain digits plus the first
uncertain digit, i.e., the minimum number of digits required to express a
value in scientific notation without loss of precision

Rule 1. The last digit of a value is known to within ± 1


Example
 Weight measured with an analytical balance is reported to 0.1 mg
(0.0001 g), e.g., 12.3456 g implies 12.3456 ± 0.0001 g
 Volume of a pipet is reported to 0.01 mL, e.g., 25.00 mL
 Volume of solution delivered by a buret is reported to 0.01 mL

If possible, uncertainty will be expressed as the standard deviation of the mean


or a confidence interval.

2016 CHEM2241 8
Chemical analysis may involve several steps. The random error of
each step contributes to the overall error of the measurement.

2016 CHEM2241 9
Example of error propagation
(Mixed operations)

Consider the following operations:

2016 CHEM2241 10
Applications of statistics to Analytical Chemistry
1. Establish confidence limits
2. Determine if two means differ (t test)
3. Calibration curves (least-squares methods)
4. Determine if precision of two sets of
measurements differ (F test)
5. Determine if an outlier can be discarded (Q test)

2016 CHEM2241 11
Statistics: some basic…..
Gaussian Distribution
µ = population mean (average)
−(𝑥𝑥−𝜇𝜇)2 /2𝜎𝜎2
𝑒𝑒
𝑦𝑦 =
𝜎𝜎 2𝜋𝜋

2016 CHEM2241 12
Gaussian distribution

𝜇𝜇 = 𝑥𝑥̅ ± 𝑧𝑧𝑧𝑧

2/16/2015 CHEM2241 IKC 13


Table 7-1 p125
Population Distribution 𝜇𝜇 = 𝑥𝑥̅ ± 𝑧𝑧𝑧𝑧
Population (raw data)

Confidence Limits

50% Data (confidence level) that 95% confidence level that the true
the true mean μ lies within ±0.67 σ mean μ lies within ±1.96 σ.

2/16/2015 CHEM2241 IKC 14


http://onlinestatbook.com/stat_sim/sampling_dist/index.html
2016 CHEM2241 15
Gaussian Distribution: Normal error curve
Mean of sample
means
Average (mean)
Standard error of
the mean
population Sampling (N)

frequency
frequency

𝑠𝑠 is the standard error*


𝑁𝑁 of the mean

Measured volume Measured mean volume

Standard error of the mean becomes


smaller with n increases For N measurements on a subset/sample for
n>5: ∆ sm becomes moderate
n>10: <1% improvement in precision large samples
𝑧𝑧𝑧𝑧
𝜇𝜇 = 𝑥𝑥̅ ±
𝑁𝑁
2/16/2015
CHEM2241 IKC
Statistical Data Treatment and Evaluation
Inferring population mean from sample mean
• Two to five replicates of a sample are analyzed
• The mean (average) or the median represents the
N
best estimation of the true value – assuming no
systematic error
µ σ Infer • Variation in data (precision or standard deviations)
𝑥𝑥̅ 𝑠𝑠
indicates the degree of confidence of the
1. Establish confidence limits experimental result

2. Determine if two means differ (t test)


Case I. Comparing a Measured Result with a “Known” Value

Case II: Decide whether two sets of replicate measurements give “the same” or “different” results,
within a stated confidence level.

Case III. Comparing Individual Differences: we use two methods to make single measurements on
several different samples. Do the two methods give the same answer “within experimental error”?

2/16/2015 CHEM2241 IKC 17


Confidence Level and Interval
Population

Confidence Limits

50% confidence level that the true 95% confidence level that the true
mean μ lies within ±0.67 σ mean μ lies within ±1.96 σ.
(±0.67 σ = confidence interval)

2/16/2015 CHEM2241 IKC 18


…as the sample size gets smaller, s is not a good estimate of σ (population standard deviation). ….

For a small set of data, confidence limits of the mean are given by
For N measurements, confidence limit CL for µ is

confidence intervals adjustment

𝑡𝑡𝑡𝑡
𝜇𝜇 = 𝑥𝑥̅ ±
𝑁𝑁

 t is used instead of z to account for larger uncertainties in the estimation of σ for a small set of data.
 No confidence intervals adjustment when you know that s as a good estimate of σ even for a small sample size;
use z.
2/16/2015 CHEM2241 IKC 19
Confidence Interval
Calculating Confidence Intervals
The carbohydrate content of a glycoprotein (a protein with sugars attached to it) is found to be
12.6, 11.9, 13.0, 12.7, and 12.5 wt% (g carbohydrate/100 g glycoprotein) in replicate analyses.
Find the 50% and 90% confidence intervals for the carbohydrate content.

The Meaning of a Confidence Interval

If we repeated sets of five


measurements many times, half
of 50% confidence intervals are
expected to include the true
mean, . Nine-tenths of the 90%
confidence intervals are
expected to include the true
mean, .

2/16/2015 CHEM2241 IKC 20


The Meaning of a Confidence Interval
Example: The population mean and population standard deviation of the nitrate content in
sea water with Gaussian population were 10,000 and 1,000 ppb. Four numbers were
chosen, and their mean and standard deviation were calculated.

50% confidence interval 100 times, and 45 of


the error bars in
Figure a pass through
the horizontal line at
10 000.

50% and 90% confidence intervals for the same set of random data. Filled squares are the
data points whose confidence interval does not include the true population mean of 10 000.

90% confidence interval


In Figure b, 89 of the
100 error bars cross
the horizontal line at
10 000.

2/16/2015 CHEM2241 IKC 21


Applications of statistics to Analytical Chemistry

1. Establish confidence limits


2. Determine if two means differ (t test)
• Comparing a measured result to a “known” value
• Comparing replicate measurements
• Comparing individual differences.

3. Calibration curves (least-squares methods)


4. Determine if precision of two sets of measurements
differ (F test)
5. Determine if an outlier can be discarded (Q test)

2016 CHEM2241 22
Case 1. Comparing a Measured Result with a “Known” Value
In a new method for determining selenourea in water the following values were obtained
Example for tap water samples spiked with 50.0 ng/ml of selenourea (Standard Reference Material
sample certified by the National Institute of Standards and Technology): 50.4, 50.7, 49.1,
49.0, 51.1 ng/ml. Is there any evidence of systematic error (Within 95% CI)?

(Aller, A.J. and Robles, L.C., 1998, Analyst, 123: 919)

Adopting the null hypothesis that there is no systematic error µ = 50.0,

Mean = 50.06 ;
standard deviation = 0.956

For 95% probability and N = 5 (DF=4), t = 2.78

The observed value (0.14) < 2.78.


The null hypothesis is retained: there is no evidence of systematic error. Note again
that this does not mean that there are no systematic errors, only that they have not been
demonstrated.
2/16/2015 CHEM2241 IKC 23
Example
A solution is prepared from primary standard AgNO3 (µ = 100.00%)
and use to test the purity of NaCl. The experimental results are
99.94, 99.91, 99.96, and 99.92% NaCl.

𝑥𝑥̅ − 𝜇𝜇
x = 99.93%, s = 0.022, N = 4
𝑡𝑡 = 𝑁𝑁 µ = 100.00%
𝑠𝑠
t = 6.4

From the t-table, for 3 degrees of freedom at 95% confidence level,


t = 3.2. Therefore, the purity of the NaCl sample is suspected.

2/16/2015 CHEM2241 IKC 24


Case 2. Comparing replicate measurements
If not (µ has not a high level of certainty,) use of
pooled s is necessary

• t-test : test whether two means differ

We measure a quantity multiple times by two


different methods that give two different
answers, each with its own standard
deviation. Do the two results agree with each
other “within experimental error”?

2/16/2015 CHEM2241 IKC 25


t-test : test whether two means differ
𝑥𝑥̅1 − 𝑥𝑥̅2 𝑥𝑥̅1 − 𝑥𝑥̅2 𝑁𝑁1 𝑁𝑁2
𝑡𝑡 = 𝑠𝑠 =
𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑠𝑠𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑁𝑁1 + 𝑁𝑁2
𝑁𝑁
where 𝑆𝑆 2 𝑆𝑆 2 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝
𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝
+
𝑁𝑁1 𝑁𝑁2
𝑁𝑁1 + 𝑁𝑁2
= 𝑆𝑆 2 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝
𝑁𝑁1 𝑁𝑁2

Pooled standard deviation spooled

𝑁𝑁1 − 1 𝑠𝑠12 + 𝑁𝑁2 − 1 𝑠𝑠22


𝑠𝑠𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 =
𝑁𝑁1 − 1 + 𝑁𝑁2 − 1

 a weighted average of the variances of 𝑠𝑠12 & 𝑠𝑠22


2/16/2015 CHEM2241 IKC 26
Example x1 = 1.015, s1 = 0.0055, N1 = 6
x2 = 1.007, s2 = 0.0054, N 2 = 6
s pooled = 0.0055

At 90% confidence level, t = 1.81 for 10 degrees of freedom. The difference


of the two experimental data should be, 90 times out of 100, within the
range:
tsspooled
𝑡𝑡 =
𝑥𝑥̅1 − 𝑥𝑥̅ 2 𝑁𝑁1 𝑁𝑁2 x1 − x2 = ±
𝑠𝑠𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑁𝑁1 + 𝑁𝑁2 N1 N 2 ( N1 + N 2 )
1.81× 0.0055
=± = ±0.0057
6 × 6 /( 6 + 6)

The difference is expected to be ±0.0057 or greater no more frequently than


10 times out of 100. Since ∆x = 0.008, the difference is significant. This
conclusion (the data sets are different) will, on the average, be wrong fewer
than 10 times in 100.

2/16/2015 CHEM2241 IKC 27


Case 3. Comparing Individual Differences:
Paired t test
Example: measurements of aluminum in 11 samples of drinking water. Results for
Method 1 are in column B and results for Method 2 are in column C. For each
sample, the two results are similar, but not identical. (95% confidence)

We find that tcalculated (1.224) is less than ttable (2.228) listed in t-


Table for 95% confidence and 10 degrees of freedom.
There is more than a 5% chance that the two sets of
results lie “within experimental error” of each other, so
2/16/2015
we conclude that the results are not significantly different.
Class Practice 𝐴𝐴 = 𝑎𝑎𝑎𝑎𝑎𝑎
Given the table of [KMnO4] vs Absorbance (at 525 nm) table shown
on the left and that the cell length used is 1 cm, calculate
(i) the molar absorptivity ε (or a);
(ii) the concentration of an unknown KMnO4 solution. Measured
absorbance = 0.401.
Slope=A/c= l ε
ε = Slope / l
Unknown sample
Absorbance

Unknown sample
Concentration

y = 1450.8x + 0.0002
-
2/16/2015 R² = 0.99997 CHEM2241
Page 29IKChu
• the
2
Standardization andabout
standard deviation Calibration (Refer
the regression sr =to
S yy − m
“Fitting SStraight-Line
xx

Relations”- Errors in Chemical Analyses) n−2


the standard deviation of the slope sm = sr / S xx
• Calibration curve : n points2is used;
the standard deviation of the intercept s =s
∑ x i

− ( n∑ x)
b r 2
n∑
• the mean value of x the
y for
2
points.
i i

the standard deviation for results from the calibration curve


s 1 1 ( yc − y ) 2
sc = r + +
m M n m 2 S xx

M is the number of measurements of the unknown giving the mean yc

2/16/2015 CHEM2241 IKChu Error 30


D.A. Skoog, Det. al, “Fundamentals of Analytical Chemistry”, 9th edition, Examples 8-5

You might also like