0% found this document useful (0 votes)
37 views

Chapter 2 - Data Analysis II

The document discusses various statistical methods used in data analysis and evaluation including confidence limits, confidence intervals, t-tests, F-tests, outlier detection, calibration curves, and correlation coefficients. Specific statistical tests and their appropriate uses are defined.

Uploaded by

Nazratul Najwa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views

Chapter 2 - Data Analysis II

The document discusses various statistical methods used in data analysis and evaluation including confidence limits, confidence intervals, t-tests, F-tests, outlier detection, calibration curves, and correlation coefficients. Specific statistical tests and their appropriate uses are defined.

Uploaded by

Nazratul Najwa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

DATA ANALYSIS

Assoc. Prof. Dr. Azli B. Sulaiman


Department of Chemistry
Universiti Teknologi Malaysia
81310 UTM Johor Bahru
Johor Darul Takzim
[email protected]
LECTURE OUTLINES
• Errors in Chemical Analysis
• Descriptive Statistics
• Precision and Accuracy
• Types of Error
• Significant Figures
• Statistics in Data Evaluation
• Calibration Curve
• Method of Validation
STATISTICS IN DATA EVALUATION

• Defining confidence limits


• Estimating the different of two means
(t test)
• Estimating the precision of data from two
experiments (F test)
• Deciding to accept or reject outliers (Q test)
• Calibration graphs
• Methods of validation
CONFIDENCE LIMITS AND
CONFIDENCE INTERVAL

• Confidence - assert a certain probability that


the confidence interval does include the true
value.
• The greater the certainty, the greater the
interval required.
• Confidence Limits (CL)
Interval around the mean that probably contains .
Extreme value of the range, a < x > b

• Confidence Interval (CI)


Range within which we assume the true value lies.
The magnitude of the confidence limits.
x  Confidence Limits

• Confidence Level
Fixes the level of probability that the mean is within
the confidence limits.
99.9%, 99%, 95%, 90%, 80%, 68%, 50%
CONFIDENCE LIMITS (CL) OF MEAN

• Since the exact value of population mean, 


cannot be determined, one must use
statistical theory to set limits around the
measured mean, x , that probably contain .

• CL only have meaning with the measured


standard deviation, s, is a good
approximation of the population standard
deviation, , and there is no bias in the
measurement.
CONFIDENCE LIMITS (CL)

In the absence of any systematic errors, the limits within


which the population mean () is expected to lie with a
given degree of probability.

-0.67 +0.67
50% 80% 95%
dN/N

dN/N

dN/N
-1.29 +1.29

-1.96 +1.96

-4 -3 -2 -1 0 1 2 3 4 -4 -3 -2 -1 0 1 2 3 4 -4 -3 -2 -1 0 1 2 3 4
CONFIDENCE INTERVAL (CI)

• CI when  is known (Population)

CI for   x  z
N

N = Number of measurements
VALUES FOR z AT VARIOUS
CONFIDENCE LEVELS

Confidence Level, % z

50 0.67
68 1.00
80 1.29
90 1.64
95 1.96
96 2.00
99 2.58
99.7 3.00
99.9 3.29
• CI For Small Data Set (N < 20)
 Not Known

CI for   x  ts
N

Values of t depend on degree of freedom,


(N - 1) and confidence level (from Table t).
t also known as ‘student’s t’ and will be used in
hypothesis test.

Example 2
VALUES OF t AT VARIOUS
CONFIDENCE LEVEL

Degrees of Freedom 80% 90% 95% 99%


(N-1)

1 3.08 6.31 12.7 63.7


2 1.89 2.92 4.30 9.92
3 1.64 2.35 3.18 5.84
4 1.53 2.13 2.78 4.60
5 1.48 2.02 2.57 4.03
6 1.44 1.94 2.45 3.71
7 1.42 1.90 2.36 3.50
8 1.40 1.86 2.31 3.36
9 1.38 1.83 2.26 3.25
19 1.33 1.73 2.10 2.88
59 1.30 1.67 2.00 2.66
 1.29 1.64 1.96 2.58
OTHER USAGE OF CONFIDENCE
INTERVAL

• To determine number of replicates needed


for the mean to be within the confidence
interval.

• To determine systematic error.


TESTING A HYPOTHESIS

Observation Hypothesis Model

NO
Reject Valid?
YES

Basis for Further


Experiment
SIGNIFICANT TESTS

• Approach tests whether the difference


between the two results is significant
(due to systematic error) or not
significant (merely due to random error).
NULL HYPOTHESIS, Ho

• The values of two measured quantities do not differ


(significantly) UNLESS we can prove it that the two
values are significantly different.
“Innocent until proven guilty”

• The calculated value of a parameter from the


equation is compared to the parameter value from
the table.

• If the calculated value is smaller than the table value,


the hypothesis is accepted and vice-versa.
NULL HYPOTHESIS, Ho

Can be used to compare:


•  and x
• x1 and x 2
• s and 
• s1 and s2
APPLICATION OF t-TEST

A t-test is used to compare one set of


measurement with another to decide
whether or not they are significantly
different.
t TEST

1. Comparison between experimental mean


and true mean ( and x )

• To check the presence of systematic error


• Steps for t test
• Example 5
t TEST

2. Compare x1 and x 2 from two sets of data


• Normally used to determine whether the two
samples are identical or not.
• The difference in the mean of two sets of the
same analysis will provide information on
the similarity of the sample or the existence
of random error.
• Steps
• Example 6
F TEST

• Comparing the precision of two


measurements
• Is Method A more precise than Method B?
• Is there any significant difference between
both methods?
• With the degree of freedom = N – 1
Ho : s1 = s2

• Then, if Fcalc < Ftable ,Ho is accepted.

• Since the values of F are always greater than


1, the smaller variance (the more precise)
always become the denominator.
V1 > V2, so

• Example 7
• Table F
DIXON’S TEST OR Q TEST

• A way of detecting outlier, a data which is


statistically does not belong to the set.

Data: 10.05, 10.10, 10.15, 10.05, 10.45, 10.10

• By inspection, 10.45 seems to be out of the


data normal range.
• Should this data be eliminated?
• Example 8
• Table Q
CALIBRATION GRAPHS

• Commonly used in analytical chemistry to find the


quantitative relation between two variables
(e.g. response and concentration).

• The calibration curves are normally linear, however


not all the points are located on the drawn straight
line (due to random error).

• Regression analysis can be done on the data to see


how good the linearity of the data is.
(Method of least squares)
METHOD OF LEAST SQUARES

• Linear relationship between analytical signal (y) and


concentration (x).
• Calculate best straight line through data points, each
of which is subject to experimental errors.
CALIBRATION CURVES

0.35
0.30
y = mx + c m = slope
Response (Y)

0.25
c = intercept
0.20
0.15
0.10
0.05
0.00
0.0 1.0 2.0 3.0 4.0 5.0 6.0
Concentration (X)
CALIBRATION METHODS

• Standard Calibration Method


• Standard Addition Method
STANDARD CALIBRATION METHOD

1 ppm 2 ppm 3 ppm 4 ppm 5 ppm

Blank Samples
STANDARD CALIBRATION METHOD

Concentration (ppm) Absorbance


0.00 0.00
1.00 0.06
2.00 0.13
3.00 0.21
4.00 0.25
5.00 0.29
Sample 0.22
STANDARD CALIBRATION METHOD

Calibration Plot for Absorbance versus Concentration

0.35
0.30 y = 0.06x + 0.01
Absorbance

0.25
0.20
0.15
0.10
0.05
0.00
0.0 1.0 2.0 3.0 4.0 5.0 6.0
Concentration (ppm)
STANDARD ADDITION METHOD

(x + 0) ppm (x + 10) ppm (x + 20) ppm ( x + 50) ppm

(x + 100) ppm Blank


STANDARD ADDITION METHOD

Concentration (ppm) Absorbance


(x + 0.00) 5.0
(x + 10.00) 11.0
(x + 20.00) 17.0
(x + 50.00) 28.0
(x + 100.00) 55.0
STANDARD ADDITION METHOD

Abs 60
50
40
30
20
10
0
-20 -10 0 10 20 30 40 50 60 70 80 90 100 110 120

Concentration (ppm)
CORRELATION COEFFICIENT

• To estimate how good the experimental


points fit a linear curve.
• Calculate the product-moment correlation
coefficient, R (or r) or correlation coefficient.
• Value of R lies in the range of 1 to +1
(1  r  +1).
CORRELATION COEFFICIENT

• The closer the R value to 1 (or 1), the better


the correlation between y and x.
R = +1: perfect positive correlation with all
points lying on a straight line with
positive slope.
R = 1: perfect negative correlation.

• Correlation coefficient, R2 of > 0.999:


evidence of acceptable fit of the data to the
regression line.
CORRELATION COEFFICIENT
METHOD VALIDATION

DEFINITION

• Method validation is the process to confirm


that the analytical procedure employed for a
specific test is suitable for its intended use.

• The process of verifying that a procedure or


method yields acceptable results.
METHOD VALIDATION

• WHY?
To defend validity of the result and
demonstrate method is fit for the intended
purpose.
• WHO?
Responsibility of the laboratories.
• HOW?
Based on evaluation of the method
performance and the estimated uncertainty
on the result.
VALIDATION OF ANALYTICAL METHOD
(METHOD VALIDATION)

• Analysis of Standard Samples (SRM)


• Analysis by Other Methods
• Standard Addition to the Sample
METHOD VALIDATION

1. Analysis of Standard Samples


• A sample whose analyte concentration is
known.
• The Standard Reference Material (SRM)
can be obtained from The National Institute
Of Standard and Technology (NIST).
• The analyte concentration in the SRM
has been certified by the institute.
• Compare the data obtained from the
method with the certified value.
2. Analysis by Other Methods
• The result of the analytical method can be
evaluated by comparison with data obtained
from a different method.

3. Standard Addition to the Sample


• The known amount of the analyte is added
to the sample and then analyzed by the
proposed method.
• The effectiveness of the method can be
established by evaluating the recovery of the
added quantity.
METHOD VALIDATION

• Limit of Detection (LOD)


• Limit of Quantitation (LOQ)
• Precision/Repeatability/Reproducibility
• Accuracy
• Sensitivity
• Specificity
• Linearity
• Range
• Ruggedness/Robustness
METHOD VALIDATION

Determine precision, accuracy and detection limit


when a single analyst uses the method to analyze a
standard sample of known composition.
• Detection limit - determined by analyzing a
reagent blank for each type of sample matrix for
which a method will be used.
• Precision - determined by analyzing replicate
portions of a standard sample, various
concentration with replicates.
• Accuracy - evaluated by a t-test.
METHOD VALIDATION

Blind analysis of a samples where the


analyte’s concentration is unknown to
analyst.
• The sample is analyzed several times and
the average is determined.
• The value should be within three and
preferably two standard deviations.
PUBLISHED GUIDANCE
• ICH-Q2A. (1994). Text on Validation of Analytical
Procedure.
• ICH-Q2B. (1995). Validation on Analytical Procedures:
Methodology.
• CDER. (1994). Reviewer Guidance: Validation of
Chromatographic Method.
• CDER. (2000). Analytical Procedures and Method
Validation.
• USP. (2012). Validation of Analytical Methods and
Procedures.
ICH: International Conference on Harmonisation
CDER: Center for Drug Evaluation and Research
USP: US Pharmacopeia
Examples of Methods That Require
Validation Documentation

• Dissolution Methods
• Titration Methods
• Spectrophotometric Methods
(UV-VIS, IR, AAS, NMR, XRD, ICP-MS
etc.)
• Chromatographic Methods
• Capillary Electrophoresis Methods
• Particle Sizer Analysis Methods
• Automated Analytical Methods
LIMIT OF DETECTION (LOD)

• The detection limit of an individual analytical


procedure is the lowest amount of analyte in
a sample which can be detected.
• Concentration of an analyte which gives an
instrument signal (y) significantly different
from the blank or background signal.
LIMIT OF QUANTITATION (LOQ)

• The lowest amount of analyte in a sample


which can be quantitatively determined with
suitable precision and accuracy.
• The quantitation limit is a parameter of
quantitative assays for low levels of
compounds in sample matrices, and is used
particularly for the determination of impurities
and/or degradation products.
PRECISION

• Expresses within-laboratories variations:


different days, different analysts, different
equipment, etc.
• The precision of data was estimated by the
deviation from the mean of the multiple
analysis.
REPEATABILITY

• Expresses the precision under the same


operating conditions over a short interval of
time.
• Repeatability is also termed intra-assay
precision.
• Done by performing replicates.
• Purpose:
- to ensure that the method is working properly.
- to reduce the sampling error.
REPRODUCIBILITY

• Expresses the precision between laboratories


(collaborative studies, usually applied to
standardization of methodology).
• Reproducibility conditions are identical
samples analyzed under different conditions.
ACCURACY

• The accuracy of an analytical procedure


expresses the closeness of agreement
between accepted value (either as a
conventional true value or an accepted
reference value) and the value found.
SENSITIVITY

• Defined as the concentration of an element


required to produce a signal of 1%
absorption (0.0044 absorbance units).
• Can be determined by reading the
absorbance produced by a known
concentration of the element.
SPECIFICITY

• Specificity is the ability to assess


unequivocally the analyte in the presence of
components which may be expected to be
present. Typically these might include
impurities, degradants, matrix, etc.
• It is not always possible to demonstrate that
an analytical procedure is specific for a
particular analyte (complete discrimination).
LINEARITY

The linearity of an analytical procedure is its


ability (within a given range) to obtain test
results which are directly proportional to the
concentration (amount) of analyte in the
sample.
RANGE

The range of an analytical procedure is the


interval between the upper and lower
concentration (amounts) of analyte in the
sample for which it has been
demonstrated that the analytical procedure
has a suitable level of precision, accuracy
and linearity.
RUGGEDNESS/ROBUSTNESS

• The precision of one lab over multiple


days, which may include multiple analysts,
multiple instruments, different sources of
reagents, etc.
• How sensitive is the method to deliberate
or uncontrolled small changes in
parameters such as sample size,
temperature, pH, time etc.

You might also like