Applications of Sampling For Theory: Pentti Minkkinen
Applications of Sampling For Theory: Pentti Minkkinen
Applications of Sampling For Theory: Pentti Minkkinen
Pentti Minkkinen
Lappeenranta University of Technology
e-mail: [email protected]
Sources of sampling error
Correct and incorrect sampling
Estimation of sampling uncertainty
Optimization of sampling procedures
Practical examples
WSC2, Barnaul, March 2003
(o)
x
LOT
Primary Secondary Analysis Result
sample sample
s
1
s
2
s
3
s
x
Propagation of errors:
Example:
GOAL: x =
=
2
i x
s s
% 5 . 5 (%) 30 %) 1 ( %) 2 ( %) 5 (
2 2 2 2
= = + + =
x
s
Analytical process usually contains several sampling and
sample preparation steps
SAMPLING
Art of cutting a small portion of material from a large lot and
transferring it to the analyzer
Theory of sampling (theoretical) distributions with known
properties
SLOGANS
The result is not better than the sample that it is based on
Sample must be representative
Theory that combines both technical and statistical parts
of sampling has been developed by Pierre Gy:
Sampling for Analytical Purposes, Wiley, 1998
PLANNING OF SAMPLING
1. GATHERING OF INFORMATION
What are the analytes to be determined?
What kind of estimates are needed?
Average (hour, day, shift, batch, shipment, etc.)
Distribution (heterogeneity) of the determinand in the lot
Highest or lowest values
Is there available useful a priory information (variance estimates, unit
costs)?
Is all the necessary personnel and equipment available?
What is the maximum cost or uncertainty level of the investigation?
PLANNING OF SAMPLING
2. DECISIONS TO BE MADE
Manual vs. automatic sampling
Sampling frequency
Sample sizes
Sampling locations
Individual vs. composite samples
Sampling strategy
Random selection
Stratified random selection
Systematic stratified selection
Training
Error components of analytical determination according to P.Gy
Global Estimation Error
GEE
Total Sampling Error
TSE
Point Selection Error
PSE
Total Analytical Error
TAE
Point Materialization Error
PME
Weighting Error
SWE
Increment Delimi-
tation Error
IDE
Long Range
Point Selection Error
PSE
1
Periodic
Point Selection Error
PSE
2
Fundamental
Sampling Error
FSE
Grouping and
Segregation Error
GSE
Increment Extraction
Error
IXE
Increment and Sample
Preparation Error
IPE
GEE=TSE +TAE
TSE=(PSE+FSE+GSE)+(I DE+I XE+I PE)+SWE
Sample
Ideal
mixing
If the lot to be sampled can be mixed before sampling it can be treated
as a 0-dimensional lot. Fundamental sampling error determines the
correct sampling error and can be estimated by using binomial or
Poisson distributions as models, or by using Gys fundamental sampling
error equations.
0-D
Samples
1-D
2-D
If the lot cannot be mixed before sampling the dimensionality of the lot
depends on how the samples are delimited and cut from the lot. Auto-
correlation has to be taken into account in sampling error estimation
3-D
Samples
Lot is 3-dimensional, if none of the dimensions is completely
included in the sample
Weighting error
Sample
No.
Concentration
mg/l
Volume
m
3
cV
g
1 6.25 4.58 28.6
2 4.36 3.71 16.2
3 5.58 5.20 28.99
4 4.64 5.71 26.48
5 4.86 4 .54 22.08
6 3.65 6.78 24.75
7 3.73 7.12 26.55
8 5.98 5.81 34.76
9 4.96 5.86 29.05
Mean 4.89 5.479 26.39
Sum 44.01 49.3 237.47
c
i
V
i
Total emission estimate (unweighted):
m m g V c M
241.13
479 . 5 / 89 . 4 9 9
3 3
= = =
Total emission estimate (weighted):
m m g V c M
w w 237.47 g 479 . 5 / 86 . 4 9 9
3 3
= = =
Weighting error (in concentration): 0.03 mg/l
Weighting error (in total emission): 3.66 g
Weighted mean concentration:
= 4.86 mg/l
V
i
Incorrect sample delimitation
Correct sample delimitation
Correct design for proportional sampler:
correct increment extraction
b
a
c
v
v = constant s 0.6 m/s
if d > 3 mm, b > 3d = b
0
if d < 3 mm, b > 10 mm = b
0
d = diameter of largest particles
b
0
= minimum opening of the
sample cutter
Process analyzers often have sample delimitation problems
I ncorrect I ncrement and Sample Preparation Errors
Contamination (extraneous material in sample)
Losses (adsorption, condensation, precipitation, etc.)
Alteration of chemical composition (preservation)
Alteration of physical composition (agglomeration,
breaking of particles, moisture, etc.)
Involuntary mistakes (mixed sample numbers, lack of
knowledge, negligence)
Deliberate faults (salting of gold ores, deliberate errors in
increment delimitation, forgery, etc.)
Estimation of Fundamental Sampling Error
by Using Poisson Distribution
Poisson distribution is describes the random distribution of rare
events in a given interval.
If
s
is the number of critical particles in sample, the standard
Deviation expressed as the number of particles is
n n
o =
The relative standard deviation is
n
r
o
1
=
(1)
(2)
Example
Plant Manager: I am producing fine-ground limestone that is used
in paper mills for coating printing paper. According to their speci-
fication my product must not contain more than 5 particles/tonne
particles larger than 5 m. How should I sample my product?
Sampling Expert: That is a bit too general a question. Lets first
define our goal. Would 20 % relative standard deviation for the
coarse particles be sufficient?
Plant Manager: Yes.
Sampling Expert: Well, lets consider the problem. We could use the
Poisson distribution to estimate the required sample size. Lets see:
The maximum relative standard deviation s
r
= 20 % = 0.2. From
equation 2 we can estimate how many coarse particles there should be
in the sample to have this standard deviation
25
2 . 0
1 1
2 2
= = =
r
s
n
If 1 tonne contains 5 coarse particles this result means that the primary
sample should be 25 tonnes. This is a good example of an impossible
sampling problem. Even though you could take a 25 tonne sample there
is no feasible technology to separate and count the coarse particles
from it. You shouldnt try the traditional analytical approach in con-
trolling the quality of your product. Instead, if the specification is really
sensible, you forget the particle size analyzers and maintain the quality
of your product by process technological means, that is, you take care
that all equipment are regularly serviced and their high performance
maintained to guarantee the product quality.
Plant Manager: Thank you
P. Gys Fundamental sampling error model
)
1 1
(
3 2
L S
r
M M
Cd = o
s
M
Cd
3
~
if M
s
<< M
L
L
a
a
r
o
o =
= relative standard deviation
where
M
s
= sample size
M
L
= lot size
d = particle size (95 % top size)
a
L
= average concentration of the analyte in the lot
C = Sampling constant
SAMPLING CONSTANT C
c g f C = |
composition factor
liberation factor
size distribution factor
shape factor
Estimation of shape factor
d d d
d
f= 1 f= 0,524
f= 0,5
f= 0,1
default in most cases
Estimation of liberation factor for unliberated and
liberated particles
L
d
L
= |
L = d
1 = |
1
max
= |
Estimation of size distribution factor, g
Wide size distribution (d/d
0.05
> 4) default g = 0.25
Medium distribution (d/d
0.05
= 4...2) g = 0.50
Narrow distribution (1 < d/d
0.05
< 2) g = 0.75
Identical particles (d/d
0.05
= 1) g = 1.00
Estimation of constitution factor, c
m
L
c
L
L
a
a
a
c
o
o
o
|
.
|
\
|
+
|
.
|
\
|
= 1
1
2
density of matrix
density of critical particles
concentration of determinand in critical particles
average concentration of the lot
Example:
A chicken feed (density = 0.67 g/cm
3
) contains as an average 0.05 % of an
enzyme powder that has a density of 1.08 g/cm
3
. The size distribution of the
enzyme particle size d=1.00 mm and the size range factor g = 0.5 could be
estimated.
Estimate the fundamental sampling error for the following analytical procedure.
First a 500 g sample is taken from a 25 kg bag. This sample is ground to a
particle size -0.5 mm. Then the enzyme is extracted from a 2 g sample by
using a proper solvent and the concentration is determined by using liquid
chromatography.
The relative standard deviation of the chromatographic measurement is 5 %.
d
1
= 1 mm; M
S1
= 500 g ; M
L1
=25000 g; g
1
= 0.5;
C
1
= 540 g/cm
3
s
r1
= 0.033 =3.3 % (primary sample)
d
2
= 0.5 mm; M
S2
=2 g ; M
L2
=500 g; g
2
= 0.25;
C
1
= 270 g/cm
3
s
r2
= 0.13 =13 % (secondary sample)
M
L
=25000 g; d = 1 mm
c
= 1.08 g/cm
3
; o = 100 % ; | = 1
m
= 0.67 g/cm
3
; a
L
= 0.05 % ; f = 0.5
c =2160 g/cm
3
s
r3
= 0.05 = 5 % (analysis)
% 3 . 14 143 . 0
2
= = =
ri t
s s
Total relative standard deviation:
Analysis of Mineral Mixtures by
Using IR Spectrometry
Pentti Minkkinen
a)
, Marko Lallo
a)
, Pekka Sten
b)
, and Markku J. Lehtinen
c)
a)
Lappeenranta University of Technology, Department of Chemical Technology, P.O. Box
20,
FIN-53851 Lappeenranta, Finland
b)
Technical Research Centre of Finland (VTT), Chemical Technology, Mineral
Processing,
P.O. Box 1405, FIN-83501 Outokumpu, Finland
c)
Partek Nordkalk Oy Ab, Poikkitie 1, FIN-53500 Lappeenranta, Finland
(Present address: Geological Survey of Finland, R & D Department/Mineralogy and
Applied Mineralogy, P.O.Box 96, FIN-2151 Espoo, Finland)
Content
Introduction
Sampling error estimation and sample
preparation
Design of calibration and test sets
Calibration (PLS)
Results
Introduction
In mineral processing it is important to know quanti-
tatively the mineral composition of the material to be
processed
The methods presently in use time consuming
Only a few reports in literature on use of IR for mineral
analysis
Feasibility of FTIR (Nicolet Magna 560) to analyze
quantitatively mineral species associated with wollastonite
was studied
Mineral Mixture Studied
Mineral Formula Concentration
Range (%)
Wollastonite CaSiO3 80 - 95
Calcite CaCO3 1 - 10
Dolomite CaMg(CO3)2 1 - 10
Quartz SiO2 1 - 10
Diopside CaMgSi2O6 1 - 10
MATERIAL PROPERTIES
NEEDED IN SAMPLING
ERROR ESTIMATION
Component Density
g/cm
3
Particle size, d
.95
m
Wollastonite 3.0 37
Calcite 2.8 45
Dolomite 2.86 31
Quartz 2.65 45
Diopside 3.3 43
KBr 2.75
Design of calibration set
Mixture design for five components by using
XVERT- algorithm (CORNELL, J.A., Experi-
ments with Mixtures, 2
nd
Edition, Wiley, 1990,
pp 139-227)
37 calibration and 5 validation standards
Calibration
Spectra recorded with Nicolet Magna 560 FTIR
spectrometer
Calibration with PLS1 (model calculated by using
TURBOQUANT program)
PREPARATION OF CALIBRATION
STANDARDS
1. Pure minerals (d=1mm) were ground
individually 2 min in a swing mill
2. 30 mg - 2.95 g of each mineral were carefully
weighted to obtain the designed composition
which was carefully mixed 3 min in a Retsch
Spectro Mill
3. 20 mg of the mineral mixture was carefully
weighted into 4.98 g of KBr and mixed 3 min in
a Retsch Spectro Mill
4. 200 mg of the mineral-KBr mixture was pressed
into a tablet for the IR measurement
Sampling errors of sample
preparation and IR measurement
Dilution factor = = 0.004
Tablet Preparation:
Lot size = M
L1
= 5 g
Sample size = M
s1
= 0.2 g
IR Measurement:
Lot size = M
L2
= 200 mg
Sample size = M
s2
= 38% of 0.2 g =
76 mg
g
g
0 . 5
2 . 0
RESULTS
1 2 3 4 5 6 7 8 9 10
0
2
4
6
8
10
s
r1
%
s
r2
%
s
rt
%
Quartz concentration (%)
FSE of quartz determination
80 82 84 86 88 90 92 94 96
0.4
0.5
0.6
0.7
0.8
s
r1
%
s
r2
%
s
rt
%
Wollastonite concentration (%)
FSE of wollastonite determination
Experimental result (PLS-calibration)
0 2 4 6 8 10 12
0
2
4
6
8
10
12
QUARTZ
DESIGN (%)
P
R
E
D
I
C
T
E
D
(
%
)
Designed vs. predicted concentration
* = calibration, o = test set
78 80 82 84 86 88 90 92 94 96
76
78
80
82
84
86
88
90
92
94
96
WOLLASTONITE
DESIGN (%)
P
R
E
D
I
C
T
E
D
(
%
)
Designed vs. predicted concentration
* = calibration, o = test set
Conclusions
FTIR and PLS can be used for mineral analysis
Design and preparation of the calibration set
important; pure minerals hard to get and vary in
composition from deposit to deposit and also within
a deposit
Reproducible sample preparation important both
from the spectroscopic point of view and to control
the sampling error
Still difficult for a routine laboratory
Uses of Gys fundamental sampling error
model
s
r
of a given sample size
Minimum M
s
for a required s
r
Maximum d for given M
s
and s
r
Audit and design of multistep sampling procedures
Estimation of point selection error,
PSE
PSEdepends on sample selection strategy, if consecutive
values are autocorrelated. Selection options:
random
stratified random
stratified systematic.
PSE is the error of the mean of a continuous lot estimated by using
discrete samples.
Point selection error has two components: PSE =PSE
1
+PSE
2
PSE
1
... error component caused by random drift
PSE
2
... error component caused by cyclic drift
Statistics of correlated series is needed to evaluate the sampling
variance.
100
0 5 10 15 20 25 30
0
50
C
O
N
C
E
N
T
R
A
T
I
O
N
TIME
Random selection
0 5 10 15 20 25 30
0
50
100
C
O
N
C
E
N
T
R
A
T
I
O
N
TIME
Stratified selection
0 5 10 15 20 25 30
0
50
100
TIME
C
O
N
C
E
N
T
R
A
T
I
O
N
Systematic selection
When sampling autocorrelated series the same number of
samples gives different uncertainties for the mean depending
on selection strategy
Random sampling:
n
s
s
p
x
=
Stratified sampling:
n
s
s
str
x
=
Systematic sampling:
n
s
s
sys
x
=
s
p
is the process
standard deviation,
s
str
and s
sys
standard
deviation estimates
where the autocor-
relation has been
taken into account.
Normally s
p
> s
str
> s
sys
,
except in periodic processes, where s
sys
may be the
largest
Estimation of PSE by variography
Variogaphic experiment: N samples collected at equal
distances
Variogram of heterogeneity calculated:
( )
( )
=
+
=
j N
i
i
j
i j
h h
j N
V
1
2
2
1
2
, , 2 , 1
N
j = ,
Heterogeneity of the
process:
N i , , 2 , 1 =
M
M
a
a a
h
i
L
L i
i
= ,
To estimate variances the variogram has to be integrated (numerically
in Gys method)
0.005
0.01
0.015
0
0.02
0.04
0
0.01
0.02
0 5 10 15 20 25 30 35 40 45 50
0
0.01
0.02
SAMPLE INTERVAL
V
V
V
V
A
B
C
D
Shapes of variograms: A. Random process; B. Process with
non-periodic drift; C. Periodic process; D. Complex process
Estimation of sulfur in wastewater
stream
0 5 10 15 20 25 30
0.5
0
0.5
1
DAYS
h
i
Heterogeneity of the process, s
p
= 0.282 =28.2 %
0 2 4 6 8 10 12 14 16
0
0.05
0.1
0.15
Sample interval (d)
V
i
Variogram of sulfur in wastewater stream
s
str
s
sys
0 5 10 15
0
5
10
15
20
25
Sample interval (d)
s
r
(
%
)
Relative standard deviation estimates, which take auto-
correlation into account
Estimate the uncertainty of the annual mean, if one
sample/ week is analyzed by using systematic sample
selection
Sampling interval = 7 d s
sys
= 7.8 %
Number of samples/y = n =52
Standard deviation of the annual mean =
% 1 . 1
52
% 8 . 7
= = =
n
s
s
sys
x
Expanded uncertainty = % 2 . 2 2
95 . 0
= =
x
s U
Process standard deviation was 28.2 %. If the number of samples is
estimated by using normal approximation (or samples are selected
completely randomly) the required number of samples is for the same
uncertainty:
657
%) 1 . 1 (
%) 2 . 28 (
2
2
2
2
= = =
x
p
s
s
n
CONCLUSIONS
Sampling uncertainty can be, and should be estimated
If the sampling uncertainty is not known it is questionable
whether the sample should be analyzed at all
Sampling nearly always takes a significant part of the
total uncertainty budget
Optimization of sampling and analytical procedures may
result significant savings, or better results, including
scales from laboratory procedures and process sampling
to large national surveys
THANK YOU