Ea 4 16 G Rev00 December 2003 Rev
Ea 4 16 G Rev00 December 2003 Rev
Ea 4 16 G Rev00 December 2003 Rev
Publication
Reference EA-4/16 G:2003
EA guidelines
on the expression
of uncertainty
in quantitative testing
PURPOSE
The purpose of this document is to harmonise the evaluation of uncertainties associated with
measurement and test results within EA. To achieve this, recommendations and advice are
given for the evaluation of those uncertainties.
Authorship
Official Language
The text may be translated into other languages as required. The English language version
remains the definitive version.
Copyright
The copyright of this text is held by EA. The text may not be copied for resale.
Further information
For further information about this publication, contact your National member of EA. Please
check our website http://www.european-accreditation.org for up-to-date information.
CONTENTS
1 INTRODUCTION
2 SCOPE OF APPLICATION
3 POLICY STATEMENT
5.1 Requirements
6.2 Data accumulated during validation and verification of a test method prior to application
in the testing environment
6.3 Interlaboratory study of test methods performance according to ISO 5725 or equivalent
10 REFERENCES
11 BIBLIOGRAPHY
12 APPENDIX
1 INTRODUCTION
In general, the GUM is also applicable in testing, although there are decisive differences
between measurement and testing procedures. The very nature of some testing
procedures may make it difficult to apply the GUM strictly. Section 6 provides guidance
on how to proceed in such cases.
Wherever possible accredited testing laboratories are required, when reporting the
uncertainties associated with quantitative results, to do so in accordance with the GUM.
A basic requirement of the GUM is the use of a model for the evaluation of uncertainty.
The model should include all quantities that can contribute significantly to the uncertainty
associated with the test result. There are circumstances, however, where the effort
required developing a detailed model is unnecessary. In such a case other identified
guidance should be adopted, and other methods based, for example on validation and
method performance data be used.
To ensure that clients benefit fully from laboratories services, accredited testing
laboratories have developed appropriate principles for their collaboration with
clients. Clients have the right to expect that the test reports are factually correct, useful
and comprehensive. Depending on the situation, clients are also interested in quality
features, especially
the reliability of the results and a quantitative statement on this reliability, i.e.
uncertainty
the level of confidence of a conformity statement about the product that can be
inferred from the testing result and the associated expanded uncertainty.
Other quality features such as repeatability, intermediate precision reproducibility,
trueness, robustness and selectivity are also important for the characterisation of the
quality of a test method.
This document does not deal with the use of uncertainty in conformity assessment. In
general, the quality of a test result does not reflect the best achievable or the smallest
uncertainty. Section 2 defines the scope of application of this guide and Section 3
presents a policy statement jointly made by EUROLAB, EURACHEM and EA.
Sections 4, 5 and 6 are tutorial. Section 4 provides a brief summary of the
GUM. Section 5 summarises the existing requirements according to ISO/IEC 17025 [7]
and the strategy for the implementation of uncertainty evaluation. It also addresses
some difficulties associated with uncertainty evaluation in testing. Section 6 explains the
use of validation and method performance data for evaluating uncertainty in testing. EA
requirements on reporting the result of a measurement are given in Section 7. Guidance
on a stepwise implementation of uncertainty in testing is provided in Section 8. The
benefits of elaborating the uncertainty associated with the values obtained in quantitative
testing are indicated in Section 9.
2 SCOPE OF APPLICATION
3 POLICY STATEMENT
Testing laboratories should not be expected to do more than take notice of, and apply the
uncertainty-related information given in the standard, i.e. quote the applicable figure, or
perform the applicable procedure for uncertainty estimation. Standards specifying test
methods should be reviewed concerning estimation and statement of uncertainty of test
results, and revised accordingly by the standards organisation.
7. The required depth of the uncertainty estimations may be different in different technical
fields. Factors to be taken into account include:
common sense;
influence of the uncertainty of measurement on the result (appropriateness of the
determination);
appropriateness;
classification of the degree of rigour in the determination of uncertainty of
measurement.
1 The term evaluation has been used in preference to the term estimation. The former term is more
general and is applicable to different approaches for uncertainty. This choice is also made to be
consistent with the vocabulary used in GUM.
2 The laboratories have to demonstrate full compliance with the test methods.
The GUM is based on sound theory and provides a consistent and transferable
evaluation of measurement uncertainty and supports metrological traceability. The
following paragraphs provide a brief interpretation of the basic ideas and concepts.
Three levels in the GUM can be identified. These are basic concepts, recommendations
and evaluation procedures. Consistency requires the basic concepts to be accepted and
the recommendations to be followed. The basic evaluation procedure presented in the
GUM, the law of propagation of uncertainty, applies to linear or linearised models (see
below). It should be applied whenever appropriate, since it is straightforward and easy
to implement. However, for some cases more advanced methods such as the use of
higher-order expansion of the model or the propagation of probability distributions may
be required.
The basic concepts in uncertainty evaluation are
the knowledge about any quantity that influences the measurand is in principle
incomplete and can be expressed by a probability density function (PDF) for the
values attributable to the quantity based on that knowledge
the expectation value of that PDF is taken as the best estimate of the value of the
quantity
the standard deviation of that PDF is taken as the standard uncertainty
associated with that estimate
the PDF is based on knowledge about a quantity that may by inferred from
- repeated measurementsType A evaluation
- scientific judgement based on all the available information on the possible
variability of the quantityType B evaluation.
PDF, i.e. the best estimate of each quantity and the standard uncertainty
associated with that estimate
Propagation of uncertainty. The basic procedure (the law of propagation of
uncertainty) can be applied to linear or linearised models, but is subject to some
restrictions. A working group of the Joint Committee for Guides in Metrology
(JCGM) is preparing guidance for a more general method (the propagation of
PDFs) that includes the law of propagation of uncertainty as a special case
Stating the complete result of a measurement by providing the best estimate of
the value of the measurand, the combined standard uncertainty associated with
that estimate and an expanded uncertainty (Section 7).
The GUM [1] provides guidance on stating a complete result of a measurement in its
section 7, titled Reporting uncertainty. Section 7 in this document follows the
recommendations of the GUM and provides some more detailed guidance. Note that the
GUM permits the use of either the combined standard uncertainty uc(y) or the expanded
uncertainty U(y), i.e. the half width of an interval having a stated level of confidence, as a
measure of uncertainty. However, if the expanded uncertainty is used, one must state
the coverage factor k, which is equal to the value of U(y)/uc(y).
For the evaluation of the uncertainty associated with the measurand Y one needs only to
know
the model, Y = f(X1,..., XN),
the best estimates xi of all input quantities Xi and
the uncertainties u(xi) and the correlation coefficients r(xi,xj) associated with xi
and with xi and xj.
The best estimate xi is the expected value of the PDF for Xi, u(xi) is the standard
deviation of that PDF and r(xi,xj) is the ratio of the covariance between xi and xj and the
product of the standard deviations.
To state the combined standard uncertainty uc(y) associated with the measurement
result y, no further knowledge of the PDF is required. To state the half width of an
interval having a stated level of confidence, i.e. an expanded uncertainty, it is necessary
to know the PDF. This requires more knowledge since the two parameters, expectation
value and standard deviation, do not fully specify a PDF unless it is known to be
Gaussian.
5.1 Requirements
In principle, the standard ISO/IEC 17025 does not include new requirements
concerning measurement uncertainty but it deals with this subject in more detail than
the previous version of this standard:
5.4.6.2 Testing laboratories shall have and shall apply procedures for estimating
uncertainty of measurement. In certain cases the nature of the test method
NOTE 1 Sources contributing to the uncertainty include, but are not necessarily
limited to, the reference standards and reference materials used, methods and
equipment used, environmental conditions, properties and conditions of the item being
tested or calibrated, and the operator.
NOTE 2 The predicted long-term behaviour of the tested and/or calibrated item
is not normally taken into account when estimating the measurement uncertainty.
NOTE 3 For further information, see ISO 5725 and the Guide to the Expression
of Uncertainty in Measurement (see bibliography).
The difference between the terminology used in measurement and testing activities
will be more clearly seen upon comparing the definitions of the two operations:
There are, however, important differences in the practice of measurement (as seen in
calibration and in testing), and these affect the practice of uncertainty evaluation:
A test result typically depends on the method and on the specific procedure used to
determine the characteristic, sometimes strongly. In general, different test methods
may yield different results, because a characteristic is not necessarily a well-defined
measurand.
Test methods are often determined by conventions. These conventions reflect different
concerns or aims:
the test must be representative of the real conditions of use of the product
the test conditions are often a compromise between extreme conditions of use
individual test conditions should control the variability in the test result.
To achieve the last aim, a nominal value and a tolerance for the relevant conditions are
defined. The test temperature is often specified, e.g. 38.0 C 0.5 C. However, not all
conditions can be controlled. This lack of knowledge introduces variability to the
results. A desirable feature of a test method is to control such variability.
For tests, an indicator (such as a physical quantity) is used to express the test results.
For instance, the ignition time is often used as an indicator for a burning test. The
uncertainty associated with the measurement of the ignition time adds variability to the
test results. However, this contribution to the variability is generally dwarfed by
contributions inherent in the test method and uncontrolled conditions, although this
aspect should be confirmed.
Testing laboratories should scrutinise all elements of the test method and the
conditions prevailing during its application in order to evaluate the uncertainty
associated with a test result.
In principle, the mathematical model describing the test procedure can be established
as proposed in the GUM. However, the derivation of the model may be infeasible for
economic or other reasons. In such cases alternative approaches may be used. In
particular, the major sources of variability can often be assessed by interlaboratory
studies as stated in ISO 5725 [8], which provides estimates of repeatability,
reproducibility and (sometimes) trueness of the method.
Despite the differences in terminology above, for the purposes of this document, a
quantitative test result is considered to be a measurement result in the sense used in
the GUM. The important distinction is that a comprehensive mathematical model, which
describes all the effects on the measurand, is less likely to be available in testing. The
evaluation of uncertainty in testing may therefore require the use of validation and
method performance studies as described in section 6.
6.2 Data accumulated during validation and verification of a test method prior to
application in the testing environment
6.2.1 In practice, the fitness for purpose of test methods applied for routine testing is
frequently checked through method validation and verification studies. The data so
accumulated can inform the evaluation of uncertainty for test methods. Validation
studies for quantitative test methods typically determine some or all of the following
parameters:
Precision. Studies within a laboratory will obtain precision under repeatability conditions
and intermediate conditions, ideally over time and across different operators and types
of test item. The observed precision of a testing procedure is an essential component
of overall uncertainty, whether determined by a combination of individual variances or
by a study of the complete method in operation.
Bias. The bias of a test method is usually determined by studying relevant reference
materials or test samples. The aim is typically to identify and eliminate significant bias.
In general, the uncertainty associated with the determination of the bias is an important
component of overall uncertainty.
Selectivity and specificity. These terms relate to the ability of a test method to respond
to the appropriate measurand in the presence of interfering influences, and are
particularly important in chemical testing. They are, however, qualitative concepts and
do not directly provide uncertainty information, though the influence of interfering
effects may in principle be used in uncertainty evaluation [12].
6.2.3 The general principles of applying validation and performance data to uncertainty
evaluation are similar to those applicable to the use of performance data (above).
However, it is likely that the performance data available will adequately cover fewer
contributions. Correspondingly further supplementary estimates will be required. A
typical procedure is:
Compile a list of relevant sources of uncertainty. It is usually convenient to include
any measured quantities held constant during a test, and to incorporate
appropriate precision terms to account for the variability of individual
measurements or the test method as a whole. A cause and effect diagram [13] is a
very convenient way to summarise the uncertainty sources, showing how they
relate to each other and indicating their influence on the uncertainty associated
with the result
Assemble the available method performance and calibration data
Check to see which sources of uncertainty are adequately accounted for by the
available data. It is not generally necessary to obtain separately the effects of all
contributions; where several effects contribute to an overall performance figure, all
such effects may be considered to be accounted for. Precision data covering a
wide variety of sources of variation are therefore particularly useful as they will
often encompass many effects simultaneously (but note that in general precision
data alone are insufficient unless all other factors are assessed and shown to be
negligible)
For any sources of uncertainty not adequately covered by existing data, either
seek additional information from the literature or existing data (certificates,
equipment specifications, etc.) or, plan experiments to obtain the required
additional data.
These principles are applicable to test methods that have been subjected to
interlaboratory study. For these cases, reference to ISO TS 21748 is recommended for
details of the relevant procedure. The EURACHEM/CITAC guide [12] also gives
guidance on the application of interlaboratory study data in chemical testing.
6.3.2 The additional sources (6.3.1 iii)) that may need particular consideration are:
Sampling. Collaborative studies rarely include a sampling step. If the method used
in-house involves sub-sampling, or the measurand is a bulk property of a small
sample, the effects of sampling should be investigated and their effects included
Pre-treatment. In most studies, samples are homogenised, and may additionally
be stabilised, before distribution. It may be necessary to investigate and add the
effects of the particular pre-treatment procedures applied in-house
Method bias. Method bias is often examined prior to or during interlaboratory
study, where possible by comparison with reference methods or materials. Where
the bias itself, the standard uncertainties associated with the reference values
used, and the standard uncertainty associated with the estimated bias are all small
compared with the reproducibility standard deviation, no additional allowance need
be made for the uncertainty associated with method bias. Otherwise, it will be
necessary to make such allowance.
Variation in conditions. Laboratories participating in a study may tend to steer their
results towards the means of the ranges of the experimental conditions, resulting
in underestimates of the ranges of results possible within the method definition.
Where such effects have been investigated and shown to be insignificant across
their full permitted range, however, no further allowance is required.
Changes in sample type. The uncertainty arising from samples with properties
outside the range covered by the study will need to be considered.
6.4.2 Quality control (QC) data of this kind will not generally include sub-sampling, the effect
of differences between test items, the effects of changes in the level of response, or
inhomogeneity in test items. QC data should accordingly be applied with caution to
similar materials, and with due allowance for additional effects that may reasonably
apply.
6.4.3 Data points from QC data that gave rise to rejection of measurement and test results
and to corrective action should normally be eliminated from the data set before
calculating the standard deviation.
6.5.2 In general, proficiency tests are not carried out sufficiently frequently to provide good
estimates of the performance of an individual laboratorys implementation of a test
method. Additionally, the nature of the test items circulated will typically vary, as will the
expected result. It is thus difficult to accumulate representative data for well-
characterised test items. Furthermore, many schemes use consensus values to assess
laboratory performance, which occasionally lead to apparently anomalous results for
individual laboratories. Their use for the evaluation of uncertainty is accordingly limited.
However, in the special case where
the types of test items used in the scheme are appropriate to the types tested
routinely
the assigned values in each round are traceable to appropriate reference values,
and
the uncertainty associated with the assigned value is small compared with the
observed spread of results,
the dispersion of the differences between the reported values and the assigned values
obtained in repeated rounds provides a basis for an evaluation of the uncertainty
arising from those parts of the measurement procedure within the scope of the
scheme.
6.5.3 Systematic deviation from traceable assigned values and any other sources of
uncertainty (such as those noted in connection with the use of interlaboratory study
data obtained in accordance with ISO 5725) must also be taken into account.
6.5.4 It is recognised that the above approach is relatively restricted. Recent guidance from
EUROLAB [14] suggests that proficiency testing data may have wider applicability in
providing a preliminary estimate of uncertainty in some circumstances.
The relative sizes of the largest and the smaller contributions. For example, a
contribution that is one fifth of the largest contribution will contribute at most 2% of
the combined standard uncertainty
The effect on the reported uncertainty. It is imprudent to make approximations that
materially affect the reported uncertainty or the interpretation of the result
The degree of rigour justified for the uncertainty evaluation, taking into account the
client and regulatory and other external requirements identified, for example,
during contract review.
Where the conditions above are met, and the method is operated within its scope and
field of application, it is normally acceptable to apply the data from prior studies
(including validation studies) directly to uncertainty evaluations in the laboratory in
question.
For methods operating within their defined scope, when the reconciliation stage shows
that all the identified sources have been included in the validation study or when the
contributions from any remaining sources have been shown to be negligible, the
reproducibility standard deviation sR may be used as the combined standard
uncertainty.
If there are any significant sources of uncertainty that are not included in the validation
study their contribution is evaluated separately and combined with sR to obtain the
overall uncertainty.
7.1 Once the expanded uncertainty has been calculated for a specified level of confidence
(typically 95%), the test result y and the expanded uncertainty U should be reported as
y U and accompanied by a statement of confidence. This statement will depend on
the nature of the probability distribution; some examples are presented below.
All clauses below that relate to a 95% level of confidence require modification if a
different level of confidence is required.
7.1.2 t-distribution
The t-distribution may be assumed if the conditions for normality (above) apply but the
degrees of freedom is less than 30. Under these circumstances the following statement
(in which the appropriate numerical values are substituted for XX and YY) can be
made:
7.2 For the purposes of this document the term approximately is interpreted
as meaning effectively or for most practical purposes.
7.3 Reference should also be made to the method by which the uncertainties
have been evaluated.
7.4 In some testing situations it may not be possible to evaluate a metrologically sound
numerical values for each component of uncertainty; in such circumstances the means
of reporting should be such that this is clear. For example, if the uncertainty is based
only on repeatability without consideration being made to other factors then this should
be stated.
7.5 Unless sampling uncertainty has been fully taken into account, it should also be made
clear that the result and the associated uncertainty apply to the tested sample only and
do not apply to any batch from which the sample may have been taken.
7.6 The number of decimal digits in a reported uncertainty should always reflect practical
measurement capability. In view of the process for evaluating uncertainties, it is rarely
justified to report more than two significant digits. Often a single significant digit is
appropriate. Similarly, the numerical value of the result should be rounded so that the
last decimal digit corresponds to the last digit of the uncertainty. The normal rules of
rounding can be applied in both cases.
For example, if a result of 123.456 units is obtained, and an uncertainty of 2.27 units
has resulted from the evaluation, the use of two significant decimal digits would give
the rounded values 123.5 units 2.3 units.
7.7 The test result can usually be expressed as y U. However there may be situations
where the upper and lower bounds are different; for example if cosine errors are
involved. If such differences are small then the most practical approach is to report the
expanded uncertainty as the larger of the two. However, if there is a significant
difference between the upper and lower values they should be evaluated and reported
separately. This may be achieved, for example, by determining the shortest coverage
interval at the desired level of confidence in the PDF for the measurand.
For example, for an uncertainty of +6.5 units and 6.7 units, for practical purposes
6.7 units could simply be stated. However, if the values were +6.5 units and 9.8
units they should be separated, e.g. +6.5 units; 9.8 units.
This aspect has to be taken into account when implementing ISO/IEC 17025.
Laboratories cannot in general be expected to initiate scientific research to assess the
uncertainties associated with their measurements and tests. The respective
requirements of the accreditation bodies should be adapted according to the current
state of knowledge in the respective testing field.
list those quantities and parameters that are expected to have a significant
influence on the uncertainty and estimate their contribution to the overall
uncertainty
use data concerning repeatability or reproducibility that might be available from
validation, internal quality assurance or interlaboratory comparisons
refer to data or procedures given in the relevant testing standards
combine the approaches mentioned above.
recent data from internal quality assurance in order to broaden the statistical
basis for the uncertainty evaluation
new data from the participation in interlaboratory comparisons or proficiency tests
revisions of the relevant standards
specific guidance documents for the respective testing field.
There are several advantages linked with the evaluation of measurement uncertainty in
testing, although the task can be time-consuming.
Calibration costs can be reduced if it can be shown from the evaluation that
particular influence quantities do not substantially contribute to the uncertainty.
10 REFERENCES
[1] Guide to the Expression of Uncertainty in Measurement. BIPM, IEC, IFCC, ISO,
IUPAC, IUPAP, OIML. International Organization for Standardization, Printed in
Switzerland, ISBN 92-67-10188-9, First Edition, 1993. Corrected and reprinted 1995.
[3] ISO/IEC Guide 2:1996, Standardization and related activities - General vocabulary
[5] ISO/IEC 3534-1:1994, Statistics - Vocabulary and symbols Part 1: Probability and
general statistical terms
[6] ISO/IEC 3534-2:1994, Statistics - Vocabulary and symbols Part 2: Statistical quality
control
[7] ISO/IEC 17025:1999, General requirements for the competence of testing and
calibration laboratories
[8] ISO/IEC 5725: 1994, Accuracy (trueness and precision) of measurement methods
and results
[9] ISO/TS 21748: 2002, - Guide to the use of repeatability, reproducibility and trueness
estimates in measurement uncertainty evaluation
[10] EA-3/04, Use of Proficiency Testing as a Tool for Accreditation in Testing (with
EUROLAB and EURACHEM) Aug 2001
[13] EURACHEM, The Fitness for Purpose of Analytical Methods (ISBN 0- 948926-12-
0) 1998
11 BIBLIOGRAPHY
12 APPENDIX