Guidelines For The Validation and Verifi Cation of Chemical Test Methods
Guidelines For The Validation and Verifi Cation of Chemical Test Methods
Guidelines For The Validation and Verifi Cation of Chemical Test Methods
1 of 6
Technical Note 17—Guidelines for the validation and verification of chemical test methods
2 of 6
Technical Note 17— Guidelines for validation and verification of chemical test methods
determined under such conditions. Test materials should be many tests, it will be necessary to spike samples that contain
typical of samples normally analysed. Sample preparation should natural concentrations of analyte(s).
be consistent with normal practice and variations in reagents, test In such cases, bias is estimated from the difference between
equipment, analysts and instrumentation should be representative results obtained for analysis of the sample in its spiked and
of those normally encountered. original states. Caution is advised when evaluating bias from the
Precision may vary with analyte concentration. This should analysis of spiked samples since the recovery may be better for
be investigated if analyte concentration is expected to vary by spiked analyte compared to ‘native’ analyte, or incurred residues/
more than 50% of an average value. For some tests, it may contaminants. For example, whilst spiking drinking water with
be appropriate to determine precision at only one or two fluoride would allow a reliable estimate of recovery the same
concentrations of particular significance to the users of test data, may not be true for spiking a soil with organochlorine pesticides.
eg. a production QC specification or regulatory limit. This is largely due to different extraction efficiencies for ‘added’
For single-laboratory validation, the best measure of precision and ‘native’ analytes. If possible, spiked recovery data should
is obtained by replicate analyses of independently prepared test be substantiated by some means; for example, participation in
portions of a laboratory sample, CRM or RM, under normal Proficiency Trials involving natural samples or samples with
longer term operating conditions. Usually this will involve the incurred residues/contamination.
determination of intra-laboratory reproducibility as described In some cases, laboratories will have to rely solely on spiked
above. recovery data to estimate bias. In such instances, it should be
If data are available from precision experiments carried out noted that while a 100% recovery does not necessarily indicate
on different samples, possibly at different times and there is no trueness, a poor recovery definitely indicates bias, albeit a possible
significant difference between the variances from each data set, the underestimate of the total bias.
data may be combined to calculate a pooled standard deviation. Reference methods
A reference method with a known bias may be used to
2.4.2 Trueness
investigate the bias of another method. Typical samples covering
The bias of a measurement result may be seen as the the range of matrices and analyte concentrations relevant to
combination of the bias of the method itself, laboratory bias and proposed testing programs are analysed by both methods. The
the bias attributable to a particular analytical run. significance of the bias of the test method may be estimated by
A reference material, containing an analyte of known statistical analysis (a t-test) of the results obtained.
concentration may be used to estimate the bias of a test result. If
2.5 Limit of detection and limit of quantitation
bias is not determined in each run, an estimate of the average bias
is best achieved by comparing test results, obtained in different The limit of detection (LOD) of a method is the smallest
runs over several days, with the known value. Reference materials amount or concentration of an analyte that can be reliably
should match the matrices and analytes of the samples to be tested distinguished from zero. In other words, the LOD is the lowest
by the method. value measured by a method that is greater than the uncertainty
Certified Reference Materials (CRMs) associated with it. (Taylor, 1989)
CRMs contain measurands with assigned values, traceable to It is a NATA requirement that trace organic analytes must be
international standards with stated uncertainties. When CRMs are positively identified by an appropriate confirmatory technique. In
available to match the matrices and values of laboratory samples, this context, for trace organic analyses, the LOD is the smallest
they present the best option for estimating bias. Ideally, several amount or concentration that can be readily distinguished from
CRMs with appropriate matrices and analyte concentrations zero and be positively identified according to predetermined
should be measured. However, for most test methods, suitable criteria and/or levels of confidence.
CRMs are not available, and alternatives are necessarily employed The limit of quantitation (LOQ) of a method is often defined
to estimate bias. as the lowest concentration of analyte that can be determined
Certified reference materials are also used to establish the with an acceptable level of uncertainty. Various conventions have
traceability of calibrations. been applied to estimating the LOQ. Perhaps the most common
recommendation is to quote the LOQ as 3 times the LOD.
Reference Materials (RMs)
There is no need to estimate the LOD or LOQ for methods
If CRMs are not available, other reference materials may be that will always be applied to determine analyte concentrations
used to estimate bias, provided they are matrix matched with the much greater than the LOQ. However, the estimates often
samples to be tested and sufficiently characterised with respect have great importance for trace and ultra-trace methods where
to the analytes of interest. Materials characterised by restricted concentrations of concern are often close to the LOD or LOQ
collaborative testing may be suitable for the purpose. Laboratories and results reported as ‘not detected’ may nevertheless have
may use RMs characterised against CRMs for routine quality significant impact on risk assessments or regulatory decisions.
control as an acceptable, cost-effective alternative to the regular The LOD of a method should not be confused with the lowest
analysis of CRMs. instrumental response. The use of a signal to noise ratio for
Spiked samples an analytical standard introduced to an instrument is a useful
indicator of instrument performance but an inappropriate means
If neither suitable CRMs nor RMs are available, bias may be
of estimating the LOD of a method.
investigated by the analysis of spiked samples, i.e. samples to
which a known concentration of analyte has been added. For In order to estimate the LOD of a method, analyses should
some tests, eg. pesticide residue analysis, laboratories may be be performed on samples, including all steps of the analytical
able to spike samples that have been determined not to contain procedure. The LOD may be determined by analysing 7 replicate
detectable residues of the analyte(s) of interest. However, for samples at each of 3 concentrations, the lowest concentration
3 of 6
Technical Note 17—Guidelines for the validation and verification of chemical test methods
being reasonably close to zero. A plot of standard deviation Numerous references are available that present different
vs concentration is then extrapolated to estimate the standard approaches for the estimation of MU. ISO has published
deviation at zero concentration (s0). The LOD of the method is guidelines on the estimation of MU (ISO, 1995) and Eurachem/
taken as 3s0, which gives 95% confidence that the method would CITAC interpretation on how they may be applied to analytical
detect an analyte present in a sample at that concentration. measurements (Eurochem/CITAC, 2000). These documents
Alternatively, 7 replicate analyses may be performed at a single have now been supplemented by guidelines and examples from
concentration equal to about twice the LOQ. (The analyst will a number of other sources (ILAC, 2002; APLAC, 2003; UKAS,
need to apply informed judgement in selecting the appropriate 2000; ISO/TS, 2004; Magnusson et al., 2003) aiming to provide
concentration). In such circumstances, the standard deviation of laboratories with more practical examples and simpler approaches
these replicates can be assumed to approximate s0, and the LOD which may be used to calculate reasonable estimates of MU.
may be calculated as described above. Excellent examples are also available from the website www.
measurementuncertainty.org/
2.6 Range
Technical Note 33 entitled Guidelines for estimating and
The working range of a method is defined as the concentration reporting measurement uncertainty of chemical test results is available
range within which results will have an acceptable level of on the NATA website (www.nata.asn.au). The site also provides
uncertainty. In terms of the parameters discussed above, this some worked examples as well as links to other informative web
could be taken to equate to the concentration range between the sites.
LOQ and the upper limit of the linear calibration. In practice,
The information gained from other aspects of method
acceptable uncertainties may be achieved at concentrations greater
validation, as described above, will be sufficient to produce a
than this upper limit (beyond the extent of the determined
reasonable estimate of MU. These data can be supplemented
linear range). However, it is more prudent to consider the
with data from regular QC checks once the method is operational
validated range, i.e. the range between the LOQ and the highest
and data resulting from participation in relevant Proficiency
concentration studied during validation.
Trials. Estimates may also be based on, or partly based on,
2.7 Ruggedness published data and professional judgement. As with all aspects
of method validation, estimates of measurement uncertainty
The ruggedness (a measure of robustness) of a method is the should be fit-for-purpose. The required rigour for estimates will
degree to which results are unaffected by minor changes from the vary according to the rationale for testing; the principle being
experimental conditions described in the procedure; for example, that estimates should be reasonable for the intended purpose. A
small changes in temperature, pH, reagent concentration, flow reasonable estimate of MU may be obtained from consideration
rates, extraction times, composition of mobile phase. Ruggedness of long-term precision (intra-laboratory reproducibility) and
is investigated by measuring the effects on the results of small, bias. In some instances, other significant contributors to MU,
planned changes to the method conditions. In some cases, eg. purity of standards, which may not be covered by these
information may be available from studies conducted during parameters may need to be included in the estimation.
in-house method development. Intra-laboratory reproducibility
It is recommended that a test result and its associated MU
investigations, by their nature, take into account some aspects of a
be expressed in the same units (Eurachem/CITAC, 2000). It
method’s ruggedness.
is desirable to estimate MU at the values most important to
The simplest tests of ruggedness consider only one method the users of the results produced by the method e.g. critical
variable at a time. Youden and Steiner (1975) describe a Plackett- concentrations such as a QC specification or regulatory limit.
Burman designed experiment that provides an economical
NATA’s current policies with respect to MU require laboratories
and efficient approach whereby seven variables are evaluated
to provide examples of MU estimates using a documented
by conducting only eight analyses. Both approaches assume
procedure. The procedure may cite literature references and
independence of effects.
include alternative approaches. Examples of MU estimates that
In practice, an experienced analyst will be able to identify are submitted for review should include:
those method parameters with the potential to affect results and
• A brief description of the method including the formulae used
introduce controls, eg. specified limits for temperature, time or
to calculate results
pH ranges, to guard against such effects.
• Consideration of the approach to be used for estimating MU
2.8 Measurement Uncertainty • Consideration of the possible contributors to MU (a ‘fishbone’
Measurement Uncertainty (MU) is defined as ‘a parameter, cause and effect diagram may help)
associated with the result of a measurement, that characterises • Step-wise calculations for estimating each contribution to MU
the dispersion of the values that could reasonably be attributed to as per the chosen approach
the measurand (ISO, 1993). Knowledge of MU is necessary for • Argument for disregarding the uncertainty associated with any
the effective comparison of measurements and for comparison test parameter originally identified as a potential contributor to
of measurements with specification limits. Furthermore, MU is uncertainty
inexorably linked to traceability. Test results must be traceable
• The equation used for combining standard uncertainties
to stated references, usually national or international standards,
through an unbroken chain of comparisons, all having stated • The calculation of expanded uncertainty
uncertainties (ISO, 1993; Eurachem/CITAC, 2003). The • An example to show how results would be reported
ISO/IEC 17025 Standard requires that laboratories estimate MU • A reality check; i.e. does the estimate make sense, based on the
for their non-standard analytical methods and, where applicable, laboratory’s experience or other relevant information?
report the MU associated with results. Therefore the estimation of The example should be descriptive enough to allow an
MU is an essential requirement of method validation. independent reviewer to easily follow the process. Values
4 of 6
Technical Note 17— Guidelines for validation and verification of chemical test methods
calculated by spreadsheets should be accompanied by a Additional validation should be considered if the validation
description of how they were derived. Generally a spreadsheet data for a standard method is not available to the laboratory or
alone will not suffice. the laboratory needs to apply specifications more stringent than
those for which the standard method has been validated. Minor
Any judgemental decisions based on experience should be
modifications to previously validated in-house methods, for
briefly justified. Estimates based solely on precision need to be
example, using the same type of chromatographic column from a
supported by evidence to demonstrate that precision is the only different manufacturer, should also be verified.
significant contributor to MU; and in particular that bias, and the
uncertainty associated with an estimate of bias, are not significant The key parameters to consider in the verification process will
contributors to the uncertainty of results. depend on the nature of the method and the range of sample
matrices likely to be encountered. The determination of bias
and precision are minimum requirements. Ideally the laboratory
3. Verification of previously validated methods will be able to demonstrate performance in line with method
Methods published by organisations such as Standards Australia, specifications. If not, judgement should be exercised to determine
ASTM, USEPA, APHA and IP have already been subject to whether the method can be applied to generate test results fit for
validation by collaborative studies and found to be fit for purpose purpose.
for the scope defined by the method. Therefore, the rigour of For trace analyses the laboratory should also confirm that the
testing required to introduce such a method into a laboratory is achievable LOD and LOQ are fit for purpose.
less than that required to validate an in-house method. Essentially
the laboratory only needs to verify that their operators using their
equipment in their laboratory environment can apply the method
4. Summary
satisfactorily. Full validation is required if a laboratory has reason Table 1 (below) summarises the parameters that need
to significantly modify a standard method, for example, use a consideration when planning method validation and method
different extraction solvent or use HPLC instead of GLC for verification investigations. The table also includes brief notes on
determination. how each performance characteristic may be determined and the
Sensitivity Analysis of spiked samples or standards prepared in sample Initial check for satisfactory gradient for plot of response
extract solution vs concentration. (More appropriately a QC issue
following initial check)
Selectivity Consideration of potential interferences, analysis of samples If required, one-off tests should suffice
spiked with possible interferents. (Method Development may have
overcome potential issues)
Sample Spikes
Comparison with Standard Methods
Results from Collaborative Studies
Precision; intra-laboratory Replicate analysis of samples; if possible selected to contain At least 7 replicates for each matrix
reproducibility analytes at concentrations most relevant to users of test results
Limit of Detection. Analysis of samples containing low concentrations of analytes. At least 7 replicates at each of 3 concentrations including
Limit of Quantitation Note: The determination of LOD and LOQ is normally only required a concentration close to zero (graphical method), or at
for methods intended to determine analytes at about these least 7 replicates at a concentration estimated to be equal
concentrations to twice the LOQ (statistical method).
Separate determinations may be required for different
matrices
Working Range Evaluation of data from bias and possibly LOQ determinations
Ruggedness Consider those steps of the method which if varied marginally, Introduce appropriate limits to method parameters likely
would possibly affect results. to impact results if not carefully controlled
Investigate if necessary Test and re-test with small change to one method
(i) single variable test parameter
(ii) multi variable test Plackett-Burman designed experiment. (Ref, Youden and
Steiner, 1975)
Measurement Uncertainty Utilise other validation data, combined with any other Calculate a reasonable, fit-for-purpose estimate of MU.
complementary data available, eg. results from collaborative Ensure estimates are aligned with the concentration(s)
studies, proficiency tests, round-robin tests, in-house QC data most relevant to the users of results
5 of 6
Technical Note 17—Guidelines for the validation and verification of chemical test methods
recommended minimum number of replicate tests required for B. Magnusson, T. Naykki, H. Hovind and M. Krysell,
each determination. Handbook for Calculation of Measurement Uncertainty in
Not all parameters need to be assessed for all methods. The Environmental Laboratories, NORDTEST Report TR537, 2003
rigour of validation should be sufficient to ensure that test results J. C. Miller and J. N Miller, Statistics and Chemometrics for
produced by a method are technically sound and will satisfy the Analytical Chemistry 4th Edition, Prentice Hall, 2000, ISBN 0-
client’s needs. Well-planned method validation studies will be 13-022888-5, and/or VAMSTAT II Statistics Training for Valid
based on a clear understanding of the specific requirements for Analytical Measurements CD-ROM, www.vam.org.uk.
the method in use. Within this framework, carefully designed M. I. Mulholland and D. B. Hibbert, Linearity and the
experiments will provide information to satisfy more than one of Limitation of Least Squares Calibration, J. Chromatography A,
the parameters in Table 1. For example, information on precision 76273-82, 1997
and bias could be obtained from replicate analysis of a CRM, and
precision data would also be generated via the determination of NATA (2006) Guidelines for estimating and reporting
the LOD. measurement uncertainty of chemical test results Technical Note
No. 33, June, 2006
It is good practice for laboratories to keep comprehensive
records of method validation, including the procedures used for Taylor, J.K., Quality Assurance of Chemical Measurements, sixth
validation, the results obtained and a statement as to whether the edition, Lewis Publishers, ISBN 0-87371-097-5, 1989, page 79.
method is fit for its intended use. Thompson M., Ellison S.L.R. and Wood R., Harmonised
Guidelines for Single-Laboratory Validation of Methods of Analysis
(IUPAC Technical Report), Pure Appl. Chem., 74(5), 835-855,
5. References
2002.
APLAC (2003) Interpretation and Guidance on the Estimation of
UKAS (2000) The Expression of Uncertainty in Testing, UKAS
Uncertainty of Measurement in Testing, APLAC TC 005, APLAC,
LAB12, 1st Edition, October 2000, www.ukas.com.
March 2003, www.ianz.govt.nz/aplac.
W. J. Youden and E. H. Steiner, Statistical Manual of the
AS 2850:1996 Chemical Analysis—Interlaboratory Test Programs
Association of Official Analytical Chemists, AOAC, 1975, ISBN
for Determining Precision of Analytical Method(s)—Guide to the
0-935584-15-3
Planning and Conduct.
Eurachem/CITAC (2000) Quantifying Uncertainty in Analytical
Measurement Eurachem/CITAC Guide CG4, 2nd Edition, 2000,
6. Additional reading
ISBN 0-948926-15-5, www.eurachem.ul.pt. F. M. Garfield, E. Klestan and J. Hirsch, Quality Assurance
Eurachem/CITAC (2003) Traceability in Chemical Measurement Principles for Analytical Laboratories, 3rd Edition, AOAC
Eurachem/CITAC Guide, 2003, www.eurachem.ul.pt International, 2000, ISBN-0-935584-70-6.
Hibbert (2004) Method Validation, in Encyclopedia of Analytical Validation of Analytical Methods for Food Control, Report of a
Science, 2nd Edition, Quality Assurance, Elsevier Ltd. 2004 (in Joint FAO/IAEA Expert Consultation, December 1997, FAO
print) Food and Nutrition Paper No. 68, FAO, Rome (1998).
ILAC (2002) Introducing the Concept of Uncertainty of M. Thompson and R. Wood. Harmonised guidelines for internal
Measurement in Testing in Association with the Application of the quality control in analytical chemistry laboratories, Pure Appl.
Standard ISO/IEC 17025, ILAC G17, 2002, www.ilac.org. Chem. 67 (4), 49-56 (1995).
ISO (1993) International Vocabulary of Basic and General Terms M. Thompson, S. Ellison, A. Fajgelj, P. Willetts, R. Wood.
in Metrology, 2nd Edition, ISO, Geneva, 1993, ISBN 92-67- Harmonised guidelines for the use of recovery information in
01075-1. analytical measurement, Pure Appl. Chem. 71 (2), 337-348
(1999).
ISO (1995) Guide to the Expression of Uncertainty in
Measurement, 1st Edition, ISO, Geneva, 1995, ISBN 92-67- Validation of Chemical Analytical Methods, NMKL Secretariat,
10188-9. Finland, NMKL Procedure No. 4, (1996).
ISO/TS 2004 Guidance for the Use of Repeatability, EURACHEM Guide: The fitness for purpose of analytical
Reproduceability and Trueness Estimates in Measurement methods. A Laboratory Guide to method validation and related
Uncertainty Estimation ISO/TS 21748:2004 topics, LGC, Teddington 1996. Also available from the
EURACHEM Secretariat and website.
LGC (2003) In-House Method Validation—A Guide for Chemical
Laboratories, Laboratory of the Government Chemist (LGC), UK,
2003.
6 of 6