What Are Repeatability and Reproducibility, Part 3 Their Meaning in Gage R&R Methodology, ASTM Data Points, July-August 2009
What Are Repeatability and Reproducibility, Part 3 Their Meaning in Gage R&R Methodology, ASTM Data Points, July-August 2009
What Are Repeatability and Reproducibility, Part 3 Their Meaning in Gage R&R Methodology, ASTM Data Points, July-August 2009
July/August 2009
by Stephen Luko
A: Gage r&R methodology was developed in the 1960s to address the estimation of
measurement system variation as applied to manufacturing. The automotive industry led
in developing and applying this technique; today, gage r&R is a standard practice in many
Here, we compare and contrast the use of repeatability (lowercase r) and reproducibility
(uppercase R) as used in traditional manufacturing with its use in ASTM International
standards. The most important difference is that the latter use probably contains far more
applications to raw material testing while the former probably contains more applications
in metal fabrication, molding and machining, assembly of subsystems and other
manufacturing and fabricationtype gaging. This distinction is important because
materialstype tests are often subject to more sources of variation than industrial
For ASTM materials testing, the key terms for r&R are repeatability conditions and
reproducibility conditions. ASTM E177, Practice for Use of the Terms Precision and Bias
in ASTM Test Methods, defines repeatability conditions as “conditions where independent
test results are obtained with the same method on identical test items in the same
laboratory by the same operator using the same equipment within short intervals of time.”
To adapt this to gage r&R methodology, we interpret test results simply as gage
measurements and laboratory as facility. The reason is that the term test result is more
general than measurement, and facility implies that the measurements are made in one
location — not necessarily a lab. With these distinctions, there is no difference between r
as used in ASTM or in manufacturing.
In Equation 1, yijk is the kth repeat measurement of the ith part by the jth operator. The i
component is the true value of the ith part dimension, the j component is the
reproducibility effect associated with operator j and the εijk is the random repeatability
error that occurs with each measurement. Each measurement, y, is composed of these
three components. The reproducibility term ( ) may be thought of as a kind of personal
bias associated with an operator, i.e., each operator measures the various parts somewhat
differently than the true value x, and this is the individual’s effect. When we use several
operators in a gage r&R study, we effectively pick a random sample (of operators) from a
potentially infinite universe of all such possible operators. The terms are assumed to
have a mean of 0 and an unknown variance 2.
The total variance of all measurements, y, has a variance equal to the sum of the individual
variance components as in Equation 2.
A gage r&R study is a designed experiment used to estimate the individual variance
components. Typically, 2 and 2 are the main components of interest. The method,
based on sample ranges, has enjoyed continued popularity, particularly for small to modest
sample sizes, for many years. Today, many computer packages will perform gage r&R using
the analysis of variance (or ANOVA) technique as well as the range technique. The
following simple illustrations exemplify the method where sample ranges are used.
Five repeat measurements of a cylindrical shaft diameter were made by a single appraiser
using the same measurement system under the same conditions. The resulting data were:
3.158, 3.157, 3.161, 3.165 and 3.151. The range of the five measurements is R = 0.010. This
is converted into the repeatability standard deviation by division by a constant d2, in this
case, 2.326. The resulting estimate of the repeatability standard deviation is =
0.010/2.326 = 0.0043. For several data sets, use the average range in this calculation.
More generally, gage r&R experiments will have p appraisers, n parts and m repeats. One
standard plan is to use p = 3, n = 10 and m = 3, making a total of 90 observations. Suppose
we have performed such an experiment. The 90 measurements comprise 30 sets of
repeated measurements. Each set of three will have a range. Denote the average of these
ranges by and suppose this is equal to 8.4. For m = 3 and np = 30 measurements, the
d*2 constant is approximately 1.693. Accordingly, the estimate of the repeatability
standard deviation is:
For reproducibility, we need the appraiser averages range. For three appraisers, the range
is calculated as the maximum average minus the minimum average. Denote this as RA and
suppose this is equal to 6.89. Using standard formulas,1 the reproducibility standard
deviation may be calculated as shown. The conversion constant d*2 = 1.912, appropriate
for p = 3 appraisers, is used.
1. Measurement Systems Analysis Reference Manual, 3rd edition, Automotive Industry
Action Group, Southfield, Mich., 2005.
Stephen Luko, Pratt & Whitney Aircraft, is chair of Committee E11 on Quality and
Statistics, and a fellow of ASTM.
Dean Neubauer is the DataPoints column coordinator and E11.90.03 publications chair.