Msa PDF
Msa PDF
Ion Teohari
Engineer in Electronics and Tc
8 years – System engineer for minicomputers
8 years – Sales Manager - IT
8 years – Country Manager – META Group – IT&C consulting
8 years – International Certified Lean Six Sigma Black Belt
Aurelian Iuscu
Engineer TCM and Economist
3 years – Retail banking
7 years – Projects and organization
3 years – Business processes improvement
3 years – International Certified Lean Six Sigma Black Belt
– How do you know that the data you have used is accurate and precise?
– How do know if a measurement is a repeatable and reproducible?
Wadda ya
wanna
measure!?!
Revizia 2 din 07.02.2015 8 Academia TÜV Rheinland România
Poor Measures
Difficult measures
Poor sampling
Producers
Suppliers
Item to be Reference
Measured Measurement
Operator Measurement Equipment
Process
Procedure
Environment
The item to be measured can be a physical part, document or a scenario for customer service.
Operator can refer to a person or can be different instruments measuring the same products.
Reference is a standard that is used to calibrate the equipment.
Procedure is the method used to perform the test.
Equipment is the device used to measure the product.
Environment is the surroundings where the measures are performed.
Whenever you measure anything, the variation that you observe can be
segmented into the following components…
Observed Variation
Precision Accuracy
All measurement systems have error. If you don’t know how much of the variation you
observe is contributed by your measurement system, you cannot make confident
decisions.
If you were one speeding ticket away from losing your license,
how fast would you be willing to drive in a school zone?
Repeatability
For example:
Manufacturing: One person measures the purity of multiple
samples of the same vial and gets different purity measures.
Transactional: One person evaluates a contract multiple times
(over a period of time) and makes different determinations of errors.
Y Operator A
Operator B
For example:
Manufacturing: Different people perform purity test on samples
from the same vial and get different results.
Transactional: Different people evaluate the same contract and
make different determinations.
Centered (Blue)
(Green) Reduce On-Target Center
Spread Process
Measurement
Revizia 2 din 07.02.2015 22 Academia TÜV Rheinland România
Bias
Bias is defined as the deviation of the measured value from the actual
value.
Stability is Bias
characterized as a
function of time!
B i a s (y)
% Linearity = |Slope| * 100
0.00
*
-e
*
*
Reference Value (x)
y = a + b.x
y: Bias, x: Ref. Value
a: Slope, b: Intercept
Most industrial measurement system can be divided two categories, one is variable
measurement system, another is attribute measurement system. An attribute gage
cannot indicate how good or how bad a part is , but only indicates that the part is
accepted or rejected. The most common of these is a Go/No-go gage.
Attribute Variable
– Pass/Fail – Continuous scale
– Go/No Go – Discrete scale
– Document Preparation – Critical dimensions
– Surface imperfections – Pull strength
– Customer Service – Warp
Response
Variable MSA .
Whenever you measure anything, the variation that you observe can be
segmented into the following components…
Observed Variation
Precision Accuracy
All measurement systems have error. If you don’t know how much of the variation you
observe is contributed by your measurement system, you cannot make confident
decisions.
If you were one speeding ticket away from losing your license,
how fast would you be willing to drive in a school zone?
EE Rpt
22
Rpt
2 2
Rpd
Rpd
6 E
(USL - LSL)
6E
%R &R 100
USL LSL
Revizia 2 din 07.02.2015 33 Academia TÜV Rheinland România
Gage R & R Study
Example: 10 units are measured by 3 people. These units are then randomized
and a second measure on each unit is taken.
Crossed Design
A Crossed Design is used only in non-destructive testing and assumes that all the
parts can be measured multiple times by either operators or multiple machines.
Gives the ability to separate part-to-part Variation from measurement system
Variation.
Assesses Repeatability and Reproducibility.
Assesses the interaction between the operator and the part.
Nested Design
A Nested Design is used for destructive testing and also situations where it is not
possible to have all operators or machines measure all the parts multiple times.
Destructive testing assumes that all the parts within a single batch are identical
enough to claim they are the same.
Nested designs are used to test measurement systems where it is not possible
(or desirable) to send operators with parts to different locations.
Do not include all possible combinations of factors.
Uses slightly different mathematical model than the Crossed Design.
Trial 1
Operator 3
Trial 2
For extreme cases, a minimum of two appraisers can be used, but this is strongly
discouraged as a less accurate estimate of measurement variation will result.
5. Let appraiser A measure 10 parts in a random order while you record the data
noting the concealed marking. Let appraisers B and C measure the same 10 parts
Note: Do not allow the appraisers to witness each other performing the
measurement. The reason is the same as why the unit markings are concealed,
TO PREVENT BIAS.
6. Repeat the measurements for all three appraisers, but this time present the
samples to each in a random order different from the original measurements.
This is to again help reduce bias in the measurements.
10 Parts 3 Trials 3 Appraisers
……
Revizia 2 din 07.02.2015 38 Academia TÜV Rheinland România
Analysis Techniques: Variable Gage Analysis
30
0.01MM
20
10
0
1A 2A 3A 4A 5A 1B 2B 3B 4B 5B 1C 2C 3C 4C 5C
R Operator
R
No. of Parts
The average of the measurements taken by an operator is calculated.
X Operator
X
Trials * Parts
A control chart of ranges is created. The centerline represents the average range
for all operators in the study, while the upper and lower control limit constants are
based on the number of times each operator measured each part (trials).
R
R
No.of Operators
UCLR D4 R
LCLR D3 R
The centerline and control limits are graphed onto a control chart and the
calculated ranges are then plotted on the control chart. The range control chart is
examined to determine measurement process stability. If any of the plotted
ranges fall outside the control limits the measurement process is not stable,
and further analysis should not take place. However, it is common to have the
particular operator re-measure the particular process output again and use that
data if it is in-control.
EV
%EV (TOL) * 100
USL LSL
EV
%EV (PROC) * 100
5.615 m
AV X Diff * K2
2 EV 2
-
nt
Xdiff is the difference between the largest average reading by an operator and the
smallest average reading by an operator. The constant K2 is based on the number
of different conditions analyzed. The appraiser variation is often compared to the
process output tolerance or process output variation to determine a percent
appraiser variation (%AV).
AV
%AV (TOL) * 100
USL - LSL
AV
%AV (PROC) * 100
5.615 m
R & R EV 2 AV 2
The gage error (R&R) is compared to the process output tolerance to estimate the
precision to tolerance ratio (P/T ratio). This is important to determine if the
measurement system can discriminate between good and bad output.
R&R
P /T * 100
USL - LSL
The basic interest of studying the measurement process is to determine if the
measurement system is capable of measuring a process output characteristic with
its own unique variability. This is know as the Percent R&R (P/P ratio, %R&R),
and calculated as follows:
R&R
%R & R * 100
5.615 m
Process or Total Variation:
If the process output variation (m) is not known, the total variation can be
estimated using the data in the study. First the part variation is determined:
PV R p K 3
Rp is the range of the part averages, while K3 is a constant based on the number
of parts in the study.
The total variation (TV) is just the square root of the sum of the squares of R&R
and the part variation
m TV R & R 2 PV 2
t2 p2 o2 po
2
r2
The part to part variation is estimated by p2; the operator variation is estimated by
o2; the interaction effect is estimated by op2; while repeatability is estimated by r2
MINITAB™ calculates a column of variance components (VarComp) which are used to calculate %
Gage R&R using the ANOVA Method.
Estimates for a Gage R&R study are obtained by calculating the variance components for each term
and for error. Repeatability, Operator and Operator*Part components are summed to obtain a total
Variability due to the measuring system.
We use variance components to assess the Variation contributed by each source of measurement
error relative to the total Variation.
Recommended
5 or more Categories
% Tolerance
or % Contribution System is…
% Study Variance
Acceptability Criteria:
For a Gage deemed to be INCAPABLE for it’s application. The team must review
the design of the gage to improve it’s intended application and it’s ability to
measure critical measurements correctly. Also, if a re-calibration is required,
please follow caliberation steps.
If repeatability is large compared to reproducibility, the reasons might be:
1) the instrument needs maintenance, the gage should be redesigned
2) the location for gaging needs to be improved
3) there is excessive within-part variation.
If reproducibility is large compared to repeatability, then the possible causes
could be:
1) inadequate training on the gage,
2) calibrations are not effective,
3) a fixture may be needed to help use the gage more consistently.
Percent
%Toleranc e 0.8
Repeatability 30 0.03875 0.001292 100 0.7
0.6
Total Gage 59 2.24912 0.5
0 0.4
R&R %Contribution Gage R&R Repeat Reprod Part-to-Part Part 1 2 3 4 5 6 7 8 9 10
Sample Range
UCL=0.1252
Total Gage R&R 0.004437 10.67 0.9
0.10
0.8
Repeatability 0.001292 3.10 0.7
0.05
R=0.03833 0.6
Reproducibility 0.003146 7.56 0.5
0.00 LCL=0 0.4
Operator 0.000912 2.19 Operator 1 2 3
Sample Mean
0.9 UCL=0.8796 0.9 2
Average
0.8 Mean=0.8075 0.8 3
Total Variation 0.041602 100.00 0.7
LCL=0.7354
0.7
0.6
StDev Study Var %Study Var %Tolerance 0.5 0.6
0.4 0.5
Source (SD) (5.15*SD) (%SV) (SV/Toler) 0.3 0.4
Part 1 2 3 4 5 6 7 8 9 10
1 Stability Components of
1.1
By Part
200 Variation %Contribution
1.0
0.8
100 0.7
0.6
3 Repeatability 0
0.5
0.4
Gage R&R Repeat Reprod Part-to-Part Part 1 2 3 4 5 6 7 8 9 10
0.10 0.9
0.8
variation 0.05
R=0.03833
0.7
0.6
5
5 Reproducibility 0.00 LCL=0
0.5
0.4
Operator 1 2 3
Av erage
0.8 Mean=0.8075 0.8 3
0.7 LCL=0.7354
0.7
0.6
0.5
0.4
0.6
0.5 5
0.3 0.4
4 5 Part 1 2 3 4 5 6 7 8 9 10
Sample
Range
0.02
(measure of predictability)
0.01
- The property of being in statistical
0.00
control on the Range Chart
- Unpredictable measurements
obtained when the same operator
measures the same part with the
same gage
• Where you look for it
Size
- The Range Chart Measurement
of part #4
• What you’d like to see shows 1 point
clearly different
- All points below the upper control
1 2 3 4 5
limit on the Range Chart Part
What it is
R Chart by operator
The ability of the measurement 0.015
1 2 3
UCL=0.01476
systems units to adequately identify
variation in a 0.010
Sample
Range
measured parameter R=0.005733
0.005
Insufficient discrimination results from
inadequate measurement units being 0.000 LCL=0
What it is
The variation between R Chart by operator
0.015 1 2 3
successive measurements
of the same part, same
S a mp le
Ra n g e
0.010
characteristic, by the same
person using the 0.005
same instrument
0.000
Also called Test-Retest Error
and Operator Uncertainty
Operator 1:
Operator 2:
Where you look for it Repeatability is poor,
Repeatability is good,
Ranges are high
Ranges are low
Range Chart
What you’d like to see
No significant average
differences in ranges
between operators
What it is
1 2 3 4
Sam p le
M ean
The difference between the average 0.65
measures of the different operators
0.60
Consists of Operator Bias and the
Operator by Part Interaction (next page) No Operator Bias
All inspectors have very similar pattern and average.
Where you look for it
Xbar Chart Xbar Chart by Operator
Julie Matt Mike
“By Operator” Chart 0.75
Sam ple
0.70
Mean
What you’d like to see 0.65
0.6
ean
5 different patterns
0.6
0
% Study
** % Tolerance
% Study Variation
R&R *100
TOTAL
• Looks at standard deviations instead of variance Acceptance
Criteria
• Measurement System Standard Deviation (R&R) as
a percentage of Total Observed Process % Study
Standard Deviation Variation
2
2*
Process Output
Number of Distinct Categories
2
R&R
Acceptance
The number of distinct categories within the process Criteria
data that the Measurement System can discern # of Distinct
How well a measurement process can detect process Categories
output variation-process shifts and improvement 0-3
This number represents the number of non-overlapping
confidence intervals that will span the range of 4
product variation 5 or
more
Suppose the Standard Deviation for one part measured by one person many
times is 9.5.
The Variation due to the measurement system, as a percent of study Variation is causing 92.21% of
the Variation seen in the process.
By AIAG Standards this gage should not be used. By all standards, the
data being produced by this gage is not valid for analysis.
% Tolerance
or % Contribution System is…
% Study Variance
Repeatability Problems:
Calibrate or replace gage.
If only occurring with one operator, re-train.
Reproducibility Problems:
Measurement machines
Similar machines
Ensure all have been calibrated and that the standard measurement
method is being utilized.
Dissimilar machines
One machine is superior.
Operators
Training and skill level of the operators must be assessed.
Operators should be observed to ensure that standard procedures are followed.
Operator/machine by part interactions
Understand why the operator/machine had problems measuring some parts and
not others.
Re-measure the problem parts
Problem could be a result of gage linearity
Problem could be fixture problem
Problem could be poor gage design
Whenever you measure anything, the variation that you observe can be
segmented into the following components…
Observed Variation
Precision Accuracy
All measurement systems have error. If you don’t know how much of the variation you
observe is contributed by your measurement system, you cannot make confident
decisions.
If you were one speeding ticket away from losing your license,
how fast would you be willing to drive in a school zone?
What caused it
Measures are rounded off – They now appear the same
Insufficient gage
SOP
Process capability
Cost and difficulty in replacing device
What caused it
Incorrect data entry/transposing data
Operator misreads gage
Lack of an SOP
Operator changed technique during MSA
Other?
What can you do about it
Improve the SOP and constrain the technique
Capture data correctly
If appropriate, ensure we are clocking or locating on the same feature each time
Remove debris
Other?
BIAS Known X
Process Variation = 6 Sigma Range
BIAS
Percent Bias =
Process Variation
Possible causes
Out of calibration
Worn or damaged fixture, equipment, instrument
Wrong gage
Environmental conditions
Operator skill level, performed wrong method
Attribute MSA
Accuracy checks
Assess standards against customers’ requirements
Identify how well Measurement System conforms to a “known master”
Precision checks
To determine if inspectors (appraisers) across all shifts, machines, lines, etc…
use the same criteria to evaluate items – Reproducibility
To quantify the ability of inspectors (appraisers) or gages to accurately repeat
their inspection decisions – Repeatability
To identify how well inspectors/gages measure a known master (possibly defined by
the customer) to ensure no misclassification occurs
How often operators decide to ship truly defective product
How often operators do not ship truly acceptable product
To determine areas where
Training is needed
Procedures or control plans are lacking
Standards are not clearly defined
Gage adjustment or correlation is necessary
Take 60 seconds to count the number of times “F” appears in this paragraph?
36
The Necessity of Training Farm Hands for First Class
Farms in the Fatherly Handling of Farm Live Stock is
Foremost in the Eyes of Farm Owners. Since the
Forefathers of the Farm Owners Trained the Farm Hands
for First Class Farms in the Fatherly Handling of Farm
Live Stock, the Farm Owners Feel they should carry on
with the Family Tradition of Training Farm Hands of First
Class Farmers in the Fatherly Handling of Farm Live
Stock Because they Believe it is the Basis of Good
Fundamental Farm Management.
Suppose that invoice quality is a key to the process throughput. In other words, if an
invoice is incomplete, the rework required impacts the quantity that can be
processed in a day. Two appraisers are asked to independently evaluate ten
invoices randomly selected from different days. The results of the study are shown
below:
Invoice # Appraiser Appraiser 2 Agreement?
1 1 Bad Bad Y
2 Good Bad N
3 Good Good Y
4 Bad Bad Y
5 Good Good Y
6 Good Good Y
7 Good Bad N
8 Good Bad N
9 Good Good Y
10 Good Good Y
We could simply look at the percent of the time they agree as metric for between
Appraiser agreement: % agree % disagree
But what would that not take into account?
Pobserved Pchance
Kappa
1 Pchance
The proportion that the judges are in agreement is Pobserved.
The proportion expected to occur by chance is:
5/10 = .5 3/10 = .3 .8
appraiser 1
Calculate Pchance
Add
Add
0/10 = 0 2/10 = .2 .2
.5 .5 Pobs = .5 + .2 = .7
Pobs - Pchance
Pchance = (PR1bad)(PR2bad)+(PR1good)(PR2good) Kappa = =(.7-.5)/(1-.5)
= (.2)(.5)+(.8)(.5) 1 - Pchance
= .5 How is this interpreted? = 0.4
Improvement - If the Within Appraiser Kappa scores are low, that appraiser may
need training. Do they understand characteristic they are looking for? Are
the instructions clear to them?
- If the Between Appraiser Kappa scores are low, each appraiser may have a
differing definition of the categories – A standardized definition can improve
this situation
- If improvements are made, the study should be repeated to confirm
improvements have worked
Recall what ordinal data is – Categorical variables that have three or more
possible levels with a natural ordering, such as strongly disagree, disagree,
neutral, agree, and strongly agree; or use a numeric scale such as 1-5
When the attribute data can be represented by three or more categories that can be
arranged in a rank order, Kendall’s Coefficient of Concordance (KCC) can be used
to evaluate the measurement system
KCC ranges from 0 to 1
0 1
No Strong
association association
Unlike Kappa, KCC does not treat misclassifications equally – e.g., the difference
between a “mild” and “medium” is less than the difference between “mild” and “very
hot”
KAPPA: Pass/Fail
KCC: Mild/Medium/Hot/Very Hot (Hot sauce)
Let’s walk through an example to
see how these are calculated.
Three judges score the quality of a proposal. The scale they use is 1-5, 1
being “poor”, 5 being “excellent.” The results from their scoring are provided
in the following table:
As stated on the prior page, the p-value for KCC should also be low,
generally less than 0.05 – This reduces your risk of getting an
acceptable KCC just by random chance.
Planning
Sample size
More is better – As your sample size increases, your confidence
intervals around your KCC decrease
Collect as many samples as practically possible, 20 minimum is a
guideline, ≥ 30 is best
Perform at least two trials per appraiser
Sample part selection
Parts in the study should represent the full range of variation and
thus utilize the full range of the rating scale
Execution
Parts should be rated in random order independently (no comparisons)
Study should be blind
Rating time should be similar to that “normally” used
Analysis
Prior to reviewing the KCC value, check to see that the p-value is low (generally < 0.05)
– If it is not, add more samples to the study or add another trial
Review the repeatability portion first (Within Appraiser), if an appraiser’s KCC is very low,
he/she may need improvement (see Improvement below)
For appraisers that have acceptable repeatability, review the reproducibility portion
(Between Appraiser)
If a “gold standard” is available (ratings of the samples known by some other means as being
“correct”), compare each appraiser to them for “calibration”
Use the field in MINITAB, “Known Standard Attribute”
Improvement
If the Within appraiser KCC scores are low, that appraiser may need training. Do they
understand the rating scale? Are the instructions clear to them?
If the Between Appraiser KCC scores are low, each appraiser may have a differing
definition of the rating scale – A standardized definition can improve this situation
If improvements are made, the study should be repeated to confirm improvements have
worked
Succes la examen!
www.tuv.ro
[email protected]