Data and Basic Stats Rev C 1-25 (Compatibility Mode)
Data and Basic Stats Rev C 1-25 (Compatibility Mode)
Introduction To Data
1
Six Sigma Breakthrough Steps
2
Basic Data: Questions to Answer
• What is data?
• What are the different types of data?
• Why is continuous data better?
• What is a data collection plan?
• What is a rational subgroup?
3
Data
An individual fact or a
collection of facts about
something is called DATA
4
Types of Data
5
Discrete Vs. Continuous Data
TEMPERATURE
Caliper
Thermometer
Time
6
Discrete Vs. Continuous Data
• Discrete Data
• Provides sparse information
• Continuous Data
• Is rich with information
7
Categories of Scales
8
What about this?
9
Or This?
10
Or this?
11
Data & Statistics
Important:
• DATA, by itself, DOES NOT provide
information.
• You have to MANIPULATE the data for it
to give information.
• We use STATISTICS to manipulate data.
12
Statistical Techniques
14
Data Collection Plan Questions
15
Data Collection Plan Questions (continued)
16
Data Collection Plan Model
Plan
Activities
Answer
Critical
Questions
Execute Data
Collection Plan
17
Rational Subgroups
• If you take data over these conditions, it may contain and hide
special or assignable causes that should be attributed directly
to that special cause.
• What is data?
• What are the different types of data?
• Why is continuous data better?
• What is a data collection plan?
• What is a rational subgroup?
19
Basic Data Questions Summary
20
Basic Data Questions Summary
21
Basic Data Lessons Learned
22
Basic Data: Deliverables
23
Data Collection Exercise
25
Six Sigma Breakthrough Steps
26
Basic Data: Questions to Answer
28
Statistics – Benefits of Plotting the Data
29
Variability, Centering, and Stability
• Variability
– How much does a process vary? We all know
that every process has some movement, not
every piece will come out “exactly” the same.
30
Variability Measures - Formulas
s2 = i =1
individual data point from the n −1
mean.
n
• Standard Deviation (σ σ ; s): The
square root of the variance. Most
∑ i
(X
i =1
− X ) 2
s=
commonly used measurement to n −1
quantify variability.
Computers do all the hard work 31
Variability Exercise
• Common Cause:
– This is the normal “bouncing around” of any process.
– This is what we saw within each of the 3 teams/shifts.
– To reduce this type of variation, we usually need to
modify the process or technology.
• Special or Assignable Cause:
– This is the variation due to an “assignable” input such
as each shift using different targets, change of
material vendors, using tools past their replacement
point.
– This is what we saw between the 3 teams/shifts.
– To reduce this type of variation, we usually need to
develop and enforce better controls for our process.
33
Variability, Centering, and Stability
34
Measures of Central Tendency - Exercise
• Calculate the Mean, Median and Mode for the each data set
shown below. Use the space provided in the chart for your
answers.
Index Data Set 1 Data Set 2 Data Set 3
A 5 3 9
B 6 6 1
C 4 3 1
D 5 4 8
E 5 3 1
F 7 4 6
G 4 16 10
H 7 4 1
I 6 5 7
J 3 3 1
K 3 4 10
35
Variability, Centering, and Stability
25
meters
35
Feet
24
23
25
5 15 25 2 12 22
Observation Observation
37
No Stability = “all bets are off”
Sample Mean
Sample Mean
125
Sample Mean
115
105 100 X=101.0
X=100.7 115 X=115.0
95
85 93.42
75
90
65 62.93 110.4
55 1 110
0 10 20 0 10 20 0 10 20
Sample Number Sample Number Sample Number
39
Statistics - Mini Road Map
40
Variation Is the Enemy
41
Shaft and Bushing Example
Bushing
Clearance
IDbush - ODshaft = Clearance
1.002” - 1.000” = .002” (total)
42
Example Continued
• If you use the entire tolerance, you could have a shaft
of .995”, and a bushing of 1.007” for a clearance per
side of .006” (.012” total).
43
Distributions
44
Dotplot & Histogram
:
:
. . . : . .
:: : :::.:: :: . ::
. : .. .:.:.:::::::::::::::.::.::::..: : .
-------+---------+---------+---------+---------+-------GPM
49.00 49.50 50.00 50.50 51.00
4 0
3 0
Frequency
2 0
10
4 8 .8 4 9 .3 4 9 .8 5 0 .3 5 0 .8 5 1. 3
G PM
45
Smoothed (Normal) Distribution
46
Normal Distribution
47
The Normal Distribution - Properties
48
The Normal Distribution – Property 1
49
Normal Curve Probabilities
Probability of sample value
68%
40%
Point of
30% 95% Inflection
20%
99.73%
10%
0%
-4 -3 -2 -1 0 1 2 3 4
50
Normal Curve - Exercise 1
20%
10%
0%
2 4 6 8 10 12 14 16 18 Inches
51
Normal Curve - Exercise 2
40%
30%
20%
10%
0%
-4 -3 -2 -1 0 1 2 3 4
52
Normal Curve - Exercise 3
40%
30%
20%
10%
0%
-8 -5 -2 1 4 7 10 13 16
53
Normal Curve - Exercise 4
40%
30%
20%
10%
0%
4 5.5 7 8.5 10 11.5 13 14.5 16
54
Normal Curve - Exercise 5
40%
30%
20%
10%
0%
55
How do I know if my data is Normal?
100 .99
.95
Probability
.80
Frequency
.50
50 .20
.05
.01
.001
0
26 36 46 56 66 76 86 96 106
20 30 40 50 60 70 80 90 100 110 Normal
C1 Average: 70 Anderson-Darling Normality Test
Std Dev: 10 A-Squared: 0.418
N of data: 500 p-value: 0.328
56
Normal Probability Plots (continued)
200
200
Mystery Distribution
equency
equency
Freq
Freq
100 100
.999
.99
0 0 .95
Probability
60 70 80 90 100 110 120 130 0 10 20 30 40 50 60 70 80 .80
C2 C3
.50
.20
.05
Positive Skewed Distribution Negative Skewed Distribution .01
.001
.999
50 100 150
.99
.999 .95 Mystery
.99 .80 Average: 100 Anderson-Darling NormalityTest
Probability
Probability
57
Stop
58
Z Transformation
Z transformation takes a normal distribution
and translates it to a normalized distribution 10 USL
Lets assume a process
with a mean of zero and a standard
Mu = 10 and
deviation of 1.0).
Std Dev = 2
Z= X-µ Z = 13 - 10 = 1.5
σ 2
Question 1: if my
tolerance is 13,
how many inches
X scale = units
away am I from Are Inches
my Mean ? 13
4 6 8 10 12 14 16
Question 2: If std
deviation is 2,
Z scale = units
How many std dev Are standard
Is my tolerance from -3 -2 -1 0 1 2 3 deviations
my mean?
59
Z Transformation - Exercise
Z = X- µ First Question, what is the mean and standard deviation?
σ
X Z
? 1
10 ?
6 ?
? -3
? 1.5 X scale = units
? -2.25 Are Inches
13 ? 4 6 8 10 12 14 16
61
Probability
Example:
68% of the
Probability of sample value
40%
68% Points will be
Between
30% 95% plus and
20%
minus one
Standard
99.73%
10% Deviation.
0%
-4 -3 -2 -1 0 1 2 3 4
64
Probability – Exercise method
For X = 12, Z = +1
The area under the
curve to the right is 16%
and the area under the
curve to the left is
68%
16+68 = 84%
ALTERNATE
To get from mean to 16% 16% X scale = units
1 sigma is half of the Are Inches
68% between +/- s. 4 6 8 10 12 14 16
Add this 34% to the
Z scale = units
50% to get from left Are standard
To mean = 84% -3 -2 -1 0 1 2 3 deviations
65
Z Table
66
Z-Table Exercises
67
Z Transformation - Use
X −µ
The Standardized Z Transformation: Z=
σ
X − µ
Suppose the diameters Z =
σ 47.5
of shafts are normally 47 . 5 − 45 DEFECTS
distributed with a mean Z =
1 to the right
of (45) and a standard Z = 2 .5 of the USL
deviation of (1). The From a Z-table the probability that a shaft is less
customer derived upper then (47.5) is 99.37% and the probability of a
specification limit is defect is (1 - .9937) or .0063%.
(47.5). What is the
DPMO for this process. DPMO = .0063 x 1,000,000 = 6,300
69
Z Transformation
DPMO Calculation For a Lower Spec
Same process 10
µ = 10 LSL = 8
and Std Dev = 2
Z = 8 - 10 = -1
2
Question: if my Probability
LSL is 8 Of a defect
What % of my Is 15.87%
Production is
Defective?
(Green area under 4 6 8 10 12 14 16
The curve)
Answer: use Z table
Or Minitab for Z =-1 -3 -2 2 3
-1 0 1
70
Z Transformation
DPMO Calculation Z bench
LSL 10 USL
Probability
Of a defect
Question: if my below LSL
USL is 13 and my LSL Is 15.87%
Is 8, What % of my
Production is Probability
Defective? Of a defect
(Red and green areas Past USL
Under the curve) Is 6.68%
Answer: use Z table
Or Minitab for Z =1.5 13
4 6 8 10 12 14 16
And Z = -1 and add
The probabilities of
Defects on both sides
-3 -2 -1 0 1 2 3
71
Z Transformation
Z Bench Calculation for Combined Defects
10 11.5
Question:
P. USL = 6.68 %
P. LSL = 15.87%
P. Total = 22.55%
If I threw all my
Total
defects on one side,
Probability
How many std dev
of a defect
would fit between the
is 22.55%.
mean and the line
4 6 8 10 12
From Z table
where the defects 14 16
or Minitab
start?
Find Z = .75
Answer: use Z table
Or Minitab for p=.2255 -3 -2 -1 0 1 2 3
0.75
72
Z Transformation
Z Bench Calculation for Combined Defects
Total Probability of a defect is 10 11.5
22.55% (area under curve to right)
-3 -2 -1 0 1 2 3
0.75
73
Z Bench versus Cpk and Ppk
LSL 10 USL
Cpk and Ppk take into Z bench takes into
account only those account all of the
defects associated with defects.
the closest spec limit.
13
4 6 8 10 12 14 16
-3 -2 -1 0 1 2 3
74
Population Vs Sample
100
marbles
Sample
• Population:
– Are all the parts that are.
– Are difficult and expensive to measure because of volume
• Sample:
– Is a small subset of the population
– Is selected randomly to best represent population
– After a change to a process, a new sample can be easily taken and
used to determine if an improvement has truly been made
• Note:
– In actual usage it is common for the Population Parameters of (σ =
population standard deviation) and (µ = population mean) to be
frequently substituted for the Sample Statistics of (s = standard
deviation of a sample) and (X bar = sample mean).
Sample
• Procedure:
– Set up the catapult and keep all conditions except
ammo type fixed for this exercise.
– Pull back angle must be selected and fixed at
approximately half way back.
– Select and use a ping pong ball and a die or different
ball for the two types of ammo.
– Launch five test fires with each type to estimate range.
– Two inspectors silently record distance values for 20
launches of each ammo type using same operator.
– Perform appropriate analysis as listed on following
pages.
– Input measured distance by averaging the prerecorded
data from the two different inspectors (data sheet on
next page).
77
Exercise - Data Sheet
Date: Line/team
Operator: Angle special
Inspector: cause
Test Fire # Ping Pong Die notes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
78
Exercise - Deliverables
79
Basic Data Lessons Learned
80
Basic Data Lessons Learned
81
Basic Data Questions to Answer
85