0% found this document useful (0 votes)
29 views184 pages

Data Analysis Manual 000

The document serves as a comprehensive guide to data analysis within the context of Six Sigma methodology, specifically focusing on the DMAIC framework. It outlines essential concepts such as basic statistics, control charts, hypothesis testing, and the importance of data collection and analysis in improving business processes. The preface emphasizes the need for consistent data analysis skills among team members to effectively identify and address areas for improvement in the leasing and logistics sectors.

Uploaded by

Claire Che
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views184 pages

Data Analysis Manual 000

The document serves as a comprehensive guide to data analysis within the context of Six Sigma methodology, specifically focusing on the DMAIC framework. It outlines essential concepts such as basic statistics, control charts, hypothesis testing, and the importance of data collection and analysis in improving business processes. The preface emphasizes the need for consistent data analysis skills among team members to effectively identify and address areas for improvement in the leasing and logistics sectors.

Uploaded by

Claire Che
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 184

Data Analysis

1. Data Analysis Intro 3


Who, What, When, Where, How, Why
2. Basic Statistics 10
Types of Data 11
Mean & Standard Deviation 13
Normality 22
3. Graphs 29
Pareto Chart 31
Scatter Plot 32
Time Series 33
Analysis Chart Decision Tool 36
4. Control Charts 38
Common Cause and Special Cause 41
Spec Limits vs. Control Limits 44
Continuous vs. Discrete 46
Control Chart Decision Tree 64-65
5. Sampling in the Measure Phase 67
Calculating Required Sample Size 75-80
6. Measurement Systems (MSA) 83
R&R 85
7. Capability 93
Z-table 103
8. Hypothesis Testing 111
Null and Alternative (HO and HA) 113
The P-value 114
Test Selection Tool 118
9. Means/Medians Tests 122
Analysis Path 125
Means tests 126
Nonparametrics 137
Choice of Test Decision Tool 148
10. Regression 150
Correlation 152
Fitted Line 154
Multiple Regression 155
2
R 156
11. Chi-Square 160
Expected Values 163
12. Design of Experiment (DOE) 165
Introduction to Factorials 167
13. Sampling in the Improve Phase 178
Risk, Power 180
Sample Size Calculator 181

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 1
PREFACE

Welcome to the Quality Department!


A formidable challenge in business today is not only to make effective changes, but also
to sustain the increased efficiency upon completion of any improvement effort. This is
especially difficult in the leasing and logistics businesses, as new branches and locations
are opening all the time, each demanding its own customized approach.
With the advent of the Business Process Management System (BPMS) processes are
standardized and measured for their performance and their improvement consistently
across the company. As a result of this organized approach, the opportunity to identify
areas in need of improvement is increasing, and the tactical response to such needs is the
Six Sigma methodology. Processes are improved using the DMAIC framework or
redesigned via DMADV - design for Six Sigma. These projects are intended to nail
down the root causes of defects and minimize, eliminate or prevent them. Projects are
executed by teams of individuals who are involved in the relevant process, and these
teams are assisted by experts in the management of projects and the analysis of data.
You are the improvement methodology expert. It is your job to move the project through
its various phases, making sure that all the requirements of each phase are met before
going forward. It is your job to report your team’s findings and progress to the
Champion and the Process Owner. You are also responsible for the clear and convincing
analysis of the data gathered by your team. Complex processes yield seemingly chaotic
data and it takes a skilled analyst to pull truth from such data. That is our challenge and
these are our tools. You must collect appropriate data, make sense of all the numbers,
and draw accurate conclusions from them. The branch or location, our business, and
even the lives of people inside and outside this company depend on your investigation of
the information available and the clarity of your reasoning. You must understand the
story that the data is telling you, and you must not make a mistake.
This book is intended to improve your data analysis skills. There is ample material
available from many other sources that can help you with the details and intricacies of
data analysis, but this text brings together the most commonly used techniques from our
business and explains them to a level of understanding beneath which no member of the
Quality Department should fall. As I stated previously, a consistent approach is vital to
the pursuit of quality. We must all function at the same level of competency. We will
cover areas that you may not see in your everyday mission, but I encourage you to learn
as much as you can and to refer to these chapters as often as possible.
I charge each and every one of you with strengthening your data analysis aptitude and
sustaining that improvement, and I know you can do it.

David A. Taliaferro
June, 2000

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 2
Data Analysis Introduction
It is important to begin a project with the end result in mind. In the case of a DMAIC
or DMADV project, we want to identify solutions using the Six Sigma methodology and
following all the steps prescribed therein. During the course of your project there will be
many opportunities to analyze data, and it is important to know what the results of your
analysis must show. By understanding the desired result, you can better prepare for the
collection of data, and you can keep your focus on the goal while you chart your course
through the phases of your project.

Six Sigma dates back to 1987 in the United States and its foundation is reducing the
variation in a process so that nearly all of the output falls within the customer’s specification
limits. The term “Six Sigma” has a statistical definition – 68.3% of all the output of a process
will fall within one sigma, or standard deviation, above and below the mean of that process,
and 95.5% will fall within two sigmas. The idea is to make 99.9997% of all the output, or six
sigmas, fall within the customer’s specification, leaving only 3.4 defective parts out of one
million. In order to reach the point where the process is performing so well, it is necessary to
use relevant and accurate data to develop conclusions.

The steps of Six Sigma are Define, Measure, Analyze, Improve and Control, and the
DMAIC training waves I & III cover the fundamentals of this methodology. The focus of this
book is the analysis of data relevant to each of these phases.

DEFINE
In the DEFINE phase, the team will collect facts about a problem or opportunity.
One of the elements of a project charter is the Goal Statement, and the criteria for a
good goal statement are that it is Specific, Measurable, Attainable, Relevant and TimeBound
(SMART). Identifying a measurable goal is the first use of data in the DMAIC process. Later
in the project, the improvement will be judged against the goal statement, so it is important to
keep the SMART criteria in mind.
Another purpose of the project charter is identifying roles and responsibilities for the
team members. Data collection will often be conducted by team members at various locations
and with varying levels of statistical aptitude. It is important for the Black Belt to make data
collection easy and to clearly define the objectives for all team members. While a Master
Black Belt is assigned to each project, he or she will not always be available every time a
question arises regarding data collection or analysis. The Black Belt must be prepared to
answer such questions confidently and correctly. That is the objective of this manual, and
wherever possible in these pages, supporting information is provided to further clarify the
answers you will give to inquisitive team members and stakeholders.
An important deliverable of the Define phase is a list of customer CTQs – items that
are critical to quality for the customer. These items begin as high-level needs and are
translated into specific, measurable customer requirements. The CTQ items must be

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 3
measurable in order to track progress and show improvement. By knowing what kind of
analysis you will have to perform later in the project, and by knowing what kind of control
charts will show the process to the stakeholders most clearly, you can influence the decision on
what to include on the list of CTQs.
The process mapping portion of the Define phase is crucial to the success of a DMAIC
project. Everything else springs from the process map: the inputs and outputs of the process,
the data collection plan, and everything that follows. The team will identify measures related to
Inputs, Outputs and Process steps, and these are taken directly from well-prepared SIPOCs and
deployment flowcharts. These measures will provide the data that will be analyzed in order to
make improvements, so it is important to keep the SMART principle in mind when determining
what to measure.

MEASURE
In the MEASURE phase, the team will determine baseline performance of the process to
understand how well we are meeting customer requirements.
The first step in the Measure phase is to identify what output is most important to the
customer – the Project Y. The name “Y” is given to a dependent variable that is influenced by
one or more independent variables, or X’s. The Y variable is a function of the X’s, shown
mathematically as: y = f (x1, x2, x3… xn). In other words, all the inputs come together to shape
the output. Through DMAIC, we will examine which X’s or combinations of X’s are
negatively affecting the performance of our process and then identify ways to improve them.
The Data Collection Plan is produced by the team and can be used several times
throughout the DMAIC process. A good plan will have a clear objective and will identify
exactly what data are to be collected, by whom, at what frequency, and where that data can be
gathered. This is the first time we will distinguish between the two main types of data –discrete
and continuous. Discrete data can be counted or categorized, and continuous data are measured
along a scale of values at very detailed intervals. Further definitions and examples of discrete
and continuous data follow in this manual, but in short, continuous data should be used
whenever possible because discrete data only records whether a point is defective or not and
continuous data show the degree to which any point deviates from the target. For example,
suppose a truck has a scheduled delivery to make – the discrete measure is whether or not the
truck was on time, and the continuous measure is how many minutes early or late the delivery
was made. For a group of trucks making deliveries over one month, the discrete measure is
how many trucks were on time, while the continuous measure is the average delivery time for
all the trucks relative to schedule and how many minutes early or late the individual trucks are
generally arriving. We can learn more about the process when we use continuous
measurements.
Another aspect of the Data Collection Plan is the measurement system analysis (MSA).
We are trying to measure the amount of variation in our process, but this is difficult to do when
there may be variation in the way we are measuring the process. If we are trying to determine
whether a truck was early or late, but our clocks are not calibrated or different people are using
their own wristwatches to record the time, then the variation in the measurement system is
added to the variation in the actual process. As a result, we do not get a clear picture of how
early or late the truck really is. In some cases the truck will appear

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 4
to have arrived later than it did, and in some cases the time recorded will make the truck
appear to have arrived on time when it was actually late. Additionally, if the time is being
rounded to the nearest 15-minute increment, the guidelines on when to round up or down
must be clear to everybody who is recording time. Measurements must be made consistently
so that the only variation perceived is the actual variation in the process.
The next step in the Measure phase is to describe and display the variation in the
process. Too often we hear that the average of a sample is used to make decisions with no
mention of the variation above or below that average. The problem with using only the
average is that customers see much more than that. Customers see the variation in the
process, and a high level of variation leads to customer dissatisfaction, no matter what the
average is. Even though your supplier’s deliveries are on time on the average, those two or
three deliveries that are two hours late, and even the deliveries that are an hour too early are
causing you to make adjustments and revise your schedule. Your supplier is causing you
problems, but he doesn’t know it because he’s only looking at his average and he think he’s
doing fine.
There will always be variation in a process, but the key to Six Sigma is to minimize
the controllable variation. There are two types of variation: common cause and special
cause. Common cause variation is always present, it creates random fluctuation in the
process, and it does not make a process out of control. Special cause variation, on the other
hand, is not always present. It is unexpected and unpredictable, and it is usually an
indication that a process is out of control. In order to separate common cause and special
cause variation we use graphical tools to chart the variation in the process and interpret what
the graphs show us. Once we understand the variation, we can develop the improvement
objective for the project, whether it is to reduce variation, to shift the average of the process,
or to do both. The data we gather can tell us the most about the process when we are able to
graph it, so we will try to collect data that we can translate into effective graphs.
The final step in the Measure phase is to calculate the process Sigma, or capability.
Since our objective is Six Sigma, we must first know how the process is operating at present
so we know how much improvement is required. Also, by establishing the baseline
capability, we can track the improvement throughout the project. The capability of the
process is calculated differently based on whether we have a discrete or continuous measure.
The capability analysis will answer two questions:
- How capable is the process of hitting the customer’s target with regularity?
- How capable is the process of hitting ANY target with regularity?
The sample data that is collected by the team will determine the baseline capability, and the
sample data collected after the improvement project will show a new, hopefully better,
capability figure. However, these samples show only a limited picture of the process. The
process could go on for a much longer time, and some aspects of the process can change
over that time. If the process shows very little variation after the improvement project,
which is the objective, then it should behave just as well in the future even if things change –
machines get older, business policies change, technological advancements arise, other parts
of the business are streamlined. If your DMAIC project is successful, then the process will
be more effective regardless of these changes. In determining the capability of the process,
this is known as the Shift – a negative 1.5 Sigma shift can be expected in a process in the
long term, so for the short term we allow for that shift in our calculation of capability.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 5
ANALYZE
In the Analyze phase of DMAIC, the team will identify the root cause(s) of the problem and
find the key to solving the problem.
The purpose of the Analyze phase is to find the X’s that cause the variation that the
customer feels. While a lot of intuition is required to narrow your list of variables to the vital
few, the data that the project team collects and the analysis you will perform will help you
uncover sources of variation that were not previously apparent.
The initial data analysis will focus on segmentation as you categorize the variables
based on their behavior. The center (mean or median) and spread (standard deviation) of the
data will behave differently for different segments, and your improvement objective will be
defined by those differences. If the mean of one segment is different from that of another
segment, you will study both segments separately and look for the causes of that difference –
if one is better than the other, use it as a benchmark. If the mean is the same for two groups
but the standard deviation is different, you will use the best practices of the tighter process to
improve the performance of the other. If the center and spread are the same for all segments,
it will be necessary to investigate other possible segmentation factors and try again.
Segmentation is grouping or separating variables for study, stratification is studying
the variables separately. These steps are made easier with the use of graphical tools like the
Pareto diagram, Boxplot, and Time Series Plot. The Pareto diagram shows defects by
category, ranking each one in order of severity, so the team can focus on the most serious
causes of trouble first. The Boxplot shows the characteristics of one or more groups of data
and illustrates the center, spread, and shape of the data for each group compared with the
others so the team can understand which segments are performing better or worse. The Time
Series Plot shows the data over time so the team can see a picture of the variation in the
process and also spot trends, patterns, or cycles in the process.
Another graphical tool used in data analysis is the histogram. Each output value from
the process is placed along a horizontal axis of all possible values. Additional occurrences of
the same output value are stacked on top of one another, resulting in taller stacks of points at
the more frequent outputs and smaller stacks at less frequent values. When the histogram is
complete and all the points measured in the process are plotted, the picture should resemble a
bell-shaped curve, with the highest point in the center at the most frequently occurring value.
The frequency of values to the right and left of the highest point should decrease at the same
rate in both directions, and there may be some individual points located far from the center in
both directions. This is a description of a normal distribution. When the actual curve is
drawn across the tops of the stacks of output values it should resemble a bell in that it will be
symmetrical to the right and left of the center. When data are normally distributed, we are
able to analyze and make predictions based on our analysis. When data are not normally
distributed, our conclusions must be validated using additional analysis techniques.
Narrowing to root causes is performed using hypothesis-testing tools that are chosen
based on the type of data being studied. Continuous inputs and outputs are studied using
Regression analysis, which produces a prediction equation that explains a large amount of the
variation in the process. If the inputs are discrete – categorized or counted – variables, and
the output is continuous, the t-test is used to compare the output to a target value or to
compare the output of two distinct groups. The ANOVA data analysis tool is used when
comparing more than two groups of this type. The relationship between discrete input

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 6
variables and a discrete output variable is studied using the Chi-square test for independence,
which determines whether one or more inputs influence the output more than others do.
The effectiveness of each of these tests will be measured by the p-value. The p-value
is the probability of the tester being correct when concluding that his or her hypothesis is
true. For example, my hypothesis is that two locations are performing to the same level of
efficiency, and my test yields a p-value of 0.89. This means that I can be 89% certain that I
am making the right decision in believing that my hypothesis is true. The question may
arise: “what if my hypothesis is that the locations are performing differently – that one is
better than the other?” The assumed, or null, hypothesis of every test is that the data sets
are equal, or in this case, that the locations are performing equally. Therefore, the p-value
relates only to the chance of being correct when supporting the null hypothesis. Stated
conversely, the p-value is the chance of making a mistake when you reject the null
hypothesis. If my p-value had been 0.03, then there is a 97% chance of being wrong if I
believe that the locations are performing the same.
Generally, we will set a limit of 0.05 for a p-value to tell us that we should reject the
null hypothesis. When you see a p-value of 0.05 or less, the data sets are not equal, the
processes are different, or the process is not hitting its target value. The p-value is also used
to determine whether a set of data is distributed normally. The physical description of the
histogram above paints a good picture of normal data, but the proof is in the p-value. Again,
the assumption, or null hypothesis, is that the set of data is normal, and only a p-value of
0.05 or less will denote non-normal data.
The purpose of Six Sigma methodology is to identify variables that cause variation
and other problems in our process and sort out those that are most important to resolve. The
Analyze phase is where the funnel narrows and only the vital X’s are left to study.

IMPROVE
In the Improve phase, the team will evaluate possible solutions based on their impact on
performance.
Improvement begins when the team has identified the root causes of the problem and
begins generating ideas for how to improve these variables. The statistical tool most
prominently used during this phase is Design of Experiment (DOE). DOE involves testing
the process at different settings for each variable, for example, studying the fuel efficiency
of an automobile under different combinations of variables: tire pressure of 30 or 40 psi, no
passengers or four passengers, air conditioner on or off, 55 miles per hour or 65. By testing
all the variables at the same time you will arrive at the combination that provides the greatest
fuel efficiency, and you can endeavor to drive under those conditions whenever possible.
Another important aspect of data analysis is that of sample size. There are two times
when sample size is important: when determining the amount of data to collect in order to
establish baseline data; and when deciding how much data to collect during the Improve
phase in order to show improvement in the process. Sample size calculation involves
estimating the proportion of defective items and estimating the standard deviation of the
population before any data is collected. For that reason, it is necessary to analyze the sample
after the data has been gathered and validate the assumptions that were used during
calculation of the required sample size.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 7
CONTROL
In the Control phase, the team will quantify the change in performance to make sure
improvement is sustained.
Statistical Process Control (SPC) has roots that are even deeper than Six Sigma. In
the 1920’s Dr. Walter Shewhart developed SPC while working at Western Electric Company.
His belief was that data must be shown in context to be understood and that using only the
average or range of a set of numbers to make decisions is not effective. All the individual
values must be presented in the order that they were collected over time to show the behavior
of the process.
The primary tool in SPC is the control chart. A control chart shows data in time
sequence (or time series) so that the process owner can see trends, patterns, and cycles over
time. A center line is calculated from the data points indicating the average value, and upper
and lower control limits are added to show the highest and lowest value the process should
generate under normal circumstances.
Control charts are used to show whether a process is in statistical control. The
importance of SPC lies in being able to predict the output of a process - any process. The
process must have a consistent and predictable output before it can be adjusted to meet a
customer’s requirements. Trying to change the process with insufficient information -
without a full understanding of what the process is capable of generating - is like shooting at
a moving target.
There are different control charts available depending on the type of data being
analyzed. Once again, continuous and discrete are the main categories of data. The
continuous data control charts plot individual values or the average of a set of values over
time, along with a second chart that shows the range of the set of values or, in the case of the
individuals chart, the range from one point to the next. The discrete data control charts are
further divided into those that will chart the number items in a set that were defective and
those that will chart the total number of defects that are produced. Once again these groups
will be divided into those sets of data that have a constant opportunity for defects and those
for which the opportunity for defects is variable.
Control charts will be used to track the dashboard measures of every process, so they
must be structured in a usable format and they must measure relevant information that will be
used to continually improve the process.

This data analysis manual has been prepared to assist Black Belts with the quantitative
aspects of DMAIC and DMADV projects. Each chapter includes examples and exercises
intended to help during classroom discussion and later for reference. Instruction will be
presented using Minitab statistical software and Microsoft Excel. Familiarity with Excel is
assumed; familiarity with Minitab is not assumed and instructions for use are included.

Questions can be directed to the Quality Department at Green Hills.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 8
CASE STUDY
To enhance the instructional effort of this manual, a case study has been included. The data
set will be available from the instructor and through the Intranet.

The case study data is related to a Penske Truck Leasing branch that performs maintenance on
trucks leased by five fictional customers. The two types of service being observed are
preventive maintenance and repairs to the truck’s air conditioning system. Two years of PM
and repair visits have been recorded, with the amount of time spent in the shop and the cost of
the service for each visit.

Both continuous and discrete data are included, and the student will be able to practice using
nearly every analysis tool in this manual.

Case study data is collected in a spreadsheet entitled Air Conditioner Exercise.xls


Worksheet “Data” contains the two years of data for each service visit for the five fictional
customers.
Worksheet “Customer Info” is the Pivot Table showing annual summaries for time spent in
service and cost of service for each customer.
Worksheet “Test” is the data corresponding to a test of the measurement system accuracy, in
this case the time to reach the required pressure in the A/C system. Also, baseline and
improved capability can be tested using this data.

Use the analysis tools and control charts to show which customers have more problems and
for what reason(s).

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 9
Basic Statistics

What is Sigma?

OBJECTIVES
- Develop an understanding of data types and explain the difference between
continuous and discrete data.
- Describe data characteristics using center, spread and shape information.
- Calculate mean, median and standard deviation.
- Illustrate the Normal Curve as a basis for assumptions in data analysis.
- Introduce the p-value as a decision tool.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 10
Types of Data
• Discrete / Attribute • Continuous / Variable
– Counted events – Cycle time
• number of errors – Revenue / Cost
• number of trucks
– Categories
• Employee 1, Employee
2, Employee 3
• 1st shift, 2nd shift
• Yes / No

Discrete data are counted events or categories. Continuous data are measured on a
scale to a very fine level of detail. Examples of discrete and continuous data are
shown above. It is important to identify the data type before any measurements
begin.
It is important to know the type of data being collected and studied. The tests that
will be used to chart and analyze the data will be specific to the type of data under
investigation. The measure of process capability, or sigma (σ), is calculated
differently for discrete and continuous data.
Generally, continuous data will provide a more complete picture of variation in a
process than discrete data. However, discrete data can be used in many ways to
show causes for defects, frequency of defects under various conditions and other
useful information. It is important to note that discrete and continuous data are
often used together. Analysis tools will differ depending on which type of data is
measured for the input to the process and which type is measured for the output.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 11
Distributions
• Show the center, spread and shape of data
over time.
• Key to understanding variation
– Plot the data
– Observe shape of the distribution
• Tools:
– Frequency plot
– Histogram
– Boxplot

Creating a graph of data will give you a visual understanding of the process you are
observing. You can begin to determine the objective of your improvement process
once the center, spread and shape of the data are identified.
Center: The center of data refers to the mean (average value) or median (middle
value) of the data you are observing. Identifying the center of your data is critical
to understanding the process because it represents the most likely or expected value.
Spread: The spread of data refers to how far the individual values move away
from the center (mean/median). Variance and standard deviation are the two major
measures of spread.
Shape: The shape of data shows us what portion of the data is higher or lower than
the mean or median. Shape also shows which values occur most frequently. There
are many shapes that can occur (normal, skewed, bimodal). In this section you will
learn about the different graphs used to display these shapes (boxplot, dotplot,
histogram).

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 12
Center
Central Tendency
• Mean: Average of a set of values
– All points influence the mean
– Extreme points pull the mean toward them
• Median: Center number of a data set
– Structural center of distribution
– Not influenced by extreme values
– Preferred for extreme (long-tailed) distributions

The most common way to summarize a data set using only one value is the central
tendency. This is the value we expect is most likely to occur. We will use different
measures of central tendency depending on the distribution of the data we are
observing.
The mean of a set of data is the average of all individual values.
The median is the middle value of the data set when the data is ordered from lowest
to highest.
The mode of the data set is the one value that occurs most frequently. The mode
will see very limited use in our data analysis.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 13
Center
Mean and Median
For the following data set - For an odd number of values
3.5, 9.5, 5.0, 2.5, 6.5, 5.0, 4.5, 3.0, 10.0 3.5, 9.5, 5.0, 2.5, 6.5, 5.0, 4.5, 3.0, 10.0

The mean equals: The median is:


2.5, 3.0, 3.5, 4.5, 5.0, 5.0, 6.5, 9.5, 10.0
3.5 5.0
µ = ΣX
9.5
5.0
2.5
For an even number of values
6.5
5.0
N 3, 5, 5, 2, 6, 5, 2, 3, 1, 6, 4, 5, 3, 2
4.5
3.0 The median is:
10.0 1, 2, 2, 2, 3, 3, 3, 4, 5, 5, 5, 5, 6, 6
49.5 ÷ 9 = 5.5
3.5

To calculate the mean (µ) of a data set:


- add all the values to get the total (Σx)
- divide that total by the number of individual values (N).
In the example above, the total of the values is 49.5 and there are 9 values in the
data set. 49.5(Σx)÷9(N)=5.5(µ)

To calculate the median ( x ~)of a data set:


- order the values from lowest to highest
- find the middle value
- if the data set has an odd number of values, there will be a definite middle
value
- if the data set has an even number of values, look for the middle two
values and find the value that is halfway between them by adding them
together and dividing by two.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 14
Center
Quartile Values
Skewed Right Skewed Left
200 200
Frequency

Frequency
100 100

0 0

0 10 20 30 40 0 10 20 30 40

Mean = 5 Mean = 36
Median = 3.5 Median = 37.5
1st Quartile = 1.5 3rd Quartile = 39.5

How do we find central tendency for skewed data?

In cases dealing with a skewed distribution, the truest measure of central tendency
may be neither the mean nor the median. In these cases, the first or third quartile
value may be used.
On a histogram, data that is skewed to the right will look tall at the left end of the x-
axis and will taper far to the right. For this distribution, the first quartile value can
be used. This value is halfway between the lowest value and the median. 25% of
all the values in the data set will fall to the left of the first quartile.
On a histogram, data that is skewed to the left will look tall at the right end of the x-
axis and will taper far to the left. For this distribution, the third quartile value can
be used. This value is halfway between the median and the highest value of the
data set. 25% of all the values in the data set will fall to the right of the third
quartile.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 15
Spread of Data
• Variance: Average distance from the mean
• Standard Deviation: Square root of the
Variance
– Most commonly used to show variation
σ)
– Standard Deviation = Sigma (σ
• Range: Highest value to lowest
– Even one extreme point changes the range

In data analysis it is critical to understand how far the data is spread away from the center (mean,
median). Understanding spread will give you a better understanding about the stability of your
process. In most cases, data that is spread away from the mean indicates large variation (not stable)
and data gathered close to the mean indicates small variation (stable). Below are the major indicators
of spread that are most commonly used in data analysis.
Variance (σ σ2): A measure of how the data tends to vary around the mean. Variance takes into
account all the data points and is a good indicator of stability in a process. Variance can also be
computed as Standard Deviation squared.
Standard Deviation (σ σ): The square root of variance. Calculated the same way as variance, it takes
into account all data points. It is the most commonly used measure of spread.
Range: The simplest measure of spread for a set of data. Range measures the distance between the
largest and smallest observations in the data. Although it is the easiest measure of spread to calculate,
it conveys the least amount of information. Since range is determined by the two extreme values of a
data set, it gives no indication about the spread of data between these two points. (Maximum –
Minimum = Range)
Span: The distance from the 5th percentile to the 95th percentile. Span is used to eliminate the extreme
high and low values in a data set. This is a useful statistic to measure if you feel the data contains
extreme outliers that are not true indicators of your process.
Stability Factor: Used with quartile measurements to show variation. It is calculated by dividing the
first quartile by the third quartile. A stability factor close to zero indicates a high level of variation and
closer to one shows stability.
Z score: Used to show variation without actual values. Assuming a mean of 0 and standard deviation
of 1, we can show how many standard deviations an individual value is positioned away from the
mean. Use this transformation to show the proportion of output that is out of spec. Z-score is used
when comparing sets of data with varying scales to show the level of variation without actual numbers
(or dollar amounts, revenue miles, payroll, etc.).

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 16
Center and Spread
Calculations
• Population • Sample
N n
Σ Xi Σ Xi
i=1 i=1
Mean (µ) = ———— Mean (x) = ———
N n

Standard (σ) N Standard (s) n


Σ(µ - xi)2 Σ(x - xi)2
Deviation = i=1 Deviation = i=1
N n-1

N = number in entire population, n = number in sample


x = each individual value, i = sequence number of each value

To calculate the mean of a population or sample*:


- Add all the individual values
- Divide by the number of values
The resulting number is the most likely value for data sets that are normally
distributed.
To calculate the standard deviation of a population or sample*:
- Subtract each value from the mean and square the result
- Add all the squared differences
- Then, for an entire population, divide this sum by the total number of
individual values (N).
- For a sample, divide the sum by the number of individuals values in
the sample (n) minus 1
- The final step is to take the square root of the resulting quotient.
*Note: Subtracting 1 increases the standard deviation slightly because the standard deviation for a
sample tends to be smaller than that of an entire population and our objective is to estimate the
parameters of the population using the statistics of the sample.

Notice the different symbols used for population and sample data:
- The mean of a population is shown by the Greek letter mu - µ
- The mean of a sample is shown by a lower case letter x with a bar
above it - x
- The standard deviation of a population is shown by the Greek letter
sigma - σ
- The standard deviation of a sample is shown by a lower case - s
*Note: Population and Sample characteristics will be explained in Chapter 5.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 17
18 Copyright © 2000 Penske Truck Leasing Co., L.P. Confidential and Proprietary.
5+9+11+6+5+12+8+10+10+7 = 83 ; 83/10 = 8.3 mean
8.3-5=3.3; 8.3-9=-0.7; 8.3-11=-2.7; 8.3-6=2.3; 8.3-12=-3.7; 8.3-8=0.3; 8.3-10=-1.7; 8.3-7=1.3
3.32=10.89; -0.72=0.49; -2.72=7.29; 2.32=5.29; -3.72=13.69; 0.32=0.09; -1.72=2.89; 1.32=1.69
10.89+0.49+7.29+5.29+10.89+13.69+0.09+2.89+2.89+1.69 = 56.1
56.1/9 = 6.23 (6.23)1/2 = 2.497 standard deviation
std dev = _________
mean = __________
5, 9, 11, 6, 5, 12, 8, 10, 10, 7
sample:
Calculate the mean and standard deviation for the following
Calculations - Exercise
Center and Spread
Shape
Frequency Plot

10 11 12 13 14 15 16

Example

Minitab>Graph>Dotplot

Dotplots clearly illustrate variation and are especially useful for comparing several
groups of data. Each dot represents one output value or data point. Suppose the
graph above is showing the time it takes to prepare invoices. One invoice may take
10 days to prepare, while another may take over 16.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 19
Shape Histogram
The
Thecurved
curvedline
lineis
isadded
addedtoto
show how the distribution
show how the distribution
35 would
wouldlook
lookififwe
wewere
wereto
to
30
collect an infinite number
collect an infinite number
of
ofsamples
samples
25
Frequency

20

15

10

9 10 11 12 13 14 15 16

Example

Minitab>Stat>Basic Statistics>Display Descriptive Statistics

Histograms clearly illustrate the data at specified intervals. This is the same data
displayed on the previous dotplot. On the histogram, the height of each bar is
determined by the number of output values that fall within each interval. From the
example you can see the two tallest bars in the middle indicate that the data points
between 12.3 and 13.3 represent the most common output values.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 20
Shape
Boxplots
Outliers

10
18
9
76% -100% 4th Quartile
16
8 Highest value within
upper limit
51% -75% 3rd Quartile
7 14

26% -50% 2nd Quartile


6 Median 12

5
1% -25% 1st Quartile
10
Whiskers
4

Lowest value within


lower limit

Minitab>Graph>Boxplot

Boxplots are another excellent tool in observing the distribution of single or


multiple sets of data. Minitab takes the data and displays it in a box format. This
box is divided by a single line depicting the median. The thickness of the box is
determined by the spread of the data. The thicker the box, the more dispersed the
data is from the median. The thinner the box, the more data points fall closer to the
median. By observing the median it is easy to see if your data is skewed. The areas
above and below the median, but still within the box, show the size and spread of
the second and third quartile. The lines extending from the box (whiskers) extend
to the furthest points away from the median. These whiskers represent the 1st and
4th quartiles and are still within the upper and lower limits (confidence intervals).
Stars are in place to show outliers and their locations.
A red dot is often added to indicate the mean of the process.

Note: It is easy to see that the box on the right has a nice even distribution, while the box on the left
is skewed and possesses outliers.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 21
Shape
The Normal Curve
• The curve can be divided in half around the
most frequently occurring value.
• Peak of the Curve represents the center of
the process
• Area under the Curve equals 100% of the
output of the process
• The Curve never reaches zero
From GE Capital DMAIC Wave I

Almost everyone has seen a bell-shaped curve (page 14). Most natural and man-
made data will occur in a bell-shaped curve and many of our assumptions in data
analysis are made based on this shape. We call it the “Normal Curve.”
There are certain properties to the normal curve:
- It is symmetrical - the curve can be folded in half at its highest point,
which is the center of the process (mean).
- The curve never reaches zero. If we were to collect an infinite number of
samples, there would always be some values far to the right and left.
- The area under the curve represents 100% of the possible output of the
process. If we know how much of the output falls to the right of a certain
point, then we know how much falls to the left.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 22
Shape
Normal Curves

The Normal Curve is defined by its mean and standard deviation.


What is the difference among all the curves shown above? Which one is the strongest when
it comes to observing data? The difference in the curves above is the standard deviation.
Generally, the narrowest curve is the most desirable because it represents the tightest
process and that is the objective of Six Sigma methodology - to reduce variation.
The Normal Curve is divided into segments measured by standard deviation.
From the mean to the left or right:
- One standard deviation includes 34% of possible outputs (68.3% together)
- Two standard deviations include 48% of possible outputs (95.5% together)
- Three standard deviations include 49.9% (99.7% together)

Area under
the curve

-3s -2s -1s mean +1s +2s +3s s = Standard Deviation


|--68.3%--|
|--------95.5%--------|
|------------99.7%-------------|

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 23
Shape
Probability Plot confidence
confidence intervals
intervals --
dashed
dashed lines
lines

ML Estimates

Mean: 61.2825
99
StDev: 9.31969
95
90
80
Percent

70
60
50
40
30
20 By
Byobserving
observingyour
your
10
5
data
datapoints
pointswithin
within
the
theconfidence
confidence
1
limits,
limits,you
youcan
can
mean - better
betterunderstand
understand
solid line the
30 40 50 60 70 80 90 thenormality
normalityofof
Data your
yourdata.
data.

Minitab>Graph>Probability Plot

Probability plots are an excellent tool for visually interpreting the normality of your
entire set of data. Once your data points have been entered, Minitab will
automatically assign confidence intervals based on that data. Points are said to be
normal if they appear to fall within these confidence intervals. This is also referred
to as “good fit.” If to many points appear to fall outside of the confidence intervals,
the data can be said to have a distribution that is not normal.

Note: Probability plots are good visual indicators of the normality of your data, but always use
hypothesis testing (p-values) to ensure the strength of your data. See p-values on page 26.
Note: Confidence Interval is the range above and below the target line within which we are 95%
sure that all points would line up if the distribution were truly normal. We will see confidence
intervals elsewhere in this manual and they often refer to the range that should contain the mean of
the data set. With larger sample sizes the confidence intervals are tighter because we can be more
certain of the target values. With smaller sample sizes the confidence intervals are generally wider
and this situation may require additional sampling to be sure of the mean.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 24
Probability Plots
NORMAL PROCESS

ML Estimates

Mean: -0.108017
99
StDev: 1.07418
95
90
80
Percent

70 MULTIPLE PROCESSES
60
50
40
30
20
ML Estimates
10
5 Mean: 1738.28
99
StDev: 284.938
1
95
90
80

Percent
70
60
-3 -2 -1 0 1 2 3 50
40
Data 30
20
UNSTABLE PROCESS 10
5

ML Estimates

Mean: 32.2538
99
StDev: 46.9502
95
0 1000 2000 3000 4000
90
80 Data
Percent

70
60
50
40
30
20
10
5

-100 0 100 200 300 400

Data

Minitab>Graph>Probability Plot

Probability plots are not only indicators of normality, they also tell you something
about the process they represent by the shape of their distribution. Data can take on
an infinite number of distribution shapes. Displayed above are a few common
shapes that can provide information to you about your process. Distributions that
resemble these shapes commonly reflect the processes labeled above them.

Note: Never come to a full conclusion about your data by observing just one statistical tool. Always
view and test your data to the fullest.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 25
The p-value
Descriptive Statistics
Variable: Length

Anderson-Darling Normality Test


A-Squared: 0.583
P-Value: 0.127

Mean 61.2825
StDev 9.3525
Variance 87.4684
Skewness -3.0E-01
Kurtosis 3.84E-02
N 143
37.5 45.0 52.5 60.0 67.5 75.0 82.5
Minimum 36.0000
1st Quartile 57.0000
Median 61.0000
3rd Quartile 67.5000
95% Confidence Interval for Mu Maximum 83.0000
95% Confidence Interval for Mu
59.7365 62.8286
60 61 62 63 95% Confidence Interval for Sigma
8.3797 10.5827
95% Confidence Interval for Median
95% Confidence Interval for Median
60.0000 63.0000

The p-value is used to help us make decisions about the significance of our data set. It
represents the chance of making an error if you reject an assumption. The p-value is the
probability that your results could occur if your assumption is actually true.

Note: We will learn more about the p-value during the Hypothesis Testing Chapter.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 26
Common Distributions

Bimodal Skewed Right

Skewed Left Sawtooth

The data we collect will not always distribute itself in a perfect normal curve.
Often data will take the shape of one of the distributions shown above. Your
actions will be different based on the shape, center and spread of the data displayed.

Note: Refer to GE Capital DMAIC Wave I for more information on distribution shapes.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 27
Summary
• Continuous or Discrete data will determine
what tests to use in data analysis
• Center, spread and shape of data are critical
to understanding process
• Central tendency and variance are the
foundation of data analysis
• Many assumptions in data analysis are
based on the normality of data

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 28
Graphs

Paint a Picture of Data

OBJECTIVES
- Select the right graphical tool to illustrate variation to stakeholders.
- Use Pareto Chart to show defect categories and frequencies.
- Use Scatter Plot to show variation for input and output variables.
- Use Time Series Plot to show variation over time.
- Use Main Effects Plot and Interval Plot as categorization tools.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 29
Graphs
6

5
Sales (millions)

4 Plot

3
Car Sales Based on Temperature
-
2
- +
- +
1 T - +
e 84+ + +
0 m - + +
p - + +
Month J F M A M J J A S O N D e - + +
Quarter 1 2 3 4 r - +
Defects for Manufactured Washing Machines a 72+ + +
t - + +
u -
r - +
450 e - +
100
400 60+ + +
-
350 80 -
------+---------+---------+---------+---------+---------+
300
17.5 21.0 24.5 28.0 31.5 35.0

Percent
60
Count

250 Sales
200
40
150
100 20
50
0 0
rts rts r
Pa Pa h olo nt er s
Defect tiv e ing ratc gC De Oth
f ec s Sc on
De Mis Wr

Count 267 88 35 23 19 5
Percent 61.1 20.1 8.0 5.3 4.3 1.1
Cum % 61.1 81.2 89.2 94.5 98.9 100.0

Graphs are essential to data analysis for a number of reasons. Visually displaying
data can make it more understandable to the human mind. Graphs take complex
data and structure it for easy interpretation. There are a variety of graphs to choose
from based on the type of data you have and the output you are looking for. Graphs
help us understand the nature of variation and help separate signals from noise*.
Dr. Donald Wheeler, the author of Understanding Variation, states that graphs
should be the primary presentation tool in data analysis. He goes on to say that if
you can’t show it (your data) graphically, you probably haven’t reached a good
conclusion.

*Note: See page 41 for a discussion of signal and noise.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 30
Pareto Chart
Defects for Manufactured Washing Machines

450 100
400
350 80
300

Percent
Cumulative % Line 60
Count

250
200
40
150
100 20
50
0 0
rts rts r
Pa Pa ch olo nt ers
Defect tiv e ing rat gC De Oth
f ec s Sc on
De Mis Wr
Count 267 88 35 23 19 5
Percent 61.1 20.1 8.0 5.3 4.3 1.1
Cum % 61.1 81.2 89.2 94.5 98.9 100.0

Minitab>Stat>Quality Tools>Pareto Chart

A Pareto chart may look at first like a simple bar chart, but when used properly it can
provide a vast amount of information about your data. Pareto charts are best utilized for
categorizing defects. Look at the example above. The chart shows reasons for defective
washing machines coming off an assembly line. The graph displays the defects from left
to right and from most common to least common. Graphically displaying your data from
most common to least common defects gives you the opportunity to focus attention on
the most vital defects. The y-axis on the left shows the number of defects by count and
the y-axis on the right shows what percentage those defects make up among all defects.
The cumulative percentage line is also inserted into the graph to show the cumulative
percentage after each type of defect. All of this information, count, percentage and
cumulative percentage, is listed at the bottom of the graph page.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 31
Scatter Plot
Plot

Car Sales Based on Temperature


-
- +
- +
T - +
e 84+ + +
m - + +
p - + +
e - + +
r - +
a 72+ + +
t - + +
u -
r - +
e - +
60+ + +
-
-
------+---------+---------+---------+---------+---------+
17.5 21.0 24.5 28.0 31.5 35.0
Sales

Minitab>Graph>Character Graphs>Scatter Plot

Many times the number of occurrences can say a lot about your data. Suppose you
want to know how many times something occurred within your data set at a specific
level. Scatter Plot charts are an excellent visual aid for viewing pairs of data
together at one time. Variables are assigned along the X and Y axis, while points
within the chart are represented by a designated symbol. Each symbol indicates
one occurrence. Observing where and how many times something takes place can
help you pinpoint problem areas or areas deserving further analysis. In the example
above, a car salesman uses a Scatter Plot to graph the number of automobile sales
based on outside temperature.

Note: It is very important not to conclude causal relationships between predictor variables on the x-
axis and response variables on the y-axis when using Scatter Plots. These graphs should be used as
an exploration tool when viewing your data, not as a tool that shows definite cause and effect.
Cause and Effect will be covered in depth in Chapter 10.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 32
Time Series Plot

5
Sales (millions)
4

0
Month J F M A M J J A S O N D
Quarter 1 2 3 4

Minitab>Stat>Time Series>Time Series Plot


or
Minitab>Graph>Time Series Plot

Many of the charts that we commonly see in use today can be classified as Time
Series Plots. Time Series Plots show the progression of data or process over time.
It is very common for news organizations, corporations and governments to utilize
these types of charts because just about anyone can understand them.
For your own data analysis, a Time Series Plot can help you understand a great deal
about your data or process. Not only can these charts show you variation, but
useful indicators such as long/short term trends and seasonality also. In the
example above, you can see that the sales manager at XYZ company charted sales
for one year by month and quarter.

Note: Minitab gives you a great deal of options for displaying Time Series Plots. Take the time and
set your charts to accurately reflect your data. Minitab will enable you to display your data in time
increments as small as a second and date increments as large as a year.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 33
Main Effects Plot
Canceled Orders by Day of the Week

4200

3200

2200

1200

200

Friday Monday Saturday Sunday Thursday Tuesday Wednesday

Minitab>Stat>ANOVA>Main Effects Plot

Many times we are asked to compare data from different groups. This data can be
segmented for analysis. A good tool you can use to show how a process performs
under different conditions is the Main Effects Plot. The data is sorted into
categories that are displayed on the x-axis and the data in each category is evaluated
separately. The mean of the data for each category is shown by the dots on the
connected line and the overall mean is displayed as a dashed line.
This graph can show a good snapshot of a process to identify significant differences
in performance among groups.
The example above shows the number of orders that are canceled at a distribution
center in Brazil. The cancellations are sorted by the day of the week. You can see
that cancellations are more common on Saturday. The next step in improving your
process might be to answer the question, “what is special about Saturday?”

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 34
Interval Plot
Canceled Orders by Day of the Week

6500

5500

4500

3500

2500

1500

500

Fri Mon Sat Sun Thu Tue Wed

Minitab>Stat>ANOVA>Interval Plot

An Interval Plot can be used to show your data in greater detail than the Main
Effects Plot. The Interval Plot shows the mean of the data set in each category and
the confidence interval for the mean or the standard error for the data set. This can
help to identify the greatest and smallest amount of variation among the categories.
The farther apart the interval lines, the greater the variation in that category.
With an Interval Plot, we can see that the average number of canceled orders is
greater on Saturday. The number of cancellations on a given Saturday can reach a
number much higher than that average. We also see that the number of canceled
orders seems very predictable on Thursday because the interval lines are closest
together. If we compare Thursday to Saturday, we may find a way to improve the
process.

Minitab Note: For both the Main Effects Plot and the Interval Plot it may be necessary to stack the
data in your Minitab worksheet. To do this, use the path: Manip>Stack/Unstack>Stack Columns.
Store the stacked data in a blank column (usually named “Stack”) and use that column for the
Response. Store the subscripts in another blank column (usually named “Group”) and these
subscripts become the categories for the x-axis. Use this column for the Factor and the categories
will often show up as numbers. Instead of Friday, Monday, Saturday… you will see 1, 2, 3... You
will have to translate the numbers back to the category names, keeping in mind that Minitab orders
words alphabetically.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 35
Analysis Chart Decision Tool

ts
s
r

ha
t
t
ha

lo
ie
lo

ec

C
P
er
P
C

ff

al

l
S
r

tro
o

tte

rv
et

on
ca

te
ar

ai

In
Ti

C
P

ontinuous Data X X X
screte Data X X X X X
ends & Seasonality X X X
& Y Axis X X X X X
ata Counts
ata Percentages
ean of Data
ange of Data
ata Over Time X X
efects X X X X X
ontrol Limits
egmented Data X X
ccurences X X X
splays Variation X X X

Optimal =
eets Requirements = X

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 36
Summary
• Graphs help us understand the nature of variation
• Graphs make nature of data more accessible to
the human mind
• Graphs help display the context of the data
• Graphs should be primary presentation tool in
data analysis
- If you can’t show it graphically, you probably don’t have a good conclusion

• Graphs help separate the signal from the noise


Source: Donald Wheeler: Understanding Variation

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 37
Control Charts

The Primary Tools in


Statistical Process Control

OBJECTIVES
- Introduce control charts as an essential tool in Statistical Process Control
and the DMAIC process.
- Revisit continuous and discrete data.
- Explain types of control charts and their construction.
- Differentiate control limits from specification limits.
- Highlight variation so that Black Belts can differentiate between common
cause (noise) and special cause (signals).

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 38
Control Charts
Upper Control Limit
AUGUST DELIVERIES

30 1

20
1 AUG 1 3.0SL=6.519
1 15 AUG 31 AUG Center line
Individual Value

10
(average)
0 X=1.225

-10 11 1 1 -3.0SL=-4.069
1 11
11 11
-20
1
-30

-40
Lower Control Limit
-50 1

0 100 200 300 400 500


Observation Number
Minitab>Stat>Control Charts

Is the process in control? One of the most user friendly tools you can use to evaluate
the performance of a process over time is a Control Chart. These charts plot data in
time sequence and detect the presence of special cause variation. There are a variety
of Control Charts to use depending on the type of data you are observing, but all have
same basic structure and characteristics. Once the data is entered, Minitab will plot a
center line based on the average of the data being plotted. Minitab will also plot the
upper and lower control limits above and below the center line. Once these lines are
in place, it is easy to observe your data points and how much control the process
yields. The following slides detail some of the Control Charts you can use.

Note: When entering your data into Minitab, ensure your data is in time order sequence. By doing so
you will be able to observe the changes in your process over time. The chart above displays August
data in time sequence from the beginning of the month to the end.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 39
Why Use Control Charts?
• Statistical Process Control (SPC) is used to
determine the amount of variation in a process
• Graphically displaying process data provides a
basic description of variation and its sources.
• Once the process is in control, improvement
objectives can be set.

In the 1920’s, Dr. Walter Shewhart developed Statistical Process Control (SPC) at
the Western Electric Company as a way to identify common cause and special cause
variation in a process. SPC helps us to find signals that a process is out of control
amid the uncontrolled noise that produces random variation.
Control Charts are the primary tool in SPC because they show process variation over
time. They separate signals from noise and they show data in time series, not as
isolated points. Better management decisions can be made when the data is seen in
the context of time.
Control Charts also help us track improvement. Once a process is in control, the
team can decide what improvement action is required. This is a step in the Measure
process of DMAIC (State Improvement Goal). In order to meet the customer’s
needs, the team can decide to:
- Reduce process variation if the variation is too great
- Shift the process mean (average) if the customer’s specification is not being
met
- Accomplish both a reduction in variation and a mean shift.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 40
Common Cause Special Cause
• Noise • Signals
Process inputs that consistently Signals are changes in output caused by
cause random variation in the a specific factor such as environmental
output. Noise also is referred to as conditions or process input parameters.
white noise, random variation, Special cause variation can be directly
common cause variation, accounted for and potentially removed.
noncontrollable variation, and It is a measure of process control, or
within-group variation. how well the process is performing
compared to its potential. Signals are
also known as nonrandom variation,
controllable variation, and between-
group variation.

All processes will exhibit variation. The approach of the Black Belt team toward
improving the process will be determined by the level of common cause and special
cause variation.
As long as the data stays within the control limits and does not exhibit any sort of
special cause variation, it is said to be in control and can be targeted for
improvement.
Noise (common cause variation) is present in all processes. If it is mistaken for a
signal, time and money can be wasted trying to improve a process that is not
possible to improve. This can result in more variation, and this mistake is known as
“tampering”.
Special cause variation is not only shown by a point above or below the control
limits, but also by trends, patterns, or cycles visible in the data. Without a control
chart, signals can be missed and opportunities for improvement can be lost. Control
charts are very useful. Not only do they show variation, but the behavior of the
variation over time.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 41
Special Cause Patterns
I Chart for Shift

3.0SL=7.091
Individual Value
7
Shift: 8 or more points in a row on
6 the same side of the center line
5
X=5.318
indicate a shift in the process.
4 Find out what was different about
3
-3.0SL=3.545
the process during the shift.
0 10 20
Observation Number
I Chart for Trend

3.0SL=7.218 Trend: 7 or more points in a row


7
continuously increasing or
Individual Value

6 X=6.031
decreasing. Often seen after some
change was made in the process.
5
-3.0SL=4.844
Helps to show a positive or
0 10 20 30
negative effect.
Observation Number

Special cause variation is not only shown by points above or below the control
limits, but also by trends, patterns or cycles in the data. Any set of points that looks
non-random can be a signal that something is changing in the process.
For example, the shift shown above can be an indication that something was
different in the process for that period of time. An investigation into what inputs or
process steps were different during that time may reveal a special cause for that
variation. A corrective action can be generated to make the process more
consistent.
An upward or downward trend is usually easy to see and may be a result of a
temporary increase in attention or decline in vigilance. For example, a machine
may be slowing down, speeding up or there may be some other short-term change in
the process.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 42
Special Cause Patterns
I Chart for Same

7 3.0SL=6.931
Same Value: 7 or more points
6
having the same value.
Individual Value

5 X=4.997

4
Measurement device may be
3 -3.0SL=3.063
stuck or operator may be rounding
2
1
to the next whole number
0 5 10 15 20 25
Observation Number
I Chart for Cycle
7
3.0SL=6.800
Cycle: Chart shows higher and
lower groups of values over equal
Individual Value

X=5.468
periods of time. Find out what the
5
clusters above and below the center
4
-3.0SL=4.135 line have in common
0 5 10 15 20 25
Observation Number

When the same value is repeated in the data, it can be an indication of a


measurement system that is stuck or, in many cases, an operator who is rounding off
the measurements.
Cycles can be short or long and may be hard to identify, but the key is to look for
runs of similar length above and below the center line. Even one point alternating
above and below for a long enough run, 14 or 15 points in a row, can be an
indication that some factor is influencing the process in a predictable manner. Any
variation that is not random is considered a special cause and merits investigation.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 43
Control Limits
vs.
Specification Limits
B A

3.0SL=12.00

10

5
R=3.673

0 -3.0SL=0.000

0 10 20 30 40 50

B A

It is very easy to mistake Control limits for the customer’s specification limits when
viewing charts that utilize these control tools. Control limits and Spec limits are points
set above and below the mean of a set of data. These points are usually indicated by a
numeric value (see A above) and a designated line that spans the length of the data set
(see B above). These numeric values and lines act as a warning system for your data
and help you determine when the process is not operating in a predictable manner. As a
general rule, points that fall on or within these limits are said to be in control. Points
outside of these limits are said to be out of control.
Now that we understand the fundamentals of Control and Spec limits, let’s find out how
they are different and when is the best time to use each. Control limits are referred to
many times as the natural process limits. The term “natural” comes from the fact that
these control limits are set by the existing process. In other words, the chart will
automatically calculate and set these limits based on the input data applied to a formula.
This is referred to as voice of the process. This is very important to understand because
this is the data telling us what is going on with our process, like it or not.
Specification limits are not automatically set by the chart based on a formula, they must
be entered manually. That means you can set these limits according to customer
requirements (VOC/CTQ), Six Sigma standards or any desired level that you choose.
Minitab gives you the ability to set these limits as part of its menu options.

Note: Minitab uses “SL” to denote upper and lower control limits. SL means Sigma Limit, not
specification limit.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 44
Control Limits
vs.
Specification
B
Limits A
0.4
3.0SL=0.3488

0.3

0.2
U=0.1726

0.1

0.0 -3.0SL=0.000

0 1 2 3 4 5 6 7 8 9 10
B
A

Just because your data meets the requirements of the customer’s specification
limits, does not necessarily mean that your process is in control. Suppose that you
set the Spec limits for a particular set of data. These set Spec limits can be based
on VOC, historical requirements or any set standard level. Suppose you plot the
data against the customer’s specifications and you find that all your data points fall
within the specification limits. Does this mean your process data is in control?
You really cannot answer that question until you do some more analysis. You must
then run the same process data through a Control chart to view it against the natural
process limits. When you view the results you find that a few of your points fall
outside the control limits and therefore signify that it is out of control. It is possible
that the Specification limits that the customer set were not strict enough to properly
evaluate the level of control for your process data. In order to properly measure the
level of control, always set it against the strictest requirements. In these cases,
always utilize control limits before settling with Specification Limits. We use
Control Charts first to determine whether the process is in control and later to
determine the capability of our process (Ch. 7).

Note: What if my customer’s requirements (CTQ’s) are not that strict and I can get away with
meeting their CTQ’s based on the set Spec limits?
Remember, the goal of process control is 6 Sigma. Just because we are meeting our customer
requirements at a lower sigma level, does not mean we should stop striving to achieve the highest
level of quality possible.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 45
Continuous vs. Discrete Data
Variables Attributes
Continuous, measured data Discrete, Counted Data
Examples: Examples:
– Cycle Time – 1, 2, 3, 4, etc.
– Temperature – Yes/No
– Miles per Gallon – Truck 1, Truck 2, Truck 3…
– Annual Revenue
np & p Charts
I-MR Chart c & u Charts
X-R & X-s Charts

The choice of control chart will first be determined by the type of data you are
observing.
Continuous data will be graphically represented by individual values or groups of
values over time. The variation over time will be shown as lines sloping up or
down.
Discrete data will be graphically represented by a number of points or a percent of
the sample that is defective. The variation in the process will be summarized in
categories. Discrete data can also be charted using the counted number of defects
over time. The variation in the process will be shown as increasing or decreasing
totals.
The Continuous data control charts are:
- Individuals and Moving Range
- X-bar/R & X-bar/S
The Discrete data control charts are:
- np chart, for number of defective units
- p chart, for proportion defective
- c chart, for counted number of defects
- u chart, for counted number of defects per unit

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 46
Continuous Data
AUGUST DELIVERIES

Individual Value
15
3.0SL=14.25

10

5 X=4.480

-5 -3.0SL=-5.290

0 10 20 30 40 50

Observation Number

Types of Data: continuous, measured


Examples: cycle time, temperature,
miles per hour, revenue
Types of Charts: Individuals, Moving Range
I-MR, Xbar, Xbar-S, Xbar-R
R chart, S chart

The choice of which control chart you will use to observe your data will be
determined by the type of data you are working with. Data can be categorized into
two types; Continuous Data and Discrete Data.
Continuous data is is graphically represented by individual values or groups of of
values (ranges) as they appear over time.

Note: The chart above, the sloping lines show delivery times as they fluctuate during a given period.
The fact that the deliveries can occur at infinite times throughout the period makes the data
Continuous.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 47
Continuous Data Control Charts
I MR X R S

Single Value

Group Average

Group Size < 5

Group Size > 5

Individuals - shows each value in time sequence

Moving Range - shows the change from one value to the next

X-bar Chart - shows the average value for each group over time

R Chart - shows the range, highest to lowest, for each group over time.

S Chart - shows the standard deviation for each group over time.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 48
Individuals Chart
AUGUST DELIVERIES

15
3.0SL=14.25
Individual Value

10

5 X=4.480

-5 -3.0SL=-5.290

0 10 20 30 40 50
Observation Number

Minitab>Stat>Control Charts>Individuals

Individuals Charts are used when data points represent individual observations. You
can use the Individuals Chart to track the process level and detect special cause when
data points are set up in time order sequence.
In the above example you can see deliveries as they occur over time for XYZ Company.
Their mean delivery time is 4.48 hours early. This means that the average delivery
arrives more than 4 hours earlier than it is due. Deliveries occur above and below this
mean throughout the month.

Note: Use an Individuals chart when you want to look at individual observations over time. This should
not be confused with an Xbar chart, which graphs mean values for sets of data over time. Even when
you are charting the total number of invoices prepared in a week, the Individuals chart is used because
each weekly total is a single number and not an average.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 49
Moving Range Chart
AUGUST DELIVERIES

3.0SL=12.00

10
Moving Range

5
R=3.673

0 -3.0SL=0.000

0 10 20 30 40 50
Observation Number

Minitab>Stat>Control Charts>Moving Range

Observing the change in your data over time may give you some important insight to
your process. Moving Range Charts help you observe these changes and trends as they
occur. Just like all Control Charts, the center line and upper/lower control limits appear
according to the voice of process. Again, you have the opportunity to change these to
reflect standards that are current to your process.
Example: In the example above, the first delivery was delivered 8 hours ahead of
schedule, the second 7 hours ahead, the third 4 hours ahead and the fourth 2 hours
ahead. The Moving Range Chart graphs these changes as 1…3…2... or
(8 - 7 = 1, 7 - 4 = 3, 4 - 2 = 2...).

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 50
I-MR Chart
Individual Value AUGUST DELIVERIES
15 3.0SL=14.25

10

5 X=4.480

-5 -3.0SL=-5.290

Subgroup 0 10 20 30 40 50

3.0SL=12.00
Moving Range

10

5
R=3.673

0 -3.0SL=0.000

Minitab>Stat>Control Charts>I-MR

Suppose you want to observe an Individuals Chart and a Moving Range Chart for your
data simultaneously. Minitab gives you the power to do so with the command listed
above. Notice that although no points go out of the control limits in the Individuals
Chart on top, one point exceeds the limits in the Moving Range Chart. This may or
may not be a problem, but as you can see looking at the same data in simultaneous
formats can give you a powerful comparison tool.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 51
X-bar Chart
X-bar Chart for 1Shift - 3Shift

Sample Mean 15.5 3.0SL=15.49

15.0 X=15.02

-3.0SL=14.54
14.5

0 5 10 15
Sample Number

Minitab>Stat>Control Charts>Xbar

Suppose your process is broken up into subgroups - samples of data to be studied


together - and within each subgroup are housed numerous minor data points. How
can you observe your process without the clutter of numerous minor data points on
your graph? Using the X-bar Chart you can graph the means of those minor data
points to show how well your process is in control.
Example: XYZ Company manufactures parts. They run three shifts per day and
want to look at the number of rejected parts per shift each day. Using an X-bar
chart they were able to graph the mean each day.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 52
R Chart
R Chart for 1Shift - 3Shift

3.0SL=1.198

1.0
Sample Range

0.5 R=0.4654

0.0 -3.0SL=0.00E+00

0 5 10 15
Sample Number

Minitab>Stat>Control Charts>R

Suppose again you are dealing with a process that possesses a wide variety of
smaller subgroup data. Now you want to look at the range within those subgroups
to understand the stability of your process. One way to do this is to use an R Chart
to view the range of your subgroup data, comparing one section to the other
sections over time.
Example: XYZ Company runs three shifts per day and want to look at the range of
rejected parts on each shift. The R Chart will take the lowest value of the three
shifts and subtract it from the highest. This difference will give it the data points
shown above. These data points represent the range for each day.

Note: R Charts should only be used to track sample sizes of 5 or less. With sample ranges of 6 or
more, use an S Chart.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 53
Xbar-R Chart
Xbar/R Chart for 1Shift-3Shift

15.5 3.0SL=15.49
Sample Mean

15.0 X=15.02

14.5 -3.0SL=14.54

Subgroup 0 5 10 15

3.0SL=1.198
Sample Range

1.0

0.5 R=0.4654

0.0 -3.0SL=0.00E+00

Minitab>Stat>Control Charts>Xbar-R

Just as Mintab gave you the opportunity to observe the Individuals Chart and
Moving Range Chart simultaneously on the I-MR Chart, it also permits you to view
the X-bar and R Charts at the same time. Remember that R Charts are for sample
sizes less than or equal to 5.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 54
S Chart
S Chart for Machine1 - Machine9
0.5

3.0SL=0.4398
0.4
Sample StDev

0.3
S=0.2498
0.2

0.1
-3.0SL=0.05972

0.0
0 5 10 15
Sample Number

Minitab>Stat>Control Charts>S

S Charts are very similar to R Charts. S Charts graph standard deviation for sample
sizes greater than 5 and detect variation within a process.
Example: Management of XYZ Company is looking at another facility where 9
assembly lines are utilized 24 hours a day to produce parts. They want to track the
number of parts produced on all the assembly lines to get a picture of their daily
productivity. To do so they used an S Chart to graph the standard deviation of a large
sample of data to detect the amount of variation in the plant.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 55
Xbar-S Chart
Xbar/S Chart for Machine1-Machine9

15.3
15.2 3.0SL=15.23
Sample Mean

15.1
15.0
X=14.97
14.9
14.8
14.7 -3.0SL=14.71

Subgroup 0 5 10 15

0.5
3.0SL=0.4398
0.4
Sample StDev

0.3
S=0.2498
0.2

0.1
-3.0SL=0.05972
0.0

Minitab>Stat>Control Charts>Xbar-S

The Xbar-S Chart will provide the same double chart viewing that the I-MR Chart
and the Xbar-R Chart provides. As you can see in the example above, both the S
Chart and the X-bar Chart come up when the Minitab command is applied. You
may ask yourself, “if I can only use S Charts for large sample sizes and R Charts for
small sample sizes, do I have to use a different Xbar Chart for different sample
sizes?” The answer is “no.” Xbar charts can be utilized for any sample size.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 56
Discrete Data
DEFECTIVE PARTS

0.025
3.0SL=0.02353

Proportion
0.020

0.015
P=0.01282
0.010

0.005
-3.0SL=0.002104
0.000
0 1 2 3 4 5 6 7 8 9 10
Sample Number

Types of Data: Discrete, Counted Data

Examples: 1, 2, 3, 4….
Truck 1, Truck 2, Truck 3

Types of Charts: p charts, c charts, u charts


np charts

Discrete data is graphically represented by the number of points defective or the


percent of a sample that is defective. The sloping lines in the chart above show the
number of defective parts for different samples. The fact that the defective parts are
counted units makes the data Discrete. Since these sample sizes are different
quantities, they are compared to each other based on proportion (Y axis).

Note: The type of data you are working with, Continuous or Discrete, determines what control
chart you will use.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 57
Defect vs. Defective
2000 TRAINING ENROLLMENT FORM

Defect
PARTICIPANT INFORMATION
NAME Black Belt
SOCIAL SECURITY NUMBER 047-24-9638
LOCATION NAME Green Mountains Defect
LOCATION NUMBER 0863
JOB TITLE Black Belt
BUSINESS TELEPHONE 610 796-2947 FAX
E-MAIL ADDRESS john [email protected]

PROGRAM INFORMATION Defect


PROGRAM NAME CAP Training
PROGRAM DATE(S) 5 Jan – 10 Jan LOCATION
PARTICIPANT’S SIGNATURE DATE

MANAGER’S AUTHORIZATION Defect


NAME (please print) Martin Smith MBB JOB TITLE
SIGNATURE DATE

PLEASE MAKE ADDITIONAL COPIES OF THE ENROLLMENT


FORM FOR YOUR FUTURE USE.

From the previous slide we learned that Discrete Data represents counted defective
points within our data set. In order to determine what Discrete Control Chart we
will use, we must determine if our data measures Defects or Defectives.
Defects are occurrences of non-compliance. They are counted events that can
theoretically be innumerable. For example, the number of scratches on a
manufactured glass window.
Defectives are items that have one or more defects and are part of a specific number
of items. If you know how many defectives you have, then know how many non-
defectives you have. For example, the number of defective microwaves
manufactured in August, number of failed test scores on a particular science exam.
We will use the example on the slide above to further differentiate between Defects
and Defectives. The application above has a number of fields that applicants must
fill in to enroll in a class. When the training manager reviewed this application, he
found it to have 4 fields that were either filled in incorrectly or not filled in at all.
According to the training department standards, these fields constitute a defect. If
your data is measuring the amount of incorrect fields on this application, then you
will document 4 Defects.
Now let’s change your requirements and assume this application is part of a batch of
10 other applications. If this application and 3 others were found to have defects
within them, then 4 out of the batch were found to be Defectives.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 58
Discrete Data Control Charts
p np c u

Defectives, Constant
Sample Size
Defectives, Variable
Sample Size
Defects, Constant
Opportunity
Defects, Variable
Opportunity

np - shows the number of defective units

p chart - shows the proportion of defective units in the sample

c chart - shows the total counted number of defects

u chart - shows the counted number of defects per unit

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 59
p Charts
P Chart for Rejected

0.20
3.0SL=0.1934

0.15
Proportion

P=0.1236

0.10

0.05 -3.0SL=0.05380

Subgroup 0 1 2 3 4 5 6 7 8 9 10
Date /1 1/2 1/5 1/6 1/7 1/8 1/9 /12 /13 /14
11 1 1 1 1 1 1 11 11 11

Minitab>Stat>Control Charts>P

P Charts track the proportion of defectives for a series of samples. Since we want
to compare these varying sample sizes to one another, we must use proportion
defective (Y axis). This is very useful if you are comparing data that produces
varying sample sizes. Notice the upper and lower control limits, these must also
fluctuate from sample to sample since sample sizes change throughout the graph.
Example: In the graph above, XYZ company produces parts and wants you to
compare defective parts produced every day. You gather the data of parts produced
every day and the amount of defective parts produced every day. Since you find
that sample size fluctuates from day to day, you conclude that you must use a p
Chart to compare proportion defective.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 60
np Charts
Defective Parts

25 3.0SL=24.86

20
Sample Count

15
NP=13.80

10

5
-3.0SL=2.741
0
0 1 2 3 4 5 6 7 8 9 10

Time (hours)

Minitab>Stat>Control Charts>NP

np Charts track the number of defectives for a series of samples. np Charts are
very similar to p Charts, the major difference is that np Charts observe samples
from a constant sample size. The Y axis in the graph above tracks defectives by
number, not proportion. The upper and lower control limits are constant because
sample size is consistent throughout the data set.
Example: XYZ company produces parts and wants to compare defective parts
produced every hour. Management believes that the first 900 parts produced every
hour yield the least amount of defective parts. You gather the data for the first 900
parts produced every hour and the amount of defective parts produced in that 900.
You decide to use an np Chart to compare number defective since the sample size
is constant.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 61
c Charts
Defective Application Fields

1
20 3.0SL=20.51

Sample Count

C=10.70
10

-3.0SL=0.8867
0

0 1 2 3 4 5 6 7 8 9 10
Sample Applications
over time

Minitab>Stat>Control Charts>C

The c Chart tracks the number of defects for a series of similar samples. Many times
the items or processes we are asked to analyze have a number of subprocesses that go
into their makeup. For Example, each part installed into an automobile has the
potential to be out of compliance. You could track automobile models by the number
of parts rejected from them. These collective subprocesses are referred to as defect
opportunities.
Use the c Chart to track data that have the same number of defect opportunities in
each sample.
Example: XYZ insurance company wants you to compare a single type of
application that is filled out by new customers. You take the applications received
every hour and compare the number of defects on each application. Since the
opportunities for defects is the same on each application, you determine that a c Chart
should be utilized to compare the number of defects on each application.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 62
u Charts
Defective Application Fields
0.4

3.0SL=0.3488
Sample Proportion
0.3

0.2
U=0.1726

0.1

0.0 -3.0SL=0.000

0 1 2 3 4 5 6 7 8 9 10
Sample Applications
over time

Minitab>Stat>Control Charts>U

The u Chart tracks the number of defects per unit sampled. Unlike c Charts that
can only compare samples with the same amount of defect opportunities, u Charts
can compare samples with different amounts of defect opportunities. For Example,
compare the number of parts rejected from different model cars. Since different
model cars posses different parts, their defect opportunities will be different. A u
Chart gives you the power to track these differences and compare their performance
under different circumstances. The Y axis on the graph above tracks defects by
proportion and the upper and lower control limits fluctuate. This is due to the
change in defect opportunities between each sample.
Example: XYZ insurance company wants you to compare 3 different types of
applications that are filled out by customers. Management wants you to take the
first 10 applications received every hour and compare the number of defects on
each application. Since the amount of defect opportunities is different for the 3
applications, you determine that a u Chart should be utilized to compare the
number of defects per application.

Note: the overall sample amount that you collect every hour can either be constant or fluctuating.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 63
Control Chart Decision Tool
(continuous data)
Continuous Data

Single value Group Data


Single
or (ranges, means
Value
Group data std. dev)

Display of Data Sample Size

Display each Display change < or = 5 > 5


value in time from one value
sequence to the next

Moving
Individuals
Range
Display of Data Display of Data

View both
charts together Range
(Highest value
Mean Mean Std Deviation
minus
lowest value)

I-MR X-bar R chart X-bar S chart

View both View both


charts together charts together

X-bar R X-bar S

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 64
Control Chart Decision Tool
(discrete data)
Discrete Data

Defects
Defectives or Defects
Defectives

Defect
Sample Size
Opportuities

Varying Varying Consistent


Sample Constant Defect Defect
Size Opportunities Opportunites

p Chart np Chart u Chart c Chart

(number of defective (total number of


(defects per unit) (total defects)
units per sample) defective units)

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 65
Summary
• Control Charts are the Voice of the Process
• Changes due to Noise are expected
• Out of Control points, Patterns, and Trends
are Signals
• The essence of statistical control is
predictability Dr. Wheeler, Understanding Variation

• Control Charts can be produced automatically

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 66
Sampling in the Measure Phase

Collecting the Proper Amount of


Data to Perform a Good Analysis

OBJECTIVES
- Know the difference between Sample and Population
- Understand why sampling is used and the basic techniques for sampling
- Be able to identify some of the problems that can result from sampling
- Familiarize yourself with large and small sampling concepts
- Understand the basic notation used in sampling
- Understand how sampling applies to the Measure phase of DMAIC
- Learn to calculate sample size for continuous and discrete data.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 67
Sampling

α β

Population

Risk

We are learning that the data we observe hold a great deal of diverse properties
within them. These properties provide us with a wealth of information and a great
opportunity for analysis. Where does this data come from? How do we know that
the data we are observing will really tell us something truthful about the processes
we are observing? Your data will never be 100% accurate in data analysis, but
there are steps you can take to ensure your observed data is as strong as possible.
One of these steps is called sampling. There are specific times that you will be
required to sample data throughout the DMAIC process. In this section we will be
discussing sampling in the Measure phase. We will discuss sampling in the
Improve phase in a later chapter.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 68
Population vs. Sample

POPULATION SAMPLE

One of the first questions you may ask yourself when starting data analysis is “how
much data should I observe?” and “where should I draw this data from?” These two
questions, how much and where to draw from, are the heart of sampling. A
population is a collection of all items or observations. A sample is a subset of the
population. For example, a local manufacturing manager wants to look at defective
parts data for one year. He has the option of looking at all defective parts for the
year (population) or a portion of the defective parts for the year (sample). The
decision to use the entire population or a sample will be discussed further in the
“Why Sample?” section of this chapter.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 69
Population vs. Sample Notation

µ - Population Mean

x - Sample Mean

σ - Population Standard Deviation

s - Sample Standard Deviation

In the previous slides we discussed the differences between Population and Sample.
Although a sample is a subset of a population, it would be incorrect to say that they are
the same in all characteristics. Because population and sample are statistically
different, they will of course have different means and standard deviations. The
statistical notations for each type of mean and standard deviation are listed above.
They will be useful in further analysis and notation.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 70
Why Sample?

Time Constraints

$
Cost Constraints
Improved Accuracy

Observing data based upon an entire population is almost always the ideal way to
conduct data analysis. When everything about a population is taken into
consideration, the truest observations can be made. You should consider using an
entire population in data analysis if the data you are observing is finite (has
boundaries) or if it already has been gathered and no further effort is required to
acquire data. Also consider using an entire population if technology allows you to
gather information quickly or if the taking of samples could change or distort the
items being observed.
The reality of our ever-changing work place is that there are a great deal of
limitations placed upon project management and data analysis. Gathering data can
be both time consuming and costly. Most processes we observe today are infinite
and operate continually. It would be almost impossible to observe every data point
in an operation. Many times we have no choice but to sample populations due to
these constraints. By utilizing the proper tools when sampling, we can estimate and
predict the parameters* of a population based on the statistics* of a sample. In
some cases you may be able to obtain more accurate results from a sample than the
results of a census (population data). If the data-gathering process relies heavily on
human input, the sample may contain fewer human errors than the entire population
and, therefore, can be more accurate.

Note: Characteristics (mean, median, standard deviation) of a population are often referred to as
“parameters”. Characteristics of a sample are referred to as “statistics”.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 71
Sampling Techniques

Samples taken during the Measure phase are very important. They are your first
look inside the processes you are working to improve and they set the tone for the
remaining portions of the DMAIC process in determining courses of action.
Because of the importance of sampling, you must determine the proper sampling
technique to use in drawing your data from the population. Your sampling
technique will be determined by the type of data you have, the sources of variation
in your data and other specified factors. Refer to the Measure portion of the
DMAIC training manual, the GE Six Sigma Quality Coach or any other statistics
reference to determine the technique that is right for your analysis. Ensure your
technique is a proper fit for the data you are observing. It is also important to take
into consideration the following sections involving sample size and pitfalls when
determining sample technique.
Here is a list of common sampling techniques:

Nonprobability Statistical
convenience cluster
judgment simple random
quota stratified random
systematic

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 72
Data Types
Continuous Discrete
• Measured on a Scale • Counted Events
- Cycle Time • Categories
- Cost or Revenue in - Good/Bad, Yes/No
Dollars - Driver 1, Driver 2,
• Statistics on Counted Driver 3
Events

We mentioned in the Basic Statistics chapter that continuous data and discrete data
are measured and displayed differently. It is important to distinguish between the
two at the earliest stages of a DMAIC project. Continuous data can be measured in
more detail, show variation in detail and are generally preferred for control charts
and dashboard measures. Discrete data can also be very helpful in uncovering
causes of variation.
Continuous data are measured on a continuum, which implies that they are the
result of an ongoing process. Time, temperature, air pressure and revenue are all
examples of data that are usually classified as continuous.
Discrete data are counted events or categories. Days of the week, number of trucks
and number of defect opportunities are all types of data usually classified as
discrete. (A large sample size is usually required for discrete measures)
You can calculate the appropriate sample size for a set of data by utilizing specific
formulas for discrete and continuous data.
It is important to note that the discrete calculation for sample size requires a
proportion of defective units. If your discrete data has more than 7 categories and is
not easily converted to a proportion of defective units, the continuous calculation
for sample size may be used for your disrete data type. These formulas are covered
in the following sections.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 73
Precision

∆ ∆
| | |
X-∆ X X+∆

Population mean = X +/- ∆

∆ - Greek letter Delta

Will the mean of the sample (x) equal the mean of the population (µ) it was drawn
from? Chances are there will be some statistical difference between the two. But if
we utilized the proper sampling techniques, our sample mean will be very close to our
population mean (sometimes even equal to it). How close is good enough for our
process? You can assign a Delta statistic to the mean of the sample to determine if it
is a good enough fit. This Delta statistic is also referred to as a “degree of Precision.”
The degree of precision is a value that, when added or subtracted to the sample mean,
is still within statistical compliance with your process. For example, suppose that an
engineer determines that any sample mean that is within 2 (∆) of the population mean
is statistically sufficient. If the sample mean is equal to 30, then he can be sure that
the population mean is between 28 and 32.

Note: The Delta (∆) statistic used to signify the degree of precision must be expressed in the same unit
of measure as the mean and standard deviation. Delta should not be expressed as a percentage unless
mean and standard deviation are percentages.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 74
Sample Size
Continuous Data

Standard Deviation
Confidence level

2
n= ( 1.96s
∆ )
Sample size
Degree of Precision

There are three factors necessary to calculate the required sample size for a continuous
measure. Confidence level and Precision are chosen by the Black Belt and the Standard
Deviation is obtained from the data:
1. Confidence level - how confident we are that the interval (+/- ∆) will contain the
mean of the population.*
2. Precision (or accuracy) - tolerance level above and below the sample mean within
which we are confident the population mean will fall.
3. Standard deviation - The standard deviation must be ascertained from the
population data if available or calculated from a small sample for use in the sample size
equation.

*1.96 represents a 95% confidence interval. This is derived from the z-table of area
under the normal curve.

Area under
the curve

1.96 = 95%
-3s -2s -1s mean +1s +2s +3s
|--68.3%--|
|--------95.5%--------|
|------------99.7%-------------|
Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 75
Continuous Data Example

How many boxes are


loaded in an hour?

A distribution center loads identical cartons onto outbound trucks. We want to


determine the average number of cartons loaded in an hour. Historically, the
standard deviation of cartons loaded at similar locations is 8.5 per hour.
We want to estimate the number of cartons loaded in one hour at this location to
within +/- 5 cartons with a 95% confidence interval.
We know that:
1. Level of confidence is 95% (1.96 on the z-table)
2. Desired level of precision is +/- 5 cartons
3. Standard deviation is 8.5 cartons
We set up the equation this way:

n = [(1.96 x 8.5) / 5]2 → n = 11.1

The resulting sample size of 11.1 means that after 11.1 hours, the average number of
cartons loaded per hour will be within +/- 5 cartons of the actual average with a 95%
certainty. This means that if 112 cartons are actually loaded per hour, we can be
95% confident that the actual average rate of cartons loaded every day is between
107 and 117 per hour.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 76
Sample Size
Discrete Data

Estimated Proportion
Confidence level Defective

2
n= ( 1.96
∆ ) P(1-P)
Sample size
Degree of Precision

There are three factors necessary to calculate the required sample size for a discrete
measure:
1. Confidence level - how confident we are that the interval (+/- ∆) will contain the
mean of the population.
2. Precision (or accuracy) - tolerance level above and below the sample mean
within which we are confident the population mean will fall.
3. Estimated proportion defective - the rate of defects we expect to see is set by a
Black Belt and expressed as a percentage. Sample size is maximized at P=0.5 (see
chart next page).
It is important to recheck the Proportion defective to verify the original assumption
of P after collecting the sample. Recalculate the required sample size (n) using the
actual Proportion defective (P) and if the new calculation results in a sample size
that is higher than the original result, then collect the additional samples and
recalculate P.
If it is not possible to collect more samples at this time, then change the formula
algebraically to solve for the degree of precision (∆) and apply it as your new
precision level for that sample. The new ∆ will give you a wider confidence
interval.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 77
Estimated Proportion Defective

Large
6
Estimated Proportion Defective = P
5
4
Sample Size

3
2
1
Small
0
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Estimated Proportion Defective (%)

The chart above shows that, as Estimated Proportion Defective moves closer to
50%, sample size increases. Sample size is maximized at 50%. Keep this in mind
when using the formula for calculating discrete sample size.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 78
Discrete Data Example

How many tires will


leak after 1000 miles?

A certain brand of tires is purchased new from a non-Six Sigma manufacturer.


Some of the tires purchased are defective and have been found to show leakage
after 1000 miles. Historically, about 7 out of every 100 are found to be defective.
We want to estimate the defective rate within +/- 3 tires with a 95% confidence
interval.

1. Level of confidence is 95% (1.96 on the z-table)


2. Desired level of precision is +/- 3 tires in 100 or 0.03
3. Estimated proportion defective is 7 in 100 or 0.07
We set up the equation this way:

n = (1.96 / 0.03)2 x 0.07(1 - 0.07) → n = 278

The resulting sample size of 278 means that the average number of defective tires
will be within +/-3 tires of the actual average (with 95% certainty). This means that
if, after checking 278 tires, the rate of defective tires is 7 out of every 100, then we
can be 95% sure that the actual defect rate is between 4 and 10 out of every 100.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 79
Sampling Tips
To calculate: Minimum Sample
Average 5 - 10
Standard Deviation 25 - 30
Proportion Defective 100 (with >5 expected)
Histogram or Pareto 50
Scatter Plot 25
Control Chart 12

All of the above calculations and tools have a minimum sample size requirement in
order to be effective. Too few samples lead to confidence intervals that are very
wide and therefore not useful.
Remember that these are minimums! The higher your confidence level, the
stronger your conclusions and predictions will be. Confidence level increases as
sample size increases.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 80
Potential Pitfalls
in Sampling

• BIAS
• CHANGES
• SAMPLING PLAN

BIAS - Listed below are some of the systematic differences in the sample as a result of the
selection process. Always check your data to ensure you avoid these common pitfalls.
1. Selection Bias - No structure in determining which items to sample
- Convenience sampling
- Systematic sampling (though sometimes it is appropriate)
2. Environmental Bias - Outdated sample
3. Non-Response Bias - The opinions of those who do not respond will not be
included.
4. Measurement Bias - Inconsistencies in data collection
- Operational definitions
- Unequal batch sizes
- Inconsistent collectors
- Non-representative samples (example:collecting
information on only the late invoices and not all invoices)
CHANGES - If changes to the process are made during the sampling process, those
samples already taken are no longer valid
SAMPLING PLAN - It is important to prepare the sampling plan with awareness of
potential bias. Always define the role of data collectors and to stick to the plan throughout
the sampling process.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 81
Summary
• Proper sampling of data can bring you the
same results as observing an entire population
• Using samples is both cost efficient and time
effective
• Which sampling technique to choose will be
determined by the type of data being observed
and the amount of variation within it
• Sample size is determined by formulas for
continuous and discrete data.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 82
Measurement Systems Analysis

The First Step in Data Collection

OBJECTIVES
- Illustrate the importance of accurate measurements in data collection.
- Introduce the Gage R&R study of measurement variation and the Discrete
Data Analysis.
- Set a target for allowable measurement variation.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 83
Measurement Systems

A measurement system is only helpful if it is consistently reliable. Each person who


collects data must be able to obtain the same result each time the same measurement is
taken. In addition, all people who collect data the same way must be able to obtain the
same result, and that measurement must be accurate.
Ask a room full of people what time it is and you’ll get many different answers, yet
there is only one exact time. Our individual measurement systems (wristwatches) may
not be synchronized and, therefore, the time we read is different, one from another and
from the actual time.
Measuring data is a process. This process needs to be examined to make sure that the
variation due to the measurement system itself represents only a small fraction of the
total variation in the data being observed. Using the example of wristwatches, if we
are measuring delivery time to the minute with a target of 08:00 and the receiving
clerk at the dock logs the arrival time as 08:04 but the driver logs 07:58, there is a
potential disagreement on whether the shipment was on time or late. Further, if these
two wristwatches are still six minutes apart after one month, the majority of shipments
for that month could be recorded as late when the driver was up to five minutes early
on his/her own watch. On top of this, if the monthly average and variation for delivery
time are being recorded, the variation will be amplified by the measurement system
error causing an even more inaccurate picture of the delivery time.
It is important to standardize the system that will be used to collect data.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 84
% R&R
• Repeatability - One person measures something
the same way each time
• Reproducibility - Everyone measures the same
thing the same way
• % R&R - Amount of variation due to the
difference in measurements
• Target - Less than 30%

Repeatability is the measure of one person’s ability to take the same measurement
more than once and get the same reading each time. If I look at a wooden ruler
directly in front of me, I may get a different measurement than if I lean to the right or
left and read it then. If I am counting the occurrence of the letter “f” on a typewritten
page, I may get a different result by scanning the page top to bottom the first time
and from side to side the next.
Reproducibility is the ability of a measurement to be taken by more than one person
with equally consistent results. If I look at the ruler directly in front of me, I may get
a different result than a much taller person who is trying to read the ruler from his/her
height.
The score calculated from the data sampled is called the Percentage of Repeatability
and Reproducibility (%R&R), and it represents the amount of variation found in the
data that is due to measurement error. The target for this score is a value less than
30%.
Other concerns for measurement system consistency are:
- Accuracy. The difference between the observed measurement and a
standard or actual value.
- Stability. The measurement system must be consistent over time.
- Linearity. There must be equal consistency throughout the range of the
measurement system. For example, a tape measure may be stretched like
taffy at one end.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 85
Conducting a Study

To conduct a Measurement System Analysis will require people and time. Each
measurement that is taken for the study must be repeated by each person two times
each.
There are two objectives to the process:
- Evaluate the repeatability of each person who uses the measurement
system, that is, his or her ability to get the same result each time the
measurement is taken.
- Evaluate the reproducibility of the measurement system itself, regardless
of the number of operators, that is, the consistency of measurements from
one operator to the next.
In order to accomplish this, it will be necessary to identify three operators who
normally collect the measurement in question, for example, actual editors who
proofread documents or the actual dock workers who log delivery time. Then each
of the operators will be asked to evaluate at least ten different samples, two times
each. The second time through the samples, the operators should evaluate them in a
different sequence than the first time so that the initial measurements are forgotten.
It is best when the tester knows the real score for each of the samples before the test
so that the accuracy of the operators’ measurements can be compared to a target,
but this is not required.
The result will be a chart showing the measurement taken by each operator for each
sample, as shown on the next page.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 86
part oper measure
1
1
1
1
0.33
0.34
Data Format
1 2 0.33
1 2 0.33
1
1
3
3
0.33
0.34 PART
2 1 0.36
2 1 0.36
2
2
2
2
0.36
0.37 OPERATOR
2 3 0.36
2 3 0.37
3
3
1
1
0.35
0.34 MEASUREMENT
3 2 0.33
3 2 0.33
3 3 0.34
3 3 0.33

Minitab>Stat>Quality Tools>Gage R&R Study

This is the structure of the data required for analysis in Minitab.


The Gage R&R Study will produce a %R&R figure that should be less than 30% for
the data collection to be reliable.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 87
Results
Gage R&R
Source %Contribution %Study Var
% R&R
Total Gage R&R 14.55 38.14
Repeatability 10.91 33.03
Reproducibility 3.64 19.07
oper 0.00 0.00
oper*part 3.64 19.07
Part-To-Part 85.45 92.44
Total Variation 100.00 100.00

The Minitab Session window shows the results of the study and the top number
under “%Study Var” is the %R&R. In the above example, the measurement system
analysis produced an R&R score of 38.14%, outside the recommended tolerance of
30%.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 88
Example
Reported Delivery Time

-500 0 500

Time
Each dot represents up to 31 observations.

This is actual data from a Penske Logistics location in Ohio. Drivers reported their
delivery time to dispatchers who recorded the number of minutes early or late against
the scheduled delivery time.
First, notice the extremely tall stack of points at 0. It is obvious that many drivers
simply reported that they were on time and the dispatchers recorded the scheduled
delivery time as the actual time, therefore the delivery was 0 minutes early or late.
Next, notice the wide variation in delivery times - as much as six hours (720 minutes)
early, or to the left of 0, to more than 4 hours late.
If the drivers who reported “on time” were actually early, how many more points
could be stacked at those earlier intervals? Late delivery is generally recognized to
be a problem, but early delivery can disrupt a customer just as much. The variation
in the measurement system distorts our understanding of what the customer feels.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 89
Discrete Data Analysis Method
Discrete Data Analysis Method
Delete Data

Delete All
SCORING REPORT
Print Report DATE: 3/10/96
Attribute Legend (used in computations) NAME: Allied Employee
Print Statistics
1 pass PRODUCT: 3313 Spark Plug
DEMO 2 fail BUSINESS: F&SP
Must match labels used below

Known Population Individual 1 Individual 2 Individual 3 Match Match


Sample # Attribute Try #1 Try #2 Try #1 Try #2 Try #1 Try #2 Each Other Known
1 pass pass pass pass pass fail fail N N
2 pass pass pass pass pass fail fail N N
3 fail fail fail fail pass fail fail N N
4 fail fail fail fail fail fail fail Y Y
5 fail fail fail pass fail fail fail N N
6 pass pass pass pass pass pass pass Y Y
7 pass fail fail fail fail fail fail Y N
8 pass pass pass pass pass pass pass Y Y
9 fail pass pass pass pass pass pass Y N
10 fail pass pass fail fail fail fail N N
11 pass pass pass pass pass pass pass Y Y
12 pass pass pass pass pass pass pass Y Y
13 fail fail fail fail fail fail fail Y Y
14 fail fail fail pass fail fail fail N N
15
"REPEATABILITY" -> 100.00% 78.57% 100.00%
"ACCURACY" -> 78.57% 64.29% 71.43%
"OVERALL REPEATABILITY & REPRODUCIBILITY" -> 57.14%
"OVERALL REPEATABILITY & REPRODUCIBILITY" -> 57.14%
"OVERALL REPEATABILITY, REPRODUCIBILITY & ACCURACY" -> 42.86%

For discrete measurements, a similar tool is used. The Discrete Data Analysis Method
is available for download at the GE Capital Six Sigma Quality Coach website. This
tool also returns a %R&R figure that should be greater than 90% before you continue
collecting data.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 90
What Next?

If the variation from the measurement system is greater than the acceptable limit of
30%, it is necessary to reduce that figure before proceeding with data collection.
Fixing a measurement system is a project itself, demanding close investigation of the
process and development of possible solutions.
It is important to find ways to improve the repeatability and reproducibility of the
measurement system. Accuracy should be consistently good regardless of who is
taking the measurements.
The fishbone, or Ishikawa, diagram is a useful tool in uncovering possible causes of
variation, and it is very useful in MSA. Use of this tool is explained in the Memory
Jogger and at the GE Capital Six Sigma Quality Coach website.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 91
Summary
• Measurement error increases the variation
found in the process
• Without MSA you run the risk of making
incorrect decisions
• Uncover variation and reduce it before
collecting data for a project

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 92
Capability (σ)

Meeting the Customer’s Needs

OBJECTIVES
- Calculate Process Sigma, a deliverable from the Measure Phase.
- Outline the methods for calculating capability using discrete and
continuous data.
- Introduce the Six Sigma tool in Minitab and the GE Process Sigma
Calculator in Excel.
- Use the capability study to determine the appropriate improvement
objective for the project.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 93
Capability
Baseline Improved
• Determined in the • Set objective for
Measure phase process improvement
• Z - a Statistical Unit of • Match Voice of the
Measure that Reflects Customer (CTQs) to
Process Capability Voice of the Process
• Yield - the percentage (variation)
of output that meets
specifications

Capturing the Baseline capability of the process is essential to tracking improvements


to that process. There are many ways to measure capability: total defects, defects
per unit, non-conforming parts ratio, average and variation, first pass / final yield,
and defects per million opportunities (DPMO). In order to measure capability
consistently among all processes we use sigma (Z)* and standardize all
measurements to that index.
Once the baseline sigma has been determined, all improvements can be measured
against it in the future.

*Note: The letter Z is used for the sigma level as a measure of process capability. The original sigma,
σ or s, refers to standard deviation of a data set.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 94
Sigma Scale
Defects per
Process Sigma Million Opportunities
6 3.4
5 230
4 6,210
3 66,800
2 308,000

Improvement by one sigma level requires exponential reduction of defects per


million opportunities. The difference from one level to the next is not the same for
all intervals. It is important that we do not interpolate or estimate a sigma level in
between those published in the Abridged Process Sigma Conversion Table, found in
the wave training manuals. Use only one decimal point for reporting a sigma level
you’ve calculated from the table. Only if one of the computer programs has been
used to determine the sigma level can it be shown with two decimal places.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 95
Current Process
LSL USL

5 10 15 20 25 30

A baseline capability study will show you how the process is performing against the
customer’s specifications (CTQs). The team will then be able to effectively state
the project’s improvement objective.
The objective of most projects will be to make all output fall within the
specification limits. Often, however, the objective will be to reduce defects. In that
case, picture a distribution like the one shown above with no lower specification
limit, only an upper specification limit since we are only concerned with the
maximum number of defects allowed.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 96
Objective
LSL USL

17 18 19 20 21 22 23

The improved process capability will ideally have all output falling within the
customer’s specifications limits. This will be proven by the increase in the sigma
level from the baseline capability.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 97
Continuous vs. Discrete

As always, the type of data being studied is important to the capability analysis.
Discrete and continuous data are handled differently with different formulas and
computer programs.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 98
Discrete Method
UNIT The item produced or
processed
DEFECT Any event that does not meet
a customer CTQ
DEFECT Any event that provides a
OPPORTUNITY chance of not meeting CTQ
DEFECTIVE A unit with one or more
defects

The discrete method involves calculating defects per million opportunities (DPMO)
and converting that figure to a sigma value using the Abridged Process Sigma
Conversion Table. In order to calculate DPMO it is necessary to record the number
of units being studied, how many defects were counted, and how many defect
opportunities are found on each unit. For example, an invoice may have several
blank fields to complete and some or all of the blanks could be filled in incorrectly.
Each blank would be considered a defect opportunity, and each defect opportunity
that is found to be incorrect is a defect. As we discussed in Chapter 4, Control
Charts, a defective unit is one that contains one or more defects. In this case, when
counting defective units, there is only one defect opportunity per unit - it is either
right or wrong, on time or late, complete or incomplete, etc.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 99
Defects per Opportunity (DPO)
Defects per Million Opportunities (DPMO)

D D = Number of Defects
N = Number of Units
DPO = N x O O = Opportunities per Unit

DPMO = 1,000,000 x DPO

Yield = 1 - DPO
Yield: The percentage of output that
meets specifications

The formulas for DPMO and process yield use only these variables. After the yield
or DPMO has been calculated, that number can be converted to a sigma value from
the table and recorded as baseline sigma or later in the project as improved sigma.
Remember that first pass yield is always used to calculate process sigma. It is
calculated by counting the number of defects discovered at any point during the
process. Even the defects that are discovered before they reach the customer are
counted because they demand rework and cost time and money. An inspection that
turns up defects is useful, but should not be relied on for quality.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 100
Example
• Over the last three months, a Black Belt collected data on
the number of errors made on rental contracts.
• There are six fields on the rental contract, all of which
must be filled in correctly (6 defect opportunities)
• 1247 contracts were written during that time, and there
were 97 errors (incorrectly completed fields) among them.

» DPO = 97 / (1247 x 6) = 97 / 7482 = 0.013


» DPMO = 0.013 x 1,000,000 = 13,000
» Yield = 1 - 0.013 - 0.987 or 98.7%
» Process is at 3.7 Sigma (from sigma table)

In this example, the calculations are based on six defect opportunities per unit.
Multiplying 1247 by six gives us 7482, which is the number of times a defect can be
made in that sample. If we were recording defective contracts there would only be
one defect opportunity per unit because the contract was either approved on the first
pass or it was sent back for re-work. In that case, if 55 of the contracts had one or
more defects and were considered defective, then the calculation would be 55 /
(1247 x 1) = 0.044 or 44,106 DPMO for a process sigma of 3.2.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 101
GE Process Sigma Calculator

General Worksheet For Calculating Process Sigma


enter
1 Number Of Units Processed N= 1247

2 Total Number Of Defects Made (Include Defects Made And Later Fixed) D= 97

3 Number Of Defect OpportunitiesPer Unit O= 6

4 Solve For Defects Per Million Opportunities 12964

5 Look Up Process Sigma In Abridged Sigma Conversion Table Sigma= 3.73 *

*Use only one decimal place for Sigma -


Round to the lower score, 3.7 in this case

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 102
Z - Table

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
-4 0.000032 0.000021 0.000013 0.000009 0.000005 0.000003 0.000002 0.000001 0.000001 0.000000
-3 0.001350 0.000968 0.000687 0.000483 0.000337 0.000233 0.000159 0.000108 0.000072 0.000048
-2 0.022750 0.017864 0.013903 0.010724 0.008198 0.006210 0.004661 0.003467 0.002555 0.001866
-1 0.158655 0.135666 0.115070 0.096801 0.080757 0.066807 0.054799 0.044565 0.035930 0.028716
-0 0.500000 0.460172 0.420740 0.382089 0.344578 0.308538 0.274253 0.241964 0.211855 0.184060
0 0.500000 0.539828 0.579260 0.617911 0.655422 0.691462 0.725747 0.758036 0.788145 0.815940
1 0.841345 0.864334 0.884930 0.903199 0.919243 0.933193 0.945201 0.955435 0.964070 0.971284
2 0.977250 0.982136 0.986097 0.989276 0.991802 0.993790 0.995339 0.996533 0.997445 0.998134
3 0.998650 0.999032 0.999313 0.999517 0.999663 0.999767 0.999841 0.999892 0.999928 0.999952
4 0.999968 0.999979 0.999987 0.999991 0.999995 0.999997 0.999998 0.999999 0.999999 1.000000

The Z-table is used to determine the area under the curve between specification
limits. In the case of a one-sided distribution with no lower spec limit, the area of the
data that falls below the upper spec limit is the portion of the process output that is in
compliance with CTQs. To read the chart, calculate the z-score to the right (positive)
and the left (negative) of the mean. The mean in these calculations is represented by
0. Negative numbers between 0 and -1 are listed next to the “-0” on the left.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 103
Continuous Method

From GE Capital Six Sigma Quality Coach

Studying capability with continuous data involves an assumption of normality,


discussed in Chapter 2, Basic Statistics. The area under the normal curve represents
100% of all possible output of the process with its current settings. When the data
is shown graphically, the amount of data outside the specification limits is apparent.
The calculations for capability involve solving for the area under the curve that is
between the specification limits, and that area equates to the percentage of the
output that is in compliance. This percentage is the yield of the process and can be
converted to a sigma value with the Abridged Process Sigma Conversion Table.
To calculate sigma, it will be necessary to know the mean of the data and the
standard deviation. The standard deviation and spec limits will be converted to a Z-
score that will standardize the process. For example, if the standard deviation of the
data is 6 and the upper spec limit is 18, the upper spec limit is 3 standard deviations
from the mean, and the portion of the data that falls more than 3σ to the right of the
mean is out of spec.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 104
Example - Cycle Time
• New hire orientation must be completed no sooner than 10
days and no later than 30 days after hiring date.
• 127 new employees were hired in the last six months.
Days to Complete Orientation

30

20
Frequency

10

10 15 20 25 30 35
Number of Days

Continuous data has been collected with specification limits for the CTQ. The
mean and standard deviation will be calculated and the specification limits will be
converted into z-scores, positive and negative. The portion of data that is outside
the specification limits will be determined and the percentage of data that is in
compliance with CTQ (between the spec limits) will be identified as the yield of the
process.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 105
Out
of S
pec
Calculations Are
as

Mean = 22.4 LSL USL

Standard Deviation = 3.9


Upper spec limit = 30
Z2 Z1
Lower spec limit = 10 area area

10 15 20 25 30 35

Z1 = USL - x = 30 - 22.4 = 1.9 from table 97.13%


s 3.9
Z2 = LSL - x = 10 - 22.4 = -3.2 from table 0.07%
s 3.9
Total Area (Yield) = 97.2% or 3.4 Sigma*
*1.9 Long Term Sigma

The calculations in this example are based on normal data. The area to the right of
the mean but to the left of the upper specification limit is equal to 1.9 standard
deviations and the area to the left of the mean but to the right of the lower
specification limit is equal to -3.2 standard deviations. The overall percentage of
data between the spec limits is 97.2% (97.13% + 0.07%) for a sigma value of 3.4.
This is the short term sigma of the process. Over time the process can vary due to
environmental conditions, changes to the process, policy changes, etc. The standard
shift from short term sigma to long term sigma in the quality industry is -1.5. This
means that the process will be 1.5 sigma less capable in the long term than it is in
the short term. In the example above, the long term sigma would be 1.9 with a -1.5
sigma shift (3.4 - 1.5 = 1.9)

Note: For reporting, use the short term sigma, because it is an indication of the potential of the
process.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 106
Minitab Capability Analysis
LSL USL
Process Data
USL 30.0000 ST
Target * LT
LSL 10.0000
Mean 22.4094
Sample N 127
StDev (ST) 3.94715
StDev (LT) 3.95058

Potential (ST) Capability


Cp 0.84
CPU 0.64
CPL 1.05
Cpk 0.64
Cpm *
10 15 20 25 30 35

Overall (LT) Capability Observed Performance Expected ST Performance Expected LT Performance


Pp 0.84 PPM < LSL 0.00 PPM < LSL 833.55 PPM < LSL 841.38
PPU 0.64 PPM > USL 31496.06 PPM > USL 27237.02 PPM > USL 27342.27
PPL 1.05 PPM Total 31496.06 PPM Total 28070.57 PPM Total 28183.65
Ppk 0.64

Minitab>Stat>Quality Tools>Capability Analysis (Normal)

When executing the capability analysis in Minitab, arrange all the data points in one
column of the worksheet in time order. The output is a very busy chart, but the graph
that is shown tells the main part of the story. The process is capable in the short term
to the level shown by the red (solid) line and in the long term by the black (dashed)
line. Numerically, the sigma value can be calculated from the output data as well.
On the left side Minitab posts the short term (potential) and long term (overall) results
of the capability study. The Cpk number, in this case 0.64, is the demonstrated
excellence of the process and is converted to sigma by multiplying by 3, in this case
1.9. Apparently Minitab doesn’t think as highly of our process as we did in our
manual calculation above. This is because the tolerances included in the computer
program attempt to predict the future allowing for changes in people, machines, and
the other factors from the fishbone diagram. Whichever method you select for
reporting must be consistent from the Measure phase through the Control phase.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 107
Minitab - Six Sigma Feature
Report 1: Executive Summary
Process Performance Process Demographics
Actual (LT)
Potential (ST) Date: 11/19/99
Report BB2:Green
Reported by: Process Capability for C3
LSL USL Project: New Hire
Department: HR
I and MR
Process: Chart
Orientation Capability Indices
Characteristic: 3.0SL=34.25
30 ST LT
Units:
Upper Spec: 30 X=22.41 Mean 20.0000 22.4094
6 16 26 36
20
Lower Spec: 10 StDev 3.8591 3.9428
1,000,000 -3.0SL=10.57 Z.USL 2.5913 1.9252
10
Actual (LT) Nominal:
Potential (ST) Z.LSL 2.5913 3.1474
Opportunity:
100,000
Observ. 0 50 100
Z.Bench 2.3431 1.9122
15 3.0SL=14.55
Z.Shift 0.4309 0.4309
10,000
10
Process Benchmarks P.USL 0.004781 0.027103

1000 5 P.LSL 0.004781 0.000824


R=4.452
Actual (LT) Potential (ST)
P.Total 0.009563 0.027926
0 -3.0SL=0.00E+00
100 Yield 99.0437 97.2074
Sigma 1.91 2.34
(Z.Bench) PPM 9562.52 27926.4
10
Potential (ST) Capability Actual (LT) Capability Cp 0.86
PPM 27926.4 9562.52 Cpk 0.65
1
Process Tolerance Process Tolerance
8.3996 31.6004 10.5577 34.2612 Pp 0.84
0 50 100
I I I I I I Ppk 0.64
I I I I I I
10 30 10 30 Data Source:
Time Span: 6 months
Specifications Specifications
Data Trace:

Minitab>Six Sigma>Process Report

The Six Sigma module in the latest version of Minitab may be a good way to present
the capability of the process to those who are reviewing the project’s progress. The
data must be arranged the same way it is for Capability Analysis. An area is
provided for information about the process and the sample data on the Executive
Summary portion of the report.
Once again, remember to be consistent in showing the potential capability of the
process because the Six Sigma module, the Capability Analysis, and the manual
calculations may show different short term sigma values.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 108
State Improvement Goal

• Determined in the Measure Phase


• Reduce Process Variation
• Shift the Process Mean
• Reduce Variation and Shift Mean

When a process is in control and its capability can be studied, the project team will
see a picture of the process and how it meets customer requirements. When the
process is not always within specifications, variation must be reduced. When the
process is not producing the correct output value, the mean must be shifted to the
target. It is also possible that both of these conditions exist and a shift of mean and
reduction in variation are required.
In any case, the sigma value is the preferred method of measuring capability.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 109
Summary
• Sigma (Z) is used as a standard measure of
process capability
• Determining baseline capability helps to
outline the improvement objective
• Use continuous and discrete methods for
calculating process sigma

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 110
Hypothesis Testing

Narrowing to Root Causes

OBJECTIVES
- Introduce the fundamentals of hypothesis testing.
- Determine which test to use for what types of data.
- Define the terms used in subsequent chapters.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 111
Hypothesis Testing

• Is there a difference between data sets?

• Is that difference significant?

• How can I be sure?

In the Analyze Phase of a DMAIC project you are asked to segment the data from
your process in order to identify possible causes of defects and variation. Many
times these variables are segmented on the basis of a gut feeling from the process
owners, and other times the behavior of the data will show you how to separate the
segments. It is necessary at this stage to identify the statistical differences between
sets of data in order to understand which possible causes are actually true causes of
defects and variation in the process. The way to identify the root causes is through
hypothesis testing, testing whether the output of the process is influenced by a
specific input or inputs.
Hypothesis testing will enable you to determine whether an observed difference in
the data is due to random chance or a true difference in the process. Through
hypothesis testing we are able to minimize subjectivity by eliminating the gut feeling
as a decision-making tool. We can also minimize the risk of making an incorrect
decision and wasting time trying to fix a problem that has no bearing on the output of
our process.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 112
H0 HA
Null Hypothesis Alternative Hypothesis
• There is no difference • There is a significant
between data sets difference
• Means are equal, • Means are not equal,
Medians are equal Medians are not equal
• Data is normal • Data is not normal

“Everything is the same” “Something is different”

The null hypothesis, or the hypothesis that we assume to be true, is that all things
are equal; that there is no difference between the averages of two data sets; that all
the locations in the region have the same on-time performance; that the data is
normal; that one factor has the same influence on the output as any other.
The alternative hypothesis is that there is a difference.
A hypothesis test assumes that the null hypothesis is true and then sets out to
disprove that hypothesis and to prove the alternative.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 113
The p-value

REJECT

ACCEPT

As described in Chapter 1, the p-value is a tool that will help you make decisions.
The p-value represents the probability of a set of data occurring if the null
hypothesis is true. If the null hypothesis is true and the data sets are equal, the p-
value will be high, close to 1.00. On the other hand, if the alternative hypothesis is
true and the data sets are different, the p-value will be low, usually less than 0.05.
The Black Belt decides what the cutoff should be. For most tests it will be 0.05, but
in less important cases it can be as high as 0.10, and in very important cases it can
be as low as 0.01.
The inverse of the p-value is the chance of being correct if you believe that the null
hypothesis is true, usually 0.95 or 95%. A p-value of 0.01 or less would indicate
that you are 99% sure of your results, and in some cases that level of certainty is
required.
The p-value will help you decide which road to take, whether to accept that your
data sets are equal and that there is no difference, or to reject that hypothesis and
believe, instead, that they are statistically different and should be treated separately.
When you get a p-value from your hypothesis test that is less than 0.05, you can
believe that your variables are behaving differently and this is the next step in
determining root causes.
0.00 0.05 0.50 1.00

Different The Same

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 114
Your Decision
Accept Reject
Hypothesis Hypothesis

Hypothesis Type 1 Error


is True Correct α - Risk
The
Truth
Hypothesis Type 2 Error
is False
β - Risk Correct
(Alternative
is True)

Example: The customs agent assumes that there are no narcotics in the suitcase, and the
truth is either that there are drugs in the suitcase or there are not.
The hypothesis is that there are no drugs because the hypothesis is always that there is
no difference between samples, in this case, no drugs in any suitcase.
A) The agent rejects the hypothesis and searches the bag:
1) If there are drugs in the bag, he will find them and make the arrest.
2) If there are no drugs in the bag, the person will be delayed a few minutes and
possibly grumble about the experience, but no harm was really done.
B) The agent accepts the hypothesis and does not search the bag:
1) If there are no drugs in the bag, there is no harm done.
2) If there are drugs in the bag then they get into the country, and that is much worse
than the error made in A2 above.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 115
α - Risk
• The chance that you will make a
mistake if you reject your hypothesis
but it is actually true.
• False alarm - We take action when
none is necessary
• Generally 0.05 - the limit for the p-value

The errors that can be made when analyzing data are different, and the risk of
making each of those errors is also different. We must determine the amount of risk
we can tolerate before sampling and testing the data.
We will deal with α-risk more often than β-risk because most of the tests we use
will involve the p-value, and that is set at the level of α-risk we identify as
acceptable. Usually 0.05, the level at which we are 95% confident that our null
hypothesis is false, is the cutoff for the p-value.
α-risk is the possibility of making a Type 1 error, rejecting the null hypothesis when
it is actually true.
At times it may be acceptable to use 0.10 if we only need to be 90% certain. At
other times, when it may be a matter of safety or of a large potential expenditure,
0.01, or 99% certainty, may be required.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 116
β - Risk
• The chance that you will make a
mistake if you accept your hypothesis
but it is actually false
• Missed opportunity - We take no action
when we should make a correction
• Typically 0.10 - 0.20

β-risk is the probability that we will miss a signal in the data and take no action to
correct a problem. Our sample data will show us a problem and we will miss it and
miss the opportunity to make an improvement. We will believe that the null
hypothesis is true, that there is no difference in the data sets or that no single factor
influences the outcome more than any other. However, the null hypothesis is false
and there is a difference, but we miss it.
This risk can be a serious matter because opportunities to solve problems may be
hard to find. Most experiments use a β of 0.10 or 0.20. A β of 0.10 means we are
90% confident that we will not miss a signal.
β-risk is the possibility of making a Type 2 error, accepting the null hypothesis
when it is actually false.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 117
Which Test to Use?
Discrete X Continuous X

Discrete Y Chi-Square
Logistic
Regression

Continuous Y ANOVA Regression


Means/Medians Tests

There are several test we will use to look for root causes, and the choice of test
depends on the type of data being studied.
If the input variable is discrete and the output variable is also discrete, the Chi-
Square test is used.
If the input variable is discrete and the output variable is continuous, the available
tests are ANOVA, t-tests and medians tests.
If the input variable is continuous and the output variable is also continuous,
regression analysis will be employed.
Continuous input and discrete output variables demand the use of logistic
regression, which will not be covered in this manual.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 118
Definitions

• t - Statistic
• f - Statistic
• Chi-Square (χ2)
• Correlation coefficient (R)
• Coefficient of determination (R2)

In the following chapters the tests used to decide between the null hypothesis and the
alternative will be explained. Some of the key terms that will arise during the discussions
are defined here:
- t-statistic is a ratio, the numerator is the difference in the means of two sets of
data and the denominator is the variation of the means. The p-value that results
from the t-test represents the chance of the two means being equal.
- f-statistic is a ratio that compares the variation within groups of data to the
variation among the groups. The f-statistic is used in Analysis of Variance
(ANOVA) to compare the means of three or more sets of data. The p-value that
results from the ANOVA test represents the chance that the means of all groups are
the same.
- Chi-square is a sum total, calculated by subtracting all observed values from
values that should be expected and squaring the differences, then adding the
resulting numbers. The p-value that results from the chi-square test is the chance
that the observed values are what should be expected.
- Correlation coefficient (R) is the measure of the strength of the relationship
between one continuous x and one continuous y. The p-value that results from a
regression analysis is the chance that R equals zero.
- Coefficient of determination (R2) measures the amount of variation that is
explained by the regression model and is expressed as a percentage.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 119
Homogeneity of Variance
Normal Data
95% Confidence Intervals for Sigmas Factor Levels

F-Test
2

Test Statistic: 1.398


0.20 0.25 0.30 0.35
P-Value : 0.152

1
Levene's Test

Test Statistic: 1.448


2
P-Value : 0.231

15.0 15.5 16.0

Minitab>Stat>ANOVA>Homogeneity of Variance

Some of the tests we will encounter will require the Black Belt to specify whether
the variances of two or more sets of data are equal. In order to answer this question,
the Homogeneity of Variance test must be performed
The p-value that results from the homogeneity of variance test is the chance that the
sets of data have the same level of variation from the mean.
If the data is normal, use the p-value found in the F-test above. If the data is not
normal, use the p-value found in the Levene’s test below.
The resulting graphs are also useful for illustrating variation in a presentation.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 120
Summary
• Null hypothesis is assumed
• Null hypothesis states that there is no
difference, all sets of data are equal
• p-value is a decision tool
• Choice of test is determined by the type of
data being studied

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 121
Means / Medians Tests

Are we really different?

OBJECTIVES
- Present the Analysis Path as the method for choosing the correct test when
studying discrete input variables and a continuous output.
- Explain the use of each of the six tests
1-Sample t - test
2-Sample t - test
ANOVA
Wilcoxon Test
} for normal data

Mann-Whitney
Mood’s Median Test
- Identify p-values for all tests
} for non-normal data

- Use the Stack Data command to structure data for tests in Minitab

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 122
MEAN ?

MEDIAN ?

As discussed in the Basic Statistics Chapter, the measure of central tendency, or


most likely expected value, can be either the mean or the median of the data set. In
order to measure the performance of a process, it will be necessary to compare the
center of the data set to a target value or to the center of another data set. While the
dispersion of the data is also very important in improving a process, the central
tendency is the best way to compare whole sets of data with discrete inputs and a
continuous output.
The examples in this chapter will compare distribution centers and box-packing
operators, which are clearly discrete inputs. The Y’s will be arrival time for the
locations and number of boxes loaded per hour for the operators, clearly continuous
outputs.
When data is normal, the mean is the appropriate measure of central tendency
because the distribution will fit under the normal curve and is predictable.
When data is non-normal, the median is more appropriate because extreme values,
or outliers, will pull the mean in one direction away from the actual center of the
data, while the median remains constant in the face of very high or very low
numbers. It is the middle of the road, the structural center of the data set.
Mean is shown with the Greek letter mu - µ
~
Median is shown as x

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 123
Levels
• Test of One Level
~
µ = target value x = target value
• Test of Two Levels
~ ~
µA = µB xA = xB
• Test of Three or More Levels
~ ~ ~
µA = µB = µC … xA = xB = xC ...

There are different tests to be used depending, not only on the normality of the data,
but also on the number of data sets being tested.
If we want to measure one process against an objective, for example measuring the
ability of the accounting department to prepare all invoices in 15 days, then we must
use the correct test that compares the central tendency of our output to the target
value of 15.
If we want to compare two sets of data, for example measuring the accounting
department’s completion time for a Type ‘A’ invoice versus a Type ‘B’ invoice, then
we must use the appropriate two-level comparison test.
If we want to compare three or more sets of data, we must use the appropriate test for
that study.
As a result, there are six tests to be covered:
One-level test of the mean
Two-level test of the means
Three or more-level test of the means
One-level test of the median
Two-level test of the medians
Three or more-level test of the medians
We will cover the execution and results of each test.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 124
Analysis Path

1. Control
2. Normality of Data
3. Homogeneity of Variance 1
1

4. Choice of Test
---------+---------+---------+-------
(---------*---------)
(---------*---------)
(---------*----------)
[
_
Ho X
]
---------+---------+---------+-------

The first thing to understand is the order of analysis, or the Analysis Path.
The data must be in control in order to be studied. If special cause variation exists,
it can disturb the analysis and produce inaccurate results. Chart the process and
identify special cause variation so it can be removed from the analysis if possible.
The next step is to determine if the data being studied is normal or non-normal.
The choice of test will depend greatly on this fact.
The homogeneity of variance will not always be used, but it is important to
understand if the data sets being studied are similar, not only in their central
tendency, but also in their dispersion. The choice of test will, in some cases,
depend on this fact.
After these steps you will be able to select the appropriate test to conduct.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 125
One-Level Test of Mean

• Average Delivery Time

• Target Value = 0
PENSKE
– Meaning: not late or early

“Are we meeting our objective?”

The mean of a sample will give the observer a snapshot of that process for the
present time. If you want to determine the mean of the data in the long term, it is
necessary to test the mean statistically against the target value.
Suppose we wanted to study a distribution center’s ability to meet its required
delivery times exactly, not one minute early or late, that is, a 0 minute average
delivery time. We can take a sample and calculate the mean, but that data is
incomplete because it has been collected over a short time.
Using hypothesis testing, we will be able to determine the confidence interval for
the mean output of the process over the long term, thereby painting a more accurate
picture of the process than just the sample.
In this case, the target value is 0.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 126
Analysis Path

5
In Control?
!
0

-5

0 5 10 15 20 25 30 35

Minitab>Stat>Control Charts>Individuals

We begin by studying the data to see if it is in control. An individuals chart is


helpful in showing the process over time.
In this case, we see that the process is in control because there are not points outside
the control limits and there are no trends, patterns or cycles in the data. If there
were some evidence that the data is out of control, it would be necessary to identify
the special cause variation and eliminate it before continuing. If the signals are
easily explained, then the out of control points can be eliminated from the study. If
further investigation is required to identify the special causes of variation, then the
focus of the study must turn in that direction until the process is in control.
For this example we are clear to proceed.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 127
Analysis Path

99

95
Normal?
!
90

80
70
60
50
40
30
20

10
5

-5 0 5
Minitab>Graph>Probability Plot

Next it is necessary to learn whether the data is normal or not. As discussed in the
Basic Statistics Chapter, there are several ways to test normality. In this case we’ve
used the Normal Probability Plot to see if all points are within the confidence
interval. The data appears normal, but if there is any doubt, be sure to run
additional tests.
Once we know the data is normal we can select the appropriate test.
Non-normal data will be studied in the second half of this Chapter.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 128
Analysis Path

t - Test
8

0 _
HoX
[ ]

-3 -2 -1 0 1 2 3 4
Test of mu = 0.000 vs mu not = 0.000
Variable N Mean StDev SE Mean T P
Loc1 35 0.114 1.996 0.337 0.34 0.74

Minitab>Stat>Basic Statistics>1-Sample t

The test of the mean for one level is the 1-Sample t-test.
Data must be in one column, though not necessarily in time sequence.
The output display will show the null hypothesis (in blue) based on what you’ve
entered as the test mean, which is the target value, and the 95% confidence interval
for the mean of the process in the long term (in red).
You will also see a box labeled Alternative for the alternative hypothesis. In most
cases “not equal” will be the correct choice, but if you are only interested in
determining whether the actual mean is greater than or less than the target value,
you can select the corresponding choice from the dropdown menu.
In this case the p-value is 0.74, making us confident that the mean of the process in
the long term is 0.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 129
Two-Level Test of Means

• Average Delivery Time


PENSKE

• One Data Set Compared to


PENSKE
Another

“How are we performing


relative to another location?”

Suppose we want to compare the output of a process when performed by two


different operators or locations. We will want to determine if the mean of the data
from one operator is the same as that of the other. We continue on the Analysis
Path for two levels.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 130
Analysis Path

10
In Control? Normal?
!
0

99

95
-10
90
0 5 10 15 20 25 30 35 80
70
60
50
40
30
20

10
5

-5 0 5

Check the second set of data for control and for normality the same way you
checked the first set. In this case, we see that both conditions are met - the data is
in control and it is normal.
If the data were not in control, it would be necessary to investigate special cause
variation before proceeding.
If the data were not normal, it would be necessary to compare medians instead of
means.
In this example, we can compare the means of the two sets of data.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 131
Analysis Path

Variances Equal?
95% Confidence Intervals for Sigmas Factor Levels

1
!
F-Test
2
Test Statistic: 1.654
1.5 2.5 3.5
P-Value : 0.147

1
Levene's Test

Test Statistic: 4.277


2
P-Value : 0.042

-5 -4 -3 -2 -1 0 1 2 3 4 5

Stack1

Minitab>Stat>ANOVA>Homogeneity of Variance

Before comparing the means, however, it is necessary to determine whether the sets
of data have the same spread around the mean. Comparing sets of data with very
different levels of dispersion can lead to faulty conclusions, so it is necessary to
identify the homogeneity (similarity) of variance between the two data sets.
The null hypothesis for the homogeneity of variance test is that the variances of the
data sets are equal and the alternative hypothesis is that they are not equal. The
resulting p-value from the test will tell us whether we should accept the null
hypothesis, believing that the variances are equal, or reject it.
For normal data, the Bartlett’s, or F-test, is used. It is the upper portion of the
output graph. For non-normal data, the result of Levene’s test is used.
In this case, we see that the variances are statistically equal because the p-value of
the F-test is 0.147, above the 0.05 cutoff.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 132
Analysis Path

2-Sample t - Test
5
4

3
2

1
0

-1
-2

-3
-4

-5

Loc1 Loc2

95% CI for mu Loc1 - mu Loc2: ( -1.08, 1.11)


T-Test mu Loc1 = mu Loc2 (vs not =): T = 0.03 P = 0.98
Both use Pooled StDev = 2.30

Minitab>Stat>Basic Statistics>2-Sample t

The test of means for two levels is the 2-Sample t-test.


Data must be in two columns, though not necessarily in time sequence.
The boxplot display will show the two data sets with their individual means. The
null hypothesis for this test is that the means are equal, and the alternative is that
they are not equal, unless you specify under the Alternative that you want to
determine that the mean of the First Sample is greater than or less than the mean of
the Second Sample.
In this case the p-value is 0.98, making us confident that the means of the two data
sets are equal.

Note: For data sets with equal variance, determined through the homogeneity of variance test, check
the box marked Assume equal variances and the test will be conducted using the pooled standard
deviation of both data sets. This will offer a truer picture of the comparison between the means of
both groups. For data sets with unequal variance, leave the box unchecked and the data sets will be
compared using the individual variance for each.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 133
Three or More Levels
Test of Means
PENSKE
• Average Delivery Time
PENSKE

• All Data Sets Compared to


PENSKE
Each Other

PENSKE “How are we performing among


all locations?”

When comparing data sets for three or more levels, we must follow the Analysis
Path for each additional data set.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 134
Analysis Path

!
In Control? Normal?
Variances Equal?
10
99

95
90
5 80
70
60
50
40
30
0
20

10
5

0 5 10 15 20 25 30 35 1

0 4 8

95% Confidence Intervals for Sigmas Factor Levels

Bartlett's Test

Test Statistic: 2.356


P-Value : 0.308
P-Value : 0.308
2

Levene's Test

Test Statistic: 2.322


P-Value : 0.103

1.5 2.5 3.5

In this example, the third data set is in control and it is normal. The variances of all
three sets are statistically similar based on the 0.308 p-value for Bartlett’s test, the
test used with normal data. We can proceed with the test of means for three (or
more) levels.
If the data were not in control, it would be necessary to investigate special cause
variation before proceeding.
If the data were not normal, it would be necessary to compare the medians of the
data sets.
If the variances were not equal, it would be necessary to compare the medians of
the data sets in addition to comparing the means in order to support the findings of
the means test.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 135
Analysis Path

ANOVA
Analysis of Variance for Stack1
Source DF SS MS F P
Group1 2 162.31 81.15 16.11 0.000
Error 102 513.72 5.04
Total 104 676.03
Individual 95% CIs For Mean
Based on Pooled StDev
Level N Mean StDev ------+---------+---------+---------+
1 35 0.114 1.996 (-----*-----)
2 35 0.097 2.567 (-----*-----)
3 35 2.743 2.131 (-----*-----)
------+---------+---------+---------+
Pooled StDev = 2.244 0.0 1.2 2.4 3.6

Minitab>Stat>ANOVA>One-way -5
1

3
The test of means for three or more levels is Analysis of Variance or ANOVA.
Data can be in separate columns, or it can be stacked into one column of data with a
separate column of subscripts identifying each input group. Use
Minitab>Manip>Stack/Unstack>Stack Columns to stack the data.
The output display (boxplots optional) will show the data sets with a 95%
confidence interval for their individual means. The null hypothesis for this test is
that all means are equal, and the alternative is that they are not equal.
In this case the p-value is 0.00, making us very confident that the means of the three
data sets are not equal.
We’ve already compared the first two sets of data, so the third set is the one that is
different and that is making the p-value decrease to 0.00. The Minitab session
window display, shown above, also reveals the data set that is not in line with the
others. An unwritten rule is that the more overlap there is among the dashed-line
confidence intervals, the more likely the means are equal. It is clear that set #3 has
a higher mean than the other two.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 136
One-Level Test of Median

• Number of Boxes Loaded


per Hour

• Target Value = 15

“Am I meeting my objective?”

The median of a sample can also give the observer a snapshot of that process for the
present time. If you want to determine the median of the data in the long term, it is
necessary to test the median statistically against the target value.
Suppose we wanted to study an individual’s ability to load a specified number of
carton in an hour. We can take a sample and calculate the median, but that data is
incomplete because it has been collected over a short time.
Using hypothesis testing, we will be able to determine the confidence interval for the
median output of the process over the long term, thereby painting a more accurate
picture of the process than just the sample.
In this case, the target value is 15.
We will follow the same Analysis Path as before, but in these examples the data is
non-normal. Non-normal data are studied using the medians test found in
Minitab>Stat>Nonparametrics.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 137
Analysis Path

25
In Control?
!
15

5 1

0 5 10 15 20 25 30 35

Minitab>Stat>Control Charts>Individuals

The first step is to check the data for control. In this case we can see on point that
is just below the lower control limit, and this point would require investigation in
order to identify any special cause that may be present.
Other than that, the data seems to be in control, so we can proceed once that
individual point is explained.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 138
Analysis Path

99

95
Normal?
!
!
90

80
70
60
50
40
30
20

10
5

5 15 25
Minitab>Graph>Probability Plot

The normality tests will fail with this data set. The data is stacked up at several
points along the distribution and many points fall outside the dashed lines. We will
have to conduct medians tests on this set of data.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 139
Analysis Path

1-Sample Wilcoxon Test

Wilcoxon Signed Rank Test

Test of median = 15.00 versus median not = 15.00

N for Wilcoxon Estimated


N Test Statistic P Median
Oper1 35 32 281.0 0.758 15.00

Minitab>Stat>Nonparametrics>1-Sample Wilcoxon

The test of the median for one level is the 1-Sample Wilcoxon.
Data must be in one column, though not necessarily in time sequence.
The null hypothesis for this test is that the actual median equals the test median,
which you must specify in Minitab, and the alternative hypothesis is that the actual
median does not equal the test median. The output display will show the estimated
median and a p-value for the null hypothesis.
In this case the p-value is 0.758, making us confident that the median of the process
in the long term is 15, which is the target value.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 140
Two-Level Test of Medians

• Number of Boxes Loaded


per Hour

• One Data Set Compared to


Another

“Is Operator #1 performing the


same as Operator #2?”

Suppose we want to compare the output of a process when performed by two


different operators or locations. We will want to determine if the median of the
data from one operator is the same as that of the other. We continue on the
Analysis Path for two levels.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 141
Analysis Path

30
In Control? Normal?
!
!
20

10

99

0 95
90

0 5 10 15 20 25 30 35 80
70
60
50
40
30
20

10
5

0 10 20 30

Check the second set of data for control and for normality the same way you
checked the first set. In this case, we see that the new data is also not normal, but it
is in control.
If the data were not in control, it would be necessary to investigate special cause
variation before proceeding.
If the data were normal, it would still be necessary to compare medians instead of
means because the first set of data is not normal.
In this example, we must compare the medians of the two sets of data.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 142
Analysis Path

Variances Equal?
95% Confidence Intervals for Sigmas Factor Levels

1
!
!
F-Test
2
Test Statistic: 2.663
3 4 5 6 7 8
P-Value : 0.005

1
Levene's Test

Test Statistic: 6.121


2
P-Value : 0.016

5 15 25

Minitab>Stat>ANOVA>Homogeneity of Variance

Before comparing the medians, however, it is necessary to determine whether the


sets of data have the same spread around the mean. Comparing sets of data with
very different levels of dispersion can lead to faulty conclusions, so it is necessary
to identify the homogeneity (similarity) of variance between the two data sets.
The null hypothesis for the homogeneity of variance test is that the variances of the
data sets are equal and the alternative hypothesis is that they are not equal. The
resulting p-value from the test will tell us whether we should accept the null
hypothesis, believing that the variances are equal, or reject it.
For normal data, the Bartlett’s, or F-test, is used. It is the upper portion of the
output graph. For non-normal data, the result of Levene’s test is used.
In this case, we see that the variances are statistically different because the p-value of
Levene’s test is 0.016, below the 0.05 cutoff.
The reason this test is necessary when studying non-normal data is that, if the
variances were equal, you can conduct a means test in support of the findings in the
medians test. This is only true with sample sizes of 25 or more. With unequal
variances and non-normal data and with sample sizes less than 25, only medians
tests can be used.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 143
Analysis Path

Mann - Whitney Test

Mann-Whitney Confidence Interval and Test

Oper1 N = 35 Median = 15.000


Oper2 N = 35 Median = 17.000
Point estimate for ETA1-ETA2 is -2.000
95.0 Percent CI for ETA1-ETA2 is (-3.001,0.000)
W = 1088.5
Test of ETA1 = ETA2 vs ETA1 not = ETA2
is significant at 0.0714

The test is significant at 0.0697 (adjusted for ties)

alue
p-v
Cannot reject at alpha = 0.05

Minitab>Stat>Nonparametrics>Mann-Whitney

The test of means for two levels is the Mann-Whitney test.


Data must be in two columns, though not necessarily in time sequence.
The output display will show the two data sets with their individual medians. The
null hypothesis for this test is that the mdieans are equal, and the alternative is that
they are not equal, unless you specify under the Alternative that you want to
determine that the median of the First Sample is greater than or less than the median
of the Second Sample.
In this case the p-value is 0.0697, and although that is above the 0.05 cutoff, there
may still be some doubt as to the similarity of the two medians. In some cases, as
discussed in the Basic Statistics Chapter, it may be more appropriate to set the p-
value cutoff at 0.10.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 144
Three or More Levels
Test of Medians
• Number of Boxes Loaded
per Hour

• All Data Sets Compared to


Each Other

“Are all operators meeting the


standard?”

When comparing data sets for three or more levels, we must follow the Analysis
Path for each additional data set.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 145
Analysis Path

!
In Control? Normal?

30
Variances Equal?
1 99

95
90
!
20 80
70
60
50
40
30
10
20

10
5

0
1
0 5 10 15 20 25 30 35
5 15 25

95% Confidence Intervals for Sigmas Factor Levels

Bartlett's Test

Test Statistic: 8.036


P-Value : 0.018

Levene's Test

Test Statistic: 3.263


P-Value : 0.042 P-Value : 0.042
3

3 4 5 6 7 8

In this example, the third data set is in control and is not normal. The variances of all
three sets are not statistically similar based on the 0.042 p-value for Levene’s test, the
test used with non-normal data. We must proceed with the test of medians for three
(or more) levels.
If the data were not in control, it would be necessary to investigate special cause
variation before proceeding.
If the data in the third set were normal, it would still be necessary to compare the
medians of the data sets because the first two sets of data are non-normal.
If the variances were equal, it would be possible to compare the means of the data
sets in addition to comparing the medians in order to support the findings of the
medians test. Since the variances are not equal, we are only able to test the medians.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 146
Analysis Path

Mood’s Median Test


Mood Median Test

Mood median test for Stack2

Chi-Square = 1.22 DF = 2 P = 0.542

Individual 95.0% CIs


Group2 N<= N> Median Q3-Q1 ----+---------+---------+---------+--
1 20 15 15.00 3.00 (------+------)
2 16 19 17.00 6.00 (-------------------+------)
3 20 15 15.00 4.00 (------------+------------)
----+---------+---------+---------+--
13.5 15.0 16.5 18.0
Overall median = 15.00

Minitab>Stat>Nonparametrics>Mood’s Median Test

The test of medians for three or more levels is Mood’s Median Test.
Data must be stacked into one column with a separate column of subscripts identifying
each input group. Use Minitab>Manip>Stack/Unstack>Stack Columns to stack the data.
The output display will show the data sets with a 95% confidence interval for their
individual medians. The null hypothesis for this test is that all medians are equal, and the
alternative is that they are not equal.
In this case the p-value is 0.542, making us confident that the medians of the three data
sets are equal.
If the p-value were less than 0.05, we would look at the output window to determine which
of the data sets was dissimilar from the others. An unwritten rule is that the more overlap
there is among the dashed-line confidence intervals, the more likely the means are equal.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 147
Choice of Test*
Normal Non-Normal

One Data t - test Wilcoxon


Set

2-Sample t - test Mann-Whitney


Variances
Mood’s Median Test
Equal ANOVA
(ANOVA)

Variances ANOVA Mann-Whitney


Not Equal (Mood’s Median Test) Mood’s Median Test

*Assuming Data is in Control

Reviewing the available tests of means and medians, the first choice is whether the data
is normal or non-normal. For normal data, tests of means are appropriate, except when
variances are unequal, in which case medians should be compared to support the means
test. For non-normal data, tests of medians are appropriate, but means tests can be used
to support the findings of the medians tests when variances are equal and sample size is
at least 25.
The next decision is how many levels of data are being studied.
If only one set is being compared to a target value:
- 1-Sample t-test for normal data
- 1-Sample Wilcoxon for non-normal data
If two sets of data are being compared to each other:
- 2-Sample t-test for normal data
- Mann-Whitney test for non-normal data
If three or more sets of data are being compared to each other:
- ANOVA for normal data
- Mood’s Median test for non-normal data
Remember - the first step is to make sure the data is in control.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 148
Summary
• Choice of test is determined through
Analysis Path
• Study control, normality and variance in
order to choose correct analysis tool
• Test means and medians to support
conclusions
• Stack the data for better analysis

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 149
Regression

If A increases, what does B do...?

OBJECTIVES
- Introduce regression as an analysis tool for continuous data, both input(s)
and output.
- Use Minitab to conduct regression analysis.
- Understand the line of best fit and the regression equation.
- Define correlation and determination.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 150
Regression

+ + =

The Effect of Independent Variables on


Dependent Variables
Mintab>Stat>Regression>Regression

Regression is a very broad topic that explains the effects independent variables have
on dependent variables. Let’s take the illustration above for example. Suppose you
want to compare the amount of vegetables a farmer produces from year to year.
Rainfall, sunshine and soil fertility are all factors that effect crop levels. We will
call these the independent variables since they all are independent of one another.
The amount of vegetables produced in one year is the dependent variable since its
is directly dependent on the independent variables. The level of rain, sunshine and
fertilizer will directly effect the amount of crops produced. If you are able to obtain
data on all of these dependent and independent levels from year to year, regression
analysis tools can permit you to predict very accurately the amount of vegetables
produced in the future. As you can tell by now, this can be a very powerful tool for
managers to help them plan, forecast and improve their operations.
Regression can range from the very simple to very complex. Simple regression
covers the effect that one independent variable has on one dependent variable. An
example of this would be the effect the amount of rainfall alone has on crop
production. Multiple regression is the effect that two or more independent
variables have on a single dependent variable. The illustration above is an example
of multiple regression.

Note: Regression Analysis is only used with continuous data. Continuous input must generate
continuous output in order for this analysis to be effective.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 151
Correlation

Mintab>Stat>Basic Stats>Correlation

When you first take a look at the concept of correlation, you may immediately
question the difference between it and simple regression. While the two concepts are
fundamentally similar, they have certain characteristics that set them apart and provide
a different type of analysis to interpret.
Regression deals with the effect that independent variables have on a dependent
variable. Correlation is different in that it deals with the effect that two separate
variables have on one another. For example, a study showed that the productivity of a
manufacturing plant was directly related to the amount of sleep employees received
the night before. As sleep increased, so did productivity. This is know as a positive
correlation, since the increase in one happened concurrently with an increase in
another. Another study showed that the increase in spending per pupil in a school
district led to a decrease in failure rates among students. This is classified as a
negative correlation, since the increase in one led to the decrease in another.
Sometimes variables are found to be totally unrelated or hold such a weak correlation
that they really have no relationship to one another. There really is no correlation
between someone’s phone number and the type of car they drive or pollution rates and
baseball scores is there? These types of relationships are said to have zero
correlation, since they really have no relationship to one another. What do you think
the illustration above is saying? What type of correlation can you classify this as?

Note: Scatter Plots are the graph of choice for observing correlation between variables. Refer to the
Graphs Chapter to learn more about scatter plots.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 152
Data Setup

Head.L Head.W Neck.G Length Chest.G Weight


10 5 15 45 23 65
11 6.5 20 47.5 24 70
12 6 17 57 27 74
12.5 5 20.5 59.5 38 142
12 6 18 62 31 121
11 5.5 16 53 26 80
12 5.5 17 56 30.5 108
16.5 9 28 67.5 45 344
16.5 9 27 78 49 371
15.5 8 31 72 54 416
16 8 32 77 52 432
17 10 31.5 72 49 348
15.5 7.5 32 75 54.5 476
17.5 8 32 75 55 478
15 9 33 75 49 386
15.5 6.5 22 62 35 166
13 7 21 70 41 220
15 6.5 28 78 45 334
15 7.5 26.5 73.5 41 262
13.5 8 27 68.5 49 360
15.5 7 29.3 76 53 416

Before any analysis can take place in regression, it is important to ensure that your
data is set up correctly. In this case we will use the Minitab model to explain proper
data set up. As stated before, regression deals with continuous data. This
continuous data contains predictor (sometimes known as the “X”) and the
response (sometimes know as the “Y”). Predictors determine the response.
Therefore, it is important to put all of your predictors in separate columns and the
response in its own column with all the points in sequence.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 153
Fitted Line Plots
Regression Plot
Y = -278.749 + 12.9680X
R-Sq = 93.3 %

500

400

300
Weight

200

100

20 30 40 50

Chest.G

Mintab>Stat>Regression>Fitted Line Plot

When graphing simple regression data, the Fitted Line Plot is the most useful in
data analysis. This graph will produce a scatter plot diagram of a single
independent variable (X) and its relation to a single dependent variable (Y). This
allows you to observe the data and its spread along the regression line. The
regression line and its slope are products of the data itself. The steeper the slope of
the line, the greater the correlation between X and Y. The tightness of the data
points around the regression line is also very important to observe on this chart.
The tighter the data points around the regression line, the stronger your prediction
equation. This tightness is noted below the equation at the top of the chart as R-sq.
R-sq and the meaning of the % statistic associated with it will be discussed further
in the next section. For now it is important just to note that it signifies tightness of
the data points around the regression line.

Note: The slope of the regression line is displayed as the algebraic equation (y = mx + b) located
above the scatter plot. This is the algebraic method of representing the slope of a line linearly. The
letter ‘m’ is a constant that multiplies ‘y’ for each increase in ‘x’, also known as the slope of the line.
The letter ‘b’ is the y-intercept, or the value for ‘y’ when ‘x’ equals zero.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 154
Multiple Regression

Predictor Coef StDev T P


Constant -248.96 22.91 -10.86 0.000
Chest.W 11.4572 0.8578 13.36 0.000
Length 0.0652 0.7065 0.09 0.927
Age 0.4898 0.1397 3.51 0.001

S = 29.04 R-Sq = 94.4% R-Sq(adj) = 94.2%

Mintab>Stat>Regression>Regression

Fitted Line Plots were very useful in analyzing simple regression examples, but in
the real world, dependent variables (Y) will usually be effected by numerous
independent variables (X).
The data in the example above comes from a study of bears found in North
America. A scientist studying the bears wants to determine if there is a direct
relationship between the weight of the bears (Y) and three factors (X) listed above:
chest width, length and age. He decides to use multiple regression analysis to test
these factors. After the test is run, he observes the p-value for each factor X and its
relationship to the dependent variable Y. The p-value that results is an indication of
the level of impact each predictor has on the response. In regression analysis, p-
values for X variables that are above 0.05 are said to not have an effect on Y (null
hypothesis). X variables that show 0.05 or less are said to have an effect on the Y
output (alternative hypothesis). As you can see in the chart above, both chest width
and age possibly have an effect on weight of the bears (p-value less than .05).
Length on the other hand has a p-value is greater than .05. This indicates that there
is probably not a direct relationship between length and weight.

Note: When you perform the Minitab command listed above for multiple regression you will
generate the output shown above, along with other graphs that chart residuals. Residuals and their
role in regression analysis will be covered later in the chapter.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 155
Coefficient of Determination (R-Sq)

The Strength of the Prediction Equation

During the section on Fitted Line Plots we briefly mentioned that the R-Sq value
tells us about the tightness of the data points around the regression line. This
tightness, observed with a scatter plot diagram, visually signifies that the regression
equation is either strong or weak. The closer the data points are to the regression
line, the stronger the equation. Conversely, the farther away the points fall away
from the regression line, the weaker the equation. The R-Sq statistic is a direct
reflection of these data points and their tightness. You interpret R-Sq by observing
how close the statistic is to 1 and 0. The closer the statistic is to 1, the stronger the
regression equation. The closer the statistic is to 0, the weaker the regression
equation.
Another way to describe this is that the R-Sq value represents the portion of total
variation that is explained by the regression equation. If R-Sq is 35%, then only
35% of the variation in the process is explained by that regression model. We will
generally want to see R-Sq scores above 60% to assign a predictor as a possible
root cause.
As with all forms of data analysis, both visual and analytical observations of data
are equally important. Regression analysis is an excellent example of achieving
both of these desired observations.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 156
Residuals and Residual Analysis
Residual Model Diagnostics
Normal Plot of Residuals

200
Residual

100

-100

-200
-3 -2 -1 0 1 2 3
Normal Score

Residuals vs. Fits


200

Residual
100

-100

-200
0 100 200 300 400
Fit

Mintab>Stat>Regression>Residual Plots

Once you have tested your regression equation either visually through a scatter plot or
by interpreting the R-Sq value, it is always important to look at the residuals that have
resulted from the equation. Residuals represent the distance between the predicted
value from your regression equation and the actual value that falls above or below the
regression line. The point on the regression line where the output is calculated by the
equation is know as the theoretical fit or fit. Residuals are also referred to as error.
This is logical because the distance from your predicted value to the actual value,
represents a certain amount of error. It shows how much each actual value was off
from your prediction. Residual analysis is important because it gives you the
opportunity to look for the normality and random patterns within the data. The
Normal Plot of Residuals Chart (above/left) should show a normal pattern. The
Residuals vs. Fits Chart (below/right) should be scattered random around the center
line. If either of these is false, it could be an indicator that you should go back and
check your data for problems.
Note: Minitab can automatically designate two columns to hold both residuals (difference from
regression line to individual value) and fits (value along regression line where the individual value
should be). To do this, click “Storage” in the Minitab dialog box when performing simple and multiple
regression. If you do not specify storage of residuals and fits, you will not be able to initiate the
Residual Plots command to analyze residuals.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 157
Cause

Regression Plot
Y = -278.749 + 12.9680X
R-Sq = 93.3 %

500

400

300
Weight

200

100

20 30 40 50

Chest.G

It is important to understand that Regression Analysis can be a very valuable tool


during data analysis. Through it you can gain tremendous insight into your data and
possibly uncover factors that you may not have known to exist. The results you
gain from regression analysis should not be utilized as the single factor in making
conclusions about your data. Just because you find a correlation between factors or
multiple factors, it does not automatically imply a relationship between them.
Cause can be defined as significant indicators of relationships or the absence of
relationships within your data. If you do find cause, note it and investigate what
you have found using additional analysis tools.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 158
Summary
• Regression studies the relationship between a
continuous input and a continuous output
• Multiple inputs can be studied against one
output
• P-value and Coefficient of Determination (R2)
define the strength of the relationship
• Cause is not always present; it must be proven

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 159
Chi-square Test of Independence

OBJECTIVES
- Introduce chi-square test of independence analysis as a method for determining the
associations between variables.
- Understand how to interpret a resulting p-value and apply its outcome to a null
hypothesis
- Use Minitab to do a chi-square analysis
- Understand how to construct a contingency table and interpret contingency
analysis

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 160
Chi-square Test of Independence

Everyday we hear examples of individuals proclaiming associations between


variables. “Women shop more than men.” “Foreign cars last longer than domestic
cars.” “First shift is more productive than second shift.” Associations between
discrete data such as these are common, but did you ever stop and wonder if they
were made as a result of statistical analysis or just personal feelings? In either case,
it is important to understand that you can make statistical determinations about
these variables and their relationship to one another. The test most commonly used
in these cases is called the chi square test of independence. We use the term
“independence” because we are trying to determine if a relationship exists between
the variables or if no relationship exists between them (independent). Determining
a relationship can be a significant factor in helping you control your process. After
all, if you find that B has a relationship to A (your process). Wouldn’t it be in your
best interest to control B as much as possible?

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 161
Contract Pre-Screen
Chi Square Example

Prescreen Non-prescreen Total


Good 90 40 130
Fair 20 10 30
Poor 10 15 25
Total 120 65 N = 185

P-value = .019

HO: Cycle time is independent of prescreen or non-prescreen


HA: Cycle time in not independent of prescreen or non-prescreen

Mintab>Stat>Tables>Chi-Square Test

The example above used the chi square test of independence to help determine if
truck leasing contracts that went through a prescreen process had a shorter cycle
time than those that did not go through a prescreen process (see null hypothesis
above). The contracts were rated according to the longevity of their cycle time into
the categories of good, fair and poor. Categorizing these timeframes converts the
continuous measure into discrete data, and with categorized data we apply data
analysis methods. The test for a relationship between discrete inputs and a discrete
output is chi-square. In order determine if there is an association between pre-
screen/ non pre-screen contracts and quality, this data was applied to a chi-square
test. The resulting p-value determines if there is or is not a statistical difference
between the populations represented. As a general rule, when the p-value is less
than or equal to 0.05 we reject the null hypothesis and conclude that there is a
statistical difference between the populations represented by the samples. If the p-
value is greater than 0.05 we accept the null hypothesis and conclude that there is
no statistical difference between the sample populations. Since the p-value in the
example above is less than 0.05, we conclude that contracts that go through a pre-
screen generally have faster cycle times than those that do not go through a
prescreen.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 162
Expected Value
Chi Square Example

P rescreen N o n -p rescreen T o ta l
G ood 90 40 130
F a ir 20 10 30
Poor 10 15 25
T o ta l 120 65 N = 185

N u m b e r o f g o o d c y c le tim e s 130
Pgood = __________________________ = ______ = .7 0
N u m b e r o f c y c le t im e s o b s e r v e d 185

P rescreen N o n -p rescreen
G ood A c tu a l = 9 0 A c tu a l = 4 0
E x p ected = 8 4 E x p ected = 4 6
F a ir A c tu a l = 20 A c tu a l = 1 0
E x p ected = 19 E x p ected = 3
Poor A c tu a l = 10 A c tu a l = 15
E x p ected = 2 E x p ected = 12

HO: Cycle time is independent of prescreen or non-prescreen


HA: Cycle time in not independent of prescreen or non-prescreen
Mintab>Stat>Tables>Chi-Square Test

Observing the p-value is not the only way to determine the association between
variables. The data that you have on hand will enable you to conduct contingency
analysis to determine if your variables are statistically independent or not. Again,
we will utilize data from the truck leasing contract example used earlier. Start with
the same chart of data from the previous page. This table of actual values will help
us to determine the expected values listed in the table on the bottom of the page. As
you can see, the leasing contracts are still listed as either good, fair or poor, based
on the longevity of their cycle time. Now look at the sample equation located
between the two tables that we have chosen Since we know that 130 out of 185
(70%) total contracts were considered good, we determine our expected value by
multiplying 0.70 by 120 (total prescreens). This gives us an expected value of 84.
This value is the number of prescreened contracts that we expect to be considered
good (short cycle time). The greater the difference between the expected value and
the actual value, the more likely that the null hypothesis is false and should be
rejected. Using contingency analysis enables you to compare each variable
independently. Unlike observing p-values alone, this gives you the opportunity to
explore possible outliers within your data that may provide further information.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 163
Summary
• Chi-square test of independence studies the association
between variables and determines if they are independent
of one another
• P-values help determine the validity of a null hypothesis
that determines independence
• The difference between actual and expected values in
contingency analysis helps determine independence
between variables
• Contingency analysis provides deeper insight into your
data rather than simple p-value observation

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 164
Design of Experiment (DOE)

Isolating the Vital Few

OBJECTIVES
- Introduce DOE as a useful tool in isolating root causes
- Outline experimental designs and their usage
- Structure data for analysis in Minitab
- Identify keys for decision-making from results

Note: This text is not designed to cover Design of Experiment to the level of detail that may be
required during DMAIC or DMADV projects. It is intended only as an introduction to the uses of
DOE and a familiarization with the terms and techniques involved. For additional information consult
the GE Six Sigma Quality Coach or other sources listed on the reference page in this manual.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 165
Experimentation
• Used in Analyze Phase to identify critical
process variables
CHARACTERIZING THE PROCESS

• Used in Improve Phase to determine the


best settings for variables
OPTIMIZING THE PROCESS

In the Measure phase, data is collected and the capability of the process is
determined. In the Analyze phase, it is time to dig into the data to find the root
causes of defects and variation within the process. DOE is an analysis tool that can
be very useful in the pursuit of root causes. With it you can determine which
variables (x) most affect the output (y). This includes those x’s and y’s over which
the process owner has control and those that are uncontrollable.
Later, in the Improve phase, it will be necessary to pilot possible solutions. DOE
can be an efficient and inexpensive way to determine optimal settings for the
process in order to meet the customer’s needs.
For DMADV projects, DOE can be useful in understanding x’s early in the design
phase, which will save time in developing the new process.
DOE is a more structural approach to data analysis, identifying exactly which
variables and interactions of variables influence the output and to what degree. If it
is possible to structure your data into an experimental design, it can be a very
powerful analysis tool.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 166
Types of Experiments
• Screening - used to identify important
individual factors
• Full Factorial - used to study all main
effects and interactions
• Fractional Factorial - the middle ground;
used to learn more about factors and
interactions than screening experiments

The decision on which design to use depends on the number of variables being studied
and the amount of time and money available for testing.
When it is possible and affordable, the full factorial is the most complete type of
experimental design. All factors are investigated against all levels of the other factors
and the output is collected for each combination. The results include the level to
which each factor influences the output, as well as the amount of influence that
interaction of two or more variables carries. While it may not be true that factor A or
factor B influence the output of a process individually, some combination of the two
together might raise or lower the output value significantly. Full factorials are best
with less than five input variables.
Fractional factorials abbreviate the full factorial and give a picture of the individual
factors known as main effects and provide some idea regarding significant
interactions between factors. Because the full picture of all main effects and
interactions cannot be seen as a result of a fractional factorial experiment, it is best
used when the list of vital x’s has been shortened to less than ten.
Screening designs are a first pass at a process with many input variables. They are
used to filter the variables that are not significant in order to more closely study those
that may have significance.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 167
Steps in Experimentation
1 Define the Problem
2 State the Hypothesis
3 Identify independent and dependent variables
4 Determine the test level for each variable
5 Calculate the number of trials needed
6 Construct the experimental table
7 Run the experiment (test the data)
8 Summarize the data
9 Draw conclusions and make recommendations

The steps to conducting a DOE are as shown.


We will cover each of these steps in detail and relate them to an example from a
recent project. The first step in this process is to clearly identify the problem to be
resolved.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 168
Example
1 Define the Problem
On time delivery is inconsistent
2 State the Hypothesis
Some combination of Driver, Load Type, Delivery Sequence
and Distance will provide consistent Delivery Time
3 Identify independent and dependent variables
Dependent variable (y): Delivery Time
Independent variables: Driver (Penske or non-Penske)
(x’s) Load Type (drop/hook or unload)
Delivery Sequence (single or multiple)
Distance (over 250 miles or under)

DEFINE THE PROBLEM


A customer location complained of inconsistent delivery time, so a study was
conducted to uncover the reasons for the variation in the process to determine why on-
time delivery is inconsistent.
STATE THE HYPOTHESIS
The experimental hypothesis is structured in such a way that some combination of
x’s will produce either a more predictable y or a y that is centered around a customer’s
requirement.
IDENTIFY INDEPENDENT AND DEPENDENT VARIABLES
The variables to be studied are of critical importance because it is necessary to include
ALL possible influencing factors. The dependent variable is the output of the process,
and the independent variables are all those factors that influence the output.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 169
Number of Trials
• Test levels for each variable
– Two levels, high and low
• Treatment combinations
– Exponential equation: (# levels)# variables

Example - three variables, two levels each


23 = 8 experimental runs

DETERMINE THE TEST LEVEL FOR EACH VARIABLE


The simplest factorial designs to study are Two-level Factorials. Each variable is
assigned a high and a low value for the study and each is run against both settings of
every other variable. As soon as three levels are introduced to one or more of the
variables, the designs get tricky. If possible, stick with two levels.
CALCULATE THE NUMBER OF TRIALS NEEDED
The number of treatment combinations is the number of possible combinations of
variables at each setting. For example, if I wanted to study whether goldfish grow
faster with red light or blue light or with classical music or rock’n’roll, the treatment
combinations would be:
1. Red Light / Classical
2. Blue Light / Classical
3. Red Light / Rock’n’Roll
4. Blue Light / Rock’n’Roll
Two factors, two levels each, two raised to the power of two is four (22)
With three factors at two levels each, two raised to the power of three is eight (23)

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 170
Trials - continued
• Repetitions - Running multiple trials
consecutively

• Replicates - Setting up the experiment to


run again

We have studied the p-value in previous chapters as a decision-making tool in data


analysis. In order to capture a p-value during experimentation, it is necessary to
have more than one test run of each treatment combination. There are two ways to
increase the number of runs: repetitions and replicates.
Repetitions are consecutive runs with the same treatment combination resulting in
two or more output values. For example, I set the inputs: VariableA - Low,
VariableB - Low, VariableC - High, and then run the experiment and get an output
value. Then, before changing to the next treatment combination, I run the test again
and get another output value.
Replicates are reproductions of the whole experiment, that is, the experiment is run
with all treatment combinations and then it is conducted again from the beginning.
For example, I set the inputs and run the experiment for all treatment combinations,
and then return to the first setting and start over, recording the new output values.

Note: All the possible experimental designs are described in Minitab under Stat>DOE>Create
Factorial Design.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 171
Experimental Table
Test Run Variable 1 Variable 2 Variable 3 Output
1 Lo Lo Lo
2 Lo Lo Hi
3 Lo Hi Lo
4 Lo Hi Hi
5 Hi Lo Lo
6 Hi Lo Hi
7 Hi Hi Lo
8 Hi Hi Hi

CONSTRUCT THE EXPERIMENTAL TABLE


When all the levels of each variable are assigned, the experimental table can be
produced for use in setting up the experiment. Notice the patterns of Hi and Lo for
each variable. This is the basic structure, but you may find it more useful to
randomize the runs. In that event, Test Run #8 may be first, followed by Test Run
#3, and so on. This is helpful in maximizing the experimental value of the design
by forcing those participating in the experiment to break down the design after each
treatment combination has been run and set it up anew each time.

Note: The randomizing option is available under Create Factorial Design as well.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 172
Example - continued
4 Determine the test level for each variable
Driver non-Penske -1 Penske +1
Load Type Unload -1 Drop/Hook +1
Distance Under 250 miles -1 Over 250 miles +1
Delivery Single Stop -1 Multiple +1
5 Calculate the number of trials needed
Four variables, two levels each 24 = 16 trial runs
6 Construct the experimental table

For our example, the settings for each variable have been determined above. The -1
and +1 are common settings for experimental design.
The number of trials in our experiment depends on the number of variables. Four
variables, two levels of each, two raised to the power of four is sixteen.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 173
Run the Experiment
RunOrder Drvr Load Stops Dist Time
1 1 1 1 -1 -63
2 -1 -1 -1 1 -14
3 1 -1 -1 -1 -22
4 -1 -1 -1 -1 -16
5 1 1 1 -1 58
6 1 1 1 1 -92
7 1 1 -1 1 -223
8 1 1 -1 1 11
9 -1 1 1 1 -120
10 -1 -1 -1 -1 113
11 -1 1 -1 -1 -71
12 -1 1 1 1 -120
13 1 1 1 -1 -184
14 1 -1 -1 -1 88
15 -1 1 -1 -1 116
16 -1 1 -1 -1 -258
17 -1 1 1 -1 -74
18 1 -1 1 -1 -38
19 -1 -1 -1 -1 -113
20 1 -1 -1 1 -57
21 -1 1 -1 1 -146
22 -1 1 -1 1 46
23 1 1 -1 1 -457
(Excerpt)
24 -1 -1 1 1 -19

RUN THE EXPERIMENT


Once the settings have been determined and the order of the test runs has been
arranged, set each variable for the first test run and record the output. Move on to
the next run, resetting each variable each time, and record all outputs.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 174
Minitab Output
Pareto Chart of the Standardized Effects
(response is Output, Alpha = .10)

A: Var1
B: Var2

A
Main Effects Plot (data means) for Output

B -1 1 -1 1

30

25
AB

Output
20

15
0 50 100

10

Var1 Var2

Minitab>Stat>DOE>Analyze Factorial Design


Minitab>Stat>DOE>Factorial Plots

SUMMARIZE THE DATA


Some sample results are shown. The Pareto diagram shows each variable
and the interaction of both. On this chart, anything that falls to the right
of the red dashed line is significant. In this case, only the variable marked
A, or Var1, is significant to the output of this process.
The Main Effects Plot shows the mean for the output at the high and low
level of each variable. This can be useful in understanding which inputs
have an effect and which do not. Now we can see that not only is Var1
significant, but to the level that the output changes under the high and low
settings of that variable.
All variables in our example were significant, but with the Main Effects
Plot we can see how much influence each has on the output.
Main Effects Plot (data means) for Mean

-1 1 -1 1 -1 1 -1 1

-30

-50
Mean

-70

-90

-110
Drvr Load Stops Dist

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 175
Interpret Results
Fractional Factorial Fit
p-v
Estimated Effects and Coefficients for Output (coded units) alu
es
Term Effect Coef StDev Coef T P
Constant 21.3125 0.1083 196.88 0.000
Var1 24.1250 12.0625 0.1083 111.43 0.000
Var2 -0.3750 -0.1875 0.1083 -1.73 0.158
Var1*Var2 -0.3750 -0.1875 0.1083 -1.73 0.158

Analysis of Variance for Output (coded units)

Source DF Seq SS Adj SS Adj MS F P


Main Effects 2 1164.31 1164.31 582.156 6E+03 0.000
2-Way Interactions 1 0.28 0.28 0.281 3.00 0.158
Residual Error 4 0.38 0.38 0.094
Pure Error 4 0.38 0.38 0.094
Total 7 1164.97

Minitab>Stat>DOE>Analyze Factorial Design

DRAW CONCLUSIONS AND MAKE RECOMMENDATIONS


The Minitab session window will capture relationships for each variable and the
interactions of each on the output. This result will be shown as a p-value, and we
all know what to do with p-values! Generally, 0.10 is the cutoff in experimentation.
Without replicates or repetitions, however, no p-value will be present in the session
window, and your conclusions will have to be drawn from the charts above.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 176
Summary
• DOE can be used to study data in the
Analyze phase
• DOE can also be used in the Improve phase
to determine optimal settings for a process
• Two-level factorials are simplest to use
• DOE is a world in itself

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 177
Sampling in the Improve Phase

Collecting the Proper Amount of


Data to Show Improvement

OBJECTIVES
- Review α and β risk
- Use the Power and Sample Size Calculator to determine sample size
required to validate improvement.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 178
Key Elements of the Improve Phase

• Prove improvement

• Limit expense in pilot testing

• Make good decisions

In order to justify a solution that has been selected during the Improve phase of a
project, it is necessary to prove that an improvement has been realized.
A pilot is a test of all or part of a proposed solution conducted on a small scale. Pilots
are used to better understand the effects of a chosen solution and to learn about how
to make the full scale implementation more effective.
The desired outcomes of the pilot test are:
1) Statistically prove that the improvement that has been achieved is
legitimate, and
2) Spend as little time and money as possible to prove it
By selecting the appropriate sample size required, we are able to accomplish both
objectives. The Power and Sample Size Calculator will provide sufficient samples to
validate the improvement. Once that number of samples is reached and sufficient
data is captured, we need not continue the test.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 179
Factors
• Alpha Risk (α) - The chance of acting when
there has been no improvement
• Beta Risk (β) - The chance of not acting
when there has been an improvement
• Power (1-β) - The chance that we will not
miss an opportunity for improvement
• Delta (∆) - The amount of difference we’re
trying to detect

As discussed in Chapter 8, Hypothesis Testing, there are elements of risk in


decision-making. The α risk factor is the chance that the tester will reject the null
hypothesis when it was actually true. In other words, she will assume from her
findings that there has been an improvement when there has actually been none.
This is normally set at 0.05 for a 95% confidence level.
The β risk factor is the chance that the tester will accept the null, that there is no
difference, when there actually has been an improvement. This is known as a
missed opportunity. The level is usually set higher than the α risk factor at 0.10 or
0.20 so that it is difficult for an opportunity to be missed.
The power value is a measure of how confident the tester can be that she will detect
an improvement. It is the inverse of β risk, so it is generally set at 0.80 or 0.90 (1 -
0.20 or 1 - 0.10). The tester will be able to say that she is 80% or 90% sure that the
improvement opportunity was not missed.
Delta, introduced in Chapter 5, is a measure of the improvement difference we want
to detect. For example, if the new process is expected to create a 50% higher yield
of good products, we set ∆ at 0.50.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 180
Required
Sample
Sizes

1-β

α risk 0.05

In this example of the Power and Sample Size Calculator, we are trying to find the
required sample size that will make us 80% or 90% confident (power, 1-β) that we
have made an improvement in our process. We also want to be 95% certain
(significance level, α risk) that the new process has a mean that is 3 units lower
(difference, ∆) than the previous process, which had a standard deviation of 2 units
(sigma).
When the calculation is executed, we find that we will require 5 samples to be 86%
certain that we properly detected an improvement and 6 samples to be 93% certain.
In other words, once we test 6 samples, if the mean of those samples is 3 units less
than the previous mean, then we are 93% certain that it is a legitimate improvement.
The Power and Sample Size Calculator is very useful in minimizing the number of
samples required to prove improvement.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 181
Summary
• The purpose of sampling in the Improve
phase is to validate improvement

• Minitab Power and Sample Size Calculator


requires α, β, ∆ and previous process σ.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 182
References
• GE DMAIC Wave Training
• GE Six Sigma Quality Coach
• Minitab Software (Help Menu)
• Dr. D. J. Wheeler, Understanding Variation: The Key to
Managing Chaos
– SPC Press, Inc., Knoxville
• D.J. Koosis, Statistics: A Self-Teaching Guide, 4th Edition
– John Wiley and Sons, Inc., New York
• Groebner & Shannon, Business Statistics: A Decision-
Making Approach, 3rd Edition
– Merrill Publishing Company, Columbus

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 183
ABOUT THE AUTHORS

Tracy Owens (left) has been a Master Black Belt with Penske Logistics since March
2000. He served six years in the U.S. Army and worked five years in export sales and
Six Sigma quality at a North American truck manufacturer. Tracy holds a Master of
International Business degree from Seattle University.

Cayle Sargent joined Penske Logistics as a Black Belt in September 1999. Prior to
joining the company, he served for four years as an Ordnance officer in the U.S.
Army. Cayle holds a B.S. degree in Quantitative Methods from Western New
England College in Springfield, Massachusetts.

Confidential and Proprietary. Copyright © 2000 Penske Truck Leasing Co., L.P. 184

You might also like