0% found this document useful (0 votes)
53 views6 pages

Control Charts For Non-Normal Data: Illustrative Example From The Construction Industry Business

This document discusses using control charts for non-normal data from the construction industry. It introduces Johnson distributions as a way to model non-normal quality characteristics data to create appropriate control charts. As an example, the authors analyze real field data from ready-mixed concrete production plants where the process data was skewed non-normally distributed. The Johnson distribution system can accommodate different levels of skewness and kurtosis, allowing estimates of percentiles to calculate control limits similar to normally distributed data. This approach allows calculating capability indices and control limits for non-normal data types.

Uploaded by

Tamayo Pepe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views6 pages

Control Charts For Non-Normal Data: Illustrative Example From The Construction Industry Business

This document discusses using control charts for non-normal data from the construction industry. It introduces Johnson distributions as a way to model non-normal quality characteristics data to create appropriate control charts. As an example, the authors analyze real field data from ready-mixed concrete production plants where the process data was skewed non-normally distributed. The Johnson distribution system can accommodate different levels of skewness and kurtosis, allowing estimates of percentiles to calculate control limits similar to normally distributed data. This approach allows calculating capability indices and control limits for non-normal data types.

Uploaded by

Tamayo Pepe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Mathematical and Computational Methods in Science and Engineering

Control Charts for Non-Normal Data:


Illustrative Example from the Construction Industry Business

M. AICHOUNI (1, *), A. I. AL-GHONAMY (2) and L. BACHIOUA (3)


(1, 2)
BinLaden Research Chair on Quality and Productivity Improvement in the Construction Industry,
College of Engineering
(3)
Department of Mathematics, Preparatory Year College
University of Hail
P.O. Box. 2440, Hail
SAUDI ARABIA
(*)
[email protected] http://faculty.uoh.edu.sa/m.aichouni/

Abstract: - Statistical Process Control charts widely used in industry and services by quality professionals
require that the quality characteristic being monitored is normally distributed. If, in contrast, the distribution of
this characteristic is not normal, any conclusion drawn from control charts on the stability of the process may
be misleading and erroneous. In this paper, an alternative approach has been suggested that is based on the
identification of the best distribution that would fit the data. Specifically, the Johnson distribution was used as a
model to normalize real field data that showed departure from normality. Real field data from the construction
industry was used as a case study to illustrate the proposed analysis.

Key-Words: - Statistical Process Control, Shewhart control charts, non-normal data, Johnson System of
distributions

1 Introduction Standards (Shewhart) control charts are designed


Statistical Process Control is a process improvement on the assumption that the process being monitored
methodology widely used by modern manufacturing produces a quality characteristic that can be
and service organizations. This methodology is approximated by a symmetrical normal distribution,
mainly based on the use of control charts and when only the innate sources of variability are
frequency distributions of process and quality present in the system. The central limit theorem can
characteristics data. Common and well established be used to approximate distributions to the normal
control charts include the Shewhart control chart distribution provided that the samples being
measured and monitored would be large enough.
( X -R and X -s charts), the cumulative sum control
However, in many industrial situations, this can not
chart (CUSUM) and the exponentially weighted
be assured and the process output is not normally
moving average control chart (EWMA). In process
distributed and heavy tailed and skewed. Experience
improvement strategies, these control charts are
has showed that in some manufacturing processes,
used to monitor product quality and detect special
such as chemical processes parameters, cutting tool
events occurring in the process that may cause out-
wear processes and some concrete production
of-control situations that would lead to an unstable
processes, the distribution are usually skewed. In
and unpredictable process. Such processes deliver
this case, standard control charts based on normality
poor quality products to customers. Customers
assumptions can lead to erroneous conclusions
expect suppliers of products and services to provide
regarding the stability and the capability of the
proof of process control and process capability.
process. Such wrong conclusions would cost
Control charts help organizations management to
manufacturing and service organization big
continuously improve processes, by making them
financial losses and lost customers to competitors.
more stable and capable to produce high quality, to
With the advents in statistical theories and
meet customer specifications, and to achieve
computing facilities, this can be easily solved, by
business excellence.
understanding of distributions that provide good
model for most non-normal quality characteristics.
Such an approach has been reported in the technical

ISBN: 978-960-474-372-8 71
Mathematical and Computational Methods in Science and Engineering

literature (Farnaum, (1996), Chou et al., (1998), S


Sherill and Johnson (2009), and Derya and Canan families such as U , S L , and S B . This motivated
(2012)). Derya and Canan (2012) developed us to use Johnson system for the analysis of micro
standards control charts based using Weibull, array data (Johnson, Kotz & Balakrishnan 1994).
Gamma and lognormal distributions. Sherill and This family of distributions, published by the
Johnson (2009) showed the possibility to use statistician N.L. Johnson in 1949, is perhaps the
exponential, Weibull and Lognormal distribution for most versatile choice. It is based on a transformation
transforming non-normal data for process control of the standard normal variable, and includes four
and process capability calculations. The objective of forms:
the present paper is to examine the use of the 1. Unbounded: the set of distributions that go to
Johnson`s family of distributions to model control infinity in both the upper or lower tail.
charts that can be used for process improvement 2. Bounded: the set of distributions that have a fixed
purposes. A real field case study is presented for boundary on either the upper or lower tail, or both.
ready mixed concrete production plants where 3. Log Normal: a border between the Unbounded
process distribution showed a skewed non-normal and Bounded distribution forms.
distribution. 4. Normal: a special case of the unbounded form.
The fact that the Johnson system involves a
transformation of the raw variable to a normal
2 Johnson`s Distributions in Quality variable allows estimates of the percentiles of the
Improvement fitted distribution to be calculated from the Normal
Statisticians and quality professionals are often distribution percentiles, for use in control limit
faced with the problem of summarizing a set of data calculations (on the Individual-X chart or the X -R
by means of a mathematical function which fits the charts) or for Capability Analysis. Thus, although
data and allow obtaining estimates of percentiles. capability indices and control limits are generally
Frequently, statisticians and quality professionals only defined for normal variables, this approach
usually have insufficient theoretical grounds for allows their calculation for all distribution types
selecting a model like normal, gamma or extreme- (Johnson, Kotz & Balakrishnan 1994). In this study,
value distributions for a "real world" data set the authors applied the Johnson system, which
(Johnson, 1949). Usually data are obtained and S
empirical methods are used to draw conclusions and includes the U , S L , and S B distributions, as the
make decisions on process and quality improvement Johnson system exhibits the key property of being
in real business situations. The fitting of empirical able to accommodate all theoretically feasible
distributions to data has a long history, and many skewness-kurtosis combinations (Figure 1).
different procedures have been advocated. The most The standard process capability analysis is one of
common of these is the use of normal distribution. many statistical process control widely used in
The central limit theorem leads one to expect this manufacturing and services engineering. It is based
distribution to provide reasonable representation for on the assumption that process data are normally
many, but not all, physical phenomena (Gerald and distributed. When this condition cannot be
Samuel, 1967). guaranteed, either capability indices should be
Although models like gamma, log-normal and computed based on distributions other than normal,
beta distributions do lead to a wide diversity of or the data should be transformed so that it
distribution shapes, they still do not provide the conforms better to the normal distribution (Farnum,
degree of generality that is frequently desirable. In 1996). Sherill and Johnson (2009), and many others
1949, Johnson derived a system of curves that has showed that the use of Box-Cox and the Johnson
the flexibility of covering a wide variety of shapes. transformations would help the quality professional
This system has the practical and theoretical to perform correct process analysis using both
advantages of being able to transform these curves control charts for process stability and capability
to the normal distribution. The Johnson system is indices for process capability to meet customer
able to closely approximate many of the standard specifications. In addition, it is worth mentioning
continuous distributions through one of the three here that in a recent study, Kilink et al (2012),
functional forms and is thus highly flexible. The showed compressive strength of concrete elements
Johnson system provides one distribution in buildings are best modeled using log-normal and
corresponding to each pair of mathematically the Johnson SB distributions.
possible values of skewness and kurtosis. Any data
set can be fitted by a member of the Johnson

ISBN: 978-960-474-372-8 72
Mathematical and Computational Methods in Science and Engineering

3 Mathematical Formulation of the has the flexibility to match any feasible set of values
for the mean, variance, skewness, and kurtosis
Johnson`s Distributions coefficients. With this system, the skewness and
As stated earlier, when process data exhibit non-
kurtosis also uniquely identify the appropriate form
normal distribution, it is erroneous to draw
for the (g) function.
standards control charts for process improvement
and perform capability analysis. The practical
3.1 Johnson's Translation System:
solution is to transform the data and drive them
Johnson proposed three normalizing transformations
towards normality, using common and well
having the general form:
established probability distributions, such as Box-
Cox, log-normal or the Johnson distribution. Such
X −µ
an approach has been used in the open literature. Z = γ +σ f   ,.........(1)
Basically the Johnson transformation computes an  λ 
optimal transformation function from three flexible
Where f (.) denotes the transformation function,
Z is a standard normal random variable γ and σ
distribution families (SU, SB, and SL). This makes
this transformation more powerful than other
distribution (Sherill and Johnson, 2009). are shape parameters, λ is a scale parameter and µ
is a location parameter. Without loss of generality, it
is assumed that σ  0 and λ  0 .
The first transformation proposed by Johnson
defines the lognormal system of distributions
denoted by S L :

X −µ *
Z = γ + σ ln  = γ + σ ln ( X − µ ) , X  µ , ...(2)
 λ 

The bounded system of distributions S B is


defined by:

 X −µ 
Z = γ + σ ln  = µ  X  µ + λ , ..........(3)
µ +λ − X 

S B curves cover bounded distributions. The


Figure (1): The Skewness and Kurtosis Plane
distributions can be bounded on either lower end, or
for the Johnson Distributions the upper end, or both. This family covers gamma
These translations transform any continuous distributions, beta distributions and many others.
random variable X into a standard normal SU is
variable Z using general form: The unbounded system of distributions
X −µ defined by:
Z = a + bg ( )...............(1)    
1/ 2

σ  X − µ   X − µ  −1  X − µ 
2

Z = γ + σ ln    +   + 1  = γ + σ sinh  ,
 λ   λ     λ 
Where: a and b are shape parameters, µ is a   
location parameter, and g (x) is a function ................................ − ∞  X  +∞.........................................(4)
defining the Johnson system of families,
determined as: S
The U curves are unbounded and cover the t
and normal distributions, among others.
ln ( x ) , for the lognormal family,

( )
ln x + x + 1 , for the unbounded family,

2 3.2 Johnson's Family of Distributions:
The Johnson family of distributions is made up of
g ( x) =   x 
ln  , for the bounded family,
three distributions, Johnson U , Johnson S B and
S
 1- x 

 x , for the normal family.
lognormal. It covers any specified average,
standard deviation, skewness and kurtosis. Together
As discussed in [Johnson, 1949], the above system
they form 4-parameter family distributions that

ISBN: 978-960-474-372-8 73
Mathematical and Computational Methods in Science and Engineering

cover the entire skewness-kurtosis region other than Table 1 – Data for compressive strength for Ready
S Mixed Concrete (Kgf/cm2)
the impossible region. The Johnson U distribution
covers the area above the lognormal curve and the Sample Cylinder 1 Cylinder 2 Cylinder 3
Johnson S B covers the area below the normal curve. 1 353.8 363 360.6
A family of distributions is several distributions 2 357.8 358.7 370.9
combined so that they cover a well defined region in 3 365.2 360 356.6
a skewness and kurtosis plot (lognormal family of 4 340.4 335.2 330.1
distributions, negative lognormal and normal 5 359.6 358.1 351.2
distributions,..). Readers can find detailed 6 368.1 366.7 369.3
developments about the Johnson family of 7 357.9 355.0 350.6
distributions in reference books (Gerald and 8 337.8 352.6 361.6
Samuel, 1967). 9 359.1 349.2 363.7
This family of distributions is usually 10 361.1 358.2 358.3
parameterized as a function of skewness and 11 358.3 345.7 341.7
kurtosis. Skewness is a measure of non symmetry in 12 357.3 359.2 356.9
the data, so for a normal distribution it takes the 13 352.6 363.1 374.6
value of zero. Negative values for the skewness 14 360.8 356.2 352.7
indicate that data are skewed left, and positive 15 347.5 339.8 354.3
values indicate that data are skewed right. On the
16 358.2 359.5 353.9
other hand, kurtosis is a measure of whether the data
17 375.2 372.5 370.2
are peaked or flat relative to a normal distribution.
18 357.5 359.5 348.9
The kurtosis for a normal distribution is 3.0. A
kurtosis value larger than 3.0 indicates a “peaked” 19 343.2 355.8 362.4
distribution and a kurtosis value less than 3.0 20 362.1 356.6 359.1
indicates a “flat” distribution. Thus, both can be 21 365.2 362 359.4
seen as measures of shape of the distributions. 22 361.3 346.8 339.0

4 Application of Johnson's System of Figure 2 – Standards X chart for the concrete


compressive strength
Distributions for Real Field Data
To illustrate the above analysis, real field data from
Xbar Chart of RMC350
the construction industry business was chosen as a 1

case study. Data from Ready mixed concrete plants 370 1


UCL=367.87
were gathered and analyzed using Minitab 16
statistical software. The observed quality 360 _
_
Sample Mean

X=356.66
characteristic was the compressive strength
(kgf/cm2) of concrete as defined by international 350

quality standards (ACI-214). The gathered data LCL=345.45

consisted of 22 samples of concrete with a nominal 340

specification 350 kgf/cm2. The sampling process 1

consists of a sample size of 3 spanning over a period 330


1 3 5 7 9 11 13 15 17 19 21
of 30 days. These data are presented in table (1). Sample

Initial analysis of the data of the concrete using


standard X -chart (figure 2) showed that the process The observation of an out of control situation
is out of statistical control; this would mean the shown from the X control chart was drawn based
existence of special causes of variation affecting the on the assumption of normally distributed concrete
process. data. Is this assumption correct? If not what would
be the best distribution that fits these real field data.
To answer this question, distributions identification
was carried out for the data, and the outcome is
presented in figure 3 as probability plots. From this
figure, it can be seen that the compressive strength
of concrete does fit neither the normal, nor the
exponential, nor the Weibull, nor the lognormal

ISBN: 978-960-474-372-8 74
Mathematical and Computational Methods in Science and Engineering

distributions. It is very obvious that the exponential Real field data from the construction industry
distribution is a poor model for the concrete data. was used as a case study to illustrate the analysis.
The Johnson distribution would be an alternative for The assumption of normality when the data were not
the model (Kilink et al, 2012, Sherill and Johnson, normally distributed led to conclude that the
2009). The transformed data by the Johnson system monitored process was out of statistical control,
are illustrated in figure 4, where it can be seen that indicating that some special causes are present in the
this distribution shown as a mixture would be the process, which would require some intervention
best model of these concrete data. From this figure, from management on the process to get rid of the
it can be seen that within the interval percentile special cause of variation to occur again. This would
ranging from 1.054 to 98.94, would be the best fit of certainly cost the organization some cost. However,
the data. Normality within this interval can be when the data were transformed and brought to
guaranteed. These correspond to the lower control normality through Johnson transformations, and the
limit and the upper control limit for the normalized new control limits calculated, the new control chart
data which are UCL=375.2 (kgf/cm2) and indicated no sign of special causes of variation.
LCL=330.1 (kgf/cm2). These control limits will be
used as the new control limits for the X chart as Figure 4 – Probability Plots for the Johnson
shown in figure (5). It is clearly shown that the Transformed data of Concrete Strength
control chart with the new control limits indicate
totally the opposite of the early conclusion drawn Johnson Transformation for RMC350
P r obability P lot for Or iginal Data Select a T r ansfor mation
from the standard control chart. The process is 99.9
N 66
0.5

P-Value for A D test


shown to be in statistical control. 99 AD 1.245
P-Value <0.005
0.4
90 0.3
Percent

50 0.2
Figure 3 – Probability Plots for the Concrete 0.1 Ref P
10
Compressive Strength 0.0
1 0.2 0.4 0.6 0.8 1.0 1.2
0.1 Z Value
320 340 360 380 (P-Value = 0.005 means <= 0.005)
Probability Plot for RMC350
Goodness of F it Test P r obability P lot for T r ansfor med Data
Normal - 95% C I Exponential - 95% C I
99.9
N 66
99.9 99.9 Normal 99 AD 0.353
99 90 A D = 1.245 P-Value 0.455 P-V alue for Best F it: 0.454685
P-V alue < 0.005 90 Z for Best F it: 0.5
90 50
Best Transformation Ty pe: SU
Percent
P er cent

P er cent

Exponential 50 Transformation function equals


50 10
A D = 28.824 0.193791 + 0.798420 * A sinh( ( X - 358.804 ) / 4.63969 )
P-V alue < 0.003 10
10 1
1 Weibull 1
0.1 0.1 A D = 0.933 0.1
320 340 360 380 0.1 1 10 100 1000 10000 P-V alue = 0.018 -2 0 2 4
RMC3 5 0 RMC3 5 0
Gamma
Weibull - 95% C I Gamma - 95% C I A D = 1.314
99.9 99.9
P-V alue < 0.005

90 99

50 90 Figure 5 – X -chart with the new Control Limits


P er cent

P er cent

10 50 using Johnson transformations


10
1
1
0.1
300 325 350 375
0.1
320 340 360 380
UCL(NN) = 375.2
RMC3 5 0 RMC3 5 0 8
+3 StDev=372.72
4 4 4
+2 StDev=367.37
5 Conclusions 2 2 2 2
+1 StDev=362.01
Most statistical process control charts require that 0 0 2 0 0 0 0 0 __
the quality characteristic being monitored is X=356.66
0 8 0 0 0 0
normally distributed. If, in contrast, the quality -1 StDev=351.31
distribution of the quality characteristic of interest is 2 2 2 2 2

not normal, the conclusions drawn from control -2 StDev=345.95


4
charts on the stability of the process may be -3 StDev=340.60
misleading and highly erroneous. In this paper, an 8 8
LCL(NN) = 330.10
alternative approach has been suggested that is 1 3 5 7 9 11 13 15 17 19 21
Sample
based on the identification of the best distribution
that would fit the data. Specifically, the Johnson
distribution was used as a model to normalize real
field data that showed departure from normality.

ISBN: 978-960-474-372-8 75
Mathematical and Computational Methods in Science and Engineering

Acknowledgment
The present research work has been undertaken
within the Bin Laden Research Chair on Quality and
Productivity Improvement in the Construction
Industry funded by the Saudi Bin Laden
Constructions Group; this is gratefully
acknowledged. The opinions and conclusions
presented in this paper are those of the authors and
do not necessarily reflect the views of the
sponsoring organization.

References:
[1] Nicholas R. Farnum., Using Johnson Curves to
Describe Non-Normal Process Data, Quality
Engineering, Vol. 9, No. 2, December, 1996,
329-336.
[2] Chou.Y, A.M. Polansky, and R.L. Mason,
Transforming Non-normal Data to Normality in
Statistical Process Control", Journal of Quality
Technology, Vol. 30, April, 1998pp 133-141.
[3] Sherill, R. W. and Johnson, L. A., Calculated
Decisions, Quality Progress, Vol. 42 (1), 2009,
pp. 30-35.
[4] Derya, K and Canan, H., Control Charts for
Skewed Distributions: Weibull, Gamma and
Lognormal, Metodoloski zvezki, Vol. 9, N. 2,
2012 pp. 95-106.
[5] Johnson, N.L., Systems of frequency curves
generated by methods of translation,
Biometrika, Vol. 36, 1949, 149-176.
[6] Hahn J. Gerald and Shapiro S. Samuel,
Statistical models in Engineering, John Wiley
and Sons, 1967.
[7] Johnson, N. L., Kotz, S., and Balakrishnan, N.,
Continuous Univariate Distributions, Second
Edition, New York: John Wiley & Sons., 1994.
[8] Kilink, K, Celik, A., Tuncan, M., Tuncan, A.,
Arslan, G. and Arioz, O., Statistical
distributions of in situ microcore concrete
strength, Construction & Building Materials,
Vol. 26 Issue 1, Jan2012, p393-403.
[9] ACI Committee 214, Evaluation of Strength
Test Results of Concrete (ACI 214R-02), 2005.

ISBN: 978-960-474-372-8 76

You might also like